I run a dnsmasq server on my router (which is a Raspberry Pi 2) to handle local DNS, DNS proxying and DHCP. For some reason one of the hosts stopped registering its hostname with the DHCP server, and so I couldn’t resolve its name to an IP address from other clients on my network.
I’m pretty sure it used to work, and I’m also pretty sure I didn’t change anything – so why did it suddenly stop? My theory is that the disk on the client became corrupt and a fsck fix removed some files.
Anyway, the cause is that the DHCP client didn’t know to send it’s hostname along with the DHCP request.
This is fixed by creating (or editing) /etc/dhcp/dhclient.conf and adding this line:
A few years ago, when I started working at home, I had a second ADSL line installed so that I could still get online if my ISP had an outage. As well as fault tolerance I wanted to try and use all the available bandwidth rather than just have it sitting there “just in case”. I achieved this using multi path routing and documented the solution here: Over Engineering FTW.
This has been running really well on a Raspberry Pi for about 3 years (with an older kernel, see later in this post for why) but recently the SD card has started to fail. Although this would be easy to fix; simply replace the SD card and copy my scripts over, the rural town I live in has just been upgraded to FTTC and so my connection speed has gone from about 8 Mbps to about 70 Mbps on each line. The first generation Pi doesn’t have enough horsepower to cope with 70 Mbps let alone 140Mbps, and indeed the ethernet interface is only 100Mbps. I had a Raspberry Pi 2 spare anyway so I figured I would use that and add a second gigabit NIC so I could cope with the theoretical 140 Mbps connection to the internet, and since I had two NICs I might as well use both of them.
This is what I came up with:
Two lines coming from the cabinet to my house, one with Plusnet and one with TalkTalk
The Plusnet line:
It came with an OpenReach vDSL bridge and a crappy locked down router, so I chucked the router away and used PPPoE tools to bring up the PPP connection
The vDSL bridge talks to the Raspberry Pi over a VLAN to keep it separated from the other noise on the switch
Interface eth1.1000 is an unnumbered interface and ppoeconf uses a layer 2 discovery protocol to find the bridge
Once the PPP connection is established ppp1 can be used to route traffic to the internet
The TalkTalk line:
It too came with a crappy router, but no OpenReach bridge. So I had to use it.
The TalkTalk router talks to the Raspberry Pi over VLAN 10. Those ports are untagged on the switch, so as far as everyone on that network knows its just a self contained LAN.
Interface eth0 on the Raspberry Pi has an address on that LAN and uses the TalkTalk router to talk to the internet
The main LAN:
Interface eth1 is used to connect to the main LAN
Clients on the LAN use the Raspberry Pi as their default gateway
With me so far? Essentially we have the normal eth0 interface of the Pi connected to one LAN with its own router and eth1 (a USB gigabit ethernet adapter) has a tagged VLAN for connection to the OpenReach bridge (eth1.1000) and an untagged default network for connecting the the main LAN. Once the layer 2 connection with the bridge is established a PPP connection becomes the second route to the internet.
The death of route caching
Around version 3.6 of the Linux kernel “route caching” was removed. With route caching in place you could set up a default route with multiple hops, something along the lines of:
ip route add default nexthop via 192.168.1.254 dev eth0 nexthop via 192.168.2.254 dev eth1
When a packet needed routing to the internet the kernel would do a round-robin selection of which route to use and then remember that route for a period of time. The upshot of this was, for example, that if you connected to www.bbc.co.uk and got routed first via 192.168.1.254 and so SNATed to 18.104.22.168 then all subsequent traffic for that destination also got routed via the same route and had the same source IP address. Without route caching the next packet to that same destination would (probably) use the other route, and in the case of my home user scenario would arrive from a different source IP address – my two internet connections having different IP addresses. Although HTTP is a connectionless protocol this change of IP address did seem to freak some services out. For protocols with connections the story is worse, e.g. packets of an SSH connection would arrive at the far end from from two different IP addresses and probably get dropped. Route caching was a simple fix for this issue and worked well, as far as I was concerned anyway.
Im sure the reasons to remove it are valid, but for my simple use case it worked very well and the alternative, and now only option is to use connection marking to simulate the route caching. When I first looked at it I was baffled and thought I would just go back to a pre 3.6 kernel and use route caching again. But, in the standard Raspbian distro there isn’t a kernel old enough for the Raspberry Pi 2 to make use of it.
So I was stuck… I had to use a Raspberry Pi 2 to get enough packet throughput to max out my internet connections, and I couldn’t use route caching because there wasn’t a kernel old enough. This meant I was going to have to either compile my own kernel or learn to use connection marking. Joy.
The documentation for Netfilter is extensive but I found a lot of it to be out of date and very hard to grok. I found a few projects who had already implemented connection tracking/marking namely FWGuardian and Fault Tolerant Router.
FWGuardian is, as far as I can tell, designed for something orthogonal to my set up. Where you might have lots of connections coming in to a server, or a number of offices which need to connect to other offices via pre-defined routes. I played around with it for a while, and Humberto very kindly offered me support over email, but ultimately it was too involved and complex for my needs. You should check out the project though if you have advanced requirements. It’s got some brilliant features for a more enterprise oriented setup.
Fault Tolerant Router is a much simpler setup and matched my requirements very closely. At it’s core it’s a Ruby script which can write your iptables rules and routing tables and constantly monitor the links. If one goes down it can dynamically rewrite your rules and direct all traffic down the working connection. However, it’s not expecting to use a PPP connection where gateways can change and it’s not really been tested with VLANs, although in practice it handled VLANs just fine.
But, at the end of the day, I wanted to learn how to do this myself and so I used the rules generated by Fault Tolerant Router to understand how connection marking was supposed to work and then started to implement my own home-grown solution for teh lolz.
Multi-path routing and connection marking
As I understand it, the idea with connection marking, or connection tracking – I’m not sure what the difference is, is that when a new conversation starts the packets are marked with an identifier. You can then set ip rules to dictate which route packets with a particular mark take. In essence once a new connection is established and a route selected, all other packets in that conversation take on the same mark and so the same route. This emulates the route caching of the past. I don’t really get how, in the case of an HTTP conversation (or flow) which is connectionless, all the packets in the conversation get marked the same. This page has some more details, but I haven’t read it properly yet. Anyway, we don’t know HOW it works, but it does. Good enough.
First of all we need to create the iptables configuration to set up connection marking. Here’s the relevant extract from the iptables.save file:
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:INPUT ACCEPT [0:0]
[0:0] -A PREROUTING -i eth1 -j CONNMARK --restore-mark[0:0] -A PREROUTING -i ppp1 -m conntrack --ctstate NEW -j CONNMARK --set-mark 1[0:0] -A PREROUTING -i eth0 -m conntrack --ctstate NEW -j CONNMARK --set-mark 2[0:0] -A POSTROUTING -o ppp1 -m conntrack --ctstate NEW -j CONNMARK --set-mark 1[0:0] -A POSTROUTING -o eth0 -m conntrack --ctstate NEW -j CONNMARK --set-mark 2
-i = –in-interface and -0 = –out-interface
These rules set a mark depending on which interface is used. These changes happen in the mangle table.
Packets going in or out the WAN via ppp1 or eth0 which are a new connection are marked with a 1 or a 2 depending on which interface they use. The decision about which route to use is done in the rules which we will see later. Any packets coming in to eth1, so from the LAN, have their marks restored on the way in so they can be dealt with accordingly.
Now let’s have a look at the filter table:
*filter:INPUT DROP [0:0]:FORWARD DROP [0:0]:OUTPUT ACCEPT [0:0]:LAN_WAN - [0:0]:WAN_LAN - [0:0]
[0:0] -A INPUT -i lo -j ACCEPT[0:0] -A INPUT -i eth1 -j ACCEPT[0:0] -A INPUT -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
[0:0] -A FORWARD -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT[0:0] -A FORWARD -i eth1 -o ppp1 -j LAN_WAN[0:0] -A FORWARD -i eth1 -o eth0 -j LAN_WAN[0:0] -A FORWARD -i ppp1 -o eth1 -j WAN_LAN[0:0] -A FORWARD -i eth0 -o eth1 -j WAN_LAN
## Clamp MSS (ideal for PPPoE connections)[0:0] -I FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu[0:0] -A LAN_WAN -j ACCEPT[0:0] -A WAN_LAN -j REJECT
The default policy is set to DROP, so any packet not matching one of the rules are dropped.
INPUT applies to packets which are bound for the router itself. Packets from the local interface are allowed, and packets from eth1 (the main LAN) are also allowed.
FORWARD applies to packets which are passing through the router on their way somewhere else. Packets which are known to be part of an already in-progress session are allowed. Packets are then categorised as LAN to WAN or WAN to LAN and dealt with by the rules LAN_WAN or WAN_LAN, getting accepted and rejected respectively. All this boils down to LAN clients using the Raspberry Pi as a router and so having their packets forwarded are allowed out and packets coming in from the internet are rejected, the exception being if they are part of an on-going connection.
Clamping MSS to MTU deals with a particular issue with using PPPoE connections where the MTU can’t be the usual 1500 bytes. Because a lot of ISPs block the ICMP messages that would normally deal with asking the client to send smaller packet sizes we use this handy trick to make sure that packets can go out unfragmented. If you find that some web pages are slow to load and others are not, then try switching this on. If you’re only using upstream ISP provided routers you probably don’t need this.
Lastly in iptables we enable SNAT or masquerading so that connections out to the internet appear to come from a valid internet routable IP address not our LAN IP address:
#SNAT: LAN --> WAN[0:0] -A POSTROUTING -o ppp1 -j SNAT --to-source 22.214.171.124[0:0] -A POSTROUTING -o eth0 -j SNAT --to-source 192.168.1.253
We’ve configured iptables to add a mark to traffic depending on which WAN interface it is going in or out of. But this is only marking the packets, there is no logic to make sure that packets of the same mark use the same route. To make this happen we use ip rules.
First create three new routing tables by editing /etc/iproute2/rt_tables. I’ve added this to the bottom:
1 plusnet2 talktalk3 loadbal
Now we add a default route to the first two of those tables:
ip route add default via $PPP_GATEWAY_ADDRESS dev ppp1 src 126.96.36.199 table plusnet
ip route add default via 192.168.1.254 dev eth0 src 192.168.1.253 table talktalk
$PPP_GATEWAY_ADDRESS is set when the PPP session is established and changes. We can look at ways to find that address later, but for now just substitute the “P-t-P” IP address from “ifconfig ppp1” or whatever your ppp interface number is, or in the case of an ISP-provided router, the LAN side IP of that router.
This is simply creating a routing table with the name of the ISP that will be used and a default route which can find its way to the internet for that ISP.
Next we create the loadbal routing table which is a combination of the previous two:
ip route add default table loadbal nexthop via $PPP_GATEWAY_ADDRESS dev ppp1 nexthop via 192.168.1.254 dev eth0
which is the same idea as we used in the old route caching days, a round-robin route which flicks between the two available routes to the internet.
We’ve now created the iptables entries to track and mark traffic from each of the two ISPs and add some basic firewalling and IP masquerading. We’ve also created a routing table for each ISP and a load-balancing table which splits the traffic between the two ISPs.
Now we need to create some rules to govern which of the routing tables is used for a particular connection. The commands to do this are:
ip rule add from $PPP_IPADDR table plusnet pref 40000
ip rule add from 192.168.1.253 table talktalk pref 40100
ip rule add fwmark 0x1 table plusnet pref 40200
ip rule add fwmark 0x2 table talktalk pref 40300
ip rule add from 0/0 table loadbal pref 40400
The rules are matched in numerical order based on preference and once a rule matches that’s it. The first two rules make sure that traffic from the routers uses the correct table.
The important rules are the last three. Traffic which has been marked “1” will always use the plusnet routing table, traffic marked as “2” will always use the talktalk routing table. This ensures that all traffic which is part of an on-going conversation will always use the same router out to the internet, and so always come from the same IP address.
The last rule only matches traffic which is not already marked i.e. new conversations. This routing table, as can be seen in the previous section, has a multi-path route to balance traffic between the two routes out. Once a conversation is established the IPtables conntrack rules will mark the traffic and so one of the two fwmark rules will match.
Now delete the main default route so that the above rules don’t get bypassed with a route in the “main” table:
ip route del default
And that’s it. You should now have a router which splits the traffic fairly evenly across two internet connections and keeps tabs on which packets should go out of which routers. I’ve had this running for a month or so now, and it seems to be working fine. I’ve had the Pi lock up a couple of times, but I think that’s related to the USB gigabit ethernet adapter.
Smart Netflix hacks
Services such as unblock-us allow you to work around some geographic content blocks by acting as your DNS server and replying with the IP address of, say, the US based Netflix server instead of the UK ones. I’ve installed dnsmasq on my Pi as well and configured it to use the Unblock DNS servers instead of my ISP or Google servers. The clients on the LAN get their network configuration over DHCP from the Pi which sets the DNS server address for the clients to the Pi itself which then handles DNS lookups using the Unblock servers upstream. This works really well for most Netflix clients but I was having a lot of problems getting the Chromecast to work with Netflix and Unblock US.
It turns out that Google have hard-coded it’s own DNS servers into the Chromecast and so your local DNS settings are ignored. Nice one Google.
Because we’re using a Linux box as our router we can do this:
iptables -t nat -A PREROUTING -s <Netflix Client IP>/32 -d 188.8.131.52 -p udp --dport 53 -j DNAT --to <Alternative DNS Server IP Address>
iptables -t nat -A PREROUTING -s <Netflix Client IP>/32 -d 184.108.40.206 -p udp --dport 53 -j DNAT --to <Alternative DNS Server IP Address>
Using the NAT table we rewrite the DNS lookup bound for Google’s DNS servers to send it to our dnsmasq server instead. lol.
Spreading interrupts across cores
Network cards have queues for tx and rx. Higher end cards will typically have more queues, but on the Pi the on-board NIC (which is actually connected via USB) has one for tx and one for rx, as do the VLAN interfaces and the PPP interfaces. Each of these queues has a CPU affinity and it seems that by default the queues all use the same CPU core.
When downloading an ISO with BitTorrent and the load-balancing set up I was able to achieve just over 10 MBytes a second. But the Pi became really unresponsive. Looking at top showed one CPU core maxed out in soft interrupts:
By adjusting the CPU affinity to spread these IRQs across multiple CPUs I squeeze out a tiny bit more network throughput, but more usefully the Pi remained responsive under heavy load:
As part of my ever expanding home automation system I wanted to use MQTT to publish data on my network. With the release of the Raspberry Pi 2 I can run Ubuntu Core to create a reliable, secure and easily updated server which is a perfect fit for requirements of an MQTT broker and general HA controller. I asked some Ubuntu friends to help me package Mosquitto as a Snap, and in return I would write down how we did it. Here’s the story…
In summary; a Snappy application is secure because it’s wrapped with AppArmor. It’s easier to install and upgrade because everything is packaged in a single file and installed to a single location. That location is backed-up before you install a new version, and so if the installation goes wrong you can revert to the previous version easily by copying the original files back (or rather, Snappy will do all of that for you). Simplifying things slightly there are two types of Snappy “application”: Apps and Frameworks. Frameworks can extend the OS and provide a mediation layer to access shared resources. Apps are your more traditional top-level items which can use the provided frameworks, or bundle everything they need in to their Snap. This makes things much easier for app providers because they are now in charge – they can be assured that no library will change underneath them. This is a huge benefit!
Then launch the virtual machine. This command port forwards 8022 on your local machine to 22 on the virtual machine, so you can SSH to port 8022 on localhost and actually connect to the Ubuntu Core machine. It gives the Core machine 512MB of RAM, nicely achievable on a modest budget (The Pi2 has 1 GB). We also forward port 1883 from to the VM, which will allow us to connect to the Mosquitto server on our VM once it’s all installed.
Time to build Mosquitto. Before you run the commands below, a bit of background information. The cmake line will force cmake to install the binaries to the location specified with INSTALL_PREFIX, rather than /usr/local. This is required to bundle all of the binaries and other files to the “install” directory we created above, making it possible to package as a Snappy.
nproc spits out the number of processor cores you have, so the make line above will use as many processor cores as you have available. It’s not required, and for Mosquitto which is fairly small it’s not worth worrying about, but for a bigger job this is quite handy.
If you look in the “../install” directory you’ll see a familiar structure containing all the goodies needed by Mosquitto.
3. Find the libraries needed and copy them in to your Snappy project
Change in to the install/lib directory and use ldd to display the linked libraries for the two main .so files:
Now, on the Ubuntu Core machine we can run this little script:
for i in `cat`; do find /lib /usr/lib -name $i; done
Copy the list from the previous command to the clipboard and then paste it in to terminal where this command is running and hit Ctrl-D to submit the list. The script will then search Ubuntu Core for the libraries required. If it finds them they will be displayed, if it doesn’t then they are not available in Ubuntu Core by default and will need to be included in your Snappy package.
linux-vdso is the Linux kernel and is available on every Linux system by default, so we don’t need to provide that specifically.
libssl, libcrypto, libpthread, librt, libc and libdl are all available in Ubuntu Core by default – so we don’t need those either.
That leaves just libcares to be copied in to our package.
cp /usr/lib/x86_64-linux-gnu/libcares.so.2.1.0 .
We should already be in the ‘lib’ directory, hence the ‘.’ above. We are copying libcares in to the lib directory of our Snap, and when we run the Snap we will pass in the library path to make sure Mosquitto can find it. More on this later.
4. Add the meta data required for the Snappy package
Information about these fields and what they mean is available in the reference linked to above, but they are easily understandable. A comment on the name though, you need to append .<yournamespace> where your namespace is as you select in your Ubuntu myapps account. One thing to mention, you can see that to start our Snap we are calling a shell script. This allows us to pass in extra options to Mosquitto when it runs.
Next we need to create a readme file:
This file needs to contain at least a couple of non-blank lines. Here’s what we put in it:
This is a Snappy package for Mosquitto MQTT broker.
Information about Mosquitto is available here: http://mosquitto.org/
Information about MQTT is available here: http://mqtt.org/
We also need to configure our Mosquitto server, by editing the conf file. Most of the settings can be left as default, so we will create a new conf file with only the bits in we need.
We need to change this to run as root. Since our Snap will be confined there is no risk here. I expect the ability to run as non-root users when using Snappy will be improved, but really it’s not necessary.
We also need to add a small shell script to start Mosquitto with the right options. Create a file in install/sbin called mosquitto.sh:
If you see an error about ImportError: No module named ‘click.repository’ then you likely have a clash between the Click library version in the SDK team PPA and the version in the Snappy PPA. This will be fixed soon, but in the meantime I would suggest installing ppa-purge via apt-get and then running sudo ppa-purge ppa:ubuntu-sdk-team/ppa.
If you see an error about “expected <block end>” in the package.yaml check the whitespace in the file. It’s likely a copy and paste error.
6. Install your Snappy package
Once you have your .snap file you can install it to your virtual machine like this:
We’ve built a Snappy package for amd64 (or whatever your native architecture is), but we really need to be cross-architecture to give people the best choice of platform on which to use the package. This involves cross compiling, which can be tricky to put it mildly.
I spoke to Alexander Sack, the Director of Ubuntu Core, and asked what was coming next for Snappy and I was very excited to hear about easier cross-compilation methods as well as a cool script to help automate gathering the libraries in to your package. I’ll find out more about these and follow up with another post about
A huge “Thank You!” to Saviq and Didrocks for doing the actual work and letting me watch.