On Wed, Jan 15, 2014 at 01:13:19PM +0200, Vangelis Koukis wrote: > On Tue, Jan 14, 2014 at 06:01:22pm +0100, Jose A. Lopes wrote: > > > A simple scenario is: > > > a) snf-nfdhcpd starts. Upon initialization, it creates an NFQUEUE (e.g., > > > 42, > > > configurable), and listens on it for incoming DHCP requests. It also > > > begins to > > > watch its state directory, /var/lib/nfdhcpd via inotify(). > > > b) A new VM gets created, let's assume its NIC has address mac0, lives on > > > TAP > > > interface tap0, and is to receive IP address ip0 via DHCP. > > > c) Someone (e.g., a Ganeti KVM ifup script, or in our case snf-network, > > > see http://code.grnet.gr/projects/snf-network) creates a new binding > > > file informing snf-nfdhcpd that it is to reply to DHCP requests from MAC > > > mac0 on TAP interface tap0, and include IP ip0 in the DHCP reply. > > > d) The administrator injects snf-nfdhcpd in the processing pipeline for > > > packets > > > coming from tap0, using iptables. This can happen for every TAP > > > interface, e.g.: > > > # iptables -t mangle -A PREROUTING -i tap+ -m udp -p udp --dport 67 -j > > > NFQUEUE --queue-num 42 > > > > We deciced to use the DHCP server to avoid iptables, given that > > existing installations probably already have complicated iptables > > setups. This way, we can avoid adding to the confusion. By the way, > > avoiding iptables was actually a request from Apollon. > > > > Hello Jose, > > When did the discussion on avoiding iptables take place in > ganeti-devel? > > I'm not sure I have all the context, so I can't really comment on this. > Could you provide me with pointers to the discussion, so I can be in > sync?
I believe it was an offline discussion during GanetiCon. > In previous mails, we focused on whether Ganeti or the administrator > would be responsible for handling the special network interface used by > the communication mechanism. You explained that Ganeti was going to do > everything necessary on its own (bringing up/down the interface, > starting/stopping the DHCP server, updating its state as VMs go up and > down etc.). > > So, Ganeti implements a specific *policy*, and is itself responsible for > setting it up properly. > > For standard NICs, it is the administrator who is responsible for > setting up networking: All Ganeti does is run the admin-provided ifup > scripts, and it is the administrator's responsibility to set up routing > rules, NAT, proxy ARP, firewalling, or what else. > > The question is: > Ganeti needs a special network interface per VM to implement the > communication mechanism. This interface has specific policy applied to > it: On a specific TAP tap0, only a specific VM with MAC mac0 may > obtain IP ip0 and communicate with the host by exchanging IP packets > with 169.254.169.254:80. Since the admin is not involved now, and no > hooks are provided, isn't it Ganeti's responsibility to setup > iptables-based firewalling accordingly? > > Ganeti is setting up new IP interfaces on the *host*, which communicate > directly with untrusted VMs. Who is going to guarantee that the only > service reachable by the VMs is the HTTP server needed for host<->VM > traffic as the Ganeti communication mechanism prescribes? Skip the next answer and below. > Who is going to guarantee that VM1 cannot get VM0's IP address or > MAC? Answered in the new version of the design doc. > If Ganeti is going to be totally responsible for enforcing policy on > these TAP interfaces (which makes sense), then it should go all the way, > and this includes iptables. Otherwise, the host will be exposed directly > to the VMs, via interfaces for which the administrator doesn't really > know anything about, and this could have important security > implications. Perhaps a suitable compromise would be the following: Ganeti will do all the configuration and routing as mentioned in the design doc but will not configure iptables for the reasons I have listed before; but it will provide the ifup hooks in order to allow users to customize iptables, if they want to. What do you think? > > If we were to use iptables, we could probably get by by throwing away > > the DHCP server and using NAT instead to replace any source IP address > > of incoming packets from the TAP interfaces with a unique IP address > > within the host. > > > > Please see above. > It doesn't matter if you do NAT or not, the issue is isolating the VMs > and making sure the host is properly firewalled on these TAP interfaces. > > > Is there a way to configure snf-nfdhcpd without iptables? > > > > Not that I now of currently. The alternative (as dnsmasq or any DHCP > server would do), is to have the host use an interface with full IP > networking, which has its own set of potential problems. > > Finally, can you please comment on the issues raised on the other branch > of this thread? Specifically, how will the current dnsmasq-based > approach support dynamic updates of the state of the DHCP server, > and how it will enforce MAC address <-> TAP interface pairs. Comment sent! Thanks, Jose > Thank you, > Vangelis. > > > > or for individual TAP interfaces. > > > e) From now on, whenever a DHCP request is sent out by the VM, the > > > iptables rule will forward the packet to nfdhcpd, which will consult > > > its bindings database, find the entry for tap0, verify the source MAC, > > > and inject a DHCP reply for the corresponding IP address into tap0. > > > > > > This has various advantages compared to dnsmasq or similar servers: > > > > > > a) The DHCP service can be activated dynamically, per-interface, > > > by manipulating iptables accordingly. There is no need to restart > > > the daemon, or edit (potentially read-only) configuration files, > > > you only need to drop a file under /var/lib/nfdhcpd. > > > > > > b) There is no interference to existing DHCP servers listening to port > > > 67. Everything happens directly via NFQUEUE. > > > > > > c) The host doesn't even need to have an IP address on the interfaces > > > where DHCP replies are served, making it invisible to the VMs. This > > > may be beneficial from a security perspective. Similarly, it doesn't > > > matter if the TAP interface is bridged or routed. > > > > > > d) MAC addresses are bound on TAP interfaces. Requests coming from > > > unrelated TAP interfaces are ignored, and packet processing happens > > > as if snf-nfdhcpd didn't exist in the first place. > > > > > > e) snf-nfdhcpd is written in pure Python and uses scapy for packet > > > processing. This has proved super-useful when trying to troubleshooting > > > networking problems in production. > > > > > > Example snf-nfdhcpd binding file: > > > A binding file in snf-nfdhcpd's state directory is named after the > > > physical interface where the daemon is to receive incoming DHCP requests > > > from, and defines at least the following variables: > > > > > > * INDEV: The logical interface where the packet is received on. For > > > bridged setups, the bridge interface, e.g., br0. Otherwise, same as > > > the file name. > > > * MAC: The MAC address where the DHCP request must be originating from > > > * IP: The IPv4 address to be returned in DHCP replies > > > * SUBNET: The IPv4 subnet to be returned in DHCP replies in CIDR form > > > * GATEWAY: The IPv4 gateway to be returned in DHCP replies > > > > > > Please see package snf-network for an example Ganeti KVM ifup script > > > which exercises all of the above, plus IPv6 autoconfiguration: > > > > > > https://code.grnet.gr/projects/snf-network/repository/revisions/master/entry/common.sh#L153 > > > > > > We are in the process of integrating the above description as > > > documentation for snf-nfdhcpd. > > > > > > Looking forward to your comments, > > > > > > Thanks, > > > Vangelis. > > > > > > > > -- > > Jose Antonio Lopes > > Ganeti Engineering > > Google Germany GmbH > > Dienerstr. 12, 80331, München > > > > Registergericht und -nummer: Hamburg, HRB 86891 > > Sitz der Gesellschaft: Hamburg > > Geschäftsführer: Graham Law, Christine Elizabeth Flores > > Steuernummer: 48/725/00206 > > Umsatzsteueridentifikationsnummer: DE813741370 > > -- > Vangelis Koukis > [email protected] > OpenPGP public key ID: > pub 1024D/1D038E97 2003-07-13 Vangelis Koukis <[email protected]> > Key fingerprint = C5CD E02E 2C78 7C10 8A00 53D8 FBFC 3799 1D03 8E97 > > Only those who will risk going too far > can possibly find out how far one can go. > -- T.S. Eliot -- Jose Antonio Lopes Ganeti Engineering Google Germany GmbH Dienerstr. 12, 80331, München Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Graham Law, Christine Elizabeth Flores Steuernummer: 48/725/00206 Umsatzsteueridentifikationsnummer: DE813741370
