KVM networking configuration refactoring

Current state and shortcomings

Currently, Ganeti uses KVM's builtin per-NIC “script=” option to configure tap
devices. This approach generally works, however there are a number of
shortcomings:

 a) The tap interface name is not known beforehand; furthermore, the script
    does not know which NIC it is called for, which rules out passing the whole
    instance configuration as environment variables to KVM and from there to
    the script. Thus, Ganeti has to write out an ifup script per instance
    NIC to disk, having among others the side-effect that /tmp cannot be
    mounted noexec.

 b) There is no easy way (other than getting it from the script) for the
    admin to know which tap device(s) are attached to a running KVM instance.

 c) Currently no action may be run during an instance's networking shutdown;
    using KVMs downscript support with either “pool” or “user” security models
    is probably pointless as the downscript will be run as an unprivileged user.

 d) The most significant issue perhaps is that live migration of routed
    instances is broken; KVM configures the network interface right away as
    it is spawned, even in “incoming” mode. This is harmless for bridged
    instances, however in routed instances this means that the instance's IP
    addresses are announced (either directly through e.g. an IGP, or indirectly
    through Proxy-ARP/proxy-NDP) by two nodes during the whole migration
    process, causing possible network disruption for the migrated instance.

Refactoring proposal

We propose moving KVM's network configuration to Ganeti by taking advantage of
QEMU/KVM's ability to receive an open tap device as a file descriptor. For
this to be possible, minor modifications are required to utils.RunCmd, as well
as additional methods to open and initialize tap devices from within Ganeti.
The current default script can be preserved as a system-wide
/etc/ganeti/kvm-vif-bridge script which will be run directly for each tap
device with the same environment as before. For freshly-started instances, the
network is configured as soon as KVM starts, for incoming instances it is
configured during FinalizeMigration, after a successful migration.

What's missing

There is room for some unittests (checking e.g. RunCmd's “noclose_fds=”
behaviour). Also, possible extensions of this approach would include making
network configuration more modular, using run-parts on e.g.
/etc/ganeti/kvm-if-up.d, /etc/ganeti-kvm-if-pre-up.d etc, as well as modifying
LUQueryInstanceData to report an instance's tap devices to the caller.

Thanks

PS: I have tested the following patches and they work for me. Perhaps 
they need some polishing but I chose to submit them early to get some 
feedback as I will be away for the rest of the week.

Apollon Oikonomopoulos (3):
  KVM: Add auxiliary functions to handle tap devices
  Add ability to retain specified fds open in RunCmd
  KVM: Perform network configuration in Ganeti

 lib/hypervisor/hv_kvm.py      |  278 ++++++++++++++++++++++++-----------------
 lib/utils.py                  |   45 ++++++--
 test/ganeti.utils_unittest.py |    2 +-
 3 files changed, 198 insertions(+), 127 deletions(-)

Reply via email to