On Thu, Apr 23, 2015 at 12:35:28PM -0400, Laine Stump wrote:
> On 04/22/2015 01:20 PM, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (berra...@redhat.com) wrote:
> >> On Wed, Apr 22, 2015 at 06:12:25PM +0100, Dr. David Alan Gilbert wrote:
> >>> * Daniel P. Berrange (berra...@redhat.com) wrote:
> >>>> On Wed, Apr 22, 2015 at 06:01:56PM +0100, Dr. David Alan Gilbert wrote:
> >>>>> * Daniel P. Berrange (berra...@redhat.com) wrote:
> >>>>>> On Fri, Apr 17, 2015 at 04:53:02PM +0800, Chen Fan wrote:
> >>>>>>> backgrond:
> >>>>>>> Live migration is one of the most important features of 
> >>>>>>> virtualization technology.
> >>>>>>> With regard to recent virtualization techniques, performance of 
> >>>>>>> network I/O is critical.
> >>>>>>> Current network I/O virtualization (e.g. Para-virtualized I/O, VMDq) 
> >>>>>>> has a significant
> >>>>>>> performance gap with native network I/O. Pass-through network devices 
> >>>>>>> have near
> >>>>>>> native performance, however, they have thus far prevented live 
> >>>>>>> migration. No existing
> >>>>>>> methods solve the problem of live migration with pass-through devices 
> >>>>>>> perfectly.
> >>>>>>>
> >>>>>>> There was an idea to solve the problem in website:
> >>>>>>> https://www.kernel.org/doc/ols/2008/ols2008v2-pages-261-267.pdf
> >>>>>>> Please refer to above document for detailed information.
> >>>>>>>
> >>>>>>> So I think this problem maybe could be solved by using the 
> >>>>>>> combination of existing
> >>>>>>> technology. and the following steps are we considering to implement:
> >>>>>>>
> >>>>>>> -  before boot VM, we anticipate to specify two NICs for creating 
> >>>>>>> bonding device
> >>>>>>>    (one plugged and one virtual NIC) in XML. here we can specify the 
> >>>>>>> NIC's mac addresses
> >>>>>>>    in XML, which could facilitate qemu-guest-agent to find the 
> >>>>>>> network interfaces in guest.
> >>>>>>>
> >>>>>>> -  when qemu-guest-agent startup in guest it would send a 
> >>>>>>> notification to libvirt,
> >>>>>>>    then libvirt will call the previous registered initialize 
> >>>>>>> callbacks. so through
> >>>>>>>    the callback functions, we can create the bonding device according 
> >>>>>>> to the XML
> >>>>>>>    configuration. and here we use netcf tool which can facilitate to 
> >>>>>>> create bonding device
> >>>>>>>    easily.
> >>>>>> I'm not really clear on why libvirt/guest agent needs to be involved 
> >>>>>> in this.
> >>>>>> I think configuration of networking is really something that must be 
> >>>>>> left to
> >>>>>> the guest OS admin to control. I don't think the guest agent should be 
> >>>>>> trying
> >>>>>> to reconfigure guest networking itself, as that is inevitably going to 
> >>>>>> conflict
> >>>>>> with configuration attempted by things in the guest like 
> >>>>>> NetworkManager or
> >>>>>> systemd-networkd.
> >>>>>>
> >>>>>> IOW, if you want to do this setup where the guest is given multiple 
> >>>>>> NICs connected
> >>>>>> to the same host LAN, then I think we should just let the gues admin 
> >>>>>> configure
> >>>>>> bonding in whatever manner they decide is best for their OS install.
> >>>>> I disagree; there should be a way for the admin not to have to do this 
> >>>>> manually;
> >>>>> however it should interact well with existing management stuff.
> >>>>>
> >>>>> At the simplest, something that marks the two NICs in a discoverable way
> >>>>> so that they can be seen that they're part of a set;  with just that ID 
> >>>>> system
> >>>>> then an installer or setup tool can notice them and offer to put them 
> >>>>> into
> >>>>> a bond automatically; I'd assume it would be possible to add a rule 
> >>>>> somewhere
> >>>>> that said anything with the same ID would automatically be added to the 
> >>>>> bond.
> >>>> I didn't mean the admin would literally configure stuff manually. I 
> >>>> really
> >>>> just meant that the guest OS itself should decide how it is done, whether
> >>>> NetworkManager magically does the right thing, or the person building the
> >>>> cloud disk image provides a magic udev rule, or $something else. I just
> >>>> don't think that the QEMU guest agent should be involved, as that will
> >>>> definitely trample all over other things that manage networking in the
> >>>> guest.
> >>> OK, good, that's about the same level I was at.
> >>>
> >>>> I could see this being solved in the cloud disk images by using
> >>>> cloud-init metadata to mark the NICs as being in a set, or perhaps there
> >>>> is some magic you could define in SMBIOS tables, or something else again.
> >>>> A cloud-init based solution wouldn't need any QEMU work, but an SMBIOS
> >>>> solution might.
> >>> Would either of these work with hotplug though?   I guess as the VM starts
> >>> off with the pair of NICs, then when you remove one and add it back after
> >>> migration then you don't need any more information added; so yes
> >>> cloud-init or SMBIOS would do it.  (I was thinking SMBIOS stuff
> >>> in the way that you get device/slot numbering that NIC naming is 
> >>> sometimes based
> >>> off).
> >>>
> >>> What about if we hot-add a new NIC later on (not during migration);
> >>> a normal hot-add of a NIC now turns into a hot-add of two new NICs; how
> >>> do we pass the information at hot-add time to provide that?
> >> Hmm, yes, actually hotplug would be a problem with that.
> >>
> >> A even simpler idea would be to just keep things real dumb and simply
> >> use the same MAC address for both NICs. Once you put them in a bond
> >> device, the kernel will be copying the MAC address of the first NIC
> >> into the second NIC anyway, so unless I'm missing something, we might
> >> as well just use the same MAC address for both right away. That makes
> >> it easy for guest to discover NICs in the same set and works with
> >> hotplug trivially.
> > I bet you need to distinguish the two NICs though; you'd want the bond
> > to send all the traffic through the real NIC during normal use;
> > and how does the guest know when it sees the hotplug of the 1st NIC in the 
> > pair
> > that this is a special NIC that it's about to see it's sibbling arrive.
> 
> Yeah, there needs to be *some way* for the guest OS to differentiate
> between the emulated NIC (which will be operational all the time, but
> only used during migration when the passed-through NIC is missing) and
> the passed-through NIC (which should be preferred for all traffic when
> it is present). The simplest method of differentiating would be for the
> admin who configures it to know the MAC address. Another way could be
> [some bit of magic I don't know how to do] that sets the bonding config
> based on which driver is used for the NIC (the emulated NIC will almost
> certainly be virtio, and the passed-through will be igbf, ixgbvf, or
> similar).

Why not supply this information using the qemu ga?

> A complicating factor with using MAC address to differentiate is that it
> isn't possible for the guest to modify the MAC address of a
> passed-through SRIOV VF - the only way that could be done would be for
> the guest to notify the host, then the host could use an RTM_SETLINK
> message sent for the PF+VF# to change the MAC address, otherwise it is
> prohibited by the hardware.
> 
> Likewise (but at least tehcnically possible to solve with current
> libvirt+qemu), the default configuration for a macvtap connection to an
> emulated guest ethernet device (which is probably what the "backup"
> device of the bond would be) doesn't pass any traffic once the guest has
> changed the MAC address of the emulated device - qemu does send an
> RX_FILTER_CHANGED event to libvirt, and if the interface's config has
> trustGuestRxFilters='yes', then and only then libvirt will modify the
> MAC address of the host side of the macvtap device.
> 
> Thinking about this more, it seems a bit problematic from a security
> point of view to allow the guest to arbitrarily change its MAC addresses
> just to support this, so maybe the requirement should be that the MAC
> addresses be set to the same value, and the guest config required to
> figure out which is the "preferred" and which is the "backup" by
> examining the driver used for the device.

That's an unrelated question.  Some people want to allow changing
the MAC, some don't. Don't use MAC addresses to identify devices,
and the problem will go away.

-- 
MST

Reply via email to