* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Wed, May 13, 2015 at 10:00:42AM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Krempa (pkre...@redhat.com) wrote:
> > > On Wed, May 13, 2015 at 09:40:23 +0100, Dr. David Alan Gilbert wrote:
> > > > * Peter Krempa (pkre...@redhat.com) wrote:
> > > > > On Wed, May 13, 2015 at 09:08:39 +0100, Dr. David Alan Gilbert wrote:
> > > > > > * Peter Krempa (pkre...@redhat.com) wrote:
> > > > > > > On Wed, May 13, 2015 at 11:36:26 +0800, Chen Fan wrote:
> > > > > > > > my main goal is to add support migration with host NIC
> > > > > > > > passthrough devices and keep the network connectivity.
> > > > > > > > 
> > > > > > > > this series patch base on Shradha's patches on
> > > > > > > > https://www.redhat.com/archives/libvir-list/2012-November/msg01324.html
> > > > > > > > which is add migration support for host passthrough devices.
> > > > > > > > 
> > > > > > > >  1) unplug the ephemeral devices before migration
> > > > > > > > 
> > > > > > > >  2) do native migration
> > > > > > > > 
> > > > > > > >  3) when migration finished, hotplug the ephemeral devices
> > > > > > > 
> > > > > > > IMHO this algorithm is something that an upper layer management 
> > > > > > > app
> > > > > > > should do. The device unplug operation is complex and it might not
> > > > > > > succeed which will make the current migration thread hang or fail 
> > > > > > > in an
> > > > > > > intermediate state that will not be recoverable.
> > > > > > 
> > > > > > However you wouldn't want each of the upper layer management apps 
> > > > > > implementing
> > > > > > their own hacks for this; so something somewhere needs to 
> > > > > > standardise
> > > > > > what the guest sees.
> > > > > 
> > > > > The guest still will see an PCI device unplug request and will have to
> > > > > respond to it, then will be paused and after resume a new PCI device
> > > > > will appear. This is standardised. The nonstandardised part (which 
> > > > > can't
> > > > > really be standardised) is how the bonding or other guest-dependant
> > > > > stuff will be handled, but that is up to the guest OS to handle.
> > > > 
> > > > Why can't that be standardised?   Don't we need to provide the 
> > > > information
> > > > on what to bond to the guest and that this process is happening?  The 
> > > > previous
> > > > suggestion was to use guest-agent for this.
> > > 
> > > Well, since only in linux you've got multiple ways to do that including
> > > legacy init scripts on various distros, the systemd-networkd thingie or
> > > how it's called or network manager, standardising this part won't be
> > > that easy. Not speaking of possible different OSes.
> > 
> > Right - so we need to standardise on the messaging we send to the guest to
> > tell it that we've got this bonded hotplug setup, and then the different
> > OSs can implement what they need off using that information.
> > 
> > > > > From libvirt's perspective this is only something that will trigger 
> > > > > the
> > > > > device unplug and plug the devices back. And there are a lot of issues
> > > > > here:
> > > > > 
> > > > > 1) the destination of the migration might not have the desired devices
> > > > > 
> > > > >     This will trigger a lot of problems as we will not be able to 
> > > > > guarantee
> > > > >     that the devices reappear on the destination and if we'd wanted 
> > > > > to check
> > > > >     we'd need a new migration protocol AFAIK.
> > > > 
> > > > But if it's using the bonding trick then that isn't fatal; it would 
> > > > still
> > > > be able to have the bonded virtio device.
> > > > 
> > > > > 2) The guest OS might refuse to detach the PCI device (it might be 
> > > > > stuck
> > > > > before PCI code is loaded)
> > > > > 
> > > > >     In that case the migration will be stuck forever and abort 
> > > > > attempts
> > > > >     will make the domain state basically undefined depending on the
> > > > >     phase where it failed.
> > > > > 
> > > > > Since we can't guarantee that the unplug of the PCI host devices will 
> > > > > be
> > > > > atomic or that it will succeed we basically can't guarantee in any way
> > > > > in which state the VM will end up later after (a possibly failed)
> > > > > migration. To recover such state there are too many option that could 
> > > > > be
> > > > > desired by the user that would be hard to implement in a way that 
> > > > > would
> > > > > be flexible enough.
> > > > 
> > > > I don't understand why this is any different to any other PCI device 
> > > > hot-unplug.
> > > 
> > > It's the same, but once libvirt would be doing multiple PCI unplug
> > > requests along with the migration code, things might not go well. If you
> > > then couple this with different user expectations what should happen in
> > > various error cases it gets even more messy.
> > 
> > Well, since we've got the bond it shouldn't get quite that bad;  the
> > error cases don't sound that bad:
> >    1) If we can't hot-unplug then we don't migrate/cancel migration.
> >       We warn the user, if we're unlucky we're left running on the bond.
> >    2) If we can't hot-plug at the end, then we've still got the bond in,
> >       so the guest carries on running (albeit with reduced performance).
> >       We need to flag this to the user somehow.
> 
> If there are multiple PCI devices attached to the guest, we may end up
> with some PCI devices removed and some still present, and some for which
> we don't know if they are removed or present at all as the guest may simply
> not have responded to us yet. Further there are devices which are not just
> bonded NICs, so I'm really not happy for us to design a policy that works
> for bonded NICs but which is quite possibly going to be useless for other
> types of PCI device people will inevitably want to deal with later.

This is only trying to address the problem for devices that can have
the equivalent of a bond; so it's not NIC specific; the same should work for
storage devices with multipath.

Dave

> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to