On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
> On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
> > On 7/2/2018 9:14 AM, Roman Kagan wrote:
> > > On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > > > The patch set "Enable virtio_net to act as a standby for a passthru
> > > > device" [1] deals with live migration of guests that use passthrough
> > > > devices. However, that scheme uses the MAC address for pairing
> > > > the virtio device and the passthrough device. The thread "netvsc:
> > > > refactor notifier/event handling code to use the failover framework"
> > > > [2] discusses an alternate mechanism, such as using an UUID, for pairing
> > > > the devices. Based on that discussion, proposals "Add "Group Identifier"
> > > > to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> > > > store pairing information..." [4] were made.
> > > > 
> > > > The current patch set includes all the feedback received for proposals 
> > > > [3]
> > > > and [4]. For the sake of completeness, patch for the virtio 
> > > > specification
> > > > is also included here. Following is the updated proposal.
> > > > 
> > > > 1. Extend the virtio specification to include a new virtio PCI 
> > > > capability
> > > >     "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > > 
> > > > 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > > >     virtio device. The "failover-group-id" is a 64 bit value.
> > > > 
> > > > 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > > >     Red Hat PCI bridge device (support for the i440FX model).
> > > > 
> > > > 4. Add a new "pcie-downstream" device, with the option
> > > >     "failover-group-id" (support for the Q35 model).
> > > > 
> > > > 5. The operator creates a 64 bit unique identifier, failover-group-id.
> > > > 
> > > > 6. When the virtio device is created, the operator uses the
> > > >     "failover-group-id" option (for example, '-device
> > > >     virtio-net-pci,failover-group-id=<identifier>') and specifies the
> > > >     failover-group-id created in step 4.
> > > > 
> > > >     QEMU stores the failover-group-id in the virtio device's 
> > > > configuration
> > > >     space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > > 
> > > > 7. When assigning a PCI device to the guest in passthrough mode, the
> > > >     operator first creates a bridge using the "failover-group-id" option
> > > >     (for example, '-device 
> > > > pcie-downstream,failover-group-id=<identifier>')
> > > >     to specify the failover-group-id created in step 4, and then 
> > > > attaches
> > > >     the passthrough device to the bridge.
> > > > 
> > > >     QEMU stores the failover-group-id in the configuration space of the
> > > >     bridge as Vendor-Specific capability (0x09). The "Vendor" here is
> > > >     not to be confused with a specific organization. Instead, the vendor
> > > >     of the bridge is QEMU.
> > > > 
> > > > 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
> > > >     a passthru device" [1] needs to be modified to use the UUID values
> > > >     present in the bridge's configuration space and the virtio device's
> > > >     configuration space instead of the MAC address for pairing the 
> > > > devices.
> > > I'm still missing a few bits in the overall scheme.
> > > 
> > > Is the guest supposed to acknowledge the support for PT-PV failover?
> > 
> > Yes. We are leveraging virtio's feature negotiation mechanism for that.
> > Guest which does not acknowledge the support will not have PT plugged in.
> > 
> > > Should the PT device be visibile to the guest before it acknowledges the
> > > support for failover?
> > No. QEMU will only expose PT device after guest acknowledges the support
> > through virtio's feature negotiation.
> > 
> > >    How is this supposed to work with legacy guests that don't support it?
> > Only PV device will be exposed on legacy guest.
> 
> So how is this coordination going to work?  One possibility is that the
> PV device emits a QMP event upon the guest driver confirming the support
> for failover, the management layer intercepts the event and performs
> device_add of the PT device.  Another is that the PT device is added
> from the very beginning (e.g. on the QEMU command line) but its parent
> PCI bridge subscribes a callback with the PV device to "activate" the PT
> device upon negotiating the failover feature.
> 
> I think this needs to be decided within the scope of this patchset.
> 
> > > Is the guest supposed to signal the datapath switch to the host?
> > No, guest doesn't need to be initiating datapath switch at all.
> 
> What happens if the guest supports failover in its PV driver, but lacks
> the driver for the PT device?
> 
> > However, QMP
> > events may be generated when exposing or hiding the PT device through hot
> > plug/unplug to facilitate host to switch datapath.
> 
> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
> would it also need QMP events for them?
> 
> > > Is the scheme going to be applied/extended to other transports (vmbus,
> > > virtio-ccw, etc.)?
> > Well, it depends on the use case, and how feasible it can be extended to
> > other transport due to constraints and transport specifics.
> > 
> > > Is the failover group concept going to be used beyond PT-PV network
> > > device failover?
> > Although the concept of failover group is generic, the implementation itself
> > may vary.
> 
> My point with these two questions is that since this patchset is
> defining external interfaces -- with guest OS, with management layer --

This patch set is not defining any external interfaces. All this is doing
is provide the means and locations to store the "group identifier". How
that info will be used, I thought, should be another patch set.

Venu

> which are not easy to change later, it might make sense to try and see
> if the interfaces map to other usecases.  E.g. I think we can get enough
> information on how Hyper-V handles PT-PV network device failover from
> the current Linux implementation; it may be a good idea to share some
> concepts and workflows with virtio-pci.
> 
> Thanks,
> Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org

Reply via email to