On Thu, 2016-01-28 at 02:25 +0530, Kirti Wankhede wrote:
> 
> On 1/27/2016 9:30 PM, Alex Williamson wrote:
> > On Wed, 2016-01-27 at 13:36 +0530, Kirti Wankhede wrote:
> > > 
> > > On 1/27/2016 1:36 AM, Alex Williamson wrote:
> > > > On Tue, 2016-01-26 at 02:20 -0800, Neo Jia wrote:
> > > > > On Mon, Jan 25, 2016 at 09:45:14PM +0000, Tian, Kevin wrote:
> > > > > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > > 
> > > > > Hi Alex, Kevin and Jike,
> > > > > 
> > > > > (Seems I shouldn't use attachment, resend it again to the list, 
> > > > > patches are
> > > > > inline at the end)
> > > > > 
> > > > > Thanks for adding me to this technical discussion, a great opportunity
> > > > > for us to design together which can bring both Intel and NVIDIA vGPU 
> > > > > solution to
> > > > > KVM platform.
> > > > > 
> > > > > Instead of directly jumping to the proposal that we have been working 
> > > > > on
> > > > > recently for NVIDIA vGPU on KVM, I think it is better for me to put 
> > > > > out couple
> > > > > quick comments / thoughts regarding the existing discussions on this 
> > > > > thread as
> > > > > fundamentally I think we are solving the same problem, DMA, interrupt 
> > > > > and MMIO.
> > > > > 
> > > > > Then we can look at what we have, hopefully we can reach some 
> > > > > consensus soon.
> > > > > 
> > > > > > Yes, and since you're creating and destroying the vgpu here, this is
> > > > > > where I'd expect a struct device to be created and added to an IOMMU
> > > > > > group.  The lifecycle management should really include links between
> > > > > > the vGPU and physical GPU, which would be much, much easier to do 
> > > > > > with
> > > > > > struct devices create here rather than at the point where we start
> > > > > > doing vfio "stuff".
> > > > > 
> > > > > Infact to keep vfio-vgpu to be more generic, vgpu device creation and 
> > > > > management
> > > > > can be centralized and done in vfio-vgpu. That also include adding to 
> > > > > IOMMU
> > > > > group and VFIO group.
> > > > Is this really a good idea?  The concept of a vgpu is not unique to
> > > > vfio, we want vfio to be a driver for a vgpu, not an integral part of
> > > > the lifecycle of a vgpu.  That certainly doesn't exclude adding
> > > > infrastructure to make lifecycle management of a vgpu more consistent
> > > > between drivers, but it should be done independently of vfio.  I'll go
> > > > back to the SR-IOV model, vfio is often used with SR-IOV VFs, but vfio
> > > > does not create the VF, that's done in coordination with the PF making
> > > > use of some PCI infrastructure for consistency between drivers.
> > > > 
> > > > It seems like we need to take more advantage of the class and driver
> > > > core support to perhaps setup a vgpu bus and class with vfio-vgpu just
> > > > being a driver for those devices.
> > > 
> > > For device passthrough or SR-IOV model, PCI devices are created by PCI
> > > bus driver and from the probe routine each device is added in vfio group.
> > 
> > An SR-IOV VF is created by the PF driver using standard interfaces
> > provided by the PCI core.  The IOMMU group for a VF is added by the
> > IOMMU driver when the device is created on the pci_bus_type.  The probe
> > routine of the vfio bus driver (vfio-pci) is what adds the device into
> > the vfio group.
> > 
> > > For vgpu, there should be a common module that create vgpu device, say
> > > vgpu module, add vgpu device to an IOMMU group and then add it to vfio
> > > group.  This module can handle management of vgpus. Advantage of keeping
> > > this module a separate module than doing device creation in vendor
> > > modules is to have generic interface for vgpu management, for example,
> > > files /sys/class/vgpu/vgpu_start and  /sys/class/vgpu/vgpu_shudown and
> > > vgpu driver registration interface.
> > 
> > But you're suggesting something very different from the SR-IOV model.
> > If we wanted to mimic that model, the GPU specific driver should create
> > the vgpu using services provided by a common interface.  For instance
> > i915 could call a new vgpu_device_create() which creates the device,
> > adds it to the vgpu class, etc.  That vgpu device should not be assumed
> > to be used with vfio though, that should happen via a separate probe
> > using a vfio-vgpu driver.  It's that vfio bus driver that will add the
> > device to a vfio group.
> > 
> 
> In that case vgpu driver should provide a driver registration interface 
> to register vfio-vgpu driver.
> 
> struct vgpu_driver {
>       const char *name;
>       int (*probe) (struct vgpu_device *vdev);
>       void (*remove) (struct vgpu_device *vdev);
> }
> 
> int vgpu_register_driver(struct vgpu_driver *driver)
> {
> ...
> }
> EXPORT_SYMBOL(vgpu_register_driver);
> 
> int vgpu_unregister_driver(struct vgpu_driver *driver)
> {
> ...
> }
> EXPORT_SYMBOL(vgpu_unregister_driver);
> 
> vfio-vgpu driver registers to vgpu driver. Then from 
> vgpu_device_create(), after creating the device it calls 
> vgpu_driver->probe(vgpu_device) and vfio-vgpu driver adds the device to 
> vfio group.
> 
> +--------------+    vgpu_register_driver()+---------------+
> >     __init() +------------------------->+               |
> >              |                          |               |
> >              +<-------------------------+    vgpu.ko    |
> > vfio_vgpu.ko |   probe()/remove()       |               |
> >              |                +---------+               +---------+
> +--------------+                |         +-------+-------+         |
>                                  |                 ^                 |
>                                  | callback        |                 |
>                                  |         +-------+--------+        |
>                                  |         |vgpu_register_device()   |
>                                  |         |                |        |
>                                  +---^-----+-----+    +-----+------+-+
>                                      | nvidia.ko |    |  i915.ko   |
>                                      |           |    |            |
>                                      +-----------+    +------------+
> 
> Is my understanding correct?

We have an entire driver core subsystem in Linux for the purpose of
matching devices to drivers, I don't think we should be re-inventing
that.  That's why I'm suggesting that we should have infrastructure
which facilitates GPU drivers to create vGPU devices in a common way,
perhaps even placing the devices on a virtual vgpu bus, and then allow a
vfio-vgpu driver to register as a driver for devices of that bus/class
and use the existing driver callbacks.  Thanks,

Alex


Reply via email to