Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-08-08 Thread Gerd Hoffmann
On Fri, Jul 20, 2018 at 04:56:15AM +, Yuan, Hang wrote:
> Hi Gerd,
> 
> Can I know your status on the boot display support work? I'm interested to 
> try it in some real use cases.

https://git.kraxel.org/cgit/qemu/log/?h=sirius/ramfb-vfio

Most of the bits needed (general ramfb support) is merged upstream and will be 
in 3.0.

Wiring up ramfb for vfio display devices is in the branch listed above
and should follow for 3.1

cheers,
  Gerd

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-07-20 Thread Yuan, Hang
Hi Gerd,

Can I know your status on the boot display support work? I'm interested to try 
it in some real use cases.

Thanks,
Henry

> -Original Message-
> From: intel-gvt-dev [mailto:intel-gvt-dev-boun...@lists.freedesktop.org] On
> Behalf Of Gerd Hoffmann
> Sent: Monday, May 7, 2018 2:26 PM
> To: Alex Williamson 
> Cc: Neo Jia ; k...@vger.kernel.org; Erik Skultety
> ; libvirt ; Dr. David Alan
> Gilbert ; Zhang, Tina ; Kirti
> Wankhede ; Laine Stump ;
> Daniel P. Berrange ; Jiri Denemark
> ; intel-gvt-...@lists.freedesktop.org
> Subject: Re: Expose vfio device display/migration to libvirt and above, was 
> Re:
> [PATCH 0/3] sample: vfio mdev display devices.
> 
>   Hi,
> 
> > This raises another question, is the configuration of the emulated
> > graphics a factor in the handling the mdev device's display option?
> > AFAIK, neither vGPU vendor provides a VBIOS for boot graphics, so even
> > with a display option, we're mostly targeting a secondary graphics
> > head, otherwise the user will be running headless until the guest OS
> > drivers initialize.
> 
> Right now yes, no boot display for vgpu devices.  I'm trying to fix that with
> ramfb.  There are a bunch of rough edges still and details to hashed out.  
> It'll
> probably be uefi only.
> 
> cheers,
>   Gerd

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-10 Thread Alex Williamson
On Thu, 10 May 2018 13:00:29 +0200
Erik Skultety  wrote:

> ...
> 
> > > Now, if we (theoretically) can settle on easing the restrictions Alex
> > > has mentioned, we in fact could introduce a QMP command to probe
> > > these devices and provide libvirt with useful information at that
> > > point in time. Of course, since the 3rd party vendor is "de-coupled"
> > > from qemu, libvirt would have no way to find out that the driver has
> > > changed in the meantime, thus still using the old information we
> > > gathered, ergo potentially causing the QEMU process to fail
> > > eventually. But then again, there's very often a strong
> > > recommendation to reboot your host after a driver update, especially
> > > in NVIDIA's case, which means this fact wouldn't matter. However,
> > > there's also a significant drawback to my proposal which probably
> > > renders it completely useless (but we can continue from there...) and
> > > that is the devices would either have to be present already (not an
> > > option) or QEMU would need to be enhanced in a way, that it would
> > > create a dummy device during QMP probing, open it, collect the
> > > information libvirt needs, close it and remove it. If the driver
> > > doesn't change in the meantime, this should be sufficient for a VM to
> > > be successfully instantiated with a display, right?  
> >
> > I don't think this last requirement is possible, QEMU is as clueless
> > about the capabilities of an mdev device as anyone else until that
> > device is opened and probed, so how would we invent this "dummy
> > device"?  I don't really see how there's any ability for
> > pre-determination of the device capabilities, we can only probe the
> > actual device we intend to use.  
> 
> Hmm, let's say libvirt is able to create mdevs. Do the vendor drivers impose
> any kind of limitations on whether a specific device-type or a specific
> instance of a type does or does not present certain features like display or
> migration in comparison to the other types/instances? IOW I would assume that
> once the driver version does support display/migration, any mdev instance of 
> any
> mdev type the driver supports will "inherit" the support for 
> display/migration.
> If this assumption works, libvirt, knowing there are some mdev capable parent
> devices, could technically create a dummy instance of the first type it can 
> for
> each parent device, passing the UUID to qemu QMP query command, qemu would 
> then
> open and probe the device, returning the capabilities which libvirt would then
> cache. Next time a VM is due to start, libvirt can use the device UUID to 
> check
> the capabilities we cached and try setting appropriate config options. 
> However,
> as you've mentioned, this approach is fairly policy-driven, which doesn't cope
> with what libvirt's goal is. Would such a suggestion help at all from QEMU's
> POV?

There is no guarantee that all mdevs are equal for a given vendor.  For
instance we know that the smallest vGPU instance for Intel is intended
for compute offload, it's configured with barely enough framebuffer and
screen resolution for a working desktop.  Does it necessarily make
sense that it would support all of the same capabilities as a more
desktop focused mdev instance?  For that matter, can we necessarily
guarantee that all mdev types for a given parent device are the same
class of device?  For a GPU parent device we might have some VGA class
devices supporting a display and some 3D controllers which don't.  So I
think the operative word above is "assumption".  You can make whatever
assumptions you want, but they're only that, there's nothing that binds
the mdev vendor driver to those assumptions.

> > > > The above has pressed the need for investigating some sort of
> > > > alternative API through which libvirt might introspect a vfio device
> > > > and with vfio device migration on the horizon, it's natural that
> > > > some sort of support for migration state compatibility for the
> > > > device need be considered as a second user of such an API.
> > > > However, we currently have no concept of migration compatibility on
> > > > a per-device level as there are no migratable devices that live
> > > > outside of the QEMU code base. It's therefore assumed that per
> > > > device migration compatibility is encompassed by the versioned
> > > > machine type for the overall VM.  We need participation all the way
> > > > to the top of the VM management stack to resolve this issue and
> > > > it's dragging down the (possibly) more simple question of how do we
> > > > resolve the display situation.  Therefore I'm looking for
> > > > alternatives for display that work within what we have available to
> > > > us at the moment.
> > > >
> > > > Erik Skultety, who initially raised the display question, has
> > > > identified one possible solution, which is to simply make the
> > > > display configuration the user's problem (apologies if I've
> > > > 

Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-10 Thread Erik Skultety
...

> > Now, if we (theoretically) can settle on easing the restrictions Alex
> > has mentioned, we in fact could introduce a QMP command to probe
> > these devices and provide libvirt with useful information at that
> > point in time. Of course, since the 3rd party vendor is "de-coupled"
> > from qemu, libvirt would have no way to find out that the driver has
> > changed in the meantime, thus still using the old information we
> > gathered, ergo potentially causing the QEMU process to fail
> > eventually. But then again, there's very often a strong
> > recommendation to reboot your host after a driver update, especially
> > in NVIDIA's case, which means this fact wouldn't matter. However,
> > there's also a significant drawback to my proposal which probably
> > renders it completely useless (but we can continue from there...) and
> > that is the devices would either have to be present already (not an
> > option) or QEMU would need to be enhanced in a way, that it would
> > create a dummy device during QMP probing, open it, collect the
> > information libvirt needs, close it and remove it. If the driver
> > doesn't change in the meantime, this should be sufficient for a VM to
> > be successfully instantiated with a display, right?
>
> I don't think this last requirement is possible, QEMU is as clueless
> about the capabilities of an mdev device as anyone else until that
> device is opened and probed, so how would we invent this "dummy
> device"?  I don't really see how there's any ability for
> pre-determination of the device capabilities, we can only probe the
> actual device we intend to use.

Hmm, let's say libvirt is able to create mdevs. Do the vendor drivers impose
any kind of limitations on whether a specific device-type or a specific
instance of a type does or does not present certain features like display or
migration in comparison to the other types/instances? IOW I would assume that
once the driver version does support display/migration, any mdev instance of any
mdev type the driver supports will "inherit" the support for display/migration.
If this assumption works, libvirt, knowing there are some mdev capable parent
devices, could technically create a dummy instance of the first type it can for
each parent device, passing the UUID to qemu QMP query command, qemu would then
open and probe the device, returning the capabilities which libvirt would then
cache. Next time a VM is due to start, libvirt can use the device UUID to check
the capabilities we cached and try setting appropriate config options. However,
as you've mentioned, this approach is fairly policy-driven, which doesn't cope
with what libvirt's goal is. Would such a suggestion help at all from QEMU's
POV?

>
> > > The above has pressed the need for investigating some sort of
> > > alternative API through which libvirt might introspect a vfio device
> > > and with vfio device migration on the horizon, it's natural that
> > > some sort of support for migration state compatibility for the
> > > device need be considered as a second user of such an API.
> > > However, we currently have no concept of migration compatibility on
> > > a per-device level as there are no migratable devices that live
> > > outside of the QEMU code base. It's therefore assumed that per
> > > device migration compatibility is encompassed by the versioned
> > > machine type for the overall VM.  We need participation all the way
> > > to the top of the VM management stack to resolve this issue and
> > > it's dragging down the (possibly) more simple question of how do we
> > > resolve the display situation.  Therefore I'm looking for
> > > alternatives for display that work within what we have available to
> > > us at the moment.
> > >
> > > Erik Skultety, who initially raised the display question, has
> > > identified one possible solution, which is to simply make the
> > > display configuration the user's problem (apologies if I've
> > > misinterpreted Erik).  I believe this would work something like:
> > >
> > >  - libvirt identifies a version of QEMU that includes 'display'
> > > support for vfio-pci devices and defaults to adding display=off for
> > > every vfio-pci device [have we chosen the wrong default (auto) in
> > > QEMU?].
> >
> > From libvirt's POV, having a new XML attribute display to the host
> > device type mdev should with a default value 'off', potentially
> > extending this to 'auto' once we have enough information to base our
> > decision on. We'll need to combine this with a new attribute value
> > for the  element that would prevent adding an emulated VGA any
> > time  (spice,VNC) is requested, but that's something we'd
> > need to do anyway, so I'm just mentioning it.
>
> This raises another question, is the configuration of the emulated
> graphics a factor in the handling the mdev device's display option?
> AFAIK, neither vGPU vendor provides a VBIOS for boot graphics, so even

Good point, I forgot about the fact that we don't have boot 

Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-07 Thread Gerd Hoffmann
  Hi,

> This raises another question, is the configuration of the emulated
> graphics a factor in the handling the mdev device's display option?
> AFAIK, neither vGPU vendor provides a VBIOS for boot graphics, so even
> with a display option, we're mostly targeting a secondary graphics
> head, otherwise the user will be running headless until the guest OS
> drivers initialize.

Right now yes, no boot display for vgpu devices.  I'm trying to fix that
with ramfb.  There are a bunch of rough edges still and details to
hashed out.  It'll probably be uefi only.

cheers,
  Gerd

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-07 Thread Gerd Hoffmann
  Hi,

> Maybe some guiding questions:
> 
>  - Will dma-buf always require GL support?

Yes.

>  - Does GL support limit our ability to have a display over a remote
>connection?

Currently yes, althrough the plan is to support gl display remotely in
spice.  The workflow will be completely different though.  Non-gl spice
uses the classic display channel, the plan for gl spice is to feed the
dma-bufs into the gpu's video encoder then send a video stream.

>  - Do region-based displays also work with GL support, even if not
>required?

Yes.  Any qemu display device works with gl-enabled UI.

cheers,
  Gerd

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-04 Thread Alex Williamson
On Fri, 4 May 2018 10:16:09 +0100
Daniel P. Berrangé  wrote:

> On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote:
> > Hi,
> > 
> > The previous discussion hasn't produced results, so let's start over.
> > Here's the situation:
> > 
> >  - We currently have kernel and QEMU support for the QEMU vfio-pci
> >display option.
> > 
> >  - The default for this option is 'auto', so the device will attempt to
> >generate a display if the underlying device supports it, currently
> >only GVTg and some future release of NVIDIA vGPU (plus Gerd's
> >sample mdpy and mbochs).
> > 
> >  - The display option is implemented via two different mechanism, a
> >vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).
> > 
> >  - Displays using dma-buf require OpenGL support, displays making
> >use of region support do not.
> > 
> >  - Enabling OpenGL support requires specific VM configurations, which
> >libvirt /may/ want to facilitate.
> > 
> >  - Probing display support for a given device is complicated by the
> >fact that GVTg and NVIDIA both impose requirements on the process
> >opening the device file descriptor through the vfio API:
> > 
> >- GVTg requires a KVM association or will fail to allow the device
> >  to be opened.
> > 
> >- NVIDIA requires that their vgpu-manager process can locate a UUID
> >  for the VM via the process commandline.
> > 
> >- These are both horrible impositions and prevent libvirt from
> >  simply probing the device itself.  
> 
> Agreed, these requirements are just horrific. Probing for features
> should not require this kind of level environmental setup. I can
> just about understand & accept how we ended up here, because this
> scenario is not one that was strongly considered when the first impls
> were being done. I don't think we should accept it as a long term
> requirement though.
> 
> > Erik Skultety, who initially raised the display question, has identified
> > one possible solution, which is to simply make the display configuration
> > the user's problem (apologies if I've misinterpreted Erik).  I believe
> > this would work something like:
> > 
> >  - libvirt identifies a version of QEMU that includes 'display' support
> >for vfio-pci devices and defaults to adding display=off for every
> >vfio-pci device [have we chosen the wrong default (auto) in QEMU?].
> > 
> >  - New XML support would allow a user to enable display support on the
> >vfio device.
> > 
> >  - Resolving any OpenGL dependencies of that change would be left to
> >the user.
> > 
> > A nice aspect of this is that policy decisions are left to the user and
> > clearly no interface changes are necessary, perhaps with the exception
> > of deciding whether we've made the wrong default choice for vfio-pci
> > devices in QEMU.  
> 
> Unless I'm mis-understanding this isn't really a solution to the
> problem, rather it is us simply giving up and telling someone else
> to try to fix the problem. The 'user' here is not a human - it is
> simply the next level up in the mgmt stack, eg OpenStack or oVirt.
> If we can't solve it acceptably in libvirt code, I don't have much
> hope that OpenStack can solve it in their code, since they have
> even stronger need to automate everything.

But to solve this at any level other than the user suggests there is
one "right" answer to automatically configuring the device.  Is there?
If a device supports a display, does the user necessarily want to
enable it?  If there's a difference between enabling a display for a
local user or a remote user, is there any reasonable expectation that
we can automatically make that determination?

> > On the other hand, if we do want to give libvirt a mechanism to probe
> > the display support for a device, we can make a simplified QEMU
> > instance be the mechanism through which we do that.  For example the
> > script[1] can be provided with either a PCI device or sysfs path to an
> > mdev device and run a minimal VM instance meeting the requirements of
> > both GVTg and NVIDIA to report the display support and GL requirements
> > for a device.  There are clearly some unrefined and atrocious bits of
> > this script, but it's only a proof of concept, the process management
> > can be improved and we can decide whether we want to provide qmp
> > mechanism to introspect the device rather than grep'ing error
> > messages.  The goal is simply to show that we could choose to embrace
> > QEMU and use it not as a VM, but simply a tool for poking at a device
> > given the restrictions the mdev vendor drivers have already imposed.  
> 
> Feels like a pretty heavy weight solution, that just encourages the
> drivers to continue down the undesirable path they're already on,
> possibly making the situation even worse over time.

I'm not getting the impression that the vendor drivers are considering
a change, or necessarily can change.  The NVIDIA UUID requirement

Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-04 Thread Alex Williamson
On Fri, 4 May 2018 09:49:44 +0200
Erik Skultety  wrote:

> On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote:
> > Hi,
> >
> > The previous discussion hasn't produced results, so let's start over.
> > Here's the situation:
> >
> >  - We currently have kernel and QEMU support for the QEMU vfio-pci
> >display option.
> >
> >  - The default for this option is 'auto', so the device will attempt to
> >generate a display if the underlying device supports it, currently
> >only GVTg and some future release of NVIDIA vGPU (plus Gerd's
> >sample mdpy and mbochs).
> >
> >  - The display option is implemented via two different mechanism, a
> >vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).
> >
> >  - Displays using dma-buf require OpenGL support, displays making
> >use of region support do not.
> >
> >  - Enabling OpenGL support requires specific VM configurations, which
> >libvirt /may/ want to facilitate.
> >
> >  - Probing display support for a given device is complicated by the
> >fact that GVTg and NVIDIA both impose requirements on the process
> >opening the device file descriptor through the vfio API:
> >
> >- GVTg requires a KVM association or will fail to allow the device
> >  to be opened.  
> 
> How exactly is this association checked?

The intel_vgpu_open() callback for the mdev device registers a vfio
group notifier for VFIO_GROUP_NOTIFY_SET_KVM events. The KVM pointer is
already registered via the addition of the vfio group to the vfio-kvm
pseudo device, so the registration synchronously triggers the notifier
callback and the result is tested slightly later in the open path in
kvmgt_guest_init().
 
> >
> >- NVIDIA requires that their vgpu-manager process can locate a
> > UUID for the VM via the process commandline.
> >
> >- These are both horrible impositions and prevent libvirt from
> >  simply probing the device itself.  
> 
> So I feel like we're trying to solve a problem coming from one layer
> on a bunch of different layers which inherently prevents us to
> produce a viable long term solution without dragging a significant
> amount of hacky nasty code and it is not the missing sysfs attributes
> I have in mind. Why does NVIDIA's vgpu-manager need to locate a UUID
> of a qemu VM? I assume that's to prevent multiple VM instances trying
> to use the same mdev device, in which case can't the vgpu-manager
> track references to how many "open" and "close" calls have been made

Hard to say, NVIDIA hasn't been terribly forthcoming about this
requirement, but probably not multiple users of the same mdev device
as that's already prevented through vfio in general.  Intel has
discussed that their requirement is to be able to track VM page table
updates so they can update their shadow tables, so effectively rather
than mediating interactions directly with the device, they're using a
KVM back channel to manage the DMA translation address space for the
device.

The flip side is that while these requirements are annoying and hard
for non-VM users to deal with, is there a next logical point in the
interaction with the vfio device where the vendor driver can reasonably
impose those requirements?  For instance, both vendors expose a
vfio-pci interface, so they could prevent the user driver from enabling
bus master in the PCI command register, but that's a fairly subtle
failure, typically drivers wouldn't even bother to read back after a
write to the bus master bit to see if it sticks and this sort of
enabling is done by the guest, not the hypervisor.  There's really no
error path for a write to the device.

> to the same device? This is just from a layman's perspective, but it
> would allow the following:
> - when libvirt starts, it initializes all its drivers (let's
> focus on QEMU)
> - as part of this initialization, libvirt probes QEMU for
> capabilities and caches them in order to use them when spawning VMs
> 
> Now, if we (theoretically) can settle on easing the restrictions Alex
> has mentioned, we in fact could introduce a QMP command to probe
> these devices and provide libvirt with useful information at that
> point in time. Of course, since the 3rd party vendor is "de-coupled"
> from qemu, libvirt would have no way to find out that the driver has
> changed in the meantime, thus still using the old information we
> gathered, ergo potentially causing the QEMU process to fail
> eventually. But then again, there's very often a strong
> recommendation to reboot your host after a driver update, especially
> in NVIDIA's case, which means this fact wouldn't matter. However,
> there's also a significant drawback to my proposal which probably
> renders it completely useless (but we can continue from there...) and
> that is the devices would either have to be present already (not an
> option) or QEMU would need to be enhanced in a way, that it would
> create a dummy device during QMP probing, open it, collect the

Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-04 Thread Daniel P . Berrangé
On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote:
> Hi,
> 
> The previous discussion hasn't produced results, so let's start over.
> Here's the situation:
> 
>  - We currently have kernel and QEMU support for the QEMU vfio-pci
>display option.
> 
>  - The default for this option is 'auto', so the device will attempt to
>generate a display if the underlying device supports it, currently
>only GVTg and some future release of NVIDIA vGPU (plus Gerd's
>sample mdpy and mbochs).
> 
>  - The display option is implemented via two different mechanism, a
>vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).
> 
>  - Displays using dma-buf require OpenGL support, displays making
>use of region support do not.
> 
>  - Enabling OpenGL support requires specific VM configurations, which
>libvirt /may/ want to facilitate.
> 
>  - Probing display support for a given device is complicated by the
>fact that GVTg and NVIDIA both impose requirements on the process
>opening the device file descriptor through the vfio API:
> 
>- GVTg requires a KVM association or will fail to allow the device
>  to be opened.
> 
>- NVIDIA requires that their vgpu-manager process can locate a UUID
>  for the VM via the process commandline.
> 
>- These are both horrible impositions and prevent libvirt from
>  simply probing the device itself.

Agreed, these requirements are just horrific. Probing for features
should not require this kind of level environmental setup. I can
just about understand & accept how we ended up here, because this
scenario is not one that was strongly considered when the first impls
were being done. I don't think we should accept it as a long term
requirement though.

> Erik Skultety, who initially raised the display question, has identified
> one possible solution, which is to simply make the display configuration
> the user's problem (apologies if I've misinterpreted Erik).  I believe
> this would work something like:
> 
>  - libvirt identifies a version of QEMU that includes 'display' support
>for vfio-pci devices and defaults to adding display=off for every
>vfio-pci device [have we chosen the wrong default (auto) in QEMU?].
> 
>  - New XML support would allow a user to enable display support on the
>vfio device.
> 
>  - Resolving any OpenGL dependencies of that change would be left to
>the user.
> 
> A nice aspect of this is that policy decisions are left to the user and
> clearly no interface changes are necessary, perhaps with the exception
> of deciding whether we've made the wrong default choice for vfio-pci
> devices in QEMU.

Unless I'm mis-understanding this isn't really a solution to the
problem, rather it is us simply giving up and telling someone else
to try to fix the problem. The 'user' here is not a human - it is
simply the next level up in the mgmt stack, eg OpenStack or oVirt.
If we can't solve it acceptably in libvirt code, I don't have much
hope that OpenStack can solve it in their code, since they have
even stronger need to automate everything.

> On the other hand, if we do want to give libvirt a mechanism to probe
> the display support for a device, we can make a simplified QEMU
> instance be the mechanism through which we do that.  For example the
> script[1] can be provided with either a PCI device or sysfs path to an
> mdev device and run a minimal VM instance meeting the requirements of
> both GVTg and NVIDIA to report the display support and GL requirements
> for a device.  There are clearly some unrefined and atrocious bits of
> this script, but it's only a proof of concept, the process management
> can be improved and we can decide whether we want to provide qmp
> mechanism to introspect the device rather than grep'ing error
> messages.  The goal is simply to show that we could choose to embrace
> QEMU and use it not as a VM, but simply a tool for poking at a device
> given the restrictions the mdev vendor drivers have already imposed.

Feels like a pretty heavy weight solution, that just encourages the
drivers to continue down the undesirable path they're already on,
possibly making the situation even worse over time.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-04 Thread Erik Skultety
On Thu, May 03, 2018 at 12:58:00PM -0600, Alex Williamson wrote:
> Hi,
>
> The previous discussion hasn't produced results, so let's start over.
> Here's the situation:
>
>  - We currently have kernel and QEMU support for the QEMU vfio-pci
>display option.
>
>  - The default for this option is 'auto', so the device will attempt to
>generate a display if the underlying device supports it, currently
>only GVTg and some future release of NVIDIA vGPU (plus Gerd's
>sample mdpy and mbochs).
>
>  - The display option is implemented via two different mechanism, a
>vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).
>
>  - Displays using dma-buf require OpenGL support, displays making
>use of region support do not.
>
>  - Enabling OpenGL support requires specific VM configurations, which
>libvirt /may/ want to facilitate.
>
>  - Probing display support for a given device is complicated by the
>fact that GVTg and NVIDIA both impose requirements on the process
>opening the device file descriptor through the vfio API:
>
>- GVTg requires a KVM association or will fail to allow the device
>  to be opened.

How exactly is this association checked?

>
>- NVIDIA requires that their vgpu-manager process can locate a UUID
>  for the VM via the process commandline.
>
>- These are both horrible impositions and prevent libvirt from
>  simply probing the device itself.

So I feel like we're trying to solve a problem coming from one layer on a bunch
of different layers which inherently prevents us to produce a viable long term
solution without dragging a significant amount of hacky nasty code and it is
not the missing sysfs attributes I have in mind. Why does NVIDIA's vgpu-manager
need to locate a UUID of a qemu VM? I assume that's to prevent multiple VM
instances trying to use the same mdev device, in which case can't the
vgpu-manager track references to how many "open" and "close" calls have been
made to the same device? This is just from a layman's perspective, but it would
allow the following:
- when libvirt starts, it initializes all its drivers (let's focus on
  QEMU)
- as part of this initialization, libvirt probes QEMU for capabilities and
  caches them in order to use them when spawning VMs

Now, if we (theoretically) can settle on easing the restrictions Alex has
mentioned, we in fact could introduce a QMP command to probe these devices and
provide libvirt with useful information at that point in time. Of course, since
the 3rd party vendor is "de-coupled" from qemu, libvirt would have no way to
find out that the driver has changed in the meantime, thus still using the old
information we gathered, ergo potentially causing the QEMU process to fail
eventually. But then again, there's very often a strong recommendation to reboot
your host after a driver update, especially in NVIDIA's case, which means this
fact wouldn't matter. However, there's also a significant drawback to my
proposal which probably renders it completely useless (but we can continue from
there...) and that is the devices would either have to be present already (not
an option) or QEMU would need to be enhanced in a way, that it would create a
dummy device during QMP probing, open it, collect the information libvirt
needs, close it and remove it. If the driver doesn't change in the meantime,
this should be sufficient for a VM to be successfully instantiated with a
display, right?

>
> The above has pressed the need for investigating some sort of
> alternative API through which libvirt might introspect a vfio device
> and with vfio device migration on the horizon, it's natural that some
> sort of support for migration state compatibility for the device need be
> considered as a second user of such an API.  However, we currently have
> no concept of migration compatibility on a per-device level as there
> are no migratable devices that live outside of the QEMU code base.
> It's therefore assumed that per device migration compatibility is
> encompassed by the versioned machine type for the overall VM.  We need
> participation all the way to the top of the VM management stack to
> resolve this issue and it's dragging down the (possibly) more simple
> question of how do we resolve the display situation.  Therefore I'm
> looking for alternatives for display that work within what we have
> available to us at the moment.
>
> Erik Skultety, who initially raised the display question, has identified
> one possible solution, which is to simply make the display configuration
> the user's problem (apologies if I've misinterpreted Erik).  I believe
> this would work something like:
>
>  - libvirt identifies a version of QEMU that includes 'display' support
>for vfio-pci devices and defaults to adding display=off for every
>vfio-pci device [have we chosen the wrong default (auto) in QEMU?].

>From libvirt's POV, having a new XML attribute display to the host device type
mdev should 

[libvirt] Expose vfio device display/migration to libvirt and above, was Re: [PATCH 0/3] sample: vfio mdev display devices.

2018-05-03 Thread Alex Williamson
Hi,

The previous discussion hasn't produced results, so let's start over.
Here's the situation:

 - We currently have kernel and QEMU support for the QEMU vfio-pci
   display option.

 - The default for this option is 'auto', so the device will attempt to
   generate a display if the underlying device supports it, currently
   only GVTg and some future release of NVIDIA vGPU (plus Gerd's
   sample mdpy and mbochs).

 - The display option is implemented via two different mechanism, a
   vfio region (NVIDIA, mdpy) or a dma-buf (GVTg, mbochs).

 - Displays using dma-buf require OpenGL support, displays making
   use of region support do not.

 - Enabling OpenGL support requires specific VM configurations, which
   libvirt /may/ want to facilitate.

 - Probing display support for a given device is complicated by the
   fact that GVTg and NVIDIA both impose requirements on the process
   opening the device file descriptor through the vfio API:

   - GVTg requires a KVM association or will fail to allow the device
 to be opened.

   - NVIDIA requires that their vgpu-manager process can locate a UUID
 for the VM via the process commandline.

   - These are both horrible impositions and prevent libvirt from
 simply probing the device itself.

The above has pressed the need for investigating some sort of
alternative API through which libvirt might introspect a vfio device
and with vfio device migration on the horizon, it's natural that some
sort of support for migration state compatibility for the device need be
considered as a second user of such an API.  However, we currently have
no concept of migration compatibility on a per-device level as there
are no migratable devices that live outside of the QEMU code base.
It's therefore assumed that per device migration compatibility is
encompassed by the versioned machine type for the overall VM.  We need
participation all the way to the top of the VM management stack to
resolve this issue and it's dragging down the (possibly) more simple
question of how do we resolve the display situation.  Therefore I'm
looking for alternatives for display that work within what we have
available to us at the moment.

Erik Skultety, who initially raised the display question, has identified
one possible solution, which is to simply make the display configuration
the user's problem (apologies if I've misinterpreted Erik).  I believe
this would work something like:

 - libvirt identifies a version of QEMU that includes 'display' support
   for vfio-pci devices and defaults to adding display=off for every
   vfio-pci device [have we chosen the wrong default (auto) in QEMU?].

 - New XML support would allow a user to enable display support on the
   vfio device.

 - Resolving any OpenGL dependencies of that change would be left to
   the user.

A nice aspect of this is that policy decisions are left to the user and
clearly no interface changes are necessary, perhaps with the exception
of deciding whether we've made the wrong default choice for vfio-pci
devices in QEMU.

On the other hand, if we do want to give libvirt a mechanism to probe
the display support for a device, we can make a simplified QEMU
instance be the mechanism through which we do that.  For example the
script[1] can be provided with either a PCI device or sysfs path to an
mdev device and run a minimal VM instance meeting the requirements of
both GVTg and NVIDIA to report the display support and GL requirements
for a device.  There are clearly some unrefined and atrocious bits of
this script, but it's only a proof of concept, the process management
can be improved and we can decide whether we want to provide qmp
mechanism to introspect the device rather than grep'ing error
messages.  The goal is simply to show that we could choose to embrace
QEMU and use it not as a VM, but simply a tool for poking at a device
given the restrictions the mdev vendor drivers have already imposed.

So I think the question bounces back to libvirt, does libvirt want
enough information about the display requirements for a given device to
automatically attempt to add GL support for it, effectively a policy of
'if it's supported try to enable it', or should we leave well enough
alone and let the user choose to enable it?

Maybe some guiding questions:

 - Will dma-buf always require GL support?

 - Does GL support limit our ability to have a display over a remote
   connection?

 - Do region-based displays also work with GL support, even if not
   required?

Furthermore, should QEMU vfio-pci flip the default to 'off' for
compatibility?  Thanks,

Alex

[1] https://gist.github.com/awilliam/2ccd31e85923ac8135694a7db2306646

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list