Re: [PATCH] util: basic support for vendor-specific vfio drivers
On 8/5/22 2:20 PM, Jason Gunthorpe wrote: On Fri, Aug 05, 2022 at 11:24:08AM -0600, Alex Williamson wrote: On Thu, 4 Aug 2022 21:11:07 -0300 Jason Gunthorpe wrote: On Thu, Aug 04, 2022 at 01:36:24PM -0600, Alex Williamson wrote: That is reasonable, but I'd say those three kernels only have two drivers and they both have vfio as a substring in their name - so the simple thing of just substring searching 'vfio' would get us over that gap. Looking at the aliases for exactly "vfio_pci" isn't that much more complicated, and "feels" a lot more reliable than just doing a substring search for "vfio" in the driver's name. (It would be, uh, "not smart" to name a driver "vfio" if it wasn't actually a vfio variant driver (or the opposite), but I could imagine it happening; :-/) This is still pretty hacky. I'm worried about what happens to the kernel if this becames some crazy unintended uAPI that we never really thought about carefully... This was not a use case when we designed the modules.alias stuff at least. BTW - why not do things the normal way? 1. readlink /sys/bus/pci/devices/XX/iommu_group 2. Compute basename of #1 3. Check if /dev/vfio/#2 exists (or /sys/class/vfio/#2) It has a small edge case where a multi-device group might give a false positive for an undrivered device, but for the purposes of libvirt that seems pretty obscure.. (while the above has false negative issues, obviously) This is not a small edge case, it's extremely common. We have a *lot* of users assigning desktop GPUs and other consumer grade hardware, which are usually multi-function devices without isolation exposed via ACS or quirks. The edge case is that the user has created a multi-device group, manually assigned device 1 in the group to VFIO, left device 2 with no driver and then told libvirt to manually use device 2. With the above approach libvirt won't detect this misconfiguration and qemu will fail. libvirt will see that there is no driver at all, and recognize that, by definition, "no driver" == "not a vfio variant driver" :-). So in this case libvirt catches the misconfiguration. The vfio group exists if any devices in the group are bound to a vfio driver, but the device is not accessible from the group unless the viability test passes. That means QEMU may not be able to get access to the device because the device we want isn't actually bound to a vfio driver or another device in the group is not in a viable state. Thanks, This is a different misconfiguration that libvirt also won't detect, right? In this case ownership claiming in the kernel will fail and qemu will fail too, like above. Right. If we're relying on "iommu group matching" as you suggest, then libvirt will mistakenly conclude that the driver is a vfio variant, but then qemu will fail. This, and the above, could be handled by having libvirt also open the group FD and get the device. It would prove both correct binding and viability. I had understood the point of this logic was to give better error reporting to users so that common misconfigurations could be diagnosed earlier. Correct. In the end QEMU and the kernel have the final say of course, but if we can detect the problem sooner then it's more likely the user will get a meaningful error message. When I say 'small edge case' I mean it seems like an unlikely misconfiguration that someone would know to setup VFIO but then use the wrong BDFs to do it - arguably less likely than someone would know to setup VFIO but forget to unbind the other drivers in the group? You obviously haven't spent enough time trying to remotely troubleshoot the setups of noobs on IRC :-). If anything can be done wrong, there is certainly someone around who will do it that way. Of course we can't eliminate 100% of these, but the more misconfigurations we can catch, and the earlier we can catch them, the better. All of this without any false error reports of course. It's better that some edge cases get through (and be shot down by QEMU) rather than that we would mistakenly prevent someone from using a totally viable config (which is the case right now). But maybe I don't get it at all ... Nah, you get it. We just have minor differences of opinion of which choice will be the best combination of simple to implement + less user headaches (and in the end any of us could be right; personally my opinion sways from minute to minute).
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On 8/5/22 2:56 PM, Alex Williamson wrote: On Fri, 5 Aug 2022 15:20:24 -0300 Jason Gunthorpe wrote: On Fri, Aug 05, 2022 at 11:24:08AM -0600, Alex Williamson wrote: On Thu, 4 Aug 2022 21:11:07 -0300 Jason Gunthorpe wrote: On Thu, Aug 04, 2022 at 01:36:24PM -0600, Alex Williamson wrote: That is reasonable, but I'd say those three kernels only have two drivers and they both have vfio as a substring in their name - so the simple thing of just substring searching 'vfio' would get us over that gap. Looking at the aliases for exactly "vfio_pci" isn't that much more complicated, and "feels" a lot more reliable than just doing a substring search for "vfio" in the driver's name. (It would be, uh, "not smart" to name a driver "vfio" if it wasn't actually a vfio variant driver (or the opposite), but I could imagine it happening; :-/) This is still pretty hacky. I'm worried about what happens to the kernel if this becames some crazy unintended uAPI that we never really thought about carefully... This was not a use case when we designed the modules.alias stuff at least. BTW - why not do things the normal way? 1. readlink /sys/bus/pci/devices/XX/iommu_group 2. Compute basename of #1 3. Check if /dev/vfio/#2 exists (or /sys/class/vfio/#2) It has a small edge case where a multi-device group might give a false positive for an undrivered device, but for the purposes of libvirt that seems pretty obscure.. (while the above has false negative issues, obviously) This is not a small edge case, it's extremely common. We have a *lot* of users assigning desktop GPUs and other consumer grade hardware, which are usually multi-function devices without isolation exposed via ACS or quirks. The edge case is that the user has created a multi-device group, manually assigned device 1 in the group to VFIO, left device 2 with no driver and then told libvirt to manually use device 2. With the above approach libvirt won't detect this misconfiguration and qemu will fail. The vfio group exists if any devices in the group are bound to a vfio driver, but the device is not accessible from the group unless the viability test passes. That means QEMU may not be able to get access to the device because the device we want isn't actually bound to a vfio driver or another device in the group is not in a viable state. Thanks, This is a different misconfiguration that libvirt also won't detect, right? In this case ownership claiming in the kernel will fail and qemu will fail too, like above. This, and the above, could be handled by having libvirt also open the group FD and get the device. It would prove both correct binding and viability. libvirt cannot do this in the group model because the group must be isolated in a container before the device can be accessed and libvirt cannot presume the QEMU container configuration. For direct device access, this certainly becomes a possibility and I've been trying to steer things in that direction, libvirt has the option to pass an fd for the iommufd and can then pass fds for each of the devices in the new uAPI paradigm. I had understood the point of this logic was to give better error reporting to users so that common misconfigurations could be diagnosed earlier. When I say 'small edge case' I mean it seems like an unlikely misconfiguration that someone would know to setup VFIO but then use the wrong BDFs to do it - arguably less likely than someone would know to setup VFIO but forget to unbind the other drivers in the group? I'm not sure how much testing libvirt does of other devices in a group, Laine? It had been so long since I looked at that, and the code was so obtuse, that I had to set something up to see what happened. 1) if there is another devices in the same group, and that device is bound to vfio-pci and in use by a different QEMU process, then libvirt refuses assign the device in question (it will allow assigning the device to the same guest as other devices in the same group. 2) if there is another device in the same group, and that device is bound to some other driver than vfio-pci, then libvirt doesn't notice, tells QEMU to do the assignment, and QEMU fails. Without looking/trying, I would have said that libvirt would check for (2), but I guess nobody ever tried it :-/ AIUI here, libvirt has a managed='yes|no' option per device. In the 'yes' case libvirt will unbind devices from their host driver and bind them to vfio-pci. In the 'no' case, I believe libvirt is still doing a sanity test on the driver, but only knows about vfio-pci. Correct. The initial step is to then enlighten libvirt that other drivers can be compatible for the 'no' case and later we can make smarter choices about which driver to use or allow the user to specify (ie. a user should be able to use vfio-pci rather than a variant driver if they choose) in the 'yes' case. Yes, that's the next step. I just wanted to first add a simple (i.e. difficult to botch up)
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Fri, 5 Aug 2022 15:20:24 -0300 Jason Gunthorpe wrote: > On Fri, Aug 05, 2022 at 11:24:08AM -0600, Alex Williamson wrote: > > On Thu, 4 Aug 2022 21:11:07 -0300 > > Jason Gunthorpe wrote: > > > > > On Thu, Aug 04, 2022 at 01:36:24PM -0600, Alex Williamson wrote: > > > > > > > > > That is reasonable, but I'd say those three kernels only have two > > > > > > drivers and they both have vfio as a substring in their name - so > > > > > > the > > > > > > simple thing of just substring searching 'vfio' would get us over > > > > > > that > > > > > > gap. > > > > > > > > > > Looking at the aliases for exactly "vfio_pci" isn't that much more > > > > > complicated, and "feels" a lot more reliable than just doing a > > > > > substring > > > > > search for "vfio" in the driver's name. (It would be, uh, "not > > > > > smart" to name a driver "vfio" if it wasn't actually a vfio > > > > > variant driver (or the opposite), but I could imagine it happening; > > > > > :-/) > > > > > > This is still pretty hacky. I'm worried about what happens to the > > > kernel if this becames some crazy unintended uAPI that we never really > > > thought about carefully... This was not a use case when we designed > > > the modules.alias stuff at least. > > > > > > BTW - why not do things the normal way? > > > > > > 1. readlink /sys/bus/pci/devices/XX/iommu_group > > > 2. Compute basename of #1 > > > 3. Check if /dev/vfio/#2 exists (or /sys/class/vfio/#2) > > > > > > It has a small edge case where a multi-device group might give a false > > > positive for an undrivered device, but for the purposes of libvirt > > > that seems pretty obscure.. (while the above has false negative > > > issues, obviously) > > > > This is not a small edge case, it's extremely common. We have a *lot* > > of users assigning desktop GPUs and other consumer grade hardware, which > > are usually multi-function devices without isolation exposed via ACS or > > quirks. > > The edge case is that the user has created a multi-device group, > manually assigned device 1 in the group to VFIO, left device 2 with no > driver and then told libvirt to manually use device 2. With the above > approach libvirt won't detect this misconfiguration and qemu will > fail. > > > The vfio group exists if any devices in the group are bound to a vfio > > driver, but the device is not accessible from the group unless the > > viability test passes. That means QEMU may not be able to get access > > to the device because the device we want isn't actually bound to a vfio > > driver or another device in the group is not in a viable state. Thanks, > > This is a different misconfiguration that libvirt also won't detect, > right? In this case ownership claiming in the kernel will fail and > qemu will fail too, like above. > > This, and the above, could be handled by having libvirt also open the > group FD and get the device. It would prove both correct binding and > viability. libvirt cannot do this in the group model because the group must be isolated in a container before the device can be accessed and libvirt cannot presume the QEMU container configuration. For direct device access, this certainly becomes a possibility and I've been trying to steer things in that direction, libvirt has the option to pass an fd for the iommufd and can then pass fds for each of the devices in the new uAPI paradigm. > I had understood the point of this logic was to give better error > reporting to users so that common misconfigurations could be diagnosed > earlier. When I say 'small edge case' I mean it seems like an unlikely > misconfiguration that someone would know to setup VFIO but then use > the wrong BDFs to do it - arguably less likely than someone would know > to setup VFIO but forget to unbind the other drivers in the group? I'm not sure how much testing libvirt does of other devices in a group, Laine? AIUI here, libvirt has a managed='yes|no' option per device. In the 'yes' case libvirt will unbind devices from their host driver and bind them to vfio-pci. In the 'no' case, I believe libvirt is still doing a sanity test on the driver, but only knows about vfio-pci. The initial step is to then enlighten libvirt that other drivers can be compatible for the 'no' case and later we can make smarter choices about which driver to use or allow the user to specify (ie. a user should be able to use vfio-pci rather than a variant driver if they choose) in the 'yes' case. If libvirt is currently testing that only the target device is bound to vfio-pci, then maybe we do have gaps for the ancillary devices in the group, but that gap changes if instead we only test that a vfio group exists relative to the iommu group of the target device. Thanks, Alex
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Thu, 4 Aug 2022 21:11:07 -0300 Jason Gunthorpe wrote: > On Thu, Aug 04, 2022 at 01:36:24PM -0600, Alex Williamson wrote: > > > > > That is reasonable, but I'd say those three kernels only have two > > > > drivers and they both have vfio as a substring in their name - so the > > > > simple thing of just substring searching 'vfio' would get us over that > > > > gap. > > > > > > Looking at the aliases for exactly "vfio_pci" isn't that much more > > > complicated, and "feels" a lot more reliable than just doing a substring > > > search for "vfio" in the driver's name. (It would be, uh, "not > > > smart" to name a driver "vfio" if it wasn't actually a vfio > > > variant driver (or the opposite), but I could imagine it happening; :-/) > > This is still pretty hacky. I'm worried about what happens to the > kernel if this becames some crazy unintended uAPI that we never really > thought about carefully... This was not a use case when we designed > the modules.alias stuff at least. > > BTW - why not do things the normal way? > > 1. readlink /sys/bus/pci/devices/XX/iommu_group > 2. Compute basename of #1 > 3. Check if /dev/vfio/#2 exists (or /sys/class/vfio/#2) > > It has a small edge case where a multi-device group might give a false > positive for an undrivered device, but for the purposes of libvirt > that seems pretty obscure.. (while the above has false negative > issues, obviously) This is not a small edge case, it's extremely common. We have a *lot* of users assigning desktop GPUs and other consumer grade hardware, which are usually multi-function devices without isolation exposed via ACS or quirks. The vfio group exists if any devices in the group are bound to a vfio driver, but the device is not accessible from the group unless the viability test passes. That means QEMU may not be able to get access to the device because the device we want isn't actually bound to a vfio driver or another device in the group is not in a viable state. Thanks, Alex
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Fri, Aug 05, 2022 at 11:46:17AM -0400, Laine Stump wrote: > On 8/5/22 5:40 AM, Daniel P. Berrangé wrote: > > On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote: > > > On 8/1/22 7:58 AM, Erik Skultety wrote: > > > > > > > > Instead of calling an external program and then grepping its output > > > > which > > > > technically could change in the future, wouldn't it be better if we read > > > > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the > > > > vfio-pci' substring and compared the module name with the user-provided > > > > device > > > > driver? > > > > > > Again, although I was hesistant about calling an external command, and > > > asked > > > if there was something simpler, Alex still suggested modinfo, so I'll let > > > him answer that. Alex? > > > > > > (Also, although the format of the output of "uname -r" is pretty much > > > written in stone, you're still running an external command :-)) > > > > You wouldn't actually call 'uname -r', you'd invoke uname(2) function > > and use the 'release' field in 'struct utsname'. > > Yeah, I wasn't thinking clearly when I said that :-P > > > > > I'd favour reading modules.alias directly over invoking modinfo for > > sure, though I'd be even more in favour of the kernel just exposing > > the sysfs attribute and in the meanwhile just hardcoding the only 2 > > driver names that exist so far. > > The problem with hardcoding the 2 existing driver names is that it wouldn't > do any good to anyone developing a new driver, and part of the aim of doing > this is to make it possible for developers to test their new drivers using > libvirt (and management systems based on libvirt). I'm only suggesting hardcoding the driver names, *if* the kernel folks agree to expose the sysfs directory. Anyone developing new drivers is unlikely to have their drivers merged before this new sysfs dir is added, so it is not a significant enough real world blocker for them for us to worry about. Best to focus on what the best long term approach is, and not worry about problems that will only exist for a couple of kernel releases today. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On 8/5/22 5:40 AM, Daniel P. Berrangé wrote: On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote: On 8/1/22 7:58 AM, Erik Skultety wrote: Instead of calling an external program and then grepping its output which technically could change in the future, wouldn't it be better if we read /lib/modules/`uname -r`/modules.alias and filtered whatever line had the vfio-pci' substring and compared the module name with the user-provided device driver? Again, although I was hesistant about calling an external command, and asked if there was something simpler, Alex still suggested modinfo, so I'll let him answer that. Alex? (Also, although the format of the output of "uname -r" is pretty much written in stone, you're still running an external command :-)) You wouldn't actually call 'uname -r', you'd invoke uname(2) function and use the 'release' field in 'struct utsname'. Yeah, I wasn't thinking clearly when I said that :-P I'd favour reading modules.alias directly over invoking modinfo for sure, though I'd be even more in favour of the kernel just exposing the sysfs attribute and in the meanwhile just hardcoding the only 2 driver names that exist so far. The problem with hardcoding the 2 existing driver names is that it wouldn't do any good to anyone developing a new driver, and part of the aim of doing this is to make it possible for developers to test their new drivers using libvirt (and management systems based on libvirt).
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote: > On 8/1/22 7:58 AM, Erik Skultety wrote: > > > > Instead of calling an external program and then grepping its output which > > technically could change in the future, wouldn't it be better if we read > > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the > > vfio-pci' substring and compared the module name with the user-provided > > device > > driver? > > Again, although I was hesistant about calling an external command, and asked > if there was something simpler, Alex still suggested modinfo, so I'll let > him answer that. Alex? > > (Also, although the format of the output of "uname -r" is pretty much > written in stone, you're still running an external command :-)) You wouldn't actually call 'uname -r', you'd invoke uname(2) function and use the 'release' field in 'struct utsname'. I'd favour reading modules.alias directly over invoking modinfo for sure, though I'd be even more in favour of the kernel just exposing the sysfs attribute and in the meanwhile just hardcoding the only 2 driver names that exist so far. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Thu, Aug 04, 2022 at 01:51:20PM -0300, Jason Gunthorpe wrote: > On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote: > > > > > > > Fortunately these new vendor/device-specific drivers can be easily > > > > > > identified as being "vfio-pci + extra stuff" - all that's needed is > > > > > > to > > > > > > look at the output of the "modinfo $driver_name" command to see if > > > > > > "vfio_pci" is in the alias list for the driver. > > We are moving in a direction on the kernel side to expose a sysfs > under the PCI device that definitively says it is VFIO enabled, eg > something like > > /sys/devices/pci:00/:00:1f.6/vfio/ > > Which is how every other subsystem in the kernel works. When this > lands libvirt can simply stat the vfio directory and confirm that the > device handle it is looking at is vfio enabled, for all things that > vfio support. > > My thinking had been to do the above work a bit later, but if libvirt > needs it right now then lets do it right away so we don't have to > worry about this hacky modprobe stuff down the road? I wouldn't go so far as to say libvirt "needs" it, as obviously we can make it work using module.alias information. I would say that exposing this in sysfs though makes it simpler and faster, because the check then essentially turns into a single stat() call. So from that POV libvirt would be happy to see that improvement. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Thu, Aug 04, 2022 at 03:11:07PM -0400, Laine Stump wrote: > On 8/4/22 2:36 PM, Jason Gunthorpe wrote: > > On Thu, Aug 04, 2022 at 12:18:26PM -0600, Alex Williamson wrote: > > > On Thu, 4 Aug 2022 13:51:20 -0300 > > > Jason Gunthorpe wrote: > > > > > > > On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote: > > > > > > > > > > > > > Fortunately these new vendor/device-specific drivers can be > > > > > > > > > easily > > > > > > > > > identified as being "vfio-pci + extra stuff" - all that's > > > > > > > > > needed is to > > > > > > > > > look at the output of the "modinfo $driver_name" command to > > > > > > > > > see if > > > > > > > > > "vfio_pci" is in the alias list for the driver. > > > > > > > > We are moving in a direction on the kernel side to expose a sysfs > > > > under the PCI device that definitively says it is VFIO enabled, eg > > > > something like > > > > > > > > /sys/devices/pci:00/:00:1f.6/vfio/ > > > > > > > > Which is how every other subsystem in the kernel works. When this > > > > lands libvirt can simply stat the vfio directory and confirm that the > > > > device handle it is looking at is vfio enabled, for all things that > > > > vfio support. > > > > > > > > My thinking had been to do the above work a bit later, but if libvirt > > > > needs it right now then lets do it right away so we don't have to > > > > worry about this hacky modprobe stuff down the road? > > > > > > That seems like a pretty long gap, there are vfio-pci variant drivers > > > since v5.18 and this hasn't even been proposed for v6.0 (aka v5.20) > > > midway through the merge window. We therefore have at least 3 kernels > > > exposing devices in a way that libvirt can't make use of simply due to > > > a driver matching test. > > > > That is reasonable, but I'd say those three kernels only have two > > drivers and they both have vfio as a substring in their name - so the > > simple thing of just substring searching 'vfio' would get us over that > > gap. > > Looking at the aliases for exactly "vfio_pci" isn't that much more > complicated, and "feels" a lot more reliable than just doing a substring > search for "vfio" in the driver's name. (It would be, uh, "not smart" > to name a driver "vfio" if it wasn't actually a vfio variant > driver (or the opposite), but I could imagine it happening; :-/) If it is just 2 drivers so far then we don't need to even do a substring. We should do a precise full string match for just those couple of drivers that exist. We don't need to care about out of tree drivers IMHO. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Thu, 4 Aug 2022 15:11:07 -0400 Laine Stump wrote: > On 8/4/22 2:36 PM, Jason Gunthorpe wrote: > > On Thu, Aug 04, 2022 at 12:18:26PM -0600, Alex Williamson wrote: > >> On Thu, 4 Aug 2022 13:51:20 -0300 > >> Jason Gunthorpe wrote: > >> > >>> On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote: > >>> > Fortunately these new vendor/device-specific drivers can be easily > identified as being "vfio-pci + extra stuff" - all that's needed is > to > look at the output of the "modinfo $driver_name" command to see if > "vfio_pci" is in the alias list for the driver. > >>> > >>> We are moving in a direction on the kernel side to expose a sysfs > >>> under the PCI device that definitively says it is VFIO enabled, eg > >>> something like > >>> > >>> /sys/devices/pci:00/:00:1f.6/vfio/ > >>> > >>> Which is how every other subsystem in the kernel works. When this > >>> lands libvirt can simply stat the vfio directory and confirm that the > >>> device handle it is looking at is vfio enabled, for all things that > >>> vfio support. > >>> > >>> My thinking had been to do the above work a bit later, but if libvirt > >>> needs it right now then lets do it right away so we don't have to > >>> worry about this hacky modprobe stuff down the road? > >> > >> That seems like a pretty long gap, there are vfio-pci variant drivers > >> since v5.18 and this hasn't even been proposed for v6.0 (aka v5.20) > >> midway through the merge window. We therefore have at least 3 kernels > >> exposing devices in a way that libvirt can't make use of simply due to > >> a driver matching test. > > > > That is reasonable, but I'd say those three kernels only have two > > drivers and they both have vfio as a substring in their name - so the > > simple thing of just substring searching 'vfio' would get us over that > > gap. > > Looking at the aliases for exactly "vfio_pci" isn't that much more > complicated, and "feels" a lot more reliable than just doing a substring > search for "vfio" in the driver's name. (It would be, uh, "not > smart" to name a driver "vfio" if it wasn't actually a vfio > variant driver (or the opposite), but I could imagine it happening; :-/) > > > > >> might be leveraged for managed='yes' with variant drivers. Once vfio > >> devices expose a chardev themselves, libvirt might order the tests as: > > > > I wasn't thinking to include the chardev part if we are to expedite > > this. The struct device bit alone is enough and it doesn't have the > > complex bits needed to make the cdev. > > > > If you say you want to do it we'll do it for v6.1.. > > Since we already need to do something else as a stop-gap for the interim > (in order to avoid making driver developers wait any longer if for no > other reason), my opinion would be to not spend extra time splitting up > patches just to give us this functionality slightly sooner; we'll anyway > have something at least workable in place. We also need to be careful in adding things piecemeal that libvirt can determine when new functionality, such as vfio device chardevs, are actually available and not simply a placeholder to fill a gap elsewhere. Thanks, Alex
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On 8/4/22 2:36 PM, Jason Gunthorpe wrote: On Thu, Aug 04, 2022 at 12:18:26PM -0600, Alex Williamson wrote: On Thu, 4 Aug 2022 13:51:20 -0300 Jason Gunthorpe wrote: On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote: Fortunately these new vendor/device-specific drivers can be easily identified as being "vfio-pci + extra stuff" - all that's needed is to look at the output of the "modinfo $driver_name" command to see if "vfio_pci" is in the alias list for the driver. We are moving in a direction on the kernel side to expose a sysfs under the PCI device that definitively says it is VFIO enabled, eg something like /sys/devices/pci:00/:00:1f.6/vfio/ Which is how every other subsystem in the kernel works. When this lands libvirt can simply stat the vfio directory and confirm that the device handle it is looking at is vfio enabled, for all things that vfio support. My thinking had been to do the above work a bit later, but if libvirt needs it right now then lets do it right away so we don't have to worry about this hacky modprobe stuff down the road? That seems like a pretty long gap, there are vfio-pci variant drivers since v5.18 and this hasn't even been proposed for v6.0 (aka v5.20) midway through the merge window. We therefore have at least 3 kernels exposing devices in a way that libvirt can't make use of simply due to a driver matching test. That is reasonable, but I'd say those three kernels only have two drivers and they both have vfio as a substring in their name - so the simple thing of just substring searching 'vfio' would get us over that gap. Looking at the aliases for exactly "vfio_pci" isn't that much more complicated, and "feels" a lot more reliable than just doing a substring search for "vfio" in the driver's name. (It would be, uh, "not smart" to name a driver "vfio" if it wasn't actually a vfio variant driver (or the opposite), but I could imagine it happening; :-/) might be leveraged for managed='yes' with variant drivers. Once vfio devices expose a chardev themselves, libvirt might order the tests as: I wasn't thinking to include the chardev part if we are to expedite this. The struct device bit alone is enough and it doesn't have the complex bits needed to make the cdev. If you say you want to do it we'll do it for v6.1.. Since we already need to do something else as a stop-gap for the interim (in order to avoid making driver developers wait any longer if for no other reason), my opinion would be to not spend extra time splitting up patches just to give us this functionality slightly sooner; we'll anyway have something at least workable in place. Definitely once it is there, libvirt should check for it, since it would be quicker and just "feels even more reliable". I'm updating my patches to directly look at modules.alias and will resend based on that.
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Thu, 4 Aug 2022 13:51:20 -0300 Jason Gunthorpe wrote: > On Mon, Aug 01, 2022 at 09:49:28AM -0600, Alex Williamson wrote: > > > > > > > Fortunately these new vendor/device-specific drivers can be easily > > > > > > identified as being "vfio-pci + extra stuff" - all that's needed is > > > > > > to > > > > > > look at the output of the "modinfo $driver_name" command to see if > > > > > > "vfio_pci" is in the alias list for the driver. > > We are moving in a direction on the kernel side to expose a sysfs > under the PCI device that definitively says it is VFIO enabled, eg > something like > > /sys/devices/pci:00/:00:1f.6/vfio/ > > Which is how every other subsystem in the kernel works. When this > lands libvirt can simply stat the vfio directory and confirm that the > device handle it is looking at is vfio enabled, for all things that > vfio support. > > My thinking had been to do the above work a bit later, but if libvirt > needs it right now then lets do it right away so we don't have to > worry about this hacky modprobe stuff down the road? That seems like a pretty long gap, there are vfio-pci variant drivers since v5.18 and this hasn't even been proposed for v6.0 (aka v5.20) midway through the merge window. We therefore have at least 3 kernels exposing devices in a way that libvirt can't make use of simply due to a driver matching test. Libvirt needs backwards compatibility, so we'll need it to look for the vfio-pci driver through some long deprecation period. In the interim, it can look at module aliases, support for which will be necessary and might be leveraged for managed='yes' with variant drivers. Once vfio devices expose a chardev themselves, libvirt might order the tests as: a) vfio device chardev present b) driver is a vfio-pci modalias c) driver is vfio-pci The current state of the world though is that variant driver exist and libvirt can't make use of them. Thanks, Alex
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Mon, 1 Aug 2022 16:02:05 +0200 Erik Skultety wrote: > Putting Alex on CC since I don't see him there: > +alex.william...@redhat.com Hmm, Laine cc'd me on the initial post but it seems it got dropped somewhere. > On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote: > > On 8/1/22 7:58 AM, Erik Skultety wrote: > > > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote: > > > > Before a PCI device can be assigned to a guest with VFIO, that device > > > > must be bound to the vfio-pci driver rather than to the device's > > > > normal driver. The vfio-pci driver provides APIs that permit QEMU to > > > > perform all the necessary operations to make the device accessible to > > > > the guest. > > > > > > > > There has been kernel work recently to support vendor/device-specific > > > > VFIO drivers that provide the basic vfio-pci driver functionality > > > > while adding support for device-specific operations (for example these > > > > device-specific drivers are planned to support live migration of > > > > certain devices). All that will be needed to make this functionality > > > > available will be to bind the new vendor-specific driver to the device > > > > (rather than the generic vfio-pci driver, which will continue to work > > > > just without the extra functionality). > > > > > > > > But until now libvirt has required that all PCI devices being assigned > > > > to a guest with VFIO specifically have the "vfio-pci" driver bound to > > > > the device. So even if the user manually binds a shiny new > > > > vendor-specific driver to the device (and puts "managed='no'" in the > > > > config to prevent libvirt from changing that), libvirt will just fail > > > > during startup of the guest (or during hotplug) because the driver > > > > bound to the device isn't named exactly "vfio-pci". > > > > > > > > Fortunately these new vendor/device-specific drivers can be easily > > > > identified as being "vfio-pci + extra stuff" - all that's needed is to > > > > look at the output of the "modinfo $driver_name" command to see if > > > > "vfio_pci" is in the alias list for the driver. > > > > > > > > That's what this patch does. When libvirt checks the driver bound to a > > > > device (either to decide if it needs to bind to a different driver or > > > > perform some other operation, or if the current driver is acceptable > > > > as-is), if the driver isn't specifically "vfio-pci", then it will look > > > > at the output of modinfo for the driver that *is* bound to the device; > > > > if modinfo shows vfio_pci as an alias for that device, then we'll > > > > behave as if the driver was exactly "vfio-pci". > > > > > > Since you say that the vendor/device-specific drivers does each of such > > > drivers > > > implement the base vfio-pci functionality or they simply call into the > > > base > > > driver? The reason why I'm asking is that if each of the vendor-specific > > > drivers depend on the vfio-pci module to be loaded as well, then reading > > > /proc/modules should suffice as vfio-pci should be listed right next to > > > the > > > vendor-specific one. What am I missing? > > I don't know the definitive answer to that, as I have no example of a > > working vendor-specific driver to look at and only know about the kernel > > work going on second-hand from Alex. It looks like even the vfio_pci driver > > itself depends on other presumably lower level vfio-* modules (it directly > > uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly > > these new drivers would be depending on one or more of those lower level > > modules rather than vfio_pci. Also I would imagine it would be possible for > > other drivers to also depend on the vfio-pci driver while not themselves > > being a vfio driver. A module dependency on vfio-pci (actually vfio-pci-core) is a pretty loose requirement, *any* symbol dependency generates such a linkage, without necessarily exposing a vfio-pci uAPI. The alias support introduced to the kernel is intended to allow userspace to determine the most appropriate vfio-pci driver for a device, whether that's vfio-pci itself or a variant driver that augments device specific features. See the upstream commit here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc6711b0bf36de068b10490198d05ac168377989 All we're doing here is extending libvirt to say that if the driver is vfio-pci or the modalias for the driver is prefixed with vfio-pci, then the driver exposes a vfio-pci compatible uAPI. I expect in the future libvirt, or some other utility, may take on the role as described in the above commit log to not only detect that a driver supports a vfio-pci uAPI, but also to identify the most appropriate driver for the device which exposes a vfio-uAPI. > > > The 'alias' field is optional so do we have any support guarantees from > > > the > > > vendors that the it will always be filled in correctly? I mean you surely > > >
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Putting Alex on CC since I don't see him there: +alex.william...@redhat.com On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote: > On 8/1/22 7:58 AM, Erik Skultety wrote: > > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote: > > > Before a PCI device can be assigned to a guest with VFIO, that device > > > must be bound to the vfio-pci driver rather than to the device's > > > normal driver. The vfio-pci driver provides APIs that permit QEMU to > > > perform all the necessary operations to make the device accessible to > > > the guest. > > > > > > There has been kernel work recently to support vendor/device-specific > > > VFIO drivers that provide the basic vfio-pci driver functionality > > > while adding support for device-specific operations (for example these > > > device-specific drivers are planned to support live migration of > > > certain devices). All that will be needed to make this functionality > > > available will be to bind the new vendor-specific driver to the device > > > (rather than the generic vfio-pci driver, which will continue to work > > > just without the extra functionality). > > > > > > But until now libvirt has required that all PCI devices being assigned > > > to a guest with VFIO specifically have the "vfio-pci" driver bound to > > > the device. So even if the user manually binds a shiny new > > > vendor-specific driver to the device (and puts "managed='no'" in the > > > config to prevent libvirt from changing that), libvirt will just fail > > > during startup of the guest (or during hotplug) because the driver > > > bound to the device isn't named exactly "vfio-pci". > > > > > > Fortunately these new vendor/device-specific drivers can be easily > > > identified as being "vfio-pci + extra stuff" - all that's needed is to > > > look at the output of the "modinfo $driver_name" command to see if > > > "vfio_pci" is in the alias list for the driver. > > > > > > That's what this patch does. When libvirt checks the driver bound to a > > > device (either to decide if it needs to bind to a different driver or > > > perform some other operation, or if the current driver is acceptable > > > as-is), if the driver isn't specifically "vfio-pci", then it will look > > > at the output of modinfo for the driver that *is* bound to the device; > > > if modinfo shows vfio_pci as an alias for that device, then we'll > > > behave as if the driver was exactly "vfio-pci". > > > > Since you say that the vendor/device-specific drivers does each of such > > drivers > > implement the base vfio-pci functionality or they simply call into the base > > driver? The reason why I'm asking is that if each of the vendor-specific > > drivers depend on the vfio-pci module to be loaded as well, then reading > > /proc/modules should suffice as vfio-pci should be listed right next to the > > vendor-specific one. What am I missing? > I don't know the definitive answer to that, as I have no example of a > working vendor-specific driver to look at and only know about the kernel > work going on second-hand from Alex. It looks like even the vfio_pci driver > itself depends on other presumably lower level vfio-* modules (it directly > uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly > these new drivers would be depending on one or more of those lower level > modules rather than vfio_pci. Also I would imagine it would be possible for > other drivers to also depend on the vfio-pci driver while not themselves > being a vfio driver. > > > > > The 'alias' field is optional so do we have any support guarantees from the > > vendors that the it will always be filled in correctly? I mean you surely > > handle that case in the code, but once we start supporting this there's no > > way > > back and we already know how painful it can be to convince the vendors to > > follow some kind of standard so that we don't need to maintain several code > > paths based on a vendor-matrix. > > The aliases are what is used to determine the "best" vfio driver for a > particular device, so I don't think it would be possible for a driver to not > implement it, and the method I've used here to determine if a driver is a > vfio driver was recommended by Alex after a couple of discussions on the > subject. > > > > > ... > > > > > +int > > > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev, > > > + char **drvName, > > > + virPCIStubDriver *drvType) > > > +{ > > > +g_autofree char *drvPath = NULL; > > > +g_autoptr(virCommand) cmd = NULL; > > > +g_autofree char *output = NULL; > > > +g_autoptr(GRegex) regex = NULL; > > > +g_autoptr(GError) err = NULL; > > > +g_autoptr(GMatchInfo) info = NULL; > > > +int exit; > > > +int tmpType; > > > + > > > +if (virPCIDeviceGetDriverPathAndName(dev, , drvName) < 0) > > > +return -1; > > > + > > > +if (!*drvName) { > > > +*drvType =
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote: > On 8/1/22 7:58 AM, Erik Skultety wrote: > > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote: > > > Before a PCI device can be assigned to a guest with VFIO, that device > > > must be bound to the vfio-pci driver rather than to the device's > > > normal driver. The vfio-pci driver provides APIs that permit QEMU to > > > perform all the necessary operations to make the device accessible to > > > the guest. > > > > > > There has been kernel work recently to support vendor/device-specific > > > VFIO drivers that provide the basic vfio-pci driver functionality > > > while adding support for device-specific operations (for example these > > > device-specific drivers are planned to support live migration of > > > certain devices). All that will be needed to make this functionality > > > available will be to bind the new vendor-specific driver to the device > > > (rather than the generic vfio-pci driver, which will continue to work > > > just without the extra functionality). > > > > > > But until now libvirt has required that all PCI devices being assigned > > > to a guest with VFIO specifically have the "vfio-pci" driver bound to > > > the device. So even if the user manually binds a shiny new > > > vendor-specific driver to the device (and puts "managed='no'" in the > > > config to prevent libvirt from changing that), libvirt will just fail > > > during startup of the guest (or during hotplug) because the driver > > > bound to the device isn't named exactly "vfio-pci". > > > > > > Fortunately these new vendor/device-specific drivers can be easily > > > identified as being "vfio-pci + extra stuff" - all that's needed is to > > > look at the output of the "modinfo $driver_name" command to see if > > > "vfio_pci" is in the alias list for the driver. > > > > > > That's what this patch does. When libvirt checks the driver bound to a > > > device (either to decide if it needs to bind to a different driver or > > > perform some other operation, or if the current driver is acceptable > > > as-is), if the driver isn't specifically "vfio-pci", then it will look > > > at the output of modinfo for the driver that *is* bound to the device; > > > if modinfo shows vfio_pci as an alias for that device, then we'll > > > behave as if the driver was exactly "vfio-pci". > > > > Since you say that the vendor/device-specific drivers does each of such > > drivers > > implement the base vfio-pci functionality or they simply call into the base > > driver? The reason why I'm asking is that if each of the vendor-specific > > drivers depend on the vfio-pci module to be loaded as well, then reading > > /proc/modules should suffice as vfio-pci should be listed right next to the > > vendor-specific one. What am I missing? > I don't know the definitive answer to that, as I have no example of a > working vendor-specific driver to look at and only know about the kernel > work going on second-hand from Alex. It looks like even the vfio_pci driver > itself depends on other presumably lower level vfio-* modules (it directly > uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly > these new drivers would be depending on one or more of those lower level > modules rather than vfio_pci. Also I would imagine it would be possible for > other drivers to also depend on the vfio-pci driver while not themselves > being a vfio driver. > > > > > The 'alias' field is optional so do we have any support guarantees from the > > vendors that the it will always be filled in correctly? I mean you surely > > handle that case in the code, but once we start supporting this there's no > > way > > back and we already know how painful it can be to convince the vendors to > > follow some kind of standard so that we don't need to maintain several code > > paths based on a vendor-matrix. > > The aliases are what is used to determine the "best" vfio driver for a > particular device, so I don't think it would be possible for a driver to not > implement it, and the method I've used here to determine if a driver is a > vfio driver was recommended by Alex after a couple of discussions on the > subject. > > > > > ... > > > > > +int > > > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev, > > > + char **drvName, > > > + virPCIStubDriver *drvType) > > > +{ > > > +g_autofree char *drvPath = NULL; > > > +g_autoptr(virCommand) cmd = NULL; > > > +g_autofree char *output = NULL; > > > +g_autoptr(GRegex) regex = NULL; > > > +g_autoptr(GError) err = NULL; > > > +g_autoptr(GMatchInfo) info = NULL; > > > +int exit; > > > +int tmpType; > > > + > > > +if (virPCIDeviceGetDriverPathAndName(dev, , drvName) < 0) > > > +return -1; > > > + > > > +if (!*drvName) { > > > +*drvType = VIR_PCI_STUB_DRIVER_NONE; > > > +return 0; > > > +} > > > + > > > +tmpType =
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On 8/1/22 7:58 AM, Erik Skultety wrote: On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote: Before a PCI device can be assigned to a guest with VFIO, that device must be bound to the vfio-pci driver rather than to the device's normal driver. The vfio-pci driver provides APIs that permit QEMU to perform all the necessary operations to make the device accessible to the guest. There has been kernel work recently to support vendor/device-specific VFIO drivers that provide the basic vfio-pci driver functionality while adding support for device-specific operations (for example these device-specific drivers are planned to support live migration of certain devices). All that will be needed to make this functionality available will be to bind the new vendor-specific driver to the device (rather than the generic vfio-pci driver, which will continue to work just without the extra functionality). But until now libvirt has required that all PCI devices being assigned to a guest with VFIO specifically have the "vfio-pci" driver bound to the device. So even if the user manually binds a shiny new vendor-specific driver to the device (and puts "managed='no'" in the config to prevent libvirt from changing that), libvirt will just fail during startup of the guest (or during hotplug) because the driver bound to the device isn't named exactly "vfio-pci". Fortunately these new vendor/device-specific drivers can be easily identified as being "vfio-pci + extra stuff" - all that's needed is to look at the output of the "modinfo $driver_name" command to see if "vfio_pci" is in the alias list for the driver. That's what this patch does. When libvirt checks the driver bound to a device (either to decide if it needs to bind to a different driver or perform some other operation, or if the current driver is acceptable as-is), if the driver isn't specifically "vfio-pci", then it will look at the output of modinfo for the driver that *is* bound to the device; if modinfo shows vfio_pci as an alias for that device, then we'll behave as if the driver was exactly "vfio-pci". Since you say that the vendor/device-specific drivers does each of such drivers implement the base vfio-pci functionality or they simply call into the base driver? The reason why I'm asking is that if each of the vendor-specific drivers depend on the vfio-pci module to be loaded as well, then reading /proc/modules should suffice as vfio-pci should be listed right next to the vendor-specific one. What am I missing? I don't know the definitive answer to that, as I have no example of a working vendor-specific driver to look at and only know about the kernel work going on second-hand from Alex. It looks like even the vfio_pci driver itself depends on other presumably lower level vfio-* modules (it directly uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly these new drivers would be depending on one or more of those lower level modules rather than vfio_pci. Also I would imagine it would be possible for other drivers to also depend on the vfio-pci driver while not themselves being a vfio driver. The 'alias' field is optional so do we have any support guarantees from the vendors that the it will always be filled in correctly? I mean you surely handle that case in the code, but once we start supporting this there's no way back and we already know how painful it can be to convince the vendors to follow some kind of standard so that we don't need to maintain several code paths based on a vendor-matrix. The aliases are what is used to determine the "best" vfio driver for a particular device, so I don't think it would be possible for a driver to not implement it, and the method I've used here to determine if a driver is a vfio driver was recommended by Alex after a couple of discussions on the subject. ... +int +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev, + char **drvName, + virPCIStubDriver *drvType) +{ +g_autofree char *drvPath = NULL; +g_autoptr(virCommand) cmd = NULL; +g_autofree char *output = NULL; +g_autoptr(GRegex) regex = NULL; +g_autoptr(GError) err = NULL; +g_autoptr(GMatchInfo) info = NULL; +int exit; +int tmpType; + +if (virPCIDeviceGetDriverPathAndName(dev, , drvName) < 0) +return -1; + +if (!*drvName) { +*drvType = VIR_PCI_STUB_DRIVER_NONE; +return 0; +} + +tmpType = virPCIStubDriverTypeFromString(*drvName); + +if (tmpType > VIR_PCI_STUB_DRIVER_NONE) { +*drvType = tmpType; +return 0; /* exact match of a known driver name (or no name) */ +} + +/* Check the output of "modinfo $drvName" to see if it has + * "vfio_pci" as an alias. If it does, then this driver should + * also be considered as a vfio-pci driver, because it implements + * all the functionality of the basic vfio-pci (plus additional + * device-specific
Re: [PATCH] util: basic support for vendor-specific vfio drivers
On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote: > Before a PCI device can be assigned to a guest with VFIO, that device > must be bound to the vfio-pci driver rather than to the device's > normal driver. The vfio-pci driver provides APIs that permit QEMU to > perform all the necessary operations to make the device accessible to > the guest. > > There has been kernel work recently to support vendor/device-specific > VFIO drivers that provide the basic vfio-pci driver functionality > while adding support for device-specific operations (for example these > device-specific drivers are planned to support live migration of > certain devices). All that will be needed to make this functionality > available will be to bind the new vendor-specific driver to the device > (rather than the generic vfio-pci driver, which will continue to work > just without the extra functionality). > > But until now libvirt has required that all PCI devices being assigned > to a guest with VFIO specifically have the "vfio-pci" driver bound to > the device. So even if the user manually binds a shiny new > vendor-specific driver to the device (and puts "managed='no'" in the > config to prevent libvirt from changing that), libvirt will just fail > during startup of the guest (or during hotplug) because the driver > bound to the device isn't named exactly "vfio-pci". > > Fortunately these new vendor/device-specific drivers can be easily > identified as being "vfio-pci + extra stuff" - all that's needed is to > look at the output of the "modinfo $driver_name" command to see if > "vfio_pci" is in the alias list for the driver. > > That's what this patch does. When libvirt checks the driver bound to a > device (either to decide if it needs to bind to a different driver or > perform some other operation, or if the current driver is acceptable > as-is), if the driver isn't specifically "vfio-pci", then it will look > at the output of modinfo for the driver that *is* bound to the device; > if modinfo shows vfio_pci as an alias for that device, then we'll > behave as if the driver was exactly "vfio-pci". Since you say that the vendor/device-specific drivers does each of such drivers implement the base vfio-pci functionality or they simply call into the base driver? The reason why I'm asking is that if each of the vendor-specific drivers depend on the vfio-pci module to be loaded as well, then reading /proc/modules should suffice as vfio-pci should be listed right next to the vendor-specific one. What am I missing? The 'alias' field is optional so do we have any support guarantees from the vendors that the it will always be filled in correctly? I mean you surely handle that case in the code, but once we start supporting this there's no way back and we already know how painful it can be to convince the vendors to follow some kind of standard so that we don't need to maintain several code paths based on a vendor-matrix. ... > +int > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev, > + char **drvName, > + virPCIStubDriver *drvType) > +{ > +g_autofree char *drvPath = NULL; > +g_autoptr(virCommand) cmd = NULL; > +g_autofree char *output = NULL; > +g_autoptr(GRegex) regex = NULL; > +g_autoptr(GError) err = NULL; > +g_autoptr(GMatchInfo) info = NULL; > +int exit; > +int tmpType; > + > +if (virPCIDeviceGetDriverPathAndName(dev, , drvName) < 0) > +return -1; > + > +if (!*drvName) { > +*drvType = VIR_PCI_STUB_DRIVER_NONE; > +return 0; > +} > + > +tmpType = virPCIStubDriverTypeFromString(*drvName); > + > +if (tmpType > VIR_PCI_STUB_DRIVER_NONE) { > +*drvType = tmpType; > +return 0; /* exact match of a known driver name (or no name) */ > +} > + > +/* Check the output of "modinfo $drvName" to see if it has > + * "vfio_pci" as an alias. If it does, then this driver should > + * also be considered as a vfio-pci driver, because it implements > + * all the functionality of the basic vfio-pci (plus additional > + * device-specific stuff). > + */ Instead of calling an external program and then grepping its output which technically could change in the future, wouldn't it be better if we read /lib/modules/`uname -r`/modules.alias and filtered whatever line had the vfio-pci' substring and compared the module name with the user-provided device driver? If not then I think you should pass '-F alias' to the command to speed up the regex just a tiny bit. Regards, Erik
[PATCH] util: basic support for vendor-specific vfio drivers
Before a PCI device can be assigned to a guest with VFIO, that device must be bound to the vfio-pci driver rather than to the device's normal driver. The vfio-pci driver provides APIs that permit QEMU to perform all the necessary operations to make the device accessible to the guest. There has been kernel work recently to support vendor/device-specific VFIO drivers that provide the basic vfio-pci driver functionality while adding support for device-specific operations (for example these device-specific drivers are planned to support live migration of certain devices). All that will be needed to make this functionality available will be to bind the new vendor-specific driver to the device (rather than the generic vfio-pci driver, which will continue to work just without the extra functionality). But until now libvirt has required that all PCI devices being assigned to a guest with VFIO specifically have the "vfio-pci" driver bound to the device. So even if the user manually binds a shiny new vendor-specific driver to the device (and puts "managed='no'" in the config to prevent libvirt from changing that), libvirt will just fail during startup of the guest (or during hotplug) because the driver bound to the device isn't named exactly "vfio-pci". Fortunately these new vendor/device-specific drivers can be easily identified as being "vfio-pci + extra stuff" - all that's needed is to look at the output of the "modinfo $driver_name" command to see if "vfio_pci" is in the alias list for the driver. That's what this patch does. When libvirt checks the driver bound to a device (either to decide if it needs to bind to a different driver or perform some other operation, or if the current driver is acceptable as-is), if the driver isn't specifically "vfio-pci", then it will look at the output of modinfo for the driver that *is* bound to the device; if modinfo shows vfio_pci as an alias for that device, then we'll behave as if the driver was exactly "vfio-pci". The effect of this patch is that users will now be able to pre-setup a device to be bound to a vendor-specific driver, then put "managed='no'" in the config and libvirt will allow that driver. What this patch does *not* do is handle automatically determining the proper/best vendor-specific driver and binding to it in the case of "managed='yes'". This will be implemented later when there is a widely available driver / device combo we can use for testing. This initial simple patch is just something simple that will permit initial testing of the new drivers' functionality. (I personally had to add an extra patch playing with driver names to my build just to test that everything was working as expected; that's okay for a patch as simple as this, but wouldn't be acceptable testing for anything more complex.) Signed-off-by: Laine Stump --- meson.build | 1 + src/hypervisor/virhostdev.c | 26 --- src/util/virpci.c | 90 ++--- src/util/virpci.h | 3 ++ 4 files changed, 97 insertions(+), 23 deletions(-) diff --git a/meson.build b/meson.build index de59b1be9c..9d96eb3ee3 100644 --- a/meson.build +++ b/meson.build @@ -822,6 +822,7 @@ optional_programs = [ 'iscsiadm', 'mdevctl', 'mm-ctl', + 'modinfo', 'modprobe', 'ovs-vsctl', 'pdwtags', diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index c0ce867596..15b35fa75e 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -747,9 +747,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManager *mgr, mgr->inactivePCIHostdevs) < 0) goto reattachdevs; } else { -g_autofree char *driverPath = NULL; -g_autofree char *driverName = NULL; -int stub; +g_autofree char *drvName = NULL; +virPCIStubDriver drvType; /* Unmanaged devices should already have been marked as * inactive: if that's the case, we can simply move on */ @@ -769,18 +768,14 @@ virHostdevPreparePCIDevicesImpl(virHostdevManager *mgr, * information about active / inactive device across * daemon restarts has been implemented */ -if (virPCIDeviceGetDriverPathAndName(pci, - , ) < 0) +if (virPCIDeviceGetDriverNameAndType(pci, , ) < 0) goto reattachdevs; -stub = virPCIStubDriverTypeFromString(driverName); - -if (stub > VIR_PCI_STUB_DRIVER_NONE && -stub < VIR_PCI_STUB_DRIVER_LAST) { +if (drvType > VIR_PCI_STUB_DRIVER_NONE) { /* The device is bound to a known stub driver: store this * information and add a copy to the inactive list */ -virPCIDeviceSetStubDriver(pci, stub); +virPCIDeviceSetStubDriver(pci, drvType);