On Sun, Jun 10, 2012 at 09:15:10AM -0600, Alex Williamson wrote: > On Sun, 2012-06-10 at 17:54 +0300, Michael S. Tsirkin wrote: > > On Sun, Jun 10, 2012 at 08:41:03AM -0600, Alex Williamson wrote: > > > On Sun, 2012-06-10 at 17:03 +0300, Michael S. Tsirkin wrote: > > > > On Sun, Jun 10, 2012 at 07:41:51AM -0600, Alex Williamson wrote: > > > > > > > >>>> vfio_pci.c contains a nice function called "parse_hostaddr". > > > > > > > >>>> You may > > > > > > > >>>> guess what it does. ;) > > > > > > > >>> > > > > > > > >>> Interesting. Why? This looks strange to me: > > > > > > > >>> I would expect the admin to bind a device to vfio > > > > > > > >>> the way it's now bound to a stub. > > > > > > > >>> The pass /dev/vfioXXX to qemu. > > > > > > > >> > > > > > > > >> That's the "libvirt way". We surely also want the "qemu > > > > > > > >> command line > > > > > > > >> way" for which this kind of service is needed. > > > > > > > >> > > > > > > > >> Jan > > > > > > > >> > > > > > > > > > > > > > > > > Yes, I imagine the qemu command line passing in /dev/vfioXXX, > > > > > > > > the libvirt way will pass in an fd for above. No? > > > > > > > > > > > > > > As far as I understand the API, there is no device file per > > > > > > > assigned > > > > > > > device. > > > > > > > > > > > > Does it do pci_get_domain_bus_and_slot like kvm then? > > > > > > With all the warts like you have to remember to bind pci stub > > > > > > or you get two drivers for one device? > > > > > > If true that's unfortunate IMHO. > > > > > > > > I hope the answer to the above is no? > > > > > > No, it does a probe for devices. We need the devaddr to compare against > > > dev_name of the device to figure out which device the user is attempting > > > to identify. > > > > > > > > > > Also, this [domain:]bus:dev.fn format is more handy for the > > > > > > > command line. > > > > > > > > > > > > > > Jan > > > > > > > > > > > > > > > > > > > Then users could add udev rules that will name vfio devices > > > > > > like this. Another interesting option: /dev/vfio/eth0/vf1. > > > > > > That's better I think: no one really likes running lspci > > > > > > and guessing the address from there. > > > > > > > > > > That's not at all how VFIO works. /dev/vfio/# represents a group, > > > > > which > > > > > may contain one or more devices. Even if libvirt passes a file > > > > > descriptor for the group, qemu needs to know which device in the group > > > > > to add to the guest, so parsing a device address is still necessary. > > > > > Thanks, > > > > > > > > > > Alex > > > > > > > > That's very unusual, and unfortunate. For example this means that I > > > > must update applications just because I move a card to another slot. > > > > UIO does not have this problem. > > > > The fact that it's broken in kvm ATM seems to have made people > > > > think it's okay, but it really is a bug. We didn't fix it > > > > because vfio was supposed to be the solution. > > > > > > I don't know what you're talking about here. Are you suggesting that > > > needing to specify -device pci-assign,host=3.0 changing to host=4.0 when > > > you move a card is broken? > > > > Yes. Absolutely. Admin should be able to abstract it away without users > > knowing anything about it. > > We don't have UUIDs on PCI devices, so who's to say that the device that > was in slot 3 is the same device that's now in slot 4 and the user > should still have access to it? That sounds even more problematic.
PF has a driver loaded so you can identify that, and identify the VF through it. Again this is really policy, it should be up to the admin how to name the device. > > > How does UIO avoid such a problem. > > > > Normally you use a misc device that you can name with udev. > > > > > UIO-pci > > > requires the user to use pci-sysfs for resource access, so it surely > > > cares about the device address. > > > > Only uio_pci_generic. Other uio devices let you drive the > > device. > > If this is actually a problem, this is the first ever complaint I've > heard about it. As above, I don't think we can assume the same access > when a device is moved. I thought need for sane naming and for sysfs interface was discussed multiple times. But maybe I'm misremembering. > > > > I do realize you want to represent a group of devices somehow but can't > > > > this be solved without breaking naming devices with udev? For example, > > > > the > > > > device could be a file as well. You would then use the fd to identify > > > > the > > > > device within the group. And in a somewhat common case of a single > > > > device > > > > within the group, you can even make opening the group optional. > > > > Don't know if this fix I suggest makes sense at all but it's a real > > > > problem all the same. > > > > > > Unfortunately, exposing individual devices just confuses the ownership > > > model we require for groups. It would provide the illusion of being > > > able to assign an individual device, without the reality of the > > > grouping. Groups are owned either by _a_ user or by the kernel, they > > > can't be split across multiple users (at least not with any guarantees > > > of isolation). The current interface makes this clear. Thanks, > > > > > > Alex > > > > So do users pass in group=/dev/vfio/1,host=0:3.0 then? > > No, vfio syntax is -device vfio-pci,host=0:3.0, just like pci-assign. > Qemu will figure out which group that device belongs to and "do the > right thing". If we add support for libvirt passing a groupfd, it will > be mostly the same, just using scm_rights to get the groupfd instead of > opening it directly. Thanks, > > Alex Then how do you know which /dev/vfio/# to open? -- MST