> -----Original Message----- > From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Monday, June 15, 2015 21:32 > To: Yehuda Yitschak > Cc: Eric Auger; qemu-devel@nongnu.org; Yuval Caduri; Shadi Ammouri > Subject: Re: Assigning an eth port to a guest VM > > On Mon, 2015-06-15 at 17:45 +0000, Yehuda Yitschak wrote: > > ________________________________________ > > From: Alex Williamson <alex.william...@redhat.com> > > Sent: Monday, June 15, 2015 8:15 PM > > To: Yehuda Yitschak > > Cc: Eric Auger; qemu-devel@nongnu.org; Yuval Caduri; Shadi Ammouri > > Subject: Re: Assigning an eth port to a guest VM > > > > On Mon, 2015-06-15 at 16:52 +0000, Yehuda Yitschak wrote: > > >> ________________________________________ > > >> From: Eric Auger <eric.au...@linaro.org> > > >> Sent: Monday, June 15, 2015 4:42 PM > > >> To: Yehuda Yitschak; qemu-devel@nongnu.org > > >> Cc: Yuval Caduri; Shadi Ammouri > > >> Subject: Re: Assigning an eth port to a guest VM > > >> > > >> Hi Yehuda, > > >> On 06/15/2015 01:01 PM, Yehuda Yitschak wrote: > > >> >> Cc: Eric Auger > > >> >> > > >> >>> -----Original Message----- > > >> >>> From: Yehuda Yitschak > > >> >>> Sent: Monday, June 15, 2015 9:35 > > >> >>> To: qemu-devel@nongnu.org > > >> >>> Cc: Yuval Caduri; Shadi Ammouri > > >> >>> Subject: Assigning an eth port to a guest VM > > >> >>> > > >> >>> Hello > > >> >>> > > >> >>> I would to ask your advice on how to assign a semi-virtualized > > >> >>> Ethernet port to a guest VM > > >> >>> > > >> >>> The eth port's HW partially supports virtualization since the > > >> >>> data path MMIO registers (which controls rx/tx operation) are > duplicated per VM. > > >> >>> So for the run-time operation the guest can directly access the > > >> >>> MMIO registers, using VFIO-PLATFORM, and enjoy the > performance benefit. > > >> >>> > > >> >>> However for the initial setup and occasional configuration the > > >> >>> guest need to access control path registers which are shared for all > guests. > > >> >>> AFAIK this is usually done with HW emulation using trap & > > >> >>> emulate with QEMU. > > >> >>> So, to the best of my knowledge I need a mix of VFIO and HW > > >> >>> emulation to get the port to work with device assignment , right ? > > >> > Yes to me you're correct. > > >> >>> > > >> >>> Are there any standard methods for achieving this ? > > >> >>> Is there an example for such an existing HW in QEMU ? > > >> > Not yet unfortunately. To my knowledge the only platform devices > > >> > that were assigned with QEMU VFIO platform were standalone > > >> > duplicated devices, PL330, Calxeda Xgmac, SATA. So you are a > > >> > trailblazer on that track. > > >> > > >> Thanks. It's good to know the diagnosis :-) > > >> > > >> BTW - i thought SR-IOV uses a somewhat similar concept. AFAIK each > > >> virtual function (VF) gets a set of registers enabling it to > > >> perform data path but most of the configuration and management > operations are controlled by the host using the Physical Function PF driver. > > >> Are you familiar with that ? > > >> i know SR-IOV is not related to VFIO-PLATFORM but if the mixed of > > >> direct access and emulation exists there as well then maybe i can > > >> borrow some concepts > > > > > The difference for SR-IOV is that emulation of shared resources is > > >done almost entirely in the hardware. the PF configures the VFs and > > >may interact with them to some degree at runtime, but VFs are largely > > >separate devices from a software perspective. > > > > > The first question I would have for your device is whether there is > > > IOMMU isolation between the individual "functions". > > > > Yes. IOMMU isolation is possible. > > > > > If not, there's really nothing vfio can help with and they probably > > > ought to be used more as a macvtap interface. If there is > > > isolation, then I'd assume we'd configure the device for direct > > > access to the duplicated registers and trap to QEMU for the > > > emulation portion. For things were the emulation portion needs to > > > interact with the "PF", interfaces would need to be created in the kernel. > > > > Can you give a short example of such an interface ? > > Do you mean a special device or ioctl to handle the emulation request from > QEMU/VFIO ? > > It's a trivial example, but with PCI we have a configuration space where the > first 4 bytes expose the vendor and device ID of the device. With an SR-IOV > VF, these bytes are not populated and provided instead by the PF via the SR- > IOV capability definition on the PF. The vfio-pci driver therefore exposes > the > static PF defined vendor and device IDs though the VF config space. It's > transparent to the user. > > I would hope we wouldn't need any sort of special device or ioctl. It sounds > like the "PF" registers are separate and distinct from the "VF" > registers, so the "PF" registers could be exposed through a separate VFIO > memory region that does not allow mmap, forcing them to be trapped into > QEMU and emulated in VFIO. > > > > The vfio-platform pieces specific to your device might be the > > > logical place for that interaction with the PF to occur, ie. > > > emulation at the vfio-platform interface rather than in QEMU itself. > > > Thanks, > > > > That sounds simpler than adding QEMU to the mix. > > However for that to happen we need to trap into the vfio-platfrom driver, > right ? > > is that possible ? > > Yes. The vfio-platform driver specific to this device would expose a memory > region for those "VF" registers that does not allow mmap. The only access > would be via read/write handlers. You could then emulate/gate/police > access to those registers on the "PF" using kernel internal interfaces. It > would be a kernel internal API for accessing the PF registers. Thanks,
Eric, Alex, Thank you very much for all your answers and details. From your answers it sounds like I need to extended vfio's resource query mechanism to enable flagging certain resources as NO_MAP and then make VFIO in QEMU act accordingly. That looks like the easier part. The more complex part in my view is to manage the trap to vfio-platform driver and emulate the access. In any way, I will take some time to process all this into a solution and fill in some gaps in my knowledge. Thanks again Yehuda > > Alex