> -----Original Message----- > From: Peter Xu [mailto:pet...@redhat.com] > Sent: Thursday, April 20, 2017 11:04 AM > To: Lan, Tianyu <tianyu....@intel.com> > Cc: Liu, Yi L <yi.l....@intel.com>; qemu-devel@nongnu.org; Michael S . Tsirkin > <m...@redhat.com>; Jason Wang <jasow...@redhat.com>; Marcel Apfelbaum > <mar...@redhat.com>; David Gibson <da...@gibson.dropbear.id.au>; Tian, Kevin > <kevin.t...@intel.com> > Subject: Re: [Qemu-devel] [PATCH v2 7/7] intel_iommu: support passthrough (PT) > > On Wed, Apr 19, 2017 at 03:27:52PM +0800, Lan Tianyu wrote: > > Hi All: > > Sorry for later response. > > On 2017年04月18日 17:04, Liu, Yi L wrote: > > >> -----Original Message----- > > >> From: Peter Xu [mailto:pet...@redhat.com] > > >> Sent: Tuesday, April 18, 2017 3:27 PM > > >> To: Liu, Yi L <yi.l....@intel.com> > > >> Cc: qemu-devel@nongnu.org; Lan, Tianyu <tianyu....@intel.com>; Michael S > > >> . > > >> Tsirkin <m...@redhat.com>; Jason Wang <jasow...@redhat.com>; Marcel > > >> Apfelbaum <mar...@redhat.com>; David Gibson > > >> <da...@gibson.dropbear.id.au> > > >> Subject: Re: [Qemu-devel] [PATCH v2 7/7] intel_iommu: support > > >> passthrough (PT) > > >> > > >> On Tue, Apr 18, 2017 at 06:02:44AM +0000, Liu, Yi L wrote: > > >>>>> Hi Peter, > > >>>>> > > >>>>> Skip address space switching is a good idea to support Passthru mode. > > >>>>> However, without the address space, the vfio notifier would not > > >>>>> be registered, thus vIOMMU emulator has no way to connect to > > >>>>> host. It is no harm if there is only map/unmap notifier. But if > > >>>>> we have more notifiers other than map/unmap, it may be a problem. > > >>>>> > > >>>>> I think we need to reconsider it here. > > >>>> > > >>>> For now I think as switching is good to us in general. Could I > > >>>> know more context about this? Would it be okay to work on top of this > > >>>> in the > future? > > >>>> > > >>> > > >>> Let me explain. For vSVM enabling, it needs to add new notifier to > > >>> bind gPASID table to host. If an assigned device uses SVM in > > >>> guest, as designed guest IOMMU driver would set "pt" for it. This > > >>> is to make sure the host second-level page table stores GPA->HPA > > >>> mapping. So that pIOMMU can do nested translation to achieve gVA->GPA > GPA->HPA mapping. > > >> > > >> That should mean that in the guest it will only enable first level > > >> translation, while in the host it'll be nested (first + second)? > > >> Then you should be trapping the guest extended context entry > > >> invalidation, then pass the PASID table pointer downward to the host > > >> IOMMU, > am I correct? > > > > > > exactly. > > > > > >> > > >>> > > >>> So the situation is if we want to keep GPA->HPA mapping, we should > > >>> skip address space switch, but the vfio listener would not know > > >>> vIOMMU is added, so no vfio notifier would be registered. But if > > >>> we do the switch, the GPA->HPA mapping is unmapped. And need > > >>> another way to build the > > >> GPA->HPA mapping. > > >> > > >> If my understanding above is correct, current IOMMU notifier may > > >> not satisfy your need. If you see the current notify interface, > > >> it's delivering IOMMUTLBEntry. I suspect it only suites for IOTLB > > >> notifications, but not if you want to pass some pointer (one GPA) > > >> downward > somewhere. > > > > > > Yeah, you got it more than absolutely. One of my patch is to modify > > > the para to be void * and let each notifiers to translate > > > separately. I think it should be a reasonable change. > > > > > > Supposedly, I would send RFC for vSVM soon. We may talk more it at that > > > thread. > > I suspect whether it'll be good that we hang everything under current IOMMU > notifiers... But sure I'm looking forward to your RFC. :)
The current IOMMU notifier is no doubt MAP/UNMAP specific. However, I think it should be a common requirement that vIOMMU emulator needs notifiers other than MAP/UNMAP to notify pIOMMU. So I think it's better to show the basic design for vSVM. And discuss the best way to place the new notifiers. Would accelerate my work. And thx for your understanding. > > > > > >>> > > >>> I think we may have two choice here. Pls let me know your ideas. > > >>> > > >>> 1. skip the switch for "pt" mode, find other way to register vfio > > >>> notifiers. not sure if this is workable. Just a quick thought. > > >>> > > >>> 2. do the switch, and rebuild GPA->HPA mapping if device use "pt" > > >>> mode. For this option, I also have two way to build the GPA->HPA > > >>> mapping. > > >>> a) walk all the memory region sections in address_space_memory, > > >>> and build the > > >> mapping. > > >>> address_space_memory.dispatch->map.sections, sections stores all > > >>> the mapped > > >> sections. > > >>> > > >>> b) let guest iommu driver mimics a 1:1 mapping for the devices use > > >>> "pt" mode. in this way, the GPA->HPA mapping could be setup by > > >>> walking the guest SL page table. This is consistent with your vIOVA > > >>> replay > solution. > > >> > > >> I'll prefer (1). Reason explained above. > > >> > > >>> > > >>> Also I'm curious if Tianyu's fault report framework needs to register > > >>> new > notifiers. > > >> > > >> For Tianyu's case, I feel like we need other kind of notifier as > > >> well, but not the current IOTLB one. And, of course in this case > > >> it'll be in an reverted direction, which is from device to vIOMMU. > > >> > > >> (I am thinking whether it's a good time now to let any PCI device > > >> able to fetch its direct owner IOMMU object, then it can register > > >> anything it want onto that object > > >> maybe?) > > >> > > > Hmmm, let's see Tianyu's comment as he's driving fault report. Let's > > > keep in touch on this Passthru-Mode enabling. > > > > In my previous RFC patchset of fault event reporting, I registered > > fault notifier when there is a VFIO group attached to VFIO container > > and used the address space to check whether vIOMMU is added. The > > address space is returned by vtd_host_dma_iommu(). vtd_find_add_as() > > initializes device's IOMMU memory region and put it into device's > > address space as root memory region. > > Does this make sense? > > > > @@ -1103,6 +1132,14 @@ static int vfio_connect_container(VFIOGroup > > *group, AddressSpace *as, > > goto listener_release_exit; > > } > > > > + if (memory_region_is_iommu(container->space->as->root)) { > > I would suggest we don't play with as->root any more. After vtd vfio series, > this may > not be true if passthrough mode is enabled (then the root may be switched to > an > system memory alias). I don't know the best way to check this, one > alternative might > be that we check whether > container->space->as == system_memory(), it should be workable, but in > a slightly hackish way. In my understanding, container->space->as->root cannot work here no matter passthru-mode is enabled or not. The code here is aiming to check if vIOMMU exists. After the vfio series, the vtd_dev_as->root is not initialized to be a iommu MemoryRegion. Compared with checking if it is system_memory(), I think adding a mechanism to get the iommu MemoryRegion may be a better choice. Just like the current pci_device_iommu_address_space(). > > > + if (vfio_set_iommu_fault_notifier(container)) { > > + error_setg_errno(errp, -ret, > > + "Fail to set IOMMU fault notifier"); > > + goto listener_release_exit; > > + } > > + } > > + > > container->initialized = true; > > > > -- > > Best regards > > Tianyu Lan > > -- > Peter Xu Thanks, Yi L