* Alex Williamson <alex.william...@redhat.com> [2016-10-29 08:03:01 -0600]:
> On Sat, 29 Oct 2016 16:07:05 +0530 > Kirti Wankhede <kwankh...@nvidia.com> wrote: > > > On 10/29/2016 2:03 AM, Alex Williamson wrote: > > > On Sat, 29 Oct 2016 01:32:35 +0530 > > > Kirti Wankhede <kwankh...@nvidia.com> wrote: > > > > > >> On 10/28/2016 6:10 PM, Alex Williamson wrote: > > >>> On Fri, 28 Oct 2016 15:33:58 +0800 > > >>> Jike Song <jike.s...@intel.com> wrote: > > >>> > > ... > > >>>>> > > >>>>> +/* > > >>>>> + * This function finds pfn in domain->external_addr_space->pfn_list > > >>>>> for given > > >>>>> + * iova range. If pfn exist, notify pfn to registered notifier list. > > >>>>> On > > >>>>> + * receiving notifier callback, vendor driver should invalidate the > > >>>>> mapping and > > >>>>> + * call vfio_unpin_pages() to unpin this pfn. With that vfio_pfn for > > >>>>> this pfn > > >>>>> + * gets removed from rb tree of pfn_list. That re-arranges rb tree, > > >>>>> so while > > >>>>> + * searching for next vfio_pfn in rb tree, start search from first > > >>>>> node again. > > >>>>> + * If any vendor driver doesn't unpin that pfn, vfio_pfn would not > > >>>>> get removed > > >>>>> + * from rb tree and so in next search vfio_pfn would be same as > > >>>>> previous > > >>>>> + * vfio_pfn. In that case, exit from loop. > > >>>>> + */ > > >>>>> +static void vfio_notifier_call_chain(struct vfio_iommu *iommu, > > >>>>> + struct vfio_iommu_type1_dma_unmap > > >>>>> *unmap) > > >>>>> +{ > > >>>>> + struct vfio_domain *domain = iommu->external_domain; > > >>>>> + struct rb_node *n; > > >>>>> + struct vfio_pfn *vpfn = NULL, *prev_vpfn; > > >>>>> + > > >>>>> + do { > > >>>>> + prev_vpfn = vpfn; > > >>>>> + mutex_lock(&domain->external_addr_space->pfn_list_lock); > > >>>>> + > > >>>>> + n = rb_first(&domain->external_addr_space->pfn_list); > > >>>>> + > > >>>>> + for (; n; n = rb_next(n), vpfn = NULL) { > > >>>>> + vpfn = rb_entry(n, struct vfio_pfn, node); > > >>>>> + > > >>>>> + if ((vpfn->iova >= unmap->iova) && > > >>>>> + (vpfn->iova < unmap->iova + unmap->size)) > > >>>>> + break; > > >>>>> + } > > >>>>> + > > >>>>> + > > >>>>> mutex_unlock(&domain->external_addr_space->pfn_list_lock); > > >>>>> + > > >>>>> + /* Notify any listeners about DMA_UNMAP */ > > >>>>> + if (vpfn) > > >>>>> + blocking_notifier_call_chain(&iommu->notifier, > > >>>>> + > > >>>>> VFIO_IOMMU_NOTIFY_DMA_UNMAP, > > >>>>> + &vpfn->pfn); > > >>>> > > >>>> Hi Kirti, > > >>>> > > >>>> The information carried by notifier is only a pfn. > > >>>> > > >>>> Since your pin/unpin interfaces design, it's the vendor driver who > > >>>> should > > >>>> guarantee pin/unpin same times. To achieve that, the vendor driver must > > >>>> cache it's iova->pfn mapping on its side, to avoid pinning a same page > > >>>> for multiple times. > > >>>> > > >>>> With the notifier carrying only a pfn, to find the iova by this pfn, > > >>>> the vendor driver must *also* keep a reverse-mapping. That's a bit > > >>>> too much. > > >>>> > > >>>> Since the vendor could also suffer from IOMMU-compatible problem, > > >>>> which means a local cache is always helpful, so I'd like to have the > > >>>> iova carried to the notifier. > > >>>> > > >>>> What'd you say? > > >>> > > >>> I agree, the pfn is not unique, multiple guest pfns (iovas) might be > > >>> backed by the same host pfn. DMA_UNMAP calls are based on iova, the > > >>> notifier through to the vendor driver must be based on the same. > > >> > > >> Host pfn should be unique, right? > > > > > > Let's say a user does a malloc of a single page and does 100 calls to > > > MAP_DMA populating 100 pages of IOVA space all backed by the same > > > malloc'd page. This is valid, I have unit tests that do essentially > > > this. Those will all have the same pfn. The user then does an > > > UNMAP_DMA to a single one of those IOVA pages. Did the user unmap > > > everything matching that pfn? Of course not, they only unmapped that > > > one IOVA page. There is no guarantee of a 1:1 mapping of pfn to IOVA. > > > UNMAP_DMA works based on IOVA. Invalidation broadcasts to the vendor > > > driver MUST therefore also work based on IOVA. This is not an academic > > > problem, address space aliases exist in real VMs, imagine a virtual > > > IOMMU. Thanks, > > > > > > > > > So struct vfio_iommu_type1_dma_unmap should be passed as argument to > > notifier callback: > > > > if (unmapped && iommu->external_domain) > > - vfio_notifier_call_chain(iommu, unmap); > > + blocking_notifier_call_chain(&iommu->notifier, > > + VFIO_IOMMU_NOTIFY_DMA_UNMAP, > > + unmap); > > > > Then vendor driver should find pfns he has pinned from this range of > > iovas, then invalidate and unpin pfns. Right? > > That seems like a valid choice. It's probably better than calling the > notifier for each page of iova. Thanks, > > Alex > Hi Kirti, This version requires the *vendor driver* call vfio_register_notifier for an mdev device before any pinning operations. I guess all of the vendor drivers may have some alike code for notifier registration/unregistration. My question is, how about letting the mdev framework managing the notifier registration/unregistration process? We could add a notifier_fn_t callback to "struct parent_ops", then the mdev framework should make sure that the vendor driver assigned a value to this callback. The mdev core could initiate a notifier_block for each parent driver with its callback, and register/unregister it to vfio in the right time. -- Dong Jia