* Kirti Wankhede <kwankh...@nvidia.com> [2016-11-01 13:17:19 +0530]: > > > On 11/1/2016 9:15 AM, Dong Jia Shi wrote: > > * Alex Williamson <alex.william...@redhat.com> [2016-10-29 08:03:01 -0600]: > > > >> On Sat, 29 Oct 2016 16:07:05 +0530 > >> Kirti Wankhede <kwankh...@nvidia.com> wrote: > >> > >>> On 10/29/2016 2:03 AM, Alex Williamson wrote: > >>>> On Sat, 29 Oct 2016 01:32:35 +0530 > >>>> Kirti Wankhede <kwankh...@nvidia.com> wrote: > >>>> > >>>>> On 10/28/2016 6:10 PM, Alex Williamson wrote: > >>>>>> On Fri, 28 Oct 2016 15:33:58 +0800 > >>>>>> Jike Song <jike.s...@intel.com> wrote: > >>>>>> > >>> ... > >>>>>>>> > >>>>>>>> +/* > >>>>>>>> + * This function finds pfn in domain->external_addr_space->pfn_list > >>>>>>>> for given > >>>>>>>> + * iova range. If pfn exist, notify pfn to registered notifier > >>>>>>>> list. On > >>>>>>>> + * receiving notifier callback, vendor driver should invalidate the > >>>>>>>> mapping and > >>>>>>>> + * call vfio_unpin_pages() to unpin this pfn. With that vfio_pfn > >>>>>>>> for this pfn > >>>>>>>> + * gets removed from rb tree of pfn_list. That re-arranges rb tree, > >>>>>>>> so while > >>>>>>>> + * searching for next vfio_pfn in rb tree, start search from first > >>>>>>>> node again. > >>>>>>>> + * If any vendor driver doesn't unpin that pfn, vfio_pfn would not > >>>>>>>> get removed > >>>>>>>> + * from rb tree and so in next search vfio_pfn would be same as > >>>>>>>> previous > >>>>>>>> + * vfio_pfn. In that case, exit from loop. > >>>>>>>> + */ > >>>>>>>> +static void vfio_notifier_call_chain(struct vfio_iommu *iommu, > >>>>>>>> + struct vfio_iommu_type1_dma_unmap > >>>>>>>> *unmap) > >>>>>>>> +{ > >>>>>>>> + struct vfio_domain *domain = iommu->external_domain; > >>>>>>>> + struct rb_node *n; > >>>>>>>> + struct vfio_pfn *vpfn = NULL, *prev_vpfn; > >>>>>>>> + > >>>>>>>> + do { > >>>>>>>> + prev_vpfn = vpfn; > >>>>>>>> + mutex_lock(&domain->external_addr_space->pfn_list_lock); > >>>>>>>> + > >>>>>>>> + n = rb_first(&domain->external_addr_space->pfn_list); > >>>>>>>> + > >>>>>>>> + for (; n; n = rb_next(n), vpfn = NULL) { > >>>>>>>> + vpfn = rb_entry(n, struct vfio_pfn, node); > >>>>>>>> + > >>>>>>>> + if ((vpfn->iova >= unmap->iova) && > >>>>>>>> + (vpfn->iova < unmap->iova + unmap->size)) > >>>>>>>> + break; > >>>>>>>> + } > >>>>>>>> + > >>>>>>>> + > >>>>>>>> mutex_unlock(&domain->external_addr_space->pfn_list_lock); > >>>>>>>> + > >>>>>>>> + /* Notify any listeners about DMA_UNMAP */ > >>>>>>>> + if (vpfn) > >>>>>>>> + blocking_notifier_call_chain(&iommu->notifier, > >>>>>>>> + > >>>>>>>> VFIO_IOMMU_NOTIFY_DMA_UNMAP, > >>>>>>>> + &vpfn->pfn); > >>>>>>> > >>>>>>> Hi Kirti, > >>>>>>> > >>>>>>> The information carried by notifier is only a pfn. > >>>>>>> > >>>>>>> Since your pin/unpin interfaces design, it's the vendor driver who > >>>>>>> should > >>>>>>> guarantee pin/unpin same times. To achieve that, the vendor driver > >>>>>>> must > >>>>>>> cache it's iova->pfn mapping on its side, to avoid pinning a same page > >>>>>>> for multiple times. > >>>>>>> > >>>>>>> With the notifier carrying only a pfn, to find the iova by this pfn, > >>>>>>> the vendor driver must *also* keep a reverse-mapping. That's a bit > >>>>>>> too much. > >>>>>>> > >>>>>>> Since the vendor could also suffer from IOMMU-compatible problem, > >>>>>>> which means a local cache is always helpful, so I'd like to have the > >>>>>>> iova carried to the notifier. > >>>>>>> > >>>>>>> What'd you say? > >>>>>> > >>>>>> I agree, the pfn is not unique, multiple guest pfns (iovas) might be > >>>>>> backed by the same host pfn. DMA_UNMAP calls are based on iova, the > >>>>>> notifier through to the vendor driver must be based on the same. > >>>>> > >>>>> Host pfn should be unique, right? > >>>> > >>>> Let's say a user does a malloc of a single page and does 100 calls to > >>>> MAP_DMA populating 100 pages of IOVA space all backed by the same > >>>> malloc'd page. This is valid, I have unit tests that do essentially > >>>> this. Those will all have the same pfn. The user then does an > >>>> UNMAP_DMA to a single one of those IOVA pages. Did the user unmap > >>>> everything matching that pfn? Of course not, they only unmapped that > >>>> one IOVA page. There is no guarantee of a 1:1 mapping of pfn to IOVA. > >>>> UNMAP_DMA works based on IOVA. Invalidation broadcasts to the vendor > >>>> driver MUST therefore also work based on IOVA. This is not an academic > >>>> problem, address space aliases exist in real VMs, imagine a virtual > >>>> IOMMU. Thanks, > >>>> > >>> > >>> > >>> So struct vfio_iommu_type1_dma_unmap should be passed as argument to > >>> notifier callback: > >>> > >>> if (unmapped && iommu->external_domain) > >>> - vfio_notifier_call_chain(iommu, unmap); > >>> + blocking_notifier_call_chain(&iommu->notifier, > >>> + VFIO_IOMMU_NOTIFY_DMA_UNMAP, > >>> + unmap); > >>> > >>> Then vendor driver should find pfns he has pinned from this range of > >>> iovas, then invalidate and unpin pfns. Right? > >> > >> That seems like a valid choice. It's probably better than calling the > >> notifier for each page of iova. Thanks, > >> > >> Alex > >> > > Hi Kirti, > > > > This version requires the *vendor driver* call vfio_register_notifier > > for an mdev device before any pinning operations. I guess all of the > > vendor drivers may have some alike code for notifier > > registration/unregistration. > > > > My question is, how about letting the mdev framework managing the > > notifier registration/unregistration process? > > > > We could add a notifier_fn_t callback to "struct parent_ops", then the > > mdev framework should make sure that the vendor driver assigned a value > > to this callback. The mdev core could initiate a notifier_block for each > > parent driver with its callback, and register/unregister it to vfio in > > the right time. > > > > Module mdev_core is independent of VFIO so far and it should be > independent of VFIO module. > > Its a good suggestion to have a notifier callback in parent_ops, but we > shouldn't call vfio_register_notifier()/ vfio_unregister_notifier() from > mdev core module. vfio_mdev module would take care to > register/unregister notifier to vfio from its vfio_mdev_open()/ > vfio_mdev_release() call. That looks cleaner to me. Nod.
> > Notifier callback in parent_ops should be optional since all vendor > drivers might not pin/unpin pages, for example sample mtty driver. > Notifier would be registered to vfio module only if the notifier > callback is provided by vendor driver. > > If this looks reasonable, I'll have this patch in my next version of > patch series. Sounds reasonable for me. Thanks, > > Thanks, > Kirti > -- Dong Jia