Re: [Intel-gfx] [PATCH v4 1/2] vfio: Replace the DMA unmapping notifier with a callback

2022-08-04 Thread Eric Farman
On Wed, 2022-07-20 at 17:04 -0600, Alex Williamson wrote:
> On Wed, 20 Jul 2022 17:08:29 -0300
> Jason Gunthorpe  wrote:
> 
> > On Wed, Jul 20, 2022 at 01:41:13PM -0600, Alex Williamson wrote:
> >  
> > > ie. we don't need the gfn, we only need the iova.  
> > 
> > Right, that makes sense
> >  
> > > However then I start to wonder why we're passing in 1 for the
> > > number of
> > > pages because this previously notifier, now callback is called
> > > for the
> > > entire vfio_dma range when we find any pinned pages.
> > 
> > Well, it is doing this because it only ever pins one page.
> 
> Of course that page is not necessarily the page it unpins given the
> contract misunderstanding below.
>  
> > The drivers are confused about what the contract is. vfio is
> > calling
> > the notifier with the entire IOVA range that is being unmapped and
> > the
> > drivers are expecting to receive notifications only for the IOVA
> > they
> > have actually pinned.
> > 
> > > Should ap and ccw implementations of .dma_unmap just be replaced
> > > with a
> > > BUG_ON(1)?  
> > 
> > The point of these callbacks is to halt concurrent DMA, and ccw
> > does
> > that today.
> 
> ccw essentially only checks whether the starting iova of the unmap is
> currently mapped.  If not it does nothing, if it is it tries to reset
> the device and unpin everything.  Chances are the first iova is not
> the
> one pinned, so we don't end up removing the pinned page and type1
> will
> eventually BUG_ON after a few tries.
> 
> > It looks like AP is missing a call to ap_aqic(), so it is
> > probably double wrong.
> 
> Thankfully the type1 unpinning path can't be tricked into unpinning
> something that wasn't pinned, so chances are the unpin call does
> nothing, with a small risk that it unpins another driver's pinned
> page,
> which might not yet have been notified and could still be using the
> page.  In the end, if ap did have a page pinned in the range, we'll
> hit
> the same BUG_ON as above.
> 
> > What I'd suggest is adding a WARN_ON that the dma->pfn_list is not
> > empty and leave these functions alone.
> 
> The BUG_ON still exists in type1.
> 
> Eric, Matt, Tony, Halil, JasonH, any quick fixes here?  ccw looks
> like
> it would be pretty straightforward to test against a range rather
> than
> a single iova.

Agreed, ccw looks pretty easy. Should I send something to go before
this series to make stable easier? (It's a trivial change in either
direction, so either is fine to me.)

Eric

>  
> > Most likely AP should be fixed to call vfio_ap_irq_disable() and to
> > check the q->saved_pfn against the IOVA.
> 
> Right, the q->saved_iova, perhaps calling vfio_ap_irq_disable() on
> finding a matching queue.
> 
> > But I'm inclined to leave this as-is for this series given we are
> > at
> > rc7.
> 
> On the grounds that it's no worse, maybe, but given the changes
> around this code hopefully we can submit fixes patches to stable if
> the
> backport isn't obvious and the BUG_ON in type1 is reachable.  Thanks,
> 
> Alex
> 



Re: [Intel-gfx] [PATCH v4 1/2] vfio: Replace the DMA unmapping notifier with a callback

2022-07-21 Thread Alex Williamson
On Thu, 21 Jul 2022 12:01:47 -0400
Eric Farman  wrote:

> On Wed, 2022-07-20 at 17:04 -0600, Alex Williamson wrote:
> > On Wed, 20 Jul 2022 17:08:29 -0300
> > Jason Gunthorpe  wrote:
> >   
> > > On Wed, Jul 20, 2022 at 01:41:13PM -0600, Alex Williamson wrote:
> > >
> > > > ie. we don't need the gfn, we only need the iova.
> > > 
> > > Right, that makes sense
> > >
> > > > However then I start to wonder why we're passing in 1 for the
> > > > number of
> > > > pages because this previously notifier, now callback is called
> > > > for the
> > > > entire vfio_dma range when we find any pinned pages.  
> > > 
> > > Well, it is doing this because it only ever pins one page.  
> > 
> > Of course that page is not necessarily the page it unpins given the
> > contract misunderstanding below.
> >
> > > The drivers are confused about what the contract is. vfio is
> > > calling
> > > the notifier with the entire IOVA range that is being unmapped and
> > > the
> > > drivers are expecting to receive notifications only for the IOVA
> > > they
> > > have actually pinned.
> > >   
> > > > Should ap and ccw implementations of .dma_unmap just be replaced
> > > > with a
> > > > BUG_ON(1)?
> > > 
> > > The point of these callbacks is to halt concurrent DMA, and ccw
> > > does
> > > that today.  
> > 
> > ccw essentially only checks whether the starting iova of the unmap is
> > currently mapped.  If not it does nothing, if it is it tries to reset
> > the device and unpin everything.  Chances are the first iova is not
> > the
> > one pinned, so we don't end up removing the pinned page and type1
> > will
> > eventually BUG_ON after a few tries.
> >   
> > > It looks like AP is missing a call to ap_aqic(), so it is
> > > probably double wrong.  
> > 
> > Thankfully the type1 unpinning path can't be tricked into unpinning
> > something that wasn't pinned, so chances are the unpin call does
> > nothing, with a small risk that it unpins another driver's pinned
> > page,
> > which might not yet have been notified and could still be using the
> > page.  In the end, if ap did have a page pinned in the range, we'll
> > hit
> > the same BUG_ON as above.
> >   
> > > What I'd suggest is adding a WARN_ON that the dma->pfn_list is not
> > > empty and leave these functions alone.  
> > 
> > The BUG_ON still exists in type1.
> > 
> > Eric, Matt, Tony, Halil, JasonH, any quick fixes here?  ccw looks
> > like
> > it would be pretty straightforward to test against a range rather
> > than
> > a single iova.  
> 
> Agreed, ccw looks pretty easy. Should I send something to go before
> this series to make stable easier? (It's a trivial change in either
> direction, so either is fine to me.)

It looks like we're expecting an rc8 for this development cycle, so the
merge window will be pushed out a week (which works better for some
upcoming PTO on my end), but if it's trivial either way let's plan for
the fix to follow Nicolin's and Jason's series and we can always post a
backport to the stable list if there's any trouble.  Thanks,

Alex



Re: [Intel-gfx] [PATCH v4 1/2] vfio: Replace the DMA unmapping notifier with a callback

2022-07-20 Thread Alex Williamson
On Wed, 20 Jul 2022 17:08:29 -0300
Jason Gunthorpe  wrote:

> On Wed, Jul 20, 2022 at 01:41:13PM -0600, Alex Williamson wrote:
>  
> > ie. we don't need the gfn, we only need the iova.  
> 
> Right, that makes sense
>  
> > However then I start to wonder why we're passing in 1 for the number of
> > pages because this previously notifier, now callback is called for the
> > entire vfio_dma range when we find any pinned pages.
> 
> Well, it is doing this because it only ever pins one page.

Of course that page is not necessarily the page it unpins given the
contract misunderstanding below.
 
> The drivers are confused about what the contract is. vfio is calling
> the notifier with the entire IOVA range that is being unmapped and the
> drivers are expecting to receive notifications only for the IOVA they
> have actually pinned.
> 
> > Should ap and ccw implementations of .dma_unmap just be replaced with a
> > BUG_ON(1)?  
> 
> The point of these callbacks is to halt concurrent DMA, and ccw does
> that today.

ccw essentially only checks whether the starting iova of the unmap is
currently mapped.  If not it does nothing, if it is it tries to reset
the device and unpin everything.  Chances are the first iova is not the
one pinned, so we don't end up removing the pinned page and type1 will
eventually BUG_ON after a few tries.

> It looks like AP is missing a call to ap_aqic(), so it is
> probably double wrong.

Thankfully the type1 unpinning path can't be tricked into unpinning
something that wasn't pinned, so chances are the unpin call does
nothing, with a small risk that it unpins another driver's pinned page,
which might not yet have been notified and could still be using the
page.  In the end, if ap did have a page pinned in the range, we'll hit
the same BUG_ON as above.

> What I'd suggest is adding a WARN_ON that the dma->pfn_list is not
> empty and leave these functions alone.

The BUG_ON still exists in type1.

Eric, Matt, Tony, Halil, JasonH, any quick fixes here?  ccw looks like
it would be pretty straightforward to test against a range rather than
a single iova.
 
> Most likely AP should be fixed to call vfio_ap_irq_disable() and to
> check the q->saved_pfn against the IOVA.

Right, the q->saved_iova, perhaps calling vfio_ap_irq_disable() on
finding a matching queue.

> But I'm inclined to leave this as-is for this series given we are at
> rc7.

On the grounds that it's no worse, maybe, but given the changes
around this code hopefully we can submit fixes patches to stable if the
backport isn't obvious and the BUG_ON in type1 is reachable.  Thanks,

Alex



Re: [Intel-gfx] [PATCH v4 1/2] vfio: Replace the DMA unmapping notifier with a callback

2022-07-20 Thread Alex Williamson
On Tue, 19 Jul 2022 21:02:48 -0300
Jason Gunthorpe  wrote:
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
> b/drivers/s390/crypto/vfio_ap_ops.c
> index a7d2a95796d360..bb1a1677c5c230 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1226,34 +1226,14 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev 
> *matrix_mdev,
>   return 0;
>  }
>  
> -/**
> - * vfio_ap_mdev_iommu_notifier - IOMMU notifier callback
> - *
> - * @nb: The notifier block
> - * @action: Action to be taken
> - * @data: data associated with the request
> - *
> - * For an UNMAP request, unpin the guest IOVA (the NIB guest address we
> - * pinned before). Other requests are ignored.
> - *
> - * Return: for an UNMAP request, NOFITY_OK; otherwise NOTIFY_DONE.
> - */
> -static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
> -unsigned long action, void *data)
> +static void vfio_ap_mdev_dma_unmap(struct vfio_device *vdev, u64 iova,
> +u64 length)
>  {
> - struct ap_matrix_mdev *matrix_mdev;
> -
> - matrix_mdev = container_of(nb, struct ap_matrix_mdev, iommu_notifier);
> -
> - if (action == VFIO_IOMMU_NOTIFY_DMA_UNMAP) {
> - struct vfio_iommu_type1_dma_unmap *unmap = data;
> - unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
> -
> - vfio_unpin_pages(_mdev->vdev, _pfn, 1);
> - return NOTIFY_OK;
> - }
> + struct ap_matrix_mdev *matrix_mdev =
> + container_of(vdev, struct ap_matrix_mdev, vdev);
> + unsigned long g_pfn = iova >> PAGE_SHIFT;
>  
> - return NOTIFY_DONE;
> + vfio_unpin_pages(_mdev->vdev, _pfn, 1);
>  }
>  
>  /**


I tried to apply this on top of Nicolin's series which results in a
conflict that can be resolved as below:

diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
b/drivers/s390/crypto/vfio_ap_ops.c
index e8856a7e151c..d7c38c82f694 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1219,33 +1219,13 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev 
*matrix_mdev,
return 0;
 }
 
-/**
- * vfio_ap_mdev_iommu_notifier - IOMMU notifier callback
- *
- * @nb: The notifier block
- * @action: Action to be taken
- * @data: data associated with the request
- *
- * For an UNMAP request, unpin the guest IOVA (the NIB guest address we
- * pinned before). Other requests are ignored.
- *
- * Return: for an UNMAP request, NOFITY_OK; otherwise NOTIFY_DONE.
- */
-static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
-  unsigned long action, void *data)
+static void vfio_ap_mdev_dma_unmap(struct vfio_device *vdev, u64 iova,
+  u64 length)
 {
-   struct ap_matrix_mdev *matrix_mdev;
-
-   matrix_mdev = container_of(nb, struct ap_matrix_mdev, iommu_notifier);
-
-   if (action == VFIO_IOMMU_NOTIFY_DMA_UNMAP) {
-   struct vfio_iommu_type1_dma_unmap *unmap = data;
-
-   vfio_unpin_pages(_mdev->vdev, unmap->iova, 1);
-   return NOTIFY_OK;
-   }
+   struct ap_matrix_mdev *matrix_mdev =
+   container_of(vdev, struct ap_matrix_mdev, vdev);
 
-   return NOTIFY_DONE;
+   vfio_unpin_pages(_mdev->vdev, iova, 1);
 }
 
 /**

ie. we don't need the gfn, we only need the iova.

However then I start to wonder why we're passing in 1 for the number of
pages because this previously notifier, now callback is called for the
entire vfio_dma range when we find any pinned pages.  It makes no sense for
a driver to assume that the first iova is pinned and is the only pinned
page.

ccw has the same issue:

static void vfio_ccw_dma_unmap(struct vfio_device *vdev, u64 iova, u64 length)
{
struct vfio_ccw_private *private =
container_of(vdev, struct vfio_ccw_private, vdev);

/* Drivers MUST unpin pages in response to an invalidation. */
if (!cp_iova_pinned(>cp, iova))
return;

vfio_ccw_mdev_reset(private);
}

Entirely ignoring the length arg.

It seems only GVT-g has this correct to actually look through the
extent of the range being unmapped:

static void intel_vgpu_dma_unmap(struct vfio_device *vfio_dev, u64 iova,
 u64 length)
{
struct intel_vgpu *vgpu = vfio_dev_to_vgpu(vfio_dev);
struct gvt_dma *entry;
u64 iov_pfn = iova >> PAGE_SHIFT;
u64 end_iov_pfn = iov_pfn + length / PAGE_SIZE;

mutex_lock(>cache_lock);
for (; iov_pfn < end_iov_pfn; iov_pfn++) {
entry = __gvt_cache_find_gfn(vgpu, iov_pfn);
if (!entry)
continue;

gvt_dma_unmap_page(vgpu, entry->gfn, entry->dma_addr,
   entry->size);
__gvt_cache_remove_entry(vgpu, entry);
}
mutex_unlock(>cache_lock);
}

Should ap and