On Mon, Jan 26, 2026 at 08:53:57PM +0000, Pranjal Shrivastava wrote:
> On Sat, Jan 24, 2026 at 09:14:16PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <[email protected]>
> > 
> > dma-buf invalidation is handled asynchronously by the hardware, so VFIO
> > must wait until all affected objects have been fully invalidated.
> > 
> > In addition, the dma-buf exporter is expecting that all importers unmap any
> > buffers they previously mapped.
> > 
> > Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO 
> > regions")
> > Signed-off-by: Leon Romanovsky <[email protected]>
> > ---
> >  drivers/vfio/pci/vfio_pci_dmabuf.c | 71 
> > ++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 68 insertions(+), 3 deletions(-)

<...>

> > @@ -333,7 +359,37 @@ void vfio_pci_dma_buf_move(struct vfio_pci_core_device 
> > *vdev, bool revoked)
> >                     dma_resv_lock(priv->dmabuf->resv, NULL);
> >                     priv->revoked = revoked;
> >                     dma_buf_invalidate_mappings(priv->dmabuf);
> > +                   dma_resv_wait_timeout(priv->dmabuf->resv,
> > +                                         DMA_RESV_USAGE_BOOKKEEP, false,
> > +                                         MAX_SCHEDULE_TIMEOUT);
> >                     dma_resv_unlock(priv->dmabuf->resv);
> > +                   if (revoked) {
> > +                           kref_put(&priv->kref, vfio_pci_dma_buf_done);
> > +                           /* Let's wait till all DMA unmap are completed. 
> > */
> > +                           wait = wait_for_completion_timeout(
> > +                                   &priv->comp, secs_to_jiffies(1));
> 
> Is the 1-second constant sufficient for all hardware, or should the 
> invalidate_mappings() contract require the callback to block until 
> speculative reads are strictly fenced? I'm wondering about a case where
> a device's firmware has a high response latency, perhaps due to internal
> management tasks like error recovery or thermal and it exceeds the 1s 
> timeout. 
> 
> If the device is in the middle of a large DMA burst and the firmware is
> slow to flush the internal pipelines to a fully "quiesced"
> read-and-discard state, reclaiming the memory at exactly 1.001 seconds
> risks triggering platform-level faults..
> 
> Since the wen explicitly permit these speculative reads until unmap is
> complete, relying on a hardcoded timeout in the exporter seems to 
> introduce a hardware-dependent race condition that could compromise
> system stability via IOMMU errors or AER faults. 
> 
> Should the importer instead be required to guarantee that all 
> speculative access has ceased before the invalidation call returns?

It is guaranteed by the dma_resv_wait_timeout() call above. That call ensures
that the hardware has completed all pending operations. The 1‑second delay is
meant to catch cases where an in-kernel DMA unmap call is missing, which should
not trigger any DMA activity at that point.

So yes, one second is more than sufficient.

Thanks

> 
> Thanks
> Praan
> 
> > +                           /*
> > +                            * If you see this WARN_ON, it means that
> > +                            * importer didn't call unmap in response to
> > +                            * dma_buf_invalidate_mappings() which is not
> > +                            * allowed.
> > +                            */
> > +                           WARN(!wait,
> > +                                "Timed out waiting for DMABUF unmap, 
> > importer has a broken invalidate_mapping()");
> > +                   } else {
> > +                           /*
> > +                            * Kref is initialize again, because when revoke
> > +                            * was performed the reference counter was 
> > decreased
> > +                            * to zero to trigger completion.
> > +                            */
> > +                           kref_init(&priv->kref);
> > +                           /*
> > +                            * There is no need to wait as no mapping was
> > +                            * performed when the previous status was
> > +                            * priv->revoked == true.
> > +                            */
> > +                           reinit_completion(&priv->comp);
> > +                   }
> >             }
> >             fput(priv->dmabuf->file);
> >     }
> > @@ -346,6 +402,8 @@ void vfio_pci_dma_buf_cleanup(struct 
> > vfio_pci_core_device *vdev)
> >  
> >     down_write(&vdev->memory_lock);
> >     list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) {
> > +           unsigned long wait;
> > +
> >             if (!get_file_active(&priv->dmabuf->file))
> >                     continue;
> >  
> > @@ -354,7 +412,14 @@ void vfio_pci_dma_buf_cleanup(struct 
> > vfio_pci_core_device *vdev)
> >             priv->vdev = NULL;
> >             priv->revoked = true;
> >             dma_buf_invalidate_mappings(priv->dmabuf);
> > +           dma_resv_wait_timeout(priv->dmabuf->resv,
> > +                                 DMA_RESV_USAGE_BOOKKEEP, false,
> > +                                 MAX_SCHEDULE_TIMEOUT);
> >             dma_resv_unlock(priv->dmabuf->resv);
> > +           kref_put(&priv->kref, vfio_pci_dma_buf_done);
> > +           wait = wait_for_completion_timeout(&priv->comp,
> > +                                              secs_to_jiffies(1));
> > +           WARN_ON(!wait);
> >             vfio_device_put_registration(&vdev->vdev);
> >             fput(priv->dmabuf->file);
> >     }
> > 
> > -- 
> > 2.52.0
> > 
> > 
> 

Reply via email to