On Wed, Jan 21, 2026 at 04:28:17PM +0100, Christian König wrote: > On 1/21/26 14:31, Jason Gunthorpe wrote: > > On Wed, Jan 21, 2026 at 10:20:51AM +0100, Christian König wrote: > >> On 1/20/26 15:07, Leon Romanovsky wrote: > >>> From: Leon Romanovsky <[email protected]> > >>> > >>> dma-buf invalidation is performed asynchronously by hardware, so VFIO must > >>> wait until all affected objects have been fully invalidated. > >>> > >>> Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO > >>> regions") > >>> Signed-off-by: Leon Romanovsky <[email protected]> > >> > >> Reviewed-by: Christian König <[email protected]> > >> > >> Please also keep in mind that the while this wait for all fences for > >> correctness you also need to keep the mapping valid until > >> dma_buf_unmap_attachment() was called. > > > > Can you elaborate on this more? > > > > I think what we want for dma_buf_attach_revocable() is the strong > > guarentee that the importer stops doing all access to the memory once > > this sequence is completed and the exporter can rely on it. I don't > > think this works any other way. > > > > This is already true for dynamic move capable importers, right? > > Not quite, no.
:( It is kind of shocking to hear these APIs work like this with such a loose lifetime definition. Leon can you include some of these detail in the new comments? > >> In other words you can only redirect the DMA-addresses previously > >> given out into nirvana (or a dummy memory or similar), but you still > >> need to avoid re-using them for something else. > > > > Does any driver do this? If you unload/reload a GPU driver it is > > going to re-use the addresses handed out? > > I never fully read through all the source code, but if I'm not > completely mistaken that is enforced for all GPU drivers through the > DMA-buf and DRM layer lifetime handling and I think even in other in > kernel frameworks like V4L, alsa etc... > What roughly happens is that each DMA-buf mapping through a couple > of hoops keeps a reference on the device, so even after a hotplug > event the device can only fully go away after all housekeeping > structures are destroyed and buffers freed. A simple reference on the device means nothing for these kinds of questions. It does not stop unloading and reloading a driver. Obviously if the driver is loaded fresh it will reallocate. To do what you are saying the DRM drivers would have to block during driver remove until all unmaps happen. > Background is that a lot of device still make reads even after you > have invalidated a mapping, but then discard the result. And they also don't insert fences to conclude that? > So when you don't have same grace period you end up with PCI AER, > warnings from IOMMU, random accesses to PCI BARs which just happen > to be in the old location of something etc... Yes, definitely. It is very important to have a definitive point in the API where all accesses stop. While "read but discard" seems harmless on the surface, there are corner cases where it is not OK. Am I understanding right that these devices must finish their reads before doing unmap? > I would rather like to keep that semantics even for forcefully > shootdowns since it proved to be rather reliable. We can investigate making unmap the barrier point if this is the case. Jason
