On Tue, Mar 23, 2021 at 02:06:19PM -0700, Nadav Amit wrote: > From: Nadav Amit <na...@vmware.com> > > Currently, IOMMU invalidations and device-IOTLB invalidations using > AMD IOMMU fall back to full address-space invalidation if more than a > single page need to be flushed. > > Full flushes are especially inefficient when the IOMMU is virtualized by > a hypervisor, since it requires the hypervisor to synchronize the entire > address-space. > > AMD IOMMUs allow to provide a mask to perform page-specific > invalidations for multiple pages that match the address. The mask is > encoded as part of the address, and the first zero bit in the address > (in bits [51:12]) indicates the mask size. > > Use this hardware feature to perform selective IOMMU and IOTLB flushes. > Combine the logic between both for better code reuse. > > The IOMMU invalidations passed a smoke-test. The device IOTLB > invalidations are untested.
Have you thoroughly tested this on real hardware? I had a patch-set doing the same many years ago and it lead to data corruption under load. Back then it could have been a bug in my code of course, but it made me cautious about using targeted invalidations. Regards, Joerg