> From: Jason Gunthorpe <j...@nvidia.com> > Sent: Wednesday, June 1, 2022 7:11 AM > > On Tue, May 31, 2022 at 10:22:32PM +0100, Robin Murphy wrote: > > > There are only 3 instances where we'll free a table while the domain is > > live. The first is the one legitimate race condition, where two map requests > > targeting relatively nearby PTEs both go to fill in an intermediate level of > > table; whoever loses that race frees the table they allocated, but it was > > never visible to anyone else so that's definitely fine. The second is if > > we're mapping a block entry, and find that there's already a table entry > > there, wherein we assume the table must be empty, clear the entry, > > invalidate any walk caches, install the block entry, then free the orphaned > > table; since we're mapping the entire IOVA range covered by that table, > > there should be no other operations on that IOVA range attempting to walk > > the table at the same time, so it's fine. > > I saw these two in the Intel driver > > > The third is effectively the inverse, if we get a block-sized unmap > > but find a table entry rather than a block at that point (on the > > assumption that it's de-facto allowed for a single unmap to cover > > multiple adjacent mappings as long as it does so exactly); similarly > > we assume that the table must be full, and no other operations > > should be racing because we're unmapping its whole IOVA range, so we > > remove the table entry, invalidate, and free as before. > > Not sure I noticed this one though > > This all it all makes sense though.
Intel driver also does this. See dma_pte_clear_level(): /* If range covers entire pagetable, free it */ if (start_pfn <= level_pfn && last_pfn >= level_pfn + level_size(level) - 1) { ... _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu