>-----Original Message----- >From: Peter Xu <pet...@redhat.com> >Sent: Thursday, June 8, 2023 10:05 PM >To: Duan, Zhenzhong <zhenzhong.d...@intel.com>; Jason Gunthorpe ><j...@nvidia.com> >Cc: qemu-devel@nongnu.org; m...@redhat.com; jasow...@redhat.com; >pbonz...@redhat.com; richard.hender...@linaro.org; edua...@habkost.net; >marcel.apfelb...@gmail.com; alex.william...@redhat.com; >c...@redhat.com; da...@redhat.com; phi...@linaro.org; >kwankh...@nvidia.com; c...@nvidia.com; Liu, Yi L <yi.l....@intel.com>; Peng, >Chao P <chao.p.p...@intel.com> >Subject: Re: [PATCH v3 5/5] intel_iommu: Optimize out some unnecessary >UNMAP calls > >On Thu, Jun 08, 2023 at 05:52:31PM +0800, Zhenzhong Duan wrote: >> Commit 63b88968f1 ("intel-iommu: rework the page walk logic") adds >> logic to record mapped IOVA ranges so we only need to send MAP or >> UNMAP when necessary. But there is still a corner case of unnecessary >UNMAP. >> >> During invalidation, either domain or device selective, we only need >> to unmap when there are recorded mapped IOVA ranges, presuming most >of >> OSes allocating IOVA range continuously, e.g. on x86, linux sets up >> mapping from 0xffffffff downwards. >> >> Strace shows UNMAP ioctl taking 0.000014us and we have 28 such ioctl() >> in one invalidation, as two notifiers in x86 are split into power of 2 >> pieces. >> >> ioctl(48, VFIO_IOMMU_UNMAP_DMA, 0x7ffffd5c42f0) = 0 <0.000014> > >Thanks for the numbers, but for a fair comparison IMHO it needs to be a >comparison of before/after on the whole time used for unmap AS. It'll be >great to have finer granule measurements like each ioctl, but the total time >used should be more important (especially to contain "after"). Side >note: I don't think the UNMAP ioctl will take the same time; it should matter >on whether there's mapping exist).
Yes, but what we want to optimize out is the unmapping no-existent range case. Will show the time diff spent in unmap AS. > >Actually it's hard to tell because this also depends on what's in the iova >tree.. >but still at least we know how it works in some cases. > >> >> The other purpose of this patch is to eliminate noisy error log when >> we work with IOMMUFD. It looks the duplicate UNMAP call will fail with >> IOMMUFD while always succeed with legacy container. This behavior >> difference leads to below error log for IOMMUFD: >> >> IOMMU_IOAS_UNMAP failed: No such file or directory >> vfio_container_dma_unmap(0x562012d6b6d0, 0x0, 0x80000000) = -2 (No >> such file or directory) IOMMU_IOAS_UNMAP failed: No such file or >> directory vfio_container_dma_unmap(0x562012d6b6d0, 0x80000000, >> 0x40000000) = -2 (No such file or directory) ... > >My gut feeling is the major motivation is actually this (not the perf). >tens of some 14us ioctls is really nothing on a rare event.. To be honest, yes. Thanks Zhenzhong > >Jason Wang raised a question in previous version and I think JasonG's reply is >here: > >https://lore.kernel.org/r/zhtaqxd3zybmh...@nvidia.com > >JasonG: sorry I know zero on iommufd api yet, but you said: > > The VFIO emulation functions should do whatever VFIO does, is there > a mistake there? > >IIUC what VFIO does here is it returns succeed if unmap over nothing rather >than failing like iommufd. Curious (like JasonW) on why that retval? I'd >assume for returning "how much unmapped" we can at least still return 0 for >nothing. > >Are you probably suggesting that we can probably handle that in QEMU side >on -ENOENT here for iommufd only (a question to Yi?). > >If that's already a kernel abi, not sure whether it's even discussable, but >just to >raise this up. > >-- >Peter Xu