On Fri, May 15, 2026 at 11:33:21PM +0000, Michael Kelley wrote: > From: Michael Kelley <[email protected]> Sent: Friday, May 15, 2026 11:00 > AM > > > > From: Yu Zhang <[email protected]> Sent: Friday, May 15, 2026 > > 9:24 AM > > > > > > On Thu, May 14, 2026 at 06:14:22PM +0000, Michael Kelley wrote: > > > > From: Yu Zhang <[email protected]> Sent: Monday, May 11, > > > > 2026 9:24 AM > > > > > > > > > [....] > > > > > > > + unsigned long nr_pages = end_pfn - start_pfn; > > > > > + u16 count = 0; > > > > > + > > > > > + while (nr_pages > 0) { > > > > > + unsigned long flush_pages; > > > > > + int order; > > > > > + unsigned long pfn_align; > > > > > + unsigned long size_align; > > > > > + > > > > > + if (count >= HV_IOMMU_MAX_FLUSH_VA_COUNT) { > > > > > + count = HV_IOMMU_FLUSH_VA_OVERFLOW; > > > > > + break; > > > > > + } > > > > > + > > > > > + if (start_pfn) > > > > > + pfn_align = __ffs(start_pfn); > > > > > > > > I don't understand why __ffs() is correct here. I would expect > > > > __fls() so it is consistent with the calculation of size_align. But I > > > > can only surmise how the hypercall works since there's no > > > > documentation, so maybe my understanding of the hypercall is > > > > wrong. If __ffs really is correct, a comment explaining why > > > > would help. :-) > > > > > > > > > > The use of __ffs() is intentional. Each flush entry invalidates a > > > naturally aligned 2^N page block, and the hypervisor requires the > > > page_number to be aligned to 2^page_mask_shift. > > > > > > Here __ffs() and __fls() serve different purposes: > > > - __ffs(start_pfn) is about the alignment constraint, e.g., how > > > large a block can this address support? > > > - __fls(nr_pages) is about the size constraint, e.g., how large > > > a block can the remaining range hold? > > > > > > Taking min() of both ensures each entry is both properly aligned > > > and within bounds. > > > > > > Thanks for raising this - it definitely deserves a comment. I had to > > > stare at it for a while myself to remember why. :) > > > > Hmmm. Something about this still nags at me. I'll run some > > experiments to either convince myself that you are right, or to > > come up with a counterexample. > > I have resolved what was nagging at me. My understanding of how > _ffs() and __fls() work was wrong. :-( Your code is correct. > > > > > A related thought occurred to me. If each flush entry that is passed > > to Hyper-V describes a naturally aligned 2^N page block, I don't > > think the HV_IOMMU_MAX_FLUSH_VA_COUNT can ever > > be reached. The number of entries is limited by the number of > > bits in a PFN and the pages count, both of which are 64. And with > > 52 bit physical addressing and 4KiB pages, the actual size of > > a PFN and pages count is even smaller than 64. > > HV_IOMMU_MAX_FLUSH_VA_COUNT is the number of 8 byte > > union hv_iommu_flush_va entries that fit in a 4KiB page, so > > it's ~500. > > > > My statement applies to a single flush range. If multiple flush > > ranges were strung together in a single hypercall, a larger count > > could be reached, but hv_flush_device_domain_list() only does > > a single range. So I don't think the overflow case in > > hv_flush_device_domain_list() can ever happen. But let me > > do my experiments, and I will also look at this aspect to confirm > > if it's right. > > My experiments also convince me that the overflow case can't > happen as long as the hypercall is being populated with entries > derived from a single range.
Gosh. You're right. I think I've been overthinking this :( So we can simplify this to just do the list flush and fall back to HVCALL_FLUSH_DEVICE_DOMAIN only if the list flush fails. B.R. Yu

