On Thu, Jun 12, 2025 at 10:53:34AM -0700, Nicolin Chen wrote:
> On Thu, Jun 12, 2025 at 12:42:42PM -0300, Jason Gunthorpe wrote:
> > On Thu, Jun 12, 2025 at 05:23:01PM +0200, Thomas Weißschuh wrote:
> > > On Thu, Jun 12, 2025 at 11:58:01AM -0300, Jason Gunthorpe wrote:
> > > > On Thu, Jun 12, 2025 at 04:27:41PM +0200, Thomas Weißschuh wrote:
> > > > 
> > > > > If the assumption is that this is most likely a kernel bug,
> > > > > shouldn't it be fixed properly rather than worked around?
> > > > > After all the job of a selftest is to detect bugs to be fixed.
> > > > 
> > > > I investigated the history for a bit and it seems likely we cannot
> > > > change the kernel here. Call it an undocumented "feature".
> > > 
> > > I looked a bit and it seems to be mentioned in mmap(2):
> > > 
> > >   For mmap(), offset must be a multiple of the underlying huge page size.
> > >   The system automatically aligns length to be a multiple of the 
> > > underlying huge page size.
> > 
> > Oh there you go then :) Horrible design. No way for userspace to know
> > what the rounded up length actually was and thus no way for
> > userspace to unmap it.
> 
> OK. I think we would have to skip those cases then.

Or.. maybe we could just allocate a huge page:

@@ -2022,7 +2023,19 @@ FIXTURE_SETUP(iommufd_dirty_tracking)
        self->fd = open("/dev/iommu", O_RDWR);
        ASSERT_NE(-1, self->fd);

-       rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size);
+       if (variant->hugepages) {
+               /*
+                * Allocation must be aligned to the HUGEPAGE_SIZE, because the
+                * following mmap() will automatically align the length to be a
+                * multiple of the underlying huge page size. Failing to do the
+                * same at this allocation will result in a memory overwrite by
+                * the mmap().
+                */
+               size = __ALIGN_KERNEL(variant->buffer_size, HUGEPAGE_SIZE);
+       } else {
+               size = variant->buffer_size;
+       }
+       rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, size);
        if (rc || !self->buffer) {
                SKIP(return, "Skipping buffer_size=%lu due to errno=%d",
                           variant->buffer_size, rc);

It can just upsize the allocation, i.e. the test case will only
use the first 64M or 128MB out of the reserved 512MB huge page.

Thanks
Nicolin

Reply via email to