Hi Krzysztof On Wed Nov 26, 2025 at 7:49 AM UTC, Krzysztof Karas wrote: > igt_mmap_migrate() tests migration with various parameters. > In one of the cases, where FILL and UNFAULTABLE flags are set, > during first stages of this test, a mock file is opened in > igt_mmap_offset(), which results in allocating GEM objects for > page table structures and scratch in GPU mappable memory. > > Then, also in igt_mmap_offset(), the file is closed (fput) and > the cleanup of these objects is scheduled on a delayed worqueue, > which is designed to execute after unspecified amount of time. > > Next, the test calls igt_fill_mappable() to fill mappable GPU > memory. At this point, three scenarios are possible > (N = max size of GPU memory for this test in MiB): > 1) the objects allocated for the mock file get cleaned up after > crucial part of the test is over, so the memory is full with > the 1 MiB they occupy and N - 1 MiB added by > igt_fill_mappable(), so the migration fails properly; > 2) the object cleanup fires before igt_fill_mappable() > completes, so the whole memory is populated with N MiB from > igt_fill_mappable(), so migration fails as well; > 3) the object cleanup is performed right after fill is done, > so only N - 1 MiB are in the mappable portion of GPU memory, > allowing the migration to succeed - we'd expect no space > left to perform migration, but an object was able to fit in > the remaining 1 MiB, which caused get_user() to succeed, so > a page fault did not fail. > > The test incorrectly assumes that the GPU mappable memory state > is unchanging during the test. Amend this by keeping the mock > file open until migration and page fault checking is complete. > > Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13929 > Signed-off-by: Krzysztof Karas <[email protected]> > --- > Resending, because the patch did not get any reviews. > > There are 7 GEM objects in total allocated as a result of mock > file creation: > a) 1 object from __i915_gem_object_create_lmem_with_ps() > -> i915_gem_object_create_region(), > b) 3 objects from i915_vm_alloc_pt_stash() -> alloc_pt() > -> alloc_pt_lmem(), > c) 3 objects from i915_vm_alloc_pt_stash() -> alloc_pd() > -> alloc_pt_lmem(). > > I inspected the behavior of this test on ATS-M and DG2 > platforms. The latter always freed the GEM objects originating > from mock file creation at the end of the test. ATS-M, on the > other hand, performed the release much sooner - around the time > a call to igt_fill_mappable() would be returning, so there was > not much leeway in the timing. I confirmed this by delaying the > fill operation by one second and that did away with the issue. > On the other end, adding a delay to __i915_gem_free_objects() > produced 100% reproduction rate of the issue. However, I felt > that juggling delays would not have resolved the problem, only > decreased the probability of this race, so I decided to increase > control over the contents of mappable memory instead. > > Chris Wilson had a suspicion that this patch papers over leaking > vm_area struct, which was addressed in > f768ebbba9110846c9f986a96109d70154d60b5d, but that did not > resolve the original issue. > > v2: > * change ownership of the file used in igt_mmap_offset* > functions to the caller (Krzysztof, Sebastian); > * rename igt_mmap_offset_get_file() to > igt_mmap_offset_with_file(); > > v3: > * remove double fput() call (Krzysztof); > > v4: > * extend the comment above mock_drm_getfile(); > * reword commit message to contain information about GEM > objects; > * rebase; > > .../drm/i915/gem/selftests/i915_gem_mman.c | 23 +++++++--- > drivers/gpu/drm/i915/selftests/igt_mmap.c | 46 +++++++++++++------ > drivers/gpu/drm/i915/selftests/igt_mmap.h | 8 ++++ > 3 files changed, 57 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c > b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c > index 0d250d57496a..c561df41ba49 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c > @@ -1159,6 +1159,7 @@ static int __igt_mmap_migrate(struct > intel_memory_region **placements, > struct drm_i915_gem_object *obj; > struct i915_request *rq = NULL; > struct vm_area_struct *area; > + struct file *mock_file; > unsigned long addr; > LIST_HEAD(objects); > u64 offset; > @@ -1178,13 +1179,22 @@ static int __igt_mmap_migrate(struct > intel_memory_region **placements, > goto out_put; > > /* > - * This will eventually create a GEM context, due to opening dummy drm > - * file, which needs a tiny amount of mappable device memory for the top > - * level paging structures(and perhaps scratch), so make sure we > - * allocate early, to avoid tears. > + * Pretend to open("/dev/dri/card0"), which will eventually create a GEM > + * context along with multiple GEM objects (for paging structures and > + * scratch) that are placed in mappable portion of GPU memory. > + * Calling fput() on the file places objects' cleanup routines in > delayed > + * worqueues, which execute after unspecified amount of time. > + * Keep the file open until migration and page fault checks are done to > + * make sure object cleanup is not executed after igt_fill_mappable() > + * finishes and before migration is attempted - that would leave a gap > + * large enough for the migration to succeed, when we'd expect it to > fail. > */ > - addr = igt_mmap_offset(i915, offset, obj->base.size, > - PROT_WRITE, MAP_SHARED); > + mock_file = mock_drm_getfile(i915->drm.primary, O_RDWR); > + if (IS_ERR(mock_file)) > + return PTR_ERR(mock_file); > + > + addr = igt_mmap_offset_with_file(i915, offset, obj->base.size, > + PROT_WRITE, MAP_SHARED, mock_file); > if (IS_ERR_VALUE(addr)) { > err = addr; > goto out_put; > @@ -1295,6 +1305,7 @@ static int __igt_mmap_migrate(struct > intel_memory_region **placements, > vm_munmap(addr, obj->base.size); > > out_put: > + fput(mock_file); > i915_gem_object_put(obj); > igt_close_objects(i915, &objects); > return err; > diff --git a/drivers/gpu/drm/i915/selftests/igt_mmap.c > b/drivers/gpu/drm/i915/selftests/igt_mmap.c > index e920a461bd36..6f1b6d5cc2d3 100644 > --- a/drivers/gpu/drm/i915/selftests/igt_mmap.c > +++ b/drivers/gpu/drm/i915/selftests/igt_mmap.c > @@ -9,17 +9,22 @@ > #include "i915_drv.h" > #include "igt_mmap.h" > > -unsigned long igt_mmap_offset(struct drm_i915_private *i915, > - u64 offset, > - unsigned long size, > - unsigned long prot, > - unsigned long flags) > +unsigned long igt_mmap_offset_with_file(struct drm_i915_private *i915, > + u64 offset, > + unsigned long size, > + unsigned long prot, > + unsigned long flags, > + struct file *file) > { > struct drm_vma_offset_node *node; > - struct file *file; > unsigned long addr; > int err; > > + if (!file) { Minor suggestion, consider using WARN_ON() here for better debugging.
This patch seems okay from my perspective. Reviewed-by: Sebastian Brzezinka <[email protected]> -- Best regards, Sebastian
