Hi Krzysztof,

Thanks for looking at this.

On Thursday, 13 November 2025 08:55:01 CET Krzysztof Karas wrote:
> Hi Janusz,
> 
> > The smem-oom subtest forks SMEM helper processes from a loop run in the
> > main process.  That loop is supposed to be terminated only when exit
> > handler of a formerly forked LMEM process signals completion.  However,
> I'd add "eviction" keyword:
> ...formerly forked LMEM eviction process signals completion.

OK

> 
> > since the subtest arranges OOM conditions, the LMEM process may get killed
> > and never signal its completion.  When that happens, the subtest may keep
> > respawning SMEM helpers indefinitely and complete only when killed, e.g.,
> > by igt_runner on per-test timeout expiration.
> > 
> > Instead of waiting form completion of the loop of the SMEM helpers, run
> form -> for

Thanks.

> 
> > the loop in background and wait for completion of the LMEM process.  Also,
> > take care of signaling the SMEM helper processes about LMEM process
> > completion in case it has got killed and hasn't had a chance to do that
> > itself.
> > 
> > This patch addresses timeout results reported to the below mentioned
> > upstream issue.  Other failures (incomplete / dmesg-warn / crash) may
> > need additional patches, but let's fix those timeouts first to get a more
> > clear picture.
> > 
> > Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5493
> > Signed-off-by: Janusz Krzysztofik <[email protected]>
> > ---
> >  tests/intel/gem_lmem_swapping.c | 19 ++++++++++++++-----
> >  1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/tests/intel/gem_lmem_swapping.c 
> > b/tests/intel/gem_lmem_swapping.c
> > index 8e0dac42d8..45bd94adbb 100644
> > --- a/tests/intel/gem_lmem_swapping.c
> > +++ b/tests/intel/gem_lmem_swapping.c
> > @@ -678,6 +678,7 @@ static void test_smem_oom(int i915,
> >     const unsigned int num_alloc = 1 + smem_size / (alloc >> 20);
> >     struct igt_helper_process smem_proc = {};
> >     unsigned int n;
> > +   int lmem_err;
> >  
> >     lmem_done = mmap(0, sizeof(*lmem_done), PROT_WRITE,
> >                      MAP_SHARED | MAP_ANON, -1, 0);
> > @@ -703,8 +704,8 @@ static void test_smem_oom(int i915,
> >     }
> >  
> >     /* smem memory hog process, respawn till the lmem process completes */
> > -   while (!READ_ONCE(*lmem_done)) {
> > -           igt_fork_helper(&smem_proc) {
> > +   igt_fork_helper(&smem_proc) {
> > +           while (!READ_ONCE(*lmem_done)) {
> Huh, so instead of having a single helper that handles spawning
> grand-children for manual leaks, we had many of them. Since all
> but one processes were waiting for lelm_done flag, if the lmem
> eviction fork got killed, everything would spin indefinitely.
> Good catch!
> 
> >                     igt_fork(child, 1) {
> >                             for (int pass = 0; pass < num_alloc; pass++) {
> >                                     if (READ_ONCE(*lmem_done))
> > @@ -730,11 +731,19 @@ static void test_smem_oom(int i915,
> >                     for (n = 0; n < 2; n++)
> >                             wait(NULL);
> >             }
> > -           igt_wait_helper(&smem_proc);
> >     }
> > -   munmap(lmem_done, sizeof(*lmem_done));
> > +
> >     /* Reap exit status of the lmem process */
> > -   igt_waitchildren();
> > +   lmem_err = __igt_waitchildren();
> By changing igt_waitchildren() to __igt_waitchildren() you
> wanted to outline that this test is not supposed to run on multi
> GPU systems?

No, my intention was to avoid potential longjmp from inside 
igt_waitchildren()->igt_fail() if the lmem eviction process returns a failure, 
so we are still able to signal completion to the smem helpers as below.  Once 
cleaned up, we are then free to fail on lmem_err.

Thanks,
Janusz

> 
> > +
> > +   /* Make sure SMEM helpers stop even when the LMEM process gets killed */
> > +   if (lmem_err)
> > +           (*lmem_done)++;
> > +   munmap(lmem_done, sizeof(*lmem_done));
> > +
> > +   igt_wait_helper(&smem_proc);
> > +
> > +   igt_assert_eq(lmem_err, 0);
> >  }
> >  
> >  #define dynamic_lmem_subtest(reg, regs, subtest_name...) \
> 
> 




Reply via email to