Hi Krzysztof, Thanks for looking at this.
On Thursday, 13 November 2025 08:55:01 CET Krzysztof Karas wrote: > Hi Janusz, > > > The smem-oom subtest forks SMEM helper processes from a loop run in the > > main process. That loop is supposed to be terminated only when exit > > handler of a formerly forked LMEM process signals completion. However, > I'd add "eviction" keyword: > ...formerly forked LMEM eviction process signals completion. OK > > > since the subtest arranges OOM conditions, the LMEM process may get killed > > and never signal its completion. When that happens, the subtest may keep > > respawning SMEM helpers indefinitely and complete only when killed, e.g., > > by igt_runner on per-test timeout expiration. > > > > Instead of waiting form completion of the loop of the SMEM helpers, run > form -> for Thanks. > > > the loop in background and wait for completion of the LMEM process. Also, > > take care of signaling the SMEM helper processes about LMEM process > > completion in case it has got killed and hasn't had a chance to do that > > itself. > > > > This patch addresses timeout results reported to the below mentioned > > upstream issue. Other failures (incomplete / dmesg-warn / crash) may > > need additional patches, but let's fix those timeouts first to get a more > > clear picture. > > > > Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5493 > > Signed-off-by: Janusz Krzysztofik <[email protected]> > > --- > > tests/intel/gem_lmem_swapping.c | 19 ++++++++++++++----- > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/tests/intel/gem_lmem_swapping.c > > b/tests/intel/gem_lmem_swapping.c > > index 8e0dac42d8..45bd94adbb 100644 > > --- a/tests/intel/gem_lmem_swapping.c > > +++ b/tests/intel/gem_lmem_swapping.c > > @@ -678,6 +678,7 @@ static void test_smem_oom(int i915, > > const unsigned int num_alloc = 1 + smem_size / (alloc >> 20); > > struct igt_helper_process smem_proc = {}; > > unsigned int n; > > + int lmem_err; > > > > lmem_done = mmap(0, sizeof(*lmem_done), PROT_WRITE, > > MAP_SHARED | MAP_ANON, -1, 0); > > @@ -703,8 +704,8 @@ static void test_smem_oom(int i915, > > } > > > > /* smem memory hog process, respawn till the lmem process completes */ > > - while (!READ_ONCE(*lmem_done)) { > > - igt_fork_helper(&smem_proc) { > > + igt_fork_helper(&smem_proc) { > > + while (!READ_ONCE(*lmem_done)) { > Huh, so instead of having a single helper that handles spawning > grand-children for manual leaks, we had many of them. Since all > but one processes were waiting for lelm_done flag, if the lmem > eviction fork got killed, everything would spin indefinitely. > Good catch! > > > igt_fork(child, 1) { > > for (int pass = 0; pass < num_alloc; pass++) { > > if (READ_ONCE(*lmem_done)) > > @@ -730,11 +731,19 @@ static void test_smem_oom(int i915, > > for (n = 0; n < 2; n++) > > wait(NULL); > > } > > - igt_wait_helper(&smem_proc); > > } > > - munmap(lmem_done, sizeof(*lmem_done)); > > + > > /* Reap exit status of the lmem process */ > > - igt_waitchildren(); > > + lmem_err = __igt_waitchildren(); > By changing igt_waitchildren() to __igt_waitchildren() you > wanted to outline that this test is not supposed to run on multi > GPU systems? No, my intention was to avoid potential longjmp from inside igt_waitchildren()->igt_fail() if the lmem eviction process returns a failure, so we are still able to signal completion to the smem helpers as below. Once cleaned up, we are then free to fail on lmem_err. Thanks, Janusz > > > + > > + /* Make sure SMEM helpers stop even when the LMEM process gets killed */ > > + if (lmem_err) > > + (*lmem_done)++; > > + munmap(lmem_done, sizeof(*lmem_done)); > > + > > + igt_wait_helper(&smem_proc); > > + > > + igt_assert_eq(lmem_err, 0); > > } > > > > #define dynamic_lmem_subtest(reg, regs, subtest_name...) \ > >
