On Monday, 7 July 2025 15:14:11 CEST Christian König wrote: > On 07.07.25 14:25, Janusz Krzysztofik wrote: > > Hi Christian, > > > > I've taken over that old issue and have a few questions to you. > > Thanks a lot for that, something really fishy seems to be going on here. > > > On Thursday, 27 February 2025 15:11:39 CEST Christian König wrote: > ... > >> Question is why is the test taking 26 (busy?) seconds to complete? That > > sounds really long even for a very old CPU. > > > > The failing test case operates on a dma fence chain built of 4096 dma > > fences. > > Am I right that a single dma_fence_signal() call may result in up to 4096 > > levels of nested sub-calls to dma_fence_chain_cb()? > > No, that could only happen if the test case is broken. > > See the test is to make sure that we don't overflow the kernel stack with > many elements when enable_signaling is called. > > The callback is always only installed on the first unsignaled fence. So when > the last fence is signaled the dma_fence_chain_cb function it is called > exactly once, installs itself on the second to last fence, waits for that one > to signal and so on. > > > What could be is that test case calls enable_signaling on each of the 4096 > dma_fence_chain elements individually and so each of the elements registers a > dma_fence_chain_cb function, but then the test case would be rather broken > and should probably be fixed. > > > The test case signals > > each fence of the chain in a loop, starting from the last one. Then, > > dma_fence_chain_cb() is called 4096 * (4096 + 1) / 2 (an arithmetic series) > > ~= > > 8.4 milion times, isn't it? > > When the test case indeed enables signaling on all 4096 elements then yes. > > > > > On most powerful gen12 machines used in CI, that test case takes slightly > > less > > than 3s, on low end few years old machines -- ~ 10s. Should we be > > surprised > > that it takes over 20s on the least powerful one (gen3 PineView Atom)? > > And, > > while reproducing the issue, I've never seen any unrecoverable deadlocks. > > It > > just takes time to complete the loop. > > > > Does that address your doubts? > > No, not really. > > > Assuming the commit message will be extended and provide the above > > explanation, I can still imagine a few ways of "fixing" that issue. We can > > just limit the chain length and still execute all those dma_fence_signal() > > calls without taking a breath, but why? Or we can somehow measure expected > > processing speed before running the exercise and limit the chain size > > accordingly at runtime, which in turn seems an overcomplication to me. Or, > > we > > can agree that there is no point in avoiding that process being potentially > > swapped out from the CPU and take the approach proposed by Nitin, perhaps > > limited to this particular problematic test case. And, I can see > > cond_resched() still used in recent patches. > > > > Would any of those options be acceptable for you? > > No, if the test case really does what you describe here then the test is just > broken and should potentially be fixed instead. > > There is no reason to test enabling signaling each of the element in a loop. > So there should be something like 4096 calls to the dma_fence_chain_cb > function each jumping to the next unsignaled fence and re-installing the > callback.
So how building a chain should look like in real use cases? When a user builds a chained link of her fence with another fence then may she enable signaling on the new chain link? If that other fence occurs a chain link then who should take care of disabling signaling on it so signaling is enabled only on the last link of the chain, not leading to a situation similar to what we have now in the test case? IOW, what's a correct use pattern of dma_fence_chain? I can't find that documented anywhere, neither in inline docs nor in commit descriptions. Thanks, Janusz > > Regards, > Christian. > > > > > Thanks, > > Janusz > > > >> > >> Do we maybe have an udelay() here which should have been an usleep() or > > similar? > >> > >> Regards, > >> Christian. > >> > >>> > >>> Andi > >>> > >>>> --- > >>>> Hi, > >>>> > >>>> For reviewer reference, adding here watchdog issue seen on old machines > >>>> during dma-fence-chain subtests testing. This log is retrieved from > > device > >>>> pstore log while testing dam-buf@all-tests: > >>>> > >>>> dma-buf: Running dma_fence_chain > >>>> Panic#1 Part7 > >>>> <6> sizeof(dma_fence_chain)=184 > >>>> <6> dma-buf: Running dma_fence_chain/sanitycheck > >>>> <6> dma-buf: Running dma_fence_chain/find_seqno > >>>> <6> dma-buf: Running dma_fence_chain/find_signaled > >>>> <6> dma-buf: Running dma_fence_chain/find_out_of_order > >>>> <6> dma-buf: Running dma_fence_chain/find_gap > >>>> <6> dma-buf: Running dma_fence_chain/find_race > >>>> <6> Completed 4095 cycles > >>>> <6> dma-buf: Running dma_fence_chain/signal_forward > >>>> <6> dma-buf: Running dma_fence_chain/signal_backward > >>>> <6> dma-buf: Running dma_fence_chain/wait_forward > >>>> <6> dma-buf: Running dma_fence_chain/wait_backward > >>>> <0> watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [dmabuf:2263] > >>>> Panic#1 Part6 > >>>> <4> irq event stamp: 415735 > >>>> <4> hardirqs last enabled at (415734): [<ffffffff813d3a1b>] > > handle_softirqs+0xab/0x4d0 > >>>> <4> hardirqs last disabled at (415735): [<ffffffff827c7e31>] > > sysvec_apic_timer_interrupt+0x11/0xc0 > >>>> <4> softirqs last enabled at (415728): [<ffffffff813d3f8f>] > > __irq_exit_rcu+0x13f/0x160 > >>>> <4> softirqs last disabled at (415733): [<ffffffff813d3f8f>] > > __irq_exit_rcu+0x13f/0x160 > >>>> <4> CPU: 2 UID: 0 PID: 2263 Comm: dmabuf Not tainted 6.14.0-rc2-drm- > > next_483-g7b91683e7de7+ #1 > >>>> <4> Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS > > AYAPLCEL.86A.0056.2018.0926.1100 09/26/2018 > >>>> <4> RIP: 0010:handle_softirqs+0xb1/0x4d0 > >>>> <4> RSP: 0018:ffffc90000154f60 EFLAGS: 00000246 > >>>> <4> RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 > >>>> <4> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > >>>> <4> RBP: ffffc90000154fb8 R08: 0000000000000000 R09: 0000000000000000 > >>>> <4> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000000a > >>>> <4> R13: 0000000000000200 R14: 0000000000000200 R15: 0000000000400100 > >>>> <4> FS: 000077521c5cd940(0000) GS:ffff888277900000(0000) > > knlGS:0000000000000000 > >>>> Panic#1 Part5 > >>>> <4> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>> <4> CR2: 00005dbfee8c00c4 CR3: 0000000133d38000 CR4: 00000000003526f0 > >>>> <4> Call Trace: > >>>> <4> <IRQ> > >>>> <4> ? show_regs+0x6c/0x80 > >>>> <4> ? watchdog_timer_fn+0x247/0x2d0 > >>>> <4> ? __pfx_watchdog_timer_fn+0x10/0x10 > >>>> <4> ? __hrtimer_run_queues+0x1d0/0x420 > >>>> <4> ? hrtimer_interrupt+0x116/0x290 > >>>> <4> ? __sysvec_apic_timer_interrupt+0x70/0x1e0 > >>>> <4> ? sysvec_apic_timer_interrupt+0x47/0xc0 > >>>> <4> ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 > >>>> <4> ? handle_softirqs+0xb1/0x4d0 > >>>> <4> __irq_exit_rcu+0x13f/0x160 > >>>> <4> irq_exit_rcu+0xe/0x20 > >>>> <4> sysvec_irq_work+0xa0/0xc0 > >>>> <4> </IRQ> > >>>> <4> <TASK> > >>>> <4> asm_sysvec_irq_work+0x1b/0x20 > >>>> <4> RIP: 0010:_raw_spin_unlock_irqrestore+0x57/0x80 > >>>> <4> RSP: 0018:ffffc9000292b8f0 EFLAGS: 00000246 > >>>> <4> RAX: 0000000000000000 RBX: ffff88810f235480 RCX: 0000000000000000 > >>>> <4> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > >>>> <4> RBP: ffffc9000292b900 R08: 0000000000000000 R09: 0000000000000000 > >>>> <4> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000246 > >>>> <4> R13: 0000000000000000 R14: 0000000000000246 R15: 000000000003828c > >>>> Panic#1 Part4 > >>>> <4> dma_fence_signal+0x49/0xb0 > >>>> <4> wait_backward+0xf8/0x140 [dmabuf_selftests] > >>>> <4> __subtests+0x75/0x1f0 [dmabuf_selftests] > >>>> <4> dma_fence_chain+0x94/0xe0 [dmabuf_selftests] > >>>> <4> st_init+0x6a/0xff0 [dmabuf_selftests] > >>>> <4> ? __pfx_st_init+0x10/0x10 [dmabuf_selftests] > >>>> <4> do_one_initcall+0x79/0x400 > >>>> <4> do_init_module+0x97/0x2a0 > >>>> <4> load_module+0x2c23/0x2f60 > >>>> <4> init_module_from_file+0x97/0xe0 > >>>> <4> ? init_module_from_file+0x97/0xe0 > >>>> <4> idempotent_init_module+0x134/0x350 > >>>> <4> __x64_sys_finit_module+0x77/0x100 > >>>> <4> x64_sys_call+0x1f37/0x2650 > >>>> <4> do_syscall_64+0x91/0x180 > >>>> <4> ? trace_hardirqs_off+0x5d/0xe0 > >>>> <4> ? syscall_exit_to_user_mode+0x95/0x260 > >>>> <4> ? do_syscall_64+0x9d/0x180 > >>>> <4> ? do_syscall_64+0x9d/0x180 > >>>> <4> ? irqentry_exit+0x77/0xb0 > >>>> <4> ? sysvec_apic_timer_interrupt+0x57/0xc0 > >>>> <4> entry_SYSCALL_64_after_hwframe+0x76/0x7e > >>>> <4> RIP: 0033:0x77521e72725d > >>>> > >>>> > >>>> drivers/dma-buf/st-dma-fence-chain.c | 14 +++++++++++--- > >>>> 1 file changed, 11 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/drivers/dma-buf/st-dma-fence-chain.c b/drivers/dma-buf/st- > > dma-fence-chain.c > >>>> index ed4b323886e4..328a66ed59e5 100644 > >>>> --- a/drivers/dma-buf/st-dma-fence-chain.c > >>>> +++ b/drivers/dma-buf/st-dma-fence-chain.c > >>>> @@ -505,6 +505,7 @@ static int signal_forward(void *arg) > >>>> > >>>> for (i = 0; i < fc.chain_length; i++) { > >>>> dma_fence_signal(fc.fences[i]); > >>>> + cond_resched(); > >>>> > >>>> if (!dma_fence_is_signaled(fc.chains[i])) { > >>>> pr_err("chain[%d] not signaled!\n", i); > >>>> @@ -537,6 +538,7 @@ static int signal_backward(void *arg) > >>>> > >>>> for (i = fc.chain_length; i--; ) { > >>>> dma_fence_signal(fc.fences[i]); > >>>> + cond_resched(); > >>>> > >>>> if (i > 0 && dma_fence_is_signaled(fc.chains[i])) { > >>>> pr_err("chain[%d] is signaled!\n", i); > >>>> @@ -587,8 +589,10 @@ static int wait_forward(void *arg) > >>>> get_task_struct(tsk); > >>>> yield_to(tsk, true); > >>>> > >>>> - for (i = 0; i < fc.chain_length; i++) > >>>> + for (i = 0; i < fc.chain_length; i++) { > >>>> dma_fence_signal(fc.fences[i]); > >>>> + cond_resched(); > >>>> + } > >>>> > >>>> err = kthread_stop_put(tsk); > >>>> > >>>> @@ -616,8 +620,10 @@ static int wait_backward(void *arg) > >>>> get_task_struct(tsk); > >>>> yield_to(tsk, true); > >>>> > >>>> - for (i = fc.chain_length; i--; ) > >>>> + for (i = fc.chain_length; i--; ) { > >>>> dma_fence_signal(fc.fences[i]); > >>>> + cond_resched(); > >>>> + } > >>>> > >>>> err = kthread_stop_put(tsk); > >>>> > >>>> @@ -663,8 +669,10 @@ static int wait_random(void *arg) > >>>> get_task_struct(tsk); > >>>> yield_to(tsk, true); > >>>> > >>>> - for (i = 0; i < fc.chain_length; i++) > >>>> + for (i = 0; i < fc.chain_length; i++) { > >>>> dma_fence_signal(fc.fences[i]); > >>>> + cond_resched(); > >>>> + } > >>>> > >>>> err = kthread_stop_put(tsk); > >>>> > >> > >> > > > > > > > > > >