On Fri, May 30, 2025 at 09:55:45AM +0800, Xiongfeng Wang wrote: > Hi Joel, > > On 2025/5/29 0:30, Joel Fernandes wrote: > > On Wed, May 21, 2025 at 5:43 AM Xiongfeng Wang > > <wangxiongfe...@huawei.com> wrote: > >> > >> Hi RCU experts, > >> > >> When I ran syskaller in Linux 6.6 with CONFIG_PREEMPT_RCU enabled, I got > >> the following soft lockup. The Calltrace is too long. I put it in the end. > >> The issue can also be reproduced in the latest kernel. > >> > >> The issue is as follows. CPU3 is waiting for a spin_lock, which is got by > >> CPU1. > >> But CPU1 stuck in the following dead loop. > >> > >> irq_exit() > >> __irq_exit_rcu() > >> /* in_hardirq() returns false after this */ > >> preempt_count_sub(HARDIRQ_OFFSET) > >> tick_irq_exit() > >> tick_nohz_irq_exit() > >> tick_nohz_stop_sched_tick() > >> trace_tick_stop() /* a bpf prog is hooked on this trace > >> point */ > >> __bpf_trace_tick_stop() > >> bpf_trace_run2() > >> rcu_read_unlock_special() > >> /* will send a IPI to itself */ > >> irq_work_queue_on(&rdp->defer_qs_iw, > >> rdp->cpu); > >> > >> /* after interrupt is enabled again, the irq_work is called */ > >> asm_sysvec_irq_work() > >> sysvec_irq_work() > >> irq_exit() /* after handled the irq_work, we again enter into irq_exit() */ > >> __irq_exit_rcu() > >> ...skip... > >> /* we queue a irq_work again, and enter a dead loop */ > >> irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu); > > > > This seems legitimate, Boqun and I were just talking about it. He may > > share more thoughts but here are a few: > > > > Maybe we can delay subsequent clearing of the flag in > > rcu_preempt_deferred_qs_handler() using a timer and an exponential > > back-off? That way we are not sending too many self-IPIs. > > > > And reset the process at the end of a grace period. > > > > Or just don't send subsequent self-IPIs if we just sent one for the > > rdp. Chances are, if we did not get the scheduler's attention during > > the first one, we may not in subsequent ones I think. Plus we do send > > other IPIs already if the grace period was over extended (from the FQS > > loop), maybe we can tweak that? > > Thanks a lot for your reply. I think it's hard for me to fix this issue as > above without introducing new bugs. I barely understand the RCU code. But I'm > very glad to help test if you have any code modifiction need to. I have > the VM and the syskaller benchmark which can reproduce the problem.
Sure, I understand. This is already incredibly valuable so thank you again. Will request for your testing help soon. I also have a test module now which can sort-off reproduce this. Keep you posted! thanks, - Joel