On Thu, 26 Apr 2018 21:11:00 -0700 Joel Fernandes <joe...@google.com> wrote:
> I'm able to reproduce a lockdep splat when CONFIG_PROVE_LOCKING=y and > CONFIG_PREEMPTIRQ_EVENTS=y. > > $ echo 1 > /d/tracing/events/preemptirq/preempt_enable/enable > --- > [ 26.112609] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) > [ 26.112636] WARNING: CPU: 0 PID: 118 at kernel/locking/lockdep.c:3854 > [...] > [ 26.144229] Call Trace: > [ 26.144926] <IRQ> > [ 26.145506] lock_acquire+0x55/0x1b0 > [ 26.146499] ? __do_softirq+0x46f/0x4d9 > [ 26.147571] ? __do_softirq+0x46f/0x4d9 > [ 26.148646] trace_preempt_on+0x8f/0x240 > [ 26.149744] ? trace_preempt_on+0x4d/0x240 > [ 26.150862] ? __do_softirq+0x46f/0x4d9 > [ 26.151930] preempt_count_sub+0x18a/0x1a0 > [ 26.152985] __do_softirq+0x46f/0x4d9 > [ 26.153937] irq_exit+0x68/0xe0 > [ 26.154755] smp_apic_timer_interrupt+0x271/0x280 > [ 26.156056] apic_timer_interrupt+0xf/0x20 > [ 26.157105] </IRQ> > > The problem is the softirqs annotation in lockdep goes out of sync with > the reality of the world that softirq is still off. This causes a > lockdep splat because the preempt_count_sub can call into a preemptoff > tracer or the trace events code, which inturn can call into lockdep > *before* softirqs are really turned back on, which can cause a softirqs > invalid annotation splat in lockdep. > OK, so the issue was this: preempt_count = 1 << SOFTIRQ_SHIFT __local_bh_enable(cnt = 1 << SOFTIRQ_SHIFT) { if (softirq_count() == (cnt && SOFTIRQ_MASK)) { trace_softirqs_on() { current->softirqs_enabled = 1; } } preempt_count_sub(cnt) { trace_preempt_on() { tracepoint() { rcu_read_lock_sched() { // jumps into lockdep Where preempt_count still has softirqs disabled, but current->softirqs_enabled is true, and we get a splat. Your patch makes the tracing happen before we monkey with softirqs. Yeah, looks good to me. Reviewed-by: Steven Rostedt (VMware) <rost...@goodmis.org> Thomas, you want to take this? -- Steve > The same issue was fixed in local_bh_disable_ip which has a comment so: > /* > * The preempt tracer hooks into preempt_count_add and will break > * lockdep because it calls back into lockdep after SOFTIRQ_OFFSET > * is set and before current->softirq_enabled is cleared. > * We must manually increment preempt_count here and manually > * call the trace_preempt_off later. > */ > > I have done a similar change to the local_bh_enable path to fix it. > > Cc: Steven Rostedt <rost...@goodmis.org> > Cc: Peter Zilstra <pet...@infradead.org> > Cc: Ingo Molnar <mi...@redhat.com> > Cc: Mathieu Desnoyers <mathieu.desnoy...@efficios.com> > Cc: Tom Zanussi <tom.zanu...@linux.intel.com> > Cc: Namhyung Kim <namhy...@kernel.org> > Cc: Thomas Glexiner <t...@linutronix.de> > Cc: Boqun Feng <boqun.f...@gmail.com> > Cc: Paul McKenney <paul...@linux.vnet.ibm.com> > Cc: Frederic Weisbecker <fweis...@gmail.com> > Cc: Randy Dunlap <rdun...@infradead.org> > Cc: Masami Hiramatsu <mhira...@kernel.org> > Cc: Fenguang Wu <fengguang...@intel.com> > Cc: Baohong Liu <baohong....@intel.com> > Cc: Vedang Patel <vedang.pa...@intel.com> > Cc: kernel-t...@android.com > Signed-off-by: Joel Fernandes <joe...@google.com> > --- > kernel/softirq.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/softirq.c b/kernel/softirq.c > index 177de3640c78..8a040bcaa033 100644 > --- a/kernel/softirq.c > +++ b/kernel/softirq.c > @@ -139,9 +139,13 @@ static void __local_bh_enable(unsigned int cnt) > { > lockdep_assert_irqs_disabled(); > > + if (preempt_count() == cnt) > + trace_preempt_on(CALLER_ADDR0, get_lock_parent_ip()); > + > if (softirq_count() == (cnt & SOFTIRQ_MASK)) > trace_softirqs_on(_RET_IP_); > - preempt_count_sub(cnt); > + > + __preempt_count_sub(cnt); > } > > /*