Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On (09/07/18 16:03), Peter Zijlstra wrote: > > > > I would even argue that placing printk_nmi_enter() between > > lockdep_off() and ftrace_nmi_enter() is wrong because if in the future > > printk_nmi_enter() were to do any ftrace tracing, it wont be caught, as > > it was by having it before lockdep_off(). > > > > printk_nmi_enter() should not muck with IRQ state, nor should it do any > > ftrace tracing. Since ftrace mucks with IRQ state when it gets enabled > > or disabled, it will screw up lockdep, and lockdep will complain. That > > way we can use lockdep not being off to catch this bug. > > The very bestest solution is to rm -rf printk ;-) Talented, capable and tremendously clever people had spent decades on making printk what it is today. I feel responsible for respecting that effort and, thus, my vote would be to keep printk around for a while. ... we also support !CONFIG_PRINTK builds ;) -ss
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, Sep 07, 2018 at 10:01:28AM -0400, Steven Rostedt wrote: > On Fri, 7 Sep 2018 09:55:33 -0400 > Steven Rostedt wrote: > > > On Fri, 7 Sep 2018 15:45:32 +0200 > > Peter Zijlstra wrote: > > > > > Yes really, we should not muck with the IRQ state from NMI context. > > > > Right, and we didn't. Your patch didn't change anything, but allow for > > printk_nmi_enter/exit() to be traced by ftrace, but that's wrong to > > begin with because it ftrace_nmi_enter() hasn't been called yet. > > > > I would even argue that placing printk_nmi_enter() between > lockdep_off() and ftrace_nmi_enter() is wrong because if in the future > printk_nmi_enter() were to do any ftrace tracing, it wont be caught, as > it was by having it before lockdep_off(). > > printk_nmi_enter() should not muck with IRQ state, nor should it do any > ftrace tracing. Since ftrace mucks with IRQ state when it gets enabled > or disabled, it will screw up lockdep, and lockdep will complain. That > way we can use lockdep not being off to catch this bug. The very bestest solution is to rm -rf printk ;-)
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, Sep 07, 2018 at 09:55:33AM -0400, Steven Rostedt wrote: > On Fri, 7 Sep 2018 15:45:32 +0200 > Peter Zijlstra wrote: > > > Yes really, we should not muck with the IRQ state from NMI context. > > Right, and we didn't. Your patch didn't change anything, but allow for It does, it kills lockdep before we get to mucking with the IRQ state. > printk_nmi_enter/exit() to be traced by ftrace, but that's wrong to > begin with because it ftrace_nmi_enter() hasn't been called yet. That's a second bug :-) A better solution might be to inline that function as well. A direct OR instruction is still way faster than a CALL+RET.
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, 7 Sep 2018 09:55:33 -0400 Steven Rostedt wrote: > On Fri, 7 Sep 2018 15:45:32 +0200 > Peter Zijlstra wrote: > > > Yes really, we should not muck with the IRQ state from NMI context. > > Right, and we didn't. Your patch didn't change anything, but allow for > printk_nmi_enter/exit() to be traced by ftrace, but that's wrong to > begin with because it ftrace_nmi_enter() hasn't been called yet. > I would even argue that placing printk_nmi_enter() between lockdep_off() and ftrace_nmi_enter() is wrong because if in the future printk_nmi_enter() were to do any ftrace tracing, it wont be caught, as it was by having it before lockdep_off(). printk_nmi_enter() should not muck with IRQ state, nor should it do any ftrace tracing. Since ftrace mucks with IRQ state when it gets enabled or disabled, it will screw up lockdep, and lockdep will complain. That way we can use lockdep not being off to catch this bug. -- Steve
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, 7 Sep 2018 15:45:32 +0200 Peter Zijlstra wrote: > Yes really, we should not muck with the IRQ state from NMI context. Right, and we didn't. Your patch didn't change anything, but allow for printk_nmi_enter/exit() to be traced by ftrace, but that's wrong to begin with because it ftrace_nmi_enter() hasn't been called yet. -- Steve
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, 7 Sep 2018 11:30:55 +0200 Peter Zijlstra wrote: > > That said, I am not against this change. Especially the inlining > > is a good move. Note that lockdep_off()/lockdep_on() must not > > be traced as well. > > Hard to trace an inline funcion; we could make it __always_inline to > feel better. In the old days, you would need to make it __always_inline, because if gcc didn't inline it (and it did so in the past), it would be traced and would re-introduce this bug. But I've hit this issue too many times in the past, that I added "notrace" to the "inline" definition, where all "inline" labeled functions are also "notrace". lockdep.o has the -pg option removed in the Makefile so that all functions in lockdep.c is notrace, which is why it was fine to keep it there. But adding __always_inline may not be a bad idea regardless. -- Steve
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, 7 Sep 2018 17:35:38 +0900 Sergey Senozhatsky wrote: > Should't printk_nmi_enter()/printk_nmi_exit() still be notrace? > Like you and Steven said - it's still before ftrace_nmi_enter() > and should be notrace regardless. Correct. My patch fixes both issues. The patch that Peter is posting is in case for some reason in the future printk_nmi_enter/exit() do something with interrupts or locking. And the printk_nmi_enter/exit() is placed where it is, in case for some reason lockdep_off/on() does a printk. The printk_nmi_enter/exit() and lockdep_off/on() are only changing state variables. Neither of them should be traced by ftrace. My patch solved the problem and was not just fixing a symptom. It could have been solved either way, but as Petr said, the printk_nmi_enter/exit() should have been notrace, because it was before ftrace_nmi_enter/exit(). Lockdep actually found this bug because the lockdep_on/off() was not first. -- Steve
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, Sep 07, 2018 at 09:41:48AM -0400, Steven Rostedt wrote: > On Fri, 7 Sep 2018 09:34:48 +0200 > Peter Zijlstra wrote: > > > On Wed, Sep 05, 2018 at 09:33:34PM -0400, Steven Rostedt wrote: > > > do_idle { > > > > > > [interrupts enabled] > > > > > > [interrupts disabled] > > > TRACE_IRQS_OFF [lockdep says irqs off] > > > [...] > > > TRACE_IRQS_IRET > > > test if pt_regs say return to interrupts enabled [yes] > > > TRACE_IRQS_ON [lockdep says irqs are on] > > > > > > > > > nmi_enter() { > > > printk_nmi_enter() [traced by ftrace] > > > [ hit ftrace breakpoint ] > > > > > > TRACE_IRQS_OFF [lockdep says irqs off] > > > [...] > > > TRACE_IRQS_IRET [return from breakpoint] > > > test if pt_regs say interrupts enabled [no] > > > [iret back to interrupt] > > > [iret back to code] > > > > > > tick_nohz_idle_enter() { > > > > > > lockdep_assert_irqs_enabled() [lockdep say no!] > > > > Isn't the problem that we muck with the IRQ state from NMI context? We > > shouldn't be doing that. > > Not really. Yes really, we should not muck with the IRQ state from NMI context.
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, 7 Sep 2018 09:34:48 +0200 Peter Zijlstra wrote: > On Wed, Sep 05, 2018 at 09:33:34PM -0400, Steven Rostedt wrote: > > do_idle { > > > > [interrupts enabled] > > > > [interrupts disabled] > > TRACE_IRQS_OFF [lockdep says irqs off] > > [...] > > TRACE_IRQS_IRET > > test if pt_regs say return to interrupts enabled [yes] > > TRACE_IRQS_ON [lockdep says irqs are on] > > > > > > nmi_enter() { > > printk_nmi_enter() [traced by ftrace] > > [ hit ftrace breakpoint ] > > > > TRACE_IRQS_OFF [lockdep says irqs off] > > [...] > > TRACE_IRQS_IRET [return from breakpoint] > >test if pt_regs say interrupts enabled [no] > >[iret back to interrupt] > >[iret back to code] > > > > tick_nohz_idle_enter() { > > > > lockdep_assert_irqs_enabled() [lockdep say no!] > > Isn't the problem that we muck with the IRQ state from NMI context? We > shouldn't be doing that. Not really. The problem is that the exception handling code jumps into lockdep, and we allow exceptions to happen in NMIs. It's just that the tracing of printk_nmi_enter() caused an exception before lockdep was turned off. > > The thing is, since we trace the IRQ state from within IRQ-disable, > since that's the only IRQ-safe option, it is very much not NMI-safe. > > Your patch might avoid the symptom, but I don't think it cures the > fundamental problem. No, the issue is that the nmi_enter() code was traced early which adds breakpoints and breakpoints causes exceptions which jumps into lockdep, and we did this before lockdep was turned off. -- Steve
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri, Sep 07, 2018 at 10:28:34AM +0200, Petr Mladek wrote: > On Fri 2018-09-07 09:45:31, Peter Zijlstra wrote: > > On Thu, Sep 06, 2018 at 11:31:51AM +0900, Sergey Senozhatsky wrote: > > > An alternative option, thus, could be re-instating back the rule that > > > lockdep_off/on should be the first and the last thing we do in > > > nmi_enter/nmi_exit. E.g. > > > > > > nmi_enter() > > > lockdep_off(); > > > printk_nmi_enter(); > > > > > > nmi_exit() > > > printk_nmi_exit(); > > > lockdep_on(); > > > > Yes that. Also, those should probably be inline functions. > > > > --- > > Subject: locking/lockdep: Fix NMI handling > > > > Someone put code in the NMI handler before lockdep_off(). Since lockdep > > is not NMI safe, this wrecks stuff. > > My view is that nmi_enter() has to switch several features into > NMI-safe mode. The code must not trigger the other features when > they are still in the unsafe mode. > > It is a chicken&egg problem. And it is hard to completely prevent > regressions caused by future changes. Sure, not bothered too much about the regression, that happens. > I though that printk_nmi_enter() should never need any lockdep-related > code. On the other hand, people might want to printk debug messages > when lockdep_off() is called. This is why I put it in the current order. Nah, that'd be broken. Or rather, if you want to debug NMI stuff, you had better know wth you're doing. Printk, as per mainline, is utterly useless for that -- I still carry those force_early_printk patches locally. Because even if the core printk code were to be NMI safe (possible I think, all we need is a lockless ring-buffer), then none of the console drivers are :/ (I really hate this current printk-nmi mess) > That said, I am not against this change. Especially the inlining > is a good move. Note that lockdep_off()/lockdep_on() must not > be traced as well. Hard to trace an inline funcion; we could make it __always_inline to feel better.
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On (09/07/18 10:28), Petr Mladek wrote: > On Fri 2018-09-07 09:45:31, Peter Zijlstra wrote: > > On Thu, Sep 06, 2018 at 11:31:51AM +0900, Sergey Senozhatsky wrote: > > > An alternative option, thus, could be re-instating back the rule that > > > lockdep_off/on should be the first and the last thing we do in > > > nmi_enter/nmi_exit. E.g. > > > > > > nmi_enter() > > > lockdep_off(); > > > printk_nmi_enter(); > > > > > > nmi_exit() > > > printk_nmi_exit(); > > > lockdep_on(); > > > > Yes that. Also, those should probably be inline functions. > > > > --- > > Subject: locking/lockdep: Fix NMI handling > > > > Someone put code in the NMI handler before lockdep_off(). Since lockdep > > is not NMI safe, this wrecks stuff. > > My view is that nmi_enter() has to switch several features into > NMI-safe mode. The code must not trigger the other features when > they are still in the unsafe mode. > > It is a chicken&egg problem. And it is hard to completely prevent > regressions caused by future changes. > > I though that printk_nmi_enter() should never need any lockdep-related > code. On the other hand, people might want to printk debug messages > when lockdep_off() is called. This is why I put it in the current order. > > That said, I am not against this change. Especially the inlining > is a good move. Note that lockdep_off()/lockdep_on() must not > be traced as well. Should't printk_nmi_enter()/printk_nmi_exit() still be notrace? Like you and Steven said - it's still before ftrace_nmi_enter() and should be notrace regardless. -ss
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Fri 2018-09-07 09:45:31, Peter Zijlstra wrote: > On Thu, Sep 06, 2018 at 11:31:51AM +0900, Sergey Senozhatsky wrote: > > An alternative option, thus, could be re-instating back the rule that > > lockdep_off/on should be the first and the last thing we do in > > nmi_enter/nmi_exit. E.g. > > > > nmi_enter() > > lockdep_off(); > > printk_nmi_enter(); > > > > nmi_exit() > > printk_nmi_exit(); > > lockdep_on(); > > Yes that. Also, those should probably be inline functions. > > --- > Subject: locking/lockdep: Fix NMI handling > > Someone put code in the NMI handler before lockdep_off(). Since lockdep > is not NMI safe, this wrecks stuff. My view is that nmi_enter() has to switch several features into NMI-safe mode. The code must not trigger the other features when they are still in the unsafe mode. It is a chicken&egg problem. And it is hard to completely prevent regressions caused by future changes. I though that printk_nmi_enter() should never need any lockdep-related code. On the other hand, people might want to printk debug messages when lockdep_off() is called. This is why I put it in the current order. That said, I am not against this change. Especially the inlining is a good move. Note that lockdep_off()/lockdep_on() must not be traced as well. Best Regards, Petr
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Thu, Sep 06, 2018 at 11:31:51AM +0900, Sergey Senozhatsky wrote: > An alternative option, thus, could be re-instating back the rule that > lockdep_off/on should be the first and the last thing we do in > nmi_enter/nmi_exit. E.g. > > nmi_enter() > lockdep_off(); > printk_nmi_enter(); > > nmi_exit() > printk_nmi_exit(); > lockdep_on(); Yes that. Also, those should probably be inline functions. --- Subject: locking/lockdep: Fix NMI handling Someone put code in the NMI handler before lockdep_off(). Since lockdep is not NMI safe, this wrecks stuff. Fixes: 42a0bb3f7138 ("printk/nmi: generic solution for safe printk in NMI") Signed-off-by: Peter Zijlstra (Intel) --- include/linux/hardirq.h | 4 ++-- include/linux/lockdep.h | 11 +-- kernel/locking/lockdep.c | 12 3 files changed, 11 insertions(+), 16 deletions(-) diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h index 0fbbcdf0c178..8d70270d9486 100644 --- a/include/linux/hardirq.h +++ b/include/linux/hardirq.h @@ -62,8 +62,8 @@ extern void irq_exit(void); #define nmi_enter()\ do {\ - printk_nmi_enter(); \ lockdep_off(); \ + printk_nmi_enter(); \ ftrace_nmi_enter(); \ BUG_ON(in_nmi()); \ preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \ @@ -78,8 +78,8 @@ extern void irq_exit(void); BUG_ON(!in_nmi()); \ preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \ ftrace_nmi_exit(); \ - lockdep_on(); \ printk_nmi_exit(); \ + lockdep_on(); \ } while (0) #endif /* LINUX_HARDIRQ_H */ diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index b0d0b51c4d85..70bb9e8fc8f9 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -272,8 +272,15 @@ extern void lockdep_reset_lock(struct lockdep_map *lock); extern void lockdep_free_key_range(void *start, unsigned long size); extern asmlinkage void lockdep_sys_exit(void); -extern void lockdep_off(void); -extern void lockdep_on(void); +static inline void lockdep_off(void) +{ + current->lockdep_recursion++; +} + +static inline void lockdep_on(void) +{ + current->lockdep_recursion--; +} /* * These methods are used by specific locking variants (spinlocks, diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c index e406c5fdb41e..da51ed1c0c21 100644 --- a/kernel/locking/lockdep.c +++ b/kernel/locking/lockdep.c @@ -317,18 +317,6 @@ static inline u64 iterate_chain_key(u64 key, u32 idx) return k0 | (u64)k1 << 32; } -void lockdep_off(void) -{ - current->lockdep_recursion++; -} -EXPORT_SYMBOL(lockdep_off); - -void lockdep_on(void) -{ - current->lockdep_recursion--; -} -EXPORT_SYMBOL(lockdep_on); - /* * Debugging switches: */
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Wed, Sep 05, 2018 at 09:33:34PM -0400, Steven Rostedt wrote: > do_idle { > > [interrupts enabled] > > [interrupts disabled] > TRACE_IRQS_OFF [lockdep says irqs off] > [...] > TRACE_IRQS_IRET > test if pt_regs say return to interrupts enabled [yes] > TRACE_IRQS_ON [lockdep says irqs are on] > > > nmi_enter() { > printk_nmi_enter() [traced by ftrace] > [ hit ftrace breakpoint ] > > TRACE_IRQS_OFF [lockdep says irqs off] > [...] > TRACE_IRQS_IRET [return from breakpoint] > test if pt_regs say interrupts enabled [no] > [iret back to interrupt] > [iret back to code] > > tick_nohz_idle_enter() { > > lockdep_assert_irqs_enabled() [lockdep say no!] Isn't the problem that we muck with the IRQ state from NMI context? We shouldn't be doing that. The thing is, since we trace the IRQ state from within IRQ-disable, since that's the only IRQ-safe option, it is very much not NMI-safe. Your patch might avoid the symptom, but I don't think it cures the fundamental problem.
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Thu, 6 Sep 2018 11:31:51 +0900 Sergey Senozhatsky wrote: > Hello, > > On (09/05/18 21:33), Steven Rostedt wrote: > > do_idle { > > > > [interrupts enabled] > > > > [interrupts disabled] > > TRACE_IRQS_OFF [lockdep says irqs off] > > [...] > > TRACE_IRQS_IRET > > test if pt_regs say return to interrupts enabled [yes] > > TRACE_IRQS_ON [lockdep says irqs are on] > > > > > > nmi_enter() { > > printk_nmi_enter() [traced by ftrace] > > [ hit ftrace breakpoint ] > > > > TRACE_IRQS_OFF [lockdep says irqs off] > > [...] > > TRACE_IRQS_IRET [return from breakpoint] > >test if pt_regs say interrupts enabled [no] > >[iret back to interrupt] > >[iret back to code] > > > > tick_nohz_idle_enter() { > > > > lockdep_assert_irqs_enabled() [lockdep say no!] > > > > Although interrupts are indeed enabled, lockdep thinks it is not, and since > > we now do asserts via lockdep, it gives a false warning. The issue here is > > that printk_nmi_enter() is called before lockdep_off(), which disables > > lockdep (for this reason) in NMIs. By simply not allowing ftrace to see > > printk_nmi_enter() (via notrace annotation) we keep lockdep from getting > > confused. > > Great catch and I like the patch! > > Indeed, with printk_nmi we changed the "everything that nmi_enter does > should happen after lockdep_off" to "everything that nmi_enter does should > happen after printk_nmi_enter" // +similar change to nmi_exit. > > An alternative option, thus, could be re-instating back the rule that > lockdep_off/on should be the first and the last thing we do in > nmi_enter/nmi_exit. E.g. > > nmi_enter() > lockdep_off(); > printk_nmi_enter(); > > nmi_exit() > printk_nmi_exit(); > lockdep_on(); > > I guess that we can keep printk_nmi_enter/printk_nmi_exit at the top > and at the bottom of nmi_enter/nmi_exit correspondingly just in case > if lockdep_off/lockdep_on sometime in the future starts invoking printk(), > which would deadlock us. Hence I thought about doing the move first, but figured it wasn't needed, as Petr said, it's before ftrace_nmi_enter() and should be notrace regrardless. > > Acked-by: Sergey Senozhatsky > > > > Cc: sta...@vger.kernel.org > > Fixes: 099f1c84c0052 ("printk: introduce per-cpu safe_print seq buffer") > > A minor nitpick, > > printk_nmi was introduced by 42a0bb3f71383b457 ("printk/nmi: generic > solution for safe printk in NMI") - in that commit we declared a new > printk_nmi_enter/exit should be the first and the last thing we do in > nmi_enter/nmi_exit rule. Thanks I'll update the change log. Grumble, my git blame foo is getting rusty. -- Steve
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
On Thu 2018-09-06 11:31:51, Sergey Senozhatsky wrote: > Hello, > > On (09/05/18 21:33), Steven Rostedt wrote: > > do_idle { > > > > [interrupts enabled] > > > > [interrupts disabled] > > TRACE_IRQS_OFF [lockdep says irqs off] > > [...] > > TRACE_IRQS_IRET > > test if pt_regs say return to interrupts enabled [yes] > > TRACE_IRQS_ON [lockdep says irqs are on] > > > > > > nmi_enter() { > > printk_nmi_enter() [traced by ftrace] > > [ hit ftrace breakpoint ] > > > > TRACE_IRQS_OFF [lockdep says irqs off] > > [...] > > TRACE_IRQS_IRET [return from breakpoint] > >test if pt_regs say interrupts enabled [no] > >[iret back to interrupt] > >[iret back to code] > > > > tick_nohz_idle_enter() { > > > > lockdep_assert_irqs_enabled() [lockdep say no!] > > > > Although interrupts are indeed enabled, lockdep thinks it is not, and since > > we now do asserts via lockdep, it gives a false warning. The issue here is > > that printk_nmi_enter() is called before lockdep_off(), which disables > > lockdep (for this reason) in NMIs. By simply not allowing ftrace to see > > printk_nmi_enter() (via notrace annotation) we keep lockdep from getting > > confused. > > Great catch and I like the patch! > > Indeed, with printk_nmi we changed the "everything that nmi_enter does > should happen after lockdep_off" to "everything that nmi_enter does should > happen after printk_nmi_enter" // +similar change to nmi_exit. > > An alternative option, thus, could be re-instating back the rule that > lockdep_off/on should be the first and the last thing we do in > nmi_enter/nmi_exit. E.g. > > nmi_enter() > lockdep_off(); > printk_nmi_enter(); > > nmi_exit() > printk_nmi_exit(); > lockdep_on(); Yup, this would help here as well. > I guess that we can keep printk_nmi_enter/printk_nmi_exit at the top > and at the bottom of nmi_enter/nmi_exit correspondingly just in case > if lockdep_off/lockdep_on sometime in the future starts invoking printk(), > which would deadlock us. Hence Yes. In general, it does not looks safe to allow tracing functions that are called before ftrace_nmi_enter(). > Acked-by: Sergey Senozhatsky Therefore I am all for Steven's patch. Acked-by: Petr Mladek Best Regards, Petr
Re: [PATCH] printk/tracing: Do not trace printk_nmi_enter()
Hello, On (09/05/18 21:33), Steven Rostedt wrote: > do_idle { > > [interrupts enabled] > > [interrupts disabled] > TRACE_IRQS_OFF [lockdep says irqs off] > [...] > TRACE_IRQS_IRET > test if pt_regs say return to interrupts enabled [yes] > TRACE_IRQS_ON [lockdep says irqs are on] > > > nmi_enter() { > printk_nmi_enter() [traced by ftrace] > [ hit ftrace breakpoint ] > > TRACE_IRQS_OFF [lockdep says irqs off] > [...] > TRACE_IRQS_IRET [return from breakpoint] > test if pt_regs say interrupts enabled [no] > [iret back to interrupt] > [iret back to code] > > tick_nohz_idle_enter() { > > lockdep_assert_irqs_enabled() [lockdep say no!] > > Although interrupts are indeed enabled, lockdep thinks it is not, and since > we now do asserts via lockdep, it gives a false warning. The issue here is > that printk_nmi_enter() is called before lockdep_off(), which disables > lockdep (for this reason) in NMIs. By simply not allowing ftrace to see > printk_nmi_enter() (via notrace annotation) we keep lockdep from getting > confused. Great catch and I like the patch! Indeed, with printk_nmi we changed the "everything that nmi_enter does should happen after lockdep_off" to "everything that nmi_enter does should happen after printk_nmi_enter" // +similar change to nmi_exit. An alternative option, thus, could be re-instating back the rule that lockdep_off/on should be the first and the last thing we do in nmi_enter/nmi_exit. E.g. nmi_enter() lockdep_off(); printk_nmi_enter(); nmi_exit() printk_nmi_exit(); lockdep_on(); I guess that we can keep printk_nmi_enter/printk_nmi_exit at the top and at the bottom of nmi_enter/nmi_exit correspondingly just in case if lockdep_off/lockdep_on sometime in the future starts invoking printk(), which would deadlock us. Hence Acked-by: Sergey Senozhatsky > Cc: sta...@vger.kernel.org > Fixes: 099f1c84c0052 ("printk: introduce per-cpu safe_print seq buffer") A minor nitpick, printk_nmi was introduced by 42a0bb3f71383b457 ("printk/nmi: generic solution for safe printk in NMI") - in that commit we declared a new printk_nmi_enter/exit should be the first and the last thing we do in nmi_enter/nmi_exit rule. -ss
[PATCH] printk/tracing: Do not trace printk_nmi_enter()
[ I'm currently testing this patch, and will push this to mainline, I'm just looking for acks and review bys. Also, Peter, this is the bug that my patch that differentiates if lockdep is broken or if the code is broken when lockdep_assert_irqs_{dis,en}abled() triggers, and would have been helpful. ] From: "Steven Rostedt (VMware)" I hit the following splat in my tests: [ cut here ] IRQs not enabled as expected WARNING: CPU: 3 PID: 0 at kernel/time/tick-sched.c:982 tick_nohz_idle_enter+0x44/0x8c Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipv6 CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-rc2-test+ #2 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 EIP: tick_nohz_idle_enter+0x44/0x8c Code: ec 05 00 00 00 75 26 83 b8 c0 05 00 00 00 75 1d 80 3d d0 36 3e c1 00 75 14 68 94 63 12 c1 c6 05 d0 36 3e c1 01 e8 04 ee f8 ff <0f> 0b 58 fa bb a0 e5 66 c1 e8 25 0f 04 00 64 03 1d 28 31 52 c1 8b EAX: 001c EBX: f26e7f8c ECX: 0006 EDX: 0007 ESI: f26dd1c0 EDI: EBP: f26e7f40 ESP: f26e7f38 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010296 CR0: 80050033 CR2: 0813c6b0 CR3: 2f342000 CR4: 001406f0 Call Trace: do_idle+0x33/0x202 cpu_startup_entry+0x61/0x63 start_secondary+0x18e/0x1ed startup_32_smp+0x164/0x168 irq event stamp: 18773830 hardirqs last enabled at (18773829): [] trace_hardirqs_on_thunk+0xc/0x10 hardirqs last disabled at (18773830): [] trace_hardirqs_off_thunk+0xc/0x10 softirqs last enabled at (18773824): [] __do_softirq+0x25f/0x2bf softirqs last disabled at (18773767): [] call_on_stack+0x45/0x4b ---[ end trace b7c64aa79e17954a ]--- After a bit of debugging, I found what was happening. This would trigger when performing "perf" with a high NMI interrupt rate, while enabling and disabling function tracer. Ftrace uses breakpoints to convert the nops at the start of functions to calls to the function trampolines. The breakpoint traps disable interrupts and this makes calls into lockdep via the trace_hardirqs_off_thunk in the entry.S code. What happens is the following: do_idle { [interrupts enabled] [interrupts disabled] TRACE_IRQS_OFF [lockdep says irqs off] [...] TRACE_IRQS_IRET test if pt_regs say return to interrupts enabled [yes] TRACE_IRQS_ON [lockdep says irqs are on] nmi_enter() { printk_nmi_enter() [traced by ftrace] [ hit ftrace breakpoint ] TRACE_IRQS_OFF [lockdep says irqs off] [...] TRACE_IRQS_IRET [return from breakpoint] test if pt_regs say interrupts enabled [no] [iret back to interrupt] [iret back to code] tick_nohz_idle_enter() { lockdep_assert_irqs_enabled() [lockdep say no!] Although interrupts are indeed enabled, lockdep thinks it is not, and since we now do asserts via lockdep, it gives a false warning. The issue here is that printk_nmi_enter() is called before lockdep_off(), which disables lockdep (for this reason) in NMIs. By simply not allowing ftrace to see printk_nmi_enter() (via notrace annotation) we keep lockdep from getting confused. Cc: sta...@vger.kernel.org Fixes: 099f1c84c0052 ("printk: introduce per-cpu safe_print seq buffer") Signed-off-by: Steven Rostedt (VMware) --- kernel/printk/printk_safe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index a0a74c533e4b..0913b4d385de 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -306,12 +306,12 @@ static __printf(1, 0) int vprintk_nmi(const char *fmt, va_list args) return printk_safe_log_store(s, fmt, args); } -void printk_nmi_enter(void) +void notrace printk_nmi_enter(void) { this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK); } -void printk_nmi_exit(void) +void notrace printk_nmi_exit(void) { this_cpu_and(printk_context, ~PRINTK_NMI_CONTEXT_MASK); } -- 2.13.6