On Fri, May 15, 2020 at 07:33:08PM +0900, Sergey Senozhatsky wrote: > On (20/05/15 10:50), Petr Mladek wrote: > > kdb is able to stop kernel even in NMI context where printk() is redirected > > to the printk_safe() lockless variant. Move the check and redirect to kdb > > even in this case. > > Can I please have some context what problem does this solve? > I can see that vkdb_printf() calls into console drivers: > > for_each_console(c) { > c->write(c, cp, retlen - (cp - kdb_buffer)); > touch_nmi_watchdog(); > } > > Is this guaranteed that we never execute this path from NMI?
Absolutely not. The execution context for kdb is pretty much unique... we are running a debug mode with all CPUs parked in a holding loop with interrupts disabled. One CPU is at an unknown exception state and the others are either handling an IRQ or NMI depending on architecture[1]. However there are a number of factors that IMHO weigh in favour of allowing kdb to intercept here. 1. kgdb/kdb are designed to work from NMI, modulo the bugs that are undoubtedly present. 2. A synchronous breakpoint (including an implicit breakpoint-on-oops) from any code that executes with irqs disabled will exhibit most of the same problems as an NMI but without waking up all the NMI logic. 3. kdb_trap_printk is only set for *very* narrow time intervals by the debug master (the single CPU in the system that is *not* in a holding loop). Thus in all cases the system has already successfully executed kdb_printf() several times before we ever call the printk() interception code. Or put another way, even if we did tickle a bug speculated about in #1, it won't be the call to printk() that triggers it; we'd never get that far! > If so, can this please be added to the commit message? A more > detailed commit message will help a lot. I suspect Petr might prefer any future flames about kdb_printf() to be pointed at me rather than him ;-) so if adding anything to the commit message then I'd suggest it be based on the reasoning in #3 above. Daniel.