On Fri, May 15, 2020 at 07:33:08PM +0900, Sergey Senozhatsky wrote:
> On (20/05/15 10:50), Petr Mladek wrote:
> > kdb is able to stop kernel even in NMI context where printk() is redirected
> > to the printk_safe() lockless variant. Move the check and redirect to kdb
> > even in this case.
> 
> Can I please have some context what problem does this solve?
> I can see that vkdb_printf() calls into console drivers:
> 
>       for_each_console(c) {
>               c->write(c, cp, retlen - (cp - kdb_buffer));
>               touch_nmi_watchdog();
>       }
> 
> Is this guaranteed that we never execute this path from NMI?

Absolutely not.

The execution context for kdb is pretty much unique... we are running a
debug mode with all CPUs parked in a holding loop with interrupts
disabled. One CPU is at an unknown exception state and the others are
either handling an IRQ or NMI depending on architecture[1].

However there are a number of factors that IMHO weigh in favour of
allowing kdb to intercept here.

1. kgdb/kdb are designed to work from NMI, modulo the bugs that are
   undoubtedly present.

2. A synchronous breakpoint (including an implicit breakpoint-on-oops)
   from any code that executes with irqs disabled will exhibit most of
   the same problems as an NMI but without waking up all the NMI logic.

3. kdb_trap_printk is only set for *very* narrow time intervals by the
   debug master (the single CPU in the system that is *not* in a
   holding loop). Thus in all cases the system has already successfully
   executed kdb_printf() several times before we ever call the printk()
   interception code.

   Or put another way, even if we did tickle a bug speculated about in
   #1, it won't be the call to printk() that triggers it; we'd never
   get that far!


> If so, can this please be added to the commit message? A more
> detailed commit message will help a lot.

I suspect Petr might prefer any future flames about kdb_printf() to be
pointed at me rather than him ;-) so if adding anything to the commit
message then I'd suggest it be based on the reasoning in #3 above.


Daniel.

Reply via email to