On Wed 2023-08-30 10:53:59, Daniel Thompson wrote: > On Fri, Aug 25, 2023 at 07:18:44AM -0700, Doug Anderson wrote: > > Hi, > > > > On Fri, Aug 25, 2023 at 3:09 AM Daniel Thompson > > <daniel.thomp...@linaro.org> wrote: > > > > > > On Tue, Aug 22, 2023 at 01:19:46PM -0700, Douglas Anderson wrote: > > > > When entering kdb/kgdb on a kernel panic, it was be observed that the > > > > console isn't flushed before the `kdb` prompt came up. Specifically, > > > > when using the buddy lockup detector on arm64 and running: > > > > echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT > > > > > > > > I could see: > > > > [ 26.161099] lkdtm: Performing direct entry HARDLOCKUP > > > > [ 32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6 > > > > [ 32.552865] Sending NMI from CPU 5 to CPUs 6: > > > > [ 32.557359] NMI backtrace for cpu 6 > > > > ... [backtrace for cpu 6] ... > > > > [ 32.558353] NMI backtrace for cpu 5 > > > > ... [backtrace for cpu 5] ... > > > > [ 32.867471] Sending NMI from CPU 5 to CPUs 0-4,7: > > > > [ 32.872321] NMI backtrace forP cpuANC: Hard LOCKUP > > > > > > > > Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry > > > > [5]kdb> > > > > > > > > As you can see, backtraces for the other CPUs start printing and get > > > > interleaved with the kdb PANIC print. > > > > > > > > Let's replicate the commands to flush the console in the kdb panic > > > > entry point to avoid this. > > > > > > > > Signed-off-by: Douglas Anderson <diand...@chromium.org> > > > > --- > > > > > > > > kernel/debug/debug_core.c | 3 +++ > > > > 1 file changed, 3 insertions(+) > > > > > > > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c > > > > index d5e9ccde3ab8..3a904d8697c8 100644 > > > > --- a/kernel/debug/debug_core.c > > > > +++ b/kernel/debug/debug_core.c > > > > @@ -1006,6 +1006,9 @@ void kgdb_panic(const char *msg) > > > > if (panic_timeout) > > > > return; > > > > > > > > + debug_locks_off(); > > > > + console_flush_on_panic(CONSOLE_FLUSH_PENDING); > > > > + > > > > if (dbg_kdb_mode) > > > > kdb_printf("PANIC: %s\n", msg); > > > > > > I'm somewhat included to say *this* (calling kdb_printf() when not > > > actually in the debugger) is the cause of the problem. kdb_printf() > > > does some pretty horid things to the console and isn't intended to > > > run while the system is active. > > > > > > I'd therefore be more tempted to defer the print to the b.p. trap > > > handler itself and make this part of kgdb_panic() look more like: > > > > > > kgdb_panic_msg = msg; > > > kgdb_breakpoint(); > > > kgdb_panic_msg = NULL; > > > > Unfortunately I think that only solves half the problem. As a quick > > test, I tried simply commenting out the "kdb_printf" line in > > kgdb_panic(). While that avoids the interleaved panic message and > > backtrace, it does nothing to actually get the backtraces printed out > > before you end up in kdb. As an example, this is what happened when I > > used `echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT` and > > had the "kdb_printf" in kgdb_panic() commented out: > > > > [ 72.658424] lkdtm: Performing direct entry HARDLOCKUP > > [ 82.181857] watchdog: Watchdog detected hard LOCKUP on cpu 6 > > ... > > [ 82.234801] Sending NMI from CPU 5 to CPUs 6: > > [ 82.239296] NMI backtrace for cpu 6 > > ... [ stack trace for CPU 6 ] ... > > [ 82.240294] NMI backtrace for cpu 5 > > ... [ stack trace for CPU 5 ] ... > > [ 82.576443] Sending NMI from CPU 5 to CPUs 0-4,7: > > [ 82.581291] NMI backtrace > > Entering kdb (current=0xffffff80da5a1080, pid 6978) on processor 5 due > > to Keyboard Entry > > [5]kdb> > > > > As you can see, I don't see the traces for CPUs 0-4 and 7. Those do > > show up if I use the "dmesg" command but it's a bit of a hassle to run > > "dmesg" to look for any extra debug messages every time I drop in kdb. > > > > I guess perhaps that part isn't obvious from the commit message? > > I figured it was a risk. > > In fact it's an area where my instinct to honour console messages and my > instinct to get into the kernel as soon as possible after the decision > to invoke it has been made come into conflict. > > In other words does it matter that the console buffers are not flushed > before entering kgdb? However having thought about it for a little while > (and knowing the console code tends to be written to be decently robust) > I can come to the view the flushing is best.
Just for record. I wondered if there was bug in printk() that the backtraces were not flushed. And I think that it is because they are printed in NMI context where the conosle handling is deferred. From this POV, the explicit flush makes perfect sense. Best Regards, Petr _______________________________________________ Kgdb-bugreport mailing list Kgdb-bugreport@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport