From: Hoeun Ryu <hoeun....@lge.com> Make printk_safe_flush() safe in NMI context. nmi_trigger_cpumask_backtrace() can be called in NMI context. For example the function is called in watchdog_overflow_callback() if the flag of hardlockup backtrace (sysctl_hardlockup_all_cpu_backtrace) is true and watchdog_overflow_callback() function is called in NMI context on some architectures. Calling printk_safe_flush() in nmi_trigger_cpumask_backtrace() eventually tries to lock logbuf_lock in vprintk_emit() that might be already be part of another non-nmi context on the same CPU or a soft- or hard-lockup on another CPU. The example of deadlock can be
CPU0 local_irq_save(); for (;;) req = blk_peek_request(q); if (unlikely(!scsi_device_online(sdev))) printk() vprintk_emit() console_unlock() logbuf_lock_irqsave() slow-serial-console-write() // close to watchdog threshold watchdog_overflow_callback() trigger_allbutself_cpu_backtrace() printk_safe_flush() vprintk_emit() logbuf_lock_irqsave() ^^^^ deadlock and some other cases. This patch prevents a deadlock in printk_safe_flush() in NMI context. It makes sure that we continue and eventually call printk_safe_flush_on_panic() from panic() that has better chances to succeed. There is a risk that logbuf_lock was not part of a soft- or dead-lockup and we might just loose the messages. But then there is a high chance that irq_work will get called and the messages will get flushed the normal way. Signed-off-by: Hoeun Ryu <hoeun....@lge.com> Suggested-by: Petr Mladek <pmla...@suse.com> Suggested-by: Sergey Senozhatsky <sergey.senozhatsky.w...@gmail.com> --- v2: fix comments in commit message and code. no change in code itself. kernel/printk/printk_safe.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index 3e3c200..3b5c660 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -254,6 +254,17 @@ void printk_safe_flush(void) { int cpu; + /* + * Just avoid a deadlock here. + * It makes sure that we continue and eventually call + * printk_safe_flush_on_panic() from panic() that has better chances to succeed. + * There is a risk that logbuf_lock was not part of a soft- or dead-lockup and + * we might just loose the messages. But then there is a high chance that + * irq_work will get called and the messages will get flushed the normal way. + */ + if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) + return; + for_each_possible_cpu(cpu) { #ifdef CONFIG_PRINTK_NMI __printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work); -- 2.1.4