From: Hoeun Ryu <hoeun....@lge.com>

 Make printk_safe_flush() safe in NMI context.
nmi_trigger_cpumask_backtrace() can be called in NMI context. For example the
function is called in watchdog_overflow_callback() if the flag of hardlockup
backtrace (sysctl_hardlockup_all_cpu_backtrace) is true and
watchdog_overflow_callback() function is called in NMI context on some
architectures.
 Calling printk_safe_flush() in nmi_trigger_cpumask_backtrace() eventually tries
to lock logbuf_lock in vprintk_emit() that might be already be part
of another non-nmi context on the same CPU or a soft- or hard-lockup on another
CPU. The example of deadlock can be

 CPU0
 local_irq_save();
 for (;;)
   req = blk_peek_request(q);
   if (unlikely(!scsi_device_online(sdev)))
     printk()
       vprintk_emit()
         console_unlock()
           logbuf_lock_irqsave()
             slow-serial-console-write()        // close to watchdog threshold
               watchdog_overflow_callback()
                 trigger_allbutself_cpu_backtrace()
                   printk_safe_flush()
                     vprintk_emit()
                       logbuf_lock_irqsave()
                       ^^^^ deadlock

and some other cases.
 This patch prevents a deadlock in printk_safe_flush() in NMI context. It makes
sure that we continue and eventually call printk_safe_flush_on_panic() from 
panic()
that has better chances to succeed.
 There is a risk that logbuf_lock was not part of a soft- or dead-lockup and we
might just loose the messages. But then there is a high chance that irq_work 
will
get called and the messages will get flushed the normal way.

Signed-off-by: Hoeun Ryu <hoeun....@lge.com>
Suggested-by: Petr Mladek <pmla...@suse.com>
Suggested-by: Sergey Senozhatsky <sergey.senozhatsky.w...@gmail.com>
---
 v2: fix comments in commit message and code. no change in code itself.

 kernel/printk/printk_safe.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 3e3c200..3b5c660 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -254,6 +254,17 @@ void printk_safe_flush(void)
 {
        int cpu;
 
+       /*
+        * Just avoid a deadlock here.
+        * It makes sure that we continue and eventually call
+        * printk_safe_flush_on_panic() from panic() that has better chances to 
succeed.
+        * There is a risk that logbuf_lock was not part of a soft- or 
dead-lockup and
+        * we might just loose the messages. But then there is a high chance 
that
+        * irq_work will get called and the messages will get flushed the 
normal way.
+        */
+       if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK)
+               return;
+
        for_each_possible_cpu(cpu) {
 #ifdef CONFIG_PRINTK_NMI
                __printk_safe_flush(&per_cpu(nmi_print_seq, cpu).work);
-- 
2.1.4

Reply via email to