On Sat, 17 Nov 2012 19:28:53 -0500 Sasha Levin <sasha.le...@oracle.com> wrote:
> Send an NMI to all CPUs when a lockup is detected and the lockup > watchdog code is configured to panic. This gives us a fairly uptodate > snapshot of all CPUs in the system. > > This lets us get stack trace of all CPUs which makes life easier > trying to debug a deadlock, and the NMI doesn't change anything > since the next step is a kernel panic. > nit: I'll rename this to "watchdog: trigger all-cpu backtrace when locked up and going to panic". We don't know how the arch implements trigger_all_cpu_backtrace() at this level! > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -239,10 +239,12 @@ static void watchdog_overflow_callback(struct > perf_event *event, > if (__this_cpu_read(hard_watchdog_warn) == true) > return; > > - if (hardlockup_panic) > + if (hardlockup_panic) { > + trigger_all_cpu_backtrace(); > panic("Watchdog detected hard LOCKUP on cpu %d", > this_cpu); > - else > + } else { > WARN(1, "Watchdog detected hard LOCKUP on cpu %d", > this_cpu); > + } > > __this_cpu_write(hard_watchdog_warn, true); > return; > @@ -323,8 +325,10 @@ static enum hrtimer_restart watchdog_timer_fn(struct > hrtimer *hrtimer) > else > dump_stack(); > > - if (softlockup_panic) > + if (softlockup_panic) { > + trigger_all_cpu_backtrace(); > panic("softlockup: hung tasks"); > + } > __this_cpu_write(soft_watchdog_warn, true); > } else > __this_cpu_write(soft_watchdog_warn, false); The change seems sensible, but I wonder about CONFIG_SMP=n machines. Will they end up getting the same backtrace displayed twice? (I don't remember whether trigger_all_cpu_backtrace() is really trigger_all_other_cpu_backtrace() and we didn't document it). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/