On Sat, 17 Nov 2012 19:28:53 -0500
Sasha Levin <sasha.le...@oracle.com> wrote:

> Send an NMI to all CPUs when a lockup is detected and the lockup
> watchdog code is configured to panic. This gives us a fairly uptodate
> snapshot of all CPUs in the system.
> 
> This lets us get stack trace of all CPUs which makes life easier
> trying to debug a deadlock, and the NMI doesn't change anything
> since the next step is a kernel panic.
> 

nit: I'll rename this to "watchdog: trigger all-cpu backtrace when
locked up and going to panic".  We don't know how the arch implements
trigger_all_cpu_backtrace() at this level!


> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -239,10 +239,12 @@ static void watchdog_overflow_callback(struct 
> perf_event *event,
>               if (__this_cpu_read(hard_watchdog_warn) == true)
>                       return;
>  
> -             if (hardlockup_panic)
> +             if (hardlockup_panic) {
> +                     trigger_all_cpu_backtrace();
>                       panic("Watchdog detected hard LOCKUP on cpu %d", 
> this_cpu);
> -             else
> +             } else {
>                       WARN(1, "Watchdog detected hard LOCKUP on cpu %d", 
> this_cpu);
> +             }
>  
>               __this_cpu_write(hard_watchdog_warn, true);
>               return;
> @@ -323,8 +325,10 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
> hrtimer *hrtimer)
>               else
>                       dump_stack();
>  
> -             if (softlockup_panic)
> +             if (softlockup_panic) {
> +                     trigger_all_cpu_backtrace();
>                       panic("softlockup: hung tasks");
> +             }
>               __this_cpu_write(soft_watchdog_warn, true);
>       } else
>               __this_cpu_write(soft_watchdog_warn, false);

The change seems sensible, but I wonder about CONFIG_SMP=n machines. 
Will they end up getting the same backtrace displayed twice?

(I don't remember whether trigger_all_cpu_backtrace() is really
trigger_all_other_cpu_backtrace() and we didn't document it).

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to