On Thu, 28 Jul 2005 21:16:56 -0700, George Anzinger <george@mvista.com> wrote: >Keith Owens wrote: >> On Thu, 28 Jul 2005 13:31:58 -0700, >> George Anzinger <george@mvista.com> wrote: >> >>>I have been doing some work on kgdb to pull a few of it "fingers" out of >>>various places in the kernel. This is the final location where we have >>>a kgdb intercept not covered by a notify. >> >> >> I like the idea, but the hook should be in die_nmi(), not in the >> watchdog, using the reason that is already passed into die_nmi. >> die_nmi() is also called for a real NMI. >> >I had though that too, but it does not allow recovery (i.e. lets reset >the watchdog and try again).
die_nmi() returns to nmi_watchdog_tick(), nmi_watchdog_tick does the reset and continues. Patch below. >Hmm.. just looked at traps.c. Seems die_nmi is NOT called from the nmi >trap, only from the watchdog. Also, there is a notify in the path to >the other nmi stuff. I was looking at unknown_nmi_panic_callback(), which also calls die_nmi(). traps.c already has several notify_die() calls, nmi.c has none. It is cleaner to keep all the notification in traps.c, with this small change to nmi.c to cope with die_nmi() returning. Index: linux/arch/i386/kernel/nmi.c =================================================================== --- linux.orig/arch/i386/kernel/nmi.c 2005-07-28 17:22:06.735038510 +1000 +++ linux/arch/i386/kernel/nmi.c 2005-07-29 15:19:00.371196596 +1000 @@ -494,8 +494,10 @@ void nmi_watchdog_tick (struct pt_regs * * wait a few IRQs (5 seconds) before doing the oops ... */ alert_counter[cpu]++; - if (alert_counter[cpu] == 5*nmi_hz) + if (alert_counter[cpu] == 5*nmi_hz) { die_nmi(regs, "NMI Watchdog detected LOCKUP"); + alert_counter[cpu] = 0; + } } else { last_irq_sums[cpu] = sum; alert_counter[cpu] = 0; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/