Hi, I'm playing with the NMI watchdog (nmi_watchdog=1) on a reproductable hard lockup (no keyboard, etc) but seems like it doesn't works and I can't understand why, please explain to me the possible causes.. I belive it should work in this situation..
environment: P4C800 motherboard, P4-2.4 cpu (APIC 2.0 on) Promise 20378 SATA controller on the motherboard (sil_promise driver) Maxtor diamondmax plus 9 200G sata disk (and an empty PCI expander plus some more other under-testing hardware which doesn't matter in the experiment)
mainline kernel 2.6.11 serial and VGA console, root on NFS
steps to the lockup: 1. booting the machine with sata drive on the promise controller 2. dd if=/dev/sda of=/dev/null bs=4k 3. unplug the power from drive 4. waiting about 2 seconds 5. plug the power back
dd stucked in 'D' here for 10-15 seconds and than the kernel say: ata1: command timeout
and voila, the box is dead, but without any message from the NMI watchdog :(
The NMI watchdog only triggers if something is blocking interrupts from getting through - if timer interrupts are still happening it won't activate. You can try Alt-Sysrq-T to get a traceback of where the current process is stuck..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/