On Tue, 2007-09-11 at 11:06 -0700, Daniel Walker wrote: > On Tue, 2007-09-11 at 11:39 -0600, David Bahi wrote: > > On Tue, 2007-09-11 at 02:10 +0000, David Bahi wrote: > > > trying to get a crashdump in a clean -rt series kernel which hardlocks > > > on the ltp openposix conformance interfaces sched_yield 1-1.test ... > > > > > > i was trying with nmi_watchdog = 1 (x86_64 machine, dual cpu, quad core, > > > HT enabled) and it would report NMI LOCKUP on presentation of the login > > > prompt - no chance to run the test. > > > > > > linux-2.6.22.1-rt9 linux-2.6.23-rc4-rt1 > > > lockup on openposix > > > sched_yield test occurs yes yes > > > > > > crash on login prompt > > > with nmi_watchdog=1 yes* yes* > > > > > > (*) console logs attached > > > > > > > > > > fyi 2.6.23-rc4 w/o -rt passes the LTP openposix sched_yeild test and > > does *not* crash (NMI LOCKUP) if nmi_watchdog=1 is a boot arg. > > > > so these don't seem to be related to what i'm seeing at least: > > > > http://thread.gmane.org/gmane.linux.kernel/577449 > > The patches above are related to nmi_watchdog=2 . Is the same lock up > detected with nmi_watchdog=2 ?
sorry to pull your work in to this thread wrongly Daniel. my point was
really that the 23-rc4 kernel does not experience either failure for me
and that the current work being done for nmi_watchdog is unrelated so
this still needs attention.
no hang with nmi_watchdog=2 in 23-rc4 (non -rt)
no hang with nmi_watchdog=2 in 23-rc4-rt1
test box has dual quad xenons - not a core duo - so the coreduo_ed_ops
work around isn't needed, right?
and the endflag=1 isn't needed since x86_64 sets this in both code paths
(inefficiently).
if (!atomic_read(&nmi_active)) {
kfree(counts);
atomic_set(&nmi_active, -1);
endflag = 1;
return -1;
}
endflag = 1;
printk("OK.\n");
finally, i didn't try nmi_watchdog=2 earlier because the
Documentation/nmi_watchdog.txt file says it's not a useful setting for
x86_64. quote:
For x86-64, the needed APIC is always compiled in, and the NMI
watchdog is always enabled with I/O-APIC mode (nmi_watchdog=1).
Currently, local APIC mode (nmi_watchdog=2) does not work on x86-64.
Is this no longer true? My experience with nmi_watchdog=2 and this LTP
openposix sched_yeild 1-1.test is that this test hardlocks the host and
that no watchdog is triggered with this setting.
db
signature.asc
Description: PGP signature
