On Wed, 2007-02-21 at 19:18 +0100, Thomas Gleixner wrote: > On Wed, 2007-02-21 at 09:38 -0800, Daniel Walker wrote: > > > > > > > > Could be the switch over then which confuses the NMI . > > > > > > Why? The switch just stops the PIT/HPET. It does not fiddle with IO_APIC > > > and friends at all. > > > > I'm not an expert on the io-apic, but the check_timer() function seemed > > to assume IRQ0 was happening regularly .. > > Again: > > check_timer() is called _BEFORE_ we even touch the local APIC timers. At > this point PIT/HPET _IS_ firing IRQ0 with HZ frequency.
Right, but eventually there isn't a regular timer interrupt through the io-apic. I don't think in the past IRQ0 stops without the system crashing, so check_timer() could assume the timer (IRQ0) is _always_ regular. do you know what the requirement are for routing the NMI through the io-apic? > > Well, I'm pretty sure it's HRT, cause in prior versions this only > > happened when HRT is enabled. Then you guys went to the lapic all the > > time, and now this is happening all the time .. > > The NMI is stuck: > > if (nmi_count(cpu) - prev_nmi_count[cpu] <= 5) { > printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n", > cpu, > prev_nmi_count[cpu], > nmi_count(cpu)); > > This has nothing to do with jiffies. I think it has to do with IRQ0 . Did I mention this doesn't happen in 2.6.20 . > There have been a bunch of changes in arch/i386/kernel/nmi.c as well. > > > You can't reproduce this? > > Nope. Do you use nmi_watchdog=1 ? Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/