On Thu, Feb 19, 2015 at 1:59 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > > Is this worth looking at? Or is it something spurious? I might have > gotten the vectors wrong, and maybe the warning is not because the ISR > bit isn't set, but because I test the wrong bit.
I edited the patch to do ratelimiting (one per 10s max) rather than "once". And tested it some more. It seems to work correctly. The irq case during 8042 probing is not repeatable, and I suspect it happens because the interrupt source goes away (some probe-time thing that first triggers an interrupt, but then clears it itself), so it doesn't happen every boot, and I've gotten it with slightly different backtraces. But it's the only warning that happens for me, so I think my code is right (at least for the cases that trigger on this machine). It's definitely not a "every interrupt causes the warning because the code was buggy, and the WARN_ONCE() just printed the first one". It would be interesting to hear if others see spurious APIC EOI cases too. In particular, the people seeing the IPI lockup. Because a lot of the lockups we've seen have *looked* like the IPI interrupt just never happened, and so we're waiting forever for the target CPU to react to it. And just maybe the spurious EOI could cause the wrong bit to be cleared in the ISR, and then the interrupt never shows up. Something like that would certainly explain why it only happens on some machines and under certain timing circumstances. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/