On Wed, Nov 19, 2014 at 11:44 AM, Linus Torvalds <torva...@linux-foundation.org> wrote: > On Wed, Nov 19, 2014 at 11:29 AM, Andi Kleen <a...@firstfloor.org> wrote: >> >> The exception handlers which use the IST stacks don't necessarily >> set irq count. Maybe they should. > > Hmm. I think they should. Since they clearly must not schedule, as > they use a percpu stack. > > Which exceptions use IST? > > [ grep grep ] > > Looks like stack, doublefault, nmi, debug and mce. And yes, I really > think they should all raise the irq count if they don't already. > Rather than add random arch-specific "let's check that we're on the > right stack" code to the might-sleep stuff, just use the one we have. >
Resurrecting an old thread: The outcome of this discussion was that ist_enter now raises HARDIRQ_COUNT. I think this is causing a problem. If a user program enables TF, it generates a bunch of debug exceptions. The handlers raise the IRQ count and do stuff, and apparently some of that stuff can raise a softirq. (I have no idea where the softirq is being raised.) The softirq code notices that we're in_interrupt and doesn't wake ksoftirqd because it thinks we're about to exit the interrupt and process the softirq. But we don't, which causes occasional warnings and confuses things (and me!). So how do we fix it? If we stop raising HARDIRQ_COUNT (and apply $SUBJECT?), then raise_softirq will wake ksoftirqd and life is good. But this seems a bit silly, since, if we entered the ist exception handler from a context with irqs on and softirqs enabled, we *could* plausibly handle the softirq right away -- we're on an essentially empty stack. (Of course, it's a *small* stack, since it could be the IST stack.) Or we could just let ksoftirqd do its thing and stop raising HARDIRQ_COUNT. We could add a new preempt count field just for IST (yuck). We could try to hijack a different preempt count field (NMI?). But I kind of like the idea of just reinstating the original patch of explicitly checking that we're on a safe stack in schedule and __might_sleep, since that is the actual condition we care about. --Andy