Rumor has it that Linux 3.13 was supposed to get rid of all the silly rescheduling interrupts. It doesn't, although it does seem to have improved the situation.
A small number of reschedule interrupts appear to be due to a race: both resched_task and wake_up_idle_cpu do, essentially: set_tsk_need_resched(t); smb_mb(); if (!tsk_is_polling(t)) smp_send_reschedule(cpu); The problem is that set_tsk_need_resched wakes the CPU and, if the CPU is too quick (which isn't surprising if it was in C0 or C1), then it could *clear* TS_POLLING before tsk_is_polling is read. Is there a good reason that TIF_NEED_RESCHED is in thread->flags and TS_POLLING is in thread->status? Couldn't both of these be in the same field in something like struct rq? That would allow a real atomic op here. The more serious issue is that AFAICS default_wake_function is completely missing the polling check. It goes through ttwu_queue_remote, which unconditionally sends an interrupt. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/