On Fri, 11 Apr 2014, Peter Zijlstra wrote: > Because mwait_idle_with_hints() gets called from !idle context it must > call current_clr_polling(). This however means that resched_task() is > very likely to send an IPI even when we were polling: > > CPU0 CPU1 > > if (current_set_polling_and_test()) > goto out; > > __monitor(&ti->flags); > if (!need_resched()) > __mwait(eax, ecx); > set_tsk_need_resched(p); > smp_mb(); > out: > current_clr_polling(); > if (!tsk_is_polling(p)) > smp_send_reschedule(cpu); > > > So while it is correct (extra IPIs aren't a problem, whereas a missed > IPI would be) it is a performance problem (for some). > > Avoid this issue by using fetch_or() to atomically set NEED_RESCHED > and test if POLLING_NRFLAG is set. > > Since a CPU stuck in mwait is unlikely to modify the flags word, > contention on the cmpxchg is unlikely and thus we should mostly > succeed in a single go.
Looks like a nice cleanup. Acked-by: Nicolas Pitre <n...@linaro.org> > > Cc: Thomas Gleixner <t...@linutronix.de> > Cc: Andy Lutomirski <l...@amacapital.net> > Signed-off-by: Peter Zijlstra <pet...@infradead.org> > --- > kernel/sched/core.c | 41 ++++++++++++++++++++++++++++++++++++----- > 1 file changed, 36 insertions(+), 5 deletions(-) > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -505,6 +505,39 @@ static inline void init_hrtick(void) > #endif /* CONFIG_SCHED_HRTICK */ > > /* > + * cmpxchg based fetch_or, macro so it works for different integer types > + */ > +#define fetch_or(ptr, val) \ > +({ typeof(*(ptr)) __old, __val = *(ptr); \ > + for (;;) { \ > + __old = cmpxchg((ptr), __val, __val | (val)); \ > + if (__old == __val) \ > + break; \ > + __val = __old; \ > + } \ > + __old; \ > +}) > + > +#ifdef TIF_POLLING_NRFLAG > +/* > + * Atomically set TIF_NEED_RESCHED and test for TIF_POLLING_NRFLAG, > + * this avoids any races wrt polling state changes and thereby avoids > + * spurious IPIs. > + */ > +static bool set_nr_and_not_polling(struct task_struct *p) > +{ > + struct thread_info *ti = task_thread_info(p); > + return !(fetch_or(&ti->flags, _TIF_NEED_RESCHED) & _TIF_POLLING_NRFLAG); > +} > +#else > +static bool set_nr_and_not_polling(struct task_struct *p) > +{ > + set_tsk_need_resched(p); > + return true; > +} > +#endif > + > +/* > * resched_task - mark a task 'to be rescheduled now'. > * > * On UP this means the setting of the need_resched flag, on SMP it > @@ -520,17 +553,15 @@ void resched_task(struct task_struct *p) > if (test_tsk_need_resched(p)) > return; > > - set_tsk_need_resched(p); > - > cpu = task_cpu(p); > + > if (cpu == smp_processor_id()) { > + set_tsk_need_resched(p); > set_preempt_need_resched(); > return; > } > > - /* NEED_RESCHED must be visible before we test polling */ > - smp_mb(); > - if (!tsk_is_polling(p)) > + if (set_nr_and_not_polling(p)) > smp_send_reschedule(cpu); > } > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/