On Tue, Jun 3, 2014 at 7:02 AM, Peter Zijlstra <pet...@infradead.org> wrote: > On Tue, Jun 03, 2014 at 12:43:47PM +0200, Peter Zijlstra wrote: >> We need rq->curr, rq->idle 'sleeps' with polling set and nr clear, but >> it obviously has no effect setting that if its not actually the current >> task. >> >> Touching rq->curr needs holding rcu_read_lock() though, to make sure the >> task stays around, still shouldn't be a problem. > >> @@ -1581,8 +1604,14 @@ void scheduler_ipi(void) >> >> static void ttwu_queue_remote(struct task_struct *p, int cpu) >> { >> - if (llist_add(&p->wake_entry, &cpu_rq(cpu)->wake_list)) >> - smp_send_reschedule(cpu); >> + struct rq *rq = cpu_rq(cpu); >> + >> + if (llist_add(&p->wake_entry, &rq->wake_list)) { >> + rcu_read_lock(); >> + if (!set_nr_if_polling(rq->curr)) >> + smp_send_reschedule(cpu); >> + rcu_read_unlock(); >> + } >> } > > Hrmm, I think that is still broken, see how in schedule() we clear NR > before setting the new ->curr. > > So I think I had a loop on rq->curr the last time we talked about this, > but alternatively we could look at clearing NR after setting a new curr. > > I think I once looked at why it was done before, of course I can't > actually remember the details :/
Wouldn't this be a little simpler and maybe even faster if we just changed the idle loop to make TIF_POLLING_NRFLAG be a real indication that the idle task is running and actively polling? That is, change the end of cpuidle_idle_loop to: preempt_set_need_resched(); tick_nohz_idle_exit(); clear_tsk_need_resched(current); __current_clr_polling(); smp_mb__after_clear_bit(); WARN_ON_ONCE(test_thread_flag(TIF_POLLING_NRFLAG)); sched_ttwu_pending(); schedule_preempt_disabled(); __current_set_polling(); This has the added benefit that the optimistic version of the cmpxchg loop would be safe again. I'm about to test this with this variant. I'll try and send a comprehensible set of patches in a few hours. Can you remind me what the benefit was of letting polling be set when the idle thread schedules? It seems racy to me: it probably prevents any safe use of the polling bit without holding the rq lock. I guess there's some benefit to having polling be set for as long as possible, but it only helps if there are wakeups in very rapid succession, and it costs a couple of extra bit ops per idle entry. -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/