On Fri, May 15, 2020 at 02:03:46PM +0100, Mel Gorman wrote: > On Fri, May 15, 2020 at 01:17:32PM +0200, Peter Zijlstra wrote: > > On Fri, May 15, 2020 at 09:47:40AM +0100, Mel Gorman wrote: > > +static bool ttwu_queue_remote(struct task_struct *p, int cpu, int > > wake_flags) > > +{ > > + if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), > > cpu)) { > > + sched_clock_cpu(cpu); /* Sync clocks across CPUs */ > > + __ttwu_queue_remote(p, cpu, wake_flags); > > + return true; > > + } > > + > > + return false; > > +}
> > + if (READ_ONCE(p->on_cpu) && __ttwu_queue_remote(p, cpu, wake_flags)) > > + goto unlock; > I don't see a problem with moving the updating of p->state to the other > side of the barrier but I'm relying on the comment that the barrier is > only related to on_rq and on_cpu. Yeah, I went with that too, like I said, didn't think too hard. > However, I'm less sure about what exactly you intended to do. > __ttwu_queue_remote is void so maybe you meant to use ttwu_queue_remote. That! > In that case, we potentially avoid spinning on on_rq for wakeups between > tasks that do not share CPU but it's not clear why it would be specific to > remote tasks. The thinking was that we can avoid spinning on ->on_cpu, and let the CPU get on with things. Rik had a workload where that spinning was significant, and I thought to have understood you saw the same. By sticking the task on the wake_list of the CPU that's in charge of clearing ->on_cpu we ensure ->on_cpu is 0 by the time we get to doing the actual enqueue.