On Mon, 2017-06-26 at 18:12 +0200, Peter Zijlstra wrote: > On Mon, Jun 26, 2017 at 11:20:54AM -0400, Rik van Riel wrote: > > > Oh, indeed. I guess in wake_affine() we should test > > whether the CPUs are in the same NUMA node, rather than > > doing cpus_share_cache() ? > > Well, since select_idle_sibling() is on LLC; the early test on > cpus_share_cache(prev,this) seems to actually make sense. > > But then cutting out all the other bits seems wrong. Not in the least > because !NUMA_BALACING should also still keep working.
Even when !NUMA_BALANCING, I suspect it makes little sense to compare the loads just one the cores in question, since select_idle_sibling() will likely move the task somewhere else. I suspect we want to compare the load on the whole LLC for that reason, even with NUMA_BALANCING disabled. > > Or, alternatively, have an update_numa_stats() variant > > for numa_wake_affine() that works on the LLC level? > > I think we want to retain the existing behaviour for everything > larger than LLC, and when NUMA_BALANCING, smaller than NUMA. What do you mean by this, exactly? How does the "existing behaviour" of only looking at the load on two cores make sense when doing LLC-level task placement? > Also note that your use of task_h_load() in the new numa thing > suffers > from exactly the problem effective_load() is trying to solve. Are you saying task_h_load is wrong in task_numa_compare() too, then? Should both use effective_load()?