On Mon, Dec 07, 2020 at 04:04:41PM +0100, Vincent Guittot wrote: > On Mon, 7 Dec 2020 at 10:15, Mel Gorman <mgor...@techsingularity.net> wrote: > > > > This is a minimal series to reduce the amount of runqueue scanning in > > select_idle_sibling in the worst case. > > > > Patch 1 removes SIS_AVG_CPU because it's unused. > > > > Patch 2 improves the hit rate of p->recent_used_cpu to reduce the amount > > of scanning. It should be relatively uncontroversial > > > > Patch 3-4 scans the runqueues in a single pass for select_idle_core() > > and select_idle_cpu() so runqueues are not scanned twice. It's > > a tradeoff because it benefits deep scans but introduces overhead > > for shallow scans. > > > > Even if patch 3-4 is rejected to allow more time for Aubrey's idle cpu mask > > patch 3 looks fine and doesn't collide with Aubrey's work. But I don't > like patch 4 which manipulates different cpumask including > load_balance_mask out of LB and I prefer to wait for v6 of Aubrey's > patchset which should fix the problem of possibly scanning twice busy > cpus in select_idle_core and select_idle_cpu >
Seems fair, we can see where we stand after V6 of Aubrey's work. A lot of the motivation for patch 4 would go away if we managed to avoid calling select_idle_core() unnecessarily. As it stands, we can call it a lot from hackbench even though the chance of getting an idle core are minimal. Assuming I revisit it, I'll update the schedstat debug patches to include the times select_idle_core() starts versus how many times it fails and see can I think of a useful heuristic. I'll wait for more review on patches 1-3 and if I hear nothing, I'll resend just those. Thanks Vincent. -- Mel Gorman SUSE Labs