On 2020/12/7 23:42, Mel Gorman wrote: > On Mon, Dec 07, 2020 at 04:04:41PM +0100, Vincent Guittot wrote: >> On Mon, 7 Dec 2020 at 10:15, Mel Gorman <mgor...@techsingularity.net> wrote: >>> >>> This is a minimal series to reduce the amount of runqueue scanning in >>> select_idle_sibling in the worst case. >>> >>> Patch 1 removes SIS_AVG_CPU because it's unused. >>> >>> Patch 2 improves the hit rate of p->recent_used_cpu to reduce the amount >>> of scanning. It should be relatively uncontroversial >>> >>> Patch 3-4 scans the runqueues in a single pass for select_idle_core() >>> and select_idle_cpu() so runqueues are not scanned twice. It's >>> a tradeoff because it benefits deep scans but introduces overhead >>> for shallow scans. >>> >>> Even if patch 3-4 is rejected to allow more time for Aubrey's idle cpu mask >> >> patch 3 looks fine and doesn't collide with Aubrey's work. But I don't >> like patch 4 which manipulates different cpumask including >> load_balance_mask out of LB and I prefer to wait for v6 of Aubrey's >> patchset which should fix the problem of possibly scanning twice busy >> cpus in select_idle_core and select_idle_cpu >> > > Seems fair, we can see where we stand after V6 of Aubrey's work. A lot > of the motivation for patch 4 would go away if we managed to avoid calling > select_idle_core() unnecessarily. As it stands, we can call it a lot from > hackbench even though the chance of getting an idle core are minimal. >
Sorry for the delay, I sent v6 out just now. Comparing to v5, v6 followed Vincent's suggestion to decouple idle cpumask update from stop_tick signal, that is, the CPU is set in idle cpumask every time the CPU enters idle, this should address Peter's concern about the facebook trail-latency workload, as I didn't see any regression in schbench workload 99.0000th latency report. However, I also didn't see any significant benefit so far, probably I should put more load on the system. I'll do more characterization of uperf workload to see if I can find anything. Thanks, -Aubrey