On 29/10/2020 17:18, Vincent Guittot wrote: [...]
> - On hikey960 with performance governor (EAS disable) > > ./perf bench sched pipe -T -l 50000 > mainline w/ patch > # migrations 999364 0 > ops/sec 149313(+/-0.28%) 182587(+/- 0.40) +22% > > - On hikey with performance governor > > ./perf bench sched pipe -T -l 50000 > mainline w/ patch > # migrations 0 0 > ops/sec 47721(+/-0.76%) 47899(+/- 0.56) +0.4% Tested on hikey960 (big cluster 0xf0) with perf gov on tip sched/core + patch) and defconfig plus: # CONFIG_ARM_CPUIDLE is not set # CONFIG_CPU_THERMAL is not set # CONFIG_HISI_THERMAL is not set and for 'w/ uclamp' tests: CONFIG_UCLAMP_TASK=y CONFIG_UCLAMP_BUCKETS_COUNT=5 CONFIG_UCLAMP_TASK_GROUP=y (a) perf stat -n -r 20 taskset 0xf0 perf bench sched pipe -T -l 50000 (b) perf stat -n -r 20 -- cgexec -g cpu:A/B taskset 0xf0 perf bench sched pipe -T -l 50000 (1) w/o uclamp (a) w/o patch: 0.392850 +- 0.000289 seconds time elapsed ( +- 0.07% ) w/ patch: 0.330786 +- 0.000401 seconds time elapsed ( +- 0.12% ) (b) w/o patch: 0.414644 +- 0.000375 seconds time elapsed ( +- 0.09% ) w/ patch: 0.353113 +- 0.000393 seconds time elapsed ( +- 0.11% ) (2) w/ uclamp (a) w/o patch: 0.393781 +- 0.000488 seconds time elapsed ( +- 0.12% ) w/ patch: 0.342726 +- 0.000661 seconds time elapsed ( +- 0.19% ) (b) w/o patch: 0.416645 +- 0.000520 seconds time elapsed ( +- 0.12% ) w/ patch: 0.358098 +- 0.000577 seconds time elapsed ( +- 0.16% ) Tested-by: Dietmar Eggemann <dietmar.eggem...@arm.com> > According to test on hikey, the patch doesn't impact symmetric system > compared to current implementation (only tested on arm64) > > Also read the uclamped value of task's utilization at most twice instead > instead each time we compare task's utilization with cpu's capacity. task_util could be passed into select_idle_capacity() avoiding the second call to uclamp_task_util()? With this I see a small improvement for (a) (3) w/ uclamp and passing task_util into sic() (a) w/ patch: 0.337032 +- 0.000564 seconds time elapsed ( +- 0.17% ) (b) w/ patch: 0.358467 +- 0.000381 seconds time elapsed ( +- 0.11% ) [...] > -symmetric: > - if (available_idle_cpu(target) || sched_idle_cpu(target)) > + if ((available_idle_cpu(target) || sched_idle_cpu(target)) && > + asym_fits_capacity(task_util, target)) > return target; Braces because of multi-line condition ? > /* > * If the previous CPU is cache affine and idle, don't be stupid: > */ > if (prev != target && cpus_share_cache(prev, target) && > - (available_idle_cpu(prev) || sched_idle_cpu(prev))) > + (available_idle_cpu(prev) || sched_idle_cpu(prev)) && > + asym_fits_capacity(task_util, prev)) > return prev; and here ... [...]