On 29/10/2020 17:18, Vincent Guittot wrote:

[...]

> - On hikey960 with performance governor (EAS disable)
> 
> ./perf bench sched pipe -T -l 50000
>              mainline           w/ patch
> # migrations   999364                  0
> ops/sec        149313(+/-0.28%)   182587(+/- 0.40) +22%
> 
> - On hikey with performance governor
> 
> ./perf bench sched pipe -T -l 50000
>              mainline           w/ patch
> # migrations        0                  0
> ops/sec         47721(+/-0.76%)    47899(+/- 0.56) +0.4%

Tested on hikey960 (big cluster 0xf0) with perf gov on tip sched/core +
patch) and defconfig plus:

# CONFIG_ARM_CPUIDLE is not set
# CONFIG_CPU_THERMAL is not set
# CONFIG_HISI_THERMAL is not set

and for 'w/ uclamp' tests:

CONFIG_UCLAMP_TASK=y
CONFIG_UCLAMP_BUCKETS_COUNT=5
CONFIG_UCLAMP_TASK_GROUP=y

(a) perf stat -n -r 20 taskset 0xf0 perf bench sched pipe -T -l 50000

(b) perf stat -n -r 20 -- cgexec -g cpu:A/B taskset 0xf0 perf bench
sched pipe -T -l 50000


(1) w/o uclamp

(a) w/o patch: 0.392850 +- 0.000289 seconds time elapsed  ( +-  0.07% )

    w/  patch: 0.330786 +- 0.000401 seconds time elapsed  ( +-  0.12% )

(b) w/o patch: 0.414644 +- 0.000375 seconds time elapsed  ( +-  0.09% )

    w/  patch: 0.353113 +- 0.000393 seconds time elapsed  ( +-  0.11% )

(2) w/ uclamp

(a) w/o patch: 0.393781 +- 0.000488 seconds time elapsed  ( +-  0.12% )

    w/  patch: 0.342726 +- 0.000661 seconds time elapsed  ( +-  0.19% )

(b) w/o patch: 0.416645 +- 0.000520 seconds time elapsed  ( +-  0.12% )

    w/  patch: 0.358098 +- 0.000577 seconds time elapsed  ( +-  0.16% )

Tested-by: Dietmar Eggemann <dietmar.eggem...@arm.com>

> According to test on hikey, the patch doesn't impact symmetric system
> compared to current implementation (only tested on arm64)
> 
> Also read the uclamped value of task's utilization at most twice instead
> instead each time we compare task's utilization with cpu's capacity.

task_util could be passed into select_idle_capacity() avoiding the
second call to uclamp_task_util()?

With this I see a small improvement for (a)

(3) w/ uclamp and passing task_util into sic()

(a) w/  patch: 0.337032 +- 0.000564 seconds time elapsed  ( +-  0.17% )

(b) w/  patch: 0.358467 +- 0.000381 seconds time elapsed  ( +-  0.11% )

[...]

> -symmetric:
> -     if (available_idle_cpu(target) || sched_idle_cpu(target))
> +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> +         asym_fits_capacity(task_util, target))
>               return target;

Braces because of multi-line condition ?

>       /*
>        * If the previous CPU is cache affine and idle, don't be stupid:
>        */
>       if (prev != target && cpus_share_cache(prev, target) &&
> -         (available_idle_cpu(prev) || sched_idle_cpu(prev)))
> +         (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> +         asym_fits_capacity(task_util, prev))
>               return prev;

and here ...

[...]

Reply via email to