I'm writting to see if it makes sense to track idle cpus in a shared cpumask
in sched domain, then a task wakes up it can select idle cpu from this cpumask
instead of scanning all the cpus in the last level cache domain, especially
when the system is heavily loaded, the scanning cost could be significantly
reduced. The price is that the atomic cpumask ops are added to the idle entry
and exit paths.

I tested the following benchmarks on a x86 4 socket system with 24 cores per
socket and 2 hyperthreads per core, total 192 CPUs:

uperf throughput: netperf workload, tcp_nodelay, r/w size = 90

  threads       baseline-avg    %std    patch-avg       %std
  96            1               1.24    0.98            2.76
  144           1               1.13    1.35            4.01
  192           1               0.58    1.67            3.25
  240           1               2.49    1.68            3.55

hackbench: process mode, 100000 loops, 40 file descriptors per group

  group         baseline-avg    %std    patch-avg       %std
  2(80)         1               12.05   0.97            9.88
  3(120)        1               12.48   0.95            11.62
  4(160)        1               13.83   0.97            13.22
  5(200)        1               2.76    1.01            2.94     

schbench: 99th percentile latency, 16 workers per message thread

  mthread       baseline-avg    %std    patch-avg       %std
  6(96)         1               1.24    0.993           1.73
  9(144)        1               0.38    0.998           0.39
  12(192)       1               1.58    0.995           1.64
  15(240)       1               51.71   0.606           37.41

sysbench mysql throughput: read/write, table size = 10,000,000

  thread        baseline-avg    %std    patch-avg       %std
  96            1               1.77    1.015           1.71
  144           1               3.39    0.998           4.05
  192           1               2.88    1.002           2.81
  240           1               2.07    1.011           2.09

kbuild: kexec reboot every time

  baseline-avg  patch-avg
  1             1

Any suggestions are highly appreciated!

Thanks,
-Aubrey

Aubrey Li (1):
  sched/fair: select idle cpu from idle cpumask in sched domain

 include/linux/sched/topology.h | 13 +++++++++++++
 kernel/sched/fair.c            |  4 +++-
 kernel/sched/topology.c        |  2 +-
 3 files changed, 17 insertions(+), 2 deletions(-)

-- 
2.25.1

Reply via email to