On Sun, May 8, 2016 at 2:19 AM, Wanpeng Li <kernel...@gmail.com> wrote: > > 2016-05-06 23:43 GMT+08:00 Tomasz Grabiec <tgrab...@scylladb.com>: > > Hi, > > > > System details: > > > > Kernel: 4.2.0-27-generic #32~14.04.1-Ubuntu SMP Fri Jan 22 15:32:26 > > UTC 2016 x86_64 x86_64 x86_64 GNU/Linux > > CPU: 2 sockets, 8 cores per socket, 2 threads per core > > CPU Model: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz > > > > I was trying to reduce scheduling delays below 1/HZ, which is 4ms > > for our kernel. I enabled HRTICK scheduler feature and set > > scheduler tunables to: > > > > sched_latency_ns: 1000000 > > sched_min_granularity_ns: 400000 > > sched_wakeup_granularity_ns: 400000 > > sched_autogroup_enabled: 0 > > sched_tunable_scaling: 1 > > sched_cfs_bandwidth_slice_us: 5000 > > sched_child_runs_first: 0 > > sched_migration_cost_ns: 500000 > > sched_nr_migrate: 32 > > sched_rr_timeslice_ms: 25 > > sched_rt_period_us: 1000000 > > sched_rt_runtime_us: 950000 > > sched_shares_window_ns: 1000000 > > sched_time_avg_ms: 1000 > > > > Note that scheduling latency is set to 1ms and min_granularity to 0.4ms. > > > > The test case I used was to run two processes competing for CPU > > pinned to a single core, like this: > > > > python -c "import math; math.factorial(100000000)" & > > pid1=$! > > python -c "import math; math.factorial(100000000)" & > > pid2=$! > > sleep 1 > > sudo taskset -cap 2 $pid1 > > sudo taskset -cap 2 $pid2 > > > > Using perf I've found that scheduling delays do not stay around > > 0.5ms as could be expected from scheduling latency, but switch > > between modes of 0.5ms and 4ms back and forth. > > > > For example (perf script): > > > > python 147208 [002] 3272129.329011: sched:sched_stat_wait: > > comm=python pid=147207 delay=504875 [ns] > > python 147207 [002] 3272129.329515: sched:sched_stat_wait: > > comm=python pid=147208 delay=504096 [ns] > > python 147208 [002] 3272129.330019: sched:sched_stat_wait: > > comm=python pid=147207 delay=503936 [ns] > > python 147207 [002] 3272129.330523: sched:sched_stat_wait: > > comm=python pid=147208 delay=503888 [ns] > > python 147208 [002] 3272129.331067: sched:sched_stat_wait: > > comm=python pid=147207 delay=543902 [ns] > > python 147207 [002] 3272129.334047: sched:sched_stat_wait: > > comm=python pid=147208 delay=2980114 [ns] > > python 147208 [002] 3272129.338048: sched:sched_stat_wait: > > comm=python pid=147207 delay=4000695 [ns] > > python 147207 [002] 3272129.342051: sched:sched_stat_wait: > > comm=python pid=147208 delay=4002851 [ns] > > python 147208 [002] 3272129.346055: sched:sched_stat_wait: > > comm=python pid=147207 delay=4004720 [ns] > > python 147207 [002] 3272129.350058: sched:sched_stat_wait: > > comm=python pid=147208 delay=4002403 [ns] > > python 147208 [002] 3272129.354064: sched:sched_stat_wait: > > comm=python pid=147207 delay=4006037 [ns] > > python 147207 [002] 3272129.358064: sched:sched_stat_wait: > > comm=python pid=147208 delay=3999730 [ns] > > > > More detailed trace around such transition point looks like this: > > > > perf record \ > > -e sched:sched_stat_runtime \ > > -e sched:sched_stat_wait \ > > -e sched:sched_switch \ > > -e probe:hrtick \ > > -e probe:hrtick_start \ > > -e probe:scheduler_tick -C 2 > > > > python 873 [002] 3273334.927836: probe:hrtick: (ffffffff8109b430) > > python 873 [002] 3273334.927837: sched:sched_stat_runtime: > > comm=python pid=873 runtime=502915 [ns] vruntime=1549788940847739 [ns] > > python 873 [002] 3273334.927837: sched:sched_stat_runtime: > > comm=python pid=873 runtime=880 [ns] vruntime=1549788940848619 [ns] > > python 873 [002] 3273334.927838: sched:sched_stat_wait: > > comm=python pid=874 delay=503795 [ns] > > python 873 [002] 3273334.927838: probe:hrtick_start: (ffffffff8109f4f0) > > python 873 [002] 3273334.927839: sched:sched_switch: > > prev_comm=python prev_pid=873 prev_prio=120 prev_state=R ==> > > next_comm=python next_pid=874 next_prio=120 > > python 874 [002] 3273334.928233: probe:scheduler_tick: > > (ffffffff810a1f30) > > python 874 [002] 3273334.928233: sched:sched_stat_runtime: > > comm=python pid=874 runtime=395992 [ns] vruntime=1549788941220502 [ns] > > python 874 [002] 3273334.928236: sched:sched_stat_runtime: > > comm=python pid=874 runtime=2278 [ns] vruntime=1549788941222780 [ns] > > python 874 [002] 3273334.928238: sched:sched_stat_wait: > > comm=kworker/2:0 pid=93149 delay=0 [ns] > > python 874 [002] 3273334.928238: probe:hrtick_start: (ffffffff8109f4f0) > > python 874 [002] 3273334.928239: sched:sched_switch: > > prev_comm=python prev_pid=874 prev_prio=120 prev_state=R ==> > > next_comm=kworker/2:0 next_pid=93149 next_prio=120 > > kworker/2:0 93149 [002] 3273334.928242: sched:sched_stat_runtime: > > comm=kworker/2:0 pid=93149 runtime=5827 [ns] vruntime=1549788940354446 > > [ns] > > kworker/2:0 93149 [002] 3273334.928242: probe:hrtick_start: > > (ffffffff8109f4f0) > > kworker/2:0 93149 [002] 3273334.928243: sched:sched_stat_wait: > > comm=python pid=874 delay=5827 [ns] > > kworker/2:0 93149 [002] 3273334.928244: probe:hrtick_start: > > (ffffffff8109f4f0) > > kworker/2:0 93149 [002] 3273334.928244: sched:sched_switch: > > prev_comm=kworker/2:0 prev_pid=93149 prev_prio=120 prev_state=S ==> > > next_comm=python next_pid=874 next_prio=120 > > python 874 [002] 3273334.928745: probe:hrtick: (ffffffff8109b430) > > python 874 [002] 3273334.928745: sched:sched_stat_runtime: > > comm=python pid=874 runtime=503888 [ns] vruntime=1549788941726668 [ns] > > python 874 [002] 3273334.928746: sched:sched_stat_runtime: > > comm=python pid=874 runtime=910 [ns] vruntime=1549788941727578 [ns] > > python 874 [002] 3273334.928747: sched:sched_stat_wait: > > comm=python pid=873 delay=908895 [ns] > > python 874 [002] 3273334.928747: probe:hrtick_start: (ffffffff8109f4f0) > > python 874 [002] 3273334.928748: sched:sched_switch: > > prev_comm=python prev_pid=874 prev_prio=120 prev_state=R ==> > > next_comm=python next_pid=873 next_prio=120 > > python 873 [002] 3273334.929249: probe:hrtick: (ffffffff8109b430) > > python 873 [002] 3273334.929249: sched:sched_stat_runtime: > > comm=python pid=873 runtime=502850 [ns] vruntime=1549788941351469 [ns] > > python 873 [002] 3273334.929250: sched:sched_stat_runtime: > > comm=python pid=873 runtime=918 [ns] vruntime=1549788941352387 [ns] > > python 873 [002] 3273334.932236: probe:scheduler_tick: > > (ffffffff810a1f30) > > python 873 [002] 3273334.932237: sched:sched_stat_runtime: > > comm=python pid=873 runtime=2986981 [ns] vruntime=1549788944339368 > > [ns] > > python 873 [002] 3273334.932239: sched:sched_stat_runtime: > > comm=python pid=873 runtime=1685 [ns] vruntime=1549788944341053 [ns] > > python 873 [002] 3273334.932239: sched:sched_stat_wait: > > comm=python pid=874 delay=3492434 [ns] > > python 873 [002] 3273334.932240: probe:hrtick_start: (ffffffff8109f4f0) > > python 873 [002] 3273334.932240: sched:sched_switch: > > prev_comm=python prev_pid=873 prev_prio=120 prev_state=R ==> > > next_comm=python next_pid=874 next_prio=120 > > python 874 [002] 3273334.932741: probe:hrtick: (ffffffff8109b430) > > python 874 [002] 3273334.932742: sched:sched_stat_runtime: > > comm=python pid=874 runtime=503174 [ns] vruntime=1549788942230752 [ns] > > python 874 [002] 3273334.932743: sched:sched_stat_runtime: > > comm=python pid=874 runtime=948 [ns] vruntime=1549788942231700 [ns] > > python 874 [002] 3273334.936240: probe:scheduler_tick: > > (ffffffff810a1f30) > > python 874 [002] 3273334.936241: sched:sched_stat_runtime: > > comm=python pid=874 runtime=3497530 [ns] vruntime=1549788945729230 > > [ns] > > python 874 [002] 3273334.936249: sched:sched_stat_runtime: > > comm=python pid=874 runtime=8553 [ns] vruntime=1549788945737783 [ns] > > python 874 [002] 3273334.936250: sched:sched_stat_wait: > > comm=python pid=873 delay=4010205 [ns] > > python 874 [002] 3273334.936250: probe:hrtick_start: (ffffffff8109f4f0) > > python 874 [002] 3273334.936251: sched:sched_switch: > > prev_comm=python prev_pid=874 prev_prio=120 prev_state=R ==> > > next_comm=python next_pid=873 next_prio=120 > > > > > > Please advise, is this a bug? > > > > Could you cat /sys/kernel/debug/sched_features?
Here it goes: $ sudo cat /sys/kernel/debug/sched_features | xargs -n 1 echo GENTLE_FAIR_SLEEPERS START_DEBIT NO_NEXT_BUDDY LAST_BUDDY CACHE_HOT_BUDDY WAKEUP_PREEMPTION ARCH_CAPACITY HRTICK NO_DOUBLE_TICK LB_BIAS NONTASK_CAPACITY TTWU_QUEUE RT_PUSH_IPI NO_FORCE_SD_OVERLAP RT_RUNTIME_SHARE NO_LB_MIN NUMA NUMA_FAVOUR_HIGHER NO_NUMA_RESIST_LOWER