On 23/02/21 10:30, kernel test robot wrote: > Greeting, > > FYI, we noticed a -13.9% regression of stress-ng.vm-segv.ops_per_sec due to > commit: > > > commit: b360fb5e5954a8a440ef95bf11257e2e7ea90340 ("[PATCH v2 1/7] sched/fair: > Ignore percpu threads for imbalance pulls") > url: > https://github.com/0day-ci/linux/commits/Valentin-Schneider/sched-fair-misfit-task-load-balance-tweaks/20210219-211028 > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git > c5e6fc08feb2b88dc5dac2f3c817e1c2a4cafda4 > > in testcase: stress-ng > on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with > 512G memory > with following parameters: > > nr_threads: 10% > disk: 1HDD > testtime: 60s > fs: ext4 > class: vm > test: vm-segv > cpufreq_governor: performance > ucode: 0x5003003 > > > > > If you fix the issue, kindly add following tag > Reported-by: kernel test robot <oliver.s...@intel.com> > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > To reproduce: > > git clone https://github.com/intel/lkp-tests.git > cd lkp-tests > bin/lkp install job.yaml # job file is attached in > this email > bin/lkp split-job --compatible job.yaml > bin/lkp run compatible-job.yaml > > ========================================================================================= > class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode: > > vm/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/vm-segv/stress-ng/60s/0x5003003 > > commit: > c5e6fc08fe ("sched,x86: Allow !PREEMPT_DYNAMIC") > b360fb5e59 ("sched/fair: Ignore percpu threads for imbalance pulls") > > c5e6fc08feb2b88d b360fb5e5954a8a440ef95bf112 > ---------------- --------------------------- > fail:runs %reproduction fail:runs > | | | > 1:6 -3% 1:6 > perf-profile.children.cycles-pp.error_entry > 0:6 -1% 0:6 > perf-profile.self.cycles-pp.error_entry > %stddev %change %stddev > \ | \ > 11324 � 3% -28.1% 8140 � 3% > stress-ng.time.involuntary_context_switches > 6818 � 15% +315.2% 28311 � 12% > stress-ng.time.major_page_faults > 30952041 -12.8% 26988502 stress-ng.time.minor_page_faults
> 378.82 +5.3% 398.75 stress-ng.time.system_time > 215.82 -10.0% 194.24 stress-ng.time.user_time > 62102177 -13.9% 53448474 > stress-ng.time.voluntary_context_switches > 810348 -13.9% 698034 stress-ng.vm-segv.ops > 13505 -13.9% 11633 stress-ng.vm-segv.ops_per_sec My hunch was that this could be due to the balance interval no longer being increased when load balance catches pcpu kworkers, but that's not the case: we would still have LBF_ALL_PINNED, which will still double the balance interval if no task was moved. I'm not sure which stat to look at wrt softirqs; this seems to say there weren't that many more: > 1.11 -0.3 0.85 mpstat.cpu.all.irq% > 0.18 -0.0 0.16 mpstat.cpu.all.soft% > 0.40 -0.1 0.35 mpstat.cpu.all.usr% But this does: > 11501 � 5% +9.6% 12610 � 6% softirqs.CPU12.RCU > 10678 � 5% +16.0% 12383 � 2% softirqs.CPU16.RCU > 10871 � 4% +13.1% 12294 � 2% softirqs.CPU17.RCU > 10724 � 2% +13.8% 12205 � 3% softirqs.CPU18.RCU > 10810 � 4% +16.2% 12560 � 3% softirqs.CPU19.RCU > 10647 � 6% +16.2% 12372 � 6% softirqs.CPU20.RCU > 10863 � 3% +14.7% 12461 � 3% softirqs.CPU21.RCU > 11231 � 5% +14.6% 12873 � 6% softirqs.CPU22.RCU > 11141 � 6% +21.0% 13480 � 8% softirqs.CPU64.RCU > 11209 � 6% +20.8% 13545 � 2% softirqs.CPU65.RCU > 11108 � 3% +20.0% 13334 � 6% softirqs.CPU66.RCU > 11414 � 9% +16.9% 13345 � 6% softirqs.CPU67.RCU > 11162 � 4% +16.2% 12968 � 9% softirqs.CPU68.RCU > 11035 � 5% +13.6% 12533 � 4% softirqs.CPU69.RCU > 11003 � 5% +18.9% 13078 � 8% softirqs.CPU70.RCU > 11097 � 4% +14.9% 12756 � 5% softirqs.CPU71.RCU