Hello,
kernel test robot noticed a -2.9% regression of will-it-scale.per_thread_ops on:
commit: 0ede61d8589cc2d93aa78230d74ac58b5b8d0244 ("file: convert to
SLAB_TYPESAFE_BY_RCU")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: will-it-scale
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @
2.90GHz (Cooper Lake) with 192G memory
parameters:
nr_task: 16
mode: thread
test: poll2
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.s...@intel.com>
| Closes:
https://lore.kernel.org/oe-lkp/202311201406.2022ca3f-oliver.s...@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231120/202311201406.2022ca3f-oliver.s...@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-12/performance/x86_64-rhel-8.3/thread/16/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/poll2/will-it-scale
commit:
93faf426e3 ("vfs: shave work on failed file open")
0ede61d858 ("file: convert to SLAB_TYPESAFE_BY_RCU")
93faf426e3cc000c 0ede61d8589cc2d93aa78230d74
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.01 ± 9% +58125.6% 4.17 ±175%
perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
89056 -2.0% 87309 proc-vmstat.nr_slab_unreclaimable
97958 ± 7% -9.7% 88449 ± 4% sched_debug.cpu.avg_idle.stddev
0.00 ± 12% +24.2% 0.00 ± 17%
sched_debug.cpu.next_balance.stddev
6391048 -2.9% 6208584 will-it-scale.16.threads
399440 -2.9% 388036 will-it-scale.per_thread_ops
6391048 -2.9% 6208584 will-it-scale.workload
19.99 ± 4% -2.2 17.74
perf-profile.calltrace.cycles-pp.fput.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
1.27 ± 5% +0.8 2.11 ± 3%
perf-profile.calltrace.cycles-pp.__fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
32.69 ± 4% +5.0 37.70
perf-profile.calltrace.cycles-pp.__fget_light.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64
0.00 +27.9 27.85
perf-profile.calltrace.cycles-pp.__get_file_rcu.__fget_light.do_poll.do_sys_poll.__x64_sys_poll
20.00 ± 4% -2.3 17.75
perf-profile.children.cycles-pp.fput
0.24 ± 10% -0.1 0.18 ± 2%
perf-profile.children.cycles-pp.syscall_return_via_sysret
1.48 ± 5% +0.5 1.98 ± 3%
perf-profile.children.cycles-pp.__fdget
31.85 ± 4% +6.0 37.86
perf-profile.children.cycles-pp.__fget_light
0.00 +27.7 27.67
perf-profile.children.cycles-pp.__get_file_rcu
30.90 ± 4% -20.6 10.35 ± 2%
perf-profile.self.cycles-pp.__fget_light
19.94 ± 4% -2.4 17.53 perf-profile.self.cycles-pp.fput
9.81 ± 4% -2.4 7.42 ± 2%
perf-profile.self.cycles-pp.do_poll
0.23 ± 11% -0.1 0.17 ± 4%
perf-profile.self.cycles-pp.syscall_return_via_sysret
0.00 +26.5 26.48
perf-profile.self.cycles-pp.__get_file_rcu
2.146e+10 ± 2% +8.5% 2.329e+10 ± 2% perf-stat.i.branch-instructions
0.22 ± 14% -0.0 0.19 ± 14% perf-stat.i.branch-miss-rate%
1.404e+10 ± 2% +8.7% 1.526e+10 ± 2% perf-stat.i.dTLB-stores
70.87 -2.3 68.59 perf-stat.i.iTLB-load-miss-rate%
5267608 -5.5% 4979133 ± 2% perf-stat.i.iTLB-load-misses
2102507 +5.4% 2215725 perf-stat.i.iTLB-loads
18791 ± 3% +10.5% 20757 ± 2%
perf-stat.i.instructions-per-iTLB-miss
266.67 ± 2% +6.8% 284.75 ± 2% perf-stat.i.metric.M/sec
0.01 ± 10% -10.5% 0.01 ± 5% perf-stat.overall.MPKI
0.19 -0.0 0.17
perf-stat.overall.branch-miss-rate%
0.65 -3.1% 0.63 perf-stat.overall.cpi
0.00 ± 4% -0.0 0.00 ± 4%
perf-stat.overall.dTLB-store-miss-rate%
71.48 -2.3 69.21
perf-stat.overall.iTLB-load-miss-rate%
18757 +10.0% 20629
perf-stat.overall.instructions-per-iTLB-miss
1.54 +3.2% 1.59 perf-stat.overall.ipc
4795147 +6.4% 5100406 perf-stat.overall.path-length
2.14e+10 ± 2% +8.5% 2.322e+10 ± 2% perf-stat.ps.branch-instructions
1.4e+10 ± 2% +8.7% 1.522e+10 ± 2% perf-stat.ps.dTLB-stores
5253923 -5.5% 4966218 ± 2% perf-stat.ps.iTLB-load-misses
2095770 +5.4% 2208605 perf-stat.ps.iTLB-loads
3.065e+13 +3.3% 3.167e+13 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki