Jeff Layton <jeff.lay...@primarydata.com> writes: > On Fri, 29 Jan 2016 09:32:19 +0800 > kernel test robot <ying.hu...@linux.intel.com> wrote: > >> FYI, we noticed the below changes on >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master >> commit 7f3697e24dc3820b10f445a4a7d914fc356012d1 ("locks: fix unlock when >> fcntl_setlk races with a close") >> >> >> ========================================================================================= >> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase: >> >> gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/lkp-snb01/lock1/will-it-scale >> >> commit: >> 9189922675ecca0fab38931d86b676e9d79602dc >> 7f3697e24dc3820b10f445a4a7d914fc356012d1 >> >> 9189922675ecca0f 7f3697e24dc3820b10f445a4a7 >> ---------------- -------------------------- >> %stddev %change %stddev >> \ | \ >> 2376432 ± 0% +2.1% 2427484 ± 0% will-it-scale.per_process_ops >> 807889 ± 0% +35.1% 1091496 ± 4% will-it-scale.per_thread_ops >> 22.08 ± 2% +89.1% 41.75 ± 5% will-it-scale.time.user_time >> 1238371 ± 14% +100.4% 2481345 ± 39% cpuidle.C1E-SNB.time >> 3098 ± 57% -66.6% 1035 ±171% numa-numastat.node1.other_node >> 379.25 ± 8% -21.4% 298.00 ± 12% >> numa-vmstat.node0.nr_alloc_batch >> 22.08 ± 2% +89.1% 41.75 ± 5% time.user_time >> 1795 ± 4% +7.5% 1930 ± 2% vmstat.system.cs >> 0.54 ± 5% +136.9% 1.28 ± 10% >> perf-profile.cycles.___might_sleep.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file >> 1.65 ± 57% +245.2% 5.70 ± 29% >> perf-profile.cycles.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath >> 1.58 ± 59% +248.3% 5.50 ± 31% >> perf-profile.cycles.__fget.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath >> 1.62 ± 58% +246.3% 5.63 ± 30% >> perf-profile.cycles.__fget_light.__fdget_raw.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 5.88 ± 11% >> perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait >> 2.50 ± 2% -100.0% 0.00 ± -1% >> perf-profile.cycles.__memset.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk >> 1.29 ± 4% +138.8% 3.09 ± 11% >> perf-profile.cycles.__memset.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.47 ± 9% +144.4% 1.16 ± 11% >> perf-profile.cycles.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.37 ± 12% +140.3% 0.90 ± 9% >> perf-profile.cycles.__might_sleep.__might_fault.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.86 ± 6% +137.7% 2.05 ± 10% >> perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file >> 0.61 ± 14% +56.8% 0.95 ± 14% >> perf-profile.cycles.__might_sleep.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl >> 0.00 ± -1% +Inf% 39.84 ± 12% >> perf-profile.cycles.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl >> 16.44 ± 3% -100.0% 0.00 ± -1% >> perf-profile.cycles.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 1.77 ± 11% >> perf-profile.cycles._raw_spin_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk >> 59.34 ± 1% -72.4% 16.36 ± 33% >> perf-profile.cycles._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.46 ± 11% +144.9% 1.13 ± 19% >> perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_fcntl.security_file_fcntl >> 0.87 ± 6% +103.2% 1.77 ± 12% >> perf-profile.cycles.avc_has_perm.inode_has_perm.file_has_perm.selinux_file_lock.security_file_lock >> 0.81 ± 4% +135.7% 1.90 ± 10% >> perf-profile.cycles.copy_user_generic_string.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 41.86 ± 12% >> perf-profile.cycles.do_lock_file_wait.part.29.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.88 ± 6% +127.8% 2.00 ± 9% >> perf-profile.cycles.entry_SYSCALL_64 >> 0.86 ± 4% +122.6% 1.92 ± 12% >> perf-profile.cycles.entry_SYSCALL_64_after_swapgs >> 84.98 ± 0% -9.1% 77.20 ± 2% >> perf-profile.cycles.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.76 ± 10% +142.1% 1.84 ± 14% >> perf-profile.cycles.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath >> 1.35 ± 4% +106.3% 2.78 ± 11% >> perf-profile.cycles.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl >> 0.00 ± -1% +Inf% 0.89 ± 12% >> perf-profile.cycles.flock_to_posix_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 6.90 ± 4% -48.6% 3.55 ± 27% >> perf-profile.cycles.fput.entry_SYSCALL_64_fastpath >> 0.51 ± 10% +140.5% 1.23 ± 16% >> perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_fcntl.security_file_fcntl.sys_fcntl >> 0.98 ± 4% +97.7% 1.93 ± 11% >> perf-profile.cycles.inode_has_perm.isra.31.file_has_perm.selinux_file_lock.security_file_lock.fcntl_setlk >> 0.00 ± -1% +Inf% 6.56 ± 10% >> perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait >> 2.75 ± 4% -100.0% 0.00 ± -1% >> perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk >> 1.53 ± 7% +119.7% 3.37 ± 13% >> perf-profile.cycles.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 1.79 ± 11% >> perf-profile.cycles.kmem_cache_free.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait >> 0.46 ± 14% +257.0% 1.66 ± 11% >> perf-profile.cycles.kmem_cache_free.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.40 ± 7% +158.6% 1.05 ± 17% >> perf-profile.cycles.kmem_cache_free.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file >> 0.00 ± -1% +Inf% 0.96 ± 10% >> perf-profile.cycles.lg_local_lock.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait >> 0.00 ± -1% +Inf% 14.69 ± 10% >> perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk >> 6.38 ± 3% -100.0% 0.00 ± -1% >> perf-profile.cycles.locks_alloc_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl >> 3.28 ± 6% +127.1% 7.45 ± 12% >> perf-profile.cycles.locks_alloc_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 9.75 ± 13% >> perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk >> 3.61 ± 1% -100.0% 0.00 ± -1% >> perf-profile.cycles.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl >> 0.00 ± -1% +Inf% 1.84 ± 11% >> perf-profile.cycles.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk >> 0.00 ± -1% +Inf% 2.42 ± 10% >> perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk >> 1.00 ± 3% -100.0% 0.00 ± -1% >> perf-profile.cycles.locks_free_lock.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl >> 0.63 ± 11% +224.1% 2.05 ± 10% >> perf-profile.cycles.locks_free_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 1.22 ± 14% >> perf-profile.cycles.locks_free_lock.locks_dispose_list.__posix_lock_file.vfs_lock_file.do_lock_file_wait >> 0.00 ± -1% +Inf% 6.17 ± 15% >> perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait.fcntl_setlk >> 2.31 ± 6% -100.0% 0.00 ± -1% >> perf-profile.cycles.locks_insert_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk.sys_fcntl >> 0.00 ± -1% +Inf% 8.96 ± 13% >> perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.do_lock_file_wait >> 3.27 ± 1% -100.0% 0.00 ± -1% >> perf-profile.cycles.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file.fcntl_setlk >> 53.88 ± 1% -79.7% 10.92 ± 46% >> perf-profile.cycles.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 2.75 ± 0% +183.3% 7.79 ± 13% >> perf-profile.cycles.put_pid.locks_unlink_lock_ctx.locks_delete_lock_ctx.__posix_lock_file.vfs_lock_file >> 1.11 ± 9% +137.2% 2.63 ± 14% >> perf-profile.cycles.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath >> 1.69 ± 4% +118.2% 3.69 ± 11% >> perf-profile.cycles.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.91 ± 9% +139.0% 2.17 ± 14% >> perf-profile.cycles.selinux_file_fcntl.security_file_fcntl.sys_fcntl.entry_SYSCALL_64_fastpath >> 1.39 ± 4% +114.6% 2.97 ± 10% >> perf-profile.cycles.selinux_file_lock.security_file_lock.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 0.00 ± -1% +Inf% 41.12 ± 12% >> perf-profile.cycles.vfs_lock_file.do_lock_file_wait.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 17.04 ± 3% -100.0% 0.00 ± -1% >> perf-profile.cycles.vfs_lock_file.fcntl_setlk.sys_fcntl.entry_SYSCALL_64_fastpath >> 34.75 ±148% +132.4% 80.75 ± 82% sched_debug.cfs_rq:/.load.8 >> 15.00 ± 9% +198.3% 44.75 ± 72% >> sched_debug.cfs_rq:/.load_avg.21 >> 25.00 ± 29% +574.0% 168.50 ± 78% sched_debug.cfs_rq:/.load_avg.9 >> 38.47 ± 5% +29.1% 49.65 ± 26% >> sched_debug.cfs_rq:/.load_avg.avg >> 63.17 ± 10% +44.3% 91.16 ± 36% >> sched_debug.cfs_rq:/.load_avg.stddev >> 893865 ± 12% -12.5% 782455 ± 0% >> sched_debug.cfs_rq:/.min_vruntime.25 >> 18.25 ± 26% +52.1% 27.75 ± 25% >> sched_debug.cfs_rq:/.runnable_load_avg.9 >> -57635 ±-68% -196.4% 55548 ±130% sched_debug.cfs_rq:/.spread0.1 >> -802264 ±-25% -29.5% -565458 ±-49% sched_debug.cfs_rq:/.spread0.8 >> -804662 ±-25% -29.4% -567811 ±-48% >> sched_debug.cfs_rq:/.spread0.min >> 1233 ± 5% +30.9% 1614 ± 28% >> sched_debug.cfs_rq:/.tg_load_avg.0 >> 1233 ± 5% +30.9% 1614 ± 28% >> sched_debug.cfs_rq:/.tg_load_avg.1 >> 1228 ± 5% +30.3% 1601 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.10 >> 1228 ± 5% +30.4% 1601 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.11 >> 1228 ± 5% +30.3% 1601 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.12 >> 1229 ± 5% +30.0% 1598 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.13 >> 1228 ± 5% +30.1% 1598 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.14 >> 1229 ± 5% +30.0% 1598 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.15 >> 1226 ± 5% +30.3% 1598 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.16 >> 1226 ± 5% +30.2% 1597 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.17 >> 1227 ± 5% +30.1% 1595 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.18 >> 1227 ± 5% +29.4% 1588 ± 26% >> sched_debug.cfs_rq:/.tg_load_avg.19 >> 1233 ± 5% +30.4% 1609 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.2 >> 1222 ± 5% +29.9% 1587 ± 26% >> sched_debug.cfs_rq:/.tg_load_avg.20 >> 1223 ± 5% +24.2% 1519 ± 20% >> sched_debug.cfs_rq:/.tg_load_avg.21 >> 1223 ± 5% +23.8% 1515 ± 20% >> sched_debug.cfs_rq:/.tg_load_avg.22 >> 1223 ± 5% +23.9% 1515 ± 20% >> sched_debug.cfs_rq:/.tg_load_avg.23 >> 1223 ± 5% +23.9% 1515 ± 20% >> sched_debug.cfs_rq:/.tg_load_avg.24 >> 1223 ± 5% +23.5% 1511 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.25 >> 1224 ± 5% +23.5% 1512 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.26 >> 1223 ± 5% +23.1% 1506 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.27 >> 1223 ± 5% +22.5% 1499 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.28 >> 1224 ± 5% +22.5% 1499 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.29 >> 1233 ± 5% +30.3% 1607 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.3 >> 1223 ± 5% +22.2% 1495 ± 18% >> sched_debug.cfs_rq:/.tg_load_avg.30 >> 1224 ± 5% +22.0% 1493 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.31 >> 1234 ± 5% +30.0% 1604 ± 28% >> sched_debug.cfs_rq:/.tg_load_avg.4 >> 1233 ± 5% +30.0% 1604 ± 28% >> sched_debug.cfs_rq:/.tg_load_avg.5 >> 1231 ± 5% +30.3% 1604 ± 28% >> sched_debug.cfs_rq:/.tg_load_avg.6 >> 1233 ± 5% +30.0% 1603 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.7 >> 1231 ± 5% +30.1% 1601 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.8 >> 1228 ± 5% +30.3% 1601 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.9 >> 1228 ± 5% +27.8% 1569 ± 24% >> sched_debug.cfs_rq:/.tg_load_avg.avg >> 1246 ± 5% +30.7% 1628 ± 27% >> sched_debug.cfs_rq:/.tg_load_avg.max >> 1212 ± 5% +22.2% 1481 ± 19% >> sched_debug.cfs_rq:/.tg_load_avg.min >> 15.00 ± 9% +198.3% 44.75 ± 72% >> sched_debug.cfs_rq:/.tg_load_avg_contrib.21 >> 25.00 ± 29% +574.0% 168.50 ± 78% >> sched_debug.cfs_rq:/.tg_load_avg_contrib.9 >> 38.53 ± 5% +29.0% 49.71 ± 26% >> sched_debug.cfs_rq:/.tg_load_avg_contrib.avg >> 63.34 ± 10% +44.1% 91.30 ± 36% >> sched_debug.cfs_rq:/.tg_load_avg_contrib.stddev >> 532.25 ± 2% +8.5% 577.50 ± 6% >> sched_debug.cfs_rq:/.util_avg.15 >> 210.75 ± 14% -14.4% 180.50 ± 4% >> sched_debug.cfs_rq:/.util_avg.29 >> 450.00 ± 22% +50.7% 678.00 ± 18% sched_debug.cfs_rq:/.util_avg.9 >> 955572 ± 4% -10.2% 857813 ± 5% sched_debug.cpu.avg_idle.6 >> 23.99 ± 60% -76.2% 5.71 ± 24% sched_debug.cpu.clock.stddev >> 23.99 ± 60% -76.2% 5.71 ± 24% >> sched_debug.cpu.clock_task.stddev >> 2840 ± 37% -47.4% 1492 ± 65% sched_debug.cpu.curr->pid.25 >> 34.75 ±148% +132.4% 80.75 ± 82% sched_debug.cpu.load.8 >> 61776 ± 7% -7.1% 57380 ± 0% >> sched_debug.cpu.nr_load_updates.25 >> 6543 ± 2% +20.4% 7879 ± 9% sched_debug.cpu.nr_switches.0 >> 5256 ± 23% +177.1% 14566 ± 52% sched_debug.cpu.nr_switches.27 >> 7915 ± 3% +8.7% 8605 ± 3% sched_debug.cpu.nr_switches.avg >> -0.25 ±-519% +1900.0% -5.00 ±-24% >> sched_debug.cpu.nr_uninterruptible.12 >> 2.00 ± 93% -125.0% -0.50 ±-300% >> sched_debug.cpu.nr_uninterruptible.24 >> 17468 ± 14% +194.3% 51413 ± 75% sched_debug.cpu.sched_count.15 >> 2112 ± 2% +20.8% 2552 ± 11% sched_debug.cpu.sched_goidle.0 >> 2103 ± 34% +219.0% 6709 ± 55% sched_debug.cpu.sched_goidle.27 >> 3159 ± 3% +8.2% 3418 ± 4% >> sched_debug.cpu.sched_goidle.avg >> 1323 ± 64% -72.7% 361.50 ± 15% sched_debug.cpu.ttwu_count.23 >> 3264 ± 12% +94.4% 6347 ± 41% sched_debug.cpu.ttwu_count.27 >> 3860 ± 3% +9.0% 4208 ± 3% sched_debug.cpu.ttwu_count.avg >> 2358 ± 3% +28.7% 3035 ± 9% sched_debug.cpu.ttwu_local.0 >> 1110 ± 22% +54.6% 1716 ± 28% sched_debug.cpu.ttwu_local.27 >> 1814 ± 8% +16.1% 2106 ± 5% >> sched_debug.cpu.ttwu_local.stddev >> >> >> lkp-snb01: Sandy Bridge-EP >> Memory: 32G >> >> will-it-scale.per_thread_ops >> >> 1.2e+06 ++---------------------------------------------------------------+ >> | O | >> 1.15e+06 O+O O O O O O O O | >> 1.1e+06 ++ | >> | O O O O O OO | >> 1.05e+06 ++ O O | >> 1e+06 ++ | >> | | >> 950000 ++ | >> 900000 ++ | >> | | >> 850000 ++ | >> 800000 *+*.*.*.*.*.*.*.*.*.*.*.*. .*.*. *.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.* >> | * * | >> 750000 ++---------------------------------------------------------------+ >> >> >> will-it-scale.time.user_time >> >> 50 ++---------------------------------------------------------------------+ >> | | >> 45 ++ O O O O O O | >> O O O O | >> | O O O O O | >> 40 ++ O O O O | >> | | >> 35 ++ | >> | | >> 30 ++ | >> | | >> | * | >> 25 ++ + + | >> *.*.*.*..*.* *.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*.*..*.*.*.*.*.*..*.*.*.* >> 20 ++---------------------------------------------------------------------+ >> >> >> [*] bisect-good sample >> [O] bisect-bad sample >> >> To reproduce: >> >> git clone >> git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git >> cd lkp-tests >> bin/lkp install job.yaml # job file is attached in this email >> bin/lkp run job.yaml >> >> >> Disclaimer: >> Results have been estimated based on internal Intel analysis and are provided >> for informational purposes only. Any difference in system hardware or >> software >> design or configuration may affect actual performance. >> >> >> Thanks, >> Ying Huang > > Thanks... > > Huh...I'm stumped on this one. If anything I would have expected better > performance with this patch since we don't even take the file_lock or > do the fcheck in the F_UNLCK codepath now, or when there is an error. > > I'll see if I can reproduce it on my own test rig, but I'd welcome > ideas of where and how this performance regression could have crept in.
This is a performance increase instead of performance regression. Best Regards, Huang, Ying