On Wed, 2015-08-05 at 10:38 +0200, Ingo Molnar wrote: > * kernel test robot <ying.hu...@intel.com> wrote: > > > FYI, we noticed the below changes on > > > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/asm > > commit b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 ("x86/build: Fix detection > > of GCC -mpreferred-stack-boundary support") > > Does the performance regression go away reproducibly if you do: > > git revert b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 > > ?
Sorry for reply so late! Revert the commit will restore part of the performance, as below. parent commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 the commit: b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 revert commit: 987d12601a4a82cc2f2151b1be704723eb84cb9d ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/cpufreq_governor/test: wsm/will-it-scale/debian-x86_64-2015-02-07.cgz/x86_64-rhel/gcc-4.9/performance/readseek2 commit: f2a50f8b7da45ff2de93a71393e715a2ab9f3b68 b2c51106c7581866c37ffc77c5d739f3d4b7cbc9 987d12601a4a82cc2f2151b1be704723eb84cb9d f2a50f8b7da45ff2 b2c51106c7581866c37ffc77c5 987d12601a4a82cc2f2151b1be ---------------- -------------------------- -------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 879002 ± 0% -18.1% 720270 ± 7% -3.6% 847011 ± 2% will-it-scale.per_process_ops 0.02 ± 0% +34.5% 0.02 ± 7% +5.6% 0.02 ± 2% will-it-scale.scalability 11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% will-it-scale.time.minor_page_faults 769.30 ± 0% -0.9% 762.15 ± 0% +1.1% 777.42 ± 0% will-it-scale.time.system_time 26153173 ± 0% +7.0% 27977076 ± 0% +3.5% 27078124 ± 0% will-it-scale.time.voluntary_context_switches 2964 ± 2% +1.4% 3004 ± 1% -51.9% 1426 ± 2% proc-vmstat.pgactivate 0.06 ± 27% +154.5% 0.14 ± 44% +122.7% 0.12 ± 24% turbostat.CPU%c3 370683 ± 0% +6.2% 393491 ± 0% +2.4% 379575 ± 0% vmstat.system.cs 11144 ± 0% +0.1% 11156 ± 0% +10.6% 12320 ± 0% time.minor_page_faults 15.70 ± 2% +14.5% 17.98 ± 0% +1.5% 15.94 ± 1% time.user_time 830343 ± 56% -54.0% 382128 ± 39% -22.3% 645308 ± 65% cpuidle.C1E-NHM.time 788.25 ± 14% -21.7% 617.25 ± 16% -12.3% 691.00 ± 3% cpuidle.C1E-NHM.usage 2489132 ± 20% +79.3% 4464147 ± 33% +78.4% 4440574 ± 21% cpuidle.C3-NHM.time 1082762 ±162% -100.0% 0.00 ± -1% +189.3% 3132030 ±110% latency_stats.avg.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 102189 ± 2% -2.1% 100087 ± 5% -32.9% 68568 ± 2% latency_stats.hits.pipe_wait.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath 1082762 ±162% -100.0% 0.00 ± -1% +289.6% 4217977 ±109% latency_stats.max.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 1082762 ±162% -100.0% 0.00 ± -1% +478.5% 6264061 ±110% latency_stats.sum.nfs_wait_on_request.nfs_updatepage.nfs_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 5.10 ± 2% -8.0% 4.69 ± 1% +13.0% 5.76 ± 1% perf-profile.cpu-cycles.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency 2.58 ± 8% +19.5% 3.09 ± 3% -1.8% 2.54 ± 11% perf-profile.cpu-cycles._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry 7.02 ± 3% +9.2% 7.67 ± 2% +7.1% 7.52 ± 3% perf-profile.cpu-cycles._raw_spin_lock_irqsave.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry 3.07 ± 2% +14.8% 3.53 ± 3% -1.4% 3.03 ± 5% perf-profile.cpu-cycles.finish_wait.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 3.05 ± 5% -8.4% 2.79 ± 4% -5.2% 2.90 ± 5% perf-profile.cpu-cycles.hrtimer_start_range_ns.tick_nohz_stop_sched_tick.__tick_nohz_idle_enter.tick_nohz_idle_enter.cpu_startup_entry 0.89 ± 5% -7.6% 0.82 ± 3% +16.3% 1.03 ± 5% perf-profile.cpu-cycles.is_ftrace_trampoline.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk 0.98 ± 3% -25.1% 0.74 ± 7% -16.8% 0.82 ± 2% perf-profile.cpu-cycles.is_ftrace_trampoline.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency 1.58 ± 3% -5.2% 1.50 ± 2% +44.2% 2.28 ± 1% perf-profile.cpu-cycles.is_module_text_address.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk 1.82 ± 18% +46.6% 2.67 ± 3% -32.6% 1.23 ± 56% perf-profile.cpu-cycles.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.finish_wait.__wait_on_bit_lock.__lock_page 8.05 ± 3% +9.5% 8.82 ± 3% +5.4% 8.49 ± 2% perf-profile.cpu-cycles.prepare_to_wait_exclusive.__wait_on_bit_lock.__lock_page.find_lock_entry.shmem_getpage_gfp 1.16 ± 2% +6.9% 1.25 ± 5% +11.4% 1.30 ± 5% perf-profile.cpu-cycles.put_page.shmem_file_read_iter.__vfs_read.vfs_read.sys_read 11102 ± 1% +0.0% 11102 ± 1% -95.8% 468.00 ± 0% slabinfo.Acpi-ParseExt.active_objs 198.25 ± 1% +0.0% 198.25 ± 1% -93.9% 12.00 ± 0% slabinfo.Acpi-ParseExt.active_slabs 11102 ± 1% +0.0% 11102 ± 1% -95.8% 468.00 ± 0% slabinfo.Acpi-ParseExt.num_objs 198.25 ± 1% +0.0% 198.25 ± 1% -93.9% 12.00 ± 0% slabinfo.Acpi-ParseExt.num_slabs 341.25 ± 14% +2.9% 351.00 ± 11% -100.0% 0.00 ± -1% slabinfo.blkdev_ioc.active_objs 341.25 ± 14% +2.9% 351.00 ± 11% -100.0% 0.00 ± -1% slabinfo.blkdev_ioc.num_objs 438.00 ± 16% -8.3% 401.50 ± 20% -100.0% 0.00 ± -1% slabinfo.file_lock_ctx.active_objs 438.00 ± 16% -8.3% 401.50 ± 20% -100.0% 0.00 ± -1% slabinfo.file_lock_ctx.num_objs 4398 ± 1% +1.4% 4462 ± 0% -14.5% 3761 ± 2% slabinfo.ftrace_event_field.active_objs 4398 ± 1% +1.4% 4462 ± 0% -14.5% 3761 ± 2% slabinfo.ftrace_event_field.num_objs 3947 ± 2% +10.6% 4363 ± 3% +107.1% 8175 ± 2% slabinfo.kmalloc-192.active_objs 93.00 ± 2% +10.8% 103.00 ± 3% +120.2% 204.75 ± 2% slabinfo.kmalloc-192.active_slabs 3947 ± 2% +10.6% 4363 ± 3% +118.4% 8620 ± 2% slabinfo.kmalloc-192.num_objs 93.00 ± 2% +10.8% 103.00 ± 3% +120.2% 204.75 ± 2% slabinfo.kmalloc-192.num_slabs 1794 ± 0% +3.2% 1851 ± 2% +12.2% 2012 ± 3% slabinfo.trace_event_file.active_objs 1794 ± 0% +3.2% 1851 ± 2% +12.2% 2012 ± 3% slabinfo.trace_event_file.num_objs 7065 ± 7% -5.4% 6684 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.active_objs 160.50 ± 7% -5.5% 151.75 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.active_slabs 7091 ± 7% -5.6% 6694 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.num_objs 160.50 ± 7% -5.5% 151.75 ± 8% -100.0% 0.00 ± -1% slabinfo.vm_area_struct.num_slabs 857.50 ± 29% +75.7% 1506 ± 78% +157.6% 2209 ± 33% sched_debug.cfs_rq[11]:/.blocked_load_avg 52.75 ± 29% -29.4% 37.25 ± 60% +103.3% 107.25 ± 43% sched_debug.cfs_rq[11]:/.load 914.50 ± 29% +69.9% 1553 ± 77% +155.6% 2337 ± 32% sched_debug.cfs_rq[11]:/.tg_load_contrib 7.75 ± 34% -64.5% 2.75 ± 64% -12.9% 6.75 ±115% sched_debug.cfs_rq[2]:/.nr_spread_over 1135 ± 20% -43.6% 640.75 ± 49% -18.8% 922.50 ± 51% sched_debug.cfs_rq[3]:/.blocked_load_avg 1215 ± 21% -43.1% 691.50 ± 46% -21.3% 956.25 ± 50% sched_debug.cfs_rq[3]:/.tg_load_contrib 38.50 ± 21% +129.9% 88.50 ± 36% +96.1% 75.50 ± 56% sched_debug.cfs_rq[4]:/.load 26.00 ± 20% +98.1% 51.50 ± 46% +142.3% 63.00 ± 53% sched_debug.cfs_rq[4]:/.runnable_load_avg 128.25 ± 18% +227.5% 420.00 ± 43% +152.4% 323.75 ± 68% sched_debug.cfs_rq[4]:/.utilization_load_avg 28320 ± 12% -6.3% 26545 ± 11% -19.4% 22813 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum 1015 ± 78% +101.1% 2042 ± 25% +64.4% 1669 ± 73% sched_debug.cfs_rq[6]:/.blocked_load_avg 1069 ± 72% +100.2% 2140 ± 23% +61.2% 1722 ± 70% sched_debug.cfs_rq[6]:/.tg_load_contrib 619.25 ± 12% -6.3% 580.25 ± 11% -19.2% 500.25 ± 13% sched_debug.cfs_rq[6]:/.tg_runnable_contrib 88.75 ± 14% -47.3% 46.75 ± 36% -24.5% 67.00 ± 11% sched_debug.cfs_rq[9]:/.load 59.25 ± 23% -41.4% 34.75 ± 34% -6.3% 55.50 ± 12% sched_debug.cfs_rq[9]:/.runnable_load_avg 315.50 ± 45% -64.6% 111.67 ± 1% -12.1% 277.25 ± 3% sched_debug.cfs_rq[9]:/.utilization_load_avg 2246758 ± 7% +87.6% 4213925 ± 65% -2.2% 2197475 ± 4% sched_debug.cpu#0.nr_switches 2249376 ± 7% +87.4% 4215969 ± 65% -2.2% 2199216 ± 4% sched_debug.cpu#0.sched_count 1121438 ± 7% +81.0% 2030313 ± 61% -2.2% 1096479 ± 4% sched_debug.cpu#0.sched_goidle 1151160 ± 7% +86.5% 2146608 ± 64% -1.9% 1129264 ± 3% sched_debug.cpu#0.ttwu_count 33.75 ± 15% -22.2% 26.25 ± 6% -8.9% 30.75 ± 10% sched_debug.cpu#1.cpu_load[3] 33.25 ± 10% -18.0% 27.25 ± 7% -3.8% 32.00 ± 11% sched_debug.cpu#1.cpu_load[4] 41.75 ± 29% +23.4% 51.50 ± 33% +53.9% 64.25 ± 16% sched_debug.cpu#10.cpu_load[1] 40.00 ± 18% +24.4% 49.75 ± 18% +49.4% 59.75 ± 8% sched_debug.cpu#10.cpu_load[2] 39.25 ± 14% +22.3% 48.00 ± 10% +38.9% 54.50 ± 7% sched_debug.cpu#10.cpu_load[3] 39.50 ± 15% +20.3% 47.50 ± 6% +30.4% 51.50 ± 7% sched_debug.cpu#10.cpu_load[4] 5269004 ± 1% +27.8% 6731790 ± 30% +1.4% 5342560 ± 2% sched_debug.cpu#10.nr_switches 5273193 ± 1% +27.8% 6736526 ± 30% +1.4% 5345791 ± 2% sched_debug.cpu#10.sched_count 2633974 ± 1% +27.8% 3365271 ± 30% +1.4% 2670901 ± 2% sched_debug.cpu#10.sched_goidle 2644149 ± 1% +26.9% 3356318 ± 30% +1.9% 2693295 ± 1% sched_debug.cpu#10.ttwu_count 26.50 ± 37% +116.0% 57.25 ± 48% +109.4% 55.50 ± 29% sched_debug.cpu#11.cpu_load[0] 30.75 ± 15% +66.7% 51.25 ± 31% +65.9% 51.00 ± 21% sched_debug.cpu#11.cpu_load[1] 33.50 ± 10% +37.3% 46.00 ± 22% +39.6% 46.75 ± 17% sched_debug.cpu#11.cpu_load[2] 37.00 ± 11% +15.5% 42.75 ± 19% +29.7% 48.00 ± 11% sched_debug.cpu#11.cpu_load[4] 508300 ± 11% -0.6% 505024 ± 1% +18.1% 600291 ± 7% sched_debug.cpu#4.avg_idle 454696 ± 9% -5.9% 427894 ± 25% +21.8% 553608 ± 4% sched_debug.cpu#5.avg_idle 66.00 ± 27% +11.0% 73.25 ± 37% -46.6% 35.25 ± 22% sched_debug.cpu#6.cpu_load[0] 62.00 ± 36% +12.5% 69.75 ± 45% -41.5% 36.25 ± 11% sched_debug.cpu#6.cpu_load[1] 247681 ± 19% +21.0% 299747 ± 10% +28.7% 318764 ± 17% sched_debug.cpu#8.avg_idle 5116609 ± 4% +34.5% 6884238 ± 33% +55.2% 7942254 ± 34% sched_debug.cpu#9.nr_switches 5120531 ± 4% +34.5% 6889156 ± 33% +55.2% 7945270 ± 34% sched_debug.cpu#9.sched_count 2557822 ± 4% +34.5% 3441428 ± 33% +55.2% 3970337 ± 34% sched_debug.cpu#9.sched_goidle 2565307 ± 4% +32.9% 3410042 ± 33% +54.0% 3949696 ± 34% sched_debug.cpu#9.ttwu_count 0.00 ±141% +4.2e+05% 4.76 ±173% +47.7% 0.00 ±-59671% sched_debug.rt_rq[10]:/.rt_time 155259 ± 0% +0.0% 155259 ± 0% -42.2% 89723 ± 0% sched_debug.sysctl_sched.sysctl_sched_features Best Regards, Huang, Ying -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/