[PATCH v4 0/5] perf report: Show branch type
v4: --- 1. Describe the major changes in patch description. Thanks for Peter Zijlstra's reminding. 2. Initialize branch type to 0 in intel_pmu_lbr_read_32 and intel_pmu_lbr_read_64. Remove the invalid else code in intel_pmu_lbr_filter. v3: --- 1. Move the JCC forward/backward and cross page computing from kernel to userspace. 2. Use lookup table to replace original switch/case processing. Changed: perf/core: Define the common branch type classification perf/x86/intel: Record branch type perf report: Show branch type statistics for stdio mode perf report: Show branch type in callchain entry Not changed: perf record: Create a new option save_type in --branch-filter v2: --- 1. Use 4 bits in perf_branch_entry to record branch type. 2. Pull out some common branch types from FAR_BRANCH. Now the branch types defined in perf_event.h: PERF_BR_NONE : unknown PERF_BR_JCC_FWD : conditional forward jump PERF_BR_JCC_BWD : conditional backward jump PERF_BR_JMP : jump PERF_BR_IND_JMP : indirect jump PERF_BR_CALL : call PERF_BR_IND_CALL : indirect call PERF_BR_RET : return PERF_BR_SYSCALL : syscall PERF_BR_SYSRET: syscall return PERF_BR_IRQ : hw interrupt/trap/fault PERF_BR_INT : sw interrupt PERF_BR_IRET : return from interrupt PERF_BR_FAR_BRANCH: others not generic far branch type 3. Use 2 bits in perf_branch_entry for a "cross" metrics checking for branch cross 4K or 2M area. It's an approximate computing for checking if the branch cross 4K page or 2MB page. For example: perf record -g --branch-filter any,save_type perf report --stdio JCC forward: 27.7% JCC backward: 9.8% JMP: 0.0% IND_JMP: 6.5% CALL: 26.6% IND_CALL: 0.0% RET: 29.3% IRET: 0.0% CROSS_4K: 0.0% CROSS_2M: 14.3% perf report --branch-history --stdio --no-children -23.60%--main div.c:42 (RET cycles:2) compute_flag div.c:28 (RET cycles:2) compute_flag div.c:27 (RET CROSS_2M cycles:1) rand rand.c:28 (RET CROSS_2M cycles:1) rand rand.c:28 (RET cycles:1) __random random.c:298 (RET cycles:1) __random random.c:297 (JCC forward cycles:1) __random random.c:295 (JCC forward cycles:1) __random random.c:295 (JCC forward cycles:1) __random random.c:295 (JCC forward cycles:1) __random random.c:295 (RET cycles:9) Changed: perf/core: Define the common branch type classification perf/x86/intel: Record branch type perf report: Show branch type statistics for stdio mode perf report: Show branch type in callchain entry Not changed: perf record: Create a new option save_type in --branch-filter v1: --- It is often useful to know the branch types while analyzing branch data. For example, a call is very different from a conditional branch. Currently we have to look it up in binary while the binary may later not be available and even the binary is available but user has to take some time. It is very useful for user to check it directly in perf report. Perf already has support for disassembling the branch instruction to get the branch type. The patch series records the branch type and show the branch type with other LBR information in callchain entry via perf report. The patch series also adds the branch type summary at the end of perf report --stdio. To keep consistent on kernel and userspace and make the classification more common, the patch adds the common branch type classification in perf_event.h. The common branch types are: JCC forward: Conditional forward jump JCC backward: Conditional backward jump JMP: Jump imm IND_JMP: Jump reg/mem CALL: Call imm IND_CALL: Call reg/mem RET: Ret FAR_BRANCH: SYSCALL/SYSRET, IRQ, IRET, TSX Abort An example: 1. Record branch type (new option "save_type") perf record -g --branch-filter any,save_type 2. Show the branch type statistics at the end of perf report --stdio perf report --stdio JCC forward: 34.0% JCC backward: 3.6% JMP: 0.0% IND_JMP: 6.5% CALL: 26.6% IND_CALL: 0.0% RET: 29.3% FAR_BRANCH: 0.0% 3. Show branch type in callchain entry perf report --branch-history --stdio --no-children --23.91%--main div.c:42 (RET cycles:2) compute_flag div.c:28 (RET cycles:2) compute_flag div.c:27 (RET cycles:1) rand rand.c:28 (RET cycles:1) rand rand.c:28 (RET cycles:1) __random random.c:298 (RET cycles:1) __random random.c:297 (JCC forward cycles:1) __random random.c:295 (JCC forward cycles:1) __random random.c:295 (JCC forward cycles:1) __random random.c:295 (JCC forward cycles:1) __random random.c:295 (RET cycles:9) Jin Yao (5
Re: [PATCH v4 0/5] perf report: Show branch type
On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote: SNIP > > 3. Use 2 bits in perf_branch_entry for a "cross" metrics checking >for branch cross 4K or 2M area. It's an approximate computing >for checking if the branch cross 4K page or 2MB page. > > For example: > > perf record -g --branch-filter any,save_type > > perf report --stdio > > JCC forward: 27.7% > JCC backward: 9.8% > JMP: 0.0% > IND_JMP: 6.5% > CALL: 26.6% > IND_CALL: 0.0% > RET: 29.3% > IRET: 0.0% > CROSS_4K: 0.0% > CROSS_2M: 14.3% got mangled perf report --stdio output for: [root@ibm-x3650m4-02 perf]# ./perf record -j any,save_type kill kill: not enough arguments [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (18 samples) ] [root@ibm-x3650m4-02 perf]# ./perf report --stdio -f | head -30 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 253 of event 'cycles' # Event count (approx.): 253 # # Overhead Command Source Shared Object Source Symbol Target SymbolBasic Block Cycles # ... ... ... .. # 8.30% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] native_write_msr - 7.91% perf Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] __intel_pmu_enable_all.constprop.17 - 7.91% perf Um [kernel.vmlinux] [k] native_write_msr [k] intel_pmu_lbr_enable_all - 6.32% kill libc-2.24.so [.] _dl_addr [.] _dl_addr - 5.93% perf Um [kernel.vmlinux] [k] perf_iterate_ctx [k] perf_iterate_ctx - 2.77% kill libc-2.24.so [.] malloc [.] malloc - 1.98% kill libc-2.24.so [.] _int_malloc [.] _int_malloc - 1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented [k] __rb_insert_augmented- 1.58% perf Um [kernel.vmlinux] [k] perf_event_exec [k] perf_event_exec - 1.19% kill [kernel.vmlinux] [k] anon_vma_interval_tree_insert [k] anon_vma_interval_tree_insert- 1.19% kill [kernel.vmlinux] [k] free_pgd_range [k] free_pgd_range - 1.19% kill [kernel.vmlinux] [k] n_tty_write [k] n_tty_write - 1.19% perf Um [kernel.vmlinux] [k] native_sched_clock [k] sched_clock - ... SNIP jirka
Re: [PATCH v4 0/5] perf report: Show branch type
On 4/12/2017 6:58 PM, Jiri Olsa wrote: On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote: SNIP 3. Use 2 bits in perf_branch_entry for a "cross" metrics checking for branch cross 4K or 2M area. It's an approximate computing for checking if the branch cross 4K page or 2MB page. For example: perf record -g --branch-filter any,save_type perf report --stdio JCC forward: 27.7% JCC backward: 9.8% JMP: 0.0% IND_JMP: 6.5% CALL: 26.6% IND_CALL: 0.0% RET: 29.3% IRET: 0.0% CROSS_4K: 0.0% CROSS_2M: 14.3% got mangled perf report --stdio output for: [root@ibm-x3650m4-02 perf]# ./perf record -j any,save_type kill kill: not enough arguments [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (18 samples) ] [root@ibm-x3650m4-02 perf]# ./perf report --stdio -f | head -30 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 253 of event 'cycles' # Event count (approx.): 253 # # Overhead Command Source Shared Object Source Symbol Target SymbolBasic Block Cycles # ... ... ... .. # 8.30% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] native_write_msr - 7.91% perf Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] __intel_pmu_enable_all.constprop.17 - 7.91% perf Um [kernel.vmlinux] [k] native_write_msr [k] intel_pmu_lbr_enable_all - 6.32% kill libc-2.24.so [.] _dl_addr [.] _dl_addr - 5.93% perf Um [kernel.vmlinux] [k] perf_iterate_ctx [k] perf_iterate_ctx - 2.77% kill libc-2.24.so [.] malloc [.] malloc - 1.98% kill libc-2.24.so [.] _int_malloc [.] _int_malloc - 1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented [k] __rb_insert_augmented- 1.58% perf Um [kernel.vmlinux] [k] perf_event_exec [k] perf_event_exec - 1.19% kill [kernel.vmlinux] [k] anon_vma_interval_tree_insert [k] anon_vma_interval_tree_insert- 1.19% kill [kernel.vmlinux] [k] free_pgd_range [k] free_pgd_range - 1.19% kill [kernel.vmlinux] [k] n_tty_write [k] n_tty_write - 1.19% perf Um [kernel.vmlinux] [k] native_sched_clock [k] sched_clock - ... SNIP jirka Hi, Thanks so much for trying this patch. The branch statistics is printed at the end of perf report --stdio. For example, on my machine, root@skl:/tmp# perf record -j any,save_type kill . . . . . . For more details see kill(1). [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ] root@skl:/tmp# perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 3 of event 'cycles' # Event count (approx.): 3 # # Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles # ... .. # 33.33% perf [kernel.vmlinux] [k] __intel_pmu_enable_all[k] native_write_msr 10 33.33% perf [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] __intel_pmu_enable_all4 33.33% perf [kernel.vmlinux] [k] native_write_msr [k] intel_pmu_lbr_enable_all - # # (Tip: Show current config key-value pairs: perf config --list) # # # Branch Statistics: # CROSS_4K: 100.0% CALL: 33.3% RET: 66.7% Thanks Jin Yao
Re: [PATCH v4 0/5] perf report: Show branch type
On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote: SNIP > > # Overhead Command Source Shared Object Source Symbol > > Target SymbolBasic Block Cycles > > # ... > > ... > > ... .. > > # > > 8.30% perf > > Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] > > native_write_msr - > > 7.91% perf > > Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] > > __intel_pmu_enable_all.constprop.17 - > > 7.91% perf > > Um [kernel.vmlinux] [k] native_write_msr [k] > > intel_pmu_lbr_enable_all - > > 6.32% kill libc-2.24.so [.] _dl_addr > > [.] _dl_addr - > > 5.93% perf > > Um [kernel.vmlinux] [k] perf_iterate_ctx [k] > > perf_iterate_ctx - > > 2.77% kill libc-2.24.so [.] malloc > > [.] malloc - > > 1.98% kill libc-2.24.so [.] _int_malloc > > [.] _int_malloc - > > 1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented > > [k] __rb_insert_augmented- > > 1.58% perf > > Um [kernel.vmlinux] [k] perf_event_exec [k] > > perf_event_exec - > > 1.19% kill [kernel.vmlinux] [k] > > anon_vma_interval_tree_insert[k] anon_vma_interval_tree_insert > > - > > 1.19% kill [kernel.vmlinux] [k] free_pgd_range > > [k] free_pgd_range - > > 1.19% kill [kernel.vmlinux] [k] n_tty_write > > [k] n_tty_write - > > 1.19% perf > > Um [kernel.vmlinux] [k] native_sched_clock [k] > > sched_clock - > > ... > > SNIP > > > > > > jirka > > Hi, > > Thanks so much for trying this patch. > > The branch statistics is printed at the end of perf report --stdio. yep, but for some reason with your changes the head report got changed as well, I haven't checked the details yet.. jirka
Re: [PATCH v4 0/5] perf report: Show branch type
On 4/12/2017 10:26 PM, Jiri Olsa wrote: On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote: SNIP # Overhead Command Source Shared Object Source Symbol Target SymbolBasic Block Cycles # ... ... ... .. # 8.30% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] native_write_msr - 7.91% perf Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] __intel_pmu_enable_all.constprop.17 - 7.91% perf Um [kernel.vmlinux] [k] native_write_msr [k] intel_pmu_lbr_enable_all - 6.32% kill libc-2.24.so [.] _dl_addr [.] _dl_addr - 5.93% perf Um [kernel.vmlinux] [k] perf_iterate_ctx [k] perf_iterate_ctx - 2.77% kill libc-2.24.so [.] malloc [.] malloc - 1.98% kill libc-2.24.so [.] _int_malloc [.] _int_malloc - 1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented [k] __rb_insert_augmented- 1.58% perf Um [kernel.vmlinux] [k] perf_event_exec [k] perf_event_exec - 1.19% kill [kernel.vmlinux] [k] anon_vma_interval_tree_insert [k] anon_vma_interval_tree_insert- 1.19% kill [kernel.vmlinux] [k] free_pgd_range [k] free_pgd_range - 1.19% kill [kernel.vmlinux] [k] n_tty_write [k] n_tty_write - 1.19% perf Um [kernel.vmlinux] [k] native_sched_clock [k] sched_clock - ... SNIP jirka Hi, Thanks so much for trying this patch. The branch statistics is printed at the end of perf report --stdio. yep, but for some reason with your changes the head report got changed as well, I haven't checked the details yet.. jirka The kill returns immediately with no parameter error. Could you try an application which can run for a while? For example: perf record -j any,save_type top Thanks Jin Yao
Re: [PATCH v4 0/5] perf report: Show branch type
On Wed, Apr 12, 2017 at 11:42:44PM +0800, Jin, Yao wrote: > > > On 4/12/2017 10:26 PM, Jiri Olsa wrote: > > On Wed, Apr 12, 2017 at 08:25:34PM +0800, Jin, Yao wrote: > > > > SNIP > > > > > > # Overhead Command Source Shared Object Source Symbol > > > > Target SymbolBasic Block Cycles > > > > # ... > > > > ... > > > > ... .. > > > > # > > > >8.30% perf > > > > Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] > > > > native_write_msr - > > > >7.91% perf > > > > Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] > > > > __intel_pmu_enable_all.constprop.17 - > > > >7.91% perf > > > > Um [kernel.vmlinux] [k] native_write_msr [k] > > > > intel_pmu_lbr_enable_all - > > > >6.32% kill libc-2.24.so [.] _dl_addr > > > > [.] _dl_addr - > > > >5.93% perf > > > > Um [kernel.vmlinux] [k] perf_iterate_ctx [k] > > > > perf_iterate_ctx - > > > >2.77% kill libc-2.24.so [.] malloc > > > > [.] malloc - > > > >1.98% kill libc-2.24.so [.] _int_malloc > > > > [.] _int_malloc - > > > >1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented > > > > [k] __rb_insert_augmented- > > > >1.58% perf > > > > Um [kernel.vmlinux] [k] perf_event_exec [k] > > > > perf_event_exec - > > > >1.19% kill [kernel.vmlinux] [k] > > > > anon_vma_interval_tree_insert[k] anon_vma_interval_tree_insert > > > > - > > > >1.19% kill [kernel.vmlinux] [k] free_pgd_range > > > > [k] free_pgd_range - > > > >1.19% kill [kernel.vmlinux] [k] n_tty_write > > > > [k] n_tty_write - > > > >1.19% perf > > > > Um [kernel.vmlinux] [k] native_sched_clock [k] > > > > sched_clock - > > > > ... > > > > SNIP > > > > > > > > > > > > jirka > > > Hi, > > > > > > Thanks so much for trying this patch. > > > > > > The branch statistics is printed at the end of perf report --stdio. > > yep, but for some reason with your changes the head report > > got changed as well, I haven't checked the details yet.. > > > > jirka > > The kill returns immediately with no parameter error. Could you try an > application which can run for a while? > > For example: > perf record -j any,save_type top sure, but it does not change the fact that the report output is broken, we need to fix it even for the 'kill' record case jirka
Re: [PATCH v4 0/5] perf report: Show branch type
On 4/12/2017 6:58 PM, Jiri Olsa wrote: On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote: SNIP 3. Use 2 bits in perf_branch_entry for a "cross" metrics checking for branch cross 4K or 2M area. It's an approximate computing for checking if the branch cross 4K page or 2MB page. For example: perf record -g --branch-filter any,save_type perf report --stdio JCC forward: 27.7% JCC backward: 9.8% JMP: 0.0% IND_JMP: 6.5% CALL: 26.6% IND_CALL: 0.0% RET: 29.3% IRET: 0.0% CROSS_4K: 0.0% CROSS_2M: 14.3% got mangled perf report --stdio output for: [root@ibm-x3650m4-02 perf]# ./perf record -j any,save_type kill kill: not enough arguments [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (18 samples) ] [root@ibm-x3650m4-02 perf]# ./perf report --stdio -f | head -30 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 253 of event 'cycles' # Event count (approx.): 253 # # Overhead Command Source Shared Object Source Symbol Target SymbolBasic Block Cycles # ... ... ... .. # 8.30% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] native_write_msr - 7.91% perf Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] __intel_pmu_enable_all.constprop.17 - 7.91% perf Um [kernel.vmlinux] [k] native_write_msr [k] intel_pmu_lbr_enable_all - 6.32% kill libc-2.24.so [.] _dl_addr [.] _dl_addr - 5.93% perf Um [kernel.vmlinux] [k] perf_iterate_ctx [k] perf_iterate_ctx - 2.77% kill libc-2.24.so [.] malloc [.] malloc - 1.98% kill libc-2.24.so [.] _int_malloc [.] _int_malloc - 1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented [k] __rb_insert_augmented- 1.58% perf Um [kernel.vmlinux] [k] perf_event_exec [k] perf_event_exec - 1.19% kill [kernel.vmlinux] [k] anon_vma_interval_tree_insert [k] anon_vma_interval_tree_insert- 1.19% kill [kernel.vmlinux] [k] free_pgd_range [k] free_pgd_range - 1.19% kill [kernel.vmlinux] [k] n_tty_write [k] n_tty_write - 1.19% perf Um [kernel.vmlinux] [k] native_sched_clock [k] sched_clock - ... SNIP jirka Sorry, I look at this issue at midnight in Shanghai. I misunderstood that the above output was only a mail format issue. Sorry about that. Now I recheck the output, and yes, the perf report output is mangled. But my patch doesn't touch the associated code. Anyway I remove my patches, pull the latest update from perf/core branch and run tests to check if its a regression issue. I test on HSW and SKL both. 1. On HSW. root@hsw:/tmp# perf record -j any kill .. /* SNIP */ For more details see kill(1). [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data (9 samples) ] root@hsw:/tmp# perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 144 of event 'cycles' # Event count (approx.): 144 # # Overhead Command Source Shared Object Source SymbolTarget SymbolBasic Block Cycles # ... ... ... .. # 10.42% kill libc-2.23.so [.] read_alias_file [.] read_alias_file - 9.72% kill [kernel.vmlinux] [k] update_load_avg [k] update_load_avg - 9.03% perf Um [unknown] [k] [k] - 8.33% kill libc-2.23.so [.] _int_malloc [.] _int_malloc - .. /* SNIP */ 0.69% kill [kernel.vmlinux] [k] _raw_spin_lock [k] unmap_page_range - 0.69% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all [k] native_write_msr - 0.69% perf Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_
Re: [PATCH v4 0/5] perf report: Show branch type
On 4/13/2017 10:00 AM, Jin, Yao wrote: On 4/12/2017 6:58 PM, Jiri Olsa wrote: On Wed, Apr 12, 2017 at 06:21:01AM +0800, Jin Yao wrote: SNIP 3. Use 2 bits in perf_branch_entry for a "cross" metrics checking for branch cross 4K or 2M area. It's an approximate computing for checking if the branch cross 4K page or 2MB page. For example: perf record -g --branch-filter any,save_type perf report --stdio JCC forward: 27.7% JCC backward: 9.8% JMP: 0.0% IND_JMP: 6.5% CALL: 26.6% IND_CALL: 0.0% RET: 29.3% IRET: 0.0% CROSS_4K: 0.0% CROSS_2M: 14.3% got mangled perf report --stdio output for: [root@ibm-x3650m4-02 perf]# ./perf record -j any,save_type kill kill: not enough arguments [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (18 samples) ] [root@ibm-x3650m4-02 perf]# ./perf report --stdio -f | head -30 # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 253 of event 'cycles' # Event count (approx.): 253 # # Overhead Command Source Shared Object Source SymbolTarget SymbolBasic Block Cycles # ... ... ... .. # 8.30% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all.constprop.17 [k] native_write_msr - 7.91% perf Um [kernel.vmlinux] [k] intel_pmu_lbr_enable_all [k] __intel_pmu_enable_all.constprop.17 - 7.91% perf Um [kernel.vmlinux] [k] native_write_msr [k] intel_pmu_lbr_enable_all - 6.32% kill libc-2.24.so [.] _dl_addr [.] _dl_addr - 5.93% perf Um [kernel.vmlinux] [k] perf_iterate_ctx [k] perf_iterate_ctx - 2.77% kill libc-2.24.so [.] malloc [.] malloc - 1.98% kill libc-2.24.so [.] _int_malloc [.] _int_malloc - 1.58% kill [kernel.vmlinux] [k] __rb_insert_augmented[k] __rb_insert_augmented- 1.58% perf Um [kernel.vmlinux] [k] perf_event_exec [k] perf_event_exec - 1.19% kill [kernel.vmlinux] [k] anon_vma_interval_tree_insert[k] anon_vma_interval_tree_insert- 1.19% kill [kernel.vmlinux] [k] free_pgd_range [k] free_pgd_range - 1.19% kill [kernel.vmlinux] [k] n_tty_write [k] n_tty_write - 1.19% perf Um [kernel.vmlinux] [k] native_sched_clock [k] sched_clock - ... SNIP jirka Sorry, I look at this issue at midnight in Shanghai. I misunderstood that the above output was only a mail format issue. Sorry about that. Now I recheck the output, and yes, the perf report output is mangled. But my patch doesn't touch the associated code. Anyway I remove my patches, pull the latest update from perf/core branch and run tests to check if its a regression issue. I test on HSW and SKL both. 1. On HSW. root@hsw:/tmp# perf record -j any kill .. /* SNIP */ For more details see kill(1). [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.014 MB perf.data (9 samples) ] root@hsw:/tmp# perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 144 of event 'cycles' # Event count (approx.): 144 # # Overhead Command Source Shared Object Source SymbolTarget SymbolBasic Block Cycles # ... ... ... .. # 10.42% kill libc-2.23.so [.] read_alias_file [.] read_alias_file - 9.72% kill [kernel.vmlinux] [k] update_load_avg [k] update_load_avg - 9.03% perf Um [unknown] [k] [k] - 8.33% kill libc-2.23.so [.] _int_malloc [.] _int_malloc - .. /* SNIP */ 0.69% kill [kernel.vmlinux] [k] _raw_spin_lock [k] unmap_page_range - 0.69% perf Um [kernel.vmlinux] [k] __intel_pmu_enable_all [k] native_write_msr - 0.69%
Re: [PATCH v4 0/5] perf report: Show branch type
On Thu, Apr 13, 2017 at 11:25:39AM +0800, Jin, Yao wrote: SNIP > > > > Now it works without my patch and it runs with latest perf/core branch. > > So it looks like a regression issue. > > > > Thanks > > Jin Yao > > > > > > I have tested, the regression issue is happened after this commit: > > bdd97ca perf tools: Refactor the code to strip command name with {l,r}trim() > > CC to the author for double checking. cool, thanks jirka