Em Sat, Jul 18, 2015 at 08:24:45AM -0700, Andi Kleen escreveu: > [v2: Addressed review comments. Fixed display problems and > correctly compute IPC now. See patches for detailed changes.] > [v3: Merged with current Arnaldo perf/core and added acked-by.] > > [Note the respective kernel patches to report cycles are in > peterz's perf/core queue, but so far not in tip. The patchkit > can be tested however with the "fake cycles" debug patch added at > the end] > > The upcoming Skylake CPU has a new timed branch stack feature, > that reports cycle counts for individual branches in the > last branch record. > > This allows to get fine grained cost information for code, and also allows > to compute fine grained IPC.
Thanks, applied. - Arnaldo > Available from > git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git > perf/skl-tools3 > > This patchkit adds support for this in the perf tools: > - Basic support for the cycles field like other branch fields > - Show cycles in the standard branch sort view (no IPC here, > as IPC needs the instruction counts from annotation) > - Annotate cycles and IPC in the assembler annotate view > - Add branch support to top, so we can do live annotation. > - Misc support, like dumping it in perf report -D > > Example output for annotate (with made up numbers): > > The second column is the IPC and third average cycles for the basic block. > > │ static int hex(char ch) > ▒ > │ { > ▒ > 0.12 │ push %rbp > ◆ > 0.12 │ mov %rsp,%rbp > ▒ > 0.12 │ sub $0x20,%rsp > ▒ > 0.12 │ mov %edi,%eax > ▒ > 0.12 │ mov %al,-0x14(%rbp) > ▒ > 0.12 │ mov %fs:0x28,%rax > ▒ > 0.12 │ mov %rax,-0x8(%rbp) > ▒ > 0.12 │ xor %eax,%eax > ▒ > │ if ((ch >= '0') && (ch <= '9')) > ▒ > 0.12 │ cmpb $0x2f,-0x14(%rbp) > ▒ > 66.67 0.12 123 │ ↓ jle 31 > ▒ > 0.12 │ cmpb $0x39,-0x14(%rbp) > ▒ > 0.12 123 │ ↓ jg 31 > ▒ > │ return ch - '0'; > ▒ > 22.22 0.12 │ movsbl -0x14(%rbp),%eax > ▒ > 0.12 │ sub $0x30,%eax > ▒ > 0.12 123 │ ↓ jmp 60 > ▒ > │ if ((ch >= 'a') && (ch <= 'f')) > ▒ > 0.06 │31: cmpb $0x60,-0x14(%rbp) > ▒ > 0.06 123 │ ↓ jle 46 > ▒ > 0.06 │ cmpb $0x66,-0x14(%rbp) > ▒ > 0.06 │ ↓ jg 46 > ▒ > │ return ch - 'a' + 10; > ▒ > 0.06 │ movsbl -0x14(%rbp),%eax > > > Example output for branch view (again with fake data): > > Overhead Command Source Shared Object Source Symbol > Target Symbol Basic Block Cycles ◆ > 30.08% tcall tcall [.] f1 > [.] f2 123 ▒ > 27.44% tcall tcall [.] f2 > [.] f1 123 ▒ > 15.60% tcall tcall [.] main > [.] f1 123 ▒ > 12.96% tcall tcall [.] f1 > [.] main 123 ▒ > 12.86% tcall tcall [.] main > [.] main 123 ▒ > 0.08% tcall [kernel.kallsyms] [k] hrtimer_interrupt > [k] hrtimer_interrupt 123 > > IPC computation has a few limitations (see the comments in the respective > patches), > in particular it punts on overlaping basic blocks. > > The annotation only works for the interactive annotation. Currently it is not > working in the scripted perf annotate, as that is missing a lot of the > infrastructure needed for per instruction state. > > It would be nice to add column headers to annotate. > > So far no support in --branch-history or in perf script. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/