Em Sat, Jul 18, 2015 at 08:24:45AM -0700, Andi Kleen escreveu:
> [v2: Addressed review comments. Fixed display problems and 
> correctly compute IPC now. See patches for detailed changes.]
> [v3: Merged with current Arnaldo perf/core and added acked-by.]
> 
> [Note the respective kernel patches to report cycles are in
> peterz's perf/core queue, but so far not in tip. The patchkit
> can be tested however with the "fake cycles" debug patch added at
> the end]
> 
> The upcoming Skylake CPU has a new timed branch stack feature,
> that reports cycle counts for individual branches in the
> last branch record.
> 
> This allows to get fine grained cost information for code, and also allows
> to compute fine grained IPC.

Thanks, applied.

- Arnaldo
 
> Available from
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git 
> perf/skl-tools3
> 
> This patchkit adds support for this in the perf tools:
> - Basic support for the cycles field like other branch fields
> - Show cycles in the standard branch sort view (no IPC here,
>   as IPC needs the instruction counts from annotation)
> - Annotate cycles and IPC in the assembler annotate view
> - Add branch support to top, so we can do live annotation.
> - Misc support, like dumping it in perf report -D
> 
> Example output for annotate (with made up numbers):
>     
> The second column is the IPC and third average cycles for the basic block.
> 
>                    │    static int hex(char ch)                               
>                                                                         ▒
>                    │    {                                                     
>                                                                         ▒
>         0.12       │      push   %rbp                                         
>                                                                         ◆
>         0.12       │      mov    %rsp,%rbp                                    
>                                                                         ▒
>         0.12       │      sub    $0x20,%rsp                                   
>                                                                         ▒
>         0.12       │      mov    %edi,%eax                                    
>                                                                         ▒
>         0.12       │      mov    %al,-0x14(%rbp)                              
>                                                                         ▒
>         0.12       │      mov    %fs:0x28,%rax                                
>                                                                         ▒
>         0.12       │      mov    %rax,-0x8(%rbp)                              
>                                                                         ▒
>         0.12       │      xor    %eax,%eax                                    
>                                                                         ▒
>                    │            if ((ch >= '0') && (ch <= '9'))               
>                                                                         ▒
>         0.12       │      cmpb   $0x2f,-0x14(%rbp)                            
>                                                                         ▒
>  66.67  0.12   123 │    ↓ jle    31                                           
>                                                                         ▒
>         0.12       │      cmpb   $0x39,-0x14(%rbp)                            
>                                                                         ▒
>         0.12   123 │    ↓ jg     31                                           
>                                                                         ▒
>                    │                    return ch - '0';                      
>                                                                         ▒
>  22.22  0.12       │      movsbl -0x14(%rbp),%eax                             
>                                                                         ▒
>         0.12       │      sub    $0x30,%eax                                   
>                                                                         ▒
>         0.12   123 │    ↓ jmp    60                                           
>                                                                         ▒
>                    │            if ((ch >= 'a') && (ch <= 'f'))               
>                                                                         ▒
>         0.06       │31:   cmpb   $0x60,-0x14(%rbp)                            
>                                                                         ▒
>         0.06   123 │    ↓ jle    46                                           
>                                                                         ▒
>         0.06       │      cmpb   $0x66,-0x14(%rbp)                            
>                                                                         ▒
>         0.06       │    ↓ jg     46                                           
>                                                                         ▒
>                    │                    return ch - 'a' + 10;                 
>                                                                         ▒
>         0.06       │      movsbl -0x14(%rbp),%eax                             
>     
> 
> Example output for branch view (again with fake data):
> 
> Overhead  Command  Source Shared Object  Source Symbol                        
>        Target Symbol                               Basic Block Cycles   ◆
>   30.08%  tcall    tcall                 [.] f1                               
>        [.] f2                                      123                  ▒
>   27.44%  tcall    tcall                 [.] f2                               
>        [.] f1                                      123                  ▒
>   15.60%  tcall    tcall                 [.] main                             
>        [.] f1                                      123                  ▒
>   12.96%  tcall    tcall                 [.] f1                               
>        [.] main                                    123                  ▒
>   12.86%  tcall    tcall                 [.] main                             
>        [.] main                                    123                  ▒
>    0.08%  tcall    [kernel.kallsyms]     [k] hrtimer_interrupt                
>        [k] hrtimer_interrupt                       123             
> 
> IPC computation has a few limitations (see the comments in the respective 
> patches),
> in particular it punts on overlaping basic blocks.
> 
> The annotation only works for the interactive annotation. Currently it is not
> working in the scripted perf annotate, as that is missing a lot of the
> infrastructure needed for per instruction state.
> 
> It would be nice to add column headers to annotate.
> 
> So far no support in --branch-history or in perf script.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to