Em Thu, Oct 15, 2020 at 03:50:33PM +0100, Leo Yan escreveu:
> If the memory event doesn't contain HITM tag (like Arm SPE), it cannot
> rely on HITM display to report cache false sharing.  Alternatively, we
> can use the LLC access and multi-threads info to locate the potential
> false sharing's data address, and if we connect with source code and
> analyze the multi-threads' execution timing, if can conclude load and
> store the same cache line at the meantime, thus this can be helpful for
> resolve the cache false sharing issue.
> 
> This patch set is to enable the display with sorting on LLC load
> accesses; it adds dimensions for total LLC hit and LLC load accesses,
> and these dimensions are used for shared cache line table and pareto.
> 
> This patch set is dependend on the patch set "perf c2c: Refine the
> organization of metrics" [1].
> 
> [1] https://lore.kernel.org/patchwork/cover/1321499/

Ok, that one is applied and will appear publicly as soon as it goes thru
my usual set of build tests.

- Arnaldo
 
> With this patch set, we can get display 'llc' as follows:
> 
>   # perf c2c report -d llc --coalesce tid,pid,iaddr,dso --stdio
> 
>   [...]
> 
>   =================================================
>              Shared Data Cache Line Table
>   =================================================
>   #
>   #        ----------- Cacheline ----------  LLC Hit   LLC Hit    Total    
> Total    Total  ---- Stores ----  ----- Core Load Hit -----  - LLC Load Hit 
> --  - RMT Load Hit --  --- Load Dram ----
>   # Index             Address  Node  PA cnt      Pct     Total  records    
> Loads   Stores    L1Hit   L1Miss       FB       L1       L2    LclHit  
> LclHitm    RmtHit  RmtHitm       Lcl       Rmt
>   # .....  ..................  ....  ......  .......  ........  .......  
> .......  .......  .......  .......  .......  .......  .......  ........  
> .......  ........  .......  ........  ........
>   #
>         0      0x563b01e83100     0    1401   65.32%       648     7011     
> 3738     3273     2582      691      515     2516       59       143      505 
>         0        0         0         0
>         1      0x563b01e830c0     0       1   26.51%       263      400      
> 400        0        0        0      130        3        4       262        1  
>        0        0         0         0
>         2      0x563b01e83080     0       1    7.76%        77      650      
> 650        0        0        0      180      348       45        14       63  
>        0        0         0         0
>         3  0xffff88c3d74e82c0     0       1    0.10%         1        1       
>  1        0        0        0        0        0        0         1        0   
>       0        0         0         0
>         4  0xffffa587c11e38c0   N/A       0    0.10%         1        2       
>  1        1        1        0        0        0        0         1        0   
>       0        0         0         0
>         5  0xffffffffbd5e6fc0     0       1    0.10%         1        1       
>  1        0        0        0        0        0        0         0        1   
>       0        0         0         0
>         6      0x7f90a4d6c2c0     0       1    0.10%         1        1       
>  1        0        0        0        0        0        0         1        0   
>       0        0         0         0
> 
>   =================================================
>         Shared Cache Line Distribution Pareto
>   =================================================
>   #
>   #        ---- LLC LD ----  -- Store Refs --  --------- Data address 
> ---------                                                   ---------- cycles 
> ----------    Total       cpu                                  Shared
>   #   Num   LclHit  LclHitm   L1 Hit  L1 Miss              Offset  Node  PA 
> cnt      Pid                 Tid        Code address  rmt hitm  lcl hitm      
> load  records       cnt               Symbol             Object               
>    Source:Line  Node
>   # .....  .......  .......  .......  .......  ..................  ....  
> ......  .......  ..................  ..................  ........  ........  
> ........  .......  ........  ...................  .................  
> ...........................  ....
>   #
>     -------------------------------------------------------------
>         0      143      505     2582      691      0x563b01e83100
>     -------------------------------------------------------------
>             96.50%    7.72%   46.79%    0.00%                 0x0     0       
> 1    14100    14102:lock_th         0x563b01c81c16         0      1949      
> 1331     1876         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:145   0
>              0.00%   35.05%    0.00%    0.00%                 0x0     0       
> 1    14100    14102:lock_th         0x563b01c81c1d         0      2651       
> 975      748         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:146   0
>              0.00%   30.89%    0.00%    0.00%                 0x0     0       
> 1    14100    14103:lock_th         0x563b01c81c1d         0      1425      
> 1003      762         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:146   0
>              2.10%    7.52%   49.19%    0.00%                 0x0     0       
> 1    14100    14103:lock_th         0x563b01c81c16         0      1585      
> 1053     2037         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:145   0
>              0.00%    0.00%    2.52%   44.86%                 0x0     0       
> 1    14100    14102:lock_th         0x563b01c81c28         0         0        
>  0      375         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:146   0
>              0.00%    0.00%    1.51%   55.14%                 0x0     0       
> 1    14100    14103:lock_th         0x563b01c81c28         0         0        
>  0      420         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:146   0
>              1.40%   12.87%    0.00%    0.00%                0x20     0       
> 1    14100    14104:reader_thd      0x563b01c81c73         0       166        
> 99      417         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:155   0
>              0.00%    5.94%    0.00%    0.00%                0x20     0       
> 1    14100    14105:reader_thd      0x563b01c81c73         0       144        
> 85      376         1  [.] read_write_func  false_sharing.exe  
> false_sharing_example.c:155   0
> 
>   [...]
> 
> 
> Leo Yan (8):
>   perf mem: Add structure field c2c_stats::tot_llchit
>   perf c2c: Add dimensions for total LLC hit
>   perf c2c: Add dimensions for LLC load hit
>   perf c2c: Change to general naming for macros
>   perf c2c: Rename for shared cache line stats
>   perf c2c: Refactor hist entry validation
>   perf c2c: Add option '-d llc' for sorting with LLC load
>   perf c2c: Update documentation for display option 'llc'
> 
>  tools/perf/Documentation/perf-c2c.txt |  18 +-
>  tools/perf/builtin-c2c.c              | 333 +++++++++++++++++++++-----
>  tools/perf/util/mem-events.c          |   3 +
>  tools/perf/util/mem-events.h          |   1 +
>  4 files changed, 286 insertions(+), 69 deletions(-)
> 
> -- 
> 2.17.1
> 

-- 

- Arnaldo

Reply via email to