Hi Arnaldo,

On 1/21/2021 9:02 PM, Arnaldo Carvalho de Melo wrote:
So if we just want to append the default list, we only need to set
detailed_run=1, then ideally perf-stat will print the default list.
But for now, there are no task-clock, context-switches, cpu-migrations,
page-faults, instructions, branches and branch-misses displayed.
root@kbl-ppc:~# ./perf stat -e cycles -d -a -- sleep 1

  Performance counter stats for 'system wide':

        124,178,207      cycles                                                 
       (80.02%)
          6,444,490      L1-dcache-loads                                        
       (80.01%)
          1,043,169      L1-dcache-load-misses     #   16.19% of all L1-dcache 
accesses  (80.02%)
            564,474      LLC-loads                                              
       (80.02%)
             49,262      LLC-load-misses           #    8.73% of all LL-cache 
accesses  (79.92%)

        1.001614947 seconds time elapsed

Do we still need the '+' prefix to add the specified event on top of default
list? It looks current syntax should already support that feature, but just
need to fix some issues.
I think we can do away with that '+' when showing the added events and
its counts.

- Arnaldo

Can you help to look at my v3 (I will post it soon)? It only has one line change but it can achieve the goal. Another advantage is it can append the metrics to the default event list easily.

For example,

root@kbl-ppc:~# ./perf stat -M Page_Walks_Utilization -d -a -- sleep 1

 Performance counter stats for 'system wide':

         1,417,358      itlb_misses.walk_pending  #    0.177 M/sec
                                                  #     0.05 
Page_Walks_Utilization   (30.44%)
         1,145,481      dtlb_store_misses.walk_pending #    0.143 M/sec         
           (30.85%)
       126,098,937      cycles                    #    0.016 GHz                
      (31.25%)
         9,069,839      dtlb_load_misses.walk_pending #    1.132 M/sec          
          (31.64%)
                 0      ept.walk_pending          #    0.000 K/sec              
      (31.61%)
          8,009.41 msec cpu-clock                 #    7.994 CPUs utilized
               300      context-switches          #    0.037 K/sec
                 8      cpu-migrations            #    0.001 K/sec
                 3      page-faults               #    0.000 K/sec
       124,456,362      cycles                    #    0.016 GHz                
      (31.20%)
        23,924,628      instructions              #    0.19  insn per cycle     
      (38.79%)
         4,532,511      branches                  #    0.566 M/sec              
      (38.39%)
           650,797      branch-misses             #   14.36% of all branches    
      (38.00%)
         6,332,823      L1-dcache-loads           #    0.791 M/sec              
      (37.95%)
         1,056,199      L1-dcache-load-misses     #   16.68% of all L1-dcache 
accesses  (37.95%)
           572,791      LLC-loads                 #    0.072 M/sec              
      (30.36%)
            52,025      LLC-load-misses           #    9.08% of all LL-cache 
accesses  (30.36%)

       1.001966758 seconds time elapsed

It appends the metric 'Page_Walks_Utilization' to the default event list.

Anyway, if you think it's not a good solution, I'd like to change it. :)

Thanks
Jin Yao

Reply via email to