Re: [PATCH v5 0/5] perf stat: Support overall statistics for interval mode
Hi Kajoljain, On 5/14/2020 5:53 PM, kajoljain wrote: On 5/14/20 11:06 AM, Jin Yao wrote: Currently perf-stat supports to print counts at regular interval (-I), but it's not very easy for user to get the overall statistics. With this patchset, it supports to report the summary at the end of interval output. For example, root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2 # time counts unit events 1.000412064 2,281,114 cycles 2.001383658 2,547,880 cycles Performance counter stats for 'system wide': 4,828,994 cycles 2.002860349 seconds time elapsed root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2 # time counts unit events 1.000389902 1,536,093 cycles 1.000389902420,226 instructions #0.27 insn per cycle 2.001433453 2,213,952 cycles 2.001433453735,465 instructions #0.33 insn per cycle Performance counter stats for 'system wide': 3,750,045 cycles 1,155,691 instructions #0.31 insn per cycle 2.003023361 seconds time elapsed root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2 # time counts unit events 1.000435121905,303 inst_retired.any # 2.9 CPI 1.000435121 2,663,333 cycles 1.000435121914,702 inst_retired.any # 0.3 IPC 1.000435121 2,676,559 cpu_clk_unhalted.thread 2.001615941 1,951,092 inst_retired.any # 1.8 CPI 2.001615941 3,551,357 cycles 2.001615941 1,950,837 inst_retired.any # 0.5 IPC 2.001615941 3,551,044 cpu_clk_unhalted.thread Performance counter stats for 'system wide': 2,856,395 inst_retired.any # 2.2 CPI 6,214,690 cycles 2,865,539 inst_retired.any # 0.5 IPC 6,227,603 cpu_clk_unhalted.thread 2.003403078 seconds time elapsed Hi Jin, Reporting the summary will be great for understanding overall stats. So, Before the patch where we are reseting rt_stat before read_counters to make sure, whatever printing in final aggregate is as per counts on that interval, Yes, I had similar thoughts, so I posted following patch. https://lore.kernel.org/lkml/20200420145417.6864-1-yao@linux.intel.com/ we used to update stats->means and other info as described in RFC: https://lkml.org/lkml/2020/3/24/158 I've checked your patch but sorry I'm also not very sure if it's the expected behavior. Now, stats->means is same as counts which we are using in generic_metric function. Is this expected behavior? I am not sure, if data like stats->means and all suppose to update per interval or we are using it somewhere else. I just think it's easy to understand, that is the metric calculated by the counts per interval. So, As we call update_stats for each event and for each interval, can we somehow use that to print overall stats maybe by adding some var in `struct stats` to keep count of total counts for that event. Please let me know if my understanding is fine. Adding var in 'struct stats' looks not enough (or more complicated), because perf-stat also needs to report some counts according to different aggregation modes (not only the metric). I just think copying total counts to current counts is a easy way because we can reuse most of existing non-interval processing code. Thanks Jin Yao Thanks, Kajol Jain v5: --- 1. Create new patch "perf stat: Save aggr value to first member of prev_raw_counts". 2. Call perf_evlist__save_aggr_prev_raw_counts to save aggr value to first member of prev_raw_counts for AGGR_GLOBAL. Then next, perf_stat_process_counter can create aggr values from per cpu values. Following patches are impacted in v5: perf stat: Copy counts from prev_raw_counts to evsel->counts perf stat: Save aggr value to first member of prev_raw_counts perf stat: Report summary for interval mode v4: --- 1. Create runtime_stat_reset. 2. Zero the aggr in perf_counts__reset and use it to reset prev_raw_counts. 3. Move affinity setup and read_counter_cpu to a new function read_affinity_counters. It's only called when stat_config.summary is not set. v3: --- 1. 'perf stat: Fix wrong per-thread runtime stat for interval mode' is a new patch which fixes an existing issue found in test. 2. We use the prev_raw_counts for summary counts. Drop the summary_counts in v2. 3. Fix some issues. v2: --- Rebase to perf/core branch Jin Yao (5): perf stat: Fix wrong per-th
Re: [PATCH v5 0/5] perf stat: Support overall statistics for interval mode
On 5/14/20 11:06 AM, Jin Yao wrote: > Currently perf-stat supports to print counts at regular interval (-I), > but it's not very easy for user to get the overall statistics. > > With this patchset, it supports to report the summary at the end of > interval output. > > For example, > > root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2 > # time counts unit events > 1.000412064 2,281,114 cycles > 2.001383658 2,547,880 cycles > > Performance counter stats for 'system wide': > > 4,828,994 cycles > > 2.002860349 seconds time elapsed > > root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2 > # time counts unit events > 1.000389902 1,536,093 cycles > 1.000389902420,226 instructions #0.27 > insn per cycle > 2.001433453 2,213,952 cycles > 2.001433453735,465 instructions #0.33 > insn per cycle > > Performance counter stats for 'system wide': > > 3,750,045 cycles > 1,155,691 instructions #0.31 insn per cycle > > 2.003023361 seconds time elapsed > > root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2 > # time counts unit events > 1.000435121905,303 inst_retired.any # > 2.9 CPI > 1.000435121 2,663,333 cycles > 1.000435121914,702 inst_retired.any # > 0.3 IPC > 1.000435121 2,676,559 cpu_clk_unhalted.thread > 2.001615941 1,951,092 inst_retired.any # > 1.8 CPI > 2.001615941 3,551,357 cycles > 2.001615941 1,950,837 inst_retired.any # > 0.5 IPC > 2.001615941 3,551,044 cpu_clk_unhalted.thread > > Performance counter stats for 'system wide': > > 2,856,395 inst_retired.any # 2.2 CPI > 6,214,690 cycles > 2,865,539 inst_retired.any # 0.5 IPC > 6,227,603 cpu_clk_unhalted.thread > > 2.003403078 seconds time elapsed Hi Jin, Reporting the summary will be great for understanding overall stats. So, Before the patch where we are reseting rt_stat before read_counters to make sure, whatever printing in final aggregate is as per counts on that interval, we used to update stats->means and other info as described in RFC: https://lkml.org/lkml/2020/3/24/158 Now, stats->means is same as counts which we are using in generic_metric function. Is this expected behavior? I am not sure, if data like stats->means and all suppose to update per interval or we are using it somewhere else. So, As we call update_stats for each event and for each interval, can we somehow use that to print overall stats maybe by adding some var in `struct stats` to keep count of total counts for that event. Please let me know if my understanding is fine. Thanks, Kajol Jain > > v5: > --- > 1. Create new patch "perf stat: Save aggr value to first member > of prev_raw_counts". > > 2. Call perf_evlist__save_aggr_prev_raw_counts to save aggr value > to first member of prev_raw_counts for AGGR_GLOBAL. Then next, > perf_stat_process_counter can create aggr values from per cpu > values. > > Following patches are impacted in v5: > perf stat: Copy counts from prev_raw_counts to evsel->counts > perf stat: Save aggr value to first member of prev_raw_counts > perf stat: Report summary for interval mode > > v4: > --- > 1. Create runtime_stat_reset. > > 2. Zero the aggr in perf_counts__reset and use it to reset > prev_raw_counts. > > 3. Move affinity setup and read_counter_cpu to a new function > read_affinity_counters. It's only called when stat_config.summary > is not set. > > v3: > --- > 1. 'perf stat: Fix wrong per-thread runtime stat for interval mode' > is a new patch which fixes an existing issue found in test. > > 2. We use the prev_raw_counts for summary counts. Drop the summary_counts in > v2. > > 3. Fix some issues. > > v2: > --- > Rebase to perf/core branch > > Jin Yao (5): > perf stat: Fix wrong per-thread runtime stat for interval mode > perf counts: Reset prev_raw_counts counts > perf stat: Copy counts from prev_raw_counts to evsel->counts > perf stat: Save aggr value to first member of prev_raw_counts > perf stat: Report summary for interval mode > > tools/perf/builtin-stat.c | 101 ++ > tools/perf/util/counts.c | 4 +- > tools/perf/util/counts.h | 1 + > tools/perf/util/stat.c| 43 +--- > tools/perf/util/stat.h| 3 ++ > 5 files changed, 113 insertions(+), 39 deletions(-) >
[PATCH v5 0/5] perf stat: Support overall statistics for interval mode
Currently perf-stat supports to print counts at regular interval (-I), but it's not very easy for user to get the overall statistics. With this patchset, it supports to report the summary at the end of interval output. For example, root@kbl-ppc:~# perf stat -e cycles -I1000 --interval-count 2 # time counts unit events 1.000412064 2,281,114 cycles 2.001383658 2,547,880 cycles Performance counter stats for 'system wide': 4,828,994 cycles 2.002860349 seconds time elapsed root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2 # time counts unit events 1.000389902 1,536,093 cycles 1.000389902420,226 instructions #0.27 insn per cycle 2.001433453 2,213,952 cycles 2.001433453735,465 instructions #0.33 insn per cycle Performance counter stats for 'system wide': 3,750,045 cycles 1,155,691 instructions #0.31 insn per cycle 2.003023361 seconds time elapsed root@kbl-ppc:~# perf stat -M CPI,IPC -I1000 --interval-count 2 # time counts unit events 1.000435121905,303 inst_retired.any # 2.9 CPI 1.000435121 2,663,333 cycles 1.000435121914,702 inst_retired.any # 0.3 IPC 1.000435121 2,676,559 cpu_clk_unhalted.thread 2.001615941 1,951,092 inst_retired.any # 1.8 CPI 2.001615941 3,551,357 cycles 2.001615941 1,950,837 inst_retired.any # 0.5 IPC 2.001615941 3,551,044 cpu_clk_unhalted.thread Performance counter stats for 'system wide': 2,856,395 inst_retired.any # 2.2 CPI 6,214,690 cycles 2,865,539 inst_retired.any # 0.5 IPC 6,227,603 cpu_clk_unhalted.thread 2.003403078 seconds time elapsed v5: --- 1. Create new patch "perf stat: Save aggr value to first member of prev_raw_counts". 2. Call perf_evlist__save_aggr_prev_raw_counts to save aggr value to first member of prev_raw_counts for AGGR_GLOBAL. Then next, perf_stat_process_counter can create aggr values from per cpu values. Following patches are impacted in v5: perf stat: Copy counts from prev_raw_counts to evsel->counts perf stat: Save aggr value to first member of prev_raw_counts perf stat: Report summary for interval mode v4: --- 1. Create runtime_stat_reset. 2. Zero the aggr in perf_counts__reset and use it to reset prev_raw_counts. 3. Move affinity setup and read_counter_cpu to a new function read_affinity_counters. It's only called when stat_config.summary is not set. v3: --- 1. 'perf stat: Fix wrong per-thread runtime stat for interval mode' is a new patch which fixes an existing issue found in test. 2. We use the prev_raw_counts for summary counts. Drop the summary_counts in v2. 3. Fix some issues. v2: --- Rebase to perf/core branch Jin Yao (5): perf stat: Fix wrong per-thread runtime stat for interval mode perf counts: Reset prev_raw_counts counts perf stat: Copy counts from prev_raw_counts to evsel->counts perf stat: Save aggr value to first member of prev_raw_counts perf stat: Report summary for interval mode tools/perf/builtin-stat.c | 101 ++ tools/perf/util/counts.c | 4 +- tools/perf/util/counts.h | 1 + tools/perf/util/stat.c| 43 +--- tools/perf/util/stat.h| 3 ++ 5 files changed, 113 insertions(+), 39 deletions(-) -- 2.17.1