Re: [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts

2020-05-13 Thread Jin, Yao

Hi Jiri,

On 5/13/2020 11:31 PM, Jiri Olsa wrote:

On Fri, May 08, 2020 at 03:58:16PM +0800, Jin Yao wrote:

It would be useful to support the overall statistics for perf-stat
interval mode. For example, report the summary at the end of
"perf-stat -I" output.

But since perf-stat can support many aggregation modes, such as
--per-thread, --per-socket, -M and etc, we need a solution which
doesn't bring much complexity.

The idea is to use 'evsel->prev_raw_counts' which is updated in
each interval and it's saved with the latest counts. Before reporting
the summary, we copy the counts from evsel->prev_raw_counts to
evsel->counts, and next we just follow non-interval processing.

In evsel__compute_deltas, this patch saves counts to the member
[cpu0,thread0] of perf_counts for AGGR_GLOBAL.

That's because after copying evsel->prev_raw_counts to evsel->counts,
perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
Once we go to process_counter_maps again, all members of perf_counts
are 0.

So this patch uses a trick that saves the previous aggr value to
the member [cpu0,thread0] of perf_counts, then aggr calculation
in process_counter_values can work correctly.

  v4:
  ---
  Change the commit message.
  No functional change.

Signed-off-by: Jin Yao 
---
  tools/perf/util/evsel.c |  1 +
  tools/perf/util/stat.c  | 24 
  tools/perf/util/stat.h  |  1 +
  3 files changed, 26 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..6fae1ec28886 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int cpu, 
int thread,
if (cpu == -1) {
tmp = evsel->prev_raw_counts->aggr;
evsel->prev_raw_counts->aggr = *count;
+   *perf_counts(evsel->prev_raw_counts, 0, 0) = *count;


ok, I think I understand that now.. it's only for AGGR_GLOBAL mode,
because the perf_stat_process_counter will create aggr values from
per cpu values

but why do we need to do that all the time? can't we just set it up
before you zero prev_raw_counts in next patch?


 if (interval) {
 stat_config.interval = 0;
 stat_config.summary = true;
 perf_evlist__copy_prev_raw_counts(evsel_list);

-> for AGGR_GLOBAL set the counts[0,0] to prev_raw_counts->aggr

 perf_evlist__reset_prev_raw_counts(evsel_list);
 runtime_stat_reset(&stat_config);
 perf_stat__reset_shadow_per_stat(&rt_stat);
 }




Yes, I think that's a good idea.

Now in v5, I create a new patch "perf stat: Save aggr value to first member of 
prev_raw_counts" to save aggr value to first member of prev_raw_counts for 
AGGR_GLOBAL. Then next, perf_stat_process_counter can create aggr values from 
per cpu values successfully.


Thanks
Jin Yao



thanks,
jirka



Re: [PATCH v4 3/4] perf stat: Copy counts from prev_raw_counts to evsel->counts

2020-05-13 Thread Jiri Olsa
On Fri, May 08, 2020 at 03:58:16PM +0800, Jin Yao wrote:
> It would be useful to support the overall statistics for perf-stat
> interval mode. For example, report the summary at the end of
> "perf-stat -I" output.
> 
> But since perf-stat can support many aggregation modes, such as
> --per-thread, --per-socket, -M and etc, we need a solution which
> doesn't bring much complexity.
> 
> The idea is to use 'evsel->prev_raw_counts' which is updated in
> each interval and it's saved with the latest counts. Before reporting
> the summary, we copy the counts from evsel->prev_raw_counts to
> evsel->counts, and next we just follow non-interval processing.
> 
> In evsel__compute_deltas, this patch saves counts to the member
> [cpu0,thread0] of perf_counts for AGGR_GLOBAL.
> 
> That's because after copying evsel->prev_raw_counts to evsel->counts,
> perf_counts(evsel->counts, cpu, thread) are all 0 for AGGR_GLOBAL.
> Once we go to process_counter_maps again, all members of perf_counts
> are 0.
> 
> So this patch uses a trick that saves the previous aggr value to
> the member [cpu0,thread0] of perf_counts, then aggr calculation
> in process_counter_values can work correctly.
> 
>  v4:
>  ---
>  Change the commit message.
>  No functional change.
> 
> Signed-off-by: Jin Yao 
> ---
>  tools/perf/util/evsel.c |  1 +
>  tools/perf/util/stat.c  | 24 
>  tools/perf/util/stat.h  |  1 +
>  3 files changed, 26 insertions(+)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 28683b0eb738..6fae1ec28886 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1283,6 +1283,7 @@ void evsel__compute_deltas(struct evsel *evsel, int 
> cpu, int thread,
>   if (cpu == -1) {
>   tmp = evsel->prev_raw_counts->aggr;
>   evsel->prev_raw_counts->aggr = *count;
> + *perf_counts(evsel->prev_raw_counts, 0, 0) = *count;

ok, I think I understand that now.. it's only for AGGR_GLOBAL mode,
because the perf_stat_process_counter will create aggr values from
per cpu values

but why do we need to do that all the time? can't we just set it up
before you zero prev_raw_counts in next patch?


if (interval) {
stat_config.interval = 0;
stat_config.summary = true;
perf_evlist__copy_prev_raw_counts(evsel_list);

-> for AGGR_GLOBAL set the counts[0,0] to prev_raw_counts->aggr

perf_evlist__reset_prev_raw_counts(evsel_list);
runtime_stat_reset(&stat_config);
perf_stat__reset_shadow_per_stat(&rt_stat);
}


thanks,
jirka