[tip: perf/core] perf metricgroup: Support multiple events for metricgroup

2019-09-02 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: f01642e4912bb80a01d693f4cc6fb0897207a090
Gitweb:
https://git.kernel.org/tip/f01642e4912bb80a01d693f4cc6fb0897207a090
Author:Jin Yao 
AuthorDate:Wed, 28 Aug 2019 13:59:32 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Sat, 31 Aug 2019 22:27:52 -03:00

perf metricgroup: Support multiple events for metricgroup

Some uncore metrics don't work as expected. For example, on
cascadelakex:

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1

   Performance counter stats for 'system wide':

   1841092  unc_m_pmm_rpq_inserts
   3680816  unc_m_pmm_wpq_inserts

   1.001775055 seconds time elapsed

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1

   Performance counter stats for 'system wide':

 860649746  unc_m_pmm_rpq_occupancy.all
   1840557  unc_m_pmm_rpq_inserts
   12790627455  unc_m_clockticks

   1.001773348 seconds time elapsed

No metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' or 'UNC_M_PMM_READ_LATENCY' are
reported.

The issue is, the case of an alias expanding to mulitple events is not
supported, typically the uncore events.  (see comments in
find_evsel_group()).

For UNC_M_PMM_BANDWIDTH.TOTAL in above example, the expanded event group
is '{unc_m_pmm_rpq_inserts,unc_m_pmm_wpq_inserts}:W', but the actual
events passed to find_evsel_group are:

  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_rpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts
  unc_m_pmm_wpq_inserts

For this multiple events case, it's not supported well.

This patch introduces a new field 'metric_leader' in struct evsel. The
first event is considered as a metric leader. For the rest of same
events, they point to the first event via it's metric_leader field in
struct evsel.

This design is for adding the counting results of all same events to the
first event in group (the metric_leader).

With this patch,

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_BANDWIDTH.TOTAL -a -- sleep 1

   Performance counter stats for 'system wide':

   1842108  unc_m_pmm_rpq_inserts #337.2 MB/sec  
UNC_M_PMM_BANDWIDTH.TOTAL
   3682209  unc_m_pmm_wpq_inserts

   1.001819706 seconds time elapsed

  root@lkp-csl-2sp2:~# perf stat -M UNC_M_PMM_READ_LATENCY -a -- sleep 1

   Performance counter stats for 'system wide':

 861970685  unc_m_pmm_rpq_occupancy.all #219.4 ns  
UNC_M_PMM_READ_LATENCY
   1842772  unc_m_pmm_rpq_inserts
   12790196356  unc_m_clockticks

   1.001749103 seconds time elapsed

Now we can see the correct metrics 'UNC_M_PMM_BANDWIDTH.TOTAL' and
'UNC_M_PMM_READ_LATENCY'.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20190828055932.8269-5-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evsel.h   |  1 +-
 tools/perf/util/metricgroup.c | 84 +-
 tools/perf/util/stat-shadow.c | 27 +--
 3 files changed, 68 insertions(+), 44 deletions(-)

diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index fd60cac..68321d1 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -168,6 +168,7 @@ struct evsel {
const char *metric_expr;
const char *metric_name;
struct evsel**metric_events;
+   struct evsel*metric_leader;
boolcollect_stat;
boolweak_group;
boolpercore;
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index f474a29..a7c0424 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -90,57 +90,61 @@ struct egroup {
const char *metric_unit;
 };
 
-static bool record_evsel(int *ind, struct evsel **start,
-int idnum,
-struct evsel **metric_events,
-struct evsel *ev)
-{
-   metric_events[*ind] = ev;
-   if (*ind == 0)
-   *start = ev;
-   if (++*ind == idnum) {
-   metric_events[*ind] = NULL;
-   return true;
-   }
-   return false;
-}
-
 static struct evsel *find_evsel_group(struct evlist *perf_evlist,
  const char **ids,
  int idnum,
  struct evsel **metric_events)
 {
-   struct evsel *ev, *start = NULL;
-   int ind = 0;
+   struct evsel *ev;
+   int i = 0;
+   bool leader_found;
 
evlist__for_each_entry (perf_evlist, ev) {
- 

[tip: perf/core] perf metricgroup: Scale the metric result

2019-09-02 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 287f2649f791819dd2d8f32f0213c8c521d6dfa0
Gitweb:
https://git.kernel.org/tip/287f2649f791819dd2d8f32f0213c8c521d6dfa0
Author:Jin Yao 
AuthorDate:Wed, 28 Aug 2019 13:59:31 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Sat, 31 Aug 2019 22:27:52 -03:00

perf metricgroup: Scale the metric result

Some metrics define the scale unit, such as

{
"BriefDescription": "Intel Optane DC persistent memory read latency 
(ns). Derived from unc_m_pmm_rpq_occupancy.all",
"Counter": "0,1,2,3",
"EventCode": "0xE0",
"EventName": "UNC_M_PMM_READ_LATENCY",
"MetricExpr": "UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / 
UNC_M_CLOCKTICKS",
"MetricName": "UNC_M_PMM_READ_LATENCY",
"PerPkg": "1",
"ScaleUnit": "60ns",
"UMask": "0x1",
"Unit": "iMC"
},

For above example, the ratio should be,

ratio = (UNC_M_PMM_RPQ_OCCUPANCY.ALL / UNC_M_PMM_RPQ_INSERTS / 
UNC_M_CLOCKTICKS) * 60

But in current code, the ratio is not scaled ( * 60)

With this patch, the ratio is scaled and the unit (ns) is printed.

For example,
  #219.4 ns  UNC_M_PMM_READ_LATENCY

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20190828055932.8269-4-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/metricgroup.c |  3 +++-
 tools/perf/util/metricgroup.h |  1 +-
 tools/perf/util/stat-shadow.c | 38 --
 3 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 33f5e21..f474a29 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -87,6 +87,7 @@ struct egroup {
const char **ids;
const char *metric_name;
const char *metric_expr;
+   const char *metric_unit;
 };
 
 static bool record_evsel(int *ind, struct evsel **start,
@@ -182,6 +183,7 @@ static int metricgroup__setup_events(struct list_head 
*groups,
}
expr->metric_expr = eg->metric_expr;
expr->metric_name = eg->metric_name;
+   expr->metric_unit = eg->metric_unit;
expr->metric_events = metric_events;
list_add(&expr->nd, &me->head);
}
@@ -453,6 +455,7 @@ static int metricgroup__add_metric(const char *metric, 
struct strbuf *events,
eg->idnum = idnum;
eg->metric_name = pe->metric_name;
eg->metric_expr = pe->metric_expr;
+   eg->metric_unit = pe->unit;
list_add_tail(&eg->nd, group_list);
ret = 0;
}
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
index e5092f6..475c7f9 100644
--- a/tools/perf/util/metricgroup.h
+++ b/tools/perf/util/metricgroup.h
@@ -20,6 +20,7 @@ struct metric_expr {
struct list_head nd;
const char *metric_expr;
const char *metric_name;
+   const char *metric_unit;
struct evsel **metric_events;
 };
 
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 2ed5e00..696d263 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -715,6 +715,7 @@ static void generic_metric(struct perf_stat_config *config,
   struct evsel **metric_events,
   char *name,
   const char *metric_name,
+  const char *metric_unit,
   double avg,
   int cpu,
   struct perf_stat_output_ctx *out,
@@ -722,7 +723,7 @@ static void generic_metric(struct perf_stat_config *config,
 {
print_metric_t print_metric = out->print_metric;
struct parse_ctx pctx;
-   double ratio;
+   double ratio, scale;
int i;
void *ctxp = out->ctx;
char *n, *pn;
@@ -732,7 +733,6 @@ static void generic_metric(struct perf_stat_config *config,
for (i = 0; metric_events[i]; i++) {
struct saved_value *v;
struct stats *stats;
-   double scale;
 
if (!strcmp(metric_events[i]->name, "duration_time")) {
stats = &walltime_nsecs_stats;
@@ -762,16 +762,32 @@ static void generic_metric(struct perf_stat_config 
*config,
if (!metric_events[i]) {
const char *p = metric_expr;
 
-   if (expr__parse(&ratio, &pctx, &p) == 0)
-   print_metric(config, ctxp, NULL, "%8.1f",
-   metric_name ?
-   metric_name :
-   out->force_header ?  name : "",
-   

[tip: perf/core] perf pmu: Change convert_scale from static to global

2019-09-02 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: a55ab7c4ca6986a542d313b02043a39ebf712a39
Gitweb:
https://git.kernel.org/tip/a55ab7c4ca6986a542d313b02043a39ebf712a39
Author:Jin Yao 
AuthorDate:Wed, 28 Aug 2019 13:59:29 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Sat, 31 Aug 2019 22:27:51 -03:00

perf pmu: Change convert_scale from static to global

The function convert_scale() can be used to convert string to unit and
scale. For example,

  s = "60ns";
  convert_scale(s, &unit, &scale);

unit = "ns", scale = 60.

Currently this function is static. This patch renames the function to
perf_pmu__convert_scale and changes the function to global.  No
functional change.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20190828055932.8269-2-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/pmu.c | 6 +++---
 tools/perf/util/pmu.h | 2 ++
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 6b3448f..fb597fa 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -102,7 +102,7 @@ static int pmu_format(const char *name, struct list_head 
*format)
return 0;
 }
 
-static int convert_scale(const char *scale, char **end, double *sval)
+int perf_pmu__convert_scale(const char *scale, char **end, double *sval)
 {
char *lc;
int ret = 0;
@@ -165,7 +165,7 @@ static int perf_pmu__parse_scale(struct perf_pmu_alias 
*alias, char *dir, char *
else
scale[sret] = '\0';
 
-   ret = convert_scale(scale, NULL, &alias->scale);
+   ret = perf_pmu__convert_scale(scale, NULL, &alias->scale);
 error:
close(fd);
return ret;
@@ -373,7 +373,7 @@ static int __perf_pmu__new_alias(struct list_head *list, 
char *dir, char *name,
desc ? strdup(desc) : NULL;
alias->topic = topic ? strdup(topic) : NULL;
if (unit) {
-   if (convert_scale(unit, &unit, &alias->scale) < 0)
+   if (perf_pmu__convert_scale(unit, &unit, &alias->scale) < 0)
return -1;
snprintf(alias->unit, sizeof(alias->unit), "%s", unit);
}
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 3f8b79b..f36ade6 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -96,4 +96,6 @@ struct perf_event_attr *perf_pmu__get_default_config(struct 
perf_pmu *pmu);
 
 struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
 
+int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
+
 #endif /* __PMU_H */


[tip: perf/core] perf diff: Report noisy for cycles diff

2019-10-14 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: cebf7d51a6c3babc4d0589da7aec0de1af0a5691
Gitweb:
https://git.kernel.org/tip/cebf7d51a6c3babc4d0589da7aec0de1af0a5691
Author:Jin Yao 
AuthorDate:Wed, 25 Sep 2019 09:14:46 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Fri, 11 Oct 2019 10:57:00 -03:00

perf diff: Report noisy for cycles diff

This patch prints the stddev and hist for the cycles diff of program
block. It can help us to understand if the cycles is noisy or not.

This patch is inspired by Andi Kleen's patch:

  https://lwn.net/Articles/600471/

We create new option '--cycles-hist'.

Example:

  perf record -b ./div
  perf record -b ./div
  perf diff -c cycles

  # Baseline[Program Block Range] Cycles Diff  
Shared Object  Symbol
  #   ..   
.  
  #
  46.72%  [div.c:40 -> div.c:40]0  
div[.] main
  46.72%  [div.c:42 -> div.c:44]0  
div[.] main
  46.72%  [div.c:42 -> div.c:39]0  
div[.] main
  20.54%  [random_r.c:357 -> random_r.c:394]1  
libc-2.27.so   [.] __random_r
  20.54%  [random_r.c:357 -> random_r.c:380]0  
libc-2.27.so   [.] __random_r
  20.54%  [random_r.c:388 -> random_r.c:388]0  
libc-2.27.so   [.] __random_r
  20.54%  [random_r.c:388 -> random_r.c:391]0  
libc-2.27.so   [.] __random_r
  17.04%  [random.c:288 -> random.c:291]0  
libc-2.27.so   [.] __random
  17.04%  [random.c:291 -> random.c:291]0  
libc-2.27.so   [.] __random
  17.04%  [random.c:293 -> random.c:293]0  
libc-2.27.so   [.] __random
  17.04%  [random.c:295 -> random.c:295]0  
libc-2.27.so   [.] __random
  17.04%  [random.c:295 -> random.c:295]0  
libc-2.27.so   [.] __random
  17.04%  [random.c:298 -> random.c:298]0  
libc-2.27.so   [.] __random
   8.40%  [div.c:22 -> div.c:25]0  
div[.] compute_flag
   8.40%  [div.c:27 -> div.c:28]0  
div[.] compute_flag
   5.14%[rand.c:26 -> rand.c:27]0  
libc-2.27.so   [.] rand
   5.14%[rand.c:28 -> rand.c:28]0  
libc-2.27.so   [.] rand
   2.15%  [rand@plt+0 -> rand@plt+0]0  
div[.] rand@plt
   0.00%   
[kernel.kallsyms]  [k] __x86_indirect_thunk_rax
   0.00%[do_mmap+714 -> do_mmap+732]  -10  
[kernel.kallsyms]  [k] do_mmap
   0.00%[do_mmap+737 -> do_mmap+765]1  
[kernel.kallsyms]  [k] do_mmap
   0.00%[do_mmap+262 -> do_mmap+299]0  
[kernel.kallsyms]  [k] do_mmap
   0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]7  
[kernel.kallsyms]  [k] __x86_indirect_thunk_r15
   0.00%[native_sched_clock+0 -> native_sched_clock+119]   -1  
[kernel.kallsyms]  [k] native_sched_clock
   0.00% [native_write_msr+0 -> native_write_msr+16]  -13  
[kernel.kallsyms]  [k] native_write_msr

When we enable the option '--cycles-hist', the output is

  perf diff -c cycles --cycles-hist

  # Baseline[Program Block Range] Cycles Diff   
 stddev/Hist  Shared Object  Symbol
  #   ..   
.  .  
  #
  46.72%  [div.c:40 -> div.c:40]0  
± 37.8% ▁█▁▁██▁█   div[.] main
  46.72%  [div.c:42 -> div.c:44]0  
± 49.4% ▁▁▂█   div[.] main
  46.72%  [div.c:42 -> div.c:39]0  
± 24.1% ▃█▂▄▁▃▂▁   div[.] main
  20.54%  [random_r.c:357 -> random_r.c:394]1  
± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so   [.] __random_r
  20.54%  [random_r.c:357 -> random_r.c:380]0  
± 39.4% ▁▁█▁██▅▁   libc-2.27.so   [.] __random_r
  20.54%  [random_r.c:388 -> random_r.c:388]0   
  libc-2.27.so   [.] __random_r
  20.54% 

[tip: perf/core] perf report: Add warning when libunwind not compiled in

2019-10-21 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 800d3f561659b5436f8c57e7c26dd1f6928b5615
Gitweb:
https://git.kernel.org/tip/800d3f561659b5436f8c57e7c26dd1f6928b5615
Author:Jin Yao 
AuthorDate:Fri, 11 Oct 2019 10:21:22 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Tue, 15 Oct 2019 08:36:22 -03:00

perf report: Add warning when libunwind not compiled in

We received a user report that call-graph DWARF mode was enabled in
'perf record' but 'perf report' didn't unwind the callstack correctly.
The reason was, libunwind was not compiled in.

We can use 'perf -vv' to check the compiled libraries but it would be
valuable to report a warning to user directly (especially valuable for
a perf newbie).

The warning is:

Warning:
Please install libunwind development packages during the perf build.

Both TUI and stdio are supported.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20191011022122.26369-1-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index aae0e57..7accaf8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -399,6 +399,13 @@ static int report__setup_sample_type(struct report *rep)
PERF_SAMPLE_BRANCH_ANY))
rep->nonany_branch_mode = true;
 
+#ifndef HAVE_LIBUNWIND_SUPPORT
+   if (dwarf_callchain_users) {
+   ui__warning("Please install libunwind development packages "
+   "during the perf build.\n");
+   }
+#endif
+
return 0;
 }
 


[tip: perf/core] perf stat: Support --all-kernel/--all-user

2019-10-21 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: dd071024bf52156eed31deaf511c6e7a82a6f57b
Gitweb:
https://git.kernel.org/tip/dd071024bf52156eed31deaf511c6e7a82a6f57b
Author:Jin Yao 
AuthorDate:Fri, 11 Oct 2019 13:05:45 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Tue, 15 Oct 2019 08:39:42 -03:00

perf stat: Support --all-kernel/--all-user

'perf record' has supported --all-kernel / --all-user to configure all
used events to run in kernel space or run in user space. But 'perf stat'
doesn't support these options.

It would be useful to support these options in 'perf stat' too to keep
the same semantics available in both tools.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20191011050545.3899-1-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-stat.txt |  6 ++
 tools/perf/builtin-stat.c  |  6 ++
 tools/perf/util/stat.c | 10 ++
 tools/perf/util/stat.h |  2 ++
 4 files changed, 24 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt 
b/tools/perf/Documentation/perf-stat.txt
index 930c51c..a9af4e4 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -323,6 +323,12 @@ The output is SMI cycles%, equals to (aperf - unhalted 
core cycles) / aperf
 
 Users who wants to get the actual value can apply --no-metric-only.
 
+--all-kernel::
+Configure all used events to run in kernel space.
+
+--all-user::
+Configure all used events to run in user space.
+
 EXAMPLES
 
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 468fc49..c88d4e1 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -803,6 +803,12 @@ static struct option stat_options[] = {
OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
 "monitor specified metrics or metric groups (separated by 
,)",
 parse_metric_groups),
+   OPT_BOOLEAN_FLAG(0, "all-kernel", &stat_config.all_kernel,
+"Configure all used events to run in kernel space.",
+PARSE_OPT_EXCLUSIVE),
+   OPT_BOOLEAN_FLAG(0, "all-user", &stat_config.all_user,
+"Configure all used events to run in user space.",
+PARSE_OPT_EXCLUSIVE),
OPT_END()
 };
 
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index ebdd130..6822e4f 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -490,6 +490,16 @@ int create_perf_stat_counter(struct evsel *evsel,
if (config->identifier)
attr->sample_type = PERF_SAMPLE_IDENTIFIER;
 
+   if (config->all_user) {
+   attr->exclude_kernel = 1;
+   attr->exclude_user   = 0;
+   }
+
+   if (config->all_kernel) {
+   attr->exclude_kernel = 0;
+   attr->exclude_user   = 1;
+   }
+
/*
 * Disabling all counters initially, they will be enabled
 * either manually by us or by kernel via enable_on_exec
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index edbeb2f..081c4a5 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -106,6 +106,8 @@ struct perf_stat_config {
bool big_num;
bool no_merge;
bool walltime_run_table;
+   bool all_kernel;
+   bool all_user;
FILE*output;
unsigned int interval;
unsigned int timeout;


[tip: perf/core] perf list: Hide deprecated events by default

2019-10-21 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: a7f6c8c81afdd6d24eb12558f2fb66901207d349
Gitweb:
https://git.kernel.org/tip/a7f6c8c81afdd6d24eb12558f2fb66901207d349
Author:Jin Yao 
AuthorDate:Tue, 15 Oct 2019 10:53:57 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Sat, 19 Oct 2019 15:35:01 -03:00

perf list: Hide deprecated events by default

There are some deprecated events listed by perf list. But we can't
remove them from perf list with ease because some old scripts may use
them.

Deprecated events are old names of renamed events.  When an event gets
renamed the old name is kept around for some time and marked with
Deprecated. The newer Intel event lists in the tree already have these
headers.

So we need to keep them in the event list, but provide a new option to
show them. The new option is "--deprecated".

With this patch, the deprecated events are hidden by default but they
can be displayed when option "--deprecated" is enabled.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20191015025357.8708-1-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-list.txt |  3 +++-
 tools/perf/builtin-list.c  | 14 +
 tools/perf/pmu-events/jevents.c| 26 +++--
 tools/perf/pmu-events/jevents.h|  3 ++-
 tools/perf/pmu-events/pmu-events.h |  1 +-
 tools/perf/util/parse-events.c |  4 ++--
 tools/perf/util/parse-events.h |  2 +-
 tools/perf/util/pmu.c  | 17 
 tools/perf/util/pmu.h  |  4 +++-
 9 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt 
b/tools/perf/Documentation/perf-list.txt
index 18ed1b0..6345db3 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -36,6 +36,9 @@ Enable debugging output.
 Print how named events are resolved internally into perf events, and also
 any extra expressions computed by perf stat.
 
+--deprecated::
+Print deprecated events. By default the deprecated events are hidden.
+
 [[EVENT_MODIFIERS]]
 EVENT MODIFIERS
 ---
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 08e62ae..965ef01 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -26,6 +26,7 @@ int cmd_list(int argc, const char **argv)
int i;
bool raw_dump = false;
bool long_desc_flag = false;
+   bool deprecated = false;
struct option list_options[] = {
OPT_BOOLEAN(0, "raw-dump", &raw_dump, "Dump raw events"),
OPT_BOOLEAN('d', "desc", &desc_flag,
@@ -34,6 +35,8 @@ int cmd_list(int argc, const char **argv)
"Print longer event descriptions."),
OPT_BOOLEAN(0, "details", &details_flag,
"Print information on the perf event names and 
expressions used internally by events."),
+   OPT_BOOLEAN(0, "deprecated", &deprecated,
+   "Print deprecated events."),
OPT_INCR(0, "debug", &verbose,
 "Enable debugging output"),
OPT_END()
@@ -55,7 +58,7 @@ int cmd_list(int argc, const char **argv)
 
if (argc == 0) {
print_events(NULL, raw_dump, !desc_flag, long_desc_flag,
-   details_flag);
+   details_flag, deprecated);
return 0;
}
 
@@ -78,7 +81,8 @@ int cmd_list(int argc, const char **argv)
print_hwcache_events(NULL, raw_dump);
else if (strcmp(argv[i], "pmu") == 0)
print_pmu_events(NULL, raw_dump, !desc_flag,
-   long_desc_flag, details_flag);
+   long_desc_flag, details_flag,
+   deprecated);
else if (strcmp(argv[i], "sdt") == 0)
print_sdt_events(NULL, NULL, raw_dump);
else if (strcmp(argv[i], "metric") == 0 || strcmp(argv[i], 
"metrics") == 0)
@@ -91,7 +95,8 @@ int cmd_list(int argc, const char **argv)
if (sep == NULL) {
print_events(argv[i], raw_dump, !desc_flag,
long_desc_flag,
-   details_flag);
+   details_flag,
+   deprecated);
continue;
}
sep_idx = sep - argv[i];
@@ -117,7 +122,8 @@ int cmd_list(int argc, co

[tip: perf/core] perf stat: Zero all the 'ena' and 'run' array slot stats for interval mode

2020-05-08 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 0e0bf1ea1147fcf74eab19c2d3c853cc3740a72f
Gitweb:
https://git.kernel.org/tip/0e0bf1ea1147fcf74eab19c2d3c853cc3740a72f
Author:Jin Yao 
AuthorDate:Thu, 09 Apr 2020 15:07:55 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Wed, 22 Apr 2020 15:51:01 -03:00

perf stat: Zero all the 'ena' and 'run' array slot stats for interval mode

As the code comments in perf_stat_process_counter() say, we calculate
counter's data every interval, and the display code shows ps->res_stats
avg value. We need to zero the stats for interval mode.

But the current code only zeros the res_stats[0], it doesn't zero the
res_stats[1] and res_stats[2], which are for ena and run of counter.

This patch zeros the whole res_stats[] for interval mode.

Fixes: 51fd2df1e882 ("perf stat: Fix interval output values")
Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20200409070755.17261-1-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/stat.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 5f26137..242476e 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -368,8 +368,10 @@ int perf_stat_process_counter(struct perf_stat_config 
*config,
 * interval mode, otherwise overall avg running
 * averages will be shown for each interval.
 */
-   if (config->interval)
-   init_stats(ps->res_stats);
+   if (config->interval) {
+   for (i = 0; i < 3; i++)
+   init_stats(&ps->res_stats[i]);
+   }
 
if (counter->per_pkg)
zero_per_pkg(counter);


[tip: perf/core] perf stat: Improve runtime stat for interval mode

2020-05-08 Thread tip-bot2 for Jin Yao
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 197ba86fdc888dc0d3d6b89b402c9c6851d4c6fb
Gitweb:
https://git.kernel.org/tip/197ba86fdc888dc0d3d6b89b402c9c6851d4c6fb
Author:Jin Yao 
AuthorDate:Mon, 20 Apr 2020 22:54:17 +08:00
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Thu, 23 Apr 2020 11:03:46 -03:00

perf stat: Improve runtime stat for interval mode

For interval mode, the metric is printed after the '#' character if it
exists. But it's not calculated by the counts generated in this
interval.

See the following examples:

  root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
  #   time counts unit events
   1.000422803764,809  inst_retired.any  #  2.9 
CPI
   1.000422803  2,234,932  cycles
   2.001464585  1,960,061  inst_retired.any  #  1.6 
CPI
   2.001464585  4,022,591  cycles

The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)

  root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
  #   time counts unit events
   1.000429493  2,869,311  cycles
   1.000429493816,875  instructions  #0.28  
insn per cycle
   2.001516426  9,260,973  cycles
   2.001516426  5,250,634  instructions  #0.87  
insn per cycle

The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is
0.57).

The current code uses a global variable 'rt_stat' for tracking and
updating the std dev of runtime stat. Unlike the counts, 'rt_stat' is not
reset for interval. While the counts are reset for interval.

  perf_stat_process_counter()
  {
  if (config->interval)
  init_stats(ps->res_stats);
  }

So for interval mode, the 'rt_stat' variable should be reset too.

This patch resets 'rt_stat' before read_counters(), so the runtime stat
is only calculated by the counts generated in this interval.

With this patch:

  root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
  #   time counts unit events
   1.000420924  2,408,818  inst_retired.any  #  2.1 
CPI
   1.000420924  5,010,111  cycles
   2.001448579  2,798,407  inst_retired.any  #  1.6 
CPI
   2.001448579  4,599,861  cycles

  root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
  #   time counts unit events
   1.000428555  2,769,714  cycles
   1.000428555774,462  instructions  #0.28  
insn per cycle
   2.001471562  3,595,904  cycles
   2.001471562  1,243,703  instructions  #0.35  
insn per cycle

Now the second 'insn per cycle' and CPI are calculated by the counts
generated in this interval.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-By: Kajol Jain 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lore.kernel.org/lkml/20200420145417.6864-1-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-stat.txt | 2 ++
 tools/perf/builtin-stat.c  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/tools/perf/Documentation/perf-stat.txt 
b/tools/perf/Documentation/perf-stat.txt
index 4d56586..3fb5028 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -176,6 +176,8 @@ Print count deltas every N milliseconds (minimum: 1ms)
 The overhead percentage could be high in some cases, for instance with small, 
sub 100ms intervals.  Use with caution.
example: 'perf stat -I 1000 -e cycles -a sleep 5'
 
+If the metric exists, it is calculated by the counts generated in this 
interval and the metric is printed after #.
+
 --interval-count times::
 Print count deltas for fixed number of times.
 This option should be used together with "-I" option.
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9207b6c..3f050d8 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -359,6 +359,7 @@ static void process_interval(void)
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);
 
+   perf_stat__reset_shadow_per_stat(&rt_stat);
read_counters(&rs);
 
if (STAT_RECORD) {