[tip:perf/urgent] perf pmu-events: Fix missing "cpu_clk_unhalted.core" event

2019-08-08 Thread tip-bot for Jin Yao
Commit-ID:  8e6e5bea2e34c61291d00cb3f47560341aa84bc3
Gitweb: https://git.kernel.org/tip/8e6e5bea2e34c61291d00cb3f47560341aa84bc3
Author: Jin Yao 
AuthorDate: Mon, 29 Jul 2019 15:27:55 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 8 Aug 2019 15:41:37 -0300

perf pmu-events: Fix missing "cpu_clk_unhalted.core" event

The events defined in pmu-events JSON are parsed and added into perf
tool. For fixed counters, we handle the encodings between JSON and perf
by using a static array fixed[].

But the fixed[] has missed an important event "cpu_clk_unhalted.core".

For example, on the Tremont platform,

  [root@localhost ~]# perf stat -e cpu_clk_unhalted.core -a
  event syntax error: 'cpu_clk_unhalted.core'
   \___ parser error

With this patch, the event cpu_clk_unhalted.core can be parsed.

  [root@localhost perf]# ./perf stat -e cpu_clk_unhalted.core -a -vvv
  
  perf_event_attr:
type 4
size 112
config   0x3c
sample_type  IDENTIFIER
read_format  TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit  1
exclude_guest1
  
...

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/20190729072755.2166-1-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/pmu-events/jevents.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 1a91a197cafb..d413761621b0 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -453,6 +453,7 @@ static struct fixed {
{ "inst_retired.any_p", "event=0xc0" },
{ "cpu_clk_unhalted.ref", "event=0x0,umask=0x03" },
{ "cpu_clk_unhalted.thread", "event=0x3c" },
+   { "cpu_clk_unhalted.core", "event=0x3c" },
{ "cpu_clk_unhalted.thread_any", "event=0x3c,any=1" },
{ NULL, NULL},
 };


[tip:perf/core] perf diff: Documentation -c cycles option

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  c8f7bc1a080b081a178bff20356cb7575d385f84
Gitweb: https://git.kernel.org/tip/c8f7bc1a080b081a178bff20356cb7575d385f84
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:23:04 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 13:20:51 -0300

perf diff: Documentation -c cycles option

Documentation the new computation selection 'cycles'.

 v4:
 ---
 Change the column 'Block cycles diff [start:end]' to
 '[Program Block Range] Cycles Diff'

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-8-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index facd91e4e945..d5cc15e651cf 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -90,9 +90,10 @@ OPTIONS
 
 -c::
 --compute::
-Differential computation selection - delta, ratio, wdiff, delta-abs
-(default is delta-abs).  Default can be changed using diff.compute
-config option.  See COMPARISON METHODS section for more info.
+Differential computation selection - delta, ratio, wdiff, cycles,
+delta-abs (default is delta-abs).  Default can be changed using
+diff.compute config option.  See COMPARISON METHODS section for
+more info.
 
 -p::
 --period::
@@ -280,6 +281,16 @@ If specified the 'Weighted diff' column is displayed with 
value 'd' computed as:
 - WEIGHT-A being the weight of the data file
 - WEIGHT-B being the weight of the baseline data file
 
+cycles
+~~
+If specified the '[Program Block Range] Cycles Diff' column is displayed.
+It displays the cycles difference of same program basic block amongst
+two perf.data. The program basic block is the code between two branches.
+
+'[Program Block Range]' indicates the range of a program basic block.
+Source line is reported if it can be found otherwise uses symbol+offset
+instead.
+
 SEE ALSO
 
 linkperf:perf-record[1], linkperf:perf-report[1]


[tip:perf/core] perf diff: Print the basic block cycles diff

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  b10c78c50964da952e6d4db78a3692ab051e6638
Gitweb: https://git.kernel.org/tip/b10c78c50964da952e6d4db78a3692ab051e6638
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:23:03 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 13:20:51 -0300

perf diff: Print the basic block cycles diff

 $ perf record -b ./div
 $ perf record -b ./div

Following is the default perf diff output

 $ perf diff

 # Event 'cycles'
 #
 # Baseline  Delta Abs  Shared Object Symbol
 #   .    ..
 #
 48.75% +0.33%  div   [.] main
  8.21% -0.20%  div   [.] compute_flag
 19.02% -0.12%  libc-2.23.so  [.] __random_r
 16.17% -0.09%  libc-2.23.so  [.] __random
  2.27% -0.03%  div   [.] rand@plt
+0.02%  [i915][k] gen8_irq_handler
  5.52% +0.02%  libc-2.23.so  [.] rand

This patch creates a new computation selection 'cycles'.

 $ perf diff -c cycles

 # Event 'cycles'
 #
 # Baseline   [Program Block Range] Cycles Diff Shared Object Symbol
 #  ... 
.
 #
 48.75% [div.c:42 -> div.c:45]  147 div   [.] main
 48.75% [div.c:31 -> div.c:40]4 div   [.] main
 48.75% [div.c:40 -> div.c:40]0 div   [.] main
 48.75% [div.c:42 -> div.c:42]0 div   [.] main
 48.75% [div.c:42 -> div.c:44]0 div   [.] main
 19.02% [random_r.c:357 -> random_r.c:360]0 libc-2.23.so  [.] __random_r
 19.02% [random_r.c:357 -> random_r.c:373]0 libc-2.23.so  [.] __random_r
 19.02% [random_r.c:357 -> random_r.c:376]0 libc-2.23.so  [.] __random_r
 19.02% [random_r.c:357 -> random_r.c:380]0 libc-2.23.so  [.] __random_r
 19.02% [random_r.c:357 -> random_r.c:392]0 libc-2.23.so  [.] __random_r
 16.17% [random.c:288 -> random.c:291]0 libc-2.23.so  [.] __random
 16.17% [random.c:288 -> random.c:291]0 libc-2.23.so  [.] __random
 16.17% [random.c:288 -> random.c:295]0 libc-2.23.so  [.] __random
 16.17% [random.c:288 -> random.c:297]0 libc-2.23.so  [.] __random
 16.17% [random.c:291 -> random.c:291]0 libc-2.23.so  [.] __random
 16.17% [random.c:293 -> random.c:293]0 libc-2.23.so  [.] __random
  8.21% [div.c:22 -> div.c:22]  148 div   [.] 
compute_flag
  8.21% [div.c:22 -> div.c:25]0 div   [.] 
compute_flag
  8.21% [div.c:27 -> div.c:28]0 div   [.] 
compute_flag
  5.52%   [rand.c:26 -> rand.c:27]0 libc-2.23.so  [.] rand
  5.52%   [rand.c:26 -> rand.c:28]0 libc-2.23.so  [.] rand
  2.27% [rand@plt+0 -> rand@plt+0]0 div   [.] rand@plt
  0.01% [entry_64.S:694 -> entry_64.S:694]   16 [vmlinux] [k] 
native_irq_return_iret
  0.00%   [fair.c:7676 -> fair.c:7665]  162 [vmlinux] [k] 
update_blocked_averages

"[Program Block Range]" indicates the range of program basic block
(start -> end). If we can find the source line it prints the source line
otherwise it prints the symbol+offset instead.

 v4:
 ---
 Use source lines or symbol+offset to indicate the basic block. It should
 be easier to understand.

 v3:
 ---
 Cast 'struct hist_entry' to 'struct block_hist' in hist_entry__block_fprintf.
 Use symbol_conf.report_block to check if executing hist_entry__block_fprintf.

 v2:
 ---
 Keep standard perf diff format and display the 'Baseline' and
 'Shared Object'.

The output is sorted by "Baseline" and the basic blocks in the same
function are sorted by cycles diff.

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-diff.c | 80 ---
 tools/perf/ui/stdio/hist.c| 27 +++
 tools/perf/util/hist.c| 18 ++
 tools/perf/util/hist.h|  3 ++
 tools/perf/util/srcline.c |  4 ++-
 tools/perf/util/symbol_conf.h |  4 ++-
 6 files changed, 130 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index fafb7b3f58fb..f924b46910b5 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -21,6 +21,7 @@
 #include "util/config.h"
 #include "util/time-utils.h"
 #include "util/annotate.h"
+#include "util/map.h"
 
 #include 
 #include 
@@ -46,6 +47,7 @@ enum {
PERF_HPP_DIFF__WEIGHTED_DIFF,
PERF_HPP_DIFF__FORMULA,
PERF_HPP_DIFF__DELTA_ABS,
+   PERF_HPP_DIFF__CYCLES,
 
PERF_HPP_DIFF__MAX_INDEX
 };
@@ -114,6 +116,7 @@ 

[tip:perf/core] perf diff: Link same basic blocks among different data

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  f3810817b20645ffae809feb30e9fe260fbd6c4d
Gitweb: https://git.kernel.org/tip/f3810817b20645ffae809feb30e9fe260fbd6c4d
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:23:02 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 13:20:15 -0300

perf diff: Link same basic blocks among different data

The target is to compare the performance difference (cycles diff) for
the same basic blocks in different data files.

The same basic block means same function, same start address and same
end address. This patch finds the same basic blocks from different data
files and link them together and resort by the cycles diff.

 v3:
 ---
 The block stuffs are maintained by new structure 'block_hist',
 so this patch is update accordingly.

 v2:
 ---
 Since now the basic block hists is changed to per symbol,
 the patch only links the basic block hists for the same
 symbol in different data files.

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-6-git-send-email-yao@linux.intel.com
[ sym->name is an array, not a pointer, so no need to check it for NULL, fixes 
de build in some distros ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-diff.c | 87 +++
 1 file changed, 87 insertions(+)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 83b8c0f3fb16..fafb7b3f58fb 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -641,6 +641,82 @@ static int process_block_per_sym(struct hist_entry *he)
return 0;
 }
 
+static int block_pair_cmp(struct hist_entry *a, struct hist_entry *b)
+{
+   struct block_info *bi_a = a->block_info;
+   struct block_info *bi_b = b->block_info;
+   int cmp;
+
+   if (!bi_a->sym || !bi_b->sym)
+   return -1;
+
+   cmp = strcmp(bi_a->sym->name, bi_b->sym->name);
+
+   if ((!cmp) && (bi_a->start == bi_b->start) && (bi_a->end == bi_b->end))
+   return 0;
+
+   return -1;
+}
+
+static struct hist_entry *get_block_pair(struct hist_entry *he,
+struct hists *hists_pair)
+{
+   struct rb_root_cached *root = hists_pair->entries_in;
+   struct rb_node *next = rb_first_cached(root);
+   int cmp;
+
+   while (next != NULL) {
+   struct hist_entry *he_pair = rb_entry(next, struct hist_entry,
+ rb_node_in);
+
+   next = rb_next(_pair->rb_node_in);
+
+   cmp = block_pair_cmp(he_pair, he);
+   if (!cmp)
+   return he_pair;
+   }
+
+   return NULL;
+}
+
+static void compute_cycles_diff(struct hist_entry *he,
+   struct hist_entry *pair)
+{
+   pair->diff.computed = true;
+   if (pair->block_info->num && he->block_info->num) {
+   pair->diff.cycles =
+   pair->block_info->cycles_aggr / 
pair->block_info->num_aggr -
+   he->block_info->cycles_aggr / he->block_info->num_aggr;
+   }
+}
+
+static void block_hists_match(struct hists *hists_base,
+ struct hists *hists_pair)
+{
+   struct rb_root_cached *root = hists_base->entries_in;
+   struct rb_node *next = rb_first_cached(root);
+
+   while (next != NULL) {
+   struct hist_entry *he = rb_entry(next, struct hist_entry,
+rb_node_in);
+   struct hist_entry *pair = get_block_pair(he, hists_pair);
+
+   next = rb_next(>rb_node_in);
+
+   if (pair) {
+   hist_entry__add_pair(pair, he);
+   compute_cycles_diff(he, pair);
+   }
+   }
+}
+
+static int filter_cb(struct hist_entry *he, void *arg __maybe_unused)
+{
+   /* Skip the calculation of column length in output_resort */
+   he->filtered = true;
+   return 0;
+}
+
 static void hists__precompute(struct hists *hists)
 {
struct rb_root_cached *root;
@@ -653,6 +729,7 @@ static void hists__precompute(struct hists *hists)
 
next = rb_first_cached(root);
while (next != NULL) {
+   struct block_hist *bh, *pair_bh;
struct hist_entry *he, *pair;
struct data__file *d;
int i;
@@ -681,6 +758,16 @@ static void hists__precompute(struct hists *hists)
break;
case COMPUTE_CYCLES:
process_block_per_sym(pair);
+   bh = container_of(he, struct block_hist, he);
+   pair_bh = container_of(pair, struct block_hist,
+  he);
+
+   if (bh->valid && 

[tip:perf/core] perf diff: Check if all data files with branch stacks

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  30d815534e63d737f8004414d12b1679c032e0dd
Gitweb: https://git.kernel.org/tip/30d815534e63d737f8004414d12b1679c032e0dd
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:23:00 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 12:46:11 -0300

perf diff: Check if all data files with branch stacks

We will expand perf diff to support diff cycles of individual programs
blocks, so it requires all data files having branch stacks.

This patch checks HEADER_BRANCH_STACK in header, and only set the flag
has_br_stack when HEADER_BRANCH_STACK are set in all data files.

 v2:
 ---
 Move check_file_brstack() from __cmd_diff() to cmd_diff().
 Because later patch will check flag 'has_br_stack' before
 ui_init().

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-diff.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 6e7920793729..a7e04202955c 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -32,6 +32,7 @@ struct perf_diff {
struct perf_time_interval   *ptime_range;
int  range_size;
int  range_num;
+   bool has_br_stack;
 };
 
 /* Diff command specific HPP columns. */
@@ -873,6 +874,31 @@ static int parse_time_str(struct data__file *d, char 
*abstime_ostr,
return ret;
 }
 
+static int check_file_brstack(void)
+{
+   struct data__file *d;
+   bool has_br_stack;
+   int i;
+
+   data__for_each_file(i, d) {
+   d->session = perf_session__new(>data, false, );
+   if (!d->session) {
+   pr_err("Failed to open %s\n", d->data.path);
+   return -1;
+   }
+
+   has_br_stack = perf_header__has_feat(>session->header,
+HEADER_BRANCH_STACK);
+   perf_session__delete(d->session);
+   if (!has_br_stack)
+   return 0;
+   }
+
+   /* Set only all files having branch stacks */
+   pdiff.has_br_stack = true;
+   return 0;
+}
+
 static int __cmd_diff(void)
 {
struct data__file *d;
@@ -1487,6 +1513,9 @@ int cmd_diff(int argc, const char **argv)
if (data_init(argc, argv) < 0)
return -1;
 
+   if (check_file_brstack() < 0)
+   return -1;
+
if (ui_init() < 0)
return -1;
 


[tip:perf/core] perf diff: Use hists to manage basic blocks per symbol

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  99150a1faab2963d3f5bf353354afe79bdddb75f
Gitweb: https://git.kernel.org/tip/99150a1faab2963d3f5bf353354afe79bdddb75f
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:23:01 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 12:47:07 -0300

perf diff: Use hists to manage basic blocks per symbol

The hist__account_cycles() can account cycles per basic block. The basic
block information is saved in cycles_hist structure.

This patch processes each symbol, get basic blocks from cycles_hist and
add the basic block entries to a new hists (in 'struct block_hist').
Using a hists is because we need to compare, sort and print the basic
blocks later.

 v6:
 ---
 Since 'ops' argument is removed from hists__add_entry_block,
 update the code accordingly. No functional change.

 v5:
 ---
 Since now we still carry block_info in 'struct hist_entry'
 we don't need to use our own new/free ops for hist entries.
 And the block_info is released in hist_entry__delete.

 v3:
 ---
 1. In v2, we put block stuffs in 'struct hist_entry', but
 it's not a good design. In v3, we create a new
 'struct block_hist' and cast the 'struct hist_entry' to
 'struct block_hist' in some places, which can avoid adding
 new stuffs in 'struct hist_entry'.

 2. abs() -> labs(), in block_cycles_diff_cmp().

 v2:
 ---
 v1 adds the basic block entries to per data-file hists
 but v2 adds the basic block entries to per symbol hists.
 That is to keep current perf-diff format. Will show the
 result in next patches.

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-diff.c | 190 +-
 tools/perf/util/hist.c|   3 +
 tools/perf/util/sort.h|  12 +++
 3 files changed, 202 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index a7e04202955c..83b8c0f3fb16 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -20,6 +20,7 @@
 #include "util/data.h"
 #include "util/config.h"
 #include "util/time-utils.h"
+#include "util/annotate.h"
 
 #include 
 #include 
@@ -87,11 +88,14 @@ static s64 compute_wdiff_w2;
 static const char  *cpu_list;
 static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 
+static struct addr_location dummy_al;
+
 enum {
COMPUTE_DELTA,
COMPUTE_RATIO,
COMPUTE_WEIGHTED_DIFF,
COMPUTE_DELTA_ABS,
+   COMPUTE_CYCLES,
COMPUTE_MAX,
 };
 
@@ -100,6 +104,7 @@ const char *compute_names[COMPUTE_MAX] = {
[COMPUTE_DELTA_ABS] = "delta-abs",
[COMPUTE_RATIO] = "ratio",
[COMPUTE_WEIGHTED_DIFF] = "wdiff",
+   [COMPUTE_CYCLES] = "cycles",
 };
 
 static int compute = COMPUTE_DELTA_ABS;
@@ -234,6 +239,8 @@ static int setup_compute(const struct option *opt, const 
char *str,
for (i = 0; i < COMPUTE_MAX; i++)
if (!strcmp(cstr, compute_names[i])) {
*cp = i;
+   if (i == COMPUTE_CYCLES)
+   break;
return setup_compute_opt(option);
}
 
@@ -336,6 +343,31 @@ static int formula_fprintf(struct hist_entry *he, struct 
hist_entry *pair,
return -1;
 }
 
+static void *block_hist_zalloc(size_t size)
+{
+   struct block_hist *bh;
+
+   bh = zalloc(size + sizeof(*bh));
+   if (!bh)
+   return NULL;
+
+   return >he;
+}
+
+static void block_hist_free(void *he)
+{
+   struct block_hist *bh;
+
+   bh = container_of(he, struct block_hist, he);
+   hists__delete_entries(>block_hists);
+   free(bh);
+}
+
+struct hist_entry_ops block_hist_ops = {
+   .new= block_hist_zalloc,
+   .free   = block_hist_free,
+};
+
 static int diff__process_sample_event(struct perf_tool *tool,
  union perf_event *event,
  struct perf_sample *sample,
@@ -363,9 +395,22 @@ static int diff__process_sample_event(struct perf_tool 
*tool,
goto out_put;
}
 
-   if (!hists__add_entry(hists, , NULL, NULL, NULL, sample, true)) {
-   pr_warning("problem incrementing symbol period, skipping 
event\n");
-   goto out_put;
+   if (compute != COMPUTE_CYCLES) {
+   if (!hists__add_entry(hists, , NULL, NULL, NULL, sample,
+ true)) {
+   pr_warning("problem incrementing symbol period, "
+  "skipping event\n");
+   goto out_put;
+   }
+   } else {
+   if (!hists__add_entry_ops(hists, _hist_ops, , NULL,
+ NULL, NULL, sample, true)) {
+   

[tip:perf/core] perf hists: Add block_info in hist_entry

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  fe96245c7f38c4ea92c1c599b43f176e27d9921e
Gitweb: https://git.kernel.org/tip/fe96245c7f38c4ea92c1c599b43f176e27d9921e
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:22:59 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 12:45:23 -0300

perf hists: Add block_info in hist_entry

The block_info contains the program basic block information, i.e,
contains the start address and the end address of this basic block and
how much cycles it takes.

We need to compare, sort and even print out the basic block by some
orders, i.e. sort by cycles.

For this purpose, we add block_info field to hist_entry. In order not to
impact current interface, we creates a new function
hists__add_entry_block.

 v6:
 ---
 Remove the 'ops' argument in hists__add_entry_block

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/hist.c | 20 ++--
 tools/perf/util/hist.h |  5 +
 tools/perf/util/sort.h |  1 +
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index fb3271fd420c..c4defff151ed 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -574,6 +574,8 @@ static struct hist_entry *hists__findnew_entry(struct hists 
*hists,
 */
mem_info__zput(entry->mem_info);
 
+   block_info__zput(entry->block_info);
+
/* If the map of an existing hist_entry has
 * become out-of-date due to an exec() or
 * similar, update it.  Otherwise we will
@@ -645,6 +647,7 @@ __hists__add_entry(struct hists *hists,
   struct symbol *sym_parent,
   struct branch_info *bi,
   struct mem_info *mi,
+  struct block_info *block_info,
   struct perf_sample *sample,
   bool sample_self,
   struct hist_entry_ops *ops)
@@ -677,6 +680,7 @@ __hists__add_entry(struct hists *hists,
.hists  = hists,
.branch_info = bi,
.mem_info = mi,
+   .block_info = block_info,
.transaction = sample->transaction,
.raw_data = sample->raw_data,
.raw_size = sample->raw_size,
@@ -699,7 +703,7 @@ struct hist_entry *hists__add_entry(struct hists *hists,
struct perf_sample *sample,
bool sample_self)
 {
-   return __hists__add_entry(hists, al, sym_parent, bi, mi,
+   return __hists__add_entry(hists, al, sym_parent, bi, mi, NULL,
  sample, sample_self, NULL);
 }
 
@@ -712,10 +716,22 @@ struct hist_entry *hists__add_entry_ops(struct hists 
*hists,
struct perf_sample *sample,
bool sample_self)
 {
-   return __hists__add_entry(hists, al, sym_parent, bi, mi,
+   return __hists__add_entry(hists, al, sym_parent, bi, mi, NULL,
  sample, sample_self, ops);
 }
 
+struct hist_entry *hists__add_entry_block(struct hists *hists,
+ struct addr_location *al,
+ struct block_info *block_info)
+{
+   struct hist_entry entry = {
+   .block_info = block_info,
+   .hists = hists,
+   }, *he = hists__findnew_entry(hists, , al, false);
+
+   return he;
+}
+
 static int
 iter_next_nop_entry(struct hist_entry_iter *iter __maybe_unused,
struct addr_location *al __maybe_unused)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 76ff6c6d03b8..c670122b4e40 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -16,6 +16,7 @@ struct addr_location;
 struct map_symbol;
 struct mem_info;
 struct branch_info;
+struct block_info;
 struct symbol;
 
 enum hist_filter {
@@ -149,6 +150,10 @@ struct hist_entry *hists__add_entry_ops(struct hists 
*hists,
struct perf_sample *sample,
bool sample_self);
 
+struct hist_entry *hists__add_entry_block(struct hists *hists,
+ struct addr_location *al,
+ struct block_info *bi);
+
 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location 
*al,
 int max_stack_depth, void *arg);
 
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index ce376a73f964..43623fa874b2 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -144,6 +144,7 @@ struct hist_entry {
long

[tip:perf/core] perf symbol: Create block_info structure

2019-07-03 Thread tip-bot for Jin Yao
Commit-ID:  0cec2447e7d209b77e52c6ec62169cc564df54e7
Gitweb: https://git.kernel.org/tip/0cec2447e7d209b77e52c6ec62169cc564df54e7
Author: Jin Yao 
AuthorDate: Fri, 28 Jun 2019 17:22:58 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 2 Jul 2019 12:44:19 -0300

perf symbol: Create block_info structure

'perf diff' currently can only diff symbols(functions).

We should expand it to diff cycles of individual programs blocks as
reported by timed LBR.  This would allow to identify changes in specific
code accurately.

We need a new structure to maintain the basic block information, such as,
symbol(function), start/end address of this block, cycles. This patch
creates this structure and with some ops.

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1561713784-30533-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/symbol.c | 22 ++
 tools/perf/util/symbol.h | 23 +++
 2 files changed, 45 insertions(+)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 46d2c03814a1..ae2ce255e848 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2351,3 +2351,25 @@ struct mem_info *mem_info__new(void)
refcount_set(>refcnt, 1);
return mi;
 }
+
+struct block_info *block_info__get(struct block_info *bi)
+{
+   if (bi)
+   refcount_inc(>refcnt);
+   return bi;
+}
+
+void block_info__put(struct block_info *bi)
+{
+   if (bi && refcount_dec_and_test(>refcnt))
+   free(bi);
+}
+
+struct block_info *block_info__new(void)
+{
+   struct block_info *bi = zalloc(sizeof(*bi));
+
+   if (bi)
+   refcount_set(>refcnt, 1);
+   return bi;
+}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 9a8fe012910a..12755b42ea93 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -131,6 +131,17 @@ struct mem_info {
refcount_t  refcnt;
 };
 
+struct block_info {
+   struct symbol   *sym;
+   u64 start;
+   u64 end;
+   u64 cycles;
+   u64 cycles_aggr;
+   int num;
+   int num_aggr;
+   refcount_t  refcnt;
+};
+
 struct addr_location {
struct machine *machine;
struct thread *thread;
@@ -332,4 +343,16 @@ static inline void __mem_info__zput(struct mem_info **mi)
 
 #define mem_info__zput(mi) __mem_info__zput()
 
+struct block_info *block_info__new(void);
+struct block_info *block_info__get(struct block_info *bi);
+void   block_info__put(struct block_info *bi);
+
+static inline void __block_info__zput(struct block_info **bi)
+{
+   block_info__put(*bi);
+   *bi = NULL;
+}
+
+#define block_info__zput(bi) __block_info__zput()
+
 #endif /* __PERF_SYMBOL */


[tip:perf/core] perf stat: Support 'percore' event qualifier

2019-05-18 Thread tip-bot for Jin Yao
Commit-ID:  4fc4d8dfa056dfd48afe73b9ea3b7570ceb80b9c
Gitweb: https://git.kernel.org/tip/4fc4d8dfa056dfd48afe73b9ea3b7570ceb80b9c
Author: Jin Yao 
AuthorDate: Fri, 12 Apr 2019 21:59:49 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 16 May 2019 14:17:24 -0300

perf stat: Support 'percore' event qualifier

With this patch, we can use the 'percore' event qualifier in perf-stat.

  root@skl:/tmp# perf stat -e 
cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
1.000773050 CPU047,699,641 cpu/event=0,umask=0x3/(50.02%)
1.000773050 CPU149,052,451 cpu/event=0,umask=0x3/(49.98%)
1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/(49.98%)
1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/(49.98%)
1.000773050 CPU443,171,342 cpu/event=0,umask=0x3/(49.98%)
1.000773050 CPU554,152,158 cpu/event=0,umask=0x3/(49.98%)
1.000773050 CPU693,618,410 cpu/event=0,umask=0x3/(49.98%)
1.000773050 CPU774,477,589 cpu/event=0,umask=0x3/(49.99%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Note that, the 'percore' event qualifier needs to use with option '-A'.

Signed-off-by: Jin Yao 
Tested-by: Ravi Bangoria 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-stat.txt |  4 
 tools/perf/builtin-stat.c  | 21 +
 tools/perf/util/stat-display.c | 43 ++
 tools/perf/util/stat.c |  8 ---
 4 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt 
b/tools/perf/Documentation/perf-stat.txt
index 39c05f89104e..1e312c2672e4 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
  param1 and param2 are defined as formats for the PMU in
  /sys/bus/event_source/devices//format/*
 
+ 'percore' is a event qualifier that sums up the event counts for both
+ hardware threads in a core. For example:
+ perf stat -A -a -e cpu/event,percore=1/,otherevent ...
+
- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
  where M, N, K are numbers (in decimal, hex, octal format).
  Acceptable values for each of 'config', 'config1' and 'config2'
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a3c060878faa..24b8e690fb69 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -847,6 +847,18 @@ static int perf_stat__get_core_cached(struct 
perf_stat_config *config,
return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
 }
 
+static bool term_percore_set(void)
+{
+   struct perf_evsel *counter;
+
+   evlist__for_each_entry(evsel_list, counter) {
+   if (counter->percore)
+   return true;
+   }
+
+   return false;
+}
+
 static int perf_stat_init_aggr_mode(void)
 {
int nr;
@@ -867,6 +879,15 @@ static int perf_stat_init_aggr_mode(void)
stat_config.aggr_get_id = perf_stat__get_core_cached;
break;
case AGGR_NONE:
+   if (term_percore_set()) {
+   if (cpu_map__build_core_map(evsel_list->cpus,
+   _config.aggr_map)) {
+   perror("cannot build core map");
+   return -1;
+   }
+   stat_config.aggr_get_id = perf_stat__get_core_cached;
+   }
+   break;
case AGGR_GLOBAL:
case AGGR_THREAD:
case AGGR_UNSET:
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index f5b4ee79568c..4c53bae5644b 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -88,9 +88,17 @@ static void aggr_printout(struct perf_stat_config *config,
config->csv_sep);
break;
case AGGR_NONE:
-   fprintf(config->output, "CPU%*d%s",
-   config->csv_output ? 0 : -4,
-

[tip:perf/core] perf stat: Factor out aggregate counts printing

2019-05-18 Thread tip-bot for Jin Yao
Commit-ID:  40480a8136700d678dc07222c4d7287c89d0c04d
Gitweb: https://git.kernel.org/tip/40480a8136700d678dc07222c4d7287c89d0c04d
Author: Jin Yao 
AuthorDate: Fri, 12 Apr 2019 21:59:48 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 16 May 2019 14:17:24 -0300

perf stat: Factor out aggregate counts printing

Move the aggregate counts printing to a new function
print_counter_aggrdata, which will be used in following patches.

Signed-off-by: Jin Yao 
Tested-by: Ravi Bangoria 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/stat-display.c | 64 +-
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 3324f23c7efc..f5b4ee79568c 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -594,6 +594,41 @@ static void aggr_cb(struct perf_stat_config *config,
}
 }
 
+static void print_counter_aggrdata(struct perf_stat_config *config,
+  struct perf_evsel *counter, int s,
+  char *prefix, bool metric_only,
+  bool *first)
+{
+   struct aggr_data ad;
+   FILE *output = config->output;
+   u64 ena, run, val;
+   int id, nr;
+   double uval;
+
+   ad.id = id = config->aggr_map->map[s];
+   ad.val = ad.ena = ad.run = 0;
+   ad.nr = 0;
+   if (!collect_data(config, counter, aggr_cb, ))
+   return;
+
+   nr = ad.nr;
+   ena = ad.ena;
+   run = ad.run;
+   val = ad.val;
+   if (*first && metric_only) {
+   *first = false;
+   aggr_printout(config, counter, id, nr);
+   }
+   if (prefix && !metric_only)
+   fprintf(output, "%s", prefix);
+
+   uval = val * counter->scale;
+   printout(config, id, nr, counter, uval, prefix,
+run, ena, 1.0, _stat);
+   if (!metric_only)
+   fputc('\n', output);
+}
+
 static void print_aggr(struct perf_stat_config *config,
   struct perf_evlist *evlist,
   char *prefix)
@@ -601,9 +636,7 @@ static void print_aggr(struct perf_stat_config *config,
bool metric_only = config->metric_only;
FILE *output = config->output;
struct perf_evsel *counter;
-   int s, id, nr;
-   double uval;
-   u64 ena, run, val;
+   int s;
bool first;
 
if (!(config->aggr_map || config->aggr_get_id))
@@ -616,33 +649,14 @@ static void print_aggr(struct perf_stat_config *config,
 * Without each counter has its own line.
 */
for (s = 0; s < config->aggr_map->nr; s++) {
-   struct aggr_data ad;
if (prefix && metric_only)
fprintf(output, "%s", prefix);
 
-   ad.id = id = config->aggr_map->map[s];
first = true;
evlist__for_each_entry(evlist, counter) {
-   ad.val = ad.ena = ad.run = 0;
-   ad.nr = 0;
-   if (!collect_data(config, counter, aggr_cb, ))
-   continue;
-   nr = ad.nr;
-   ena = ad.ena;
-   run = ad.run;
-   val = ad.val;
-   if (first && metric_only) {
-   first = false;
-   aggr_printout(config, counter, id, nr);
-   }
-   if (prefix && !metric_only)
-   fprintf(output, "%s", prefix);
-
-   uval = val * counter->scale;
-   printout(config, id, nr, counter, uval, prefix,
-run, ena, 1.0, _stat);
-   if (!metric_only)
-   fputc('\n', output);
+   print_counter_aggrdata(config, counter, s,
+  prefix, metric_only,
+  );
}
if (metric_only)
fputc('\n', output);


[tip:perf/core] perf tools: Add a 'percore' event qualifier

2019-05-18 Thread tip-bot for Jin Yao
Commit-ID:  064b4e82aa1633c27c383cc686b87ced57e072d1
Gitweb: https://git.kernel.org/tip/064b4e82aa1633c27c383cc686b87ced57e072d1
Author: Jin Yao 
AuthorDate: Fri, 12 Apr 2019 21:59:47 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 16 May 2019 14:17:24 -0300

perf tools: Add a 'percore' event qualifier

Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.

We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.

This can be implemented in only the user tool, no kernel support needed.

 v4:
 ---
 1. Add Arnaldo's patch which updates the documentation for
this new qualifier.
 2. Rebase to latest perf/core branch

 v3:
 ---
 Simplify the code according to Jiri's comments.
 Before:
   "return term->val.percore ? true : false;"
 Now:
   "return term->val.percore;"

 v2:
 ---
 Change the qualifier name from 'coresum' to 'percore' according to
 comments from Jiri and Andi.

Signed-off-by: Jin Yao 
Tested-by: Ravi Bangoria 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-list.txt | 12 
 tools/perf/util/evsel.c|  2 ++
 tools/perf/util/evsel.h|  3 +++
 tools/perf/util/parse-events.c | 27 +++
 tools/perf/util/parse-events.h |  1 +
 tools/perf/util/parse-events.l |  1 +
 6 files changed, 46 insertions(+)

diff --git a/tools/perf/Documentation/perf-list.txt 
b/tools/perf/Documentation/perf-list.txt
index 138fb6e94b3c..18ed1b0fceb3 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -199,6 +199,18 @@ also be supplied. For example:
 
   perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
 
+EVENT QUALIFIERS:
+
+It is also possible to add extra qualifiers to an event:
+
+percore:
+
+Sums up the event counts for all hardware threads in a core, e.g.:
+
+
+  perf stat -e cpu/event=0,umask=0x3,percore=1/
+
+
 EVENT GROUPS
 
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a10cf4cde920..a6f572a40deb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -813,6 +813,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
break;
case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
break;
+   case PERF_EVSEL__CONFIG_TERM_PERCORE:
+   break;
default:
break;
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 6d190cbf1070..cad54e8ba522 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -50,6 +50,7 @@ enum term_type {
PERF_EVSEL__CONFIG_TERM_OVERWRITE,
PERF_EVSEL__CONFIG_TERM_DRV_CFG,
PERF_EVSEL__CONFIG_TERM_BRANCH,
+   PERF_EVSEL__CONFIG_TERM_PERCORE,
 };
 
 struct perf_evsel_config_term {
@@ -67,6 +68,7 @@ struct perf_evsel_config_term {
booloverwrite;
char*branch;
unsigned long max_events;
+   boolpercore;
} val;
bool weak;
 };
@@ -158,6 +160,7 @@ struct perf_evsel {
struct perf_evsel   **metric_events;
boolcollect_stat;
boolweak_group;
+   boolpercore;
const char  *pmu_name;
struct {
perf_evsel__sb_cb_t *cb;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4432bfe039fd..cf0b9b81c5aa 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -950,6 +950,7 @@ static const char 
*config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
[PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]   = "no-overwrite",
[PARSE_EVENTS__TERM_TYPE_DRV_CFG]   = "driver-config",
+   [PARSE_EVENTS__TERM_TYPE_PERCORE]   = "percore",
 };
 
 static bool config_term_shrinked;
@@ -970,6 +971,7 @@ config_term_avail(int term_type, struct parse_events_error 
*err)
case PARSE_EVENTS__TERM_TYPE_CONFIG2:
case PARSE_EVENTS__TERM_TYPE_NAME:
case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+   case PARSE_EVENTS__TERM_TYPE_PERCORE:
return true;
default:
if (!err)
@@ -1061,6 +1063,14 @@ do { 
   \
case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:

[tip:perf/core] perf annotate: Remove hist__account_cycles() from callback

2019-05-18 Thread tip-bot for Jin Yao
Commit-ID:  bdd1666b3d03d675bdb7f8d92b29f2797acbc5e8
Gitweb: https://git.kernel.org/tip/bdd1666b3d03d675bdb7f8d92b29f2797acbc5e8
Author: Jin Yao 
AuthorDate: Sat, 16 Mar 2019 05:16:17 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 15 May 2019 16:36:46 -0300

perf annotate: Remove hist__account_cycles() from callback

The hist__account_cycles() function is executed when the
hist_iter__branch_callback() is called.

But it looks it's not necessary.  In hist__account_cycles, it already
walks on all branch entries.

This patch moves the hist__account_cycles out of callback, now the data
processing is much faster than before.

Previous code has an issue that the ch[offset].num++ (in
__symbol__account_cycles) is executed repeatedly since
hist__account_cycles is called in each hist_iter__branch_callback, so
the counting of ch[offset].num is not correct (too big).

With this patch, the issue is fixed. And we don't need the code of
"ch->reset >= ch->num / 2" to check if there are too many overlaps (in
annotation__count_and_fill), otherwise some data would be hidden.

Now, we can try, for example:

  perf record -b ...
  perf annotate or perf report -s symbol

The before/after output should be no change.

 v3:
 ---
 Fix the crash in stdio mode.
 Like previous code, it needs the checking of ui__has_annotation()
 before hist__account_cycles()

 v2:
 ---
 1. Cover the similar perf report
 2. Remove the checking code "ch->reset >= ch->num / 2"

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1552684577-29041-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-annotate.c |  4 ++--
 tools/perf/builtin-report.c   | 11 +--
 tools/perf/util/annotate.c|  2 +-
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 67f9d9ffacfb..77deb3a40596 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -159,8 +159,6 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct perf_evsel *evsel = iter->evsel;
int err;
 
-   hist__account_cycles(sample->branch_stack, al, sample, false);
-
bi = he->branch_info;
err = addr_map_symbol__inc_samples(>from, sample, evsel);
 
@@ -199,6 +197,8 @@ static int process_branch_callback(struct perf_evsel *evsel,
if (a.map != NULL)
a.map->dso->hit = 1;
 
+   hist__account_cycles(sample->branch_stack, al, sample, false);
+
ret = hist_entry_iter__add(, , PERF_MAX_STACK_DEPTH, ann);
return ret;
 }
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4054eb1f98ac..91e27ac297c2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -136,9 +136,6 @@ static int hist_iter__report_callback(struct 
hist_entry_iter *iter,
if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
 
-   hist__account_cycles(sample->branch_stack, al, sample,
-rep->nonany_branch_mode);
-
if (sort__mode == SORT_MODE__BRANCH) {
bi = he->branch_info;
err = addr_map_symbol__inc_samples(>from, sample, evsel);
@@ -181,9 +178,6 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
 
-   hist__account_cycles(sample->branch_stack, al, sample,
-rep->nonany_branch_mode);
-
bi = he->branch_info;
err = addr_map_symbol__inc_samples(>from, sample, evsel);
if (err)
@@ -282,6 +276,11 @@ static int process_sample_event(struct perf_tool *tool,
if (al.map != NULL)
al.map->dso->hit = 1;
 
+   if (ui__has_annotation() || rep->symbol_ipc) {
+   hist__account_cycles(sample->branch_stack, , sample,
+rep->nonany_branch_mode);
+   }
+
ret = hist_entry_iter__add(, , rep->max_stack, rep);
if (ret < 0)
pr_debug("problem adding hist entry, skipping event\n");
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 09762985c713..0b8573fd9b05 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1021,7 +1021,7 @@ static void annotation__count_and_fill(struct annotation 
*notes, u64 start, u64
float ipc = n_insn / ((double)ch->cycles / (double)ch->num);
 
/* Hide data when there are too many overlaps. */
-   if (ch->reset >= 0x7fff || ch->reset >= ch->num / 2)
+   if (ch->reset >= 0x7fff)
return;
 
for (offset = start; offset <= end; offset++) {


[tip:perf/urgent] perf diff: Support --pid/--tid filter options

2019-03-09 Thread tip-bot for Jin Yao
Commit-ID:  c1d3e633e16db3eb64f519c7099171bfcef94b20
Gitweb: https://git.kernel.org/tip/c1d3e633e16db3eb64f519c7099171bfcef94b20
Author: Jin Yao 
AuthorDate: Tue, 5 Mar 2019 21:05:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Mar 2019 18:06:16 -0300

perf diff: Support --pid/--tid filter options

Using the existing symbol_conf.pid_list_str and symbol_conf.tid_list_str
logic.

For example:

  perf diff --tid 13965

It'll only diff the samples for thread 13965.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1551791143-10334-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt | 6 ++
 tools/perf/builtin-diff.c  | 4 
 2 files changed, 10 insertions(+)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 8c2c229faf50..da7809b15cc9 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -168,6 +168,12 @@ OPTIONS
CPUs are specified with -: 0-2. Default is to report samples on all
CPUs.
 
+--pid=::
+   Only diff samples for given process ID (comma separated list).
+
+--tid=::
+   Only diff samples for given thread ID (comma separated list).
+
 COMPARISON
 --
 The comparison is governed by the baseline file. The baseline perf.data
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index dfe6c7606f5a..6e7920793729 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -985,6 +985,10 @@ static const struct option options[] = {
OPT_STRING(0, "time", _str, "str",
   "Time span (time percent or absolute timestamp)"),
OPT_STRING(0, "cpu", _list, "cpu", "list of cpus to profile"),
+   OPT_STRING(0, "pid", _conf.pid_list_str, "pid[,pid...]",
+  "only consider symbols in these pids"),
+   OPT_STRING(0, "tid", _conf.tid_list_str, "tid[,tid...]",
+  "only consider symbols in these tids"),
OPT_END()
 };
 


[tip:perf/urgent] perf diff: Support --cpu filter option

2019-03-09 Thread tip-bot for Jin Yao
Commit-ID:  daca23b2007595b6a48255ca08c763f56050d1c5
Gitweb: https://git.kernel.org/tip/daca23b2007595b6a48255ca08c763f56050d1c5
Author: Jin Yao 
AuthorDate: Tue, 5 Mar 2019 21:05:42 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Mar 2019 18:05:21 -0300

perf diff: Support --cpu filter option

To improve 'perf diff', implement a --cpu filter option.

Multiple CPUs can be provided as a comma-separated list with no space:
0,1.  Ranges of CPUs are specified with -: 0-2. Default is to report
samples on all CPUs.

For example,

  perf diff --cpu 0,1

It only diff the samples for CPU0 and CPU1.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1551791143-10334-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt |  5 +
 tools/perf/builtin-diff.c  | 16 
 2 files changed, 21 insertions(+)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index 036d65bded51..8c2c229faf50 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -163,6 +163,11 @@ OPTIONS
the end of perf.data.old and analyzes the perf.data from the
timestamp 3971.150589 to the end of perf.data.
 
+--cpu:: Only diff samples for the list of CPUs provided. Multiple CPUs can
+   be provided as a comma-separated list with no space: 0,1. Ranges of
+   CPUs are specified with -: 0-2. Default is to report samples on all
+   CPUs.
+
 COMPARISON
 --
 The comparison is governed by the baseline file. The baseline perf.data
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 17cd898074c8..dfe6c7606f5a 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -83,6 +83,9 @@ static unsigned int sort_compute = 1;
 static s64 compute_wdiff_w1;
 static s64 compute_wdiff_w2;
 
+static const char  *cpu_list;
+static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
+
 enum {
COMPUTE_DELTA,
COMPUTE_RATIO,
@@ -354,6 +357,11 @@ static int diff__process_sample_event(struct perf_tool 
*tool,
return -1;
}
 
+   if (cpu_list && !test_bit(sample->cpu, cpu_bitmap)) {
+   ret = 0;
+   goto out_put;
+   }
+
if (!hists__add_entry(hists, , NULL, NULL, NULL, sample, true)) {
pr_warning("problem incrementing symbol period, skipping 
event\n");
goto out_put;
@@ -892,6 +900,13 @@ static int __cmd_diff(void)
goto out_delete;
}
 
+   if (cpu_list) {
+   ret = perf_session__cpu_bitmap(d->session, cpu_list,
+  cpu_bitmap);
+   if (ret < 0)
+   goto out_delete;
+   }
+
ret = perf_session__process_events(d->session);
if (ret) {
pr_err("Failed to process %s\n", d->data.path);
@@ -969,6 +984,7 @@ static const struct option options[] = {
 "How to display percentage of filtered entries", 
parse_filter_percentage),
OPT_STRING(0, "time", _str, "str",
   "Time span (time percent or absolute timestamp)"),
+   OPT_STRING(0, "cpu", _list, "cpu", "list of cpus to profile"),
OPT_END()
 };
 


[tip:perf/urgent] perf diff: Support --time filter option

2019-03-09 Thread tip-bot for Jin Yao
Commit-ID:  4802138d78caed36cee2a859f77fb2035f230018
Gitweb: https://git.kernel.org/tip/4802138d78caed36cee2a859f77fb2035f230018
Author: Jin Yao 
AuthorDate: Tue, 5 Mar 2019 21:05:41 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Mar 2019 18:03:23 -0300

perf diff: Support --time filter option

To improve 'perf diff', implement a --time filter option to diff the
samples within given time window.

It supports time percent with multiple time ranges. The time string
format is 'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.

For example:

Select the second 10% time slice to diff:

  perf diff --time 10%/2

Select from 0% to 10% time slice to diff:

  perf diff --time 0%-10%

Select the first and the second 10% time slices to diff:

  perf diff --time 10%/1,10%/2

Select from 0% to 10% and 30% to 40% slices to diff:

  perf diff --time 0%-10%,30%-40%

It also supports analysing samples within a given time window
,.

Times have the format seconds.microseconds.

If 'start' is not given (i.e., time string is ',x.y') then analysis starts at
the beginning of the file.

If the stop time is not given (i.e, time string is 'x.y,') then analysis
goes to end of file.

Time string is 'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for
different perf.data files.

For example, we get the timestamp information from perf script.

  perf script -i perf.data.old

mgen 13940 [000]  3946.361400: ...

  perf script -i perf.data

mgen 13940 [000]  3971.150589 ...

  perf diff --time 3946.361400,:3971.150589,

It analyzes the perf.data.old from the timestamp 3946.361400 to the end of
perf.data.old and analyzes the perf.data from the timestamp 3971.150589 to the
end of perf.data.

 v4:
 ---
 Update abstime_str_dup(), let it return error if strdup
 is failed, and update __cmd_diff() accordingly.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1551791143-10334-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-diff.txt |  45 ++
 tools/perf/builtin-diff.c  | 148 +
 2 files changed, 179 insertions(+), 14 deletions(-)

diff --git a/tools/perf/Documentation/perf-diff.txt 
b/tools/perf/Documentation/perf-diff.txt
index a79c84ae61aa..036d65bded51 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -118,6 +118,51 @@ OPTIONS
sum of shown entries will be always 100%.  "absolute" means it retains
the original value before and after the filter is applied.
 
+--time::
+   Analyze samples within given time window. It supports time
+   percent with multiple time ranges. Time string is 'a%/n,b%/m,...'
+   or 'a%-b%,c%-%d,...'.
+
+   For example:
+
+   Select the second 10% time slice to diff:
+
+ perf diff --time 10%/2
+
+   Select from 0% to 10% time slice to diff:
+
+ perf diff --time 0%-10%
+
+   Select the first and the second 10% time slices to diff:
+
+ perf diff --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices to diff:
+
+ perf diff --time 0%-10%,30%-40%
+
+   It also supports analyzing samples within a given time window
+   ,. Times have the format seconds.microseconds. If 'start'
+   is not given (i.e., time string is ',x.y') then analysis starts at
+   the beginning of the file. If stop time is not given (i.e, time
+   string is 'x.y,') then analysis goes to the end of the file. Time 
string is
+   'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for different
+   perf.data files.
+
+   For example, we get the timestamp information from 'perf script'.
+
+ perf script -i perf.data.old
+   mgen 13940 [000]  3946.361400: ...
+
+ perf script -i perf.data
+   mgen 13940 [000]  3971.150589 ...
+
+ perf diff --time 3946.361400,:3971.150589,
+
+   It analyzes the perf.data.old from the timestamp 3946.361400 to
+   the end of perf.data.old and analyzes the perf.data from the
+   timestamp 3971.150589 to the end of perf.data.
+
 COMPARISON
 --
 The comparison is governed by the baseline file. The baseline perf.data
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 58fe0e88215c..17cd898074c8 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -19,12 +19,21 @@
 #include "util/util.h"
 #include "util/data.h"
 #include "util/config.h"
+#include "util/time-utils.h"
 
 #include 
 #include 
 #include 
 #include 
 
+struct perf_diff {
+   struct perf_tool tool;
+   const char  *time_str;
+   struct perf_time_interval   *ptime_range;
+   int  range_size;
+   int  range_num;
+};
+
 /* Diff 

[tip:perf/urgent] perf time-utils: Refactor time range parsing code

2019-03-09 Thread tip-bot for Jin Yao
Commit-ID:  284c4e18f55e85155fbcbef5f88b6e62d2b1c29c
Gitweb: https://git.kernel.org/tip/284c4e18f55e85155fbcbef5f88b6e62d2b1c29c
Author: Jin Yao 
AuthorDate: Fri, 1 Mar 2019 18:13:06 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 1 Mar 2019 11:03:53 -0300

perf time-utils: Refactor time range parsing code

Jiri points out that we don't need any time checking and time string
parsing if the --time option is not set. That makes sense.

This patch refactors the time range parsing code, move the duplicated
code from perf report and perf script to time_utils and check if --time
option is set before parsing the time string. This patch is no logic
change expected. So the usage of --time is same as before.

For example:

Select the first and second 10% time slices:
  perf report --time 10%/1,10%/2
  perf script --time 10%/1,10%/2

Select the slices from 0% to 10% and from 30% to 40%:
  perf report --time 0%-10%,30%-40%
  perf script --time 0%-10%,30%-40%

Select the time slices from timestamp 3971 to 3973
  perf report --time 3971,3973
  perf script --time 3971,3973

Committer testing:

Using the above examples, check before and after to see if it remains
the same:

  $ perf record -F 1 -- find . -name "*.[ch]" -exec cat {} + > /dev/null
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 1.626 MB perf.data (42392 samples) ]
  $
  $ perf report --time 10%/1,10%/2 > /tmp/report.before.1
  $ perf script --time 10%/1,10%/2 > /tmp/script.before.1
  $ perf report --time 0%-10%,30%-40% > /tmp/report.before.2
  $ perf script --time 0%-10%,30%-40% > /tmp/script.before.2
  $ perf report --time 180457.375844,180457.377717 > /tmp/report.before.3
  $ perf script --time 180457.375844,180457.377717 > /tmp/script.before.3

For example, the 3rd test produces this slice:

  $ cat /tmp/script.before.3
cat  3147 180457.375844:   2143 cycles:uppp:  7f79362590d9 
cfree@GLIBC_2.2.5+0x9 (/usr/lib64/libc-2.28.so)
cat  3147 180457.375986:   2245 cycles:uppp:  558b70f3d86e 
[unknown] (/usr/bin/cat)
cat  3147 180457.376012:   2164 cycles:uppp:  7f7936257430 
_int_malloc+0x8c0 (/usr/lib64/libc-2.28.so)
cat  3147 180457.376140:   2921 cycles:uppp:  558b70f3a554 
[unknown] (/usr/bin/cat)
cat  3147 180457.376296:   2844 cycles:uppp:  7f7936258abe 
malloc+0x4e (/usr/lib64/libc-2.28.so)
cat  3147 180457.376431:   2717 cycles:uppp:  558b70f3b0ca 
[unknown] (/usr/bin/cat)
cat  3147 180457.376667:   2630 cycles:uppp:  558b70f3d86e 
[unknown] (/usr/bin/cat)
cat  3147 180457.376795:   2442 cycles:uppp:  7f79362bff55 
read+0x15 (/usr/lib64/libc-2.28.so)
cat  3147 180457.376927:   2376 cycles:uppp:  9aa00163 
[unknown] ([unknown])
cat  3147 180457.376954:   2307 cycles:uppp:  7f7936257438 
_int_malloc+0x8c8 (/usr/lib64/libc-2.28.so)
cat  3147 180457.377116:   3091 cycles:uppp:  7f7936258a70 
malloc+0x0 (/usr/lib64/libc-2.28.so)
cat  3147 180457.377362:   2945 cycles:uppp:  558b70f3a3b0 
[unknown] (/usr/bin/cat)
cat  3147 180457.377517:   2727 cycles:uppp:  558b70f3a9aa 
[unknown] (/usr/bin/cat)
  $

Install 'coreutils-debuginfo' to see cat's guts (symbols), but then, the
above chunk translates into this 'perf report' output:

  $ cat /tmp/report.before.3
  # To display the perf.data header info, please use --header/--header-only 
options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles:uppp' (time slices: 
180457.375844,180457.377717)
  # Event count (approx.): 33552
  #
  # Overhead  Command  Shared Object Symbol
  #   ...    ..
  #
  17.69%  cat  libc-2.28.so  [.] malloc
  14.53%  cat  cat   [.] 0x586e
  13.33%  cat  libc-2.28.so  [.] _int_malloc
   8.78%  cat  cat   [.] 0x23b0
   8.71%  cat  cat   [.] 0x2554
   8.13%  cat  cat   [.] 0x29aa
   8.10%  cat  cat   [.] 0x30ca
   7.28%  cat  libc-2.28.so  [.] read
   7.08%  cat  [unknown] [k] 0x9aa00163
   6.39%  cat  libc-2.28.so  [.] cfree@GLIBC_2.2.5

  #
  # (Tip: Order by the overhead of source file name and line number: perf 
report -s srcline)
  #
  $

Now lets see after applying this patch, nothing should change:

  $ perf report --time 10%/1,10%/2 > /tmp/report.after.1
  $ perf script --time 10%/1,10%/2 > /tmp/script.after.1
  $ perf report --time 0%-10%,30%-40% > /tmp/report.after.2
  $ perf script --time 0%-10%,30%-40% > /tmp/script.after.2
  $ perf report --time 180457.375844,180457.377717 > /tmp/report.after.3
  $ perf script --time 180457.375844,180457.377717 > /tmp/script.after.3
  $ diff -u /tmp/report.before.1 /tmp/report.after.1
  $ diff -u /tmp/script.before.1 

[tip:perf/urgent] perf report: Fix wrong iteration count in --branch-history

2019-01-08 Thread tip-bot for Jin Yao
Commit-ID:  a3366db06bb656cef2e03f30f780d93059bcc594
Gitweb: https://git.kernel.org/tip/a3366db06bb656cef2e03f30f780d93059bcc594
Author: Jin Yao 
AuthorDate: Fri, 4 Jan 2019 14:10:30 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 4 Jan 2019 12:54:49 -0300

perf report: Fix wrong iteration count in --branch-history

By calculating the removed loops, we can get the iteration count.

But the iteration count could be reported incorrectly, reporting
impossibly high counts.

That's because previous code uses the number of removed LBR entries for
the iteration count. That's not good. Fix this by increasing the
iteration count when a loop is detected.

When matching the chain, the iteration count would be added up, finally we need
to compute the average value when printing out.

For example,

  $ perf report --branch-history --stdio --no-children

Before:

  ---f2 +0
 |
 |--33.62%--f1 +9 (cycles:1)
 |  f1 +0
 |  main +22 (cycles:1)
 |  main +17
 |  main +38 (cycles:1)
 |  main +27
 |  f1 +26 (cycles:1)
 |  f1 +24
 |  f2 +27 (cycles:7)
 |  f2 +0
 |  f1 +19 (cycles:1)
 |  f1 +14
 |  f2 +27 (cycles:11)
 |  f2 +0
 |  f1 +9 (cycles:1 iter:2968 avg_cycles:3)
 |  f1 +0
 |  main +22 (cycles:1 iter:2968 avg_cycles:3)
 |  main +17
 |  main +38 (cycles:1 iter:2968 avg_cycles:3)

2968 is an impossible high iteration count and avg_cycles is too small.

After:

  ---f2 +0
 |
 |--33.62%--f1 +9 (cycles:1)
 |  f1 +0
 |  main +22 (cycles:1)
 |  main +17
 |  main +38 (cycles:1)
 |  main +27
 |  f1 +26 (cycles:1)
 |  f1 +24
 |  f2 +27 (cycles:7)
 |  f2 +0
 |  f1 +19 (cycles:1)
 |  f1 +14
 |  f2 +27 (cycles:11)
 |  f2 +0
 |  f1 +9 (cycles:1 iter:1 avg_cycles:23)
 |  f1 +0
 |  main +22 (cycles:1 iter:1 avg_cycles:23)
 |  main +17
 |  main +38 (cycles:1 iter:1 avg_cycles:23)

avg_cycles:23 is the average cycles of this iteration.

Fixes: c4ee06251d42 ("perf report: Calculate the average cycles of iterations")

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/callchain.c | 32 
 tools/perf/util/callchain.h |  1 +
 tools/perf/util/machine.c   |  2 +-
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 32ef7bdca1cf..dc2212e12184 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -766,6 +766,7 @@ static enum match_result match_chain(struct 
callchain_cursor_node *node,
cnode->cycles_count += node->branch_flags.cycles;
cnode->iter_count += node->nr_loop_iter;
cnode->iter_cycles += node->iter_cycles;
+   cnode->from_count++;
}
}
 
@@ -1345,10 +1346,10 @@ static int branch_to_str(char *bf, int bfsize,
 static int branch_from_str(char *bf, int bfsize,
   u64 branch_count,
   u64 cycles_count, u64 iter_count,
-  u64 iter_cycles)
+  u64 iter_cycles, u64 from_count)
 {
int printed = 0, i = 0;
-   u64 cycles;
+   u64 cycles, v = 0;
 
cycles = cycles_count / branch_count;
if (cycles) {
@@ -1357,14 +1358,16 @@ static int branch_from_str(char *bf, int bfsize,
bf + printed, bfsize - printed);
}
 
-   if (iter_count) {
-   printed += count_pri64_printf(i++, "iter",
-   iter_count,
-   bf + printed, bfsize - printed);
+   if (iter_count && from_count) {
+   v = iter_count / from_count;
+   if (v) {
+   printed += count_pri64_printf(i++, "iter",
+   v, bf + printed, bfsize - printed);
 
-   printed += count_pri64_printf(i++, "avg_cycles",
-   iter_cycles / iter_count,
-   bf + printed, bfsize - printed);
+   printed += count_pri64_printf(i++, "avg_cycles",
+   iter_cycles / iter_count,
+   bf + printed, bfsize - printed);
+   }
}
 
if (i)
@@ -1377,6 +1380,7 @@ static int counts_str_build(char *bf, int bfsize,
   

[tip:perf/urgent] perf stat: Fix endless wait for child process

2019-01-08 Thread tip-bot for Jin Yao
Commit-ID:  8a99255a50c0b4c2a449b96fd8d45fcc8d72c701
Gitweb: https://git.kernel.org/tip/8a99255a50c0b4c2a449b96fd8d45fcc8d72c701
Author: Jin Yao 
AuthorDate: Thu, 3 Jan 2019 15:40:45 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 3 Jan 2019 12:12:18 -0300

perf stat: Fix endless wait for child process

We hit a 'perf stat' issue by using following script:

  #!/bin/bash

  sleep 1000 &
  exec perf stat -a -e cycles -I1000 -- sleep 5

Since "perf stat" is launched by exec, the "sleep 1000" would be the
child process of "perf stat". The wait4() call will not return because
it's waiting for the child process "sleep 1000" to end. So 'perf stat'
doesn't return even after 5s passes.

This patch lets 'perf stat' return when the specified child process ends
(in this case, the specified child process is "sleep 5").

Committer testing:

  # cat test.sh
  #!/bin/bash

  sleep 10 &
  exec perf stat -a -e cycles -I1000 -- sleep 5
  #

Before:

  # time ./test.sh
  #   time counts unit events
   1.001113090108,453,351  cycles
   2.002062196142,075,435  cycles
   3.002896194164,801,068  cycles
   4.003731666107,062,140  cycles
   5.002068867112,241,832  cycles

  real  0m10.066s
  user  0m0.016s
  sys   0m0.101s
  #

After:

  # time ./test.sh
  #   time counts unit events
   1.001016096 91,412,027  cycles
   2.002014963124,063,708  cycles
   3.002883964125,993,929  cycles
   4.003706470120,465,734  cycles
   5.002006778163,560,355  cycles

  real  0m5.123s
  user  0m0.014s
  sys   0m0.105s
  #

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1546501245-4512-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-stat.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 1410d66192f7..63a3afc7f32b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -561,7 +561,8 @@ try_again:
break;
}
}
-   wait4(child_pid, , 0, _config.ru_data);
+   if (child_pid != -1)
+   wait4(child_pid, , 0, _config.ru_data);
 
if (workload_exec_errno) {
const char *emsg = str_error_r(workload_exec_errno, 
msg, sizeof(msg));


[tip:perf/core] perf report: Documentation average IPC and IPC coverage

2018-12-18 Thread tip-bot for Jin Yao
Commit-ID:  239ca3e78609378a1ed5d9db1c7db629a71c2857
Gitweb: https://git.kernel.org/tip/239ca3e78609378a1ed5d9db1c7db629a71c2857
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:57 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 17 Dec 2018 14:55:49 -0300

perf report: Documentation average IPC and IPC coverage

Add explanations for new columns "IPC" and "IPC coverage" in perf
documentation.

 v5:
 ---
 Update the description according to Ingo's comments.

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 474a4941f65d..ed2bf37ab132 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -126,6 +126,14 @@ OPTIONS
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
 
+   When the sort key symbol is specified, columns "IPC" and "IPC Coverage"
+   are enabled automatically. Column "IPC" reports the average IPC per 
function
+   and column "IPC coverage" reports the percentage of instructions with
+   sampled IPC in this function. IPC means Instruction Per Cycle. If it's 
low,
+   it indicates there may be a performance bottleneck when the function is
+   executed, such as a memory access bottleneck. If a function has high 
overhead
+   and low IPC, it's worth further analyzing it to optimize its 
performance.
+
If the --mem-mode option is used, the following sort keys are also 
available
(incompatible with --branch-stack):
symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.


[tip:perf/core] perf annotate: Create a annotate2 flag in struct symbol

2018-12-18 Thread tip-bot for Jin Yao
Commit-ID:  246fda09c127e689780d32ef72f2e870615ece3f
Gitweb: https://git.kernel.org/tip/246fda09c127e689780d32ef72f2e870615ece3f
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:55 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 17 Dec 2018 14:55:40 -0300

perf annotate: Create a annotate2 flag in struct symbol

We often use the symbol__annotate2() to annotate a specified symbol.
While annotating may take some time, so in order to avoid annotating the
same symbol repeatedly, the patch creates a new flag to indicate the
symbol has been annotated.

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 1 +
 tools/perf/util/symbol.h   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4b2b1b09b8f1..f69d8e177fa3 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2798,6 +2798,7 @@ int symbol__annotate2(struct symbol *sym, struct map 
*map, struct perf_evsel *ev
notes->nr_events = nr_pcnt;
 
annotation__update_column_widths(notes);
+   sym->annotate2 = true;
 
return 0;
 
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index d026d215bdc6..14d9d438e7e2 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -63,6 +63,7 @@ struct symbol {
u8  ignore:1;
u8  inlined:1;
u8  arch_sym;
+   boolannotate2;
charname[0];
 };
 


[tip:perf/core] perf report: Display average IPC and IPC coverage per symbol

2018-12-18 Thread tip-bot for Jin Yao
Commit-ID:  ec6ae74fe8f00c7df018628ada9d33190de72efa
Gitweb: https://git.kernel.org/tip/ec6ae74fe8f00c7df018628ada9d33190de72efa
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:56 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 17 Dec 2018 14:55:44 -0300

perf report: Display average IPC and IPC coverage per symbol

Support displaying the average IPC and IPC coverage per symbol in 'perf
report' --tui and --stdio modes.

For example,

 $ perf record -b ...
 $ perf report -s symbol

 Overhead  Symbol   IPC   [IPC Coverage]
   39.60%  [.] __random 2.30  [ 54.8%]
   18.02%  [.] main 0.43  [ 54.3%]
   14.21%  [.] compute_flag 2.29  [100.0%]
   14.16%  [.] rand 0.36  [100.0%]
7.06%  [.] __random_r   2.57  [ 70.5%]
6.85%  [.] rand@plt 0.00  [  0.0%]

Jiri Olsa  provided the patch to support the --stdio
mode. I merged Jiri's code in this patch.

  $ perf report -s symbol --stdio

# Overhead  Symbol   IPC   [IPC Coverage]
#   ...  
#
  39.60%  [.] __random   2.30  [ 54.8%]
  18.02%  [.] main   0.43  [ 54.3%]
  14.21%  [.] compute_flag   2.29  [100.0%]
  14.16%  [.] rand   0.36  [100.0%]
   7.06%  [.] __random_r 2.57  [ 70.5%]
   6.85%  [.] rand@plt   0.00  [  0.0%]
   0.02%  [k] run_timer_softirq  1.60  [ 57.2%]

The columns "IPC" and "[IPC Coverage]" are automatically enabled when
the sort-key "symbol" is specified. If the perf.data file doesn't
contain timed LBR information, columns are filled with "-".

For example,

  # Overhead  Symbol   IPC   [IPC Coverage]
  #   ...  
  #
  46.57%  [.] main -  -
  17.60%  [.] rand -  -
  15.84%  [.] __random_r   -  -
  11.90%  [.] __random -  -
   6.50%  [.] compute_flag -  -
   1.59%  [.] rand@plt -  -
   0.00%  [.] _dl_relocate_object  -  -
   0.00%  [k] tlb_flush_mmu-  -
   0.00%  [k] perf_event_mmap  -  -
   0.00%  [k] native_sched_clock   -  -
   0.00%  [k] intel_pmu_handle_irq_v4  -  -
   0.00%  [k] native_write_msr -  -

 v3:
 ---
 Removed the sortkey 'ipc' from command-line. The columns "IPC"
 and "[IPC Coverage]" are automatically enabled when "symbol"
 is specified.

 v2:
 ---
 Merge in Jiri's patch to support stdio mode

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 26 ---
 tools/perf/util/hist.h  |  1 +
 tools/perf/util/sort.c  | 61 +
 tools/perf/util/sort.h  |  2 ++
 4 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 257c9c18cb7e..4958095be4fc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -85,6 +85,7 @@ struct report {
int socket_filter;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
struct branch_type_stat brtype_stat;
+   boolsymbol_ipc;
 };
 
 static int report__config(const char *var, const char *value, void *cb)
@@ -129,7 +130,7 @@ static int hist_iter__report_callback(struct 
hist_entry_iter *iter,
struct mem_info *mi;
struct branch_info *bi;
 
-   if (!ui__has_annotation())
+   if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
 
hist__account_cycles(sample->branch_stack, al, sample,
@@ -174,7 +175,7 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct perf_evsel *evsel = iter->evsel;
int err;
 
-   if (!ui__has_annotation())
+   if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
 
hist__account_cycles(sample->branch_stack, al, sample,
@@ -1133,6 +1134,7 @@ int cmd_report(int argc, const char **argv)
.mode  = PERF_DATA_MODE_READ,
};
int ret = hists__init();
+   char sort_tmp[128];
 
if (ret < 0)
return ret;
@@ -1284,6 +1286,24 @@ repeat:
else
use_browser = 0;
 
+   if (sort_order && strstr(sort_order, "ipc")) {
+   parse_options_usage(report_usage, options, "s", 1);
+   goto error;
+   }
+
+   if (sort_order && 

[tip:perf/core] perf annotate: Compute average IPC and IPC coverage per symbol

2018-12-18 Thread tip-bot for Jin Yao
Commit-ID:  ace4f8faea54f62521e349f70b49797e48873e1f
Gitweb: https://git.kernel.org/tip/ace4f8faea54f62521e349f70b49797e48873e1f
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:54 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 17 Dec 2018 14:55:32 -0300

perf annotate: Compute average IPC and IPC coverage per symbol

Add support to 'perf report' annotate view or 'perf annotate --stdio2'
to aggregate the IPC derived from timed LBRs per symbol. We compute the
average IPC and the IPC coverage percentage.

For example:

  $ perf annotate --stdio2

  Percent  IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)

  Disassembly of section .text:

  0003aac0 :
8.32  3.28  sub$0x18,%rsp
  3.28  mov$0x1,%esi
  3.28  xor%eax,%eax
  3.28  cmpl   
$0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
   11.57  3.28 1  ↓ je 20
lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
  ↓ jne29
  ↓ jmp43
   11.57  1.1020:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
0.00  1.10 1  ↓ je 43
  29:   lea__abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
sub$0x80,%rsp
  → callq  __lll_lock_wait_private
add$0x80,%rsp
0.00  3.0043:   lea__ctype_b@GLIBC_2.2.5+0x38,%rdi
  3.00  lea0xc(%rsp),%rsi
8.49  3.00 1  → callq  __random_r
7.91  1.94  cmpl   
$0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
0.00  1.94 1  ↓ je 68
lock   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
  ↓ jne70
  ↓ jmp8a
0.00  2.0068:   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
   21.56  2.00 1  ↓ je 8a
  70:   lea__abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
sub$0x80,%rsp
  → callq  __lll_unlock_wake_private
add$0x80,%rsp
   21.56  2.908a:   movslq 0xc(%rsp),%rax
  2.90  add$0x18,%rsp
9.03  2.90 1  ← retq

It shows for this symbol the average IPC is 2.30 and the IPC coverage is
54.8%.

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 41 ++---
 tools/perf/util/annotate.h |  5 +
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 6936daf89ddd..4b2b1b09b8f1 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1000,6 +1000,7 @@ static unsigned annotation__count_insn(struct annotation 
*notes, u64 start, u64
 static void annotation__count_and_fill(struct annotation *notes, u64 start, 
u64 end, struct cyc_hist *ch)
 {
unsigned n_insn;
+   unsigned int cover_insn = 0;
u64 offset;
 
n_insn = annotation__count_insn(notes, start, end);
@@ -1013,21 +1014,34 @@ static void annotation__count_and_fill(struct 
annotation *notes, u64 start, u64
for (offset = start; offset <= end; offset++) {
struct annotation_line *al = notes->offsets[offset];
 
-   if (al)
+   if (al && al->ipc == 0.0) {
al->ipc = ipc;
+   cover_insn++;
+   }
+   }
+
+   if (cover_insn) {
+   notes->hit_cycles += ch->cycles;
+   notes->hit_insn += n_insn * ch->num;
+   notes->cover_insn += cover_insn;
}
}
 }
 
 void annotation__compute_ipc(struct annotation *notes, size_t size)
 {
-   u64 offset;
+   s64 offset;
 
if (!notes->src || !notes->src->cycles_hist)
return;
 
+   notes->total_insn = annotation__count_insn(notes, 0, size - 1);
+   notes->hit_cycles = 0;
+   notes->hit_insn = 0;
+   notes->cover_insn = 0;
+
pthread_mutex_lock(>lock);
-   for (offset = 0; offset < size; ++offset) {
+   for (offset = size - 1; offset >= 0; --offset) {
struct cyc_hist *ch;
 
ch = >src->cycles_hist[offset];
@@ -2563,6 +2577,22 @@ call_like:
disasm_line__scnprintf(dl, bf, size, !notes->options->use_offset);
 }
 
+static void ipc_coverage_string(char *bf, int size, struct annotation *notes)
+{
+   double ipc = 0.0, coverage = 0.0;

[tip:perf/core] perf annotate: Compute average IPC and IPC coverage per symbol

2018-12-14 Thread tip-bot for Jin Yao
Commit-ID:  9d2a4fa18816c8a7c4df5873755b92956f6a7128
Gitweb: https://git.kernel.org/tip/9d2a4fa18816c8a7c4df5873755b92956f6a7128
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:54 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 30 Nov 2018 17:14:52 -0300

perf annotate: Compute average IPC and IPC coverage per symbol

Add support to 'perf report' annotate view or 'perf annotate --stdio2'
to aggregate the IPC derived from timed LBRs per symbol. We compute the
average IPC and the IPC coverage percentage.

For example:

  $ perf annotate --stdio2

  Percent  IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%)

  Disassembly of section .text:

  0003aac0 :
8.32  3.28  sub$0x18,%rsp
  3.28  mov$0x1,%esi
  3.28  xor%eax,%eax
  3.28  cmpl   
$0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
   11.57  3.28 1  ↓ je 20
lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
  ↓ jne29
  ↓ jmp43
   11.57  1.1020:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
0.00  1.10 1  ↓ je 43
  29:   lea__abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
sub$0x80,%rsp
  → callq  __lll_lock_wait_private
add$0x80,%rsp
0.00  3.0043:   lea__ctype_b@GLIBC_2.2.5+0x38,%rdi
  3.00  lea0xc(%rsp),%rsi
8.49  3.00 1  → callq  __random_r
7.91  1.94  cmpl   
$0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0
0.00  1.94 1  ↓ je 68
lock   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
  ↓ jne70
  ↓ jmp8a
0.00  2.0068:   decl   __abort_msg@@GLIBC_PRIVATE+0x8a0
   21.56  2.00 1  ↓ je 8a
  70:   lea__abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi
sub$0x80,%rsp
  → callq  __lll_unlock_wake_private
add$0x80,%rsp
   21.56  2.908a:   movslq 0xc(%rsp),%rax
  2.90  add$0x18,%rsp
9.03  2.90 1  ← retq

It shows for this symbol the average IPC is 2.30 and the IPC coverage is
54.8%.

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 41 ++---
 tools/perf/util/annotate.h |  5 +
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 6936daf89ddd..4b2b1b09b8f1 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1000,6 +1000,7 @@ static unsigned annotation__count_insn(struct annotation 
*notes, u64 start, u64
 static void annotation__count_and_fill(struct annotation *notes, u64 start, 
u64 end, struct cyc_hist *ch)
 {
unsigned n_insn;
+   unsigned int cover_insn = 0;
u64 offset;
 
n_insn = annotation__count_insn(notes, start, end);
@@ -1013,21 +1014,34 @@ static void annotation__count_and_fill(struct 
annotation *notes, u64 start, u64
for (offset = start; offset <= end; offset++) {
struct annotation_line *al = notes->offsets[offset];
 
-   if (al)
+   if (al && al->ipc == 0.0) {
al->ipc = ipc;
+   cover_insn++;
+   }
+   }
+
+   if (cover_insn) {
+   notes->hit_cycles += ch->cycles;
+   notes->hit_insn += n_insn * ch->num;
+   notes->cover_insn += cover_insn;
}
}
 }
 
 void annotation__compute_ipc(struct annotation *notes, size_t size)
 {
-   u64 offset;
+   s64 offset;
 
if (!notes->src || !notes->src->cycles_hist)
return;
 
+   notes->total_insn = annotation__count_insn(notes, 0, size - 1);
+   notes->hit_cycles = 0;
+   notes->hit_insn = 0;
+   notes->cover_insn = 0;
+
pthread_mutex_lock(>lock);
-   for (offset = 0; offset < size; ++offset) {
+   for (offset = size - 1; offset >= 0; --offset) {
struct cyc_hist *ch;
 
ch = >src->cycles_hist[offset];
@@ -2563,6 +2577,22 @@ call_like:
disasm_line__scnprintf(dl, bf, size, !notes->options->use_offset);
 }
 
+static void ipc_coverage_string(char *bf, int size, struct annotation *notes)
+{
+   double ipc = 0.0, coverage = 0.0;

[tip:perf/core] perf annotate: Create a annotate2 flag in struct symbol

2018-12-14 Thread tip-bot for Jin Yao
Commit-ID:  41405354bd534857a02448c14df4454bf8abe432
Gitweb: https://git.kernel.org/tip/41405354bd534857a02448c14df4454bf8abe432
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:55 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 30 Nov 2018 17:14:52 -0300

perf annotate: Create a annotate2 flag in struct symbol

We often use the symbol__annotate2() to annotate a specified symbol.
While annotating may take some time, so in order to avoid annotating the
same symbol repeatedly, the patch creates a new flag to indicate the
symbol has been annotated.

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 1 +
 tools/perf/util/symbol.h   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4b2b1b09b8f1..f69d8e177fa3 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2798,6 +2798,7 @@ int symbol__annotate2(struct symbol *sym, struct map 
*map, struct perf_evsel *ev
notes->nr_events = nr_pcnt;
 
annotation__update_column_widths(notes);
+   sym->annotate2 = true;
 
return 0;
 
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index d026d215bdc6..14d9d438e7e2 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -63,6 +63,7 @@ struct symbol {
u8  ignore:1;
u8  inlined:1;
u8  arch_sym;
+   boolannotate2;
charname[0];
 };
 


[tip:perf/core] perf report: Documentation average IPC and IPC coverage

2018-12-14 Thread tip-bot for Jin Yao
Commit-ID:  eb58157b92bfdddc5257f9a170edd3db96e96748
Gitweb: https://git.kernel.org/tip/eb58157b92bfdddc5257f9a170edd3db96e96748
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:57 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 30 Nov 2018 17:14:52 -0300

perf report: Documentation average IPC and IPC coverage

Add explanations for new columns "IPC" and "IPC coverage" in perf
documentation.

 v5:
 ---
 Update the description according to Ingo's comments.

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 474a4941f65d..ed2bf37ab132 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -126,6 +126,14 @@ OPTIONS
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
 
+   When the sort key symbol is specified, columns "IPC" and "IPC Coverage"
+   are enabled automatically. Column "IPC" reports the average IPC per 
function
+   and column "IPC coverage" reports the percentage of instructions with
+   sampled IPC in this function. IPC means Instruction Per Cycle. If it's 
low,
+   it indicates there may be a performance bottleneck when the function is
+   executed, such as a memory access bottleneck. If a function has high 
overhead
+   and low IPC, it's worth further analyzing it to optimize its 
performance.
+
If the --mem-mode option is used, the following sort keys are also 
available
(incompatible with --branch-stack):
symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.


[tip:perf/core] perf report: Display average IPC and IPC coverage per symbol

2018-12-14 Thread tip-bot for Jin Yao
Commit-ID:  8eced1d1581499203caa785b9b60bdcfdab5dcce
Gitweb: https://git.kernel.org/tip/8eced1d1581499203caa785b9b60bdcfdab5dcce
Author: Jin Yao 
AuthorDate: Fri, 30 Nov 2018 21:54:56 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 30 Nov 2018 17:14:52 -0300

perf report: Display average IPC and IPC coverage per symbol

Support displaying the average IPC and IPC coverage per symbol in 'perf
report' --tui and --stdio modes.

For example,

 $ perf record -b ...
 $ perf report -s symbol

 Overhead  Symbol   IPC   [IPC Coverage]
   39.60%  [.] __random 2.30  [ 54.8%]
   18.02%  [.] main 0.43  [ 54.3%]
   14.21%  [.] compute_flag 2.29  [100.0%]
   14.16%  [.] rand 0.36  [100.0%]
7.06%  [.] __random_r   2.57  [ 70.5%]
6.85%  [.] rand@plt 0.00  [  0.0%]

Jiri Olsa  provided the patch to support the --stdio
mode. I merged Jiri's code in this patch.

  $ perf report -s symbol --stdio

# Overhead  Symbol   IPC   [IPC Coverage]
#   ...  
#
  39.60%  [.] __random   2.30  [ 54.8%]
  18.02%  [.] main   0.43  [ 54.3%]
  14.21%  [.] compute_flag   2.29  [100.0%]
  14.16%  [.] rand   0.36  [100.0%]
   7.06%  [.] __random_r 2.57  [ 70.5%]
   6.85%  [.] rand@plt   0.00  [  0.0%]
   0.02%  [k] run_timer_softirq  1.60  [ 57.2%]

The columns "IPC" and "[IPC Coverage]" are automatically enabled when
the sort-key "symbol" is specified. If the perf.data file doesn't
contain timed LBR information, columns are filled with "-".

For example,

  # Overhead  Symbol   IPC   [IPC Coverage]
  #   ...  
  #
  46.57%  [.] main -  -
  17.60%  [.] rand -  -
  15.84%  [.] __random_r   -  -
  11.90%  [.] __random -  -
   6.50%  [.] compute_flag -  -
   1.59%  [.] rand@plt -  -
   0.00%  [.] _dl_relocate_object  -  -
   0.00%  [k] tlb_flush_mmu-  -
   0.00%  [k] perf_event_mmap  -  -
   0.00%  [k] native_sched_clock   -  -
   0.00%  [k] intel_pmu_handle_irq_v4  -  -
   0.00%  [k] native_write_msr -  -

 v3:
 ---
 Removed the sortkey 'ipc' from command-line. The columns "IPC"
 and "[IPC Coverage]" are automatically enabled when "symbol"
 is specified.

 v2:
 ---
 Merge in Jiri's patch to support stdio mode

Signed-off-by: Jin Yao 
Reviewed-by: Ingo Molnar 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1543586097-27632-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 26 ---
 tools/perf/util/hist.h  |  1 +
 tools/perf/util/sort.c  | 61 +
 tools/perf/util/sort.h  |  2 ++
 4 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 257c9c18cb7e..4958095be4fc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -85,6 +85,7 @@ struct report {
int socket_filter;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
struct branch_type_stat brtype_stat;
+   boolsymbol_ipc;
 };
 
 static int report__config(const char *var, const char *value, void *cb)
@@ -129,7 +130,7 @@ static int hist_iter__report_callback(struct 
hist_entry_iter *iter,
struct mem_info *mi;
struct branch_info *bi;
 
-   if (!ui__has_annotation())
+   if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
 
hist__account_cycles(sample->branch_stack, al, sample,
@@ -174,7 +175,7 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct perf_evsel *evsel = iter->evsel;
int err;
 
-   if (!ui__has_annotation())
+   if (!ui__has_annotation() && !rep->symbol_ipc)
return 0;
 
hist__account_cycles(sample->branch_stack, al, sample,
@@ -1133,6 +1134,7 @@ int cmd_report(int argc, const char **argv)
.mode  = PERF_DATA_MODE_READ,
};
int ret = hists__init();
+   char sort_tmp[128];
 
if (ret < 0)
return ret;
@@ -1284,6 +1286,24 @@ repeat:
else
use_browser = 0;
 
+   if (sort_order && strstr(sort_order, "ipc")) {
+   parse_options_usage(report_usage, options, "s", 1);
+   goto error;
+   }
+
+   if (sort_order && 

[tip:perf/urgent] perf top: Display the LBR stats in callchain entry

2018-11-06 Thread tip-bot for Jin Yao
Commit-ID:  590ac60d8aa929bd21e35cd95a7d8720d00eb4f3
Gitweb: https://git.kernel.org/tip/590ac60d8aa929bd21e35cd95a7d8720d00eb4f3
Author: Jin Yao 
AuthorDate: Wed, 31 Oct 2018 19:06:35 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Nov 2018 14:37:11 -0300

perf top: Display the LBR stats in callchain entry

'perf report' has supported the displaying of LBR stats (such as cycles,
predicted%) in callchain entry.

For example:

  $ perf report --branch-history --stdio

  --1.01%--intel_idle mwait.h:29
intel_idle cpufeature.h:164 (cycles:5)
intel_idle cpufeature.h:164 (predicted:76.4%)
intel_idle mwait.h:102 (cycles:41)
intel_idle current.h:15

While 'perf top' doesn't support that.

For example:

  $ perf top -a -b --call-graph branch

  -   13.86% 0.23%  [kernel][k] __x86_indirect_thunk_rax
 - 13.65% __x86_indirect_thunk_rax
+ 1.69% do_syscall_64
+ 1.68% do_select
+ 1.41% ktime_get
+ 0.70% __schedule
+ 0.62% do_sys_poll
  0.58% __x86_indirect_thunk_rax

Actually it's very easy to enable this feature in 'perf top'.

With this patch, the result is:

  $ perf top -a -b --call-graph branch

  $ -   13.58% 0.00%  [kernel]  [k] __x86_indirect_thunk_rax
 $ - 13.57% __x86_indirect_thunk_rax (predicted:93.9%)
$ + 1.78% do_select (cycles:2)
$ + 1.68% perf_pmu_disable.part.99 (cycles:1)
$ + 1.45% ___sys_recvmsg (cycles:25)
$ + 0.81% unix_stream_sendmsg (cycles:18)
$ + 0.80% ktime_get (cycles:400)
  $ 0.58% pick_next_task_fair (cycles:47)
$ + 0.56% i915_request_retire (cycles:2)
$ + 0.52% do_sys_poll (cycles:4)

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1540983995-20462-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-top.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b2838de13de0..aa0c73e57924 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1429,6 +1429,9 @@ int cmd_top(int argc, const char **argv)
}
}
 
+   if (opts->branch_stack && callchain_param.enabled)
+   symbol_conf.show_branchflag_count = true;
+
sort__mode = SORT_MODE__TOP;
/* display thread wants entries to be collapsed in a different tree */
perf_hpp_list.need_collapse = 1;


[tip:perf/urgent] perf top: Display the LBR stats in callchain entry

2018-11-06 Thread tip-bot for Jin Yao
Commit-ID:  590ac60d8aa929bd21e35cd95a7d8720d00eb4f3
Gitweb: https://git.kernel.org/tip/590ac60d8aa929bd21e35cd95a7d8720d00eb4f3
Author: Jin Yao 
AuthorDate: Wed, 31 Oct 2018 19:06:35 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 5 Nov 2018 14:37:11 -0300

perf top: Display the LBR stats in callchain entry

'perf report' has supported the displaying of LBR stats (such as cycles,
predicted%) in callchain entry.

For example:

  $ perf report --branch-history --stdio

  --1.01%--intel_idle mwait.h:29
intel_idle cpufeature.h:164 (cycles:5)
intel_idle cpufeature.h:164 (predicted:76.4%)
intel_idle mwait.h:102 (cycles:41)
intel_idle current.h:15

While 'perf top' doesn't support that.

For example:

  $ perf top -a -b --call-graph branch

  -   13.86% 0.23%  [kernel][k] __x86_indirect_thunk_rax
 - 13.65% __x86_indirect_thunk_rax
+ 1.69% do_syscall_64
+ 1.68% do_select
+ 1.41% ktime_get
+ 0.70% __schedule
+ 0.62% do_sys_poll
  0.58% __x86_indirect_thunk_rax

Actually it's very easy to enable this feature in 'perf top'.

With this patch, the result is:

  $ perf top -a -b --call-graph branch

  $ -   13.58% 0.00%  [kernel]  [k] __x86_indirect_thunk_rax
 $ - 13.57% __x86_indirect_thunk_rax (predicted:93.9%)
$ + 1.78% do_select (cycles:2)
$ + 1.68% perf_pmu_disable.part.99 (cycles:1)
$ + 1.45% ___sys_recvmsg (cycles:25)
$ + 0.81% unix_stream_sendmsg (cycles:18)
$ + 0.80% ktime_get (cycles:400)
  $ 0.58% pick_next_task_fair (cycles:47)
$ + 0.56% i915_request_retire (cycles:2)
$ + 0.52% do_sys_poll (cycles:4)

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1540983995-20462-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-top.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index b2838de13de0..aa0c73e57924 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1429,6 +1429,9 @@ int cmd_top(int argc, const char **argv)
}
}
 
+   if (opts->branch_stack && callchain_param.enabled)
+   symbol_conf.show_branchflag_count = true;
+
sort__mode = SORT_MODE__TOP;
/* display thread wants entries to be collapsed in a different tree */
perf_hpp_list.need_collapse = 1;


[tip:perf/urgent] perf script python: Move dsoname code to a new function

2018-06-07 Thread tip-bot for Jin Yao
Commit-ID:  5f9e0f3158a5cd0ef7bb205b9f1826b2ec1893a9
Gitweb: https://git.kernel.org/tip/5f9e0f3158a5cd0ef7bb205b9f1826b2ec1893a9
Author: Jin Yao 
AuthorDate: Fri, 1 Jun 2018 17:01:01 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Jun 2018 12:52:09 -0300

perf script python: Move dsoname code to a new function

This patch creates a new function get_dsoname() and move the code which
gets the dsoname string to this function.

That's because in next patch, when we process LBR data, we will also
need get_dsoname() to return dsoname for branch from/to.

Signed-off-by: Jin Yao 
Reviewed-by: Andi Kleen 
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1527843663-32288-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../util/scripting-engines/trace-event-python.c| 23 ++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 7f8afacd08ee..f863e96fb7bc 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -372,6 +372,19 @@ static PyObject *get_field_numeric_entry(struct 
event_format *event,
return obj;
 }
 
+static const char *get_dsoname(struct map *map)
+{
+   const char *dsoname = "[unknown]";
+
+   if (map && map->dso) {
+   if (symbol_conf.show_kernel_path && map->dso->long_name)
+   dsoname = map->dso->long_name;
+   else
+   dsoname = map->dso->name;
+   }
+
+   return dsoname;
+}
 
 static PyObject *python_process_callchain(struct perf_sample *sample,
 struct perf_evsel *evsel,
@@ -427,14 +440,8 @@ static PyObject *python_process_callchain(struct 
perf_sample *sample,
}
 
if (node->map) {
-   struct map *map = node->map;
-   const char *dsoname = "[unknown]";
-   if (map && map->dso) {
-   if (symbol_conf.show_kernel_path && 
map->dso->long_name)
-   dsoname = map->dso->long_name;
-   else
-   dsoname = map->dso->name;
-   }
+   const char *dsoname = get_dsoname(node->map);
+
pydict_set_item_string_decref(pyelem, "dso",
_PyUnicode_FromString(dsoname));
}


[tip:perf/urgent] perf script python: Move dsoname code to a new function

2018-06-07 Thread tip-bot for Jin Yao
Commit-ID:  5f9e0f3158a5cd0ef7bb205b9f1826b2ec1893a9
Gitweb: https://git.kernel.org/tip/5f9e0f3158a5cd0ef7bb205b9f1826b2ec1893a9
Author: Jin Yao 
AuthorDate: Fri, 1 Jun 2018 17:01:01 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Jun 2018 12:52:09 -0300

perf script python: Move dsoname code to a new function

This patch creates a new function get_dsoname() and move the code which
gets the dsoname string to this function.

That's because in next patch, when we process LBR data, we will also
need get_dsoname() to return dsoname for branch from/to.

Signed-off-by: Jin Yao 
Reviewed-by: Andi Kleen 
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1527843663-32288-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../util/scripting-engines/trace-event-python.c| 23 ++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index 7f8afacd08ee..f863e96fb7bc 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -372,6 +372,19 @@ static PyObject *get_field_numeric_entry(struct 
event_format *event,
return obj;
 }
 
+static const char *get_dsoname(struct map *map)
+{
+   const char *dsoname = "[unknown]";
+
+   if (map && map->dso) {
+   if (symbol_conf.show_kernel_path && map->dso->long_name)
+   dsoname = map->dso->long_name;
+   else
+   dsoname = map->dso->name;
+   }
+
+   return dsoname;
+}
 
 static PyObject *python_process_callchain(struct perf_sample *sample,
 struct perf_evsel *evsel,
@@ -427,14 +440,8 @@ static PyObject *python_process_callchain(struct 
perf_sample *sample,
}
 
if (node->map) {
-   struct map *map = node->map;
-   const char *dsoname = "[unknown]";
-   if (map && map->dso) {
-   if (symbol_conf.show_kernel_path && 
map->dso->long_name)
-   dsoname = map->dso->long_name;
-   else
-   dsoname = map->dso->name;
-   }
+   const char *dsoname = get_dsoname(node->map);
+
pydict_set_item_string_decref(pyelem, "dso",
_PyUnicode_FromString(dsoname));
}


[tip:perf/urgent] perf script python: Add more PMU fields to event handler dict

2018-06-07 Thread tip-bot for Jin Yao
Commit-ID:  48a1f565261d2ab1e17f9a3ad532cf6d9e07748d
Gitweb: https://git.kernel.org/tip/48a1f565261d2ab1e17f9a3ad532cf6d9e07748d
Author: Jin Yao 
AuthorDate: Fri, 1 Jun 2018 17:01:02 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Jun 2018 15:38:26 -0300

perf script python: Add more PMU fields to event handler dict

When doing pmu sampling and then running a script with perf script -s
script.py, the process_event function gets dictionary with some fields
from the perf ring buffer (like ip, sym, callchain etc).

But we miss quite a few fields we report now, for example, LBRs, data
source, weight, transaction, iregs, uregs, etc.

This patch reports these fields for perf script python processing.

  New keys/items:
  ---
  key  : brstack
  items: from, to, from_dsoname, to_dsoname, mispred,
 predicted, in_tx, abort, cycles.

  key  : brstacksym
  items: from, to, pred, in_tx, abort (converted string)

  key  : datasrc
  key  : datasrc_decode (decoded string)
  key  : iregs
  key  : uregs
  key  : weight
  key  : transaction

  v2:
  ---
  Add new fields for dso.
  Use PyBool_FromLong() for mispred/predicted/in_tx/abort

Committer notes:

!sym->name isn't valid, as its not a pointer, its a [0] array, use
!sym->name[0] instead, guaranteed to be the case by symbol__new.

This was caught by just one of the containers:

  5254.22 ubuntu:17.04  : FAIL gcc (Ubuntu 6.3.0-12ubuntu2) 
6.3.0 20170406

CC   /tmp/build/perf/util/scripting-engines/trace-event-python.o
  util/scripting-engines/trace-event-python.c:534:20: error: address of array 
'sym->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  if (!sym || !sym->name)
~~^~~~
  1 error generated.
  mv: cannot stat 
'/tmp/build/perf/util/scripting-engines/.trace-event-python.o.tmp': No such 
file or directory
  /git/linux/tools/build/Makefile.build:96: recipe for target 
'/tmp/build/perf/util/scripting-engines/trace-event-python.o' failed
  make[5]: *** [/tmp/build/perf/util/scripting-engines/trace-event-python.o] 
Error 1

Signed-off-by: Jin Yao 
Reviewed-by: Andi Kleen 
Cc: Alexander Shishkin 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1527843663-32288-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../util/scripting-engines/trace-event-python.c| 227 -
 1 file changed, 226 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index f863e96fb7bc..46e9e19ab1ac 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -48,6 +48,7 @@
 #include "cpumap.h"
 #include "print_binary.h"
 #include "stat.h"
+#include "mem-events.h"
 
 #if PY_MAJOR_VERSION < 3
 #define _PyUnicode_FromString(arg) \
@@ -455,6 +456,166 @@ exit:
return pylist;
 }
 
+static PyObject *python_process_brstack(struct perf_sample *sample,
+   struct thread *thread)
+{
+   struct branch_stack *br = sample->branch_stack;
+   PyObject *pylist;
+   u64 i;
+
+   pylist = PyList_New(0);
+   if (!pylist)
+   Py_FatalError("couldn't create Python list");
+
+   if (!(br && br->nr))
+   goto exit;
+
+   for (i = 0; i < br->nr; i++) {
+   PyObject *pyelem;
+   struct addr_location al;
+   const char *dsoname;
+
+   pyelem = PyDict_New();
+   if (!pyelem)
+   Py_FatalError("couldn't create Python dictionary");
+
+   pydict_set_item_string_decref(pyelem, "from",
+   PyLong_FromUnsignedLongLong(br->entries[i].from));
+   pydict_set_item_string_decref(pyelem, "to",
+   PyLong_FromUnsignedLongLong(br->entries[i].to));
+   pydict_set_item_string_decref(pyelem, "mispred",
+   PyBool_FromLong(br->entries[i].flags.mispred));
+   pydict_set_item_string_decref(pyelem, "predicted",
+   PyBool_FromLong(br->entries[i].flags.predicted));
+   pydict_set_item_string_decref(pyelem, "in_tx",
+   PyBool_FromLong(br->entries[i].flags.in_tx));
+   pydict_set_item_string_decref(pyelem, "abort",
+   PyBool_FromLong(br->entries[i].flags.abort));
+   pydict_set_item_string_decref(pyelem, "cycles",
+   PyLong_FromUnsignedLongLong(br->entries[i].flags.cycles));
+
+   thread__find_map(thread, sample->cpumode,
+br->entries[i].from, );
+   dsoname = get_dsoname(al.map);
+   pydict_set_item_string_decref(pyelem, "from_dsoname",
+ 

[tip:perf/urgent] perf script python: Add more PMU fields to event handler dict

2018-06-07 Thread tip-bot for Jin Yao
Commit-ID:  48a1f565261d2ab1e17f9a3ad532cf6d9e07748d
Gitweb: https://git.kernel.org/tip/48a1f565261d2ab1e17f9a3ad532cf6d9e07748d
Author: Jin Yao 
AuthorDate: Fri, 1 Jun 2018 17:01:02 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Jun 2018 15:38:26 -0300

perf script python: Add more PMU fields to event handler dict

When doing pmu sampling and then running a script with perf script -s
script.py, the process_event function gets dictionary with some fields
from the perf ring buffer (like ip, sym, callchain etc).

But we miss quite a few fields we report now, for example, LBRs, data
source, weight, transaction, iregs, uregs, etc.

This patch reports these fields for perf script python processing.

  New keys/items:
  ---
  key  : brstack
  items: from, to, from_dsoname, to_dsoname, mispred,
 predicted, in_tx, abort, cycles.

  key  : brstacksym
  items: from, to, pred, in_tx, abort (converted string)

  key  : datasrc
  key  : datasrc_decode (decoded string)
  key  : iregs
  key  : uregs
  key  : weight
  key  : transaction

  v2:
  ---
  Add new fields for dso.
  Use PyBool_FromLong() for mispred/predicted/in_tx/abort

Committer notes:

!sym->name isn't valid, as its not a pointer, its a [0] array, use
!sym->name[0] instead, guaranteed to be the case by symbol__new.

This was caught by just one of the containers:

  5254.22 ubuntu:17.04  : FAIL gcc (Ubuntu 6.3.0-12ubuntu2) 
6.3.0 20170406

CC   /tmp/build/perf/util/scripting-engines/trace-event-python.o
  util/scripting-engines/trace-event-python.c:534:20: error: address of array 
'sym->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
  if (!sym || !sym->name)
~~^~~~
  1 error generated.
  mv: cannot stat 
'/tmp/build/perf/util/scripting-engines/.trace-event-python.o.tmp': No such 
file or directory
  /git/linux/tools/build/Makefile.build:96: recipe for target 
'/tmp/build/perf/util/scripting-engines/trace-event-python.o' failed
  make[5]: *** [/tmp/build/perf/util/scripting-engines/trace-event-python.o] 
Error 1

Signed-off-by: Jin Yao 
Reviewed-by: Andi Kleen 
Cc: Alexander Shishkin 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1527843663-32288-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 .../util/scripting-engines/trace-event-python.c| 227 -
 1 file changed, 226 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index f863e96fb7bc..46e9e19ab1ac 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -48,6 +48,7 @@
 #include "cpumap.h"
 #include "print_binary.h"
 #include "stat.h"
+#include "mem-events.h"
 
 #if PY_MAJOR_VERSION < 3
 #define _PyUnicode_FromString(arg) \
@@ -455,6 +456,166 @@ exit:
return pylist;
 }
 
+static PyObject *python_process_brstack(struct perf_sample *sample,
+   struct thread *thread)
+{
+   struct branch_stack *br = sample->branch_stack;
+   PyObject *pylist;
+   u64 i;
+
+   pylist = PyList_New(0);
+   if (!pylist)
+   Py_FatalError("couldn't create Python list");
+
+   if (!(br && br->nr))
+   goto exit;
+
+   for (i = 0; i < br->nr; i++) {
+   PyObject *pyelem;
+   struct addr_location al;
+   const char *dsoname;
+
+   pyelem = PyDict_New();
+   if (!pyelem)
+   Py_FatalError("couldn't create Python dictionary");
+
+   pydict_set_item_string_decref(pyelem, "from",
+   PyLong_FromUnsignedLongLong(br->entries[i].from));
+   pydict_set_item_string_decref(pyelem, "to",
+   PyLong_FromUnsignedLongLong(br->entries[i].to));
+   pydict_set_item_string_decref(pyelem, "mispred",
+   PyBool_FromLong(br->entries[i].flags.mispred));
+   pydict_set_item_string_decref(pyelem, "predicted",
+   PyBool_FromLong(br->entries[i].flags.predicted));
+   pydict_set_item_string_decref(pyelem, "in_tx",
+   PyBool_FromLong(br->entries[i].flags.in_tx));
+   pydict_set_item_string_decref(pyelem, "abort",
+   PyBool_FromLong(br->entries[i].flags.abort));
+   pydict_set_item_string_decref(pyelem, "cycles",
+   PyLong_FromUnsignedLongLong(br->entries[i].flags.cycles));
+
+   thread__find_map(thread, sample->cpumode,
+br->entries[i].from, );
+   dsoname = get_dsoname(al.map);
+   pydict_set_item_string_decref(pyelem, "from_dsoname",
+ 

[tip:perf/urgent] perf script python: Add dict fields introduction to Documentation

2018-06-07 Thread tip-bot for Jin Yao
Commit-ID:  ac56aa4549cdfd9c56387b35e99e3c868cfc7bd0
Gitweb: https://git.kernel.org/tip/ac56aa4549cdfd9c56387b35e99e3c868cfc7bd0
Author: Jin Yao 
AuthorDate: Fri, 1 Jun 2018 17:01:03 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Jun 2018 15:40:10 -0300

perf script python: Add dict fields introduction to Documentation

Add a brief introduction about fields to perf-script-python.txt.

It should help python script developers in easily finding what fields
are supported.

Signed-off-by: Jin Yao 
Reviewed-by: Andi Kleen 
Cc: Alexander Shishkin 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1527843663-32288-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script-python.txt | 26 +
 1 file changed, 26 insertions(+)

diff --git a/tools/perf/Documentation/perf-script-python.txt 
b/tools/perf/Documentation/perf-script-python.txt
index 51ec2d20068a..0fb9eda3cbca 100644
--- a/tools/perf/Documentation/perf-script-python.txt
+++ b/tools/perf/Documentation/perf-script-python.txt
@@ -610,6 +610,32 @@ Various utility functions for use with perf script:
   nsecs_str(nsecs) - returns printable string in the form secs.nsecs
   avg(total, n) - returns average given a sum and a total number of values
 
+SUPPORTED FIELDS
+
+
+Currently supported fields:
+
+ev_name, comm, pid, tid, cpu, ip, time, period, phys_addr, addr,
+symbol, dso, time_enabled, time_running, values, callchain,
+brstack, brstacksym, datasrc, datasrc_decode, iregs, uregs,
+weight, transaction, raw_buf, attr.
+
+Some fields have sub items:
+
+brstack:
+from, to, from_dsoname, to_dsoname, mispred,
+predicted, in_tx, abort, cycles.
+
+brstacksym:
+items: from, to, pred, in_tx, abort (converted string)
+
+For example,
+We can use this code to print brstack "from", "to", "cycles".
+
+if 'brstack' in dict:
+   for entry in dict['brstack']:
+   print "from %s, to %s, cycles %s" % (entry["from"], 
entry["to"], entry["cycles"])
+
 SEE ALSO
 
 linkperf:perf-script[1]


[tip:perf/urgent] perf script python: Add dict fields introduction to Documentation

2018-06-07 Thread tip-bot for Jin Yao
Commit-ID:  ac56aa4549cdfd9c56387b35e99e3c868cfc7bd0
Gitweb: https://git.kernel.org/tip/ac56aa4549cdfd9c56387b35e99e3c868cfc7bd0
Author: Jin Yao 
AuthorDate: Fri, 1 Jun 2018 17:01:03 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 6 Jun 2018 15:40:10 -0300

perf script python: Add dict fields introduction to Documentation

Add a brief introduction about fields to perf-script-python.txt.

It should help python script developers in easily finding what fields
are supported.

Signed-off-by: Jin Yao 
Reviewed-by: Andi Kleen 
Cc: Alexander Shishkin 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1527843663-32288-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script-python.txt | 26 +
 1 file changed, 26 insertions(+)

diff --git a/tools/perf/Documentation/perf-script-python.txt 
b/tools/perf/Documentation/perf-script-python.txt
index 51ec2d20068a..0fb9eda3cbca 100644
--- a/tools/perf/Documentation/perf-script-python.txt
+++ b/tools/perf/Documentation/perf-script-python.txt
@@ -610,6 +610,32 @@ Various utility functions for use with perf script:
   nsecs_str(nsecs) - returns printable string in the form secs.nsecs
   avg(total, n) - returns average given a sum and a total number of values
 
+SUPPORTED FIELDS
+
+
+Currently supported fields:
+
+ev_name, comm, pid, tid, cpu, ip, time, period, phys_addr, addr,
+symbol, dso, time_enabled, time_running, values, callchain,
+brstack, brstacksym, datasrc, datasrc_decode, iregs, uregs,
+weight, transaction, raw_buf, attr.
+
+Some fields have sub items:
+
+brstack:
+from, to, from_dsoname, to_dsoname, mispred,
+predicted, in_tx, abort, cycles.
+
+brstacksym:
+items: from, to, pred, in_tx, abort (converted string)
+
+For example,
+We can use this code to print brstack "from", "to", "cycles".
+
+if 'brstack' in dict:
+   for entry in dict['brstack']:
+   print "from %s, to %s, cycles %s" % (entry["from"], 
entry["to"], entry["cycles"])
+
 SEE ALSO
 
 linkperf:perf-script[1]


[tip:perf/core] perf annotate: Show group event string for stdio

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  787e4da9f95fd44376b3af6fa163ac0b3a48a1fc
Gitweb: https://git.kernel.org/tip/787e4da9f95fd44376b3af6fa163ac0b3a48a1fc
Author: Jin Yao 
AuthorDate: Tue, 22 May 2018 19:38:35 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 23 May 2018 10:26:40 -0300

perf annotate: Show group event string for stdio

When we enable the group, for tui/stdio2, the output first line includes
the group event string. While for stdio, it will show only one event.

For example,

perf record -e cycles,branches ./div
perf annotate --group --stdio

Percent |  Source code & Disassembly of div for cycles (44407 samples)
..

The first line doesn't include the event 'branches'.

With this patch, it will show the correct group even string.

perf annotate --group --stdio

Percent |  Source code & Disassembly of div for cycles, branches (44407 
samples)
..

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 6612c7f90af4..71897689dacf 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1965,6 +1965,7 @@ int symbol__annotate_printf(struct symbol *sym, struct 
map *map,
u64 len;
int width = symbol_conf.show_total_period ? 12 : 8;
int graph_dotted_len;
+   char buf[512];
 
filename = strdup(dso->long_name);
if (!filename)
@@ -1977,8 +1978,11 @@ int symbol__annotate_printf(struct symbol *sym, struct 
map *map,
 
len = symbol__size(sym);
 
-   if (perf_evsel__is_group_event(evsel))
+   if (perf_evsel__is_group_event(evsel)) {
width *= evsel->nr_members;
+   perf_evsel__group_desc(evsel, buf, sizeof(buf));
+   evsel_name = buf;
+   }
 
graph_dotted_len = printf(" %-*.*s| Source code & Disassembly of %s 
for %s (%" PRIu64 " samples)\n",
  width, width, symbol_conf.show_total_period ? 
"Period" :


[tip:perf/core] perf annotate: Show group event string for stdio

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  787e4da9f95fd44376b3af6fa163ac0b3a48a1fc
Gitweb: https://git.kernel.org/tip/787e4da9f95fd44376b3af6fa163ac0b3a48a1fc
Author: Jin Yao 
AuthorDate: Tue, 22 May 2018 19:38:35 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 23 May 2018 10:26:40 -0300

perf annotate: Show group event string for stdio

When we enable the group, for tui/stdio2, the output first line includes
the group event string. While for stdio, it will show only one event.

For example,

perf record -e cycles,branches ./div
perf annotate --group --stdio

Percent |  Source code & Disassembly of div for cycles (44407 samples)
..

The first line doesn't include the event 'branches'.

With this patch, it will show the correct group even string.

perf annotate --group --stdio

Percent |  Source code & Disassembly of div for cycles, branches (44407 
samples)
..

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 6612c7f90af4..71897689dacf 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1965,6 +1965,7 @@ int symbol__annotate_printf(struct symbol *sym, struct 
map *map,
u64 len;
int width = symbol_conf.show_total_period ? 12 : 8;
int graph_dotted_len;
+   char buf[512];
 
filename = strdup(dso->long_name);
if (!filename)
@@ -1977,8 +1978,11 @@ int symbol__annotate_printf(struct symbol *sym, struct 
map *map,
 
len = symbol__size(sym);
 
-   if (perf_evsel__is_group_event(evsel))
+   if (perf_evsel__is_group_event(evsel)) {
width *= evsel->nr_members;
+   perf_evsel__group_desc(evsel, buf, sizeof(buf));
+   evsel_name = buf;
+   }
 
graph_dotted_len = printf(" %-*.*s| Source code & Disassembly of %s 
for %s (%" PRIu64 " samples)\n",
  width, width, symbol_conf.show_total_period ? 
"Period" :


[tip:perf/core] perf annotate: Support '--group' option

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  7ebaf4890f63eb90856b76864a0847413cdf6c86
Gitweb: https://git.kernel.org/tip/7ebaf4890f63eb90856b76864a0847413cdf6c86
Author: Jin Yao 
AuthorDate: Mon, 21 May 2018 22:57:46 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 May 2018 14:41:25 -0300

perf annotate: Support '--group' option

With the '--group' option, even for non-explicit group, 'perf annotate'
will enable the group output.

For example,

  $ perf record -e cycles,branches ./div
  $ perf annotate main --stdio --group

 :Disassembly of section .text:
 :
 :004004b0 :
 :main():
 :
 :return i;
 :}
 :
 :int main(void)
 :{
0.000.00 :   4004b0:   push   %rbx
 :int i;
 :int flag;
 :volatile double x = 1212121212, y = 
121212;
 :
 :s_randseed = time(0);
0.000.00 :   4004b1:   xor%edi,%edi
 :srand(s_randseed);
0.000.00 :   4004b3:   mov$0x77359400,%ebx
 :
 :return i;
 :}
 :

But if without --group, there is only one event reported.

  $ perf annotate main --stdio

 :Disassembly of section .text:
 :
 :004004b0 :
 :main():
 :
 :return i;
 :}
 :
 :int main(void)
 :{
0.00 :   4004b0:   push   %rbx
 :int i;
 :int flag;
 :volatile double x = 1212121212, y = 121212;
 :
 :s_randseed = time(0);
0.00 :   4004b1:   xor%edi,%edi
 :srand(s_randseed);
0.00 :   4004b3:   mov$0x77359400,%ebx
 :
 :return i;
 :}

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-annotate.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 6e5d9f718154..da5704240239 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -45,6 +45,7 @@ struct perf_annotate {
bool   print_line;
bool   skip_missing;
bool   has_br_stack;
+   bool   group_set;
const char *sym_hist_filter;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -508,6 +509,9 @@ int cmd_annotate(int argc, const char **argv)
"Don't shorten the displayed pathnames"),
OPT_BOOLEAN(0, "skip-missing", _missing,
"Skip symbols that cannot be annotated"),
+   OPT_BOOLEAN_SET(0, "group", _conf.event_group,
+   _set,
+   "Show event group information together"),
OPT_STRING('C', "cpu", _list, "cpu", "list of cpus to 
profile"),
OPT_CALLBACK(0, "symfs", NULL, "directory",
 "Look for files with symbols relative to this directory",
@@ -570,6 +574,9 @@ int cmd_annotate(int argc, const char **argv)
annotate.has_br_stack = perf_header__has_feat(>header,
  HEADER_BRANCH_STACK);
 
+   if (annotate.group_set)
+   perf_evlist__force_leader(annotate.session->evlist);
+
ret = symbol__annotation_init();
if (ret < 0)
goto out_delete;


[tip:perf/core] perf annotate: Support '--group' option

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  7ebaf4890f63eb90856b76864a0847413cdf6c86
Gitweb: https://git.kernel.org/tip/7ebaf4890f63eb90856b76864a0847413cdf6c86
Author: Jin Yao 
AuthorDate: Mon, 21 May 2018 22:57:46 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 May 2018 14:41:25 -0300

perf annotate: Support '--group' option

With the '--group' option, even for non-explicit group, 'perf annotate'
will enable the group output.

For example,

  $ perf record -e cycles,branches ./div
  $ perf annotate main --stdio --group

 :Disassembly of section .text:
 :
 :004004b0 :
 :main():
 :
 :return i;
 :}
 :
 :int main(void)
 :{
0.000.00 :   4004b0:   push   %rbx
 :int i;
 :int flag;
 :volatile double x = 1212121212, y = 
121212;
 :
 :s_randseed = time(0);
0.000.00 :   4004b1:   xor%edi,%edi
 :srand(s_randseed);
0.000.00 :   4004b3:   mov$0x77359400,%ebx
 :
 :return i;
 :}
 :

But if without --group, there is only one event reported.

  $ perf annotate main --stdio

 :Disassembly of section .text:
 :
 :004004b0 :
 :main():
 :
 :return i;
 :}
 :
 :int main(void)
 :{
0.00 :   4004b0:   push   %rbx
 :int i;
 :int flag;
 :volatile double x = 1212121212, y = 121212;
 :
 :s_randseed = time(0);
0.00 :   4004b1:   xor%edi,%edi
 :srand(s_randseed);
0.00 :   4004b3:   mov$0x77359400,%ebx
 :
 :return i;
 :}

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-annotate.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 6e5d9f718154..da5704240239 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -45,6 +45,7 @@ struct perf_annotate {
bool   print_line;
bool   skip_missing;
bool   has_br_stack;
+   bool   group_set;
const char *sym_hist_filter;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -508,6 +509,9 @@ int cmd_annotate(int argc, const char **argv)
"Don't shorten the displayed pathnames"),
OPT_BOOLEAN(0, "skip-missing", _missing,
"Skip symbols that cannot be annotated"),
+   OPT_BOOLEAN_SET(0, "group", _conf.event_group,
+   _set,
+   "Show event group information together"),
OPT_STRING('C', "cpu", _list, "cpu", "list of cpus to 
profile"),
OPT_CALLBACK(0, "symfs", NULL, "directory",
 "Look for files with symbols relative to this directory",
@@ -570,6 +574,9 @@ int cmd_annotate(int argc, const char **argv)
annotate.has_br_stack = perf_header__has_feat(>header,
  HEADER_BRANCH_STACK);
 
+   if (annotate.group_set)
+   perf_evlist__force_leader(annotate.session->evlist);
+
ret = symbol__annotation_init();
if (ret < 0)
goto out_delete;


[tip:perf/core] perf report: Use perf_evlist__force_leader to support '--group'

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  a26bb0ba706aef4f42cc9377c0d4e849378574a4
Gitweb: https://git.kernel.org/tip/a26bb0ba706aef4f42cc9377c0d4e849378574a4
Author: Jin Yao 
AuthorDate: Mon, 21 May 2018 22:57:45 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 May 2018 14:41:01 -0300

perf report: Use perf_evlist__force_leader to support '--group'

Since we created a new function perf_evlist__force_leader(), remove the
old code and use that new evlist method.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526914666-31839-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4c931afb2e80..ad978e3ee2b8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -194,20 +194,11 @@ out:
return err;
 }
 
-/*
- * Events in data file are not collect in groups, but we still want
- * the group display. Set the artificial group and set the leader's
- * forced_leader flag to notify the display code.
- */
 static void setup_forced_leader(struct report *report,
struct perf_evlist *evlist)
 {
-   if (report->group_set && !evlist->nr_groups) {
-   struct perf_evsel *leader = perf_evlist__first(evlist);
-
-   perf_evlist__set_leader(evlist);
-   leader->forced_leader = true;
-   }
+   if (report->group_set)
+   perf_evlist__force_leader(evlist);
 }
 
 static int process_feature_event(struct perf_tool *tool,


[tip:perf/core] perf report: Use perf_evlist__force_leader to support '--group'

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  a26bb0ba706aef4f42cc9377c0d4e849378574a4
Gitweb: https://git.kernel.org/tip/a26bb0ba706aef4f42cc9377c0d4e849378574a4
Author: Jin Yao 
AuthorDate: Mon, 21 May 2018 22:57:45 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 May 2018 14:41:01 -0300

perf report: Use perf_evlist__force_leader to support '--group'

Since we created a new function perf_evlist__force_leader(), remove the
old code and use that new evlist method.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526914666-31839-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 4c931afb2e80..ad978e3ee2b8 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -194,20 +194,11 @@ out:
return err;
 }
 
-/*
- * Events in data file are not collect in groups, but we still want
- * the group display. Set the artificial group and set the leader's
- * forced_leader flag to notify the display code.
- */
 static void setup_forced_leader(struct report *report,
struct perf_evlist *evlist)
 {
-   if (report->group_set && !evlist->nr_groups) {
-   struct perf_evsel *leader = perf_evlist__first(evlist);
-
-   perf_evlist__set_leader(evlist);
-   leader->forced_leader = true;
-   }
+   if (report->group_set)
+   perf_evlist__force_leader(evlist);
 }
 
 static int process_feature_event(struct perf_tool *tool,


[tip:perf/core] perf evlist: Introduce force_leader() method

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  e2bdbe80a0b7dea9ba73582701b8a67c01e1da4f
Gitweb: https://git.kernel.org/tip/e2bdbe80a0b7dea9ba73582701b8a67c01e1da4f
Author: Jin Yao 
AuthorDate: Mon, 21 May 2018 22:57:44 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 May 2018 14:40:54 -0300

perf evlist: Introduce force_leader() method

For non-explicit group (e.g. those created with -e '{eventA,eventB}'),
'perf report' supports a option '--group' which can enable group output.

We also need to support 'perf annotate' with the same '--group'.

Create a new function perf_evlist__force_leader() which contains common
code to force setting the group leader.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526914666-31839-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 15 +++
 tools/perf/util/evlist.h |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a59281d64368..e7a4b31a84fb 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1795,3 +1795,18 @@ bool perf_evlist__exclude_kernel(struct perf_evlist 
*evlist)
 
return true;
 }
+
+/*
+ * Events in data file are not collect in groups, but we still want
+ * the group display. Set the artificial group and set the leader's
+ * forced_leader flag to notify the display code.
+ */
+void perf_evlist__force_leader(struct perf_evlist *evlist)
+{
+   if (!evlist->nr_groups) {
+   struct perf_evsel *leader = perf_evlist__first(evlist);
+
+   perf_evlist__set_leader(evlist);
+   leader->forced_leader = true;
+   }
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 6c41b2f78713..dc66436add98 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -309,4 +309,7 @@ struct perf_evsel *perf_evlist__event2evsel(struct 
perf_evlist *evlist,
union perf_event *event);
 
 bool perf_evlist__exclude_kernel(struct perf_evlist *evlist);
+
+void perf_evlist__force_leader(struct perf_evlist *evlist);
+
 #endif /* __PERF_EVLIST_H */


[tip:perf/core] perf evlist: Introduce force_leader() method

2018-05-23 Thread tip-bot for Jin Yao
Commit-ID:  e2bdbe80a0b7dea9ba73582701b8a67c01e1da4f
Gitweb: https://git.kernel.org/tip/e2bdbe80a0b7dea9ba73582701b8a67c01e1da4f
Author: Jin Yao 
AuthorDate: Mon, 21 May 2018 22:57:44 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 May 2018 14:40:54 -0300

perf evlist: Introduce force_leader() method

For non-explicit group (e.g. those created with -e '{eventA,eventB}'),
'perf report' supports a option '--group' which can enable group output.

We also need to support 'perf annotate' with the same '--group'.

Create a new function perf_evlist__force_leader() which contains common
code to force setting the group leader.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526914666-31839-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 15 +++
 tools/perf/util/evlist.h |  3 +++
 2 files changed, 18 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a59281d64368..e7a4b31a84fb 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1795,3 +1795,18 @@ bool perf_evlist__exclude_kernel(struct perf_evlist 
*evlist)
 
return true;
 }
+
+/*
+ * Events in data file are not collect in groups, but we still want
+ * the group display. Set the artificial group and set the leader's
+ * forced_leader flag to notify the display code.
+ */
+void perf_evlist__force_leader(struct perf_evlist *evlist)
+{
+   if (!evlist->nr_groups) {
+   struct perf_evsel *leader = perf_evlist__first(evlist);
+
+   perf_evlist__set_leader(evlist);
+   leader->forced_leader = true;
+   }
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 6c41b2f78713..dc66436add98 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -309,4 +309,7 @@ struct perf_evsel *perf_evlist__event2evsel(struct 
perf_evlist *evlist,
union perf_event *event);
 
 bool perf_evlist__exclude_kernel(struct perf_evlist *evlist);
+
+void perf_evlist__force_leader(struct perf_evlist *evlist);
+
 #endif /* __PERF_EVLIST_H */


[tip:perf/core] perf annotate: Create hotkey 'c' to show min/max cycles

2018-05-19 Thread tip-bot for Jin Yao
Commit-ID:  3e71fc0319775723adc08991ba7fbaeff1150347
Gitweb: https://git.kernel.org/tip/3e71fc0319775723adc08991ba7fbaeff1150347
Author: Jin Yao 
AuthorDate: Thu, 17 May 2018 22:58:38 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Sat, 19 May 2018 06:42:49 -0300

perf annotate: Create hotkey 'c' to show min/max cycles

In the 'perf annotate' view, a new hotkey 'c' is created for showing the
min/max cycles.

For example, when press 'c', the annotate view is:

  Percent│ IPC Cycle(min/max)
 │
 │
 │ Disassembly of section .text:
 │
 │ 0003aab0 :
8.22 │3.92   sub$0x18,%rsp
 │3.92   mov$0x1,%esi
 │3.92   xor%eax,%eax
 │3.92   cmpl   
$0x0,argp_program_version_hook@@G
 │3.92 1(2/1)  ↓ je 20
 │   lock   cmpxchg 
%esi,__abort_msg@@GLIBC_P
 │ ↓ jne29
 │ ↓ jmp43
 │1.10 20:   cmpxchg 
%esi,__abort_msg@@GLIBC_PRIVATE+
8.93 │1.10 1(5/1)  ↓ je 43

When press 'c' again, the annotate view is switched back:

  Percent│ IPC Cycle
 │
 │
 │Disassembly of section .text:
 │
 │0003aab0 :
8.22 │3.92  sub$0x18,%rsp
 │3.92  mov$0x1,%esi
 │3.92  xor%eax,%eax
 │3.92  cmpl   
$0x0,argp_program_version_hook@@GLIBC_2.2.5+0x
 │3.92 1  ↓ je 20
 │  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
 │↓ jne29
 │↓ jmp43
 │1.1020:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
8.93 │1.10 1  ↓ je 43

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526569118-14217-3-git-send-email-yao@linux.intel.com
[ Rename all maxmin to minmax ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c |  8 
 tools/perf/util/annotate.c| 37 +++--
 tools/perf/util/annotate.h|  7 ++-
 3 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 3781d74088a7..8be40fa903aa 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -695,6 +695,7 @@ static int annotate_browser__run(struct annotate_browser 
*browser,
"O Bump offset level (jump targets -> +call -> all 
-> cycle thru)\n"
"s Toggle source code view\n"
"t Circulate percent, total period, samples view\n"
+   "c Show min/max cycle\n"
"/ Search string\n"
"k Toggle line numbers\n"
"P Print to [symbol_name].annotation file.\n"
@@ -791,6 +792,13 @@ show_sup_ins:
notes->options->show_total_period = true;
annotation__update_column_widths(notes);
continue;
+   case 'c':
+   if (notes->options->show_minmax_cycle)
+   notes->options->show_minmax_cycle = false;
+   else
+   notes->options->show_minmax_cycle = true;
+   annotation__update_column_widths(notes);
+   continue;
case K_LEFT:
case K_ESC:
case 'q':
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4fcfefea3bc2..6612c7f90af4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2498,13 +2498,38 @@ static void __annotation_line__write(struct 
annotation_line *al, struct annotati
else
obj__printf(obj, "%*s ", ANNOTATION__IPC_WIDTH - 1, 
"IPC");
 
-   if (al->cycles)
-   obj__printf(obj, "%*" PRIu64 " ",
+   if (!notes->options->show_minmax_cycle) {
+   if (al->cycles)
+   obj__printf(obj, "%*" PRIu64 " ",
   ANNOTATION__CYCLES_WIDTH - 1, 
al->cycles);
-   else if (!show_title)
- 

[tip:perf/core] perf annotate: Create hotkey 'c' to show min/max cycles

2018-05-19 Thread tip-bot for Jin Yao
Commit-ID:  3e71fc0319775723adc08991ba7fbaeff1150347
Gitweb: https://git.kernel.org/tip/3e71fc0319775723adc08991ba7fbaeff1150347
Author: Jin Yao 
AuthorDate: Thu, 17 May 2018 22:58:38 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Sat, 19 May 2018 06:42:49 -0300

perf annotate: Create hotkey 'c' to show min/max cycles

In the 'perf annotate' view, a new hotkey 'c' is created for showing the
min/max cycles.

For example, when press 'c', the annotate view is:

  Percent│ IPC Cycle(min/max)
 │
 │
 │ Disassembly of section .text:
 │
 │ 0003aab0 :
8.22 │3.92   sub$0x18,%rsp
 │3.92   mov$0x1,%esi
 │3.92   xor%eax,%eax
 │3.92   cmpl   
$0x0,argp_program_version_hook@@G
 │3.92 1(2/1)  ↓ je 20
 │   lock   cmpxchg 
%esi,__abort_msg@@GLIBC_P
 │ ↓ jne29
 │ ↓ jmp43
 │1.10 20:   cmpxchg 
%esi,__abort_msg@@GLIBC_PRIVATE+
8.93 │1.10 1(5/1)  ↓ je 43

When press 'c' again, the annotate view is switched back:

  Percent│ IPC Cycle
 │
 │
 │Disassembly of section .text:
 │
 │0003aab0 :
8.22 │3.92  sub$0x18,%rsp
 │3.92  mov$0x1,%esi
 │3.92  xor%eax,%eax
 │3.92  cmpl   
$0x0,argp_program_version_hook@@GLIBC_2.2.5+0x
 │3.92 1  ↓ je 20
 │  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
 │↓ jne29
 │↓ jmp43
 │1.1020:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
8.93 │1.10 1  ↓ je 43

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526569118-14217-3-git-send-email-yao@linux.intel.com
[ Rename all maxmin to minmax ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c |  8 
 tools/perf/util/annotate.c| 37 +++--
 tools/perf/util/annotate.h|  7 ++-
 3 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 3781d74088a7..8be40fa903aa 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -695,6 +695,7 @@ static int annotate_browser__run(struct annotate_browser 
*browser,
"O Bump offset level (jump targets -> +call -> all 
-> cycle thru)\n"
"s Toggle source code view\n"
"t Circulate percent, total period, samples view\n"
+   "c Show min/max cycle\n"
"/ Search string\n"
"k Toggle line numbers\n"
"P Print to [symbol_name].annotation file.\n"
@@ -791,6 +792,13 @@ show_sup_ins:
notes->options->show_total_period = true;
annotation__update_column_widths(notes);
continue;
+   case 'c':
+   if (notes->options->show_minmax_cycle)
+   notes->options->show_minmax_cycle = false;
+   else
+   notes->options->show_minmax_cycle = true;
+   annotation__update_column_widths(notes);
+   continue;
case K_LEFT:
case K_ESC:
case 'q':
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4fcfefea3bc2..6612c7f90af4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2498,13 +2498,38 @@ static void __annotation_line__write(struct 
annotation_line *al, struct annotati
else
obj__printf(obj, "%*s ", ANNOTATION__IPC_WIDTH - 1, 
"IPC");
 
-   if (al->cycles)
-   obj__printf(obj, "%*" PRIu64 " ",
+   if (!notes->options->show_minmax_cycle) {
+   if (al->cycles)
+   obj__printf(obj, "%*" PRIu64 " ",
   ANNOTATION__CYCLES_WIDTH - 1, 
al->cycles);
-   else if (!show_title)
-   obj__printf(obj, "%*s", ANNOTATION__CYCLES_WIDTH, " ");
-   else
-   obj__printf(obj, "%*s ", ANNOTATION__CYCLES_WIDTH - 1, 
"Cycle");
+   else if (!show_title)
+   

[tip:perf/core] perf annotate: Record the min/max cycles

2018-05-19 Thread tip-bot for Jin Yao
Commit-ID:  48659ebf37e5d9d23bda6dbf032bdbe9708929f1
Gitweb: https://git.kernel.org/tip/48659ebf37e5d9d23bda6dbf032bdbe9708929f1
Author: Jin Yao 
AuthorDate: Thu, 17 May 2018 22:58:37 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 18 May 2018 16:31:41 -0300

perf annotate: Record the min/max cycles

Currently perf has a feature to account cycles for LBRs

For example, on skylake:

  perf record -b ...
  perf report or perf annotate

And then browsing the annotate browser gives average cycle counts for
program blocks.

For some analysis it would be useful if we could know not only the
average cycles but also the min and max cycles.

This patch records the min and max cycles.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526569118-14217-2-git-send-email-yao@linux.intel.com
[ Switch from max/min to min/max ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 14 +-
 tools/perf/util/annotate.h |  4 
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 5d74a30fe00f..4fcfefea3bc2 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -760,6 +760,15 @@ static int __symbol__account_cycles(struct annotation 
*notes,
ch[offset].num_aggr++;
ch[offset].cycles_aggr += cycles;
 
+   if (cycles > ch[offset].cycles_max)
+   ch[offset].cycles_max = cycles;
+
+   if (ch[offset].cycles_min) {
+   if (cycles && cycles < ch[offset].cycles_min)
+   ch[offset].cycles_min = cycles;
+   } else
+   ch[offset].cycles_min = cycles;
+
if (!have_start && ch[offset].have_start)
return 0;
if (ch[offset].num) {
@@ -953,8 +962,11 @@ void annotation__compute_ipc(struct annotation *notes, 
size_t size)
if (ch->have_start)
annotation__count_and_fill(notes, ch->start, 
offset, ch);
al = notes->offsets[offset];
-   if (al && ch->num_aggr)
+   if (al && ch->num_aggr) {
al->cycles = ch->cycles_aggr / ch->num_aggr;
+   al->cycles_max = ch->cycles_max;
+   al->cycles_min = ch->cycles_min;
+   }
notes->have_cycles = true;
}
}
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index f28a9e43421d..d50363d56f73 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -105,6 +105,8 @@ struct annotation_line {
int  jump_sources;
floatipc;
u64  cycles;
+   u64  cycles_max;
+   u64  cycles_min;
size_t   privsize;
char*path;
u32  idx;
@@ -186,6 +188,8 @@ struct cyc_hist {
u64 start;
u64 cycles;
u64 cycles_aggr;
+   u64 cycles_max;
+   u64 cycles_min;
u32 num;
u32 num_aggr;
u8  have_start;


[tip:perf/core] perf annotate: Record the min/max cycles

2018-05-19 Thread tip-bot for Jin Yao
Commit-ID:  48659ebf37e5d9d23bda6dbf032bdbe9708929f1
Gitweb: https://git.kernel.org/tip/48659ebf37e5d9d23bda6dbf032bdbe9708929f1
Author: Jin Yao 
AuthorDate: Thu, 17 May 2018 22:58:37 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 18 May 2018 16:31:41 -0300

perf annotate: Record the min/max cycles

Currently perf has a feature to account cycles for LBRs

For example, on skylake:

  perf record -b ...
  perf report or perf annotate

And then browsing the annotate browser gives average cycle counts for
program blocks.

For some analysis it would be useful if we could know not only the
average cycles but also the min and max cycles.

This patch records the min and max cycles.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1526569118-14217-2-git-send-email-yao@linux.intel.com
[ Switch from max/min to min/max ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 14 +-
 tools/perf/util/annotate.h |  4 
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 5d74a30fe00f..4fcfefea3bc2 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -760,6 +760,15 @@ static int __symbol__account_cycles(struct annotation 
*notes,
ch[offset].num_aggr++;
ch[offset].cycles_aggr += cycles;
 
+   if (cycles > ch[offset].cycles_max)
+   ch[offset].cycles_max = cycles;
+
+   if (ch[offset].cycles_min) {
+   if (cycles && cycles < ch[offset].cycles_min)
+   ch[offset].cycles_min = cycles;
+   } else
+   ch[offset].cycles_min = cycles;
+
if (!have_start && ch[offset].have_start)
return 0;
if (ch[offset].num) {
@@ -953,8 +962,11 @@ void annotation__compute_ipc(struct annotation *notes, 
size_t size)
if (ch->have_start)
annotation__count_and_fill(notes, ch->start, 
offset, ch);
al = notes->offsets[offset];
-   if (al && ch->num_aggr)
+   if (al && ch->num_aggr) {
al->cycles = ch->cycles_aggr / ch->num_aggr;
+   al->cycles_max = ch->cycles_max;
+   al->cycles_min = ch->cycles_min;
+   }
notes->have_cycles = true;
}
}
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index f28a9e43421d..d50363d56f73 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -105,6 +105,8 @@ struct annotation_line {
int  jump_sources;
floatipc;
u64  cycles;
+   u64  cycles_max;
+   u64  cycles_min;
size_t   privsize;
char*path;
u32  idx;
@@ -186,6 +188,8 @@ struct cyc_hist {
u64 start;
u64 cycles;
u64 cycles_aggr;
+   u64 cycles_max;
+   u64 cycles_min;
u32 num;
u32 num_aggr;
u8  have_start;


[tip:perf/urgent] perf annotate: Display all available events on --stdio

2018-05-15 Thread tip-bot for Jin Yao
Commit-ID:  04d2600ab669b2d44dd7920cc8a1b95c8144084c
Gitweb: https://git.kernel.org/tip/04d2600ab669b2d44dd7920cc8a1b95c8144084c
Author: Jin Yao 
AuthorDate: Wed, 9 May 2018 23:57:15 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 10 May 2018 15:19:30 -0300

perf annotate: Display all available events on --stdio

When we perform the following command lines:

  $ perf record -e "{cycles,branches}" ./div
  $ perf annotate main --stdio

The output shows only the first event, "cycles" and the displaying
format is not correct.

   Percent |  Source code & Disassembly of div for cycles (44550 
samples)
  
---
   :
   :
   :
   :Disassembly of section .text:
   :
   :004004b0 :
   :main():
   :
   :return i;
   :}
   :
   :int main(void)
   :{
  0.00 :   4004b0:   push   %rbx
   :int i;
   :int flag;
   :volatile double x = 1212121212, y = 
121212;
   :
   :s_randseed = time(0);
  0.00 :   4004b1:   xor%edi,%edi
   :srand(s_randseed);
  0.00 :   4004b3:   mov$0x77359400,%ebx
   :
   :return i;
   :}

The issue is that the value of the 'nr_percent' variable is hardcoded to
1.  This patch fixes it.

With this patch, the output is:

   Percent |  Source code & Disassembly of div for cycles (44550 
samples)
  
---
   :
   :
   :
   :Disassembly of section .text:
   :
   :004004b0 :
   :main():
   :
   :return i;
   :}
   :
   :int main(void)
   :{
  0.000.00 :   4004b0:   push   %rbx
   :int i;
   :int flag;
   :volatile double x = 1212121212, y = 
121212;
   :
   :s_randseed = time(0);
  0.000.00 :   4004b1:   xor%edi,%edi
   :srand(s_randseed);
  0.000.00 :   4004b3:   mov$0x77359400,%ebx
   :
   :return i;
   :}

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Fixes: f681d593d1ce ("perf annotate: Remove disasm__calc_percent() from 
disasm_line__print()")
Link: 
http://lkml.kernel.org/r/1525881435-4092-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 536ee148bff8..5d74a30fe00f 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1263,6 +1263,9 @@ annotation_line__print(struct annotation_line *al, struct 
symbol *sym, u64 start
max_percent = sample->percent;
}
 
+   if (al->samples_nr > nr_percent)
+   nr_percent = al->samples_nr;
+
if (max_percent < min_pcnt)
return -1;
 


[tip:perf/urgent] perf annotate: Display all available events on --stdio

2018-05-15 Thread tip-bot for Jin Yao
Commit-ID:  04d2600ab669b2d44dd7920cc8a1b95c8144084c
Gitweb: https://git.kernel.org/tip/04d2600ab669b2d44dd7920cc8a1b95c8144084c
Author: Jin Yao 
AuthorDate: Wed, 9 May 2018 23:57:15 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 10 May 2018 15:19:30 -0300

perf annotate: Display all available events on --stdio

When we perform the following command lines:

  $ perf record -e "{cycles,branches}" ./div
  $ perf annotate main --stdio

The output shows only the first event, "cycles" and the displaying
format is not correct.

   Percent |  Source code & Disassembly of div for cycles (44550 
samples)
  
---
   :
   :
   :
   :Disassembly of section .text:
   :
   :004004b0 :
   :main():
   :
   :return i;
   :}
   :
   :int main(void)
   :{
  0.00 :   4004b0:   push   %rbx
   :int i;
   :int flag;
   :volatile double x = 1212121212, y = 
121212;
   :
   :s_randseed = time(0);
  0.00 :   4004b1:   xor%edi,%edi
   :srand(s_randseed);
  0.00 :   4004b3:   mov$0x77359400,%ebx
   :
   :return i;
   :}

The issue is that the value of the 'nr_percent' variable is hardcoded to
1.  This patch fixes it.

With this patch, the output is:

   Percent |  Source code & Disassembly of div for cycles (44550 
samples)
  
---
   :
   :
   :
   :Disassembly of section .text:
   :
   :004004b0 :
   :main():
   :
   :return i;
   :}
   :
   :int main(void)
   :{
  0.000.00 :   4004b0:   push   %rbx
   :int i;
   :int flag;
   :volatile double x = 1212121212, y = 
121212;
   :
   :s_randseed = time(0);
  0.000.00 :   4004b1:   xor%edi,%edi
   :srand(s_randseed);
  0.000.00 :   4004b3:   mov$0x77359400,%ebx
   :
   :return i;
   :}

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Fixes: f681d593d1ce ("perf annotate: Remove disasm__calc_percent() from 
disasm_line__print()")
Link: 
http://lkml.kernel.org/r/1525881435-4092-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 536ee148bff8..5d74a30fe00f 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1263,6 +1263,9 @@ annotation_line__print(struct annotation_line *al, struct 
symbol *sym, u64 start
max_percent = sample->percent;
}
 
+   if (al->samples_nr > nr_percent)
+   nr_percent = al->samples_nr;
+
if (max_percent < min_pcnt)
return -1;
 


[tip:perf/urgent] perf version: Print status for syscall_table

2018-04-16 Thread tip-bot for Jin Yao
Commit-ID:  8a812bf552d98f6f887f860d3910f201b4a97b26
Gitweb: https://git.kernel.org/tip/8a812bf552d98f6f887f860d3910f201b4a97b26
Author: Jin Yao 
AuthorDate: Mon, 9 Apr 2018 18:26:49 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 12 Apr 2018 10:33:34 -0300

perf version: Print status for syscall_table

This patch doesn't print "libaudit" line if HAVE_SYSCALL_TABLE_SUPPORT
is available and add a line for HAVE_SYSCALL_TABLE_SUPPORT.

For example,

$ ./perf -vv
perf version 4.13.rc5.gc2f8af9
 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
 syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
   libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
   libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
 libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
  libslang: [ on  ]  # HAVE_SLANG_SUPPORT
 libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
 libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
 get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT

The line "syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT" is
new created.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1523269609-28824-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-version.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 2abe3910d6b6..50df168be326 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -60,7 +60,10 @@ static void library_status(void)
STATUS(HAVE_DWARF_GETLOCATIONS_SUPPORT, dwarf_getlocations);
STATUS(HAVE_GLIBC_SUPPORT, glibc);
STATUS(HAVE_GTK2_SUPPORT, gtk2);
+#ifndef HAVE_SYSCALL_TABLE_SUPPORT
STATUS(HAVE_LIBAUDIT_SUPPORT, libaudit);
+#endif
+   STATUS(HAVE_SYSCALL_TABLE_SUPPORT, syscall_table);
STATUS(HAVE_LIBBFD_SUPPORT, libbfd);
STATUS(HAVE_LIBELF_SUPPORT, libelf);
STATUS(HAVE_LIBNUMA_SUPPORT, libnuma);


[tip:perf/urgent] perf version: Print status for syscall_table

2018-04-16 Thread tip-bot for Jin Yao
Commit-ID:  8a812bf552d98f6f887f860d3910f201b4a97b26
Gitweb: https://git.kernel.org/tip/8a812bf552d98f6f887f860d3910f201b4a97b26
Author: Jin Yao 
AuthorDate: Mon, 9 Apr 2018 18:26:49 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 12 Apr 2018 10:33:34 -0300

perf version: Print status for syscall_table

This patch doesn't print "libaudit" line if HAVE_SYSCALL_TABLE_SUPPORT
is available and add a line for HAVE_SYSCALL_TABLE_SUPPORT.

For example,

$ ./perf -vv
perf version 4.13.rc5.gc2f8af9
 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
 syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
   libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
   libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
 libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
  libslang: [ on  ]  # HAVE_SLANG_SUPPORT
 libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
 libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
 get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT

The line "syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT" is
new created.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1523269609-28824-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-version.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 2abe3910d6b6..50df168be326 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -60,7 +60,10 @@ static void library_status(void)
STATUS(HAVE_DWARF_GETLOCATIONS_SUPPORT, dwarf_getlocations);
STATUS(HAVE_GLIBC_SUPPORT, glibc);
STATUS(HAVE_GTK2_SUPPORT, gtk2);
+#ifndef HAVE_SYSCALL_TABLE_SUPPORT
STATUS(HAVE_LIBAUDIT_SUPPORT, libaudit);
+#endif
+   STATUS(HAVE_SYSCALL_TABLE_SUPPORT, syscall_table);
STATUS(HAVE_LIBBFD_SUPPORT, libbfd);
STATUS(HAVE_LIBELF_SUPPORT, libelf);
STATUS(HAVE_LIBNUMA_SUPPORT, libnuma);


[tip:perf/urgent] perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT

2018-04-16 Thread tip-bot for Jin Yao
Commit-ID:  22e9af4e94801bbdf6945e55db64b877be7c71b3
Gitweb: https://git.kernel.org/tip/22e9af4e94801bbdf6945e55db64b877be7c71b3
Author: Jin Yao 
AuthorDate: Mon, 9 Apr 2018 18:26:48 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 12 Apr 2018 10:33:31 -0300

perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT

To be consistent with other HAVE_XXX_SUPPORT uses in Makefile.config,
this patch renames HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT and
updates the C code accordingly.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1523269609-28824-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.config  | 2 +-
 tools/perf/builtin-help.c   | 2 +-
 tools/perf/perf.c   | 4 ++--
 tools/perf/util/generate-cmdlist.sh | 2 +-
 tools/perf/util/syscalltbl.c| 6 +++---
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 6b307e97dc57..ae7dc46e8f8a 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -68,7 +68,7 @@ ifeq ($(NO_PERF_REGS),0)
 endif
 
 ifneq ($(NO_SYSCALL_TABLE),1)
-  CFLAGS += -DHAVE_SYSCALL_TABLE
+  CFLAGS += -DHAVE_SYSCALL_TABLE_SUPPORT
 endif
 
 # So far there's only x86 and arm libdw unwind support merged in perf.
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 4aca13f23b9d..1c41b4eaf73c 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -439,7 +439,7 @@ int cmd_help(int argc, const char **argv)
 #ifdef HAVE_LIBELF_SUPPORT
"probe",
 #endif
-#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE_SUPPORT)
"trace",
 #endif
NULL };
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 1659029d03fc..20a08cb32332 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -73,7 +73,7 @@ static struct cmd_struct commands[] = {
{ "lock",   cmd_lock,   0 },
{ "kvm",cmd_kvm,0 },
{ "test",   cmd_test,   0 },
-#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE_SUPPORT)
{ "trace",  cmd_trace,  0 },
 #endif
{ "inject", cmd_inject, 0 },
@@ -491,7 +491,7 @@ int main(int argc, const char **argv)
argv[0] = cmd;
}
if (strstarts(cmd, "trace")) {
-#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE_SUPPORT)
setup_path();
argv[0] = "trace";
return cmd_trace(argc, argv);
diff --git a/tools/perf/util/generate-cmdlist.sh 
b/tools/perf/util/generate-cmdlist.sh
index ff17920a5ebc..c3cef36d4176 100755
--- a/tools/perf/util/generate-cmdlist.sh
+++ b/tools/perf/util/generate-cmdlist.sh
@@ -38,7 +38,7 @@ do
 done
 echo "#endif /* HAVE_LIBELF_SUPPORT */"
 
-echo "#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)"
+echo "#if defined(HAVE_LIBAUDIT_SUPPORT) || 
defined(HAVE_SYSCALL_TABLE_SUPPORT)"
 sed -n -e 's/^perf-\([^]*\)[   ].* audit*/\1/p' command-list.txt |
 sort |
 while read cmd
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 895122d638dd..0ee7f568d60c 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 
-#ifdef HAVE_SYSCALL_TABLE
+#ifdef HAVE_SYSCALL_TABLE_SUPPORT
 #include 
 #include "string2.h"
 #include "util.h"
@@ -139,7 +139,7 @@ int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, 
const char *syscall_g
return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
 }
 
-#else /* HAVE_SYSCALL_TABLE */
+#else /* HAVE_SYSCALL_TABLE_SUPPORT */
 
 #include 
 
@@ -176,4 +176,4 @@ int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, 
const char *syscall_g
 {
return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
 }
-#endif /* HAVE_SYSCALL_TABLE */
+#endif /* HAVE_SYSCALL_TABLE_SUPPORT */


[tip:perf/urgent] perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT

2018-04-16 Thread tip-bot for Jin Yao
Commit-ID:  22e9af4e94801bbdf6945e55db64b877be7c71b3
Gitweb: https://git.kernel.org/tip/22e9af4e94801bbdf6945e55db64b877be7c71b3
Author: Jin Yao 
AuthorDate: Mon, 9 Apr 2018 18:26:48 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 12 Apr 2018 10:33:31 -0300

perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT

To be consistent with other HAVE_XXX_SUPPORT uses in Makefile.config,
this patch renames HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT and
updates the C code accordingly.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1523269609-28824-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.config  | 2 +-
 tools/perf/builtin-help.c   | 2 +-
 tools/perf/perf.c   | 4 ++--
 tools/perf/util/generate-cmdlist.sh | 2 +-
 tools/perf/util/syscalltbl.c| 6 +++---
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 6b307e97dc57..ae7dc46e8f8a 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -68,7 +68,7 @@ ifeq ($(NO_PERF_REGS),0)
 endif
 
 ifneq ($(NO_SYSCALL_TABLE),1)
-  CFLAGS += -DHAVE_SYSCALL_TABLE
+  CFLAGS += -DHAVE_SYSCALL_TABLE_SUPPORT
 endif
 
 # So far there's only x86 and arm libdw unwind support merged in perf.
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index 4aca13f23b9d..1c41b4eaf73c 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -439,7 +439,7 @@ int cmd_help(int argc, const char **argv)
 #ifdef HAVE_LIBELF_SUPPORT
"probe",
 #endif
-#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE_SUPPORT)
"trace",
 #endif
NULL };
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 1659029d03fc..20a08cb32332 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -73,7 +73,7 @@ static struct cmd_struct commands[] = {
{ "lock",   cmd_lock,   0 },
{ "kvm",cmd_kvm,0 },
{ "test",   cmd_test,   0 },
-#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE_SUPPORT)
{ "trace",  cmd_trace,  0 },
 #endif
{ "inject", cmd_inject, 0 },
@@ -491,7 +491,7 @@ int main(int argc, const char **argv)
argv[0] = cmd;
}
if (strstarts(cmd, "trace")) {
-#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE_SUPPORT)
setup_path();
argv[0] = "trace";
return cmd_trace(argc, argv);
diff --git a/tools/perf/util/generate-cmdlist.sh 
b/tools/perf/util/generate-cmdlist.sh
index ff17920a5ebc..c3cef36d4176 100755
--- a/tools/perf/util/generate-cmdlist.sh
+++ b/tools/perf/util/generate-cmdlist.sh
@@ -38,7 +38,7 @@ do
 done
 echo "#endif /* HAVE_LIBELF_SUPPORT */"
 
-echo "#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)"
+echo "#if defined(HAVE_LIBAUDIT_SUPPORT) || 
defined(HAVE_SYSCALL_TABLE_SUPPORT)"
 sed -n -e 's/^perf-\([^]*\)[   ].* audit*/\1/p' command-list.txt |
 sort |
 while read cmd
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 895122d638dd..0ee7f568d60c 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 
-#ifdef HAVE_SYSCALL_TABLE
+#ifdef HAVE_SYSCALL_TABLE_SUPPORT
 #include 
 #include "string2.h"
 #include "util.h"
@@ -139,7 +139,7 @@ int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, 
const char *syscall_g
return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
 }
 
-#else /* HAVE_SYSCALL_TABLE */
+#else /* HAVE_SYSCALL_TABLE_SUPPORT */
 
 #include 
 
@@ -176,4 +176,4 @@ int syscalltbl__strglobmatch_first(struct syscalltbl *tbl, 
const char *syscall_g
 {
return syscalltbl__strglobmatch_next(tbl, syscall_glob, idx);
 }
-#endif /* HAVE_SYSCALL_TABLE */
+#endif /* HAVE_SYSCALL_TABLE_SUPPORT */


[tip:perf/urgent] perf version: Add man page

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  709846725673a944ee38da1c275a6dfbf0576d0f
Gitweb: https://git.kernel.org/tip/709846725673a944ee38da1c275a6dfbf0576d0f
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:16 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:52:23 -0300

perf version: Add man page

Since a new option '--build-options' is created for 'perf version', so
we need to document it.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-version.txt | 24 
 1 file changed, 24 insertions(+)

diff --git a/tools/perf/Documentation/perf-version.txt 
b/tools/perf/Documentation/perf-version.txt
new file mode 100644
index ..e207b7cfca26
--- /dev/null
+++ b/tools/perf/Documentation/perf-version.txt
@@ -0,0 +1,24 @@
+perf-version(1)
+===
+
+NAME
+
+perf-version - display the version of perf binary
+
+SYNOPSIS
+
+'perf version' [--build-options]
+
+DESCRIPTION
+---
+With no options given, the 'perf version' prints the perf version
+on the standard output.
+
+If the option '--build-options' is given, then the status of
+compiled-in libraries are printed on the standard output.
+
+OPTIONS
+---
+--build-options::
+Prints the status of compiled-in libraries on the
+standard output.


[tip:perf/urgent] perf version: Add man page

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  709846725673a944ee38da1c275a6dfbf0576d0f
Gitweb: https://git.kernel.org/tip/709846725673a944ee38da1c275a6dfbf0576d0f
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:16 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:52:23 -0300

perf version: Add man page

Since a new option '--build-options' is created for 'perf version', so
we need to document it.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-version.txt | 24 
 1 file changed, 24 insertions(+)

diff --git a/tools/perf/Documentation/perf-version.txt 
b/tools/perf/Documentation/perf-version.txt
new file mode 100644
index ..e207b7cfca26
--- /dev/null
+++ b/tools/perf/Documentation/perf-version.txt
@@ -0,0 +1,24 @@
+perf-version(1)
+===
+
+NAME
+
+perf-version - display the version of perf binary
+
+SYNOPSIS
+
+'perf version' [--build-options]
+
+DESCRIPTION
+---
+With no options given, the 'perf version' prints the perf version
+on the standard output.
+
+If the option '--build-options' is given, then the status of
+compiled-in libraries are printed on the standard output.
+
+OPTIONS
+---
+--build-options::
+Prints the status of compiled-in libraries on the
+standard output.


[tip:perf/urgent] perf version: Print the compiled-in status of libraries

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  9ff2a64708a642b3dee867d0a083171077663b0a
Gitweb: https://git.kernel.org/tip/9ff2a64708a642b3dee867d0a083171077663b0a
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:14 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:50:30 -0300

perf version: Print the compiled-in status of libraries

This patch checks the values passed by CFLAGS (-DHAVE_XXX) and then
print the status of libraries.

For example, if HAVE_DWARF_SUPPORT is defined, that means the library
"dwarf" is compiled-in. The patch will print the status "on" for this
library otherwise it print the status "OFF".

A new option '--build-options' created for 'perf version' supports the
printing of library status.

For example:

$ ./perf version --build-options
or
  ./perf --version --build-options
or
  ./perf -v --build-options

perf version 4.13.rc5.g6727c5
 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
  libaudit: [ OFF ]  # HAVE_LIBAUDIT_SUPPORT
libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
   libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
   libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
 libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
  libslang: [ on  ]  # HAVE_SLANG_SUPPORT
 libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
 libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
 get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT

v4:

1. Also print the macro name. That would make it easier
   to grep around in the source looking for where code
   related a particular features is located.

2. Update since HAVE_DWARF_GETLOCATIONS is renamed to
   HAVE_DWARF_GETLOCATIONS_SUPPORT

v3:

Remove following unnecessary help message.

1. [ on  ]: library is compiled-in
   [ OFF ]: library is disabled in make configuration
OR library is not installed in build environment

2. Create '--build-options' option.

3. Use standard option parsing API 'parse_options'.

v2:

1. Use IS_BUILTIN macro to replace #ifdef/#endif block.

2. Print color for on/OFF.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Suggested-by: Ingo Molnar 
Suggested-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-version.c | 82 +++-
 1 file changed, 81 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 37019c5d675f..2abe3910d6b6 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -1,11 +1,91 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "builtin.h"
 #include "perf.h"
+#include "color.h"
 #include 
+#include 
 #include 
+#include 
+#include 
 
-int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused)
+int version_verbose;
+
+struct version {
+   boolbuild_options;
+};
+
+static struct version version;
+
+static struct option version_options[] = {
+   OPT_BOOLEAN(0, "build-options", _options,
+   "display the build options"),
+};
+
+static const char * const version_usage[] = {
+   "perf version []",
+   NULL
+};
+
+static void on_off_print(const char *status)
+{
+   printf("[ ");
+
+   if (!strcmp(status, "OFF"))
+   color_fprintf(stdout, PERF_COLOR_RED, "%-3s", status);
+   else
+   color_fprintf(stdout, PERF_COLOR_GREEN, "%-3s", status);
+
+   printf(" ]");
+}
+
+static void status_print(const char *name, const char *macro,
+const char *status)
 {
+   printf("%22s: ", name);
+   on_off_print(status);
+   printf("  # %s\n", macro);
+}
+
+#define STATUS(__d, __m)   \
+do {   \
+   if (IS_BUILTIN(__d))\
+   status_print(#__m, #__d, "on"); \
+   else\
+   

[tip:perf/urgent] perf version: Print the compiled-in status of libraries

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  9ff2a64708a642b3dee867d0a083171077663b0a
Gitweb: https://git.kernel.org/tip/9ff2a64708a642b3dee867d0a083171077663b0a
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:14 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:50:30 -0300

perf version: Print the compiled-in status of libraries

This patch checks the values passed by CFLAGS (-DHAVE_XXX) and then
print the status of libraries.

For example, if HAVE_DWARF_SUPPORT is defined, that means the library
"dwarf" is compiled-in. The patch will print the status "on" for this
library otherwise it print the status "OFF".

A new option '--build-options' created for 'perf version' supports the
printing of library status.

For example:

$ ./perf version --build-options
or
  ./perf --version --build-options
or
  ./perf -v --build-options

perf version 4.13.rc5.g6727c5
 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
  libaudit: [ OFF ]  # HAVE_LIBAUDIT_SUPPORT
libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
   libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
   libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
 libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
  libslang: [ on  ]  # HAVE_SLANG_SUPPORT
 libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
 libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
 get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT

v4:

1. Also print the macro name. That would make it easier
   to grep around in the source looking for where code
   related a particular features is located.

2. Update since HAVE_DWARF_GETLOCATIONS is renamed to
   HAVE_DWARF_GETLOCATIONS_SUPPORT

v3:

Remove following unnecessary help message.

1. [ on  ]: library is compiled-in
   [ OFF ]: library is disabled in make configuration
OR library is not installed in build environment

2. Create '--build-options' option.

3. Use standard option parsing API 'parse_options'.

v2:

1. Use IS_BUILTIN macro to replace #ifdef/#endif block.

2. Print color for on/OFF.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Suggested-by: Ingo Molnar 
Suggested-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-version.c | 82 +++-
 1 file changed, 81 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 37019c5d675f..2abe3910d6b6 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -1,11 +1,91 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "builtin.h"
 #include "perf.h"
+#include "color.h"
 #include 
+#include 
 #include 
+#include 
+#include 
 
-int cmd_version(int argc __maybe_unused, const char **argv __maybe_unused)
+int version_verbose;
+
+struct version {
+   boolbuild_options;
+};
+
+static struct version version;
+
+static struct option version_options[] = {
+   OPT_BOOLEAN(0, "build-options", _options,
+   "display the build options"),
+};
+
+static const char * const version_usage[] = {
+   "perf version []",
+   NULL
+};
+
+static void on_off_print(const char *status)
+{
+   printf("[ ");
+
+   if (!strcmp(status, "OFF"))
+   color_fprintf(stdout, PERF_COLOR_RED, "%-3s", status);
+   else
+   color_fprintf(stdout, PERF_COLOR_GREEN, "%-3s", status);
+
+   printf(" ]");
+}
+
+static void status_print(const char *name, const char *macro,
+const char *status)
 {
+   printf("%22s: ", name);
+   on_off_print(status);
+   printf("  # %s\n", macro);
+}
+
+#define STATUS(__d, __m)   \
+do {   \
+   if (IS_BUILTIN(__d))\
+   status_print(#__m, #__d, "on"); \
+   else\
+   status_print(#__m, #__d, "OFF");\
+} while (0)
+
+static void library_status(void)
+{
+   STATUS(HAVE_DWARF_SUPPORT, dwarf);
+   STATUS(HAVE_DWARF_GETLOCATIONS_SUPPORT, dwarf_getlocations);
+   STATUS(HAVE_GLIBC_SUPPORT, glibc);
+   STATUS(HAVE_GTK2_SUPPORT, gtk2);
+ 

[tip:perf/urgent] perf tools: Add 'perf -vv' as an alias to 'perf version --build-options'

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  3aa94b10ab0a818ed9fa2dc06c40812c136f9a5a
Gitweb: https://git.kernel.org/tip/3aa94b10ab0a818ed9fa2dc06c40812c136f9a5a
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:15 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:50:35 -0300

perf tools: Add 'perf -vv' as an alias to 'perf version --build-options'

We keep having bug reports that when users build perf on their own, but
they don't install some needed libraries such as libelf,
libbfd/libibery.

The perf can build, but it is missing important functionality.

This patch provides a new option '-vv' for perf which will print the
compiled-in status of libraries.

The 'perf -vv' is mapped to 'perf version --build-options'.

For example:

$ ./perf -vv

perf version 4.13.rc5.g6727c5
 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
  libaudit: [ OFF ]  # HAVE_LIBAUDIT_SUPPORT
libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
   libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
   libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
 libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
  libslang: [ on  ]  # HAVE_SLANG_SUPPORT
 libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
 libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
 get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT

v3:

One bug is found in v2. It didn't process the option like '-vabc'
correctly. Fix this bug.

v2:

Use a global variable version_verbose to record the number of 'v'.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-6-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/perf.c | 6 ++
 tools/perf/perf.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 1b3fc8ec0fa2..1659029d03fc 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -190,6 +190,12 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
break;
}
 
+   if (!strcmp(cmd, "-vv")) {
+   (*argv)[0] = "version";
+   version_verbose = 1;
+   break;
+   }
+
/*
 * Check remaining flags.
 */
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 8fec1abd0f1f..a1a97956136f 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -84,6 +84,7 @@ struct record_opts {
 struct option;
 extern const char * const *record_usage;
 extern struct option *record_options;
+extern int version_verbose;
 
 int record__parse_freq(const struct option *opt, const char *str, int unset);
 #endif


[tip:perf/urgent] perf tools: Add 'perf -vv' as an alias to 'perf version --build-options'

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  3aa94b10ab0a818ed9fa2dc06c40812c136f9a5a
Gitweb: https://git.kernel.org/tip/3aa94b10ab0a818ed9fa2dc06c40812c136f9a5a
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:15 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:50:35 -0300

perf tools: Add 'perf -vv' as an alias to 'perf version --build-options'

We keep having bug reports that when users build perf on their own, but
they don't install some needed libraries such as libelf,
libbfd/libibery.

The perf can build, but it is missing important functionality.

This patch provides a new option '-vv' for perf which will print the
compiled-in status of libraries.

The 'perf -vv' is mapped to 'perf version --build-options'.

For example:

$ ./perf -vv

perf version 4.13.rc5.g6727c5
 dwarf: [ on  ]  # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on  ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
 glibc: [ on  ]  # HAVE_GLIBC_SUPPORT
  gtk2: [ on  ]  # HAVE_GTK2_SUPPORT
  libaudit: [ OFF ]  # HAVE_LIBAUDIT_SUPPORT
libbfd: [ on  ]  # HAVE_LIBBFD_SUPPORT
libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
   libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
   libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
 libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
  libslang: [ on  ]  # HAVE_SLANG_SUPPORT
 libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
 libunwind: [ on  ]  # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on  ]  # HAVE_DWARF_SUPPORT
  zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
  lzma: [ on  ]  # HAVE_LZMA_SUPPORT
 get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT

v3:

One bug is found in v2. It didn't process the option like '-vabc'
correctly. Fix this bug.

v2:

Use a global variable version_verbose to record the number of 'v'.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-6-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/perf.c | 6 ++
 tools/perf/perf.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 1b3fc8ec0fa2..1659029d03fc 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -190,6 +190,12 @@ static int handle_options(const char ***argv, int *argc, 
int *envchanged)
break;
}
 
+   if (!strcmp(cmd, "-vv")) {
+   (*argv)[0] = "version";
+   version_verbose = 1;
+   break;
+   }
+
/*
 * Check remaining flags.
 */
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index 8fec1abd0f1f..a1a97956136f 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -84,6 +84,7 @@ struct record_opts {
 struct option;
 extern const char * const *record_usage;
 extern struct option *record_options;
+extern int version_verbose;
 
 int record__parse_freq(const struct option *opt, const char *str, int unset);
 #endif


[tip:perf/urgent] perf config: Rename to HAVE_DWARF_GETLOCATIONS_SUPPORT

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  a36ebe4e242a2f6818f424b03a5e8dae3964e458
Gitweb: https://git.kernel.org/tip/a36ebe4e242a2f6818f424b03a5e8dae3964e458
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:13 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:50:24 -0300

perf config: Rename to HAVE_DWARF_GETLOCATIONS_SUPPORT

In Makefile.config, to make all libraries flags have _SUPPORT suffix,
rename HAVE_DWARF_GETLOCATIONS to HAVE_DWARF_GETLOCATIONS_SUPPORT

Signed-off-by: Jin Yao 
Suggested-by: Ingo Molnar 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.config  | 2 +-
 tools/perf/util/dwarf-aux.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index deb8fba2f4f1..c7abd83a8e19 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -346,7 +346,7 @@ else
   ifneq ($(feature-dwarf_getlocations), 1)
 msg := $(warning Old libdw.h, finding variables at given 'perf probe' 
point will not work, install elfutils-devel/libdw-dev >= 0.157);
   else
-CFLAGS += -DHAVE_DWARF_GETLOCATIONS
+CFLAGS += -DHAVE_DWARF_GETLOCATIONS_SUPPORT
   endif # dwarf_getlocations
 endif # Dwarf support
   endif # libelf support
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index f5acda13dcfa..7eb7de5aee44 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -979,7 +979,7 @@ int die_get_varname(Dwarf_Die *vr_die, struct strbuf *buf)
return ret < 0 ? ret : strbuf_addf(buf, "\t%s", dwarf_diename(vr_die));
 }
 
-#ifdef HAVE_DWARF_GETLOCATIONS
+#ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
 /**
  * die_get_var_innermost_scope - Get innermost scope range of given variable 
DIE
  * @sp_die: a subprogram DIE


[tip:perf/urgent] perf config: Rename to HAVE_DWARF_GETLOCATIONS_SUPPORT

2018-04-03 Thread tip-bot for Jin Yao
Commit-ID:  a36ebe4e242a2f6818f424b03a5e8dae3964e458
Gitweb: https://git.kernel.org/tip/a36ebe4e242a2f6818f424b03a5e8dae3964e458
Author: Jin Yao 
AuthorDate: Fri, 30 Mar 2018 17:27:13 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 2 Apr 2018 13:50:24 -0300

perf config: Rename to HAVE_DWARF_GETLOCATIONS_SUPPORT

In Makefile.config, to make all libraries flags have _SUPPORT suffix,
rename HAVE_DWARF_GETLOCATIONS to HAVE_DWARF_GETLOCATIONS_SUPPORT

Signed-off-by: Jin Yao 
Suggested-by: Ingo Molnar 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1522402036-22915-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.config  | 2 +-
 tools/perf/util/dwarf-aux.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index deb8fba2f4f1..c7abd83a8e19 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -346,7 +346,7 @@ else
   ifneq ($(feature-dwarf_getlocations), 1)
 msg := $(warning Old libdw.h, finding variables at given 'perf probe' 
point will not work, install elfutils-devel/libdw-dev >= 0.157);
   else
-CFLAGS += -DHAVE_DWARF_GETLOCATIONS
+CFLAGS += -DHAVE_DWARF_GETLOCATIONS_SUPPORT
   endif # dwarf_getlocations
 endif # Dwarf support
   endif # libelf support
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index f5acda13dcfa..7eb7de5aee44 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -979,7 +979,7 @@ int die_get_varname(Dwarf_Die *vr_die, struct strbuf *buf)
return ret < 0 ? ret : strbuf_addf(buf, "\t%s", dwarf_diename(vr_die));
 }
 
-#ifdef HAVE_DWARF_GETLOCATIONS
+#ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
 /**
  * die_get_var_innermost_scope - Get innermost scope range of given variable 
DIE
  * @sp_die: a subprogram DIE


[tip:perf/core] perf annotate: Support to display the IPC/Cycle in TUI mode

2018-03-09 Thread tip-bot for Jin Yao
Commit-ID:  bb848c14f80d93059cb10b1e1446cc6823d77142
Gitweb: https://git.kernel.org/tip/bb848c14f80d93059cb10b1e1446cc6823d77142
Author: Jin Yao 
AuthorDate: Tue, 27 Feb 2018 17:38:47 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 8 Mar 2018 11:30:52 -0300

perf annotate: Support to display the IPC/Cycle in TUI mode

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the IPC/Cycle even if branch info is recorded in perf
data file.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of IPC/Cycle if
branch info is in perf data.

For example,

  perf annotate compute_flag

  Percent│ IPC Cycle
 │
 │
 │Disassembly of section .text:
 │
 │00400640 :
 │compute_flag():
 │volatile int count;
 │static unsigned int s_randseed;
 │
 │__attribute__((noinline))
 │int compute_flag()
 │{
   22.96 │1.18   584sub$0x8,%rsp
 │int i;
 │
 │i = rand() % 2;
   23.02 │1.18 1  → callq  rand@plt
 │
 │return i;
   27.05 │3.37  mov%eax,%edx
 │}
 │3.37  add$0x8,%rsp
 │{
 │int i;
 │
 │i = rand() % 2;
 │
 │return i;
 │3.37  shr$0x1f,%edx
 │3.37  add%edx,%eax
 │3.37  and$0x1,%eax
 │3.37  sub%edx,%eax
 │}
   26.97 │3.37 2  ← retq

Note that, this patch only supports TUI mode. For stdio, now it just keeps
original behavior. Will support it in a follow-up patch.

  $ perf annotate compute_flag --stdio

   Percent |  Source code & Disassembly of div for cycles:ppp (7993 samples)
  --
   :
   :
   :
   :Disassembly of section .text:
   :
   :00400640 :
   :compute_flag():
   :volatile int count;
   :static unsigned int s_randseed;
   :
   :__attribute__((noinline))
   :int compute_flag()
   :{
  0.29 :   400640:   sub$0x8,%rsp # +100.00%
   :int i;
   :
   :i = rand() % 2;
 42.93 :   400644:   callq  400490  # -100.00% (p:100.00%)
   :
   :return i;
  0.10 :   400649:   mov%eax,%edx # +100.00%
   :}
  0.94 :   40064b:   add$0x8,%rsp
   :{
   :int i;
   :
   :i = rand() % 2;
   :
   :return i;
 27.02 :   40064f:   shr$0x1f,%edx
  0.15 :   400652:   add%edx,%eax
  1.24 :   400654:   and$0x1,%eax
  2.08 :   400657:   sub%edx,%eax
   :}
 25.26 :   400659:   retq # -100.00% (p:100.00%)

Signed-off-by: Jin Yao 
Acked-by: Andi Kleen 
Link: http://lkml.kernel.org/r/20180223170210.gc7...@tassilo.jf.intel.com
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1519724327-7773-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-annotate.c | 88 ---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a3d438..ead6ae4549e5 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
bool   full_paths;
bool   print_line;
bool   skip_missing;
+   bool   has_br_stack;
const char *sym_hist_filter;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, 
struct addr_location *
free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ 

[tip:perf/core] perf annotate: Support to display the IPC/Cycle in TUI mode

2018-03-09 Thread tip-bot for Jin Yao
Commit-ID:  bb848c14f80d93059cb10b1e1446cc6823d77142
Gitweb: https://git.kernel.org/tip/bb848c14f80d93059cb10b1e1446cc6823d77142
Author: Jin Yao 
AuthorDate: Tue, 27 Feb 2018 17:38:47 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 8 Mar 2018 11:30:52 -0300

perf annotate: Support to display the IPC/Cycle in TUI mode

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the IPC/Cycle even if branch info is recorded in perf
data file.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of IPC/Cycle if
branch info is in perf data.

For example,

  perf annotate compute_flag

  Percent│ IPC Cycle
 │
 │
 │Disassembly of section .text:
 │
 │00400640 :
 │compute_flag():
 │volatile int count;
 │static unsigned int s_randseed;
 │
 │__attribute__((noinline))
 │int compute_flag()
 │{
   22.96 │1.18   584sub$0x8,%rsp
 │int i;
 │
 │i = rand() % 2;
   23.02 │1.18 1  → callq  rand@plt
 │
 │return i;
   27.05 │3.37  mov%eax,%edx
 │}
 │3.37  add$0x8,%rsp
 │{
 │int i;
 │
 │i = rand() % 2;
 │
 │return i;
 │3.37  shr$0x1f,%edx
 │3.37  add%edx,%eax
 │3.37  and$0x1,%eax
 │3.37  sub%edx,%eax
 │}
   26.97 │3.37 2  ← retq

Note that, this patch only supports TUI mode. For stdio, now it just keeps
original behavior. Will support it in a follow-up patch.

  $ perf annotate compute_flag --stdio

   Percent |  Source code & Disassembly of div for cycles:ppp (7993 samples)
  --
   :
   :
   :
   :Disassembly of section .text:
   :
   :00400640 :
   :compute_flag():
   :volatile int count;
   :static unsigned int s_randseed;
   :
   :__attribute__((noinline))
   :int compute_flag()
   :{
  0.29 :   400640:   sub$0x8,%rsp # +100.00%
   :int i;
   :
   :i = rand() % 2;
 42.93 :   400644:   callq  400490  # -100.00% (p:100.00%)
   :
   :return i;
  0.10 :   400649:   mov%eax,%edx # +100.00%
   :}
  0.94 :   40064b:   add$0x8,%rsp
   :{
   :int i;
   :
   :i = rand() % 2;
   :
   :return i;
 27.02 :   40064f:   shr$0x1f,%edx
  0.15 :   400652:   add%edx,%eax
  1.24 :   400654:   and$0x1,%eax
  2.08 :   400657:   sub%edx,%eax
   :}
 25.26 :   400659:   retq # -100.00% (p:100.00%)

Signed-off-by: Jin Yao 
Acked-by: Andi Kleen 
Link: http://lkml.kernel.org/r/20180223170210.gc7...@tassilo.jf.intel.com
Cc: Alexander Shishkin 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1519724327-7773-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-annotate.c | 88 ---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a3d438..ead6ae4549e5 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
bool   full_paths;
bool   print_line;
bool   skip_missing;
+   bool   has_br_stack;
const char *sym_hist_filter;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, 
struct addr_location *
free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ bool single __maybe_unused,
+ void *arg __maybe_unused)
+{
+   struct hist_entry *he = iter->he;
+   struct branch_info *bi;
+   struct 

[tip:perf/core] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-03-05 Thread tip-bot for Jin Yao
Commit-ID:  ab6c79b819f5a50cf41a11ebec17bef63b530333
Gitweb: https://git.kernel.org/tip/ab6c79b819f5a50cf41a11ebec17bef63b530333
Author: Jin Yao 
AuthorDate: Tue, 16 Jan 2018 23:43:08 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 27 Feb 2018 11:29:21 -0300

perf stat: Ignore error thread when enabling system-wide --per-thread

If we execute 'perf stat --per-thread' with non-root account (even set
kernel.perf_event_paranoid = -1 yet), it reports the error:

  jinyao@skl:~$ perf stat --per-thread
  Error:
  You may not have permission to collect system-wide stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  which controls use of the performance events system by
  unprivileged users (without CAP_SYS_ADMIN).

  The current value is 2:

-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
  >= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_ADMIN
  >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
  >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

  To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

  kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

  jinyao@skl:~$ perf stat --per-thread
  ^C
   Performance counter stats for 'system wide':

 vmstat-34586.171984   cpu-clock:u (msec) #  0.000 CPUs utilized
   perf-36700.515599   cpu-clock:u (msec) #  0.000 CPUs utilized
 vmstat-3458   1,163,643   cycles:u   #  0.189 GHz
   perf-3670  40,881   cycles:u   #  0.079 GHz
 vmstat-3458   1,410,238   instructions:u #  1.21  insn per cycle
   perf-3670   3,536   instructions:u #  0.09  insn per cycle
 vmstat-3458 288,937   branches:u # 46.814 M/sec
   perf-3670 936   branches:u #  1.815 M/sec
 vmstat-3458  15,195   branch-misses:u#  5.26% of all branches
   perf-3670  76   branch-misses:u#  8.12% of all branches

12.651675247 seconds time elapsed

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-stat.c| 14 +-
 tools/perf/util/evsel.c  |  3 +++
 tools/perf/util/thread_map.c |  1 +
 tools/perf/util/thread_map.h |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index fadcff52cd09..6214d2b220b2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -637,7 +637,19 @@ try_again:
 if (verbose > 0)
 ui__warning("%s\n", msg);
 goto try_again;
-}
+   } else if (target__has_per_thread() &&
+  evsel_list->threads &&
+  evsel_list->threads->err_thread != -1) {
+   /*
+* For global --per-thread case, skip current
+* error thread.
+*/
+   if (!thread_map__remove(evsel_list->threads,
+   
evsel_list->threads->err_thread)) {
+   evsel_list->threads->err_thread = -1;
+   goto try_again;
+   }
+   }
 
perf_evsel__open_strerror(counter, ,
  errno, msg, sizeof(msg));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ef351688b797..b56e1c2ddaee 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1915,6 +1915,9 @@ try_fallback:
goto fallback_missing_features;
}
 out_close:
+   if (err)
+   threads->err_thread = thread;
+
do {
while (--thread >= 0) {
close(FD(evsel, cpu, thread));
diff 

[tip:perf/core] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-03-05 Thread tip-bot for Jin Yao
Commit-ID:  ab6c79b819f5a50cf41a11ebec17bef63b530333
Gitweb: https://git.kernel.org/tip/ab6c79b819f5a50cf41a11ebec17bef63b530333
Author: Jin Yao 
AuthorDate: Tue, 16 Jan 2018 23:43:08 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 27 Feb 2018 11:29:21 -0300

perf stat: Ignore error thread when enabling system-wide --per-thread

If we execute 'perf stat --per-thread' with non-root account (even set
kernel.perf_event_paranoid = -1 yet), it reports the error:

  jinyao@skl:~$ perf stat --per-thread
  Error:
  You may not have permission to collect system-wide stats.

  Consider tweaking /proc/sys/kernel/perf_event_paranoid,
  which controls use of the performance events system by
  unprivileged users (without CAP_SYS_ADMIN).

  The current value is 2:

-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
  >= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_ADMIN
  >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
  >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

  To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

  kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

  jinyao@skl:~$ perf stat --per-thread
  ^C
   Performance counter stats for 'system wide':

 vmstat-34586.171984   cpu-clock:u (msec) #  0.000 CPUs utilized
   perf-36700.515599   cpu-clock:u (msec) #  0.000 CPUs utilized
 vmstat-3458   1,163,643   cycles:u   #  0.189 GHz
   perf-3670  40,881   cycles:u   #  0.079 GHz
 vmstat-3458   1,410,238   instructions:u #  1.21  insn per cycle
   perf-3670   3,536   instructions:u #  0.09  insn per cycle
 vmstat-3458 288,937   branches:u # 46.814 M/sec
   perf-3670 936   branches:u #  1.815 M/sec
 vmstat-3458  15,195   branch-misses:u#  5.26% of all branches
   perf-3670  76   branch-misses:u#  8.12% of all branches

12.651675247 seconds time elapsed

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-stat.c| 14 +-
 tools/perf/util/evsel.c  |  3 +++
 tools/perf/util/thread_map.c |  1 +
 tools/perf/util/thread_map.h |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index fadcff52cd09..6214d2b220b2 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -637,7 +637,19 @@ try_again:
 if (verbose > 0)
 ui__warning("%s\n", msg);
 goto try_again;
-}
+   } else if (target__has_per_thread() &&
+  evsel_list->threads &&
+  evsel_list->threads->err_thread != -1) {
+   /*
+* For global --per-thread case, skip current
+* error thread.
+*/
+   if (!thread_map__remove(evsel_list->threads,
+   
evsel_list->threads->err_thread)) {
+   evsel_list->threads->err_thread = -1;
+   goto try_again;
+   }
+   }
 
perf_evsel__open_strerror(counter, ,
  errno, msg, sizeof(msg));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ef351688b797..b56e1c2ddaee 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1915,6 +1915,9 @@ try_fallback:
goto fallback_missing_features;
}
 out_close:
+   if (err)
+   threads->err_thread = thread;
+
do {
while (--thread >= 0) {
close(FD(evsel, cpu, thread));
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 729dad8f412d..5d467d8ae9ab 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -32,6 +32,7 @@ static void 

[tip:perf/core] perf report: Fix wrong jump arrow

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Gitweb: https://git.kernel.org/tip/b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Author: Jin Yao 
AuthorDate: Mon, 29 Jan 2018 18:57:53 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Feb 2018 14:55:47 -0300

perf report: Fix wrong jump arrow

When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.4┌──   1  ↓ je 82
   │   │x += x / y + y / x;
  0.00 │0.4│  1310movsd  (%rsp),%xmm0
  0.00 │0.4│   565movsd  0x8(%rsp),%xmm4
   │0.4│  movsd  0x8(%rsp),%xmm1
   │0.4│  movsd  (%rsp),%xmm3
   │0.4│  divsd  %xmm4,%xmm0
  0.00 │0.4│   579divsd  %xmm3,%xmm1
   │0.4│  movsd  (%rsp),%xmm2
   │0.4│  addsd  %xmm1,%xmm0
   │0.4│  addsd  %xmm2,%xmm0
  0.00 │0.4│  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4└─→  82:   sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.48 1 ┌──je 82
   │   │x += x / y + y / x;
  0.00 │0.48  1310 │  movsd  (%rsp),%xmm0
  0.00 │0.48   565 │  movsd  0x8(%rsp),%xmm4
   │0.48   │  movsd  0x8(%rsp),%xmm1
   │0.48   │  movsd  (%rsp),%xmm3
   │0.48   │  divsd  %xmm4,%xmm0
  0.00 │0.48   579 │  divsd  %xmm3,%xmm1
   │0.48   │  movsd  (%rsp),%xmm2
   │0.48   │  addsd  %xmm1,%xmm0
   │0.48   │  addsd  %xmm2,%xmm0
  0.00 │0.48   │  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4882:└─→sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2864279..e2f6663 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -319,6 +319,7 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
u8 pcnt_width = annotate_browser__pcnt_width(ab);
+   int width = 0;
 
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
@@ -340,13 +341,17 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
to = (u64)btarget->idx;
}
 
+   if (ab->have_cycles)
+   width = IPC_WIDTH + CYCLES_WIDTH;
+
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
-   __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
+   __ui_browser__line_arrow(browser,
+pcnt_width + 2 + ab->addr_width + width,
 from, to);
 
if (is_fused(ab, cursor)) {
ui_browser__mark_fused(browser,
-  pcnt_width + 3 + 

[tip:perf/core] perf report: Fix wrong jump arrow

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Gitweb: https://git.kernel.org/tip/b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Author: Jin Yao 
AuthorDate: Mon, 29 Jan 2018 18:57:53 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Feb 2018 14:55:47 -0300

perf report: Fix wrong jump arrow

When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.4┌──   1  ↓ je 82
   │   │x += x / y + y / x;
  0.00 │0.4│  1310movsd  (%rsp),%xmm0
  0.00 │0.4│   565movsd  0x8(%rsp),%xmm4
   │0.4│  movsd  0x8(%rsp),%xmm1
   │0.4│  movsd  (%rsp),%xmm3
   │0.4│  divsd  %xmm4,%xmm0
  0.00 │0.4│   579divsd  %xmm3,%xmm1
   │0.4│  movsd  (%rsp),%xmm2
   │0.4│  addsd  %xmm1,%xmm0
   │0.4│  addsd  %xmm2,%xmm0
  0.00 │0.4│  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4└─→  82:   sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.48 1 ┌──je 82
   │   │x += x / y + y / x;
  0.00 │0.48  1310 │  movsd  (%rsp),%xmm0
  0.00 │0.48   565 │  movsd  0x8(%rsp),%xmm4
   │0.48   │  movsd  0x8(%rsp),%xmm1
   │0.48   │  movsd  (%rsp),%xmm3
   │0.48   │  divsd  %xmm4,%xmm0
  0.00 │0.48   579 │  divsd  %xmm3,%xmm1
   │0.48   │  movsd  (%rsp),%xmm2
   │0.48   │  addsd  %xmm1,%xmm0
   │0.48   │  addsd  %xmm2,%xmm0
  0.00 │0.48   │  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4882:└─→sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2864279..e2f6663 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -319,6 +319,7 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
u8 pcnt_width = annotate_browser__pcnt_width(ab);
+   int width = 0;
 
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
@@ -340,13 +341,17 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
to = (u64)btarget->idx;
}
 
+   if (ab->have_cycles)
+   width = IPC_WIDTH + CYCLES_WIDTH;
+
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
-   __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
+   __ui_browser__line_arrow(browser,
+pcnt_width + 2 + ab->addr_width + width,
 from, to);
 
if (is_fused(ab, cursor)) {
ui_browser__mark_fused(browser,
-  pcnt_width + 3 + ab->addr_width,
+  pcnt_width + 3 + ab->addr_width + width,
   from - 1,
   to > from ? true : false);
}


[tip:perf/core] perf tools: Use target->per_thread and target->system_wide flags

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  147c508f3004df6e2958f6c8867909531c2a15e2
Gitweb: https://git.kernel.org/tip/147c508f3004df6e2958f6c8867909531c2a15e2
Author: Jin Yao 
AuthorDate: Mon, 12 Feb 2018 13:32:36 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Feb 2018 14:55:40 -0300

perf tools: Use target->per_thread and target->system_wide flags

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on 'perf
record --per-thread'. It has the effect of creating a kernel event for
each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: linux-arm-ker...@lists.infradead.org
Fixes: 73c0ca1eee3d ("perf thread_map: Enumerate all threads from /proc")
Link: 
http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poir...@linaro.org
Signed-off-by: Mathieu Poirier 
[Fixed checkpatch warning about line over 80 characters]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 21 -
 tools/perf/util/thread_map.c |  4 ++--
 tools/perf/util/thread_map.h |  2 +-
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e5fc14e..7b7d535 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1086,11 +1086,30 @@ int perf_evlist__mmap(struct perf_evlist *evlist, 
unsigned int pages)
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 {
+   bool all_threads = (target->per_thread && target->system_wide);
struct cpu_map *cpus;
struct thread_map *threads;
 
+   /*
+* If specify '-a' and '--per-thread' to perf record, perf record
+* will override '--per-thread'. target->per_thread = false and
+* target->system_wide = true.
+*
+* If specify '--per-thread' only to perf record,
+* target->per_thread = true and target->system_wide = false.
+*
+* So target->per_thread && target->system_wide is false.
+* For perf record, thread_map__new_str doesn't call
+* thread_map__new_all_cpus. That will keep perf record's
+* current behavior.
+*
+* For perf stat, it allows the case that target->per_thread and
+* target->system_wide are all true. It means to collect system-wide
+* per-thread data. thread_map__new_str will call
+* thread_map__new_all_cpus to enumerate all threads.
+*/
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ all_threads);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ out_free_threads:
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 


[tip:perf/core] perf tools: Use target->per_thread and target->system_wide flags

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  147c508f3004df6e2958f6c8867909531c2a15e2
Gitweb: https://git.kernel.org/tip/147c508f3004df6e2958f6c8867909531c2a15e2
Author: Jin Yao 
AuthorDate: Mon, 12 Feb 2018 13:32:36 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Feb 2018 14:55:40 -0300

perf tools: Use target->per_thread and target->system_wide flags

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on 'perf
record --per-thread'. It has the effect of creating a kernel event for
each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: linux-arm-ker...@lists.infradead.org
Fixes: 73c0ca1eee3d ("perf thread_map: Enumerate all threads from /proc")
Link: 
http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poir...@linaro.org
Signed-off-by: Mathieu Poirier 
[Fixed checkpatch warning about line over 80 characters]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 21 -
 tools/perf/util/thread_map.c |  4 ++--
 tools/perf/util/thread_map.h |  2 +-
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e5fc14e..7b7d535 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1086,11 +1086,30 @@ int perf_evlist__mmap(struct perf_evlist *evlist, 
unsigned int pages)
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 {
+   bool all_threads = (target->per_thread && target->system_wide);
struct cpu_map *cpus;
struct thread_map *threads;
 
+   /*
+* If specify '-a' and '--per-thread' to perf record, perf record
+* will override '--per-thread'. target->per_thread = false and
+* target->system_wide = true.
+*
+* If specify '--per-thread' only to perf record,
+* target->per_thread = true and target->system_wide = false.
+*
+* So target->per_thread && target->system_wide is false.
+* For perf record, thread_map__new_str doesn't call
+* thread_map__new_all_cpus. That will keep perf record's
+* current behavior.
+*
+* For perf stat, it allows the case that target->per_thread and
+* target->system_wide are all true. It means to collect system-wide
+* per-thread data. thread_map__new_str will call
+* thread_map__new_all_cpus to enumerate all threads.
+*/
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ all_threads);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ out_free_threads:
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 


[tip:perf/core] perf script: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  cc2ef584a863b7c8033b78723cd253ca47e9a589
Gitweb: https://git.kernel.org/tip/cc2ef584a863b7c8033b78723cd253ca47e9a589
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:33 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf script: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-9-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 16 
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ac78191..3499d68 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3445,6 +3444,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3457,7 +3463,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3476,6 +3482,8 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   zfree(_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 


[tip:perf/core] perf script: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  cc2ef584a863b7c8033b78723cd253ca47e9a589
Gitweb: https://git.kernel.org/tip/cc2ef584a863b7c8033b78723cd253ca47e9a589
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:33 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf script: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-9-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 16 
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ac78191..3499d68 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3445,6 +3444,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3457,7 +3463,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3476,6 +3482,8 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   zfree(_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 


[tip:perf/core] perf report: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Gitweb: https://git.kernel.org/tip/0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:32 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf report: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 22 --
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 63d0db3..907e505 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 437..42a52dc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1300,24 +1299,33 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1333,6 +1341,8 @@ repeat:
ret = 0;
 
 error:
+   zfree(_range);
+
perf_session__delete(session);
return ret;
 }


[tip:perf/core] perf report: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Gitweb: https://git.kernel.org/tip/0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:32 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf report: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 22 --
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 63d0db3..907e505 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 437..42a52dc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1300,24 +1299,33 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1333,6 +1341,8 @@ repeat:
ret = 0;
 
 error:
+   zfree(_range);
+
perf_session__delete(session);
return ret;
 }


[tip:perf/core] perf util: Allocate time slices buffer according to number of comma

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  5a031f887cb8d60fe87d21159c3cf82c38f55679
Gitweb: https://git.kernel.org/tip/5a031f887cb8d60fe87d21159c3cf82c38f55679
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:31 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf util: Allocate time slices buffer according to number of comma

Previously we use a magic number 10 to limit the number of time slices.
It's not very good.

This patch creates a new function perf_time__range_alloc() to allocate
time slices buffer. The number of buffer entries is determined by the
number of comma in string but at least it will allocate one entry even
if no comma is found.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,


[tip:perf/core] perf util: Allocate time slices buffer according to number of comma

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  5a031f887cb8d60fe87d21159c3cf82c38f55679
Gitweb: https://git.kernel.org/tip/5a031f887cb8d60fe87d21159c3cf82c38f55679
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:31 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf util: Allocate time slices buffer according to number of comma

Previously we use a magic number 10 to limit the number of time slices.
It's not very good.

This patch creates a new function perf_time__range_alloc() to allocate
time slices buffer. The number of buffer entries is determined by the
number of comma in string but at least it will allocate one entry even
if no comma is found.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,


[tip:perf/core] perf report: Add an indication of what time slices are used

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  7425664bbd3174814500c7ab8740cbb9bb25396c
Gitweb: https://git.kernel.org/tip/7425664bbd3174814500c7ab8740cbb9bb25396c
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:30 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf report: Add an indication of what time slices are used

Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao 
Suggested--by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7d4f0a5..437 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -404,6 +404,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");


[tip:perf/core] perf report: Add an indication of what time slices are used

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  7425664bbd3174814500c7ab8740cbb9bb25396c
Gitweb: https://git.kernel.org/tip/7425664bbd3174814500c7ab8740cbb9bb25396c
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:30 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf report: Add an indication of what time slices are used

Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao 
Suggested--by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7d4f0a5..437 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -404,6 +404,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");


[tip:perf/core] perf util: Support no index time percent slice

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  3002812e602d3f991a5b8cdc0499e63e13ff65c4
Gitweb: https://git.kernel.org/tip/3002812e602d3f991a5b8cdc0499e63e13ff65c4
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:29 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Support no index time percent slice

Previously, the time percent slice needs an index to specify which one
the user wants.

It may be easier to use if the index can be omitted.  So with this
patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 


[tip:perf/core] perf util: Support no index time percent slice

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  3002812e602d3f991a5b8cdc0499e63e13ff65c4
Gitweb: https://git.kernel.org/tip/3002812e602d3f991a5b8cdc0499e63e13ff65c4
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:29 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Support no index time percent slice

Previously, the time percent slice needs an index to specify which one
the user wants.

It may be easier to use if the index can be omitted.  So with this
patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 


[tip:perf/core] perf util: Improve error checking for time percent input

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Gitweb: https://git.kernel.org/tip/6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:28 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Improve error checking for time percent input

The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the entire
string. Now for the same command line, it would return error message
"Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 


[tip:perf/core] perf util: Improve error checking for time percent input

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Gitweb: https://git.kernel.org/tip/6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:28 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Improve error checking for time percent input

The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the entire
string. Now for the same command line, it would return error message
"Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 


[tip:perf/core] perf script: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  1e2778e91616086177a255f3fc8c72ecaa564ae6
Gitweb: https://git.kernel.org/tip/1e2778e91616086177a255f3fc8c72ecaa564ae6
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:27 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf script: Improve error msg when no first/last sample time found

The following message will be returned to user when executing 'perf
script --time' if perf data file doesn't contain the first/last sample
time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 08bc818..ac78191 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3449,7 +3449,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}


[tip:perf/core] perf script: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  1e2778e91616086177a255f3fc8c72ecaa564ae6
Gitweb: https://git.kernel.org/tip/1e2778e91616086177a255f3fc8c72ecaa564ae6
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:27 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf script: Improve error msg when no first/last sample time found

The following message will be returned to user when executing 'perf
script --time' if perf data file doesn't contain the first/last sample
time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 08bc818..ac78191 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3449,7 +3449,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}


[tip:perf/core] perf report: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Gitweb: https://git.kernel.org/tip/eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:26 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf report: Improve error msg when no first/last sample time found

The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6593779..7d4f0a5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1300,7 +1300,9 @@ repeat:
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 


[tip:perf/core] perf report: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Gitweb: https://git.kernel.org/tip/eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:26 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf report: Improve error msg when no first/last sample time found

The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6593779..7d4f0a5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1300,7 +1300,9 @@ repeat:
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 


[tip:perf/core] perf script: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  2ab046cd01e33a854798a3e245c9e3f32b950a7d
Gitweb: https://git.kernel.org/tip/2ab046cd01e33a854798a3e245c9e3f32b950a7d
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:46 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 12:07:06 -0300

perf script: Support time percent and multiple time ranges

perf script has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

   perf script --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

   perf script --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script.txt | 16 +++
 tools/perf/builtin-script.c  | 34 ++--
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 974ceb1..7b622a8 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -329,6 +329,22 @@ include::itrace.txt[]
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multipe time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice
+   perf script --time 10%/2
+
+   Select from 0% to 10% time slice
+   perf script --time 0%-10%
+
+   Select the first and second 10% time slices
+   perf script --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices
+   perf script --time 0%-10%,30%-40%
+
 --max-blocks::
Set the maximum number of program blocks to print with brstackasm for
each sample.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 77e47cf..330dcd9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1436,6 +1436,8 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
+#define PTIME_RANGE_MAX10
+
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1449,7 +1451,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
 };
 
 static int perf_evlist__max_name_len(struct perf_evlist *evlist)
@@ -1734,8 +1737,10 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_script *scr = container_of(tool, struct perf_script, tool);
struct addr_location al;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
+ sample->time)) {
return 0;
+   }
 
if (debug_mode) {
if (sample->time < last_timestamp) {
@@ -3360,10 +3365,27 @@ int cmd_script(int argc, const char **argv)
goto out_delete;
 
/* needs to be parsed after looking up reference time */
-   if (perf_time__parse_str(, script.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   err = -EINVAL;
-   goto out_delete;
+   if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   err = -EINVAL;
+   goto out_delete;
+   }
+
+   script.range_num = perf_time__percent_parse_str(
+ 

[tip:perf/core] perf script: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  2ab046cd01e33a854798a3e245c9e3f32b950a7d
Gitweb: https://git.kernel.org/tip/2ab046cd01e33a854798a3e245c9e3f32b950a7d
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:46 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 12:07:06 -0300

perf script: Support time percent and multiple time ranges

perf script has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

   perf script --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

   perf script --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script.txt | 16 +++
 tools/perf/builtin-script.c  | 34 ++--
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 974ceb1..7b622a8 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -329,6 +329,22 @@ include::itrace.txt[]
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multipe time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice
+   perf script --time 10%/2
+
+   Select from 0% to 10% time slice
+   perf script --time 0%-10%
+
+   Select the first and second 10% time slices
+   perf script --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices
+   perf script --time 0%-10%,30%-40%
+
 --max-blocks::
Set the maximum number of program blocks to print with brstackasm for
each sample.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 77e47cf..330dcd9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1436,6 +1436,8 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
+#define PTIME_RANGE_MAX10
+
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1449,7 +1451,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
 };
 
 static int perf_evlist__max_name_len(struct perf_evlist *evlist)
@@ -1734,8 +1737,10 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_script *scr = container_of(tool, struct perf_script, tool);
struct addr_location al;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
+ sample->time)) {
return 0;
+   }
 
if (debug_mode) {
if (sample->time < last_timestamp) {
@@ -3360,10 +3365,27 @@ int cmd_script(int argc, const char **argv)
goto out_delete;
 
/* needs to be parsed after looking up reference time */
-   if (perf_time__parse_str(, script.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   err = -EINVAL;
-   goto out_delete;
+   if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   err = -EINVAL;
+   goto out_delete;
+   }
+
+   script.range_num = perf_time__percent_parse_str(
+   script.ptime_range, PTIME_RANGE_MAX,
+   script.time_str,
+   session->evlist->first_sample_time,
+   

[tip:perf/core] perf report: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  5b969bc766807e5c2f184d1d6f97b8471de946f1
Gitweb: https://git.kernel.org/tip/5b969bc766807e5c2f184d1d6f97b8471de946f1
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:45 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 12:06:20 -0300

perf report: Support time percent and multiple time ranges

perf report has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

perf report --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

perf report --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao@linux.intel.com
[ Add missing colons at end of examples in the man page ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 20 
 tools/perf/builtin-report.c  | 31 ++-
 2 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index ddde2b5..1e02c4e 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -402,6 +402,26 @@ OPTIONS
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multiple time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice:
+
+ perf report --time 10%/2
+
+   Select from 0% to 10% time slice:
+
+ perf report --time 0%-10%
+
+   Select the first and second 10% time slices:
+
+ perf report --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices:
+
+ perf report --time 0%-10%,30%-40%
+
 --itrace::
Options for decoding instruction tracing data. The options are:
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 07827cd..770bf8a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -52,6 +52,8 @@
 #include 
 #include 
 
+#define PTIME_RANGE_MAX10
+
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -69,7 +71,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
float   min_percent;
u64 nr_entries;
u64 queue_size;
@@ -202,8 +205,10 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(rep->ptime_range, rep->range_num,
+ sample->time)) {
return 0;
+   }
 
if (machine__resolve(machine, , sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
@@ -1093,9 +1098,25 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
-   if (perf_time__parse_str(, report.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   return -EINVAL;
+   if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   return -EINVAL;
+   }
+
+   report.range_num = perf_time__percent_parse_str(
+   report.ptime_range, PTIME_RANGE_MAX,
+   report.time_str,
+  

[tip:perf/core] perf report: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  5b969bc766807e5c2f184d1d6f97b8471de946f1
Gitweb: https://git.kernel.org/tip/5b969bc766807e5c2f184d1d6f97b8471de946f1
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:45 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 12:06:20 -0300

perf report: Support time percent and multiple time ranges

perf report has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

perf report --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

perf report --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao@linux.intel.com
[ Add missing colons at end of examples in the man page ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 20 
 tools/perf/builtin-report.c  | 31 ++-
 2 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index ddde2b5..1e02c4e 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -402,6 +402,26 @@ OPTIONS
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multiple time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice:
+
+ perf report --time 10%/2
+
+   Select from 0% to 10% time slice:
+
+ perf report --time 0%-10%
+
+   Select the first and second 10% time slices:
+
+ perf report --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices:
+
+ perf report --time 0%-10%,30%-40%
+
 --itrace::
Options for decoding instruction tracing data. The options are:
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 07827cd..770bf8a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -52,6 +52,8 @@
 #include 
 #include 
 
+#define PTIME_RANGE_MAX10
+
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -69,7 +71,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
float   min_percent;
u64 nr_entries;
u64 queue_size;
@@ -202,8 +205,10 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(rep->ptime_range, rep->range_num,
+ sample->time)) {
return 0;
+   }
 
if (machine__resolve(machine, , sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
@@ -1093,9 +1098,25 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
-   if (perf_time__parse_str(, report.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   return -EINVAL;
+   if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   return -EINVAL;
+   }
+
+   report.range_num = perf_time__percent_parse_str(
+   report.ptime_range, PTIME_RANGE_MAX,
+   report.time_str,
+   session->evlist->first_sample_time,
+   session->evlist->last_sample_time);
+
+   if (report.range_num < 0) {
+   pr_err("Invalid time 

[tip:perf/core] perf tools: Create function to perform multiple time range checking

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Gitweb: https://git.kernel.org/tip/9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:44 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:41:06 -0300

perf tools: Create function to perform multiple time range checking

Previous patch supports the multiple time range.

For example, select the first and second 10% time slices.
perf report --time 10%/1,10%/2

We need a function to check if a timestamp is in the ranges of
[0, 10%) and [10%, 20%].

Note that it includes the last element in [10%, 20%] but it doesn't
include the last element in [0, 10%). It's to avoid the overlap.

This patch implments a new function perf_time__ranges_skip_sample
for this checking.

Change log:

v4: Let perf_time__ranges_skip_sample be compatible with
perf_time__skip_sample when only one time range.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  3 +++
 2 files changed, 31 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 61c46022..3f7f18f 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -300,6 +300,34 @@ bool perf_time__skip_sample(struct perf_time_interval 
*ptime, u64 timestamp)
return false;
 }
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp)
+{
+   struct perf_time_interval *ptime;
+   int i;
+
+   if ((timestamp == 0) || (num == 0))
+   return false;
+
+   if (num == 1)
+   return perf_time__skip_sample(_buf[0], timestamp);
+
+   /*
+* start/end of multiple time ranges must be valid.
+*/
+   for (i = 0; i < num; i++) {
+   ptime = _buf[i];
+
+   if (timestamp >= ptime->start &&
+   ((timestamp < ptime->end && i < num - 1) ||
+(timestamp <= ptime->end && i == num - 1))) {
+   break;
+   }
+   }
+
+   return (i == num) ? true : false;
+}
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
 {
u64  sec = timestamp / NSEC_PER_SEC;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 2308723..34d5eba 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -18,6 +18,9 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp);
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
 
 int fetch_current_timestamp(char *buf, size_t sz);


[tip:perf/core] perf tools: Create function to perform multiple time range checking

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Gitweb: https://git.kernel.org/tip/9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:44 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:41:06 -0300

perf tools: Create function to perform multiple time range checking

Previous patch supports the multiple time range.

For example, select the first and second 10% time slices.
perf report --time 10%/1,10%/2

We need a function to check if a timestamp is in the ranges of
[0, 10%) and [10%, 20%].

Note that it includes the last element in [10%, 20%] but it doesn't
include the last element in [0, 10%). It's to avoid the overlap.

This patch implments a new function perf_time__ranges_skip_sample
for this checking.

Change log:

v4: Let perf_time__ranges_skip_sample be compatible with
perf_time__skip_sample when only one time range.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  3 +++
 2 files changed, 31 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 61c46022..3f7f18f 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -300,6 +300,34 @@ bool perf_time__skip_sample(struct perf_time_interval 
*ptime, u64 timestamp)
return false;
 }
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp)
+{
+   struct perf_time_interval *ptime;
+   int i;
+
+   if ((timestamp == 0) || (num == 0))
+   return false;
+
+   if (num == 1)
+   return perf_time__skip_sample(_buf[0], timestamp);
+
+   /*
+* start/end of multiple time ranges must be valid.
+*/
+   for (i = 0; i < num; i++) {
+   ptime = _buf[i];
+
+   if (timestamp >= ptime->start &&
+   ((timestamp < ptime->end && i < num - 1) ||
+(timestamp <= ptime->end && i == num - 1))) {
+   break;
+   }
+   }
+
+   return (i == num) ? true : false;
+}
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
 {
u64  sec = timestamp / NSEC_PER_SEC;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 2308723..34d5eba 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -18,6 +18,9 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp);
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
 
 int fetch_current_timestamp(char *buf, size_t sz);


[tip:perf/core] perf tools: Create function to parse time percent

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  13a70f350665580708ab11f725d3578eaacbf2d0
Gitweb: https://git.kernel.org/tip/13a70f350665580708ab11f725d3578eaacbf2d0
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:39:09 -0300

perf tools: Create function to parse time percent

Current perf report/script/... have a --time option to limit the time
range of output. But right now it only supports absolute time, add
support for time percentage.

For example:

1. Select the second 10% time slice
   perf report --time 10%/2

2. Select from 0% to 10% time slice
   perf report --time 0%-10%

It also support the multiple time ranges.

3. Select the first and second 10% time slices
   perf report --time 10%/1,10%/2

4. Select from 0% to 10% and 30% to 40% slices
   perf report --time 0%-10%,30%-40%

Changelog:

v4: An issue is found. Following passes.
perf script --time 10%/10x12321xsdfdasfdsafdsafdsa

Now it uses strtol to replace atoi.

Committer notes:

This just puts in place the infrastructure, so the examples in this cset
comment will only work later, after more patches in this series are
applied.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 205 ---
 tools/perf/util/time-utils.h |   3 +
 2 files changed, 196 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 81927d0..61c46022 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "perf.h"
 #include "debug.h"
@@ -60,11 +61,10 @@ static int parse_timestr_sec_nsec(struct perf_time_interval 
*ptime,
return 0;
 }
 
-int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+static int split_start_end(char **start, char **end, const char *ostr, char ch)
 {
char *start_str, *end_str;
char *d, *str;
-   int rc = 0;
 
if (ostr == NULL || *ostr == '\0')
return 0;
@@ -74,25 +74,35 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
if (str == NULL)
return -ENOMEM;
 
-   ptime->start = 0;
-   ptime->end = 0;
-
-   /* str has the format: ,
-* variations: ,
-* ,
-* ,
-*/
start_str = str;
-   d = strchr(start_str, ',');
+   d = strchr(start_str, ch);
if (d) {
*d = '\0';
++d;
}
end_str = d;
 
+   *start = start_str;
+   *end = end_str;
+
+   return 0;
+}
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+{
+   char *start_str = NULL, *end_str;
+   int rc;
+
+   rc = split_start_end(_str, _str, ostr, ',');
+   if (rc || !start_str)
+   return rc;
+
+   ptime->start = 0;
+   ptime->end = 0;
+
rc = parse_timestr_sec_nsec(ptime, start_str, end_str);
 
-   free(str);
+   free(start_str);
 
/* make sure end time is after start time if it was given */
if (rc == 0 && ptime->end && ptime->end < ptime->start)
@@ -104,6 +114,177 @@ int perf_time__parse_str(struct perf_time_interval 
*ptime, const char *ostr)
return rc;
 }
 
+static int parse_percent(double *pcnt, char *str)
+{
+   char *c;
+
+   c = strchr(str, '%');
+   if (c)
+   *c = '\0';
+   else
+   return -1;
+
+   *pcnt = atof(str) / 100.0;
+
+   return 0;
+}
+
+static int percent_slash_split(char *str, struct perf_time_interval *ptime,
+  u64 start, u64 end)
+{
+   char *p, *end_str;
+   double pcnt, start_pcnt, end_pcnt;
+   u64 total = end - start;
+   int i;
+
+   /*
+* Example:
+* 10%/2: select the second 10% slice and the third 10% slice
+*/
+
+   /* We can modify this string since the original one is copied */
+   p = strchr(str, '/');
+   if (!p)
+   return -1;
+
+   *p = '\0';
+   if (parse_percent(, str) < 0)
+   return -1;
+
+   p++;
+   i = (int)strtol(p, _str, 10);
+   if (*end_str)
+   return -1;
+
+   if (pcnt <= 0.0)
+   return -1;
+
+   start_pcnt = pcnt * (i - 1);
+   end_pcnt = pcnt * i;
+
+   if (start_pcnt < 0.0 || start_pcnt > 1.0 ||
+   end_pcnt < 0.0 || end_pcnt > 1.0) {
+   return -1;
+   }
+
+   

[tip:perf/core] perf tools: Create function to parse time percent

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  13a70f350665580708ab11f725d3578eaacbf2d0
Gitweb: https://git.kernel.org/tip/13a70f350665580708ab11f725d3578eaacbf2d0
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:39:09 -0300

perf tools: Create function to parse time percent

Current perf report/script/... have a --time option to limit the time
range of output. But right now it only supports absolute time, add
support for time percentage.

For example:

1. Select the second 10% time slice
   perf report --time 10%/2

2. Select from 0% to 10% time slice
   perf report --time 0%-10%

It also support the multiple time ranges.

3. Select the first and second 10% time slices
   perf report --time 10%/1,10%/2

4. Select from 0% to 10% and 30% to 40% slices
   perf report --time 0%-10%,30%-40%

Changelog:

v4: An issue is found. Following passes.
perf script --time 10%/10x12321xsdfdasfdsafdsafdsa

Now it uses strtol to replace atoi.

Committer notes:

This just puts in place the infrastructure, so the examples in this cset
comment will only work later, after more patches in this series are
applied.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 205 ---
 tools/perf/util/time-utils.h |   3 +
 2 files changed, 196 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 81927d0..61c46022 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "perf.h"
 #include "debug.h"
@@ -60,11 +61,10 @@ static int parse_timestr_sec_nsec(struct perf_time_interval 
*ptime,
return 0;
 }
 
-int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+static int split_start_end(char **start, char **end, const char *ostr, char ch)
 {
char *start_str, *end_str;
char *d, *str;
-   int rc = 0;
 
if (ostr == NULL || *ostr == '\0')
return 0;
@@ -74,25 +74,35 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
if (str == NULL)
return -ENOMEM;
 
-   ptime->start = 0;
-   ptime->end = 0;
-
-   /* str has the format: ,
-* variations: ,
-* ,
-* ,
-*/
start_str = str;
-   d = strchr(start_str, ',');
+   d = strchr(start_str, ch);
if (d) {
*d = '\0';
++d;
}
end_str = d;
 
+   *start = start_str;
+   *end = end_str;
+
+   return 0;
+}
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+{
+   char *start_str = NULL, *end_str;
+   int rc;
+
+   rc = split_start_end(_str, _str, ostr, ',');
+   if (rc || !start_str)
+   return rc;
+
+   ptime->start = 0;
+   ptime->end = 0;
+
rc = parse_timestr_sec_nsec(ptime, start_str, end_str);
 
-   free(str);
+   free(start_str);
 
/* make sure end time is after start time if it was given */
if (rc == 0 && ptime->end && ptime->end < ptime->start)
@@ -104,6 +114,177 @@ int perf_time__parse_str(struct perf_time_interval 
*ptime, const char *ostr)
return rc;
 }
 
+static int parse_percent(double *pcnt, char *str)
+{
+   char *c;
+
+   c = strchr(str, '%');
+   if (c)
+   *c = '\0';
+   else
+   return -1;
+
+   *pcnt = atof(str) / 100.0;
+
+   return 0;
+}
+
+static int percent_slash_split(char *str, struct perf_time_interval *ptime,
+  u64 start, u64 end)
+{
+   char *p, *end_str;
+   double pcnt, start_pcnt, end_pcnt;
+   u64 total = end - start;
+   int i;
+
+   /*
+* Example:
+* 10%/2: select the second 10% slice and the third 10% slice
+*/
+
+   /* We can modify this string since the original one is copied */
+   p = strchr(str, '/');
+   if (!p)
+   return -1;
+
+   *p = '\0';
+   if (parse_percent(, str) < 0)
+   return -1;
+
+   p++;
+   i = (int)strtol(p, _str, 10);
+   if (*end_str)
+   return -1;
+
+   if (pcnt <= 0.0)
+   return -1;
+
+   start_pcnt = pcnt * (i - 1);
+   end_pcnt = pcnt * i;
+
+   if (start_pcnt < 0.0 || start_pcnt > 1.0 ||
+   end_pcnt < 0.0 || end_pcnt > 1.0) {
+   return -1;
+   }
+
+   ptime->start = start + round(start_pcnt * total);
+   ptime->end = start + round(end_pcnt * total);
+
+   return 0;
+}
+
+static int percent_dash_split(char *str, struct perf_time_interval *ptime,
+  

[tip:perf/core] perf header: Add infrastructure to record first and last sample time

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  6011518db3bd04c80cd3ce3e6aea1c399739adb4
Gitweb: https://git.kernel.org/tip/6011518db3bd04c80cd3ce3e6aea1c399739adb4
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:41 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:20:51 -0300

perf header: Add infrastructure to record first and last sample time

perf report/script/... have a --time option to limit the time range of
output. That's very useful to slice large traces, e.g. when processing
the output of perf script for some analysis.

But right now --time only supports absolute time. Also there is no fast
way to get the start/end times of a given trace except for looking at
it.  This makes it hard to e.g. only decode the first half of the trace,
which is useful for parallelization of scripts

Another problem is that perf records are variable size and there is no
synchronization mechanism. So the only way to find the last sample
reliably would be to walk all samples. But we want to avoid that in perf
report/...  because it is already quite expensive. That is why storing
the first sample time and last sample time in perf record is better.

This patch creates a new header feature type HEADER_SAMPLE_TIME and
related ops. Save the first sample time and the last sample time to the
feature section in perf file header. That will be done when, for
instance, processing build-ids, where we already have to process all
samples to create the build-id table, take advantage of that to further
amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
report/script' faster when using --time.

Committer testing:

After this patch is applied the header is written with zeroes, we need
the next patch, for "perf record" to actually write the timestamps:

  # perf report -D | grep PERF_RECORD_SAMPLE\(
  22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21be8c5 period: 1 addr: 0
  
  22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21ffd50 period: 2828043 addr: 0
  # perf report --header | grep "time of "
  # time of first sample : 0.00
  # time of last sample : 0.00
  #

Changelog:

v7: 1. Rebase to latest perf/core branch.

2. Add following clarification in patch description according to
   Arnaldo's suggestion.

   "That will be done when, for instance, processing build-ids,
where we already have to process all samples to create the
build-id table, take advantage of that to further amortize
that processing by storing HEADER_SAMPLE_TIME to make
'perf report/script' faster when using --time."

v4: Use perf script time style for timestamp printing. Also add with
the printing of sample duration.

v3: Remove the definitions of first_sample_time/last_sample_time from
perf_session. Just define them in perf_evlist

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf.data-file-format.txt |  4 ++
 tools/perf/util/evlist.h   |  2 +
 tools/perf/util/header.c   | 60 ++
 tools/perf/util/header.h   |  1 +
 4 files changed, 67 insertions(+)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 15e8b48..f7d85e8 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -261,6 +261,10 @@ struct {
struct perf_header_string map;
 }[number_of_cache_levels];
 
+   HEADER_SAMPLE_TIME = 21,
+
+Two uint64_t for the time of first sample and the time of last sample.
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS= 256,
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7516066..e7fbca6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -50,6 +50,8 @@ struct perf_evlist {
struct perf_evsel *selected;
struct events_stats stats;
struct perf_env *env;
+   u64 first_sample_time;
+   u64 last_sample_time;
 };
 
 struct perf_evsel_str_handler {
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index ca73aa7..a326e0d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "evlist.h"
 #include "evsel.h"
@@ -35,6 +36,7 @@
 #include 
 #include "asm/bug.h"
 #include 

[tip:perf/core] perf record: Record the first and last sample time in the header

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  68588baf8d01826673f2874f434123029e519052
Gitweb: https://git.kernel.org/tip/68588baf8d01826673f2874f434123029e519052
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:42 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:20:56 -0300

perf record: Record the first and last sample time in the header

In the default 'perf record' configuration, all samples are processed,
to create the HEADER_BUILD_ID table. So it's very easy to get the
first/last samples and save the time to perf file header via the
function write_sample_time().

Later, at post processing time, perf report/script will fetch the time
from perf file header.

Committer testing:

  # perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
  [root@jouet home]# perf report --header | grep "time of "
  # time of first sample : 22947.909226
  # time of last sample : 22948.910704
  #
  # perf report -D | grep PERF_RECORD_SAMPLE\(
  0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa21b1af3 period: 1 addr: 0
  0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa200d204 period: 1 addr: 0
  
  3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 
28251/28251: 0xa22071d8 period: 169518 addr: 0
  0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 198807 addr: 0
  2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 88111 addr: 0
  #

Changelog:

v7: Just update the patch description according to Arnaldo's suggestion.

v6: Currently '--buildid-all' is not enabled at default. So the walking
on all samples is the default operation. There is no big overhead
to calculate the timestamp boundary in process_sample_event handler
once we already go through all samples. So the timestamp boundary
calculation is enabled by default when '--buildid-all' is not enabled.

While if '--buildid-all' is enabled, we creates a new option
"--timestamp-boundary" for user to decide if it enables the
timestamp boundary calculation.

v5: There is an issue that the sample walking can only work when
'--buildid-all' is not enabled. So we need to let the walking
be able to work even if '--buildid-all' is enabled and let the
processing skips the dso hit marking for this case.

At first, I want to provide a new option "--record-time-boundaries".
While after consideration, I think a new option is not very
necessary.

v3: Remove the definitions of first_sample_time and last_sample_time
from struct record and directly save them in perf_evlist.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-record.txt |  3 +++
 tools/perf/builtin-record.c  | 18 +++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef..3eea6de 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -430,6 +430,9 @@ Configure all used events to run in user space.
 --timestamp-filename
 Append timestamp to output file name.
 
+--timestamp-boundary::
+Record timestamp boundary (time of first/last samples).
+
 --switch-output[=mode]::
 Generate multiple perf.data files, timestamp prefixed, switching to a new one
 based on 'mode' value:
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 50385d8..65681a1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -78,6 +78,7 @@ struct record {
boolno_buildid_cache_set;
boolbuildid_all;
booltimestamp_filename;
+   booltimestamp_boundary;
struct switch_outputswitch_output;
unsigned long long  samples;
 };
@@ -409,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
struct record *rec = container_of(tool, struct record, tool);
 
-   rec->samples++;
+   if (rec->evlist->first_sample_time == 0)
+   rec->evlist->first_sample_time = sample->time;
+
+   rec->evlist->last_sample_time = sample->time;
 
+   if (rec->buildid_all)
+   return 0;
+
+   rec->samples++;
return build_id__mark_dso_hit(tool, event, sample, 

[tip:perf/core] perf header: Add infrastructure to record first and last sample time

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  6011518db3bd04c80cd3ce3e6aea1c399739adb4
Gitweb: https://git.kernel.org/tip/6011518db3bd04c80cd3ce3e6aea1c399739adb4
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:41 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:20:51 -0300

perf header: Add infrastructure to record first and last sample time

perf report/script/... have a --time option to limit the time range of
output. That's very useful to slice large traces, e.g. when processing
the output of perf script for some analysis.

But right now --time only supports absolute time. Also there is no fast
way to get the start/end times of a given trace except for looking at
it.  This makes it hard to e.g. only decode the first half of the trace,
which is useful for parallelization of scripts

Another problem is that perf records are variable size and there is no
synchronization mechanism. So the only way to find the last sample
reliably would be to walk all samples. But we want to avoid that in perf
report/...  because it is already quite expensive. That is why storing
the first sample time and last sample time in perf record is better.

This patch creates a new header feature type HEADER_SAMPLE_TIME and
related ops. Save the first sample time and the last sample time to the
feature section in perf file header. That will be done when, for
instance, processing build-ids, where we already have to process all
samples to create the build-id table, take advantage of that to further
amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
report/script' faster when using --time.

Committer testing:

After this patch is applied the header is written with zeroes, we need
the next patch, for "perf record" to actually write the timestamps:

  # perf report -D | grep PERF_RECORD_SAMPLE\(
  22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21be8c5 period: 1 addr: 0
  
  22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21ffd50 period: 2828043 addr: 0
  # perf report --header | grep "time of "
  # time of first sample : 0.00
  # time of last sample : 0.00
  #

Changelog:

v7: 1. Rebase to latest perf/core branch.

2. Add following clarification in patch description according to
   Arnaldo's suggestion.

   "That will be done when, for instance, processing build-ids,
where we already have to process all samples to create the
build-id table, take advantage of that to further amortize
that processing by storing HEADER_SAMPLE_TIME to make
'perf report/script' faster when using --time."

v4: Use perf script time style for timestamp printing. Also add with
the printing of sample duration.

v3: Remove the definitions of first_sample_time/last_sample_time from
perf_session. Just define them in perf_evlist

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf.data-file-format.txt |  4 ++
 tools/perf/util/evlist.h   |  2 +
 tools/perf/util/header.c   | 60 ++
 tools/perf/util/header.h   |  1 +
 4 files changed, 67 insertions(+)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 15e8b48..f7d85e8 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -261,6 +261,10 @@ struct {
struct perf_header_string map;
 }[number_of_cache_levels];
 
+   HEADER_SAMPLE_TIME = 21,
+
+Two uint64_t for the time of first sample and the time of last sample.
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS= 256,
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7516066..e7fbca6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -50,6 +50,8 @@ struct perf_evlist {
struct perf_evsel *selected;
struct events_stats stats;
struct perf_env *env;
+   u64 first_sample_time;
+   u64 last_sample_time;
 };
 
 struct perf_evsel_str_handler {
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index ca73aa7..a326e0d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "evlist.h"
 #include "evsel.h"
@@ -35,6 +36,7 @@
 #include 
 #include "asm/bug.h"
 #include "tool.h"
+#include "time-utils.h"
 
 #include "sane_ctype.h"
 
@@ -1180,6 +1182,20 @@ static int write_stat(struct feat_fd *ff __maybe_unused,
return 0;
 }
 
+static int write_sample_time(struct feat_fd *ff,
+ 

[tip:perf/core] perf record: Record the first and last sample time in the header

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  68588baf8d01826673f2874f434123029e519052
Gitweb: https://git.kernel.org/tip/68588baf8d01826673f2874f434123029e519052
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:42 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:20:56 -0300

perf record: Record the first and last sample time in the header

In the default 'perf record' configuration, all samples are processed,
to create the HEADER_BUILD_ID table. So it's very easy to get the
first/last samples and save the time to perf file header via the
function write_sample_time().

Later, at post processing time, perf report/script will fetch the time
from perf file header.

Committer testing:

  # perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
  [root@jouet home]# perf report --header | grep "time of "
  # time of first sample : 22947.909226
  # time of last sample : 22948.910704
  #
  # perf report -D | grep PERF_RECORD_SAMPLE\(
  0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa21b1af3 period: 1 addr: 0
  0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa200d204 period: 1 addr: 0
  
  3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 
28251/28251: 0xa22071d8 period: 169518 addr: 0
  0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 198807 addr: 0
  2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 88111 addr: 0
  #

Changelog:

v7: Just update the patch description according to Arnaldo's suggestion.

v6: Currently '--buildid-all' is not enabled at default. So the walking
on all samples is the default operation. There is no big overhead
to calculate the timestamp boundary in process_sample_event handler
once we already go through all samples. So the timestamp boundary
calculation is enabled by default when '--buildid-all' is not enabled.

While if '--buildid-all' is enabled, we creates a new option
"--timestamp-boundary" for user to decide if it enables the
timestamp boundary calculation.

v5: There is an issue that the sample walking can only work when
'--buildid-all' is not enabled. So we need to let the walking
be able to work even if '--buildid-all' is enabled and let the
processing skips the dso hit marking for this case.

At first, I want to provide a new option "--record-time-boundaries".
While after consideration, I think a new option is not very
necessary.

v3: Remove the definitions of first_sample_time and last_sample_time
from struct record and directly save them in perf_evlist.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-record.txt |  3 +++
 tools/perf/builtin-record.c  | 18 +++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef..3eea6de 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -430,6 +430,9 @@ Configure all used events to run in user space.
 --timestamp-filename
 Append timestamp to output file name.
 
+--timestamp-boundary::
+Record timestamp boundary (time of first/last samples).
+
 --switch-output[=mode]::
 Generate multiple perf.data files, timestamp prefixed, switching to a new one
 based on 'mode' value:
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 50385d8..65681a1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -78,6 +78,7 @@ struct record {
boolno_buildid_cache_set;
boolbuildid_all;
booltimestamp_filename;
+   booltimestamp_boundary;
struct switch_outputswitch_output;
unsigned long long  samples;
 };
@@ -409,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
struct record *rec = container_of(tool, struct record, tool);
 
-   rec->samples++;
+   if (rec->evlist->first_sample_time == 0)
+   rec->evlist->first_sample_time = sample->time;
+
+   rec->evlist->last_sample_time = sample->time;
 
+   if (rec->buildid_all)
+   return 0;
+
+   rec->samples++;
return build_id__mark_dso_hit(tool, event, sample, evsel, machine);
 }
 
@@ -435,9 +443,11 @@ static int process_buildids(struct record *rec)
 
/*
 * If --buildid-all is given, it marks all DSO regardless of hits,
-* so no need to process 

[tip:perf/core] perf report: Fix a wrong offset issue when using /proc/kcore

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  935f5a9d4500020879858c9224c98dfabf16101d
Gitweb: https://git.kernel.org/tip/935f5a9d4500020879858c9224c98dfabf16101d
Author: Jin Yao 
AuthorDate: Sat, 30 Dec 2017 00:26:52 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a wrong offset issue when using /proc/kcore

When a valid vmlinux is not found, 'perf report' falls back to look at
/proc/kcore. In this case, it will report the impossible large offset.

For example:

  # perf record -b -e cycles:k find /etc/ > /dev/null
  # perf report --stdio --branch-history

22.77%  _vm_normal_page+18446603336221188162
|
---page_remove_rmap +18446603336221188324
   page_remove_rmap +18446603336221188487 (cycles:5)
   unlock_page_memcg +18446603336221188096
   page_remove_rmap +18446603336221188327 (cycles:1)

The issue is the value which is passed to parameter 'addr' in
__get_srcline() is the objdump address. It's not correct if we calculate
the offset by using 'addr - sym->start'.

This patch creates a new parameter 'ip' in __get_srcline(). It is not
converted to objdump address.

With this patch, the perf report output is:

22.77%  _vm_normal_page+66
|
---page_remove_rmap +228
   page_remove_rmap +391 (cycles:5)
   unlock_page_memcg +0
   page_remove_rmap +231 (cycles:1)
   page_remove_rmap +236

Committer testing:

Make sure you get any valid vmlinux out of the way, using '-v' on the
'perf report' case and deleting it from places where perf searches them,
like your kernel build dir and the build-id cache, in ~/.debug/.

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c |  3 ++-
 tools/perf/util/machine.c  |  2 +-
 tools/perf/util/map.c  |  2 +-
 tools/perf/util/sort.c | 16 ++--
 tools/perf/util/srcline.c  |  9 +
 tools/perf/util/srcline.h  |  5 +++--
 6 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 68e687d..28b233c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1960,7 +1960,8 @@ static void annotation__calc_lines(struct annotation 
*notes, struct map *map,
if (percent_max <= 0.5)
continue;
 
-   al->path = get_srcline(map->dso, start + al->offset, NULL, 
false, true);
+   al->path = get_srcline(map->dso, start + al->offset, NULL,
+  false, true, start + al->offset);
insert_source_line(_root, al);
}
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 64d255f..b05a674 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1726,7 +1726,7 @@ static char *callchain_srcline(struct map *map, struct 
symbol *sym, u64 ip)
bool show_addr = callchain_param.key == CCKEY_ADDRESS;
 
srcline = get_srcline(map->dso, map__rip_2objdump(map, ip),
- sym, show_sym, show_addr);
+ sym, show_sym, show_addr, ip);
srcline__tree_insert(>dso->srclines, ip, srcline);
}
 
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6d40efd..8fe5703 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -419,7 +419,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const 
char *prefix,
if (map && map->dso) {
srcline = get_srcline(map->dso,
  map__rip_2objdump(map, addr), NULL,
- true, true);
+ true, true, addr);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a00eacd..211e7f3 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -336,7 +336,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
return SRCLINE_UNKNOWN;
 
return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
-  he->ms.sym, true, true);
+  he->ms.sym, true, true, he->ip);
 }
 
 static int64_t
@@ -380,7 +380,8 @@ sort__srcline_from_cmp(struct 

[tip:perf/core] perf report: Fix a no annotate browser displayed issue

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  40c39e3046411f84bab82f66783ff3593e2bcd9b
Gitweb: https://git.kernel.org/tip/40c39e3046411f84bab82f66783ff3593e2bcd9b
Author: Jin Yao 
AuthorDate: Tue, 26 Dec 2017 18:42:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a no annotate browser displayed issue

When enabling '-b' option in perf record, for example,

  perf record -b ...
  perf report

and then browsing the annotate browser from perf report (press 'A'), it
would fail (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten by
hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
something like mapping symbols and sources. So next, do_annotate() will return
directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (refer to
hist_iter__report_callback).

v2:

Fix a crash bug when perform 'perf report --stdio'.

The reason is that we init the symbol annotation only in browser mode, it
doesn't allocate/init resources for stdio mode.

So now in hist_iter__branch_callback(), it will return directly if it's not in
browser mode.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index eb9ce63..07827cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -162,12 +162,28 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct hist_entry *he = iter->he;
struct report *rep = arg;
struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   if (!ui__has_annotation())
+   return 0;
+
+   hist__account_cycles(sample->branch_stack, al, sample,
+rep->nonany_branch_mode);
 
bi = he->branch_info;
+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
branch_type_count(>brtype_stat, >flags,
  bi->from.addr, bi->to.addr);
 
-   return 0;
+out:
+   return err;
 }
 
 static int process_sample_event(struct perf_tool *tool,


[tip:perf/core] perf report: Fix a no annotate browser displayed issue

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  40c39e3046411f84bab82f66783ff3593e2bcd9b
Gitweb: https://git.kernel.org/tip/40c39e3046411f84bab82f66783ff3593e2bcd9b
Author: Jin Yao 
AuthorDate: Tue, 26 Dec 2017 18:42:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a no annotate browser displayed issue

When enabling '-b' option in perf record, for example,

  perf record -b ...
  perf report

and then browsing the annotate browser from perf report (press 'A'), it
would fail (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten by
hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
something like mapping symbols and sources. So next, do_annotate() will return
directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (refer to
hist_iter__report_callback).

v2:

Fix a crash bug when perform 'perf report --stdio'.

The reason is that we init the symbol annotation only in browser mode, it
doesn't allocate/init resources for stdio mode.

So now in hist_iter__branch_callback(), it will return directly if it's not in
browser mode.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index eb9ce63..07827cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -162,12 +162,28 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct hist_entry *he = iter->he;
struct report *rep = arg;
struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   if (!ui__has_annotation())
+   return 0;
+
+   hist__account_cycles(sample->branch_stack, al, sample,
+rep->nonany_branch_mode);
 
bi = he->branch_info;
+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
branch_type_count(>brtype_stat, >flags,
  bi->from.addr, bi->to.addr);
 
-   return 0;
+out:
+   return err;
 }
 
 static int process_sample_event(struct perf_tool *tool,


[tip:perf/core] perf report: Fix a wrong offset issue when using /proc/kcore

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  935f5a9d4500020879858c9224c98dfabf16101d
Gitweb: https://git.kernel.org/tip/935f5a9d4500020879858c9224c98dfabf16101d
Author: Jin Yao 
AuthorDate: Sat, 30 Dec 2017 00:26:52 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a wrong offset issue when using /proc/kcore

When a valid vmlinux is not found, 'perf report' falls back to look at
/proc/kcore. In this case, it will report the impossible large offset.

For example:

  # perf record -b -e cycles:k find /etc/ > /dev/null
  # perf report --stdio --branch-history

22.77%  _vm_normal_page+18446603336221188162
|
---page_remove_rmap +18446603336221188324
   page_remove_rmap +18446603336221188487 (cycles:5)
   unlock_page_memcg +18446603336221188096
   page_remove_rmap +18446603336221188327 (cycles:1)

The issue is the value which is passed to parameter 'addr' in
__get_srcline() is the objdump address. It's not correct if we calculate
the offset by using 'addr - sym->start'.

This patch creates a new parameter 'ip' in __get_srcline(). It is not
converted to objdump address.

With this patch, the perf report output is:

22.77%  _vm_normal_page+66
|
---page_remove_rmap +228
   page_remove_rmap +391 (cycles:5)
   unlock_page_memcg +0
   page_remove_rmap +231 (cycles:1)
   page_remove_rmap +236

Committer testing:

Make sure you get any valid vmlinux out of the way, using '-v' on the
'perf report' case and deleting it from places where perf searches them,
like your kernel build dir and the build-id cache, in ~/.debug/.

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c |  3 ++-
 tools/perf/util/machine.c  |  2 +-
 tools/perf/util/map.c  |  2 +-
 tools/perf/util/sort.c | 16 ++--
 tools/perf/util/srcline.c  |  9 +
 tools/perf/util/srcline.h  |  5 +++--
 6 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 68e687d..28b233c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1960,7 +1960,8 @@ static void annotation__calc_lines(struct annotation 
*notes, struct map *map,
if (percent_max <= 0.5)
continue;
 
-   al->path = get_srcline(map->dso, start + al->offset, NULL, 
false, true);
+   al->path = get_srcline(map->dso, start + al->offset, NULL,
+  false, true, start + al->offset);
insert_source_line(_root, al);
}
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 64d255f..b05a674 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1726,7 +1726,7 @@ static char *callchain_srcline(struct map *map, struct 
symbol *sym, u64 ip)
bool show_addr = callchain_param.key == CCKEY_ADDRESS;
 
srcline = get_srcline(map->dso, map__rip_2objdump(map, ip),
- sym, show_sym, show_addr);
+ sym, show_sym, show_addr, ip);
srcline__tree_insert(>dso->srclines, ip, srcline);
}
 
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6d40efd..8fe5703 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -419,7 +419,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const 
char *prefix,
if (map && map->dso) {
srcline = get_srcline(map->dso,
  map__rip_2objdump(map, addr), NULL,
- true, true);
+ true, true, addr);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a00eacd..211e7f3 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -336,7 +336,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
return SRCLINE_UNKNOWN;
 
return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
-  he->ms.sym, true, true);
+  he->ms.sym, true, true, he->ip);
 }
 
 static int64_t
@@ -380,7 +380,8 @@ sort__srcline_from_cmp(struct hist_entry *left, struct 
hist_entry *right)
   map__rip_2objdump(map,
 
left->branch_info->from.al_addr),

  1   2   >