Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-02-26 Thread Jin, Yao



On 1/23/2018 10:19 PM, Jiri Olsa wrote:

On Mon, Jan 22, 2018 at 01:10:31PM +0800, Jin, Yao wrote:



On 1/16/2018 9:17 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 09:06:09PM +0800, Jin, Yao wrote:

Just tested. But looks it's not OK for '--per-thread' case.


yea, I haven't tested much.. might need soem tweaking,
but my point was that it could be doable on one place
instead of introducing another if possible

jirka



Hi Jiri,

I ever considered to move the operation of removing error thread to
perf_evsel__fallback(). The perf_evsel__fallback() is common code and it's
shared by perf report, perf stat and perf top.

While finally I think it'd better let the caller decide to remove error
thread and try again, or just return the warning message.
perf_evsel__fallback() probably doesn't know what the caller want to do.

That's my current thinking. Maybe there will be a better fix...


ok, can't think of better fix atm.. looks good ;-)

Acked-by: Jiri Olsa 

thanks,
jirka



Hi Arnaldo,

Could this fix be accepted?

Thanks
Jin Yao


Re: [PATCH] perf annotate: Support to display the LBR data in tui mode

2018-02-23 Thread Jin, Yao



On 2/23/2018 11:29 PM, Arnaldo Carvalho de Melo wrote:

Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:

On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.



perf record -b ...
perf annotate function



It should show IPC/cycle, but it doesn't.



There is far more than IPC/cycle for the LBR data, so this Changelog is
misleading.



Also, I think that this patch goes the wrong way, we should reduce the
divergence of the various modes, not make it worse.


Right, Peter, what would you think if I made --stdio use the same
routines used to format the TUI, i.e. stdio would be equal to the TUI
modulo de navigation/jump arrows, etc.

We would have switches to provide the TUI output options that make sense
for non-interactive mode, like:

   J   Toggle showing number of jump sources on targets
   o   Toggle disassembler output/simplified view
   s   Toggle source code view
   t   Circulate percent, total period, samples view
   k   Toggle line numbers



Hi Arnaldo, looks your idea is very similar as my idea. In my 
understanding, for example, we may provide switch to tui routine like 
annotate_browser__write() and use fprintf() to replace 
ui_browser__printf()/ui_browser_write__xxx() if switch is on for stdio.


Is that your idea?

For this approach, I think, the benefit is we can reuse most of existing 
code but the disadvantage is we have to mix tui and stdio up.


Thanks
Jin Yao


And would still have e --stdio-classic (deprecated), that we would keep
for a while.

I think that this new mode with "dissassembler output" would be the same
as the current --stdio, I'll check.

To further clarify, this wouldn't use any ncurses/slang TUI code, just
plain printf with things formatted using what is used now for the TUI
mode.

This way there would never be any drift amongst the output modes and we
would have less work to do when implementing new stuff like this LBR
case.

What do you think?

- Arnaldo



Re: [PATCH] perf annotate: Support to display the LBR data in tui mode

2018-02-23 Thread Jin, Yao



On 2/23/2018 11:29 PM, Arnaldo Carvalho de Melo wrote:

Em Fri, Feb 23, 2018 at 09:25:00AM +0100, Peter Zijlstra escreveu:

On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.



perf record -b ...
perf annotate function



It should show IPC/cycle, but it doesn't.



There is far more than IPC/cycle for the LBR data, so this Changelog is
misleading.



Also, I think that this patch goes the wrong way, we should reduce the
divergence of the various modes, not make it worse.


Right, Peter, what would you think if I made --stdio use the same
routines used to format the TUI, i.e. stdio would be equal to the TUI
modulo de navigation/jump arrows, etc.

We would have switches to provide the TUI output options that make sense
for non-interactive mode, like:

   J   Toggle showing number of jump sources on targets
   o   Toggle disassembler output/simplified view
   s   Toggle source code view
   t   Circulate percent, total period, samples view
   k   Toggle line numbers



Hi Arnaldo, looks your idea is very similar as my idea. In my 
understanding, for example, we may provide switch to tui routine like 
annotate_browser__write() and use fprintf() to replace 
ui_browser__printf()/ui_browser_write__xxx() if switch is on for stdio.


Is that your idea?

For this approach, I think, the benefit is we can reuse most of existing 
code but the disadvantage is we have to mix tui and stdio up.


Thanks
Jin Yao


And would still have e --stdio-classic (deprecated), that we would keep
for a while.

I think that this new mode with "dissassembler output" would be the same
as the current --stdio, I'll check.

To further clarify, this wouldn't use any ncurses/slang TUI code, just
plain printf with things formatted using what is used now for the TUI
mode.

This way there would never be any drift amongst the output modes and we
would have less work to do when implementing new stuff like this LBR
case.

What do you think?

- Arnaldo



Re: [PATCH] perf annotate: Support to display the LBR data in tui mode

2018-02-23 Thread Jin, Yao



On 2/23/2018 4:25 PM, Peter Zijlstra wrote:

On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.


There is far more than IPC/cycle for the LBR data, so this Changelog is
misleading.



I will change the changelog to make it more clear.


Also, I think that this patch goes the wrong way, we should reduce the
divergence of the various modes, not make it worse.

I do plan to support stdio mode. While stdio mode needs more changes 
than tui mode, so I plan to do it in a follow-up patch.


Posting this patch now is because I want to listen from community first 
for this feature. If the tui patch could be accepted, then it's worth 
putting more efforts on stdio version. That's my thoughts.


Thanks
Jin Yao


Re: [PATCH] perf annotate: Support to display the LBR data in tui mode

2018-02-23 Thread Jin, Yao



On 2/23/2018 4:25 PM, Peter Zijlstra wrote:

On Fri, Feb 23, 2018 at 10:35:58PM +0800, Jin Yao wrote:

Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.


There is far more than IPC/cycle for the LBR data, so this Changelog is
misleading.



I will change the changelog to make it more clear.


Also, I think that this patch goes the wrong way, we should reduce the
divergence of the various modes, not make it worse.

I do plan to support stdio mode. While stdio mode needs more changes 
than tui mode, so I plan to do it in a follow-up patch.


Posting this patch now is because I want to listen from community first 
for this feature. If the tui patch could be accepted, then it's worth 
putting more efforts on stdio version. That's my thoughts.


Thanks
Jin Yao


[PATCH] perf annotate: Support to display the LBR data in tui mode

2018-02-22 Thread Jin Yao
Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of LBR data.

For example,

perf annotate compute_flag

Percent│ IPC Cycle
   │
   │
   │Disassembly of section .text:
   │
   │00400640 :
   │compute_flag():
   │volatile int count;
   │static unsigned int s_randseed;
   │
   │__attribute__((noinline))
   │int compute_flag()
   │{
 22.96 │1.18   584sub$0x8,%rsp
   │int i;
   │
   │i = rand() % 2;
 23.02 │1.18 1  → callq  rand@plt
   │
   │return i;
 27.05 │3.37  mov%eax,%edx
   │}
   │3.37  add$0x8,%rsp
   │{
   │int i;
   │
   │i = rand() % 2;
   │
   │return i;
   │3.37  shr$0x1f,%edx
   │3.37  add%edx,%eax
   │3.37  and$0x1,%eax
   │3.37  sub%edx,%eax
   │}
 26.97 │3.37 2  ← retq

Note that, this patch only supports tui mode. For other mode like --stdio,
it just keeps original behavior.

perf annotate compute_flag --stdio

 Percent |  Source code & Disassembly of div for cycles:ppp (7993 samples)
--
 :
 :
 :
 :Disassembly of section .text:
 :
 :00400640 :
 :compute_flag():
 :volatile int count;
 :static unsigned int s_randseed;
 :
 :__attribute__((noinline))
 :int compute_flag()
 :{
0.29 :   400640:   sub$0x8,%rsp # +100.00%
 :int i;
 :
 :i = rand() % 2;
   42.93 :   400644:   callq  400490 <rand@plt> # -100.00% (p:100.00%)
 :
 :return i;
0.10 :   400649:   mov%eax,%edx # +100.00%
 :}
0.94 :   40064b:   add$0x8,%rsp
 :{
 :int i;
 :
 :i = rand() % 2;
 :
 :return i;
   27.02 :   40064f:   shr$0x1f,%edx
0.15 :   400652:   add%edx,%eax
1.24 :   400654:   and$0x1,%eax
2.08 :   400657:   sub%edx,%eax
 :}
   25.26 :   400659:   retq # -100.00% (p:100.00%)

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/builtin-annotate.c | 88 ---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a..ead6ae4 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
bool   full_paths;
bool   print_line;
bool   skip_missing;
+   bool   has_br_stack;
const char *sym_hist_filter;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, 
struct addr_location *
free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ bool single __maybe_unused,
+ void *arg __maybe_unused)
+{
+   struct hist_entry *he = iter->he;
+   struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   hist__account_cycles(sample->branch_stack, al, sample, false);
+
+   bi = he->branch_info;
+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
+out:
+   return err;
+}
+
+static int process_branch_callback(struct perf_evsel *evsel,
+  struct perf_sample *sample,
+  struct addr_location *al __maybe_unused,
+  struct perf_annotate *ann,
+  struct machine *machine)
+{
+   struct hist_entry_iter iter = {
+   .evse

[PATCH] perf annotate: Support to display the LBR data in tui mode

2018-02-22 Thread Jin Yao
Unlike the perf report interactive annotate mode, the perf annotate
doesn't display the LBR data.

perf record -b ...
perf annotate function

It should show IPC/cycle, but it doesn't.

This patch lets perf annotate support the displaying of LBR data.

For example,

perf annotate compute_flag

Percent│ IPC Cycle
   │
   │
   │Disassembly of section .text:
   │
   │00400640 :
   │compute_flag():
   │volatile int count;
   │static unsigned int s_randseed;
   │
   │__attribute__((noinline))
   │int compute_flag()
   │{
 22.96 │1.18   584sub$0x8,%rsp
   │int i;
   │
   │i = rand() % 2;
 23.02 │1.18 1  → callq  rand@plt
   │
   │return i;
 27.05 │3.37  mov%eax,%edx
   │}
   │3.37  add$0x8,%rsp
   │{
   │int i;
   │
   │i = rand() % 2;
   │
   │return i;
   │3.37  shr$0x1f,%edx
   │3.37  add%edx,%eax
   │3.37  and$0x1,%eax
   │3.37  sub%edx,%eax
   │}
 26.97 │3.37 2  ← retq

Note that, this patch only supports tui mode. For other mode like --stdio,
it just keeps original behavior.

perf annotate compute_flag --stdio

 Percent |  Source code & Disassembly of div for cycles:ppp (7993 samples)
--
 :
 :
 :
 :Disassembly of section .text:
 :
 :00400640 :
 :compute_flag():
 :volatile int count;
 :static unsigned int s_randseed;
 :
 :__attribute__((noinline))
 :int compute_flag()
 :{
0.29 :   400640:   sub$0x8,%rsp # +100.00%
 :int i;
 :
 :i = rand() % 2;
   42.93 :   400644:   callq  400490  # -100.00% (p:100.00%)
 :
 :return i;
0.10 :   400649:   mov%eax,%edx # +100.00%
 :}
0.94 :   40064b:   add$0x8,%rsp
 :{
 :int i;
 :
 :i = rand() % 2;
 :
 :return i;
   27.02 :   40064f:   shr$0x1f,%edx
0.15 :   400652:   add%edx,%eax
1.24 :   400654:   and$0x1,%eax
2.08 :   400657:   sub%edx,%eax
 :}
   25.26 :   400659:   retq # -100.00% (p:100.00%)

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-annotate.c | 88 ---
 1 file changed, 82 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a..ead6ae4 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -44,6 +44,7 @@ struct perf_annotate {
bool   full_paths;
bool   print_line;
bool   skip_missing;
+   bool   has_br_stack;
const char *sym_hist_filter;
const char *cpu_list;
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -146,16 +147,73 @@ static void process_branch_stack(struct branch_stack *bs, 
struct addr_location *
free(bi);
 }
 
+static int hist_iter__branch_callback(struct hist_entry_iter *iter,
+ struct addr_location *al __maybe_unused,
+ bool single __maybe_unused,
+ void *arg __maybe_unused)
+{
+   struct hist_entry *he = iter->he;
+   struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   hist__account_cycles(sample->branch_stack, al, sample, false);
+
+   bi = he->branch_info;
+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
+out:
+   return err;
+}
+
+static int process_branch_callback(struct perf_evsel *evsel,
+  struct perf_sample *sample,
+  struct addr_location *al __maybe_unused,
+  struct perf_annotate *ann,
+  struct machine *machine)
+{
+   struct hist_entry_iter iter = {
+   .evsel  = evsel,
+

[tip:perf/core] perf report: Fix wrong jump arrow

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Gitweb: https://git.kernel.org/tip/b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Mon, 29 Jan 2018 18:57:53 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Fri, 16 Feb 2018 14:55:47 -0300

perf report: Fix wrong jump arrow

When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.4┌──   1  ↓ je 82
   │   │x += x / y + y / x;
  0.00 │0.4│  1310movsd  (%rsp),%xmm0
  0.00 │0.4│   565movsd  0x8(%rsp),%xmm4
   │0.4│  movsd  0x8(%rsp),%xmm1
   │0.4│  movsd  (%rsp),%xmm3
   │0.4│  divsd  %xmm4,%xmm0
  0.00 │0.4│   579divsd  %xmm3,%xmm1
   │0.4│  movsd  (%rsp),%xmm2
   │0.4│  addsd  %xmm1,%xmm0
   │0.4│  addsd  %xmm2,%xmm0
  0.00 │0.4│  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4└─→  82:   sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.48 1 ┌──je 82
   │   │x += x / y + y / x;
  0.00 │0.48  1310 │  movsd  (%rsp),%xmm0
  0.00 │0.48   565 │  movsd  0x8(%rsp),%xmm4
   │0.48   │  movsd  0x8(%rsp),%xmm1
   │0.48   │  movsd  (%rsp),%xmm3
   │0.48   │  divsd  %xmm4,%xmm0
  0.00 │0.48   579 │  divsd  %xmm3,%xmm1
   │0.48   │  movsd  (%rsp),%xmm2
   │0.48   │  addsd  %xmm1,%xmm0
   │0.48   │  addsd  %xmm2,%xmm0
  0.00 │0.48   │  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4882:└─→sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jin Yao <yao@intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/ui/browsers/annotate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2864279..e2f6663 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -319,6 +319,7 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
u8 pcnt_width = annotate_browser__pcnt_width(ab);
+   int width = 0;
 
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
@@ -340,13 +341,17 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
to = (u64)btarget->idx;
}
 
+   if (ab->have_cycles)
+   width = IPC_WIDTH + CYCLES_WIDTH;
+
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
-   __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
+   __ui_browser__line_arrow(browser,
+pcnt_width + 2 + ab->addr_width + width,
 from, to);
 
if (is_fused(ab, cursor)) {
  

[tip:perf/core] perf report: Fix wrong jump arrow

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Gitweb: https://git.kernel.org/tip/b40982e8468b46b8f7f5bba5a7e541ec04a29d7d
Author: Jin Yao 
AuthorDate: Mon, 29 Jan 2018 18:57:53 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Feb 2018 14:55:47 -0300

perf report: Fix wrong jump arrow

When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.4┌──   1  ↓ je 82
   │   │x += x / y + y / x;
  0.00 │0.4│  1310movsd  (%rsp),%xmm0
  0.00 │0.4│   565movsd  0x8(%rsp),%xmm4
   │0.4│  movsd  0x8(%rsp),%xmm1
   │0.4│  movsd  (%rsp),%xmm3
   │0.4│  divsd  %xmm4,%xmm0
  0.00 │0.4│   579divsd  %xmm3,%xmm1
   │0.4│  movsd  (%rsp),%xmm2
   │0.4│  addsd  %xmm1,%xmm0
   │0.4│  addsd  %xmm2,%xmm0
  0.00 │0.4│  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4└─→  82:   sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.48 1 ┌──je 82
   │   │x += x / y + y / x;
  0.00 │0.48  1310 │  movsd  (%rsp),%xmm0
  0.00 │0.48   565 │  movsd  0x8(%rsp),%xmm4
   │0.48   │  movsd  0x8(%rsp),%xmm1
   │0.48   │  movsd  (%rsp),%xmm3
   │0.48   │  divsd  %xmm4,%xmm0
  0.00 │0.48   579 │  divsd  %xmm3,%xmm1
   │0.48   │  movsd  (%rsp),%xmm2
   │0.48   │  addsd  %xmm1,%xmm0
   │0.48   │  addsd  %xmm2,%xmm0
  0.00 │0.48   │  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4882:└─→sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/annotate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2864279..e2f6663 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -319,6 +319,7 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
u8 pcnt_width = annotate_browser__pcnt_width(ab);
+   int width = 0;
 
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
@@ -340,13 +341,17 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
to = (u64)btarget->idx;
}
 
+   if (ab->have_cycles)
+   width = IPC_WIDTH + CYCLES_WIDTH;
+
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
-   __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
+   __ui_browser__line_arrow(browser,
+pcnt_width + 2 + ab->addr_width + width,
 from, to);
 
if (is_fused(ab, cursor)) {
ui_browser__mark_fused(browser,
-  pcnt_width + 3 + ab->addr_width,
+  pcnt_width + 3 + ab->addr_width + width,
   from - 1,
   to > from ? true : false);
}


[tip:perf/core] perf tools: Use target->per_thread and target->system_wide flags

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  147c508f3004df6e2958f6c8867909531c2a15e2
Gitweb: https://git.kernel.org/tip/147c508f3004df6e2958f6c8867909531c2a15e2
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Mon, 12 Feb 2018 13:32:36 -0700
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Fri, 16 Feb 2018 14:55:40 -0300

perf tools: Use target->per_thread and target->system_wide flags

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on 'perf
record --per-thread'. It has the effect of creating a kernel event for
each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Namhyung Kim <namhy...@kernel.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: linux-arm-ker...@lists.infradead.org
Fixes: 73c0ca1eee3d ("perf thread_map: Enumerate all threads from /proc")
Link: 
http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poir...@linaro.org
Signed-off-by: Mathieu Poirier <mathieu.poir...@linaro.org>
[Fixed checkpatch warning about line over 80 characters]
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/evlist.c | 21 -
 tools/perf/util/thread_map.c |  4 ++--
 tools/perf/util/thread_map.h |  2 +-
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e5fc14e..7b7d535 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1086,11 +1086,30 @@ int perf_evlist__mmap(struct perf_evlist *evlist, 
unsigned int pages)
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 {
+   bool all_threads = (target->per_thread && target->system_wide);
struct cpu_map *cpus;
struct thread_map *threads;
 
+   /*
+* If specify '-a' and '--per-thread' to perf record, perf record
+* will override '--per-thread'. target->per_thread = false and
+* target->system_wide = true.
+*
+* If specify '--per-thread' only to perf record,
+* target->per_thread = true and target->system_wide = false.
+*
+* So target->per_thread && target->system_wide is false.
+* For perf record, thread_map__new_str doesn't call
+* thread_map__new_all_cpus. That will keep perf record's
+* current behavior.
+*
+* For perf stat, it allows the case that target->per_thread and
+* target->system_wide are all true. It means to collect system-wide
+* per-thread data. thread_map__new_str will call
+* thread_map__new_all_cpus to enumerate all threads.
+*/
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ all_threads);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ out_free_threads:
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 


[tip:perf/core] perf tools: Use target->per_thread and target->system_wide flags

2018-02-17 Thread tip-bot for Jin Yao
Commit-ID:  147c508f3004df6e2958f6c8867909531c2a15e2
Gitweb: https://git.kernel.org/tip/147c508f3004df6e2958f6c8867909531c2a15e2
Author: Jin Yao 
AuthorDate: Mon, 12 Feb 2018 13:32:36 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 16 Feb 2018 14:55:40 -0300

perf tools: Use target->per_thread and target->system_wide flags

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on 'perf
record --per-thread'. It has the effect of creating a kernel event for
each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao 
Cc: Alexander Shishkin 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: linux-arm-ker...@lists.infradead.org
Fixes: 73c0ca1eee3d ("perf thread_map: Enumerate all threads from /proc")
Link: 
http://lkml.kernel.org/r/1518467557-18505-3-git-send-email-mathieu.poir...@linaro.org
Signed-off-by: Mathieu Poirier 
[Fixed checkpatch warning about line over 80 characters]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/evlist.c | 21 -
 tools/perf/util/thread_map.c |  4 ++--
 tools/perf/util/thread_map.h |  2 +-
 3 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index e5fc14e..7b7d535 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1086,11 +1086,30 @@ int perf_evlist__mmap(struct perf_evlist *evlist, 
unsigned int pages)
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 {
+   bool all_threads = (target->per_thread && target->system_wide);
struct cpu_map *cpus;
struct thread_map *threads;
 
+   /*
+* If specify '-a' and '--per-thread' to perf record, perf record
+* will override '--per-thread'. target->per_thread = false and
+* target->system_wide = true.
+*
+* If specify '--per-thread' only to perf record,
+* target->per_thread = true and target->system_wide = false.
+*
+* So target->per_thread && target->system_wide is false.
+* For perf record, thread_map__new_str doesn't call
+* thread_map__new_all_cpus. That will keep perf record's
+* current behavior.
+*
+* For perf stat, it allows the case that target->per_thread and
+* target->system_wide are all true. It means to collect system-wide
+* per-thread data. thread_map__new_str will call
+* thread_map__new_all_cpus to enumerate all threads.
+*/
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ all_threads);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ out_free_threads:
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 


Re: [PATCH] perf report: Fix a memory corrupton issue when enabling --branch-history

2018-02-15 Thread Jin, Yao



On 2/13/2018 10:00 PM, Jin, Yao wrote:



On 2/13/2018 5:45 PM, Jiri Olsa wrote:

On Tue, Feb 13, 2018 at 04:44:28PM +0800, Jin Yao wrote:

Following command lines will cause perf crash.

perf record -j call -g -a 
perf report --branch-history

*** Error in `perf': double free or corruption (!prev): 
0x104aa040 ***

=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
perf[0x51b914]
perf(hist_entry_iter__add+0x1e5)[0x51f305]
perf[0x43cf01]
perf[0x4fa3bf]
perf[0x4fa923]
perf[0x4fd396]
perf[0x4f9614]
perf(perf_session__process_events+0x89e)[0x4fc38e]
perf(cmd_report+0x15d2)[0x43f202]
perf[0x4a059f]
perf(main+0x631)[0x427b71]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
perf(_start+0x29)[0x427d89]

The memory corruption happens at:

iter_add_next_cumulative_entry()
{
 ...
 for (i = 0; i < iter->curr; i++) {
 ...
}

Whatever in iter_next_cumulative_entry() or in 
iter_add_next_cumulative_entry(),

they all don't check if iter->curr exceeds the array 'he_cache[]'.

If there are too many nodes in callchain, it's possible that 
iter->curr >

iter->max_stack, then memory corruption occurs.

This patch will reallocate array 'he_cache[]' in 
iter_next_cumulative_entry()

if necessary (the case of too many nodes in callchain).


right, the max_stack does not say how many nodes end up in
callchain_cursor at the end.. good catch, please mention
that also in the changelog



max_stack looks only to limit the number of calls but not for other 
branches.



however we know the final count from callchain_cursor itself,
the attached patch might do the same job, right?



I think the attached patch is ok.


also could we now get rid of iter->max_stack?



 From my opinion, the option '--max-stack' in perf report looks not very 
necessary. While it's just my personal opinion, need to hear from more 
people. :)


Thanks
Jin Yao


thanks,
jirka


---
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b6140950301e..b50b7b70dcca 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -879,7 +879,7 @@ iter_prepare_cumulative_entry(struct 
hist_entry_iter *iter,

   * cumulated only one time to prevent entries more than 100%
   * overhead.
   */
-    he_cache = malloc(sizeof(*he_cache) * (iter->max_stack + 1));
+    he_cache = malloc(sizeof(*he_cache) * (callchain_cursor.nr + 1));
  if (he_cache == NULL)
  return -ENOMEM;



Hi Jiri,

I guess you will post this patch, right?

Thanks
Jin Yao


Re: [PATCH] perf report: Fix a memory corrupton issue when enabling --branch-history

2018-02-15 Thread Jin, Yao



On 2/13/2018 10:00 PM, Jin, Yao wrote:



On 2/13/2018 5:45 PM, Jiri Olsa wrote:

On Tue, Feb 13, 2018 at 04:44:28PM +0800, Jin Yao wrote:

Following command lines will cause perf crash.

perf record -j call -g -a 
perf report --branch-history

*** Error in `perf': double free or corruption (!prev): 
0x104aa040 ***

=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
perf[0x51b914]
perf(hist_entry_iter__add+0x1e5)[0x51f305]
perf[0x43cf01]
perf[0x4fa3bf]
perf[0x4fa923]
perf[0x4fd396]
perf[0x4f9614]
perf(perf_session__process_events+0x89e)[0x4fc38e]
perf(cmd_report+0x15d2)[0x43f202]
perf[0x4a059f]
perf(main+0x631)[0x427b71]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
perf(_start+0x29)[0x427d89]

The memory corruption happens at:

iter_add_next_cumulative_entry()
{
 ...
 for (i = 0; i < iter->curr; i++) {
 ...
}

Whatever in iter_next_cumulative_entry() or in 
iter_add_next_cumulative_entry(),

they all don't check if iter->curr exceeds the array 'he_cache[]'.

If there are too many nodes in callchain, it's possible that 
iter->curr >

iter->max_stack, then memory corruption occurs.

This patch will reallocate array 'he_cache[]' in 
iter_next_cumulative_entry()

if necessary (the case of too many nodes in callchain).


right, the max_stack does not say how many nodes end up in
callchain_cursor at the end.. good catch, please mention
that also in the changelog



max_stack looks only to limit the number of calls but not for other 
branches.



however we know the final count from callchain_cursor itself,
the attached patch might do the same job, right?



I think the attached patch is ok.


also could we now get rid of iter->max_stack?



 From my opinion, the option '--max-stack' in perf report looks not very 
necessary. While it's just my personal opinion, need to hear from more 
people. :)


Thanks
Jin Yao


thanks,
jirka


---
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b6140950301e..b50b7b70dcca 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -879,7 +879,7 @@ iter_prepare_cumulative_entry(struct 
hist_entry_iter *iter,

   * cumulated only one time to prevent entries more than 100%
   * overhead.
   */
-    he_cache = malloc(sizeof(*he_cache) * (iter->max_stack + 1));
+    he_cache = malloc(sizeof(*he_cache) * (callchain_cursor.nr + 1));
  if (he_cache == NULL)
  return -ENOMEM;



Hi Jiri,

I guess you will post this patch, right?

Thanks
Jin Yao


Re: [PATCH] perf report: Fix a memory corrupton issue when enabling --branch-history

2018-02-13 Thread Jin, Yao



On 2/13/2018 5:45 PM, Jiri Olsa wrote:

On Tue, Feb 13, 2018 at 04:44:28PM +0800, Jin Yao wrote:

Following command lines will cause perf crash.

perf record -j call -g -a 
perf report --branch-history

*** Error in `perf': double free or corruption (!prev): 0x104aa040 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
perf[0x51b914]
perf(hist_entry_iter__add+0x1e5)[0x51f305]
perf[0x43cf01]
perf[0x4fa3bf]
perf[0x4fa923]
perf[0x4fd396]
perf[0x4f9614]
perf(perf_session__process_events+0x89e)[0x4fc38e]
perf(cmd_report+0x15d2)[0x43f202]
perf[0x4a059f]
perf(main+0x631)[0x427b71]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
perf(_start+0x29)[0x427d89]

The memory corruption happens at:

iter_add_next_cumulative_entry()
{
 ...
 for (i = 0; i < iter->curr; i++) {
 ...
}

Whatever in iter_next_cumulative_entry() or in iter_add_next_cumulative_entry(),
they all don't check if iter->curr exceeds the array 'he_cache[]'.

If there are too many nodes in callchain, it's possible that iter->curr >
iter->max_stack, then memory corruption occurs.

This patch will reallocate array 'he_cache[]' in iter_next_cumulative_entry()
if necessary (the case of too many nodes in callchain).


right, the max_stack does not say how many nodes end up in
callchain_cursor at the end.. good catch, please mention
that also in the changelog



max_stack looks only to limit the number of calls but not for other 
branches.



however we know the final count from callchain_cursor itself,
the attached patch might do the same job, right?



I think the attached patch is ok.


also could we now get rid of iter->max_stack?



From my opinion, the option '--max-stack' in perf report looks not very 
necessary. While it's just my personal opinion, need to hear from more 
people. :)


Thanks
Jin Yao


thanks,
jirka


---
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b6140950301e..b50b7b70dcca 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -879,7 +879,7 @@ iter_prepare_cumulative_entry(struct hist_entry_iter *iter,
 * cumulated only one time to prevent entries more than 100%
 * overhead.
 */
-   he_cache = malloc(sizeof(*he_cache) * (iter->max_stack + 1));
+   he_cache = malloc(sizeof(*he_cache) * (callchain_cursor.nr + 1));
if (he_cache == NULL)
return -ENOMEM;
  



Re: [PATCH] perf report: Fix a memory corrupton issue when enabling --branch-history

2018-02-13 Thread Jin, Yao



On 2/13/2018 5:45 PM, Jiri Olsa wrote:

On Tue, Feb 13, 2018 at 04:44:28PM +0800, Jin Yao wrote:

Following command lines will cause perf crash.

perf record -j call -g -a 
perf report --branch-history

*** Error in `perf': double free or corruption (!prev): 0x104aa040 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
perf[0x51b914]
perf(hist_entry_iter__add+0x1e5)[0x51f305]
perf[0x43cf01]
perf[0x4fa3bf]
perf[0x4fa923]
perf[0x4fd396]
perf[0x4f9614]
perf(perf_session__process_events+0x89e)[0x4fc38e]
perf(cmd_report+0x15d2)[0x43f202]
perf[0x4a059f]
perf(main+0x631)[0x427b71]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
perf(_start+0x29)[0x427d89]

The memory corruption happens at:

iter_add_next_cumulative_entry()
{
 ...
 for (i = 0; i < iter->curr; i++) {
 ...
}

Whatever in iter_next_cumulative_entry() or in iter_add_next_cumulative_entry(),
they all don't check if iter->curr exceeds the array 'he_cache[]'.

If there are too many nodes in callchain, it's possible that iter->curr >
iter->max_stack, then memory corruption occurs.

This patch will reallocate array 'he_cache[]' in iter_next_cumulative_entry()
if necessary (the case of too many nodes in callchain).


right, the max_stack does not say how many nodes end up in
callchain_cursor at the end.. good catch, please mention
that also in the changelog



max_stack looks only to limit the number of calls but not for other 
branches.



however we know the final count from callchain_cursor itself,
the attached patch might do the same job, right?



I think the attached patch is ok.


also could we now get rid of iter->max_stack?



From my opinion, the option '--max-stack' in perf report looks not very 
necessary. While it's just my personal opinion, need to hear from more 
people. :)


Thanks
Jin Yao


thanks,
jirka


---
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b6140950301e..b50b7b70dcca 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -879,7 +879,7 @@ iter_prepare_cumulative_entry(struct hist_entry_iter *iter,
 * cumulated only one time to prevent entries more than 100%
 * overhead.
 */
-   he_cache = malloc(sizeof(*he_cache) * (iter->max_stack + 1));
+   he_cache = malloc(sizeof(*he_cache) * (callchain_cursor.nr + 1));
if (he_cache == NULL)
return -ENOMEM;
  



[PATCH] perf report: Fix a memory corrupton issue when enabling --branch-history

2018-02-12 Thread Jin Yao
Following command lines will cause perf crash.

perf record -j call -g -a 
perf report --branch-history

*** Error in `perf': double free or corruption (!prev): 0x104aa040 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
perf[0x51b914]
perf(hist_entry_iter__add+0x1e5)[0x51f305]
perf[0x43cf01]
perf[0x4fa3bf]
perf[0x4fa923]
perf[0x4fd396]
perf[0x4f9614]
perf(perf_session__process_events+0x89e)[0x4fc38e]
perf(cmd_report+0x15d2)[0x43f202]
perf[0x4a059f]
perf(main+0x631)[0x427b71]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
perf(_start+0x29)[0x427d89]

The memory corruption happens at:

iter_add_next_cumulative_entry()
{
...
for (i = 0; i < iter->curr; i++) {
...
}

Whatever in iter_next_cumulative_entry() or in iter_add_next_cumulative_entry(),
they all don't check if iter->curr exceeds the array 'he_cache[]'.

If there are too many nodes in callchain, it's possible that iter->curr >
iter->max_stack, then memory corruption occurs.

This patch will reallocate array 'he_cache[]' in iter_next_cumulative_entry()
if necessary (the case of too many nodes in callchain).

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/util/hist.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b614095..71f07d2 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -926,11 +926,32 @@ iter_next_cumulative_entry(struct hist_entry_iter *iter,
   struct addr_location *al)
 {
struct callchain_cursor_node *node;
+   struct hist_entry **tmp;
+   int i;
 
node = callchain_cursor_current(_cursor);
if (node == NULL)
return 0;
 
+   /*
+* If there are too many nodes in callchain,
+* increase the size of he_cache[].
+*/
+   if (iter->curr == iter->max_stack) {
+   i = 2 * iter->max_stack + 1;
+   tmp = realloc(iter->priv, sizeof(struct hist_entry *) * i);
+   if (tmp == NULL) {
+   /*
+* No need to free iter->priv here. It will be
+* freed in iter_finish_cumulative_entry.
+*/
+   return 0;
+   }
+
+   iter->priv = tmp;
+   iter->max_stack = i;
+   }
+
return fill_callchain_info(al, node, iter->hide_unresolved);
 }
 
-- 
2.7.4



[PATCH] perf report: Fix a memory corrupton issue when enabling --branch-history

2018-02-12 Thread Jin Yao
Following command lines will cause perf crash.

perf record -j call -g -a 
perf report --branch-history

*** Error in `perf': double free or corruption (!prev): 0x104aa040 ***
=== Backtrace: =
/lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
/lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
perf[0x51b914]
perf(hist_entry_iter__add+0x1e5)[0x51f305]
perf[0x43cf01]
perf[0x4fa3bf]
perf[0x4fa923]
perf[0x4fd396]
perf[0x4f9614]
perf(perf_session__process_events+0x89e)[0x4fc38e]
perf(cmd_report+0x15d2)[0x43f202]
perf[0x4a059f]
perf(main+0x631)[0x427b71]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
perf(_start+0x29)[0x427d89]

The memory corruption happens at:

iter_add_next_cumulative_entry()
{
...
for (i = 0; i < iter->curr; i++) {
...
}

Whatever in iter_next_cumulative_entry() or in iter_add_next_cumulative_entry(),
they all don't check if iter->curr exceeds the array 'he_cache[]'.

If there are too many nodes in callchain, it's possible that iter->curr >
iter->max_stack, then memory corruption occurs.

This patch will reallocate array 'he_cache[]' in iter_next_cumulative_entry()
if necessary (the case of too many nodes in callchain).

Signed-off-by: Jin Yao 
---
 tools/perf/util/hist.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index b614095..71f07d2 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -926,11 +926,32 @@ iter_next_cumulative_entry(struct hist_entry_iter *iter,
   struct addr_location *al)
 {
struct callchain_cursor_node *node;
+   struct hist_entry **tmp;
+   int i;
 
node = callchain_cursor_current(_cursor);
if (node == NULL)
return 0;
 
+   /*
+* If there are too many nodes in callchain,
+* increase the size of he_cache[].
+*/
+   if (iter->curr == iter->max_stack) {
+   i = 2 * iter->max_stack + 1;
+   tmp = realloc(iter->priv, sizeof(struct hist_entry *) * i);
+   if (tmp == NULL) {
+   /*
+* No need to free iter->priv here. It will be
+* freed in iter_finish_cumulative_entry.
+*/
+   return 0;
+   }
+
+   iter->priv = tmp;
+   iter->max_stack = i;
+   }
+
return fill_callchain_info(al, node, iter->hide_unresolved);
 }
 
-- 
2.7.4



Re: [PATCH] perf report: Fix wrong jump arrow

2018-02-12 Thread Jin, Yao

Hi Arnaldo,

Thanks for applying the patch. Yes the issue only happens on skl+.

The committer notes are very good and clear.

Thanks
Jin Yao

On 2/9/2018 11:15 PM, Arnaldo Carvalho de Melo wrote:

Em Mon, Jan 29, 2018 at 06:57:53PM +0800, Jin Yao escreveu:

When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
│if (flag)
   1.37 │0.4┌──   1  ↓ je 82
│   │x += x / y + y / x;


Applied and added this to the cset log, please check:

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

- Arnaldo



Re: [PATCH] perf report: Fix wrong jump arrow

2018-02-12 Thread Jin, Yao

Hi Arnaldo,

Thanks for applying the patch. Yes the issue only happens on skl+.

The committer notes are very good and clear.

Thanks
Jin Yao

On 2/9/2018 11:15 PM, Arnaldo Carvalho de Melo wrote:

Em Mon, Jan 29, 2018 at 06:57:53PM +0800, Jin Yao escreveu:

When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
│if (flag)
   1.37 │0.4┌──   1  ↓ je 82
│   │x += x / y + y / x;


Applied and added this to the cset log, please check:

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

- Arnaldo



[PATCH] perf report: Fix wrong jump arrow

2018-01-28 Thread Jin Yao
When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.4┌──   1  ↓ je 82
   │   │x += x / y + y / x;
  0.00 │0.4│  1310movsd  (%rsp),%xmm0
  0.00 │0.4│   565movsd  0x8(%rsp),%xmm4
   │0.4│  movsd  0x8(%rsp),%xmm1
   │0.4│  movsd  (%rsp),%xmm3
   │0.4│  divsd  %xmm4,%xmm0
  0.00 │0.4│   579divsd  %xmm3,%xmm1
   │0.4│  movsd  (%rsp),%xmm2
   │0.4│  addsd  %xmm1,%xmm0
   │0.4│  addsd  %xmm2,%xmm0
  0.00 │0.4│  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4└─→  82:   sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.48 1 ┌──je 82
   │   │x += x / y + y / x;
  0.00 │0.48  1310 │  movsd  (%rsp),%xmm0
  0.00 │0.48   565 │  movsd  0x8(%rsp),%xmm4
   │0.48   │  movsd  0x8(%rsp),%xmm1
   │0.48   │  movsd  (%rsp),%xmm3
   │0.48   │  divsd  %xmm4,%xmm0
  0.00 │0.48   579 │  divsd  %xmm3,%xmm1
   │0.48   │  movsd  (%rsp),%xmm2
   │0.48   │  addsd  %xmm1,%xmm0
   │0.48   │  addsd  %xmm2,%xmm0
  0.00 │0.48   │  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4882:└─→sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/ui/browsers/annotate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2864279..e2f6663 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -319,6 +319,7 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
u8 pcnt_width = annotate_browser__pcnt_width(ab);
+   int width = 0;
 
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
@@ -340,13 +341,17 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
to = (u64)btarget->idx;
}
 
+   if (ab->have_cycles)
+   width = IPC_WIDTH + CYCLES_WIDTH;
+
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
-   __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
+   __ui_browser__line_arrow(browser,
+pcnt_width + 2 + ab->addr_width + width,
 from, to);
 
if (is_fused(ab, cursor)) {
ui_browser__mark_fused(browser,
-  pcnt_width + 3 + ab->addr_width,
+  pcnt_width + 3 + ab->addr_width + width,
   from - 1,
   to > from ? true : false);
}
-- 
2.7.4



[PATCH] perf report: Fix wrong jump arrow

2018-01-28 Thread Jin Yao
When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.4┌──   1  ↓ je 82
   │   │x += x / y + y / x;
  0.00 │0.4│  1310movsd  (%rsp),%xmm0
  0.00 │0.4│   565movsd  0x8(%rsp),%xmm4
   │0.4│  movsd  0x8(%rsp),%xmm1
   │0.4│  movsd  (%rsp),%xmm3
   │0.4│  divsd  %xmm4,%xmm0
  0.00 │0.4│   579divsd  %xmm3,%xmm1
   │0.4│  movsd  (%rsp),%xmm2
   │0.4│  addsd  %xmm1,%xmm0
   │0.4│  addsd  %xmm2,%xmm0
  0.00 │0.4│  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4└─→  82:   sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
   │if (flag)
  1.37 │0.48 1 ┌──je 82
   │   │x += x / y + y / x;
  0.00 │0.48  1310 │  movsd  (%rsp),%xmm0
  0.00 │0.48   565 │  movsd  0x8(%rsp),%xmm4
   │0.48   │  movsd  0x8(%rsp),%xmm1
   │0.48   │  movsd  (%rsp),%xmm3
   │0.48   │  divsd  %xmm4,%xmm0
  0.00 │0.48   579 │  divsd  %xmm3,%xmm1
   │0.48   │  movsd  (%rsp),%xmm2
   │0.48   │  addsd  %xmm1,%xmm0
   │0.48   │  addsd  %xmm2,%xmm0
  0.00 │0.48   │  movsd  %xmm0,(%rsp)
   │   │volatile double x = 1212121212, y = 121212;
   │   │
   │   │s_randseed = time(0);
   │   │srand(s_randseed);
   │   │
   │   │for (i = 0; i < 20; i++) {
  1.37 │0.4882:└─→sub$0x1,%ebx
 28.21 │0.4817  ↑ jne38

Signed-off-by: Jin Yao 
---
 tools/perf/ui/browsers/annotate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2864279..e2f6663 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -319,6 +319,7 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
struct map_symbol *ms = ab->b.priv;
struct symbol *sym = ms->sym;
u8 pcnt_width = annotate_browser__pcnt_width(ab);
+   int width = 0;
 
/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
@@ -340,13 +341,17 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
to = (u64)btarget->idx;
}
 
+   if (ab->have_cycles)
+   width = IPC_WIDTH + CYCLES_WIDTH;
+
ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
-   __ui_browser__line_arrow(browser, pcnt_width + 2 + ab->addr_width,
+   __ui_browser__line_arrow(browser,
+pcnt_width + 2 + ab->addr_width + width,
 from, to);
 
if (is_fused(ab, cursor)) {
ui_browser__mark_fused(browser,
-  pcnt_width + 3 + ab->addr_width,
+  pcnt_width + 3 + ab->addr_width + width,
   from - 1,
   to > from ? true : false);
}
-- 
2.7.4



[PATCH v2] perf util: Use target->per_thread and target->system_wide flags

2018-01-23 Thread Jin Yao
Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

v2:
---
According to Jiri's comment, add description to explain why the patch
uses 'target->per_thread && target->system_wide'.

v2 doesn't have functional change.

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/util/evlist.c | 20 +++-
 tools/perf/util/thread_map.c |  4 ++--
 tools/perf/util/thread_map.h |  2 +-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ac35cd2..3eea0f8 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,8 +1106,26 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
struct cpu_map *cpus;
struct thread_map *threads;
 
+   /*
+* If specify '-a' and '--per-thread' to perf record, perf record
+* will override '--per-thread'. target->per_thread = false and
+* target->system_wide = true.
+*
+* If specify '--per-thread' only to perf record,
+* target->per_thread = true and target->system_wide = false.
+*
+* So target->per_thread && target->system_wide is false.
+* For perf record, thread_map__new_str doesn't call
+* thread_map__new_all_cpus. That will keep perf record's
+* current behavior.
+*
+* For perf stat, it allows the case that target->per_thread and
+* target->system_wide are all true. It means to collect system-wide
+* per-thread data. thread_map__new_str will call
+* thread_map__new_all_cpus to enumerate all threads.
+*/
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 
-- 
2.7.4



[PATCH v2] perf util: Use target->per_thread and target->system_wide flags

2018-01-23 Thread Jin Yao
Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

v2:
---
According to Jiri's comment, add description to explain why the patch
uses 'target->per_thread && target->system_wide'.

v2 doesn't have functional change.

Signed-off-by: Jin Yao 
---
 tools/perf/util/evlist.c | 20 +++-
 tools/perf/util/thread_map.c |  4 ++--
 tools/perf/util/thread_map.h |  2 +-
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ac35cd2..3eea0f8 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,8 +1106,26 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
struct cpu_map *cpus;
struct thread_map *threads;
 
+   /*
+* If specify '-a' and '--per-thread' to perf record, perf record
+* will override '--per-thread'. target->per_thread = false and
+* target->system_wide = true.
+*
+* If specify '--per-thread' only to perf record,
+* target->per_thread = true and target->system_wide = false.
+*
+* So target->per_thread && target->system_wide is false.
+* For perf record, thread_map__new_str doesn't call
+* thread_map__new_all_cpus. That will keep perf record's
+* current behavior.
+*
+* For perf stat, it allows the case that target->per_thread and
+* target->system_wide are all true. It means to collect system-wide
+* per-thread data. thread_map__new_str will call
+* thread_map__new_all_cpus to enumerate all threads.
+*/
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 
-- 
2.7.4



Re: [PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-23 Thread Jin, Yao



On 1/23/2018 10:40 PM, Jiri Olsa wrote:

On Tue, Jan 23, 2018 at 07:02:44AM +0800, Jin, Yao wrote:

SNIP


  threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);


At first glance I thought your solution would do the trick but perf
record does use target->system_wide when the '-a' switch is used.
Moreover specifying the '-a' switch doesn't prevent the '--per-thread'
option from being used as well, making both target->perf_thread and
target_system_wide equal to true (and that is not good).

Although not a fan of adding more to struct target, the advantage of
having target->all_threads is that we are guaranteed that it isn't
used anywhere else.

Let me know what you think,
Mathieu



If we specify both '-a' and '--per-thread' to perf record, perf record will
override'--per-thread'. So now target->per_thread = false, and
target->system_wide = true.

If we specify '--per-thread' only to perf record, target->per_thread = true,
and target->system_wide = false.

So whatever for any case, target->per_thread && target->system_wide is
false.

Since the parameter is false, in thread_map__new_str(), it will not execute
the thread_map__new_all_cpus(). So that will not change perf record previous
behavior.

In perf stat, it allows the case that target->per_thread and
target->system_wide are all true. That means we want to collect system-wide
per-thread metrics.


could you please put this description into comment
before the thread_map__new_str is called?

thanks,
jirka



OK, I will put this comment in v2. I will send v2 soon.

Thanks
Jin Yao


Re: [PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-23 Thread Jin, Yao



On 1/23/2018 10:40 PM, Jiri Olsa wrote:

On Tue, Jan 23, 2018 at 07:02:44AM +0800, Jin, Yao wrote:

SNIP


  threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);


At first glance I thought your solution would do the trick but perf
record does use target->system_wide when the '-a' switch is used.
Moreover specifying the '-a' switch doesn't prevent the '--per-thread'
option from being used as well, making both target->perf_thread and
target_system_wide equal to true (and that is not good).

Although not a fan of adding more to struct target, the advantage of
having target->all_threads is that we are guaranteed that it isn't
used anywhere else.

Let me know what you think,
Mathieu



If we specify both '-a' and '--per-thread' to perf record, perf record will
override'--per-thread'. So now target->per_thread = false, and
target->system_wide = true.

If we specify '--per-thread' only to perf record, target->per_thread = true,
and target->system_wide = false.

So whatever for any case, target->per_thread && target->system_wide is
false.

Since the parameter is false, in thread_map__new_str(), it will not execute
the thread_map__new_all_cpus(). So that will not change perf record previous
behavior.

In perf stat, it allows the case that target->per_thread and
target->system_wide are all true. That means we want to collect system-wide
per-thread metrics.


could you please put this description into comment
before the thread_map__new_str is called?

thanks,
jirka



OK, I will put this comment in v2. I will send v2 soon.

Thanks
Jin Yao


Re: [PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-22 Thread Jin, Yao



On 1/23/2018 7:56 AM, Mathieu Poirier wrote:

On 22 January 2018 at 15:15, Jin Yao <yao@linux.intel.com> wrote:

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
  tools/perf/util/evlist.c | 2 +-
  tools/perf/util/thread_map.c | 4 ++--
  tools/perf/util/thread_map.h | 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 120efd8..9dff74a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
 struct thread_map *threads;

 threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);

 if (!threads)
 return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
  }

  struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
  {
 if (pid)
 return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
 if (!tid && uid != UINT_MAX)
 return thread_map__new_by_uid(uid);

-   if (per_thread)
+   if (all_threads)
 return thread_map__new_all_cpus();

 return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
  void thread_map__put(struct thread_map *map);

  struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);

  struct thread_map *thread_map__new_by_tid_str(const char *tid_str);


Reviewed-and-tested-by: Mathieu Poirier "mathieu.poir...@linaro.org"



Thanks a lot for reviewing the patch.

Thanks
Jin Yao




Re: [PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-22 Thread Jin, Yao



On 1/23/2018 7:56 AM, Mathieu Poirier wrote:

On 22 January 2018 at 15:15, Jin Yao  wrote:

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao 
---
  tools/perf/util/evlist.c | 2 +-
  tools/perf/util/thread_map.c | 4 ++--
  tools/perf/util/thread_map.h | 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 120efd8..9dff74a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
 struct thread_map *threads;

 threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);

 if (!threads)
 return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
  }

  struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
  {
 if (pid)
 return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
 if (!tid && uid != UINT_MAX)
 return thread_map__new_by_uid(uid);

-   if (per_thread)
+   if (all_threads)
 return thread_map__new_all_cpus();

 return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
  void thread_map__put(struct thread_map *map);

  struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);

  struct thread_map *thread_map__new_by_tid_str(const char *tid_str);


Reviewed-and-tested-by: Mathieu Poirier "mathieu.poir...@linaro.org"



Thanks a lot for reviewing the patch.

Thanks
Jin Yao




Re: [PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-22 Thread Jin, Yao



On 1/23/2018 5:10 AM, Mathieu Poirier wrote:

On 22 January 2018 at 15:15, Jin Yao <yao@linux.intel.com> wrote:

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
  tools/perf/util/evlist.c | 2 +-
  tools/perf/util/thread_map.c | 4 ++--
  tools/perf/util/thread_map.h | 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 120efd8..9dff74a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
 struct thread_map *threads;

 threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);


At first glance I thought your solution would do the trick but perf
record does use target->system_wide when the '-a' switch is used.
Moreover specifying the '-a' switch doesn't prevent the '--per-thread'
option from being used as well, making both target->perf_thread and
target_system_wide equal to true (and that is not good).

Although not a fan of adding more to struct target, the advantage of
having target->all_threads is that we are guaranteed that it isn't
used anywhere else.

Let me know what you think,
Mathieu



If we specify both '-a' and '--per-thread' to perf record, perf record 
will override'--per-thread'. So now target->per_thread = false, and 
target->system_wide = true.


If we specify '--per-thread' only to perf record, target->per_thread = 
true, and target->system_wide = false.


So whatever for any case, target->per_thread && target->system_wide is 
false.


Since the parameter is false, in thread_map__new_str(), it will not 
execute the thread_map__new_all_cpus(). So that will not change perf 
record previous behavior.


In perf stat, it allows the case that target->per_thread and 
target->system_wide are all true. That means we want to collect 
system-wide per-thread metrics.


That's my current thinking.

Thanks
Jin Yao


Re: [PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-22 Thread Jin, Yao



On 1/23/2018 5:10 AM, Mathieu Poirier wrote:

On 22 January 2018 at 15:15, Jin Yao  wrote:

Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao 
---
  tools/perf/util/evlist.c | 2 +-
  tools/perf/util/thread_map.c | 4 ++--
  tools/perf/util/thread_map.h | 2 +-
  3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 120efd8..9dff74a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
 struct thread_map *threads;

 threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);


At first glance I thought your solution would do the trick but perf
record does use target->system_wide when the '-a' switch is used.
Moreover specifying the '-a' switch doesn't prevent the '--per-thread'
option from being used as well, making both target->perf_thread and
target_system_wide equal to true (and that is not good).

Although not a fan of adding more to struct target, the advantage of
having target->all_threads is that we are guaranteed that it isn't
used anywhere else.

Let me know what you think,
Mathieu



If we specify both '-a' and '--per-thread' to perf record, perf record 
will override'--per-thread'. So now target->per_thread = false, and 
target->system_wide = true.


If we specify '--per-thread' only to perf record, target->per_thread = 
true, and target->system_wide = false.


So whatever for any case, target->per_thread && target->system_wide is 
false.


Since the parameter is false, in thread_map__new_str(), it will not 
execute the thread_map__new_all_cpus(). So that will not change perf 
record previous behavior.


In perf stat, it allows the case that target->per_thread and 
target->system_wide are all true. That means we want to collect 
system-wide per-thread metrics.


That's my current thinking.

Thanks
Jin Yao


[PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-22 Thread Jin Yao
Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/util/evlist.c | 2 +-
 tools/perf/util/thread_map.c | 4 ++--
 tools/perf/util/thread_map.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 120efd8..9dff74a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
struct thread_map *threads;
 
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 
-- 
2.7.4



[PATCH] perf util: Use target->per_thread and target->system_wide flags

2018-01-22 Thread Jin Yao
Mathieu Poirier reports issue in commit ("73c0ca1eee3d perf thread_map:
Enumerate all threads from /proc") that it has negative impact on
'perf record --per-thread'. It has the effect of creating a kernel event
for each thread in the system for 'perf record --per-thread'.

Mathieu Poirier's patch ("perf util: Do not reuse target->per_thread flag")
can fix this issue by creating a new target->all_threads flag.

This patch is based on Mathieu Poirier's patch but it doesn't use a new
target->all_threads flag. This patch just uses 'target->per_thread &&
target->system_wide' as a condition to check for all threads case.

Signed-off-by: Jin Yao 
---
 tools/perf/util/evlist.c | 2 +-
 tools/perf/util/thread_map.c | 4 ++--
 tools/perf/util/thread_map.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 120efd8..9dff74a 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
struct thread_map *threads;
 
threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
+ target->per_thread && 
target->system_wide);
 
if (!threads)
return -1;
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..729dad8 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
 {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
 
-   if (per_thread)
+   if (all_threads)
return thread_map__new_all_cpus();
 
return thread_map__new_by_tid_str(tid);
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b9..5ec91cf 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 
-- 
2.7.4



Re: [PATCH 1/1] perf util: Do not reuse target->per_thread flag

2018-01-22 Thread Jin, Yao

Hi Mathieu,

Sorry, I just see this patch today.

I have tested this patch and it works.

Another idea from me is it doesn't need to add a new target->all_threads 
flag.


We just use target->per-thread && target->system_wide as a condition to 
check for all per-thread case.


I just think for your perf record case, the target->system_wide will not 
be set. Instead, if target->per-thread and target->system_wide are both 
set, that means we needs to trace on all threads, right?


Thanks
Jin Yao

On 1/13/2018 7:12 AM, Mathieu Poirier wrote:

Commit ("73c0ca1eee3d perf thread_map: Enumerate all threads from /proc")
is using the target->per_thread flag to specify that all threads in a
system should be taken into account.  That is then used in function
thread_map__new_str() where all threads are added to evlist->threads.

Variable target->per_thread is also used by 'perf record' when handling
trace sessions using the --per-thread command line option.  Since the
target->per_thread flag is set all threads will be added to
evlist->threads, which has the effect of creating a kernel event for each
thread in the system.

This patch address the issue by creating a new target->all_threads flag
that gets set from the 'stat' utility, avoiding any conflict with other
utilities using target->per_thread.

Signed-off-by: Mathieu Poirier <mathieu.poir...@linaro.org>
---
  tools/perf/builtin-stat.c| 2 +-
  tools/perf/util/evlist.c | 2 +-
  tools/perf/util/target.h | 1 +
  tools/perf/util/thread_map.c | 4 ++--
  tools/perf/util/thread_map.h | 2 +-
  5 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..87e156bfb45e 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2831,7 +2831,7 @@ int cmd_stat(int argc, const char **argv)
target__validate();
  
  	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))

-   target.per_thread = true;
+   target.per_thread = target.all_threads = true;
  
  	if (perf_evlist__create_maps(evsel_list, ) < 0) {

if (target__has_task()) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a5e09c4071..c239eb895612 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
struct thread_map *threads;
  
  	threads = thread_map__new_str(target->pid, target->tid, target->uid,

- target->per_thread);
+ target->all_threads);
  
  	if (!threads)

return -1;
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 6ef01a83b24e..0da8ea2b6801 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -15,6 +15,7 @@ struct target {
bool uses_mmap;
bool default_per_cpu;
bool per_thread;
+   bool all_threads;
  };
  
  enum target_errno {

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f6491c..729dad8f412d 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
  }
  
  struct thread_map *thread_map__new_str(const char *pid, const char *tid,

-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
  {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
  
-	if (per_thread)

+   if (all_threads)
return thread_map__new_all_cpus();
  
  	return thread_map__new_by_tid_str(tid);

diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b99e73c..5ec91cfd1869 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
  void thread_map__put(struct thread_map *map);
  
  struct thread_map *thread_map__new_str(const char *pid,

-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
  
  struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
  



Re: [PATCH 1/1] perf util: Do not reuse target->per_thread flag

2018-01-22 Thread Jin, Yao

Hi Mathieu,

Sorry, I just see this patch today.

I have tested this patch and it works.

Another idea from me is it doesn't need to add a new target->all_threads 
flag.


We just use target->per-thread && target->system_wide as a condition to 
check for all per-thread case.


I just think for your perf record case, the target->system_wide will not 
be set. Instead, if target->per-thread and target->system_wide are both 
set, that means we needs to trace on all threads, right?


Thanks
Jin Yao

On 1/13/2018 7:12 AM, Mathieu Poirier wrote:

Commit ("73c0ca1eee3d perf thread_map: Enumerate all threads from /proc")
is using the target->per_thread flag to specify that all threads in a
system should be taken into account.  That is then used in function
thread_map__new_str() where all threads are added to evlist->threads.

Variable target->per_thread is also used by 'perf record' when handling
trace sessions using the --per-thread command line option.  Since the
target->per_thread flag is set all threads will be added to
evlist->threads, which has the effect of creating a kernel event for each
thread in the system.

This patch address the issue by creating a new target->all_threads flag
that gets set from the 'stat' utility, avoiding any conflict with other
utilities using target->per_thread.

Signed-off-by: Mathieu Poirier 
---
  tools/perf/builtin-stat.c| 2 +-
  tools/perf/util/evlist.c | 2 +-
  tools/perf/util/target.h | 1 +
  tools/perf/util/thread_map.c | 4 ++--
  tools/perf/util/thread_map.h | 2 +-
  5 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..87e156bfb45e 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2831,7 +2831,7 @@ int cmd_stat(int argc, const char **argv)
target__validate();
  
  	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))

-   target.per_thread = true;
+   target.per_thread = target.all_threads = true;
  
  	if (perf_evlist__create_maps(evsel_list, ) < 0) {

if (target__has_task()) {
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a5e09c4071..c239eb895612 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1106,7 +1106,7 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
struct thread_map *threads;
  
  	threads = thread_map__new_str(target->pid, target->tid, target->uid,

- target->per_thread);
+ target->all_threads);
  
  	if (!threads)

return -1;
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 6ef01a83b24e..0da8ea2b6801 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -15,6 +15,7 @@ struct target {
bool uses_mmap;
bool default_per_cpu;
bool per_thread;
+   bool all_threads;
  };
  
  enum target_errno {

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f6491c..729dad8f412d 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -323,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char 
*tid_str)
  }
  
  struct thread_map *thread_map__new_str(const char *pid, const char *tid,

-  uid_t uid, bool per_thread)
+  uid_t uid, bool all_threads)
  {
if (pid)
return thread_map__new_by_pid_str(pid);
@@ -331,7 +331,7 @@ struct thread_map *thread_map__new_str(const char *pid, 
const char *tid,
if (!tid && uid != UINT_MAX)
return thread_map__new_by_uid(uid);
  
-	if (per_thread)

+   if (all_threads)
return thread_map__new_all_cpus();
  
  	return thread_map__new_by_tid_str(tid);

diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index 0a806b99e73c..5ec91cfd1869 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -31,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
  void thread_map__put(struct thread_map *map);
  
  struct thread_map *thread_map__new_str(const char *pid,

-   const char *tid, uid_t uid, bool per_thread);
+   const char *tid, uid_t uid, bool all_threads);
  
  struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
  



Re: [PATCH] perf util: Fix evlist->threads when working with 'perf stat --per-thread'

2018-01-22 Thread Jin, Yao



On 1/13/2018 3:53 AM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jan 12, 2018 at 09:44:53AM -0700, Mathieu Poirier escreveu:

After commit ("73c0ca1eee3d perf thread_map: Enumerate all threads
from /proc") any condition where target->per_thread is set will see all the
threads in a system added to evlist->threads.

Since the 'record' utility also uses function thread_map__new_str(), any
attempt to use the --perf-thread option will also end up accounting for all


--perf-thread? Do you mean --per-thread, ok, so 'record' has this, uses
the opts->per_thread that then Jin reused for his recent patches... Jin,
can you try to untangle this, probably you can't reuse that
opts->per-thread thing...

Without looking further, I think that could be a better avenue to fix
this issue, without having to add a 'stat' specific method to the core
classes.

- Arnaldo



Hi,

Very sorry, I just see this mail today.

Please let me look at this issue first.

Thanks
Jin Yao


threads in a system, resulting in new kernel events being created for all
threads rather than just the thread of interest.

This patch keeps the newly introduced functionality but make sure it
doesn't affect other utilities outside of 'stat'.

Signed-off-by: Mathieu Poirier <mathieu.poir...@linaro.org>
---
  tools/perf/builtin-stat.c |  2 +-
  tools/perf/util/evlist.c  | 29 -
  tools/perf/util/evlist.h  |  2 ++
  3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..82d16426b984 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2833,7 +2833,7 @@ int cmd_stat(int argc, const char **argv)
if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
target.per_thread = true;
  
-	if (perf_evlist__create_maps(evsel_list, ) < 0) {

+   if (perf_evlist__create_stat_maps(evsel_list, ) < 0) {
if (target__has_task()) {
pr_err("Problems finding threads of monitor\n");
parse_options_usage(stat_usage, stat_options, "p", 1);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a5e09c4071..a4ae684f28de 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1100,13 +1100,11 @@ int perf_evlist__mmap(struct perf_evlist *evlist, 
unsigned int pages)
return perf_evlist__mmap_ex(evlist, pages, 0, false);
  }
  
-int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)

+static int _perf_evlist__create_maps(struct perf_evlist *evlist,
+struct target *target,
+struct thread_map *threads)
  {
struct cpu_map *cpus;
-   struct thread_map *threads;
-
-   threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
  
  	if (!threads)

return -1;
@@ -1130,6 +1128,27 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
return -1;
  }
  
+int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)

+{
+   struct thread_map *threads;
+
+   threads = thread_map__new_str(target->pid, target->tid, target->uid,
+ false);
+
+   return _perf_evlist__create_maps(evlist, target, threads);
+}
+
+int perf_evlist__create_stat_maps(struct perf_evlist *evlist,
+ struct target *target)
+{
+   struct thread_map *threads;
+
+   threads = thread_map__new_str(target->pid, target->tid, target->uid,
+ target->per_thread);
+
+   return _perf_evlist__create_maps(evlist, target, threads);
+}
+
  void perf_evlist__set_maps(struct perf_evlist *evlist, struct cpu_map *cpus,
   struct thread_map *threads)
  {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 75160666d305..80c654990e19 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -188,6 +188,8 @@ void perf_evlist__set_selected(struct perf_evlist *evlist,
  void perf_evlist__set_maps(struct perf_evlist *evlist, struct cpu_map *cpus,
   struct thread_map *threads);
  int perf_evlist__create_maps(struct perf_evlist *evlist, struct target 
*target);
+int perf_evlist__create_stat_maps(struct perf_evlist *evlist,
+ struct target *target);
  int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel 
**err_evsel);
  
  void __perf_evlist__set_leader(struct list_head *list);

--
2.7.4


Re: [PATCH] perf util: Fix evlist->threads when working with 'perf stat --per-thread'

2018-01-22 Thread Jin, Yao



On 1/13/2018 3:53 AM, Arnaldo Carvalho de Melo wrote:

Em Fri, Jan 12, 2018 at 09:44:53AM -0700, Mathieu Poirier escreveu:

After commit ("73c0ca1eee3d perf thread_map: Enumerate all threads
from /proc") any condition where target->per_thread is set will see all the
threads in a system added to evlist->threads.

Since the 'record' utility also uses function thread_map__new_str(), any
attempt to use the --perf-thread option will also end up accounting for all


--perf-thread? Do you mean --per-thread, ok, so 'record' has this, uses
the opts->per_thread that then Jin reused for his recent patches... Jin,
can you try to untangle this, probably you can't reuse that
opts->per-thread thing...

Without looking further, I think that could be a better avenue to fix
this issue, without having to add a 'stat' specific method to the core
classes.

- Arnaldo



Hi,

Very sorry, I just see this mail today.

Please let me look at this issue first.

Thanks
Jin Yao


threads in a system, resulting in new kernel events being created for all
threads rather than just the thread of interest.

This patch keeps the newly introduced functionality but make sure it
doesn't affect other utilities outside of 'stat'.

Signed-off-by: Mathieu Poirier 
---
  tools/perf/builtin-stat.c |  2 +-
  tools/perf/util/evlist.c  | 29 -
  tools/perf/util/evlist.h  |  2 ++
  3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..82d16426b984 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2833,7 +2833,7 @@ int cmd_stat(int argc, const char **argv)
if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
target.per_thread = true;
  
-	if (perf_evlist__create_maps(evsel_list, ) < 0) {

+   if (perf_evlist__create_stat_maps(evsel_list, ) < 0) {
if (target__has_task()) {
pr_err("Problems finding threads of monitor\n");
parse_options_usage(stat_usage, stat_options, "p", 1);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f0a5e09c4071..a4ae684f28de 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -1100,13 +1100,11 @@ int perf_evlist__mmap(struct perf_evlist *evlist, 
unsigned int pages)
return perf_evlist__mmap_ex(evlist, pages, 0, false);
  }
  
-int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)

+static int _perf_evlist__create_maps(struct perf_evlist *evlist,
+struct target *target,
+struct thread_map *threads)
  {
struct cpu_map *cpus;
-   struct thread_map *threads;
-
-   threads = thread_map__new_str(target->pid, target->tid, target->uid,
- target->per_thread);
  
  	if (!threads)

return -1;
@@ -1130,6 +1128,27 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, 
struct target *target)
return -1;
  }
  
+int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)

+{
+   struct thread_map *threads;
+
+   threads = thread_map__new_str(target->pid, target->tid, target->uid,
+ false);
+
+   return _perf_evlist__create_maps(evlist, target, threads);
+}
+
+int perf_evlist__create_stat_maps(struct perf_evlist *evlist,
+ struct target *target)
+{
+   struct thread_map *threads;
+
+   threads = thread_map__new_str(target->pid, target->tid, target->uid,
+ target->per_thread);
+
+   return _perf_evlist__create_maps(evlist, target, threads);
+}
+
  void perf_evlist__set_maps(struct perf_evlist *evlist, struct cpu_map *cpus,
   struct thread_map *threads)
  {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 75160666d305..80c654990e19 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -188,6 +188,8 @@ void perf_evlist__set_selected(struct perf_evlist *evlist,
  void perf_evlist__set_maps(struct perf_evlist *evlist, struct cpu_map *cpus,
   struct thread_map *threads);
  int perf_evlist__create_maps(struct perf_evlist *evlist, struct target 
*target);
+int perf_evlist__create_stat_maps(struct perf_evlist *evlist,
+ struct target *target);
  int perf_evlist__apply_filters(struct perf_evlist *evlist, struct perf_evsel 
**err_evsel);
  
  void __perf_evlist__set_leader(struct list_head *list);

--
2.7.4


Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-21 Thread Jin, Yao



On 1/16/2018 9:17 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 09:06:09PM +0800, Jin, Yao wrote:

Just tested. But looks it's not OK for '--per-thread' case.


yea, I haven't tested much.. might need soem tweaking,
but my point was that it could be doable on one place
instead of introducing another if possible

jirka



Hi Jiri,

I ever considered to move the operation of removing error thread to 
perf_evsel__fallback(). The perf_evsel__fallback() is common code and 
it's shared by perf report, perf stat and perf top.


While finally I think it'd better let the caller decide to remove error 
thread and try again, or just return the warning message. 
perf_evsel__fallback() probably doesn't know what the caller want to do.


That's my current thinking. Maybe there will be a better fix...

Thanks
Jin Yao


Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-21 Thread Jin, Yao



On 1/16/2018 9:17 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 09:06:09PM +0800, Jin, Yao wrote:

Just tested. But looks it's not OK for '--per-thread' case.


yea, I haven't tested much.. might need soem tweaking,
but my point was that it could be doable on one place
instead of introducing another if possible

jirka



Hi Jiri,

I ever considered to move the operation of removing error thread to 
perf_evsel__fallback(). The perf_evsel__fallback() is common code and 
it's shared by perf report, perf stat and perf top.


While finally I think it'd better let the caller decide to remove error 
thread and try again, or just return the warning message. 
perf_evsel__fallback() probably doesn't know what the caller want to do.


That's my current thinking. Maybe there will be a better fix...

Thanks
Jin Yao


[tip:perf/core] perf script: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  cc2ef584a863b7c8033b78723cd253ca47e9a589
Gitweb: https://git.kernel.org/tip/cc2ef584a863b7c8033b78723cd253ca47e9a589
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:33 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf script: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-9-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 16 
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ac78191..3499d68 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3445,6 +3444,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3457,7 +3463,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3476,6 +3482,8 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   zfree(_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 


[tip:perf/core] perf script: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  cc2ef584a863b7c8033b78723cd253ca47e9a589
Gitweb: https://git.kernel.org/tip/cc2ef584a863b7c8033b78723cd253ca47e9a589
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:33 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf script: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-9-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 16 
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ac78191..3499d68 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3445,6 +3444,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3457,7 +3463,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3476,6 +3482,8 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   zfree(_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 


[tip:perf/core] perf report: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Gitweb: https://git.kernel.org/tip/0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:32 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf report: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 22 --
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 63d0db3..907e505 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 437..42a52dc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1300,24 +1299,33 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1333,6 +1341,8 @@ repeat:
ret = 0;
 
 error:
+   zfree(_range);
+
perf_session__delete(session);
return ret;
 }


[tip:perf/core] perf report: Remove the time slices number limitation

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Gitweb: https://git.kernel.org/tip/0a3cc3ae05c363dabd891ed5f918c62197de8c7f
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:32 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:37 -0300

perf report: Remove the time slices number limitation

Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-8-git-send-email-yao@linux.intel.com
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 22 --
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 63d0db3..907e505 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 437..42a52dc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1300,24 +1299,33 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1333,6 +1341,8 @@ repeat:
ret = 0;
 
 error:
+   zfree(_range);
+
perf_session__delete(session);
return ret;
 }


[tip:perf/core] perf util: Allocate time slices buffer according to number of comma

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  5a031f887cb8d60fe87d21159c3cf82c38f55679
Gitweb: https://git.kernel.org/tip/5a031f887cb8d60fe87d21159c3cf82c38f55679
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:31 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf util: Allocate time slices buffer according to number of comma

Previously we use a magic number 10 to limit the number of time slices.
It's not very good.

This patch creates a new function perf_time__range_alloc() to allocate
time slices buffer. The number of buffer entries is determined by the
number of comma in string but at least it will allocate one entry even
if no comma is found.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,


[tip:perf/core] perf util: Allocate time slices buffer according to number of comma

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  5a031f887cb8d60fe87d21159c3cf82c38f55679
Gitweb: https://git.kernel.org/tip/5a031f887cb8d60fe87d21159c3cf82c38f55679
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:31 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf util: Allocate time slices buffer according to number of comma

Previously we use a magic number 10 to limit the number of time slices.
It's not very good.

This patch creates a new function perf_time__range_alloc() to allocate
time slices buffer. The number of buffer entries is determined by the
number of comma in string but at least it will allocate one entry even
if no comma is found.

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,


[tip:perf/core] perf report: Add an indication of what time slices are used

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  7425664bbd3174814500c7ab8740cbb9bb25396c
Gitweb: https://git.kernel.org/tip/7425664bbd3174814500c7ab8740cbb9bb25396c
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:30 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf report: Add an indication of what time slices are used

Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao <yao@linux.intel.com>
Suggested--by: Arnaldo Carvalho de Melo <a...@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7d4f0a5..437 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -404,6 +404,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");


[tip:perf/core] perf report: Add an indication of what time slices are used

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  7425664bbd3174814500c7ab8740cbb9bb25396c
Gitweb: https://git.kernel.org/tip/7425664bbd3174814500c7ab8740cbb9bb25396c
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:30 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:36 -0300

perf report: Add an indication of what time slices are used

Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao 
Suggested--by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-6-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 7d4f0a5..437 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -404,6 +404,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");


[tip:perf/core] perf util: Support no index time percent slice

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  3002812e602d3f991a5b8cdc0499e63e13ff65c4
Gitweb: https://git.kernel.org/tip/3002812e602d3f991a5b8cdc0499e63e13ff65c4
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:29 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Support no index time percent slice

Previously, the time percent slice needs an index to specify which one
the user wants.

It may be easier to use if the index can be omitted.  So with this
patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao <yao@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 


[tip:perf/core] perf util: Support no index time percent slice

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  3002812e602d3f991a5b8cdc0499e63e13ff65c4
Gitweb: https://git.kernel.org/tip/3002812e602d3f991a5b8cdc0499e63e13ff65c4
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:29 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Support no index time percent slice

Previously, the time percent slice needs an index to specify which one
the user wants.

It may be easier to use if the index can be omitted.  So with this
patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao 
Suggested-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 


[tip:perf/core] perf util: Improve error checking for time percent input

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Gitweb: https://git.kernel.org/tip/6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:28 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Improve error checking for time percent input

The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the entire
string. Now for the same command line, it would return error message
"Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao <yao@linux.intel.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 


[tip:perf/core] perf util: Improve error checking for time percent input

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Gitweb: https://git.kernel.org/tip/6e761cbc9127fb8fc609aea2265ee8279b8d6c55
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:28 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:35 -0300

perf util: Improve error checking for time percent input

The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the entire
string. Now for the same command line, it would return error message
"Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 


[tip:perf/core] perf script: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  1e2778e91616086177a255f3fc8c72ecaa564ae6
Gitweb: https://git.kernel.org/tip/1e2778e91616086177a255f3fc8c72ecaa564ae6
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:27 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf script: Improve error msg when no first/last sample time found

The following message will be returned to user when executing 'perf
script --time' if perf data file doesn't contain the first/last sample
time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao@linux.intel.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 08bc818..ac78191 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3449,7 +3449,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}


[tip:perf/core] perf script: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  1e2778e91616086177a255f3fc8c72ecaa564ae6
Gitweb: https://git.kernel.org/tip/1e2778e91616086177a255f3fc8c72ecaa564ae6
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:27 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf script: Improve error msg when no first/last sample time found

The following message will be returned to user when executing 'perf
script --time' if perf data file doesn't contain the first/last sample
time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 08bc818..ac78191 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3449,7 +3449,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}


[tip:perf/core] perf report: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Gitweb: https://git.kernel.org/tip/eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Wed, 10 Jan 2018 23:00:26 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf report: Improve error msg when no first/last sample time found

The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao@linux.intel.com>
Reviewed-by: Jiri Olsa <jo...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6593779..7d4f0a5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1300,7 +1300,9 @@ repeat:
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 


[tip:perf/core] perf report: Improve error msg when no first/last sample time found

2018-01-17 Thread tip-bot for Jin Yao
Commit-ID:  eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Gitweb: https://git.kernel.org/tip/eb0b419eff8cf51af8e16cc8c5d2a92d19824266
Author: Jin Yao 
AuthorDate: Wed, 10 Jan 2018 23:00:26 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Jan 2018 10:23:34 -0300

perf report: Improve error msg when no first/last sample time found

The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
Reviewed-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1515596433-24653-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 6593779..7d4f0a5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1300,7 +1300,9 @@ repeat:
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 


Re: [PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-16 Thread Jin, Yao



On 1/16/2018 10:48 PM, Arnaldo Carvalho de Melo wrote:

Em Tue, Jan 16, 2018 at 12:55:19PM +0100, Jiri Olsa escreveu:

On Wed, Jan 10, 2018 at 11:00:25PM +0800, Jin Yao wrote:

It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
perf report: Improve error msg when no first/last sample time found
perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
perf report --stdio --time 10%/1 is equivalent to
perf report --stdio --time 10%

perf util: Support no index time percent slice

4. Add indication of time slices in perf report header.
perf report: Add an indication of what time slices are used
>>> 5. Remove the time slices number limitation in perf report/script
perf util: Allocate time slices buffer according to number of comma
perf report: Remove the time slices number limitation
perf script: Remove the time slices number limitation

Jin Yao (8):
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found
   perf util: Improve error checking for time percent input
   perf util: Support no index time percent slice
   perf report: Add an indication of what time slices are used
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation


from quick look it looks ok

Reviewed-by: Jiri Olsa <jo...@redhat.com>


Thanks, applied, now test building with 'make -C tools/perf build-test',
containers, etc.

- Arnaldo



Thanks Arnaldo!

Thanks
Jin Yao


Re: [PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-16 Thread Jin, Yao



On 1/16/2018 10:48 PM, Arnaldo Carvalho de Melo wrote:

Em Tue, Jan 16, 2018 at 12:55:19PM +0100, Jiri Olsa escreveu:

On Wed, Jan 10, 2018 at 11:00:25PM +0800, Jin Yao wrote:

It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
perf report: Improve error msg when no first/last sample time found
perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
perf report --stdio --time 10%/1 is equivalent to
perf report --stdio --time 10%

perf util: Support no index time percent slice

4. Add indication of time slices in perf report header.
perf report: Add an indication of what time slices are used
>>> 5. Remove the time slices number limitation in perf report/script
perf util: Allocate time slices buffer according to number of comma
perf report: Remove the time slices number limitation
perf script: Remove the time slices number limitation

Jin Yao (8):
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found
   perf util: Improve error checking for time percent input
   perf util: Support no index time percent slice
   perf report: Add an indication of what time slices are used
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation


from quick look it looks ok

Reviewed-by: Jiri Olsa 


Thanks, applied, now test building with 'make -C tools/perf build-test',
containers, etc.

- Arnaldo



Thanks Arnaldo!

Thanks
Jin Yao


Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-16 Thread Jin, Yao



On 1/16/2018 9:17 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 09:06:09PM +0800, Jin, Yao wrote:

Just tested. But looks it's not OK for '--per-thread' case.


yea, I haven't tested much.. might need soem tweaking,
but my point was that it could be doable on one place
instead of introducing another if possible

jirka



Yes, I understand. It'd better we can put the code to more common places.

Actually I used the similar patch as yours at first. While I found it 
didn't work for the case of system-wide '--per-thread' then I developed 
current one.


Thanks
Jin Yao




Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-16 Thread Jin, Yao



On 1/16/2018 9:17 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 09:06:09PM +0800, Jin, Yao wrote:

Just tested. But looks it's not OK for '--per-thread' case.


yea, I haven't tested much.. might need soem tweaking,
but my point was that it could be doable on one place
instead of introducing another if possible

jirka



Yes, I understand. It'd better we can put the code to more common places.

Actually I used the similar patch as yours at first. While I found it 
didn't work for the case of system-wide '--per-thread' then I developed 
current one.


Thanks
Jin Yao




Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-16 Thread Jin, Yao

Just tested. But looks it's not OK for '--per-thread' case.

jinyao@skl:~/$ ./perf stat --per-thread
WARNING: Ignored open failure for pid 1
WARNING: Ignored open failure for pid 2
WARNING: Ignored open failure for pid 4
WARNING: Ignored open failure for pid 6
WARNING: Ignored open failure for pid 7
WARNING: Ignored open failure for pid 8
WARNING: Ignored open failure for pid 9
WARNING: Ignored open failure for pid 10
..
WARNING: Ignored open failure for pid 11783
WARNING: Ignored open failure for pid 12275
WARNING: Ignored open failure for pid 31733
WARNING: Ignored open failure for pid 31737
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

  -1: Allow use of (almost) all events by all users
  Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
  Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

kernel.perf_event_paranoid = -1

Thanks
Jin Yao

On 1/16/2018 8:51 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 11:43:08PM +0800, Jin Yao wrote:

If we execute 'perf stat --per-thread' with non-root account
(even set kernel.perf_event_paranoid = -1 yet), it reports the error:

jinyao@skl:~$ perf stat --per-thread
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

   -1: Allow use of (almost) all events by all users
   Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK

= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN

   Disallow raw tracepoint access by users without CAP_SYS_ADMIN

= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN


To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

 kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

jinyao@skl:~$ perf stat --per-thread
^C
  Performance counter stats for 'system wide':

   vmstat-3458   6.171984  cpu-clock:u (msec)#  
  0.000 CPUs utilized
 perf-3670   0.515599  cpu-clock:u (msec)#  
  0.000 CPUs utilized
   vmstat-3458  1,163,643  cycles:u  #  
  0.189 GHz
 perf-3670 40,881  cycles:u  #  
  0.079 GHz
   vmstat-3458  1,410,238  instructions:u#  
  1.21  insn per cycle
 perf-3670  3,536  instructions:u#  
  0.09  insn per cycle
   vmstat-3458288,937  branches:u#  
 46.814 M/sec
 perf-3670936  branches:u#  
  1.815 M/sec
   vmstat-3458 15,195  branch-misses:u   #  
  5.26% of all branches
 perf-3670 76  branch-misses:u   #  
  8.12% of all branches

   12.651675247 seconds time elapsed


could we use the existing code like in attached patch?

we might also swap following warning for some generic one:
   WARNING: Ignored open failure for pid 28392

jirka


---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..f9d4d3ad6fc5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -276,6 +276,9 @@ static int create_perf_stat_counter(struct perf_evsel 
*evsel)
attr->enable_on_exec = 1;
}
  
+	if (stat_config.aggr_mode == AGGR_THREAD)

+   evsel->ignore_missing_thread = true;
+
if (target__has_cpu() && !target__has_per_thread())
return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
  
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c

index 8f971a2301d1..509ee175bc97 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1656,7 +1656,7 @@ static bool ignore_missing_thread(struct perf_evsel 
*evsel,
return false;
  
  	/* The -ESRCH is perf event syscall errno for pid's not found. */

- 

Re: [PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-16 Thread Jin, Yao

Just tested. But looks it's not OK for '--per-thread' case.

jinyao@skl:~/$ ./perf stat --per-thread
WARNING: Ignored open failure for pid 1
WARNING: Ignored open failure for pid 2
WARNING: Ignored open failure for pid 4
WARNING: Ignored open failure for pid 6
WARNING: Ignored open failure for pid 7
WARNING: Ignored open failure for pid 8
WARNING: Ignored open failure for pid 9
WARNING: Ignored open failure for pid 10
..
WARNING: Ignored open failure for pid 11783
WARNING: Ignored open failure for pid 12275
WARNING: Ignored open failure for pid 31733
WARNING: Ignored open failure for pid 31737
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

  -1: Allow use of (almost) all events by all users
  Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
  Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

kernel.perf_event_paranoid = -1

Thanks
Jin Yao

On 1/16/2018 8:51 PM, Jiri Olsa wrote:

On Tue, Jan 16, 2018 at 11:43:08PM +0800, Jin Yao wrote:

If we execute 'perf stat --per-thread' with non-root account
(even set kernel.perf_event_paranoid = -1 yet), it reports the error:

jinyao@skl:~$ perf stat --per-thread
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

   -1: Allow use of (almost) all events by all users
   Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK

= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN

   Disallow raw tracepoint access by users without CAP_SYS_ADMIN

= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN


To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

 kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

jinyao@skl:~$ perf stat --per-thread
^C
  Performance counter stats for 'system wide':

   vmstat-3458   6.171984  cpu-clock:u (msec)#  
  0.000 CPUs utilized
 perf-3670   0.515599  cpu-clock:u (msec)#  
  0.000 CPUs utilized
   vmstat-3458  1,163,643  cycles:u  #  
  0.189 GHz
 perf-3670 40,881  cycles:u  #  
  0.079 GHz
   vmstat-3458  1,410,238  instructions:u#  
  1.21  insn per cycle
 perf-3670  3,536  instructions:u#  
  0.09  insn per cycle
   vmstat-3458288,937  branches:u#  
 46.814 M/sec
 perf-3670936  branches:u#  
  1.815 M/sec
   vmstat-3458 15,195  branch-misses:u   #  
  5.26% of all branches
 perf-3670 76  branch-misses:u   #  
  8.12% of all branches

   12.651675247 seconds time elapsed


could we use the existing code like in attached patch?

we might also swap following warning for some generic one:
   WARNING: Ignored open failure for pid 28392

jirka


---
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d32f222..f9d4d3ad6fc5 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -276,6 +276,9 @@ static int create_perf_stat_counter(struct perf_evsel 
*evsel)
attr->enable_on_exec = 1;
}
  
+	if (stat_config.aggr_mode == AGGR_THREAD)

+   evsel->ignore_missing_thread = true;
+
if (target__has_cpu() && !target__has_per_thread())
return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
  
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c

index 8f971a2301d1..509ee175bc97 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1656,7 +1656,7 @@ static bool ignore_missing_thread(struct perf_evsel 
*evsel,
return false;
  
  	/* The -ESRCH is perf event syscall errno for pid's not found. */

- 

Re: [PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-16 Thread Jin, Yao



On 1/16/2018 7:55 PM, Jiri Olsa wrote:

On Wed, Jan 10, 2018 at 11:00:25PM +0800, Jin Yao wrote:

It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
perf report: Improve error msg when no first/last sample time found
perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
perf report --stdio --time 10%/1 is equivalent to
perf report --stdio --time 10%

perf util: Support no index time percent slice

4. Add indication of time slices in perf report header.
perf report: Add an indication of what time slices are used

5. Remove the time slices number limitation in perf report/script
perf util: Allocate time slices buffer according to number of comma
perf report: Remove the time slices number limitation
perf script: Remove the time slices number limitation

Jin Yao (8):
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found
   perf util: Improve error checking for time percent input
   perf util: Support no index time percent slice
   perf report: Add an indication of what time slices are used
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation


from quick look it looks ok

Reviewed-by: Jiri Olsa <jo...@redhat.com>

thanks,
jirka



Hi Jiri,

Thanks so much for reviewing the patch.

Thanks
Jin Yao


Re: [PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-16 Thread Jin, Yao



On 1/16/2018 7:55 PM, Jiri Olsa wrote:

On Wed, Jan 10, 2018 at 11:00:25PM +0800, Jin Yao wrote:

It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
perf report: Improve error msg when no first/last sample time found
perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
perf report --stdio --time 10%/1 is equivalent to
perf report --stdio --time 10%

perf util: Support no index time percent slice

4. Add indication of time slices in perf report header.
perf report: Add an indication of what time slices are used

5. Remove the time slices number limitation in perf report/script
perf util: Allocate time slices buffer according to number of comma
perf report: Remove the time slices number limitation
perf script: Remove the time slices number limitation

Jin Yao (8):
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found
   perf util: Improve error checking for time percent input
   perf util: Support no index time percent slice
   perf report: Add an indication of what time slices are used
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation


from quick look it looks ok

Reviewed-by: Jiri Olsa 

thanks,
jirka



Hi Jiri,

Thanks so much for reviewing the patch.

Thanks
Jin Yao


[PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-15 Thread Jin Yao
If we execute 'perf stat --per-thread' with non-root account
(even set kernel.perf_event_paranoid = -1 yet), it reports the error:

jinyao@skl:~$ perf stat --per-thread
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

  -1: Allow use of (almost) all events by all users
  Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
  Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

jinyao@skl:~$ perf stat --per-thread
^C
 Performance counter stats for 'system wide':

  vmstat-3458   6.171984  cpu-clock:u (msec)#   
 0.000 CPUs utilized
perf-3670   0.515599  cpu-clock:u (msec)#   
 0.000 CPUs utilized
  vmstat-3458  1,163,643  cycles:u  #   
 0.189 GHz
perf-3670 40,881  cycles:u  #   
 0.079 GHz
  vmstat-3458  1,410,238  instructions:u#   
 1.21  insn per cycle
perf-3670  3,536  instructions:u#   
 0.09  insn per cycle
  vmstat-3458288,937  branches:u#   
46.814 M/sec
perf-3670936  branches:u#   
 1.815 M/sec
  vmstat-3458 15,195  branch-misses:u   #   
 5.26% of all branches
perf-3670 76  branch-misses:u   #   
 8.12% of all branches

  12.651675247 seconds time elapsed

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/builtin-stat.c| 14 +-
 tools/perf/util/evsel.c  |  3 +++
 tools/perf/util/thread_map.c |  1 +
 tools/perf/util/thread_map.h |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d3..bcdb47c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -632,7 +632,19 @@ static int __run_perf_stat(int argc, const char **argv)
 if (verbose > 0)
 ui__warning("%s\n", msg);
 goto try_again;
-}
+   } else if (target__has_per_thread() &&
+  evsel_list->threads &&
+  evsel_list->threads->err_thread != -1) {
+   /*
+* For global --per-thread case, skip current
+* error thread.
+*/
+   if (!thread_map__remove(evsel_list->threads,
+   
evsel_list->threads->err_thread)) {
+   evsel_list->threads->err_thread = -1;
+   goto try_again;
+   }
+   }
 
perf_evsel__open_strerror(counter, ,
  errno, msg, sizeof(msg));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a4d256e..12f8733 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1899,6 +1899,9 @@ int perf_evsel__open(struct perf_evsel *evsel, struct 
cpu_map *cpus,
goto fallback_missing_features;
}
 out_close:
+   if (err)
+   threads->err_thread = thread;
+
do {
while (--thread >= 0) {
close(FD(evsel, cpu, thread));
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..870cb0c 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -32,6 +32,7 @@ static void thread_map__reset(struct thread_map *map, int 
start, int nr)
size_t size = (nr - start) * sizeof(map->map[0]);
 
memset(>map[start], 0, size);
+   map->err_thread = -1;
 }
 
 static struct thread_map *thread_map__

[PATCH] perf stat: Ignore error thread when enabling system-wide --per-thread

2018-01-15 Thread Jin Yao
If we execute 'perf stat --per-thread' with non-root account
(even set kernel.perf_event_paranoid = -1 yet), it reports the error:

jinyao@skl:~$ perf stat --per-thread
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

  -1: Allow use of (almost) all events by all users
  Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN
  Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

kernel.perf_event_paranoid = -1

Perhaps the ptrace rule doesn't allow to trace some processes. But anyway
the global --per-thread mode had better ignore such errors and continue
working on other threads.

This patch will record the index of error thread in perf_evsel__open()
and remove this thread before retrying.

For example (run with non-root, kernel.perf_event_paranoid isn't set):

jinyao@skl:~$ perf stat --per-thread
^C
 Performance counter stats for 'system wide':

  vmstat-3458   6.171984  cpu-clock:u (msec)#   
 0.000 CPUs utilized
perf-3670   0.515599  cpu-clock:u (msec)#   
 0.000 CPUs utilized
  vmstat-3458  1,163,643  cycles:u  #   
 0.189 GHz
perf-3670 40,881  cycles:u  #   
 0.079 GHz
  vmstat-3458  1,410,238  instructions:u#   
 1.21  insn per cycle
perf-3670  3,536  instructions:u#   
 0.09  insn per cycle
  vmstat-3458288,937  branches:u#   
46.814 M/sec
perf-3670936  branches:u#   
 1.815 M/sec
  vmstat-3458 15,195  branch-misses:u   #   
 5.26% of all branches
perf-3670 76  branch-misses:u   #   
 8.12% of all branches

  12.651675247 seconds time elapsed

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-stat.c| 14 +-
 tools/perf/util/evsel.c  |  3 +++
 tools/perf/util/thread_map.c |  1 +
 tools/perf/util/thread_map.h |  1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 98bf9d3..bcdb47c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -632,7 +632,19 @@ static int __run_perf_stat(int argc, const char **argv)
 if (verbose > 0)
 ui__warning("%s\n", msg);
 goto try_again;
-}
+   } else if (target__has_per_thread() &&
+  evsel_list->threads &&
+  evsel_list->threads->err_thread != -1) {
+   /*
+* For global --per-thread case, skip current
+* error thread.
+*/
+   if (!thread_map__remove(evsel_list->threads,
+   
evsel_list->threads->err_thread)) {
+   evsel_list->threads->err_thread = -1;
+   goto try_again;
+   }
+   }
 
perf_evsel__open_strerror(counter, ,
  errno, msg, sizeof(msg));
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a4d256e..12f8733 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1899,6 +1899,9 @@ int perf_evsel__open(struct perf_evsel *evsel, struct 
cpu_map *cpus,
goto fallback_missing_features;
}
 out_close:
+   if (err)
+   threads->err_thread = thread;
+
do {
while (--thread >= 0) {
close(FD(evsel, cpu, thread));
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 3e1038f..870cb0c 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -32,6 +32,7 @@ static void thread_map__reset(struct thread_map *map, int 
start, int nr)
size_t size = (nr - start) * sizeof(map->map[0]);
 
memset(>map[start], 0, size);
+   map->err_thread = -1;
 }
 
 static struct thread_map *thread_map__realloc(struct thread_

Re: [PATCH] perf util: Display warning when perf report/annotate is missing some libs

2018-01-11 Thread Jin, Yao



On 1/11/2018 11:30 PM, Jiri Olsa wrote:

On Thu, Jan 11, 2018 at 07:03:06PM +0800, Jin Yao wrote:

We keep having bug reports that when users build perf on their own,


we already have same warnings during the build



Yes, there will be warnings displayed during the build if some libraries 
are missing.


While I do find the users (especially the new perf user) don't notice 
the warning. They only care about the success of build.


Once they use this perf version and find something working strangely, 
they will say that perf may have bug. :(



but they don't install some needed libraries like libelf, libbfd/libibery.


how about saying that in the symbol column,
instead of poluting report's output, like:

$ perf report --stdio

# Overhead  Command  Shared Object Symbol (disabled)
#   ...    ..



I just think it'd better provide some hints to user. For example, 
"symbol is disabled and you need to install libelf/xxx", say something 
like that.


But it looks the column can't contain too much information (i.e. no more 
space to contain the entire hints).


Any idea? Or just add this warning in verbose mode?


also your change does not affect tui mode

annotation for some reason does not start at all.. could be
little more verbose ;-)

jirka



Yes, it doesn't affect tui mode.

Or we just add this warning in verbose mode?

e.g. perf report -v?

Thanks
Jin Yao


Re: [PATCH] perf util: Display warning when perf report/annotate is missing some libs

2018-01-11 Thread Jin, Yao



On 1/11/2018 11:30 PM, Jiri Olsa wrote:

On Thu, Jan 11, 2018 at 07:03:06PM +0800, Jin Yao wrote:

We keep having bug reports that when users build perf on their own,


we already have same warnings during the build



Yes, there will be warnings displayed during the build if some libraries 
are missing.


While I do find the users (especially the new perf user) don't notice 
the warning. They only care about the success of build.


Once they use this perf version and find something working strangely, 
they will say that perf may have bug. :(



but they don't install some needed libraries like libelf, libbfd/libibery.


how about saying that in the symbol column,
instead of poluting report's output, like:

$ perf report --stdio

# Overhead  Command  Shared Object Symbol (disabled)
#   ...    ..



I just think it'd better provide some hints to user. For example, 
"symbol is disabled and you need to install libelf/xxx", say something 
like that.


But it looks the column can't contain too much information (i.e. no more 
space to contain the entire hints).


Any idea? Or just add this warning in verbose mode?


also your change does not affect tui mode

annotation for some reason does not start at all.. could be
little more verbose ;-)

jirka



Yes, it doesn't affect tui mode.

Or we just add this warning in verbose mode?

e.g. perf report -v?

Thanks
Jin Yao


[tip:perf/core] perf script: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  2ab046cd01e33a854798a3e245c9e3f32b950a7d
Gitweb: https://git.kernel.org/tip/2ab046cd01e33a854798a3e245c9e3f32b950a7d
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Fri, 8 Dec 2017 21:13:46 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 12:07:06 -0300

perf script: Support time percent and multiple time ranges

perf script has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

   perf script --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

   perf script --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1512738826-2628-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/Documentation/perf-script.txt | 16 +++
 tools/perf/builtin-script.c  | 34 ++--
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 974ceb1..7b622a8 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -329,6 +329,22 @@ include::itrace.txt[]
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multipe time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice
+   perf script --time 10%/2
+
+   Select from 0% to 10% time slice
+   perf script --time 0%-10%
+
+   Select the first and second 10% time slices
+   perf script --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices
+   perf script --time 0%-10%,30%-40%
+
 --max-blocks::
Set the maximum number of program blocks to print with brstackasm for
each sample.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 77e47cf..330dcd9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1436,6 +1436,8 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
+#define PTIME_RANGE_MAX10
+
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1449,7 +1451,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
 };
 
 static int perf_evlist__max_name_len(struct perf_evlist *evlist)
@@ -1734,8 +1737,10 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_script *scr = container_of(tool, struct perf_script, tool);
struct addr_location al;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
+ sample->time)) {
return 0;
+   }
 
if (debug_mode) {
if (sample->time < last_timestamp) {
@@ -3360,10 +3365,27 @@ int cmd_script(int argc, const char **argv)
goto out_delete;
 
/* needs to be parsed after looking up reference time */
-   if (perf_time__parse_str(, script.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   err = -EINVAL;
-   goto out_delete;
+   if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   err = -EINVAL;
+  

[tip:perf/core] perf script: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  2ab046cd01e33a854798a3e245c9e3f32b950a7d
Gitweb: https://git.kernel.org/tip/2ab046cd01e33a854798a3e245c9e3f32b950a7d
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:46 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 12:07:06 -0300

perf script: Support time percent and multiple time ranges

perf script has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

   perf script --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

   perf script --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-7-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-script.txt | 16 +++
 tools/perf/builtin-script.c  | 34 ++--
 2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 974ceb1..7b622a8 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -329,6 +329,22 @@ include::itrace.txt[]
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multipe time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice
+   perf script --time 10%/2
+
+   Select from 0% to 10% time slice
+   perf script --time 0%-10%
+
+   Select the first and second 10% time slices
+   perf script --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices
+   perf script --time 0%-10%,30%-40%
+
 --max-blocks::
Set the maximum number of program blocks to print with brstackasm for
each sample.
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 77e47cf..330dcd9 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1436,6 +1436,8 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
+#define PTIME_RANGE_MAX10
+
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1449,7 +1451,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
 };
 
 static int perf_evlist__max_name_len(struct perf_evlist *evlist)
@@ -1734,8 +1737,10 @@ static int process_sample_event(struct perf_tool *tool,
struct perf_script *scr = container_of(tool, struct perf_script, tool);
struct addr_location al;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
+ sample->time)) {
return 0;
+   }
 
if (debug_mode) {
if (sample->time < last_timestamp) {
@@ -3360,10 +3365,27 @@ int cmd_script(int argc, const char **argv)
goto out_delete;
 
/* needs to be parsed after looking up reference time */
-   if (perf_time__parse_str(, script.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   err = -EINVAL;
-   goto out_delete;
+   if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   err = -EINVAL;
+   goto out_delete;
+   }
+
+   script.range_num = perf_time__percent_parse_str(
+   script.ptime_range, PTIME_RANGE_MAX,
+   script.time_str,
+

[tip:perf/core] perf report: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  5b969bc766807e5c2f184d1d6f97b8471de946f1
Gitweb: https://git.kernel.org/tip/5b969bc766807e5c2f184d1d6f97b8471de946f1
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Fri, 8 Dec 2017 21:13:45 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 12:06:20 -0300

perf report: Support time percent and multiple time ranges

perf report has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

perf report --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

perf report --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao@linux.intel.com
[ Add missing colons at end of examples in the man page ]
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/Documentation/perf-report.txt | 20 
 tools/perf/builtin-report.c  | 31 ++-
 2 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index ddde2b5..1e02c4e 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -402,6 +402,26 @@ OPTIONS
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multiple time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice:
+
+ perf report --time 10%/2
+
+   Select from 0% to 10% time slice:
+
+ perf report --time 0%-10%
+
+   Select the first and second 10% time slices:
+
+ perf report --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices:
+
+ perf report --time 0%-10%,30%-40%
+
 --itrace::
Options for decoding instruction tracing data. The options are:
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 07827cd..770bf8a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -52,6 +52,8 @@
 #include 
 #include 
 
+#define PTIME_RANGE_MAX10
+
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -69,7 +71,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
float   min_percent;
u64 nr_entries;
u64 queue_size;
@@ -202,8 +205,10 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(rep->ptime_range, rep->range_num,
+ sample->time)) {
return 0;
+   }
 
if (machine__resolve(machine, , sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
@@ -1093,9 +1098,25 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
-   if (perf_time__parse_str(, report.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   return -EINVAL;
+   if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   return -EINVAL;
+   }
+
+   report.range_num = perf_time__percent_parse_str

[tip:perf/core] perf report: Support time percent and multiple time ranges

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  5b969bc766807e5c2f184d1d6f97b8471de946f1
Gitweb: https://git.kernel.org/tip/5b969bc766807e5c2f184d1d6f97b8471de946f1
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:45 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 12:06:20 -0300

perf report: Support time percent and multiple time ranges

perf report has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

perf report --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

perf report --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
are moved from perf_session to perf_evlist so change the
related code.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao@linux.intel.com
[ Add missing colons at end of examples in the man page ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-report.txt | 20 
 tools/perf/builtin-report.c  | 31 ++-
 2 files changed, 46 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index ddde2b5..1e02c4e 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -402,6 +402,26 @@ OPTIONS
stop time is not given (i.e, time string is 'x.y,') then analysis goes
to end of file.
 
+   Also support time percent with multiple time range. Time string is
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+
+   For example:
+   Select the second 10% time slice:
+
+ perf report --time 10%/2
+
+   Select from 0% to 10% time slice:
+
+ perf report --time 0%-10%
+
+   Select the first and second 10% time slices:
+
+ perf report --time 10%/1,10%/2
+
+   Select from 0% to 10% and 30% to 40% slices:
+
+ perf report --time 0%-10%,30%-40%
+
 --itrace::
Options for decoding instruction tracing data. The options are:
 
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 07827cd..770bf8a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -52,6 +52,8 @@
 #include 
 #include 
 
+#define PTIME_RANGE_MAX10
+
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -69,7 +71,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime;
+   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   int range_num;
float   min_percent;
u64 nr_entries;
u64 queue_size;
@@ -202,8 +205,10 @@ static int process_sample_event(struct perf_tool *tool,
};
int ret = 0;
 
-   if (perf_time__skip_sample(>ptime, sample->time))
+   if (perf_time__ranges_skip_sample(rep->ptime_range, rep->range_num,
+ sample->time)) {
return 0;
+   }
 
if (machine__resolve(machine, , sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
@@ -1093,9 +1098,25 @@ repeat:
if (symbol__init(>header.env) < 0)
goto error;
 
-   if (perf_time__parse_str(, report.time_str) != 0) {
-   pr_err("Invalid time string\n");
-   return -EINVAL;
+   if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
+   if (session->evlist->first_sample_time == 0 &&
+   session->evlist->last_sample_time == 0) {
+   pr_err("No first/last sample time in perf data\n");
+   return -EINVAL;
+   }
+
+   report.range_num = perf_time__percent_parse_str(
+   report.ptime_range, PTIME_RANGE_MAX,
+   report.time_str,
+   session->evlist->first_sample_time,
+   session->evlist->last_sample_time);
+
+ 

[tip:perf/core] perf tools: Create function to perform multiple time range checking

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Gitweb: https://git.kernel.org/tip/9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Fri, 8 Dec 2017 21:13:44 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 11:41:06 -0300

perf tools: Create function to perform multiple time range checking

Previous patch supports the multiple time range.

For example, select the first and second 10% time slices.
perf report --time 10%/1,10%/2

We need a function to check if a timestamp is in the ranges of
[0, 10%) and [10%, 20%].

Note that it includes the last element in [10%, 20%] but it doesn't
include the last element in [0, 10%). It's to avoid the overlap.

This patch implments a new function perf_time__ranges_skip_sample
for this checking.

Change log:

v4: Let perf_time__ranges_skip_sample be compatible with
perf_time__skip_sample when only one time range.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  3 +++
 2 files changed, 31 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 61c46022..3f7f18f 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -300,6 +300,34 @@ bool perf_time__skip_sample(struct perf_time_interval 
*ptime, u64 timestamp)
return false;
 }
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp)
+{
+   struct perf_time_interval *ptime;
+   int i;
+
+   if ((timestamp == 0) || (num == 0))
+   return false;
+
+   if (num == 1)
+   return perf_time__skip_sample(_buf[0], timestamp);
+
+   /*
+* start/end of multiple time ranges must be valid.
+*/
+   for (i = 0; i < num; i++) {
+   ptime = _buf[i];
+
+   if (timestamp >= ptime->start &&
+   ((timestamp < ptime->end && i < num - 1) ||
+(timestamp <= ptime->end && i == num - 1))) {
+   break;
+   }
+   }
+
+   return (i == num) ? true : false;
+}
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
 {
u64  sec = timestamp / NSEC_PER_SEC;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 2308723..34d5eba 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -18,6 +18,9 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp);
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
 
 int fetch_current_timestamp(char *buf, size_t sz);


[tip:perf/core] perf tools: Create function to perform multiple time range checking

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Gitweb: https://git.kernel.org/tip/9a9b8b4b2271e763c1600311a3d4ecc2ac359b55
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:44 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:41:06 -0300

perf tools: Create function to perform multiple time range checking

Previous patch supports the multiple time range.

For example, select the first and second 10% time slices.
perf report --time 10%/1,10%/2

We need a function to check if a timestamp is in the ranges of
[0, 10%) and [10%, 20%].

Note that it includes the last element in [10%, 20%] but it doesn't
include the last element in [0, 10%). It's to avoid the overlap.

This patch implments a new function perf_time__ranges_skip_sample
for this checking.

Change log:

v4: Let perf_time__ranges_skip_sample be compatible with
perf_time__skip_sample when only one time range.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  3 +++
 2 files changed, 31 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 61c46022..3f7f18f 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -300,6 +300,34 @@ bool perf_time__skip_sample(struct perf_time_interval 
*ptime, u64 timestamp)
return false;
 }
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp)
+{
+   struct perf_time_interval *ptime;
+   int i;
+
+   if ((timestamp == 0) || (num == 0))
+   return false;
+
+   if (num == 1)
+   return perf_time__skip_sample(_buf[0], timestamp);
+
+   /*
+* start/end of multiple time ranges must be valid.
+*/
+   for (i = 0; i < num; i++) {
+   ptime = _buf[i];
+
+   if (timestamp >= ptime->start &&
+   ((timestamp < ptime->end && i < num - 1) ||
+(timestamp <= ptime->end && i == num - 1))) {
+   break;
+   }
+   }
+
+   return (i == num) ? true : false;
+}
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
 {
u64  sec = timestamp / NSEC_PER_SEC;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 2308723..34d5eba 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -18,6 +18,9 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+  int num, u64 timestamp);
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
 
 int fetch_current_timestamp(char *buf, size_t sz);


[tip:perf/core] perf tools: Create function to parse time percent

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  13a70f350665580708ab11f725d3578eaacbf2d0
Gitweb: https://git.kernel.org/tip/13a70f350665580708ab11f725d3578eaacbf2d0
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Fri, 8 Dec 2017 21:13:43 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 11:39:09 -0300

perf tools: Create function to parse time percent

Current perf report/script/... have a --time option to limit the time
range of output. But right now it only supports absolute time, add
support for time percentage.

For example:

1. Select the second 10% time slice
   perf report --time 10%/2

2. Select from 0% to 10% time slice
   perf report --time 0%-10%

It also support the multiple time ranges.

3. Select the first and second 10% time slices
   perf report --time 10%/1,10%/2

4. Select from 0% to 10% and 30% to 40% slices
   perf report --time 0%-10%,30%-40%

Changelog:

v4: An issue is found. Following passes.
perf script --time 10%/10x12321xsdfdasfdsafdsafdsa

Now it uses strtol to replace atoi.

Committer notes:

This just puts in place the infrastructure, so the examples in this cset
comment will only work later, after more patches in this series are
applied.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/time-utils.c | 205 ---
 tools/perf/util/time-utils.h |   3 +
 2 files changed, 196 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 81927d0..61c46022 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "perf.h"
 #include "debug.h"
@@ -60,11 +61,10 @@ static int parse_timestr_sec_nsec(struct perf_time_interval 
*ptime,
return 0;
 }
 
-int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+static int split_start_end(char **start, char **end, const char *ostr, char ch)
 {
char *start_str, *end_str;
char *d, *str;
-   int rc = 0;
 
if (ostr == NULL || *ostr == '\0')
return 0;
@@ -74,25 +74,35 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
if (str == NULL)
return -ENOMEM;
 
-   ptime->start = 0;
-   ptime->end = 0;
-
-   /* str has the format: ,
-* variations: ,
-* ,
-* ,
-*/
start_str = str;
-   d = strchr(start_str, ',');
+   d = strchr(start_str, ch);
if (d) {
*d = '\0';
++d;
}
end_str = d;
 
+   *start = start_str;
+   *end = end_str;
+
+   return 0;
+}
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+{
+   char *start_str = NULL, *end_str;
+   int rc;
+
+   rc = split_start_end(_str, _str, ostr, ',');
+   if (rc || !start_str)
+   return rc;
+
+   ptime->start = 0;
+   ptime->end = 0;
+
rc = parse_timestr_sec_nsec(ptime, start_str, end_str);
 
-   free(str);
+   free(start_str);
 
/* make sure end time is after start time if it was given */
if (rc == 0 && ptime->end && ptime->end < ptime->start)
@@ -104,6 +114,177 @@ int perf_time__parse_str(struct perf_time_interval 
*ptime, const char *ostr)
return rc;
 }
 
+static int parse_percent(double *pcnt, char *str)
+{
+   char *c;
+
+   c = strchr(str, '%');
+   if (c)
+   *c = '\0';
+   else
+   return -1;
+
+   *pcnt = atof(str) / 100.0;
+
+   return 0;
+}
+
+static int percent_slash_split(char *str, struct perf_time_interval *ptime,
+  u64 start, u64 end)
+{
+   char *p, *end_str;
+   double pcnt, start_pcnt, end_pcnt;
+   u64 total = end - start;
+   int i;
+
+   /*
+* Example:
+* 10%/2: select the second 10% slice and the third 10% slice
+*/
+
+   /* We can modify this string since the original one is copied */
+   p = strchr(str, '/');
+   if (!p)
+   return -1;
+
+   *p = '\0';
+   if (parse_percent(, str) < 0)
+   return -1;
+
+   p++;
+   i = (int)strtol(p, _str, 10);
+   if (*end_str)
+   return -1;
+
+   if (pcnt <= 0.0)
+   return -1;
+
+   start_pcnt = pcnt * (i - 1);
+   end_pcnt = pcnt * i;
+
+   if (start_pcnt < 0.0 

[tip:perf/core] perf tools: Create function to parse time percent

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  13a70f350665580708ab11f725d3578eaacbf2d0
Gitweb: https://git.kernel.org/tip/13a70f350665580708ab11f725d3578eaacbf2d0
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:39:09 -0300

perf tools: Create function to parse time percent

Current perf report/script/... have a --time option to limit the time
range of output. But right now it only supports absolute time, add
support for time percentage.

For example:

1. Select the second 10% time slice
   perf report --time 10%/2

2. Select from 0% to 10% time slice
   perf report --time 0%-10%

It also support the multiple time ranges.

3. Select the first and second 10% time slices
   perf report --time 10%/1,10%/2

4. Select from 0% to 10% and 30% to 40% slices
   perf report --time 0%-10%,30%-40%

Changelog:

v4: An issue is found. Following passes.
perf script --time 10%/10x12321xsdfdasfdsafdsafdsa

Now it uses strtol to replace atoi.

Committer notes:

This just puts in place the infrastructure, so the examples in this cset
comment will only work later, after more patches in this series are
applied.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/time-utils.c | 205 ---
 tools/perf/util/time-utils.h |   3 +
 2 files changed, 196 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 81927d0..61c46022 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "perf.h"
 #include "debug.h"
@@ -60,11 +61,10 @@ static int parse_timestr_sec_nsec(struct perf_time_interval 
*ptime,
return 0;
 }
 
-int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+static int split_start_end(char **start, char **end, const char *ostr, char ch)
 {
char *start_str, *end_str;
char *d, *str;
-   int rc = 0;
 
if (ostr == NULL || *ostr == '\0')
return 0;
@@ -74,25 +74,35 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
if (str == NULL)
return -ENOMEM;
 
-   ptime->start = 0;
-   ptime->end = 0;
-
-   /* str has the format: ,
-* variations: ,
-* ,
-* ,
-*/
start_str = str;
-   d = strchr(start_str, ',');
+   d = strchr(start_str, ch);
if (d) {
*d = '\0';
++d;
}
end_str = d;
 
+   *start = start_str;
+   *end = end_str;
+
+   return 0;
+}
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+{
+   char *start_str = NULL, *end_str;
+   int rc;
+
+   rc = split_start_end(_str, _str, ostr, ',');
+   if (rc || !start_str)
+   return rc;
+
+   ptime->start = 0;
+   ptime->end = 0;
+
rc = parse_timestr_sec_nsec(ptime, start_str, end_str);
 
-   free(str);
+   free(start_str);
 
/* make sure end time is after start time if it was given */
if (rc == 0 && ptime->end && ptime->end < ptime->start)
@@ -104,6 +114,177 @@ int perf_time__parse_str(struct perf_time_interval 
*ptime, const char *ostr)
return rc;
 }
 
+static int parse_percent(double *pcnt, char *str)
+{
+   char *c;
+
+   c = strchr(str, '%');
+   if (c)
+   *c = '\0';
+   else
+   return -1;
+
+   *pcnt = atof(str) / 100.0;
+
+   return 0;
+}
+
+static int percent_slash_split(char *str, struct perf_time_interval *ptime,
+  u64 start, u64 end)
+{
+   char *p, *end_str;
+   double pcnt, start_pcnt, end_pcnt;
+   u64 total = end - start;
+   int i;
+
+   /*
+* Example:
+* 10%/2: select the second 10% slice and the third 10% slice
+*/
+
+   /* We can modify this string since the original one is copied */
+   p = strchr(str, '/');
+   if (!p)
+   return -1;
+
+   *p = '\0';
+   if (parse_percent(, str) < 0)
+   return -1;
+
+   p++;
+   i = (int)strtol(p, _str, 10);
+   if (*end_str)
+   return -1;
+
+   if (pcnt <= 0.0)
+   return -1;
+
+   start_pcnt = pcnt * (i - 1);
+   end_pcnt = pcnt * i;
+
+   if (start_pcnt < 0.0 || start_pcnt > 1.0 ||
+   end_pcnt < 0.0 || end_pcnt > 1.0) {
+   return -1;
+   }
+
+   ptime->start = start + round(start_pcnt * total);
+   ptime->end = start + round(end_pcnt * total);
+
+   retur

[tip:perf/core] perf header: Add infrastructure to record first and last sample time

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  6011518db3bd04c80cd3ce3e6aea1c399739adb4
Gitweb: https://git.kernel.org/tip/6011518db3bd04c80cd3ce3e6aea1c399739adb4
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Fri, 8 Dec 2017 21:13:41 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 11:20:51 -0300

perf header: Add infrastructure to record first and last sample time

perf report/script/... have a --time option to limit the time range of
output. That's very useful to slice large traces, e.g. when processing
the output of perf script for some analysis.

But right now --time only supports absolute time. Also there is no fast
way to get the start/end times of a given trace except for looking at
it.  This makes it hard to e.g. only decode the first half of the trace,
which is useful for parallelization of scripts

Another problem is that perf records are variable size and there is no
synchronization mechanism. So the only way to find the last sample
reliably would be to walk all samples. But we want to avoid that in perf
report/...  because it is already quite expensive. That is why storing
the first sample time and last sample time in perf record is better.

This patch creates a new header feature type HEADER_SAMPLE_TIME and
related ops. Save the first sample time and the last sample time to the
feature section in perf file header. That will be done when, for
instance, processing build-ids, where we already have to process all
samples to create the build-id table, take advantage of that to further
amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
report/script' faster when using --time.

Committer testing:

After this patch is applied the header is written with zeroes, we need
the next patch, for "perf record" to actually write the timestamps:

  # perf report -D | grep PERF_RECORD_SAMPLE\(
  22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21be8c5 period: 1 addr: 0
  
  22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21ffd50 period: 2828043 addr: 0
  # perf report --header | grep "time of "
  # time of first sample : 0.00
  # time of last sample : 0.00
  #

Changelog:

v7: 1. Rebase to latest perf/core branch.

2. Add following clarification in patch description according to
   Arnaldo's suggestion.

   "That will be done when, for instance, processing build-ids,
where we already have to process all samples to create the
build-id table, take advantage of that to further amortize
that processing by storing HEADER_SAMPLE_TIME to make
'perf report/script' faster when using --time."

v4: Use perf script time style for timestamp printing. Also add with
the printing of sample duration.

v3: Remove the definitions of first_sample_time/last_sample_time from
perf_session. Just define them in perf_evlist

Signed-off-by: Jin Yao <yao@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/Documentation/perf.data-file-format.txt |  4 ++
 tools/perf/util/evlist.h   |  2 +
 tools/perf/util/header.c   | 60 ++
 tools/perf/util/header.h   |  1 +
 4 files changed, 67 insertions(+)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 15e8b48..f7d85e8 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -261,6 +261,10 @@ struct {
struct perf_header_string map;
 }[number_of_cache_levels];
 
+   HEADER_SAMPLE_TIME = 21,
+
+Two uint64_t for the time of first sample and the time of last sample.
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS= 256,
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7516066..e7fbca6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -50,6 +50,8 @@ struct perf_evlist {
struct perf_evsel *selected;
struct events_stats stats;
struct perf_env *env;
+   u64 first_sample_time;
+   u64 last_sample_time;
 };
 
 struct perf_evsel_str_handler {
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index ca73aa7..a326e0d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include &

[tip:perf/core] perf record: Record the first and last sample time in the header

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  68588baf8d01826673f2874f434123029e519052
Gitweb: https://git.kernel.org/tip/68588baf8d01826673f2874f434123029e519052
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Fri, 8 Dec 2017 21:13:42 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 11:20:56 -0300

perf record: Record the first and last sample time in the header

In the default 'perf record' configuration, all samples are processed,
to create the HEADER_BUILD_ID table. So it's very easy to get the
first/last samples and save the time to perf file header via the
function write_sample_time().

Later, at post processing time, perf report/script will fetch the time
from perf file header.

Committer testing:

  # perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
  [root@jouet home]# perf report --header | grep "time of "
  # time of first sample : 22947.909226
  # time of last sample : 22948.910704
  #
  # perf report -D | grep PERF_RECORD_SAMPLE\(
  0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa21b1af3 period: 1 addr: 0
  0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa200d204 period: 1 addr: 0
  
  3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 
28251/28251: 0xa22071d8 period: 169518 addr: 0
  0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 198807 addr: 0
  2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 88111 addr: 0
  #

Changelog:

v7: Just update the patch description according to Arnaldo's suggestion.

v6: Currently '--buildid-all' is not enabled at default. So the walking
on all samples is the default operation. There is no big overhead
to calculate the timestamp boundary in process_sample_event handler
once we already go through all samples. So the timestamp boundary
calculation is enabled by default when '--buildid-all' is not enabled.

While if '--buildid-all' is enabled, we creates a new option
"--timestamp-boundary" for user to decide if it enables the
timestamp boundary calculation.

v5: There is an issue that the sample walking can only work when
'--buildid-all' is not enabled. So we need to let the walking
be able to work even if '--buildid-all' is enabled and let the
processing skips the dso hit marking for this case.

At first, I want to provide a new option "--record-time-boundaries".
While after consideration, I think a new option is not very
necessary.

v3: Remove the definitions of first_sample_time and last_sample_time
from struct record and directly save them in perf_evlist.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Acked-by: Jiri Olsa <jo...@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/Documentation/perf-record.txt |  3 +++
 tools/perf/builtin-record.c  | 18 +++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef..3eea6de 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -430,6 +430,9 @@ Configure all used events to run in user space.
 --timestamp-filename
 Append timestamp to output file name.
 
+--timestamp-boundary::
+Record timestamp boundary (time of first/last samples).
+
 --switch-output[=mode]::
 Generate multiple perf.data files, timestamp prefixed, switching to a new one
 based on 'mode' value:
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 50385d8..65681a1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -78,6 +78,7 @@ struct record {
boolno_buildid_cache_set;
boolbuildid_all;
booltimestamp_filename;
+   booltimestamp_boundary;
struct switch_outputswitch_output;
unsigned long long  samples;
 };
@@ -409,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
struct record *rec = container_of(tool, struct record, tool);
 
-   rec->samples++;
+   if (rec->evlist->first_sample_time == 0)
+   rec->evlist->first_sample_time = sample->time;
+
+   rec->evlist->last_sample_time = sample->time;
 
+   if (rec->bu

[tip:perf/core] perf header: Add infrastructure to record first and last sample time

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  6011518db3bd04c80cd3ce3e6aea1c399739adb4
Gitweb: https://git.kernel.org/tip/6011518db3bd04c80cd3ce3e6aea1c399739adb4
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:41 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:20:51 -0300

perf header: Add infrastructure to record first and last sample time

perf report/script/... have a --time option to limit the time range of
output. That's very useful to slice large traces, e.g. when processing
the output of perf script for some analysis.

But right now --time only supports absolute time. Also there is no fast
way to get the start/end times of a given trace except for looking at
it.  This makes it hard to e.g. only decode the first half of the trace,
which is useful for parallelization of scripts

Another problem is that perf records are variable size and there is no
synchronization mechanism. So the only way to find the last sample
reliably would be to walk all samples. But we want to avoid that in perf
report/...  because it is already quite expensive. That is why storing
the first sample time and last sample time in perf record is better.

This patch creates a new header feature type HEADER_SAMPLE_TIME and
related ops. Save the first sample time and the last sample time to the
feature section in perf file header. That will be done when, for
instance, processing build-ids, where we already have to process all
samples to create the build-id table, take advantage of that to further
amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
report/script' faster when using --time.

Committer testing:

After this patch is applied the header is written with zeroes, we need
the next patch, for "perf record" to actually write the timestamps:

  # perf report -D | grep PERF_RECORD_SAMPLE\(
  22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21be8c5 period: 1 addr: 0
  
  22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 
0xa21ffd50 period: 2828043 addr: 0
  # perf report --header | grep "time of "
  # time of first sample : 0.00
  # time of last sample : 0.00
  #

Changelog:

v7: 1. Rebase to latest perf/core branch.

2. Add following clarification in patch description according to
   Arnaldo's suggestion.

   "That will be done when, for instance, processing build-ids,
where we already have to process all samples to create the
build-id table, take advantage of that to further amortize
that processing by storing HEADER_SAMPLE_TIME to make
'perf report/script' faster when using --time."

v4: Use perf script time style for timestamp printing. Also add with
the printing of sample duration.

v3: Remove the definitions of first_sample_time/last_sample_time from
perf_session. Just define them in perf_evlist

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf.data-file-format.txt |  4 ++
 tools/perf/util/evlist.h   |  2 +
 tools/perf/util/header.c   | 60 ++
 tools/perf/util/header.h   |  1 +
 4 files changed, 67 insertions(+)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 15e8b48..f7d85e8 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -261,6 +261,10 @@ struct {
struct perf_header_string map;
 }[number_of_cache_levels];
 
+   HEADER_SAMPLE_TIME = 21,
+
+Two uint64_t for the time of first sample and the time of last sample.
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS= 256,
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 7516066..e7fbca6 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -50,6 +50,8 @@ struct perf_evlist {
struct perf_evsel *selected;
struct events_stats stats;
struct perf_env *env;
+   u64 first_sample_time;
+   u64 last_sample_time;
 };
 
 struct perf_evsel_str_handler {
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index ca73aa7..a326e0d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "evlist.h"
 #include "evsel.h"
@@ -35,6 +36,7 @@
 #include 
 #include "asm/bug.h"
 #include "tool.h"
+#include "time-utils.h"
 
 #include "sane_ctype.h"
 
@@ -1180,6 +1182,20 @@ static int write_stat(struct feat_fd *

[tip:perf/core] perf record: Record the first and last sample time in the header

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  68588baf8d01826673f2874f434123029e519052
Gitweb: https://git.kernel.org/tip/68588baf8d01826673f2874f434123029e519052
Author: Jin Yao 
AuthorDate: Fri, 8 Dec 2017 21:13:42 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:20:56 -0300

perf record: Record the first and last sample time in the header

In the default 'perf record' configuration, all samples are processed,
to create the HEADER_BUILD_ID table. So it's very easy to get the
first/last samples and save the time to perf file header via the
function write_sample_time().

Later, at post processing time, perf report/script will fetch the time
from perf file header.

Committer testing:

  # perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
  [root@jouet home]# perf report --header | grep "time of "
  # time of first sample : 22947.909226
  # time of last sample : 22948.910704
  #
  # perf report -D | grep PERF_RECORD_SAMPLE\(
  0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa21b1af3 period: 1 addr: 0
  0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa200d204 period: 1 addr: 0
  
  3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 
28251/28251: 0xa22071d8 period: 169518 addr: 0
  0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 198807 addr: 0
  2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 
0xa2856816 period: 88111 addr: 0
  #

Changelog:

v7: Just update the patch description according to Arnaldo's suggestion.

v6: Currently '--buildid-all' is not enabled at default. So the walking
on all samples is the default operation. There is no big overhead
to calculate the timestamp boundary in process_sample_event handler
once we already go through all samples. So the timestamp boundary
calculation is enabled by default when '--buildid-all' is not enabled.

While if '--buildid-all' is enabled, we creates a new option
"--timestamp-boundary" for user to decide if it enables the
timestamp boundary calculation.

v5: There is an issue that the sample walking can only work when
'--buildid-all' is not enabled. So we need to let the walking
be able to work even if '--buildid-all' is enabled and let the
processing skips the dso hit marking for this case.

At first, I want to provide a new option "--record-time-boundaries".
While after consideration, I think a new option is not very
necessary.

v3: Remove the definitions of first_sample_time and last_sample_time
from struct record and directly save them in perf_evlist.

Signed-off-by: Jin Yao 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf-record.txt |  3 +++
 tools/perf/builtin-record.c  | 18 +++---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index 5a626ef..3eea6de 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -430,6 +430,9 @@ Configure all used events to run in user space.
 --timestamp-filename
 Append timestamp to output file name.
 
+--timestamp-boundary::
+Record timestamp boundary (time of first/last samples).
+
 --switch-output[=mode]::
 Generate multiple perf.data files, timestamp prefixed, switching to a new one
 based on 'mode' value:
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 50385d8..65681a1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -78,6 +78,7 @@ struct record {
boolno_buildid_cache_set;
boolbuildid_all;
booltimestamp_filename;
+   booltimestamp_boundary;
struct switch_outputswitch_output;
unsigned long long  samples;
 };
@@ -409,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
struct record *rec = container_of(tool, struct record, tool);
 
-   rec->samples++;
+   if (rec->evlist->first_sample_time == 0)
+   rec->evlist->first_sample_time = sample->time;
+
+   rec->evlist->last_sample_time = sample->time;
 
+   if (rec->buildid_all)
+   return 0;
+
+   rec->samples++;
return build_id__mark_dso_hit(tool, event, sample, evsel, machine);
 }
 
@@ -435,9 +443,11 @@ static int process_buildids(struct record *rec)
 
/*
 * If --buildid-all is given, it marks all D

[tip:perf/core] perf report: Fix a wrong offset issue when using /proc/kcore

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  935f5a9d4500020879858c9224c98dfabf16101d
Gitweb: https://git.kernel.org/tip/935f5a9d4500020879858c9224c98dfabf16101d
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Sat, 30 Dec 2017 00:26:52 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a wrong offset issue when using /proc/kcore

When a valid vmlinux is not found, 'perf report' falls back to look at
/proc/kcore. In this case, it will report the impossible large offset.

For example:

  # perf record -b -e cycles:k find /etc/ > /dev/null
  # perf report --stdio --branch-history

22.77%  _vm_normal_page+18446603336221188162
|
---page_remove_rmap +18446603336221188324
   page_remove_rmap +18446603336221188487 (cycles:5)
   unlock_page_memcg +18446603336221188096
   page_remove_rmap +18446603336221188327 (cycles:1)

The issue is the value which is passed to parameter 'addr' in
__get_srcline() is the objdump address. It's not correct if we calculate
the offset by using 'addr - sym->start'.

This patch creates a new parameter 'ip' in __get_srcline(). It is not
converted to objdump address.

With this patch, the perf report output is:

22.77%  _vm_normal_page+66
|
---page_remove_rmap +228
   page_remove_rmap +391 (cycles:5)
   unlock_page_memcg +0
   page_remove_rmap +231 (cycles:1)
   page_remove_rmap +236

Committer testing:

Make sure you get any valid vmlinux out of the way, using '-v' on the
'perf report' case and deleting it from places where perf searches them,
like your kernel build dir and the build-id cache, in ~/.debug/.

Reported-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Signed-off-by: Jin Yao <yao@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/util/annotate.c |  3 ++-
 tools/perf/util/machine.c  |  2 +-
 tools/perf/util/map.c  |  2 +-
 tools/perf/util/sort.c | 16 ++--
 tools/perf/util/srcline.c  |  9 +
 tools/perf/util/srcline.h  |  5 +++--
 6 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 68e687d..28b233c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1960,7 +1960,8 @@ static void annotation__calc_lines(struct annotation 
*notes, struct map *map,
if (percent_max <= 0.5)
continue;
 
-   al->path = get_srcline(map->dso, start + al->offset, NULL, 
false, true);
+   al->path = get_srcline(map->dso, start + al->offset, NULL,
+  false, true, start + al->offset);
insert_source_line(_root, al);
}
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 64d255f..b05a674 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1726,7 +1726,7 @@ static char *callchain_srcline(struct map *map, struct 
symbol *sym, u64 ip)
bool show_addr = callchain_param.key == CCKEY_ADDRESS;
 
srcline = get_srcline(map->dso, map__rip_2objdump(map, ip),
- sym, show_sym, show_addr);
+ sym, show_sym, show_addr, ip);
srcline__tree_insert(>dso->srclines, ip, srcline);
}
 
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6d40efd..8fe5703 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -419,7 +419,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const 
char *prefix,
if (map && map->dso) {
srcline = get_srcline(map->dso,
  map__rip_2objdump(map, addr), NULL,
- true, true);
+ true, true, addr);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a00eacd..211e7f3 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -336,7 +336,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
return SRCLINE_UNKNOWN;
 
return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
-  he->ms.sym, true,

[tip:perf/core] perf report: Fix a no annotate browser displayed issue

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  40c39e3046411f84bab82f66783ff3593e2bcd9b
Gitweb: https://git.kernel.org/tip/40c39e3046411f84bab82f66783ff3593e2bcd9b
Author: Jin Yao <yao@linux.intel.com>
AuthorDate: Tue, 26 Dec 2017 18:42:43 +0800
Committer:  Arnaldo Carvalho de Melo <a...@redhat.com>
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a no annotate browser displayed issue

When enabling '-b' option in perf record, for example,

  perf record -b ...
  perf report

and then browsing the annotate browser from perf report (press 'A'), it
would fail (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten by
hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
something like mapping symbols and sources. So next, do_annotate() will return
directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (refer to
hist_iter__report_callback).

v2:

Fix a crash bug when perform 'perf report --stdio'.

The reason is that we init the symbol annotation only in browser mode, it
doesn't allocate/init resources for stdio mode.

So now in hist_iter__branch_callback(), it will return directly if it's not in
browser mode.

Signed-off-by: Jin Yao <yao@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <a...@redhat.com>
Cc: Alexander Shishkin <alexander.shish...@linux.intel.com>
Cc: Andi Kleen <a...@linux.intel.com>
Cc: Jiri Olsa <jo...@kernel.org>
Cc: Kan Liang <kan.li...@intel.com>
Cc: Peter Zijlstra <pet...@infradead.org>
Link: 
http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
---
 tools/perf/builtin-report.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index eb9ce63..07827cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -162,12 +162,28 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct hist_entry *he = iter->he;
struct report *rep = arg;
struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   if (!ui__has_annotation())
+   return 0;
+
+   hist__account_cycles(sample->branch_stack, al, sample,
+rep->nonany_branch_mode);
 
bi = he->branch_info;
+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
branch_type_count(>brtype_stat, >flags,
  bi->from.addr, bi->to.addr);
 
-   return 0;
+out:
+   return err;
 }
 
 static int process_sample_event(struct perf_tool *tool,


[tip:perf/core] perf report: Fix a no annotate browser displayed issue

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  40c39e3046411f84bab82f66783ff3593e2bcd9b
Gitweb: https://git.kernel.org/tip/40c39e3046411f84bab82f66783ff3593e2bcd9b
Author: Jin Yao 
AuthorDate: Tue, 26 Dec 2017 18:42:43 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a no annotate browser displayed issue

When enabling '-b' option in perf record, for example,

  perf record -b ...
  perf report

and then browsing the annotate browser from perf report (press 'A'), it
would fail (annotate browser can't be displayed).

It's because the '.add_entry_cb' op of struct report is overwritten by
hist_iter__branch_callback() in builtin-report.c. But this function doesn't do
something like mapping symbols and sources. So next, do_annotate() will return
directly.

notes = symbol__annotation(act->ms.sym);
if (!notes->src)
return 0;

This patch adds the lost code to hist_iter__branch_callback (refer to
hist_iter__report_callback).

v2:

Fix a crash bug when perform 'perf report --stdio'.

The reason is that we init the symbol annotation only in browser mode, it
doesn't allocate/init resources for stdio mode.

So now in hist_iter__branch_callback(), it will return directly if it's not in
browser mode.

Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1514284963-18587-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-report.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index eb9ce63..07827cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -162,12 +162,28 @@ static int hist_iter__branch_callback(struct 
hist_entry_iter *iter,
struct hist_entry *he = iter->he;
struct report *rep = arg;
struct branch_info *bi;
+   struct perf_sample *sample = iter->sample;
+   struct perf_evsel *evsel = iter->evsel;
+   int err;
+
+   if (!ui__has_annotation())
+   return 0;
+
+   hist__account_cycles(sample->branch_stack, al, sample,
+rep->nonany_branch_mode);
 
bi = he->branch_info;
+   err = addr_map_symbol__inc_samples(>from, sample, evsel->idx);
+   if (err)
+   goto out;
+
+   err = addr_map_symbol__inc_samples(>to, sample, evsel->idx);
+
branch_type_count(>brtype_stat, >flags,
  bi->from.addr, bi->to.addr);
 
-   return 0;
+out:
+   return err;
 }
 
 static int process_sample_event(struct perf_tool *tool,


[tip:perf/core] perf report: Fix a wrong offset issue when using /proc/kcore

2018-01-10 Thread tip-bot for Jin Yao
Commit-ID:  935f5a9d4500020879858c9224c98dfabf16101d
Gitweb: https://git.kernel.org/tip/935f5a9d4500020879858c9224c98dfabf16101d
Author: Jin Yao 
AuthorDate: Sat, 30 Dec 2017 00:26:52 +0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 8 Jan 2018 11:11:57 -0300

perf report: Fix a wrong offset issue when using /proc/kcore

When a valid vmlinux is not found, 'perf report' falls back to look at
/proc/kcore. In this case, it will report the impossible large offset.

For example:

  # perf record -b -e cycles:k find /etc/ > /dev/null
  # perf report --stdio --branch-history

22.77%  _vm_normal_page+18446603336221188162
|
---page_remove_rmap +18446603336221188324
   page_remove_rmap +18446603336221188487 (cycles:5)
   unlock_page_memcg +18446603336221188096
   page_remove_rmap +18446603336221188327 (cycles:1)

The issue is the value which is passed to parameter 'addr' in
__get_srcline() is the objdump address. It's not correct if we calculate
the offset by using 'addr - sym->start'.

This patch creates a new parameter 'ip' in __get_srcline(). It is not
converted to objdump address.

With this patch, the perf report output is:

22.77%  _vm_normal_page+66
|
---page_remove_rmap +228
   page_remove_rmap +391 (cycles:5)
   unlock_page_memcg +0
   page_remove_rmap +231 (cycles:1)
   page_remove_rmap +236

Committer testing:

Make sure you get any valid vmlinux out of the way, using '-v' on the
'perf report' case and deleting it from places where perf searches them,
like your kernel build dir and the build-id cache, in ~/.debug/.

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Jin Yao 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Kan Liang 
Cc: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1514564812-17344-1-git-send-email-yao@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/annotate.c |  3 ++-
 tools/perf/util/machine.c  |  2 +-
 tools/perf/util/map.c  |  2 +-
 tools/perf/util/sort.c | 16 ++--
 tools/perf/util/srcline.c  |  9 +
 tools/perf/util/srcline.h  |  5 +++--
 6 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 68e687d..28b233c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1960,7 +1960,8 @@ static void annotation__calc_lines(struct annotation 
*notes, struct map *map,
if (percent_max <= 0.5)
continue;
 
-   al->path = get_srcline(map->dso, start + al->offset, NULL, 
false, true);
+   al->path = get_srcline(map->dso, start + al->offset, NULL,
+  false, true, start + al->offset);
insert_source_line(_root, al);
}
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 64d255f..b05a674 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1726,7 +1726,7 @@ static char *callchain_srcline(struct map *map, struct 
symbol *sym, u64 ip)
bool show_addr = callchain_param.key == CCKEY_ADDRESS;
 
srcline = get_srcline(map->dso, map__rip_2objdump(map, ip),
- sym, show_sym, show_addr);
+ sym, show_sym, show_addr, ip);
srcline__tree_insert(>dso->srclines, ip, srcline);
}
 
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6d40efd..8fe5703 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -419,7 +419,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const 
char *prefix,
if (map && map->dso) {
srcline = get_srcline(map->dso,
  map__rip_2objdump(map, addr), NULL,
- true, true);
+ true, true, addr);
if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
free_srcline(srcline);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a00eacd..211e7f3 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -336,7 +336,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
return SRCLINE_UNKNOWN;
 
return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
-  he->ms.sym, true, true);
+  he->ms.sym, true, true, he->ip);
 }
 
 static int64_t
@@ -380,7 +380,8 @@ sort__srcline_from_cmp(struct hist_entry *left, struct 
hist_entry *right)
   map__rip_2objdump(map,
   

[PATCH] perf util: Display warning when perf report/annotate is missing some libs

2018-01-10 Thread Jin Yao
We keep having bug reports that when users build perf on their own,
but they don't install some needed libraries like libelf, libbfd/libibery.

The perf can build, but it is missing important functionality.

For example, perf report doesn't display any symbols and perf annotate
doesn't work.

This patch displays warnings that these libraries are missing in build
when perf report / perf annotate are used.

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/builtin-annotate.c |  2 ++
 tools/perf/builtin-report.c   |  2 ++
 tools/perf/util/symbol.c  | 21 +
 tools/perf/util/symbol.h  |  2 ++
 4 files changed, 27 insertions(+)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a..609eb7a 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -511,6 +511,8 @@ int cmd_annotate(int argc, const char **argv)
 
setup_browser(true);
 
+   build_lib_warning();
+
ret = __cmd_annotate();
 
 out_delete:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..550adb7 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1293,6 +1293,8 @@ int cmd_report(int argc, const char **argv)
}
}
 
+   build_lib_warning();
+
if (symbol__init(>header.env) < 0)
goto error;
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index cc065d4..4205ba8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2224,3 +2224,24 @@ int symbol__config_symfs(const struct option *opt 
__maybe_unused,
free(bf);
return 0;
 }
+
+void build_lib_warning(void)
+{
+#ifndef HAVE_LIBELF_SUPPORT
+   pr_warning("Symbols are disabled!\n"
+  "Please install libelf-dev, libelf-devel "
+  "or elfutils-libelf-devel before building perf.\n");
+#endif
+
+#ifndef HAVE_DWARF_SUPPORT
+   pr_warning("Unwind support is disabled!\n"
+  "Please install elfutils-devel/libdw-dev "
+  "before building perf.\n");
+#endif
+
+#ifndef HAVE_LIBBFD_SUPPORT
+   pr_warning("C++ demangling and line numbers are disabled!\n"
+  "Please install binutils-dev[el]/"
+  "libiberty-dev before building perf.\n");
+#endif
+}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 0563f33..d88e046 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -383,6 +383,8 @@ int get_sdt_note_list(struct list_head *head, const char 
*target);
 int cleanup_sdt_note_list(struct list_head *sdt_notes);
 int sdt_notes__get_count(struct list_head *start);
 
+void build_lib_warning(void);
+
 #define SDT_BASE_SCN ".stapsdt.base"
 #define SDT_NOTE_SCN  ".note.stapsdt"
 #define SDT_NOTE_TYPE 3
-- 
2.7.4



[PATCH] perf util: Display warning when perf report/annotate is missing some libs

2018-01-10 Thread Jin Yao
We keep having bug reports that when users build perf on their own,
but they don't install some needed libraries like libelf, libbfd/libibery.

The perf can build, but it is missing important functionality.

For example, perf report doesn't display any symbols and perf annotate
doesn't work.

This patch displays warnings that these libraries are missing in build
when perf report / perf annotate are used.

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-annotate.c |  2 ++
 tools/perf/builtin-report.c   |  2 ++
 tools/perf/util/symbol.c  | 21 +
 tools/perf/util/symbol.h  |  2 ++
 4 files changed, 27 insertions(+)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index f15731a..609eb7a 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -511,6 +511,8 @@ int cmd_annotate(int argc, const char **argv)
 
setup_browser(true);
 
+   build_lib_warning();
+
ret = __cmd_annotate();
 
 out_delete:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..550adb7 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1293,6 +1293,8 @@ int cmd_report(int argc, const char **argv)
}
}
 
+   build_lib_warning();
+
if (symbol__init(>header.env) < 0)
goto error;
 
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index cc065d4..4205ba8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -2224,3 +2224,24 @@ int symbol__config_symfs(const struct option *opt 
__maybe_unused,
free(bf);
return 0;
 }
+
+void build_lib_warning(void)
+{
+#ifndef HAVE_LIBELF_SUPPORT
+   pr_warning("Symbols are disabled!\n"
+  "Please install libelf-dev, libelf-devel "
+  "or elfutils-libelf-devel before building perf.\n");
+#endif
+
+#ifndef HAVE_DWARF_SUPPORT
+   pr_warning("Unwind support is disabled!\n"
+  "Please install elfutils-devel/libdw-dev "
+  "before building perf.\n");
+#endif
+
+#ifndef HAVE_LIBBFD_SUPPORT
+   pr_warning("C++ demangling and line numbers are disabled!\n"
+  "Please install binutils-dev[el]/"
+  "libiberty-dev before building perf.\n");
+#endif
+}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 0563f33..d88e046 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -383,6 +383,8 @@ int get_sdt_note_list(struct list_head *head, const char 
*target);
 int cleanup_sdt_note_list(struct list_head *sdt_notes);
 int sdt_notes__get_count(struct list_head *start);
 
+void build_lib_warning(void);
+
 #define SDT_BASE_SCN ".stapsdt.base"
 #define SDT_NOTE_SCN  ".note.stapsdt"
 #define SDT_NOTE_TYPE 3
-- 
2.7.4



[PATCH v1 6/8] perf util: Allocate time slices buffer according to number of comma

2018-01-09 Thread Jin Yao
Previously we use a magic number 10 to limit the number of
time slices. It's not very good.

This patch creates a new function perf_time__range_alloc()
to allocate time slices buffer. The number of buffer entries is
determined by the number of comma in string but at least it will
allocate one entry even if no comma is found.

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
-- 
2.7.4



[PATCH v1 6/8] perf util: Allocate time slices buffer according to number of comma

2018-01-09 Thread Jin Yao
Previously we use a magic number 10 to limit the number of
time slices. It's not very good.

This patch creates a new function perf_time__range_alloc()
to allocate time slices buffer. The number of buffer entries is
determined by the number of comma in string but at least it will
allocate one entry even if no comma is found.

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 28 
 tools/perf/util/time-utils.h |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 5769f97..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -325,6 +325,34 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
return -1;
 }
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+   const char *p1, *p2;
+   int i = 1;
+   struct perf_time_interval *ptime;
+
+   /*
+* At least allocate one time range.
+*/
+   if (!ostr)
+   goto alloc;
+
+   p1 = ostr;
+   while (p1 < ostr + strlen(ostr)) {
+   p2 = strchr(p1, ',');
+   if (!p2)
+   break;
+
+   p1 = p2 + 1;
+   i++;
+   }
+
+alloc:
+   *size = i;
+   ptime = calloc(i, sizeof(*ptime));
+   return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
/* if time is not set don't drop sample */
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 34d5eba..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -16,6 +16,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr);
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end);
 
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
 bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
-- 
2.7.4



[PATCH v1 4/8] perf util: Support no index time percent slice

2018-01-09 Thread Jin Yao
Previously, the time percent slice needs an index to specify
which one the user wants.

While it may be easy for using if the index can be omitted.
So with this patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 
-- 
2.7.4



[PATCH v1 2/8] perf script: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf script --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c1cce47..4f691af 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3448,7 +3448,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}
-- 
2.7.4



[PATCH v1 4/8] perf util: Support no index time percent slice

2018-01-09 Thread Jin Yao
Previously, the time percent slice needs an index to specify
which one the user wants.

While it may be easy for using if the index can be omitted.
So with this patch, for example,

perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 36 
 1 file changed, 36 insertions(+)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 88510ab..5769f97 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -261,6 +261,37 @@ static int percent_comma_split(struct perf_time_interval 
*ptime_buf, int num,
return i;
 }
 
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+  const char *ostr, u64 start, u64 end, char *c)
+{
+   char *str;
+   int len = strlen(ostr), ret;
+
+   /*
+* c points to '%'.
+* '%' should be the last character
+*/
+   if (ostr + len - 1 != c)
+   return -1;
+
+   /*
+* Construct a string like "xx%/1"
+*/
+   str = malloc(len + 3);
+   if (str == NULL)
+   return -ENOMEM;
+
+   memcpy(str, ostr, len);
+   strcpy(str + len, "/1");
+
+   ret = percent_slash_split(str, ptime_buf, start, end);
+   if (ret == 0)
+   ret = 1;
+
+   free(str);
+   return ret;
+}
+
 int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
 const char *ostr, u64 start, u64 end)
 {
@@ -270,6 +301,7 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
 * ostr example:
 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
 * 0%-10%,30%-40%: multiple time range
+* 50%: just one percent
 */
 
memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
@@ -286,6 +318,10 @@ int perf_time__percent_parse_str(struct perf_time_interval 
*ptime_buf, int num,
   end, percent_dash_split);
}
 
+   c = strchr(ostr, '%');
+   if (c)
+   return one_percent_convert(ptime_buf, ostr, start, end, c);
+
return -1;
 }
 
-- 
2.7.4



[PATCH v1 2/8] perf script: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf script --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-script.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c1cce47..4f691af 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3448,7 +3448,9 @@ int cmd_script(int argc, const char **argv)
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
err = -EINVAL;
goto out_delete;
}
-- 
2.7.4



[PATCH v1 8/8] perf script: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 17 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4f691af..f116a31 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3444,6 +3443,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3456,7 +3462,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3475,6 +3481,9 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   if (script.ptime_range)
+   free(script.ptime_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 
-- 
2.7.4



[PATCH v1 8/8] perf script: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf script --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf script --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
---
 tools/perf/Documentation/perf-script.txt | 10 +-
 tools/perf/builtin-script.c  | 17 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt 
b/tools/perf/Documentation/perf-script.txt
index 806ec63..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -351,19 +351,19 @@ include::itrace.txt[]
to end of file.
 
Also support time percent with multipe time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
-   Select the second 10% time slice
+   Select the second 10% time slice:
perf script --time 10%/2
 
-   Select from 0% to 10% time slice
+   Select from 0% to 10% time slice:
perf script --time 0%-10%
 
-   Select the first and second 10% time slices
+   Select the first and second 10% time slices:
perf script --time 10%/1,10%/2
 
-   Select from 0% to 10% and 30% to 40% slices
+   Select from 0% to 10% and 30% to 40% slices:
perf script --time 0%-10%,30%-40%
 
 --max-blocks::
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 4f691af..f116a31 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1480,8 +1480,6 @@ static int perf_sample__fprintf_synth(struct perf_sample 
*sample,
return 0;
 }
 
-#define PTIME_RANGE_MAX10
-
 struct perf_script {
struct perf_tooltool;
struct perf_session *session;
@@ -1496,7 +1494,8 @@ struct perf_script {
struct thread_map   *threads;
int name_width;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
 };
 
@@ -3444,6 +3443,13 @@ int cmd_script(int argc, const char **argv)
if (err < 0)
goto out_delete;
 
+   script.ptime_range = perf_time__range_alloc(script.time_str,
+   _size);
+   if (!script.ptime_range) {
+   err = -ENOMEM;
+   goto out_delete;
+   }
+
/* needs to be parsed after looking up reference time */
if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
@@ -3456,7 +3462,7 @@ int cmd_script(int argc, const char **argv)
}
 
script.range_num = perf_time__percent_parse_str(
-   script.ptime_range, PTIME_RANGE_MAX,
+   script.ptime_range, script.range_size,
script.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
@@ -3475,6 +3481,9 @@ int cmd_script(int argc, const char **argv)
flush_scripting();
 
 out_delete:
+   if (script.ptime_range)
+   free(script.ptime_range);
+
perf_evlist__free_stats(session->evlist);
perf_session__delete(session);
 
-- 
2.7.4



[PATCH v1 7/8] perf report: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 23 +--
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 5522ce0..1940c4f 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 77c954c..fe89021 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1299,24 +1298,33 @@ int cmd_report(int argc, const char **argv)
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1332,6 +1340,9 @@ int cmd_report(int argc, const char **argv)
ret = 0;
 
 error:
+   if (report.ptime_range)
+   free(report.ptime_range);
+
perf_session__delete(session);
return ret;
 }
-- 
2.7.4



[PATCH v1 3/8] perf util: Improve error checking for time percent input

2018-01-09 Thread Jin Yao
The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the
entire string. Now for the same command line, it would return error
message "Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 
-- 
2.7.4



[PATCH v1 7/8] perf report: Remove the time slices number limitation

2018-01-09 Thread Jin Yao
Previously it was only allowed to use at most 10 time slices
in 'perf report --time'.

This patch removes this limitation.
For example, following command line is OK (12 time slices)

perf report --stdio --time 
1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12

Signed-off-by: Jin Yao 
---
 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/builtin-report.c  | 23 +--
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt 
b/tools/perf/Documentation/perf-report.txt
index 5522ce0..1940c4f 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -403,7 +403,7 @@ OPTIONS
to end of file.
 
Also support time percent with multiple time range. Time string is
-   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'. The maximum number of slices is 
10.
+   'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
 
For example:
Select the second 10% time slice:
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 77c954c..fe89021 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -54,8 +54,6 @@
 #include 
 #include 
 
-#define PTIME_RANGE_MAX10
-
 struct report {
struct perf_tooltool;
struct perf_session *session;
@@ -76,7 +74,8 @@ struct report {
const char  *cpu_list;
const char  *symbol_filter_str;
const char  *time_str;
-   struct perf_time_interval ptime_range[PTIME_RANGE_MAX];
+   struct perf_time_interval *ptime_range;
+   int range_size;
int range_num;
float   min_percent;
u64 nr_entries;
@@ -1299,24 +1298,33 @@ int cmd_report(int argc, const char **argv)
if (symbol__init(>header.env) < 0)
goto error;
 
+   report.ptime_range = perf_time__range_alloc(report.time_str,
+   _size);
+   if (!report.ptime_range) {
+   ret = -ENOMEM;
+   goto error;
+   }
+
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
pr_err("HINT: no first/last sample time found in perf 
data.\n"
   "Please use latest perf binary to execute 'perf 
record'\n"
   "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
 
report.range_num = perf_time__percent_parse_str(
-   report.ptime_range, PTIME_RANGE_MAX,
+   report.ptime_range, report.range_size,
report.time_str,
session->evlist->first_sample_time,
session->evlist->last_sample_time);
 
if (report.range_num < 0) {
pr_err("Invalid time string\n");
-   return -EINVAL;
+   ret = -EINVAL;
+   goto error;
}
} else {
report.range_num = 1;
@@ -1332,6 +1340,9 @@ int cmd_report(int argc, const char **argv)
ret = 0;
 
 error:
+   if (report.ptime_range)
+   free(report.ptime_range);
+
perf_session__delete(session);
return ret;
 }
-- 
2.7.4



[PATCH v1 3/8] perf util: Improve error checking for time percent input

2018-01-09 Thread Jin Yao
The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.

This patch uses strtod() to replace original atof() and check the
entire string. Now for the same command line, it would return error
message "Invalid time string".

root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string

Signed-off-by: Jin Yao 
---
 tools/perf/util/time-utils.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 3f7f18f..88510ab 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -116,7 +116,8 @@ int perf_time__parse_str(struct perf_time_interval *ptime, 
const char *ostr)
 
 static int parse_percent(double *pcnt, char *str)
 {
-   char *c;
+   char *c, *endptr;
+   double d;
 
c = strchr(str, '%');
if (c)
@@ -124,8 +125,11 @@ static int parse_percent(double *pcnt, char *str)
else
return -1;
 
-   *pcnt = atof(str) / 100.0;
+   d = strtod(str, );
+   if (endptr != str + strlen(str))
+   return -1;
 
+   *pcnt = d / 100.0;
return 0;
 }
 
-- 
2.7.4



[PATCH v1 5/8] perf report: Add an indication of what time slices are used

2018-01-09 Thread Jin Yao
Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index a6c5cf2..77c954c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -403,6 +403,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");
-- 
2.7.4



[PATCH v1 5/8] perf report: Add an indication of what time slices are used

2018-01-09 Thread Jin Yao
Add a time slices indication to the perf report header.

For example,

  # perf report --stdio --time 10%

  # Total Lost Samples: 0
  #
  # Samples: 9K of event 'cycles:ppp' (time slices: 10%)
  # Event count (approx.): 8951288803

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index a6c5cf2..77c954c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -403,6 +403,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (evname != NULL)
ret += fprintf(fp, " of event '%s'", evname);
 
+   if (rep->time_str)
+   ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
if (symbol_conf.show_ref_callgraph &&
strstr(evname, "call-graph=no")) {
ret += fprintf(fp, ", show reference callgraph");
-- 
2.7.4



[PATCH v1 1/8] perf report: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao <yao@linux.intel.com>
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..a6c5cf2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1299,7 +1299,9 @@ int cmd_report(int argc, const char **argv)
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 
-- 
2.7.4



[PATCH v1 1/8] perf report: Improve error msg when no first/last sample time found

2018-01-09 Thread Jin Yao
The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.

"HINT: no first/last sample time found in perf data.
 Please use latest perf binary to execute 'perf record'
 (if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."

Signed-off-by: Jin Yao 
---
 tools/perf/builtin-report.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dd4df9a..a6c5cf2 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1299,7 +1299,9 @@ int cmd_report(int argc, const char **argv)
if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
if (session->evlist->first_sample_time == 0 &&
session->evlist->last_sample_time == 0) {
-   pr_err("No first/last sample time in perf data\n");
+   pr_err("HINT: no first/last sample time found in perf 
data.\n"
+  "Please use latest perf binary to execute 'perf 
record'\n"
+  "(if '--buildid-all' is enabled, please set 
'--timestamp-boundary').\n");
return -EINVAL;
}
 
-- 
2.7.4



[PATCH v1 0/8] perf: Follow-up patches to improve time slice

2018-01-09 Thread Jin Yao
It's follow-up patches to improve the perf time slice feature
(perf report/script --time xxx)

1. Improve the error message
   perf report: Improve error msg when no first/last sample time found
   perf script: Improve error msg when no first/last sample time found

2. Fix an issue that illegal percent was accepted previously (e.g. 1abc%)
   perf util: Improve error checking for time percent input

3. Omit the slice index if possible. For example,
   perf report --stdio --time 10%/1 is equivalent to
   perf report --stdio --time 10%

   perf util: Support no index time percent slice 

4. Add indication of time slices in perf report header.
   perf report: Add an indication of what time slices are used

5. Remove the time slices number limitation in perf report/script
   perf util: Allocate time slices buffer according to number of comma
   perf report: Remove the time slices number limitation
   perf script: Remove the time slices number limitation

Jin Yao (8):
  perf report: Improve error msg when no first/last sample time found
  perf script: Improve error msg when no first/last sample time found
  perf util: Improve error checking for time percent input
  perf util: Support no index time percent slice
  perf report: Add an indication of what time slices are used
  perf util: Allocate time slices buffer according to number of comma
  perf report: Remove the time slices number limitation
  perf script: Remove the time slices number limitation

 tools/perf/Documentation/perf-report.txt |  2 +-
 tools/perf/Documentation/perf-script.txt | 10 ++---
 tools/perf/builtin-report.c  | 30 +
 tools/perf/builtin-script.c  | 21 +++---
 tools/perf/util/time-utils.c | 72 +++-
 tools/perf/util/time-utils.h |  2 +
 6 files changed, 117 insertions(+), 20 deletions(-)

-- 
2.7.4



<    5   6   7   8   9   10   11   12   13   14   >