[tip:perf/urgent] perf script: Assume native_arch for pipe mode

2019-07-13 Thread tip-bot for Song Liu
Commit-ID:  9d49169c5958e429ffa6874fbef734ae7502ad65
Gitweb: https://git.kernel.org/tip/9d49169c5958e429ffa6874fbef734ae7502ad65
Author: Song Liu 
AuthorDate: Thu, 20 Jun 2019 18:44:38 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 9 Jul 2019 10:13:28 -0300

perf script: Assume native_arch for pipe mode

In pipe mode, session->header.env.arch is not populated until the events
are processed. Therefore, the following command crashes:

   perf record -o - | perf script

(gdb) bt

It fails when we try to compare env.arch against uts.machine:

if (!strcmp(uts.machine, session->header.env.arch) ||
(!strcmp(uts.machine, "x86_64") &&
 !strcmp(session->header.env.arch, "i386")))
native_arch = true;

In pipe mode, it is tricky to find env.arch at this stage. To keep it
simple, let's just assume native_arch is always true for pipe mode.

Reported-by: David Carrillo Cisneros 
Signed-off-by: Song Liu 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Andi Kleen 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: kernel-t...@fb.com
Cc: sta...@vger.kernel.org #v5.1+
Fixes: 3ab481a1cfe1 ("perf script: Support insn output for normal samples")
Link: http://lkml.kernel.org/r/20190621014438.810342-1-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-script.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b3536820f9a8..79367087bd18 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3752,7 +3752,8 @@ int cmd_script(int argc, const char **argv)
goto out_delete;
 
uname();
-   if (!strcmp(uts.machine, session->header.env.arch) ||
+   if (data.is_pipe ||  /* assume pipe_mode indicates native_arch */
+   !strcmp(uts.machine, session->header.env.arch) ||
(!strcmp(uts.machine, "x86_64") &&
 !strcmp(session->header.env.arch, "i386")))
native_arch = true;


[tip:perf/core] perf header: Assign proper ff->ph in perf_event__synthesize_features()

2019-07-09 Thread tip-bot for Song Liu
Commit-ID:  c952b35f4b15dd1b83e952718dec3307256383ef
Gitweb: https://git.kernel.org/tip/c952b35f4b15dd1b83e952718dec3307256383ef
Author: Song Liu 
AuthorDate: Wed, 19 Jun 2019 18:04:53 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Sat, 6 Jul 2019 14:29:04 -0300

perf header: Assign proper ff->ph in perf_event__synthesize_features()

bpf/btf write_* functions need ff->ph->env.

With this missing, pipe-mode (perf record -o -)  would crash like:

Program terminated with signal SIGSEGV, Segmentation fault.

This patch assign proper ph value to ff.

Committer testing:

  (gdb) run record -o -
  Starting program: /root/bin/perf record -o -
  PERFILE2
  
  Thread 1 "perf" received signal SIGSEGV, Segmentation fault.
  __do_write_buf (size=4, buf=0x160, ff=0x7fff8f80) at util/header.c:126
  126   memcpy(ff->buf + ff->offset, buf, size);
  (gdb) bt
  #0  __do_write_buf (size=4, buf=0x160, ff=0x7fff8f80) at util/header.c:126
  #1  do_write (ff=ff@entry=0x7fff8f80, buf=buf@entry=0x160, size=4) at 
util/header.c:137
  #2  0x004eddba in write_bpf_prog_info (ff=0x7fff8f80, 
evlist=) at util/header.c:912
  #3  0x004f69d7 in perf_event__synthesize_features 
(tool=tool@entry=0x97cc00 , session=session@entry=0x7fffe9c6d010,
  evlist=0x7fffe9cae010, process=process@entry=0x4435d0 
) at util/header.c:3695
  #4  0x00443c79 in record__synthesize (tail=tail@entry=false, 
rec=0x97cc00 ) at builtin-record.c:1214
  #5  0x00444ec9 in __cmd_record (rec=0x97cc00 , 
argv=, argc=0) at builtin-record.c:1435
  #6  cmd_record (argc=0, argv=) at builtin-record.c:2450
  #7  0x004ae3e9 in run_builtin (p=p@entry=0x98e058 , 
argc=argc@entry=3, argv=0x7fffd670) at perf.c:304
  #8  0x0042eded in handle_internal_command (argv=, 
argc=) at perf.c:356
  #9  run_argv (argcp=, argv=) at perf.c:400
  #10 main (argc=3, argv=) at perf.c:522
  (gdb)

After the patch the SEGSEGV is gone.

Reported-by: David Carrillo Cisneros 
Signed-off-by: Song Liu 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: kernel-t...@fb.com
Cc: sta...@vger.kernel.org # v5.1+
Fixes: 606f972b1361 ("perf bpf: Save bpf_prog_info information as headers to 
perf.data")
Link: http://lkml.kernel.org/r/20190620010453.4118689-1-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/header.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 847ae51a524b..fb0aa661644b 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3602,6 +3602,7 @@ int perf_event__synthesize_features(struct perf_tool 
*tool,
return -ENOMEM;
 
ff.size = sz - sz_hdr;
+   ff.ph = >header;
 
for_each_set_bit(feat, header->adds_features, HEADER_FEAT_BITS) {
if (!feat_ops[feat].synthesize) {


[tip:x86/urgent] perf/x86: Always store regs->ip in perf_callchain_kernel()

2019-06-27 Thread tip-bot for Song Liu
Commit-ID:  83f44ae0f8afcc9da659799db8693f74847e66b3
Gitweb: https://git.kernel.org/tip/83f44ae0f8afcc9da659799db8693f74847e66b3
Author: Song Liu 
AuthorDate: Wed, 26 Jun 2019 19:33:52 -0500
Committer:  Thomas Gleixner 
CommitDate: Fri, 28 Jun 2019 00:11:20 +0200

perf/x86: Always store regs->ip in perf_callchain_kernel()

The stacktrace_map_raw_tp BPF selftest is failing because the RIP saved by
perf_arch_fetch_caller_regs() isn't getting saved by perf_callchain_kernel().

This was broken by the following commit:

  d15d356887e7 ("perf/x86: Make perf callchains work without 
CONFIG_FRAME_POINTER")

With that change, when starting with non-HW regs, the unwinder starts
with the current stack frame and unwinds until it passes up the frame
which called perf_arch_fetch_caller_regs().  So regs->ip needs to be
saved deliberately.

Fixes: d15d356887e7 ("perf/x86: Make perf callchains work without 
CONFIG_FRAME_POINTER")
Signed-off-by: Song Liu 
Signed-off-by: Josh Poimboeuf 
Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
Cc: Kairui Song 
Cc: Steven Rostedt 
Cc: Borislav Petkov 
Link: 
https://lkml.kernel.org/r/3975a298fa52b506fea32666d8ff6a13467eee6d.1561595111.git.jpoim...@redhat.com

---
 arch/x86/events/core.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index f315425d8468..4fb3ca1e699d 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2402,13 +2402,13 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *re
return;
}
 
-   if (perf_hw_regs(regs)) {
-   if (perf_callchain_store(entry, regs->ip))
-   return;
+   if (perf_callchain_store(entry, regs->ip))
+   return;
+
+   if (perf_hw_regs(regs))
unwind_start(, current, regs, NULL);
-   } else {
+   else
unwind_start(, current, NULL, (void *)regs->sp);
-   }
 
for (; !unwind_done(); unwind_next_frame()) {
addr = unwind_get_return_address();


[tip:perf/core] perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF

2019-06-17 Thread tip-bot for Song Liu
Commit-ID:  8e21be4f815ca8edfee1decd87f298f92123f719
Gitweb: https://git.kernel.org/tip/8e21be4f815ca8edfee1decd87f298f92123f719
Author: Song Liu 
AuthorDate: Mon, 20 May 2019 23:44:06 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 5 Jun 2019 09:47:52 -0300

perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF

This patch addes description of HEADER_BPF_PROG_INFO and HEADER_BPF_BTF to
perf.data-file-format.txt.

Requested-by: Arnaldo Carvalho de Melo 
Signed-off-by: Song Liu 
Cc: Jiri Olsa 
Cc: Peter Zijlstra 
Fixes: 606f972b1361 ("perf bpf: Save bpf_prog_info information as headers to 
perf.data")
Link: http://lkml.kernel.org/r/20190521064406.2498925-1-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Documentation/perf.data-file-format.txt | 16 
 1 file changed, 16 insertions(+)

diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
b/tools/perf/Documentation/perf.data-file-format.txt
index 6967e9b02be5..022bb8b1c84a 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -272,6 +272,22 @@ struct {
 
 Two uint64_t for the time of first sample and the time of last sample.
 
+HEADER_BPF_PROG_INFO = 25,
+
+struct bpf_prog_info_linear, which contains detailed information about
+a BPF program, including type, id, tag, jited/xlated instructions, etc.
+
+HEADER_BPF_BTF = 26,
+
+Contains BPF Type Format (BTF). For more information about BTF, please
+refer to Documentation/bpf/btf.rst.
+
+struct {
+   u32 id;
+   u32 data_size;
+   chardata[];
+};
+
 HEADER_COMPRESSED = 27,
 
 struct {


[tip:perf/core] perf/core: Allow non-privileged uprobe for user processes

2019-06-03 Thread tip-bot for Song Liu
Commit-ID:  9fd2e48b9ae17978b2c2a98c055c774d5d90bce8
Gitweb: https://git.kernel.org/tip/9fd2e48b9ae17978b2c2a98c055c774d5d90bce8
Author: Song Liu 
AuthorDate: Tue, 7 May 2019 09:15:45 -0700
Committer:  Ingo Molnar 
CommitDate: Mon, 3 Jun 2019 11:58:18 +0200

perf/core: Allow non-privileged uprobe for user processes

Currently, non-privileged user could only use uprobe with

kernel.perf_event_paranoid = -1

However, setting perf_event_paranoid to -1 leaks other users' processes to
non-privileged uprobes.

To introduce proper permission control of uprobes, we are building the
following system:

  A daemon with CAP_SYS_ADMIN is in charge to create uprobes via tracefs;
  Users asks the daemon to create uprobes;
  Then user can attach uprobe only to processes owned by the user.

This patch allows non-privileged user to attach uprobe to processes owned
by the user.

The following example shows how to use uprobe with non-privileged user.
This is based on Brendan's blog post [1]

1. Create uprobe with root:

  sudo perf probe -x 'readline%return +0($retval):string'

2. Then non-root user can use the uprobe as:

  perf record -vvv -e probe_bash:readline__return -p  sleep 20
  perf script

[1] http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: https://lkml.kernel.org/r/20190507161545.788381-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c| 4 ++--
 kernel/trace/trace_uprobe.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index abbd4b3b96c2..3005c80f621d 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8532,9 +8532,9 @@ static int perf_tp_event_match(struct perf_event *event,
if (event->hw.state & PERF_HES_STOPPED)
return 0;
/*
-* All tracepoints are from kernel-space.
+* If exclude_kernel, only trace user-space tracepoints (uprobes)
 */
-   if (event->attr.exclude_kernel)
+   if (event->attr.exclude_kernel && !user_mode(regs))
return 0;
 
if (!perf_tp_filter_match(event, data))
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index eb7e06b54741..0d60d6856de5 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -1331,7 +1331,7 @@ static inline void init_trace_event_call(struct 
trace_uprobe *tu,
call->event.funcs = _funcs;
call->class->define_fields = uprobe_event_define_fields;
 
-   call->flags = TRACE_EVENT_FL_UPROBE;
+   call->flags = TRACE_EVENT_FL_UPROBE | TRACE_EVENT_FL_CAP_ANY;
call->class->reg = trace_uprobe_register;
call->data = tu;
 }


[tip:perf/urgent] perf tools: Check maps for bpf programs

2019-04-19 Thread tip-bot for Song Liu
Commit-ID:  a93e0b2365e81e5a5b61f25e269b5dc73d242cba
Gitweb: https://git.kernel.org/tip/a93e0b2365e81e5a5b61f25e269b5dc73d242cba
Author: Song Liu 
AuthorDate: Tue, 16 Apr 2019 18:01:22 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 17 Apr 2019 14:30:11 -0300

perf tools: Check maps for bpf programs

As reported by Jiri Olsa in:

  "[BUG] perf: intel_pt won't display kernel function"
  https://lore.kernel.org/lkml/20190403143738.GB32001@krava

Recent changes to support PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT
broke --kallsyms option. This is because it broke test __map__is_kmodule.

This patch fixes this by adding check for bpf program, so that these maps
are not mistaken as kernel modules.

Signed-off-by: Song Liu 
Reported-by: Jiri Olsa 
Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Alexei Starovoitov 
Cc: Andi Kleen 
Cc: Andrii Nakryiko 
Cc: Daniel Borkmann 
Cc: Martin KaFai Lau 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Yonghong Song 
Link: http://lkml.kernel.org/r/20190416160127.30203-8-jo...@kernel.org
Fixes: 76193a94522f ("perf, bpf: Introduce PERF_RECORD_KSYMBOL")
Signed-off-by: Jiri Olsa 
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/map.c | 16 
 tools/perf/util/map.h |  4 +++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index e32628cd20a7..28d484ef74ae 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -261,6 +261,22 @@ bool __map__is_extra_kernel_map(const struct map *map)
return kmap && kmap->name[0];
 }
 
+bool __map__is_bpf_prog(const struct map *map)
+{
+   const char *name;
+
+   if (map->dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO)
+   return true;
+
+   /*
+* If PERF_RECORD_BPF_EVENT is not included, the dso will not have
+* type of DSO_BINARY_TYPE__BPF_PROG_INFO. In such cases, we can
+* guess the type based on name.
+*/
+   name = map->dso->short_name;
+   return name && (strstr(name, "bpf_prog_") == name);
+}
+
 bool map__has_symbols(const struct map *map)
 {
return dso__has_symbols(map->dso);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 0e20749f2c55..dc93787c74f0 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -159,10 +159,12 @@ int map__set_kallsyms_ref_reloc_sym(struct map *map, 
const char *symbol_name,
 
 bool __map__is_kernel(const struct map *map);
 bool __map__is_extra_kernel_map(const struct map *map);
+bool __map__is_bpf_prog(const struct map *map);
 
 static inline bool __map__is_kmodule(const struct map *map)
 {
-   return !__map__is_kernel(map) && !__map__is_extra_kernel_map(map);
+   return !__map__is_kernel(map) && !__map__is_extra_kernel_map(map) &&
+  !__map__is_bpf_prog(map);
 }
 
 bool map__has_symbols(const struct map *map);


[tip:perf/urgent] perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog()

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  fc462ac75b36daaa61e9bda7fba66ed1b3a500b4
Gitweb: https://git.kernel.org/tip/fc462ac75b36daaa61e9bda7fba66ed1b3a500b4
Author: Song Liu 
AuthorDate: Tue, 19 Mar 2019 09:54:53 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 21 Mar 2019 11:27:04 -0300

perf bpf: Extract logic to create program names from 
perf_event__synthesize_one_bpf_prog()

Extract logic to create program names to synthesize_bpf_prog_name(), so
that it can be reused in header.c:print_bpf_prog_info().

This commit doesn't change the behavior.

Signed-off-by: Song Liu 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190319165454.1298742-2-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 62 +
 1 file changed, 35 insertions(+), 27 deletions(-)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 2a8c245ca942..d5b041649f26 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -111,6 +111,38 @@ static int perf_env__fetch_btf(struct perf_env *env,
return 0;
 }
 
+static int synthesize_bpf_prog_name(char *buf, int size,
+   struct bpf_prog_info *info,
+   struct btf *btf,
+   u32 sub_id)
+{
+   u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags);
+   void *func_infos = (void *)(uintptr_t)(info->func_info);
+   u32 sub_prog_cnt = info->nr_jited_ksyms;
+   const struct bpf_func_info *finfo;
+   const char *short_name = NULL;
+   const struct btf_type *t;
+   int name_len;
+
+   name_len = snprintf(buf, size, "bpf_prog_");
+   name_len += snprintf_hex(buf + name_len, size - name_len,
+prog_tags[sub_id], BPF_TAG_SIZE);
+   if (btf) {
+   finfo = func_infos + sub_id * info->func_info_rec_size;
+   t = btf__type_by_id(btf, finfo->type_id);
+   short_name = btf__name_by_offset(btf, t->name_off);
+   } else if (sub_id == 0 && sub_prog_cnt == 1) {
+   /* no subprog */
+   if (info->name[0])
+   short_name = info->name;
+   } else
+   short_name = "F";
+   if (short_name)
+   name_len += snprintf(buf + name_len, size - name_len,
+"_%s", short_name);
+   return name_len;
+}
+
 /*
  * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
  * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
@@ -135,7 +167,6 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
struct bpf_prog_info_node *info_node;
struct bpf_prog_info *info;
struct btf *btf = NULL;
-   bool has_btf = false;
struct perf_env *env;
u32 sub_prog_cnt, i;
int err = 0;
@@ -189,19 +220,13 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
btf = NULL;
goto out;
}
-   has_btf = true;
perf_env__fetch_btf(env, info->btf_id, btf);
}
 
/* Synthesize PERF_RECORD_KSYMBOL */
for (i = 0; i < sub_prog_cnt; i++) {
-   u8 (*prog_tags)[BPF_TAG_SIZE] = (void 
*)(uintptr_t)(info->prog_tags);
-   __u32 *prog_lens  = (__u32 *)(uintptr_t)(info->jited_func_lens);
+   __u32 *prog_lens = (__u32 *)(uintptr_t)(info->jited_func_lens);
__u64 *prog_addrs = (__u64 *)(uintptr_t)(info->jited_ksyms);
-   void *func_infos  = (void *)(uintptr_t)(info->func_info);
-   const struct bpf_func_info *finfo;
-   const char *short_name = NULL;
-   const struct btf_type *t;
int name_len;
 
*ksymbol_event = (struct ksymbol_event){
@@ -214,26 +239,9 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
.ksym_type = PERF_RECORD_KSYMBOL_TYPE_BPF,
.flags = 0,
};
-   name_len = snprintf(ksymbol_event->name, KSYM_NAME_LEN,
-   "bpf_prog_");
-   name_len += snprintf_hex(ksymbol_event->name + name_len,
-KSYM_NAME_LEN - name_len,
-prog_tags[i], BPF_TAG_SIZE);
-   if (has_btf) {
-   finfo = func_infos + i * info->func_info_rec_size;
-   t = btf__type_by_id(btf, finfo->type_id);
-   short_name = btf__name_by_offset(btf, t->name_off);
-   } else if (i == 0 && sub_prog_cnt == 1) {
-   /* no subprog */
-   if 

[tip:perf/urgent] perf tools: Save bpf_prog_info and BTF of new BPF programs

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  d56354dc49091e33d9ffca732ac913ed2df70537
Gitweb: https://git.kernel.org/tip/d56354dc49091e33d9ffca732ac913ed2df70537
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:51 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 21 Mar 2019 11:27:04 -0300

perf tools: Save bpf_prog_info and BTF of new BPF programs

To fully annotate BPF programs with source code mapping, 4 different
information are needed:

1) PERF_RECORD_KSYMBOL
2) PERF_RECORD_BPF_EVENT
3) bpf_prog_info
4) btf

This patch handles 3) and 4) for BPF programs loaded after 'perf
record|top'.

For timely process of these information, a dedicated event is added to
the side band evlist.

When PERF_RECORD_BPF_EVENT is received via the side band event, the
polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env.

This information is saved to perf.data at the end of 'perf record'.

Committer testing:

The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a
unnamed union, so can't be used in a struct designated initialization
with older gccs, get it out of that, isolating as 'attr.wakeup_watermark
= 1;' to work with all gcc versions.

We also need to add '--no-bpf-event' to the 'perf record'
perf_event_attr tests in 'perf test', as the way that that test goes is
to intercept the events being setup and looking if they match the fields
described in the control files, since now it finds first the side band
event used to catch the PERF_RECORD_BPF_EVENT, they all fail.

With these issues fixed:

Same scenario as for testing BPF programs loaded before 'perf record' or
'perf top' starts, only start the BPF programs after 'perf record|top',
so that its information get collected by the sideband threads, the rest
works as for the programs loaded before start monitoring.

Add missing 'inline' to the bpf_event__add_sb_event() when
HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without
binutils devel files installed.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c|   3 +
 tools/perf/builtin-top.c   |   3 +
 tools/perf/tests/attr/test-record-C0   |   2 +-
 tools/perf/tests/attr/test-record-basic|   2 +-
 tools/perf/tests/attr/test-record-branch-any   |   2 +-
 .../perf/tests/attr/test-record-branch-filter-any  |   2 +-
 .../tests/attr/test-record-branch-filter-any_call  |   2 +-
 .../tests/attr/test-record-branch-filter-any_ret   |   2 +-
 tools/perf/tests/attr/test-record-branch-filter-hv |   2 +-
 .../tests/attr/test-record-branch-filter-ind_call  |   2 +-
 tools/perf/tests/attr/test-record-branch-filter-k  |   2 +-
 tools/perf/tests/attr/test-record-branch-filter-u  |   2 +-
 tools/perf/tests/attr/test-record-count|   2 +-
 tools/perf/tests/attr/test-record-data |   2 +-
 tools/perf/tests/attr/test-record-freq |   2 +-
 tools/perf/tests/attr/test-record-graph-default|   2 +-
 tools/perf/tests/attr/test-record-graph-dwarf  |   2 +-
 tools/perf/tests/attr/test-record-graph-fp |   2 +-
 tools/perf/tests/attr/test-record-group|   2 +-
 tools/perf/tests/attr/test-record-group-sampling   |   2 +-
 tools/perf/tests/attr/test-record-group1   |   2 +-
 tools/perf/tests/attr/test-record-no-buffering |   2 +-
 tools/perf/tests/attr/test-record-no-inherit   |   2 +-
 tools/perf/tests/attr/test-record-no-samples   |   2 +-
 tools/perf/tests/attr/test-record-period   |   2 +-
 tools/perf/tests/attr/test-record-raw  |   2 +-
 tools/perf/util/bpf-event.c| 100 +
 tools/perf/util/bpf-event.h|  15 
 28 files changed, 145 insertions(+), 24 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 6f645fd72fed..4e2d953d4bc5 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1238,6 +1238,9 @@ static int __cmd_record(struct record *rec, int argc, 
const char **argv)
goto out_child;
}
 
+   if (!opts->no_bpf_event)
+   bpf_event__add_sb_event(_evlist, >header.env);
+
if (perf_evlist__start_sb_thread(sb_evlist, >opts.target)) {
pr_debug("Couldn't start the BPF side band thread:\nBPF 
programs starting from now on won't be annotatable\n");
opts->no_bpf_event = true;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 3ce8a8db6c1d..1999d6533d12 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1637,6 +1637,9 @@ int cmd_top(int argc, const char **argv)
  

[tip:perf/urgent] perf annotate: Enable annotation of BPF programs

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  6987561c9e86eace45f2dbb0c564964a63f4150a
Gitweb: https://git.kernel.org/tip/6987561c9e86eace45f2dbb0c564964a63f4150a
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:48 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 20 Mar 2019 16:43:15 -0300

perf annotate: Enable annotation of BPF programs

In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
a new function symbol__disassemble_bpf(), where annotation line
information is filled based on the bpf_prog_info and btf data saved in
given perf_env.

symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
programs.

Committer testing:

After fixing this:

  -   u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
  +   u64 *addrs = (u64 
*)(uintptr_t)(info_linear->info.jited_ksyms);

Detected when crossbuilding to a 32-bit arch.

And making all this dependent on HAVE_LIBBFD_SUPPORT and
HAVE_LIBBPF_SUPPORT:

1) Have a BPF program running, one that has BTF info, etc, I used
   the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
   by 'perf trace'.

  # grep -B1 augmented_raw ~/.perfconfig
  [trace]
add_events = 
/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
  #
  # perf trace -e *mmsg
  dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
  NetworkManager/10055 sendmmsg(22, 0x7f8126ad1bb0, 2, 
MSG_NOSIGNAL) = 2

2) Then do a 'perf record' system wide for a while:

  # perf record -a
  ^C[ perf record: Woken up 68 times to write data ]
  [ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
  #

3) Check that we captured BPF and BTF info in the perf.data file:

  # perf report --header-only | grep 'b[pt]f'
  # event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 
294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, 
sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 
1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, 
exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
  # bpf_prog_info of id 13
  # bpf_prog_info of id 14
  # bpf_prog_info of id 15
  # bpf_prog_info of id 16
  # bpf_prog_info of id 17
  # bpf_prog_info of id 18
  # bpf_prog_info of id 21
  # bpf_prog_info of id 22
  # bpf_prog_info of id 41
  # bpf_prog_info of id 42
  # btf info of id 2
  #

4) Check which programs got recorded:

   # perf report | grep bpf_prog | head
 0.16%  exe  bpf_prog_819967866022f1e1_sys_enter  [k] 
bpf_prog_819967866022f1e1_sys_enter
 0.14%  exe  bpf_prog_c1bd85c092d6e4aa_sys_exit   [k] 
bpf_prog_c1bd85c092d6e4aa_sys_exit
 0.08%  fuse-overlayfs   bpf_prog_819967866022f1e1_sys_enter  [k] 
bpf_prog_819967866022f1e1_sys_enter
 0.07%  fuse-overlayfs   bpf_prog_c1bd85c092d6e4aa_sys_exit   [k] 
bpf_prog_c1bd85c092d6e4aa_sys_exit
 0.01%  clang-4.0bpf_prog_c1bd85c092d6e4aa_sys_exit   [k] 
bpf_prog_c1bd85c092d6e4aa_sys_exit
 0.01%  clang-4.0bpf_prog_819967866022f1e1_sys_enter  [k] 
bpf_prog_819967866022f1e1_sys_enter
 0.00%  clangbpf_prog_c1bd85c092d6e4aa_sys_exit   [k] 
bpf_prog_c1bd85c092d6e4aa_sys_exit
 0.00%  runc bpf_prog_819967866022f1e1_sys_enter  [k] 
bpf_prog_819967866022f1e1_sys_enter
 0.00%  clangbpf_prog_819967866022f1e1_sys_enter  [k] 
bpf_prog_819967866022f1e1_sys_enter
 0.00%  sh   bpf_prog_c1bd85c092d6e4aa_sys_exit   [k] 
bpf_prog_c1bd85c092d6e4aa_sys_exit
  #

  This was with the default --sort order for 'perf report', which is:

--sort comm,dso,symbol

  If we just look for the symbol, for instance:

   # perf report --sort symbol | grep bpf_prog | head
 0.26%  [k] bpf_prog_819967866022f1e1_sys_enter-  -
 0.24%  [k] bpf_prog_c1bd85c092d6e4aa_sys_exit -  -
   #

  or the DSO:

   # perf report --sort dso | grep bpf_prog | head
 0.26%  bpf_prog_819967866022f1e1_sys_enter
 0.24%  bpf_prog_c1bd85c092d6e4aa_sys_exit
  #

We'll see the two BPF programs that augmented_raw_syscalls.o puts in
place,  one attached to the raw_syscalls:sys_enter and another to the
raw_syscalls:sys_exit tracepoints, as expected.

Now we can finally do, from the command line, annotation for one of
those two symbols, with the original BPF program source coude intermixed
with the disassembled JITed code:

  # perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter

  Samples: 950  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 
553756947, [percent: local period]
  bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
  Percent  int sys_enter(struct syscall_enter_args *args)
   53.41 push   %rbp

0.63 mov%rsp,%rbp
0.31 sub$0x170,%rsp
1.93 sub$0x28,%rbp
7.02 mov%rbx,0x0(%rbp)
3.20  

[tip:perf/urgent] perf bpf: Show more BPF program info in print_bpf_prog_info()

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  f8dfeae009effc0b6dac2741cf8d7cbb91edb982
Gitweb: https://git.kernel.org/tip/f8dfeae009effc0b6dac2741cf8d7cbb91edb982
Author: Song Liu 
AuthorDate: Tue, 19 Mar 2019 09:54:54 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 21 Mar 2019 11:27:04 -0300

perf bpf: Show more BPF program info in print_bpf_prog_info()

This patch enables showing bpf program name, address, and size in the
header.

Before the patch:

  perf report --header-only
  ...
  # bpf_prog_info of id 9
  # bpf_prog_info of id 10
  # bpf_prog_info of id 13

After the patch:

  # bpf_prog_info 9: bpf_prog_7be49e3934a125ba addr 0xa0024947 size 229
  # bpf_prog_info 10: bpf_prog_2a142ef67aaad174 addr 0xa007c94d size 229
  # bpf_prog_info 13: bpf_prog_47368425825d7384_task__task_newt addr 
0xa0251137 size 369

Committer notes:

Fix the fallback definition when HAVE_LIBBPF_SUPPORT is not defined,
i.e. add the missing 'static inline' and add the __maybe_unused to the
args. Also add stdio.h since we now use FILE * in bpf-event.h.

Signed-off-by: Song Liu 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190319165454.1298742-3-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 40 
 tools/perf/util/bpf-event.h | 11 ++-
 tools/perf/util/header.c|  5 +++--
 3 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index d5b041649f26..2a4a0da35632 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -438,3 +438,43 @@ int bpf_event__add_sb_event(struct perf_evlist **evlist,
 
return perf_evlist__add_sb_event(evlist, , bpf_event__sb_cb, env);
 }
+
+void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info,
+   struct perf_env *env,
+   FILE *fp)
+{
+   __u32 *prog_lens = (__u32 *)(uintptr_t)(info->jited_func_lens);
+   __u64 *prog_addrs = (__u64 *)(uintptr_t)(info->jited_ksyms);
+   char name[KSYM_NAME_LEN];
+   struct btf *btf = NULL;
+   u32 sub_prog_cnt, i;
+
+   sub_prog_cnt = info->nr_jited_ksyms;
+   if (sub_prog_cnt != info->nr_prog_tags ||
+   sub_prog_cnt != info->nr_jited_func_lens)
+   return;
+
+   if (info->btf_id) {
+   struct btf_node *node;
+
+   node = perf_env__find_btf(env, info->btf_id);
+   if (node)
+   btf = btf__new((__u8 *)(node->data),
+  node->data_size);
+   }
+
+   if (sub_prog_cnt == 1) {
+   synthesize_bpf_prog_name(name, KSYM_NAME_LEN, info, btf, 0);
+   fprintf(fp, "# bpf_prog_info %u: %s addr 0x%llx size %u\n",
+   info->id, name, prog_addrs[0], prog_lens[0]);
+   return;
+   }
+
+   fprintf(fp, "# bpf_prog_info %u:\n", info->id);
+   for (i = 0; i < sub_prog_cnt; i++) {
+   synthesize_bpf_prog_name(name, KSYM_NAME_LEN, info, btf, i);
+
+   fprintf(fp, "# \tsub_prog %u: %s addr 0x%llx size %u\n",
+   i, name, prog_addrs[i], prog_lens[i]);
+   }
+}
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
index 8cb1189149ec..04c33b3bfe28 100644
--- a/tools/perf/util/bpf-event.h
+++ b/tools/perf/util/bpf-event.h
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include "event.h"
+#include 
 
 struct machine;
 union perf_event;
@@ -38,7 +39,9 @@ int perf_event__synthesize_bpf_events(struct perf_session 
*session,
  struct record_opts *opts);
 int bpf_event__add_sb_event(struct perf_evlist **evlist,
 struct perf_env *env);
-
+void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info,
+   struct perf_env *env,
+   FILE *fp);
 #else
 static inline int machine__process_bpf_event(struct machine *machine 
__maybe_unused,
 union perf_event *event 
__maybe_unused,
@@ -61,5 +64,11 @@ static inline int bpf_event__add_sb_event(struct perf_evlist 
**evlist __maybe_un
return 0;
 }
 
+static inline void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info 
__maybe_unused,
+ struct perf_env *env 
__maybe_unused,
+ FILE *fp __maybe_unused)
+{
+
+}
 #endif // HAVE_LIBBPF_SUPPORT
 #endif
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 01dda2f65d36..b9e693825873 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1468,8 +1468,9 @@ static void print_bpf_prog_info(struct feat_fd *ff, FILE 
*fp)
 
  

[tip:perf/urgent] perf evlist: Introduce side band thread

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  657ee5531903339b06697581532ed32d4762526e
Gitweb: https://git.kernel.org/tip/657ee5531903339b06697581532ed32d4762526e
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:50 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 21 Mar 2019 11:27:03 -0300

perf evlist: Introduce side band thread

This patch introduces side band thread that captures extended
information for events like PERF_RECORD_BPF_EVENT.

This new thread uses its own evlist that uses ring buffer with very low
watermark for lower latency.

To use side band thread, we need to:

1. add side band event(s) by calling perf_evlist__add_sb_event();
2. calls perf_evlist__start_sb_thread();
3. at the end of perf run, perf_evlist__stop_sb_thread().

In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT.

Committer notes:

Add fix by Jiri Olsa for when te sb_tread can't get started and then at
the end the stop_sb_thread() segfaults when joining the (non-existing)
thread.

That can happen when running 'perf top' or 'perf record' as a normal
user, for instance.

Further checks need to be done on top of this to more graciously handle
these possible failure scenarios.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c |   9 
 tools/perf/builtin-top.c|   9 
 tools/perf/util/evlist.c| 119 
 tools/perf/util/evlist.h|  12 +
 tools/perf/util/evsel.h |   6 +++
 5 files changed, 155 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e79faccd7842..6f645fd72fed 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1137,6 +1137,7 @@ static int __cmd_record(struct record *rec, int argc, 
const char **argv)
struct perf_data *data = >data;
struct perf_session *session;
bool disabled = false, draining = false;
+   struct perf_evlist *sb_evlist = NULL;
int fd;
 
atexit(record__sig_exit);
@@ -1237,6 +1238,11 @@ static int __cmd_record(struct record *rec, int argc, 
const char **argv)
goto out_child;
}
 
+   if (perf_evlist__start_sb_thread(sb_evlist, >opts.target)) {
+   pr_debug("Couldn't start the BPF side band thread:\nBPF 
programs starting from now on won't be annotatable\n");
+   opts->no_bpf_event = true;
+   }
+
err = record__synthesize(rec, false);
if (err < 0)
goto out_child;
@@ -1487,6 +1493,9 @@ out_child:
 
 out_delete_session:
perf_session__delete(session);
+
+   if (!opts->no_bpf_event)
+   perf_evlist__stop_sb_thread(sb_evlist);
return status;
 }
 
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index c2ea22c4ea67..3ce8a8db6c1d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1501,6 +1501,7 @@ int cmd_top(int argc, const char **argv)
"number of thread to run event synthesize"),
OPT_END()
};
+   struct perf_evlist *sb_evlist = NULL;
const char * const top_usage[] = {
"perf top []",
NULL
@@ -1636,8 +1637,16 @@ int cmd_top(int argc, const char **argv)
goto out_delete_evlist;
}
 
+   if (perf_evlist__start_sb_thread(sb_evlist, target)) {
+   pr_debug("Couldn't start the BPF side band thread:\nBPF 
programs starting from now on won't be annotatable\n");
+   opts->no_bpf_event = true;
+   }
+
status = __cmd_top();
 
+   if (!opts->no_bpf_event)
+   perf_evlist__stop_sb_thread(sb_evlist);
+
 out_delete_evlist:
perf_evlist__delete(top.evlist);
perf_session__delete(top.session);
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ed20f4379956..ec78e93085de 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -19,6 +19,7 @@
 #include "debug.h"
 #include "units.h"
 #include "asm/bug.h"
+#include "bpf-event.h"
 #include 
 #include 
 
@@ -1856,3 +1857,121 @@ struct perf_evsel *perf_evlist__reset_weak_group(struct 
perf_evlist *evsel_list,
}
return leader;
 }
+
+int perf_evlist__add_sb_event(struct perf_evlist **evlist,
+ struct perf_event_attr *attr,
+ perf_evsel__sb_cb_t cb,
+ void *data)
+{
+   struct perf_evsel *evsel;
+   bool new_evlist = (*evlist) == NULL;
+
+   if (*evlist == NULL)
+   *evlist = perf_evlist__new();
+   if (*evlist == NULL)
+   return -1;
+
+   if (!attr->sample_id_all) {
+   pr_warning("enabling 

[tip:perf/urgent] perf top: Add option --no-bpf-event

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  ee7a112fbcc8edb4cf2f84ce5fcc2da7818fd4b8
Gitweb: https://git.kernel.org/tip/ee7a112fbcc8edb4cf2f84ce5fcc2da7818fd4b8
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:46 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:07 -0300

perf top: Add option --no-bpf-event

This patch adds option --no-bpf-event to 'perf top', which is the same
as the option of 'perf record'.

The following patches will use this option.

Committer testing:

  # perf top -vv 2> /tmp/perf_event_attr.out
  # cat  /tmp/perf_event_attr.out
  
  perf_event_attr:
size 112
{ sample_period, sample_freq }   4000
sample_type  IP|TID|TIME|CPU|PERIOD
read_format  ID
disabled 1
inherit  1
mmap 1
comm 1
freq 1
task 1
precise_ip   3
sample_id_all1
exclude_guest1
mmap21
comm_exec1
ksymbol  1
bpf_event1
  
  #

After this patch:

  # perf top --no-bpf-event -vv 2> /tmp/perf_event_attr.out
  # cat  /tmp/perf_event_attr.out
  
  perf_event_attr:
size 112
{ sample_period, sample_freq }   4000
sample_type  IP|TID|TIME|CPU|PERIOD
read_format  ID
disabled 1
inherit  1
mmap 1
comm 1
freq 1
task 1
precise_ip   3
sample_id_all1
exclude_guest1
mmap21
comm_exec1
ksymbol  1
  
  #

Signed-off-by: Song Liu 
Tested-by: Arnaldo Carvalho de Melo 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-11-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-top.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 77e6190211d2..c2ea22c4ea67 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1469,6 +1469,7 @@ int cmd_top(int argc, const char **argv)
"Display raw encoding of assembly instructions (default)"),
OPT_BOOLEAN(0, "demangle-kernel", _conf.demangle_kernel,
"Enable kernel symbol demangling"),
+   OPT_BOOLEAN(0, "no-bpf-event", _opts.no_bpf_event, "do not 
record bpf events"),
OPT_STRING(0, "objdump", _opts.objdump_path, "path",
"objdump binary to use for disassembly and annotations"),
OPT_STRING('M', "disassembler-style", 
_opts.disassembler_style, "disassembler style",


[tip:perf/urgent] perf build: Check what binutils's 'disassembler()' signature to use

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  8a1b1718214cfd945fef14b3031e4e7262882a86
Gitweb: https://git.kernel.org/tip/8a1b1718214cfd945fef14b3031e4e7262882a86
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:48 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 20 Mar 2019 16:42:10 -0300

perf build: Check what binutils's 'disassembler()' signature to use

Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils
repo, which changed the disassembler() function signature, so we must
use the feature test introduced in fb982666e380 ("tools/bpftool: fix
bpftool build with bintutils >= 2.9") to deal with that.

Committer testing:

After adding the missing function call to test-all.c, and:

  FEATURE_CHECK_LDFLAGS-disassembler-four-args = -bfd -lopcodes

And the fallbacks for cases where we need -liberty and sometimes -lz to
tools/perf/Makefile.config, we get:

  $ make -C tools/perf O=/tmp/build/perf install-bin
  make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD:   Doing 'make -j8' parallel build

  Auto-detecting system features:
  ... dwarf: [ on  ]
  ...dwarf_getlocations: [ on  ]
  ... glibc: [ on  ]
  ...  gtk2: [ on  ]
  ...  libaudit: [ on  ]
  ...libbfd: [ on  ]
  ...libelf: [ on  ]
  ...   libnuma: [ on  ]
  ...numa_num_possible_cpus: [ on  ]
  ...   libperl: [ on  ]
  ... libpython: [ on  ]
  ...  libslang: [ on  ]
  ... libcrypto: [ on  ]
  ... libunwind: [ on  ]
  ...libdw-dwarf-unwind: [ on  ]
  ...  zlib: [ on  ]
  ...  lzma: [ on  ]
  ... get_cpuid: [ on  ]
  ...   bpf: [ on  ]
  ...libaio: [ on  ]
  ...disassembler-four-args: [ on  ]
CC   /tmp/build/perf/jvmti/libjvmti.o
CC   /tmp/build/perf/builtin-bench.o
  
  $
  $

The feature detection test-all.bin gets successfully built and linked:

  $ ls -la /tmp/build/perf/feature/test-all.bin
  -rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 
/tmp/build/perf/feature/test-all.bin
  $ nm /tmp/build/perf/feature/test-all.bin  | grep -w disassembler
  00061f90 T disassembler
  $

Time to move on to the patches that make use of this disassembler()
routine in binutils's libopcodes.

Signed-off-by: Song Liu 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Jakub Kicinski 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Roman Gushchin 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubrav...@fb.com
[ split from a larger patch, added missing 
FEATURE_CHECK_LDFLAGS-disassembler-four-args ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/build/Makefile.feature   | 6 --
 tools/build/feature/test-all.c | 5 +
 tools/perf/Makefile.config | 9 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 61e46d54a67c..8d3864b061f3 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -66,7 +66,8 @@ FEATURE_TESTS_BASIC :=  \
 sched_getcpu   \
 sdt\
 setns  \
-libaio
+libaio \
+disassembler-four-args
 
 # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
 # of all feature tests
@@ -118,7 +119,8 @@ FEATURE_DISPLAY ?=  \
  lzma   \
  get_cpuid  \
  bpf   \
- libaio
+ libaio\
+ disassembler-four-args
 
 # Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features.
 # If in the future we need per-feature checks/flags for features not
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index e903b86b742f..7853e6d91090 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -178,6 +178,10 @@
 # include "test-reallocarray.c"
 #undef main
 
+#define main main_test_disassembler_four_args
+# include "test-disassembler-four-args.c"
+#undef main
+
 int main(int argc, char *argv[])
 {
main_test_libpython();
@@ -219,6 +223,7 @@ int main(int argc, char *argv[])
main_test_setns();
main_test_libaio();
main_test_reallocarray();
+   main_test_disassembler_four_args();
 
return 0;
 }
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index df4ad45599ca..fe3f97e342fa 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -227,6 +227,8 @@ FEATURE_CHECK_LDFLAGS-libpython-version := 
$(PYTHON_EMBED_LDOPTS)
 

[tip:perf/urgent] perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  3ca3877a9732b68cf0289367a859f6c163a79bfa
Gitweb: https://git.kernel.org/tip/3ca3877a9732b68cf0289367a859f6c163a79bfa
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:49 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:07 -0300

perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation

This patch adds processing of PERF_BPF_EVENT_PROG_LOAD, which sets
proper DSO type/id/etc of memory regions mapped to BPF programs to
DSO_BINARY_TYPE__BPF_PROG_INFO.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-14-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 54 +
 1 file changed, 54 insertions(+)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index a4fc52b4ffae..852e960692cb 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -12,6 +12,7 @@
 #include "machine.h"
 #include "env.h"
 #include "session.h"
+#include "map.h"
 
 #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr))
 
@@ -25,12 +26,65 @@ static int snprintf_hex(char *buf, size_t size, unsigned 
char *data, size_t len)
return ret;
 }
 
+static int machine__process_bpf_event_load(struct machine *machine,
+  union perf_event *event,
+  struct perf_sample *sample 
__maybe_unused)
+{
+   struct bpf_prog_info_linear *info_linear;
+   struct bpf_prog_info_node *info_node;
+   struct perf_env *env = machine->env;
+   int id = event->bpf_event.id;
+   unsigned int i;
+
+   /* perf-record, no need to handle bpf-event */
+   if (env == NULL)
+   return 0;
+
+   info_node = perf_env__find_bpf_prog_info(env, id);
+   if (!info_node)
+   return 0;
+   info_linear = info_node->info_linear;
+
+   for (i = 0; i < info_linear->info.nr_jited_ksyms; i++) {
+   u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
+   u64 addr = addrs[i];
+   struct map *map;
+
+   map = map_groups__find(>kmaps, addr);
+
+   if (map) {
+   map->dso->binary_type = DSO_BINARY_TYPE__BPF_PROG_INFO;
+   map->dso->bpf_prog.id = id;
+   map->dso->bpf_prog.sub_id = i;
+   map->dso->bpf_prog.env = env;
+   }
+   }
+   return 0;
+}
+
 int machine__process_bpf_event(struct machine *machine __maybe_unused,
   union perf_event *event,
   struct perf_sample *sample __maybe_unused)
 {
if (dump_trace)
perf_event__fprintf_bpf_event(event, stdout);
+
+   switch (event->bpf_event.type) {
+   case PERF_BPF_EVENT_PROG_LOAD:
+   return machine__process_bpf_event_load(machine, event, sample);
+
+   case PERF_BPF_EVENT_PROG_UNLOAD:
+   /*
+* Do not free bpf_prog_info and btf of the program here,
+* as annotation still need them. They will be freed at
+* the end of the session.
+*/
+   break;
+   default:
+   pr_debug("unexpected bpf_event type of %d\n",
+event->bpf_event.type);
+   break;
+   }
return 0;
 }
 


[tip:perf/urgent] perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  9b86d04d53b98399017fea44e9047165ffe12d42
Gitweb: https://git.kernel.org/tip/9b86d04d53b98399017fea44e9047165ffe12d42
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:48 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:07 -0300

perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO

Introduce a new dso type DSO_BINARY_TYPE__BPF_PROG_INFO for BPF programs. In
symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso will call into a new
function symbol__disassemble_bpf() in an upcoming patch, where annotation line
information is filled based bpf_prog_info and btf saved in given perf_env.

Committer notes:

Removed the unnamed union with 'bpf_prog' and 'cache' in 'struct dso',
to fix this bug when exiting 'perf top':

  # perf top
  perf: Segmentation fault
   backtrace 
  perf[0x5a785a]
  /lib64/libc.so.6(+0x385bf)[0x7fd68443c5bf]
  perf(rb_first+0x2b)[0x4d6eeb]
  perf(dso__delete+0xb7)[0x4dffb7]
  perf[0x4f9e37]
  perf(perf_session__delete+0x64)[0x504df4]
  perf(cmd_top+0x1957)[0x454467]
  perf[0x4aad18]
  perf(main+0x61c)[0x42ec7c]
  /lib64/libc.so.6(__libc_start_main+0xf2)[0x7fd684428412]
  perf(_start+0x2d)[0x42eead]
  #
  # addr2line -fe ~/bin/perf 0x4dffb7
  dso_cache__free
  /home/acme/git/perf/tools/perf/util/dso.c:713

That is trying to access the dso->data.cache, and that is not used with
BPF programs, so we end up accessing what is in bpf_prog.first_member,
b00m.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubrav...@fb.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/dso.c| 1 +
 tools/perf/util/dso.h| 8 
 tools/perf/util/symbol.c | 1 +
 3 files changed, 10 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index ab8a455d2283..e059976d9d93 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -184,6 +184,7 @@ int dso__read_binary_type_filename(const struct dso *dso,
case DSO_BINARY_TYPE__KALLSYMS:
case DSO_BINARY_TYPE__GUEST_KALLSYMS:
case DSO_BINARY_TYPE__JAVA_JIT:
+   case DSO_BINARY_TYPE__BPF_PROG_INFO:
case DSO_BINARY_TYPE__NOT_FOUND:
ret = -1;
break;
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index bb417c54c25a..6e3f63781e51 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -14,6 +14,7 @@
 
 struct machine;
 struct map;
+struct perf_env;
 
 enum dso_binary_type {
DSO_BINARY_TYPE__KALLSYMS = 0,
@@ -35,6 +36,7 @@ enum dso_binary_type {
DSO_BINARY_TYPE__KCORE,
DSO_BINARY_TYPE__GUEST_KCORE,
DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO,
+   DSO_BINARY_TYPE__BPF_PROG_INFO,
DSO_BINARY_TYPE__NOT_FOUND,
 };
 
@@ -189,6 +191,12 @@ struct dso {
u64  debug_frame_offset;
u64  eh_frame_hdr_offset;
} data;
+   /* bpf prog information */
+   struct {
+   u32 id;
+   u32 sub_id;
+   struct perf_env *env;
+   } bpf_prog;
 
union { /* Tool specific area */
void *priv;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 58442ca5e3c4..5cbad55cd99d 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1455,6 +1455,7 @@ static bool dso__is_compatible_symtab_type(struct dso 
*dso, bool kmod,
case DSO_BINARY_TYPE__BUILD_ID_CACHE_DEBUGINFO:
return true;
 
+   case DSO_BINARY_TYPE__BPF_PROG_INFO:
case DSO_BINARY_TYPE__NOT_FOUND:
default:
return false;


[tip:perf/urgent] perf bpf: Save BTF in a rbtree in perf_env

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  3792cb2ff43b1b193136a03ce1336462a827d792
Gitweb: https://git.kernel.org/tip/3792cb2ff43b1b193136a03ce1336462a827d792
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:44 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:07 -0300

perf bpf: Save BTF in a rbtree in perf_env

BTF contains information necessary to annotate BPF programs. This patch
saves BTF for BPF programs loaded in the system.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-9-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 23 
 tools/perf/util/bpf-event.h |  7 +
 tools/perf/util/env.c   | 67 +
 tools/perf/util/env.h   |  5 
 4 files changed, 102 insertions(+)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 37ee4e2a728a..a4fc52b4ffae 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -34,6 +34,28 @@ int machine__process_bpf_event(struct machine *machine 
__maybe_unused,
return 0;
 }
 
+static int perf_env__fetch_btf(struct perf_env *env,
+  u32 btf_id,
+  struct btf *btf)
+{
+   struct btf_node *node;
+   u32 data_size;
+   const void *data;
+
+   data = btf__get_raw_data(btf, _size);
+
+   node = malloc(data_size + sizeof(struct btf_node));
+   if (!node)
+   return -1;
+
+   node->id = btf_id;
+   node->data_size = data_size;
+   memcpy(node->data, data, data_size);
+
+   perf_env__insert_btf(env, node);
+   return 0;
+}
+
 /*
  * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf
  * program. One PERF_RECORD_BPF_EVENT is generated for the program. And
@@ -113,6 +135,7 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
goto out;
}
has_btf = true;
+   perf_env__fetch_btf(env, info->btf_id, btf);
}
 
/* Synthesize PERF_RECORD_KSYMBOL */
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
index fad932f7404f..b9ec394dc7c7 100644
--- a/tools/perf/util/bpf-event.h
+++ b/tools/perf/util/bpf-event.h
@@ -16,6 +16,13 @@ struct bpf_prog_info_node {
struct rb_node  rb_node;
 };
 
+struct btf_node {
+   struct rb_node  rb_node;
+   u32 id;
+   u32 data_size;
+   chardata[];
+};
+
 #ifdef HAVE_LIBBPF_SUPPORT
 int machine__process_bpf_event(struct machine *machine, union perf_event 
*event,
   struct perf_sample *sample);
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 98cd36f0e317..c6351b557bb0 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -64,6 +64,58 @@ struct bpf_prog_info_node 
*perf_env__find_bpf_prog_info(struct perf_env *env,
return node;
 }
 
+void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node)
+{
+   struct rb_node *parent = NULL;
+   __u32 btf_id = btf_node->id;
+   struct btf_node *node;
+   struct rb_node **p;
+
+   down_write(>bpf_progs.lock);
+   p = >bpf_progs.btfs.rb_node;
+
+   while (*p != NULL) {
+   parent = *p;
+   node = rb_entry(parent, struct btf_node, rb_node);
+   if (btf_id < node->id) {
+   p = &(*p)->rb_left;
+   } else if (btf_id > node->id) {
+   p = &(*p)->rb_right;
+   } else {
+   pr_debug("duplicated btf %u\n", btf_id);
+   goto out;
+   }
+   }
+
+   rb_link_node(_node->rb_node, parent, p);
+   rb_insert_color(_node->rb_node, >bpf_progs.btfs);
+   env->bpf_progs.btfs_cnt++;
+out:
+   up_write(>bpf_progs.lock);
+}
+
+struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id)
+{
+   struct btf_node *node = NULL;
+   struct rb_node *n;
+
+   down_read(>bpf_progs.lock);
+   n = env->bpf_progs.btfs.rb_node;
+
+   while (n) {
+   node = rb_entry(n, struct btf_node, rb_node);
+   if (btf_id < node->id)
+   n = n->rb_left;
+   else if (btf_id > node->id)
+   n = n->rb_right;
+   else
+   break;
+   }
+
+   up_read(>bpf_progs.lock);
+   return node;
+}
+
 /* purge data in bpf_progs.infos tree */
 static void perf_env__purge_bpf(struct perf_env *env)
 {
@@ -86,6 +138,20 @@ static void perf_env__purge_bpf(struct perf_env *env)
 
env->bpf_progs.infos_cnt = 0;
 
+   root = >bpf_progs.btfs;
+   next = rb_first(root);
+
+   

[tip:perf/urgent] perf feature detection: Add -lopcodes to feature-libbfd

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  31be9478ed7f43d6351e0d5a2257ca76609c83d3
Gitweb: https://git.kernel.org/tip/31be9478ed7f43d6351e0d5a2257ca76609c83d3
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:47 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:07 -0300

perf feature detection: Add -lopcodes to feature-libbfd

Both libbfd and libopcodes are distributed with binutil-dev/devel. When
libbfd is present, it is OK to assume that libopcodes also present. This
has been a safe assumption for bpftool.

This patch adds -lopcodes to perf/Makefile.config. libopcodes will be
used in the next commit for BPF annotation.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-12-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.config | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 0f11d5891301..df4ad45599ca 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -713,7 +713,7 @@ else
 endif
 
 ifeq ($(feature-libbfd), 1)
-  EXTLIBS += -lbfd
+  EXTLIBS += -lbfd -lopcodes
 else
   # we are on a system that requires -liberty and (maybe) -lz
   # to link against -lbfd; test each case individually here
@@ -724,10 +724,10 @@ else
   $(call feature_check,libbfd-liberty-z)
 
   ifeq ($(feature-libbfd-liberty), 1)
-EXTLIBS += -lbfd -liberty
+EXTLIBS += -lbfd -lopcodes -liberty
   else
 ifeq ($(feature-libbfd-liberty-z), 1)
-  EXTLIBS += -lbfd -liberty -lz
+  EXTLIBS += -lbfd -lopcodes -liberty -lz
 endif
   endif
 endif


[tip:perf/urgent] perf bpf: Save BTF information as headers to perf.data

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  a70a1123174ab592c5fa8ecf09f9fad9b335b872
Gitweb: https://git.kernel.org/tip/a70a1123174ab592c5fa8ecf09f9fad9b335b872
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:45 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:07 -0300

perf bpf: Save BTF information as headers to perf.data

This patch enables 'perf record' to save BTF information as headers to
perf.data.

A new header type HEADER_BPF_BTF is introduced for this data.

Committer testing:

As root, being on the kernel sources top level directory, run:

# perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg

Just to compile and load a BPF program that attaches to the
raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending
in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc).

Make sure you have a recent enough clang, say version 9, to get the
BTF ELF sections needed for this testing:

  # clang --version | head -1
  clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 
7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ 
a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
  # readelf -SW tools/perf/examples/bpf/augmented_raw_syscalls.o | grep BTF
[22] .BTF  PROGBITS 000ede 000b0e 00
  0   0  1
[23] .BTF.ext  PROGBITS 0019ec 0002a0 00
  0   0  1
[24] .rel.BTF.ext  REL  002fa8 000270 10
 30  23  8

Then do a systemwide perf record session for a few seconds:

  # perf record -a sleep 2s

Then look at:

  # perf report --header-only | grep b[pt]f
  # event : name = cycles:ppp, , id = { 1116204, 1116205, 1116206, 1116207, 
1116208, 1116209, 1116210, 1116211 }, size = 112, { sample_period, sample_freq 
} = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, 
inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, 
precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, 
ksymbol = 1, bpf_event = 1
  # bpf_prog_info of id 13
  # bpf_prog_info of id 14
  # bpf_prog_info of id 15
  # bpf_prog_info of id 16
  # bpf_prog_info of id 17
  # bpf_prog_info of id 18
  # bpf_prog_info of id 21
  # bpf_prog_info of id 22
  # bpf_prog_info of id 51
  # bpf_prog_info of id 52
  # btf info of id 8
  #

We need to show more info about these BPF and BTF entries , but that can
be done later.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-10-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/header.c | 101 ++-
 tools/perf/util/header.h |   1 +
 2 files changed, 101 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index e6a81af516f6..01dda2f65d36 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -928,6 +928,39 @@ static int write_bpf_prog_info(struct feat_fd *ff 
__maybe_unused,
 }
 #endif // HAVE_LIBBPF_SUPPORT
 
+static int write_bpf_btf(struct feat_fd *ff,
+struct perf_evlist *evlist __maybe_unused)
+{
+   struct perf_env *env = >ph->env;
+   struct rb_root *root;
+   struct rb_node *next;
+   int ret;
+
+   down_read(>bpf_progs.lock);
+
+   ret = do_write(ff, >bpf_progs.btfs_cnt,
+  sizeof(env->bpf_progs.btfs_cnt));
+
+   if (ret < 0)
+   goto out;
+
+   root = >bpf_progs.btfs;
+   next = rb_first(root);
+   while (next) {
+   struct btf_node *node;
+
+   node = rb_entry(next, struct btf_node, rb_node);
+   next = rb_next(>rb_node);
+   ret = do_write(ff, >id,
+  sizeof(u32) * 2 + node->data_size);
+   if (ret < 0)
+   goto out;
+   }
+out:
+   up_read(>bpf_progs.lock);
+   return ret;
+}
+
 static int cpu_cache_level__sort(const void *a, const void *b)
 {
struct cpu_cache_level *cache_a = (struct cpu_cache_level *)a;
@@ -1442,6 +1475,28 @@ static void print_bpf_prog_info(struct feat_fd *ff, FILE 
*fp)
up_read(>bpf_progs.lock);
 }
 
+static void print_bpf_btf(struct feat_fd *ff, FILE *fp)
+{
+   struct perf_env *env = >ph->env;
+   struct rb_root *root;
+   struct rb_node *next;
+
+   down_read(>bpf_progs.lock);
+
+   root = >bpf_progs.btfs;
+   next = rb_first(root);
+
+   while (next) {
+   struct btf_node *node;
+
+   node = rb_entry(next, struct btf_node, rb_node);
+   next = rb_next(>rb_node);
+   fprintf(fp, "# btf info of id %u\n", node->id);
+   }
+
+   up_read(>bpf_progs.lock);
+}
+
 static void 

[tip:perf/urgent] perf bpf: Save bpf_prog_info information as headers to perf.data

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  606f972b1361f477cbd4e6e8ac00742fde4b39db
Gitweb: https://git.kernel.org/tip/606f972b1361f477cbd4e6e8ac00742fde4b39db
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:43 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

perf bpf: Save bpf_prog_info information as headers to perf.data

This patch enables perf-record to save bpf_prog_info information as
headers to perf.data. A new header type HEADER_BPF_PROG_INFO is
introduced for this data.

Committer testing:

As root, being on the kernel sources top level directory, run:

  # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg

Just to compile and load a BPF program that attaches to the
raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending
in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc).

Then do a systemwide perf record session for a few seconds:

  # perf record -a sleep 2s

Then look at:

  # perf report --header-only | grep -i bpf
  # bpf_prog_info of id 13
  # bpf_prog_info of id 14
  # bpf_prog_info of id 15
  # bpf_prog_info of id 16
  # bpf_prog_info of id 17
  # bpf_prog_info of id 18
  # bpf_prog_info of id 21
  # bpf_prog_info of id 22
  # bpf_prog_info of id 208
  # bpf_prog_info of id 209
  #

We need to show more info about these programs, like bpftool does for
the ones running on the system, i.e. 'perf record/perf report' become a
way of saving the BPF state in a machine to then analyse on another,
together with all the other information that is already saved in the
perf.data header:

  # perf report --header-only
  # 
  # captured on: Tue Mar 12 11:42:13 2019
  # header version : 1
  # data offset: 296
  # data size  : 16294184
  # feat offset: 16294480
  # hostname : quaco
  # os release : 5.0.0+
  # perf version : 5.0.gd783c8
  # arch : x86_64
  # nrcpus online : 8
  # nrcpus avail : 8
  # cpudesc : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
  # cpuid : GenuineIntel,6,142,10
  # total memory : 24555720 kB
  # cmdline : /home/acme/bin/perf (deleted) record -a
  # event : name = cycles:ppp, , id = { 3190123, 3190124, 3190125, 3190126, 
3190127, 3190128, 3190129, 3190130 }, size = 112, { sample_period, sample_freq 
} = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, 
inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, 
sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1
  # CPU_TOPOLOGY info available, use -I to display
  # NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: intel_pt = 8, software = 1, power = 11, uprobe = 7, 
uncore_imc = 12, cpu = 4, cstate_core = 18, uncore_cbox_2 = 15, breakpoint = 5, 
uncore_cbox_0 = 13, tracepoint = 2, cstate_pkg = 19, uncore_arb = 17, kprobe = 
6, i915 = 10, msr = 9, uncore_cbox_3 = 16, uncore_cbox_1 = 14
  # CACHE info available, use -I to display
  # time of first sample : 116392.441701
  # time of last sample : 116400.932584
  # sample duration :   8490.883 ms
  # MEM_TOPOLOGY info available, use -I to display
  # bpf_prog_info of id 13
  # bpf_prog_info of id 14
  # bpf_prog_info of id 15
  # bpf_prog_info of id 16
  # bpf_prog_info of id 17
  # bpf_prog_info of id 18
  # bpf_prog_info of id 21
  # bpf_prog_info of id 22
  # bpf_prog_info of id 208
  # bpf_prog_info of id 209
  # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT 
CLOCKID DIR_FORMAT
  # 
  #

Committer notes:

We can't use the libbpf unconditionally, as the build may have been with
NO_LIBBPF, when we end up with linking errors, so provide dummy
{process,write}_bpf_prog_info() wrapped by HAVE_LIBBPF_SUPPORT for that
case.

Printing are not affected by this, so can continue as is.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-8-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/header.c | 153 ++-
 tools/perf/util/header.h |   1 +
 2 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index b0683bf4d9f3..e6a81af516f6 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "evlist.h"
 #include "evsel.h"
@@ -40,6 +41,7 @@
 #include "time-utils.h"
 #include "units.h"
 #include "cputopo.h"
+#include "bpf-event.h"
 
 #include "sane_ctype.h"
 
@@ -876,6 +878,56 @@ static int write_dir_format(struct feat_fd *ff,
return do_write(ff, >dir.version, sizeof(data->dir.version));
 }
 
+#ifdef HAVE_LIBBPF_SUPPORT
+static int write_bpf_prog_info(struct feat_fd *ff,
+  struct perf_evlist *evlist __maybe_unused)
+{
+   struct perf_env *env = >ph->env;

[tip:perf/urgent] perf bpf: Save bpf_prog_info in a rbtree in perf_env

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  e4378f0cb90be0368c48baad69a99203c58e3196
Gitweb: https://git.kernel.org/tip/e4378f0cb90be0368c48baad69a99203c58e3196
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:42 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

perf bpf: Save bpf_prog_info in a rbtree in perf_env

bpf_prog_info contains information necessary to annotate bpf programs.

This patch saves bpf_prog_info for bpf programs loaded in the system.

Some big picture of the next few patches:

To fully annotate BPF programs with source code mapping, 4 different
informations are needed:

1) PERF_RECORD_KSYMBOL
2) PERF_RECORD_BPF_EVENT
3) bpf_prog_info
4) btf

Before this set, 1) and 2) in the list are already saved to perf.data
file. For BPF programs that are already loaded before perf run, 1) and 2)
are synthesized by perf_event__synthesize_bpf_events(). For short living
BPF programs, 1) and 2) are generated by kernel.

This set handles 3) and 4) from the list. Again, it is necessary to handle
existing BPF program and short living program separately.

This patch handles 3) for exising BPF programs while synthesizing 1) and
2) in perf_event__synthesize_bpf_events(). These data are stored in
perf_env. The next patch saves these data from perf_env to perf.data as
headers.

Similarly, the two patches after the next saves 4) of existing BPF
programs to perf_env and perf.data.

Another patch later will handle 3) and 4) for short living BPF programs
by monitoring 1) and 2) in a dedicate thread.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-7-songliubrav...@fb.com
[ set env->bpf_progs.infos_cnt to zero in perf_env__purge_bpf() as noted by 
jolsa ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/perf.c   |  1 +
 tools/perf/util/bpf-event.c | 30 +++-
 tools/perf/util/bpf-event.h |  7 +++-
 tools/perf/util/env.c   | 88 +
 tools/perf/util/env.h   | 19 ++
 tools/perf/util/session.c   |  1 +
 6 files changed, 144 insertions(+), 2 deletions(-)

diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index a11cb006f968..72df4b6fa36f 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -298,6 +298,7 @@ static int run_builtin(struct cmd_struct *p, int argc, 
const char **argv)
use_pager = 1;
commit_pager_choice();
 
+   perf_env__init(_env);
perf_env__set_cmdline(_env, argc, argv);
status = p->fn(argc, argv);
perf_config__exit();
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 5237e8f11997..37ee4e2a728a 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -10,6 +10,7 @@
 #include "debug.h"
 #include "symbol.h"
 #include "machine.h"
+#include "env.h"
 #include "session.h"
 
 #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr))
@@ -54,17 +55,28 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
struct bpf_event *bpf_event = >bpf_event;
struct bpf_prog_info_linear *info_linear;
struct perf_tool *tool = session->tool;
+   struct bpf_prog_info_node *info_node;
struct bpf_prog_info *info;
struct btf *btf = NULL;
bool has_btf = false;
+   struct perf_env *env;
u32 sub_prog_cnt, i;
int err = 0;
u64 arrays;
 
+   /*
+* for perf-record and perf-report use header.env;
+* otherwise, use global perf_env.
+*/
+   env = session->data ? >header.env : _env;
+
arrays = 1UL << BPF_PROG_INFO_JITED_KSYMS;
arrays |= 1UL << BPF_PROG_INFO_JITED_FUNC_LENS;
arrays |= 1UL << BPF_PROG_INFO_FUNC_INFO;
arrays |= 1UL << BPF_PROG_INFO_PROG_TAGS;
+   arrays |= 1UL << BPF_PROG_INFO_JITED_INSNS;
+   arrays |= 1UL << BPF_PROG_INFO_LINE_INFO;
+   arrays |= 1UL << BPF_PROG_INFO_JITED_LINE_INFO;
 
info_linear = bpf_program__get_prog_info_linear(fd, arrays);
if (IS_ERR_OR_NULL(info_linear)) {
@@ -153,8 +165,8 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
 machine, process);
}
 
-   /* Synthesize PERF_RECORD_BPF_EVENT */
if (!opts->no_bpf_event) {
+   /* Synthesize PERF_RECORD_BPF_EVENT */
*bpf_event = (struct bpf_event){
.header = {
.type = PERF_RECORD_BPF_EVENT,
@@ -167,6 +179,22 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_session *session,
memcpy(bpf_event->tag, info->tag, BPF_TAG_SIZE);
memset((void *)event + event->header.size, 0, 
machine->id_hdr_size);
event->header.size += 

[tip:perf/urgent] perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  e5416950454fa79b7bdc86dac45661b97d887c97
Gitweb: https://git.kernel.org/tip/e5416950454fa79b7bdc86dac45661b97d887c97
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:41 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of 
perf_tool

This patch changes the arguments of perf_event__synthesize_bpf_events()
to include perf_session* instead of perf_tool*. perf_session will be
used in the next patch.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190312053051.2690567-6-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/builtin-top.c| 2 +-
 tools/perf/util/bpf-event.c | 8 +---
 tools/perf/util/bpf-event.h | 4 ++--
 4 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f29874192d3e..e79faccd7842 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1114,7 +1114,7 @@ static int record__synthesize(struct record *rec, bool 
tail)
return err;
}
 
-   err = perf_event__synthesize_bpf_events(tool, process_synthesized_event,
+   err = perf_event__synthesize_bpf_events(session, 
process_synthesized_event,
machine, opts);
if (err < 0)
pr_warning("Couldn't synthesize bpf events.\n");
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 2508a7a552fa..77e6190211d2 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1208,7 +1208,7 @@ static int __cmd_top(struct perf_top *top)
 
init_process_thread(top);
 
-   ret = perf_event__synthesize_bpf_events(>tool, perf_event__process,
+   ret = perf_event__synthesize_bpf_events(top->session, 
perf_event__process,
>session->machines.host,
>record_opts);
if (ret < 0)
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index e0cbe7f87170..5237e8f11997 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -10,6 +10,7 @@
 #include "debug.h"
 #include "symbol.h"
 #include "machine.h"
+#include "session.h"
 
 #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr))
 
@@ -42,7 +43,7 @@ int machine__process_bpf_event(struct machine *machine 
__maybe_unused,
  *   -1 for failures;
  *   -2 for lack of kernel support.
  */
-static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool,
+static int perf_event__synthesize_one_bpf_prog(struct perf_session *session,
   perf_event__handler_t process,
   struct machine *machine,
   int fd,
@@ -52,6 +53,7 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_tool *tool,
struct ksymbol_event *ksymbol_event = >ksymbol_event;
struct bpf_event *bpf_event = >bpf_event;
struct bpf_prog_info_linear *info_linear;
+   struct perf_tool *tool = session->tool;
struct bpf_prog_info *info;
struct btf *btf = NULL;
bool has_btf = false;
@@ -175,7 +177,7 @@ out:
return err ? -1 : 0;
 }
 
-int perf_event__synthesize_bpf_events(struct perf_tool *tool,
+int perf_event__synthesize_bpf_events(struct perf_session *session,
  perf_event__handler_t process,
  struct machine *machine,
  struct record_opts *opts)
@@ -209,7 +211,7 @@ int perf_event__synthesize_bpf_events(struct perf_tool 
*tool,
continue;
}
 
-   err = perf_event__synthesize_one_bpf_prog(tool, process,
+   err = perf_event__synthesize_one_bpf_prog(session, process,
  machine, fd,
  event, opts);
close(fd);
diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h
index 7890067e1a37..6698683612a7 100644
--- a/tools/perf/util/bpf-event.h
+++ b/tools/perf/util/bpf-event.h
@@ -15,7 +15,7 @@ struct record_opts;
 int machine__process_bpf_event(struct machine *machine, union perf_event 
*event,
   struct perf_sample *sample);
 
-int perf_event__synthesize_bpf_events(struct perf_tool *tool,
+int perf_event__synthesize_bpf_events(struct perf_session *session,
  perf_event__handler_t process,
  struct machine *machine,

[tip:perf/urgent] perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear()

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  a742258af131e570a68ad8cf16cd2cc4692675a0
Gitweb: https://git.kernel.org/tip/a742258af131e570a68ad8cf16cd2cc4692675a0
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:40 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear()

With bpf_program__get_prog_info_linear, we can simplify the logic that
synthesizes bpf events.

This patch doesn't change the behavior of the code.

Commiter notes:

Needed this (for all four variables), suggested by Song, to overcome
build failure on debian experimental cross building to MIPS 32-bit:

  -   u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags);
  +   u8 (*prog_tags)[BPF_TAG_SIZE] = (void 
*)(uintptr_t)(info->prog_tags);

  util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
  util/bpf-event.c:143:35: error: cast to pointer from integer of different 
size [-Werror=int-to-pointer-cast]
 u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags);
 ^
  util/bpf-event.c:144:22: error: cast to pointer from integer of different 
size [-Werror=int-to-pointer-cast]
 __u32 *prog_lens = (__u32 *)(info->jited_func_lens);
^
  util/bpf-event.c:145:23: error: cast to pointer from integer of different 
size [-Werror=int-to-pointer-cast]
 __u64 *prog_addrs = (__u64 *)(info->jited_ksyms);
 ^
  util/bpf-event.c:146:22: error: cast to pointer from integer of different 
size [-Werror=int-to-pointer-cast]
 void *func_infos = (void *)(info->func_info);
^
  cc1: all warnings being treated as errors

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: kernel-t...@fb.com
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-5-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 118 +++-
 1 file changed, 40 insertions(+), 78 deletions(-)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index ea012b735a37..e0cbe7f87170 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -3,7 +3,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include "bpf-event.h"
 #include "debug.h"
 #include "symbol.h"
@@ -49,99 +51,62 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_tool *tool,
 {
struct ksymbol_event *ksymbol_event = >ksymbol_event;
struct bpf_event *bpf_event = >bpf_event;
-   u32 sub_prog_cnt, i, func_info_rec_size = 0;
-   u8 (*prog_tags)[BPF_TAG_SIZE] = NULL;
-   struct bpf_prog_info info = { .type = 0, };
-   u32 info_len = sizeof(info);
-   void *func_infos = NULL;
-   u64 *prog_addrs = NULL;
+   struct bpf_prog_info_linear *info_linear;
+   struct bpf_prog_info *info;
struct btf *btf = NULL;
-   u32 *prog_lens = NULL;
bool has_btf = false;
-   char errbuf[512];
+   u32 sub_prog_cnt, i;
int err = 0;
+   u64 arrays;
 
-   /* Call bpf_obj_get_info_by_fd() to get sizes of arrays */
-   err = bpf_obj_get_info_by_fd(fd, , _len);
+   arrays = 1UL << BPF_PROG_INFO_JITED_KSYMS;
+   arrays |= 1UL << BPF_PROG_INFO_JITED_FUNC_LENS;
+   arrays |= 1UL << BPF_PROG_INFO_FUNC_INFO;
+   arrays |= 1UL << BPF_PROG_INFO_PROG_TAGS;
 
-   if (err) {
-   pr_debug("%s: failed to get BPF program info: %s, aborting\n",
-__func__, str_error_r(errno, errbuf, sizeof(errbuf)));
+   info_linear = bpf_program__get_prog_info_linear(fd, arrays);
+   if (IS_ERR_OR_NULL(info_linear)) {
+   info_linear = NULL;
+   pr_debug("%s: failed to get BPF program info. aborting\n", 
__func__);
return -1;
}
-   if (info_len < offsetof(struct bpf_prog_info, prog_tags)) {
+
+   if (info_linear->info_len < offsetof(struct bpf_prog_info, prog_tags)) {
pr_debug("%s: the kernel is too old, aborting\n", __func__);
return -2;
}
 
+   info = _linear->info;
+
/* number of ksyms, func_lengths, and tags should match */
-   sub_prog_cnt = info.nr_jited_ksyms;
-   if (sub_prog_cnt != info.nr_prog_tags ||
-   sub_prog_cnt != info.nr_jited_func_lens)
+   sub_prog_cnt = info->nr_jited_ksyms;
+   if (sub_prog_cnt != info->nr_prog_tags ||
+   sub_prog_cnt != info->nr_jited_func_lens)
return -1;
 
/* check BTF func info support */
-   if (info.btf_id && info.nr_func_info && info.func_info_rec_size) {
+   if (info->btf_id && info->nr_func_info && info->func_info_rec_size) {
/* btf func info number should be same as sub_prog_cnt */
-   if 

[tip:perf/urgent] bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  cae73f2339231d61022769f09c94e4500e8ad47a
Gitweb: https://git.kernel.org/tip/cae73f2339231d61022769f09c94e4500e8ad47a
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:39 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()

This patches uses bpf_program__get_prog_info_linear() to simplify the
logic in prog.c do_dump().

Committer testing:

Before:

  # bpftool prog dump xlated id 208 > /tmp/dump.xlated.before
  # bpftool prog dump jited id 208 > /tmp/dump.jited.before
  # bpftool map dump id 107 > /tmp/map.dump.before

After:

  # ~acme/git/perf/tools/bpf/bpftool/bpftool map dump id 107 > 
/tmp/map.dump.after
  # ~acme/git/perf/tools/bpf/bpftool/bpftool prog dump xlated id 208 > 
/tmp/dump.xlated.after
  # ~acme/git/perf/tools/bpf/bpftool/bpftool prog dump jited id 208 > 
/tmp/dump.jited.after
  # diff -u /tmp/dump.xlated.before /tmp/dump.xlated.after
  # diff -u /tmp/dump.jited.before /tmp/dump.jited.after
  # diff -u /tmp/map.dump.before /tmp/map.dump.after
  # ~acme/git/perf/tools/bpf/bpftool/bpftool prog dump xlated id 208
 0: (bf) r6 = r1
 1: (85) call bpf_get_current_pid_tgid#80800
 2: (63) *(u32 *)(r10 -328) = r0
 3: (bf) r2 = r10
 4: (07) r2 += -328
 5: (18) r1 = map[id:107]
 7: (85) call __htab_map_lookup_elem#85680
 8: (15) if r0 == 0x0 goto pc+1
 9: (07) r0 += 56
10: (b7) r7 = 0
11: (55) if r0 != 0x0 goto pc+52
12: (bf) r1 = r10
13: (07) r1 += -328
14: (b7) r2 = 64
15: (bf) r3 = r6
16: (85) call bpf_probe_read#-46848
17: (bf) r2 = r10
18: (07) r2 += -320
19: (18) r1 = map[id:106]
21: (07) r1 += 208
22: (61) r0 = *(u32 *)(r2 +0)
23: (35) if r0 >= 0x200 goto pc+3
24: (67) r0 <<= 3
25: (0f) r0 += r1
26: (05) goto pc+1
27: (b7) r0 = 0
28: (15) if r0 == 0x0 goto pc+35
29: (71) r1 = *(u8 *)(r0 +0)
30: (15) if r1 == 0x0 goto pc+33
31: (b7) r5 = 64
32: (79) r1 = *(u64 *)(r10 -320)
33: (15) if r1 == 0x2 goto pc+2
34: (15) if r1 == 0x101 goto pc+3
35: (55) if r1 != 0x15 goto pc+19
36: (79) r3 = *(u64 *)(r6 +16)
37: (05) goto pc+1
38: (79) r3 = *(u64 *)(r6 +24)
39: (15) if r3 == 0x0 goto pc+15
40: (b7) r1 = 0
41: (63) *(u32 *)(r10 -260) = r1
42: (bf) r1 = r10
43: (07) r1 += -256
44: (b7) r2 = 256
45: (85) call bpf_probe_read_str#-46704
46: (b7) r5 = 328
47: (63) *(u32 *)(r10 -264) = r0
48: (bf) r1 = r0
49: (67) r1 <<= 32
50: (77) r1 >>= 32
51: (25) if r1 > 0xff goto pc+3
52: (07) r0 += 72
53: (57) r0 &= 255
54: (bf) r5 = r0
55: (bf) r4 = r10
56: (07) r4 += -328
57: (bf) r1 = r6
58: (18) r2 = map[id:105]
60: (18) r3 = 0x
62: (85) call bpf_perf_event_output_tp#-45104
63: (bf) r7 = r0
64: (bf) r0 = r7
65: (95) exit
  #

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Acked-by: Daniel Borkmann 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: kernel-t...@fb.com
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-4-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/bpf/bpftool/prog.c | 266 +++
 1 file changed, 59 insertions(+), 207 deletions(-)

diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 8ef80d65a474..d2be5a06c339 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -401,41 +401,31 @@ static int do_show(int argc, char **argv)
 
 static int do_dump(int argc, char **argv)
 {
-   unsigned int finfo_rec_size, linfo_rec_size, jited_linfo_rec_size;
-   void *func_info = NULL, *linfo = NULL, *jited_linfo = NULL;
-   unsigned int nr_finfo, nr_linfo = 0, nr_jited_linfo = 0;
+   struct bpf_prog_info_linear *info_linear;
struct bpf_prog_linfo *prog_linfo = NULL;
-   unsigned long *func_ksyms = NULL;
-   struct bpf_prog_info info = {};
-   unsigned int *func_lens = NULL;
+   enum {DUMP_JITED, DUMP_XLATED} mode;
const char *disasm_opt = NULL;
-   unsigned int nr_func_ksyms;
-   unsigned int nr_func_lens;
+   struct bpf_prog_info *info;
struct dump_data dd = {};
-   __u32 len = sizeof(info);
+   void *func_info = NULL;
struct btf *btf = NULL;
-   unsigned int buf_size;
char *filepath = NULL;
bool opcodes = false;
bool visual = false;
char func_sig[1024];
unsigned char *buf;
bool linum = false;
-   __u32 *member_len;
-   __u64 *member_ptr;
+   __u32 member_len;
+   __u64 arrays;
ssize_t n;
-   int err;
int fd;
 
if (is_prefix(*argv, "jited")) {
if (disasm_init())
return -1;
-
-   member_len = _prog_len;
-   member_ptr 

[tip:perf/urgent] tools lib bpf: Introduce bpf_program__get_prog_info_linear()

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  34be16466d4dc06f3d604dafbcdb3327b72e78da
Gitweb: https://git.kernel.org/tip/34be16466d4dc06f3d604dafbcdb3327b72e78da
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:38 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

tools lib bpf: Introduce bpf_program__get_prog_info_linear()

Currently, bpf_prog_info includes 9 arrays. The user has the option to
fetch any combination of these arrays. However, this requires a lot of
handling.

This work becomes more tricky when we need to store bpf_prog_info to a
file, because these arrays are allocated independently.

This patch introduces 'struct bpf_prog_info_linear', which stores arrays
of bpf_prog_info in continuous memory.

Helper functions are introduced to unify the work to get different sets
of bpf_prog_info.  Specifically, bpf_program__get_prog_info_linear()
allows the user to select which arrays to fetch, and handles details for
the user.

Please see the comments right before 'enum bpf_prog_info_array' for more
details and examples.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Acked-by: Daniel Borkmann 
Link: 
https://lkml.kernel.org/r/ce92c091-e80d-a0c1-4aa0-987706c42...@iogearbox.net
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: kernel-t...@fb.com
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-3-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/lib/bpf/libbpf.c   | 251 +++
 tools/lib/bpf/libbpf.h   |  63 
 tools/lib/bpf/libbpf.map |   3 +
 3 files changed, 317 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4884557aa17f..8fb6e89b4b2c 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -112,6 +112,11 @@ void libbpf_print(enum libbpf_print_level level, const 
char *format, ...)
 # define LIBBPF_ELF_C_READ_MMAP ELF_C_READ
 #endif
 
+static inline __u64 ptr_to_u64(const void *ptr)
+{
+   return (__u64) (unsigned long) ptr;
+}
+
 struct bpf_capabilities {
/* v4.14: kernel support for program & map names. */
__u32 name:1;
@@ -2997,3 +3002,249 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t 
mmap_size, size_t page_size,
ring_buffer_write_tail(header, data_tail);
return ret;
 }
+
+struct bpf_prog_info_array_desc {
+   int array_offset;   /* e.g. offset of jited_prog_insns */
+   int count_offset;   /* e.g. offset of jited_prog_len */
+   int size_offset;/* > 0: offset of rec size,
+* < 0: fix size of -size_offset
+*/
+};
+
+static struct bpf_prog_info_array_desc bpf_prog_info_array_desc[] = {
+   [BPF_PROG_INFO_JITED_INSNS] = {
+   offsetof(struct bpf_prog_info, jited_prog_insns),
+   offsetof(struct bpf_prog_info, jited_prog_len),
+   -1,
+   },
+   [BPF_PROG_INFO_XLATED_INSNS] = {
+   offsetof(struct bpf_prog_info, xlated_prog_insns),
+   offsetof(struct bpf_prog_info, xlated_prog_len),
+   -1,
+   },
+   [BPF_PROG_INFO_MAP_IDS] = {
+   offsetof(struct bpf_prog_info, map_ids),
+   offsetof(struct bpf_prog_info, nr_map_ids),
+   -(int)sizeof(__u32),
+   },
+   [BPF_PROG_INFO_JITED_KSYMS] = {
+   offsetof(struct bpf_prog_info, jited_ksyms),
+   offsetof(struct bpf_prog_info, nr_jited_ksyms),
+   -(int)sizeof(__u64),
+   },
+   [BPF_PROG_INFO_JITED_FUNC_LENS] = {
+   offsetof(struct bpf_prog_info, jited_func_lens),
+   offsetof(struct bpf_prog_info, nr_jited_func_lens),
+   -(int)sizeof(__u32),
+   },
+   [BPF_PROG_INFO_FUNC_INFO] = {
+   offsetof(struct bpf_prog_info, func_info),
+   offsetof(struct bpf_prog_info, nr_func_info),
+   offsetof(struct bpf_prog_info, func_info_rec_size),
+   },
+   [BPF_PROG_INFO_LINE_INFO] = {
+   offsetof(struct bpf_prog_info, line_info),
+   offsetof(struct bpf_prog_info, nr_line_info),
+   offsetof(struct bpf_prog_info, line_info_rec_size),
+   },
+   [BPF_PROG_INFO_JITED_LINE_INFO] = {
+   offsetof(struct bpf_prog_info, jited_line_info),
+   offsetof(struct bpf_prog_info, nr_jited_line_info),
+   offsetof(struct bpf_prog_info, jited_line_info_rec_size),
+   },
+   [BPF_PROG_INFO_PROG_TAGS] = {
+   offsetof(struct bpf_prog_info, prog_tags),
+   offsetof(struct bpf_prog_info, nr_prog_tags),
+   -(int)sizeof(__u8) * BPF_TAG_SIZE,
+   },
+
+};
+
+static __u32 bpf_prog_info_read_offset_u32(struct bpf_prog_info *info, int 
offset)
+{
+   __u32 *array = (__u32 *)info;
+
+   if (offset >= 0)
+   return array[offset / 

[tip:perf/urgent] perf record: Replace option --bpf-event with --no-bpf-event

2019-03-22 Thread tip-bot for Song Liu
Commit-ID:  71184c6ab7e60fd59d8dbc8fed62a1c753dc4934
Gitweb: https://git.kernel.org/tip/71184c6ab7e60fd59d8dbc8fed62a1c753dc4934
Author: Song Liu 
AuthorDate: Mon, 11 Mar 2019 22:30:37 -0700
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Tue, 19 Mar 2019 16:52:06 -0300

perf record: Replace option --bpf-event with --no-bpf-event

Currently, monitoring of BPF programs through bpf_event is off by
default for 'perf record'.

To turn it on, the user need to use option "--bpf-event".  As BPF gets
wider adoption in different subsystems, this option becomes
inconvenient.

This patch makes bpf_event on by default, and adds option "--no-bpf-event"
to turn it off. Since option --bpf-event is not released yet, it is safe
to remove it.

Signed-off-by: Song Liu 
Reviewed-by: Jiri Olsa 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: kernel-t...@fb.com
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Stanislav Fomichev 
Link: http://lkml.kernel.org/r/20190312053051.2690567-2-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/builtin-record.c | 2 +-
 tools/perf/perf.h   | 2 +-
 tools/perf/util/bpf-event.c | 2 +-
 tools/perf/util/evsel.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index e7144a1c1c82..f29874192d3e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1891,7 +1891,7 @@ static struct option __record_options[] = {
OPT_BOOLEAN(0, "tail-synthesize", _synthesize,
"synthesize non-sample events at the end of output"),
OPT_BOOLEAN(0, "overwrite", , "use overwrite 
mode"),
-   OPT_BOOLEAN(0, "bpf-event", _event, "record bpf 
events"),
+   OPT_BOOLEAN(0, "no-bpf-event", _bpf_event, "record bpf 
events"),
OPT_BOOLEAN(0, "strict-freq", _freq,
"Fail if the specified frequency can't be used"),
OPT_CALLBACK('F', "freq", , "freq or 'max'",
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index b120e547ddc7..c59743def8d3 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -66,7 +66,7 @@ struct record_opts {
bool ignore_missing_thread;
bool strict_freq;
bool sample_id;
-   bool bpf_event;
+   bool no_bpf_event;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 028c8ec1f62a..ea012b735a37 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -187,7 +187,7 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_tool *tool,
}
 
/* Synthesize PERF_RECORD_BPF_EVENT */
-   if (opts->bpf_event) {
+   if (!opts->no_bpf_event) {
*bpf_event = (struct bpf_event){
.header = {
.type = PERF_RECORD_BPF_EVENT,
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 1a2023da5d9c..7835e05f0c0a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1036,7 +1036,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct 
record_opts *opts,
attr->mmap2 = track && !perf_missing_features.mmap2;
attr->comm  = track;
attr->ksymbol = track && !perf_missing_features.ksymbol;
-   attr->bpf_event = track && opts->bpf_event &&
+   attr->bpf_event = track && !opts->no_bpf_event &&
!perf_missing_features.bpf_event;
 
if (opts->record_namespaces)


[tip:perf/urgent] perf, bpf: Consider events with attr.bpf_event as side-band events

2019-03-09 Thread tip-bot for Song Liu
Commit-ID:  21038f2baa05a0550f56f010f609a5c871b6a274
Gitweb: https://git.kernel.org/tip/21038f2baa05a0550f56f010f609a5c871b6a274
Author: Song Liu 
AuthorDate: Mon, 25 Feb 2019 16:20:05 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 28 Feb 2019 14:20:35 -0300

perf, bpf: Consider events with attr.bpf_event as side-band events

Events with attr.bpf_event set should be considered as side-band events,
as they carry information about BPF programs.

Signed-off-by: Song Liu 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Fixes: 6ee52e2a3fe4 ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
Link: http://lkml.kernel.org/r/20190226002019.3748539-2-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 kernel/events/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5f59d848171e..dd9698ad3d66 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4238,7 +4238,8 @@ static bool is_sb_event(struct perf_event *event)
if (attr->mmap || attr->mmap_data || attr->mmap2 ||
attr->comm || attr->comm_exec ||
attr->task || attr->ksymbol ||
-   attr->context_switch)
+   attr->context_switch ||
+   attr->bpf_event)
return true;
return false;
 }


[tip:perf/core] perf utils: Silence "Couldn't synthesize bpf events" warning for EPERM

2019-02-15 Thread tip-bot for Song Liu
Commit-ID:  39f4a913d6d439178177cae8aa2e9a232160fd51
Gitweb: https://git.kernel.org/tip/39f4a913d6d439178177cae8aa2e9a232160fd51
Author: Song Liu 
AuthorDate: Mon, 4 Feb 2019 11:31:40 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Thu, 14 Feb 2019 13:31:11 -0300

perf utils: Silence "Couldn't synthesize bpf events" warning for EPERM

Synthesizing BPF events is only supported for root. Silent warning msg
when non-root user runs perf-record.

Reported-by: David Carrillo-Cisneros 
Signed-off-by: Song Liu 
Tested-by: David Carrillo-Cisneros 
Acked-by: Jiri Olsa 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20190204193140.719740-1-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 796ef793f4ce..62dda96b0096 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -236,8 +236,8 @@ int perf_event__synthesize_bpf_events(struct perf_tool 
*tool,
pr_debug("%s: can't get next program: %s%s",
 __func__, strerror(errno),
 errno == EINVAL ? " -- kernel too old?" : "");
-   /* don't report error on old kernel */
-   err = (errno == EINVAL) ? 0 : -1;
+   /* don't report error on old kernel or EPERM  */
+   err = (errno == EINVAL || errno == EPERM) ? 0 : -1;
break;
}
fd = bpf_prog_get_fd_by_id(id);


[tip:perf/core] perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT

2019-01-26 Thread tip-bot for Song Liu
Commit-ID:  811184fb6977bb02c21512d8af6a613a7ebce329
Gitweb: https://git.kernel.org/tip/811184fb6977bb02c21512d8af6a613a7ebce329
Author: Song Liu 
AuthorDate: Tue, 22 Jan 2019 13:02:18 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 25 Jan 2019 15:12:10 +0100

perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT

Added missing machine->id_hdr_size to event->header.size. Also fixed
size of PERF_RECORD_KSYMBOL by removing extra bytes for name.

Committer notes:

We need to malloc that extra machine->id_hdr_size at the start of
perf_event__synthesize_bpf_events() and also need to cast the event to
(void *) otherwise we segfault, fix it.

Reported-by: Arnaldo Carvalho de Melo 
Suggested-by: Jiri Olsa 
Signed-off-by: Song Liu 
Acked-by: Jiri Olsa 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Fixes: 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF 
programs")
Link: http://lkml.kernel.org/r/20190122210218.358664-1-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/bpf-event.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c
index 01e1dc1bb7fb..796ef793f4ce 100644
--- a/tools/perf/util/bpf-event.c
+++ b/tools/perf/util/bpf-event.c
@@ -7,6 +7,7 @@
 #include "bpf-event.h"
 #include "debug.h"
 #include "symbol.h"
+#include "machine.h"
 
 #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr))
 
@@ -149,7 +150,7 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_tool *tool,
*ksymbol_event = (struct ksymbol_event){
.header = {
.type = PERF_RECORD_KSYMBOL,
-   .size = sizeof(struct ksymbol_event),
+   .size = offsetof(struct ksymbol_event, name),
},
.addr = prog_addrs[i],
.len = prog_lens[i],
@@ -178,6 +179,9 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_tool *tool,
 
ksymbol_event->header.size += PERF_ALIGN(name_len + 1,
 sizeof(u64));
+
+   memset((void *)event + event->header.size, 0, 
machine->id_hdr_size);
+   event->header.size += machine->id_hdr_size;
err = perf_tool__process_synth_event(tool, event,
 machine, process);
}
@@ -194,6 +198,8 @@ static int perf_event__synthesize_one_bpf_prog(struct 
perf_tool *tool,
.id = info.id,
};
memcpy(bpf_event->tag, prog_tags[i], BPF_TAG_SIZE);
+   memset((void *)event + event->header.size, 0, 
machine->id_hdr_size);
+   event->header.size += machine->id_hdr_size;
err = perf_tool__process_synth_event(tool, event,
 machine, process);
}
@@ -217,7 +223,7 @@ int perf_event__synthesize_bpf_events(struct perf_tool 
*tool,
int err;
int fd;
 
-   event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN);
+   event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN + 
machine->id_hdr_size);
if (!event)
return -1;
while (true) {


[tip:perf/core] bpf: Add module name [bpf] to ksymbols for bpf programs

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  6934058d9fb6c058fb5e5b11cdcb19834e205c91
Gitweb: https://git.kernel.org/tip/6934058d9fb6c058fb5e5b11cdcb19834e205c91
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:21 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:38:56 -0300

bpf: Add module name [bpf] to ksymbols for bpf programs

With this patch, /proc/kallsyms will show BPF programs as

   t bpf_prog__ [bpf]

Signed-off-by: Song Liu 
Reviewed-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Peter Zijlstra 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-10-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 kernel/kallsyms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index f3a04994e063..14934afa9e68 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -494,7 +494,7 @@ static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter)
 
 static int get_ksymbol_bpf(struct kallsym_iter *iter)
 {
-   iter->module_name[0] = '\0';
+   strlcpy(iter->module_name, "bpf", MODULE_NAME_LEN);
iter->exported = 0;
return bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end,
   >value, >type,


[tip:perf/core] perf tools: Synthesize PERF_RECORD_* for loaded BPF programs

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  7b612e291a5affb12b9d0b87332c71bcbe9c5db4
Gitweb: https://git.kernel.org/tip/7b612e291a5affb12b9d0b87332c71bcbe9c5db4
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:19 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:36:39 -0300

perf tools: Synthesize PERF_RECORD_* for loaded BPF programs

This patch synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for
BPF programs loaded before perf-record. This is achieved by gathering
information about all BPF programs via sys_bpf.

Committer notes:

Fix the build on some older systems such as amazonlinux:1 where it was
breaking with:

  util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
  util/bpf-event.c:52:9: error: missing initializer for field 'type' of 'struct 
bpf_prog_info' [-Werror=missing-field-initializers]
struct bpf_prog_info info = {};
   ^
  In file included from /git/linux/tools/lib/bpf/bpf.h:26:0,
   from util/bpf-event.c:3:
  /git/linux/tools/include/uapi/linux/bpf.h:2699:8: note: 'type' declared here
__u32 type;
  ^
  cc1: all warnings being treated as errors

Further fix on a centos:6 system:

  cc1: warnings being treated as errors
  util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
  util/bpf-event.c:50: error: 'func_info_rec_size' may be used uninitialized in 
this function

The compiler is wrong, but to silence it, initialize that variable to
zero.

One more fix, this time for debian:experimental-x-mips, x-mips64 and
x-mipsel:

  util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog':
  util/bpf-event.c:93:16: error: implicit declaration of function 'calloc' 
[-Werror=implicit-function-declaration]
 func_infos = calloc(sub_prog_cnt, func_info_rec_size);
  ^~
  util/bpf-event.c:93:16: error: incompatible implicit declaration of built-in 
function 'calloc' [-Werror]
  util/bpf-event.c:93:16: note: include '' or provide a declaration 
of 'calloc'

Add the missing header.

Committer testing:

  # perf record --bpf-event sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ]
  # perf report -D | grep PERF_RECORD_BPF_EVENT | nl
 1  0 0x4b10 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 13
 2  0 0x4c60 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 14
 3  0 0x4db0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 15
 4  0 0x4f00 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 16
 5  0 0x5050 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 17
 6  0 0x51a0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 18
 7  0 0x52f0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 21
 8  0 0x5440 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, 
id 22
  # bpftool prog
  13: cgroup_skb  tag 7be49e3934a125ba  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 13,14
  14: cgroup_skb  tag 2a142ef67aaad174  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 13,14
  15: cgroup_skb  tag 7be49e3934a125ba  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 15,16
  16: cgroup_skb  tag 2a142ef67aaad174  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 15,16
  17: cgroup_skb  tag 7be49e3934a125ba  gpl
loaded_at 2019-01-19T09:09:44-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 17,18
  18: cgroup_skb  tag 2a142ef67aaad174  gpl
loaded_at 2019-01-19T09:09:44-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 17,18
  21: cgroup_skb  tag 7be49e3934a125ba  gpl
loaded_at 2019-01-19T09:09:45-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 21,22
  22: cgroup_skb  tag 2a142ef67aaad174  gpl
loaded_at 2019-01-19T09:09:45-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 21,22
  #

  # perf report -D | grep -B22 PERF_RECORD_KSYMBOL
  . ... raw event: size 312 bytes
  .  :  11 00 00 00 00 00 38 01 ff 44 06 c0 ff ff ff ff  ..8..D..
  .  0010:  e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67  bpf_prog
  .  0020:  5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62  _7be49e3934a125b
  .  0030:  61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  a...
   
  .  0110:  00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00  !...
  .  0120:  7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00  {..94.%.
  .  0130:  00 00 00 00 00 00 00 00  

  0 0x49d8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr 
c00644ff len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba
  --
  . 

[tip:perf/core] perf tools: Handle PERF_RECORD_BPF_EVENT

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  45178a928a4b7c6093f6621e627d09909e81cc13
Gitweb: https://git.kernel.org/tip/45178a928a4b7c6093f6621e627d09909e81cc13
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:18 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:00:57 -0300

perf tools: Handle PERF_RECORD_BPF_EVENT

This patch adds basic handling of PERF_RECORD_BPF_EVENT.  Tracking of
PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to
turn it on.

Committer notes:

Add dummy machine__process_bpf_event() variant that returns zero for
systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking
the build in such systems.

Remove the needless include  from bpf->event.h, provide just
forward declarations for the structs and unions in the parameters, to
reduce compilation time and needless rebuilds when machine.h gets
changed.

Committer testing:

When running with:

 # perf record --bpf-event

On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL
is not present, we fallback to removing those two bits from
perf_event_attr, making the tool to continue to work on older kernels:

  perf_event_attr:
size 112
{ sample_period, sample_freq }   4000
sample_type  IP|TID|TIME|PERIOD
read_format  ID
disabled 1
inherit  1
mmap 1
comm 1
freq 1
enable_on_exec   1
task 1
precise_ip   3
sample_id_all1
exclude_guest1
mmap21
comm_exec1
ksymbol  1
bpf_event1
  
  sys_perf_event_open: pid 5779  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off bpf_event
  
  perf_event_attr:
size 112
{ sample_period, sample_freq }   4000
sample_type  IP|TID|TIME|PERIOD
read_format  ID
disabled 1
inherit  1
mmap 1
comm 1
freq 1
enable_on_exec   1
task 1
precise_ip   3
sample_id_all1
exclude_guest1
mmap21
comm_exec1
ksymbol  1
  
  sys_perf_event_open: pid 5779  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off ksymbol
  
  perf_event_attr:
size 112
{ sample_period, sample_freq }   4000
sample_type  IP|TID|TIME|PERIOD
read_format  ID
disabled 1
inherit  1
mmap 1
comm 1
freq 1
enable_on_exec   1
task 1
precise_ip   3
sample_id_all1
exclude_guest1
mmap21
comm_exec1
  

And then proceeds to work without those two features.

As passing --bpf-event is an explicit action performed by the user, perhaps we
should emit a warning telling that the kernel has no such feature, but this can
be done on top of this patch.

Now with a kernel that supports these events, start the 'record --bpf-event -a'
and then run 'perf trace sleep 1' that will use the BPF
augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus
should generate PERF_RECORD_BPF_EVENT events:

  [root@quaco ~]# perf record -e dummy -a --bpf-event
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.713 MB perf.data ]

  [root@quaco ~]# bpftool prog
  13: cgroup_skb  tag 7be49e3934a125ba  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 13,14
  14: cgroup_skb  tag 2a142ef67aaad174  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 13,14
  15: cgroup_skb  tag 7be49e3934a125ba  gpl
loaded_at 2019-01-19T09:09:43-0300  uid 0
xlated 296B  jited 229B  memlock 4096B  map_ids 

[tip:perf/core] perf tools: Handle PERF_RECORD_KSYMBOL

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  9aa0bfa370b278a539077002b3c660468d66b5e7
Gitweb: https://git.kernel.org/tip/9aa0bfa370b278a539077002b3c660468d66b5e7
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:17 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:00:57 -0300

perf tools: Handle PERF_RECORD_KSYMBOL

This patch handles PERF_RECORD_KSYMBOL in perf record/report.
Specifically, map and symbol are created for ksymbol register, and
removed for ksymbol unregister.

This patch also sets perf_event_attr.ksymbol properly. The flag is ON by
default.

Committer notes:

Use proper inttypes.h for u64, fixing the build in some environments
like in the android NDK r15c targetting ARM 32-bit.

I.e. fixing this build error:

  util/event.c: In function 'perf_event__fprintf_ksymbol':
  util/event.c:1489:10: error: format '%lx' expects argument of type 'long 
unsigned int', but argument 3 has type 'u64' [-Werror=format=]
event->ksymbol_event.flags, event->ksymbol_event.name);
^
  cc1: all warnings being treated as errors

Signed-off-by: Song Liu 
Reviewed-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-6-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/event.c   | 21 ++
 tools/perf/util/event.h   | 20 +
 tools/perf/util/evsel.c   | 10 -
 tools/perf/util/evsel.h   |  1 +
 tools/perf/util/machine.c | 55 +++
 tools/perf/util/machine.h |  3 +++
 tools/perf/util/session.c |  4 
 tools/perf/util/tool.h|  4 +++-
 8 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 937a5a4f71cc..f06f3811b25b 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -24,6 +24,7 @@
 #include "symbol/kallsyms.h"
 #include "asm/bug.h"
 #include "stat.h"
+#include "session.h"
 
 #define DEFAULT_PROC_MAP_PARSE_TIMEOUT 500
 
@@ -45,6 +46,7 @@ static const char *perf_event__names[] = {
[PERF_RECORD_SWITCH]= "SWITCH",
[PERF_RECORD_SWITCH_CPU_WIDE]   = "SWITCH_CPU_WIDE",
[PERF_RECORD_NAMESPACES]= "NAMESPACES",
+   [PERF_RECORD_KSYMBOL]   = "KSYMBOL",
[PERF_RECORD_HEADER_ATTR]   = "ATTR",
[PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE",
[PERF_RECORD_HEADER_TRACING_DATA]   = "TRACING_DATA",
@@ -1329,6 +1331,14 @@ int perf_event__process_switch(struct perf_tool *tool 
__maybe_unused,
return machine__process_switch_event(machine, event);
 }
 
+int perf_event__process_ksymbol(struct perf_tool *tool __maybe_unused,
+   union perf_event *event,
+   struct perf_sample *sample __maybe_unused,
+   struct machine *machine)
+{
+   return machine__process_ksymbol(machine, event, sample);
+}
+
 size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)
 {
return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 
"]: %c %s\n",
@@ -1461,6 +1471,14 @@ static size_t perf_event__fprintf_lost(union perf_event 
*event, FILE *fp)
return fprintf(fp, " lost %" PRIu64 "\n", event->lost.lost);
 }
 
+size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp)
+{
+   return fprintf(fp, " ksymbol event with addr %" PRIx64 " len %u type %u 
flags 0x%x name %s\n",
+  event->ksymbol_event.addr, event->ksymbol_event.len,
+  event->ksymbol_event.ksym_type,
+  event->ksymbol_event.flags, event->ksymbol_event.name);
+}
+
 size_t perf_event__fprintf(union perf_event *event, FILE *fp)
 {
size_t ret = fprintf(fp, "PERF_RECORD_%s",
@@ -1496,6 +1514,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE 
*fp)
case PERF_RECORD_LOST:
ret += perf_event__fprintf_lost(event, fp);
break;
+   case PERF_RECORD_KSYMBOL:
+   ret += perf_event__fprintf_ksymbol(event, fp);
+   break;
default:
ret += fprintf(fp, "\n");
}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index eb95f3384958..018322f2a13e 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../perf.h"
 #include "build-id.h"
@@ -84,6 +85,19 @@ struct throttle_event {
u64 stream_id;
 };
 
+#ifndef KSYM_NAME_LEN
+#define KSYM_NAME_LEN 256
+#endif
+
+struct ksymbol_event {
+   struct perf_event_header header;
+   u64 addr;
+   u32 len;
+   u16 ksym_type;
+   u16 flags;
+   char name[KSYM_NAME_LEN];
+};
+
 #define PERF_SAMPLE_MASK   

[tip:perf/core] tools headers uapi: Sync tools/include/uapi/linux/perf_event.h

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  df063c83aa2c58412ddf533ada9aaf25986120ec
Gitweb: https://git.kernel.org/tip/df063c83aa2c58412ddf533ada9aaf25986120ec
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:16 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:00:57 -0300

tools headers uapi: Sync tools/include/uapi/linux/perf_event.h

Sync for PERF_RECORD_BPF_EVENT.

Signed-off-by: Song Liu 
Reviewed-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-5-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/uapi/linux/perf_event.h | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index 1dee5c8f166b..7198ddd0c6b1 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -373,7 +373,8 @@ struct perf_event_attr {
write_backward :  1, /* Write ring buffer from 
end to beginning */
namespaces :  1, /* include namespaces data 
*/
ksymbol:  1, /* include ksymbol events 
*/
-   __reserved_1   : 34;
+   bpf_event  :  1, /* include bpf events */
+   __reserved_1   : 33;
 
union {
__u32   wakeup_events;/* wakeup every n events */
@@ -979,6 +980,25 @@ enum perf_event_type {
 */
PERF_RECORD_KSYMBOL = 17,
 
+   /*
+* Record bpf events:
+*  enum perf_bpf_event_type {
+*  PERF_BPF_EVENT_UNKNOWN  = 0,
+*  PERF_BPF_EVENT_PROG_LOAD= 1,
+*  PERF_BPF_EVENT_PROG_UNLOAD  = 2,
+*  };
+*
+* struct {
+*  struct perf_event_headerheader;
+*  u16 type;
+*  u16 flags;
+*  u32 id;
+*  u8  tag[BPF_TAG_SIZE];
+*  struct sample_idsample_id;
+* };
+*/
+   PERF_RECORD_BPF_EVENT   = 18,
+
PERF_RECORD_MAX,/* non-ABI */
 };
 
@@ -990,6 +1010,13 @@ enum perf_record_ksymbol_type {
 
 #define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER   (1 << 0)
 
+enum perf_bpf_event_type {
+   PERF_BPF_EVENT_UNKNOWN  = 0,
+   PERF_BPF_EVENT_PROG_LOAD= 1,
+   PERF_BPF_EVENT_PROG_UNLOAD  = 2,
+   PERF_BPF_EVENT_MAX, /* non-ABI */
+};
+
 #define PERF_MAX_STACK_DEPTH   127
 #define PERF_MAX_CONTEXTS_PER_STACK  8
 


[tip:perf/core] perf, bpf: Introduce PERF_RECORD_BPF_EVENT

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  6ee52e2a3fe4ea35520720736e6791df1fb67106
Gitweb: https://git.kernel.org/tip/6ee52e2a3fe4ea35520720736e6791df1fb67106
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:15 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:00:57 -0300

perf, bpf: Introduce PERF_RECORD_BPF_EVENT

For better performance analysis of BPF programs, this patch introduces
PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
load/unload information to user space.

Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs.
The following example shows kernel symbols for a BPF program with 7 sub
programs:

a0257cf9 t bpf_prog_b07ccb89267cf242_F
a02592e1 t bpf_prog_2dcecc18072623fc_F
a025b0e9 t bpf_prog_bb7a405ebaec5d5c_F
a025dd2c t bpf_prog_a7540d4a39ec1fc7_F
a025fcca t bpf_prog_05762d4ade0e3737_F
a026108f t bpf_prog_db4bd11e35df90d4_F
a0263f00 t bpf_prog_89d64e4abf0f0126_F
a0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi

When a bpf program is loaded, PERF_RECORD_KSYMBOL is generated for each
of these sub programs. Therefore, PERF_RECORD_BPF_EVENT is not needed
for simple profiling.

For annotation, user space need to listen to PERF_RECORD_BPF_EVENT and
gather more information about these (sub) programs via sys_bpf.

Signed-off-by: Song Liu 
Reviewed-by: Arnaldo Carvalho de Melo 
Acked-by: Alexei Starovoitov 
Acked-by: Peter Zijlstra (Intel) 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-4-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 include/linux/filter.h  |   7 +++
 include/linux/perf_event.h  |   6 +++
 include/uapi/linux/perf_event.h |  29 +-
 kernel/bpf/core.c   |   2 +-
 kernel/bpf/syscall.c|   2 +
 kernel/events/core.c| 115 
 6 files changed, 159 insertions(+), 2 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index ad106d845b22..d531d4250bff 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -951,6 +951,7 @@ bpf_address_lookup(unsigned long addr, unsigned long *size,
 
 void bpf_prog_kallsyms_add(struct bpf_prog *fp);
 void bpf_prog_kallsyms_del(struct bpf_prog *fp);
+void bpf_get_prog_name(const struct bpf_prog *prog, char *sym);
 
 #else /* CONFIG_BPF_JIT */
 
@@ -1006,6 +1007,12 @@ static inline void bpf_prog_kallsyms_add(struct bpf_prog 
*fp)
 static inline void bpf_prog_kallsyms_del(struct bpf_prog *fp)
 {
 }
+
+static inline void bpf_get_prog_name(const struct bpf_prog *prog, char *sym)
+{
+   sym[0] = '\0';
+}
+
 #endif /* CONFIG_BPF_JIT */
 
 void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 136fe0495374..a79e59fc3b7d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1125,6 +1125,9 @@ extern void perf_event_mmap(struct vm_area_struct *vma);
 
 extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
   bool unregister, const char *sym);
+extern void perf_event_bpf_event(struct bpf_prog *prog,
+enum perf_bpf_event_type type,
+u16 flags);
 
 extern struct perf_guest_info_callbacks *perf_guest_cbs;
 extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks 
*callbacks);
@@ -1350,6 +1353,9 @@ static inline void perf_event_mmap(struct vm_area_struct 
*vma){ }
 typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
 static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
  bool unregister, const char *sym) { }
+static inline void perf_event_bpf_event(struct bpf_prog *prog,
+   enum perf_bpf_event_type type,
+   u16 flags)  { }
 static inline void perf_event_exec(void)   { }
 static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
 static inline void perf_event_namespaces(struct task_struct *tsk)  { }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 1dee5c8f166b..7198ddd0c6b1 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -373,7 +373,8 @@ struct perf_event_attr {
write_backward :  1, /* Write ring buffer from 
end to beginning */
namespaces :  1, /* include namespaces data 
*/
ksymbol:  1, /* include ksymbol events 
*/
-   __reserved_1   : 34;
+   bpf_event  :  1, 

[tip:perf/core] tools headers uapi: Sync tools/include/uapi/linux/perf_event.h

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  d764ac6464915523e68e220b6aa4c3c2eb8e3f94
Gitweb: https://git.kernel.org/tip/d764ac6464915523e68e220b6aa4c3c2eb8e3f94
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:14 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:00:57 -0300

tools headers uapi: Sync tools/include/uapi/linux/perf_event.h

Sync changes for PERF_RECORD_KSYMBOL.

Signed-off-by: Song Liu 
Reviewed-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-3-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/uapi/linux/perf_event.h | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index ea19b5d491bf..1dee5c8f166b 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
context_switch :  1, /* context switch data */
write_backward :  1, /* Write ring buffer from 
end to beginning */
namespaces :  1, /* include namespaces data 
*/
-   __reserved_1   : 35;
+   ksymbol:  1, /* include ksymbol events 
*/
+   __reserved_1   : 34;
 
union {
__u32   wakeup_events;/* wakeup every n events */
@@ -963,9 +964,32 @@ enum perf_event_type {
 */
PERF_RECORD_NAMESPACES  = 16,
 
+   /*
+* Record ksymbol register/unregister events:
+*
+* struct {
+*  struct perf_event_headerheader;
+*  u64 addr;
+*  u32 len;
+*  u16 ksym_type;
+*  u16 flags;
+*  charname[];
+*  struct sample_idsample_id;
+* };
+*/
+   PERF_RECORD_KSYMBOL = 17,
+
PERF_RECORD_MAX,/* non-ABI */
 };
 
+enum perf_record_ksymbol_type {
+   PERF_RECORD_KSYMBOL_TYPE_UNKNOWN= 0,
+   PERF_RECORD_KSYMBOL_TYPE_BPF= 1,
+   PERF_RECORD_KSYMBOL_TYPE_MAX/* non-ABI */
+};
+
+#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER   (1 << 0)
+
 #define PERF_MAX_STACK_DEPTH   127
 #define PERF_MAX_CONTEXTS_PER_STACK  8
 


[tip:perf/core] perf, bpf: Introduce PERF_RECORD_KSYMBOL

2019-01-22 Thread tip-bot for Song Liu
Commit-ID:  76193a94522f1d4edf2447a536f3f796ce56343b
Gitweb: https://git.kernel.org/tip/76193a94522f1d4edf2447a536f3f796ce56343b
Author: Song Liu 
AuthorDate: Thu, 17 Jan 2019 08:15:13 -0800
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 21 Jan 2019 17:00:57 -0300

perf, bpf: Introduce PERF_RECORD_KSYMBOL

For better performance analysis of dynamically JITed and loaded kernel
functions, such as BPF programs, this patch introduces
PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
register/unregister information to user space.

The following data structure is used for PERF_RECORD_KSYMBOL.

/*
 * struct {
 *  struct perf_event_headerheader;
 *  u64 addr;
 *  u32 len;
 *  u16 ksym_type;
 *  u16 flags;
 *  charname[];
 *  struct sample_idsample_id;
 * };
 */

Signed-off-by: Song Liu 
Reviewed-by: Arnaldo Carvalho de Melo 
Tested-by: Arnaldo Carvalho de Melo 
Acked-by: Peter Zijlstra 
Cc: Alexei Starovoitov 
Cc: Daniel Borkmann 
Cc: Peter Zijlstra 
Cc: kernel-t...@fb.com
Cc: net...@vger.kernel.org
Link: http://lkml.kernel.org/r/20190117161521.1341602-2-songliubrav...@fb.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 include/linux/perf_event.h  |  8 
 include/uapi/linux/perf_event.h | 26 ++-
 kernel/events/core.c| 98 -
 3 files changed, 130 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 4eb88065a9b5..136fe0495374 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1122,6 +1122,10 @@ static inline void perf_event_task_sched_out(struct 
task_struct *prev,
 }
 
 extern void perf_event_mmap(struct vm_area_struct *vma);
+
+extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
+  bool unregister, const char *sym);
+
 extern struct perf_guest_info_callbacks *perf_guest_cbs;
 extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks 
*callbacks);
 extern int perf_unregister_guest_info_callbacks(struct 
perf_guest_info_callbacks *callbacks);
@@ -1342,6 +1346,10 @@ static inline int perf_unregister_guest_info_callbacks
 (struct perf_guest_info_callbacks *callbacks)  { 
return 0; }
 
 static inline void perf_event_mmap(struct vm_area_struct *vma) { }
+
+typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data);
+static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len,
+ bool unregister, const char *sym) { }
 static inline void perf_event_exec(void)   { }
 static inline void perf_event_comm(struct task_struct *tsk, bool exec) { }
 static inline void perf_event_namespaces(struct task_struct *tsk)  { }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ea19b5d491bf..1dee5c8f166b 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
context_switch :  1, /* context switch data */
write_backward :  1, /* Write ring buffer from 
end to beginning */
namespaces :  1, /* include namespaces data 
*/
-   __reserved_1   : 35;
+   ksymbol:  1, /* include ksymbol events 
*/
+   __reserved_1   : 34;
 
union {
__u32   wakeup_events;/* wakeup every n events */
@@ -963,9 +964,32 @@ enum perf_event_type {
 */
PERF_RECORD_NAMESPACES  = 16,
 
+   /*
+* Record ksymbol register/unregister events:
+*
+* struct {
+*  struct perf_event_headerheader;
+*  u64 addr;
+*  u32 len;
+*  u16 ksym_type;
+*  u16 flags;
+*  charname[];
+*  struct sample_idsample_id;
+* };
+*/
+   PERF_RECORD_KSYMBOL = 17,
+
PERF_RECORD_MAX,/* non-ABI */
 };
 
+enum perf_record_ksymbol_type {
+   PERF_RECORD_KSYMBOL_TYPE_UNKNOWN= 0,
+   PERF_RECORD_KSYMBOL_TYPE_BPF= 1,
+   PERF_RECORD_KSYMBOL_TYPE_MAX/* non-ABI */
+};
+
+#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER   (1 << 0)
+
 #define PERF_MAX_STACK_DEPTH   127
 #define PERF_MAX_CONTEXTS_PER_STACK  8
 
diff --git a/kernel/events/core.c b/kernel/events/core.c

[tip:perf/core] perf/core: Fix bad use of igrab()

2018-05-25 Thread tip-bot for Song Liu
Commit-ID:  9511bce9fe8e5e6c0f923c09243a713eba560141
Gitweb: https://git.kernel.org/tip/9511bce9fe8e5e6c0f923c09243a713eba560141
Author: Song Liu 
AuthorDate: Tue, 17 Apr 2018 23:29:07 -0700
Committer:  Ingo Molnar 
CommitDate: Fri, 25 May 2018 08:11:10 +0200

perf/core: Fix bad use of igrab()

As Miklos reported and suggested:

 "This pattern repeats two times in trace_uprobe.c and in
  kernel/events/core.c as well:

  ret = kern_path(filename, LOOKUP_FOLLOW, );
  if (ret)
  goto fail_address_parse;

  inode = igrab(d_inode(path.dentry));
  path_put();

  And it's wrong.  You can only hold a reference to the inode if you
  have an active ref to the superblock as well (which is normally
  through path.mnt) or holding s_umount.

  This way unmounting the containing filesystem while the tracepoint is
  active will give you the "VFS: Busy inodes after unmount..." message
  and a crash when the inode is finally put.

  Solution: store path instead of inode."

This patch fixes the issue in kernel/event/core.c.

Reviewed-and-tested-by: Alexander Shishkin 
Reported-by: Miklos Szeredi 
Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/pt.c |  4 ++--
 include/linux/perf_event.h |  2 +-
 kernel/events/core.c   | 21 +
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 3b993942a0e4..8d016ce5b80d 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1194,7 +1194,7 @@ static int pt_event_addr_filters_validate(struct 
list_head *filters)
filter->action == PERF_ADDR_FILTER_ACTION_START)
return -EOPNOTSUPP;
 
-   if (!filter->inode) {
+   if (!filter->path.dentry) {
if (!valid_kernel_ip(filter->offset))
return -EINVAL;
 
@@ -1221,7 +1221,7 @@ static void pt_event_addr_filters_sync(struct perf_event 
*event)
return;
 
list_for_each_entry(filter, >list, entry) {
-   if (filter->inode && !offs[range]) {
+   if (filter->path.dentry && !offs[range]) {
msr_a = msr_b = 0;
} else {
/* apply the offset */
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index def866f7269b..bea0b0cd4bf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -467,7 +467,7 @@ enum perf_addr_filter_action_t {
  */
 struct perf_addr_filter {
struct list_headentry;
-   struct inode*inode;
+   struct path path;
unsigned long   offset;
unsigned long   size;
enum perf_addr_filter_action_t  action;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ce6aa5ff3c96..24dea13a27ed 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6668,7 +6668,7 @@ static void perf_event_addr_filters_exec(struct 
perf_event *event, void *data)
 
raw_spin_lock_irqsave(>lock, flags);
list_for_each_entry(filter, >list, entry) {
-   if (filter->inode) {
+   if (filter->path.dentry) {
event->addr_filters_offs[count] = 0;
restart++;
}
@@ -7333,7 +7333,7 @@ static bool perf_addr_filter_match(struct 
perf_addr_filter *filter,
 struct file *file, unsigned long offset,
 unsigned long size)
 {
-   if (filter->inode != file_inode(file))
+   if (d_inode(filter->path.dentry) != file_inode(file))
return false;
 
if (filter->offset > offset + size)
@@ -8686,8 +8686,7 @@ static void free_filters_list(struct list_head *filters)
struct perf_addr_filter *filter, *iter;
 
list_for_each_entry_safe(filter, iter, filters, entry) {
-   if (filter->inode)
-   iput(filter->inode);
+   path_put(>path);
list_del(>entry);
kfree(filter);
}
@@ -8784,7 +8783,7 @@ static void perf_event_addr_filters_apply(struct 
perf_event 

[tip:perf/core] perf/core: Fix bad use of igrab()

2018-05-25 Thread tip-bot for Song Liu
Commit-ID:  9511bce9fe8e5e6c0f923c09243a713eba560141
Gitweb: https://git.kernel.org/tip/9511bce9fe8e5e6c0f923c09243a713eba560141
Author: Song Liu 
AuthorDate: Tue, 17 Apr 2018 23:29:07 -0700
Committer:  Ingo Molnar 
CommitDate: Fri, 25 May 2018 08:11:10 +0200

perf/core: Fix bad use of igrab()

As Miklos reported and suggested:

 "This pattern repeats two times in trace_uprobe.c and in
  kernel/events/core.c as well:

  ret = kern_path(filename, LOOKUP_FOLLOW, );
  if (ret)
  goto fail_address_parse;

  inode = igrab(d_inode(path.dentry));
  path_put();

  And it's wrong.  You can only hold a reference to the inode if you
  have an active ref to the superblock as well (which is normally
  through path.mnt) or holding s_umount.

  This way unmounting the containing filesystem while the tracepoint is
  active will give you the "VFS: Busy inodes after unmount..." message
  and a crash when the inode is finally put.

  Solution: store path instead of inode."

This patch fixes the issue in kernel/event/core.c.

Reviewed-and-tested-by: Alexander Shishkin 
Reported-by: Miklos Szeredi 
Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 arch/x86/events/intel/pt.c |  4 ++--
 include/linux/perf_event.h |  2 +-
 kernel/events/core.c   | 21 +
 3 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 3b993942a0e4..8d016ce5b80d 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1194,7 +1194,7 @@ static int pt_event_addr_filters_validate(struct 
list_head *filters)
filter->action == PERF_ADDR_FILTER_ACTION_START)
return -EOPNOTSUPP;
 
-   if (!filter->inode) {
+   if (!filter->path.dentry) {
if (!valid_kernel_ip(filter->offset))
return -EINVAL;
 
@@ -1221,7 +1221,7 @@ static void pt_event_addr_filters_sync(struct perf_event 
*event)
return;
 
list_for_each_entry(filter, >list, entry) {
-   if (filter->inode && !offs[range]) {
+   if (filter->path.dentry && !offs[range]) {
msr_a = msr_b = 0;
} else {
/* apply the offset */
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index def866f7269b..bea0b0cd4bf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -467,7 +467,7 @@ enum perf_addr_filter_action_t {
  */
 struct perf_addr_filter {
struct list_headentry;
-   struct inode*inode;
+   struct path path;
unsigned long   offset;
unsigned long   size;
enum perf_addr_filter_action_t  action;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ce6aa5ff3c96..24dea13a27ed 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6668,7 +6668,7 @@ static void perf_event_addr_filters_exec(struct 
perf_event *event, void *data)
 
raw_spin_lock_irqsave(>lock, flags);
list_for_each_entry(filter, >list, entry) {
-   if (filter->inode) {
+   if (filter->path.dentry) {
event->addr_filters_offs[count] = 0;
restart++;
}
@@ -7333,7 +7333,7 @@ static bool perf_addr_filter_match(struct 
perf_addr_filter *filter,
 struct file *file, unsigned long offset,
 unsigned long size)
 {
-   if (filter->inode != file_inode(file))
+   if (d_inode(filter->path.dentry) != file_inode(file))
return false;
 
if (filter->offset > offset + size)
@@ -8686,8 +8686,7 @@ static void free_filters_list(struct list_head *filters)
struct perf_addr_filter *filter, *iter;
 
list_for_each_entry_safe(filter, iter, filters, entry) {
-   if (filter->inode)
-   iput(filter->inode);
+   path_put(>path);
list_del(>entry);
kfree(filter);
}
@@ -8784,7 +8783,7 @@ static void perf_event_addr_filters_apply(struct 
perf_event *event)
 * Adjust base offset if the filter is associated to a binary
 * that needs to be mapped:
 */
-   if (filter->inode)
+   if (filter->path.dentry)
event->addr_filters_offs[count] =
perf_addr_filter_apply(filter, mm);
 
@@ -8858,7 +8857,6 

[tip:perf/core] perf/core: Fix group scheduling with mixed hw and sw events

2018-05-25 Thread tip-bot for Song Liu
Commit-ID:  a1150c202207cc8501bebc45b63c264f91959260
Gitweb: https://git.kernel.org/tip/a1150c202207cc8501bebc45b63c264f91959260
Author: Song Liu 
AuthorDate: Thu, 3 May 2018 12:47:16 -0700
Committer:  Ingo Molnar 
CommitDate: Fri, 25 May 2018 08:11:10 +0200

perf/core: Fix group scheduling with mixed hw and sw events

When hw and sw events are mixed in the same group, they are all attached
to the hw perf_event_context. This sometimes requires moving group of
perf_event to a different context.

We found a bug in how the kernel handles this, for example if we do:

   perf stat -e '{faults,ref-cycles,faults}'  -I 1000

 1.005591180  1,297  faults
 1.005591180457,476,576  ref-cycles
 1.005591180  faults

First, sw event "faults" is attached to the sw context, and becomes the
group leader. Then, hw event "ref-cycles" is attached, so both events
are moved to the hw context. Last, another sw "faults" tries to attach,
but it fails because of mismatch between the new target ctx (from sw
pmu) and the group_leader's ctx (hw context, same as ref-cycles).

The broken condition is:
   group_leader is sw event;
   group_leader is on hw context;
   add a sw event to the group.

Fix this scenario by checking group_leader's context (instead of just
event type). If group_leader is on hw context, use the ->pmu of this
context to look up context for the new event.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: b04243ef7006 ("perf: Complete software pmu grouping")
Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/linux/perf_event.h |  8 
 kernel/events/core.c   | 21 +++--
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e71e99eb9a4e..def866f7269b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event 
*event)
return event->event_caps & PERF_EV_CAP_SOFTWARE;
 }
 
+/*
+ * Return 1 for event in sw context, 0 for event in hw context
+ */
+static inline int in_software_context(struct perf_event *event)
+{
+   return event->ctx->pmu->task_ctx_nr == perf_sw_context;
+}
+
 extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67612ce359ad..ce6aa5ff3c96 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open,
if (pmu->task_ctx_nr == perf_sw_context)
event->event_caps |= PERF_EV_CAP_SOFTWARE;
 
-   if (group_leader &&
-   (is_software_event(event) != is_software_event(group_leader))) {
-   if (is_software_event(event)) {
+   if (group_leader) {
+   if (is_software_event(event) &&
+   !in_software_context(group_leader)) {
/*
-* If event and group_leader are not both a software
-* event, and event is, then group leader is not.
+* If the event is a sw event, but the group_leader
+* is on hw context.
 *
-* Allow the addition of software events to !software
-* groups, this is safe because software events never
-* fail to schedule.
+* Allow the addition of software events to hw
+* groups, this is safe because software events
+* never fail to schedule.
 */
-   pmu = group_leader->pmu;
-   } else if (is_software_event(group_leader) &&
+   pmu = group_leader->ctx->pmu;
+   } else if (!is_software_event(event) &&
+  is_software_event(group_leader) &&
   (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
/*
 * In case the group is a pure software group, and we


[tip:perf/core] perf/core: Fix group scheduling with mixed hw and sw events

2018-05-25 Thread tip-bot for Song Liu
Commit-ID:  a1150c202207cc8501bebc45b63c264f91959260
Gitweb: https://git.kernel.org/tip/a1150c202207cc8501bebc45b63c264f91959260
Author: Song Liu 
AuthorDate: Thu, 3 May 2018 12:47:16 -0700
Committer:  Ingo Molnar 
CommitDate: Fri, 25 May 2018 08:11:10 +0200

perf/core: Fix group scheduling with mixed hw and sw events

When hw and sw events are mixed in the same group, they are all attached
to the hw perf_event_context. This sometimes requires moving group of
perf_event to a different context.

We found a bug in how the kernel handles this, for example if we do:

   perf stat -e '{faults,ref-cycles,faults}'  -I 1000

 1.005591180  1,297  faults
 1.005591180457,476,576  ref-cycles
 1.005591180  faults

First, sw event "faults" is attached to the sw context, and becomes the
group leader. Then, hw event "ref-cycles" is attached, so both events
are moved to the hw context. Last, another sw "faults" tries to attach,
but it fails because of mismatch between the new target ctx (from sw
pmu) and the group_leader's ctx (hw context, same as ref-cycles).

The broken condition is:
   group_leader is sw event;
   group_leader is on hw context;
   add a sw event to the group.

Fix this scenario by checking group_leader's context (instead of just
event type). If group_leader is on hw context, use the ->pmu of this
context to look up context for the new event.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: b04243ef7006 ("perf: Complete software pmu grouping")
Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/linux/perf_event.h |  8 
 kernel/events/core.c   | 21 +++--
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e71e99eb9a4e..def866f7269b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event 
*event)
return event->event_caps & PERF_EV_CAP_SOFTWARE;
 }
 
+/*
+ * Return 1 for event in sw context, 0 for event in hw context
+ */
+static inline int in_software_context(struct perf_event *event)
+{
+   return event->ctx->pmu->task_ctx_nr == perf_sw_context;
+}
+
 extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 67612ce359ad..ce6aa5ff3c96 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open,
if (pmu->task_ctx_nr == perf_sw_context)
event->event_caps |= PERF_EV_CAP_SOFTWARE;
 
-   if (group_leader &&
-   (is_software_event(event) != is_software_event(group_leader))) {
-   if (is_software_event(event)) {
+   if (group_leader) {
+   if (is_software_event(event) &&
+   !in_software_context(group_leader)) {
/*
-* If event and group_leader are not both a software
-* event, and event is, then group leader is not.
+* If the event is a sw event, but the group_leader
+* is on hw context.
 *
-* Allow the addition of software events to !software
-* groups, this is safe because software events never
-* fail to schedule.
+* Allow the addition of software events to hw
+* groups, this is safe because software events
+* never fail to schedule.
 */
-   pmu = group_leader->pmu;
-   } else if (is_software_event(group_leader) &&
+   pmu = group_leader->ctx->pmu;
+   } else if (!is_software_event(event) &&
+  is_software_event(group_leader) &&
   (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
/*
 * In case the group is a pure software group, and we


[tip:perf/urgent] trace_kprobe: Remove warning message "Could not insert probe at..."

2018-04-16 Thread tip-bot for Song Liu
Commit-ID:  5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88
Gitweb: https://git.kernel.org/tip/5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88
Author: Song Liu 
AuthorDate: Fri, 13 Apr 2018 11:55:13 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 17 Apr 2018 07:54:57 +0200

trace_kprobe: Remove warning message "Could not insert probe at..."

This warning message is not very helpful, as the return value should
already show information about the error. Also, this message will
spam dmesg if the user space does testing in a loop, like:

for x in {0..5}
do
echo p:xx xx+$x >> /sys/kernel/debug/tracing/kprobe_events
done

Reported-by: Vince Weaver 
Signed-off-by: Song Liu 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20180413185513.3626052-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/trace/trace_kprobe.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 1cd3fb4d70f8..02aed76e0978 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -512,8 +512,6 @@ static int __register_trace_kprobe(struct trace_kprobe *tk)
if (ret == 0)
tk->tp.flags |= TP_FLAG_REGISTERED;
else {
-   pr_warn("Could not insert probe at %s+%lu: %d\n",
-   trace_kprobe_symbol(tk), trace_kprobe_offset(tk), ret);
if (ret == -ENOENT && trace_kprobe_is_on_module(tk)) {
pr_warn("This probe might be able to register after 
target module is loaded. Continue.\n");
ret = 0;


[tip:perf/urgent] trace_kprobe: Remove warning message "Could not insert probe at..."

2018-04-16 Thread tip-bot for Song Liu
Commit-ID:  5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88
Gitweb: https://git.kernel.org/tip/5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88
Author: Song Liu 
AuthorDate: Fri, 13 Apr 2018 11:55:13 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 17 Apr 2018 07:54:57 +0200

trace_kprobe: Remove warning message "Could not insert probe at..."

This warning message is not very helpful, as the return value should
already show information about the error. Also, this message will
spam dmesg if the user space does testing in a loop, like:

for x in {0..5}
do
echo p:xx xx+$x >> /sys/kernel/debug/tracing/kprobe_events
done

Reported-by: Vince Weaver 
Signed-off-by: Song Liu 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: kernel-t...@fb.com
Link: http://lkml.kernel.org/r/20180413185513.3626052-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/trace/trace_kprobe.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 1cd3fb4d70f8..02aed76e0978 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -512,8 +512,6 @@ static int __register_trace_kprobe(struct trace_kprobe *tk)
if (ret == 0)
tk->tp.flags |= TP_FLAG_REGISTERED;
else {
-   pr_warn("Could not insert probe at %s+%lu: %d\n",
-   trace_kprobe_symbol(tk), trace_kprobe_offset(tk), ret);
if (ret == -ENOENT && trace_kprobe_is_on_module(tk)) {
pr_warn("This probe might be able to register after 
target module is loaded. Continue.\n");
ret = 0;


[tip:perf/urgent] perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()

2018-04-12 Thread tip-bot for Song Liu
Commit-ID:  32e6e967fb36bf77ed99221ae3ce1909f045d8f9
Gitweb: https://git.kernel.org/tip/32e6e967fb36bf77ed99221ae3ce1909f045d8f9
Author: Song Liu 
AuthorDate: Wed, 11 Apr 2018 18:02:37 +
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:55:50 +0200

perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()

Non-root user cannot create kprobe or uprobe through the text-based
interface (kprobe_events, uprobe_events),so they should not be able
to create probes via perf_event_open() either.

Reported-by: Vince Weaver 
Signed-off-by: Song Liu 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: 33ea4b24277b ("perf/core: Implement the 'perf_uprobe' PMU")
Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU")
Link: http://lkml.kernel.org/r/c0b2efb5-c403-4bdb-9046-c14b3ee66...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d7af82827373..2d5fe26551f8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8400,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event 
*event)
 
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
/*
 * no branch sampling for probe events
 */
@@ -8437,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event 
*event)
 
if (event->attr.type != perf_uprobe.type)
return -ENOENT;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
/*
 * no branch sampling for probe events
 */


[tip:perf/urgent] perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()

2018-04-12 Thread tip-bot for Song Liu
Commit-ID:  32e6e967fb36bf77ed99221ae3ce1909f045d8f9
Gitweb: https://git.kernel.org/tip/32e6e967fb36bf77ed99221ae3ce1909f045d8f9
Author: Song Liu 
AuthorDate: Wed, 11 Apr 2018 18:02:37 +
Committer:  Ingo Molnar 
CommitDate: Thu, 12 Apr 2018 09:55:50 +0200

perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()

Non-root user cannot create kprobe or uprobe through the text-based
interface (kprobe_events, uprobe_events),so they should not be able
to create probes via perf_event_open() either.

Reported-by: Vince Weaver 
Signed-off-by: Song Liu 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: 33ea4b24277b ("perf/core: Implement the 'perf_uprobe' PMU")
Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU")
Link: http://lkml.kernel.org/r/c0b2efb5-c403-4bdb-9046-c14b3ee66...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d7af82827373..2d5fe26551f8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -8400,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event 
*event)
 
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
/*
 * no branch sampling for probe events
 */
@@ -8437,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event 
*event)
 
if (event->attr.type != perf_uprobe.type)
return -ENOENT;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EACCES;
+
/*
 * no branch sampling for probe events
 */


[tip:perf/urgent] perf/cgroup: Fix child event counting bug

2018-03-20 Thread tip-bot for Song Liu
Commit-ID:  c917e0f259908e75bd2a65877e25f9d90c22c848
Gitweb: https://git.kernel.org/tip/c917e0f259908e75bd2a65877e25f9d90c22c848
Author: Song Liu 
AuthorDate: Mon, 12 Mar 2018 09:59:43 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 20 Mar 2018 08:58:47 +0100

perf/cgroup: Fix child event counting bug

When a perf_event is attached to parent cgroup, it should count events
for all children cgroups:

   parent_group   < perf_event
 \
  - child_group  < process(es)

However, in our tests, we found this perf_event cannot report reliable
results. Here is an example case:

  # create cgroups
  mkdir -p /sys/fs/cgroup/p/c
  # start perf for parent group
  perf stat -e instructions -G "p"

  # on another console, run test process in child cgroup:
  stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs

  # after the test process is done, stop perf in the first console shows

 instructions  p

The instruction should not be "not counted" as the process runs in the
child cgroup.

We found this is because perf_event->cgrp and cpuctx->cgrp are not
identical, thus perf_event->cgrp are not updated properly.

This patch fixes this by updating perf_cgroup properly for ancestor
cgroup(s).

Reported-by: Ephraim Park 
Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4b838470fac4..709a55b9ad97 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -724,9 +724,15 @@ static inline void __update_cgrp_time(struct perf_cgroup 
*cgrp)
 
 static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context 
*cpuctx)
 {
-   struct perf_cgroup *cgrp_out = cpuctx->cgrp;
-   if (cgrp_out)
-   __update_cgrp_time(cgrp_out);
+   struct perf_cgroup *cgrp = cpuctx->cgrp;
+   struct cgroup_subsys_state *css;
+
+   if (cgrp) {
+   for (css = >css; css; css = css->parent) {
+   cgrp = container_of(css, struct perf_cgroup, css);
+   __update_cgrp_time(cgrp);
+   }
+   }
 }
 
 static inline void update_cgrp_time_from_event(struct perf_event *event)
@@ -754,6 +760,7 @@ perf_cgroup_set_timestamp(struct task_struct *task,
 {
struct perf_cgroup *cgrp;
struct perf_cgroup_info *info;
+   struct cgroup_subsys_state *css;
 
/*
 * ctx->lock held by caller
@@ -764,8 +771,12 @@ perf_cgroup_set_timestamp(struct task_struct *task,
return;
 
cgrp = perf_cgroup_from_task(task, ctx);
-   info = this_cpu_ptr(cgrp->info);
-   info->timestamp = ctx->timestamp;
+
+   for (css = >css; css; css = css->parent) {
+   cgrp = container_of(css, struct perf_cgroup, css);
+   info = this_cpu_ptr(cgrp->info);
+   info->timestamp = ctx->timestamp;
+   }
 }
 
 static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);


[tip:perf/urgent] perf/cgroup: Fix child event counting bug

2018-03-20 Thread tip-bot for Song Liu
Commit-ID:  c917e0f259908e75bd2a65877e25f9d90c22c848
Gitweb: https://git.kernel.org/tip/c917e0f259908e75bd2a65877e25f9d90c22c848
Author: Song Liu 
AuthorDate: Mon, 12 Mar 2018 09:59:43 -0700
Committer:  Ingo Molnar 
CommitDate: Tue, 20 Mar 2018 08:58:47 +0100

perf/cgroup: Fix child event counting bug

When a perf_event is attached to parent cgroup, it should count events
for all children cgroups:

   parent_group   < perf_event
 \
  - child_group  < process(es)

However, in our tests, we found this perf_event cannot report reliable
results. Here is an example case:

  # create cgroups
  mkdir -p /sys/fs/cgroup/p/c
  # start perf for parent group
  perf stat -e instructions -G "p"

  # on another console, run test process in child cgroup:
  stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs

  # after the test process is done, stop perf in the first console shows

 instructions  p

The instruction should not be "not counted" as the process runs in the
child cgroup.

We found this is because perf_event->cgrp and cpuctx->cgrp are not
identical, thus perf_event->cgrp are not updated properly.

This patch fixes this by updating perf_cgroup properly for ancestor
cgroup(s).

Reported-by: Ephraim Park 
Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4b838470fac4..709a55b9ad97 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -724,9 +724,15 @@ static inline void __update_cgrp_time(struct perf_cgroup 
*cgrp)
 
 static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context 
*cpuctx)
 {
-   struct perf_cgroup *cgrp_out = cpuctx->cgrp;
-   if (cgrp_out)
-   __update_cgrp_time(cgrp_out);
+   struct perf_cgroup *cgrp = cpuctx->cgrp;
+   struct cgroup_subsys_state *css;
+
+   if (cgrp) {
+   for (css = >css; css; css = css->parent) {
+   cgrp = container_of(css, struct perf_cgroup, css);
+   __update_cgrp_time(cgrp);
+   }
+   }
 }
 
 static inline void update_cgrp_time_from_event(struct perf_event *event)
@@ -754,6 +760,7 @@ perf_cgroup_set_timestamp(struct task_struct *task,
 {
struct perf_cgroup *cgrp;
struct perf_cgroup_info *info;
+   struct cgroup_subsys_state *css;
 
/*
 * ctx->lock held by caller
@@ -764,8 +771,12 @@ perf_cgroup_set_timestamp(struct task_struct *task,
return;
 
cgrp = perf_cgroup_from_task(task, ctx);
-   info = this_cpu_ptr(cgrp->info);
-   info->timestamp = ctx->timestamp;
+
+   for (css = >css; css; css = css->parent) {
+   cgrp = container_of(css, struct perf_cgroup, css);
+   info = this_cpu_ptr(cgrp->info);
+   info->timestamp = ctx->timestamp;
+   }
 }
 
 static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);


[tip:perf/urgent] perf/core: Fix ctx_event_type in ctx_resched()

2018-03-09 Thread tip-bot for Song Liu
Commit-ID:  bd903afeb504db5655a45bb4cf86f38be5b1bf62
Gitweb: https://git.kernel.org/tip/bd903afeb504db5655a45bb4cf86f38be5b1bf62
Author: Song Liu 
AuthorDate: Mon, 5 Mar 2018 21:55:04 -0800
Committer:  Ingo Molnar 
CommitDate: Fri, 9 Mar 2018 08:03:02 +0100

perf/core: Fix ctx_event_type in ctx_resched()

In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is
added. However, ctx_resched() calculates ctx_event_type before checking
this condition. As a result, pinned events will NOT get higher priority
than flexible events.

The following shows this issue on an Intel CPU (where ref-cycles can
only use one hardware counter).

  1. First start:
   perf stat -C 0 -e ref-cycles  -I 1000
  2. Then, in the second console, run:
   perf stat -C 0 -e ref-cycles:D -I 1000

The second perf uses pinned events, which is expected to have higher
priority. However, because it failed in ctx_resched(). It is never
run.

This patch fixes this by calculating ctx_event_type after re-evaluating
event_type.

Reported-by: Ephraim Park 
Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active 
contexts")
Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 96db9ae5d5af..4b838470fac4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2246,7 +2246,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
struct perf_event_context *task_ctx,
enum event_type_t event_type)
 {
-   enum event_type_t ctx_event_type = event_type & EVENT_ALL;
+   enum event_type_t ctx_event_type;
bool cpu_event = !!(event_type & EVENT_CPU);
 
/*
@@ -2256,6 +2256,8 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
if (event_type & EVENT_PINNED)
event_type |= EVENT_FLEXIBLE;
 
+   ctx_event_type = event_type & EVENT_ALL;
+
perf_pmu_disable(cpuctx->ctx.pmu);
if (task_ctx)
task_ctx_sched_out(cpuctx, task_ctx, event_type);


[tip:perf/urgent] perf/core: Fix ctx_event_type in ctx_resched()

2018-03-09 Thread tip-bot for Song Liu
Commit-ID:  bd903afeb504db5655a45bb4cf86f38be5b1bf62
Gitweb: https://git.kernel.org/tip/bd903afeb504db5655a45bb4cf86f38be5b1bf62
Author: Song Liu 
AuthorDate: Mon, 5 Mar 2018 21:55:04 -0800
Committer:  Ingo Molnar 
CommitDate: Fri, 9 Mar 2018 08:03:02 +0100

perf/core: Fix ctx_event_type in ctx_resched()

In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is
added. However, ctx_resched() calculates ctx_event_type before checking
this condition. As a result, pinned events will NOT get higher priority
than flexible events.

The following shows this issue on an Intel CPU (where ref-cycles can
only use one hardware counter).

  1. First start:
   perf stat -C 0 -e ref-cycles  -I 1000
  2. Then, in the second console, run:
   perf stat -C 0 -e ref-cycles:D -I 1000

The second perf uses pinned events, which is expected to have higher
priority. However, because it failed in ctx_resched(). It is never
run.

This patch fixes this by calculating ctx_event_type after re-evaluating
event_type.

Reported-by: Ephraim Park 
Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: 
Cc: Alexander Shishkin 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Stephane Eranian 
Cc: Thomas Gleixner 
Cc: Vince Weaver 
Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active 
contexts")
Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 kernel/events/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 96db9ae5d5af..4b838470fac4 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2246,7 +2246,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
struct perf_event_context *task_ctx,
enum event_type_t event_type)
 {
-   enum event_type_t ctx_event_type = event_type & EVENT_ALL;
+   enum event_type_t ctx_event_type;
bool cpu_event = !!(event_type & EVENT_CPU);
 
/*
@@ -2256,6 +2256,8 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
if (event_type & EVENT_PINNED)
event_type |= EVENT_FLEXIBLE;
 
+   ctx_event_type = event_type & EVENT_ALL;
+
perf_pmu_disable(cpuctx->ctx.pmu);
if (task_ctx)
task_ctx_sched_out(cpuctx, task_ctx, event_type);


[tip:perf/core] perf/core: Implement the 'perf_uprobe' PMU

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  33ea4b24277b06dbc55d7f5772a46f029600255e
Gitweb: https://git.kernel.org/tip/33ea4b24277b06dbc55d7f5772a46f029600255e
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:16 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 11:29:28 +0100

perf/core: Implement the 'perf_uprobe' PMU

This patch adds perf_uprobe support with similar pattern as previous
patch (for kprobe).

Two functions, create_local_trace_uprobe() and
destroy_local_trace_uprobe(), are created so a uprobe can be created
and attached to the file descriptor created by perf_event_open().

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-7-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/linux/trace_events.h|  4 ++
 kernel/events/core.c| 48 ++-
 kernel/trace/trace_event_perf.c | 53 +
 kernel/trace/trace_probe.h  |  4 ++
 kernel/trace/trace_uprobe.c | 86 +
 5 files changed, 186 insertions(+), 9 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 21c5d43..0d9d6cb 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -537,6 +537,10 @@ extern void perf_trace_del(struct perf_event *event, int 
flags);
 extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_kprobe_destroy(struct perf_event *event);
 #endif
+#ifdef CONFIG_UPROBE_EVENTS
+extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
+extern void perf_uprobe_destroy(struct perf_event *event);
+#endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 char *filter_str);
 extern void ftrace_profile_free_filter(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3337355..5a54630 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7992,7 +7992,7 @@ static struct pmu perf_tracepoint = {
.read   = perf_swevent_read,
 };
 
-#ifdef CONFIG_KPROBE_EVENTS
+#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS)
 /*
  * Flags in config, used by dynamic PMU kprobe and uprobe
  * The flags should match following PMU_FORMAT_ATTR().
@@ -8020,7 +8020,9 @@ static const struct attribute_group *probe_attr_groups[] 
= {
_format_group,
NULL,
 };
+#endif
 
+#ifdef CONFIG_KPROBE_EVENTS
 static int perf_kprobe_event_init(struct perf_event *event);
 static struct pmu perf_kprobe = {
.task_ctx_nr= perf_sw_context,
@@ -8057,12 +8059,52 @@ static int perf_kprobe_event_init(struct perf_event 
*event)
 }
 #endif /* CONFIG_KPROBE_EVENTS */
 
+#ifdef CONFIG_UPROBE_EVENTS
+static int perf_uprobe_event_init(struct perf_event *event);
+static struct pmu perf_uprobe = {
+   .task_ctx_nr= perf_sw_context,
+   .event_init = perf_uprobe_event_init,
+   .add= perf_trace_add,
+   .del= perf_trace_del,
+   .start  = perf_swevent_start,
+   .stop   = perf_swevent_stop,
+   .read   = perf_swevent_read,
+   .attr_groups= probe_attr_groups,
+};
+
+static int perf_uprobe_event_init(struct perf_event *event)
+{
+   int err;
+   bool is_retprobe;
+
+   if (event->attr.type != perf_uprobe.type)
+   return -ENOENT;
+   /*
+* no branch sampling for probe events
+*/
+   if (has_branch_stack(event))
+   return -EOPNOTSUPP;
+
+   is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE;
+   err = perf_uprobe_init(event, is_retprobe);
+   if (err)
+   return err;
+
+   event->destroy = perf_uprobe_destroy;
+
+   return 0;
+}
+#endif /* CONFIG_UPROBE_EVENTS */
+
 static inline void perf_tp_register(void)
 {
perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT);
 #ifdef CONFIG_KPROBE_EVENTS
perf_pmu_register(_kprobe, "kprobe", -1);
 #endif
+#ifdef CONFIG_UPROBE_EVENTS
+   perf_pmu_register(_uprobe, "uprobe", -1);
+#endif
 }
 
 static void perf_event_free_filter(struct perf_event *event)
@@ -8151,6 +8193,10 @@ static inline bool perf_event_is_tracing(struct 
perf_event *event)
if (event->pmu == _kprobe)
return true;
 #endif
+#ifdef CONFIG_UPROBE_EVENTS
+   if (event->pmu 

[tip:perf/core] perf/core: Implement the 'perf_uprobe' PMU

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  33ea4b24277b06dbc55d7f5772a46f029600255e
Gitweb: https://git.kernel.org/tip/33ea4b24277b06dbc55d7f5772a46f029600255e
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:16 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 11:29:28 +0100

perf/core: Implement the 'perf_uprobe' PMU

This patch adds perf_uprobe support with similar pattern as previous
patch (for kprobe).

Two functions, create_local_trace_uprobe() and
destroy_local_trace_uprobe(), are created so a uprobe can be created
and attached to the file descriptor created by perf_event_open().

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-7-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/linux/trace_events.h|  4 ++
 kernel/events/core.c| 48 ++-
 kernel/trace/trace_event_perf.c | 53 +
 kernel/trace/trace_probe.h  |  4 ++
 kernel/trace/trace_uprobe.c | 86 +
 5 files changed, 186 insertions(+), 9 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 21c5d43..0d9d6cb 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -537,6 +537,10 @@ extern void perf_trace_del(struct perf_event *event, int 
flags);
 extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
 extern void perf_kprobe_destroy(struct perf_event *event);
 #endif
+#ifdef CONFIG_UPROBE_EVENTS
+extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
+extern void perf_uprobe_destroy(struct perf_event *event);
+#endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 char *filter_str);
 extern void ftrace_profile_free_filter(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3337355..5a54630 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7992,7 +7992,7 @@ static struct pmu perf_tracepoint = {
.read   = perf_swevent_read,
 };
 
-#ifdef CONFIG_KPROBE_EVENTS
+#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS)
 /*
  * Flags in config, used by dynamic PMU kprobe and uprobe
  * The flags should match following PMU_FORMAT_ATTR().
@@ -8020,7 +8020,9 @@ static const struct attribute_group *probe_attr_groups[] 
= {
_format_group,
NULL,
 };
+#endif
 
+#ifdef CONFIG_KPROBE_EVENTS
 static int perf_kprobe_event_init(struct perf_event *event);
 static struct pmu perf_kprobe = {
.task_ctx_nr= perf_sw_context,
@@ -8057,12 +8059,52 @@ static int perf_kprobe_event_init(struct perf_event 
*event)
 }
 #endif /* CONFIG_KPROBE_EVENTS */
 
+#ifdef CONFIG_UPROBE_EVENTS
+static int perf_uprobe_event_init(struct perf_event *event);
+static struct pmu perf_uprobe = {
+   .task_ctx_nr= perf_sw_context,
+   .event_init = perf_uprobe_event_init,
+   .add= perf_trace_add,
+   .del= perf_trace_del,
+   .start  = perf_swevent_start,
+   .stop   = perf_swevent_stop,
+   .read   = perf_swevent_read,
+   .attr_groups= probe_attr_groups,
+};
+
+static int perf_uprobe_event_init(struct perf_event *event)
+{
+   int err;
+   bool is_retprobe;
+
+   if (event->attr.type != perf_uprobe.type)
+   return -ENOENT;
+   /*
+* no branch sampling for probe events
+*/
+   if (has_branch_stack(event))
+   return -EOPNOTSUPP;
+
+   is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE;
+   err = perf_uprobe_init(event, is_retprobe);
+   if (err)
+   return err;
+
+   event->destroy = perf_uprobe_destroy;
+
+   return 0;
+}
+#endif /* CONFIG_UPROBE_EVENTS */
+
 static inline void perf_tp_register(void)
 {
perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT);
 #ifdef CONFIG_KPROBE_EVENTS
perf_pmu_register(_kprobe, "kprobe", -1);
 #endif
+#ifdef CONFIG_UPROBE_EVENTS
+   perf_pmu_register(_uprobe, "uprobe", -1);
+#endif
 }
 
 static void perf_event_free_filter(struct perf_event *event)
@@ -8151,6 +8193,10 @@ static inline bool perf_event_is_tracing(struct 
perf_event *event)
if (event->pmu == _kprobe)
return true;
 #endif
+#ifdef CONFIG_UPROBE_EVENTS
+   if (event->pmu == _uprobe)
+   return true;
+#endif
return false;
 }
 
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 779baad..2c41650 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -286,6 +286,59 @@ void perf_kprobe_destroy(struct perf_event *p_event)
 }
 

[tip:perf/core] perf/core: Implement the 'perf_kprobe' PMU

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  e12f03d7031a977356e3d7b75a68c2185ff8d155
Gitweb: https://git.kernel.org/tip/e12f03d7031a977356e3d7b75a68c2185ff8d155
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:15 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 11:29:26 +0100

perf/core: Implement the 'perf_kprobe' PMU

A new PMU type, perf_kprobe is added. Based on attr from perf_event_open(),
perf_kprobe creates a kprobe (or kretprobe) for the perf_event. This
kprobe is private to this perf_event, and thus not added to global
lists, and not available in tracefs.

Two functions, create_local_trace_kprobe() and
destroy_local_trace_kprobe()  are added to created and destroy these
local trace_kprobe.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-6-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/linux/trace_events.h|   4 ++
 kernel/events/core.c| 142 ++--
 kernel/trace/trace_event_perf.c |  49 ++
 kernel/trace/trace_kprobe.c |  91 ++---
 kernel/trace/trace_probe.h  |   7 ++
 5 files changed, 250 insertions(+), 43 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index af44e7c..21c5d43 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -533,6 +533,10 @@ extern int  perf_trace_init(struct perf_event *event);
 extern void perf_trace_destroy(struct perf_event *event);
 extern int  perf_trace_add(struct perf_event *event, int flags);
 extern void perf_trace_del(struct perf_event *event, int flags);
+#ifdef CONFIG_KPROBE_EVENTS
+extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
+extern void perf_kprobe_destroy(struct perf_event *event);
+#endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 char *filter_str);
 extern void ftrace_profile_free_filter(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index d99fe3f..3337355 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7992,9 +7992,77 @@ static struct pmu perf_tracepoint = {
.read   = perf_swevent_read,
 };
 
+#ifdef CONFIG_KPROBE_EVENTS
+/*
+ * Flags in config, used by dynamic PMU kprobe and uprobe
+ * The flags should match following PMU_FORMAT_ATTR().
+ *
+ * PERF_PROBE_CONFIG_IS_RETPROBE if set, create kretprobe/uretprobe
+ *   if not set, create kprobe/uprobe
+ */
+enum perf_probe_config {
+   PERF_PROBE_CONFIG_IS_RETPROBE = 1U << 0,  /* [k,u]retprobe */
+};
+
+PMU_FORMAT_ATTR(retprobe, "config:0");
+
+static struct attribute *probe_attrs[] = {
+   _attr_retprobe.attr,
+   NULL,
+};
+
+static struct attribute_group probe_format_group = {
+   .name = "format",
+   .attrs = probe_attrs,
+};
+
+static const struct attribute_group *probe_attr_groups[] = {
+   _format_group,
+   NULL,
+};
+
+static int perf_kprobe_event_init(struct perf_event *event);
+static struct pmu perf_kprobe = {
+   .task_ctx_nr= perf_sw_context,
+   .event_init = perf_kprobe_event_init,
+   .add= perf_trace_add,
+   .del= perf_trace_del,
+   .start  = perf_swevent_start,
+   .stop   = perf_swevent_stop,
+   .read   = perf_swevent_read,
+   .attr_groups= probe_attr_groups,
+};
+
+static int perf_kprobe_event_init(struct perf_event *event)
+{
+   int err;
+   bool is_retprobe;
+
+   if (event->attr.type != perf_kprobe.type)
+   return -ENOENT;
+   /*
+* no branch sampling for probe events
+*/
+   if (has_branch_stack(event))
+   return -EOPNOTSUPP;
+
+   is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE;
+   err = perf_kprobe_init(event, is_retprobe);
+   if (err)
+   return err;
+
+   event->destroy = perf_kprobe_destroy;
+
+   return 0;
+}
+#endif /* CONFIG_KPROBE_EVENTS */
+
 static inline void perf_tp_register(void)
 {
perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT);
+#ifdef CONFIG_KPROBE_EVENTS
+   perf_pmu_register(_kprobe, "kprobe", -1);
+#endif
 }
 
 static void perf_event_free_filter(struct perf_event *event)
@@ -8071,13 +8139,28 @@ static void perf_event_free_bpf_handler(struct 

[tip:perf/core] perf/core: Implement the 'perf_kprobe' PMU

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  e12f03d7031a977356e3d7b75a68c2185ff8d155
Gitweb: https://git.kernel.org/tip/e12f03d7031a977356e3d7b75a68c2185ff8d155
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:15 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 11:29:26 +0100

perf/core: Implement the 'perf_kprobe' PMU

A new PMU type, perf_kprobe is added. Based on attr from perf_event_open(),
perf_kprobe creates a kprobe (or kretprobe) for the perf_event. This
kprobe is private to this perf_event, and thus not added to global
lists, and not available in tracefs.

Two functions, create_local_trace_kprobe() and
destroy_local_trace_kprobe()  are added to created and destroy these
local trace_kprobe.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-6-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/linux/trace_events.h|   4 ++
 kernel/events/core.c| 142 ++--
 kernel/trace/trace_event_perf.c |  49 ++
 kernel/trace/trace_kprobe.c |  91 ++---
 kernel/trace/trace_probe.h  |   7 ++
 5 files changed, 250 insertions(+), 43 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index af44e7c..21c5d43 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -533,6 +533,10 @@ extern int  perf_trace_init(struct perf_event *event);
 extern void perf_trace_destroy(struct perf_event *event);
 extern int  perf_trace_add(struct perf_event *event, int flags);
 extern void perf_trace_del(struct perf_event *event, int flags);
+#ifdef CONFIG_KPROBE_EVENTS
+extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
+extern void perf_kprobe_destroy(struct perf_event *event);
+#endif
 extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
 char *filter_str);
 extern void ftrace_profile_free_filter(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index d99fe3f..3337355 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7992,9 +7992,77 @@ static struct pmu perf_tracepoint = {
.read   = perf_swevent_read,
 };
 
+#ifdef CONFIG_KPROBE_EVENTS
+/*
+ * Flags in config, used by dynamic PMU kprobe and uprobe
+ * The flags should match following PMU_FORMAT_ATTR().
+ *
+ * PERF_PROBE_CONFIG_IS_RETPROBE if set, create kretprobe/uretprobe
+ *   if not set, create kprobe/uprobe
+ */
+enum perf_probe_config {
+   PERF_PROBE_CONFIG_IS_RETPROBE = 1U << 0,  /* [k,u]retprobe */
+};
+
+PMU_FORMAT_ATTR(retprobe, "config:0");
+
+static struct attribute *probe_attrs[] = {
+   _attr_retprobe.attr,
+   NULL,
+};
+
+static struct attribute_group probe_format_group = {
+   .name = "format",
+   .attrs = probe_attrs,
+};
+
+static const struct attribute_group *probe_attr_groups[] = {
+   _format_group,
+   NULL,
+};
+
+static int perf_kprobe_event_init(struct perf_event *event);
+static struct pmu perf_kprobe = {
+   .task_ctx_nr= perf_sw_context,
+   .event_init = perf_kprobe_event_init,
+   .add= perf_trace_add,
+   .del= perf_trace_del,
+   .start  = perf_swevent_start,
+   .stop   = perf_swevent_stop,
+   .read   = perf_swevent_read,
+   .attr_groups= probe_attr_groups,
+};
+
+static int perf_kprobe_event_init(struct perf_event *event)
+{
+   int err;
+   bool is_retprobe;
+
+   if (event->attr.type != perf_kprobe.type)
+   return -ENOENT;
+   /*
+* no branch sampling for probe events
+*/
+   if (has_branch_stack(event))
+   return -EOPNOTSUPP;
+
+   is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE;
+   err = perf_kprobe_init(event, is_retprobe);
+   if (err)
+   return err;
+
+   event->destroy = perf_kprobe_destroy;
+
+   return 0;
+}
+#endif /* CONFIG_KPROBE_EVENTS */
+
 static inline void perf_tp_register(void)
 {
perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT);
+#ifdef CONFIG_KPROBE_EVENTS
+   perf_pmu_register(_kprobe, "kprobe", -1);
+#endif
 }
 
 static void perf_event_free_filter(struct perf_event *event)
@@ -8071,13 +8139,28 @@ static void perf_event_free_bpf_handler(struct 
perf_event *event)
 }
 #endif
 
+/*
+ * returns true if the event is a tracepoint, or a kprobe/upprobe created
+ * with perf_event_open()
+ */
+static inline bool perf_event_is_tracing(struct perf_event *event)
+{
+   if (event->pmu == _tracepoint)
+   return true;
+#ifdef CONFIG_KPROBE_EVENTS
+   if (event->pmu == _kprobe)
+   

[tip:perf/core] perf/headers: Sync new perf_event.h with the tools/include/uapi version

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  0d8dd67be013727ae57645ecd3ea2c36365d7da8
Gitweb: https://git.kernel.org/tip/0d8dd67be013727ae57645ecd3ea2c36365d7da8
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:14 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 10:18:05 +0100

perf/headers: Sync new perf_event.h with the tools/include/uapi version

perf_event.h is updated in previous patch, this patch applies the same
changes to the tools/ version. This is part is put in a separate
patch in case the two files are back ported separately.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Acked-by: Alexei Starovoitov 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-5-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 tools/include/uapi/linux/perf_event.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index c77c9a2..5d49cfc 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -380,10 +380,14 @@ struct perf_event_attr {
__u32   bp_type;
union {
__u64   bp_addr;
+   __u64   kprobe_func; /* for perf_kprobe */
+   __u64   uprobe_path; /* for perf_uprobe */
__u64   config1; /* extension of config */
};
union {
__u64   bp_len;
+   __u64   kprobe_addr; /* when kprobe_func == NULL */
+   __u64   probe_offset; /* for perf_[k,u]probe */
__u64   config2; /* extension of config1 */
};
__u64   branch_sample_type; /* enum perf_branch_sample_type */


[tip:perf/core] perf/headers: Sync new perf_event.h with the tools/include/uapi version

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  0d8dd67be013727ae57645ecd3ea2c36365d7da8
Gitweb: https://git.kernel.org/tip/0d8dd67be013727ae57645ecd3ea2c36365d7da8
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:14 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 10:18:05 +0100

perf/headers: Sync new perf_event.h with the tools/include/uapi version

perf_event.h is updated in previous patch, this patch applies the same
changes to the tools/ version. This is part is put in a separate
patch in case the two files are back ported separately.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Acked-by: Alexei Starovoitov 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-5-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 tools/include/uapi/linux/perf_event.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index c77c9a2..5d49cfc 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -380,10 +380,14 @@ struct perf_event_attr {
__u32   bp_type;
union {
__u64   bp_addr;
+   __u64   kprobe_func; /* for perf_kprobe */
+   __u64   uprobe_path; /* for perf_uprobe */
__u64   config1; /* extension of config */
};
union {
__u64   bp_len;
+   __u64   kprobe_addr; /* when kprobe_func == NULL */
+   __u64   probe_offset; /* for perf_[k,u]probe */
__u64   config2; /* extension of config1 */
};
__u64   branch_sample_type; /* enum perf_branch_sample_type */


[tip:perf/core] perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe'

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  65074d43fc77bcae32776724b7fa2696923c78e4
Gitweb: https://git.kernel.org/tip/65074d43fc77bcae32776724b7fa2696923c78e4
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:13 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 10:18:04 +0100

perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe'

Two new perf types, perf_kprobe and perf_uprobe, will be added to allow
creating [k,u]probe with perf_event_open. These [k,u]probe are associated
with the file decriptor created by perf_event_open(), thus are easy to
clean when the file descriptor is destroyed.

kprobe_func and uprobe_path are added to union config1 for pointers to
function name for kprobe or binary path for uprobe.

kprobe_addr and probe_offset are added to union config2 for kernel
address (when kprobe_func is NULL), or [k,u]probe offset.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Acked-by: Alexei Starovoitov 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-4-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/uapi/linux/perf_event.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index c77c9a2..5d49cfc 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -380,10 +380,14 @@ struct perf_event_attr {
__u32   bp_type;
union {
__u64   bp_addr;
+   __u64   kprobe_func; /* for perf_kprobe */
+   __u64   uprobe_path; /* for perf_uprobe */
__u64   config1; /* extension of config */
};
union {
__u64   bp_len;
+   __u64   kprobe_addr; /* when kprobe_func == NULL */
+   __u64   probe_offset; /* for perf_[k,u]probe */
__u64   config2; /* extension of config1 */
};
__u64   branch_sample_type; /* enum perf_branch_sample_type */


[tip:perf/core] perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe'

2018-02-06 Thread tip-bot for Song Liu
Commit-ID:  65074d43fc77bcae32776724b7fa2696923c78e4
Gitweb: https://git.kernel.org/tip/65074d43fc77bcae32776724b7fa2696923c78e4
Author: Song Liu 
AuthorDate: Wed, 6 Dec 2017 14:45:13 -0800
Committer:  Ingo Molnar 
CommitDate: Tue, 6 Feb 2018 10:18:04 +0100

perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe'

Two new perf types, perf_kprobe and perf_uprobe, will be added to allow
creating [k,u]probe with perf_event_open. These [k,u]probe are associated
with the file decriptor created by perf_event_open(), thus are easy to
clean when the file descriptor is destroyed.

kprobe_func and uprobe_path are added to union config1 for pointers to
function name for kprobe or binary path for uprobe.

kprobe_addr and probe_offset are added to union config2 for kernel
address (when kprobe_func is NULL), or [k,u]probe offset.

Signed-off-by: Song Liu 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Yonghong Song 
Reviewed-by: Josef Bacik 
Acked-by: Alexei Starovoitov 
Cc: 
Cc: 
Cc: 
Cc: 
Cc: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Linus Torvalds 
Cc: Namhyung Kim 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20171206224518.3598254-4-songliubrav...@fb.com
Signed-off-by: Ingo Molnar 
---
 include/uapi/linux/perf_event.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index c77c9a2..5d49cfc 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -380,10 +380,14 @@ struct perf_event_attr {
__u32   bp_type;
union {
__u64   bp_addr;
+   __u64   kprobe_func; /* for perf_kprobe */
+   __u64   uprobe_path; /* for perf_uprobe */
__u64   config1; /* extension of config */
};
union {
__u64   bp_len;
+   __u64   kprobe_addr; /* when kprobe_func == NULL */
+   __u64   probe_offset; /* for perf_[k,u]probe */
__u64   config2; /* extension of config1 */
};
__u64   branch_sample_type; /* enum perf_branch_sample_type */