[tip:perf/urgent] perf script: Assume native_arch for pipe mode
Commit-ID: 9d49169c5958e429ffa6874fbef734ae7502ad65 Gitweb: https://git.kernel.org/tip/9d49169c5958e429ffa6874fbef734ae7502ad65 Author: Song Liu AuthorDate: Thu, 20 Jun 2019 18:44:38 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 9 Jul 2019 10:13:28 -0300 perf script: Assume native_arch for pipe mode In pipe mode, session->header.env.arch is not populated until the events are processed. Therefore, the following command crashes: perf record -o - | perf script (gdb) bt It fails when we try to compare env.arch against uts.machine: if (!strcmp(uts.machine, session->header.env.arch) || (!strcmp(uts.machine, "x86_64") && !strcmp(session->header.env.arch, "i386"))) native_arch = true; In pipe mode, it is tricky to find env.arch at this stage. To keep it simple, let's just assume native_arch is always true for pipe mode. Reported-by: David Carrillo Cisneros Signed-off-by: Song Liu Tested-by: Arnaldo Carvalho de Melo Cc: Andi Kleen Cc: Jiri Olsa Cc: Namhyung Kim Cc: kernel-t...@fb.com Cc: sta...@vger.kernel.org #v5.1+ Fixes: 3ab481a1cfe1 ("perf script: Support insn output for normal samples") Link: http://lkml.kernel.org/r/20190621014438.810342-1-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-script.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index b3536820f9a8..79367087bd18 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -3752,7 +3752,8 @@ int cmd_script(int argc, const char **argv) goto out_delete; uname(); - if (!strcmp(uts.machine, session->header.env.arch) || + if (data.is_pipe || /* assume pipe_mode indicates native_arch */ + !strcmp(uts.machine, session->header.env.arch) || (!strcmp(uts.machine, "x86_64") && !strcmp(session->header.env.arch, "i386"))) native_arch = true;
[tip:perf/core] perf header: Assign proper ff->ph in perf_event__synthesize_features()
Commit-ID: c952b35f4b15dd1b83e952718dec3307256383ef Gitweb: https://git.kernel.org/tip/c952b35f4b15dd1b83e952718dec3307256383ef Author: Song Liu AuthorDate: Wed, 19 Jun 2019 18:04:53 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Sat, 6 Jul 2019 14:29:04 -0300 perf header: Assign proper ff->ph in perf_event__synthesize_features() bpf/btf write_* functions need ff->ph->env. With this missing, pipe-mode (perf record -o -) would crash like: Program terminated with signal SIGSEGV, Segmentation fault. This patch assign proper ph value to ff. Committer testing: (gdb) run record -o - Starting program: /root/bin/perf record -o - PERFILE2 Thread 1 "perf" received signal SIGSEGV, Segmentation fault. __do_write_buf (size=4, buf=0x160, ff=0x7fff8f80) at util/header.c:126 126 memcpy(ff->buf + ff->offset, buf, size); (gdb) bt #0 __do_write_buf (size=4, buf=0x160, ff=0x7fff8f80) at util/header.c:126 #1 do_write (ff=ff@entry=0x7fff8f80, buf=buf@entry=0x160, size=4) at util/header.c:137 #2 0x004eddba in write_bpf_prog_info (ff=0x7fff8f80, evlist=) at util/header.c:912 #3 0x004f69d7 in perf_event__synthesize_features (tool=tool@entry=0x97cc00 , session=session@entry=0x7fffe9c6d010, evlist=0x7fffe9cae010, process=process@entry=0x4435d0 ) at util/header.c:3695 #4 0x00443c79 in record__synthesize (tail=tail@entry=false, rec=0x97cc00 ) at builtin-record.c:1214 #5 0x00444ec9 in __cmd_record (rec=0x97cc00 , argv=, argc=0) at builtin-record.c:1435 #6 cmd_record (argc=0, argv=) at builtin-record.c:2450 #7 0x004ae3e9 in run_builtin (p=p@entry=0x98e058 , argc=argc@entry=3, argv=0x7fffd670) at perf.c:304 #8 0x0042eded in handle_internal_command (argv=, argc=) at perf.c:356 #9 run_argv (argcp=, argv=) at perf.c:400 #10 main (argc=3, argv=) at perf.c:522 (gdb) After the patch the SEGSEGV is gone. Reported-by: David Carrillo Cisneros Signed-off-by: Song Liu Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Namhyung Kim Cc: kernel-t...@fb.com Cc: sta...@vger.kernel.org # v5.1+ Fixes: 606f972b1361 ("perf bpf: Save bpf_prog_info information as headers to perf.data") Link: http://lkml.kernel.org/r/20190620010453.4118689-1-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/header.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index 847ae51a524b..fb0aa661644b 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -3602,6 +3602,7 @@ int perf_event__synthesize_features(struct perf_tool *tool, return -ENOMEM; ff.size = sz - sz_hdr; + ff.ph = >header; for_each_set_bit(feat, header->adds_features, HEADER_FEAT_BITS) { if (!feat_ops[feat].synthesize) {
[tip:x86/urgent] perf/x86: Always store regs->ip in perf_callchain_kernel()
Commit-ID: 83f44ae0f8afcc9da659799db8693f74847e66b3 Gitweb: https://git.kernel.org/tip/83f44ae0f8afcc9da659799db8693f74847e66b3 Author: Song Liu AuthorDate: Wed, 26 Jun 2019 19:33:52 -0500 Committer: Thomas Gleixner CommitDate: Fri, 28 Jun 2019 00:11:20 +0200 perf/x86: Always store regs->ip in perf_callchain_kernel() The stacktrace_map_raw_tp BPF selftest is failing because the RIP saved by perf_arch_fetch_caller_regs() isn't getting saved by perf_callchain_kernel(). This was broken by the following commit: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER") With that change, when starting with non-HW regs, the unwinder starts with the current stack frame and unwinds until it passes up the frame which called perf_arch_fetch_caller_regs(). So regs->ip needs to be saved deliberately. Fixes: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER") Signed-off-by: Song Liu Signed-off-by: Josh Poimboeuf Signed-off-by: Thomas Gleixner Acked-by: Peter Zijlstra (Intel) Cc: Kairui Song Cc: Steven Rostedt Cc: Borislav Petkov Link: https://lkml.kernel.org/r/3975a298fa52b506fea32666d8ff6a13467eee6d.1561595111.git.jpoim...@redhat.com --- arch/x86/events/core.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index f315425d8468..4fb3ca1e699d 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -2402,13 +2402,13 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re return; } - if (perf_hw_regs(regs)) { - if (perf_callchain_store(entry, regs->ip)) - return; + if (perf_callchain_store(entry, regs->ip)) + return; + + if (perf_hw_regs(regs)) unwind_start(, current, regs, NULL); - } else { + else unwind_start(, current, NULL, (void *)regs->sp); - } for (; !unwind_done(); unwind_next_frame()) { addr = unwind_get_return_address();
[tip:perf/core] perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF
Commit-ID: 8e21be4f815ca8edfee1decd87f298f92123f719 Gitweb: https://git.kernel.org/tip/8e21be4f815ca8edfee1decd87f298f92123f719 Author: Song Liu AuthorDate: Mon, 20 May 2019 23:44:06 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 5 Jun 2019 09:47:52 -0300 perf data: Add description of header HEADER_BPF_PROG_INFO and HEADER_BPF_BTF This patch addes description of HEADER_BPF_PROG_INFO and HEADER_BPF_BTF to perf.data-file-format.txt. Requested-by: Arnaldo Carvalho de Melo Signed-off-by: Song Liu Cc: Jiri Olsa Cc: Peter Zijlstra Fixes: 606f972b1361 ("perf bpf: Save bpf_prog_info information as headers to perf.data") Link: http://lkml.kernel.org/r/20190521064406.2498925-1-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Documentation/perf.data-file-format.txt | 16 1 file changed, 16 insertions(+) diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt index 6967e9b02be5..022bb8b1c84a 100644 --- a/tools/perf/Documentation/perf.data-file-format.txt +++ b/tools/perf/Documentation/perf.data-file-format.txt @@ -272,6 +272,22 @@ struct { Two uint64_t for the time of first sample and the time of last sample. +HEADER_BPF_PROG_INFO = 25, + +struct bpf_prog_info_linear, which contains detailed information about +a BPF program, including type, id, tag, jited/xlated instructions, etc. + +HEADER_BPF_BTF = 26, + +Contains BPF Type Format (BTF). For more information about BTF, please +refer to Documentation/bpf/btf.rst. + +struct { + u32 id; + u32 data_size; + chardata[]; +}; + HEADER_COMPRESSED = 27, struct {
[tip:perf/core] perf/core: Allow non-privileged uprobe for user processes
Commit-ID: 9fd2e48b9ae17978b2c2a98c055c774d5d90bce8 Gitweb: https://git.kernel.org/tip/9fd2e48b9ae17978b2c2a98c055c774d5d90bce8 Author: Song Liu AuthorDate: Tue, 7 May 2019 09:15:45 -0700 Committer: Ingo Molnar CommitDate: Mon, 3 Jun 2019 11:58:18 +0200 perf/core: Allow non-privileged uprobe for user processes Currently, non-privileged user could only use uprobe with kernel.perf_event_paranoid = -1 However, setting perf_event_paranoid to -1 leaks other users' processes to non-privileged uprobes. To introduce proper permission control of uprobes, we are building the following system: A daemon with CAP_SYS_ADMIN is in charge to create uprobes via tracefs; Users asks the daemon to create uprobes; Then user can attach uprobe only to processes owned by the user. This patch allows non-privileged user to attach uprobe to processes owned by the user. The following example shows how to use uprobe with non-privileged user. This is based on Brendan's blog post [1] 1. Create uprobe with root: sudo perf probe -x 'readline%return +0($retval):string' 2. Then non-root user can use the uprobe as: perf record -vvv -e probe_bash:readline__return -p sleep 20 perf script [1] http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lkml.kernel.org/r/20190507161545.788381-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c| 4 ++-- kernel/trace/trace_uprobe.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index abbd4b3b96c2..3005c80f621d 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8532,9 +8532,9 @@ static int perf_tp_event_match(struct perf_event *event, if (event->hw.state & PERF_HES_STOPPED) return 0; /* -* All tracepoints are from kernel-space. +* If exclude_kernel, only trace user-space tracepoints (uprobes) */ - if (event->attr.exclude_kernel) + if (event->attr.exclude_kernel && !user_mode(regs)) return 0; if (!perf_tp_filter_match(event, data)) diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c index eb7e06b54741..0d60d6856de5 100644 --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -1331,7 +1331,7 @@ static inline void init_trace_event_call(struct trace_uprobe *tu, call->event.funcs = _funcs; call->class->define_fields = uprobe_event_define_fields; - call->flags = TRACE_EVENT_FL_UPROBE; + call->flags = TRACE_EVENT_FL_UPROBE | TRACE_EVENT_FL_CAP_ANY; call->class->reg = trace_uprobe_register; call->data = tu; }
[tip:perf/urgent] perf tools: Check maps for bpf programs
Commit-ID: a93e0b2365e81e5a5b61f25e269b5dc73d242cba Gitweb: https://git.kernel.org/tip/a93e0b2365e81e5a5b61f25e269b5dc73d242cba Author: Song Liu AuthorDate: Tue, 16 Apr 2019 18:01:22 +0200 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 17 Apr 2019 14:30:11 -0300 perf tools: Check maps for bpf programs As reported by Jiri Olsa in: "[BUG] perf: intel_pt won't display kernel function" https://lore.kernel.org/lkml/20190403143738.GB32001@krava Recent changes to support PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT broke --kallsyms option. This is because it broke test __map__is_kmodule. This patch fixes this by adding check for bpf program, so that these maps are not mistaken as kernel modules. Signed-off-by: Song Liu Reported-by: Jiri Olsa Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Alexei Starovoitov Cc: Andi Kleen Cc: Andrii Nakryiko Cc: Daniel Borkmann Cc: Martin KaFai Lau Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Yonghong Song Link: http://lkml.kernel.org/r/20190416160127.30203-8-jo...@kernel.org Fixes: 76193a94522f ("perf, bpf: Introduce PERF_RECORD_KSYMBOL") Signed-off-by: Jiri Olsa Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/map.c | 16 tools/perf/util/map.h | 4 +++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index e32628cd20a7..28d484ef74ae 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -261,6 +261,22 @@ bool __map__is_extra_kernel_map(const struct map *map) return kmap && kmap->name[0]; } +bool __map__is_bpf_prog(const struct map *map) +{ + const char *name; + + if (map->dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO) + return true; + + /* +* If PERF_RECORD_BPF_EVENT is not included, the dso will not have +* type of DSO_BINARY_TYPE__BPF_PROG_INFO. In such cases, we can +* guess the type based on name. +*/ + name = map->dso->short_name; + return name && (strstr(name, "bpf_prog_") == name); +} + bool map__has_symbols(const struct map *map) { return dso__has_symbols(map->dso); diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h index 0e20749f2c55..dc93787c74f0 100644 --- a/tools/perf/util/map.h +++ b/tools/perf/util/map.h @@ -159,10 +159,12 @@ int map__set_kallsyms_ref_reloc_sym(struct map *map, const char *symbol_name, bool __map__is_kernel(const struct map *map); bool __map__is_extra_kernel_map(const struct map *map); +bool __map__is_bpf_prog(const struct map *map); static inline bool __map__is_kmodule(const struct map *map) { - return !__map__is_kernel(map) && !__map__is_extra_kernel_map(map); + return !__map__is_kernel(map) && !__map__is_extra_kernel_map(map) && + !__map__is_bpf_prog(map); } bool map__has_symbols(const struct map *map);
[tip:perf/urgent] perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog()
Commit-ID: fc462ac75b36daaa61e9bda7fba66ed1b3a500b4 Gitweb: https://git.kernel.org/tip/fc462ac75b36daaa61e9bda7fba66ed1b3a500b4 Author: Song Liu AuthorDate: Tue, 19 Mar 2019 09:54:53 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 21 Mar 2019 11:27:04 -0300 perf bpf: Extract logic to create program names from perf_event__synthesize_one_bpf_prog() Extract logic to create program names to synthesize_bpf_prog_name(), so that it can be reused in header.c:print_bpf_prog_info(). This commit doesn't change the behavior. Signed-off-by: Song Liu Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190319165454.1298742-2-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 62 + 1 file changed, 35 insertions(+), 27 deletions(-) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 2a8c245ca942..d5b041649f26 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -111,6 +111,38 @@ static int perf_env__fetch_btf(struct perf_env *env, return 0; } +static int synthesize_bpf_prog_name(char *buf, int size, + struct bpf_prog_info *info, + struct btf *btf, + u32 sub_id) +{ + u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags); + void *func_infos = (void *)(uintptr_t)(info->func_info); + u32 sub_prog_cnt = info->nr_jited_ksyms; + const struct bpf_func_info *finfo; + const char *short_name = NULL; + const struct btf_type *t; + int name_len; + + name_len = snprintf(buf, size, "bpf_prog_"); + name_len += snprintf_hex(buf + name_len, size - name_len, +prog_tags[sub_id], BPF_TAG_SIZE); + if (btf) { + finfo = func_infos + sub_id * info->func_info_rec_size; + t = btf__type_by_id(btf, finfo->type_id); + short_name = btf__name_by_offset(btf, t->name_off); + } else if (sub_id == 0 && sub_prog_cnt == 1) { + /* no subprog */ + if (info->name[0]) + short_name = info->name; + } else + short_name = "F"; + if (short_name) + name_len += snprintf(buf + name_len, size - name_len, +"_%s", short_name); + return name_len; +} + /* * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf * program. One PERF_RECORD_BPF_EVENT is generated for the program. And @@ -135,7 +167,6 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, struct bpf_prog_info_node *info_node; struct bpf_prog_info *info; struct btf *btf = NULL; - bool has_btf = false; struct perf_env *env; u32 sub_prog_cnt, i; int err = 0; @@ -189,19 +220,13 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, btf = NULL; goto out; } - has_btf = true; perf_env__fetch_btf(env, info->btf_id, btf); } /* Synthesize PERF_RECORD_KSYMBOL */ for (i = 0; i < sub_prog_cnt; i++) { - u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags); - __u32 *prog_lens = (__u32 *)(uintptr_t)(info->jited_func_lens); + __u32 *prog_lens = (__u32 *)(uintptr_t)(info->jited_func_lens); __u64 *prog_addrs = (__u64 *)(uintptr_t)(info->jited_ksyms); - void *func_infos = (void *)(uintptr_t)(info->func_info); - const struct bpf_func_info *finfo; - const char *short_name = NULL; - const struct btf_type *t; int name_len; *ksymbol_event = (struct ksymbol_event){ @@ -214,26 +239,9 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, .ksym_type = PERF_RECORD_KSYMBOL_TYPE_BPF, .flags = 0, }; - name_len = snprintf(ksymbol_event->name, KSYM_NAME_LEN, - "bpf_prog_"); - name_len += snprintf_hex(ksymbol_event->name + name_len, -KSYM_NAME_LEN - name_len, -prog_tags[i], BPF_TAG_SIZE); - if (has_btf) { - finfo = func_infos + i * info->func_info_rec_size; - t = btf__type_by_id(btf, finfo->type_id); - short_name = btf__name_by_offset(btf, t->name_off); - } else if (i == 0 && sub_prog_cnt == 1) { - /* no subprog */ - if
[tip:perf/urgent] perf tools: Save bpf_prog_info and BTF of new BPF programs
Commit-ID: d56354dc49091e33d9ffca732ac913ed2df70537 Gitweb: https://git.kernel.org/tip/d56354dc49091e33d9ffca732ac913ed2df70537 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:51 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 21 Mar 2019 11:27:04 -0300 perf tools: Save bpf_prog_info and BTF of new BPF programs To fully annotate BPF programs with source code mapping, 4 different information are needed: 1) PERF_RECORD_KSYMBOL 2) PERF_RECORD_BPF_EVENT 3) bpf_prog_info 4) btf This patch handles 3) and 4) for BPF programs loaded after 'perf record|top'. For timely process of these information, a dedicated event is added to the side band evlist. When PERF_RECORD_BPF_EVENT is received via the side band event, the polling thread gathers 3) and 4) vis sys_bpf and store them in perf_env. This information is saved to perf.data at the end of 'perf record'. Committer testing: The 'wakeup_watermark' member in 'struct perf_event_attr' is inside a unnamed union, so can't be used in a struct designated initialization with older gccs, get it out of that, isolating as 'attr.wakeup_watermark = 1;' to work with all gcc versions. We also need to add '--no-bpf-event' to the 'perf record' perf_event_attr tests in 'perf test', as the way that that test goes is to intercept the events being setup and looking if they match the fields described in the control files, since now it finds first the side band event used to catch the PERF_RECORD_BPF_EVENT, they all fail. With these issues fixed: Same scenario as for testing BPF programs loaded before 'perf record' or 'perf top' starts, only start the BPF programs after 'perf record|top', so that its information get collected by the sideband threads, the rest works as for the programs loaded before start monitoring. Add missing 'inline' to the bpf_event__add_sb_event() when HAVE_LIBBPF_SUPPORT is not defined, fixing the build in systems without binutils devel files installed. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-16-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-record.c| 3 + tools/perf/builtin-top.c | 3 + tools/perf/tests/attr/test-record-C0 | 2 +- tools/perf/tests/attr/test-record-basic| 2 +- tools/perf/tests/attr/test-record-branch-any | 2 +- .../perf/tests/attr/test-record-branch-filter-any | 2 +- .../tests/attr/test-record-branch-filter-any_call | 2 +- .../tests/attr/test-record-branch-filter-any_ret | 2 +- tools/perf/tests/attr/test-record-branch-filter-hv | 2 +- .../tests/attr/test-record-branch-filter-ind_call | 2 +- tools/perf/tests/attr/test-record-branch-filter-k | 2 +- tools/perf/tests/attr/test-record-branch-filter-u | 2 +- tools/perf/tests/attr/test-record-count| 2 +- tools/perf/tests/attr/test-record-data | 2 +- tools/perf/tests/attr/test-record-freq | 2 +- tools/perf/tests/attr/test-record-graph-default| 2 +- tools/perf/tests/attr/test-record-graph-dwarf | 2 +- tools/perf/tests/attr/test-record-graph-fp | 2 +- tools/perf/tests/attr/test-record-group| 2 +- tools/perf/tests/attr/test-record-group-sampling | 2 +- tools/perf/tests/attr/test-record-group1 | 2 +- tools/perf/tests/attr/test-record-no-buffering | 2 +- tools/perf/tests/attr/test-record-no-inherit | 2 +- tools/perf/tests/attr/test-record-no-samples | 2 +- tools/perf/tests/attr/test-record-period | 2 +- tools/perf/tests/attr/test-record-raw | 2 +- tools/perf/util/bpf-event.c| 100 + tools/perf/util/bpf-event.h| 15 28 files changed, 145 insertions(+), 24 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 6f645fd72fed..4e2d953d4bc5 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -1238,6 +1238,9 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) goto out_child; } + if (!opts->no_bpf_event) + bpf_event__add_sb_event(_evlist, >header.env); + if (perf_evlist__start_sb_thread(sb_evlist, >opts.target)) { pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n"); opts->no_bpf_event = true; diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 3ce8a8db6c1d..1999d6533d12 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1637,6 +1637,9 @@ int cmd_top(int argc, const char **argv)
[tip:perf/urgent] perf annotate: Enable annotation of BPF programs
Commit-ID: 6987561c9e86eace45f2dbb0c564964a63f4150a Gitweb: https://git.kernel.org/tip/6987561c9e86eace45f2dbb0c564964a63f4150a Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:48 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 20 Mar 2019 16:43:15 -0300 perf annotate: Enable annotation of BPF programs In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into a new function symbol__disassemble_bpf(), where annotation line information is filled based on the bpf_prog_info and btf data saved in given perf_env. symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf programs. Committer testing: After fixing this: - u64 *addrs = (u64 *)(info_linear->info.jited_ksyms); + u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms); Detected when crossbuilding to a 32-bit arch. And making all this dependent on HAVE_LIBBFD_SUPPORT and HAVE_LIBBPF_SUPPORT: 1) Have a BPF program running, one that has BTF info, etc, I used the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place by 'perf trace'. # grep -B1 augmented_raw ~/.perfconfig [trace] add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c # # perf trace -e *mmsg dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2 NetworkManager/10055 sendmmsg(22, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2 2) Then do a 'perf record' system wide for a while: # perf record -a ^C[ perf record: Woken up 68 times to write data ] [ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ] # 3) Check that we captured BPF and BTF info in the perf.data file: # perf report --header-only | grep 'b[pt]f' # event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1 # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 41 # bpf_prog_info of id 42 # btf info of id 2 # 4) Check which programs got recorded: # perf report | grep bpf_prog | head 0.16% exe bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.14% exe bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.08% fuse-overlayfs bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.07% fuse-overlayfs bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.01% clang-4.0bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.01% clang-4.0bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.00% clangbpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit 0.00% runc bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.00% clangbpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter 0.00% sh bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit # This was with the default --sort order for 'perf report', which is: --sort comm,dso,symbol If we just look for the symbol, for instance: # perf report --sort symbol | grep bpf_prog | head 0.26% [k] bpf_prog_819967866022f1e1_sys_enter- - 0.24% [k] bpf_prog_c1bd85c092d6e4aa_sys_exit - - # or the DSO: # perf report --sort dso | grep bpf_prog | head 0.26% bpf_prog_819967866022f1e1_sys_enter 0.24% bpf_prog_c1bd85c092d6e4aa_sys_exit # We'll see the two BPF programs that augmented_raw_syscalls.o puts in place, one attached to the raw_syscalls:sys_enter and another to the raw_syscalls:sys_exit tracepoints, as expected. Now we can finally do, from the command line, annotation for one of those two symbols, with the original BPF program source coude intermixed with the disassembled JITed code: # perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter Samples: 950 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period] bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter Percent int sys_enter(struct syscall_enter_args *args) 53.41 push %rbp 0.63 mov%rsp,%rbp 0.31 sub$0x170,%rsp 1.93 sub$0x28,%rbp 7.02 mov%rbx,0x0(%rbp) 3.20
[tip:perf/urgent] perf bpf: Show more BPF program info in print_bpf_prog_info()
Commit-ID: f8dfeae009effc0b6dac2741cf8d7cbb91edb982 Gitweb: https://git.kernel.org/tip/f8dfeae009effc0b6dac2741cf8d7cbb91edb982 Author: Song Liu AuthorDate: Tue, 19 Mar 2019 09:54:54 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 21 Mar 2019 11:27:04 -0300 perf bpf: Show more BPF program info in print_bpf_prog_info() This patch enables showing bpf program name, address, and size in the header. Before the patch: perf report --header-only ... # bpf_prog_info of id 9 # bpf_prog_info of id 10 # bpf_prog_info of id 13 After the patch: # bpf_prog_info 9: bpf_prog_7be49e3934a125ba addr 0xa0024947 size 229 # bpf_prog_info 10: bpf_prog_2a142ef67aaad174 addr 0xa007c94d size 229 # bpf_prog_info 13: bpf_prog_47368425825d7384_task__task_newt addr 0xa0251137 size 369 Committer notes: Fix the fallback definition when HAVE_LIBBPF_SUPPORT is not defined, i.e. add the missing 'static inline' and add the __maybe_unused to the args. Also add stdio.h since we now use FILE * in bpf-event.h. Signed-off-by: Song Liu Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190319165454.1298742-3-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 40 tools/perf/util/bpf-event.h | 11 ++- tools/perf/util/header.c| 5 +++-- 3 files changed, 53 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index d5b041649f26..2a4a0da35632 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -438,3 +438,43 @@ int bpf_event__add_sb_event(struct perf_evlist **evlist, return perf_evlist__add_sb_event(evlist, , bpf_event__sb_cb, env); } + +void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info, + struct perf_env *env, + FILE *fp) +{ + __u32 *prog_lens = (__u32 *)(uintptr_t)(info->jited_func_lens); + __u64 *prog_addrs = (__u64 *)(uintptr_t)(info->jited_ksyms); + char name[KSYM_NAME_LEN]; + struct btf *btf = NULL; + u32 sub_prog_cnt, i; + + sub_prog_cnt = info->nr_jited_ksyms; + if (sub_prog_cnt != info->nr_prog_tags || + sub_prog_cnt != info->nr_jited_func_lens) + return; + + if (info->btf_id) { + struct btf_node *node; + + node = perf_env__find_btf(env, info->btf_id); + if (node) + btf = btf__new((__u8 *)(node->data), + node->data_size); + } + + if (sub_prog_cnt == 1) { + synthesize_bpf_prog_name(name, KSYM_NAME_LEN, info, btf, 0); + fprintf(fp, "# bpf_prog_info %u: %s addr 0x%llx size %u\n", + info->id, name, prog_addrs[0], prog_lens[0]); + return; + } + + fprintf(fp, "# bpf_prog_info %u:\n", info->id); + for (i = 0; i < sub_prog_cnt; i++) { + synthesize_bpf_prog_name(name, KSYM_NAME_LEN, info, btf, i); + + fprintf(fp, "# \tsub_prog %u: %s addr 0x%llx size %u\n", + i, name, prog_addrs[i], prog_lens[i]); + } +} diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h index 8cb1189149ec..04c33b3bfe28 100644 --- a/tools/perf/util/bpf-event.h +++ b/tools/perf/util/bpf-event.h @@ -7,6 +7,7 @@ #include #include #include "event.h" +#include struct machine; union perf_event; @@ -38,7 +39,9 @@ int perf_event__synthesize_bpf_events(struct perf_session *session, struct record_opts *opts); int bpf_event__add_sb_event(struct perf_evlist **evlist, struct perf_env *env); - +void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info, + struct perf_env *env, + FILE *fp); #else static inline int machine__process_bpf_event(struct machine *machine __maybe_unused, union perf_event *event __maybe_unused, @@ -61,5 +64,11 @@ static inline int bpf_event__add_sb_event(struct perf_evlist **evlist __maybe_un return 0; } +static inline void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info __maybe_unused, + struct perf_env *env __maybe_unused, + FILE *fp __maybe_unused) +{ + +} #endif // HAVE_LIBBPF_SUPPORT #endif diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index 01dda2f65d36..b9e693825873 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -1468,8 +1468,9 @@ static void print_bpf_prog_info(struct feat_fd *ff, FILE *fp)
[tip:perf/urgent] perf evlist: Introduce side band thread
Commit-ID: 657ee5531903339b06697581532ed32d4762526e Gitweb: https://git.kernel.org/tip/657ee5531903339b06697581532ed32d4762526e Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:50 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 21 Mar 2019 11:27:03 -0300 perf evlist: Introduce side band thread This patch introduces side band thread that captures extended information for events like PERF_RECORD_BPF_EVENT. This new thread uses its own evlist that uses ring buffer with very low watermark for lower latency. To use side band thread, we need to: 1. add side band event(s) by calling perf_evlist__add_sb_event(); 2. calls perf_evlist__start_sb_thread(); 3. at the end of perf run, perf_evlist__stop_sb_thread(). In the next patch, we use this thread to handle PERF_RECORD_BPF_EVENT. Committer notes: Add fix by Jiri Olsa for when te sb_tread can't get started and then at the end the stop_sb_thread() segfaults when joining the (non-existing) thread. That can happen when running 'perf top' or 'perf record' as a normal user, for instance. Further checks need to be done on top of this to more graciously handle these possible failure scenarios. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-15-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-record.c | 9 tools/perf/builtin-top.c| 9 tools/perf/util/evlist.c| 119 tools/perf/util/evlist.h| 12 + tools/perf/util/evsel.h | 6 +++ 5 files changed, 155 insertions(+) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index e79faccd7842..6f645fd72fed 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -1137,6 +1137,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) struct perf_data *data = >data; struct perf_session *session; bool disabled = false, draining = false; + struct perf_evlist *sb_evlist = NULL; int fd; atexit(record__sig_exit); @@ -1237,6 +1238,11 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) goto out_child; } + if (perf_evlist__start_sb_thread(sb_evlist, >opts.target)) { + pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n"); + opts->no_bpf_event = true; + } + err = record__synthesize(rec, false); if (err < 0) goto out_child; @@ -1487,6 +1493,9 @@ out_child: out_delete_session: perf_session__delete(session); + + if (!opts->no_bpf_event) + perf_evlist__stop_sb_thread(sb_evlist); return status; } diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index c2ea22c4ea67..3ce8a8db6c1d 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1501,6 +1501,7 @@ int cmd_top(int argc, const char **argv) "number of thread to run event synthesize"), OPT_END() }; + struct perf_evlist *sb_evlist = NULL; const char * const top_usage[] = { "perf top []", NULL @@ -1636,8 +1637,16 @@ int cmd_top(int argc, const char **argv) goto out_delete_evlist; } + if (perf_evlist__start_sb_thread(sb_evlist, target)) { + pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n"); + opts->no_bpf_event = true; + } + status = __cmd_top(); + if (!opts->no_bpf_event) + perf_evlist__stop_sb_thread(sb_evlist); + out_delete_evlist: perf_evlist__delete(top.evlist); perf_session__delete(top.session); diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index ed20f4379956..ec78e93085de 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -19,6 +19,7 @@ #include "debug.h" #include "units.h" #include "asm/bug.h" +#include "bpf-event.h" #include #include @@ -1856,3 +1857,121 @@ struct perf_evsel *perf_evlist__reset_weak_group(struct perf_evlist *evsel_list, } return leader; } + +int perf_evlist__add_sb_event(struct perf_evlist **evlist, + struct perf_event_attr *attr, + perf_evsel__sb_cb_t cb, + void *data) +{ + struct perf_evsel *evsel; + bool new_evlist = (*evlist) == NULL; + + if (*evlist == NULL) + *evlist = perf_evlist__new(); + if (*evlist == NULL) + return -1; + + if (!attr->sample_id_all) { + pr_warning("enabling
[tip:perf/urgent] perf top: Add option --no-bpf-event
Commit-ID: ee7a112fbcc8edb4cf2f84ce5fcc2da7818fd4b8 Gitweb: https://git.kernel.org/tip/ee7a112fbcc8edb4cf2f84ce5fcc2da7818fd4b8 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:46 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:07 -0300 perf top: Add option --no-bpf-event This patch adds option --no-bpf-event to 'perf top', which is the same as the option of 'perf record'. The following patches will use this option. Committer testing: # perf top -vv 2> /tmp/perf_event_attr.out # cat /tmp/perf_event_attr.out perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 task 1 precise_ip 3 sample_id_all1 exclude_guest1 mmap21 comm_exec1 ksymbol 1 bpf_event1 # After this patch: # perf top --no-bpf-event -vv 2> /tmp/perf_event_attr.out # cat /tmp/perf_event_attr.out perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 task 1 precise_ip 3 sample_id_all1 exclude_guest1 mmap21 comm_exec1 ksymbol 1 # Signed-off-by: Song Liu Tested-by: Arnaldo Carvalho de Melo Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-11-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-top.c | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 77e6190211d2..c2ea22c4ea67 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1469,6 +1469,7 @@ int cmd_top(int argc, const char **argv) "Display raw encoding of assembly instructions (default)"), OPT_BOOLEAN(0, "demangle-kernel", _conf.demangle_kernel, "Enable kernel symbol demangling"), + OPT_BOOLEAN(0, "no-bpf-event", _opts.no_bpf_event, "do not record bpf events"), OPT_STRING(0, "objdump", _opts.objdump_path, "path", "objdump binary to use for disassembly and annotations"), OPT_STRING('M', "disassembler-style", _opts.disassembler_style, "disassembler style",
[tip:perf/urgent] perf build: Check what binutils's 'disassembler()' signature to use
Commit-ID: 8a1b1718214cfd945fef14b3031e4e7262882a86 Gitweb: https://git.kernel.org/tip/8a1b1718214cfd945fef14b3031e4e7262882a86 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:48 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Wed, 20 Mar 2019 16:42:10 -0300 perf build: Check what binutils's 'disassembler()' signature to use Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils repo, which changed the disassembler() function signature, so we must use the feature test introduced in fb982666e380 ("tools/bpftool: fix bpftool build with bintutils >= 2.9") to deal with that. Committer testing: After adding the missing function call to test-all.c, and: FEATURE_CHECK_LDFLAGS-disassembler-four-args = -bfd -lopcodes And the fallbacks for cases where we need -liberty and sometimes -lz to tools/perf/Makefile.config, we get: $ make -C tools/perf O=/tmp/build/perf install-bin make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j8' parallel build Auto-detecting system features: ... dwarf: [ on ] ...dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ on ] ... libaudit: [ on ] ...libbfd: [ on ] ...libelf: [ on ] ... libnuma: [ on ] ...numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ...libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ...libaio: [ on ] ...disassembler-four-args: [ on ] CC /tmp/build/perf/jvmti/libjvmti.o CC /tmp/build/perf/builtin-bench.o $ $ The feature detection test-all.bin gets successfully built and linked: $ ls -la /tmp/build/perf/feature/test-all.bin -rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 /tmp/build/perf/feature/test-all.bin $ nm /tmp/build/perf/feature/test-all.bin | grep -w disassembler 00061f90 T disassembler $ Time to move on to the patches that make use of this disassembler() routine in binutils's libopcodes. Signed-off-by: Song Liu Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Jakub Kicinski Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Roman Gushchin Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubrav...@fb.com [ split from a larger patch, added missing FEATURE_CHECK_LDFLAGS-disassembler-four-args ] Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/Makefile.feature | 6 -- tools/build/feature/test-all.c | 5 + tools/perf/Makefile.config | 9 + 3 files changed, 18 insertions(+), 2 deletions(-) diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 61e46d54a67c..8d3864b061f3 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -66,7 +66,8 @@ FEATURE_TESTS_BASIC := \ sched_getcpu \ sdt\ setns \ -libaio +libaio \ +disassembler-four-args # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list # of all feature tests @@ -118,7 +119,8 @@ FEATURE_DISPLAY ?= \ lzma \ get_cpuid \ bpf \ - libaio + libaio\ + disassembler-four-args # Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features. # If in the future we need per-feature checks/flags for features not diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index e903b86b742f..7853e6d91090 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -178,6 +178,10 @@ # include "test-reallocarray.c" #undef main +#define main main_test_disassembler_four_args +# include "test-disassembler-four-args.c" +#undef main + int main(int argc, char *argv[]) { main_test_libpython(); @@ -219,6 +223,7 @@ int main(int argc, char *argv[]) main_test_setns(); main_test_libaio(); main_test_reallocarray(); + main_test_disassembler_four_args(); return 0; } diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index df4ad45599ca..fe3f97e342fa 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -227,6 +227,8 @@ FEATURE_CHECK_LDFLAGS-libpython-version := $(PYTHON_EMBED_LDOPTS)
[tip:perf/urgent] perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation
Commit-ID: 3ca3877a9732b68cf0289367a859f6c163a79bfa Gitweb: https://git.kernel.org/tip/3ca3877a9732b68cf0289367a859f6c163a79bfa Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:49 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:07 -0300 perf bpf: Process PERF_BPF_EVENT_PROG_LOAD for annotation This patch adds processing of PERF_BPF_EVENT_PROG_LOAD, which sets proper DSO type/id/etc of memory regions mapped to BPF programs to DSO_BINARY_TYPE__BPF_PROG_INFO. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-14-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 54 + 1 file changed, 54 insertions(+) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index a4fc52b4ffae..852e960692cb 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -12,6 +12,7 @@ #include "machine.h" #include "env.h" #include "session.h" +#include "map.h" #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr)) @@ -25,12 +26,65 @@ static int snprintf_hex(char *buf, size_t size, unsigned char *data, size_t len) return ret; } +static int machine__process_bpf_event_load(struct machine *machine, + union perf_event *event, + struct perf_sample *sample __maybe_unused) +{ + struct bpf_prog_info_linear *info_linear; + struct bpf_prog_info_node *info_node; + struct perf_env *env = machine->env; + int id = event->bpf_event.id; + unsigned int i; + + /* perf-record, no need to handle bpf-event */ + if (env == NULL) + return 0; + + info_node = perf_env__find_bpf_prog_info(env, id); + if (!info_node) + return 0; + info_linear = info_node->info_linear; + + for (i = 0; i < info_linear->info.nr_jited_ksyms; i++) { + u64 *addrs = (u64 *)(info_linear->info.jited_ksyms); + u64 addr = addrs[i]; + struct map *map; + + map = map_groups__find(>kmaps, addr); + + if (map) { + map->dso->binary_type = DSO_BINARY_TYPE__BPF_PROG_INFO; + map->dso->bpf_prog.id = id; + map->dso->bpf_prog.sub_id = i; + map->dso->bpf_prog.env = env; + } + } + return 0; +} + int machine__process_bpf_event(struct machine *machine __maybe_unused, union perf_event *event, struct perf_sample *sample __maybe_unused) { if (dump_trace) perf_event__fprintf_bpf_event(event, stdout); + + switch (event->bpf_event.type) { + case PERF_BPF_EVENT_PROG_LOAD: + return machine__process_bpf_event_load(machine, event, sample); + + case PERF_BPF_EVENT_PROG_UNLOAD: + /* +* Do not free bpf_prog_info and btf of the program here, +* as annotation still need them. They will be freed at +* the end of the session. +*/ + break; + default: + pr_debug("unexpected bpf_event type of %d\n", +event->bpf_event.type); + break; + } return 0; }
[tip:perf/urgent] perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO
Commit-ID: 9b86d04d53b98399017fea44e9047165ffe12d42 Gitweb: https://git.kernel.org/tip/9b86d04d53b98399017fea44e9047165ffe12d42 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:48 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:07 -0300 perf symbols: Introduce DSO_BINARY_TYPE__BPF_PROG_INFO Introduce a new dso type DSO_BINARY_TYPE__BPF_PROG_INFO for BPF programs. In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso will call into a new function symbol__disassemble_bpf() in an upcoming patch, where annotation line information is filled based bpf_prog_info and btf saved in given perf_env. Committer notes: Removed the unnamed union with 'bpf_prog' and 'cache' in 'struct dso', to fix this bug when exiting 'perf top': # perf top perf: Segmentation fault backtrace perf[0x5a785a] /lib64/libc.so.6(+0x385bf)[0x7fd68443c5bf] perf(rb_first+0x2b)[0x4d6eeb] perf(dso__delete+0xb7)[0x4dffb7] perf[0x4f9e37] perf(perf_session__delete+0x64)[0x504df4] perf(cmd_top+0x1957)[0x454467] perf[0x4aad18] perf(main+0x61c)[0x42ec7c] /lib64/libc.so.6(__libc_start_main+0xf2)[0x7fd684428412] perf(_start+0x2d)[0x42eead] # # addr2line -fe ~/bin/perf 0x4dffb7 dso_cache__free /home/acme/git/perf/tools/perf/util/dso.c:713 That is trying to access the dso->data.cache, and that is not used with BPF programs, so we end up accessing what is in bpf_prog.first_member, b00m. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubrav...@fb.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/dso.c| 1 + tools/perf/util/dso.h| 8 tools/perf/util/symbol.c | 1 + 3 files changed, 10 insertions(+) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index ab8a455d2283..e059976d9d93 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -184,6 +184,7 @@ int dso__read_binary_type_filename(const struct dso *dso, case DSO_BINARY_TYPE__KALLSYMS: case DSO_BINARY_TYPE__GUEST_KALLSYMS: case DSO_BINARY_TYPE__JAVA_JIT: + case DSO_BINARY_TYPE__BPF_PROG_INFO: case DSO_BINARY_TYPE__NOT_FOUND: ret = -1; break; diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index bb417c54c25a..6e3f63781e51 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -14,6 +14,7 @@ struct machine; struct map; +struct perf_env; enum dso_binary_type { DSO_BINARY_TYPE__KALLSYMS = 0, @@ -35,6 +36,7 @@ enum dso_binary_type { DSO_BINARY_TYPE__KCORE, DSO_BINARY_TYPE__GUEST_KCORE, DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO, + DSO_BINARY_TYPE__BPF_PROG_INFO, DSO_BINARY_TYPE__NOT_FOUND, }; @@ -189,6 +191,12 @@ struct dso { u64 debug_frame_offset; u64 eh_frame_hdr_offset; } data; + /* bpf prog information */ + struct { + u32 id; + u32 sub_id; + struct perf_env *env; + } bpf_prog; union { /* Tool specific area */ void *priv; diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 58442ca5e3c4..5cbad55cd99d 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1455,6 +1455,7 @@ static bool dso__is_compatible_symtab_type(struct dso *dso, bool kmod, case DSO_BINARY_TYPE__BUILD_ID_CACHE_DEBUGINFO: return true; + case DSO_BINARY_TYPE__BPF_PROG_INFO: case DSO_BINARY_TYPE__NOT_FOUND: default: return false;
[tip:perf/urgent] perf bpf: Save BTF in a rbtree in perf_env
Commit-ID: 3792cb2ff43b1b193136a03ce1336462a827d792 Gitweb: https://git.kernel.org/tip/3792cb2ff43b1b193136a03ce1336462a827d792 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:44 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:07 -0300 perf bpf: Save BTF in a rbtree in perf_env BTF contains information necessary to annotate BPF programs. This patch saves BTF for BPF programs loaded in the system. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-9-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 23 tools/perf/util/bpf-event.h | 7 + tools/perf/util/env.c | 67 + tools/perf/util/env.h | 5 4 files changed, 102 insertions(+) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 37ee4e2a728a..a4fc52b4ffae 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -34,6 +34,28 @@ int machine__process_bpf_event(struct machine *machine __maybe_unused, return 0; } +static int perf_env__fetch_btf(struct perf_env *env, + u32 btf_id, + struct btf *btf) +{ + struct btf_node *node; + u32 data_size; + const void *data; + + data = btf__get_raw_data(btf, _size); + + node = malloc(data_size + sizeof(struct btf_node)); + if (!node) + return -1; + + node->id = btf_id; + node->data_size = data_size; + memcpy(node->data, data, data_size); + + perf_env__insert_btf(env, node); + return 0; +} + /* * Synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for one bpf * program. One PERF_RECORD_BPF_EVENT is generated for the program. And @@ -113,6 +135,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, goto out; } has_btf = true; + perf_env__fetch_btf(env, info->btf_id, btf); } /* Synthesize PERF_RECORD_KSYMBOL */ diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h index fad932f7404f..b9ec394dc7c7 100644 --- a/tools/perf/util/bpf-event.h +++ b/tools/perf/util/bpf-event.h @@ -16,6 +16,13 @@ struct bpf_prog_info_node { struct rb_node rb_node; }; +struct btf_node { + struct rb_node rb_node; + u32 id; + u32 data_size; + chardata[]; +}; + #ifdef HAVE_LIBBPF_SUPPORT int machine__process_bpf_event(struct machine *machine, union perf_event *event, struct perf_sample *sample); diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index 98cd36f0e317..c6351b557bb0 100644 --- a/tools/perf/util/env.c +++ b/tools/perf/util/env.c @@ -64,6 +64,58 @@ struct bpf_prog_info_node *perf_env__find_bpf_prog_info(struct perf_env *env, return node; } +void perf_env__insert_btf(struct perf_env *env, struct btf_node *btf_node) +{ + struct rb_node *parent = NULL; + __u32 btf_id = btf_node->id; + struct btf_node *node; + struct rb_node **p; + + down_write(>bpf_progs.lock); + p = >bpf_progs.btfs.rb_node; + + while (*p != NULL) { + parent = *p; + node = rb_entry(parent, struct btf_node, rb_node); + if (btf_id < node->id) { + p = &(*p)->rb_left; + } else if (btf_id > node->id) { + p = &(*p)->rb_right; + } else { + pr_debug("duplicated btf %u\n", btf_id); + goto out; + } + } + + rb_link_node(_node->rb_node, parent, p); + rb_insert_color(_node->rb_node, >bpf_progs.btfs); + env->bpf_progs.btfs_cnt++; +out: + up_write(>bpf_progs.lock); +} + +struct btf_node *perf_env__find_btf(struct perf_env *env, __u32 btf_id) +{ + struct btf_node *node = NULL; + struct rb_node *n; + + down_read(>bpf_progs.lock); + n = env->bpf_progs.btfs.rb_node; + + while (n) { + node = rb_entry(n, struct btf_node, rb_node); + if (btf_id < node->id) + n = n->rb_left; + else if (btf_id > node->id) + n = n->rb_right; + else + break; + } + + up_read(>bpf_progs.lock); + return node; +} + /* purge data in bpf_progs.infos tree */ static void perf_env__purge_bpf(struct perf_env *env) { @@ -86,6 +138,20 @@ static void perf_env__purge_bpf(struct perf_env *env) env->bpf_progs.infos_cnt = 0; + root = >bpf_progs.btfs; + next = rb_first(root); + +
[tip:perf/urgent] perf feature detection: Add -lopcodes to feature-libbfd
Commit-ID: 31be9478ed7f43d6351e0d5a2257ca76609c83d3 Gitweb: https://git.kernel.org/tip/31be9478ed7f43d6351e0d5a2257ca76609c83d3 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:47 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:07 -0300 perf feature detection: Add -lopcodes to feature-libbfd Both libbfd and libopcodes are distributed with binutil-dev/devel. When libbfd is present, it is OK to assume that libopcodes also present. This has been a safe assumption for bpftool. This patch adds -lopcodes to perf/Makefile.config. libopcodes will be used in the next commit for BPF annotation. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-12-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Makefile.config | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 0f11d5891301..df4ad45599ca 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -713,7 +713,7 @@ else endif ifeq ($(feature-libbfd), 1) - EXTLIBS += -lbfd + EXTLIBS += -lbfd -lopcodes else # we are on a system that requires -liberty and (maybe) -lz # to link against -lbfd; test each case individually here @@ -724,10 +724,10 @@ else $(call feature_check,libbfd-liberty-z) ifeq ($(feature-libbfd-liberty), 1) -EXTLIBS += -lbfd -liberty +EXTLIBS += -lbfd -lopcodes -liberty else ifeq ($(feature-libbfd-liberty-z), 1) - EXTLIBS += -lbfd -liberty -lz + EXTLIBS += -lbfd -lopcodes -liberty -lz endif endif endif
[tip:perf/urgent] perf bpf: Save BTF information as headers to perf.data
Commit-ID: a70a1123174ab592c5fa8ecf09f9fad9b335b872 Gitweb: https://git.kernel.org/tip/a70a1123174ab592c5fa8ecf09f9fad9b335b872 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:45 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:07 -0300 perf bpf: Save BTF information as headers to perf.data This patch enables 'perf record' to save BTF information as headers to perf.data. A new header type HEADER_BPF_BTF is introduced for this data. Committer testing: As root, being on the kernel sources top level directory, run: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg Just to compile and load a BPF program that attaches to the raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc). Make sure you have a recent enough clang, say version 9, to get the BTF ELF sections needed for this testing: # clang --version | head -1 clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500) # readelf -SW tools/perf/examples/bpf/augmented_raw_syscalls.o | grep BTF [22] .BTF PROGBITS 000ede 000b0e 00 0 0 1 [23] .BTF.ext PROGBITS 0019ec 0002a0 00 0 0 1 [24] .rel.BTF.ext REL 002fa8 000270 10 30 23 8 Then do a systemwide perf record session for a few seconds: # perf record -a sleep 2s Then look at: # perf report --header-only | grep b[pt]f # event : name = cycles:ppp, , id = { 1116204, 1116205, 1116206, 1116207, 1116208, 1116209, 1116210, 1116211 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, enable_on_exec = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1 # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 51 # bpf_prog_info of id 52 # btf info of id 8 # We need to show more info about these BPF and BTF entries , but that can be done later. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-10-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/header.c | 101 ++- tools/perf/util/header.h | 1 + 2 files changed, 101 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index e6a81af516f6..01dda2f65d36 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -928,6 +928,39 @@ static int write_bpf_prog_info(struct feat_fd *ff __maybe_unused, } #endif // HAVE_LIBBPF_SUPPORT +static int write_bpf_btf(struct feat_fd *ff, +struct perf_evlist *evlist __maybe_unused) +{ + struct perf_env *env = >ph->env; + struct rb_root *root; + struct rb_node *next; + int ret; + + down_read(>bpf_progs.lock); + + ret = do_write(ff, >bpf_progs.btfs_cnt, + sizeof(env->bpf_progs.btfs_cnt)); + + if (ret < 0) + goto out; + + root = >bpf_progs.btfs; + next = rb_first(root); + while (next) { + struct btf_node *node; + + node = rb_entry(next, struct btf_node, rb_node); + next = rb_next(>rb_node); + ret = do_write(ff, >id, + sizeof(u32) * 2 + node->data_size); + if (ret < 0) + goto out; + } +out: + up_read(>bpf_progs.lock); + return ret; +} + static int cpu_cache_level__sort(const void *a, const void *b) { struct cpu_cache_level *cache_a = (struct cpu_cache_level *)a; @@ -1442,6 +1475,28 @@ static void print_bpf_prog_info(struct feat_fd *ff, FILE *fp) up_read(>bpf_progs.lock); } +static void print_bpf_btf(struct feat_fd *ff, FILE *fp) +{ + struct perf_env *env = >ph->env; + struct rb_root *root; + struct rb_node *next; + + down_read(>bpf_progs.lock); + + root = >bpf_progs.btfs; + next = rb_first(root); + + while (next) { + struct btf_node *node; + + node = rb_entry(next, struct btf_node, rb_node); + next = rb_next(>rb_node); + fprintf(fp, "# btf info of id %u\n", node->id); + } + + up_read(>bpf_progs.lock); +} + static void
[tip:perf/urgent] perf bpf: Save bpf_prog_info information as headers to perf.data
Commit-ID: 606f972b1361f477cbd4e6e8ac00742fde4b39db Gitweb: https://git.kernel.org/tip/606f972b1361f477cbd4e6e8ac00742fde4b39db Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:43 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 perf bpf: Save bpf_prog_info information as headers to perf.data This patch enables perf-record to save bpf_prog_info information as headers to perf.data. A new header type HEADER_BPF_PROG_INFO is introduced for this data. Committer testing: As root, being on the kernel sources top level directory, run: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c -e *msg Just to compile and load a BPF program that attaches to the raw_syscalls:sys_{enter,exit} tracepoints to trace the syscalls ending in "msg" (recvmsg, sendmsg, recvmmsg, sendmmsg, etc). Then do a systemwide perf record session for a few seconds: # perf record -a sleep 2s Then look at: # perf report --header-only | grep -i bpf # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 208 # bpf_prog_info of id 209 # We need to show more info about these programs, like bpftool does for the ones running on the system, i.e. 'perf record/perf report' become a way of saving the BPF state in a machine to then analyse on another, together with all the other information that is already saved in the perf.data header: # perf report --header-only # # captured on: Tue Mar 12 11:42:13 2019 # header version : 1 # data offset: 296 # data size : 16294184 # feat offset: 16294480 # hostname : quaco # os release : 5.0.0+ # perf version : 5.0.gd783c8 # arch : x86_64 # nrcpus online : 8 # nrcpus avail : 8 # cpudesc : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz # cpuid : GenuineIntel,6,142,10 # total memory : 24555720 kB # cmdline : /home/acme/bin/perf (deleted) record -a # event : name = cycles:ppp, , id = { 3190123, 3190124, 3190125, 3190126, 3190127, 3190128, 3190129, 3190130 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1 # CPU_TOPOLOGY info available, use -I to display # NUMA_TOPOLOGY info available, use -I to display # pmu mappings: intel_pt = 8, software = 1, power = 11, uprobe = 7, uncore_imc = 12, cpu = 4, cstate_core = 18, uncore_cbox_2 = 15, breakpoint = 5, uncore_cbox_0 = 13, tracepoint = 2, cstate_pkg = 19, uncore_arb = 17, kprobe = 6, i915 = 10, msr = 9, uncore_cbox_3 = 16, uncore_cbox_1 = 14 # CACHE info available, use -I to display # time of first sample : 116392.441701 # time of last sample : 116400.932584 # sample duration : 8490.883 ms # MEM_TOPOLOGY info available, use -I to display # bpf_prog_info of id 13 # bpf_prog_info of id 14 # bpf_prog_info of id 15 # bpf_prog_info of id 16 # bpf_prog_info of id 17 # bpf_prog_info of id 18 # bpf_prog_info of id 21 # bpf_prog_info of id 22 # bpf_prog_info of id 208 # bpf_prog_info of id 209 # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT # # Committer notes: We can't use the libbpf unconditionally, as the build may have been with NO_LIBBPF, when we end up with linking errors, so provide dummy {process,write}_bpf_prog_info() wrapped by HAVE_LIBBPF_SUPPORT for that case. Printing are not affected by this, so can continue as is. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-8-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/header.c | 153 ++- tools/perf/util/header.h | 1 + 2 files changed, 153 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index b0683bf4d9f3..e6a81af516f6 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -18,6 +18,7 @@ #include #include #include +#include #include "evlist.h" #include "evsel.h" @@ -40,6 +41,7 @@ #include "time-utils.h" #include "units.h" #include "cputopo.h" +#include "bpf-event.h" #include "sane_ctype.h" @@ -876,6 +878,56 @@ static int write_dir_format(struct feat_fd *ff, return do_write(ff, >dir.version, sizeof(data->dir.version)); } +#ifdef HAVE_LIBBPF_SUPPORT +static int write_bpf_prog_info(struct feat_fd *ff, + struct perf_evlist *evlist __maybe_unused) +{ + struct perf_env *env = >ph->env;
[tip:perf/urgent] perf bpf: Save bpf_prog_info in a rbtree in perf_env
Commit-ID: e4378f0cb90be0368c48baad69a99203c58e3196 Gitweb: https://git.kernel.org/tip/e4378f0cb90be0368c48baad69a99203c58e3196 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:42 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 perf bpf: Save bpf_prog_info in a rbtree in perf_env bpf_prog_info contains information necessary to annotate bpf programs. This patch saves bpf_prog_info for bpf programs loaded in the system. Some big picture of the next few patches: To fully annotate BPF programs with source code mapping, 4 different informations are needed: 1) PERF_RECORD_KSYMBOL 2) PERF_RECORD_BPF_EVENT 3) bpf_prog_info 4) btf Before this set, 1) and 2) in the list are already saved to perf.data file. For BPF programs that are already loaded before perf run, 1) and 2) are synthesized by perf_event__synthesize_bpf_events(). For short living BPF programs, 1) and 2) are generated by kernel. This set handles 3) and 4) from the list. Again, it is necessary to handle existing BPF program and short living program separately. This patch handles 3) for exising BPF programs while synthesizing 1) and 2) in perf_event__synthesize_bpf_events(). These data are stored in perf_env. The next patch saves these data from perf_env to perf.data as headers. Similarly, the two patches after the next saves 4) of existing BPF programs to perf_env and perf.data. Another patch later will handle 3) and 4) for short living BPF programs by monitoring 1) and 2) in a dedicate thread. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-7-songliubrav...@fb.com [ set env->bpf_progs.infos_cnt to zero in perf_env__purge_bpf() as noted by jolsa ] Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/perf.c | 1 + tools/perf/util/bpf-event.c | 30 +++- tools/perf/util/bpf-event.h | 7 +++- tools/perf/util/env.c | 88 + tools/perf/util/env.h | 19 ++ tools/perf/util/session.c | 1 + 6 files changed, 144 insertions(+), 2 deletions(-) diff --git a/tools/perf/perf.c b/tools/perf/perf.c index a11cb006f968..72df4b6fa36f 100644 --- a/tools/perf/perf.c +++ b/tools/perf/perf.c @@ -298,6 +298,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv) use_pager = 1; commit_pager_choice(); + perf_env__init(_env); perf_env__set_cmdline(_env, argc, argv); status = p->fn(argc, argv); perf_config__exit(); diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 5237e8f11997..37ee4e2a728a 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -10,6 +10,7 @@ #include "debug.h" #include "symbol.h" #include "machine.h" +#include "env.h" #include "session.h" #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr)) @@ -54,17 +55,28 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, struct bpf_event *bpf_event = >bpf_event; struct bpf_prog_info_linear *info_linear; struct perf_tool *tool = session->tool; + struct bpf_prog_info_node *info_node; struct bpf_prog_info *info; struct btf *btf = NULL; bool has_btf = false; + struct perf_env *env; u32 sub_prog_cnt, i; int err = 0; u64 arrays; + /* +* for perf-record and perf-report use header.env; +* otherwise, use global perf_env. +*/ + env = session->data ? >header.env : _env; + arrays = 1UL << BPF_PROG_INFO_JITED_KSYMS; arrays |= 1UL << BPF_PROG_INFO_JITED_FUNC_LENS; arrays |= 1UL << BPF_PROG_INFO_FUNC_INFO; arrays |= 1UL << BPF_PROG_INFO_PROG_TAGS; + arrays |= 1UL << BPF_PROG_INFO_JITED_INSNS; + arrays |= 1UL << BPF_PROG_INFO_LINE_INFO; + arrays |= 1UL << BPF_PROG_INFO_JITED_LINE_INFO; info_linear = bpf_program__get_prog_info_linear(fd, arrays); if (IS_ERR_OR_NULL(info_linear)) { @@ -153,8 +165,8 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, machine, process); } - /* Synthesize PERF_RECORD_BPF_EVENT */ if (!opts->no_bpf_event) { + /* Synthesize PERF_RECORD_BPF_EVENT */ *bpf_event = (struct bpf_event){ .header = { .type = PERF_RECORD_BPF_EVENT, @@ -167,6 +179,22 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, memcpy(bpf_event->tag, info->tag, BPF_TAG_SIZE); memset((void *)event + event->header.size, 0, machine->id_hdr_size); event->header.size +=
[tip:perf/urgent] perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool
Commit-ID: e5416950454fa79b7bdc86dac45661b97d887c97 Gitweb: https://git.kernel.org/tip/e5416950454fa79b7bdc86dac45661b97d887c97 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:41 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 perf bpf: Make synthesize_bpf_events() receive perf_session pointer instead of perf_tool This patch changes the arguments of perf_event__synthesize_bpf_events() to include perf_session* instead of perf_tool*. perf_session will be used in the next patch. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190312053051.2690567-6-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-record.c | 2 +- tools/perf/builtin-top.c| 2 +- tools/perf/util/bpf-event.c | 8 +--- tools/perf/util/bpf-event.h | 4 ++-- 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index f29874192d3e..e79faccd7842 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -1114,7 +1114,7 @@ static int record__synthesize(struct record *rec, bool tail) return err; } - err = perf_event__synthesize_bpf_events(tool, process_synthesized_event, + err = perf_event__synthesize_bpf_events(session, process_synthesized_event, machine, opts); if (err < 0) pr_warning("Couldn't synthesize bpf events.\n"); diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index 2508a7a552fa..77e6190211d2 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1208,7 +1208,7 @@ static int __cmd_top(struct perf_top *top) init_process_thread(top); - ret = perf_event__synthesize_bpf_events(>tool, perf_event__process, + ret = perf_event__synthesize_bpf_events(top->session, perf_event__process, >session->machines.host, >record_opts); if (ret < 0) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index e0cbe7f87170..5237e8f11997 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -10,6 +10,7 @@ #include "debug.h" #include "symbol.h" #include "machine.h" +#include "session.h" #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr)) @@ -42,7 +43,7 @@ int machine__process_bpf_event(struct machine *machine __maybe_unused, * -1 for failures; * -2 for lack of kernel support. */ -static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, +static int perf_event__synthesize_one_bpf_prog(struct perf_session *session, perf_event__handler_t process, struct machine *machine, int fd, @@ -52,6 +53,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, struct ksymbol_event *ksymbol_event = >ksymbol_event; struct bpf_event *bpf_event = >bpf_event; struct bpf_prog_info_linear *info_linear; + struct perf_tool *tool = session->tool; struct bpf_prog_info *info; struct btf *btf = NULL; bool has_btf = false; @@ -175,7 +177,7 @@ out: return err ? -1 : 0; } -int perf_event__synthesize_bpf_events(struct perf_tool *tool, +int perf_event__synthesize_bpf_events(struct perf_session *session, perf_event__handler_t process, struct machine *machine, struct record_opts *opts) @@ -209,7 +211,7 @@ int perf_event__synthesize_bpf_events(struct perf_tool *tool, continue; } - err = perf_event__synthesize_one_bpf_prog(tool, process, + err = perf_event__synthesize_one_bpf_prog(session, process, machine, fd, event, opts); close(fd); diff --git a/tools/perf/util/bpf-event.h b/tools/perf/util/bpf-event.h index 7890067e1a37..6698683612a7 100644 --- a/tools/perf/util/bpf-event.h +++ b/tools/perf/util/bpf-event.h @@ -15,7 +15,7 @@ struct record_opts; int machine__process_bpf_event(struct machine *machine, union perf_event *event, struct perf_sample *sample); -int perf_event__synthesize_bpf_events(struct perf_tool *tool, +int perf_event__synthesize_bpf_events(struct perf_session *session, perf_event__handler_t process, struct machine *machine,
[tip:perf/urgent] perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear()
Commit-ID: a742258af131e570a68ad8cf16cd2cc4692675a0 Gitweb: https://git.kernel.org/tip/a742258af131e570a68ad8cf16cd2cc4692675a0 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:40 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 perf bpf: Synthesize bpf events with bpf_program__get_prog_info_linear() With bpf_program__get_prog_info_linear, we can simplify the logic that synthesizes bpf events. This patch doesn't change the behavior of the code. Commiter notes: Needed this (for all four variables), suggested by Song, to overcome build failure on debian experimental cross building to MIPS 32-bit: - u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags); + u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags); util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:143:35: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(info->prog_tags); ^ util/bpf-event.c:144:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] __u32 *prog_lens = (__u32 *)(info->jited_func_lens); ^ util/bpf-event.c:145:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] __u64 *prog_addrs = (__u64 *)(info->jited_ksyms); ^ util/bpf-event.c:146:22: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] void *func_infos = (void *)(info->func_info); ^ cc1: all warnings being treated as errors Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: kernel-t...@fb.com Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-5-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 118 +++- 1 file changed, 40 insertions(+), 78 deletions(-) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index ea012b735a37..e0cbe7f87170 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -3,7 +3,9 @@ #include #include #include +#include #include +#include #include "bpf-event.h" #include "debug.h" #include "symbol.h" @@ -49,99 +51,62 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, { struct ksymbol_event *ksymbol_event = >ksymbol_event; struct bpf_event *bpf_event = >bpf_event; - u32 sub_prog_cnt, i, func_info_rec_size = 0; - u8 (*prog_tags)[BPF_TAG_SIZE] = NULL; - struct bpf_prog_info info = { .type = 0, }; - u32 info_len = sizeof(info); - void *func_infos = NULL; - u64 *prog_addrs = NULL; + struct bpf_prog_info_linear *info_linear; + struct bpf_prog_info *info; struct btf *btf = NULL; - u32 *prog_lens = NULL; bool has_btf = false; - char errbuf[512]; + u32 sub_prog_cnt, i; int err = 0; + u64 arrays; - /* Call bpf_obj_get_info_by_fd() to get sizes of arrays */ - err = bpf_obj_get_info_by_fd(fd, , _len); + arrays = 1UL << BPF_PROG_INFO_JITED_KSYMS; + arrays |= 1UL << BPF_PROG_INFO_JITED_FUNC_LENS; + arrays |= 1UL << BPF_PROG_INFO_FUNC_INFO; + arrays |= 1UL << BPF_PROG_INFO_PROG_TAGS; - if (err) { - pr_debug("%s: failed to get BPF program info: %s, aborting\n", -__func__, str_error_r(errno, errbuf, sizeof(errbuf))); + info_linear = bpf_program__get_prog_info_linear(fd, arrays); + if (IS_ERR_OR_NULL(info_linear)) { + info_linear = NULL; + pr_debug("%s: failed to get BPF program info. aborting\n", __func__); return -1; } - if (info_len < offsetof(struct bpf_prog_info, prog_tags)) { + + if (info_linear->info_len < offsetof(struct bpf_prog_info, prog_tags)) { pr_debug("%s: the kernel is too old, aborting\n", __func__); return -2; } + info = _linear->info; + /* number of ksyms, func_lengths, and tags should match */ - sub_prog_cnt = info.nr_jited_ksyms; - if (sub_prog_cnt != info.nr_prog_tags || - sub_prog_cnt != info.nr_jited_func_lens) + sub_prog_cnt = info->nr_jited_ksyms; + if (sub_prog_cnt != info->nr_prog_tags || + sub_prog_cnt != info->nr_jited_func_lens) return -1; /* check BTF func info support */ - if (info.btf_id && info.nr_func_info && info.func_info_rec_size) { + if (info->btf_id && info->nr_func_info && info->func_info_rec_size) { /* btf func info number should be same as sub_prog_cnt */ - if
[tip:perf/urgent] bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()
Commit-ID: cae73f2339231d61022769f09c94e4500e8ad47a Gitweb: https://git.kernel.org/tip/cae73f2339231d61022769f09c94e4500e8ad47a Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:39 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump() This patches uses bpf_program__get_prog_info_linear() to simplify the logic in prog.c do_dump(). Committer testing: Before: # bpftool prog dump xlated id 208 > /tmp/dump.xlated.before # bpftool prog dump jited id 208 > /tmp/dump.jited.before # bpftool map dump id 107 > /tmp/map.dump.before After: # ~acme/git/perf/tools/bpf/bpftool/bpftool map dump id 107 > /tmp/map.dump.after # ~acme/git/perf/tools/bpf/bpftool/bpftool prog dump xlated id 208 > /tmp/dump.xlated.after # ~acme/git/perf/tools/bpf/bpftool/bpftool prog dump jited id 208 > /tmp/dump.jited.after # diff -u /tmp/dump.xlated.before /tmp/dump.xlated.after # diff -u /tmp/dump.jited.before /tmp/dump.jited.after # diff -u /tmp/map.dump.before /tmp/map.dump.after # ~acme/git/perf/tools/bpf/bpftool/bpftool prog dump xlated id 208 0: (bf) r6 = r1 1: (85) call bpf_get_current_pid_tgid#80800 2: (63) *(u32 *)(r10 -328) = r0 3: (bf) r2 = r10 4: (07) r2 += -328 5: (18) r1 = map[id:107] 7: (85) call __htab_map_lookup_elem#85680 8: (15) if r0 == 0x0 goto pc+1 9: (07) r0 += 56 10: (b7) r7 = 0 11: (55) if r0 != 0x0 goto pc+52 12: (bf) r1 = r10 13: (07) r1 += -328 14: (b7) r2 = 64 15: (bf) r3 = r6 16: (85) call bpf_probe_read#-46848 17: (bf) r2 = r10 18: (07) r2 += -320 19: (18) r1 = map[id:106] 21: (07) r1 += 208 22: (61) r0 = *(u32 *)(r2 +0) 23: (35) if r0 >= 0x200 goto pc+3 24: (67) r0 <<= 3 25: (0f) r0 += r1 26: (05) goto pc+1 27: (b7) r0 = 0 28: (15) if r0 == 0x0 goto pc+35 29: (71) r1 = *(u8 *)(r0 +0) 30: (15) if r1 == 0x0 goto pc+33 31: (b7) r5 = 64 32: (79) r1 = *(u64 *)(r10 -320) 33: (15) if r1 == 0x2 goto pc+2 34: (15) if r1 == 0x101 goto pc+3 35: (55) if r1 != 0x15 goto pc+19 36: (79) r3 = *(u64 *)(r6 +16) 37: (05) goto pc+1 38: (79) r3 = *(u64 *)(r6 +24) 39: (15) if r3 == 0x0 goto pc+15 40: (b7) r1 = 0 41: (63) *(u32 *)(r10 -260) = r1 42: (bf) r1 = r10 43: (07) r1 += -256 44: (b7) r2 = 256 45: (85) call bpf_probe_read_str#-46704 46: (b7) r5 = 328 47: (63) *(u32 *)(r10 -264) = r0 48: (bf) r1 = r0 49: (67) r1 <<= 32 50: (77) r1 >>= 32 51: (25) if r1 > 0xff goto pc+3 52: (07) r0 += 72 53: (57) r0 &= 255 54: (bf) r5 = r0 55: (bf) r4 = r10 56: (07) r4 += -328 57: (bf) r1 = r6 58: (18) r2 = map[id:105] 60: (18) r3 = 0x 62: (85) call bpf_perf_event_output_tp#-45104 63: (bf) r7 = r0 64: (bf) r0 = r7 65: (95) exit # Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Acked-by: Daniel Borkmann Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: kernel-t...@fb.com Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-4-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/bpf/bpftool/prog.c | 266 +++ 1 file changed, 59 insertions(+), 207 deletions(-) diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 8ef80d65a474..d2be5a06c339 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -401,41 +401,31 @@ static int do_show(int argc, char **argv) static int do_dump(int argc, char **argv) { - unsigned int finfo_rec_size, linfo_rec_size, jited_linfo_rec_size; - void *func_info = NULL, *linfo = NULL, *jited_linfo = NULL; - unsigned int nr_finfo, nr_linfo = 0, nr_jited_linfo = 0; + struct bpf_prog_info_linear *info_linear; struct bpf_prog_linfo *prog_linfo = NULL; - unsigned long *func_ksyms = NULL; - struct bpf_prog_info info = {}; - unsigned int *func_lens = NULL; + enum {DUMP_JITED, DUMP_XLATED} mode; const char *disasm_opt = NULL; - unsigned int nr_func_ksyms; - unsigned int nr_func_lens; + struct bpf_prog_info *info; struct dump_data dd = {}; - __u32 len = sizeof(info); + void *func_info = NULL; struct btf *btf = NULL; - unsigned int buf_size; char *filepath = NULL; bool opcodes = false; bool visual = false; char func_sig[1024]; unsigned char *buf; bool linum = false; - __u32 *member_len; - __u64 *member_ptr; + __u32 member_len; + __u64 arrays; ssize_t n; - int err; int fd; if (is_prefix(*argv, "jited")) { if (disasm_init()) return -1; - - member_len = _prog_len; - member_ptr
[tip:perf/urgent] tools lib bpf: Introduce bpf_program__get_prog_info_linear()
Commit-ID: 34be16466d4dc06f3d604dafbcdb3327b72e78da Gitweb: https://git.kernel.org/tip/34be16466d4dc06f3d604dafbcdb3327b72e78da Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:38 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 tools lib bpf: Introduce bpf_program__get_prog_info_linear() Currently, bpf_prog_info includes 9 arrays. The user has the option to fetch any combination of these arrays. However, this requires a lot of handling. This work becomes more tricky when we need to store bpf_prog_info to a file, because these arrays are allocated independently. This patch introduces 'struct bpf_prog_info_linear', which stores arrays of bpf_prog_info in continuous memory. Helper functions are introduced to unify the work to get different sets of bpf_prog_info. Specifically, bpf_program__get_prog_info_linear() allows the user to select which arrays to fetch, and handles details for the user. Please see the comments right before 'enum bpf_prog_info_array' for more details and examples. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Acked-by: Daniel Borkmann Link: https://lkml.kernel.org/r/ce92c091-e80d-a0c1-4aa0-987706c42...@iogearbox.net Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: kernel-t...@fb.com Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-3-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/bpf/libbpf.c | 251 +++ tools/lib/bpf/libbpf.h | 63 tools/lib/bpf/libbpf.map | 3 + 3 files changed, 317 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 4884557aa17f..8fb6e89b4b2c 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -112,6 +112,11 @@ void libbpf_print(enum libbpf_print_level level, const char *format, ...) # define LIBBPF_ELF_C_READ_MMAP ELF_C_READ #endif +static inline __u64 ptr_to_u64(const void *ptr) +{ + return (__u64) (unsigned long) ptr; +} + struct bpf_capabilities { /* v4.14: kernel support for program & map names. */ __u32 name:1; @@ -2997,3 +3002,249 @@ bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, ring_buffer_write_tail(header, data_tail); return ret; } + +struct bpf_prog_info_array_desc { + int array_offset; /* e.g. offset of jited_prog_insns */ + int count_offset; /* e.g. offset of jited_prog_len */ + int size_offset;/* > 0: offset of rec size, +* < 0: fix size of -size_offset +*/ +}; + +static struct bpf_prog_info_array_desc bpf_prog_info_array_desc[] = { + [BPF_PROG_INFO_JITED_INSNS] = { + offsetof(struct bpf_prog_info, jited_prog_insns), + offsetof(struct bpf_prog_info, jited_prog_len), + -1, + }, + [BPF_PROG_INFO_XLATED_INSNS] = { + offsetof(struct bpf_prog_info, xlated_prog_insns), + offsetof(struct bpf_prog_info, xlated_prog_len), + -1, + }, + [BPF_PROG_INFO_MAP_IDS] = { + offsetof(struct bpf_prog_info, map_ids), + offsetof(struct bpf_prog_info, nr_map_ids), + -(int)sizeof(__u32), + }, + [BPF_PROG_INFO_JITED_KSYMS] = { + offsetof(struct bpf_prog_info, jited_ksyms), + offsetof(struct bpf_prog_info, nr_jited_ksyms), + -(int)sizeof(__u64), + }, + [BPF_PROG_INFO_JITED_FUNC_LENS] = { + offsetof(struct bpf_prog_info, jited_func_lens), + offsetof(struct bpf_prog_info, nr_jited_func_lens), + -(int)sizeof(__u32), + }, + [BPF_PROG_INFO_FUNC_INFO] = { + offsetof(struct bpf_prog_info, func_info), + offsetof(struct bpf_prog_info, nr_func_info), + offsetof(struct bpf_prog_info, func_info_rec_size), + }, + [BPF_PROG_INFO_LINE_INFO] = { + offsetof(struct bpf_prog_info, line_info), + offsetof(struct bpf_prog_info, nr_line_info), + offsetof(struct bpf_prog_info, line_info_rec_size), + }, + [BPF_PROG_INFO_JITED_LINE_INFO] = { + offsetof(struct bpf_prog_info, jited_line_info), + offsetof(struct bpf_prog_info, nr_jited_line_info), + offsetof(struct bpf_prog_info, jited_line_info_rec_size), + }, + [BPF_PROG_INFO_PROG_TAGS] = { + offsetof(struct bpf_prog_info, prog_tags), + offsetof(struct bpf_prog_info, nr_prog_tags), + -(int)sizeof(__u8) * BPF_TAG_SIZE, + }, + +}; + +static __u32 bpf_prog_info_read_offset_u32(struct bpf_prog_info *info, int offset) +{ + __u32 *array = (__u32 *)info; + + if (offset >= 0) + return array[offset /
[tip:perf/urgent] perf record: Replace option --bpf-event with --no-bpf-event
Commit-ID: 71184c6ab7e60fd59d8dbc8fed62a1c753dc4934 Gitweb: https://git.kernel.org/tip/71184c6ab7e60fd59d8dbc8fed62a1c753dc4934 Author: Song Liu AuthorDate: Mon, 11 Mar 2019 22:30:37 -0700 Committer: Arnaldo Carvalho de Melo CommitDate: Tue, 19 Mar 2019 16:52:06 -0300 perf record: Replace option --bpf-event with --no-bpf-event Currently, monitoring of BPF programs through bpf_event is off by default for 'perf record'. To turn it on, the user need to use option "--bpf-event". As BPF gets wider adoption in different subsystems, this option becomes inconvenient. This patch makes bpf_event on by default, and adds option "--no-bpf-event" to turn it off. Since option --bpf-event is not released yet, it is safe to remove it. Signed-off-by: Song Liu Reviewed-by: Jiri Olsa Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: kernel-t...@fb.com Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Stanislav Fomichev Link: http://lkml.kernel.org/r/20190312053051.2690567-2-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-record.c | 2 +- tools/perf/perf.h | 2 +- tools/perf/util/bpf-event.c | 2 +- tools/perf/util/evsel.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index e7144a1c1c82..f29874192d3e 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -1891,7 +1891,7 @@ static struct option __record_options[] = { OPT_BOOLEAN(0, "tail-synthesize", _synthesize, "synthesize non-sample events at the end of output"), OPT_BOOLEAN(0, "overwrite", , "use overwrite mode"), - OPT_BOOLEAN(0, "bpf-event", _event, "record bpf events"), + OPT_BOOLEAN(0, "no-bpf-event", _bpf_event, "record bpf events"), OPT_BOOLEAN(0, "strict-freq", _freq, "Fail if the specified frequency can't be used"), OPT_CALLBACK('F', "freq", , "freq or 'max'", diff --git a/tools/perf/perf.h b/tools/perf/perf.h index b120e547ddc7..c59743def8d3 100644 --- a/tools/perf/perf.h +++ b/tools/perf/perf.h @@ -66,7 +66,7 @@ struct record_opts { bool ignore_missing_thread; bool strict_freq; bool sample_id; - bool bpf_event; + bool no_bpf_event; unsigned int freq; unsigned int mmap_pages; unsigned int auxtrace_mmap_pages; diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 028c8ec1f62a..ea012b735a37 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -187,7 +187,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, } /* Synthesize PERF_RECORD_BPF_EVENT */ - if (opts->bpf_event) { + if (!opts->no_bpf_event) { *bpf_event = (struct bpf_event){ .header = { .type = PERF_RECORD_BPF_EVENT, diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 1a2023da5d9c..7835e05f0c0a 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1036,7 +1036,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts, attr->mmap2 = track && !perf_missing_features.mmap2; attr->comm = track; attr->ksymbol = track && !perf_missing_features.ksymbol; - attr->bpf_event = track && opts->bpf_event && + attr->bpf_event = track && !opts->no_bpf_event && !perf_missing_features.bpf_event; if (opts->record_namespaces)
[tip:perf/urgent] perf, bpf: Consider events with attr.bpf_event as side-band events
Commit-ID: 21038f2baa05a0550f56f010f609a5c871b6a274 Gitweb: https://git.kernel.org/tip/21038f2baa05a0550f56f010f609a5c871b6a274 Author: Song Liu AuthorDate: Mon, 25 Feb 2019 16:20:05 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 28 Feb 2019 14:20:35 -0300 perf, bpf: Consider events with attr.bpf_event as side-band events Events with attr.bpf_event set should be considered as side-band events, as they carry information about BPF programs. Signed-off-by: Song Liu Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Fixes: 6ee52e2a3fe4 ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT") Link: http://lkml.kernel.org/r/20190226002019.3748539-2-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/events/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 5f59d848171e..dd9698ad3d66 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4238,7 +4238,8 @@ static bool is_sb_event(struct perf_event *event) if (attr->mmap || attr->mmap_data || attr->mmap2 || attr->comm || attr->comm_exec || attr->task || attr->ksymbol || - attr->context_switch) + attr->context_switch || + attr->bpf_event) return true; return false; }
[tip:perf/core] perf utils: Silence "Couldn't synthesize bpf events" warning for EPERM
Commit-ID: 39f4a913d6d439178177cae8aa2e9a232160fd51 Gitweb: https://git.kernel.org/tip/39f4a913d6d439178177cae8aa2e9a232160fd51 Author: Song Liu AuthorDate: Mon, 4 Feb 2019 11:31:40 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Thu, 14 Feb 2019 13:31:11 -0300 perf utils: Silence "Couldn't synthesize bpf events" warning for EPERM Synthesizing BPF events is only supported for root. Silent warning msg when non-root user runs perf-record. Reported-by: David Carrillo-Cisneros Signed-off-by: Song Liu Tested-by: David Carrillo-Cisneros Acked-by: Jiri Olsa Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20190204193140.719740-1-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 796ef793f4ce..62dda96b0096 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -236,8 +236,8 @@ int perf_event__synthesize_bpf_events(struct perf_tool *tool, pr_debug("%s: can't get next program: %s%s", __func__, strerror(errno), errno == EINVAL ? " -- kernel too old?" : ""); - /* don't report error on old kernel */ - err = (errno == EINVAL) ? 0 : -1; + /* don't report error on old kernel or EPERM */ + err = (errno == EINVAL || errno == EPERM) ? 0 : -1; break; } fd = bpf_prog_get_fd_by_id(id);
[tip:perf/core] perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT
Commit-ID: 811184fb6977bb02c21512d8af6a613a7ebce329 Gitweb: https://git.kernel.org/tip/811184fb6977bb02c21512d8af6a613a7ebce329 Author: Song Liu AuthorDate: Tue, 22 Jan 2019 13:02:18 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 25 Jan 2019 15:12:10 +0100 perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT Added missing machine->id_hdr_size to event->header.size. Also fixed size of PERF_RECORD_KSYMBOL by removing extra bytes for name. Committer notes: We need to malloc that extra machine->id_hdr_size at the start of perf_event__synthesize_bpf_events() and also need to cast the event to (void *) otherwise we segfault, fix it. Reported-by: Arnaldo Carvalho de Melo Suggested-by: Jiri Olsa Signed-off-by: Song Liu Acked-by: Jiri Olsa Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Fixes: 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs") Link: http://lkml.kernel.org/r/20190122210218.358664-1-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/bpf-event.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/bpf-event.c b/tools/perf/util/bpf-event.c index 01e1dc1bb7fb..796ef793f4ce 100644 --- a/tools/perf/util/bpf-event.c +++ b/tools/perf/util/bpf-event.c @@ -7,6 +7,7 @@ #include "bpf-event.h" #include "debug.h" #include "symbol.h" +#include "machine.h" #define ptr_to_u64(ptr)((__u64)(unsigned long)(ptr)) @@ -149,7 +150,7 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, *ksymbol_event = (struct ksymbol_event){ .header = { .type = PERF_RECORD_KSYMBOL, - .size = sizeof(struct ksymbol_event), + .size = offsetof(struct ksymbol_event, name), }, .addr = prog_addrs[i], .len = prog_lens[i], @@ -178,6 +179,9 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, ksymbol_event->header.size += PERF_ALIGN(name_len + 1, sizeof(u64)); + + memset((void *)event + event->header.size, 0, machine->id_hdr_size); + event->header.size += machine->id_hdr_size; err = perf_tool__process_synth_event(tool, event, machine, process); } @@ -194,6 +198,8 @@ static int perf_event__synthesize_one_bpf_prog(struct perf_tool *tool, .id = info.id, }; memcpy(bpf_event->tag, prog_tags[i], BPF_TAG_SIZE); + memset((void *)event + event->header.size, 0, machine->id_hdr_size); + event->header.size += machine->id_hdr_size; err = perf_tool__process_synth_event(tool, event, machine, process); } @@ -217,7 +223,7 @@ int perf_event__synthesize_bpf_events(struct perf_tool *tool, int err; int fd; - event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN); + event = malloc(sizeof(event->bpf_event) + KSYM_NAME_LEN + machine->id_hdr_size); if (!event) return -1; while (true) {
[tip:perf/core] bpf: Add module name [bpf] to ksymbols for bpf programs
Commit-ID: 6934058d9fb6c058fb5e5b11cdcb19834e205c91 Gitweb: https://git.kernel.org/tip/6934058d9fb6c058fb5e5b11cdcb19834e205c91 Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:21 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:38:56 -0300 bpf: Add module name [bpf] to ksymbols for bpf programs With this patch, /proc/kallsyms will show BPF programs as t bpf_prog__ [bpf] Signed-off-by: Song Liu Reviewed-by: Arnaldo Carvalho de Melo Tested-by: Arnaldo Carvalho de Melo Acked-by: Peter Zijlstra Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-10-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- kernel/kallsyms.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index f3a04994e063..14934afa9e68 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -494,7 +494,7 @@ static int get_ksymbol_ftrace_mod(struct kallsym_iter *iter) static int get_ksymbol_bpf(struct kallsym_iter *iter) { - iter->module_name[0] = '\0'; + strlcpy(iter->module_name, "bpf", MODULE_NAME_LEN); iter->exported = 0; return bpf_get_kallsym(iter->pos - iter->pos_ftrace_mod_end, >value, >type,
[tip:perf/core] perf tools: Synthesize PERF_RECORD_* for loaded BPF programs
Commit-ID: 7b612e291a5affb12b9d0b87332c71bcbe9c5db4 Gitweb: https://git.kernel.org/tip/7b612e291a5affb12b9d0b87332c71bcbe9c5db4 Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:19 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:36:39 -0300 perf tools: Synthesize PERF_RECORD_* for loaded BPF programs This patch synthesize PERF_RECORD_KSYMBOL and PERF_RECORD_BPF_EVENT for BPF programs loaded before perf-record. This is achieved by gathering information about all BPF programs via sys_bpf. Committer notes: Fix the build on some older systems such as amazonlinux:1 where it was breaking with: util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:52:9: error: missing initializer for field 'type' of 'struct bpf_prog_info' [-Werror=missing-field-initializers] struct bpf_prog_info info = {}; ^ In file included from /git/linux/tools/lib/bpf/bpf.h:26:0, from util/bpf-event.c:3: /git/linux/tools/include/uapi/linux/bpf.h:2699:8: note: 'type' declared here __u32 type; ^ cc1: all warnings being treated as errors Further fix on a centos:6 system: cc1: warnings being treated as errors util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:50: error: 'func_info_rec_size' may be used uninitialized in this function The compiler is wrong, but to silence it, initialize that variable to zero. One more fix, this time for debian:experimental-x-mips, x-mips64 and x-mipsel: util/bpf-event.c: In function 'perf_event__synthesize_one_bpf_prog': util/bpf-event.c:93:16: error: implicit declaration of function 'calloc' [-Werror=implicit-function-declaration] func_infos = calloc(sub_prog_cnt, func_info_rec_size); ^~ util/bpf-event.c:93:16: error: incompatible implicit declaration of built-in function 'calloc' [-Werror] util/bpf-event.c:93:16: note: include '' or provide a declaration of 'calloc' Add the missing header. Committer testing: # perf record --bpf-event sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB perf.data (7 samples) ] # perf report -D | grep PERF_RECORD_BPF_EVENT | nl 1 0 0x4b10 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 13 2 0 0x4c60 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 14 3 0 0x4db0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 15 4 0 0x4f00 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 16 5 0 0x5050 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 17 6 0 0x51a0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 18 7 0 0x52f0 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 21 8 0 0x5440 [0x18]: PERF_RECORD_BPF_EVENT bpf event with type 1, flags 0, id 22 # bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 16: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 15,16 17: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 18: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:44-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 17,18 21: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 22: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:45-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 21,22 # # perf report -D | grep -B22 PERF_RECORD_KSYMBOL . ... raw event: size 312 bytes . : 11 00 00 00 00 00 38 01 ff 44 06 c0 ff ff ff ff ..8..D.. . 0010: e5 00 00 00 01 00 00 00 62 70 66 5f 70 72 6f 67 bpf_prog . 0020: 5f 37 62 65 34 39 65 33 39 33 34 61 31 32 35 62 _7be49e3934a125b . 0030: 61 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a... . 0110: 00 00 00 00 00 00 00 00 21 00 00 00 00 00 00 00 !... . 0120: 7b e4 9e 39 34 a1 25 ba 00 00 00 00 00 00 00 00 {..94.%. . 0130: 00 00 00 00 00 00 00 00 0 0x49d8 [0x138]: PERF_RECORD_KSYMBOL ksymbol event with addr c00644ff len 229 type 1 flags 0x0 name bpf_prog_7be49e3934a125ba -- .
[tip:perf/core] perf tools: Handle PERF_RECORD_BPF_EVENT
Commit-ID: 45178a928a4b7c6093f6621e627d09909e81cc13 Gitweb: https://git.kernel.org/tip/45178a928a4b7c6093f6621e627d09909e81cc13 Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:18 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:00:57 -0300 perf tools: Handle PERF_RECORD_BPF_EVENT This patch adds basic handling of PERF_RECORD_BPF_EVENT. Tracking of PERF_RECORD_BPF_EVENT is OFF by default. Option --bpf-event is added to turn it on. Committer notes: Add dummy machine__process_bpf_event() variant that returns zero for systems without HAVE_LIBBPF_SUPPORT, such as Alpine Linux, unbreaking the build in such systems. Remove the needless include from bpf->event.h, provide just forward declarations for the structs and unions in the parameters, to reduce compilation time and needless rebuilds when machine.h gets changed. Committer testing: When running with: # perf record --bpf-event On an older kernel where PERF_RECORD_BPF_EVENT and PERF_RECORD_KSYMBOL is not present, we fallback to removing those two bits from perf_event_attr, making the tool to continue to work on older kernels: perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all1 exclude_guest1 mmap21 comm_exec1 ksymbol 1 bpf_event1 sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off bpf_event perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all1 exclude_guest1 mmap21 comm_exec1 ksymbol 1 sys_perf_event_open: pid 5779 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off ksymbol perf_event_attr: size 112 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|PERIOD read_format ID disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 precise_ip 3 sample_id_all1 exclude_guest1 mmap21 comm_exec1 And then proceeds to work without those two features. As passing --bpf-event is an explicit action performed by the user, perhaps we should emit a warning telling that the kernel has no such feature, but this can be done on top of this patch. Now with a kernel that supports these events, start the 'record --bpf-event -a' and then run 'perf trace sleep 1' that will use the BPF augmented_raw_syscalls.o prebuilt (for another kernel version even) and thus should generate PERF_RECORD_BPF_EVENT events: [root@quaco ~]# perf record -e dummy -a --bpf-event ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.713 MB perf.data ] [root@quaco ~]# bpftool prog 13: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 14: cgroup_skb tag 2a142ef67aaad174 gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids 13,14 15: cgroup_skb tag 7be49e3934a125ba gpl loaded_at 2019-01-19T09:09:43-0300 uid 0 xlated 296B jited 229B memlock 4096B map_ids
[tip:perf/core] perf tools: Handle PERF_RECORD_KSYMBOL
Commit-ID: 9aa0bfa370b278a539077002b3c660468d66b5e7 Gitweb: https://git.kernel.org/tip/9aa0bfa370b278a539077002b3c660468d66b5e7 Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:17 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:00:57 -0300 perf tools: Handle PERF_RECORD_KSYMBOL This patch handles PERF_RECORD_KSYMBOL in perf record/report. Specifically, map and symbol are created for ksymbol register, and removed for ksymbol unregister. This patch also sets perf_event_attr.ksymbol properly. The flag is ON by default. Committer notes: Use proper inttypes.h for u64, fixing the build in some environments like in the android NDK r15c targetting ARM 32-bit. I.e. fixing this build error: util/event.c: In function 'perf_event__fprintf_ksymbol': util/event.c:1489:10: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] event->ksymbol_event.flags, event->ksymbol_event.name); ^ cc1: all warnings being treated as errors Signed-off-by: Song Liu Reviewed-by: Arnaldo Carvalho de Melo Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-6-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/event.c | 21 ++ tools/perf/util/event.h | 20 + tools/perf/util/evsel.c | 10 - tools/perf/util/evsel.h | 1 + tools/perf/util/machine.c | 55 +++ tools/perf/util/machine.h | 3 +++ tools/perf/util/session.c | 4 tools/perf/util/tool.h| 4 +++- 8 files changed, 116 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 937a5a4f71cc..f06f3811b25b 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -24,6 +24,7 @@ #include "symbol/kallsyms.h" #include "asm/bug.h" #include "stat.h" +#include "session.h" #define DEFAULT_PROC_MAP_PARSE_TIMEOUT 500 @@ -45,6 +46,7 @@ static const char *perf_event__names[] = { [PERF_RECORD_SWITCH]= "SWITCH", [PERF_RECORD_SWITCH_CPU_WIDE] = "SWITCH_CPU_WIDE", [PERF_RECORD_NAMESPACES]= "NAMESPACES", + [PERF_RECORD_KSYMBOL] = "KSYMBOL", [PERF_RECORD_HEADER_ATTR] = "ATTR", [PERF_RECORD_HEADER_EVENT_TYPE] = "EVENT_TYPE", [PERF_RECORD_HEADER_TRACING_DATA] = "TRACING_DATA", @@ -1329,6 +1331,14 @@ int perf_event__process_switch(struct perf_tool *tool __maybe_unused, return machine__process_switch_event(machine, event); } +int perf_event__process_ksymbol(struct perf_tool *tool __maybe_unused, + union perf_event *event, + struct perf_sample *sample __maybe_unused, + struct machine *machine) +{ + return machine__process_ksymbol(machine, event, sample); +} + size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp) { return fprintf(fp, " %d/%d: [%#" PRIx64 "(%#" PRIx64 ") @ %#" PRIx64 "]: %c %s\n", @@ -1461,6 +1471,14 @@ static size_t perf_event__fprintf_lost(union perf_event *event, FILE *fp) return fprintf(fp, " lost %" PRIu64 "\n", event->lost.lost); } +size_t perf_event__fprintf_ksymbol(union perf_event *event, FILE *fp) +{ + return fprintf(fp, " ksymbol event with addr %" PRIx64 " len %u type %u flags 0x%x name %s\n", + event->ksymbol_event.addr, event->ksymbol_event.len, + event->ksymbol_event.ksym_type, + event->ksymbol_event.flags, event->ksymbol_event.name); +} + size_t perf_event__fprintf(union perf_event *event, FILE *fp) { size_t ret = fprintf(fp, "PERF_RECORD_%s", @@ -1496,6 +1514,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp) case PERF_RECORD_LOST: ret += perf_event__fprintf_lost(event, fp); break; + case PERF_RECORD_KSYMBOL: + ret += perf_event__fprintf_ksymbol(event, fp); + break; default: ret += fprintf(fp, "\n"); } diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h index eb95f3384958..018322f2a13e 100644 --- a/tools/perf/util/event.h +++ b/tools/perf/util/event.h @@ -5,6 +5,7 @@ #include #include #include +#include #include "../perf.h" #include "build-id.h" @@ -84,6 +85,19 @@ struct throttle_event { u64 stream_id; }; +#ifndef KSYM_NAME_LEN +#define KSYM_NAME_LEN 256 +#endif + +struct ksymbol_event { + struct perf_event_header header; + u64 addr; + u32 len; + u16 ksym_type; + u16 flags; + char name[KSYM_NAME_LEN]; +}; + #define PERF_SAMPLE_MASK
[tip:perf/core] tools headers uapi: Sync tools/include/uapi/linux/perf_event.h
Commit-ID: df063c83aa2c58412ddf533ada9aaf25986120ec Gitweb: https://git.kernel.org/tip/df063c83aa2c58412ddf533ada9aaf25986120ec Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:16 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:00:57 -0300 tools headers uapi: Sync tools/include/uapi/linux/perf_event.h Sync for PERF_RECORD_BPF_EVENT. Signed-off-by: Song Liu Reviewed-by: Arnaldo Carvalho de Melo Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-5-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/include/uapi/linux/perf_event.h | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 1dee5c8f166b..7198ddd0c6b1 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -373,7 +373,8 @@ struct perf_event_attr { write_backward : 1, /* Write ring buffer from end to beginning */ namespaces : 1, /* include namespaces data */ ksymbol: 1, /* include ksymbol events */ - __reserved_1 : 34; + bpf_event : 1, /* include bpf events */ + __reserved_1 : 33; union { __u32 wakeup_events;/* wakeup every n events */ @@ -979,6 +980,25 @@ enum perf_event_type { */ PERF_RECORD_KSYMBOL = 17, + /* +* Record bpf events: +* enum perf_bpf_event_type { +* PERF_BPF_EVENT_UNKNOWN = 0, +* PERF_BPF_EVENT_PROG_LOAD= 1, +* PERF_BPF_EVENT_PROG_UNLOAD = 2, +* }; +* +* struct { +* struct perf_event_headerheader; +* u16 type; +* u16 flags; +* u32 id; +* u8 tag[BPF_TAG_SIZE]; +* struct sample_idsample_id; +* }; +*/ + PERF_RECORD_BPF_EVENT = 18, + PERF_RECORD_MAX,/* non-ABI */ }; @@ -990,6 +1010,13 @@ enum perf_record_ksymbol_type { #define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0) +enum perf_bpf_event_type { + PERF_BPF_EVENT_UNKNOWN = 0, + PERF_BPF_EVENT_PROG_LOAD= 1, + PERF_BPF_EVENT_PROG_UNLOAD = 2, + PERF_BPF_EVENT_MAX, /* non-ABI */ +}; + #define PERF_MAX_STACK_DEPTH 127 #define PERF_MAX_CONTEXTS_PER_STACK 8
[tip:perf/core] perf, bpf: Introduce PERF_RECORD_BPF_EVENT
Commit-ID: 6ee52e2a3fe4ea35520720736e6791df1fb67106 Gitweb: https://git.kernel.org/tip/6ee52e2a3fe4ea35520720736e6791df1fb67106 Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:15 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:00:57 -0300 perf, bpf: Introduce PERF_RECORD_BPF_EVENT For better performance analysis of BPF programs, this patch introduces PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program load/unload information to user space. Each BPF program may contain up to BPF_MAX_SUBPROGS (256) sub programs. The following example shows kernel symbols for a BPF program with 7 sub programs: a0257cf9 t bpf_prog_b07ccb89267cf242_F a02592e1 t bpf_prog_2dcecc18072623fc_F a025b0e9 t bpf_prog_bb7a405ebaec5d5c_F a025dd2c t bpf_prog_a7540d4a39ec1fc7_F a025fcca t bpf_prog_05762d4ade0e3737_F a026108f t bpf_prog_db4bd11e35df90d4_F a0263f00 t bpf_prog_89d64e4abf0f0126_F a0257cf9 t bpf_prog_ae31629322c4b018__dummy_tracepoi When a bpf program is loaded, PERF_RECORD_KSYMBOL is generated for each of these sub programs. Therefore, PERF_RECORD_BPF_EVENT is not needed for simple profiling. For annotation, user space need to listen to PERF_RECORD_BPF_EVENT and gather more information about these (sub) programs via sys_bpf. Signed-off-by: Song Liu Reviewed-by: Arnaldo Carvalho de Melo Acked-by: Alexei Starovoitov Acked-by: Peter Zijlstra (Intel) Tested-by: Arnaldo Carvalho de Melo Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-4-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- include/linux/filter.h | 7 +++ include/linux/perf_event.h | 6 +++ include/uapi/linux/perf_event.h | 29 +- kernel/bpf/core.c | 2 +- kernel/bpf/syscall.c| 2 + kernel/events/core.c| 115 6 files changed, 159 insertions(+), 2 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index ad106d845b22..d531d4250bff 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -951,6 +951,7 @@ bpf_address_lookup(unsigned long addr, unsigned long *size, void bpf_prog_kallsyms_add(struct bpf_prog *fp); void bpf_prog_kallsyms_del(struct bpf_prog *fp); +void bpf_get_prog_name(const struct bpf_prog *prog, char *sym); #else /* CONFIG_BPF_JIT */ @@ -1006,6 +1007,12 @@ static inline void bpf_prog_kallsyms_add(struct bpf_prog *fp) static inline void bpf_prog_kallsyms_del(struct bpf_prog *fp) { } + +static inline void bpf_get_prog_name(const struct bpf_prog *prog, char *sym) +{ + sym[0] = '\0'; +} + #endif /* CONFIG_BPF_JIT */ void bpf_prog_kallsyms_del_subprogs(struct bpf_prog *fp); diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 136fe0495374..a79e59fc3b7d 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1125,6 +1125,9 @@ extern void perf_event_mmap(struct vm_area_struct *vma); extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister, const char *sym); +extern void perf_event_bpf_event(struct bpf_prog *prog, +enum perf_bpf_event_type type, +u16 flags); extern struct perf_guest_info_callbacks *perf_guest_cbs; extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks); @@ -1350,6 +1353,9 @@ static inline void perf_event_mmap(struct vm_area_struct *vma){ } typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data); static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, bool unregister, const char *sym) { } +static inline void perf_event_bpf_event(struct bpf_prog *prog, + enum perf_bpf_event_type type, + u16 flags) { } static inline void perf_event_exec(void) { } static inline void perf_event_comm(struct task_struct *tsk, bool exec) { } static inline void perf_event_namespaces(struct task_struct *tsk) { } diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 1dee5c8f166b..7198ddd0c6b1 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -373,7 +373,8 @@ struct perf_event_attr { write_backward : 1, /* Write ring buffer from end to beginning */ namespaces : 1, /* include namespaces data */ ksymbol: 1, /* include ksymbol events */ - __reserved_1 : 34; + bpf_event : 1,
[tip:perf/core] tools headers uapi: Sync tools/include/uapi/linux/perf_event.h
Commit-ID: d764ac6464915523e68e220b6aa4c3c2eb8e3f94 Gitweb: https://git.kernel.org/tip/d764ac6464915523e68e220b6aa4c3c2eb8e3f94 Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:14 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:00:57 -0300 tools headers uapi: Sync tools/include/uapi/linux/perf_event.h Sync changes for PERF_RECORD_KSYMBOL. Signed-off-by: Song Liu Reviewed-by: Arnaldo Carvalho de Melo Tested-by: Arnaldo Carvalho de Melo Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-3-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/include/uapi/linux/perf_event.h | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index ea19b5d491bf..1dee5c8f166b 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -372,7 +372,8 @@ struct perf_event_attr { context_switch : 1, /* context switch data */ write_backward : 1, /* Write ring buffer from end to beginning */ namespaces : 1, /* include namespaces data */ - __reserved_1 : 35; + ksymbol: 1, /* include ksymbol events */ + __reserved_1 : 34; union { __u32 wakeup_events;/* wakeup every n events */ @@ -963,9 +964,32 @@ enum perf_event_type { */ PERF_RECORD_NAMESPACES = 16, + /* +* Record ksymbol register/unregister events: +* +* struct { +* struct perf_event_headerheader; +* u64 addr; +* u32 len; +* u16 ksym_type; +* u16 flags; +* charname[]; +* struct sample_idsample_id; +* }; +*/ + PERF_RECORD_KSYMBOL = 17, + PERF_RECORD_MAX,/* non-ABI */ }; +enum perf_record_ksymbol_type { + PERF_RECORD_KSYMBOL_TYPE_UNKNOWN= 0, + PERF_RECORD_KSYMBOL_TYPE_BPF= 1, + PERF_RECORD_KSYMBOL_TYPE_MAX/* non-ABI */ +}; + +#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0) + #define PERF_MAX_STACK_DEPTH 127 #define PERF_MAX_CONTEXTS_PER_STACK 8
[tip:perf/core] perf, bpf: Introduce PERF_RECORD_KSYMBOL
Commit-ID: 76193a94522f1d4edf2447a536f3f796ce56343b Gitweb: https://git.kernel.org/tip/76193a94522f1d4edf2447a536f3f796ce56343b Author: Song Liu AuthorDate: Thu, 17 Jan 2019 08:15:13 -0800 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 21 Jan 2019 17:00:57 -0300 perf, bpf: Introduce PERF_RECORD_KSYMBOL For better performance analysis of dynamically JITed and loaded kernel functions, such as BPF programs, this patch introduces PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol register/unregister information to user space. The following data structure is used for PERF_RECORD_KSYMBOL. /* * struct { * struct perf_event_headerheader; * u64 addr; * u32 len; * u16 ksym_type; * u16 flags; * charname[]; * struct sample_idsample_id; * }; */ Signed-off-by: Song Liu Reviewed-by: Arnaldo Carvalho de Melo Tested-by: Arnaldo Carvalho de Melo Acked-by: Peter Zijlstra Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Peter Zijlstra Cc: kernel-t...@fb.com Cc: net...@vger.kernel.org Link: http://lkml.kernel.org/r/20190117161521.1341602-2-songliubrav...@fb.com Signed-off-by: Arnaldo Carvalho de Melo --- include/linux/perf_event.h | 8 include/uapi/linux/perf_event.h | 26 ++- kernel/events/core.c| 98 - 3 files changed, 130 insertions(+), 2 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 4eb88065a9b5..136fe0495374 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1122,6 +1122,10 @@ static inline void perf_event_task_sched_out(struct task_struct *prev, } extern void perf_event_mmap(struct vm_area_struct *vma); + +extern void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, + bool unregister, const char *sym); + extern struct perf_guest_info_callbacks *perf_guest_cbs; extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks); extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks); @@ -1342,6 +1346,10 @@ static inline int perf_unregister_guest_info_callbacks (struct perf_guest_info_callbacks *callbacks) { return 0; } static inline void perf_event_mmap(struct vm_area_struct *vma) { } + +typedef int (perf_ksymbol_get_name_f)(char *name, int name_len, void *data); +static inline void perf_event_ksymbol(u16 ksym_type, u64 addr, u32 len, + bool unregister, const char *sym) { } static inline void perf_event_exec(void) { } static inline void perf_event_comm(struct task_struct *tsk, bool exec) { } static inline void perf_event_namespaces(struct task_struct *tsk) { } diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index ea19b5d491bf..1dee5c8f166b 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -372,7 +372,8 @@ struct perf_event_attr { context_switch : 1, /* context switch data */ write_backward : 1, /* Write ring buffer from end to beginning */ namespaces : 1, /* include namespaces data */ - __reserved_1 : 35; + ksymbol: 1, /* include ksymbol events */ + __reserved_1 : 34; union { __u32 wakeup_events;/* wakeup every n events */ @@ -963,9 +964,32 @@ enum perf_event_type { */ PERF_RECORD_NAMESPACES = 16, + /* +* Record ksymbol register/unregister events: +* +* struct { +* struct perf_event_headerheader; +* u64 addr; +* u32 len; +* u16 ksym_type; +* u16 flags; +* charname[]; +* struct sample_idsample_id; +* }; +*/ + PERF_RECORD_KSYMBOL = 17, + PERF_RECORD_MAX,/* non-ABI */ }; +enum perf_record_ksymbol_type { + PERF_RECORD_KSYMBOL_TYPE_UNKNOWN= 0, + PERF_RECORD_KSYMBOL_TYPE_BPF= 1, + PERF_RECORD_KSYMBOL_TYPE_MAX/* non-ABI */ +}; + +#define PERF_RECORD_KSYMBOL_FLAGS_UNREGISTER (1 << 0) + #define PERF_MAX_STACK_DEPTH 127 #define PERF_MAX_CONTEXTS_PER_STACK 8 diff --git a/kernel/events/core.c b/kernel/events/core.c
[tip:perf/core] perf/core: Fix bad use of igrab()
Commit-ID: 9511bce9fe8e5e6c0f923c09243a713eba560141 Gitweb: https://git.kernel.org/tip/9511bce9fe8e5e6c0f923c09243a713eba560141 Author: Song LiuAuthorDate: Tue, 17 Apr 2018 23:29:07 -0700 Committer: Ingo Molnar CommitDate: Fri, 25 May 2018 08:11:10 +0200 perf/core: Fix bad use of igrab() As Miklos reported and suggested: "This pattern repeats two times in trace_uprobe.c and in kernel/events/core.c as well: ret = kern_path(filename, LOOKUP_FOLLOW, ); if (ret) goto fail_address_parse; inode = igrab(d_inode(path.dentry)); path_put(); And it's wrong. You can only hold a reference to the inode if you have an active ref to the superblock as well (which is normally through path.mnt) or holding s_umount. This way unmounting the containing filesystem while the tracepoint is active will give you the "VFS: Busy inodes after unmount..." message and a crash when the inode is finally put. Solution: store path instead of inode." This patch fixes the issue in kernel/event/core.c. Reviewed-and-tested-by: Alexander Shishkin Reported-by: Miklos Szeredi Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: 375637bc5249 ("perf/core: Introduce address range filtering") Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/pt.c | 4 ++-- include/linux/perf_event.h | 2 +- kernel/events/core.c | 21 + 3 files changed, 12 insertions(+), 15 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index 3b993942a0e4..8d016ce5b80d 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -1194,7 +1194,7 @@ static int pt_event_addr_filters_validate(struct list_head *filters) filter->action == PERF_ADDR_FILTER_ACTION_START) return -EOPNOTSUPP; - if (!filter->inode) { + if (!filter->path.dentry) { if (!valid_kernel_ip(filter->offset)) return -EINVAL; @@ -1221,7 +1221,7 @@ static void pt_event_addr_filters_sync(struct perf_event *event) return; list_for_each_entry(filter, >list, entry) { - if (filter->inode && !offs[range]) { + if (filter->path.dentry && !offs[range]) { msr_a = msr_b = 0; } else { /* apply the offset */ diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index def866f7269b..bea0b0cd4bf7 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -467,7 +467,7 @@ enum perf_addr_filter_action_t { */ struct perf_addr_filter { struct list_headentry; - struct inode*inode; + struct path path; unsigned long offset; unsigned long size; enum perf_addr_filter_action_t action; diff --git a/kernel/events/core.c b/kernel/events/core.c index ce6aa5ff3c96..24dea13a27ed 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6668,7 +6668,7 @@ static void perf_event_addr_filters_exec(struct perf_event *event, void *data) raw_spin_lock_irqsave(>lock, flags); list_for_each_entry(filter, >list, entry) { - if (filter->inode) { + if (filter->path.dentry) { event->addr_filters_offs[count] = 0; restart++; } @@ -7333,7 +7333,7 @@ static bool perf_addr_filter_match(struct perf_addr_filter *filter, struct file *file, unsigned long offset, unsigned long size) { - if (filter->inode != file_inode(file)) + if (d_inode(filter->path.dentry) != file_inode(file)) return false; if (filter->offset > offset + size) @@ -8686,8 +8686,7 @@ static void free_filters_list(struct list_head *filters) struct perf_addr_filter *filter, *iter; list_for_each_entry_safe(filter, iter, filters, entry) { - if (filter->inode) - iput(filter->inode); + path_put(>path); list_del(>entry); kfree(filter); } @@ -8784,7 +8783,7 @@ static void perf_event_addr_filters_apply(struct perf_event
[tip:perf/core] perf/core: Fix bad use of igrab()
Commit-ID: 9511bce9fe8e5e6c0f923c09243a713eba560141 Gitweb: https://git.kernel.org/tip/9511bce9fe8e5e6c0f923c09243a713eba560141 Author: Song Liu AuthorDate: Tue, 17 Apr 2018 23:29:07 -0700 Committer: Ingo Molnar CommitDate: Fri, 25 May 2018 08:11:10 +0200 perf/core: Fix bad use of igrab() As Miklos reported and suggested: "This pattern repeats two times in trace_uprobe.c and in kernel/events/core.c as well: ret = kern_path(filename, LOOKUP_FOLLOW, ); if (ret) goto fail_address_parse; inode = igrab(d_inode(path.dentry)); path_put(); And it's wrong. You can only hold a reference to the inode if you have an active ref to the superblock as well (which is normally through path.mnt) or holding s_umount. This way unmounting the containing filesystem while the tracepoint is active will give you the "VFS: Busy inodes after unmount..." message and a crash when the inode is finally put. Solution: store path instead of inode." This patch fixes the issue in kernel/event/core.c. Reviewed-and-tested-by: Alexander Shishkin Reported-by: Miklos Szeredi Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: 375637bc5249 ("perf/core: Introduce address range filtering") Link: http://lkml.kernel.org/r/20180418062907.3210386-2-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/pt.c | 4 ++-- include/linux/perf_event.h | 2 +- kernel/events/core.c | 21 + 3 files changed, 12 insertions(+), 15 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index 3b993942a0e4..8d016ce5b80d 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -1194,7 +1194,7 @@ static int pt_event_addr_filters_validate(struct list_head *filters) filter->action == PERF_ADDR_FILTER_ACTION_START) return -EOPNOTSUPP; - if (!filter->inode) { + if (!filter->path.dentry) { if (!valid_kernel_ip(filter->offset)) return -EINVAL; @@ -1221,7 +1221,7 @@ static void pt_event_addr_filters_sync(struct perf_event *event) return; list_for_each_entry(filter, >list, entry) { - if (filter->inode && !offs[range]) { + if (filter->path.dentry && !offs[range]) { msr_a = msr_b = 0; } else { /* apply the offset */ diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index def866f7269b..bea0b0cd4bf7 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -467,7 +467,7 @@ enum perf_addr_filter_action_t { */ struct perf_addr_filter { struct list_headentry; - struct inode*inode; + struct path path; unsigned long offset; unsigned long size; enum perf_addr_filter_action_t action; diff --git a/kernel/events/core.c b/kernel/events/core.c index ce6aa5ff3c96..24dea13a27ed 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6668,7 +6668,7 @@ static void perf_event_addr_filters_exec(struct perf_event *event, void *data) raw_spin_lock_irqsave(>lock, flags); list_for_each_entry(filter, >list, entry) { - if (filter->inode) { + if (filter->path.dentry) { event->addr_filters_offs[count] = 0; restart++; } @@ -7333,7 +7333,7 @@ static bool perf_addr_filter_match(struct perf_addr_filter *filter, struct file *file, unsigned long offset, unsigned long size) { - if (filter->inode != file_inode(file)) + if (d_inode(filter->path.dentry) != file_inode(file)) return false; if (filter->offset > offset + size) @@ -8686,8 +8686,7 @@ static void free_filters_list(struct list_head *filters) struct perf_addr_filter *filter, *iter; list_for_each_entry_safe(filter, iter, filters, entry) { - if (filter->inode) - iput(filter->inode); + path_put(>path); list_del(>entry); kfree(filter); } @@ -8784,7 +8783,7 @@ static void perf_event_addr_filters_apply(struct perf_event *event) * Adjust base offset if the filter is associated to a binary * that needs to be mapped: */ - if (filter->inode) + if (filter->path.dentry) event->addr_filters_offs[count] = perf_addr_filter_apply(filter, mm); @@ -8858,7 +8857,6
[tip:perf/core] perf/core: Fix group scheduling with mixed hw and sw events
Commit-ID: a1150c202207cc8501bebc45b63c264f91959260 Gitweb: https://git.kernel.org/tip/a1150c202207cc8501bebc45b63c264f91959260 Author: Song LiuAuthorDate: Thu, 3 May 2018 12:47:16 -0700 Committer: Ingo Molnar CommitDate: Fri, 25 May 2018 08:11:10 +0200 perf/core: Fix group scheduling with mixed hw and sw events When hw and sw events are mixed in the same group, they are all attached to the hw perf_event_context. This sometimes requires moving group of perf_event to a different context. We found a bug in how the kernel handles this, for example if we do: perf stat -e '{faults,ref-cycles,faults}' -I 1000 1.005591180 1,297 faults 1.005591180457,476,576 ref-cycles 1.005591180 faults First, sw event "faults" is attached to the sw context, and becomes the group leader. Then, hw event "ref-cycles" is attached, so both events are moved to the hw context. Last, another sw "faults" tries to attach, but it fails because of mismatch between the new target ctx (from sw pmu) and the group_leader's ctx (hw context, same as ref-cycles). The broken condition is: group_leader is sw event; group_leader is on hw context; add a sw event to the group. Fix this scenario by checking group_leader's context (instead of just event type). If group_leader is on hw context, use the ->pmu of this context to look up context for the new event. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: b04243ef7006 ("perf: Complete software pmu grouping") Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/linux/perf_event.h | 8 kernel/events/core.c | 21 +++-- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index e71e99eb9a4e..def866f7269b 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event *event) return event->event_caps & PERF_EV_CAP_SOFTWARE; } +/* + * Return 1 for event in sw context, 0 for event in hw context + */ +static inline int in_software_context(struct perf_event *event) +{ + return event->ctx->pmu->task_ctx_nr == perf_sw_context; +} + extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64); diff --git a/kernel/events/core.c b/kernel/events/core.c index 67612ce359ad..ce6aa5ff3c96 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open, if (pmu->task_ctx_nr == perf_sw_context) event->event_caps |= PERF_EV_CAP_SOFTWARE; - if (group_leader && - (is_software_event(event) != is_software_event(group_leader))) { - if (is_software_event(event)) { + if (group_leader) { + if (is_software_event(event) && + !in_software_context(group_leader)) { /* -* If event and group_leader are not both a software -* event, and event is, then group leader is not. +* If the event is a sw event, but the group_leader +* is on hw context. * -* Allow the addition of software events to !software -* groups, this is safe because software events never -* fail to schedule. +* Allow the addition of software events to hw +* groups, this is safe because software events +* never fail to schedule. */ - pmu = group_leader->pmu; - } else if (is_software_event(group_leader) && + pmu = group_leader->ctx->pmu; + } else if (!is_software_event(event) && + is_software_event(group_leader) && (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) { /* * In case the group is a pure software group, and we
[tip:perf/core] perf/core: Fix group scheduling with mixed hw and sw events
Commit-ID: a1150c202207cc8501bebc45b63c264f91959260 Gitweb: https://git.kernel.org/tip/a1150c202207cc8501bebc45b63c264f91959260 Author: Song Liu AuthorDate: Thu, 3 May 2018 12:47:16 -0700 Committer: Ingo Molnar CommitDate: Fri, 25 May 2018 08:11:10 +0200 perf/core: Fix group scheduling with mixed hw and sw events When hw and sw events are mixed in the same group, they are all attached to the hw perf_event_context. This sometimes requires moving group of perf_event to a different context. We found a bug in how the kernel handles this, for example if we do: perf stat -e '{faults,ref-cycles,faults}' -I 1000 1.005591180 1,297 faults 1.005591180457,476,576 ref-cycles 1.005591180 faults First, sw event "faults" is attached to the sw context, and becomes the group leader. Then, hw event "ref-cycles" is attached, so both events are moved to the hw context. Last, another sw "faults" tries to attach, but it fails because of mismatch between the new target ctx (from sw pmu) and the group_leader's ctx (hw context, same as ref-cycles). The broken condition is: group_leader is sw event; group_leader is on hw context; add a sw event to the group. Fix this scenario by checking group_leader's context (instead of just event type). If group_leader is on hw context, use the ->pmu of this context to look up context for the new event. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: b04243ef7006 ("perf: Complete software pmu grouping") Link: http://lkml.kernel.org/r/20180503194716.162815-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/linux/perf_event.h | 8 kernel/events/core.c | 21 +++-- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index e71e99eb9a4e..def866f7269b 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1016,6 +1016,14 @@ static inline int is_software_event(struct perf_event *event) return event->event_caps & PERF_EV_CAP_SOFTWARE; } +/* + * Return 1 for event in sw context, 0 for event in hw context + */ +static inline int in_software_context(struct perf_event *event) +{ + return event->ctx->pmu->task_ctx_nr == perf_sw_context; +} + extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX]; extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64); diff --git a/kernel/events/core.c b/kernel/events/core.c index 67612ce359ad..ce6aa5ff3c96 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -10521,19 +10521,20 @@ SYSCALL_DEFINE5(perf_event_open, if (pmu->task_ctx_nr == perf_sw_context) event->event_caps |= PERF_EV_CAP_SOFTWARE; - if (group_leader && - (is_software_event(event) != is_software_event(group_leader))) { - if (is_software_event(event)) { + if (group_leader) { + if (is_software_event(event) && + !in_software_context(group_leader)) { /* -* If event and group_leader are not both a software -* event, and event is, then group leader is not. +* If the event is a sw event, but the group_leader +* is on hw context. * -* Allow the addition of software events to !software -* groups, this is safe because software events never -* fail to schedule. +* Allow the addition of software events to hw +* groups, this is safe because software events +* never fail to schedule. */ - pmu = group_leader->pmu; - } else if (is_software_event(group_leader) && + pmu = group_leader->ctx->pmu; + } else if (!is_software_event(event) && + is_software_event(group_leader) && (group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) { /* * In case the group is a pure software group, and we
[tip:perf/urgent] trace_kprobe: Remove warning message "Could not insert probe at..."
Commit-ID: 5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88 Gitweb: https://git.kernel.org/tip/5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88 Author: Song LiuAuthorDate: Fri, 13 Apr 2018 11:55:13 -0700 Committer: Ingo Molnar CommitDate: Tue, 17 Apr 2018 07:54:57 +0200 trace_kprobe: Remove warning message "Could not insert probe at..." This warning message is not very helpful, as the return value should already show information about the error. Also, this message will spam dmesg if the user space does testing in a loop, like: for x in {0..5} do echo p:xx xx+$x >> /sys/kernel/debug/tracing/kprobe_events done Reported-by: Vince Weaver Signed-off-by: Song Liu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20180413185513.3626052-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/trace/trace_kprobe.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 1cd3fb4d70f8..02aed76e0978 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -512,8 +512,6 @@ static int __register_trace_kprobe(struct trace_kprobe *tk) if (ret == 0) tk->tp.flags |= TP_FLAG_REGISTERED; else { - pr_warn("Could not insert probe at %s+%lu: %d\n", - trace_kprobe_symbol(tk), trace_kprobe_offset(tk), ret); if (ret == -ENOENT && trace_kprobe_is_on_module(tk)) { pr_warn("This probe might be able to register after target module is loaded. Continue.\n"); ret = 0;
[tip:perf/urgent] trace_kprobe: Remove warning message "Could not insert probe at..."
Commit-ID: 5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88 Gitweb: https://git.kernel.org/tip/5c8dad48e4f53d6fd0a7e4f95d7c1c983374de88 Author: Song Liu AuthorDate: Fri, 13 Apr 2018 11:55:13 -0700 Committer: Ingo Molnar CommitDate: Tue, 17 Apr 2018 07:54:57 +0200 trace_kprobe: Remove warning message "Could not insert probe at..." This warning message is not very helpful, as the return value should already show information about the error. Also, this message will spam dmesg if the user space does testing in a loop, like: for x in {0..5} do echo p:xx xx+$x >> /sys/kernel/debug/tracing/kprobe_events done Reported-by: Vince Weaver Signed-off-by: Song Liu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: kernel-t...@fb.com Link: http://lkml.kernel.org/r/20180413185513.3626052-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/trace/trace_kprobe.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 1cd3fb4d70f8..02aed76e0978 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -512,8 +512,6 @@ static int __register_trace_kprobe(struct trace_kprobe *tk) if (ret == 0) tk->tp.flags |= TP_FLAG_REGISTERED; else { - pr_warn("Could not insert probe at %s+%lu: %d\n", - trace_kprobe_symbol(tk), trace_kprobe_offset(tk), ret); if (ret == -ENOENT && trace_kprobe_is_on_module(tk)) { pr_warn("This probe might be able to register after target module is loaded. Continue.\n"); ret = 0;
[tip:perf/urgent] perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()
Commit-ID: 32e6e967fb36bf77ed99221ae3ce1909f045d8f9 Gitweb: https://git.kernel.org/tip/32e6e967fb36bf77ed99221ae3ce1909f045d8f9 Author: Song LiuAuthorDate: Wed, 11 Apr 2018 18:02:37 + Committer: Ingo Molnar CommitDate: Thu, 12 Apr 2018 09:55:50 +0200 perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open() Non-root user cannot create kprobe or uprobe through the text-based interface (kprobe_events, uprobe_events),so they should not be able to create probes via perf_event_open() either. Reported-by: Vince Weaver Signed-off-by: Song Liu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: 33ea4b24277b ("perf/core: Implement the 'perf_uprobe' PMU") Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU") Link: http://lkml.kernel.org/r/c0b2efb5-c403-4bdb-9046-c14b3ee66...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 8 1 file changed, 8 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index d7af82827373..2d5fe26551f8 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8400,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event *event) if (event->attr.type != perf_kprobe.type) return -ENOENT; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + /* * no branch sampling for probe events */ @@ -8437,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event *event) if (event->attr.type != perf_uprobe.type) return -ENOENT; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + /* * no branch sampling for probe events */
[tip:perf/urgent] perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()
Commit-ID: 32e6e967fb36bf77ed99221ae3ce1909f045d8f9 Gitweb: https://git.kernel.org/tip/32e6e967fb36bf77ed99221ae3ce1909f045d8f9 Author: Song Liu AuthorDate: Wed, 11 Apr 2018 18:02:37 + Committer: Ingo Molnar CommitDate: Thu, 12 Apr 2018 09:55:50 +0200 perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open() Non-root user cannot create kprobe or uprobe through the text-based interface (kprobe_events, uprobe_events),so they should not be able to create probes via perf_event_open() either. Reported-by: Vince Weaver Signed-off-by: Song Liu Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: 33ea4b24277b ("perf/core: Implement the 'perf_uprobe' PMU") Fixes: e12f03d7031a ("perf/core: Implement the 'perf_kprobe' PMU") Link: http://lkml.kernel.org/r/c0b2efb5-c403-4bdb-9046-c14b3ee66...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 8 1 file changed, 8 insertions(+) diff --git a/kernel/events/core.c b/kernel/events/core.c index d7af82827373..2d5fe26551f8 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8400,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event *event) if (event->attr.type != perf_kprobe.type) return -ENOENT; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + /* * no branch sampling for probe events */ @@ -8437,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event *event) if (event->attr.type != perf_uprobe.type) return -ENOENT; + + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + /* * no branch sampling for probe events */
[tip:perf/urgent] perf/cgroup: Fix child event counting bug
Commit-ID: c917e0f259908e75bd2a65877e25f9d90c22c848 Gitweb: https://git.kernel.org/tip/c917e0f259908e75bd2a65877e25f9d90c22c848 Author: Song LiuAuthorDate: Mon, 12 Mar 2018 09:59:43 -0700 Committer: Ingo Molnar CommitDate: Tue, 20 Mar 2018 08:58:47 +0100 perf/cgroup: Fix child event counting bug When a perf_event is attached to parent cgroup, it should count events for all children cgroups: parent_group < perf_event \ - child_group < process(es) However, in our tests, we found this perf_event cannot report reliable results. Here is an example case: # create cgroups mkdir -p /sys/fs/cgroup/p/c # start perf for parent group perf stat -e instructions -G "p" # on another console, run test process in child cgroup: stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs # after the test process is done, stop perf in the first console shows instructions p The instruction should not be "not counted" as the process runs in the child cgroup. We found this is because perf_event->cgrp and cpuctx->cgrp are not identical, thus perf_event->cgrp are not updated properly. This patch fixes this by updating perf_cgroup properly for ancestor cgroup(s). Reported-by: Ephraim Park Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4b838470fac4..709a55b9ad97 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -724,9 +724,15 @@ static inline void __update_cgrp_time(struct perf_cgroup *cgrp) static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx) { - struct perf_cgroup *cgrp_out = cpuctx->cgrp; - if (cgrp_out) - __update_cgrp_time(cgrp_out); + struct perf_cgroup *cgrp = cpuctx->cgrp; + struct cgroup_subsys_state *css; + + if (cgrp) { + for (css = >css; css; css = css->parent) { + cgrp = container_of(css, struct perf_cgroup, css); + __update_cgrp_time(cgrp); + } + } } static inline void update_cgrp_time_from_event(struct perf_event *event) @@ -754,6 +760,7 @@ perf_cgroup_set_timestamp(struct task_struct *task, { struct perf_cgroup *cgrp; struct perf_cgroup_info *info; + struct cgroup_subsys_state *css; /* * ctx->lock held by caller @@ -764,8 +771,12 @@ perf_cgroup_set_timestamp(struct task_struct *task, return; cgrp = perf_cgroup_from_task(task, ctx); - info = this_cpu_ptr(cgrp->info); - info->timestamp = ctx->timestamp; + + for (css = >css; css; css = css->parent) { + cgrp = container_of(css, struct perf_cgroup, css); + info = this_cpu_ptr(cgrp->info); + info->timestamp = ctx->timestamp; + } } static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);
[tip:perf/urgent] perf/cgroup: Fix child event counting bug
Commit-ID: c917e0f259908e75bd2a65877e25f9d90c22c848 Gitweb: https://git.kernel.org/tip/c917e0f259908e75bd2a65877e25f9d90c22c848 Author: Song Liu AuthorDate: Mon, 12 Mar 2018 09:59:43 -0700 Committer: Ingo Molnar CommitDate: Tue, 20 Mar 2018 08:58:47 +0100 perf/cgroup: Fix child event counting bug When a perf_event is attached to parent cgroup, it should count events for all children cgroups: parent_group < perf_event \ - child_group < process(es) However, in our tests, we found this perf_event cannot report reliable results. Here is an example case: # create cgroups mkdir -p /sys/fs/cgroup/p/c # start perf for parent group perf stat -e instructions -G "p" # on another console, run test process in child cgroup: stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs # after the test process is done, stop perf in the first console shows instructions p The instruction should not be "not counted" as the process runs in the child cgroup. We found this is because perf_event->cgrp and cpuctx->cgrp are not identical, thus perf_event->cgrp are not updated properly. This patch fixes this by updating perf_cgroup properly for ancestor cgroup(s). Reported-by: Ephraim Park Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 4b838470fac4..709a55b9ad97 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -724,9 +724,15 @@ static inline void __update_cgrp_time(struct perf_cgroup *cgrp) static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx) { - struct perf_cgroup *cgrp_out = cpuctx->cgrp; - if (cgrp_out) - __update_cgrp_time(cgrp_out); + struct perf_cgroup *cgrp = cpuctx->cgrp; + struct cgroup_subsys_state *css; + + if (cgrp) { + for (css = >css; css; css = css->parent) { + cgrp = container_of(css, struct perf_cgroup, css); + __update_cgrp_time(cgrp); + } + } } static inline void update_cgrp_time_from_event(struct perf_event *event) @@ -754,6 +760,7 @@ perf_cgroup_set_timestamp(struct task_struct *task, { struct perf_cgroup *cgrp; struct perf_cgroup_info *info; + struct cgroup_subsys_state *css; /* * ctx->lock held by caller @@ -764,8 +771,12 @@ perf_cgroup_set_timestamp(struct task_struct *task, return; cgrp = perf_cgroup_from_task(task, ctx); - info = this_cpu_ptr(cgrp->info); - info->timestamp = ctx->timestamp; + + for (css = >css; css; css = css->parent) { + cgrp = container_of(css, struct perf_cgroup, css); + info = this_cpu_ptr(cgrp->info); + info->timestamp = ctx->timestamp; + } } static DEFINE_PER_CPU(struct list_head, cgrp_cpuctx_list);
[tip:perf/urgent] perf/core: Fix ctx_event_type in ctx_resched()
Commit-ID: bd903afeb504db5655a45bb4cf86f38be5b1bf62 Gitweb: https://git.kernel.org/tip/bd903afeb504db5655a45bb4cf86f38be5b1bf62 Author: Song LiuAuthorDate: Mon, 5 Mar 2018 21:55:04 -0800 Committer: Ingo Molnar CommitDate: Fri, 9 Mar 2018 08:03:02 +0100 perf/core: Fix ctx_event_type in ctx_resched() In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is added. However, ctx_resched() calculates ctx_event_type before checking this condition. As a result, pinned events will NOT get higher priority than flexible events. The following shows this issue on an Intel CPU (where ref-cycles can only use one hardware counter). 1. First start: perf stat -C 0 -e ref-cycles -I 1000 2. Then, in the second console, run: perf stat -C 0 -e ref-cycles:D -I 1000 The second perf uses pinned events, which is expected to have higher priority. However, because it failed in ctx_resched(). It is never run. This patch fixes this by calculating ctx_event_type after re-evaluating event_type. Reported-by: Ephraim Park Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 96db9ae5d5af..4b838470fac4 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2246,7 +2246,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx, struct perf_event_context *task_ctx, enum event_type_t event_type) { - enum event_type_t ctx_event_type = event_type & EVENT_ALL; + enum event_type_t ctx_event_type; bool cpu_event = !!(event_type & EVENT_CPU); /* @@ -2256,6 +2256,8 @@ static void ctx_resched(struct perf_cpu_context *cpuctx, if (event_type & EVENT_PINNED) event_type |= EVENT_FLEXIBLE; + ctx_event_type = event_type & EVENT_ALL; + perf_pmu_disable(cpuctx->ctx.pmu); if (task_ctx) task_ctx_sched_out(cpuctx, task_ctx, event_type);
[tip:perf/urgent] perf/core: Fix ctx_event_type in ctx_resched()
Commit-ID: bd903afeb504db5655a45bb4cf86f38be5b1bf62 Gitweb: https://git.kernel.org/tip/bd903afeb504db5655a45bb4cf86f38be5b1bf62 Author: Song Liu AuthorDate: Mon, 5 Mar 2018 21:55:04 -0800 Committer: Ingo Molnar CommitDate: Fri, 9 Mar 2018 08:03:02 +0100 perf/core: Fix ctx_event_type in ctx_resched() In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is added. However, ctx_resched() calculates ctx_event_type before checking this condition. As a result, pinned events will NOT get higher priority than flexible events. The following shows this issue on an Intel CPU (where ref-cycles can only use one hardware counter). 1. First start: perf stat -C 0 -e ref-cycles -I 1000 2. Then, in the second console, run: perf stat -C 0 -e ref-cycles:D -I 1000 The second perf uses pinned events, which is expected to have higher priority. However, because it failed in ctx_resched(). It is never run. This patch fixes this by calculating ctx_event_type after re-evaluating event_type. Reported-by: Ephraim Park Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active contexts") Link: http://lkml.kernel.org/r/20180306055504.3283731-1-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- kernel/events/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 96db9ae5d5af..4b838470fac4 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2246,7 +2246,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx, struct perf_event_context *task_ctx, enum event_type_t event_type) { - enum event_type_t ctx_event_type = event_type & EVENT_ALL; + enum event_type_t ctx_event_type; bool cpu_event = !!(event_type & EVENT_CPU); /* @@ -2256,6 +2256,8 @@ static void ctx_resched(struct perf_cpu_context *cpuctx, if (event_type & EVENT_PINNED) event_type |= EVENT_FLEXIBLE; + ctx_event_type = event_type & EVENT_ALL; + perf_pmu_disable(cpuctx->ctx.pmu); if (task_ctx) task_ctx_sched_out(cpuctx, task_ctx, event_type);
[tip:perf/core] perf/core: Implement the 'perf_uprobe' PMU
Commit-ID: 33ea4b24277b06dbc55d7f5772a46f029600255e Gitweb: https://git.kernel.org/tip/33ea4b24277b06dbc55d7f5772a46f029600255e Author: Song LiuAuthorDate: Wed, 6 Dec 2017 14:45:16 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 11:29:28 +0100 perf/core: Implement the 'perf_uprobe' PMU This patch adds perf_uprobe support with similar pattern as previous patch (for kprobe). Two functions, create_local_trace_uprobe() and destroy_local_trace_uprobe(), are created so a uprobe can be created and attached to the file descriptor created by perf_event_open(). Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-7-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/linux/trace_events.h| 4 ++ kernel/events/core.c| 48 ++- kernel/trace/trace_event_perf.c | 53 + kernel/trace/trace_probe.h | 4 ++ kernel/trace/trace_uprobe.c | 86 + 5 files changed, 186 insertions(+), 9 deletions(-) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 21c5d43..0d9d6cb 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -537,6 +537,10 @@ extern void perf_trace_del(struct perf_event *event, int flags); extern int perf_kprobe_init(struct perf_event *event, bool is_retprobe); extern void perf_kprobe_destroy(struct perf_event *event); #endif +#ifdef CONFIG_UPROBE_EVENTS +extern int perf_uprobe_init(struct perf_event *event, bool is_retprobe); +extern void perf_uprobe_destroy(struct perf_event *event); +#endif extern int ftrace_profile_set_filter(struct perf_event *event, int event_id, char *filter_str); extern void ftrace_profile_free_filter(struct perf_event *event); diff --git a/kernel/events/core.c b/kernel/events/core.c index 3337355..5a54630 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7992,7 +7992,7 @@ static struct pmu perf_tracepoint = { .read = perf_swevent_read, }; -#ifdef CONFIG_KPROBE_EVENTS +#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS) /* * Flags in config, used by dynamic PMU kprobe and uprobe * The flags should match following PMU_FORMAT_ATTR(). @@ -8020,7 +8020,9 @@ static const struct attribute_group *probe_attr_groups[] = { _format_group, NULL, }; +#endif +#ifdef CONFIG_KPROBE_EVENTS static int perf_kprobe_event_init(struct perf_event *event); static struct pmu perf_kprobe = { .task_ctx_nr= perf_sw_context, @@ -8057,12 +8059,52 @@ static int perf_kprobe_event_init(struct perf_event *event) } #endif /* CONFIG_KPROBE_EVENTS */ +#ifdef CONFIG_UPROBE_EVENTS +static int perf_uprobe_event_init(struct perf_event *event); +static struct pmu perf_uprobe = { + .task_ctx_nr= perf_sw_context, + .event_init = perf_uprobe_event_init, + .add= perf_trace_add, + .del= perf_trace_del, + .start = perf_swevent_start, + .stop = perf_swevent_stop, + .read = perf_swevent_read, + .attr_groups= probe_attr_groups, +}; + +static int perf_uprobe_event_init(struct perf_event *event) +{ + int err; + bool is_retprobe; + + if (event->attr.type != perf_uprobe.type) + return -ENOENT; + /* +* no branch sampling for probe events +*/ + if (has_branch_stack(event)) + return -EOPNOTSUPP; + + is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE; + err = perf_uprobe_init(event, is_retprobe); + if (err) + return err; + + event->destroy = perf_uprobe_destroy; + + return 0; +} +#endif /* CONFIG_UPROBE_EVENTS */ + static inline void perf_tp_register(void) { perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT); #ifdef CONFIG_KPROBE_EVENTS perf_pmu_register(_kprobe, "kprobe", -1); #endif +#ifdef CONFIG_UPROBE_EVENTS + perf_pmu_register(_uprobe, "uprobe", -1); +#endif } static void perf_event_free_filter(struct perf_event *event) @@ -8151,6 +8193,10 @@ static inline bool perf_event_is_tracing(struct perf_event *event) if (event->pmu == _kprobe) return true; #endif +#ifdef CONFIG_UPROBE_EVENTS + if (event->pmu
[tip:perf/core] perf/core: Implement the 'perf_uprobe' PMU
Commit-ID: 33ea4b24277b06dbc55d7f5772a46f029600255e Gitweb: https://git.kernel.org/tip/33ea4b24277b06dbc55d7f5772a46f029600255e Author: Song Liu AuthorDate: Wed, 6 Dec 2017 14:45:16 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 11:29:28 +0100 perf/core: Implement the 'perf_uprobe' PMU This patch adds perf_uprobe support with similar pattern as previous patch (for kprobe). Two functions, create_local_trace_uprobe() and destroy_local_trace_uprobe(), are created so a uprobe can be created and attached to the file descriptor created by perf_event_open(). Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-7-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/linux/trace_events.h| 4 ++ kernel/events/core.c| 48 ++- kernel/trace/trace_event_perf.c | 53 + kernel/trace/trace_probe.h | 4 ++ kernel/trace/trace_uprobe.c | 86 + 5 files changed, 186 insertions(+), 9 deletions(-) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 21c5d43..0d9d6cb 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -537,6 +537,10 @@ extern void perf_trace_del(struct perf_event *event, int flags); extern int perf_kprobe_init(struct perf_event *event, bool is_retprobe); extern void perf_kprobe_destroy(struct perf_event *event); #endif +#ifdef CONFIG_UPROBE_EVENTS +extern int perf_uprobe_init(struct perf_event *event, bool is_retprobe); +extern void perf_uprobe_destroy(struct perf_event *event); +#endif extern int ftrace_profile_set_filter(struct perf_event *event, int event_id, char *filter_str); extern void ftrace_profile_free_filter(struct perf_event *event); diff --git a/kernel/events/core.c b/kernel/events/core.c index 3337355..5a54630 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7992,7 +7992,7 @@ static struct pmu perf_tracepoint = { .read = perf_swevent_read, }; -#ifdef CONFIG_KPROBE_EVENTS +#if defined(CONFIG_KPROBE_EVENTS) || defined(CONFIG_UPROBE_EVENTS) /* * Flags in config, used by dynamic PMU kprobe and uprobe * The flags should match following PMU_FORMAT_ATTR(). @@ -8020,7 +8020,9 @@ static const struct attribute_group *probe_attr_groups[] = { _format_group, NULL, }; +#endif +#ifdef CONFIG_KPROBE_EVENTS static int perf_kprobe_event_init(struct perf_event *event); static struct pmu perf_kprobe = { .task_ctx_nr= perf_sw_context, @@ -8057,12 +8059,52 @@ static int perf_kprobe_event_init(struct perf_event *event) } #endif /* CONFIG_KPROBE_EVENTS */ +#ifdef CONFIG_UPROBE_EVENTS +static int perf_uprobe_event_init(struct perf_event *event); +static struct pmu perf_uprobe = { + .task_ctx_nr= perf_sw_context, + .event_init = perf_uprobe_event_init, + .add= perf_trace_add, + .del= perf_trace_del, + .start = perf_swevent_start, + .stop = perf_swevent_stop, + .read = perf_swevent_read, + .attr_groups= probe_attr_groups, +}; + +static int perf_uprobe_event_init(struct perf_event *event) +{ + int err; + bool is_retprobe; + + if (event->attr.type != perf_uprobe.type) + return -ENOENT; + /* +* no branch sampling for probe events +*/ + if (has_branch_stack(event)) + return -EOPNOTSUPP; + + is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE; + err = perf_uprobe_init(event, is_retprobe); + if (err) + return err; + + event->destroy = perf_uprobe_destroy; + + return 0; +} +#endif /* CONFIG_UPROBE_EVENTS */ + static inline void perf_tp_register(void) { perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT); #ifdef CONFIG_KPROBE_EVENTS perf_pmu_register(_kprobe, "kprobe", -1); #endif +#ifdef CONFIG_UPROBE_EVENTS + perf_pmu_register(_uprobe, "uprobe", -1); +#endif } static void perf_event_free_filter(struct perf_event *event) @@ -8151,6 +8193,10 @@ static inline bool perf_event_is_tracing(struct perf_event *event) if (event->pmu == _kprobe) return true; #endif +#ifdef CONFIG_UPROBE_EVENTS + if (event->pmu == _uprobe) + return true; +#endif return false; } diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c index 779baad..2c41650 100644 --- a/kernel/trace/trace_event_perf.c +++ b/kernel/trace/trace_event_perf.c @@ -286,6 +286,59 @@ void perf_kprobe_destroy(struct perf_event *p_event) }
[tip:perf/core] perf/core: Implement the 'perf_kprobe' PMU
Commit-ID: e12f03d7031a977356e3d7b75a68c2185ff8d155 Gitweb: https://git.kernel.org/tip/e12f03d7031a977356e3d7b75a68c2185ff8d155 Author: Song LiuAuthorDate: Wed, 6 Dec 2017 14:45:15 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 11:29:26 +0100 perf/core: Implement the 'perf_kprobe' PMU A new PMU type, perf_kprobe is added. Based on attr from perf_event_open(), perf_kprobe creates a kprobe (or kretprobe) for the perf_event. This kprobe is private to this perf_event, and thus not added to global lists, and not available in tracefs. Two functions, create_local_trace_kprobe() and destroy_local_trace_kprobe() are added to created and destroy these local trace_kprobe. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-6-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/linux/trace_events.h| 4 ++ kernel/events/core.c| 142 ++-- kernel/trace/trace_event_perf.c | 49 ++ kernel/trace/trace_kprobe.c | 91 ++--- kernel/trace/trace_probe.h | 7 ++ 5 files changed, 250 insertions(+), 43 deletions(-) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index af44e7c..21c5d43 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -533,6 +533,10 @@ extern int perf_trace_init(struct perf_event *event); extern void perf_trace_destroy(struct perf_event *event); extern int perf_trace_add(struct perf_event *event, int flags); extern void perf_trace_del(struct perf_event *event, int flags); +#ifdef CONFIG_KPROBE_EVENTS +extern int perf_kprobe_init(struct perf_event *event, bool is_retprobe); +extern void perf_kprobe_destroy(struct perf_event *event); +#endif extern int ftrace_profile_set_filter(struct perf_event *event, int event_id, char *filter_str); extern void ftrace_profile_free_filter(struct perf_event *event); diff --git a/kernel/events/core.c b/kernel/events/core.c index d99fe3f..3337355 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7992,9 +7992,77 @@ static struct pmu perf_tracepoint = { .read = perf_swevent_read, }; +#ifdef CONFIG_KPROBE_EVENTS +/* + * Flags in config, used by dynamic PMU kprobe and uprobe + * The flags should match following PMU_FORMAT_ATTR(). + * + * PERF_PROBE_CONFIG_IS_RETPROBE if set, create kretprobe/uretprobe + * if not set, create kprobe/uprobe + */ +enum perf_probe_config { + PERF_PROBE_CONFIG_IS_RETPROBE = 1U << 0, /* [k,u]retprobe */ +}; + +PMU_FORMAT_ATTR(retprobe, "config:0"); + +static struct attribute *probe_attrs[] = { + _attr_retprobe.attr, + NULL, +}; + +static struct attribute_group probe_format_group = { + .name = "format", + .attrs = probe_attrs, +}; + +static const struct attribute_group *probe_attr_groups[] = { + _format_group, + NULL, +}; + +static int perf_kprobe_event_init(struct perf_event *event); +static struct pmu perf_kprobe = { + .task_ctx_nr= perf_sw_context, + .event_init = perf_kprobe_event_init, + .add= perf_trace_add, + .del= perf_trace_del, + .start = perf_swevent_start, + .stop = perf_swevent_stop, + .read = perf_swevent_read, + .attr_groups= probe_attr_groups, +}; + +static int perf_kprobe_event_init(struct perf_event *event) +{ + int err; + bool is_retprobe; + + if (event->attr.type != perf_kprobe.type) + return -ENOENT; + /* +* no branch sampling for probe events +*/ + if (has_branch_stack(event)) + return -EOPNOTSUPP; + + is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE; + err = perf_kprobe_init(event, is_retprobe); + if (err) + return err; + + event->destroy = perf_kprobe_destroy; + + return 0; +} +#endif /* CONFIG_KPROBE_EVENTS */ + static inline void perf_tp_register(void) { perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT); +#ifdef CONFIG_KPROBE_EVENTS + perf_pmu_register(_kprobe, "kprobe", -1); +#endif } static void perf_event_free_filter(struct perf_event *event) @@ -8071,13 +8139,28 @@ static void perf_event_free_bpf_handler(struct
[tip:perf/core] perf/core: Implement the 'perf_kprobe' PMU
Commit-ID: e12f03d7031a977356e3d7b75a68c2185ff8d155 Gitweb: https://git.kernel.org/tip/e12f03d7031a977356e3d7b75a68c2185ff8d155 Author: Song Liu AuthorDate: Wed, 6 Dec 2017 14:45:15 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 11:29:26 +0100 perf/core: Implement the 'perf_kprobe' PMU A new PMU type, perf_kprobe is added. Based on attr from perf_event_open(), perf_kprobe creates a kprobe (or kretprobe) for the perf_event. This kprobe is private to this perf_event, and thus not added to global lists, and not available in tracefs. Two functions, create_local_trace_kprobe() and destroy_local_trace_kprobe() are added to created and destroy these local trace_kprobe. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-6-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/linux/trace_events.h| 4 ++ kernel/events/core.c| 142 ++-- kernel/trace/trace_event_perf.c | 49 ++ kernel/trace/trace_kprobe.c | 91 ++--- kernel/trace/trace_probe.h | 7 ++ 5 files changed, 250 insertions(+), 43 deletions(-) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index af44e7c..21c5d43 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -533,6 +533,10 @@ extern int perf_trace_init(struct perf_event *event); extern void perf_trace_destroy(struct perf_event *event); extern int perf_trace_add(struct perf_event *event, int flags); extern void perf_trace_del(struct perf_event *event, int flags); +#ifdef CONFIG_KPROBE_EVENTS +extern int perf_kprobe_init(struct perf_event *event, bool is_retprobe); +extern void perf_kprobe_destroy(struct perf_event *event); +#endif extern int ftrace_profile_set_filter(struct perf_event *event, int event_id, char *filter_str); extern void ftrace_profile_free_filter(struct perf_event *event); diff --git a/kernel/events/core.c b/kernel/events/core.c index d99fe3f..3337355 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7992,9 +7992,77 @@ static struct pmu perf_tracepoint = { .read = perf_swevent_read, }; +#ifdef CONFIG_KPROBE_EVENTS +/* + * Flags in config, used by dynamic PMU kprobe and uprobe + * The flags should match following PMU_FORMAT_ATTR(). + * + * PERF_PROBE_CONFIG_IS_RETPROBE if set, create kretprobe/uretprobe + * if not set, create kprobe/uprobe + */ +enum perf_probe_config { + PERF_PROBE_CONFIG_IS_RETPROBE = 1U << 0, /* [k,u]retprobe */ +}; + +PMU_FORMAT_ATTR(retprobe, "config:0"); + +static struct attribute *probe_attrs[] = { + _attr_retprobe.attr, + NULL, +}; + +static struct attribute_group probe_format_group = { + .name = "format", + .attrs = probe_attrs, +}; + +static const struct attribute_group *probe_attr_groups[] = { + _format_group, + NULL, +}; + +static int perf_kprobe_event_init(struct perf_event *event); +static struct pmu perf_kprobe = { + .task_ctx_nr= perf_sw_context, + .event_init = perf_kprobe_event_init, + .add= perf_trace_add, + .del= perf_trace_del, + .start = perf_swevent_start, + .stop = perf_swevent_stop, + .read = perf_swevent_read, + .attr_groups= probe_attr_groups, +}; + +static int perf_kprobe_event_init(struct perf_event *event) +{ + int err; + bool is_retprobe; + + if (event->attr.type != perf_kprobe.type) + return -ENOENT; + /* +* no branch sampling for probe events +*/ + if (has_branch_stack(event)) + return -EOPNOTSUPP; + + is_retprobe = event->attr.config & PERF_PROBE_CONFIG_IS_RETPROBE; + err = perf_kprobe_init(event, is_retprobe); + if (err) + return err; + + event->destroy = perf_kprobe_destroy; + + return 0; +} +#endif /* CONFIG_KPROBE_EVENTS */ + static inline void perf_tp_register(void) { perf_pmu_register(_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT); +#ifdef CONFIG_KPROBE_EVENTS + perf_pmu_register(_kprobe, "kprobe", -1); +#endif } static void perf_event_free_filter(struct perf_event *event) @@ -8071,13 +8139,28 @@ static void perf_event_free_bpf_handler(struct perf_event *event) } #endif +/* + * returns true if the event is a tracepoint, or a kprobe/upprobe created + * with perf_event_open() + */ +static inline bool perf_event_is_tracing(struct perf_event *event) +{ + if (event->pmu == _tracepoint) + return true; +#ifdef CONFIG_KPROBE_EVENTS + if (event->pmu == _kprobe) +
[tip:perf/core] perf/headers: Sync new perf_event.h with the tools/include/uapi version
Commit-ID: 0d8dd67be013727ae57645ecd3ea2c36365d7da8 Gitweb: https://git.kernel.org/tip/0d8dd67be013727ae57645ecd3ea2c36365d7da8 Author: Song LiuAuthorDate: Wed, 6 Dec 2017 14:45:14 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 10:18:05 +0100 perf/headers: Sync new perf_event.h with the tools/include/uapi version perf_event.h is updated in previous patch, this patch applies the same changes to the tools/ version. This is part is put in a separate patch in case the two files are back ported separately. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Acked-by: Alexei Starovoitov Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-5-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- tools/include/uapi/linux/perf_event.h | 4 1 file changed, 4 insertions(+) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index c77c9a2..5d49cfc 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -380,10 +380,14 @@ struct perf_event_attr { __u32 bp_type; union { __u64 bp_addr; + __u64 kprobe_func; /* for perf_kprobe */ + __u64 uprobe_path; /* for perf_uprobe */ __u64 config1; /* extension of config */ }; union { __u64 bp_len; + __u64 kprobe_addr; /* when kprobe_func == NULL */ + __u64 probe_offset; /* for perf_[k,u]probe */ __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */
[tip:perf/core] perf/headers: Sync new perf_event.h with the tools/include/uapi version
Commit-ID: 0d8dd67be013727ae57645ecd3ea2c36365d7da8 Gitweb: https://git.kernel.org/tip/0d8dd67be013727ae57645ecd3ea2c36365d7da8 Author: Song Liu AuthorDate: Wed, 6 Dec 2017 14:45:14 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 10:18:05 +0100 perf/headers: Sync new perf_event.h with the tools/include/uapi version perf_event.h is updated in previous patch, this patch applies the same changes to the tools/ version. This is part is put in a separate patch in case the two files are back ported separately. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Acked-by: Alexei Starovoitov Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-5-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- tools/include/uapi/linux/perf_event.h | 4 1 file changed, 4 insertions(+) diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index c77c9a2..5d49cfc 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -380,10 +380,14 @@ struct perf_event_attr { __u32 bp_type; union { __u64 bp_addr; + __u64 kprobe_func; /* for perf_kprobe */ + __u64 uprobe_path; /* for perf_uprobe */ __u64 config1; /* extension of config */ }; union { __u64 bp_len; + __u64 kprobe_addr; /* when kprobe_func == NULL */ + __u64 probe_offset; /* for perf_[k,u]probe */ __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */
[tip:perf/core] perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe'
Commit-ID: 65074d43fc77bcae32776724b7fa2696923c78e4 Gitweb: https://git.kernel.org/tip/65074d43fc77bcae32776724b7fa2696923c78e4 Author: Song LiuAuthorDate: Wed, 6 Dec 2017 14:45:13 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 10:18:04 +0100 perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe' Two new perf types, perf_kprobe and perf_uprobe, will be added to allow creating [k,u]probe with perf_event_open. These [k,u]probe are associated with the file decriptor created by perf_event_open(), thus are easy to clean when the file descriptor is destroyed. kprobe_func and uprobe_path are added to union config1 for pointers to function name for kprobe or binary path for uprobe. kprobe_addr and probe_offset are added to union config2 for kernel address (when kprobe_func is NULL), or [k,u]probe offset. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Acked-by: Alexei Starovoitov Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-4-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/uapi/linux/perf_event.h | 4 1 file changed, 4 insertions(+) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index c77c9a2..5d49cfc 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -380,10 +380,14 @@ struct perf_event_attr { __u32 bp_type; union { __u64 bp_addr; + __u64 kprobe_func; /* for perf_kprobe */ + __u64 uprobe_path; /* for perf_uprobe */ __u64 config1; /* extension of config */ }; union { __u64 bp_len; + __u64 kprobe_addr; /* when kprobe_func == NULL */ + __u64 probe_offset; /* for perf_[k,u]probe */ __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */
[tip:perf/core] perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe'
Commit-ID: 65074d43fc77bcae32776724b7fa2696923c78e4 Gitweb: https://git.kernel.org/tip/65074d43fc77bcae32776724b7fa2696923c78e4 Author: Song Liu AuthorDate: Wed, 6 Dec 2017 14:45:13 -0800 Committer: Ingo Molnar CommitDate: Tue, 6 Feb 2018 10:18:04 +0100 perf/core: Prepare perf_event.h for new types: 'perf_kprobe' and 'perf_uprobe' Two new perf types, perf_kprobe and perf_uprobe, will be added to allow creating [k,u]probe with perf_event_open. These [k,u]probe are associated with the file decriptor created by perf_event_open(), thus are easy to clean when the file descriptor is destroyed. kprobe_func and uprobe_path are added to union config1 for pointers to function name for kprobe or binary path for uprobe. kprobe_addr and probe_offset are added to union config2 for kernel address (when kprobe_func is NULL), or [k,u]probe offset. Signed-off-by: Song Liu Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Yonghong Song Reviewed-by: Josef Bacik Acked-by: Alexei Starovoitov Cc: Cc: Cc: Cc: Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20171206224518.3598254-4-songliubrav...@fb.com Signed-off-by: Ingo Molnar --- include/uapi/linux/perf_event.h | 4 1 file changed, 4 insertions(+) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index c77c9a2..5d49cfc 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -380,10 +380,14 @@ struct perf_event_attr { __u32 bp_type; union { __u64 bp_addr; + __u64 kprobe_func; /* for perf_kprobe */ + __u64 uprobe_path; /* for perf_uprobe */ __u64 config1; /* extension of config */ }; union { __u64 bp_len; + __u64 kprobe_addr; /* when kprobe_func == NULL */ + __u64 probe_offset; /* for perf_[k,u]probe */ __u64 config2; /* extension of config1 */ }; __u64 branch_sample_type; /* enum perf_branch_sample_type */