Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On 2016/11/15 13:21, Alexei Starovoitov wrote: On Mon, Nov 14, 2016 at 9:03 PM, Wangnan (F)wrote: On 2016/11/15 12:57, Alexei Starovoitov wrote: On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan wrote: This is version 2 of perf builtin clang patch series. Compare to v1, add an exciting feature: jit compiling perf hook functions. This features allows script writer report result through BPF map in a customized way. looks great. SEC("perfhook:record_start") void record_start(void *ctx) { int perf_pid = getpid(), key = G_perf_pid; printf("Start count, perfpid=%d\n", perf_pid); jit_helper__map_update_elem(ctx, , , _pid, 0); the name, I think, is too verbose. Why not to keep them as bpf_map_update_elem even for user space programs? I can make it shorter by give it a better name or use a wrapper like BPF_MAP(update_elem) the macro isn't pretty, since function calls won't look like calls. but the only thing I can't do is to make perfhook and in-kernel script use a uniform name for these bpf_map functions, because bpf_map_update_elem is already defined: "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) = (void *)2;\n" right. i guess you could have #ifdef it, so it's different for bpf backend and for native. Then the '.c' -> LLVM IR compiling should be done twice for BPF and for JIT to make the macro work. In current implementation we have only one LLVM IR. It is faster and can make sure the data layout ("maps" section) is identical. Another alternative is to call it map_update_elem or map_update or bpf_map_update. Something shorter is already a win. 'jit_helper__' prefix is an implementation detail. The users don't need to know and don't need to spell it out everywhere. Good. Let choose a better name for them. Thank you.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On 2016/11/15 13:21, Alexei Starovoitov wrote: On Mon, Nov 14, 2016 at 9:03 PM, Wangnan (F) wrote: On 2016/11/15 12:57, Alexei Starovoitov wrote: On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan wrote: This is version 2 of perf builtin clang patch series. Compare to v1, add an exciting feature: jit compiling perf hook functions. This features allows script writer report result through BPF map in a customized way. looks great. SEC("perfhook:record_start") void record_start(void *ctx) { int perf_pid = getpid(), key = G_perf_pid; printf("Start count, perfpid=%d\n", perf_pid); jit_helper__map_update_elem(ctx, , , _pid, 0); the name, I think, is too verbose. Why not to keep them as bpf_map_update_elem even for user space programs? I can make it shorter by give it a better name or use a wrapper like BPF_MAP(update_elem) the macro isn't pretty, since function calls won't look like calls. but the only thing I can't do is to make perfhook and in-kernel script use a uniform name for these bpf_map functions, because bpf_map_update_elem is already defined: "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) = (void *)2;\n" right. i guess you could have #ifdef it, so it's different for bpf backend and for native. Then the '.c' -> LLVM IR compiling should be done twice for BPF and for JIT to make the macro work. In current implementation we have only one LLVM IR. It is faster and can make sure the data layout ("maps" section) is identical. Another alternative is to call it map_update_elem or map_update or bpf_map_update. Something shorter is already a win. 'jit_helper__' prefix is an implementation detail. The users don't need to know and don't need to spell it out everywhere. Good. Let choose a better name for them. Thank you.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On Mon, Nov 14, 2016 at 9:03 PM, Wangnan (F)wrote: > > > On 2016/11/15 12:57, Alexei Starovoitov wrote: >> >> On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan wrote: >>> >>> This is version 2 of perf builtin clang patch series. Compare to v1, >>> add an exciting feature: jit compiling perf hook functions. This >>> features allows script writer report result through BPF map in a >>> customized way. >> >> looks great. >> >>>SEC("perfhook:record_start") >>>void record_start(void *ctx) >>>{ >>> int perf_pid = getpid(), key = G_perf_pid; >>> printf("Start count, perfpid=%d\n", perf_pid); >>> jit_helper__map_update_elem(ctx, , , _pid, 0); >> >> the name, I think, is too verbose. >> Why not to keep them as bpf_map_update_elem >> even for user space programs? > > > I can make it shorter by give it a better name or use a wrapper like > > BPF_MAP(update_elem) the macro isn't pretty, since function calls won't look like calls. > but the only thing I can't do is to make perfhook and in-kernel script > use a uniform name for these bpf_map functions, because > bpf_map_update_elem is already defined: > > "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) = > (void *)2;\n" right. i guess you could have #ifdef it, so it's different for bpf backend and for native. Another alternative is to call it map_update_elem or map_update or bpf_map_update. Something shorter is already a win. 'jit_helper__' prefix is an implementation detail. The users don't need to know and don't need to spell it out everywhere.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On Mon, Nov 14, 2016 at 9:03 PM, Wangnan (F) wrote: > > > On 2016/11/15 12:57, Alexei Starovoitov wrote: >> >> On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan wrote: >>> >>> This is version 2 of perf builtin clang patch series. Compare to v1, >>> add an exciting feature: jit compiling perf hook functions. This >>> features allows script writer report result through BPF map in a >>> customized way. >> >> looks great. >> >>>SEC("perfhook:record_start") >>>void record_start(void *ctx) >>>{ >>> int perf_pid = getpid(), key = G_perf_pid; >>> printf("Start count, perfpid=%d\n", perf_pid); >>> jit_helper__map_update_elem(ctx, , , _pid, 0); >> >> the name, I think, is too verbose. >> Why not to keep them as bpf_map_update_elem >> even for user space programs? > > > I can make it shorter by give it a better name or use a wrapper like > > BPF_MAP(update_elem) the macro isn't pretty, since function calls won't look like calls. > but the only thing I can't do is to make perfhook and in-kernel script > use a uniform name for these bpf_map functions, because > bpf_map_update_elem is already defined: > > "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) = > (void *)2;\n" right. i guess you could have #ifdef it, so it's different for bpf backend and for native. Another alternative is to call it map_update_elem or map_update or bpf_map_update. Something shorter is already a win. 'jit_helper__' prefix is an implementation detail. The users don't need to know and don't need to spell it out everywhere.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On 2016/11/15 12:57, Alexei Starovoitov wrote: On Mon, Nov 14, 2016 at 8:05 PM, Wang Nanwrote: This is version 2 of perf builtin clang patch series. Compare to v1, add an exciting feature: jit compiling perf hook functions. This features allows script writer report result through BPF map in a customized way. looks great. SEC("perfhook:record_start") void record_start(void *ctx) { int perf_pid = getpid(), key = G_perf_pid; printf("Start count, perfpid=%d\n", perf_pid); jit_helper__map_update_elem(ctx, , , _pid, 0); the name, I think, is too verbose. Why not to keep them as bpf_map_update_elem even for user space programs? I can make it shorter by give it a better name or use a wrapper like BPF_MAP(update_elem) but the only thing I can't do is to make perfhook and in-kernel script use a uniform name for these bpf_map functions, because bpf_map_update_elem is already defined: "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) = (void *)2;\n" SEC("perfhook:record_end") void record_end(void *ctx) { u64 key = -1, value; while (!jit_helper__map_get_next_key(ctx, _counter, , )) { jit_helper__map_lookup_elem(ctx, _counter, , ); printf("syscall %ld\tcount: %ld\n", (long)key, (long)value); this loop will be less verbose as well.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On 2016/11/15 12:57, Alexei Starovoitov wrote: On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan wrote: This is version 2 of perf builtin clang patch series. Compare to v1, add an exciting feature: jit compiling perf hook functions. This features allows script writer report result through BPF map in a customized way. looks great. SEC("perfhook:record_start") void record_start(void *ctx) { int perf_pid = getpid(), key = G_perf_pid; printf("Start count, perfpid=%d\n", perf_pid); jit_helper__map_update_elem(ctx, , , _pid, 0); the name, I think, is too verbose. Why not to keep them as bpf_map_update_elem even for user space programs? I can make it shorter by give it a better name or use a wrapper like BPF_MAP(update_elem) but the only thing I can't do is to make perfhook and in-kernel script use a uniform name for these bpf_map functions, because bpf_map_update_elem is already defined: "static long (*bpf_map_update_elem)(void *, void *, void *, unsigned long) = (void *)2;\n" SEC("perfhook:record_end") void record_end(void *ctx) { u64 key = -1, value; while (!jit_helper__map_get_next_key(ctx, _counter, , )) { jit_helper__map_lookup_elem(ctx, _counter, , ); printf("syscall %ld\tcount: %ld\n", (long)key, (long)value); this loop will be less verbose as well.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On Mon, Nov 14, 2016 at 8:05 PM, Wang Nanwrote: > This is version 2 of perf builtin clang patch series. Compare to v1, > add an exciting feature: jit compiling perf hook functions. This > features allows script writer report result through BPF map in a > customized way. looks great. > SEC("perfhook:record_start") > void record_start(void *ctx) > { > int perf_pid = getpid(), key = G_perf_pid; > printf("Start count, perfpid=%d\n", perf_pid); > jit_helper__map_update_elem(ctx, , , _pid, 0); the name, I think, is too verbose. Why not to keep them as bpf_map_update_elem even for user space programs? > SEC("perfhook:record_end") > void record_end(void *ctx) > { > u64 key = -1, value; > while (!jit_helper__map_get_next_key(ctx, _counter, , > )) { > jit_helper__map_lookup_elem(ctx, _counter, , > ); > printf("syscall %ld\tcount: %ld\n", (long)key, (long)value); this loop will be less verbose as well.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On Mon, Nov 14, 2016 at 8:05 PM, Wang Nan wrote: > This is version 2 of perf builtin clang patch series. Compare to v1, > add an exciting feature: jit compiling perf hook functions. This > features allows script writer report result through BPF map in a > customized way. looks great. > SEC("perfhook:record_start") > void record_start(void *ctx) > { > int perf_pid = getpid(), key = G_perf_pid; > printf("Start count, perfpid=%d\n", perf_pid); > jit_helper__map_update_elem(ctx, , , _pid, 0); the name, I think, is too verbose. Why not to keep them as bpf_map_update_elem even for user space programs? > SEC("perfhook:record_end") > void record_end(void *ctx) > { > u64 key = -1, value; > while (!jit_helper__map_get_next_key(ctx, _counter, , > )) { > jit_helper__map_lookup_elem(ctx, _counter, , > ); > printf("syscall %ld\tcount: %ld\n", (long)key, (long)value); this loop will be less verbose as well.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On 2016/11/15 12:05, Wang Nan wrote: $ sudo -s # ulimit -l unlimited # perf record -e ./count_syscalls.c echo "Haha" Start count, perfpid=25209 Haha [ perf record: Woken up 1 times to write data ] syscall 8count: 6 syscall 11 count: 1 syscall 4count: 6 syscall 21 count: 1 syscall 5count: 3 syscall 231 count: 1 syscall 45 count: 3 syscall 0count: 24 syscall 257 count: 1 syscall 59 count: 4 syscall 23 count: 9 syscall 78 count: 2 syscall 41 count: 4 syscall 72 count: 8 syscall 10 count: 3 syscall 321 count: 1 syscall 298 count: 7 syscall 16 count: 21 syscall 9count: 16 syscall 1count: 114 syscall 12 count: 3 syscall 14 count: 35 syscall 158 count: 1 syscall 2count: 15 syscall 7count: 18 syscall 3count: 11 [ perf record: Captured and wrote 0.011 MB perf.data ] Note that this example counts system wide syscall histogram, not only 'echo' proc. The in-kernel BPF script doesn't know pid of 'echo' so can't filter base on it. I'm planning adding more perf hook points to pass information like this. Thank you.
Re: [PATCH 00/34] perf clang: Builtin clang and perfhook support
On 2016/11/15 12:05, Wang Nan wrote: $ sudo -s # ulimit -l unlimited # perf record -e ./count_syscalls.c echo "Haha" Start count, perfpid=25209 Haha [ perf record: Woken up 1 times to write data ] syscall 8count: 6 syscall 11 count: 1 syscall 4count: 6 syscall 21 count: 1 syscall 5count: 3 syscall 231 count: 1 syscall 45 count: 3 syscall 0count: 24 syscall 257 count: 1 syscall 59 count: 4 syscall 23 count: 9 syscall 78 count: 2 syscall 41 count: 4 syscall 72 count: 8 syscall 10 count: 3 syscall 321 count: 1 syscall 298 count: 7 syscall 16 count: 21 syscall 9count: 16 syscall 1count: 114 syscall 12 count: 3 syscall 14 count: 35 syscall 158 count: 1 syscall 2count: 15 syscall 7count: 18 syscall 3count: 11 [ perf record: Captured and wrote 0.011 MB perf.data ] Note that this example counts system wide syscall histogram, not only 'echo' proc. The in-kernel BPF script doesn't know pid of 'echo' so can't filter base on it. I'm planning adding more perf hook points to pass information like this. Thank you.
[PATCH 00/34] perf clang: Builtin clang and perfhook support
This is version 2 of perf builtin clang patch series. Compare to v1, add an exciting feature: jit compiling perf hook functions. This features allows script writer report result through BPF map in a customized way. At the end of this cover letter lists an example shows how to capture and report a syscall histogram. This patchset is based on current perf/core. In this patchset: Patch 1 - 4 are bugfixes left in my local tree. Patch 5 - 7 are preparation in libbpf. Patch 8 - 9 introduce perf hook to perf. Patch 10 - 21 are patches support builtin clang. Some of then are already collected by Arnaldo in tmp.perf/builtin-clang. They are slightly adjusted in this v2 series. Patch 22 - 29 add JIT compiling to builting clang and add them to perf hook. Patch 30 - 34 are easy of use improvements. 1) builtin clang defines macros and helpers by default using builtin include headers. Default headers can be turned off by -UBUILTIN_CLANG_DEFAULT_INCLUDE. 2) allow JITted perf hooks access more functions in libc. Example: build syscall histogram, exclude syscalls issued by perf itself. Please note following improvements: 1. Don't need define many bpf helpers by hand. See bpf_map_lookup_elem and bpf_get_current_pid_tgid. 2. See how pid of git is passed to BPF script attached to raw_syscalls:sys_enter. $ cat ./count_syscalls.c typedef unsigned long u64; #define BPF_MAP_TYPE_HASH 1 #define BPF_MAP_TYPE_ARRAY 2 enum GVAL { G_perf_pid, NR_GVALS }; struct bpf_map_def SEC("maps") GVALS = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u64), .max_entries = NR_GVALS, }; struct bpf_map_def SEC("maps") syscall_counter = { .type = BPF_MAP_TYPE_HASH, .key_size = sizeof(u64), .value_size = sizeof(u64), .max_entries = 512, }; SEC("raw_syscalls:sys_enter") int func(void *ctx) { int key = G_perf_pid; u64 id = *((u64 *)(ctx + 8)); int self_pid = bpf_get_current_pid_tgid() & 0x; int *perf_pid = bpf_map_lookup_elem(, ); u64 *counter; if (!perf_pid) return 0; if (*perf_pid == self_pid) return 0; counter = bpf_map_lookup_elem(_counter, ); if (!counter) { u64 value = 1; bpf_map_update_elem(_counter, , , 0); return 0; } __sync_fetch_and_add(counter, 1); return 0; } SEC("perfhook:record_start") void record_start(void *ctx) { int perf_pid = getpid(), key = G_perf_pid; printf("Start count, perfpid=%d\n", perf_pid); jit_helper__map_update_elem(ctx, , , _pid, 0); } SEC("perfhook:record_end") void record_end(void *ctx) { u64 key = -1, value; while (!jit_helper__map_get_next_key(ctx, _counter, , )) { jit_helper__map_lookup_elem(ctx, _counter, , ); printf("syscall %ld\tcount: %ld\n", (long)key, (long)value); } } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; $ sudo -s # ulimit -l unlimited # perf record -e ./count_syscalls.c echo "Haha" Start count, perfpid=25209 Haha [ perf record: Woken up 1 times to write data ] syscall 8 count: 6 syscall 11count: 1 syscall 4 count: 6 syscall 21count: 1 syscall 5 count: 3 syscall 231 count: 1 syscall 45count: 3 syscall 0 count: 24 syscall 257 count: 1 syscall 59count: 4 syscall 23count: 9 syscall 78count: 2 syscall 41count: 4 syscall 72count: 8 syscall 10count: 3 syscall 321 count: 1 syscall 298 count: 7 syscall 16count: 21 syscall 9 count: 16 syscall 1 count: 114 syscall 12count: 3 syscall 14count: 35 syscall 158 count: 1 syscall 2 count: 15 syscall 7 count: 18 syscall 3 count: 11 [ perf record: Captured and wrote 0.011 MB perf.data ] Eric Leblond (1): tools lib bpf: fix maps resolution Wang Nan (33): perf tools: Fix kernel version error in ubuntu perf record: Fix segfault when running with suid and kptr_restrict is 1 tools perf: Add missing struct defeinition in probe_event.h tools lib bpf: Add missing bpf map functions tools lib bpf: Add private field for bpf_object tools lib bpf: Retrive bpf_map through offset of bpf_map_def perf tools: Introduce perf hooks perf tools: Pass context to perf hook functions perf llvm: Extract helpers in llvm-utils.c tools build: Add feature detection for LLVM tools build: Add feature detection for clang perf build: Add clang and llvm compile and linking support perf clang: Add builtin clang support ant test case perf clang: Use real file system for #include perf
[PATCH 00/34] perf clang: Builtin clang and perfhook support
This is version 2 of perf builtin clang patch series. Compare to v1, add an exciting feature: jit compiling perf hook functions. This features allows script writer report result through BPF map in a customized way. At the end of this cover letter lists an example shows how to capture and report a syscall histogram. This patchset is based on current perf/core. In this patchset: Patch 1 - 4 are bugfixes left in my local tree. Patch 5 - 7 are preparation in libbpf. Patch 8 - 9 introduce perf hook to perf. Patch 10 - 21 are patches support builtin clang. Some of then are already collected by Arnaldo in tmp.perf/builtin-clang. They are slightly adjusted in this v2 series. Patch 22 - 29 add JIT compiling to builting clang and add them to perf hook. Patch 30 - 34 are easy of use improvements. 1) builtin clang defines macros and helpers by default using builtin include headers. Default headers can be turned off by -UBUILTIN_CLANG_DEFAULT_INCLUDE. 2) allow JITted perf hooks access more functions in libc. Example: build syscall histogram, exclude syscalls issued by perf itself. Please note following improvements: 1. Don't need define many bpf helpers by hand. See bpf_map_lookup_elem and bpf_get_current_pid_tgid. 2. See how pid of git is passed to BPF script attached to raw_syscalls:sys_enter. $ cat ./count_syscalls.c typedef unsigned long u64; #define BPF_MAP_TYPE_HASH 1 #define BPF_MAP_TYPE_ARRAY 2 enum GVAL { G_perf_pid, NR_GVALS }; struct bpf_map_def SEC("maps") GVALS = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u64), .max_entries = NR_GVALS, }; struct bpf_map_def SEC("maps") syscall_counter = { .type = BPF_MAP_TYPE_HASH, .key_size = sizeof(u64), .value_size = sizeof(u64), .max_entries = 512, }; SEC("raw_syscalls:sys_enter") int func(void *ctx) { int key = G_perf_pid; u64 id = *((u64 *)(ctx + 8)); int self_pid = bpf_get_current_pid_tgid() & 0x; int *perf_pid = bpf_map_lookup_elem(, ); u64 *counter; if (!perf_pid) return 0; if (*perf_pid == self_pid) return 0; counter = bpf_map_lookup_elem(_counter, ); if (!counter) { u64 value = 1; bpf_map_update_elem(_counter, , , 0); return 0; } __sync_fetch_and_add(counter, 1); return 0; } SEC("perfhook:record_start") void record_start(void *ctx) { int perf_pid = getpid(), key = G_perf_pid; printf("Start count, perfpid=%d\n", perf_pid); jit_helper__map_update_elem(ctx, , , _pid, 0); } SEC("perfhook:record_end") void record_end(void *ctx) { u64 key = -1, value; while (!jit_helper__map_get_next_key(ctx, _counter, , )) { jit_helper__map_lookup_elem(ctx, _counter, , ); printf("syscall %ld\tcount: %ld\n", (long)key, (long)value); } } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; $ sudo -s # ulimit -l unlimited # perf record -e ./count_syscalls.c echo "Haha" Start count, perfpid=25209 Haha [ perf record: Woken up 1 times to write data ] syscall 8 count: 6 syscall 11count: 1 syscall 4 count: 6 syscall 21count: 1 syscall 5 count: 3 syscall 231 count: 1 syscall 45count: 3 syscall 0 count: 24 syscall 257 count: 1 syscall 59count: 4 syscall 23count: 9 syscall 78count: 2 syscall 41count: 4 syscall 72count: 8 syscall 10count: 3 syscall 321 count: 1 syscall 298 count: 7 syscall 16count: 21 syscall 9 count: 16 syscall 1 count: 114 syscall 12count: 3 syscall 14count: 35 syscall 158 count: 1 syscall 2 count: 15 syscall 7 count: 18 syscall 3 count: 11 [ perf record: Captured and wrote 0.011 MB perf.data ] Eric Leblond (1): tools lib bpf: fix maps resolution Wang Nan (33): perf tools: Fix kernel version error in ubuntu perf record: Fix segfault when running with suid and kptr_restrict is 1 tools perf: Add missing struct defeinition in probe_event.h tools lib bpf: Add missing bpf map functions tools lib bpf: Add private field for bpf_object tools lib bpf: Retrive bpf_map through offset of bpf_map_def perf tools: Introduce perf hooks perf tools: Pass context to perf hook functions perf llvm: Extract helpers in llvm-utils.c tools build: Add feature detection for LLVM tools build: Add feature detection for clang perf build: Add clang and llvm compile and linking support perf clang: Add builtin clang support ant test case perf clang: Use real file system for #include perf