So, I had 'perf test BPF' failing in perf/core, at first thought it was due to an update I had made to clang, but then I noticed that with perf/urgent it works, so something else, did a bisection and got to:
[acme@jouet linux]$ git bisect bad 88cda1c9da02c8aa31e1d5dcf22e8a35cc8c19f2 is the first bad commit commit 88cda1c9da02c8aa31e1d5dcf22e8a35cc8c19f2 Author: Martin KaFai Lau <ka...@fb.com> Date: Wed Sep 27 14:37:54 2017 -0700 bpf: libbpf: Provide basic API support to specify BPF obj name This patch extends the libbpf to provide API support to allow specifying BPF object name. In tools/lib/bpf/libbpf, the C symbol of the function and the map is used. Regarding section name, all maps are under the same section named "maps". Hence, section name is not a good choice for map's name. To be consistent with map, bpf_prog also follows and uses its function symbol as the prog's name. This patch adds logic to collect function's symbols in libbpf. There is existing codes to collect the map's symbols and no change is needed. The bpf_load_program_name() and bpf_map_create_name() are added to take the name argument. For the other bpf_map_create_xxx() variants, a name argument is directly added to them. In samples/bpf, bpf_load.c in particular, the symbol is also used as the map's name and the map symbols has already been collected in the existing code. For bpf_prog, bpf_load.c does not collect the function symbol name. We can consider to collect them later if there is a need to continue supporting the bpf_load.c. Signed-off-by: Martin KaFai Lau <ka...@fb.com> Acked-by: Alexei Starovoitov <a...@fb.com> Acked-by: Daniel Borkmann <dan...@iogearbox.net> Signed-off-by: David S. Miller <da...@davemloft.net> :040000 040000 9082e2747e39319fcd1555506bc24f36fc85ec43 433c27b777924b188028d79d2c5648a917135f39 M samples :040000 040000 6113e47f92559d3b4ac4eda399d2c980ac407603 9774db0bb47eab0ffc642623bd7d7a6f46b820c6 M tools [acme@jouet linux]$ [acme@jouet linux]$ Please CC me and Wang Nan when making changes to tools/lib/bpf/, as it started having tools/perf/ as its sole user, i.e. there is code there that uses it and we have to make sure it continues working :-\ What fails? [root@jouet ~]# perf test bpf 39: BPF filter : 39.1: Basic BPF filtering : FAILED! 39.2: BPF pinning : Skip 39.3: BPF prologue generation : Skip 39.4: BPF relocation checker : Skip [root@jouet ~]# With -v we can see that it is map setup that fails, with the error being: libbpf: failed to create map (name: 'flip_table'): Invalid argument libbpf: failed to load object '[basic_bpf_test]' bpf: load objects failed Failed to add events selected by BPF Full logs, see below for further info I had collected before doing the bisect: [root@jouet ~]# perf test -v bpf 39: BPF filter : 39.1: Basic BPF filtering : --- start --- test child forked, pid 12198 Kernel build dir is set to /lib/modules/4.14.0+/build set env: KBUILD_DIR=/lib/modules/4.14.0+/build unset env: KBUILD_OPTS include option is set to -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: NR_CPUS=4 set env: LINUX_VERSION_CODE=0x40e00 set env: CLANG_EXEC=/usr/local/bin/clang set env: CLANG_OPTIONS=-xc set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h set env: WORKING_DIR=/lib/modules/4.14.0+/build set env: CLANG_SOURCE=- llvm compiling command template: echo '/* * bpf-script-example.c * Test basic LLVM building */ #ifndef LINUX_VERSION_CODE # error Need LINUX_VERSION_CODE # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig' #endif #define BPF_ANY 0 #define BPF_MAP_TYPE_ARRAY 2 #define BPF_FUNC_map_lookup_elem 1 #define BPF_FUNC_map_update_elem 2 static void *(*bpf_map_lookup_elem)(void *map, void *key) = (void *) BPF_FUNC_map_lookup_elem; static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) = (void *) BPF_FUNC_map_update_elem; struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) struct bpf_map_def SEC("maps") flip_table = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(int), .value_size = sizeof(int), .max_entries = 1, }; SEC("func=SyS_epoll_wait") int bpf_func__SyS_epoll_wait(void *ctx) { int ind =0; int *flag = bpf_map_lookup_elem(&flip_table, &ind); int new_flag; if (!flag) return 0; /* flip flag and store back */ new_flag = !*flag; bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY); return new_flag; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o - libbpf: loading object '[basic_bpf_test]' from buffer libbpf: section .strtab, size 120, link 0, flags 0, type=3 libbpf: section .text, size 0, link 0, flags 6, type=1 libbpf: section func=SyS_epoll_wait, size 192, link 0, flags 6, type=1 libbpf: found program func=SyS_epoll_wait libbpf: section .relfunc=SyS_epoll_wait, size 32, link 8, flags 0, type=9 libbpf: section maps, size 16, link 0, flags 3, type=1 libbpf: section license, size 4, link 0, flags 3, type=1 libbpf: license of [basic_bpf_test] is GPL libbpf: section version, size 4, link 0, flags 3, type=1 libbpf: kernel version of [basic_bpf_test] is 40e00 libbpf: section .symtab, size 168, link 1, flags 0, type=2 libbpf: maps in [basic_bpf_test]: 1 maps in 16 bytes libbpf: map 0 is "flip_table" libbpf: collecting relocating info for: 'func=SyS_epoll_wait' libbpf: relocation: insn_idx=4 libbpf: relocation: find map 0 (flip_table) for insn 4 libbpf: relocation: insn_idx=17 libbpf: relocation: find map 0 (flip_table) for insn 17 bpf: config program 'func=SyS_epoll_wait' symbol:SyS_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null) bpf: config 'func=SyS_epoll_wait' is ok Looking at the vmlinux_path (8 entries long) Using /lib/modules/4.14.0+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux Try to find probe point from debuginfo. Matched function: SyS_epoll_wait [2e5a9be] found inline addr: 0xffffffff812af256 Probe point found: compat_SyS_epoll_pwait+150 found inline addr: 0xffffffff812af037 Probe point found: SyS_epoll_pwait+135 found inline addr: 0xffffffff812aeed0 Probe point found: SyS_epoll_wait+0 Found 3 probe_trace_events. Opening /sys/kernel/debug/tracing//kprobe_events write=1 Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval Group:probe Event:check_stack_object probe:r Writing event: p:perf_bpf_probe/func _text+2814550 Writing event: p:perf_bpf_probe/func_1 _text+2814007 Writing event: p:perf_bpf_probe/func_2 _text+2813648 libbpf: failed to create map (name: 'flip_table'): Invalid argument libbpf: failed to load object '[basic_bpf_test]' bpf: load objects failed Failed to add events selected by BPF Opening /sys/kernel/debug/tracing//kprobe_events write=1 Opening /sys/kernel/debug/tracing//uprobe_events write=1 Parsing probe_events: p:perf_bpf_probe/func _text+2814550 Group:perf_bpf_probe Event:func probe:p Parsing probe_events: p:perf_bpf_probe/func_1 _text+2814007 Group:perf_bpf_probe Event:func_1 probe:p Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648 Group:perf_bpf_probe Event:func_2 probe:p Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval Group:probe Event:check_stack_object probe:r Writing event: -:perf_bpf_probe/func Opening /sys/kernel/debug/tracing//kprobe_events write=1 Opening /sys/kernel/debug/tracing//uprobe_events write=1 Parsing probe_events: p:perf_bpf_probe/func_1 _text+2814007 Group:perf_bpf_probe Event:func_1 probe:p Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648 Group:perf_bpf_probe Event:func_2 probe:p Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval Group:probe Event:check_stack_object probe:r Writing event: -:perf_bpf_probe/func_1 Opening /sys/kernel/debug/tracing//kprobe_events write=1 Opening /sys/kernel/debug/tracing//uprobe_events write=1 Parsing probe_events: p:perf_bpf_probe/func_2 _text+2813648 Group:perf_bpf_probe Event:func_2 probe:p Parsing probe_events: r:probe/check_stack_object _text+2485296 ret=$retval Group:probe Event:check_stack_object probe:r Writing event: -:perf_bpf_probe/func_2 test child finished with -1 ---- end ---- BPF filter subtest 0: FAILED! 39.2: BPF pinning : --- force skipped --- BPF filter subtest 1: Skip 39.3: BPF prologue generation : --- force skipped --- BPF filter subtest 2: Skip 39.4: BPF relocation checker : --- force skipped --- BPF filter subtest 3: Skip [root@jouet ~]# Em Mon, Nov 27, 2017 at 04:34:25PM -0300, Arnaldo Carvalho de Melo escreveu: > So, I noticed that any maps are failing, I'll go dig, but may be some > new security tightening, even running as root, this was working > recently, was even part of our discussion on the bpf_probe_read_str() > trouble with clang's optimizer: > > [root@jouet bpf]# cat open.c > #include "bpf.h" > > SEC("prog=do_sys_open filename") > int prog(void *ctx, int err, char *filename_ptr) > { > char filename[128]; > int len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr); > if (len > 0) { > if (len == 1) > perf_event_output(ctx, &__bpf_stdout__, > BPF_F_CURRENT_CPU, filename, len); > else if (len < 128) > perf_event_output(ctx, &__bpf_stdout__, > BPF_F_CURRENT_CPU, filename, len); > } > return 1; > } > [root@jouet bpf]# > <SNIP> > Found 1 probe_trace_events. > Opening /sys/kernel/debug/tracing//kprobe_events write=1 > Writing event: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64 > In map_prologue, ntevs=1 > mapping[0]=0 > libbpf: failed to create map (name: '__bpf_stdout__'): Invalid argument > libbpf: failed to load object 'open.c' > bpf: load objects failed > event syntax error: 'open.c' > \___ Operation not permitted > <SNIP> > > Using 'perf ftrace' to trace just 'perf trace': > > [root@jouet bpf]# perf ftrace -G SyS_bpf perf trace -v -e open.c,open cat > /tmp/somefile 2> /dev/null > 0) | SyS_bpf() { > 0) | capable() { > 0) | ns_capable_common() { > 0) | security_capable() { > 0) 0.045 us | cap_capable(); > 0) | selinux_capable() { > 0) 0.274 us | cred_has_capability(); > 0) 0.518 us | } > 0) 1.464 us | } > 0) 1.783 us | } > 0) 2.130 us | } > 0) 0.458 us | check_uarg_tail_zero(); > 0) | __check_object_size() { > 0) 0.046 us | __virt_addr_valid(); > 0) 0.040 us | check_stack_object(); > 0) 0.510 us | } > 0) 4.161 us | } > [root@jouet bpf]# > > /me goes to look at SyS_bpf() in this kernel... (4.14.0+). Tracing 'perf trace' with 'perf trace' we see: # perf trace -e bpf perf trace -e open.c,open cat /tmp/somefile <SNIP traced 'perf trace' error messages> 0.000 ( 0.015 ms): perf/16767 bpf(cmd: MAP_CREATE, uattr: 0x7ffc3c8c7ac0, size: 72) = -1 EINVAL Invalid argument # Humm, # perf probe check_stack_object%return 'ret=$retval' Added new event: probe:check_stack_object (on check_stack_object%return with ret=$retval) You can now use it in all perf tools, such as: perf record -e probe:check_stack_object -aR sleep 1 # # perf trace -e bpf,probe:check* perf trace -e open.c,open cat /tmp/somefile <SNIP lots of check_stack_object calls returning 0x0) 4626.779 ( 0.004 ms): perf/31498 bpf(cmd: MAP_CREATE, uattr: 0x7fff7dbbaab0, size: 72 ) ... 4626.784 ( ): probe:check_stack_object:(ffffffffb625ec30 <- ffffffffb625ed1f) ret=0x2) 4626.779 ( 0.006 ms): perf/31498 ... [continued]: bpf()) = -1 EINVAL Invalid argument <SNIP lots of check_stack_object calls returning 0x0) check_stack_object returning 0x2 means GOOD_STACK, 0x0 means NOT_STACK... - Arnaldo