On 11/20/17 5:31 AM, Arnaldo Carvalho de Melo wrote:
Em Tue, Nov 14, 2017 at 09:25:17PM +0100, Daniel Borkmann escreveu:
On 11/14/2017 07:15 PM, Yonghong Song wrote:
On 11/14/17 6:19 AM, Daniel Borkmann wrote:
On 11/14/2017 02:42 PM, Arnaldo Carvalho de Melo wrote:
Em Tue, Nov 14, 2017 at 02:09:34PM +0100, Daniel Borkmann escreveu:
On 11/14/2017 01:58 PM, Arnaldo Carvalho de Melo wrote:
Em Tue, Nov 14, 2017 at 01:09:39AM +0100, Daniel Borkmann escreveu:
On 11/13/2017 04:08 PM, Arnaldo Carvalho de Melo wrote:
libbpf: -- BEGIN DUMP LOG ---
libbpf:
0: (79) r3 = *(u64 *)(r1 +104)
1: (b7) r2 = 0
2: (bf) r6 = r1
3: (bf) r1 = r10
4: (07) r1 += -128
5: (b7) r2 = 128
6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (07) r1 += -1
9: (67) r1 <<= 32
10: (77) r1 >>= 32
11: (25) if r1 > 0x7f goto pc+11

Right, so the compiler is optimizing the two tests into a single one above,
which means lower bound cannot properly be derived again by the verifier due
to this and thus you'll get the error. Similar issue was seen recently [1].

Does the below hack work for you?

int prog([...])
{
          char filename[128];
          int ret = bpf_probe_read_str(filename, sizeof(filename), 
filename_ptr);
          if (ret > 0)
                  bpf_perf_event_output(ctx, &__bpf_stdout__, 
BPF_F_CURRENT_CPU, filename,
                                        ret & (sizeof(filename) - 1));
          return 1;
}

r0 should keep on tracking bounds here at least:

prog:
         0:    bf 16 00 00 00 00 00 00     r6 = r1
         1:    bf a1 00 00 00 00 00 00     r1 = r10
         2:    07 01 00 00 80 ff ff ff     r1 += -128
         3:    b7 02 00 00 80 00 00 00     r2 = 128
         4:    85 00 00 00 2d 00 00 00     call 45
         5:    67 00 00 00 20 00 00 00     r0 <<= 32
         6:    c7 00 00 00 20 00 00 00     r0 s>>= 32
         7:    b7 01 00 00 01 00 00 00     r1 = 1
         8:    6d 01 0a 00 00 00 00 00     if r1 s> r0 goto 10
         9:    57 00 00 00 7f 00 00 00     r0 &= 127
        10:    bf a4 00 00 00 00 00 00     r4 = r10
        11:    07 04 00 00 80 ff ff ff     r4 += -128
        12:    bf 61 00 00 00 00 00 00     r1 = r6
        13:    18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00     r2 = 0ll
        15:    18 03 00 00 ff ff ff ff 00 00 00 00 00 00 00 00     r3 = 
4294967295ll
        17:    bf 05 00 00 00 00 00 00     r5 = r0
        18:    85 00 00 00 19 00 00 00     call 25

    [1] 
https://urldefense.proofpoint.com/v2/url?u=http-3A__patchwork.ozlabs.org_project_netdev_list_-3Fseries-3D13211&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=z0d6b_hxStA845Kh7epJ-JiFwkiWqUH_z3fEadwqAQY&e=

Not yet:

6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (67) r1 <<= 32
9: (77) r1 >>= 32
10: (15) if r1 == 0x0 goto pc+10
   R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) 
R6=ctx(id=0,off=0,imm=0) R10=fp0
11: (57) r0 &= 127
12: (bf) r4 = r10
13: (07) r4 += -128
14: (bf) r1 = r6
15: (18) r2 = 0xffff92bfc2aba840u
17: (18) r3 = 0xffffffff
19: (bf) r5 = r0
20: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

I'll try updating clang/llvm...

Full details:

[root@jouet bpf]# cat open.c
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
     char filename[128];
     const unsigned len = bpf_probe_read_str(filename, sizeof(filename), 
filename_ptr);

Btw, I was using 'int' here above instead of 'unsigned' as strncpy_from_unsafe()
could potentially return errors like -EFAULT.

I changed to int, didn't help
Currently having a version compiled from the git tree:

# llc --version
LLVM 
(https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
    LLVM version 6.0.0git-2d810c2
    Optimized build.
    Default target: x86_64-unknown-linux-gnu
    Host CPU: skylake

[root@jouet bpf]# llc --version
LLVM 
(https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=Qp3xFfXEz-CT8rzYtrHeXbow2M6FlsUzwcY32i3_2Q0&s=BKC_Gu9s1hw0v13OCgCpfsGtAY2hE7dujFqg8LNaK2I&e=):
    LLVM version 4.0.0svn

Old stuff! ;-) Will change, but improving these messages should be on
the radar, I think :-)

Yep, agree, I think we need a generic, better solution for this type of
issue instead of converting individual helpers to handle 0 min bound and
then only bailing out in such case; need to brainstorm a bit on that.

I think for the above in your case ...

   [...]
    6: (85) call bpf_probe_read_str#45
    7: (bf) r1 = r0
    8: (67) r1 <<= 32
    9: (77) r1 >>= 32
   10: (15) if r1 == 0x0 goto pc+10
    R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) 
R6=ctx(id=0,off=0,imm=0) R10=fp0
   11: (57) r0 &= 127
   [...]

... the shifts on r1 might be due to using 32 bit type, so if you find

Where is it using a 32 bit type?

Here is it again, now using clang 6:

[root@jouet bpf]# trace -v -e *open,open.c  usleep 2
bpf: builtin compilation failed: -95, try external compiler
Kernel build dir is set to /lib/modules/4.14.0+/build
set env: KBUILD_DIR=/lib/modules/4.14.0+/build
unset env: KBUILD_OPTS
include option is set to  -nostdinc -isystem 
/usr/lib/gcc/x86_64-redhat-linux/7/include 
-I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  
-I/home/acme/git/linux/include -I./include 
-I/home/acme/git/linux/arch/x86/include/uapi 
-I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi 
-I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
set env: NR_CPUS=4
set env: LINUX_VERSION_CODE=0x40e00
set env: CLANG_EXEC=/usr/local/bin/clang
unset env: CLANG_OPTIONS
set env: KERNEL_INC_OPTIONS= -nostdinc -isystem 
/usr/lib/gcc/x86_64-redhat-linux/7/include 
-I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  
-I/home/acme/git/linux/include -I./include 
-I/home/acme/git/linux/arch/x86/include/uapi 
-I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi 
-I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
set env: WORKING_DIR=/lib/modules/4.14.0+/build
set env: CLANG_SOURCE=/home/acme/bpf/open.c
llvm compiling command template: $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS 
-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS 
-Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c 
"$CLANG_SOURCE" -target bpf -O2 -o -
libbpf: loading object 'open.c' from buffer
libbpf: section .strtab, size 103, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section prog=do_sys_open filename, size 168, link 0, flags 6, type=1
libbpf: found program prog=do_sys_open filename
libbpf: section .relprog=do_sys_open filename, size 16, link 8, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of open.c is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of open.c is 40e00
libbpf: section .symtab, size 144, link 1, flags 0, type=2
libbpf: maps in open.c: 1 maps in 16 bytes
libbpf: map 0 is "__bpf_stdout__"
libbpf: collecting relocating info for: 'prog=do_sys_open filename'
libbpf: relocation: insn_idx=13
libbpf: relocation: find map 0 (__bpf_stdout__) for insn 13
bpf: config program 'prog=do_sys_open filename'
symbol:do_sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
parsing arg: filename into filename
bpf: config 'prog=do_sys_open filename' is ok
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.14.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.14.0+/build/vmlinux
Try to find probe point from debuginfo.
Matched function: do_sys_open [2a2e5f8]
Probe point found: do_sys_open+0
Searching 'filename' variable in context.
Converting variable filename into trace event.
filename type is (null).
Opening /sys/kernel/debug/tracing//README write=0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
In map_prologue, ntevs=1
mapping[0]=0
libbpf: create map __bpf_stdout__: fd=3
prologue: pass validation
prologue: fast path
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
0: (79) r3 = *(u64 *)(r1 +104)
1: (b7) r2 = 0
2: (bf) r6 = r1
3: (bf) r1 = r10
4: (07) r1 += -128
5: (b7) r2 = 128
6: (85) call bpf_probe_read_str#45
7: (bf) r1 = r0
8: (67) r1 <<= 32
9: (77) r1 >>= 32

This shift is due to the return type of bpf_probe_read_str which is int.
The compiler does a sign extension to get it to u64 type.

10: (15) if r1 == 0x0 goto pc+10

Since it is u64 type, the compiler uses "==" instead of ">".

  R0=inv(id=0) R1=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) 
R6=ctx(id=0,off=0,imm=0) R10=fp0
11: (57) r0 &= 127
12: (bf) r4 = r10
13: (07) r4 += -128
14: (bf) r1 = r6
15: (18) r2 = 0xffff9b56ca5a2a80
17: (18) r3 = 0xffffffff
19: (bf) r5 = r0

r5 = r0, and r0 is still possbily 0.
   r0 = bpf_probe_read_str
   ...
   r0 &= 127
   ...

The same symptom as "int len" type.

20: (85) call bpf_perf_event_output#25
invalid stack type R4 off=-128 access_size=0

libbpf: -- END LOG --
libbpf: Loading the 0th instance of program 'prog=do_sys_open filename' failed
libbpf: failed to load program 'prog=do_sys_open filename'
libbpf: failed to load object 'open.c'
bpf: load objects failed
event syntax error: 'open.c'
                      \___ Kernel verifier blocks program loading

(add -v to see detail)
Run 'perf list' for a list of valid events

  Usage: perf trace [<options>] [<command>]
     or: perf trace [<options>] -- <command> [<options>]
     or: perf trace record [<options>] [<command>]
     or: perf trace record [<options>] -- <command> [<options>]

     -e, --event <event>   event/syscall selector. use 'perf list' to list 
available events
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Opening /sys/kernel/debug/tracing//uprobe_events write=1
Parsing probe_events: p:perf_bpf_probe/prog _text+2493856 filename=%si:x64
Group:perf_bpf_probe Event:prog probe:p
Writing event: -:perf_bpf_probe/prog
[root@jouet bpf]# /usr/local/bin/clang --version
clang version 6.0.0 
(https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_git_clang.git&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=lALOhwxSDXp7fA6co6UyLBnLfqv0e2pf97GCuu5hGSw&s=FGoodXMmAdx4dqZJtvq6Vosu03to1-GFIRbagBiz0fg&e=
 56cc8f8880db2ebc433eeb6b6a707c101467a186) 
(https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_git_llvm.git&d=DwIDAw&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=lALOhwxSDXp7fA6co6UyLBnLfqv0e2pf97GCuu5hGSw&s=_pQdnamFeu6n_um7WGrLwaCMtXqIVGLLEmYbcss_0aU&e=
 3656d83960a4f3fedf6d8f19043abf52379f78c3)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
[root@jouet bpf]# cat open.c
#include "bpf.h"

SEC("prog=do_sys_open filename")
int prog(void *ctx, int err, const char __user *filename_ptr)
{
        char filename[128];
        u64 len = bpf_probe_read_str(filename, sizeof(filename), filename_ptr);
        if (len > 0)
                        perf_event_output(ctx, &__bpf_stdout__, 
BPF_F_CURRENT_CPU, filename,
                                  len & (sizeof(filename) - 1));
        return 1;
}
[root@jouet bpf]#

a way to avoid these and have the test on r0 directly, we might get there.
Perhaps keep using a 64 bit type to avoid them. It would be useful to
propagate the deduced bound information back to r0 when we know that
neither r0 nor r1 has changed in the meantime.

It is tricky to do in the bpf_program. Compiler tries hard to optimize :-).

The issue is at "r0 &= 127".

9: (6d) if r1 s> r0 goto pc+10
  R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 
0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
10: R0=inv(id=0,umin_value=1,umax_value=9223372036854775807,var_off=(0x0; 
0x7fffffffffffffff)) R1=inv1 R6=ctx(id=0,off=0,imm=0) R10=fp0
10: (57) r0 &= 127
11: R0=inv(id=0,umax_value=127,var_off=(0x0; 0x7f)) R1=inv1 
R6=ctx(id=0,off=0,imm=0) R10=fp0

One possible solution for this problem is to relax the arg4 type
from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO.

Yeah, I know, that's what I mentioned earlier in this thread to resolve it,
but do we really want to add this hack everywhere? :( Potentially any function
having ARG_CONST_SIZE would need to handle size 0 and bail out again in their
helper implementation and it ends up that progs start relying on this runtime
check where we won't be able to get rid of it later on anymore.

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index a5580c6..a68d8bd 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -393,6 +393,9 @@ BPF_CALL_5(bpf_perf_event_output, struct pt_regs *, regs, 
struct bpf_map *, map,
                 },
         };

+       if (unlikely(size == 0))
+               return 0;
+
         if (unlikely(flags & ~(BPF_F_INDEX_MASK)))
                 return -EINVAL;

@@ -407,7 +410,7 @@ static const struct bpf_func_proto 
bpf_perf_event_output_proto = {
         .arg2_type      = ARG_CONST_MAP_PTR,
         .arg3_type      = ARG_ANYTHING,
         .arg4_type      = ARG_PTR_TO_MEM,
-       .arg5_type      = ARG_CONST_SIZE,
+       .arg5_type      = ARG_CONST_SIZE_OR_ZERO,
  };

Reply via email to