Re: [iovisor-dev] Bcc on 32-bit ARM
On Fri, Oct 13, 2017 at 5:53 AM, Adrian Szyndela via iovisor-dev wrote: > > W dniu 04.10.2017 o 21:14, Y Song pisze: >> >> On Wed, Oct 4, 2017 at 3:24 AM, Adrian Szyndela via iovisor-dev >> wrote: >>> >>> - return type of bpf_pseudo_fd is 'u64', and it is passed to functions >>> that >>> take pointers; the compiler generates 'trunc' in IR for this, which >>> becomes >>> shift-left + shift-right in final bpf code; and bpf verifier does not >>> like >>> it when pointers are shifted. We worked it around by changing return type >>> of >>> bpf_pseudo_fd/llvm.bpf.pseudo from 'u64' to 'void *' - both in bcc >>> >>> (https://github.com/aszyndela/bcc/commit/8eb8fb2e265b93d38c1ad53efcb7da8883a3d643) >>> and llvm >>> (include/llvm/IR/IntrinsicsBPF.td): >>> >>> def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">, >>>- Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>; >>>+ Intrinsic<[llvm_ptr_ty], [llvm_i64_ty, llvm_i64_ty]>; >>> >>> Obviously we don't know what we're doing, but at least it works for us. >> >> >> This should work as well. essentially bpf_pseudo tries to convert a >> load map_id to load_imm64 address. So its return value is indeed a ptr. > > > I asked about it on llvm-dev and the only suggestion I got so far is to set > bpf target for clang: > https://lists.llvm.org/pipermail/llvm-dev/2017-October/118149.html > > And it makes sense... > > We can easily get into trouble when we mix two targets in the process. For > example, it looks like sizeofs are computed at the "clang" stage, but actual > sizes of some types are "known" at the "llc" stage. > > However, switching to bpf target for the whole process takes away > convenience and maintainability: we can't rely on including kernel headers > anymore. So we have to craft everything we need by hand, including struct > pt_regs... > > Any thoughts how to handle this? All these are valid. For certain program types e.g., networking where pt_regs are not needed, you can already use -target bpf in clang. Check linux:tools/testing/selftests/bpf/Makefile. Since currently we do not have good solutions for pt_regs when "clang -target bpf" is used, let us stick to the approach "clang -target arm"/"llc -target bpf". > ___ > iovisor-dev mailing list > iovisor-dev@lists.iovisor.org > https://lists.iovisor.org/mailman/listinfo/iovisor-dev ___ iovisor-dev mailing list iovisor-dev@lists.iovisor.org https://lists.iovisor.org/mailman/listinfo/iovisor-dev
Re: [iovisor-dev] Bcc on 32-bit ARM
W dniu 04.10.2017 o 21:14, Y Song pisze: On Wed, Oct 4, 2017 at 3:24 AM, Adrian Szyndela via iovisor-dev wrote: - return type of bpf_pseudo_fd is 'u64', and it is passed to functions that take pointers; the compiler generates 'trunc' in IR for this, which becomes shift-left + shift-right in final bpf code; and bpf verifier does not like it when pointers are shifted. We worked it around by changing return type of bpf_pseudo_fd/llvm.bpf.pseudo from 'u64' to 'void *' - both in bcc (https://github.com/aszyndela/bcc/commit/8eb8fb2e265b93d38c1ad53efcb7da8883a3d643) and llvm (include/llvm/IR/IntrinsicsBPF.td): def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">, - Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>; + Intrinsic<[llvm_ptr_ty], [llvm_i64_ty, llvm_i64_ty]>; Obviously we don't know what we're doing, but at least it works for us. This should work as well. essentially bpf_pseudo tries to convert a load map_id to load_imm64 address. So its return value is indeed a ptr. I asked about it on llvm-dev and the only suggestion I got so far is to set bpf target for clang: https://lists.llvm.org/pipermail/llvm-dev/2017-October/118149.html And it makes sense... We can easily get into trouble when we mix two targets in the process. For example, it looks like sizeofs are computed at the "clang" stage, but actual sizes of some types are "known" at the "llc" stage. However, switching to bpf target for the whole process takes away convenience and maintainability: we can't rely on including kernel headers anymore. So we have to craft everything we need by hand, including struct pt_regs... Any thoughts how to handle this? ___ iovisor-dev mailing list iovisor-dev@lists.iovisor.org https://lists.iovisor.org/mailman/listinfo/iovisor-dev
Re: [iovisor-dev] Bcc on 32-bit ARM
W dniu 04.10.2017 o 21:14, Y Song pisze: Thanks for comprehensive hacking report to make arm32 work! More comments below. Thanks for taking a look! There could be some issues in BPF backend to handle 32bit pointers. I tried your example above, "clang -O2 -target arm -emit-llvm -S t.c" works fine. But llc complains calling convention failure. # llc -march=bpf -filetype=asm t.ll LLVM ERROR: Unsupported calling convention When I set the target to clang -O2 -target armv7l-unknown-linux-gnueabi -emit-llvm -S t.c it works for me the same as with the modified bcc. llc does not complain anymore. ___ iovisor-dev mailing list iovisor-dev@lists.iovisor.org https://lists.iovisor.org/mailman/listinfo/iovisor-dev
Re: [iovisor-dev] Bcc on 32-bit ARM
Hi, Adrian, Thanks for comprehensive hacking report to make arm32 work! More comments below. On Wed, Oct 4, 2017 at 3:24 AM, Adrian Szyndela via iovisor-dev wrote: > Hi, > > We're trying to launch bcc on an ARM 32-bit machine. We've managed to cook a > version that works, but with a few quirks. > > Besides adding general setup for compiling on ARM (defining registers, > calling convention, syscall number), we made some changes and corrections to > type sizes (mostly ulonglong -> size_t in libbcc.py), disabled some parts of > kernel headers and disabled 128-bit integer support. > (https://github.com/aszyndela/bcc/commit/bed992f2842007ec5563ec75caffe6e03b6bb7c4) I look at the change. Mostly it is okay. In the code, you mentioned, + // disable including kernel header files with assembly code + // as there is no assembly parser for ARM implemented yet + // same for arm64 arch: https://patchwork.kernel.org/patch/7605601/ + cflags->push_back("-D__ASM_SYSREG_H"); + cflags->push_back("-D__LINUX_ARM_ARCH__=7"); If you have latest llvm trunk code, assembly parser is there, you should not need to have -D__ASM_SYSREG_H. But for earlier version of llvm compiler, yes, you still need it. > > Additionally, we encountered the following quirks: > > - 'memset', which comes from rewriting in libbcc, is redefined in ARM kernel > headers > (https://elixir.free-electrons.com/linux/latest/source/arch/arm/include/asm/string.h#L29). > Redefined code calls __memzero, which is not allowed in bpf code because it > is an external function call, so we changed memset to __builtin_memset when > the code is rewritten. > (https://github.com/aszyndela/bcc/commit/6cc9adc4e6e8cc775b1d08641d0bf9c9f48ce3e2) The change here should be okay. > > - return type of bpf_pseudo_fd is 'u64', and it is passed to functions that > take pointers; the compiler generates 'trunc' in IR for this, which becomes > shift-left + shift-right in final bpf code; and bpf verifier does not like > it when pointers are shifted. We worked it around by changing return type of > bpf_pseudo_fd/llvm.bpf.pseudo from 'u64' to 'void *' - both in bcc > (https://github.com/aszyndela/bcc/commit/8eb8fb2e265b93d38c1ad53efcb7da8883a3d643) > and llvm > (include/llvm/IR/IntrinsicsBPF.td): > > def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">, > - Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>; > + Intrinsic<[llvm_ptr_ty], [llvm_i64_ty, llvm_i64_ty]>; > > Obviously we don't know what we're doing, but at least it works for us. This should work as well. essentially bpf_pseudo tries to convert a load map_id to load_imm64 address. So its return value is indeed a ptr. > > - when compiling to IR, the compiler considers pointers as 32-bit (as the > target triple is set to something like armv7l-unknown-linux-gnueabi). But, > when compiling from IR to bpf pointers present in IR code are considered > 64-bit. This needs more explanation probably. Here is an example - > hand-written code, which looks (suspiciously) similar to the code generated > by bcc for dereferences: > > struct teststruct > { > const char *field; > }; > int test_ko(struct pt_regs *ctx) { > void *conn = ctx->uregs[0]; > struct teststruct *t = ctx->uregs[1]; > const char *c = ({ > typeof(void*) _val; > __builtin_memset(&_val, 0, sizeof(_val)); > bpf_probe_read(&_val, sizeof(_val), (char *)t >+ offsetof(struct teststruct, field)); > _val; }); > return ({ > typeof(const char) _val; > __builtin_memset(&_val, 0, sizeof(_val)); > bpf_probe_read(&_val, sizeof(_val), (char *)c); > _val; }); > } > > The interesting part is the first __builtin_memset. After compiling to IR it > becomes this: > > %2 = alloca i8*, align 4 > [...] > %8 = bitcast i8** %2 to i32* > store i32 0, i32* %8, align 4 > > It explicitly sets 4 bytes on the stack to 0, because sizeof(void*) on ARM > is 4. Thus, 'void * val' should be zeroed, right? No. Because after > compiling to final bpf code, i8* allocated at %2 becomes 64-bit and we got > the following code, rejected by the verifier: > > 5: R6=imm0,min_value=0,max_value=0,min_align=2147483648 R10=fp > 5: (63) *(u32 *)(r10 -8) = r6 > [...] > 10: (79) r3 = *(u64 *)(r10 -8) > invalid read from stack off -8+4 size 8 There could be some issues in BPF backend to handle 32bit pointers. I tried your example above, "clang -O2 -target arm -emit-llvm -S t.c" works fine. But llc complains calling convention failure. # llc -march=bpf -filetype=asm t.ll LLVM ERROR: Unsupported calling convention I did not check out why though. > But, if we switch __builtin_memset to explicit initialization, things get > better: > > int test_ok(struct pt_regs *ctx) { > void *conn = ctx->uregs[0]; > struct teststruct *t = ctx->uregs[1]; > const char *c = ({ > typeof(void*) _val = 0; > // instead of __builtin_mem
[iovisor-dev] Bcc on 32-bit ARM
Hi, We're trying to launch bcc on an ARM 32-bit machine. We've managed to cook a version that works, but with a few quirks. Besides adding general setup for compiling on ARM (defining registers, calling convention, syscall number), we made some changes and corrections to type sizes (mostly ulonglong -> size_t in libbcc.py), disabled some parts of kernel headers and disabled 128-bit integer support. (https://github.com/aszyndela/bcc/commit/bed992f2842007ec5563ec75caffe6e03b6bb7c4) Additionally, we encountered the following quirks: - 'memset', which comes from rewriting in libbcc, is redefined in ARM kernel headers (https://elixir.free-electrons.com/linux/latest/source/arch/arm/include/asm/string.h#L29). Redefined code calls __memzero, which is not allowed in bpf code because it is an external function call, so we changed memset to __builtin_memset when the code is rewritten. (https://github.com/aszyndela/bcc/commit/6cc9adc4e6e8cc775b1d08641d0bf9c9f48ce3e2) - return type of bpf_pseudo_fd is 'u64', and it is passed to functions that take pointers; the compiler generates 'trunc' in IR for this, which becomes shift-left + shift-right in final bpf code; and bpf verifier does not like it when pointers are shifted. We worked it around by changing return type of bpf_pseudo_fd/llvm.bpf.pseudo from 'u64' to 'void *' - both in bcc (https://github.com/aszyndela/bcc/commit/8eb8fb2e265b93d38c1ad53efcb7da8883a3d643) and llvm (include/llvm/IR/IntrinsicsBPF.td): def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">, - Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>; + Intrinsic<[llvm_ptr_ty], [llvm_i64_ty, llvm_i64_ty]>; Obviously we don't know what we're doing, but at least it works for us. - when compiling to IR, the compiler considers pointers as 32-bit (as the target triple is set to something like armv7l-unknown-linux-gnueabi). But, when compiling from IR to bpf pointers present in IR code are considered 64-bit. This needs more explanation probably. Here is an example - hand-written code, which looks (suspiciously) similar to the code generated by bcc for dereferences: struct teststruct { const char *field; }; int test_ko(struct pt_regs *ctx) { void *conn = ctx->uregs[0]; struct teststruct *t = ctx->uregs[1]; const char *c = ({ typeof(void*) _val; __builtin_memset(&_val, 0, sizeof(_val)); bpf_probe_read(&_val, sizeof(_val), (char *)t + offsetof(struct teststruct, field)); _val; }); return ({ typeof(const char) _val; __builtin_memset(&_val, 0, sizeof(_val)); bpf_probe_read(&_val, sizeof(_val), (char *)c); _val; }); } The interesting part is the first __builtin_memset. After compiling to IR it becomes this: %2 = alloca i8*, align 4 [...] %8 = bitcast i8** %2 to i32* store i32 0, i32* %8, align 4 It explicitly sets 4 bytes on the stack to 0, because sizeof(void*) on ARM is 4. Thus, 'void * val' should be zeroed, right? No. Because after compiling to final bpf code, i8* allocated at %2 becomes 64-bit and we got the following code, rejected by the verifier: 5: R6=imm0,min_value=0,max_value=0,min_align=2147483648 R10=fp 5: (63) *(u32 *)(r10 -8) = r6 [...] 10: (79) r3 = *(u64 *)(r10 -8) invalid read from stack off -8+4 size 8 But, if we switch __builtin_memset to explicit initialization, things get better: int test_ok(struct pt_regs *ctx) { void *conn = ctx->uregs[0]; struct teststruct *t = ctx->uregs[1]; const char *c = ({ typeof(void*) _val = 0; // instead of __builtin_memset(&_val, 0, sizeof(_val)); bpf_probe_read(&_val, sizeof(_val), (char *)t + offsetof(struct teststruct, field)); _val; }); return ({ typeof(const char) _val; __builtin_memset(&_val, 0, sizeof(_val)); bpf_probe_read(&_val, sizeof(_val), (char *)c); _val; }); } IR is: %2 = alloca i8*, align 4 [...] store i8* null, i8** %2, align 4, !tbaa !14 And final bpf code is: 5: R6=imm0,min_value=0,max_value=0,min_align=2147483648 R10=fp 5: (7b) *(u64 *)(r10 -8) = r6 [...] 10: (79) r3 = *(u64 *)(r10 -8) Of course, as a workaround for this we explicitly use bpf_probe_read with explicit initialization in our tools. Not pretty at all... With those changes and workarounds applied we are able to run selected bcc tools on 32-bit ARM. It's very hacky, so our motivation is to fix it properly, and possibly have it integrated upstream. There is also a couple of issues with uprobes on 32-bit ARM, but that's a different story. What do you guys think about all of this? I would be grateful for comments. The changes we've made are available at https://github.com/aszyndela/bcc Thanks, Adrian ___ iovisor-dev mailing list iovisor-dev@lists.iovisor.org https://lis