[PATCH] bpf, x32: Fix regression caused by commit 24dea04767e6
Commit 24dea04767e6 ("bpf, x32: remove ld_abs/ld_ind") removed the 4 /* Extra space for skb_copy_bits buffer */ from _STACK_SIZE, but it didn't fix the concerned code in emit_prologue and emit_epilogue, and this error will bring very strange kernel runtime errors. This patch fix it. Fixes: 24dea04767e6 ("bpf, x32: remove ld_abs/ld_ind") Signed-off-by: Wang YanQing --- arch/x86/net/bpf_jit_comp32.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c index 5579987..8f6cc71 100644 --- a/arch/x86/net/bpf_jit_comp32.c +++ b/arch/x86/net/bpf_jit_comp32.c @@ -1441,8 +1441,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth) /* sub esp,STACK_SIZE */ EMIT2_off32(0x81, 0xEC, STACK_SIZE); - /* sub ebp,SCRATCH_SIZE+4+12*/ - EMIT3(0x83, add_1reg(0xE8, IA32_EBP), SCRATCH_SIZE + 16); + /* sub ebp,SCRATCH_SIZE+12*/ + EMIT3(0x83, add_1reg(0xE8, IA32_EBP), SCRATCH_SIZE + 12); /* xor ebx,ebx */ EMIT2(0x31, add_2reg(0xC0, IA32_EBX, IA32_EBX)); @@ -1475,8 +1475,8 @@ static void emit_epilogue(u8 **pprog, u32 stack_depth) /* mov edx,dword ptr [ebp+off]*/ EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(r0[1])); - /* add ebp,SCRATCH_SIZE+4+12*/ - EMIT3(0x83, add_1reg(0xC0, IA32_EBP), SCRATCH_SIZE + 16); + /* add ebp,SCRATCH_SIZE+12*/ + EMIT3(0x83, add_1reg(0xC0, IA32_EBP), SCRATCH_SIZE + 12); /* mov ebx,dword ptr [ebp-12]*/ EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX), -12); -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH] bpf, doc: clarification for the meaning of 'id'
For me, as a reader whose mother language isn't English, the old words bring a little difficulty to catch the meaning, this patch rewords the subsection in a more clarificatory way. This patch also add blank lines as separator at two places to improve readability. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Documentation/networking/filter.txt | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index 5032e12..e6b4ebb 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -1142,6 +1142,7 @@ into a register from memory, the register's top 56 bits are known zero, while the low 8 are unknown - which is represented as the tnum (0x0; 0xff). If we then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0; 0x1ff), because of potential carries. + Besides arithmetic, the register state can also be updated by conditional branches. For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false' @@ -1150,14 +1151,16 @@ BPF_JSGE) would instead update the signed minimum/maximum values. Information from the signed and unsigned bounds can be combined; for instance if a value is first tested < 8 and then tested s> 4, the verifier will conclude that the value is also > 4 and s< 8, since the bounds prevent crossing the sign boundary. + PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all pointers sharing that same variable offset. This is important for packet range -checks: after adding some variable to a packet pointer, if you then copy it to -another register and (say) add a constant 4, both registers will share the same -'id' but one will have a fixed offset of +4. Then if it is bounds-checked and -found to be less than a PTR_TO_PACKET_END, the other register is now known to -have a safe range of at least 4 bytes. See 'Direct packet access', below, for -more on PTR_TO_PACKET ranges. +checks: after adding a variable to a packet pointer register A, if you then copy +it to another register B and then add a constant 4 to A, both registers will +share the same 'id' but the A will have a fixed offset of +4. Then if A is +bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is +now known to have a safe range of at least 4 bytes. See 'Direct packet access', +below, for more on PTR_TO_PACKET ranges. + The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies of the pointer returned from a map lookup. This means that when one copy is checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs. -- 1.8.5.6.2.g3d8a54e.dirty
Re: [PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform
On Sat, Apr 28, 2018 at 01:29:17PM +0800, Wang YanQing wrote: > On Sat, Apr 28, 2018 at 01:33:15AM +0200, Daniel Borkmann wrote: > > On 04/28/2018 12:48 AM, Alexei Starovoitov wrote: > > > On Thu, Apr 26, 2018 at 05:57:49PM +0800, Wang YanQing wrote: > > >> All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in > > >> test_verifier(kselftest) report below errors on x86_32: > > >> " > > >> 172/p unpriv: spill/fill of different pointers ldx FAIL > > >> Unexpected error message! > > >> 0: (bf) r6 = r10 > > >> 1: (07) r6 += -8 > > >> 2: (15) if r1 == 0x0 goto pc+3 > > >> R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 > > >> 3: (bf) r2 = r10 > > >> 4: (07) r2 += -76 > > >> 5: (7b) *(u64 *)(r6 +0) = r2 > > >> 6: (55) if r1 != 0x0 goto pc+1 > > >> R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 > > >> R10=fp0,call_-1 fp-8=fp > > >> 7: (7b) *(u64 *)(r6 +0) = r1 > > >> 8: (79) r1 = *(u64 *)(r6 +0) > > >> 9: (79) r1 = *(u64 *)(r1 +68) > > >> invalid bpf_context access off=68 size=8 > > >> > > >> 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL > > >> Failed to load prog 'Permission denied'! > > >> 0: (b7) r0 = 0 > > >> 1: (71) r0 = *(u8 *)(r1 +68) > > >> invalid bpf_context access off=68 size=1 > > >> > > >> 379/p check bpf_perf_event_data->sample_period half load permitted FAIL > > >> Failed to load prog 'Permission denied'! > > >> 0: (b7) r0 = 0 > > >> 1: (69) r0 = *(u16 *)(r1 +68) > > >> invalid bpf_context access off=68 size=2 > > >> > > >> 380/p check bpf_perf_event_data->sample_period word load permitted FAIL > > >> Failed to load prog 'Permission denied'! > > >> 0: (b7) r0 = 0 > > >> 1: (61) r0 = *(u32 *)(r1 +68) > > >> invalid bpf_context access off=68 size=4 > > >> > > >> 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL > > >> Failed to load prog 'Permission denied'! > > >> 0: (b7) r0 = 0 > > >> 1: (79) r0 = *(u64 *)(r1 +68) > > >> invalid bpf_context access off=68 size=8 > > >> " > > >> > > >> This patch fix it, the fix isn't only necessary for x86_32, it will fix > > >> the > > >> same problem for other platforms too, if their size of bpf_user_pt_regs_t > > >> can't divide exactly into 8. > > >> > > >> Signed-off-by: Wang YanQing <udkni...@gmail.com> > > >> --- > > >> Hi all! > > >> After mainline accept this patch, then we need to submit a sync patch > > >> to update the tools/include/uapi/linux/bpf_perf_event.h. > > >> > > >> Thanks. > > >> > > >> include/uapi/linux/bpf_perf_event.h | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/include/uapi/linux/bpf_perf_event.h > > >> b/include/uapi/linux/bpf_perf_event.h > > >> index eb1b9d2..ff4c092 100644 > > >> --- a/include/uapi/linux/bpf_perf_event.h > > >> +++ b/include/uapi/linux/bpf_perf_event.h > > >> @@ -12,7 +12,7 @@ > > >> > > >> struct bpf_perf_event_data { > > >> bpf_user_pt_regs_t regs; > > >> -__u64 sample_period; > > >> +__u64 sample_period __attribute__((aligned(8))); > > > > > > I don't think this necessary. > > > imo it's a bug in pe_prog_is_valid_access > > > that should have allowed 8-byte access to 4-byte aligned sample_period. > > > The access rewritten by pe_prog_convert_ctx_access anyway, > > > no alignment issues as far as I can see. > > > > Right, good point. Wang, could you give the below a test run: > > > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > > index 56ba0f2..95b9142 100644 > > --- a/kernel/trace/bpf_trace.c > > +++ b/kernel/trace/bpf_trace.c > > @@ -833,8 +833,14 @@ static bool pe_prog_is_valid_access(int off, int size, > > enum bpf_access_type type > > return false; > > if (type != BPF_READ) > > return false; > > - if (off % size != 0) > > - return false; > > + if (off % size != 0) { > > + if (sizeof(long) != 4) > > + return false; > > + if (size != 8) > > + return false; > > + if (off % size != 4) > > + return false; > > + } > > > > switch (off) { > > case bpf_ctx_range(struct bpf_perf_event_data, sample_period): > Hi all! > > I have tested this patch, but test_verifier reports the same errors > for the five testcases. > > The reason is they all failed to pass the test of bpf_ctx_narrow_access_ok. > > Thanks. Hi! Daniel Borkmann. Do you have any plan to fix bpf_ctx_narrow_access_ok for these problems? Thanks.
[PATCH v6] bpf, x86_32: add eBPF JIT compiler for ia32
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. Almost all instructions from eBPF ISA supported except the following: BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment. IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others eBPF registers, R0-R10, are simulated through scratch space on stack. The reasons behind the hardware registers allocation policy are: 1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit for general eBPF 64bit register simulation. 2:We need at least 4 registers to simulate most eBPF ISA operations on registers operands instead of on register operands. 3:We need to put BPF_REG_AX on hardware registers, or constant blinding will degrade jit performance heavily. Tested on PC (Intel(R) Core(TM) i5-5200U CPU). Testing results on i5-5200U: 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] 2) test_progs: Summary: 83 PASSED, 0 FAILED. 3) test_lpm: OK 4) test_lru_map: OK 5) test_verifier: Summary: 828 PASSED, 0 FAILED. Above tests are all done in following two conditions separately: 1:bpf_jit_enable=1 and bpf_jit_harden=0 2:bpf_jit_enable=1 and bpf_jit_harden=2 Below are some numbers for this jit implementation: Note: I run test_progs in kselftest 100 times continuously for every condition, the numbers are in format: total/times=avg. The numbers that test_bpf reports show almost the same relation. a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10674/100=106 test_pkt_access:PASS:ipv6:9130/100=91 test_pkt_access:PASS:ipv6:4855/100=48 test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389 test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685 test_l4lb:PASS:ipv4:61100/100=611 test_l4lb:PASS:ipv4:37302/100=373 test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550 c:jit_enable=1 and jit_harden=2 test_pkt_access:PASS:ipv4:10558/100=105 test_pkt_access:PASS:ipv6:5092/100=50 test_xdp:PASS:ipv4:131902/100=1319 test_xdp:PASS:ipv6:77932/100=779 test_l4lb:PASS:ipv4:38924/100=389 test_l4lb:PASS:ipv6:57520/100=575 The numbers show we get 30%~50% improvement. See Documentation/networking/filter.txt for more information. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Changes v5-v6: 1:Add do {} while (0) to RETPOLINE_RAX_BPF_JIT for consistence reason. 2:Clean up non-standard comments, reported by Daniel Borkmann. 3:Fix a memory leak issue, repoted by Daniel Borkmann. Changes v4-v5: 1:Delete is_on_stack, BPF_REG_AX is the only one on real hardware registers, so just check with it. 2:Apply commit 1612a981b766 ("bpf, x64: fix JIT emission for dead code"), suggested by Daniel Borkmann. Changes v3-v4: 1:Fix changelog in commit. I install llvm-6.0, then test_progs willn't report errors. I submit another patch: "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform" to fix another problem, after that patch, test_verifier willn't report errors too. 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation. Changes v2-v3: 1:Move BPF_REG_AX to real hardware registers for performance reason. 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann. 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann. 5:Some bug fixes and comments improvement. Changes v1-v2: 1:Fix bug in emit_ia32_neg64. 2:Fix bug in emit_ia32_arsh_r64. 3:Delete filename in top level comment, suggested by Thomas Gleixner. 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner. 5:Rewrite some words in changelog. 6:CodingSytle improvement and a little more comments. arch/x86/Kconfig |2 +- arch/x86/include/asm/nospec-branch.h | 30 +- arch/x86/net/Makefile|9 +- arch/x86/net/bpf_jit_comp32.c| 2553 ++ 4 files changed, 2588 insertions(+), 6 deletions(-) create mode 100644 arch/x86/net/bpf_jit_comp32.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index c07f492..d51a71d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -138,7 +138,7 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS - select HAVE_EBPF_JITif X86_64 + select HAVE_EBPF_JIT select HAVE_EFFICIENT_UNALIGNED_ACCESS
[PATCH v5] bpf, x86_32: add eBPF JIT compiler for ia32
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. Almost all instructions from eBPF ISA supported except the following: BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment. IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others eBPF registers, R0-R10, are simulated through scratch space on stack. The reasons behind the hardware registers allocation policy are: 1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit for general eBPF 64bit register simulation. 2:We need at least 4 registers to simulate most eBPF ISA operations on registers operands instead of on register operands. 3:We need to put BPF_REG_AX on hardware registers, or constant blinding will degrade jit performance heavily. Tested on PC (Intel(R) Core(TM) i5-5200U CPU). Testing results on i5-5200U: 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] 2) test_progs: Summary: 83 PASSED, 0 FAILED. 3) test_lpm: OK 4) test_lru_map: OK 5) test_verifier: Summary: 828 PASSED, 0 FAILED. Above tests are all done in following two conditions separately: 1:bpf_jit_enable=1 and bpf_jit_harden=0 2:bpf_jit_enable=1 and bpf_jit_harden=2 Below are some numbers for this jit implementation: Note: I run test_progs in kselftest 100 times continuously for every condition, the numbers are in format: total/times=avg. The numbers that test_bpf reports show almost the same relation. a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10674/100=106 test_pkt_access:PASS:ipv6:9130/100=91 test_pkt_access:PASS:ipv6:4855/100=48 test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389 test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685 test_l4lb:PASS:ipv4:61100/100=611 test_l4lb:PASS:ipv4:37302/100=373 test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550 c:jit_enable=1 and jit_harden=2 test_pkt_access:PASS:ipv4:10558/100=105 test_pkt_access:PASS:ipv6:5092/100=50 test_xdp:PASS:ipv4:131902/100=1319 test_xdp:PASS:ipv6:77932/100=779 test_l4lb:PASS:ipv4:38924/100=389 test_l4lb:PASS:ipv6:57520/100=575 The numbers show we get 30%~50% improvement. See Documentation/networking/filter.txt for more information. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Changes v4-v5: 1:Delete is_on_stack, BPF_REG_AX is the only one on real hardware registers, so just check with it. 2:Apply commit 1612a981b766 ("bpf, x64: fix JIT emission for dead code"), suggested by Daniel Borkmann. Changes v3-v4: 1:Fix changelog in commit. I install llvm-6.0, then test_progs willn't report errors. I submit another patch: "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform" to fix another problem, after that patch, test_verifier willn't report errors too. 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation. Changes v2-v3: 1:Move BPF_REG_AX to real hardware registers for performance reason. 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann. 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann. 5:Some bug fixes and comments improvement. Changes v1-v2: 1:Fix bug in emit_ia32_neg64. 2:Fix bug in emit_ia32_arsh_r64. 3:Delete filename in top level comment, suggested by Thomas Gleixner. 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner. 5:Rewrite some words in changelog. 6:CodingSytle improvement and a little more comments. arch/x86/Kconfig |2 +- arch/x86/include/asm/nospec-branch.h | 26 +- arch/x86/net/Makefile|9 +- arch/x86/net/bpf_jit_comp32.c| 2527 ++ 4 files changed, 2559 insertions(+), 5 deletions(-) create mode 100644 arch/x86/net/bpf_jit_comp32.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 00fcf81..1f5fa2f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -137,7 +137,7 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS - select HAVE_EBPF_JITif X86_64 + select HAVE_EBPF_JIT select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EXIT_THREAD select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index f928ad9..a4c
Re: [PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform
On Sat, Apr 28, 2018 at 01:33:15AM +0200, Daniel Borkmann wrote: > On 04/28/2018 12:48 AM, Alexei Starovoitov wrote: > > On Thu, Apr 26, 2018 at 05:57:49PM +0800, Wang YanQing wrote: > >> All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in > >> test_verifier(kselftest) report below errors on x86_32: > >> " > >> 172/p unpriv: spill/fill of different pointers ldx FAIL > >> Unexpected error message! > >> 0: (bf) r6 = r10 > >> 1: (07) r6 += -8 > >> 2: (15) if r1 == 0x0 goto pc+3 > >> R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 > >> 3: (bf) r2 = r10 > >> 4: (07) r2 += -76 > >> 5: (7b) *(u64 *)(r6 +0) = r2 > >> 6: (55) if r1 != 0x0 goto pc+1 > >> R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 > >> fp-8=fp > >> 7: (7b) *(u64 *)(r6 +0) = r1 > >> 8: (79) r1 = *(u64 *)(r6 +0) > >> 9: (79) r1 = *(u64 *)(r1 +68) > >> invalid bpf_context access off=68 size=8 > >> > >> 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL > >> Failed to load prog 'Permission denied'! > >> 0: (b7) r0 = 0 > >> 1: (71) r0 = *(u8 *)(r1 +68) > >> invalid bpf_context access off=68 size=1 > >> > >> 379/p check bpf_perf_event_data->sample_period half load permitted FAIL > >> Failed to load prog 'Permission denied'! > >> 0: (b7) r0 = 0 > >> 1: (69) r0 = *(u16 *)(r1 +68) > >> invalid bpf_context access off=68 size=2 > >> > >> 380/p check bpf_perf_event_data->sample_period word load permitted FAIL > >> Failed to load prog 'Permission denied'! > >> 0: (b7) r0 = 0 > >> 1: (61) r0 = *(u32 *)(r1 +68) > >> invalid bpf_context access off=68 size=4 > >> > >> 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL > >> Failed to load prog 'Permission denied'! > >> 0: (b7) r0 = 0 > >> 1: (79) r0 = *(u64 *)(r1 +68) > >> invalid bpf_context access off=68 size=8 > >> " > >> > >> This patch fix it, the fix isn't only necessary for x86_32, it will fix the > >> same problem for other platforms too, if their size of bpf_user_pt_regs_t > >> can't divide exactly into 8. > >> > >> Signed-off-by: Wang YanQing <udkni...@gmail.com> > >> --- > >> Hi all! > >> After mainline accept this patch, then we need to submit a sync patch > >> to update the tools/include/uapi/linux/bpf_perf_event.h. > >> > >> Thanks. > >> > >> include/uapi/linux/bpf_perf_event.h | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/include/uapi/linux/bpf_perf_event.h > >> b/include/uapi/linux/bpf_perf_event.h > >> index eb1b9d2..ff4c092 100644 > >> --- a/include/uapi/linux/bpf_perf_event.h > >> +++ b/include/uapi/linux/bpf_perf_event.h > >> @@ -12,7 +12,7 @@ > >> > >> struct bpf_perf_event_data { > >>bpf_user_pt_regs_t regs; > >> - __u64 sample_period; > >> + __u64 sample_period __attribute__((aligned(8))); > > > > I don't think this necessary. > > imo it's a bug in pe_prog_is_valid_access > > that should have allowed 8-byte access to 4-byte aligned sample_period. > > The access rewritten by pe_prog_convert_ctx_access anyway, > > no alignment issues as far as I can see. > > Right, good point. Wang, could you give the below a test run: > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > index 56ba0f2..95b9142 100644 > --- a/kernel/trace/bpf_trace.c > +++ b/kernel/trace/bpf_trace.c > @@ -833,8 +833,14 @@ static bool pe_prog_is_valid_access(int off, int size, > enum bpf_access_type type > return false; > if (type != BPF_READ) > return false; > - if (off % size != 0) > - return false; > + if (off % size != 0) { > + if (sizeof(long) != 4) > + return false; > + if (size != 8) > + return false; > + if (off % size != 4) > + return false; > + } > > switch (off) { > case bpf_ctx_range(struct bpf_perf_event_data, sample_period): Hi all! I have tested this patch, but test_verifier reports the same errors for the five testcases. The reason is they all failed to pass the test of bpf_ctx_narrow_access_ok. Thanks.
[PATCH v4] bpf, x86_32: add eBPF JIT compiler for ia32
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. Almost all instructions from eBPF ISA supported except the following: BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment. IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others eBPF registers, R0-R10, are simulated through scratch space on stack. The reasons behind the hardware registers allocation policy are: 1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit for general eBPF 64bit register simulation. 2:We need at least 4 registers to simulate most eBPF ISA operations on registers operands instead of on register operands. 3:We need to put BPF_REG_AX on hardware registers, or constant blinding will degrade jit performance heavily. Tested on PC (Intel(R) Core(TM) i5-5200U CPU). Testing results on i5-5200U: 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] 2) test_progs: Summary: 83 PASSED, 0 FAILED. 3) test_lpm: OK 4) test_lru_map: OK 5) test_verifier: Summary: 828 PASSED, 0 FAILED. Above tests are all done in following two conditions separately: 1:bpf_jit_enable=1 and bpf_jit_harden=0 2:bpf_jit_enable=1 and bpf_jit_harden=2 Below are some numbers for this jit implementation: Note: I run test_progs in kselftest 100 times continuously for every condition, the numbers are in format: total/times=avg. The numbers that test_bpf reports show almost the same relation. a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10674/100=106 test_pkt_access:PASS:ipv6:9130/100=91 test_pkt_access:PASS:ipv6:4855/100=48 test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389 test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685 test_l4lb:PASS:ipv4:61100/100=611 test_l4lb:PASS:ipv4:37302/100=373 test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550 c:jit_enable=1 and jit_harden=2 test_pkt_access:PASS:ipv4:10558/100=105 test_pkt_access:PASS:ipv6:5092/100=50 test_xdp:PASS:ipv4:131902/100=1319 test_xdp:PASS:ipv6:77932/100=779 test_l4lb:PASS:ipv4:38924/100=389 test_l4lb:PASS:ipv6:57520/100=575 The numbers show we get 30%~50% improvement. See Documentation/networking/filter.txt for more information. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Changes v3-v4: 1:Fix changelog in commit. I install llvm-6.0, then test_progs willn't report errors. I submit another patch: "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform" to fix another problem, after that patch, test_verifier willn't report errors too. 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation. Changes v2-v3: 1:Move BPF_REG_AX to real hardware registers for performance reason. 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann. 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann. 5:Some bug fixes and comments improvement. Changes v1-v2: 1:Fix bug in emit_ia32_neg64. 2:Fix bug in emit_ia32_arsh_r64. 3:Delete filename in top level comment, suggested by Thomas Gleixner. 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner. 5:Rewrite some words in changelog. 6:CodingSytle improvement and a little more comments. Hi all! This is the fourth version of this patch, and I think it is good enough to launch, any more suggestion? Thanks. arch/x86/Kconfig |2 +- arch/x86/include/asm/nospec-branch.h | 26 +- arch/x86/net/Makefile|9 +- arch/x86/net/bpf_jit_comp32.c| 2530 ++ 4 files changed, 2562 insertions(+), 5 deletions(-) create mode 100644 arch/x86/net/bpf_jit_comp32.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 00fcf81..1f5fa2f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -137,7 +137,7 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS - select HAVE_EBPF_JITif X86_64 + select HAVE_EBPF_JIT select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EXIT_THREAD select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index f928ad9..a4c7ca4 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -291,14 +291,17
[PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform
All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in test_verifier(kselftest) report below errors on x86_32: " 172/p unpriv: spill/fill of different pointers ldx FAIL Unexpected error message! 0: (bf) r6 = r10 1: (07) r6 += -8 2: (15) if r1 == 0x0 goto pc+3 R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 3: (bf) r2 = r10 4: (07) r2 += -76 5: (7b) *(u64 *)(r6 +0) = r2 6: (55) if r1 != 0x0 goto pc+1 R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp 7: (7b) *(u64 *)(r6 +0) = r1 8: (79) r1 = *(u64 *)(r6 +0) 9: (79) r1 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (71) r0 = *(u8 *)(r1 +68) invalid bpf_context access off=68 size=1 379/p check bpf_perf_event_data->sample_period half load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (69) r0 = *(u16 *)(r1 +68) invalid bpf_context access off=68 size=2 380/p check bpf_perf_event_data->sample_period word load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (61) r0 = *(u32 *)(r1 +68) invalid bpf_context access off=68 size=4 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (79) r0 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 " This patch fix it, the fix isn't only necessary for x86_32, it will fix the same problem for other platforms too, if their size of bpf_user_pt_regs_t can't divide exactly into 8. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Hi all! After mainline accept this patch, then we need to submit a sync patch to update the tools/include/uapi/linux/bpf_perf_event.h. Thanks. include/uapi/linux/bpf_perf_event.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/bpf_perf_event.h b/include/uapi/linux/bpf_perf_event.h index eb1b9d2..ff4c092 100644 --- a/include/uapi/linux/bpf_perf_event.h +++ b/include/uapi/linux/bpf_perf_event.h @@ -12,7 +12,7 @@ struct bpf_perf_event_data { bpf_user_pt_regs_t regs; - __u64 sample_period; + __u64 sample_period __attribute__((aligned(8))); __u64 addr; }; -- 1.8.5.6.2.g3d8a54e.dirty
Re: [PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32
On Wed, Apr 25, 2018 at 02:11:16AM +0200, Daniel Borkmann wrote: > On 04/19/2018 05:54 PM, Wang YanQing wrote: > > Testing results on i5-5200U: > > > > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] > > 2) test_progs: Summary: 81 PASSED, 2 FAILED. > >test_progs report "libbpf: incorrect bpf_call opcode" for > >test_l4lb_noinline and test_xdp_noinline, because there is > >no llvm-6.0 on my machine, and current implementation doesn't > >support BPF_PSEUDO_CALL, so I think we can ignore the two failed > >testcases. > > 3) test_lpm: OK > > 4) test_lru_map: OK > > 5) test_verifier: Summary: 823 PASSED, 5 FAILED > >test_verifier report "invalid bpf_context access off=68 size=1/2/4/8" > >for all the 5 FAILED testcases with/without jit, we need to fix the > >failed testcases themself instead of this jit. > > Can you elaborate further on these? Looks like this definitely needs > fixing on 32 bit. Would be great to get a better understanding of the > underlying bug(s) and properly fix them. > Hi Daniel Borkmann, here is the detailed log for failed testcases. linux: Gentoo 32 bit llvm: ~ # llc --version LLVM (http://llvm.org/): LLVM version 4.0.1 Optimized build. Default target: i686-pc-linux-gnu Host CPU: broadwell Registered Targets: amdgcn - AMD GCN GPUs bpf - BPF (host endian) bpfeb - BPF (big endian) bpfel - BPF (little endian) nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit r600- AMD GPUs HD2XXX-HD6XXX x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 ~ # clang --version clang version 4.0.1 (tags/RELEASE_401/final) Target: i686-pc-linux-gnu Thread model: posix InstalledDir: /usr/lib/llvm/4/bin kernel version:4.16.2 test program:test_verifier in kselftest condition:bpf_jit_enable=0,bpf_jit_harden=0 log: #172/p unpriv: spill/fill of different pointers ldx FAIL Unexpected error message! 0: (bf) r6 = r10 1: (07) r6 += -8 2: (15) if r1 == 0x0 goto pc+3 R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1 3: (bf) r2 = r10 4: (07) r2 += -76 5: (7b) *(u64 *)(r6 +0) = r2 6: (55) if r1 != 0x0 goto pc+1 R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp 7: (7b) *(u64 *)(r6 +0) = r1 8: (79) r1 = *(u64 *)(r6 +0) 9: (79) r1 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 #378/p check bpf_perf_event_data->sample_period byte load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (71) r0 = *(u8 *)(r1 +68) invalid bpf_context access off=68 size=1 #379/p check bpf_perf_event_data->sample_period half load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (69) r0 = *(u16 *)(r1 +68) invalid bpf_context access off=68 size=2 #380/p check bpf_perf_event_data->sample_period word load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (61) r0 = *(u32 *)(r1 +68) invalid bpf_context access off=68 size=4 #381/p check bpf_perf_event_data->sample_period dword load permitted FAIL Failed to load prog 'Permission denied'! 0: (b7) r0 = 0 1: (79) r0 = *(u64 *)(r1 +68) invalid bpf_context access off=68 size=8 test program:test_progs condition:bpf_jit_enable=0,bpf_jit_harden=0 bpf # ./test_progs test_pkt_access:PASS:ipv4 53 nsec test_pkt_access:PASS:ipv6 47 nsec test_xdp:PASS:ipv4 1281 nsec test_xdp:PASS:ipv6 749 nsec test_l4lb:PASS:ipv4 427 nsec test_l4lb:PASS:ipv6 562 nsec libbpf: incorrect bpf_call opcode <= caused by ./test_l4lb_noinline.o in function test_l4lb_all libbpf: incorrect bpf_call opcode <= caused by ././test_xdp_noinline.o in function test_xdp_noinline Thanks.
[PATCH v3] bpf, x86_32: add eBPF JIT compiler for ia32
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. Almost all instructions from eBPF ISA supported except the following: BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment. IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others eBPF registers, R0-R10, are simulated through scratch space on stack. The reasons behind the hardware registers allocation policy are: 1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit for general eBPF 64bit register simulation. 2:We need at least 4 registers to simulate most eBPF ISA operations on registers operands instead of on register operands. 3:We need to put BPF_REG_AX on hardware registers, or constant blinding will degrade jit performance heavily. Tested on PC (Intel(R) Core(TM) i5-5200U CPU). Testing results on i5-5200U: 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] 2) test_progs: Summary: 81 PASSED, 2 FAILED. 3) test_lpm: OK 4) test_lru_map: OK 5) test_verifier: Summary: 823 PASSED, 5 FAILED Note: these failed testcases are caused by unconcerned reasons, we need to fix the testcases themself: "invalid bpf_context access" and "libbpf: incorrect bpf_call opcode" Above tests are all done in following two conditions separately: 1:bpf_jit_enable=1 and bpf_jit_harden=0 2:bpf_jit_enable=1 and bpf_jit_harden=2 Below are some numbers for this jit implementation: Note: I run test_progs in kselftest 100 times continuously for every condition, the numbers are in format: total/times=avg. The numbers that test_bpf reports show almost the same relation. a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10674/100=106 test_pkt_access:PASS:ipv6:9130/100=91 test_pkt_access:PASS:ipv6:4855/100=48 test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389 test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685 test_l4lb:PASS:ipv4:61100/100=611 test_l4lb:PASS:ipv4:37302/100=373 test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550 c:jit_enable=1 and jit_harden=2 test_pkt_access:PASS:ipv4:10558/100=105 test_pkt_access:PASS:ipv6:5092/100=50 test_xdp:PASS:ipv4:131902/100=1319 test_xdp:PASS:ipv6:77932/100=779 test_l4lb:PASS:ipv4:38924/100=389 test_l4lb:PASS:ipv6:57520/100=575 The numbers show we get 30%~50% improvement. See Documentation/networking/filter.txt for more information. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Changes v2-v3: 1:Move BPF_REG_AX to real hardware registers for performance reason. 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann. 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann. 5:Some bug fixes and comments improvement. Changes v1-v2: 1:Fix bug in emit_ia32_neg64. 2:Fix bug in emit_ia32_arsh_r64. 3:Delete filename in top level comment, suggested by Thomas Gleixner. 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner. 5:Rewrite some words in changelog. 6:CodingSytle improvement and a little more comments. arch/x86/Kconfig |2 +- arch/x86/include/asm/nospec-branch.h | 26 +- arch/x86/net/Makefile|9 +- arch/x86/net/bpf_jit_comp32.c| 2533 ++ 4 files changed, 2565 insertions(+), 5 deletions(-) create mode 100644 arch/x86/net/bpf_jit_comp32.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 00fcf81..1f5fa2f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -137,7 +137,7 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS - select HAVE_EBPF_JITif X86_64 + select HAVE_EBPF_JIT select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EXIT_THREAD select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index f928ad9..a4c7ca4 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -291,14 +291,17 @@ static inline void indirect_branch_prediction_barrier(void) *lfence *jmp spec_trap * do_rop: - *mov %rax,(%rsp) + *mov %rax,(%rsp) for x86_64 + *mov %edx,(%esp) for x86_32 *retq * * Without retpolines configured: * - *jmp *%rax + *jmp *%rax for x86_64 + *jmp *%e
Re: [PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32
On Wed, Apr 25, 2018 at 02:11:16AM +0200, Daniel Borkmann wrote: > On 04/19/2018 05:54 PM, Wang YanQing wrote: > > The JIT compiler emits ia32 bit instructions. Currently, It supports > > eBPF only. Classic BPF is supported because of the conversion by BPF core. > > > > Almost all instructions from eBPF ISA supported except the following: > > BPF_ALU64 | BPF_DIV | BPF_K > > BPF_ALU64 | BPF_DIV | BPF_X > > BPF_ALU64 | BPF_MOD | BPF_K > > BPF_ALU64 | BPF_MOD | BPF_X > > BPF_STX | BPF_XADD | BPF_W > > BPF_STX | BPF_XADD | BPF_DW > > > > It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too. > > > > ia32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI, > > and in these six registers, we can't treat all of them as real > > general purpose registers in jit: > > MUL instructions need EAX:EDX, shift instructions need ECX. > > > > So I decide to use stack to emulate all eBPF 64 registers, this will > > simplify the implementation a lot, because we don't need to face the > > flexible memory address modes on ia32, for example, we don't need to > > write below code pattern for one BPF_ADD instruction: > > > > if (src is a register && dst is a register) > > { > >//one instruction encoding for ADD instruction > > } else if (only src is a register) > > { > >//another different instruction encoding for ADD instruction > > } else if (only dst is a register) > > { > >//another different instruction encoding for ADD instruction > > } else > > { > >//src and dst are all on stack. > >//move src or dst to temporary registers > > } > > > > If the above example if-else-else-else isn't so painful, try to think > > it for BPF_ALU64|BPF_*SHIFT* instruction which we need to use many > > native instructions to emulate. > > > > Tested on my PC (Intel(R) Core(TM) i5-5200U CPU) and virtualbox. > > Just out of plain curiosity, do you have a specific use case on the > JIT for x86_32? Are you targeting this to be used for e.g. Atom or > the like (sorry, don't really have a good overview where x86_32 is > still in use these days)? Yes, you are right that x86_32 isn't the BIG DEAL anymore, but they willn't disappear completely in near future, the same as 32-bit linux kernel on 64bit machine. For me, because I use 32-bit linux on development/hacking notebook for many years, I try to trace/perf the kernel, then I meet eBPF, it is strange that there isn't a jit for it, so I try to fill the empty. > > > Testing results on i5-5200U: > > > > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] > > 2) test_progs: Summary: 81 PASSED, 2 FAILED. > >test_progs report "libbpf: incorrect bpf_call opcode" for > >test_l4lb_noinline and test_xdp_noinline, because there is > >no llvm-6.0 on my machine, and current implementation doesn't > >support BPF_PSEUDO_CALL, so I think we can ignore the two failed > >testcases. > > 3) test_lpm: OK > > 4) test_lru_map: OK > > 5) test_verifier: Summary: 823 PASSED, 5 FAILED > >test_verifier report "invalid bpf_context access off=68 size=1/2/4/8" > >for all the 5 FAILED testcases with/without jit, we need to fix the > >failed testcases themself instead of this jit. > > Can you elaborate further on these? Looks like this definitely needs > fixing on 32 bit. Would be great to get a better understanding of the > underlying bug(s) and properly fix them. Ok! I will try to dig them out. > > > Above tests are all done with following flags settings discretely: > > 1:bpf_jit_enable=1 and bpf_jit_harden=0 > > 2:bpf_jit_enable=1 and bpf_jit_harden=2 > > > > Below are some numbers for this jit implementation: > > Note: > > I run test_progs in kselftest 100 times continuously for every testcase, > > the numbers are in format: total/times=avg. > > The numbers that test_bpf reports show almost the same relation. > > > > a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 > > test_pkt_access:PASS:ipv4:15622/100=156 > > test_pkt_access:PASS:ipv4:10057/100=100 > > test_pkt_access:PASS:ipv6:9130/100=91 > > test_pkt_access:PASS:ipv6:5055/100=50 > > test_xdp:PASS:ipv4:240198/100=2401 > > test_xdp:PASS:ipv4:145945/100=1459 > > test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:67337/100=673 > > test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381 > > test_l4lb:PASS:ipv6:101000/100=1010
[PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. Almost all instructions from eBPF ISA supported except the following: BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too. ia32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI, and in these six registers, we can't treat all of them as real general purpose registers in jit: MUL instructions need EAX:EDX, shift instructions need ECX. So I decide to use stack to emulate all eBPF 64 registers, this will simplify the implementation a lot, because we don't need to face the flexible memory address modes on ia32, for example, we don't need to write below code pattern for one BPF_ADD instruction: if (src is a register && dst is a register) { //one instruction encoding for ADD instruction } else if (only src is a register) { //another different instruction encoding for ADD instruction } else if (only dst is a register) { //another different instruction encoding for ADD instruction } else { //src and dst are all on stack. //move src or dst to temporary registers } If the above example if-else-else-else isn't so painful, try to think it for BPF_ALU64|BPF_*SHIFT* instruction which we need to use many native instructions to emulate. Tested on my PC (Intel(R) Core(TM) i5-5200U CPU) and virtualbox. Testing results on i5-5200U: 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] 2) test_progs: Summary: 81 PASSED, 2 FAILED. test_progs report "libbpf: incorrect bpf_call opcode" for test_l4lb_noinline and test_xdp_noinline, because there is no llvm-6.0 on my machine, and current implementation doesn't support BPF_PSEUDO_CALL, so I think we can ignore the two failed testcases. 3) test_lpm: OK 4) test_lru_map: OK 5) test_verifier: Summary: 823 PASSED, 5 FAILED test_verifier report "invalid bpf_context access off=68 size=1/2/4/8" for all the 5 FAILED testcases with/without jit, we need to fix the failed testcases themself instead of this jit. Above tests are all done with following flags settings discretely: 1:bpf_jit_enable=1 and bpf_jit_harden=0 2:bpf_jit_enable=1 and bpf_jit_harden=2 Below are some numbers for this jit implementation: Note: I run test_progs in kselftest 100 times continuously for every testcase, the numbers are in format: total/times=avg. The numbers that test_bpf reports show almost the same relation. a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10057/100=100 test_pkt_access:PASS:ipv6:9130/100=91test_pkt_access:PASS:ipv6:5055/100=50 test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:145945/100=1459 test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:67337/100=673 test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381 test_l4lb:PASS:ipv6:101000/100=1010 test_l4lb:PASS:ipv6:57779/100=577 c:jit_enable=1 and jit_harden=2 test_pkt_access:PASS:ipv4:12650/100=126 test_pkt_access:PASS:ipv6:7074/100=70 test_xdp:PASS:ipv4:147211/100=1472 test_xdp:PASS:ipv6:85783/100=857 test_l4lb:PASS:ipv4:53222/100=532 test_l4lb:PASS:ipv6:76322/100=763 Yes, the numbers are pretty when turn off jit_harden, if we want to speedup jit_harden, then we need to move BPF_REG_AX to *real* register instead of stack emulation, but when we do it, we need to face all the pain I describe above. We can do it in next step. See Documentation/networking/filter.txt for more information. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Changes v1-v2: 1:Fix bug in emit_ia32_neg64. 2:Fix bug in emit_ia32_arsh_r64. 3:Delete filename in top level comment, suggested by Thomas Gleixner. 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner. 5:Rewrite some words in changelog. 6:CodingSytle improvement and a little more comments. Thanks. arch/x86/Kconfig |2 +- arch/x86/include/asm/nospec-branch.h | 26 +- arch/x86/net/Makefile| 10 +- arch/x86/net/bpf_jit32.S | 143 +++ arch/x86/net/bpf_jit_comp32.c| 2258 ++ 5 files changed, 2434 insertions(+), 5 deletions(-) create mode 100644 arch/x86/net/bpf_jit32.S create mode 100644 arch/x86/net/bpf_jit_comp32.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 00fcf81..1f5fa2f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -137,7 +137,7 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS - select HAVE_EBPF_JITif X86_64 + s
Re: [PATCH] bpf, x86_32: add eBPF JIT compiler for ia32 (x86_32)
On Wed, Apr 18, 2018 at 05:31:18PM +0800, Wang YanQing wrote: > The JIT compiler emits ia32 bit instructions. Currently, It supports > eBPF only. Classic BPF is supported because of the conversion by BPF core. > > Almost all instructions from eBPF ISA supported except the following: > BPF_ALU64 | BPF_DIV | BPF_K > BPF_ALU64 | BPF_DIV | BPF_X > BPF_ALU64 | BPF_MOD | BPF_K > BPF_ALU64 | BPF_MOD | BPF_X > BPF_STX | BPF_XADD | BPF_W > BPF_STX | BPF_XADD | BPF_DW > > It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too. > > IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI, > and for these six registers, we can't treat all of them as real > general purpose registers: > MUL instructions need EAX:EDX, shift instructions need ECX, ESI|EDI > for string manipulation instructions. > > So I decide to use stack to emulate all eBPF 64 registers, this will > simplify the implementation very much, because we don't need to face > the flexible memory address modes on ia32, for example, we don't need > to write below codes for one BPF_ADD instruction: > if (src_reg is a register && dst_reg is a register) > { >//one instruction encoding for ADD instruction > } else if (only src is a register) > { >//another different instruction encoding for ADD instruction > } else if (only dst is a register) > { >//another different instruction encoding for ADD instruction > } else > { >//src and dst are all on stack. >//another different instruction encoding for ADD instruction > } > > If you think above if-else-else-else isn't so painful, try to think > it for BPF_ALU64|BPF_*SHIFT* instruction:) > > Tested on my PC(Intel(R) Core(TM) i5-5200U CPU) and virtualbox. > > Testing results on i5-5200U: > > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] > 2) test_progs: Summary: 81 PASSED, 2 FAILED. >test_progs report "libbpf: incorrect bpf_call opcode" for >test_l4lb_noinline and test_xdp_noinline, because there is >no llvm-6.0 on my machine, and current implementation doesn't >support BPF_CALL, so I think we can ignore it. > 3) test_lpm: OK > 4) test_lru_map: OK > 5) test_verifier: Summary: 823 PASSED, 5 FAILED >test_verifier report "invalid bpf_context access off=68 size=1/2/4/8" >for all the 5 FAILED testcases, and test_verifier report them when >we turn off the jit, so I think the jit can do nothing to fix them. > > Above tests are all done with following flags enabled discretely: > bpf_jit_enable=1 and bpf_jit_harden=2 > > Below are some numbers for this jit implementation: > Note: > I run test_progs 100 times in loop for every testcase, the numbers > are in format: total/times=avg. The numbers that test_bpf report > almost show the same relation. > > a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 > test_pkt_access:PASS:ipv4:15622/100=156 > test_pkt_access:PASS:ipv4:10057/100=100 > test_pkt_access:PASS:ipv6:9130/100=91 > test_pkt_access:PASS:ipv6:5055/100=50 > test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:145945/100=1459 > test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:67337/100=673 > test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381 > test_l4lb:PASS:ipv6:101000/100=1010 test_l4lb:PASS:ipv6:57779/100=577 > > c:jit_enable=0 and jit_harden=2b:jit_enable=1 and jit_harden=2 > test_pkt_access:PASS:ipv4:15214/100=152 > test_pkt_access:PASS:ipv4:12650/100=126 > test_pkt_access:PASS:ipv6:9132/100=91 > test_pkt_access:PASS:ipv6:7074/100=70 > test_xdp:PASS:ipv4:237252/100=2372 test_xdp:PASS:ipv4:147211/100=1472 > test_xdp:PASS:ipv6:135977/100=1359 test_xdp:PASS:ipv6:85783/100=857 > test_l4lb:PASS:ipv4:61324/100=613test_l4lb:PASS:ipv4:53222/100=532 > test_l4lb:PASS:ipv6:100833/100=1008 test_l4lb:PASS:ipv6:76322/100=763 > > Yes, the numbers are pretty without turn on jit_harden, if we want to speedup > jit_harden, then we need to move BPF_REG_AX to *real* register instead of > stack > emulation, but If we do it, we need to face all the pain I describe above. We > can do it in next step. > > See Documentation/networking/filter.txt for more information. > > Signed-off-by: Wang YanQing <udkni...@gmail.com> > --- > arch/x86/Kconfig |2 +- > arch/x86/include/asm/nospec-branch.h | 26 +- > arch/x86/net/Makefile| 10 +- > arch/x86/net/bpf_jit32.S | 147 +++ > arch/x86/net/bpf_jit_comp32.c| 2239 > ++ > 5 files changed, 2419 insertions(+), 5 deletions(-) > create mode 100644 arch/x86/net/bpf_jit32.S > create mode 100644 arch/x86/net/bpf_jit_comp32.c Add CC to da...@davemloft.net
[PATCH] bpf, x86_32: add eBPF JIT compiler for ia32 (x86_32)
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF only. Classic BPF is supported because of the conversion by BPF core. Almost all instructions from eBPF ISA supported except the following: BPF_ALU64 | BPF_DIV | BPF_K BPF_ALU64 | BPF_DIV | BPF_X BPF_ALU64 | BPF_MOD | BPF_K BPF_ALU64 | BPF_MOD | BPF_X BPF_STX | BPF_XADD | BPF_W BPF_STX | BPF_XADD | BPF_DW It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too. IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI, and for these six registers, we can't treat all of them as real general purpose registers: MUL instructions need EAX:EDX, shift instructions need ECX, ESI|EDI for string manipulation instructions. So I decide to use stack to emulate all eBPF 64 registers, this will simplify the implementation very much, because we don't need to face the flexible memory address modes on ia32, for example, we don't need to write below codes for one BPF_ADD instruction: if (src_reg is a register && dst_reg is a register) { //one instruction encoding for ADD instruction } else if (only src is a register) { //another different instruction encoding for ADD instruction } else if (only dst is a register) { //another different instruction encoding for ADD instruction } else { //src and dst are all on stack. //another different instruction encoding for ADD instruction } If you think above if-else-else-else isn't so painful, try to think it for BPF_ALU64|BPF_*SHIFT* instruction:) Tested on my PC(Intel(R) Core(TM) i5-5200U CPU) and virtualbox. Testing results on i5-5200U: 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed] 2) test_progs: Summary: 81 PASSED, 2 FAILED. test_progs report "libbpf: incorrect bpf_call opcode" for test_l4lb_noinline and test_xdp_noinline, because there is no llvm-6.0 on my machine, and current implementation doesn't support BPF_CALL, so I think we can ignore it. 3) test_lpm: OK 4) test_lru_map: OK 5) test_verifier: Summary: 823 PASSED, 5 FAILED test_verifier report "invalid bpf_context access off=68 size=1/2/4/8" for all the 5 FAILED testcases, and test_verifier report them when we turn off the jit, so I think the jit can do nothing to fix them. Above tests are all done with following flags enabled discretely: bpf_jit_enable=1 and bpf_jit_harden=2 Below are some numbers for this jit implementation: Note: I run test_progs 100 times in loop for every testcase, the numbers are in format: total/times=avg. The numbers that test_bpf report almost show the same relation. a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0 test_pkt_access:PASS:ipv4:15622/100=156 test_pkt_access:PASS:ipv4:10057/100=100 test_pkt_access:PASS:ipv6:9130/100=91test_pkt_access:PASS:ipv6:5055/100=50 test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:145945/100=1459 test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:67337/100=673 test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381 test_l4lb:PASS:ipv6:101000/100=1010 test_l4lb:PASS:ipv6:57779/100=577 c:jit_enable=0 and jit_harden=2b:jit_enable=1 and jit_harden=2 test_pkt_access:PASS:ipv4:15214/100=152 test_pkt_access:PASS:ipv4:12650/100=126 test_pkt_access:PASS:ipv6:9132/100=91test_pkt_access:PASS:ipv6:7074/100=70 test_xdp:PASS:ipv4:237252/100=2372 test_xdp:PASS:ipv4:147211/100=1472 test_xdp:PASS:ipv6:135977/100=1359 test_xdp:PASS:ipv6:85783/100=857 test_l4lb:PASS:ipv4:61324/100=613test_l4lb:PASS:ipv4:53222/100=532 test_l4lb:PASS:ipv6:100833/100=1008 test_l4lb:PASS:ipv6:76322/100=763 Yes, the numbers are pretty without turn on jit_harden, if we want to speedup jit_harden, then we need to move BPF_REG_AX to *real* register instead of stack emulation, but If we do it, we need to face all the pain I describe above. We can do it in next step. See Documentation/networking/filter.txt for more information. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- arch/x86/Kconfig |2 +- arch/x86/include/asm/nospec-branch.h | 26 +- arch/x86/net/Makefile| 10 +- arch/x86/net/bpf_jit32.S | 147 +++ arch/x86/net/bpf_jit_comp32.c| 2239 ++ 5 files changed, 2419 insertions(+), 5 deletions(-) create mode 100644 arch/x86/net/bpf_jit32.S create mode 100644 arch/x86/net/bpf_jit_comp32.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 00fcf81..1f5fa2f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -137,7 +137,7 @@ config X86 select HAVE_DMA_CONTIGUOUS select HAVE_DYNAMIC_FTRACE select HAVE_DYNAMIC_FTRACE_WITH_REGS - select HAVE_EBPF_JITif X86_64 + select HAVE_EBPF_JIT select HAVE_EFFICIENT_UNALIGNED_ACCESS select HAVE_EXIT_THREAD select HAVE_FENTRY if X86_64 || DY
[PATCH] bpf, doc: Correct one wrong value in "Register value tracking"
If we then OR this with 0x40, then the value of 6th bit (0th is first bit) become known, so the right mask is 0xbf instead of 0xcf. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Documentation/networking/filter.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt index 8781485..a4508ec 100644 --- a/Documentation/networking/filter.txt +++ b/Documentation/networking/filter.txt @@ -1134,7 +1134,7 @@ The verifier's knowledge about the variable offset consists of: mask and value; no bit should ever be 1 in both. For example, if a byte is read into a register from memory, the register's top 56 bits are known zero, while the low 8 are unknown - which is represented as the tnum (0x0; 0xff). If we -then OR this with 0x40, we get (0x40; 0xcf), then if we add 1 we get (0x0; +then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0; 0x1ff), because of potential carries. Besides arithmetic, the register state can also be updated by conditional branches. For instance, if a SCALAR_VALUE is compared > 8, in the 'true' branch -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH v2] rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring
We can't use kfree_skb in irq disable context, because spin_lock_irqsave make sure we are always in irq disable context, use dev_kfree_skb_irq instead of kfree_skb is better than dev_kfree_skb_any. This patch fix below kernel warning: [ 7612.095528] [ cut here ] [ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150 __local_bh_enable_ip+0x58/0x80() [ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal btcoexist rtl_pci rtlwifi rtl8723_common [ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: GW 4.4.0+ #4 [ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW (1.19 ) 08/27/2015 [ 7612.095574] da37fc70 c12ce7c5 da37fca0 c104cc59 c19d4454 [ 7612.095584] 0003 116c c19d4784 0096 c10508a8 c10508a8 0200 c1b42400 [ 7612.095594] f29be780 da37fcb0 c104ccad 0009 da37fcbc c10508a8 f21f08b8 [ 7612.095604] Call Trace: [ 7612.095614] [] dump_stack+0x41/0x5c [ 7612.095620] [] warn_slowpath_common+0x89/0xc0 [ 7612.095628] [] ? __local_bh_enable_ip+0x58/0x80 [ 7612.095634] [] ? __local_bh_enable_ip+0x58/0x80 [ 7612.095640] [] warn_slowpath_null+0x1d/0x20 [ 7612.095646] [] __local_bh_enable_ip+0x58/0x80 [ 7612.095653] [] destroy_conntrack+0x64/0xa0 [ 7612.095660] [] nf_conntrack_destroy+0xf/0x20 [ 7612.095665] [] skb_release_head_state+0x55/0xa0 [ 7612.095670] [] skb_release_all+0xb/0x20 [ 7612.095674] [] __kfree_skb+0xb/0x60 [ 7612.095679] [] kfree_skb+0x30/0x70 [ 7612.095686] [] ? rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci] [ 7612.095692] [] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci] [ 7612.095698] [] rtl_pci_start+0x19/0x190 [rtl_pci] [ 7612.095705] [] rtl_op_start+0x56/0x90 [rtlwifi] [ 7612.095712] [] drv_start+0x36/0xc0 [ 7612.095717] [] ieee80211_do_open+0x2d3/0x890 [ 7612.095725] [] ? call_netdevice_notifiers_info+0x2e/0x60 [ 7612.095730] [] ieee80211_open+0x4d/0x50 [ 7612.095736] [] __dev_open+0xa3/0x130 [ 7612.095742] [] ? _raw_spin_unlock_bh+0x13/0x20 [ 7612.095748] [] __dev_change_flags+0x89/0x140 [ 7612.095753] [] ? selinux_capable+0xd/0x10 [ 7612.095759] [] dev_change_flags+0x29/0x60 [ 7612.095765] [] devinet_ioctl+0x553/0x670 [ 7612.095772] [] ? _copy_to_user+0x28/0x40 [ 7612.095777] [] inet_ioctl+0x85/0xb0 [ 7612.095783] [] sock_ioctl+0x67/0x260 [ 7612.095788] [] ? sock_fasync+0x80/0x80 [ 7612.095795] [] do_vfs_ioctl+0x6b/0x550 [ 7612.095800] [] ? selinux_file_ioctl+0x102/0x1e0 [ 7612.095807] [] ? timekeeping_suspend+0x294/0x320 [ 7612.095813] [] ? __hrtimer_run_queues+0x14a/0x210 [ 7612.095820] [] ? security_file_ioctl+0x34/0x50 [ 7612.095827] [] SyS_ioctl+0x70/0x80 [ 7612.095832] [] do_fast_syscall_32+0x84/0x120 [ 7612.095839] [] sysenter_past_esp+0x36/0x55 [ 7612.095844] ---[ end trace 97e9c637a20e8348 ]--- Signed-off-by: Wang YanQing <udkni...@gmail.com> Cc: Stable <sta...@vger.kernel.org> --- Changes: v1-v2: 1: add a Cc to stable. drivers/net/wireless/realtek/rtlwifi/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c b/drivers/net/wireless/realtek/rtlwifi/pci.c index 1ac41b8..99a3a03 100644 --- a/drivers/net/wireless/realtek/rtlwifi/pci.c +++ b/drivers/net/wireless/realtek/rtlwifi/pci.c @@ -1572,7 +1572,7 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw) true, HW_DESC_TXBUFF_ADDR), skb->len, PCI_DMA_TODEVICE); - kfree_skb(skb); + dev_kfree_skb_irq(skb); ring->idx = (ring->idx + 1) % ring->entries; } ring->idx = 0; -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH] rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring
We can't use kfree_skb in irq disable context, because spin_lock_irqsave make sure we are always in irq disable context, use dev_kfree_skb_irq instead of kfree_skb is better than dev_kfree_skb_any. This patch fix below kernel warning: [ 7612.095528] [ cut here ] [ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150 __local_bh_enable_ip+0x58/0x80() [ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal btcoexist rtl_pci rtlwifi rtl8723_common [ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: GW 4.4.0+ #4 [ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW (1.19 ) 08/27/2015 [ 7612.095574] da37fc70 c12ce7c5 da37fca0 c104cc59 c19d4454 [ 7612.095584] 0003 116c c19d4784 0096 c10508a8 c10508a8 0200 c1b42400 [ 7612.095594] f29be780 da37fcb0 c104ccad 0009 da37fcbc c10508a8 f21f08b8 [ 7612.095604] Call Trace: [ 7612.095614] [] dump_stack+0x41/0x5c [ 7612.095620] [] warn_slowpath_common+0x89/0xc0 [ 7612.095628] [] ? __local_bh_enable_ip+0x58/0x80 [ 7612.095634] [] ? __local_bh_enable_ip+0x58/0x80 [ 7612.095640] [] warn_slowpath_null+0x1d/0x20 [ 7612.095646] [] __local_bh_enable_ip+0x58/0x80 [ 7612.095653] [] destroy_conntrack+0x64/0xa0 [ 7612.095660] [] nf_conntrack_destroy+0xf/0x20 [ 7612.095665] [] skb_release_head_state+0x55/0xa0 [ 7612.095670] [] skb_release_all+0xb/0x20 [ 7612.095674] [] __kfree_skb+0xb/0x60 [ 7612.095679] [] kfree_skb+0x30/0x70 [ 7612.095686] [] ? rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci] [ 7612.095692] [] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci] [ 7612.095698] [] rtl_pci_start+0x19/0x190 [rtl_pci] [ 7612.095705] [] rtl_op_start+0x56/0x90 [rtlwifi] [ 7612.095712] [] drv_start+0x36/0xc0 [ 7612.095717] [] ieee80211_do_open+0x2d3/0x890 [ 7612.095725] [] ? call_netdevice_notifiers_info+0x2e/0x60 [ 7612.095730] [] ieee80211_open+0x4d/0x50 [ 7612.095736] [] __dev_open+0xa3/0x130 [ 7612.095742] [] ? _raw_spin_unlock_bh+0x13/0x20 [ 7612.095748] [] __dev_change_flags+0x89/0x140 [ 7612.095753] [] ? selinux_capable+0xd/0x10 [ 7612.095759] [] dev_change_flags+0x29/0x60 [ 7612.095765] [] devinet_ioctl+0x553/0x670 [ 7612.095772] [] ? _copy_to_user+0x28/0x40 [ 7612.095777] [] inet_ioctl+0x85/0xb0 [ 7612.095783] [] sock_ioctl+0x67/0x260 [ 7612.095788] [] ? sock_fasync+0x80/0x80 [ 7612.095795] [] do_vfs_ioctl+0x6b/0x550 [ 7612.095800] [] ? selinux_file_ioctl+0x102/0x1e0 [ 7612.095807] [] ? timekeeping_suspend+0x294/0x320 [ 7612.095813] [] ? __hrtimer_run_queues+0x14a/0x210 [ 7612.095820] [] ? security_file_ioctl+0x34/0x50 [ 7612.095827] [] SyS_ioctl+0x70/0x80 [ 7612.095832] [] do_fast_syscall_32+0x84/0x120 [ 7612.095839] [] sysenter_past_esp+0x36/0x55 [ 7612.095844] ---[ end trace 97e9c637a20e8348 ]--- Signed-off-by: Wang YanQing <udkni...@gmail.com> --- drivers/net/wireless/realtek/rtlwifi/pci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c b/drivers/net/wireless/realtek/rtlwifi/pci.c index 1ac41b8..99a3a03 100644 --- a/drivers/net/wireless/realtek/rtlwifi/pci.c +++ b/drivers/net/wireless/realtek/rtlwifi/pci.c @@ -1572,7 +1572,7 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw) true, HW_DESC_TXBUFF_ADDR), skb->len, PCI_DMA_TODEVICE); - kfree_skb(skb); + dev_kfree_skb_irq(skb); ring->idx = (ring->idx + 1) % ring->entries; } ring->idx = 0; -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH v2] rtlwifi: Remove double check for cnt_after_linked
rtl_lps_enter does two successive check for cnt_after_linked to make sure some time has elapsed after linked. The second check isn't necessary, because if cnt_after_linked is bigger than 5, it is bigger than 2 of course! This patch remove the second check code. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- Changes v1-v2: 1: rewrite subject and commit message. 2: fix issues report by checkpatch.pl. drivers/net/wireless/realtek/rtlwifi/ps.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/ps.c b/drivers/net/wireless/realtek/rtlwifi/ps.c index b69321d..93579ca 100644 --- a/drivers/net/wireless/realtek/rtlwifi/ps.c +++ b/drivers/net/wireless/realtek/rtlwifi/ps.c @@ -443,14 +443,10 @@ void rtl_lps_enter(struct ieee80211_hw *hw) spin_lock_irqsave(>locks.lps_lock, flag); - /* Idle for a while if we connect to AP a while ago. */ - if (mac->cnt_after_linked >= 2) { - if (ppsc->dot11_psmode == EACTIVE) { - RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD, -"Enter 802.11 power save mode...\n"); - - rtl_lps_set_psmode(hw, EAUTOPS); - } + if (ppsc->dot11_psmode == EACTIVE) { + RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD, +"Enter 802.11 power save mode...\n"); + rtl_lps_set_psmode(hw, EAUTOPS); } spin_unlock_irqrestore(>locks.lps_lock, flag); -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH v2] rtlwifi: Fix logic error in enter/exit power-save mode
In commit a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue"), the tests for enter/exit power-save mode were inverted. With this change applied, the wifi connection becomes much more stable. Fixes: a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue") Signed-off-by: Wang YanQing <udkni...@gmail.com> CC: Stable <sta...@vger.kernel.org> [3.10+] --- Hi, Larry! Because commit a269913c52ad is the first commit bring this problem, so maybe use above commit message is ok, right? And stable kernels 3.10-3.18 don't have commit fd09ff958777, but have a269913c52ad. Thanks for suggestion concerning to good subject and commit message writing, it is harder than coding sometimes:) Changes: v1-v2: 1: Fix subject and commit message. drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c b/drivers/net/wireless/realtek/rtlwifi/base.c index c74eb13..264466f 100644 --- a/drivers/net/wireless/realtek/rtlwifi/base.c +++ b/drivers/net/wireless/realtek/rtlwifi/base.c @@ -1660,9 +1660,9 @@ void rtl_watchdog_wq_callback(void *data) if (((rtlpriv->link_info.num_rx_inperiod + rtlpriv->link_info.num_tx_inperiod) > 8) || (rtlpriv->link_info.num_rx_inperiod > 2)) - rtl_lps_enter(hw); - else rtl_lps_leave(hw); + else + rtl_lps_enter(hw); } rtlpriv->link_info.num_rx_inperiod = 0; -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH] rtlwifi:rtl_lps_enter: fix double check cnt_after_linked
We have checked cnt_after_linked in below code: " /*sleep after linked 10s, to let DHCP and 4-way handshake ok enough!! */ if (mac->cnt_after_linked < 5) return; " So the second check isn't necessary. This patch delete second check code. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- drivers/net/wireless/realtek/rtlwifi/ps.c | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/ps.c b/drivers/net/wireless/realtek/rtlwifi/ps.c index b69321d..9d73581c 100644 --- a/drivers/net/wireless/realtek/rtlwifi/ps.c +++ b/drivers/net/wireless/realtek/rtlwifi/ps.c @@ -443,14 +443,11 @@ void rtl_lps_enter(struct ieee80211_hw *hw) spin_lock_irqsave(>locks.lps_lock, flag); - /* Idle for a while if we connect to AP a while ago. */ - if (mac->cnt_after_linked >= 2) { - if (ppsc->dot11_psmode == EACTIVE) { - RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD, -"Enter 802.11 power save mode...\n"); + if (ppsc->dot11_psmode == EACTIVE) { + RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD, + "Enter 802.11 power save mode...\n"); - rtl_lps_set_psmode(hw, EAUTOPS); - } + rtl_lps_set_psmode(hw, EAUTOPS); } spin_unlock_irqrestore(>locks.lps_lock, flag); -- 1.8.5.6.2.g3d8a54e.dirty
[PATCH] rtlwifi:rtl_watchdog_wq_callback: fix calling rtl_lps_enter|rtl_lps_leave in opposite condition
Commit a269913c52ad37952a4d9953bb6d748f7299c304 ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue") make a mistake, change the meaning of num_tx|rx_inperiod comparison test. Commit fd09ff958777cf583d7541f180991c0fc50bd2f7 ("rtlwifi: Remove extra workqueue for enter/leave power state") follow previous mistake, bring us to current code. This patch fix it. Signed-off-by: Wang YanQing <udkni...@gmail.com> --- I think this patch should be ported back to stable kernels, 3.10+. In my machine, I will lost wifi connection after minutes if I enable fwlps. drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c b/drivers/net/wireless/realtek/rtlwifi/base.c index c74eb13..264466f 100644 --- a/drivers/net/wireless/realtek/rtlwifi/base.c +++ b/drivers/net/wireless/realtek/rtlwifi/base.c @@ -1660,9 +1660,9 @@ void rtl_watchdog_wq_callback(void *data) if (((rtlpriv->link_info.num_rx_inperiod + rtlpriv->link_info.num_tx_inperiod) > 8) || (rtlpriv->link_info.num_rx_inperiod > 2)) - rtl_lps_enter(hw); - else rtl_lps_leave(hw); + else + rtl_lps_enter(hw); } rtlpriv->link_info.num_rx_inperiod = 0; -- 1.8.5.6.2.g3d8a54e.dirty