[PATCH] bpf, x32: Fix regression caused by commit 24dea04767e6

2018-07-25 Thread Wang YanQing
Commit 24dea04767e6 ("bpf, x32: remove ld_abs/ld_ind")
removed the 4 /* Extra space for skb_copy_bits buffer */
from _STACK_SIZE, but it didn't fix the concerned code
in emit_prologue and emit_epilogue, and this error will
bring very strange kernel runtime errors.

This patch fix it.

Fixes: 24dea04767e6 ("bpf, x32: remove ld_abs/ld_ind")
Signed-off-by: Wang YanQing 
---
 arch/x86/net/bpf_jit_comp32.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 5579987..8f6cc71 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -1441,8 +1441,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth)
 
/* sub esp,STACK_SIZE */
EMIT2_off32(0x81, 0xEC, STACK_SIZE);
-   /* sub ebp,SCRATCH_SIZE+4+12*/
-   EMIT3(0x83, add_1reg(0xE8, IA32_EBP), SCRATCH_SIZE + 16);
+   /* sub ebp,SCRATCH_SIZE+12*/
+   EMIT3(0x83, add_1reg(0xE8, IA32_EBP), SCRATCH_SIZE + 12);
/* xor ebx,ebx */
EMIT2(0x31, add_2reg(0xC0, IA32_EBX, IA32_EBX));
 
@@ -1475,8 +1475,8 @@ static void emit_epilogue(u8 **pprog, u32 stack_depth)
/* mov edx,dword ptr [ebp+off]*/
EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EDX), STACK_VAR(r0[1]));
 
-   /* add ebp,SCRATCH_SIZE+4+12*/
-   EMIT3(0x83, add_1reg(0xC0, IA32_EBP), SCRATCH_SIZE + 16);
+   /* add ebp,SCRATCH_SIZE+12*/
+   EMIT3(0x83, add_1reg(0xC0, IA32_EBP), SCRATCH_SIZE + 12);
 
/* mov ebx,dword ptr [ebp-12]*/
EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_EBX), -12);
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH] bpf, doc: clarification for the meaning of 'id'

2018-05-09 Thread Wang YanQing
For me, as a reader whose mother language isn't English, the
old words bring a little difficulty to catch the meaning, this
patch rewords the subsection in a more clarificatory way.

This patch also add blank lines as separator at two places
to improve readability.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Documentation/networking/filter.txt | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/Documentation/networking/filter.txt 
b/Documentation/networking/filter.txt
index 5032e12..e6b4ebb 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -1142,6 +1142,7 @@ into a register from memory, the register's top 56 bits 
are known zero, while
 the low 8 are unknown - which is represented as the tnum (0x0; 0xff).  If we
 then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
 0x1ff), because of potential carries.
+
 Besides arithmetic, the register state can also be updated by conditional
 branches.  For instance, if a SCALAR_VALUE is compared > 8, in the 'true' 
branch
 it will have a umin_value (unsigned minimum value) of 9, whereas in the 'false'
@@ -1150,14 +1151,16 @@ BPF_JSGE) would instead update the signed 
minimum/maximum values.  Information
 from the signed and unsigned bounds can be combined; for instance if a value is
 first tested < 8 and then tested s> 4, the verifier will conclude that the 
value
 is also > 4 and s< 8, since the bounds prevent crossing the sign boundary.
+
 PTR_TO_PACKETs with a variable offset part have an 'id', which is common to all
 pointers sharing that same variable offset.  This is important for packet range
-checks: after adding some variable to a packet pointer, if you then copy it to
-another register and (say) add a constant 4, both registers will share the same
-'id' but one will have a fixed offset of +4.  Then if it is bounds-checked and
-found to be less than a PTR_TO_PACKET_END, the other register is now known to
-have a safe range of at least 4 bytes.  See 'Direct packet access', below, for
-more on PTR_TO_PACKET ranges.
+checks: after adding a variable to a packet pointer register A, if you then 
copy
+it to another register B and then add a constant 4 to A, both registers will
+share the same 'id' but the A will have a fixed offset of +4.  Then if A is
+bounds-checked and found to be less than a PTR_TO_PACKET_END, the register B is
+now known to have a safe range of at least 4 bytes.  See 'Direct packet 
access',
+below, for more on PTR_TO_PACKET ranges.
+
 The 'id' field is also used on PTR_TO_MAP_VALUE_OR_NULL, common to all copies 
of
 the pointer returned from a map lookup.  This means that when one copy is
 checked and found to be non-NULL, all copies can become PTR_TO_MAP_VALUEs.
-- 
1.8.5.6.2.g3d8a54e.dirty


Re: [PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform

2018-05-07 Thread Wang YanQing
On Sat, Apr 28, 2018 at 01:29:17PM +0800, Wang YanQing wrote:
> On Sat, Apr 28, 2018 at 01:33:15AM +0200, Daniel Borkmann wrote:
> > On 04/28/2018 12:48 AM, Alexei Starovoitov wrote:
> > > On Thu, Apr 26, 2018 at 05:57:49PM +0800, Wang YanQing wrote:
> > >> All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in
> > >> test_verifier(kselftest) report below errors on x86_32:
> > >> "
> > >> 172/p unpriv: spill/fill of different pointers ldx FAIL
> > >> Unexpected error message!
> > >> 0: (bf) r6 = r10
> > >> 1: (07) r6 += -8
> > >> 2: (15) if r1 == 0x0 goto pc+3
> > >> R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
> > >> 3: (bf) r2 = r10
> > >> 4: (07) r2 += -76
> > >> 5: (7b) *(u64 *)(r6 +0) = r2
> > >> 6: (55) if r1 != 0x0 goto pc+1
> > >> R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 
> > >> R10=fp0,call_-1 fp-8=fp
> > >> 7: (7b) *(u64 *)(r6 +0) = r1
> > >> 8: (79) r1 = *(u64 *)(r6 +0)
> > >> 9: (79) r1 = *(u64 *)(r1 +68)
> > >> invalid bpf_context access off=68 size=8
> > >>
> > >> 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
> > >> Failed to load prog 'Permission denied'!
> > >> 0: (b7) r0 = 0
> > >> 1: (71) r0 = *(u8 *)(r1 +68)
> > >> invalid bpf_context access off=68 size=1
> > >>
> > >> 379/p check bpf_perf_event_data->sample_period half load permitted FAIL
> > >> Failed to load prog 'Permission denied'!
> > >> 0: (b7) r0 = 0
> > >> 1: (69) r0 = *(u16 *)(r1 +68)
> > >> invalid bpf_context access off=68 size=2
> > >>
> > >> 380/p check bpf_perf_event_data->sample_period word load permitted FAIL
> > >> Failed to load prog 'Permission denied'!
> > >> 0: (b7) r0 = 0
> > >> 1: (61) r0 = *(u32 *)(r1 +68)
> > >> invalid bpf_context access off=68 size=4
> > >>
> > >> 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
> > >> Failed to load prog 'Permission denied'!
> > >> 0: (b7) r0 = 0
> > >> 1: (79) r0 = *(u64 *)(r1 +68)
> > >> invalid bpf_context access off=68 size=8
> > >> "
> > >>
> > >> This patch fix it, the fix isn't only necessary for x86_32, it will fix 
> > >> the
> > >> same problem for other platforms too, if their size of bpf_user_pt_regs_t
> > >> can't divide exactly into 8.
> > >>
> > >> Signed-off-by: Wang YanQing <udkni...@gmail.com>
> > >> ---
> > >>  Hi all!
> > >>  After mainline accept this patch, then we need to submit a sync patch
> > >>  to update the tools/include/uapi/linux/bpf_perf_event.h.
> > >>
> > >>  Thanks.
> > >>
> > >>  include/uapi/linux/bpf_perf_event.h | 2 +-
> > >>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>
> > >> diff --git a/include/uapi/linux/bpf_perf_event.h 
> > >> b/include/uapi/linux/bpf_perf_event.h
> > >> index eb1b9d2..ff4c092 100644
> > >> --- a/include/uapi/linux/bpf_perf_event.h
> > >> +++ b/include/uapi/linux/bpf_perf_event.h
> > >> @@ -12,7 +12,7 @@
> > >>  
> > >>  struct bpf_perf_event_data {
> > >>  bpf_user_pt_regs_t regs;
> > >> -__u64 sample_period;
> > >> +__u64 sample_period __attribute__((aligned(8)));
> > > 
> > > I don't think this necessary.
> > > imo it's a bug in pe_prog_is_valid_access
> > > that should have allowed 8-byte access to 4-byte aligned sample_period.
> > > The access rewritten by pe_prog_convert_ctx_access anyway,
> > > no alignment issues as far as I can see.
> > 
> > Right, good point. Wang, could you give the below a test run:
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 56ba0f2..95b9142 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -833,8 +833,14 @@ static bool pe_prog_is_valid_access(int off, int size, 
> > enum bpf_access_type type
> > return false;
> > if (type != BPF_READ)
> > return false;
> > -   if (off % size != 0)
> > -   return false;
> > +   if (off % size != 0) {
> > +   if (sizeof(long) != 4)
> > +   return false;
> > +   if (size != 8)
> > +   return false;
> > +   if (off % size != 4)
> > +   return false;
> > +   }
> > 
> > switch (off) {
> > case bpf_ctx_range(struct bpf_perf_event_data, sample_period):
> Hi all!
> 
> I have tested this patch, but test_verifier reports the same errors
> for the five testcases.
> 
> The reason is they all failed to pass the test of bpf_ctx_narrow_access_ok.
> 
> Thanks.
Hi! Daniel Borkmann.

Do you have any plan to fix bpf_ctx_narrow_access_ok for these problems?

Thanks.


[PATCH v6] bpf, x86_32: add eBPF JIT compiler for ia32

2018-05-03 Thread Wang YanQing
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.

The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
  for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
  on registers operands instead of on register operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
  will degrade jit performance heavily.

Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 83 PASSED, 0 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 828 PASSED, 0 FAILED.

Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every condition,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156
test_pkt_access:PASS:ipv4:10674/100=106
  test_pkt_access:PASS:ipv6:9130/100=91  
test_pkt_access:PASS:ipv6:4855/100=48
  test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389
  test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685
  test_l4lb:PASS:ipv4:61100/100=611  test_l4lb:PASS:ipv4:37302/100=373
  test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:10558/100=105
  test_pkt_access:PASS:ipv6:5092/100=50
  test_xdp:PASS:ipv4:131902/100=1319
  test_xdp:PASS:ipv6:77932/100=779
  test_l4lb:PASS:ipv4:38924/100=389
  test_l4lb:PASS:ipv6:57520/100=575

The numbers show we get 30%~50% improvement.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Changes v5-v6:
 1:Add do {} while (0) to RETPOLINE_RAX_BPF_JIT for
   consistence reason.
 2:Clean up non-standard comments, reported by Daniel Borkmann.
 3:Fix a memory leak issue, repoted by Daniel Borkmann.

 Changes v4-v5:
 1:Delete is_on_stack, BPF_REG_AX is the only one
   on real hardware registers, so just check with
   it.
 2:Apply commit 1612a981b766 ("bpf, x64: fix JIT emission
   for dead code"), suggested by Daniel Borkmann.
 
 Changes v3-v4:
 1:Fix changelog in commit.
   I install llvm-6.0, then test_progs willn't report errors.
   I submit another patch:
   "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on 
x86_32 platform"
   to fix another problem, after that patch, test_verifier willn't report 
errors too.
 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation.
 
 Changes v2-v3:
 1:Move BPF_REG_AX to real hardware registers for performance reason.
 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
 5:Some bug fixes and comments improvement.
 
 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.

 arch/x86/Kconfig |2 +-
 arch/x86/include/asm/nospec-branch.h |   30 +-
 arch/x86/net/Makefile|9 +-
 arch/x86/net/bpf_jit_comp32.c| 2553 ++
 4 files changed, 2588 insertions(+), 6 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c07f492..d51a71d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -138,7 +138,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select HAVE_EBPF_JITif X86_64
+   select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
 

[PATCH v5] bpf, x86_32: add eBPF JIT compiler for ia32

2018-04-29 Thread Wang YanQing
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.

The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
  for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
  on registers operands instead of on register operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
  will degrade jit performance heavily.

Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 83 PASSED, 0 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 828 PASSED, 0 FAILED.

Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every condition,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156
test_pkt_access:PASS:ipv4:10674/100=106
  test_pkt_access:PASS:ipv6:9130/100=91  
test_pkt_access:PASS:ipv6:4855/100=48
  test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389
  test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685
  test_l4lb:PASS:ipv4:61100/100=611  test_l4lb:PASS:ipv4:37302/100=373
  test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:10558/100=105
  test_pkt_access:PASS:ipv6:5092/100=50
  test_xdp:PASS:ipv4:131902/100=1319
  test_xdp:PASS:ipv6:77932/100=779
  test_l4lb:PASS:ipv4:38924/100=389
  test_l4lb:PASS:ipv6:57520/100=575

The numbers show we get 30%~50% improvement.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Changes v4-v5:
 1:Delete is_on_stack, BPF_REG_AX is the only one
   on real hardware registers, so just check with
   it.
 2:Apply commit 1612a981b766 ("bpf, x64: fix JIT emission
   for dead code"), suggested by Daniel Borkmann.
 
 Changes v3-v4:
 1:Fix changelog in commit.
   I install llvm-6.0, then test_progs willn't report errors.
   I submit another patch:
   "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on 
x86_32 platform"
   to fix another problem, after that patch, test_verifier willn't report 
errors too.
 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation.

 Changes v2-v3:
 1:Move BPF_REG_AX to real hardware registers for performance reason.
 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
 5:Some bug fixes and comments improvement.

 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.

 arch/x86/Kconfig |2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile|9 +-
 arch/x86/net/bpf_jit_comp32.c| 2527 ++
 4 files changed, 2559 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select HAVE_EBPF_JITif X86_64
+   select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EXIT_THREAD
select HAVE_FENTRY  if X86_64 || DYNAMIC_FTRACE
diff --git a/arch/x86/include/asm/nospec-branch.h 
b/arch/x86/include/asm/nospec-branch.h
index f928ad9..a4c

Re: [PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform

2018-04-27 Thread Wang YanQing
On Sat, Apr 28, 2018 at 01:33:15AM +0200, Daniel Borkmann wrote:
> On 04/28/2018 12:48 AM, Alexei Starovoitov wrote:
> > On Thu, Apr 26, 2018 at 05:57:49PM +0800, Wang YanQing wrote:
> >> All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in
> >> test_verifier(kselftest) report below errors on x86_32:
> >> "
> >> 172/p unpriv: spill/fill of different pointers ldx FAIL
> >> Unexpected error message!
> >> 0: (bf) r6 = r10
> >> 1: (07) r6 += -8
> >> 2: (15) if r1 == 0x0 goto pc+3
> >> R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
> >> 3: (bf) r2 = r10
> >> 4: (07) r2 += -76
> >> 5: (7b) *(u64 *)(r6 +0) = r2
> >> 6: (55) if r1 != 0x0 goto pc+1
> >> R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 
> >> fp-8=fp
> >> 7: (7b) *(u64 *)(r6 +0) = r1
> >> 8: (79) r1 = *(u64 *)(r6 +0)
> >> 9: (79) r1 = *(u64 *)(r1 +68)
> >> invalid bpf_context access off=68 size=8
> >>
> >> 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
> >> Failed to load prog 'Permission denied'!
> >> 0: (b7) r0 = 0
> >> 1: (71) r0 = *(u8 *)(r1 +68)
> >> invalid bpf_context access off=68 size=1
> >>
> >> 379/p check bpf_perf_event_data->sample_period half load permitted FAIL
> >> Failed to load prog 'Permission denied'!
> >> 0: (b7) r0 = 0
> >> 1: (69) r0 = *(u16 *)(r1 +68)
> >> invalid bpf_context access off=68 size=2
> >>
> >> 380/p check bpf_perf_event_data->sample_period word load permitted FAIL
> >> Failed to load prog 'Permission denied'!
> >> 0: (b7) r0 = 0
> >> 1: (61) r0 = *(u32 *)(r1 +68)
> >> invalid bpf_context access off=68 size=4
> >>
> >> 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
> >> Failed to load prog 'Permission denied'!
> >> 0: (b7) r0 = 0
> >> 1: (79) r0 = *(u64 *)(r1 +68)
> >> invalid bpf_context access off=68 size=8
> >> "
> >>
> >> This patch fix it, the fix isn't only necessary for x86_32, it will fix the
> >> same problem for other platforms too, if their size of bpf_user_pt_regs_t
> >> can't divide exactly into 8.
> >>
> >> Signed-off-by: Wang YanQing <udkni...@gmail.com>
> >> ---
> >>  Hi all!
> >>  After mainline accept this patch, then we need to submit a sync patch
> >>  to update the tools/include/uapi/linux/bpf_perf_event.h.
> >>
> >>  Thanks.
> >>
> >>  include/uapi/linux/bpf_perf_event.h | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/include/uapi/linux/bpf_perf_event.h 
> >> b/include/uapi/linux/bpf_perf_event.h
> >> index eb1b9d2..ff4c092 100644
> >> --- a/include/uapi/linux/bpf_perf_event.h
> >> +++ b/include/uapi/linux/bpf_perf_event.h
> >> @@ -12,7 +12,7 @@
> >>  
> >>  struct bpf_perf_event_data {
> >>bpf_user_pt_regs_t regs;
> >> -  __u64 sample_period;
> >> +  __u64 sample_period __attribute__((aligned(8)));
> > 
> > I don't think this necessary.
> > imo it's a bug in pe_prog_is_valid_access
> > that should have allowed 8-byte access to 4-byte aligned sample_period.
> > The access rewritten by pe_prog_convert_ctx_access anyway,
> > no alignment issues as far as I can see.
> 
> Right, good point. Wang, could you give the below a test run:
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 56ba0f2..95b9142 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -833,8 +833,14 @@ static bool pe_prog_is_valid_access(int off, int size, 
> enum bpf_access_type type
>   return false;
>   if (type != BPF_READ)
>   return false;
> - if (off % size != 0)
> - return false;
> + if (off % size != 0) {
> + if (sizeof(long) != 4)
> + return false;
> + if (size != 8)
> + return false;
> + if (off % size != 4)
> + return false;
> + }
> 
>   switch (off) {
>   case bpf_ctx_range(struct bpf_perf_event_data, sample_period):
Hi all!

I have tested this patch, but test_verifier reports the same errors
for the five testcases.

The reason is they all failed to pass the test of bpf_ctx_narrow_access_ok.

Thanks.


[PATCH v4] bpf, x86_32: add eBPF JIT compiler for ia32

2018-04-26 Thread Wang YanQing
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.

The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
  for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
  on registers operands instead of on register operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
  will degrade jit performance heavily.

Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 83 PASSED, 0 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 828 PASSED, 0 FAILED.

Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every condition,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156
test_pkt_access:PASS:ipv4:10674/100=106
  test_pkt_access:PASS:ipv6:9130/100=91  
test_pkt_access:PASS:ipv6:4855/100=48
  test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389
  test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685
  test_l4lb:PASS:ipv4:61100/100=611  test_l4lb:PASS:ipv4:37302/100=373
  test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:10558/100=105
  test_pkt_access:PASS:ipv6:5092/100=50
  test_xdp:PASS:ipv4:131902/100=1319
  test_xdp:PASS:ipv6:77932/100=779
  test_l4lb:PASS:ipv4:38924/100=389
  test_l4lb:PASS:ipv6:57520/100=575

The numbers show we get 30%~50% improvement.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Changes v3-v4:
 1:Fix changelog in commit.
   I install llvm-6.0, then test_progs willn't report errors.
   I submit another patch:
   "bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on 
x86_32 platform"
   to fix another problem, after that patch, test_verifier willn't report 
errors too.
 2:Fix clear r0[1] twice unnecessarily in *BPF_IND|BPF_ABS* simulation.

 Changes v2-v3:
 1:Move BPF_REG_AX to real hardware registers for performance reason.
 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
 5:Some bug fixes and comments improvement.

 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.

 Hi all!
 This is the fourth version of this patch, and I think it is good enough
 to launch, any more suggestion?

 Thanks.

 arch/x86/Kconfig |2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile|9 +-
 arch/x86/net/bpf_jit_comp32.c| 2530 ++
 4 files changed, 2562 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select HAVE_EBPF_JITif X86_64
+   select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EXIT_THREAD
select HAVE_FENTRY  if X86_64 || DYNAMIC_FTRACE
diff --git a/arch/x86/include/asm/nospec-branch.h 
b/arch/x86/include/asm/nospec-branch.h
index f928ad9..a4c7ca4 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -291,14 +291,17 

[PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform

2018-04-26 Thread Wang YanQing
All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in
test_verifier(kselftest) report below errors on x86_32:
"
172/p unpriv: spill/fill of different pointers ldx FAIL
Unexpected error message!
0: (bf) r6 = r10
1: (07) r6 += -8
2: (15) if r1 == 0x0 goto pc+3
R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
3: (bf) r2 = r10
4: (07) r2 += -76
5: (7b) *(u64 *)(r6 +0) = r2
6: (55) if r1 != 0x0 goto pc+1
R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 
fp-8=fp
7: (7b) *(u64 *)(r6 +0) = r1
8: (79) r1 = *(u64 *)(r6 +0)
9: (79) r1 = *(u64 *)(r1 +68)
invalid bpf_context access off=68 size=8

378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (71) r0 = *(u8 *)(r1 +68)
invalid bpf_context access off=68 size=1

379/p check bpf_perf_event_data->sample_period half load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (69) r0 = *(u16 *)(r1 +68)
invalid bpf_context access off=68 size=2

380/p check bpf_perf_event_data->sample_period word load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (61) r0 = *(u32 *)(r1 +68)
invalid bpf_context access off=68 size=4

381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (79) r0 = *(u64 *)(r1 +68)
invalid bpf_context access off=68 size=8
"

This patch fix it, the fix isn't only necessary for x86_32, it will fix the
same problem for other platforms too, if their size of bpf_user_pt_regs_t
can't divide exactly into 8.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Hi all!
 After mainline accept this patch, then we need to submit a sync patch
 to update the tools/include/uapi/linux/bpf_perf_event.h.

 Thanks.

 include/uapi/linux/bpf_perf_event.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/bpf_perf_event.h 
b/include/uapi/linux/bpf_perf_event.h
index eb1b9d2..ff4c092 100644
--- a/include/uapi/linux/bpf_perf_event.h
+++ b/include/uapi/linux/bpf_perf_event.h
@@ -12,7 +12,7 @@
 
 struct bpf_perf_event_data {
bpf_user_pt_regs_t regs;
-   __u64 sample_period;
+   __u64 sample_period __attribute__((aligned(8)));
__u64 addr;
 };
 
-- 
1.8.5.6.2.g3d8a54e.dirty


Re: [PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32

2018-04-25 Thread Wang YanQing
On Wed, Apr 25, 2018 at 02:11:16AM +0200, Daniel Borkmann wrote:
> On 04/19/2018 05:54 PM, Wang YanQing wrote:
> > Testing results on i5-5200U:
> > 
> > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
> > 2) test_progs: Summary: 81 PASSED, 2 FAILED.
> >test_progs report "libbpf: incorrect bpf_call opcode" for
> >test_l4lb_noinline and test_xdp_noinline, because there is
> >no llvm-6.0 on my machine, and current implementation doesn't
> >support BPF_PSEUDO_CALL, so I think we can ignore the two failed
> >testcases.
> > 3) test_lpm: OK
> > 4) test_lru_map: OK
> > 5) test_verifier: Summary: 823 PASSED, 5 FAILED
> >test_verifier report "invalid bpf_context access off=68 size=1/2/4/8"
> >for all the 5 FAILED testcases with/without jit, we need to fix the
> >failed testcases themself instead of this jit.
> 
> Can you elaborate further on these? Looks like this definitely needs
> fixing on 32 bit. Would be great to get a better understanding of the
> underlying bug(s) and properly fix them.
> 
Hi Daniel Borkmann, here is the detailed log for failed testcases.

linux: Gentoo 32 bit
llvm:
~ # llc  --version
LLVM (http://llvm.org/):
LLVM version 4.0.1
Optimized build.
Default target: i686-pc-linux-gnu
Host CPU: broadwell

Registered Targets:
amdgcn  - AMD GCN GPUs
bpf - BPF (host endian)
bpfeb   - BPF (big endian)
bpfel   - BPF (little endian)
nvptx   - NVIDIA PTX 32-bit
nvptx64 - NVIDIA PTX 64-bit
r600- AMD GPUs HD2XXX-HD6XXX
x86 - 32-bit X86: Pentium-Pro and above
x86-64  - 64-bit X86: EM64T and AMD64

~ # clang --version
clang version 4.0.1 (tags/RELEASE_401/final)
Target: i686-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/4/bin

kernel version:4.16.2
test program:test_verifier in kselftest
condition:bpf_jit_enable=0,bpf_jit_harden=0

log:
#172/p unpriv: spill/fill of different pointers ldx FAIL
Unexpected error message!
0: (bf) r6 = r10
1: (07) r6 += -8
2: (15) if r1 == 0x0 goto pc+3
R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
3: (bf) r2 = r10
4: (07) r2 += -76
5: (7b) *(u64 *)(r6 +0) = r2
6: (55) if r1 != 0x0 goto pc+1
R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 
fp-8=fp
7: (7b) *(u64 *)(r6 +0) = r1
8: (79) r1 = *(u64 *)(r6 +0)
9: (79) r1 = *(u64 *)(r1 +68)
invalid bpf_context access off=68 size=8

#378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (71) r0 = *(u8 *)(r1 +68)
invalid bpf_context access off=68 size=1
#379/p check bpf_perf_event_data->sample_period half load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (69) r0 = *(u16 *)(r1 +68)
invalid bpf_context access off=68 size=2
#380/p check bpf_perf_event_data->sample_period word load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (61) r0 = *(u32 *)(r1 +68)
invalid bpf_context access off=68 size=4
#381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
Failed to load prog 'Permission denied'!
0: (b7) r0 = 0
1: (79) r0 = *(u64 *)(r1 +68)
invalid bpf_context access off=68 size=8


test program:test_progs
condition:bpf_jit_enable=0,bpf_jit_harden=0

bpf # ./test_progs
test_pkt_access:PASS:ipv4 53 nsec
test_pkt_access:PASS:ipv6 47 nsec
test_xdp:PASS:ipv4 1281 nsec
test_xdp:PASS:ipv6 749 nsec
test_l4lb:PASS:ipv4 427 nsec
test_l4lb:PASS:ipv6 562 nsec
libbpf: incorrect bpf_call opcode <= caused by ./test_l4lb_noinline.o in 
function test_l4lb_all 
libbpf: incorrect bpf_call opcode <= caused by ././test_xdp_noinline.o in 
function test_xdp_noinline


Thanks.


[PATCH v3] bpf, x86_32: add eBPF JIT compiler for ia32

2018-04-25 Thread Wang YanQing
The JIT compiler emits ia32 bit instructions. Currently, It supports eBPF
only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL at the moment.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI. I use
EAX|EDX|ECX|EBX as temporary registers to simulate instructions in eBPF
ISA, and allocate ESI|EDI to BPF_REG_AX for constant blinding, all others
eBPF registers, R0-R10, are simulated through scratch space on stack.

The reasons behind the hardware registers allocation policy are:
1:MUL need EAX:EDX, shift operation need ECX, so they aren't fit
  for general eBPF 64bit register simulation.
2:We need at least 4 registers to simulate most eBPF ISA operations
  on registers operands instead of on register operands.
3:We need to put BPF_REG_AX on hardware registers, or constant blinding
  will degrade jit performance heavily.

Tested on PC (Intel(R) Core(TM) i5-5200U CPU).
Testing results on i5-5200U:
1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 81 PASSED, 2 FAILED.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 823 PASSED, 5 FAILED
Note: these failed testcases are caused by unconcerned reasons, we need
  to fix the testcases themself:
  "invalid bpf_context access" and "libbpf: incorrect bpf_call opcode"

Above tests are all done in following two conditions separately:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every condition,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156
test_pkt_access:PASS:ipv4:10674/100=106
  test_pkt_access:PASS:ipv6:9130/100=91  
test_pkt_access:PASS:ipv6:4855/100=48
  test_xdp:PASS:ipv4:240198/100=2401 test_xdp:PASS:ipv4:138912/100=1389
  test_xdp:PASS:ipv6:137326/100=1373 test_xdp:PASS:ipv6:68542/100=685
  test_l4lb:PASS:ipv4:61100/100=611  test_l4lb:PASS:ipv4:37302/100=373
  test_l4lb:PASS:ipv6:101000/100=1010test_l4lb:PASS:ipv6:55030/100=550

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:10558/100=105
  test_pkt_access:PASS:ipv6:5092/100=50
  test_xdp:PASS:ipv4:131902/100=1319
  test_xdp:PASS:ipv6:77932/100=779
  test_l4lb:PASS:ipv4:38924/100=389
  test_l4lb:PASS:ipv6:57520/100=575

The numbers show we get 30%~50% improvement.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Changes v2-v3:
 1:Move BPF_REG_AX to real hardware registers for performance reason.
 3:Using bpf_load_pointer instead of bpf_jit32.S, suggested by Daniel Borkmann.
 4:Delete partial codes in 1c2a088a6626, suggested by Daniel Borkmann.
 5:Some bug fixes and comments improvement.

 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.
  
 arch/x86/Kconfig |2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile|9 +-
 arch/x86/net/bpf_jit_comp32.c| 2533 ++
 4 files changed, 2565 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select HAVE_EBPF_JITif X86_64
+   select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EXIT_THREAD
select HAVE_FENTRY  if X86_64 || DYNAMIC_FTRACE
diff --git a/arch/x86/include/asm/nospec-branch.h 
b/arch/x86/include/asm/nospec-branch.h
index f928ad9..a4c7ca4 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -291,14 +291,17 @@ static inline void 
indirect_branch_prediction_barrier(void)
  *lfence
  *jmp spec_trap
  *  do_rop:
- *mov %rax,(%rsp)
+ *mov %rax,(%rsp) for x86_64
+ *mov %edx,(%esp) for x86_32
  *retq
  *
  * Without retpolines configured:
  *
- *jmp *%rax
+ *jmp *%rax for x86_64
+ *jmp *%e

Re: [PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32

2018-04-25 Thread Wang YanQing
On Wed, Apr 25, 2018 at 02:11:16AM +0200, Daniel Borkmann wrote:
> On 04/19/2018 05:54 PM, Wang YanQing wrote:
> > The JIT compiler emits ia32 bit instructions. Currently, It supports
> > eBPF only. Classic BPF is supported because of the conversion by BPF core.
> > 
> > Almost all instructions from eBPF ISA supported except the following:
> > BPF_ALU64 | BPF_DIV | BPF_K
> > BPF_ALU64 | BPF_DIV | BPF_X
> > BPF_ALU64 | BPF_MOD | BPF_K
> > BPF_ALU64 | BPF_MOD | BPF_X
> > BPF_STX | BPF_XADD | BPF_W
> > BPF_STX | BPF_XADD | BPF_DW
> > 
> > It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too.
> > 
> > ia32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI,
> > and in these six registers, we can't treat all of them as real
> > general purpose registers in jit:
> > MUL instructions need EAX:EDX, shift instructions need ECX.
> > 
> > So I decide to use stack to emulate all eBPF 64 registers, this will
> > simplify the implementation a lot, because we don't need to face the
> > flexible memory address modes on ia32, for example, we don't need to
> > write below code pattern for one BPF_ADD instruction:
> > 
> > if (src is a register && dst is a register)
> > {
> >//one instruction encoding for ADD instruction
> > } else if (only src is a register)
> > {
> >//another different instruction encoding for ADD instruction
> > } else if (only dst is a register)
> > {
> >//another different instruction encoding for ADD instruction
> > } else
> > {
> >//src and dst are all on stack.
> >//move src or dst to temporary registers
> > }
> > 
> > If the above example if-else-else-else isn't so painful, try to think
> > it for BPF_ALU64|BPF_*SHIFT* instruction which we need to use many
> > native instructions to emulate.
> > 
> > Tested on my PC (Intel(R) Core(TM) i5-5200U CPU) and virtualbox.
> 
> Just out of plain curiosity, do you have a specific use case on the
> JIT for x86_32? Are you targeting this to be used for e.g. Atom or
> the like (sorry, don't really have a good overview where x86_32 is
> still in use these days)?
Yes, you are right that x86_32 isn't the BIG DEAL anymore, but they willn't
disappear completely in near future, the same as 32-bit linux kernel on
64bit machine.

For me, because I use 32-bit linux on development/hacking notebook for many 
years,
I try to trace/perf the kernel, then I meet eBPF, it is strange that there isn't
a jit for it, so I try to fill the empty.
> 
> > Testing results on i5-5200U:
> > 
> > 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
> > 2) test_progs: Summary: 81 PASSED, 2 FAILED.
> >test_progs report "libbpf: incorrect bpf_call opcode" for
> >test_l4lb_noinline and test_xdp_noinline, because there is
> >no llvm-6.0 on my machine, and current implementation doesn't
> >support BPF_PSEUDO_CALL, so I think we can ignore the two failed
> >testcases.
> > 3) test_lpm: OK
> > 4) test_lru_map: OK
> > 5) test_verifier: Summary: 823 PASSED, 5 FAILED
> >test_verifier report "invalid bpf_context access off=68 size=1/2/4/8"
> >for all the 5 FAILED testcases with/without jit, we need to fix the
> >failed testcases themself instead of this jit.
> 
> Can you elaborate further on these? Looks like this definitely needs
> fixing on 32 bit. Would be great to get a better understanding of the
> underlying bug(s) and properly fix them.
Ok! I will try to dig them out.
> 
> > Above tests are all done with following flags settings discretely:
> > 1:bpf_jit_enable=1 and bpf_jit_harden=0
> > 2:bpf_jit_enable=1 and bpf_jit_harden=2
> > 
> > Below are some numbers for this jit implementation:
> > Note:
> >   I run test_progs in kselftest 100 times continuously for every testcase,
> >   the numbers are in format: total/times=avg.
> >   The numbers that test_bpf reports show almost the same relation.
> > 
> > a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
> >   test_pkt_access:PASS:ipv4:15622/100=156  
> > test_pkt_access:PASS:ipv4:10057/100=100
> >   test_pkt_access:PASS:ipv6:9130/100=91
> > test_pkt_access:PASS:ipv6:5055/100=50
> >   test_xdp:PASS:ipv4:240198/100=2401   
> > test_xdp:PASS:ipv4:145945/100=1459
> >   test_xdp:PASS:ipv6:137326/100=1373   test_xdp:PASS:ipv6:67337/100=673
> >   test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381
> >   test_l4lb:PASS:ipv6:101000/100=1010  

[PATCH v2] bpf, x86_32: add eBPF JIT compiler for ia32

2018-04-19 Thread Wang YanQing
The JIT compiler emits ia32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too.

ia32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI,
and in these six registers, we can't treat all of them as real
general purpose registers in jit:
MUL instructions need EAX:EDX, shift instructions need ECX.

So I decide to use stack to emulate all eBPF 64 registers, this will
simplify the implementation a lot, because we don't need to face the
flexible memory address modes on ia32, for example, we don't need to
write below code pattern for one BPF_ADD instruction:

if (src is a register && dst is a register)
{
   //one instruction encoding for ADD instruction
} else if (only src is a register)
{
   //another different instruction encoding for ADD instruction
} else if (only dst is a register)
{
   //another different instruction encoding for ADD instruction
} else
{
   //src and dst are all on stack.
   //move src or dst to temporary registers
}

If the above example if-else-else-else isn't so painful, try to think
it for BPF_ALU64|BPF_*SHIFT* instruction which we need to use many
native instructions to emulate.

Tested on my PC (Intel(R) Core(TM) i5-5200U CPU) and virtualbox.

Testing results on i5-5200U:

1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 81 PASSED, 2 FAILED.
   test_progs report "libbpf: incorrect bpf_call opcode" for
   test_l4lb_noinline and test_xdp_noinline, because there is
   no llvm-6.0 on my machine, and current implementation doesn't
   support BPF_PSEUDO_CALL, so I think we can ignore the two failed
   testcases.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 823 PASSED, 5 FAILED
   test_verifier report "invalid bpf_context access off=68 size=1/2/4/8"
   for all the 5 FAILED testcases with/without jit, we need to fix the
   failed testcases themself instead of this jit.

Above tests are all done with following flags settings discretely:
1:bpf_jit_enable=1 and bpf_jit_harden=0
2:bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs in kselftest 100 times continuously for every testcase,
  the numbers are in format: total/times=avg.
  The numbers that test_bpf reports show almost the same relation.

a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156  
test_pkt_access:PASS:ipv4:10057/100=100
  test_pkt_access:PASS:ipv6:9130/100=91test_pkt_access:PASS:ipv6:5055/100=50
  test_xdp:PASS:ipv4:240198/100=2401   test_xdp:PASS:ipv4:145945/100=1459
  test_xdp:PASS:ipv6:137326/100=1373   test_xdp:PASS:ipv6:67337/100=673
  test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381
  test_l4lb:PASS:ipv6:101000/100=1010  test_l4lb:PASS:ipv6:57779/100=577

c:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:12650/100=126
  test_pkt_access:PASS:ipv6:7074/100=70
  test_xdp:PASS:ipv4:147211/100=1472
  test_xdp:PASS:ipv6:85783/100=857
  test_l4lb:PASS:ipv4:53222/100=532
  test_l4lb:PASS:ipv6:76322/100=763

Yes, the numbers are pretty when turn off jit_harden, if we want to speedup
jit_harden, then we need to move BPF_REG_AX to *real* register instead of stack
emulation, but when we do it, we need to face all the pain I describe above. We
can do it in next step.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Changes v1-v2:
 1:Fix bug in emit_ia32_neg64.
 2:Fix bug in emit_ia32_arsh_r64.
 3:Delete filename in top level comment, suggested by Thomas Gleixner.
 4:Delete unnecessary boiler plate text, suggested by Thomas Gleixner.
 5:Rewrite some words in changelog.
 6:CodingSytle improvement and a little more comments.

 Thanks.

 arch/x86/Kconfig |2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile|   10 +-
 arch/x86/net/bpf_jit32.S |  143 +++
 arch/x86/net/bpf_jit_comp32.c| 2258 ++
 5 files changed, 2434 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit32.S
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select HAVE_EBPF_JITif X86_64
+   s

Re: [PATCH] bpf, x86_32: add eBPF JIT compiler for ia32 (x86_32)

2018-04-18 Thread Wang YanQing
On Wed, Apr 18, 2018 at 05:31:18PM +0800, Wang YanQing wrote:
> The JIT compiler emits ia32 bit instructions. Currently, It supports
> eBPF only. Classic BPF is supported because of the conversion by BPF core.
> 
> Almost all instructions from eBPF ISA supported except the following:
> BPF_ALU64 | BPF_DIV | BPF_K
> BPF_ALU64 | BPF_DIV | BPF_X
> BPF_ALU64 | BPF_MOD | BPF_K
> BPF_ALU64 | BPF_MOD | BPF_X
> BPF_STX | BPF_XADD | BPF_W
> BPF_STX | BPF_XADD | BPF_DW
> 
> It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too.
> 
> IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI,
> and for these six registers, we can't treat all of them as real
> general purpose registers:
> MUL instructions need EAX:EDX, shift instructions need ECX, ESI|EDI
> for string manipulation instructions.
> 
> So I decide to use stack to emulate all eBPF 64 registers, this will
> simplify the implementation very much, because we don't need to face
> the flexible memory address modes on ia32, for example, we don't need
> to write below codes for one BPF_ADD instruction:
> if (src_reg is a register && dst_reg is a register)
> {
>//one instruction encoding for ADD instruction
> } else if (only src is a register)
> {
>//another different instruction encoding for ADD instruction
> } else if (only dst is a register)
> {
>//another different instruction encoding for ADD instruction
> } else
> {
>//src and dst are all on stack.
>//another different instruction encoding for ADD instruction
> }
> 
> If you think above if-else-else-else isn't so painful, try to think
> it for BPF_ALU64|BPF_*SHIFT* instruction:)
> 
> Tested on my PC(Intel(R) Core(TM) i5-5200U CPU) and virtualbox.
> 
> Testing results on i5-5200U:
> 
> 1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
> 2) test_progs: Summary: 81 PASSED, 2 FAILED.
>test_progs report "libbpf: incorrect bpf_call opcode" for
>test_l4lb_noinline and test_xdp_noinline, because there is
>no llvm-6.0 on my machine, and current implementation doesn't
>support BPF_CALL, so I think we can ignore it.
> 3) test_lpm: OK
> 4) test_lru_map: OK
> 5) test_verifier: Summary: 823 PASSED, 5 FAILED
>test_verifier report "invalid bpf_context access off=68 size=1/2/4/8"
>for all the 5 FAILED testcases, and test_verifier report them when
>we turn off the jit, so I think the jit can do nothing to fix them.
> 
> Above tests are all done with following flags enabled discretely:
> bpf_jit_enable=1 and bpf_jit_harden=2
> 
> Below are some numbers for this jit implementation:
> Note:
>   I run test_progs 100 times in loop for every testcase, the numbers
>   are in format: total/times=avg. The numbers that test_bpf report
>   almost show the same relation.
> 
> a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
>   test_pkt_access:PASS:ipv4:15622/100=156  
> test_pkt_access:PASS:ipv4:10057/100=100
>   test_pkt_access:PASS:ipv6:9130/100=91
> test_pkt_access:PASS:ipv6:5055/100=50
>   test_xdp:PASS:ipv4:240198/100=2401   test_xdp:PASS:ipv4:145945/100=1459
>   test_xdp:PASS:ipv6:137326/100=1373   test_xdp:PASS:ipv6:67337/100=673
>   test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381
>   test_l4lb:PASS:ipv6:101000/100=1010  test_l4lb:PASS:ipv6:57779/100=577
> 
> c:jit_enable=0 and jit_harden=2b:jit_enable=1 and jit_harden=2
>   test_pkt_access:PASS:ipv4:15214/100=152  
> test_pkt_access:PASS:ipv4:12650/100=126
>   test_pkt_access:PASS:ipv6:9132/100=91
> test_pkt_access:PASS:ipv6:7074/100=70
>   test_xdp:PASS:ipv4:237252/100=2372   test_xdp:PASS:ipv4:147211/100=1472
>   test_xdp:PASS:ipv6:135977/100=1359   test_xdp:PASS:ipv6:85783/100=857
>   test_l4lb:PASS:ipv4:61324/100=613test_l4lb:PASS:ipv4:53222/100=532
>   test_l4lb:PASS:ipv6:100833/100=1008  test_l4lb:PASS:ipv6:76322/100=763
> 
> Yes, the numbers are pretty without turn on jit_harden, if we want to speedup
> jit_harden, then we need to move BPF_REG_AX to *real* register instead of 
> stack
> emulation, but If we do it, we need to face all the pain I describe above. We
> can do it in next step.
> 
> See Documentation/networking/filter.txt for more information.
> 
> Signed-off-by: Wang YanQing <udkni...@gmail.com>
> ---
>  arch/x86/Kconfig |2 +-
>  arch/x86/include/asm/nospec-branch.h |   26 +-
>  arch/x86/net/Makefile|   10 +-
>  arch/x86/net/bpf_jit32.S |  147 +++
>  arch/x86/net/bpf_jit_comp32.c| 2239 
> ++
>  5 files changed, 2419 insertions(+), 5 deletions(-)
>  create mode 100644 arch/x86/net/bpf_jit32.S
>  create mode 100644 arch/x86/net/bpf_jit_comp32.c
Add CC to da...@davemloft.net


[PATCH] bpf, x86_32: add eBPF JIT compiler for ia32 (x86_32)

2018-04-18 Thread Wang YanQing
The JIT compiler emits ia32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF core.

Almost all instructions from eBPF ISA supported except the following:
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

It doesn't support BPF_JMP|BPF_CALL with BPF_PSEUDO_CALL too.

IA32 has few general purpose registers, EAX|EDX|ECX|EBX|ESI|EDI,
and for these six registers, we can't treat all of them as real
general purpose registers:
MUL instructions need EAX:EDX, shift instructions need ECX, ESI|EDI
for string manipulation instructions.

So I decide to use stack to emulate all eBPF 64 registers, this will
simplify the implementation very much, because we don't need to face
the flexible memory address modes on ia32, for example, we don't need
to write below codes for one BPF_ADD instruction:
if (src_reg is a register && dst_reg is a register)
{
   //one instruction encoding for ADD instruction
} else if (only src is a register)
{
   //another different instruction encoding for ADD instruction
} else if (only dst is a register)
{
   //another different instruction encoding for ADD instruction
} else
{
   //src and dst are all on stack.
   //another different instruction encoding for ADD instruction
}

If you think above if-else-else-else isn't so painful, try to think
it for BPF_ALU64|BPF_*SHIFT* instruction:)

Tested on my PC(Intel(R) Core(TM) i5-5200U CPU) and virtualbox.

Testing results on i5-5200U:

1) test_bpf: Summary: 349 PASSED, 0 FAILED, [319/341 JIT'ed]
2) test_progs: Summary: 81 PASSED, 2 FAILED.
   test_progs report "libbpf: incorrect bpf_call opcode" for
   test_l4lb_noinline and test_xdp_noinline, because there is
   no llvm-6.0 on my machine, and current implementation doesn't
   support BPF_CALL, so I think we can ignore it.
3) test_lpm: OK
4) test_lru_map: OK
5) test_verifier: Summary: 823 PASSED, 5 FAILED
   test_verifier report "invalid bpf_context access off=68 size=1/2/4/8"
   for all the 5 FAILED testcases, and test_verifier report them when
   we turn off the jit, so I think the jit can do nothing to fix them.

Above tests are all done with following flags enabled discretely:
bpf_jit_enable=1 and bpf_jit_harden=2

Below are some numbers for this jit implementation:
Note:
  I run test_progs 100 times in loop for every testcase, the numbers
  are in format: total/times=avg. The numbers that test_bpf report
  almost show the same relation.

a:jit_enable=0 and jit_harden=0b:jit_enable=1 and jit_harden=0
  test_pkt_access:PASS:ipv4:15622/100=156  
test_pkt_access:PASS:ipv4:10057/100=100
  test_pkt_access:PASS:ipv6:9130/100=91test_pkt_access:PASS:ipv6:5055/100=50
  test_xdp:PASS:ipv4:240198/100=2401   test_xdp:PASS:ipv4:145945/100=1459
  test_xdp:PASS:ipv6:137326/100=1373   test_xdp:PASS:ipv6:67337/100=673
  test_l4lb:PASS:ipv4:61100/100=611test_l4lb:PASS:ipv4:38137/100=381
  test_l4lb:PASS:ipv6:101000/100=1010  test_l4lb:PASS:ipv6:57779/100=577

c:jit_enable=0 and jit_harden=2b:jit_enable=1 and jit_harden=2
  test_pkt_access:PASS:ipv4:15214/100=152  
test_pkt_access:PASS:ipv4:12650/100=126
  test_pkt_access:PASS:ipv6:9132/100=91test_pkt_access:PASS:ipv6:7074/100=70
  test_xdp:PASS:ipv4:237252/100=2372   test_xdp:PASS:ipv4:147211/100=1472
  test_xdp:PASS:ipv6:135977/100=1359   test_xdp:PASS:ipv6:85783/100=857
  test_l4lb:PASS:ipv4:61324/100=613test_l4lb:PASS:ipv4:53222/100=532
  test_l4lb:PASS:ipv6:100833/100=1008  test_l4lb:PASS:ipv6:76322/100=763

Yes, the numbers are pretty without turn on jit_harden, if we want to speedup
jit_harden, then we need to move BPF_REG_AX to *real* register instead of stack
emulation, but If we do it, we need to face all the pain I describe above. We
can do it in next step.

See Documentation/networking/filter.txt for more information.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 arch/x86/Kconfig |2 +-
 arch/x86/include/asm/nospec-branch.h |   26 +-
 arch/x86/net/Makefile|   10 +-
 arch/x86/net/bpf_jit32.S |  147 +++
 arch/x86/net/bpf_jit_comp32.c| 2239 ++
 5 files changed, 2419 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/net/bpf_jit32.S
 create mode 100644 arch/x86/net/bpf_jit_comp32.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 00fcf81..1f5fa2f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -137,7 +137,7 @@ config X86
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
-   select HAVE_EBPF_JITif X86_64
+   select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EXIT_THREAD
select HAVE_FENTRY  if X86_64 || DY

[PATCH] bpf, doc: Correct one wrong value in "Register value tracking"

2018-01-24 Thread Wang YanQing
If we then OR this with 0x40, then the value of 6th bit (0th is first bit)
become known, so the right mask is 0xbf instead of 0xcf.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Documentation/networking/filter.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/networking/filter.txt 
b/Documentation/networking/filter.txt
index 8781485..a4508ec 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -1134,7 +1134,7 @@ The verifier's knowledge about the variable offset 
consists of:
 mask and value; no bit should ever be 1 in both.  For example, if a byte is 
read
 into a register from memory, the register's top 56 bits are known zero, while
 the low 8 are unknown - which is represented as the tnum (0x0; 0xff).  If we
-then OR this with 0x40, we get (0x40; 0xcf), then if we add 1 we get (0x0;
+then OR this with 0x40, we get (0x40; 0xbf), then if we add 1 we get (0x0;
 0x1ff), because of potential carries.
 Besides arithmetic, the register state can also be updated by conditional
 branches.  For instance, if a SCALAR_VALUE is compared > 8, in the 'true' 
branch
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH v2] rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring

2016-05-06 Thread Wang YanQing
We can't use kfree_skb in irq disable context, because spin_lock_irqsave
make sure we are always in irq disable context, use dev_kfree_skb_irq
instead of kfree_skb is better than dev_kfree_skb_any.

This patch fix below kernel warning:
[ 7612.095528] [ cut here ]
[ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150 
__local_bh_enable_ip+0x58/0x80()
[ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal btcoexist 
rtl_pci rtlwifi rtl8723_common
[ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: GW   4.4.0+ 
#4
[ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW (1.19 
) 08/27/2015
[ 7612.095574]    da37fc70 c12ce7c5  da37fca0 c104cc59 
c19d4454
[ 7612.095584]  0003 116c c19d4784 0096 c10508a8 c10508a8 0200 
c1b42400
[ 7612.095594]  f29be780 da37fcb0 c104ccad 0009  da37fcbc c10508a8 
f21f08b8
[ 7612.095604] Call Trace:
[ 7612.095614]  [] dump_stack+0x41/0x5c
[ 7612.095620]  [] warn_slowpath_common+0x89/0xc0
[ 7612.095628]  [] ? __local_bh_enable_ip+0x58/0x80
[ 7612.095634]  [] ? __local_bh_enable_ip+0x58/0x80
[ 7612.095640]  [] warn_slowpath_null+0x1d/0x20
[ 7612.095646]  [] __local_bh_enable_ip+0x58/0x80
[ 7612.095653]  [] destroy_conntrack+0x64/0xa0
[ 7612.095660]  [] nf_conntrack_destroy+0xf/0x20
[ 7612.095665]  [] skb_release_head_state+0x55/0xa0
[ 7612.095670]  [] skb_release_all+0xb/0x20
[ 7612.095674]  [] __kfree_skb+0xb/0x60
[ 7612.095679]  [] kfree_skb+0x30/0x70
[ 7612.095686]  [] ? rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
[ 7612.095692]  [] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
[ 7612.095698]  [] rtl_pci_start+0x19/0x190 [rtl_pci]
[ 7612.095705]  [] rtl_op_start+0x56/0x90 [rtlwifi]
[ 7612.095712]  [] drv_start+0x36/0xc0
[ 7612.095717]  [] ieee80211_do_open+0x2d3/0x890
[ 7612.095725]  [] ? call_netdevice_notifiers_info+0x2e/0x60
[ 7612.095730]  [] ieee80211_open+0x4d/0x50
[ 7612.095736]  [] __dev_open+0xa3/0x130
[ 7612.095742]  [] ? _raw_spin_unlock_bh+0x13/0x20
[ 7612.095748]  [] __dev_change_flags+0x89/0x140
[ 7612.095753]  [] ? selinux_capable+0xd/0x10
[ 7612.095759]  [] dev_change_flags+0x29/0x60
[ 7612.095765]  [] devinet_ioctl+0x553/0x670
[ 7612.095772]  [] ? _copy_to_user+0x28/0x40
[ 7612.095777]  [] inet_ioctl+0x85/0xb0
[ 7612.095783]  [] sock_ioctl+0x67/0x260
[ 7612.095788]  [] ? sock_fasync+0x80/0x80
[ 7612.095795]  [] do_vfs_ioctl+0x6b/0x550
[ 7612.095800]  [] ? selinux_file_ioctl+0x102/0x1e0
[ 7612.095807]  [] ? timekeeping_suspend+0x294/0x320
[ 7612.095813]  [] ? __hrtimer_run_queues+0x14a/0x210
[ 7612.095820]  [] ? security_file_ioctl+0x34/0x50
[ 7612.095827]  [] SyS_ioctl+0x70/0x80
[ 7612.095832]  [] do_fast_syscall_32+0x84/0x120
[ 7612.095839]  [] sysenter_past_esp+0x36/0x55
[ 7612.095844] ---[ end trace 97e9c637a20e8348 ]---

Signed-off-by: Wang YanQing <udkni...@gmail.com>
Cc: Stable <sta...@vger.kernel.org>
---
 Changes:
 v1-v2:
 1: add a Cc to stable.

 drivers/net/wireless/realtek/rtlwifi/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c 
b/drivers/net/wireless/realtek/rtlwifi/pci.c
index 1ac41b8..99a3a03 100644
--- a/drivers/net/wireless/realtek/rtlwifi/pci.c
+++ b/drivers/net/wireless/realtek/rtlwifi/pci.c
@@ -1572,7 +1572,7 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw)
 true,
 HW_DESC_TXBUFF_ADDR),
 skb->len, PCI_DMA_TODEVICE);
-   kfree_skb(skb);
+   dev_kfree_skb_irq(skb);
ring->idx = (ring->idx + 1) % ring->entries;
}
ring->idx = 0;
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH] rtlwifi: pci: use dev_kfree_skb_irq instead of kfree_skb in rtl_pci_reset_trx_ring

2016-05-05 Thread Wang YanQing
We can't use kfree_skb in irq disable context, because spin_lock_irqsave
make sure we are always in irq disable context, use dev_kfree_skb_irq
instead of kfree_skb is better than dev_kfree_skb_any.

This patch fix below kernel warning:
[ 7612.095528] [ cut here ]
[ 7612.095546] WARNING: CPU: 3 PID: 4460 at kernel/softirq.c:150 
__local_bh_enable_ip+0x58/0x80()
[ 7612.095550] Modules linked in: rtl8723be x86_pkg_temp_thermal btcoexist 
rtl_pci rtlwifi rtl8723_common
[ 7612.095567] CPU: 3 PID: 4460 Comm: ifconfig Tainted: GW   4.4.0+ 
#4
[ 7612.095570] Hardware name: LENOVO 20DFA04FCD/20DFA04FCD, BIOS J5ET48WW (1.19 
) 08/27/2015
[ 7612.095574]    da37fc70 c12ce7c5  da37fca0 c104cc59 
c19d4454
[ 7612.095584]  0003 116c c19d4784 0096 c10508a8 c10508a8 0200 
c1b42400
[ 7612.095594]  f29be780 da37fcb0 c104ccad 0009  da37fcbc c10508a8 
f21f08b8
[ 7612.095604] Call Trace:
[ 7612.095614]  [] dump_stack+0x41/0x5c
[ 7612.095620]  [] warn_slowpath_common+0x89/0xc0
[ 7612.095628]  [] ? __local_bh_enable_ip+0x58/0x80
[ 7612.095634]  [] ? __local_bh_enable_ip+0x58/0x80
[ 7612.095640]  [] warn_slowpath_null+0x1d/0x20
[ 7612.095646]  [] __local_bh_enable_ip+0x58/0x80
[ 7612.095653]  [] destroy_conntrack+0x64/0xa0
[ 7612.095660]  [] nf_conntrack_destroy+0xf/0x20
[ 7612.095665]  [] skb_release_head_state+0x55/0xa0
[ 7612.095670]  [] skb_release_all+0xb/0x20
[ 7612.095674]  [] __kfree_skb+0xb/0x60
[ 7612.095679]  [] kfree_skb+0x30/0x70
[ 7612.095686]  [] ? rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
[ 7612.095692]  [] rtl_pci_reset_trx_ring+0x22d/0x370 [rtl_pci]
[ 7612.095698]  [] rtl_pci_start+0x19/0x190 [rtl_pci]
[ 7612.095705]  [] rtl_op_start+0x56/0x90 [rtlwifi]
[ 7612.095712]  [] drv_start+0x36/0xc0
[ 7612.095717]  [] ieee80211_do_open+0x2d3/0x890
[ 7612.095725]  [] ? call_netdevice_notifiers_info+0x2e/0x60
[ 7612.095730]  [] ieee80211_open+0x4d/0x50
[ 7612.095736]  [] __dev_open+0xa3/0x130
[ 7612.095742]  [] ? _raw_spin_unlock_bh+0x13/0x20
[ 7612.095748]  [] __dev_change_flags+0x89/0x140
[ 7612.095753]  [] ? selinux_capable+0xd/0x10
[ 7612.095759]  [] dev_change_flags+0x29/0x60
[ 7612.095765]  [] devinet_ioctl+0x553/0x670
[ 7612.095772]  [] ? _copy_to_user+0x28/0x40
[ 7612.095777]  [] inet_ioctl+0x85/0xb0
[ 7612.095783]  [] sock_ioctl+0x67/0x260
[ 7612.095788]  [] ? sock_fasync+0x80/0x80
[ 7612.095795]  [] do_vfs_ioctl+0x6b/0x550
[ 7612.095800]  [] ? selinux_file_ioctl+0x102/0x1e0
[ 7612.095807]  [] ? timekeeping_suspend+0x294/0x320
[ 7612.095813]  [] ? __hrtimer_run_queues+0x14a/0x210
[ 7612.095820]  [] ? security_file_ioctl+0x34/0x50
[ 7612.095827]  [] SyS_ioctl+0x70/0x80
[ 7612.095832]  [] do_fast_syscall_32+0x84/0x120
[ 7612.095839]  [] sysenter_past_esp+0x36/0x55
[ 7612.095844] ---[ end trace 97e9c637a20e8348 ]---

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 drivers/net/wireless/realtek/rtlwifi/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c 
b/drivers/net/wireless/realtek/rtlwifi/pci.c
index 1ac41b8..99a3a03 100644
--- a/drivers/net/wireless/realtek/rtlwifi/pci.c
+++ b/drivers/net/wireless/realtek/rtlwifi/pci.c
@@ -1572,7 +1572,7 @@ int rtl_pci_reset_trx_ring(struct ieee80211_hw *hw)
 true,
 HW_DESC_TXBUFF_ADDR),
 skb->len, PCI_DMA_TODEVICE);
-   kfree_skb(skb);
+   dev_kfree_skb_irq(skb);
ring->idx = (ring->idx + 1) % ring->entries;
}
ring->idx = 0;
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH v2] rtlwifi: Remove double check for cnt_after_linked

2016-05-05 Thread Wang YanQing
rtl_lps_enter does two successive check for cnt_after_linked
to make sure some time has elapsed after linked. The second
check isn't necessary, because if cnt_after_linked is bigger
than 5, it is bigger than 2 of course!

This patch remove the second check code.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 Changes
 v1-v2:
 1: rewrite subject and commit message.
 2: fix issues report by checkpatch.pl.

 drivers/net/wireless/realtek/rtlwifi/ps.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/ps.c 
b/drivers/net/wireless/realtek/rtlwifi/ps.c
index b69321d..93579ca 100644
--- a/drivers/net/wireless/realtek/rtlwifi/ps.c
+++ b/drivers/net/wireless/realtek/rtlwifi/ps.c
@@ -443,14 +443,10 @@ void rtl_lps_enter(struct ieee80211_hw *hw)
 
spin_lock_irqsave(>locks.lps_lock, flag);
 
-   /* Idle for a while if we connect to AP a while ago. */
-   if (mac->cnt_after_linked >= 2) {
-   if (ppsc->dot11_psmode == EACTIVE) {
-   RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD,
-"Enter 802.11 power save mode...\n");
-
-   rtl_lps_set_psmode(hw, EAUTOPS);
-   }
+   if (ppsc->dot11_psmode == EACTIVE) {
+   RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD,
+"Enter 802.11 power save mode...\n");
+   rtl_lps_set_psmode(hw, EAUTOPS);
}
 
spin_unlock_irqrestore(>locks.lps_lock, flag);
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH v2] rtlwifi: Fix logic error in enter/exit power-save mode

2016-05-02 Thread Wang YanQing
In commit a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and
rtl_lps_enter() to use work queue"), the tests for enter/exit
power-save mode were inverted. With this change applied, the
wifi connection becomes much more stable.

Fixes: a269913c52ad ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to 
use work queue")
Signed-off-by: Wang YanQing <udkni...@gmail.com>
CC: Stable <sta...@vger.kernel.org> [3.10+]
---
 Hi, Larry!

 Because commit a269913c52ad is the first commit bring this problem, 
 so maybe use above commit message is ok, right? And stable kernels
 3.10-3.18 don't have commit fd09ff958777, but have a269913c52ad.

 Thanks for suggestion concerning to good subject and commit message
 writing, it is harder than coding sometimes:)

 Changes:
 v1-v2:
 1: Fix subject and commit message.

 drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c 
b/drivers/net/wireless/realtek/rtlwifi/base.c
index c74eb13..264466f 100644
--- a/drivers/net/wireless/realtek/rtlwifi/base.c
+++ b/drivers/net/wireless/realtek/rtlwifi/base.c
@@ -1660,9 +1660,9 @@ void rtl_watchdog_wq_callback(void *data)
if (((rtlpriv->link_info.num_rx_inperiod +
  rtlpriv->link_info.num_tx_inperiod) > 8) ||
(rtlpriv->link_info.num_rx_inperiod > 2))
-   rtl_lps_enter(hw);
-   else
rtl_lps_leave(hw);
+   else
+   rtl_lps_enter(hw);
}
 
rtlpriv->link_info.num_rx_inperiod = 0;
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH] rtlwifi:rtl_lps_enter: fix double check cnt_after_linked

2016-05-02 Thread Wang YanQing
We have checked cnt_after_linked in below code:
"
/*sleep after linked 10s, to let DHCP and 4-way handshake ok enough!! */
if (mac->cnt_after_linked < 5)
return;
"
So the second check isn't necessary.

This patch delete second check code.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 drivers/net/wireless/realtek/rtlwifi/ps.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/ps.c 
b/drivers/net/wireless/realtek/rtlwifi/ps.c
index b69321d..9d73581c 100644
--- a/drivers/net/wireless/realtek/rtlwifi/ps.c
+++ b/drivers/net/wireless/realtek/rtlwifi/ps.c
@@ -443,14 +443,11 @@ void rtl_lps_enter(struct ieee80211_hw *hw)
 
spin_lock_irqsave(>locks.lps_lock, flag);
 
-   /* Idle for a while if we connect to AP a while ago. */
-   if (mac->cnt_after_linked >= 2) {
-   if (ppsc->dot11_psmode == EACTIVE) {
-   RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD,
-"Enter 802.11 power save mode...\n");
+   if (ppsc->dot11_psmode == EACTIVE) {
+   RT_TRACE(rtlpriv, COMP_POWER, DBG_LOUD,
+   "Enter 802.11 power save mode...\n");
 
-   rtl_lps_set_psmode(hw, EAUTOPS);
-   }
+   rtl_lps_set_psmode(hw, EAUTOPS);
}
 
spin_unlock_irqrestore(>locks.lps_lock, flag);
-- 
1.8.5.6.2.g3d8a54e.dirty


[PATCH] rtlwifi:rtl_watchdog_wq_callback: fix calling rtl_lps_enter|rtl_lps_leave in opposite condition

2016-05-01 Thread Wang YanQing
Commit a269913c52ad37952a4d9953bb6d748f7299c304
("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue")
make a mistake, change the meaning of num_tx|rx_inperiod comparison test.

Commit fd09ff958777cf583d7541f180991c0fc50bd2f7
("rtlwifi: Remove extra workqueue for enter/leave power state")
follow previous mistake, bring us to current code.

This patch fix it.

Signed-off-by: Wang YanQing <udkni...@gmail.com>
---
 I think this patch should be ported back to stable kernels, 3.10+.
 In my machine, I will lost wifi connection after minutes if I enable
 fwlps.
 
 drivers/net/wireless/realtek/rtlwifi/base.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtlwifi/base.c 
b/drivers/net/wireless/realtek/rtlwifi/base.c
index c74eb13..264466f 100644
--- a/drivers/net/wireless/realtek/rtlwifi/base.c
+++ b/drivers/net/wireless/realtek/rtlwifi/base.c
@@ -1660,9 +1660,9 @@ void rtl_watchdog_wq_callback(void *data)
if (((rtlpriv->link_info.num_rx_inperiod +
  rtlpriv->link_info.num_tx_inperiod) > 8) ||
(rtlpriv->link_info.num_rx_inperiod > 2))
-   rtl_lps_enter(hw);
-   else
rtl_lps_leave(hw);
+   else
+   rtl_lps_enter(hw);
}
 
rtlpriv->link_info.num_rx_inperiod = 0;
-- 
1.8.5.6.2.g3d8a54e.dirty