Re: [PATCH bpf-next 1/2] bpf: Remove bpf_jit_enable=2 debugging mode

2021-04-16 Thread Jianlin Lv
On Thu, Apr 15, 2021 at 10:38 PM Daniel Borkmann  wrote:
>
> On 4/15/21 11:32 AM, Jianlin Lv wrote:
> > For debugging JITs, dumping the JITed image to kernel log is discouraged,
> > "bpftool prog dump jited" is much better way to examine JITed dumps.
> > This patch get rid of the code related to bpf_jit_enable=2 mode and
> > update the proc handler of bpf_jit_enable, also added auxiliary
> > information to explain how to use bpf_jit_disasm tool after this change.
> >
> > Signed-off-by: Jianlin Lv 
> [...]
> > diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
> > index 0a7a2870f111..8d36b4658076 100644
> > --- a/arch/x86/net/bpf_jit_comp32.c
> > +++ b/arch/x86/net/bpf_jit_comp32.c
> > @@ -2566,9 +2566,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog 
> > *prog)
> >   cond_resched();
> >   }
> >
> > - if (bpf_jit_enable > 1)
> > - bpf_jit_dump(prog->len, proglen, pass + 1, image);
> > -
> >   if (image) {
> >   bpf_jit_binary_lock_ro(header);
> >   prog->bpf_func = (void *)image;
> > diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> > index c8496c1142c9..990b1720c7a4 100644
> > --- a/net/core/sysctl_net_core.c
> > +++ b/net/core/sysctl_net_core.c
> > @@ -273,16 +273,8 @@ static int proc_dointvec_minmax_bpf_enable(struct 
> > ctl_table *table, int write,
> >
> >   tmp.data = _enable;
> >   ret = proc_dointvec_minmax(, write, buffer, lenp, ppos);
> > - if (write && !ret) {
> > - if (jit_enable < 2 ||
> > - (jit_enable == 2 && bpf_dump_raw_ok(current_cred( {
> > - *(int *)table->data = jit_enable;
> > - if (jit_enable == 2)
> > - pr_warn("bpf_jit_enable = 2 was set! NEVER 
> > use this in production, only for JIT debugging!\n");
> > - } else {
> > - ret = -EPERM;
> > - }
> > - }
> > + if (write && !ret)
> > + *(int *)table->data = jit_enable;
> >   return ret;
> >   }
> >
> > @@ -389,7 +381,7 @@ static struct ctl_table net_core_table[] = {
> >   .extra2 = SYSCTL_ONE,
> >   # else
> >   .extra1 = SYSCTL_ZERO,
> > - .extra2 = ,
> > + .extra2 = SYSCTL_ONE,
> >   # endif
> >   },
> >   # ifdef CONFIG_HAVE_EBPF_JIT
> > diff --git a/tools/bpf/bpf_jit_disasm.c b/tools/bpf/bpf_jit_disasm.c
> > index c8ae95804728..efa4b17ae016 100644
> > --- a/tools/bpf/bpf_jit_disasm.c
> > +++ b/tools/bpf/bpf_jit_disasm.c
> > @@ -7,7 +7,7 @@
> >*
> >* To get the disassembly of the JIT code, do the following:
> >*
> > - *  1) `echo 2 > /proc/sys/net/core/bpf_jit_enable`
> > + *  1) Insert bpf_jit_dump() and recompile the kernel to output JITed 
> > image into log
>
> Hmm, if we remove bpf_jit_dump(), the next drive-by cleanup patch will be 
> thrown
> at bpf@vger stating that bpf_jit_dump() has no in-tree users and should be 
> removed.
> Maybe we should be removing bpf_jit_disasm.c along with it as well as 
> bpf_jit_dump()
> itself ... I guess if it's ever needed in those rare occasions for JIT 
> debugging we
> can resurrect it from old kernels just locally. But yeah, bpftool's jit dump 
> should
> suffice for vast majority of use cases.
>
> There was a recent set for ppc32 jit which was merged into ppc tree which 
> will create
> a merge conflict with this one [0]. So we would need a rebase and take it 
> maybe during
> merge win once the ppc32 landed..
>
>[0] 
> https://lore.kernel.org/bpf/cover.1616430991.git.christophe.le...@csgroup.eu/
>
> >*  2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 
> > 192.168.20.0/24`)
> >*  3) Run e.g. `bpf_jit_disasm -o` to read out the last JIT code
> >*
> > diff --git a/tools/bpf/bpftool/feature.c b/tools/bpf/bpftool/feature.c
> > index 40a88df275f9..98c7eec2923f 100644
> > --- a/tools/bpf/bpftool/feature.c
> > +++ b/tools/bpf/bpftool/feature.c
> > @@ -203,9 +203,6 @@ static void probe_jit_enable(void)
> >   case 1:
> >   printf("JIT compiler is enabled\n");
> >   break;
> > - case 2:
> > - printf("JIT compiler is enabled with debugging traces 
> > in kernel logs\n");
> > - break;
>
> This would still need to be there for older kernels ...

I will submit another version after ppc32 landed to remove
bpf_jit_disasm.c and restore bpftool/feature.c

Jianlin


>
> >   case -1:
> >   printf("Unable to retrieve JIT-compiler status\n");
> >   break;
> >
>


[PATCH bpf-next 2/2] docs: bpf: bpf_jit_enable mode changed

2021-04-15 Thread Jianlin Lv
Remove information about bpf_jit_enable=2 mode and added description for
how to use the bpf_jit_disasm tool after get rid of =2 mode.

Signed-off-by: Jianlin Lv 
---
 Documentation/admin-guide/sysctl/net.rst |  1 -
 Documentation/networking/filter.rst  | 25 ++--
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/net.rst 
b/Documentation/admin-guide/sysctl/net.rst
index c941b214e0b7..a39f99deac38 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -86,7 +86,6 @@ Values:
 
- 0 - disable the JIT (default value)
- 1 - enable the JIT
-   - 2 - enable the JIT and ask the compiler to emit traces on kernel log.
 
 bpf_jit_harden
 --
diff --git a/Documentation/networking/filter.rst 
b/Documentation/networking/filter.rst
index 251c6bd73d15..86954f922168 100644
--- a/Documentation/networking/filter.rst
+++ b/Documentation/networking/filter.rst
@@ -504,25 +504,12 @@ been previously enabled by root::
 
   echo 1 > /proc/sys/net/core/bpf_jit_enable
 
-For JIT developers, doing audits etc, each compile run can output the generated
-opcode image into the kernel log via::
-
-  echo 2 > /proc/sys/net/core/bpf_jit_enable
-
-Example output from dmesg::
-
-[ 3389.935842] flen=6 proglen=70 pass=3 image=a0069c8f
-[ 3389.935847] JIT code: : 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 
8b 4f 68
-[ 3389.935849] JIT code: 0010: 44 2b 4f 6c 4c 8b 87 d8 00 00 00 be 0c 
00 00 00
-[ 3389.935850] JIT code: 0020: e8 1d 94 ff e0 3d 00 08 00 00 75 16 be 
17 00 00
-[ 3389.935851] JIT code: 0030: 00 e8 28 94 ff e0 83 f8 01 75 07 b8 ff 
ff 00 00
-[ 3389.935852] JIT code: 0040: eb 02 31 c0 c9 c3
-
-When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set to 
1 and
-setting any other value than that will return in failure. This is even the 
case for
-setting bpf_jit_enable to 2, since dumping the final JIT image into the kernel 
log
-is discouraged and introspection through bpftool (under tools/bpf/bpftool/) is 
the
-generally recommended approach instead.
+When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set
+to 1 and setting any other value than that will return in failure.
+For debugging JITs, the introspection through bpftool (tools/bpf/bpftool/)
+is the generally recommended approach instead. For JIT developers, doing
+audits etc, you can insert bpf_jit_dump() and recompile the kernel to
+output the generated opcode image into the kernel log.
 
 In the kernel source tree under tools/bpf/, there's bpf_jit_disasm for
 generating disassembly out of the kernel log's hexdump::
-- 
2.25.1



[PATCH bpf-next 1/2] bpf: Remove bpf_jit_enable=2 debugging mode

2021-04-15 Thread Jianlin Lv
For debugging JITs, dumping the JITed image to kernel log is discouraged,
"bpftool prog dump jited" is much better way to examine JITed dumps.
This patch get rid of the code related to bpf_jit_enable=2 mode and
update the proc handler of bpf_jit_enable, also added auxiliary
information to explain how to use bpf_jit_disasm tool after this change.

Signed-off-by: Jianlin Lv 
---
 arch/arm/net/bpf_jit_32.c |  4 
 arch/arm64/net/bpf_jit_comp.c |  4 
 arch/mips/net/bpf_jit.c   |  4 
 arch/mips/net/ebpf_jit.c  |  4 
 arch/powerpc/net/bpf_jit_comp.c   | 10 --
 arch/powerpc/net/bpf_jit_comp64.c | 11 ---
 arch/riscv/net/bpf_jit_core.c |  3 ---
 arch/s390/net/bpf_jit_comp.c  |  4 
 arch/sparc/net/bpf_jit_comp_32.c  |  3 ---
 arch/sparc/net/bpf_jit_comp_64.c  | 13 -
 arch/x86/net/bpf_jit_comp.c   |  3 ---
 arch/x86/net/bpf_jit_comp32.c |  3 ---
 net/core/sysctl_net_core.c| 14 +++---
 tools/bpf/bpf_jit_disasm.c|  2 +-
 tools/bpf/bpftool/feature.c   |  3 ---
 15 files changed, 4 insertions(+), 81 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 897634d0a67c..92d669c0b2d3 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1997,10 +1997,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog 
*prog)
}
flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));
 
-   if (bpf_jit_enable > 1)
-   /* there are 2 passes here */
-   bpf_jit_dump(prog->len, image_size, 2, ctx.target);
-
bpf_jit_binary_lock_ro(header);
prog->bpf_func = (void *)ctx.target;
prog->jited = 1;
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index f7b194878a99..a13b83ac4ca8 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1090,10 +1090,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog 
*prog)
goto out_off;
}
 
-   /* And we're done. */
-   if (bpf_jit_enable > 1)
-   bpf_jit_dump(prog->len, prog_size, 2, ctx.image);
-
bpf_flush_icache(header, ctx.image + ctx.idx);
 
if (!prog->is_func || extra_pass) {
diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c
index 0af88622c619..b5221282dd88 100644
--- a/arch/mips/net/bpf_jit.c
+++ b/arch/mips/net/bpf_jit.c
@@ -1250,10 +1250,6 @@ void bpf_jit_compile(struct bpf_prog *fp)
/* Update the icache */
flush_icache_range((ptr)ctx.target, (ptr)(ctx.target + ctx.idx));
 
-   if (bpf_jit_enable > 1)
-   /* Dump JIT code */
-   bpf_jit_dump(fp->len, alloc_size, 2, ctx.target);
-
fp->bpf_func = (void *)ctx.target;
fp->jited = 1;
 
diff --git a/arch/mips/net/ebpf_jit.c b/arch/mips/net/ebpf_jit.c
index 939dd06764bc..dac5a1fc2462 100644
--- a/arch/mips/net/ebpf_jit.c
+++ b/arch/mips/net/ebpf_jit.c
@@ -1910,10 +1910,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog 
*prog)
flush_icache_range((unsigned long)ctx.target,
   (unsigned long)[ctx.idx]);
 
-   if (bpf_jit_enable > 1)
-   /* Dump JIT code */
-   bpf_jit_dump(prog->len, image_size, 2, ctx.target);
-
bpf_jit_binary_lock_ro(header);
prog->bpf_func = (void *)ctx.target;
prog->jited = 1;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index e809cb5a1631..ebca629de2d1 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -646,18 +646,8 @@ void bpf_jit_compile(struct bpf_prog *fp)
bpf_jit_build_prologue(fp, code_base, );
bpf_jit_build_body(fp, code_base, , addrs);
bpf_jit_build_epilogue(code_base, );
-
-   if (bpf_jit_enable > 1)
-   pr_info("Pass %d: shrink = %d, seen = 0x%x\n", pass,
-   proglen - (cgctx.idx * 4), cgctx.seen);
}
 
-   if (bpf_jit_enable > 1)
-   /* Note that we output the base address of the code_base
-* rather than image, since opcodes are in code_base.
-*/
-   bpf_jit_dump(flen, proglen, pass, code_base);
-
bpf_flush_icache(code_base, code_base + (proglen/4));
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/net/bpf_jit_comp64.c 
b/arch/powerpc/net/bpf_jit_comp64.c
index aaf1a887f653..26243399ef2e 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -1215,20 +1215,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
bpf_jit_build_prologue(code_base, );
bpf_jit_build_body(fp, code_base, , addrs, extra_pass);
bpf_jit_build_epilogue(code_base, );
-
-   if (bpf_jit_enable > 1)
-

Re: [PATCH bpf-next] bpf: arm64: Redefine MOV consistent with arch insn

2021-03-31 Thread Jianlin Lv
On Wed, Mar 31, 2021 at 5:28 PM Will Deacon  wrote:
>
> On Wed, Mar 31, 2021 at 05:22:18PM +0800, Jianlin Lv wrote:
> > On Tue, Mar 30, 2021 at 5:31 PM Will Deacon  wrote:
> > >
> > > On Tue, Mar 30, 2021 at 03:42:35PM +0800, Jianlin Lv wrote:
> > > > A64_MOV is currently mapped to Add Instruction. Architecturally MOV
> > > > (register) is an alias of ORR (shifted register) and MOV (to or from SP)
> > > > is an alias of ADD (immediate).
> > > > This patch redefines A64_MOV and uses existing functionality
> > > > aarch64_insn_gen_move_reg() in insn.c to encode MOV (register) 
> > > > instruction.
> > > > For moving between register and stack pointer, rename macro to 
> > > > A64_MOV_SP.
> > >
> > > What does this gain us? There's no requirement for a BPF "MOV" to match an
> > > arm64 architectural "MOV", so what's the up-side of aligning them like 
> > > this?
> >
> > According to the description in the Arm Software Optimization Guide,
> > Arithmetic(basic) and Logical(basic) instructions have the same
> > Exec Latency and Execution Throughput.
> > This change did not bring about a performance improvement.
> > The original intention was to make the instruction map more 'natively'.
>
> I think we should leave the code as-is, then. Having a separate MOV_SP
> macro s confusing and, worse, I worry that somebody passing A64_SP to
> A64_MOV will end up using the zero register.
>
> Will

OK, your concerns are justified. I have made such mistakes.

Jianlin


Re: [PATCH bpf-next] bpf: arm64: Redefine MOV consistent with arch insn

2021-03-31 Thread Jianlin Lv
On Tue, Mar 30, 2021 at 5:31 PM Will Deacon  wrote:
>
> On Tue, Mar 30, 2021 at 03:42:35PM +0800, Jianlin Lv wrote:
> > A64_MOV is currently mapped to Add Instruction. Architecturally MOV
> > (register) is an alias of ORR (shifted register) and MOV (to or from SP)
> > is an alias of ADD (immediate).
> > This patch redefines A64_MOV and uses existing functionality
> > aarch64_insn_gen_move_reg() in insn.c to encode MOV (register) instruction.
> > For moving between register and stack pointer, rename macro to A64_MOV_SP.
>
> What does this gain us? There's no requirement for a BPF "MOV" to match an
> arm64 architectural "MOV", so what's the up-side of aligning them like this?
>
> Cheers,
>
> Will

According to the description in the Arm Software Optimization Guide,
Arithmetic(basic) and Logical(basic) instructions have the same
Exec Latency and Execution Throughput.
This change did not bring about a performance improvement.
The original intention was to make the instruction map more 'natively'.

Jianlin


[PATCH bpf-next] bpf: arm64: Redefine MOV consistent with arch insn

2021-03-30 Thread Jianlin Lv
A64_MOV is currently mapped to Add Instruction. Architecturally MOV
(register) is an alias of ORR (shifted register) and MOV (to or from SP)
is an alias of ADD (immediate).
This patch redefines A64_MOV and uses existing functionality
aarch64_insn_gen_move_reg() in insn.c to encode MOV (register) instruction.
For moving between register and stack pointer, rename macro to A64_MOV_SP.

Test:
modprobe test_bpf test_name="SPILL_FILL"
bpf_jit_disasm -o

without patch:
   0:   stp x29, x30, [sp, #-16]!
fd 7b bf a9
   4:   mov x29, sp
fd 03 00 91
...
  14:   mov x25, sp
f9 03 00 91
...
  24:   add x19, x0, #0x0
13 00 00 91
...
  8c:   add x0, x7, #0x0
e0 00 00 91

with patch:
   0:   stp x29, x30, [sp, #-16]!
fd 7b bf a9
   4:   mov x29, sp
fd 03 00 91
...
  14:   mov x25, sp
f9 03 00 91
...
  24:   mov x19, x0
f3 03 00 aa
...
  8c:   mov x0, x7
e0 03 07 aa

Signed-off-by: Jianlin Lv 
---
 arch/arm64/net/bpf_jit.h  | 7 +--
 arch/arm64/net/bpf_jit_comp.c | 4 ++--
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
index cc0cf0f5c7c3..a2c3cddb1d2f 100644
--- a/arch/arm64/net/bpf_jit.h
+++ b/arch/arm64/net/bpf_jit.h
@@ -108,8 +108,8 @@
 #define A64_CMN_I(sf, Rn, imm12) A64_ADDS_I(sf, A64_ZR, Rn, imm12)
 /* Rn - imm12; set condition flags */
 #define A64_CMP_I(sf, Rn, imm12) A64_SUBS_I(sf, A64_ZR, Rn, imm12)
-/* Rd = Rn */
-#define A64_MOV(sf, Rd, Rn) A64_ADD_I(sf, Rd, Rn, 0)
+/* Rd = Rn; MOV (to/from SP) */
+#define A64_MOV_SP(sf, Rd, Rn) A64_ADD_I(sf, Rd, Rn, 0)
 
 /* Bitfield move */
 #define A64_BITFIELD(sf, Rd, Rn, immr, imms, type) \
@@ -134,6 +134,9 @@
 #define A64_UXTH(sf, Rd, Rn) A64_UBFM(sf, Rd, Rn, 0, 15)
 #define A64_UXTW(sf, Rd, Rn) A64_UBFM(sf, Rd, Rn, 0, 31)
 
+/* Rd = Rm; MOV (register) */
+#define A64_MOV(sf, Rd, Rm) aarch64_insn_gen_move_reg(Rd, Rm, A64_VARIANT(sf))
+
 /* Move wide (immediate) */
 #define A64_MOVEW(sf, Rd, imm16, shift, type) \
aarch64_insn_gen_movewide(Rd, imm16, shift, \
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index f7b194878a99..2a118c0fcb30 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -229,7 +229,7 @@ static int build_prologue(struct jit_ctx *ctx, bool 
ebpf_from_cbpf)
 
/* Save FP and LR registers to stay align with ARM64 AAPCS */
emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx);
-   emit(A64_MOV(1, A64_FP, A64_SP), ctx);
+   emit(A64_MOV_SP(1, A64_FP, A64_SP), ctx);
 
/* Save callee-saved registers */
emit(A64_PUSH(r6, r7, A64_SP), ctx);
@@ -237,7 +237,7 @@ static int build_prologue(struct jit_ctx *ctx, bool 
ebpf_from_cbpf)
emit(A64_PUSH(fp, tcc, A64_SP), ctx);
 
/* Set up BPF prog stack base register */
-   emit(A64_MOV(1, fp, A64_SP), ctx);
+   emit(A64_MOV_SP(1, fp, A64_SP), ctx);
 
if (!ebpf_from_cbpf) {
/* Initialize tail_call_cnt */
-- 
2.25.1



Re: [PATCH bpf-next] bpf: trace jit code when enable BPF_JIT_ALWAYS_ON

2021-03-27 Thread Jianlin Lv
On Sat, Mar 27, 2021 at 11:19 PM Alexei Starovoitov
 wrote:
>
> On Sat, Mar 27, 2021 at 1:19 AM Jianlin Lv  wrote:
> >
> > > On Fri, Mar 26, 2021 at 5:40 AM Jianlin Lv  wrote:
> > > >
> > > > When CONFIG_BPF_JIT_ALWAYS_ON is enabled, the value of
> > > bpf_jit_enable
> > > > in /proc/sys is limited to SYSCTL_ONE. This is not convenient for 
> > > > debugging.
> > > > This patch modifies the value of extra2 (max) to 2 that support
> > > > developers to emit traces on kernel log.
> > > >
> > > > Signed-off-by: Jianlin Lv 
> > > > ---
> > > >  net/core/sysctl_net_core.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> > > > index d84c8a1b280e..aa16883ac445 100644
> > > > --- a/net/core/sysctl_net_core.c
> > > > +++ b/net/core/sysctl_net_core.c
> > > > @@ -386,7 +386,7 @@ static struct ctl_table net_core_table[] = {
> > > > .proc_handler   = proc_dointvec_minmax_bpf_enable,
> > > >  # ifdef CONFIG_BPF_JIT_ALWAYS_ON
> > > > .extra1 = SYSCTL_ONE,
> > > > -   .extra2 = SYSCTL_ONE,
> > > > +   .extra2 = ,
> > >
> > > "bpftool prog dump jited" is much better way to examine JITed dumps.
> > > I'd rather remove bpf_jit_enable=2 altogether.
> >
> > In my case, I introduced a bug when I made some adjustments to the arm64
> > jit macro A64_MOV(), which caused the SP register to be replaced by the
> > XZR register when building prologue, and the wrong value was stored in fp,
> > which triggered a crash.
> >
> > This bug is likely to cause the instruction to access the BPF stack in
> > jited prog to trigger a crash.
> > I tried to use bpftool to debug, but bpftool crashed when I executed the
> > "bpftool prog show" command.
> > The syslog shown that bpftool is loading and running some bpf prog.
> > because of the bug in the JIT compiler, the bpftool execution failed.
>
> Right 'bpftool prog show' command is loading a bpf iterator prog,
> but you didn't need to use it to dump JITed code.
> "bpftool prog dump jited name my_prog"
> would have dumped it even when JIT is all buggy.
>
> > bpf_jit_disasm saved me, it helped me dump the jited image:
> >
> > echo 2> /proc/sys/net/core/bpf_jit_enable
> > modprobe test_bpf test_name="SPILL_FILL"
> > ./bpf_jit_disasm -o
> >
> > So keeping bpf_jit_enable=2 is still very meaningful for developers who
> > try to modify the JIT compiler.
>
> sure and such JIT developers can compile the kernel
> without BPF_JIT_ALWAYS_ON just like you did.
> They can also insert printk, etc.
> bpf_jit_enable=2 was done long ago when there was no other way
> to see JITed code. Now we have proper apis.
> That =2 mode can and should be removed.

Thanks for your reply, I will prepare another patch to remove =2mode.

>
> > IMPORTANT NOTICE: The contents of this email and any attachments are 
> > confidential and may also be privileged. If you are not the intended 
> > recipient, please notify the sender immediately and do not disclose the 
> > contents to any other person, use it for any purpose, or store or copy the 
> > information in any medium. Thank you.
>
> please fix your email server/client/whatever. No patches will ever be
> accepted with
> such disclaimer.

Apologize for this.
Jianlin


RE: [PATCH bpf-next] bpf: trace jit code when enable BPF_JIT_ALWAYS_ON

2021-03-27 Thread Jianlin Lv


> -Original Message-
> From: Alexei Starovoitov 
> Sent: Friday, March 26, 2021 10:25 PM
> To: Jianlin Lv 
> Cc: bpf ; David S. Miller ;
> Jakub Kicinski ; Alexei Starovoitov ;
> Daniel Borkmann ; Andrii Nakryiko
> ; Martin KaFai Lau ; Song Liu
> ; Yonghong Song ; John Fastabend
> ; KP Singh ; Alexander
> Viro ; Andrey Ignatov ; Dmitry
> Vyukov ; Nicolas Dichtel
> ; Kees Cook ;
> Masahiro Yamada ; Mahesh Bandewar
> ; LKML ; Network
> Development ; iece...@gmail.com; nd
> 
> Subject: Re: [PATCH bpf-next] bpf: trace jit code when enable
> BPF_JIT_ALWAYS_ON
>
> On Fri, Mar 26, 2021 at 5:40 AM Jianlin Lv  wrote:
> >
> > When CONFIG_BPF_JIT_ALWAYS_ON is enabled, the value of
> bpf_jit_enable
> > in /proc/sys is limited to SYSCTL_ONE. This is not convenient for debugging.
> > This patch modifies the value of extra2 (max) to 2 that support
> > developers to emit traces on kernel log.
> >
> > Signed-off-by: Jianlin Lv 
> > ---
> >  net/core/sysctl_net_core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
> > index d84c8a1b280e..aa16883ac445 100644
> > --- a/net/core/sysctl_net_core.c
> > +++ b/net/core/sysctl_net_core.c
> > @@ -386,7 +386,7 @@ static struct ctl_table net_core_table[] = {
> > .proc_handler   = proc_dointvec_minmax_bpf_enable,
> >  # ifdef CONFIG_BPF_JIT_ALWAYS_ON
> > .extra1 = SYSCTL_ONE,
> > -   .extra2 = SYSCTL_ONE,
> > +   .extra2 = ,
>
> "bpftool prog dump jited" is much better way to examine JITed dumps.
> I'd rather remove bpf_jit_enable=2 altogether.

In my case, I introduced a bug when I made some adjustments to the arm64
jit macro A64_MOV(), which caused the SP register to be replaced by the
XZR register when building prologue, and the wrong value was stored in fp,
which triggered a crash.

Test case:
modprobe test_bpf test_name="SPILL_FILL"

jited code:
   0:   stp x29, x30, [sp, #-16]!
fd 7b bf a9
   4:   mov x29, xzr//Err, should be 'mov  x29, sp'
fd 03 1f aa
   8:   stp x19, x20, [sp, #-16]!
f3 53 bf a9
   c:   stp x21, x22, [sp, #-16]!
f5 5b bf a9
  10:   stp x25, x26, [sp, #-16]!
f9 6b bf a9
  14:   mov x25, xzr//Err,  should be 'mov x25, sp'
f9 03 1f aa
...
  3c:   mov x10, #0xfff8// #-8
ea 00 80 92
  40:   str w7, [x25, x10]// Crash
27 6b 2a b8

This bug is likely to cause the instruction to access the BPF stack in
jited prog to trigger a crash.
I tried to use bpftool to debug, but bpftool crashed when I executed the
"bpftool prog show" command.
The syslog shown that bpftool is loading and running some bpf prog.
because of the bug in the JIT compiler, the bpftool execution failed.

bpf_jit_disasm saved me, it helped me dump the jited image:

echo 2> /proc/sys/net/core/bpf_jit_enable
modprobe test_bpf test_name="SPILL_FILL"
./bpf_jit_disasm -o

So keeping bpf_jit_enable=2 is still very meaningful for developers who
try to modify the JIT compiler.

Jianlin



IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[PATCH bpf-next] bpf: trace jit code when enable BPF_JIT_ALWAYS_ON

2021-03-26 Thread Jianlin Lv
When CONFIG_BPF_JIT_ALWAYS_ON is enabled, the value of bpf_jit_enable in
/proc/sys is limited to SYSCTL_ONE. This is not convenient for debugging.
This patch modifies the value of extra2 (max) to 2 that support developers
to emit traces on kernel log.

Signed-off-by: Jianlin Lv 
---
 net/core/sysctl_net_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d84c8a1b280e..aa16883ac445 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -386,7 +386,7 @@ static struct ctl_table net_core_table[] = {
.proc_handler   = proc_dointvec_minmax_bpf_enable,
 # ifdef CONFIG_BPF_JIT_ALWAYS_ON
.extra1 = SYSCTL_ONE,
-   .extra2 = SYSCTL_ONE,
+   .extra2 = ,
 # else
.extra1 = SYSCTL_ZERO,
.extra2 = ,
-- 
2.25.1



Re: [PATCH bpf-next v2] bpf: Simplify expression for identify bpf mem type

2021-03-18 Thread Jianlin Lv
On Thu, Mar 18, 2021 at 11:58 PM Daniel Borkmann  wrote:
>
> On 3/18/21 7:36 AM, Jianlin Lv wrote:
> > Added BPF_LD_ST_SIZE_MASK macro as mask of size modifier that help to
> > reduce the evaluation of expressions in if statements,
> > and remove BPF_SIZE_MASK in netronome driver.
> >
> > Signed-off-by: Jianlin Lv 
> > ---
> > v2: Move the bpf_LD_ST_SIZE_MASK macro definition to include/linux/bpf.h
> > ---
> >   drivers/net/ethernet/netronome/nfp/bpf/main.h |  8 +++-
> >   include/linux/bpf.h   |  1 +
> >   kernel/bpf/verifier.c | 12 
> >   3 files changed, 8 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h 
> > b/drivers/net/ethernet/netronome/nfp/bpf/main.h
> > index d0e17eebddd9..e90981e69763 100644
> > --- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
> > +++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
> > @@ -346,8 +346,6 @@ struct nfp_insn_meta {
> >   struct list_head l;
> >   };
> >
> > -#define BPF_SIZE_MASK0x18
> > -
> >   static inline u8 mbpf_class(const struct nfp_insn_meta *meta)
> >   {
> >   return BPF_CLASS(meta->insn.code);
> > @@ -375,7 +373,7 @@ static inline bool is_mbpf_alu(const struct 
> > nfp_insn_meta *meta)
> >
> >   static inline bool is_mbpf_load(const struct nfp_insn_meta *meta)
> >   {
> > - return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM);
> > + return (meta->insn.code & ~BPF_LD_ST_SIZE_MASK) == (BPF_LDX | 
> > BPF_MEM);
> >   }
> >
> >   static inline bool is_mbpf_jmp32(const struct nfp_insn_meta *meta)
> > @@ -395,7 +393,7 @@ static inline bool is_mbpf_jmp(const struct 
> > nfp_insn_meta *meta)
> >
> >   static inline bool is_mbpf_store(const struct nfp_insn_meta *meta)
> >   {
> > - return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM);
> > + return (meta->insn.code & ~BPF_LD_ST_SIZE_MASK) == (BPF_STX | 
> > BPF_MEM);
> >   }
> >
> >   static inline bool is_mbpf_load_pkt(const struct nfp_insn_meta *meta)
> > @@ -430,7 +428,7 @@ static inline bool is_mbpf_classic_store_pkt(const 
> > struct nfp_insn_meta *meta)
> >
> >   static inline bool is_mbpf_atomic(const struct nfp_insn_meta *meta)
> >   {
> > - return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_ATOMIC);
> > + return (meta->insn.code & ~BPF_LD_ST_SIZE_MASK) == (BPF_STX | 
> > BPF_ATOMIC);
> >   }
> >
> >   static inline bool is_mbpf_mul(const struct nfp_insn_meta *meta)
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index a25730eaa148..e85924719c65 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -995,6 +995,7 @@ struct bpf_array {
> >BPF_F_RDONLY_PROG |\
> >BPF_F_WRONLY | \
> >BPF_F_WRONLY_PROG)
> > +#define BPF_LD_ST_SIZE_MASK  0x18/* mask of size modifier */
> >
> >   #define BPF_MAP_CAN_READBIT(0)
> >   #define BPF_MAP_CAN_WRITE   BIT(1)
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index f9096b049cd6..29fdfdb8abfa 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -11384,15 +11384,11 @@ static int convert_ctx_accesses(struct 
> > bpf_verifier_env *env)
> >   for (i = 0; i < insn_cnt; i++, insn++) {
> >   bpf_convert_ctx_access_t convert_ctx_access;
> >
> > - if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
> > - insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
> > - insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
> > - insn->code == (BPF_LDX | BPF_MEM | BPF_DW))
> > + /* opcode: BPF_MEM |  | BPF_LDX */
> > + if ((insn->code & ~BPF_LD_ST_SIZE_MASK) == (BPF_LDX | 
> > BPF_MEM))
> >   type = BPF_READ;
> > - else if (insn->code == (BPF_STX | BPF_MEM | BPF_B) ||
> > -  insn->code == (BPF_STX | BPF_MEM | BPF_H) ||
> > -  insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
> > -  insn->code == (BPF_STX | BPF_MEM | BPF_DW))
> > + /* opcode: BPF_MEM |  | BPF_STX */
> > + else if ((insn->code & ~BPF_LD_ST_SIZE_MASK) == (BPF_STX | 
> > BPF_MEM))
> > 

[PATCH bpf-next v2] bpf: Simplify expression for identify bpf mem type

2021-03-18 Thread Jianlin Lv
Added BPF_LD_ST_SIZE_MASK macro as mask of size modifier that help to
reduce the evaluation of expressions in if statements,
and remove BPF_SIZE_MASK in netronome driver.

Signed-off-by: Jianlin Lv 
---
v2: Move the bpf_LD_ST_SIZE_MASK macro definition to include/linux/bpf.h
---
 drivers/net/ethernet/netronome/nfp/bpf/main.h |  8 +++-
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 12 
 3 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h 
b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index d0e17eebddd9..e90981e69763 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -346,8 +346,6 @@ struct nfp_insn_meta {
struct list_head l;
 };
 
-#define BPF_SIZE_MASK  0x18
-
 static inline u8 mbpf_class(const struct nfp_insn_meta *meta)
 {
return BPF_CLASS(meta->insn.code);
@@ -375,7 +373,7 @@ static inline bool is_mbpf_alu(const struct nfp_insn_meta 
*meta)
 
 static inline bool is_mbpf_load(const struct nfp_insn_meta *meta)
 {
-   return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM);
+   return (meta->insn.code & ~BPF_LD_ST_SIZE_MASK) == (BPF_LDX | BPF_MEM);
 }
 
 static inline bool is_mbpf_jmp32(const struct nfp_insn_meta *meta)
@@ -395,7 +393,7 @@ static inline bool is_mbpf_jmp(const struct nfp_insn_meta 
*meta)
 
 static inline bool is_mbpf_store(const struct nfp_insn_meta *meta)
 {
-   return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM);
+   return (meta->insn.code & ~BPF_LD_ST_SIZE_MASK) == (BPF_STX | BPF_MEM);
 }
 
 static inline bool is_mbpf_load_pkt(const struct nfp_insn_meta *meta)
@@ -430,7 +428,7 @@ static inline bool is_mbpf_classic_store_pkt(const struct 
nfp_insn_meta *meta)
 
 static inline bool is_mbpf_atomic(const struct nfp_insn_meta *meta)
 {
-   return (meta->insn.code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_ATOMIC);
+   return (meta->insn.code & ~BPF_LD_ST_SIZE_MASK) == (BPF_STX | 
BPF_ATOMIC);
 }
 
 static inline bool is_mbpf_mul(const struct nfp_insn_meta *meta)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a25730eaa148..e85924719c65 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -995,6 +995,7 @@ struct bpf_array {
 BPF_F_RDONLY_PROG |\
 BPF_F_WRONLY | \
 BPF_F_WRONLY_PROG)
+#define BPF_LD_ST_SIZE_MASK0x18/* mask of size modifier */
 
 #define BPF_MAP_CAN_READ   BIT(0)
 #define BPF_MAP_CAN_WRITE  BIT(1)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f9096b049cd6..29fdfdb8abfa 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11384,15 +11384,11 @@ static int convert_ctx_accesses(struct 
bpf_verifier_env *env)
for (i = 0; i < insn_cnt; i++, insn++) {
bpf_convert_ctx_access_t convert_ctx_access;
 
-   if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
-   insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
-   insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
-   insn->code == (BPF_LDX | BPF_MEM | BPF_DW))
+   /* opcode: BPF_MEM |  | BPF_LDX */
+   if ((insn->code & ~BPF_LD_ST_SIZE_MASK) == (BPF_LDX | BPF_MEM))
type = BPF_READ;
-   else if (insn->code == (BPF_STX | BPF_MEM | BPF_B) ||
-insn->code == (BPF_STX | BPF_MEM | BPF_H) ||
-insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
-insn->code == (BPF_STX | BPF_MEM | BPF_DW))
+   /* opcode: BPF_MEM |  | BPF_STX */
+   else if ((insn->code & ~BPF_LD_ST_SIZE_MASK) == (BPF_STX | 
BPF_MEM))
type = BPF_WRITE;
else
continue;
-- 
2.25.1



[PATCH bpf-next] bpf: Remove insn_buf[] declaration in inner block

2021-03-17 Thread Jianlin Lv
Two insn_buf[16] variables are declared in the function, which act on
function scope and block scope respectively.
The statement in the inner blocks is redundant, so remove it.

Signed-off-by: Jianlin Lv 
---
 kernel/bpf/verifier.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f9096b049cd6..e26c5170c953 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11899,7 +11899,6 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
insn->code == (BPF_ALU64 | BPF_SUB | BPF_X)) {
const u8 code_add = BPF_ALU64 | BPF_ADD | BPF_X;
const u8 code_sub = BPF_ALU64 | BPF_SUB | BPF_X;
-   struct bpf_insn insn_buf[16];
struct bpf_insn *patch = _buf[0];
bool issrc, isneg;
u32 off_reg;
-- 
2.25.1



[PATCH bpf-next] bpf: Simplify expression for identify bpf mem type

2021-03-17 Thread Jianlin Lv
Added BPF_SIZE_MASK macro as mask of size modifier that help to reduce
the evaluation of expressions in if statements,
and remove BPF_SIZE_MASK in netronome driver.

Signed-off-by: Jianlin Lv 
---
 drivers/net/ethernet/netronome/nfp/bpf/main.h |  2 --
 include/uapi/linux/bpf.h  |  1 +
 kernel/bpf/verifier.c | 12 
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h 
b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index d0e17eebddd9..8b1c2509ce46 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -346,8 +346,6 @@ struct nfp_insn_meta {
struct list_head l;
 };
 
-#define BPF_SIZE_MASK  0x18
-
 static inline u8 mbpf_class(const struct nfp_insn_meta *meta)
 {
return BPF_CLASS(meta->insn.code);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2d3036e292a9..5d77675e7112 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -21,6 +21,7 @@
 #define BPF_DW 0x18/* double word (64-bit) */
 #define BPF_ATOMIC 0xc0/* atomic memory ops - op type in immediate */
 #define BPF_XADD   0xc0/* exclusive add - legacy name */
+#define BPF_SIZE_MASK  0x18/* mask of size modifier */
 
 /* alu/jmp fields */
 #define BPF_MOV0xb0/* mov reg to reg */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f9096b049cd6..9755bb4d7de4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -11384,15 +11384,11 @@ static int convert_ctx_accesses(struct 
bpf_verifier_env *env)
for (i = 0; i < insn_cnt; i++, insn++) {
bpf_convert_ctx_access_t convert_ctx_access;
 
-   if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
-   insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
-   insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
-   insn->code == (BPF_LDX | BPF_MEM | BPF_DW))
+   /* opcode: BPF_MEM |  | BPF_LDX */
+   if ((insn->code & ~BPF_SIZE_MASK) == (BPF_LDX | BPF_MEM))
type = BPF_READ;
-   else if (insn->code == (BPF_STX | BPF_MEM | BPF_B) ||
-insn->code == (BPF_STX | BPF_MEM | BPF_H) ||
-insn->code == (BPF_STX | BPF_MEM | BPF_W) ||
-insn->code == (BPF_STX | BPF_MEM | BPF_DW))
+   /* opcode: BPF_MEM |  | BPF_STX */
+   else if ((insn->code & ~BPF_SIZE_MASK) == (BPF_STX | BPF_MEM))
type = BPF_WRITE;
else
continue;
-- 
2.25.1



[PATCH net-next v2] bonding: Added -ENODEV interpret for slaves option

2021-03-14 Thread Jianlin Lv
When the incorrect interface name is stored in the slaves/active_slave
option of the bonding sysfs, the kernel does not record the log that
interface does not exist.

This patch adds a log for -ENODEV error, which will facilitate users to
figure out such issue.

Signed-off-by: Jianlin Lv 
---
v2: Update changlog.
---
 drivers/net/bonding/bond_options.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/bonding/bond_options.c 
b/drivers/net/bonding/bond_options.c
index 77d7c38bd435..c9d3604ae129 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -640,6 +640,15 @@ static void bond_opt_error_interpret(struct bonding *bond,
netdev_err(bond->dev, "option %s: unable to set because the 
bond device is up\n",
   opt->name);
break;
+   case -ENODEV:
+   if (val && val->string) {
+   p = strchr(val->string, '\n');
+   if (p)
+   *p = '\0';
+   netdev_err(bond->dev, "option %s: interface %s does not 
exist!\n",
+  opt->name, val->string);
+   }
+   break;
default:
break;
}
-- 
2.25.1



[PATCH net-next] bonding: Added -ENODEV interpret for slaves option

2021-03-13 Thread Jianlin Lv
After upgrading the kernel, the slave interface name is changed,
Systemd cannot use the original configuration to create bond interface,
thereby losing the connection with the host.

Adding log for ENODEV will make it easier to find out such problem lies.

Signed-off-by: Jianlin Lv 
---
 drivers/net/bonding/bond_options.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/bonding/bond_options.c 
b/drivers/net/bonding/bond_options.c
index 77d7c38bd435..c9d3604ae129 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -640,6 +640,15 @@ static void bond_opt_error_interpret(struct bonding *bond,
netdev_err(bond->dev, "option %s: unable to set because the 
bond device is up\n",
   opt->name);
break;
+   case -ENODEV:
+   if (val && val->string) {
+   p = strchr(val->string, '\n');
+   if (p)
+   *p = '\0';
+   netdev_err(bond->dev, "option %s: interface %s does not 
exist!\n",
+  opt->name, val->string);
+   }
+   break;
default:
break;
}
-- 
2.25.1



[PATCH v5] perf tools: Fix arm64 build error with gcc-11

2021-02-17 Thread Jianlin Lv
gcc version: 11.0.0 20210208 (experimental) (GCC)

Following build error on arm64:

...
In function ‘printf’,
inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
inlined from ‘regs__printf’ at util/session.c:1169:2:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
  error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]

107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
__va_arg_pack ());

..
In function ‘fprintf’,
  inlined from ‘perf_sample__fprintf_regs.isra’ at \
builtin-script.c:622:14:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
  100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
  101 | __va_arg_pack ());

cc1: all warnings being treated as errors
...

This patch fixes Wformat-overflow warnings. Add helper function to
convert NULL to "unknown".

Signed-off-by: Jianlin Lv 
---
v2: Add ternary operator to avoid similar errors in other arch.
v3: Declared reg_name in inner block.
v4: Add helper function: perf_reg_name_str, update changelog.
v5: Correct function names to follow common naming conventions.
---
 tools/perf/arch/arm/include/perf_regs.h | 2 +-
 tools/perf/arch/arm64/include/perf_regs.h   | 2 +-
 tools/perf/arch/csky/include/perf_regs.h| 2 +-
 tools/perf/arch/powerpc/include/perf_regs.h | 2 +-
 tools/perf/arch/riscv/include/perf_regs.h   | 2 +-
 tools/perf/arch/s390/include/perf_regs.h| 2 +-
 tools/perf/arch/x86/include/perf_regs.h | 2 +-
 tools/perf/util/perf_regs.h | 7 +++
 8 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/tools/perf/arch/arm/include/perf_regs.h 
b/tools/perf/arch/arm/include/perf_regs.h
index ed20e0253e25..4085419283d0 100644
--- a/tools/perf/arch/arm/include/perf_regs.h
+++ b/tools/perf/arch/arm/include/perf_regs.h
@@ -15,7 +15,7 @@ void perf_regs_load(u64 *regs);
 #define PERF_REG_IPPERF_REG_ARM_PC
 #define PERF_REG_SPPERF_REG_ARM_SP
 
-static inline const char *perf_reg_name(int id)
+static inline const char *__perf_reg_name(int id)
 {
switch (id) {
case PERF_REG_ARM_R0:
diff --git a/tools/perf/arch/arm64/include/perf_regs.h 
b/tools/perf/arch/arm64/include/perf_regs.h
index baaa5e64a3fb..fa3e07459f76 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -15,7 +15,7 @@ void perf_regs_load(u64 *regs);
 #define PERF_REG_IPPERF_REG_ARM64_PC
 #define PERF_REG_SPPERF_REG_ARM64_SP
 
-static inline const char *perf_reg_name(int id)
+static inline const char *__perf_reg_name(int id)
 {
switch (id) {
case PERF_REG_ARM64_X0:
diff --git a/tools/perf/arch/csky/include/perf_regs.h 
b/tools/perf/arch/csky/include/perf_regs.h
index 8f336ea1161a..25ac3bdcb9d1 100644
--- a/tools/perf/arch/csky/include/perf_regs.h
+++ b/tools/perf/arch/csky/include/perf_regs.h
@@ -15,7 +15,7 @@
 #define PERF_REG_IPPERF_REG_CSKY_PC
 #define PERF_REG_SPPERF_REG_CSKY_SP
 
-static inline const char *perf_reg_name(int id)
+static inline const char *__perf_reg_name(int id)
 {
switch (id) {
case PERF_REG_CSKY_A0:
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h 
b/tools/perf/arch/powerpc/include/perf_regs.h
index 63f3ac91049f..004bed228693 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -73,7 +73,7 @@ static const char *reg_names[] = {
[PERF_REG_POWERPC_SIER3] = "sier3",
 };
 
-static inline const char *perf_reg_name(int id)
+static inline const char *__perf_reg_name(int id)
 {
return reg_names[id];
 }
diff --git a/tools/perf/arch/riscv/include/perf_regs.h 
b/tools/perf/arch/riscv/include/perf_regs.h
index 7a8bcde7a2b1..6b02a767c918 100644
--- a/tools/perf/arch/riscv/include/perf_regs.h
+++ b/tools/perf/arch/riscv/include/perf_regs.h
@@ -19,7 +19,7 @@
 #define PERF_REG_IPPERF_REG_RISCV_PC
 #define PERF_REG_SPPERF_REG_RISCV_SP
 
-static inline const char *perf_reg_name(int id)
+static inline const char *__perf_reg_name(int id)
 {
switch (id) {
case PERF_REG_RISCV_PC:
diff --git a/tools/perf/arch/s390/include/perf_regs.h 
b/tools/perf/arch/s390/include/perf_regs.h
index bcfbaed78cc2..ce3031526623 100644
--- a/tools/perf/arch/s390/include/perf_regs.h
+++ b/tools/perf/arch/s390/include/perf_regs.h
@@ -14,7 +14,7 @@ void perf_regs_load(u64 *regs);
 #define PERF_REG_IP PERF_REG_S390_PC
 #define PERF_REG_SP PERF_REG_S390_R15
 
-static inline const char *perf_reg_name(int id)
+static inline const char *__perf_reg_name(int id)
 {
switch (id) {
case PERF_REG_S390_R0:
diff --git a/tools/perf/arch/x86/include/perf_regs.h 
b/tools/perf/arch/x86/include/perf_regs.h
index b7321337d100..cddc4cdc0d9b 100644
--- a/tools/perf/arch/x86/include/perf_regs.h
+++ b/tools/perf/arch/x86/include/perf_regs.h
@@

[PATCH v4] perf tools: Fix arm64 build error with gcc-11

2021-02-17 Thread Jianlin Lv
gcc version: 11.0.0 20210208 (experimental) (GCC)

Following build error on arm64:

...
In function ‘printf’,
inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
inlined from ‘regs__printf’ at util/session.c:1169:2:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
  error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]

107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
__va_arg_pack ());

..
In function ‘fprintf’,
  inlined from ‘perf_sample__fprintf_regs.isra’ at \
builtin-script.c:622:14:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
  100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
  101 | __va_arg_pack ());

cc1: all warnings being treated as errors
...

This patch fixes Wformat-overflow warnings. Add helper function to
convert NULL to "unknown".

Signed-off-by: Jianlin Lv 
---
v2: Add ternary operator to avoid similar errors in other arch.
v3: Declared reg_name in inner block.
v4: Add helper function: perf_reg_name_str, update changelog.
---
 tools/perf/builtin-script.c   |  2 +-
 tools/perf/util/perf_regs.h   | 11 ++-
 .../perf/util/scripting-engines/trace-event-python.c  |  2 +-
 tools/perf/util/session.c |  2 +-
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 42dad4a0f8cf..35cddca2c7a7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -643,7 +643,7 @@ static int perf_sample__fprintf_regs(struct regs_dump 
*regs, uint64_t mask,
 
for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), 
val);
+   printed += fprintf(fp, "%5s:0x%"PRIx64" ", 
perf_reg_name_str(r), val);
}
 
return printed;
diff --git a/tools/perf/util/perf_regs.h b/tools/perf/util/perf_regs.h
index a45499126184..e4a0a6f5408e 100644
--- a/tools/perf/util/perf_regs.h
+++ b/tools/perf/util/perf_regs.h
@@ -33,13 +33,22 @@ extern const struct sample_reg sample_reg_masks[];
 
 int perf_reg_value(u64 *valp, struct regs_dump *regs, int id);
 
+static inline const char *perf_reg_name_str(int id)
+{
+   const char *str = perf_reg_name(id);
+
+   if (!str)
+   return "unknown";
+   return str;
+}
+
 #else
 #define PERF_REGS_MASK 0
 #define PERF_REGS_MAX  0
 
 #define DWARF_MINIMAL_REGS PERF_REGS_MASK
 
-static inline const char *perf_reg_name(int id __maybe_unused)
+static inline const char *perf_reg_name_str(int id __maybe_unused)
 {
return "unknown";
 }
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index c83c2c6564e0..361307026485 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -702,7 +702,7 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, 
char *bf, int size)
 
printed += scnprintf(bf + printed, size - printed,
 "%5s:0x%" PRIx64 " ",
-perf_reg_name(r), val);
+perf_reg_name_str(r), val);
}
 
return printed;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 25adbcce0281..0737d3e7e698 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1140,7 +1140,7 @@ static void regs_dump__printf(u64 mask, u64 *regs)
u64 val = regs[i++];
 
printf(" %-5s 0x%016" PRIx64 "\n",
-  perf_reg_name(rid), val);
+  perf_reg_name_str(rid), val);
}
 }
 
-- 
2.25.1



[PATCH v3] perf probe: fix kretprobe issue caused by GCC bug

2021-02-16 Thread Jianlin Lv
Perf failed to add kretprobe event with debuginfo of vmlinux which is
compiled by gcc with -fpatchable-function-entry option enabled.
The same issue with kernel module.

Issue:

  # perf probe  -v 'kernel_clone%return $retval'
  ..
  Writing event: r:probe/kernel_clone__return _text+599624 $retval
  Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)

  # cat /sys/kernel/debug/tracing/error_log
  [156.75] trace_kprobe: error: Retprobe address must be an function entry
  Command: r:probe/kernel_clone__return _text+599624 $retval
^

  # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
  0x00df2c2b:   DW_TAG_subprogram
DW_AT_external  (true)
DW_AT_name  ("kernel_clone")
DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
DW_AT_decl_line (2423)
DW_AT_decl_column   (0x07)
DW_AT_prototyped(true)
DW_AT_type  (0x00dcd492 "pid_t")
DW_AT_low_pc(0x800010092648)
DW_AT_high_pc   (0x800010092b9c)
DW_AT_frame_base(DW_OP_call_frame_cfa)

  # cat /proc/kallsyms |grep kernel_clone
  800010092640 T kernel_clone
  # readelf -s vmlinux |grep -i kernel_clone
  183173: 800010092640  1372 FUNCGLOBAL DEFAULT2 kernel_clone

  # objdump -d vmlinux |grep -A 10  -w \:
  800010092640 :
  800010092640:   d503201fnop
  800010092644:   d503201fnop
  800010092648:   d503233fpaciasp
  80001009264c:   a9b87bfdstp x29, x30, [sp, #-128]!
  800010092650:   910003fdmov x29, sp
  800010092654:   a90153f3stp x19, x20, [sp, #16]

The entry address of kernel_clone converted by debuginfo is _text+599624
(0x92648), which is consistent with the value of DW_AT_low_pc attribute.
But the symbolic address of kernel_clone from /proc/kallsyms is
800010092640.

This issue is found on arm64, -fpatchable-function-entry=2 is enabled when
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
Just as objdump displayed the assembler contents of kernel_clone,
GCC generate 2 NOPs  at the beginning of each function.

kprobe_on_func_entry detects that (_text+599624) is not the entry address
of the function, which leads to the failure of adding kretprobe event.

kprobe_on_func_entry
->_kprobe_addr
->kallsyms_lookup_size_offset
->arch_kprobe_on_func_entry // FALSE

The cause of the issue is that the first instruction in the compile unit
indicated by DW_AT_low_pc does not include NOPs.
This issue exists in all gcc versions that support
-fpatchable-function-entry option.

I have reported it to the GCC community:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776

Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
The kernel compiled with clang does not have this issue.

FIX:

This GCC issue only cause the registration failure of the kretprobe event
which doesn't need debuginfo. So, stop using debuginfo for retprobe.
map will be used to query the probe function address.

Signed-off-by: Jianlin Lv 
---
v2: stop using debuginfo for retprobe, and update changelog.
v3: Update changelog, fixed misuse of --- marker.
---
 tools/perf/util/probe-event.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8eae2afff71a..a59d3268adb0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -894,6 +894,16 @@ static int try_to_find_probe_trace_events(struct 
perf_probe_event *pev,
struct debuginfo *dinfo;
int ntevs, ret = 0;
 
+   /* Workaround for gcc #98776 issue.
+* Perf failed to add kretprobe event with debuginfo of vmlinux which is
+* compiled by gcc with -fpatchable-function-entry option enabled. The
+* same issue with kernel module. The retprobe doesn`t need debuginfo.
+* This workaround solution use map to query the probe function address
+* for retprobe event.
+*/
+   if (pev->point.retprobe)
+   return 0;
+
dinfo = open_debuginfo(pev->target, pev->nsi, !need_dwarf);
if (!dinfo) {
if (need_dwarf)
-- 
2.25.1



[PATCH v3] perf tools: Fix arm64 build error with gcc-11

2021-02-12 Thread Jianlin Lv
gcc version: 11.0.0 20210208 (experimental) (GCC)

Following build error on arm64:

...
In function ‘printf’,
inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
inlined from ‘regs__printf’ at util/session.c:1169:2:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
  error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]

107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
__va_arg_pack ());

..
In function ‘fprintf’,
  inlined from ‘perf_sample__fprintf_regs.isra’ at \
builtin-script.c:622:14:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
  100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
  101 | __va_arg_pack ());

cc1: all warnings being treated as errors
...

This patch fixes Wformat-overflow warnings. Add ternary operator,
The statement evaluates to "Unknown" if reg_name==NULL is met.

Signed-off-by: Jianlin Lv 
---
v2: Add ternary operator to avoid similar errors in other arch.
v3: Declared reg_name in inner block.
---
 tools/perf/builtin-script.c| 4 +++-
 tools/perf/util/scripting-engines/trace-event-python.c | 3 ++-
 tools/perf/util/session.c  | 3 ++-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 42dad4a0f8cf..0d52dc45b1c7 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -643,7 +643,9 @@ static int perf_sample__fprintf_regs(struct regs_dump 
*regs, uint64_t mask,
 
for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), 
val);
+   const char *reg_name = perf_reg_name(r);
+
+   printed += fprintf(fp, "%5s:0x%"PRIx64" ", reg_name ?: 
"unknown", val);
}
 
return printed;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index c83c2c6564e0..768bdd4240f4 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -699,10 +699,11 @@ static int regs_map(struct regs_dump *regs, uint64_t 
mask, char *bf, int size)
 
for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs->regs[i++];
+   const char *reg_name = perf_reg_name(r);
 
printed += scnprintf(bf + printed, size - printed,
 "%5s:0x%" PRIx64 " ",
-perf_reg_name(r), val);
+reg_name ?: "unknown", val);
}
 
return printed;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 25adbcce0281..2b40f1c431a3 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1138,9 +1138,10 @@ static void regs_dump__printf(u64 mask, u64 *regs)
 
for_each_set_bit(rid, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs[i++];
+   const char *reg_name = perf_reg_name(rid);
 
printf(" %-5s 0x%016" PRIx64 "\n",
-  perf_reg_name(rid), val);
+  reg_name ?: "unknown", val);
}
 }
 
-- 
2.25.1



RE: [PATCH v2] perf probe: fix kretprobe issue caused by GCC bug

2021-02-12 Thread Jianlin Lv



> -Original Message-
> From: Arnaldo Carvalho de Melo 
> Sent: Saturday, February 13, 2021 5:34 AM
> To: Jianlin Lv 
> Cc: pet...@infradead.org; mi...@redhat.com; Mark Rutland
> ; alexander.shish...@linux.intel.com;
> jo...@redhat.com; namhy...@kernel.org; nat...@kernel.org;
> ndesaulni...@google.com; mhira...@kernel.org; f...@redhat.com;
> irog...@google.com; suman...@linux.ibm.com; linux-
> ker...@vger.kernel.org; clang-built-li...@googlegroups.com
> Subject: Re: [PATCH v2] perf probe: fix kretprobe issue caused by GCC bug
>
> Em Wed, Feb 10, 2021 at 02:26:46PM +0800, Jianlin Lv escreveu:
> > Perf failed to add kretprobe event with debuginfo of vmlinux which is
> > compiled by gcc with -fpatchable-function-entry option enabled.
> > The same issue with kernel module.
> >
> > Issue:
> >
> >   # perf probe  -v 'kernel_clone%return $retval'
> >   ..
> >   Writing event: r:probe/kernel_clone__return _text+599624 $retval
> >   Failed to write event: Invalid argument
> > Error: Failed to add events. Reason: Invalid argument (Code: -22)
> >
> >   # cat /sys/kernel/debug/tracing/error_log
> >   [156.75] trace_kprobe: error: Retprobe address must be an function entry
> >   Command: r:probe/kernel_clone__return _text+599624 $retval
> > ^
> >
> >   # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
> >   0x00df2c2b:   DW_TAG_subprogram
> > DW_AT_external  (true)
> > DW_AT_name  ("kernel_clone")
> > DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
> > DW_AT_decl_line (2423)
> > DW_AT_decl_column   (0x07)
> > DW_AT_prototyped(true)
> > DW_AT_type  (0x00dcd492 "pid_t")
> > DW_AT_low_pc(0x800010092648)
> > DW_AT_high_pc   (0x800010092b9c)
> > DW_AT_frame_base(DW_OP_call_frame_cfa)
> >
> >   # cat /proc/kallsyms |grep kernel_clone
> >   800010092640 T kernel_clone
> >   # readelf -s vmlinux |grep -i kernel_clone
> >   183173: 800010092640  1372 FUNCGLOBAL DEFAULT2 kernel_clone
> >
> >   # objdump -d vmlinux |grep -A 10  -w \:
> >   800010092640 :
> >   800010092640:   d503201fnop
> >   800010092644:   d503201fnop
> >   800010092648:   d503233fpaciasp
> >   80001009264c:   a9b87bfdstp x29, x30, [sp, #-128]!
> >   800010092650:   910003fdmov x29, sp
> >   800010092654:   a90153f3stp x19, x20, [sp, #16]
> >
> > The entry address of kernel_clone converted by debuginfo is
> > _text+599624 (0x92648), which is consistent with the value of
> DW_AT_low_pc attribute.
> > But the symbolic address of kernel_clone from /proc/kallsyms is
> > 800010092640.
> >
> > This issue is found on arm64, -fpatchable-function-entry=2 is enabled
> > when CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
> > Just as objdump displayed the assembler contents of kernel_clone, GCC
> > generate 2 NOPs  at the beginning of each function.
> >
> > kprobe_on_func_entry detects that (_text+599624) is not the entry
> > address of the function, which leads to the failure of adding kretprobe
> event.
> >
> > ---
> > kprobe_on_func_entry
> > ->_kprobe_addr
> > ->kallsyms_lookup_size_offset
> > ->arch_kprobe_on_func_entry// FALSE
> > ---
>
> Please don't use --- at the start of a line, it is used to separate from the 
> patch
> itself, later down your message.
>
> It causes this:
>
> [acme@five perf]$ am /wb/1.patch
> Traceback (most recent call last):
>   File "/home/acme/bin/ksoff.py", line 180, in 
> sign_msg(sys.stdin, sys.stdout)
>   File "/home/acme/bin/ksoff.py", line 142, in sign_msg
> sob.remove(last_sob[0])
> TypeError: 'NoneType' object is not subscriptable [acme@five perf]$
>
> I'm fixing this by removing that --- markers
>

Sorry for the inconvenience?
Should I commit another version to fix this issue?

Jianlin

> > The cause of the issue is that the first instruction in the compile
> > unit indicated by DW_AT_low_pc does not include NOPs.
> > This issue exists in all gcc versions that support
> > -fpatchable-function-entry option.
> >
> > I have reported it to the GCC community:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
> >
> > Currently arm64 and PA-RISC may enable fpatch

RE: [PATCH v2] perf tools: Fix arm64 build error with gcc-11

2021-02-10 Thread Jianlin Lv


> -Original Message-
> From: John Garry 
> Sent: Wednesday, February 10, 2021 5:38 PM
> To: Jianlin Lv ; w...@kernel.org;
> mathieu.poir...@linaro.org; leo@linaro.org; pet...@infradead.org;
> mi...@redhat.com; a...@kernel.org; Mark Rutland
> ; alexander.shish...@linux.intel.com;
> jo...@redhat.com; namhy...@kernel.org; irog...@google.com;
> agerstm...@redhat.com; kan.li...@linux.intel.com;
> adrian.hun...@intel.com
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] perf tools: Fix arm64 build error with gcc-11
>
> On 10/02/2021 03:24, Jianlin Lv wrote:
> > gcc version: 11.0.0 20210208 (experimental) (GCC)
> >
> > Following build error on arm64:
> >
> > ...
> > In function ‘printf’,
> >  inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
> >  inlined from ‘regs__printf’ at util/session.c:1169:2:
> > /usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
> >error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]
> >
> > 107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
> >  __va_arg_pack ());
> >
> > ..
> > In function ‘fprintf’,
> >inlined from ‘perf_sample__fprintf_regs.isra’ at \
> >  builtin-script.c:622:14:
> > /usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
> >  error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
> >100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
> >101 | __va_arg_pack ());
> >
> > cc1: all warnings being treated as errors ...
> >
> > This patch fixes Wformat-overflow warnings. Add ternary operator, The
> > statement evaluates to "Unknown" if reg_name==NULL is met.
> >
> > Signed-off-by: Jianlin Lv 
> > ---
> > v2: Add ternary operator to avoid similar errors in other arch.
> > ---
> >   tools/perf/builtin-script.c| 4 +++-
> >   tools/perf/util/scripting-engines/trace-event-python.c | 4 +++-
> >   tools/perf/util/session.c  | 5 +++--
> >   3 files changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > index 42dad4a0f8cf..d59da3a063d0 100644
> > --- a/tools/perf/builtin-script.c
> > +++ b/tools/perf/builtin-script.c
> > @@ -635,6 +635,7 @@ static int perf_sample__fprintf_regs(struct
> regs_dump *regs, uint64_t mask,
> >   {
> >   unsigned i = 0, r;
> >   int printed = 0;
> > +const char *reg_name;
> >
> >   if (!regs || !regs->regs)
> >   return 0;
> > @@ -643,7 +644,8 @@ static int perf_sample__fprintf_regs(struct
> > regs_dump *regs, uint64_t mask,
> >
> >   for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
> >   u64 val = regs->regs[i++];
> > -printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r),
> val);
> > +reg_name = perf_reg_name(r);
> > +printed += fprintf(fp, "%5s:0x%"PRIx64" ", reg_name ?:
> "Unknown",
> > +val);
>
> is it possible not have to do this check for NULL and reassignment
> everywhere?
>

In fact, Wformat-overflow warning only occurs during the compilation of
the two files util/session.c and builtin-script.c.

util/scripting-engines/trace-event-python.c can be compiled. In order to
prevent unexpected exceptions, I changed it in the same way.

To be honest, I did not fully understand this comment. Do you mean to
adopt other ways to solve this issue? Could you give me more tips?

In addition, other comments are of great help to me, thank you.

Jianlin

> >   }
> >
> >   return printed;
> > diff --git a/tools/perf/util/scripting-engines/trace-event-python.c
> > b/tools/perf/util/scripting-engines/trace-event-python.c
> > index c83c2c6564e0..e1222cc6a699 100644
> > --- a/tools/perf/util/scripting-engines/trace-event-python.c
> > +++ b/tools/perf/util/scripting-engines/trace-event-python.c
> > @@ -691,6 +691,7 @@ static int regs_map(struct regs_dump *regs,
> uint64_t mask, char *bf, int size)
> >   {
> >   unsigned int i = 0, r;
> >   int printed = 0;
> > +const char *reg_name;
> >
> >   bf[0] = 0;
> >
> > @@ -700,9 +701,10 @@ static int regs_map(struct regs_dump *regs,
> uint64_t mask, char *bf, int size)
> >   for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
>
> a good practice is to limit scope of variables as much as possible, so
> reg_name could be declared here
>
> >   u64 val = regs->regs[i++];
> >
> > +reg_name = pe

[PATCH v2] perf probe: fix kretprobe issue caused by GCC bug

2021-02-09 Thread Jianlin Lv
Perf failed to add kretprobe event with debuginfo of vmlinux which is
compiled by gcc with -fpatchable-function-entry option enabled.
The same issue with kernel module.

Issue:

  # perf probe  -v 'kernel_clone%return $retval'
  ..
  Writing event: r:probe/kernel_clone__return _text+599624 $retval
  Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)

  # cat /sys/kernel/debug/tracing/error_log
  [156.75] trace_kprobe: error: Retprobe address must be an function entry
  Command: r:probe/kernel_clone__return _text+599624 $retval
^

  # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
  0x00df2c2b:   DW_TAG_subprogram
DW_AT_external  (true)
DW_AT_name  ("kernel_clone")
DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
DW_AT_decl_line (2423)
DW_AT_decl_column   (0x07)
DW_AT_prototyped(true)
DW_AT_type  (0x00dcd492 "pid_t")
DW_AT_low_pc(0x800010092648)
DW_AT_high_pc   (0x800010092b9c)
DW_AT_frame_base(DW_OP_call_frame_cfa)

  # cat /proc/kallsyms |grep kernel_clone
  800010092640 T kernel_clone
  # readelf -s vmlinux |grep -i kernel_clone
  183173: 800010092640  1372 FUNCGLOBAL DEFAULT2 kernel_clone

  # objdump -d vmlinux |grep -A 10  -w \:
  800010092640 :
  800010092640:   d503201fnop
  800010092644:   d503201fnop
  800010092648:   d503233fpaciasp
  80001009264c:   a9b87bfdstp x29, x30, [sp, #-128]!
  800010092650:   910003fdmov x29, sp
  800010092654:   a90153f3stp x19, x20, [sp, #16]

The entry address of kernel_clone converted by debuginfo is _text+599624
(0x92648), which is consistent with the value of DW_AT_low_pc attribute.
But the symbolic address of kernel_clone from /proc/kallsyms is
800010092640.

This issue is found on arm64, -fpatchable-function-entry=2 is enabled when
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
Just as objdump displayed the assembler contents of kernel_clone,
GCC generate 2 NOPs  at the beginning of each function.

kprobe_on_func_entry detects that (_text+599624) is not the entry address
of the function, which leads to the failure of adding kretprobe event.

---
kprobe_on_func_entry
->_kprobe_addr
->kallsyms_lookup_size_offset
->arch_kprobe_on_func_entry // FALSE
---

The cause of the issue is that the first instruction in the compile unit
indicated by DW_AT_low_pc does not include NOPs.
This issue exists in all gcc versions that support
-fpatchable-function-entry option.

I have reported it to the GCC community:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776

Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
The kernel compiled with clang does not have this issue.

FIX:

This GCC issue only cause the registration failure of the kretprobe event
which doesn't need debuginfo. So, stop using debuginfo for retprobe.
map will be used to query the probe function address.

Signed-off-by: Jianlin Lv 
---
v2: stop using debuginfo for retprobe, and update changelog.
---
 tools/perf/util/probe-event.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8eae2afff71a..a59d3268adb0 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -894,6 +894,16 @@ static int try_to_find_probe_trace_events(struct 
perf_probe_event *pev,
struct debuginfo *dinfo;
int ntevs, ret = 0;
 
+   /* Workaround for gcc #98776 issue.
+* Perf failed to add kretprobe event with debuginfo of vmlinux which is
+* compiled by gcc with -fpatchable-function-entry option enabled. The
+* same issue with kernel module. The retprobe doesn`t need debuginfo.
+* This workaround solution use map to query the probe function address
+* for retprobe event.
+*/
+   if (pev->point.retprobe)
+   return 0;
+
dinfo = open_debuginfo(pev->target, pev->nsi, !need_dwarf);
if (!dinfo) {
if (need_dwarf)
-- 
2.25.1



[PATCH v2] perf tools: Fix arm64 build error with gcc-11

2021-02-09 Thread Jianlin Lv
gcc version: 11.0.0 20210208 (experimental) (GCC)

Following build error on arm64:

...
In function ‘printf’,
inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
inlined from ‘regs__printf’ at util/session.c:1169:2:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
  error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]

107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
__va_arg_pack ());

..
In function ‘fprintf’,
  inlined from ‘perf_sample__fprintf_regs.isra’ at \
builtin-script.c:622:14:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
  100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
  101 | __va_arg_pack ());

cc1: all warnings being treated as errors
...

This patch fixes Wformat-overflow warnings. Add ternary operator,
The statement evaluates to "Unknown" if reg_name==NULL is met.

Signed-off-by: Jianlin Lv 
---
v2: Add ternary operator to avoid similar errors in other arch.
---
 tools/perf/builtin-script.c| 4 +++-
 tools/perf/util/scripting-engines/trace-event-python.c | 4 +++-
 tools/perf/util/session.c  | 5 +++--
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 42dad4a0f8cf..d59da3a063d0 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -635,6 +635,7 @@ static int perf_sample__fprintf_regs(struct regs_dump 
*regs, uint64_t mask,
 {
unsigned i = 0, r;
int printed = 0;
+   const char *reg_name;
 
if (!regs || !regs->regs)
return 0;
@@ -643,7 +644,8 @@ static int perf_sample__fprintf_regs(struct regs_dump 
*regs, uint64_t mask,
 
for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs->regs[i++];
-   printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), 
val);
+   reg_name = perf_reg_name(r);
+   printed += fprintf(fp, "%5s:0x%"PRIx64" ", reg_name ?: 
"Unknown", val);
}
 
return printed;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c 
b/tools/perf/util/scripting-engines/trace-event-python.c
index c83c2c6564e0..e1222cc6a699 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -691,6 +691,7 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, 
char *bf, int size)
 {
unsigned int i = 0, r;
int printed = 0;
+   const char *reg_name;
 
bf[0] = 0;
 
@@ -700,9 +701,10 @@ static int regs_map(struct regs_dump *regs, uint64_t mask, 
char *bf, int size)
for_each_set_bit(r, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs->regs[i++];
 
+   reg_name = perf_reg_name(r);
printed += scnprintf(bf + printed, size - printed,
 "%5s:0x%" PRIx64 " ",
-perf_reg_name(r), val);
+reg_name ?: "Unknown", val);
}
 
return printed;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 25adbcce0281..1058d8487e98 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1135,12 +1135,13 @@ static void branch_stack__printf(struct perf_sample 
*sample, bool callstack)
 static void regs_dump__printf(u64 mask, u64 *regs)
 {
unsigned rid, i = 0;
+   const char *reg_name;
 
for_each_set_bit(rid, (unsigned long *) , sizeof(mask) * 8) {
u64 val = regs[i++];
-
+   reg_name = perf_reg_name(rid);
printf(" %-5s 0x%016" PRIx64 "\n",
-  perf_reg_name(rid), val);
+  reg_name ?: "Unknown", val);
}
 }
 
-- 
2.25.1



RE: [PATCH] perf tools: Fix arm64 build error with gcc-11

2021-02-09 Thread Jianlin Lv


> -Original Message-
> From: Leo Yan 
> Sent: Tuesday, February 9, 2021 8:17 PM
> To: Jianlin Lv 
> Cc: john.ga...@huawei.com; w...@kernel.org; mathieu.poir...@linaro.org;
> pet...@infradead.org; mi...@redhat.com; a...@kernel.org; Mark Rutland
> ; alexander.shish...@linux.intel.com;
> jo...@redhat.com; namhy...@kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] perf tools: Fix arm64 build error with gcc-11
> 
> Hi Jianlin,
> 
> On Tue, Feb 09, 2021 at 07:33:57PM +0800, Jianlin Lv wrote:
> > gcc version: 11.0.0 20210208 (experimental) (GCC)
> >
> > Following build error on arm64:
> >
> > ...
> > In function ‘printf’,
> > inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
> > inlined from ‘regs__printf’ at util/session.c:1169:2:
> > /usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
> >   error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]
> >
> > 107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
> > __va_arg_pack ());
> >
> > ..
> > In function ‘fprintf’,
> >   inlined from ‘perf_sample__fprintf_regs.isra’ at \
> > builtin-script.c:622:14:
> > /usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
> > error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
> >   100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
> >   101 | __va_arg_pack ());
> >
> > cc1: all warnings being treated as errors ...
> >
> > This patch fixes Wformat-overflow warnings by replacing the return
> > value NULL of perf_reg_name with "unknown".
> >
> > Signed-off-by: Jianlin Lv 
> > ---
> >  tools/perf/arch/arm64/include/perf_regs.h | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/arch/arm64/include/perf_regs.h
> > b/tools/perf/arch/arm64/include/perf_regs.h
> > index baaa5e64a3fb..901419f907c0 100644
> > --- a/tools/perf/arch/arm64/include/perf_regs.h
> > +++ b/tools/perf/arch/arm64/include/perf_regs.h
> > @@ -85,10 +85,10 @@ static inline const char *perf_reg_name(int id)
> > case PERF_REG_ARM64_PC:
> > return "pc";
> > default:
> > -   return NULL;
> > +   return "unknown";
> > }
> >
> > -   return NULL;
> > +   return "unknown";
> 
> This issue is a common issue crossing all archs.  So it's better to change the
> code in the places where calls perf_reg_name(), e.g. in
> util/session.c:
> 
> --- a/tools/perf/util/session.c
> +++ b/tools/perf/util/session.c
> @@ -1135,12 +1135,14 @@ static void branch_stack__printf(struct
> perf_sample *sample, bool callstack)  static void regs_dump__printf(u64
> mask, u64 *regs)  {
> unsigned rid, i = 0;
> +   char *reg_name;
> 
> for_each_set_bit(rid, (unsigned long *) , sizeof(mask) * 8) {
> u64 val = regs[i++];
> 
> +   reg_name = perf_reg_name(rid);
> printf(" %-5s 0x%016" PRIx64 "\n",
> -  perf_reg_name(rid), val);
> +  reg_name ?: "Unknown", val);
> }
>  }
> 

Thanks for your comments, I will send a v2 of the patch today.

Jianlin


> And another potential issue is the format specifier "%-5s", it prints out
> maximum to 5 chars, but actually string "Unknown" has 7 chars.
> Actually the format specifier breaks other archs register names, e.g.
> [1][2], seems to me, it's better to change as "%-8s", you might need to use a
> dedicated patch for format specifier changes.
> 
> Thanks,
> Leo
> 
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/
> perf/arch/powerpc/include/perf_regs.h#n57
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/
> perf/arch/csky/include/perf_regs.h#n83


[PATCH] perf tools: Fix arm64 build error with gcc-11

2021-02-09 Thread Jianlin Lv
gcc version: 11.0.0 20210208 (experimental) (GCC)

Following build error on arm64:

...
In function ‘printf’,
inlined from ‘regs_dump__printf’ at util/session.c:1141:3,
inlined from ‘regs__printf’ at util/session.c:1169:2:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:107:10: \
  error: ‘%-5s’ directive argument is null [-Werror=format-overflow=]

107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, \
__va_arg_pack ());

..
In function ‘fprintf’,
  inlined from ‘perf_sample__fprintf_regs.isra’ at \
builtin-script.c:622:14:
/usr/include/aarch64-linux-gnu/bits/stdio2.h:100:10: \
error: ‘%5s’ directive argument is null [-Werror=format-overflow=]
  100 |   return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt,
  101 | __va_arg_pack ());

cc1: all warnings being treated as errors
...

This patch fixes Wformat-overflow warnings by replacing the return
value NULL of perf_reg_name with "unknown".

Signed-off-by: Jianlin Lv 
---
 tools/perf/arch/arm64/include/perf_regs.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/arch/arm64/include/perf_regs.h 
b/tools/perf/arch/arm64/include/perf_regs.h
index baaa5e64a3fb..901419f907c0 100644
--- a/tools/perf/arch/arm64/include/perf_regs.h
+++ b/tools/perf/arch/arm64/include/perf_regs.h
@@ -85,10 +85,10 @@ static inline const char *perf_reg_name(int id)
case PERF_REG_ARM64_PC:
return "pc";
default:
-   return NULL;
+   return "unknown";
}
 
-   return NULL;
+   return "unknown";
 }
 
 #endif /* ARCH_PERF_REGS_H */
-- 
2.25.1



RE: [PATCH] perf probe: fix kretprobe issue caused by GCC bug

2021-02-08 Thread Jianlin Lv



> -Original Message-
> From: Masami Hiramatsu 
> Sent: Monday, February 8, 2021 8:33 PM
> To: Jianlin Lv 
> Cc: pet...@infradead.org; mi...@redhat.com; a...@kernel.org; Mark
> Rutland ; alexander.shish...@linux.intel.com;
> jo...@redhat.com; namhy...@kernel.org; natechancel...@gmail.com;
> ndesaulni...@google.com; f...@redhat.com; irog...@google.com;
> suman...@linux.ibm.com; adrian.hun...@intel.com; linux-
> ker...@vger.kernel.org; clang-built-li...@googlegroups.com
> Subject: Re: [PATCH] perf probe: fix kretprobe issue caused by GCC bug
>
> Hi Jianlin,
>
> On Fri,  5 Feb 2021 17:35:58 +0800
> Jianlin Lv  wrote:
>
> > Perf failed to add kretprobe event with debuginfo of vmlinux which is
> > compiled by gcc with -fpatchable-function-entry option enabled.
> > The same issue with kernel module.
> >
> > Issue:
> >
> >   # perf probe  -v 'kernel_clone%return $retval'
> >   ..
> >   Writing event: r:probe/kernel_clone__return _text+599624 $retval
> >   Failed to write event: Invalid argument
> > Error: Failed to add events. Reason: Invalid argument (Code: -22)
> >
> >   # cat /sys/kernel/debug/tracing/error_log
> >   [156.75] trace_kprobe: error: Retprobe address must be an function entry
> >   Command: r:probe/kernel_clone__return _text+599624 $retval
> > ^
> >
> >   # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
> >   0x00df2c2b:   DW_TAG_subprogram
> > DW_AT_external  (true)
> > DW_AT_name  ("kernel_clone")
> > DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
> > DW_AT_decl_line (2423)
> > DW_AT_decl_column   (0x07)
> > DW_AT_prototyped(true)
> > DW_AT_type  (0x00dcd492 "pid_t")
> > DW_AT_low_pc(0x800010092648)
> > DW_AT_high_pc   (0x800010092b9c)
> > DW_AT_frame_base(DW_OP_call_frame_cfa)
> >
> >   # cat /proc/kallsyms |grep kernel_clone
> >   800010092640 T kernel_clone
> >   # readelf -s vmlinux |grep -i kernel_clone
> >   183173: 800010092640  1372 FUNCGLOBAL DEFAULT2 kernel_clone
> >
> >   # objdump -d vmlinux |grep -A 10  -w \:
> >   800010092640 :
> >   800010092640:   d503201fnop
> >   800010092644:   d503201fnop
> >   800010092648:   d503233fpaciasp
> >   80001009264c:   a9b87bfdstp x29, x30, [sp, #-128]!
> >   800010092650:   910003fdmov x29, sp
> >   800010092654:   a90153f3stp x19, x20, [sp, #16]
> >
> > The entry address of kernel_clone converted by debuginfo is
> > _text+599624 (0x92648), which is consistent with the value of
> DW_AT_low_pc attribute.
> > But the symbolic address of kernel_clone from /proc/kallsyms is
> > 800010092640.
>
> Oh, I had faced similar bug for fentry.
> 3d918a12a1b3 ("perf probe: Find fentry mcount fuzzed parameter location")
> GCC dwarf generator tends to skip this kind of function entry information...
>
> >
> > This issue is found on arm64, -fpatchable-function-entry=2 is enabled
> > when CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
> > Just as objdump displayed the assembler contents of kernel_clone, GCC
> > generate 2 NOPs  at the beginning of each function.
> >
> > kprobe_on_func_entry detects that (_text+599624) is not the entry
> > address of the function, which leads to the failure of adding kretprobe
> event.
> >
> > ---
> > kprobe_on_func_entry
> > ->_kprobe_addr
> > ->kallsyms_lookup_size_offset
> > ->arch_kprobe_on_func_entry// FALSE
> > ---
> >
> > The cause of the issue is that the first instruction in the compile
> > unit indicated by DW_AT_low_pc does not include NOPs.
> > This issue exists in all gcc versions that support
> > -fpatchable-function-entry option.
> >
> > I have reported it to the GCC community:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776
>
> Thanks for reporting it!
>
> > Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
> > The kernel compiled with clang does not have this issue.
> >
> > FIX:
> >
> > The result of my investigation is that this GCC issue will only cause
> > the registration failure of the kretprobe event; Other functions of
> > perf probe will not be affected, such as line probe, local variable
> > p

[PATCH] perf probe: fix kretprobe issue caused by GCC bug

2021-02-05 Thread Jianlin Lv
Perf failed to add kretprobe event with debuginfo of vmlinux which is
compiled by gcc with -fpatchable-function-entry option enabled.
The same issue with kernel module.

Issue:

  # perf probe  -v 'kernel_clone%return $retval'
  ..
  Writing event: r:probe/kernel_clone__return _text+599624 $retval
  Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)

  # cat /sys/kernel/debug/tracing/error_log
  [156.75] trace_kprobe: error: Retprobe address must be an function entry
  Command: r:probe/kernel_clone__return _text+599624 $retval
^

  # llvm-dwarfdump  vmlinux |grep  -A 10  -w 0x00df2c2b
  0x00df2c2b:   DW_TAG_subprogram
DW_AT_external  (true)
DW_AT_name  ("kernel_clone")
DW_AT_decl_file ("/home/code/linux-next/kernel/fork.c")
DW_AT_decl_line (2423)
DW_AT_decl_column   (0x07)
DW_AT_prototyped(true)
DW_AT_type  (0x00dcd492 "pid_t")
DW_AT_low_pc(0x800010092648)
DW_AT_high_pc   (0x800010092b9c)
DW_AT_frame_base(DW_OP_call_frame_cfa)

  # cat /proc/kallsyms |grep kernel_clone
  800010092640 T kernel_clone
  # readelf -s vmlinux |grep -i kernel_clone
  183173: 800010092640  1372 FUNCGLOBAL DEFAULT2 kernel_clone

  # objdump -d vmlinux |grep -A 10  -w \:
  800010092640 :
  800010092640:   d503201fnop
  800010092644:   d503201fnop
  800010092648:   d503233fpaciasp
  80001009264c:   a9b87bfdstp x29, x30, [sp, #-128]!
  800010092650:   910003fdmov x29, sp
  800010092654:   a90153f3stp x19, x20, [sp, #16]

The entry address of kernel_clone converted by debuginfo is _text+599624
(0x92648), which is consistent with the value of DW_AT_low_pc attribute.
But the symbolic address of kernel_clone from /proc/kallsyms is
800010092640.

This issue is found on arm64, -fpatchable-function-entry=2 is enabled when
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y;
Just as objdump displayed the assembler contents of kernel_clone,
GCC generate 2 NOPs  at the beginning of each function.

kprobe_on_func_entry detects that (_text+599624) is not the entry address
of the function, which leads to the failure of adding kretprobe event.

---
kprobe_on_func_entry
->_kprobe_addr
->kallsyms_lookup_size_offset
->arch_kprobe_on_func_entry // FALSE
---

The cause of the issue is that the first instruction in the compile unit
indicated by DW_AT_low_pc does not include NOPs.
This issue exists in all gcc versions that support
-fpatchable-function-entry option.

I have reported it to the GCC community:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98776

Currently arm64 and PA-RISC may enable fpatchable-function-entry option.
The kernel compiled with clang does not have this issue.

FIX:

The result of my investigation is that this GCC issue will only cause the
registration failure of the kretprobe event;
Other functions of perf probe will not be affected, such as line probe,
local variable probe, uprobe, etc.

A workaround solution is to traverse all the compilation units in
debuginfo for the retprobe event and check whether the DW_AT_producer
attribute valaue of each CUs contains substrings: "GNU" and
"-fpatchable-function-entry". If these two substrings are included,
then debuginfo will not be used to convert perf_probe_event.
Instead, map will be used to query the probe function address.

-grecord-gcc-switches causes the command-line options used to invoke the
compiler to be appended to the DW_AT_producer attribute in DWARF debugging
information.It is enabled by default.

A potential defect is that if -gno-record-gcc-switches option is enabled,
the command-line options will not be recorded in debuginfo. This workaround
solution will fail.
Assume that this situation may not happen for kernel compilation.

Signed-off-by: Jianlin Lv 
---
 tools/perf/util/probe-event.c | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 8eae2afff71a..c0c1bcc59250 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -885,6 +885,60 @@ static int post_process_probe_trace_events(struct 
perf_probe_event *pev,
return ret;
 }
 
+/*
+ * Perf failed to add kretprobe event with debuginfo of vmlinux which is
+ * compiled by gcc with -fpatchable-function-entry option enabled.
+ * The same issue with kernel module. Refer to gcc issue: #98776
+ * This issue only cause the registration failure of kretprobe event,
+ * and it doesn't affect other perf probe functions.
+ * This workaround solution use map to query the probe function address
+ * for retprobe event.

[PATCH v2] perf probe: Added protection to avoid endless loop

2021-02-03 Thread Jianlin Lv
if dwarf_offdie() return NULL, the continue statement forces the next
iteration of the loop without update variable off. It will cause an
endless loop in the process of traversing the compilation unit.
So added exception protection for loop CUs.

Signed-off-by: Jianlin Lv 
---
v2: removed the statement that updates variable in the function argument.
---
 tools/perf/util/probe-finder.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 76dd349aa48d..1b118c9c86a6 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1187,8 +1187,10 @@ static int debuginfo__find_probe_location(struct 
debuginfo *dbg,
while (!dwarf_nextcu(dbg->dbg, off, , , NULL, NULL, NULL)) {
/* Get the DIE(Debugging Information Entry) of this CU */
diep = dwarf_offdie(dbg->dbg, off + cuhl, >cu_die);
-   if (!diep)
+   if (!diep) {
+   off = noff;
continue;
+   }
 
/* Check if target file is included. */
if (pp->file)
@@ -1949,8 +1951,10 @@ int debuginfo__find_line_range(struct debuginfo *dbg, 
struct line_range *lr)
 
/* Get the DIE(Debugging Information Entry) of this CU */
diep = dwarf_offdie(dbg->dbg, off + cuhl, _die);
-   if (!diep)
+   if (!diep) {
+   off = noff;
continue;
+   }
 
/* Check if target file is included. */
if (lr->file)
-- 
2.25.1



[PATCH] perf probe: Added protection to avoid endless loop

2021-02-02 Thread Jianlin Lv
if dwarf_offdie() return NULL, the continue statement forces the next
iteration of the loop without update variable off. It will cause an
endless loop in the process of traversing the compilation unit.
So added exception protection for loop CUs.

Signed-off-by: Jianlin Lv 
---
 tools/perf/util/probe-finder.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 76dd349aa48d..887bffb1cc58 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1156,7 +1156,7 @@ static int debuginfo__find_probe_location(struct 
debuginfo *dbg,
Dwarf_Die *diep;
int ret = 0;
 
-   off = 0;
+   noff = 0;
pf->lcache = intlist__new(NULL);
if (!pf->lcache)
return -ENOMEM;
@@ -1184,7 +1184,7 @@ static int debuginfo__find_probe_location(struct 
debuginfo *dbg,
}
 
/* Loop on CUs (Compilation Unit) */
-   while (!dwarf_nextcu(dbg->dbg, off, , , NULL, NULL, NULL)) {
+   while (!dwarf_nextcu(dbg->dbg, off = noff, , , NULL, NULL, 
NULL)) {
/* Get the DIE(Debugging Information Entry) of this CU */
diep = dwarf_offdie(dbg->dbg, off + cuhl, >cu_die);
if (!diep)
@@ -1208,7 +1208,6 @@ static int debuginfo__find_probe_location(struct 
debuginfo *dbg,
if (ret < 0)
break;
}
-   off = noff;
}
 
 found:
@@ -1919,7 +1918,7 @@ int debuginfo__find_line_range(struct debuginfo *dbg, 
struct line_range *lr)
 {
struct line_finder lf = {.lr = lr, .found = 0};
int ret = 0;
-   Dwarf_Off off = 0, noff;
+   Dwarf_Off off = 0, noff = 0;
size_t cuhl;
Dwarf_Die *diep;
const char *comp_dir;
@@ -1943,6 +1942,7 @@ int debuginfo__find_line_range(struct debuginfo *dbg, 
struct line_range *lr)
 
/* Loop on CUs (Compilation Unit) */
while (!lf.found && ret >= 0) {
+   off = noff;
if (dwarf_nextcu(dbg->dbg, off, , ,
 NULL, NULL, NULL) != 0)
break;
@@ -1967,7 +1967,6 @@ int debuginfo__find_line_range(struct debuginfo *dbg, 
struct line_range *lr)
ret = find_line_range_by_line(NULL, );
}
}
-   off = noff;
}
 
 found:
-- 
2.25.1



RE: [PATCH] tracing/kprobe: Fix to support kretprobe events on unloaded modules

2021-01-30 Thread Jianlin Lv



> -Original Message-
> From: Masami Hiramatsu 
> Sent: Friday, January 29, 2021 9:45 PM
> To: Jianlin Lv 
> Cc: Steven Rostedt ; Oleg Nesterov
> ; Ingo Molnar ; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH] tracing/kprobe: Fix to support kretprobe events on
> unloaded modules
>
> Hi Jianlin,
>
> On Fri, 29 Jan 2021 07:45:32 +
> Jianlin Lv  wrote:
> >
> >
> > I have verified this patch on server and it solves the kretprobe issue
> > on unloaded modules very well.
> >
> > One comments is this patch did not cover the issue I reported before,
> > https://lore.kernel.org/lkml/20210127151507.4185234-1-Jianlin.Lv@arm.c
> > om/ if echo wrong function symbol to tracefs, error_log show as :
> >
> > # echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events # cat
> > error_log
> > [   87.746191] trace_kprobe: error: Invalid probed address or symbol
> >   Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
>
> This is the correct error message because the symbol is not exist, a bad
> symbol.
>
> Thank you,
>

Yes, it would be nice if errors could be detected before registering Kprobe.

Of course, this patch is looks good for me.

Jianlin

> >
> > Based on the current patch,  The following changes will make the error
> > log more accurate.
> >
> > @@ -828,9 +828,10 @@
> > static int trace_kprobe_create(int argc, const char *argv[]) if
> > (is_return) flags |= TPARG_FL_RETURN;
> > -if (kprobe_on_func_entry(NULL, symbol, offset))
> > +if (!kprobe_on_func_entry(NULL, symbol, offset))
> > flags |= TPARG_FL_FENTRY;
> > -if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
> > +/* Check whether symbol is really bad or from a module */
> > +if (!strchr(symbol, ':') && is_return && !(flags &
> > + TPARG_FL_FENTRY)) {
> > trace_probe_log_err(0, BAD_RETPROBE);
> > goto parse_error;
> >
> > If you don`t mind, I can send v5 of my patch after this patch be merged.
> >
> > Jianlin
> >
> >
> > IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended 
> recipient,
> please notify the sender immediately and do not disclose the contents to any
> other person, use it for any purpose, or store or copy the information in any
> medium. Thank you.
>
>
> --
> Masami Hiramatsu 
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[PATCH v4] tracing: precise log info for kretprobe addr err

2021-01-27 Thread Jianlin Lv
When trying to create kretprobe with the wrong function symbol in tracefs;
The error is triggered in the register_trace_kprobe() and recorded as
FAIL_REG_PROBE issue,

Example:
  $ cd /sys/kernel/debug/tracing
  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[142797.347877] trace_kprobe: error: Failed to register probe event
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

This error can be detected in the parameter parsing stage, the effect of
applying this patch is as follows:

  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[415.89]trace_kprobe: error: Retprobe address must be an function entry
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^
v2 changes:
- Added !strchr(symbol, ':') to check whether symbol is really bad
  or from a module.

Signed-off-by: Jianlin Lv 
---
v2: added !strchr(symbol, ':') to check really bad symbol or from module.
v4: added changelog and code comments.
---
 kernel/trace/trace_kprobe.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e6fba1798771..384208a38f82 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -830,7 +830,8 @@ static int trace_kprobe_create(int argc, const char *argv[])
flags |= TPARG_FL_RETURN;
if (kprobe_on_func_entry(NULL, symbol, offset))
flags |= TPARG_FL_FENTRY;
-   if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+   /* Check whether symbol is really bad or from a module */
+   if (!strchr(symbol, ':') && is_return && !(flags & 
TPARG_FL_FENTRY)) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
-- 
2.25.1



RE: [PATCH v3] tracing: precise log info for kretprobe addr err

2021-01-27 Thread Jianlin Lv



> -Original Message-
> From: Masami Hiramatsu 
> Sent: Wednesday, January 27, 2021 9:28 PM
> To: Jianlin Lv 
> Cc: Oleg Nesterov ; Steven Rostedt
> ; mi...@redhat.com; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] tracing: precise log info for kretprobe addr err
>
> On Wed, 27 Jan 2021 02:46:10 +
> Jianlin Lv  wrote:
>
> >
> >
> > > -Original Message-
> > > From: Masami Hiramatsu 
> > > Sent: Wednesday, January 27, 2021 10:02 AM
> > > To: Oleg Nesterov 
> > > Cc: Steven Rostedt ; Jianlin Lv
> > > ; mi...@redhat.com; linux-
> ker...@vger.kernel.org
> > > Subject: Re: [PATCH v3] tracing: precise log info for kretprobe addr
> > > err
> > >
> > > On Tue, 26 Jan 2021 21:20:59 +0100
> > > Oleg Nesterov  wrote:
> > >
> > > > On 01/26, Masami Hiramatsu wrote:
> > > > >
> > > > > > >
> > > > > > > IOW, the "offset != 0" check removed by this patch is
> > > > > > > obviously wrong,
> > > right?
> > > > > > >
> > > > >
> > > > > No, not wrong. Even offset != 0, if the symbol exists in the
> > > > > kernel,
> > > > > kprobe_on_func_entry() will check it.
> > > >
> > > > Yes, but unless I am totally confused... if kprobe_on_func_entry()
> > > > returns false, then trace_kprobe_create() should fail with
> > > > BAD_RETPROBE
> > > even if offset == 0 ?
> > >
> > > Yes, if kprobe_on_func_entry() returns false, register_kretprobe()
> > > also returns an error.
> > >
> > > -
> > > int register_kretprobe(struct kretprobe *rp) {
> > > int ret = 0;
> > > struct kretprobe_instance *inst;
> > > int i;
> > > void *addr;
> > >
> > > if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name,
> > > rp-
> > > >kp.offset))
> > > return -EINVAL;
> > >
> > > -
> > >
> > > Thank you,
> > >
> > > --
> > > Masami Hiramatsu 
> >
> >
> > If register_kretprobe()returns an error -EINVAL.
> > This means that __register_trace_kprobe return -EINVAL,
> >
> > ---
> > ret = __register_trace_kprobe(tk);
> > if (ret == -ENOENT && !trace_kprobe_module_exist(tk)) { pr_warn("This
> > probe might be able to register after target module is loaded.
> > Continue.\n"); ret = 0; }
> > ---
> > As code show, cannot enable kretprobe for an unloaded module.
> >
> > This is consistent with my test results (no VXLAN module is loaded).
> >
> > # perf probe -m /lib/modules/5.11.0-rc2+/kernel/drivers/net/vxlan.ko
> > \ 'vxlan_xmit%return $retval'
> > Failed to write event: Invalid argument
> >   Error: Failed to add events.
> >
> > Is this a bug?
>
> Oops, good catch!
> It seems that the bug has been introduced when I added
> kprobe_on_func_entry() to register_Kretprobe.
> Let me fix it.
>
> Thank you!
>
>
> --
> Masami Hiramatsu 

After confirming this problem, my worries are eliminated,
and the current patch will be updated later.

I am also investigating this bug, and I think this process will deepen
my understanding of kernel probes.

Jianlin

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


RE: [PATCH v3] tracing: precise log info for kretprobe addr err

2021-01-26 Thread Jianlin Lv



> -Original Message-
> From: Masami Hiramatsu 
> Sent: Wednesday, January 27, 2021 10:02 AM
> To: Oleg Nesterov 
> Cc: Steven Rostedt ; Jianlin Lv ;
> mi...@redhat.com; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] tracing: precise log info for kretprobe addr err
>
> On Tue, 26 Jan 2021 21:20:59 +0100
> Oleg Nesterov  wrote:
>
> > On 01/26, Masami Hiramatsu wrote:
> > >
> > > > >
> > > > > IOW, the "offset != 0" check removed by this patch is obviously wrong,
> right?
> > > > >
> > >
> > > No, not wrong. Even offset != 0, if the symbol exists in the kernel,
> > > kprobe_on_func_entry() will check it.
> >
> > Yes, but unless I am totally confused... if kprobe_on_func_entry()
> > returns false, then trace_kprobe_create() should fail with BAD_RETPROBE
> even if offset == 0 ?
>
> Yes, if kprobe_on_func_entry() returns false, register_kretprobe() also
> returns an error.
>
> -
> int register_kretprobe(struct kretprobe *rp) {
> int ret = 0;
> struct kretprobe_instance *inst;
> int i;
> void *addr;
>
> if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp-
> >kp.offset))
> return -EINVAL;
>
> -
>
> Thank you,
>
> --
> Masami Hiramatsu 


If register_kretprobe()returns an error -EINVAL.
This means that __register_trace_kprobe return -EINVAL,

---
ret = __register_trace_kprobe(tk);
if (ret == -ENOENT && !trace_kprobe_module_exist(tk)) {
pr_warn("This probe might be able to register after target module is loaded. 
Continue.\n");
ret = 0;
}
---
As code show, cannot enable kretprobe for an unloaded module.

This is consistent with my test results (no VXLAN module is loaded).

# perf probe -m /lib/modules/5.11.0-rc2+/kernel/drivers/net/vxlan.ko  \
'vxlan_xmit%return $retval'
Failed to write event: Invalid argument
  Error: Failed to add events.

Is this a bug?

Jianlin

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


RE: [PATCH v3] tracing: precise log info for kretprobe addr err

2021-01-26 Thread Jianlin Lv



> -Original Message-
> From: Masami Hiramatsu 
> Sent: Tuesday, January 26, 2021 12:16 PM
> To: Steven Rostedt 
> Cc: Oleg Nesterov ; Jianlin Lv ;
> mi...@redhat.com; mhira...@kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v3] tracing: precise log info for kretprobe addr err
>
> On Mon, 25 Jan 2021 13:38:40 -0500
> Steven Rostedt  wrote:
>
> > On Mon, 25 Jan 2021 19:19:27 +0100
> > Oleg Nesterov  wrote:
> >
> > > On 01/26, Jianlin Lv wrote:
> > > >
> > > > When trying to create kretprobe with the wrong function symbol in
> > > > tracefs; The error is triggered in the register_trace_kprobe() and
> > > > recorded as FAIL_REG_PROBE issue,
> > > >
> > > > Example:
> > > >   $ cd /sys/kernel/debug/tracing
> > > >   $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
> > > > bash: echo: write error: Invalid argument
> > > >   $ cat error_log
> > > > [142797.347877] trace_kprobe: error: Failed to register probe event
> > > > Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
> > > >^
> > > >
> > > > This error can be detected in the parameter parsing stage, the
> > > > effect of applying this patch is as follows:
> > > >
> > > >   $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
> > > > bash: echo: write error: Invalid argument
> > > >   $ cat error_log
> > > > [415.89]trace_kprobe: error: Retprobe address must be an function
> entry
> > > > Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
> > >
> > > IOW, the "offset != 0" check removed by this patch is obviously wrong,
> right?
> > >
>
> No, not wrong. Even offset != 0, if the symbol exists in the kernel,
> kprobe_on_func_entry() will check it.
>
> > > Agreed, but...
> > >
> > > > --- a/kernel/trace/trace_kprobe.c
> > > > +++ b/kernel/trace/trace_kprobe.c
> > > > @@ -830,7 +830,7 @@ static int trace_kprobe_create(int argc, const
> char *argv[])
> > > >  flags |= TPARG_FL_RETURN;
> > > >  if (kprobe_on_func_entry(NULL, symbol, offset))
> > > >  flags |= TPARG_FL_FENTRY;
> > > > -if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
> > > > +if (!strchr(symbol, ':') && is_return && !(flags &
> > > > +TPARG_FL_FENTRY)) {
> > >
> > > but why did you add the strchr(':') check instead?
> > >
> > > I was really puzzled until I found the this email from Masami:
> > >
> https://lore.kernel.org/lkml/20210120131406.5a992c1e434681750a0cd5d4
> > > @kernel.org/
> > >
> > > So I leave this to you and Masami, but perhaps you can document this
> > > check at least in the changelog?
> > >
> >
> > No, you are correct. That needs to be documented in the code.
>
> Agreed. There should be commented that is defered until the module is
> loaded.
>
> >
> > I was about to comment that the check requires a comment ;-)
> >
> > Jianlin,
> >
> > Care to send a v4 of the patch with a comment to why we are checking
> > the symbol for ':'.
>
> Thank you!
>
> >
> > Thanks!
> >
> > -- Steve
> >

Thanks for everyone's comments;  I will update this patch.
In addition, I have another question.

perf-probe can add a probe on a module which has not been loaded yet.
But I still get an error when I execute the following command:
# perf probe -m /lib/modules/5.11.0-rc2+/kernel/drivers/net/vxlan.ko
'vxlan_xmit%return $retval'
Failed to write event: Invalid argument
  Error: Failed to add events.

According to my investigation, __register_trace_kprobe will return -EINVAL,
because the vxlan module is not loaded;

__register_trace_kprobe
->register_kretprobe
->kprobe_on_func_entry
->return -EINVAL;

The following code snippet shows that the probe of the offline module can be
registered successfully only when the ret value is -ENOENT.

ret = __register_trace_kprobe(tk);
if (ret == -ENOENT && !trace_kprobe_module_exist(tk)) {
pr_warn("This probe might be able to register after target module is loaded. 
Continue.\n");
ret = 0;
}

kretprobe events not work with Offline Modules.
Is this a bug?

Jianlin

>
>
> --
> Masami Hiramatsu 
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[PATCH v3] tracing: precise log info for kretprobe addr err

2021-01-25 Thread Jianlin Lv
When trying to create kretprobe with the wrong function symbol in tracefs;
The error is triggered in the register_trace_kprobe() and recorded as
FAIL_REG_PROBE issue,

Example:
  $ cd /sys/kernel/debug/tracing
  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[142797.347877] trace_kprobe: error: Failed to register probe event
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

This error can be detected in the parameter parsing stage, the effect of
applying this patch is as follows:

  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[415.89]trace_kprobe: error: Retprobe address must be an function entry
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

Signed-off-by: Jianlin Lv 
---
v2:add !strchr(symbol, ':') to check really bad symbol or not.
---
 kernel/trace/trace_kprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e6fba1798771..bce63d5ecaec 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -830,7 +830,7 @@ static int trace_kprobe_create(int argc, const char *argv[])
flags |= TPARG_FL_RETURN;
if (kprobe_on_func_entry(NULL, symbol, offset))
flags |= TPARG_FL_FENTRY;
-   if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+   if (!strchr(symbol, ':') && is_return && !(flags & 
TPARG_FL_FENTRY)) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
-- 
2.25.1



RE: [PATCH v2] tracing: precise log info for kretprobe addr err

2021-01-24 Thread Jianlin Lv
Hi Masami,
Thanks for your comments, it is very helpful to me.


Hi Steven,
Could you give me more detailed suggestions about the UPROBES section? I lack 
the knowledge of this part.
Can't fully understand your previous comments;


Thanks all for your patience.

Jianlin

> -Original Message-
> From: Masami Hiramatsu 
> Sent: Sunday, January 24, 2021 4:53 PM
> To: Jianlin Lv 
> Cc: Steven Rostedt ; mi...@redhat.com; linux-
> ker...@vger.kernel.org; o...@redhat.com
> Subject: Re: [PATCH v2] tracing: precise log info for kretprobe addr err
>
> Hi Jianlin,
>
> On Sat, 23 Jan 2021 14:21:48 +
> Jianlin Lv  wrote:
>
> > Regarding the UPROBES, I read the code of trace_uprobe_create() and
> found a place for improvement.
> >
> > diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
> > index 3cf7128e1ad3..8c81f04d7755 100644
> > --- a/kernel/trace/trace_uprobe.c
> > +++ b/kernel/trace/trace_uprobe.c
> > @@ -567,16 +567,18 @@ static int trace_uprobe_create(int argc, const
> char **argv)
> > if (!filename)
> > return -ENOMEM;
> >
> > +   trace_probe_log_init("trace_uprobe", argc, argv);
> > +   trace_probe_log_set_index(1);   /* filename is the 2nd argument */
> > +
> > /* Find the last occurrence, in case the path contains ':' too. */
> > arg = strrchr(filename, ':');
> > if (!arg || !isdigit(arg[1])) {
> > +   trace_probe_log_err(arg + 1 - filename,
> > + BAD_UPROBE_OFFS);
> > kfree(filename);
> > -   return -ECANCELED;
> > +   ret = -EINVAL;
> > +   goto out;
>
> Sorry, it's not a bug, but an expected behavior, because this is checking
> routine which identify the passed command is mine (uprobe event definition)
> or others (e.g. kprobe event definition).
>
> Note that the dyn_event_operations::create operator will be used from
> dyn_event, which passes the command string from
> TRACEFS/dynamic_events to each create operator and check the return is -
> ECANCELED.
> The -ECANCELED is not an error, it means "it is not mine, but maybe others."
> See create_dyn_event(int argc, char **argv) in kernel/trace/trace_dynevent.c.
>
> Thank you,
>
> --
> Masami Hiramatsu 
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


RE: [PATCH v2] tracing: precise log info for kretprobe addr err

2021-01-23 Thread Jianlin Lv



> -Original Message-
> From: Masami Hiramatsu 
> Sent: Thursday, January 21, 2021 10:29 AM
> To: Steven Rostedt 
> Cc: Jianlin Lv ; mi...@redhat.com; linux-
> ker...@vger.kernel.org; Masami Hiramatsu ;
> o...@redhat.com
> Subject: Re: [PATCH v2] tracing: precise log info for kretprobe addr err
>
> On Wed, 20 Jan 2021 11:20:04 -0500
> Steven Rostedt  wrote:
>
> >
> > You forgot to include Masami on Cc again. Masami maintains kprobes.
> > Please include him on any updates, as he needs to review them, and
> > give his acknowledgment before acceptance.
> >
> > I need to update MAINTAINERS files to include trace_kprobe under
> KPROBES.
> > And I also don't see any UPROBES section there. That needs to be done
> > as well.

Regarding the UPROBES, I read the code of trace_uprobe_create() and  found a 
place for improvement.

diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 3cf7128e1ad3..8c81f04d7755 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -567,16 +567,18 @@ static int trace_uprobe_create(int argc, const char 
**argv)
if (!filename)
return -ENOMEM;

+   trace_probe_log_init("trace_uprobe", argc, argv);
+   trace_probe_log_set_index(1);   /* filename is the 2nd argument */
+
/* Find the last occurrence, in case the path contains ':' too. */
arg = strrchr(filename, ':');
if (!arg || !isdigit(arg[1])) {
+   trace_probe_log_err(arg + 1 - filename, BAD_UPROBE_OFFS);
kfree(filename);
-   return -ECANCELED;
+   ret = -EINVAL;
+   goto out;
}

-   trace_probe_log_init("trace_uprobe", argc, argv);
-   trace_probe_log_set_index(1);   /* filename is the 2nd argument */
-

Is there anything else that is missing?
Could you provide more detailed guidance?


>
> Uprobes is under kernel/events/, which seems to be handled by
> performance event subsystem. Maybe we should ask them too.
> Or, can I be a reviewer (R:) for tracing subsystem?
>
> >
> > On Wed, 20 Jan 2021 23:56:44 +0800
> > Jianlin Lv  wrote:
> >
> > > When trying to create kretprobe with the wrong function symbol in
> > > tracefs; The error is triggered in the register_trace_kprobe() and
> > > recorded as FAIL_REG_PROBE issue,
> > >
> > > Example:
> > >   $ cd /sys/kernel/debug/tracing
> > >   $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
> > > bash: echo: write error: Invalid argument
> > >   $ cat error_log
> > > [142797.347877] trace_kprobe: error: Failed to register probe event
> > > Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
> > >^
> > >
> > > This error can be detected in the parameter parsing stage, the
> > > effect of applying this patch is as follows:
> > >
> > >   $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
> > > bash: echo: write error: Invalid argument
> > >   $ cat error_log
> > > [415.89]trace_kprobe: error: Retprobe address must be an function
> entry
> > > Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
> > >^
> > >
> > > Signed-off-by: Jianlin Lv 
> > >
> > > v2:add !strchr(symbol, ':') to check really bad symbol or not.
> >
> > Also, the "changes since" section should be below the "---" so that
> > they don't get pulled into the commit.
>
> Except that, this looks good to me.
>
> Acked-by: Masami Hiramatsu 
>
> Thank you,
>
> >
> > Thanks!
> >
> > -- Steve
> >
> >
> > > ---
> > >  kernel/trace/trace_kprobe.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/trace_kprobe.c
> > > b/kernel/trace/trace_kprobe.c index e6fba1798771..bce63d5ecaec
> > > 100644
> > > --- a/kernel/trace/trace_kprobe.c
> > > +++ b/kernel/trace/trace_kprobe.c
> > > @@ -830,7 +830,7 @@ static int trace_kprobe_create(int argc, const char
> *argv[])
> > >  flags |= TPARG_FL_RETURN;
> > >  if (kprobe_on_func_entry(NULL, symbol, offset))
> > >  flags |= TPARG_FL_FENTRY;
> > > -if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
> > > +if (!strchr(symbol, ':') && is_return && !(flags &
> > > +TPARG_FL_FENTRY)) {
> > >  trace_probe_log_err(0, BAD_RETPROBE);
> > >  goto parse_error;
> > >  }
> >
>
>
> --
> Masami Hiramatsu 
IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[PATCH v2] tracing: precise log info for kretprobe addr err

2021-01-20 Thread Jianlin Lv
When trying to create kretprobe with the wrong function symbol in tracefs;
The error is triggered in the register_trace_kprobe() and recorded as
FAIL_REG_PROBE issue,

Example:
  $ cd /sys/kernel/debug/tracing
  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[142797.347877] trace_kprobe: error: Failed to register probe event
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

This error can be detected in the parameter parsing stage, the effect of
applying this patch is as follows:

  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[415.89]trace_kprobe: error: Retprobe address must be an function entry
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

Signed-off-by: Jianlin Lv 

v2:add !strchr(symbol, ':') to check really bad symbol or not.
---
 kernel/trace/trace_kprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e6fba1798771..bce63d5ecaec 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -830,7 +830,7 @@ static int trace_kprobe_create(int argc, const char *argv[])
flags |= TPARG_FL_RETURN;
if (kprobe_on_func_entry(NULL, symbol, offset))
flags |= TPARG_FL_FENTRY;
-   if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+   if (!strchr(symbol, ':') && is_return && !(flags & 
TPARG_FL_FENTRY)) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
-- 
2.25.1



[PATCH] tracing: precise log info for kretprobe addr err

2021-01-19 Thread Jianlin Lv
When trying to create kretprobe with the wrong function symbol in tracefs;
The error is triggered in the register_trace_kprobe() and recorded as
FAIL_REG_PROBE issue,

Example:
  $ cd /sys/kernel/debug/tracing
  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[142797.347877] trace_kprobe: error: Failed to register probe event
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

This error can be detected in the parameter parsing stage, the effect of
applying this patch is as follows:

  $ echo 'r:myprobe ERROR_SYMBOL_XXX ret=%x0' >> kprobe_events
bash: echo: write error: Invalid argument
  $ cat error_log
[415.89]trace_kprobe: error: Retprobe address must be an function entry
Command: r:myprobe ERROR_SYMBOL_XXX ret=%x0
   ^

Signed-off-by: Jianlin Lv 
---
 kernel/trace/trace_kprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e6fba1798771..3dfd1b6711a3 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -830,7 +830,7 @@ static int trace_kprobe_create(int argc, const char *argv[])
flags |= TPARG_FL_RETURN;
if (kprobe_on_func_entry(NULL, symbol, offset))
flags |= TPARG_FL_FENTRY;
-   if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+   if (is_return && !(flags & TPARG_FL_FENTRY)) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
-- 
2.25.1



[PATCH] arm64: rename S_FRAME_SIZE to PT_REGS_SIZE

2021-01-11 Thread Jianlin Lv
S_FRAME_SIZE is the size of the pt_regs structure, no longer the size of
the kernel stack frame, the name is misleading. In keeping with arm32,
rename S_FRAME_SIZE to PT_REGS_SIZE.

Signed-off-by: Jianlin Lv 
---
 arch/arm64/kernel/asm-offsets.c   |  2 +-
 arch/arm64/kernel/entry-ftrace.S  | 12 ++--
 arch/arm64/kernel/entry.S | 14 +++---
 arch/arm64/kernel/probes/kprobes_trampoline.S |  6 +++---
 4 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index f42fd9e33981..301784463587 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -75,7 +75,7 @@ int main(void)
   DEFINE(S_SDEI_TTBR1, offsetof(struct pt_regs, sdei_ttbr1));
   DEFINE(S_PMR_SAVE,   offsetof(struct pt_regs, pmr_save));
   DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
-  DEFINE(S_FRAME_SIZE, sizeof(struct pt_regs));
+  DEFINE(PT_REGS_SIZE, sizeof(struct pt_regs));
   BLANK();
 #ifdef CONFIG_COMPAT
   DEFINE(COMPAT_SIGFRAME_REGS_OFFSET,  offsetof(struct 
compat_sigframe, uc.uc_mcontext.arm_r0));
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index a338f40e64d3..b3e4f9a088b1 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -35,7 +35,7 @@
  */
.macro  ftrace_regs_entry, allregs=0
/* Make room for pt_regs, plus a callee frame */
-   sub sp, sp, #(S_FRAME_SIZE + 16)
+   sub sp, sp, #(PT_REGS_SIZE + 16)
 
/* Save function arguments (and x9 for simplicity) */
stp x0, x1, [sp, #S_X0]
@@ -61,15 +61,15 @@
.endif
 
/* Save the callsite's SP and LR */
-   add x10, sp, #(S_FRAME_SIZE + 16)
+   add x10, sp, #(PT_REGS_SIZE + 16)
stp x9, x10, [sp, #S_LR]
 
/* Save the PC after the ftrace callsite */
str x30, [sp, #S_PC]
 
/* Create a frame record for the callsite above pt_regs */
-   stp x29, x9, [sp, #S_FRAME_SIZE]
-   add x29, sp, #S_FRAME_SIZE
+   stp x29, x9, [sp, #PT_REGS_SIZE]
+   add x29, sp, #PT_REGS_SIZE
 
/* Create our frame record within pt_regs. */
stp x29, x30, [sp, #S_STACKFRAME]
@@ -120,7 +120,7 @@ ftrace_common_return:
ldr x9, [sp, #S_PC]
 
/* Restore the callsite's SP */
-   add sp, sp, #S_FRAME_SIZE + 16
+   add sp, sp, #PT_REGS_SIZE + 16
 
ret x9
 SYM_CODE_END(ftrace_common)
@@ -130,7 +130,7 @@ SYM_CODE_START(ftrace_graph_caller)
ldr x0, [sp, #S_PC]
sub x0, x0, #AARCH64_INSN_SIZE  // ip (callsite's BL insn)
add x1, sp, #S_LR   // parent_ip (callsite's LR)
-   ldr x2, [sp, #S_FRAME_SIZE] // parent fp (callsite's FP)
+   ldr x2, [sp, #PT_REGS_SIZE] // parent fp (callsite's FP)
bl  prepare_ftrace_return
b   ftrace_common_return
 SYM_CODE_END(ftrace_graph_caller)
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index a8c3e7aaca74..c9bae73f2621 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -75,7 +75,7 @@ alternative_else_nop_endif
.endif
 #endif
 
-   sub sp, sp, #S_FRAME_SIZE
+   sub sp, sp, #PT_REGS_SIZE
 #ifdef CONFIG_VMAP_STACK
/*
 * Test whether the SP has overflowed, without corrupting a GPR.
@@ -96,7 +96,7 @@ alternative_else_nop_endif
 * userspace, and can clobber EL0 registers to free up GPRs.
 */
 
-   /* Stash the original SP (minus S_FRAME_SIZE) in tpidr_el0. */
+   /* Stash the original SP (minus PT_REGS_SIZE) in tpidr_el0. */
msr tpidr_el0, x0
 
/* Recover the original x0 value and stash it in tpidrro_el0 */
@@ -253,7 +253,7 @@ alternative_else_nop_endif
 
scs_load tsk, x20
.else
-   add x21, sp, #S_FRAME_SIZE
+   add x21, sp, #PT_REGS_SIZE
get_current_task tsk
.endif /* \el == 0 */
mrs x22, elr_el1
@@ -377,7 +377,7 @@ alternative_else_nop_endif
ldp x26, x27, [sp, #16 * 13]
ldp x28, x29, [sp, #16 * 14]
ldr lr, [sp, #S_LR]
-   add sp, sp, #S_FRAME_SIZE   // restore sp
+   add sp, sp, #PT_REGS_SIZE   // restore sp
 
.if \el == 0
 alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
@@ -580,12 +580,12 @@ __bad_stack:
 
/*
 * Store the original GPRs to the new stack. The orginal SP (minus
-* S_FRAME_SIZE) was stashed in tpidr_el0 by kernel_ventry.
+* PT_REGS_SIZE) was stashed in tpidr_el0 by kernel_ventry.
 */
-   sub sp, sp, #S_FRAME_SIZE
+   sub sp, sp, #PT_REGS_SIZE
kernel_entry 1
mrs x0, tpidr_el0
-   add x0, x0, #S_FRAME_SIZE
+   add x0, x0

[PATCH net-next] net: Remove unnecessary intermediate variables

2020-08-21 Thread Jianlin Lv
It is not necessary to use src/dst as an intermediate variable for
assignment operation; Delete src/dst intermediate variables to avoid
unnecessary variable declarations.

Signed-off-by: Jianlin Lv 
---
 drivers/net/vxlan.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index b9fefe27e3e8..c00ca01ebe76 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2728,12 +2728,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
goto tx_error;
} else if (err) {
if (info) {
-   struct in_addr src, dst;
-
-   src = remote_ip.sin.sin_addr;
-   dst = local_ip.sin.sin_addr;
-   info->key.u.ipv4.src = src.s_addr;
-   info->key.u.ipv4.dst = dst.s_addr;
+   info->key.u.ipv4.src = 
remote_ip.sin.sin_addr.s_addr;
+   info->key.u.ipv4.dst = 
local_ip.sin.sin_addr.s_addr;
}
vxlan_encap_bypass(skb, vxlan, vxlan, vni, false);
dst_release(ndst);
@@ -2784,12 +2780,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
goto tx_error;
} else if (err) {
if (info) {
-   struct in6_addr src, dst;
-
-   src = remote_ip.sin6.sin6_addr;
-   dst = local_ip.sin6.sin6_addr;
-   info->key.u.ipv6.src = src;
-   info->key.u.ipv6.dst = dst;
+   info->key.u.ipv6.src = remote_ip.sin6.sin6_addr;
+   info->key.u.ipv6.dst = local_ip.sin6.sin6_addr;
}
 
vxlan_encap_bypass(skb, vxlan, vxlan, vni, false);
-- 
2.17.1



[PATCH net-next] net: remove redundant variable in vxlan_xmit_one

2020-08-20 Thread Jianlin Lv
dst/src is used multiple times in vxlan_xmit_one function as the variable
name, although its scope is different, but it reduces the readability and
it is unnecessary to use intermediate variables here;
This patch reduces unnecessary assignments and removes redundant variables

Signed-off-by: Jianlin Lv 
---
 drivers/net/vxlan.c | 40 +++-
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index b9fefe27e3e8..679260e1d9f1 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2597,7 +2597,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
struct ip_tunnel_info *info;
struct vxlan_dev *vxlan = netdev_priv(dev);
const struct iphdr *old_iph = ip_hdr(skb);
-   union vxlan_addr *dst;
union vxlan_addr remote_ip, local_ip;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
@@ -2614,8 +2613,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
info = skb_tunnel_info(skb);
 
if (rdst) {
-   dst = >remote_ip;
-   if (vxlan_addr_any(dst)) {
+   remote_ip = rdst->remote_ip;
+   if (vxlan_addr_any(_ip)) {
if (did_rsc) {
/* short-circuited back to local bridge */
vxlan_encap_bypass(skb, vxlan, vxlan,
@@ -2635,7 +2634,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
ttl = ip_tunnel_get_ttl(old_iph, skb);
} else {
ttl = vxlan->cfg.ttl;
-   if (!ttl && vxlan_addr_multicast(dst))
+   if (!ttl && vxlan_addr_multicast(_ip))
ttl = 1;
}
 
@@ -2643,7 +2642,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
if (tos == 1)
tos = ip_tunnel_get_dsfield(old_iph, skb);
 
-   if (dst->sa.sa_family == AF_INET)
+   if (remote_ip.sa.sa_family == AF_INET)
udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM_TX);
else
udp_sum = !(flags & VXLAN_F_UDP_ZERO_CSUM6_TX);
@@ -2662,7 +2661,6 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
remote_ip.sin6.sin6_addr = info->key.u.ipv6.dst;
local_ip.sin6.sin6_addr = info->key.u.ipv6.src;
}
-   dst = _ip;
dst_port = info->key.tp_dst ? : vxlan->cfg.dst_port;
vni = tunnel_id_to_key32(info->key.tun_id);
ifindex = 0;
@@ -2681,7 +2679,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
 vxlan->cfg.port_max, true);
 
rcu_read_lock();
-   if (dst->sa.sa_family == AF_INET) {
+   if (remote_ip.sa.sa_family == AF_INET) {
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
__be16 df = 0;
@@ -2690,7 +2688,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
ifindex = sock4->sock->sk->sk_bound_dev_if;
 
rt = vxlan_get_route(vxlan, dev, sock4, skb, ifindex, tos,
-dst->sin.sin_addr.s_addr,
+remote_ip.sin.sin_addr.s_addr,
 _ip.sin.sin_addr.s_addr,
 dst_port, src_port,
 dst_cache, info);
@@ -2701,7 +2699,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
 
if (!info) {
/* Bypass encapsulation if the destination is local */
-   err = encap_bypass_if_local(skb, dev, vxlan, dst,
+   err = encap_bypass_if_local(skb, dev, vxlan, _ip,
dst_port, ifindex, vni,
>dst, rt->rt_flags);
if (err)
@@ -2728,12 +2726,8 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
goto tx_error;
} else if (err) {
if (info) {
-   struct in_addr src, dst;
-
-   src = remote_ip.sin.sin_addr;
-   dst = local_ip.sin.sin_addr;
-   info->key.u.ipv4.src = src.s_addr;
-   info->key.u.ipv4.dst = dst.s_addr;
+   info->key.u.ipv4.src = 
remote_ip.sin.sin_addr.s

[PATCH bpf-next] docs: correct subject prefix and update LLVM info

2020-08-20 Thread Jianlin Lv
bpf_devel_QA.rst:152 The subject prefix information is not accurate, it
should be 'PATCH bpf-next v2'

Also update LLVM version info and add information about
‘-DLLVM_TARGETS_TO_BUILD’ to prompt the developer to build the desired
target.

Signed-off-by: Jianlin Lv 
---
 Documentation/bpf/bpf_devel_QA.rst | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/Documentation/bpf/bpf_devel_QA.rst 
b/Documentation/bpf/bpf_devel_QA.rst
index a26aa1b9b259..75a0dca5f295 100644
--- a/Documentation/bpf/bpf_devel_QA.rst
+++ b/Documentation/bpf/bpf_devel_QA.rst
@@ -149,7 +149,7 @@ In case the patch or patch series has to be reworked and 
sent out
 again in a second or later revision, it is also required to add a
 version number (``v2``, ``v3``, ...) into the subject prefix::
 
-  git format-patch --subject-prefix='PATCH net-next v2' start..finish
+  git format-patch --subject-prefix='PATCH bpf-next v2' start..finish
 
 When changes have been requested to the patch series, always send the
 whole patch series again with the feedback incorporated (never send
@@ -479,17 +479,18 @@ LLVM's static compiler lists the supported targets through
 
  $ llc --version
  LLVM (http://llvm.org/):
-   LLVM version 6.0.0svn
+   LLVM version 10.0.0
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake
 
Registered Targets:
- bpf- BPF (host endian)
- bpfeb  - BPF (big endian)
- bpfel  - BPF (little endian)
- x86- 32-bit X86: Pentium-Pro and above
- x86-64 - 64-bit X86: EM64T and AMD64
+ aarch64- AArch64 (little endian)
+ bpf- BPF (host endian)
+ bpfeb  - BPF (big endian)
+ bpfel  - BPF (little endian)
+ x86- 32-bit X86: Pentium-Pro and above
+ x86-64 - 64-bit X86: EM64T and AMD64
 
 For developers in order to utilize the latest features added to LLVM's
 BPF back end, it is advisable to run the latest LLVM releases. Support
@@ -517,6 +518,10 @@ from the git repositories::
 The built binaries can then be found in the build/bin/ directory, where
 you can point the PATH variable to.
 
+Set ``-DLLVM_TARGETS_TO_BUILD`` equal to the target you wish to build, you
+will find a full list of targets within the llvm-project/llvm/lib/Target
+directory.
+
 Q: Reporting LLVM BPF issues
 
 Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
-- 
2.17.1



[PATCH bpf-next] bpf: fix load XDP program error in test_xdp_vlan

2020-08-12 Thread Jianlin Lv
test_xdp_vlan.sh reports the error as below:

$ sudo ./test_xdp_vlan_mode_native.sh
+ '[' -z xdp_vlan_mode_native ']'
+ XDP_MODE=xdpgeneric
……
+ export XDP_PROG=xdp_vlan_remove_outer2
+ XDP_PROG=xdp_vlan_remove_outer2
+ ip netns exec ns1 ip link set veth1 xdpdrv off
Error: XDP program already attached.

ip will throw an error in case a XDP program is already attached to the
networking interface, to prevent it from being overridden by accident.
In order to replace the currently running XDP program with a new one,
the -force option must be used.

Signed-off-by: Jianlin Lv 
---
 tools/testing/selftests/bpf/test_xdp_vlan.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/test_xdp_vlan.sh 
b/tools/testing/selftests/bpf/test_xdp_vlan.sh
index bb8b0da91686..034e603aeb50 100755
--- a/tools/testing/selftests/bpf/test_xdp_vlan.sh
+++ b/tools/testing/selftests/bpf/test_xdp_vlan.sh
@@ -220,7 +220,7 @@ ip netns exec ns1 ping -i 0.2 -W 2 -c 2 $IPADDR2
 # ETH_P_8021Q indication, and this cause overwriting of our changes.
 #
 export XDP_PROG=xdp_vlan_remove_outer2
-ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE off
+ip netns exec ns1 ip -force link set $DEVNS1 $XDP_MODE off
 ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE object $FILE section $XDP_PROG
 
 # Now the namespaces should still be able reach each-other, test with ping:
-- 
2.17.1



[PATCH bpf-next v2] bpf: fix segmentation fault of test_progs

2020-08-10 Thread Jianlin Lv
test_progs reports the segmentation fault as below

$ sudo ./test_progs -t mmap --verbose
test_mmap:PASS:skel_open_and_load 0 nsec
..
test_mmap:PASS:adv_mmap1 0 nsec
test_mmap:PASS:adv_mmap2 0 nsec
test_mmap:PASS:adv_mmap3 0 nsec
test_mmap:PASS:adv_mmap4 0 nsec
Segmentation fault

This issue was triggered because mmap() and munmap() used inconsistent
length parameters; mmap() creates a new mapping of 3*page_size, but the
length parameter set in the subsequent re-map and munmap() functions is
4*page_size; this leads to the destruction of the process space.

To fix this issue, first create 4 pages of anonymous mapping,then do all
the mmap() with MAP_FIXED.

Another issue is that when unmap the second page fails, the length
parameter to delete tmp1 mappings should be 4*page_size.

Signed-off-by: Jianlin Lv 
---
v2:
- Update commit messages
- Create 4 pages of anonymous mapping that serve the subsequent mmap()
---
 tools/testing/selftests/bpf/prog_tests/mmap.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/mmap.c 
b/tools/testing/selftests/bpf/prog_tests/mmap.c
index 43d0b5578f46..9c3c5c0f068f 100644
--- a/tools/testing/selftests/bpf/prog_tests/mmap.c
+++ b/tools/testing/selftests/bpf/prog_tests/mmap.c
@@ -21,7 +21,7 @@ void test_mmap(void)
const long page_size = sysconf(_SC_PAGE_SIZE);
int err, duration = 0, i, data_map_fd, data_map_id, tmp_fd, rdmap_fd;
struct bpf_map *data_map, *bss_map;
-   void *bss_mmaped = NULL, *map_mmaped = NULL, *tmp1, *tmp2;
+   void *bss_mmaped = NULL, *map_mmaped = NULL, *tmp0, *tmp1, *tmp2;
struct test_mmap__bss *bss_data;
struct bpf_map_info map_info;
__u32 map_info_sz = sizeof(map_info);
@@ -183,16 +183,23 @@ void test_mmap(void)
 
/* check some more advanced mmap() manipulations */
 
+   tmp0 = mmap(NULL, 4 * page_size, PROT_READ, MAP_SHARED | MAP_ANONYMOUS,
+ -1, 0);
+   if (CHECK(tmp0 == MAP_FAILED, "adv_mmap0", "errno %d\n", errno))
+   goto cleanup;
+
/* map all but last page: pages 1-3 mapped */
-   tmp1 = mmap(NULL, 3 * page_size, PROT_READ, MAP_SHARED,
+   tmp1 = mmap(tmp0, 3 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED,
  data_map_fd, 0);
-   if (CHECK(tmp1 == MAP_FAILED, "adv_mmap1", "errno %d\n", errno))
+   if (CHECK(tmp0 != tmp1, "adv_mmap1", "tmp0: %p, tmp1: %p\n", tmp0, 
tmp1)) {
+   munmap(tmp0, 4 * page_size);
goto cleanup;
+   }
 
/* unmap second page: pages 1, 3 mapped */
err = munmap(tmp1 + page_size, page_size);
if (CHECK(err, "adv_mmap2", "errno %d\n", errno)) {
-   munmap(tmp1, map_sz);
+   munmap(tmp1, 4 * page_size);
goto cleanup;
}
 
@@ -201,7 +208,7 @@ void test_mmap(void)
MAP_SHARED | MAP_FIXED, data_map_fd, 0);
if (CHECK(tmp2 == MAP_FAILED, "adv_mmap3", "errno %d\n", errno)) {
munmap(tmp1, page_size);
-   munmap(tmp1 + 2*page_size, page_size);
+   munmap(tmp1 + 2*page_size, 2 * page_size);
goto cleanup;
}
CHECK(tmp1 + page_size != tmp2, "adv_mmap4",
@@ -211,7 +218,7 @@ void test_mmap(void)
tmp2 = mmap(tmp1, 4 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED,
data_map_fd, 0);
if (CHECK(tmp2 == MAP_FAILED, "adv_mmap5", "errno %d\n", errno)) {
-   munmap(tmp1, 3 * page_size); /* unmap page 1 */
+   munmap(tmp1, 4 * page_size); /* unmap page 1 */
goto cleanup;
}
CHECK(tmp1 != tmp2, "adv_mmap6", "tmp1: %p, tmp2: %p\n", tmp1, tmp2);
-- 
2.17.1



[PATCH bpf-next] bpf: fix segmentation fault of test_progs

2020-08-07 Thread Jianlin Lv
test_progs reports the segmentation fault as below

$ sudo ./test_progs -t mmap --verbose
test_mmap:PASS:skel_open_and_load 0 nsec
..
test_mmap:PASS:adv_mmap1 0 nsec
test_mmap:PASS:adv_mmap2 0 nsec
test_mmap:PASS:adv_mmap3 0 nsec
test_mmap:PASS:adv_mmap4 0 nsec
Segmentation fault

This issue was triggered because mmap() and munmap() used inconsistent
length parameters; mmap() creates a new mapping of 3*page_size, but the
length parameter set in the subsequent re-map and munmap() functions is
4*page_size; this leads to the destruction of the process space.

Another issue is that when unmap the second page fails, the length
parameter to delete tmp1 mappings should be 3*page_size.

Signed-off-by: Jianlin Lv 
---
 tools/testing/selftests/bpf/prog_tests/mmap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/mmap.c 
b/tools/testing/selftests/bpf/prog_tests/mmap.c
index 43d0b5578f46..2070cfe19cac 100644
--- a/tools/testing/selftests/bpf/prog_tests/mmap.c
+++ b/tools/testing/selftests/bpf/prog_tests/mmap.c
@@ -192,7 +192,7 @@ void test_mmap(void)
/* unmap second page: pages 1, 3 mapped */
err = munmap(tmp1 + page_size, page_size);
if (CHECK(err, "adv_mmap2", "errno %d\n", errno)) {
-   munmap(tmp1, map_sz);
+   munmap(tmp1, 3 * page_size);
goto cleanup;
}
 
@@ -207,8 +207,8 @@ void test_mmap(void)
CHECK(tmp1 + page_size != tmp2, "adv_mmap4",
  "tmp1: %p, tmp2: %p\n", tmp1, tmp2);
 
-   /* re-map all 4 pages */
-   tmp2 = mmap(tmp1, 4 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED,
+   /* re-map all 3 pages */
+   tmp2 = mmap(tmp1, 3 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED,
data_map_fd, 0);
if (CHECK(tmp2 == MAP_FAILED, "adv_mmap5", "errno %d\n", errno)) {
munmap(tmp1, 3 * page_size); /* unmap page 1 */
@@ -226,7 +226,7 @@ void test_mmap(void)
CHECK_FAIL(map_data->val[2] != 321);
CHECK_FAIL(map_data->val[far] != 3 * 321);
 
-   munmap(tmp2, 4 * page_size);
+   munmap(tmp2, 3 * page_size);
 
/* map all 4 pages, but with pg_off=1 page, should fail */
tmp1 = mmap(NULL, 4 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED,
-- 
2.17.1



[PATCH bpf-next v2] bpf: fix compilation warning of selftests

2020-08-06 Thread Jianlin Lv
Clang compiler version: 12.0.0
The following warning appears during the selftests/bpf compilation:

prog_tests/send_signal.c:51:3: warning: ignoring return value of ‘write’,
declared with attribute warn_unused_result [-Wunused-result]
   51 |   write(pipe_c2p[1], buf, 1);
  |   ^~
prog_tests/send_signal.c:54:3: warning: ignoring return value of ‘read’,
declared with attribute warn_unused_result [-Wunused-result]
   54 |   read(pipe_p2c[0], buf, 1);
  |   ^
..

prog_tests/stacktrace_build_id_nmi.c:13:2: warning: ignoring return value
of ‘fscanf’,declared with attribute warn_unused_result [-Wunused-resul]
   13 |  fscanf(f, "%llu", _freq);
  |  ^~~

test_tcpnotify_user.c:133:2: warning:ignoring return value of ‘system’,
declared with attribute warn_unused_result [-Wunused-result]
  133 |  system(test_script);
  |  ^~~
test_tcpnotify_user.c:138:2: warning:ignoring return value of ‘system’,
declared with attribute warn_unused_result [-Wunused-result]
  138 |  system(test_script);
  |  ^~~
test_tcpnotify_user.c:143:2: warning:ignoring return value of ‘system’,
declared with attribute warn_unused_result [-Wunused-result]
  143 |  system(test_script);
  |  ^~~

Add code that fix compilation warning about ignoring return value and
handles any errors; Check return value of library`s API make the code
more secure.

Signed-off-by: Jianlin Lv 
---
v2:
- replease CHECK_FAIL by CHECK
- fix test_tcpnotify_user failed issue
---
 .../selftests/bpf/prog_tests/send_signal.c | 18 --
 .../bpf/prog_tests/stacktrace_build_id_nmi.c   |  4 +++-
 .../selftests/bpf/test_tcpnotify_user.c| 13 ++---
 3 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c 
b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index 504abb7bfb95..7043e6ded0e6 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -48,21 +48,19 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
close(pipe_p2c[1]); /* close write */
 
/* notify parent signal handler is installed */
-   write(pipe_c2p[1], buf, 1);
+   CHECK(write(pipe_c2p[1], buf, 1) != 1, "pipe_write", "err 
%d\n", -errno);
 
/* make sure parent enabled bpf program to send_signal */
-   read(pipe_p2c[0], buf, 1);
+   CHECK(read(pipe_p2c[0], buf, 1) != 1, "pipe_read", "err %d\n", 
-errno);
 
/* wait a little for signal handler */
sleep(1);
 
-   if (sigusr1_received)
-   write(pipe_c2p[1], "2", 1);
-   else
-   write(pipe_c2p[1], "0", 1);
+   buf[0] = sigusr1_received ? '2' : '0';
+   CHECK(write(pipe_c2p[1], buf, 1) != 1, "pipe_write", "err 
%d\n", -errno);
 
/* wait for parent notification and exit */
-   read(pipe_p2c[0], buf, 1);
+   CHECK(read(pipe_p2c[0], buf, 1) != 1, "pipe_read", "err %d\n", 
-errno);
 
close(pipe_c2p[1]);
close(pipe_p2c[0]);
@@ -99,7 +97,7 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
}
 
/* wait until child signal handler installed */
-   read(pipe_c2p[0], buf, 1);
+   CHECK(read(pipe_c2p[0], buf, 1) != 1, "pipe_read", "err %d\n", -errno);
 
/* trigger the bpf send_signal */
skel->bss->pid = pid;
@@ -107,7 +105,7 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
skel->bss->signal_thread = signal_thread;
 
/* notify child that bpf program can send_signal now */
-   write(pipe_p2c[1], buf, 1);
+   CHECK(write(pipe_p2c[1], buf, 1) != 1, "pipe_write", "err %d\n", 
-errno);
 
/* wait for result */
err = read(pipe_c2p[0], buf, 1);
@@ -121,7 +119,7 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
CHECK(buf[0] != '2', test_name, "incorrect result\n");
 
/* notify child safe to exit */
-   write(pipe_p2c[1], buf, 1);
+   CHECK(write(pipe_p2c[1], buf, 1) != 1, "pipe_write", "err %d\n", 
-errno);
 
 disable_pmu:
close(pmu_fd);
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c 
b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
index f002e3090d92..11a769e18f5d 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_build_id_nmi.c
@@ -6,11 +6,13 @@ static __u64 read_perf_max_sample_fr

[PATCH bpf-next] bpf: fix compilation warning of selftests

2020-07-31 Thread Jianlin Lv
Clang compiler version: 12.0.0
The following warning appears during the selftests/bpf compilation:

prog_tests/send_signal.c:51:3: warning: ignoring return value of ‘write’,
declared with attribute warn_unused_result [-Wunused-result]
   51 |   write(pipe_c2p[1], buf, 1);
  |   ^~
prog_tests/send_signal.c:54:3: warning: ignoring return value of ‘read’,
declared with attribute warn_unused_result [-Wunused-result]
   54 |   read(pipe_p2c[0], buf, 1);
  |   ^
..

prog_tests/stacktrace_build_id_nmi.c:13:2: warning: ignoring return value
of ‘fscanf’,declared with attribute warn_unused_result [-Wunused-resul]
   13 |  fscanf(f, "%llu", _freq);
  |  ^~~

test_tcpnotify_user.c:133:2: warning:ignoring return value of ‘system’,
declared with attribute warn_unused_result [-Wunused-result]
  133 |  system(test_script);
  |  ^~~
test_tcpnotify_user.c:138:2: warning:ignoring return value of ‘system’,
declared with attribute warn_unused_result [-Wunused-result]
  138 |  system(test_script);
  |  ^~~
test_tcpnotify_user.c:143:2: warning:ignoring return value of ‘system’,
declared with attribute warn_unused_result [-Wunused-result]
  143 |  system(test_script);
  |  ^~~

Add code that fix compilation warning about ignoring return value and
handles any errors; Check return value of library`s API make the code
more secure.

Signed-off-by: Jianlin Lv 
---
 .../selftests/bpf/prog_tests/send_signal.c| 37 ++-
 .../bpf/prog_tests/stacktrace_build_id_nmi.c  |  3 +-
 .../selftests/bpf/test_tcpnotify_user.c   | 15 ++--
 3 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/send_signal.c 
b/tools/testing/selftests/bpf/prog_tests/send_signal.c
index 504abb7bfb95..7a5272e4e810 100644
--- a/tools/testing/selftests/bpf/prog_tests/send_signal.c
+++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c
@@ -48,22 +48,31 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
close(pipe_p2c[1]); /* close write */
 
/* notify parent signal handler is installed */
-   write(pipe_c2p[1], buf, 1);
+   if (CHECK_FAIL(write(pipe_c2p[1], buf, 1) != 1)) {
+   perror("Child: write pipe error");
+   goto close_out;
+   }
 
/* make sure parent enabled bpf program to send_signal */
-   read(pipe_p2c[0], buf, 1);
+   if (CHECK_FAIL(read(pipe_p2c[0], buf, 1) != 1)) {
+   perror("Child: read pipe error");
+   goto close_out;
+   }
 
/* wait a little for signal handler */
sleep(1);
 
-   if (sigusr1_received)
-   write(pipe_c2p[1], "2", 1);
-   else
-   write(pipe_c2p[1], "0", 1);
+   buf[0] = sigusr1_received ? '2' : '0';
+   if (CHECK_FAIL(write(pipe_c2p[1], buf, 1) != 1)) {
+   perror("Child: write pipe error");
+   goto close_out;
+   }
 
/* wait for parent notification and exit */
-   read(pipe_p2c[0], buf, 1);
+   if (CHECK_FAIL(read(pipe_p2c[0], buf, 1) != 1))
+   perror("Child: read pipe error");
 
+close_out:
close(pipe_c2p[1]);
close(pipe_p2c[0]);
exit(0);
@@ -99,7 +108,11 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
}
 
/* wait until child signal handler installed */
-   read(pipe_c2p[0], buf, 1);
+   if (CHECK_FAIL(read(pipe_c2p[0], buf, 1) != 1)) {
+   perror("Parent: read pipe error");
+   goto disable_pmu;
+   }
+
 
/* trigger the bpf send_signal */
skel->bss->pid = pid;
@@ -107,7 +120,10 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
skel->bss->signal_thread = signal_thread;
 
/* notify child that bpf program can send_signal now */
-   write(pipe_p2c[1], buf, 1);
+   if (CHECK_FAIL(write(pipe_p2c[1], buf, 1) != 1)) {
+   perror("Parent: write pipe error");
+   goto disable_pmu;
+   }
 
/* wait for result */
err = read(pipe_c2p[0], buf, 1);
@@ -121,7 +137,8 @@ static void test_send_signal_common(struct perf_event_attr 
*attr,
CHECK(buf[0] != '2', test_name, "incorrect result\n");
 
/* notify child safe to exit */
-   write(pipe_p2c[1], buf, 1);
+   if (CHECK_FAIL(write(pipe_p2c[1], buf, 1) != 1))
+   perror("Parent: write pipe error");
 
 disable_pmu:
close(pmu_fd

RE: [PATCH bpf-next] bpf: Generate cookie for new non-initial net NS

2020-07-22 Thread Jianlin Lv


> -Original Message-
> From: bpf-ow...@vger.kernel.org  On Behalf
> Of Daniel Borkmann
> Sent: Wednesday, July 22, 2020 4:18 AM
> To: Jianlin Lv ; b...@vger.kernel.org
> Cc: da...@davemloft.net; k...@kernel.org; a...@kernel.org; y...@fb.com;
> Song Zhu ; linux-kernel@vger.kernel.org;
> net...@vger.kernel.org
> Subject: Re: [PATCH bpf-next] bpf: Generate cookie for new non-initial net NS
>
> On 7/20/20 4:09 PM, Jianlin Lv wrote:
> > For non-initial network NS, the net cookie is generated when
> > bpf_get_netns_cookie_sock is called for the first time, but it is more
> > reasonable to complete the cookie generation work when creating a new
> > network NS, just like init_net.
> > net_gen_cookie() be moved into setup_net() that it can serve the
> > initial and non-initial network namespace.
> >
> > Signed-off-by: Jianlin Lv 
>
> What use-case are you trying to solve? Why should it be different than, say,
> socket cookie generation? I'm currently not seeing much of a point in moving
> this. When it's not used in the system, it would actually create more work.

This patch does not come from use-case, but based on the following points were 
considered:
1. setup_net() runs the initializers for the network namespace object, 
net_cookie is a member of struct net, and its initialization is more reasonable 
in setup_net();
2. For initial network namespaces, this patch does not introduce additional 
burden;
3. For systems that have not created non-initial network namespaces, this will 
not introduce additional work;
4. For newly created non-initial network namespaces, the added effort of 
net_gen_cookie() is weak for the entire network namespaces creation process, 
and net_cookie is only written once during the entire life cycle of network 
namespaces.

>
> > ---
> >   net/core/net_namespace.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index
> > dcd61aca343e..5937bd0df56d 100644
> > --- a/net/core/net_namespace.c
> > +++ b/net/core/net_namespace.c
> > @@ -336,6 +336,7 @@ static __net_init int setup_net(struct net *net,
> struct user_namespace *user_ns)
> >   idr_init(>netns_ids);
> >   spin_lock_init(>nsid_lock);
> >   mutex_init(>ipv4.ra_mutex);
> > +net_gen_cookie(net);
> >
> >   list_for_each_entry(ops, _list, list) {
> >   error = ops_init(ops, net);
> > @@ -1101,7 +1102,6 @@ static int __init net_ns_init(void)
> >   panic("Could not allocate generic netns");
> >
> >   rcu_assign_pointer(init_net.gen, ng);
> > -net_gen_cookie(_net);
> >
> >   down_write(_ops_rwsem);
> >   if (setup_net(_net, _user_ns))
> >

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


[PATCH bpf-next] bpf: Generate cookie for new non-initial net NS

2020-07-20 Thread Jianlin Lv
For non-initial network NS, the net cookie is generated when
bpf_get_netns_cookie_sock is called for the first time, but it is more
reasonable to complete the cookie generation work when creating a new
network NS, just like init_net.
net_gen_cookie() be moved into setup_net() that it can serve the initial
and non-initial network namespace.

Signed-off-by: Jianlin Lv 
---
 net/core/net_namespace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index dcd61aca343e..5937bd0df56d 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -336,6 +336,7 @@ static __net_init int setup_net(struct net *net, struct 
user_namespace *user_ns)
idr_init(>netns_ids);
spin_lock_init(>nsid_lock);
mutex_init(>ipv4.ra_mutex);
+   net_gen_cookie(net);
 
list_for_each_entry(ops, _list, list) {
error = ops_init(ops, net);
@@ -1101,7 +1102,6 @@ static int __init net_ns_init(void)
panic("Could not allocate generic netns");
 
rcu_assign_pointer(init_net.gen, ng);
-   net_gen_cookie(_net);
 
down_write(_ops_rwsem);
if (setup_net(_net, _user_ns))
-- 
2.17.1