On Thu, 20 Aug 2020 at 15:23, Greg Kroah-Hartman <gre...@linuxfoundation.org> wrote: > > From: Muchun Song <songmuc...@bytedance.com> > > commit 0cb2f1372baa60af8456388a574af6133edd7d80 upstream. > > We found a case of kernel panic on our server. The stack trace is as > follows(omit some irrelevant information): > > BUG: kernel NULL pointer dereference, address: 0000000000000080 > RIP: 0010:kprobe_ftrace_handler+0x5e/0xe0 > RSP: 0018:ffffb512c6550998 EFLAGS: 00010282 > RAX: 0000000000000000 RBX: ffff8e9d16eea018 RCX: 0000000000000000 > RDX: ffffffffbe1179c0 RSI: ffffffffc0535564 RDI: ffffffffc0534ec0 > RBP: ffffffffc0534ec1 R08: ffff8e9d1bbb0f00 R09: 0000000000000004 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 > R13: ffff8e9d1f797060 R14: 000000000000bacc R15: ffff8e9ce13eca00 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000080 CR3: 00000008453d0005 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > ftrace_ops_assist_func+0x56/0xe0 > ftrace_call+0x5/0x34 > tcpa_statistic_send+0x5/0x130 [ttcp_engine] > > The tcpa_statistic_send is the function being kprobed. After analysis, > the root cause is that the fourth parameter regs of kprobe_ftrace_handler > is NULL. Why regs is NULL? We use the crash tool to analyze the kdump. > > crash> dis tcpa_statistic_send -r > <tcpa_statistic_send>: callq 0xffffffffbd8018c0 <ftrace_caller> > > The tcpa_statistic_send calls ftrace_caller instead of ftrace_regs_caller. > So it is reasonable that the fourth parameter regs of kprobe_ftrace_handler > is NULL. In theory, we should call the ftrace_regs_caller instead of the > ftrace_caller. After in-depth analysis, we found a reproducible path. > > Writing a simple kernel module which starts a periodic timer. The > timer's handler is named 'kprobe_test_timer_handler'. The module > name is kprobe_test.ko. > > 1) insmod kprobe_test.ko > 2) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}' > 3) echo 0 > /proc/sys/kernel/ftrace_enabled > 4) rmmod kprobe_test > 5) stop step 2) kprobe > 6) insmod kprobe_test.ko > 7) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}' > > We mark the kprobe as GONE but not disarm the kprobe in the step 4). > The step 5) also do not disarm the kprobe when unregister kprobe. So > we do not remove the ip from the filter. In this case, when the module > loads again in the step 6), we will replace the code to ftrace_caller > via the ftrace_module_enable(). When we register kprobe again, we will > not replace ftrace_caller to ftrace_regs_caller because the ftrace is > disabled in the step 3). So the step 7) will trigger kernel panic. Fix > this problem by disarming the kprobe when the module is going away. > > Link: > https://lkml.kernel.org/r/20200728064536.24405-1-songmuc...@bytedance.com > > Cc: sta...@vger.kernel.org > Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization") > Acked-by: Masami Hiramatsu <mhira...@kernel.org> > Signed-off-by: Muchun Song <songmuc...@bytedance.com> > Co-developed-by: Chengming Zhou <zhouchengm...@bytedance.com> > Signed-off-by: Chengming Zhou <zhouchengm...@bytedance.com> > Signed-off-by: Steven Rostedt (VMware) <rost...@goodmis.org> > Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
stable rc branch 4.19 build warning on arm64. ../kernel/kprobes.c: In function ‘kill_kprobe’: ../kernel/kprobes.c:1070:33: warning: statement with no effect [-Wunused-value] 1070 | #define disarm_kprobe_ftrace(p) (-ENODEV) | ^ ../kernel/kprobes.c:2090:3: note: in expansion of macro ‘disarm_kprobe_ftrace’ 2090 | disarm_kprobe_ftrace(p); | ^~~~~~~~~~~~~~~~~~~~ > > --- > kernel/kprobes.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -2077,6 +2077,13 @@ static void kill_kprobe(struct kprobe *p > * the original probed function (which will be freed soon) any more. > */ > arch_remove_kprobe(p); > + > + /* > + * The module is going away. We should disarm the kprobe which > + * is using ftrace. > + */ > + if (kprobe_ftrace(p)) > + disarm_kprobe_ftrace(p); > } > > /* Disable one kprobe */ > -- Linaro LKFT https://lkft.linaro.org