Re: Broken kretprobe stack traces

2021-03-04 Thread Daniel Xu
On Wed, Mar 3, 2021, at 6:18 PM, Daniel Xu wrote:
> On Wed, Mar 03, 2021 at 03:37:40PM -0500, Steven Rostedt wrote:
> > On Wed, 03 Mar 2021 12:13:08 -0800
> > "Daniel Xu"  wrote:
> > 
> > > On Wed, Mar 3, 2021, at 11:58 AM, Daniel Xu wrote:
> > > > On Wed, Mar 03, 2021 at 09:26:04AM -0500, Steven Rostedt wrote:  
> > > > > On Wed, 3 Mar 2021 13:48:28 +0900
> > > > > Masami Hiramatsu  wrote:
> > > > >   
> > > > > >   
> > > > > > > 
> > > > > > > I think (can't prove) this used to work:
> > > > > 
> > > > > Would be good to find out if it did.  
> > > > 
> > > > I'm installing some older kernels now to check. Will report back.  
> > > 
> > > Yep, works in 4.11. So there was a regression somewhere.
> > 
> > Care to bisect? ;-)
> 
> Took a while (I'll probably be typing "test_regression.sh" in my sleep
> tonight) but I've bisected it down to f95b23a112f1 ("Merge branch
> 'x86/urgent' into x86/asm, to pick up dependent fixes").
> 
> I think I saw the default option for stack unwinder change from frame
> pointers -> ORC so that may be the root cause. Not sure, though. Need to
> look more closely at the commits in the merge commit.
> 
> <...>
> 
> Daniel
>

Compiling with:

CONFIG_UNWINDER_ORC=n
CONFIG_UNWINDER_FRAME_POINTER=y

fixes the issues and leads me to believe stacktraces on kretprobes
never worked with ORC.

Josh, any chance you have an idea?

Thanks,
Daniel


Re: Broken kretprobe stack traces

2021-03-04 Thread Masami Hiramatsu
On Wed, 3 Mar 2021 09:26:04 -0500
Steven Rostedt  wrote:

> On Wed, 3 Mar 2021 13:48:28 +0900
> Masami Hiramatsu  wrote:
> 
> > 
> > > 
> > > I think (can't prove) this used to work:  
> 
> Would be good to find out if it did.
> 
> > 
> > I'm not sure the bpftrace had correctly handled it or not.
> > 
> > > 
> > > # bpftrace -e 'kretprobe:__tcp_retransmit_skb { @[kstack()] = count() 
> > > }'
> > > Attaching 1 probe...
> > > ^C
> > > 
> > > @[
> > > kretprobe_trampoline+0
> > > ]: 1  
> > 
> > Would you know how the bpftrace stacktracer rewinds the stack entries?
> > FYI, ftrace does it in trace_seq_print_sym()@kernel/trace/trace_output.c
> > 
> 
> The difference between trace events and normal function tracing stack
> traces is that it keeps its original return address. But kretprobes (and
> function graph tracing, and some bpf trampolines too) modify the return
> pointer, and that could possibly cause havoc with the stack trace.

BTW, I think if the stack tracer passes the nth of kretprobe_trampoline
or a cursor, kretprobe can find the correct return address from given task.

I've made a patch to do that only for the CONFIG_ARCH_STACKWALK=y

Here is an example on x86. 

 # echo r vfs_read > kprobe_events 
 # echo stacktrace > events/kprobes/r_vfs_read_0/trigger 
 # echo 1 > events/kprobes/r_vfs_read_0/enable 
 # echo 1 > options/sym-offset 
 # less trace 
...

  sh-132 [007] ...122.524917: 
 => kretprobe_dispatcher+0x7d/0xc0
 => __kretprobe_trampoline_handler+0xdb/0x1b0
 => trampoline_handler+0x48/0x60
 => kretprobe_trampoline+0x2a/0x50
 => ksys_read+0x70/0xf0
 => __x64_sys_read+0x1a/0x20
 => do_syscall_64+0x38/0x50
 => entry_SYSCALL_64_after_hwframe+0x44/0xae
 => 0
 => 0

--

>From 77a785a3a0791171b570830d0b2f099f8a4ba337 Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu 
Date: Thu, 4 Mar 2021 14:19:24 +0900
Subject: [PATCH] kprobes: stacktrace: Recover the address changed by kretprobe

Recover the return address on the stack which changed by the
kretprobe.

Signed-off-by: Masami Hiramatsu 
---
 include/linux/kprobes.h |  3 ++
 kernel/kprobes.c| 81 +++--
 kernel/stacktrace.c | 26 +
 3 files changed, 82 insertions(+), 28 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 1883a4a9f16a..a022e507d829 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -205,6 +205,9 @@ extern void arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
   struct pt_regs *regs);
 extern int arch_trampoline_kprobe(struct kprobe *p);
 
+unsigned long kretprobe_real_stack_tsk(struct task_struct *tsk,
+  unsigned long addr,
+  struct llist_node **cur);
 /* If the trampoline handler called from a kprobe, use this version */
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
void *trampoline_address,
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 745f08fdd7a6..b3d9dbd6086f 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1863,46 +1863,56 @@ unsigned long __weak arch_deref_entry_point(void *entry)
 
 #ifdef CONFIG_KRETPROBES
 
-unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
-void *trampoline_address,
-void *frame_pointer)
+/* This assumes the tsk is current or the task which is not running. */
+unsigned long kretprobe_real_stack_tsk(struct task_struct *tsk,
+  unsigned long addr,
+  struct llist_node **cur)
 {
-   kprobe_opcode_t *correct_ret_addr = NULL;
struct kretprobe_instance *ri = NULL;
-   struct llist_node *first, *node;
-   struct kretprobe *rp;
+   struct llist_node *node = *cur;
 
-   /* Find all nodes for this frame. */
-   first = node = current->kretprobe_instances.first;
-   while (node) {
-   ri = container_of(node, struct kretprobe_instance, llist);
+   if (addr != (unsigned long)_trampoline)
+   return addr;
 
-   BUG_ON(ri->fp != frame_pointer);
+   if (!node)
+   node = tsk->kretprobe_instances.first;
+   else
+   node = node->next;
 
-   if (ri->ret_addr != trampoline_address) {
-   correct_ret_addr = ri->ret_addr;
-   /*
-* This is the real return address. Any other
-* instances associated with this task are for
-* other calls deeper on the call stack
-*/
-   goto found;
+   while (node) {
+   ri = container_of(node, struct kretprobe_instance, llist);
+   if (ri->ret_addr != (void *)_trampoline) {
+

Re: Broken kretprobe stack traces

2021-03-03 Thread Daniel Xu
On Wed, Mar 03, 2021 at 03:37:40PM -0500, Steven Rostedt wrote:
> On Wed, 03 Mar 2021 12:13:08 -0800
> "Daniel Xu"  wrote:
> 
> > On Wed, Mar 3, 2021, at 11:58 AM, Daniel Xu wrote:
> > > On Wed, Mar 03, 2021 at 09:26:04AM -0500, Steven Rostedt wrote:  
> > > > On Wed, 3 Mar 2021 13:48:28 +0900
> > > > Masami Hiramatsu  wrote:
> > > >   
> > > > >   
> > > > > > 
> > > > > > I think (can't prove) this used to work:
> > > > 
> > > > Would be good to find out if it did.  
> > > 
> > > I'm installing some older kernels now to check. Will report back.  
> > 
> > Yep, works in 4.11. So there was a regression somewhere.
> 
> Care to bisect? ;-)

Took a while (I'll probably be typing "test_regression.sh" in my sleep
tonight) but I've bisected it down to f95b23a112f1 ("Merge branch
'x86/urgent' into x86/asm, to pick up dependent fixes").

I think I saw the default option for stack unwinder change from frame
pointers -> ORC so that may be the root cause. Not sure, though. Need to
look more closely at the commits in the merge commit.

<...>

Daniel


Re: Broken kretprobe stack traces

2021-03-03 Thread Steven Rostedt
On Wed, 03 Mar 2021 12:13:08 -0800
"Daniel Xu"  wrote:

> On Wed, Mar 3, 2021, at 11:58 AM, Daniel Xu wrote:
> > On Wed, Mar 03, 2021 at 09:26:04AM -0500, Steven Rostedt wrote:  
> > > On Wed, 3 Mar 2021 13:48:28 +0900
> > > Masami Hiramatsu  wrote:
> > >   
> > > >   
> > > > > 
> > > > > I think (can't prove) this used to work:
> > > 
> > > Would be good to find out if it did.  
> > 
> > I'm installing some older kernels now to check. Will report back.  
> 
> Yep, works in 4.11. So there was a regression somewhere.

Care to bisect? ;-)

-- Steve



Re: Broken kretprobe stack traces

2021-03-03 Thread Daniel Xu
On Wed, Mar 03, 2021 at 01:48:28PM +0900, Masami Hiramatsu wrote:
> Hi Daniel,
> 
> On Tue, 02 Mar 2021 17:15:13 -0800
> "Daniel Xu"  wrote:
> 
> > Hi Masami,
> > 
> > Jakub reported a bug with kretprobe stack traces -- wondering if you've 
> > gotten
> > any bug reports related to stack traces being broken for kretprobes.
> 
> Yeah, stack dumper must check the stack entry is kretprobe'd and skip it.
> For example, ftrace checks it as below.
> 
> /sys/kernel/debug/tracing # echo r vfs_read > kprobe_events 
> /sys/kernel/debug/tracing # echo stacktrace > 
> events/kprobes/r_vfs_read_0/trigger 
> /sys/kernel/debug/tracing # echo 1 > events/kprobes/r_vfs_read_0/enable 
> /sys/kernel/debug/tracing # head -20 trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 15685/336094   #P:8
> #
> #_-=> irqs-off
> #   / _=> need-resched
> #  | / _---=> hardirq/softirq
> #  || / _--=> preempt-depth
> #  ||| / delay
> #   TASK-PID CPU#     TIMESTAMP  FUNCTION
> #  | | |     | |
>   sh-132 [006] ...138.920352: 
>  => kretprobe_dispatcher
>  => __kretprobe_trampoline_handler
>  => trampoline_handler
>  => [unknown/kretprobe'd]
>  => [unknown/kretprobe'd]
>  => __x64_sys_read
>  => do_syscall_64
>  => entry_SYSCALL_64_after_hwframe
> 
> 
> > 
> > I think (can't prove) this used to work:
> 
> I'm not sure the bpftrace had correctly handled it or not.
> 
> > 
> > # bpftrace -e 'kretprobe:__tcp_retransmit_skb { @[kstack()] = count() }'
> > Attaching 1 probe...
> > ^C
> > 
> > @[
> > kretprobe_trampoline+0
> > ]: 1
> 
> Would you know how the bpftrace stacktracer rewinds the stack entries?
> FYI, ftrace does it in trace_seq_print_sym()@kernel/trace/trace_output.c

Thanks for the hint, I'll take a look.

bpftrace generates a bpf program that calls into the kernel's
bpf_get_stackid():

https://github.com/torvalds/linux/blob/f69d02e37a85645aa90d18cacfff36dba370f797/include/uapi/linux/bpf.h#L1296

so it could be a bug with bpf.

<...>



Re: Broken kretprobe stack traces

2021-03-03 Thread Daniel Xu
On Wed, Mar 03, 2021 at 09:26:04AM -0500, Steven Rostedt wrote:
> On Wed, 3 Mar 2021 13:48:28 +0900
> Masami Hiramatsu  wrote:
> 
> > 
> > > 
> > > I think (can't prove) this used to work:  
> 
> Would be good to find out if it did.

I'm installing some older kernels now to check. Will report back.

<...>

Thanks,
Daniel


Re: Broken kretprobe stack traces

2021-03-03 Thread Daniel Xu
On Wed, Mar 3, 2021, at 11:58 AM, Daniel Xu wrote:
> On Wed, Mar 03, 2021 at 09:26:04AM -0500, Steven Rostedt wrote:
> > On Wed, 3 Mar 2021 13:48:28 +0900
> > Masami Hiramatsu  wrote:
> > 
> > > 
> > > > 
> > > > I think (can't prove) this used to work:  
> > 
> > Would be good to find out if it did.
> 
> I'm installing some older kernels now to check. Will report back.

Yep, works in 4.11. So there was a regression somewhere.

Thanks,
Daniel


Re: Broken kretprobe stack traces

2021-03-03 Thread Steven Rostedt
On Wed, 3 Mar 2021 13:48:28 +0900
Masami Hiramatsu  wrote:

> 
> > 
> > I think (can't prove) this used to work:  

Would be good to find out if it did.

> 
> I'm not sure the bpftrace had correctly handled it or not.
> 
> > 
> > # bpftrace -e 'kretprobe:__tcp_retransmit_skb { @[kstack()] = count() }'
> > Attaching 1 probe...
> > ^C
> > 
> > @[
> > kretprobe_trampoline+0
> > ]: 1  
> 
> Would you know how the bpftrace stacktracer rewinds the stack entries?
> FYI, ftrace does it in trace_seq_print_sym()@kernel/trace/trace_output.c
> 

The difference between trace events and normal function tracing stack
traces is that it keeps its original return address. But kretprobes (and
function graph tracing, and some bpf trampolines too) modify the return
pointer, and that could possibly cause havoc with the stack trace.

-- Steve


Re: Broken kretprobe stack traces

2021-03-03 Thread Masami Hiramatsu
Hi Daniel,

On Tue, 02 Mar 2021 17:15:13 -0800
"Daniel Xu"  wrote:

> Hi Masami,
> 
> Jakub reported a bug with kretprobe stack traces -- wondering if you've gotten
> any bug reports related to stack traces being broken for kretprobes.

Yeah, stack dumper must check the stack entry is kretprobe'd and skip it.
For example, ftrace checks it as below.

/sys/kernel/debug/tracing # echo r vfs_read > kprobe_events 
/sys/kernel/debug/tracing # echo stacktrace > 
events/kprobes/r_vfs_read_0/trigger 
/sys/kernel/debug/tracing # echo 1 > events/kprobes/r_vfs_read_0/enable 
/sys/kernel/debug/tracing # head -20 trace
# tracer: nop
#
# entries-in-buffer/entries-written: 15685/336094   #P:8
#
#_-=> irqs-off
#   / _=> need-resched
#  | / _---=> hardirq/softirq
#  || / _--=> preempt-depth
#  ||| / delay
#   TASK-PID CPU#     TIMESTAMP  FUNCTION
#  | | |     | |
  sh-132 [006] ...138.920352: 
 => kretprobe_dispatcher
 => __kretprobe_trampoline_handler
 => trampoline_handler
 => [unknown/kretprobe'd]
 => [unknown/kretprobe'd]
 => __x64_sys_read
 => do_syscall_64
 => entry_SYSCALL_64_after_hwframe


> 
> I think (can't prove) this used to work:

I'm not sure the bpftrace had correctly handled it or not.

> 
> # bpftrace -e 'kretprobe:__tcp_retransmit_skb { @[kstack()] = count() }'
> Attaching 1 probe...
> ^C
> 
> @[
> kretprobe_trampoline+0
> ]: 1

Would you know how the bpftrace stacktracer rewinds the stack entries?
FYI, ftrace does it in trace_seq_print_sym()@kernel/trace/trace_output.c

Thank you,

> 
> fentry/fexit probes seem to work:
> 
> # bpftrace -e 'kretfunc:__tcp_retransmit_skb { @[kstack()] = count() }'
> Attaching 1 probe...
> ^C
> 
> @[
> ftrace_trampoline+10799
> bpf_get_stackid_raw_tp+121
> ftrace_trampoline+10799
> __tun_chr_ioctl.isra.0.cold+33312
> __tcp_retransmit_skb+5
> tcp_send_loss_probe+254
> tcp_write_timer_handler+394
> tcp_write_timer+149
> call_timer_fn+41
> __run_timers+493
> run_timer_softirq+25
> __softirqentry_text_start+207
> asm_call_sysvec_on_stack+18
> do_softirq_own_stack+55
> irq_exit_rcu+158
> sysvec_apic_timer_interrupt+54
> asm_sysvec_apic_timer_interrupt+18
> ]: 1
> @[
> ftrace_trampoline+10799
> bpf_get_stackid_raw_tp+121
> ftrace_trampoline+10799
> __tun_chr_ioctl.isra.0.cold+33312
> __tcp_retransmit_skb+5
>   <...>
> 
> which makes me suspect it's a kprobe specific issue.
> 
> Thanks,
> Daniel


-- 
Masami Hiramatsu