Re: [BUG] Failed to obtain stack trace via bpf_get_stackid on ARM64 architecture

Google Wed, 24 Sep 2025 00:32:00 -0700

On Wed, 24 Sep 2025 14:25:36 +0800
Feng Yang <[email protected]> wrote:


> On Wed, 24 Sep 2025 00:32:15 +0900 Masami Hiramatsu (Google) 
> <[email protected]> wrote:
> 
> > On Mon, 22 Sep 2025 10:15:31 +0800
> > Feng Yang <[email protected]> wrote:
> > 
> > > On Sun, 21 Sep 2025 22:30:37 +0900 Masami Hiramatsu wrote:
> > > 
> > > > On Fri, 19 Sep 2025 19:56:20 -0700
> > > > Alexei Starovoitov <[email protected]> wrote:
> > > > 
> > > > > On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <[email protected]> 
> > > > > wrote:
> > > > > >
> > > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF 
> > > > > > program that contains the bpf_get_stackid function on the arm64 
> > > > > > architecture,
> > > > > > I find that the stack trace cannot be obtained. The trace->nr in 
> > > > > > __bpf_get_stackid is 0, and the function returns -EFAULT.
> > > > > >
> > > > > > For example:
> > > > > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c 
> > > > > > b/tools/testing/selftests/bpf/progs/kprobe_multi.c
> > > > > > index 9e1ca8e34913..844fa88cdc4c 100644
> > > > > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c
> > > > > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c
> > > > > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0;
> > > > > >  __u64 kretprobe_test7_result = 0;
> > > > > >  __u64 kretprobe_test8_result = 0;
> > > > > >
> > > > > > +typedef __u64 stack_trace_t[2];
> > > > > > +
> > > > > > +struct {
> > > > > > +       __uint(type, BPF_MAP_TYPE_STACK_TRACE);
> > > > > > +       __uint(max_entries, 1024);
> > > > > > +       __type(key, __u32);
> > > > > > +       __type(value, stack_trace_t);
> > > > > > +} stacks SEC(".maps");
> > > > > > +
> > > > > >  static void kprobe_multi_check(void *ctx, bool is_return)
> > > > > >  {
> > > > > >         if (bpf_get_current_pid_tgid() >> 32 != pid)
> > > > > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx)
> > > > > >  SEC("kprobe.multi")
> > > > > >  int test_kprobe_manual(struct pt_regs *ctx)
> > > > > >  {
> > > > > > +       int id = bpf_get_stackid(ctx, &stacks, 0);
> > > > > 
> > > > > ftrace_partial_regs() supposed to work on x86 and arm64,
> > > > > but since multi-kprobe is the only user...
> > > > 
> > > > It should be able to unwind stack. It saves sp, pc, lr, fp.
> > > > 
> > > >         regs->sp = afregs->sp;
> > > >         regs->pc = afregs->pc;
> > > >         regs->regs[29] = afregs->fp;
> > > >         regs->regs[30] = afregs->lr;
> > > > 
> > > > > I suspect the arm64 implementation wasn't really tested.
> > > > > Or maybe there is some other issue.
> > > > 
> > > > It depends on how bpf_get_stackid() works. Some registers for that
> > > > function may not be saved.
> > > > 
> > > > If it returns -EFAULT, the get_perf_callchain() returns NULL.
> > > > 
> > > 
> > > During my test, the reason for returning -EFAULT was that trace->nr was 0.
> > > 
> > > static long __bpf_get_stackid(struct bpf_map *map,
> > >                         struct perf_callchain_entry *trace, u64 flags)
> > > {
> > >   struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, 
> > > map);
> > >   struct stack_map_bucket *bucket, *new_bucket, *old_bucket;
> > >   u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
> > >   u32 hash, id, trace_nr, trace_len;
> > >   bool user = flags & BPF_F_USER_STACK;
> > >   u64 *ips;
> > >   bool hash_matches;
> > > 
> > >   if (trace->nr <= skip)
> > >           /* skipping more than usable stack trace */
> > >           return -EFAULT;
> > >   ......
> > 
> > Hmm. The "trace" is returned from get_perf_callchain()
> > 
> > get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool 
> > user,
> >                u32 max_stack, bool crosstask, bool add_mark)
> > {
> > ...
> > 
> >     if (kernel && !user_mode(regs)) {
> >             if (add_mark)
> >                     perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL);
> >             perf_callchain_kernel(&ctx, regs);
> >     }
> > 
> > So this means `perf_callchain_kernel(&ctx, regs);` fails to unwind stack.
> > 
> > perf_callchain_kernel() -> arch_stack_walk() -> kunwind_stack_walk()
> > 
> > That is `kunwind_init_from_regs()` and `do_kunwind()`.
> > 
> >     if (regs) {
> >             if (task != current)
> >                     return -EINVAL;
> >             kunwind_init_from_regs(&state, regs);
> >     } else if (task == current) {
> >             kunwind_init_from_caller(&state);
> >     } else {
> >             kunwind_init_from_task(&state, task);
> >     }
> > 
> >     return do_kunwind(&state, consume_state, cookie);
> > 
> > For initialization, it should be OK because it only refers pc and 
> > fp(regs[29]), which are recovered by ftrace_partial_regs().
> > 
> > static __always_inline void
> > kunwind_init_from_regs(struct kunwind_state *state,
> >                    struct pt_regs *regs)
> > {
> >     kunwind_init(state, current);
> > 
> >     state->regs = regs;
> >     state->common.fp = regs->regs[29];
> >     state->common.pc = regs->pc;
> >     state->source = KUNWIND_SOURCE_REGS_PC;
> > }
> > 
> > And do_kunwind() should work increase trace->nr before return
> > unless `kunwind_recover_return_address()` fails.
> > 
> > static __always_inline int
> > do_kunwind(struct kunwind_state *state, kunwind_consume_fn consume_state,
> >        void *cookie)
> > {
> >     int ret;
> > 
> >     ret = kunwind_recover_return_address(state);
> >     if (ret)
> >             return ret;
> > 
> >     while (1) {
> >             if (!consume_state(state, cookie)) <--- this increases 
> > trace->nr (*).
> >                     return -EINVAL;
> >             ret = kunwind_next(state);
> >             if (ret == -ENOENT)
> >                     return 0;
> >             if (ret < 0)
> >                     return ret;
> >     }
> > }
> > 
> > (*) consume_state() == arch_kunwind_consume_entry() 
> >   ->  data->consume_entry == callchain_trace() -> perf_callchain_store().
> > 
> > Hmm, can you also dump the regs and insert pr_info() to find
> > which function fails?
> > 
> > Thanks,
> > 
> 
> After testing, it was found that the stack could not be obtained because 
> user_mode(regs) returned 1. 
> Referring to the arch_ftrace_fill_perf_regs function in your email 
> (https://lore.kernel.org/all/173518997908.391279.15910334347345106424.stgit@devnote2/),
>  
> I made the following modification: by setting the value of pstate, the stack 
> can now be obtained successfully.
> 
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index 058a99aa44bd..f2814175e958 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -159,11 +159,13 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, 
> struct pt_regs *regs)
>  {
>         struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs);
>  
>         memcpy(regs->regs, afregs->regs, sizeof(afregs->regs));
>         regs->sp = afregs->sp;
>         regs->pc = afregs->pc;
>         regs->regs[29] = afregs->fp;
>         regs->regs[30] = afregs->lr;
> +       regs->pstate = PSR_MODE_EL1h;

Good catch! 

>         return regs;
>  }
> However, I'm not sure if there will be any other impacts...
> 
> By the way, during my testing, I also noticed that when executing 
> bpf_get_stackid via kprobes or tracepoints, 
> the command bpftrace -e 'kprobe:bpf_get_stackid 
> {printf("bpf_get_stackid\n");}' produces no output. 

That is strange. since normal kprobes passes full pt_regs.

> However, it does output something when bpf_get_stackid is invoked via 
> uprobes. 
> This phenomenon also occurs on the x86 architecture, could this be a bug as 
> well?

Yes, it must be a bug.

Thanks!

> 
> Thanks.
> 


-- 
Masami Hiramatsu (Google) <[email protected]>

Re: [BUG] Failed to obtain stack trace via bpf_get_stackid on ARM64 architecture

Reply via email to