On Tue, Aug 27, 2024 at 12:29:38AM +0200, Oleg Nesterov wrote:
> On 08/27, Jiri Olsa wrote:
> >
> > did you just bpftrace-ed bpftrace? ;-) on my setup I'm getting:
> >
> > [root@qemu ex]# ../bpftrace/build/src/bpftrace -e 'kprobe:uprobe_register { 
> > printf("%s\n", kstack); }'
> > Attaching 1 probe...
> >
> >         uprobe_register+1
> 
> so I guess you are on tip/perf/core which killed uprobe_register_refctr()
> and changed bpf_uprobe_multi_link_attach() to use uprobe_register
> 
> >         bpf_uprobe_multi_link_attach+685
> >         __sys_bpf+9395
> >         __x64_sys_bpf+26
> >         do_syscall_64+128
> >         entry_SYSCALL_64_after_hwframe+118
> >
> >
> > I'm not sure what's bpftrace version in fedora 40, I'm using upstream build:
> 
> bpftrace v0.20.1
> 
> > [root@qemu ex]# ../bpftrace/build/src/bpftrace --info 2>&1 | grep 
> > uprobe_multi
> >   uprobe_multi: yes
> 
> Aha, I get
> 
>       uprobe_multi: no
> 
> OK. So, on your setup bpftrace uses bpf_uprobe_multi_link_attach()
> and this implies ->ret_handler = uprobe_multi_link_ret_handler()
> which calls uprobe_prog_run() which does
> 
>       if (link->task && current->mm != link->task->mm)
>               return 0;
> 
> So, can you reproduce the problem reported by Tianyi on your setup?

yes, I can repduce the issue with uretprobe on top of perf event uprobe

running 2 tasks of the test code:

        int func() {
                return 0;
        }

        int main() {
            printf("pid: %d\n", getpid());
            while (1) {
                sleep(2);
                func();
            }
        }

and running 2 instances of bpftrace (each with separate pid):

        [root@qemu ex]# ../bpftrace/build/src/bpftrace -p 1018 -e 
'uretprobe:./test:func { printf("%d\n", pid); }'
        Attaching 1 probe...
        1018
        1017
        1018
        1017

        [root@qemu ex]# ../bpftrace/build/src/bpftrace -p 1017 -e 
'uretprobe:./test:func { printf("%d\n", pid); }'
        Attaching 1 probe...
        1017
        1018
        1017
        1018

will execute bpf program twice for each bpftrace instance, like:

          sched-in 1018 
            perf_trace_add

   ->     uprobe-hit
            handle_swbp
              handler_chain
              {
                for_each_uprobe_consumer {

                  // consumer for task 1019
                  uprobe_dispatcher
                    uprobe_perf_func
                      uprobe_perf_filter return false

                  // consumer for task 1018
                  uprobe_dispatcher
                    uprobe_perf_func
                      uprobe_perf_filter return true
                       -> could run bpf program, but none is configured
                }

                prepare_uretprobe
              }

   ->     uretprobe-hit
            handle_swbp
              uprobe_handle_trampoline
                handle_uretprobe_chain
                {

                  for_each_uprobe_consumer {
                    
                    // consumer for task 1019
                    uretprobe_dispatcher
                      uretprobe_perf_func
                        -> runs bpf program

                    // consumer for task 1018
                    uretprobe_dispatcher
                      uretprobe_perf_func
                        -> runs bpf program

                  }
                }

          sched-out 1019
            perf_trace_del


and I think the same will happen for perf record in this case where instead of
running the program we will execute perf_tp_event

I think the uretprobe_dispatcher could call filter as suggested in the original
patch.. but I'm not sure we need to remove the uprobe from 
handle_uretprobe_chain
like we do in handler_chain.. maybe just to save the next uprobe hit which would
remove the uprobe?

jirka

Reply via email to