On Tue, Feb 03, 2026 at 03:17:05PM -0800, Alexei Starovoitov wrote:
> On Tue, Feb 3, 2026 at 1:38 AM Jiri Olsa <[email protected]> wrote:
> >
> > hi,
> > as an option to Meglong's change [1] I'm sending proposal for tracing_multi
> > link that does not add static trampoline but attaches program to all needed
> > trampolines.
> >
> > This approach keeps the same performance but has some drawbacks:
> >
> >  - when attaching 20k functions we allocate and attach 20k trampolines
> >  - during attachment we hold each trampoline mutex, so for above
> >    20k functions we will hold 20k mutexes during the attachment,
> >    should be very prone to deadlock, but haven't hit it yet
> 
> If you check that it's sorted and always take them in the same order
> then there will be no deadlock.
> Or just grab one global mutex first and then grab trampolines mutexes
> next in any order. The global one will serialize this attach operation.
> 
> > It looks the trampoline allocations/generation might not be big a problem
> > and I'll try to find a solution for holding that many mutexes. If there's
> > no better solution I think having one read/write mutex for tracing multi
> > link attach/detach should work.
> 
> If you mean to have one global mutex as I proposed above then I don't see
> a downside. It only serializes multiple libbpf calls.

we also need to serialize it with standard single trampoline attach,
because the direct ftrace update is now done under trampoline->mutex:

  bpf_trampoline_link_prog(tr)
  {
    mutex_lock(&tr->mutex);
    ...
    update_ftrace_direct_*
    ...
    mutex_unlock(&tr->mutex);
  }

for tracing_multi we would link the program first (with tr->mutex)
and do the bulk ftrace update later (without tr->mutex)

  {
    for each involved trampoline:
      bpf_trampoline_link_prog

    --> and here we could race with some other thread doing single
        trampoline attach

    update_ftrace_direct_*
  }

note the current version locks all tr->mutex instances all the way
through the update_ftrace_direct_* update

I think we could use global rwsem and take read lock on single
trampoline attach path and write lock on tracing_multi attach,

I thought we could take direct_mutex early, but that would mean
different order with trampoline mutex than we already have in
single attach path

or just sort those btf ids

jirka

Reply via email to