hi, as an option to Meglong's change [1] I'm sending proposal for tracing_multi link that does not add static trampoline but attaches program to all needed trampolines.
This approach keeps the same performance but has some drawbacks: - when attaching 20k functions we allocate and attach 20k trampolines - during attachment we hold each trampoline mutex, so for above 20k functions we will hold 20k mutexes during the attachment, should be very prone to deadlock, but haven't hit it yet I was hoping we'd find some common solution, but it looks like it's either static trampoline with performance penalty or having troubles described above but keeping the current trampoline performance. It looks the trampoline allocations/generation might not be big a problem and I'll try to find a solution for holding that many mutexes. If there's no better solution I think having one read/write mutex for tracing multi link attach/detach should work. We'd like to use trampolines instead of kprobes for the performance gains, so naturally we want to keep the same performance even when it's attached through tracing multi link. thoughts? thanks, jirka [1] https://lore.kernel.org/bpf/[email protected]/ --- Jiri Olsa (12): ftrace: Add ftrace_hash_count function bpf: Add struct bpf_trampoline_ops object bpf: Add struct bpf_struct_ops_tramp_link object bpf: Add struct bpf_tramp_node object bpf: Add multi tracing attach types bpf: Add bpf_trampoline_multi_attach/detach functions bpf: Add support to create tracing multi link libbpf: Add btf__find_by_glob_kind function libbpf: Add support to create tracing multi link selftests/bpf: Add fentry tracing multi func test selftests/bpf: Add fentry intersected tracing multi func test selftests/bpf: Add tracing multi benchmark test arch/arm64/net/bpf_jit_comp.c | 58 +++++++-------- arch/s390/net/bpf_jit_comp.c | 42 +++++------ arch/x86/net/bpf_jit_comp.c | 54 +++++++------- include/linux/bpf.h | 74 +++++++++++++------ include/linux/ftrace.h | 1 + include/linux/trace_events.h | 6 ++ include/uapi/linux/bpf.h | 7 ++ kernel/bpf/bpf_struct_ops.c | 39 +++++----- kernel/bpf/syscall.c | 62 +++++++++++----- kernel/bpf/trampoline.c | 340 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------- kernel/bpf/verifier.c | 8 ++- kernel/trace/bpf_trace.c | 105 +++++++++++++++++++++++++++ kernel/trace/ftrace.c | 14 ++-- net/bpf/bpf_dummy_struct_ops.c | 23 +++--- net/bpf/test_run.c | 2 + tools/include/uapi/linux/bpf.h | 7 ++ tools/lib/bpf/bpf.c | 7 ++ tools/lib/bpf/bpf.h | 4 ++ tools/lib/bpf/btf.c | 41 +++++++++++ tools/lib/bpf/btf.h | 3 + tools/lib/bpf/libbpf.c | 87 +++++++++++++++++++++++ tools/lib/bpf/libbpf.h | 14 ++++ tools/lib/bpf/libbpf.map | 1 + tools/testing/selftests/bpf/Makefile | 3 +- tools/testing/selftests/bpf/prog_tests/tracing_multi.c | 363 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ tools/testing/selftests/bpf/progs/tracing_multi_check.c | 132 ++++++++++++++++++++++++++++++++++ tools/testing/selftests/bpf/progs/tracing_multi_fentry.c | 39 ++++++++++ 27 files changed, 1319 insertions(+), 217 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_multi.c create mode 100644 tools/testing/selftests/bpf/progs/tracing_multi_check.c create mode 100644 tools/testing/selftests/bpf/progs/tracing_multi_fentry.c
