The current use of guard(preempt_notrace)() within __DECLARE_TRACE() to protect invocation of __DO_TRACE_CALL() means that BPF programs attached to tracepoints are non-preemptible. This is unhelpful in real-time systems, whose users apparently wish to use BPF while also achieving low latencies.
Change the protection of tracepoints to use fast_srcu() instead. This will allow the callbacks to be able to be preempted. This also means that the callbacks themselves need to be able to handle this new found preemption ability. For perf, add a guard(preempt) inside its handler too keep the old behavior of perf events being called with preemption disabled. For BPF, add a migrate_disable() to its handler. Actually, just replace the rcu_read_lock() with rcu_read_lock_dont_migrate() and make it cover more of the BPF callback handler. [ I would have sent this out earlier, but had a death in the family which cause everything to be postponed ] Changes since v5: https://patch.msgid.link/20260108220550.2f6638f3@fedora - Add separate patch for perf to call preempt_disable() - Add patch that has bpf call migrate_disable() directly. - Just change from preempt_disable() to srcu_fast() always Do not do anything different for PREEMPT_RT. Now that BPF disables migration directly, do not have tracepoints disable migration in its code. Steven Rostedt (3): tracing: perf: Have perf tracepoint callbacks always disable preemption bpf: Have __bpf_trace_run() use rcu_read_lock_dont_migrate() tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast ---- include/linux/tracepoint.h | 9 +++++---- include/trace/perf.h | 4 ++-- include/trace/trace_events.h | 4 ++-- kernel/trace/bpf_trace.c | 5 ++--- kernel/tracepoint.c | 18 ++++++++++++++---- 5 files changed, 25 insertions(+), 15 deletions(-)
