This series enables the BPF verifier to inline bpf_kptr_xchg() into BPF_XCHG
on LoongArch64, and fixes the underlying JIT atomic ordering that makes such
inlining safe.
The BPF verifier can lower bpf_kptr_xchg() to a single BPF_XCHG atomic when
the JIT advertises ptr xchg support via bpf_jit_supports_ptr_xchg(). This
removes helper-call overhead from the kptr exchange fast path. Inlining is
only correct when the JITed exchange provides the same sequentially consistent
ordering as the bpf_kptr_xchg() helper.
Patch 1 fixes memory ordering for all value-returning BPF atomic RMW operations
emitted by the LoongArch bpf JIT. Per LKMM, value-returning atomic RMW must
provide sequentially consistent ordering. Plain AMO instructions and bare
ll/sc loops on LoongArch do not satisfy this by themselves:
- BPF_FETCH (ADD/AND/OR/XOR): switch to am*_db.{b,h,w,d}
- BPF_XCHG: switch to amswap_db.{b,h,w,d}
- BPF_CMPXCHG: emit dbar 0x700 after the ll/sc loop, matching
__WEAK_LLSC_MB in cmpxchg.h
Non-value-returning RMW ops (plain BPF_ADD, BPF_AND, etc.) remain weakly
ordered, consistent with LKMM. This fix is independent of kptr inlining and
benefits all BPF programs using value-returning atomics on LoongArch.
Patch 2 implements bpf_jit_supports_ptr_xchg() so the verifier may inline
bpf_kptr_xchg() on LoongArch64.
Patches 3 and 4 extend bpf selftests: functional coverage via kptr_xchg_inline,
and an optional kptr-xchg benchmark to compare helper vs inlined paths.
Chenguang Zhao (4):
LoongArch: bpf: Fix memory ordering for value-returning atomics
LoongArch: bpf: Advertise JIT support for kptr xchg inline
selftests/bpf: Enable kptr_xchg_inline test on LoongArch
selftests/bpf: Add kptr-xchg benchmark
arch/loongarch/include/asm/inst.h | 18 ++++
arch/loongarch/net/bpf_jit.c | 37 ++++---
tools/testing/selftests/bpf/Makefile | 2 +
tools/testing/selftests/bpf/bench.c | 2 +
.../selftests/bpf/benchs/bench_kptr_xchg.c | 96 +++++++++++++++++++
.../bpf/prog_tests/kptr_xchg_inline.c | 3 +-
.../selftests/bpf/progs/kptr_xchg_bench.c | 49 ++++++++++
7 files changed, 192 insertions(+), 15 deletions(-)
create mode 100644 tools/testing/selftests/bpf/benchs/bench_kptr_xchg.c
create mode 100644 tools/testing/selftests/bpf/progs/kptr_xchg_bench.c
---
v3:
- Expand the JIT ordering fix beyond BPF_XCHG w/d to all value-returning
atomic RMW ops (BPF_FETCH, BPF_XCHG, BPF_CMPXCHG), using barrier-carrying
AMO variants and dbar after ll/sc as required by LKMM.
- Add __btf_root() in the benchmark BPF program so bpf_obj_drop() remains
visible to libbpf's kfunc linker when bpf_kptr_xchg() is inlined.
v2:
-
https://lore.kernel.org/all/[email protected]/
v1:
- https://lore.kernel.org/all/[email protected]/
--
2.25.1