Re: [PATCH bpf-next v7 00/11] Atomics for eBPF
On Thu, Jan 14, 2021 at 10:18 AM Brendan Jackman wrote: > > There's still one unresolved review comment from John[3] which I > will resolve with a followup patch. > > Differences from v6->v7 [1]: > > * Fixed riscv build error detected by 0-day robot. Applied. Thanks a lot. Please address John's request in a followup and these few issues: - rst doesn't look correct. Example: rst2man Documentation/networking/filter.rst >/dev/null Documentation/networking/filter.rst:1053: (WARNING/2) Inline emphasis start-string without end-string. > Except ``BPF_ADD`` _without_ ``BPF_FETCH`` (for legacy reasons), all 4 byte > atomic operations require alu32 mode. Clang enables this mode by default in > architecture v3 (``-mcpu=v3``). For older versions it can be enabled with > ``-Xclang -target-feature -Xclang +alu32``. It reads confusing to me. I would rephrase 'clang enables this mode by default' into 'clang can generate new atomic instruction when -mcpu=v3 is enabled'. 'For older versions...' This part I didn't get. The users need clang 12 that is capable to emit these insns. What 'older versions' you're talking about?
Re: [PATCH bpf-next v7 00/11] Atomics for eBPF
Hello: This series was applied to bpf/bpf-next.git (refs/heads/master): On Thu, 14 Jan 2021 18:17:40 + you wrote: > There's still one unresolved review comment from John[3] which I > will resolve with a followup patch. > > Differences from v6->v7 [1]: > > * Fixed riscv build error detected by 0-day robot. > > [...] Here is the summary with links: - [bpf-next,v7,01/11] bpf: x86: Factor out emission of ModR/M for *(reg + off) https://git.kernel.org/bpf/bpf-next/c/11c11d0751fc - [bpf-next,v7,02/11] bpf: x86: Factor out emission of REX byte https://git.kernel.org/bpf/bpf-next/c/74007cfc1f71 - [bpf-next,v7,03/11] bpf: x86: Factor out a lookup table for some ALU opcodes https://git.kernel.org/bpf/bpf-next/c/e5f02caccfae - [bpf-next,v7,04/11] bpf: Rename BPF_XADD and prepare to encode other atomics in .imm https://git.kernel.org/bpf/bpf-next/c/91c960b00566 - [bpf-next,v7,05/11] bpf: Move BPF_STX reserved field check into BPF_STX verifier code https://git.kernel.org/bpf/bpf-next/c/c5bcb5eb4db6 - [bpf-next,v7,06/11] bpf: Add BPF_FETCH field / create atomic_fetch_add instruction https://git.kernel.org/bpf/bpf-next/c/5ca419f2864a - [bpf-next,v7,07/11] bpf: Add instructions for atomic_[cmp]xchg https://git.kernel.org/bpf/bpf-next/c/5ffa25502b5a - [bpf-next,v7,08/11] bpf: Pull out a macro for interpreting atomic ALU operations https://git.kernel.org/bpf/bpf-next/c/462910670e4a - [bpf-next,v7,09/11] bpf: Add bitwise atomic instructions https://git.kernel.org/bpf/bpf-next/c/981f94c3e921 - [bpf-next,v7,10/11] bpf: Add tests for new BPF atomic operations https://git.kernel.org/bpf/bpf-next/c/98d666d05a1d - [bpf-next,v7,11/11] bpf: Document new atomic instructions https://git.kernel.org/bpf/bpf-next/c/de948576f8e7 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
[PATCH bpf-next v7 00/11] Atomics for eBPF
There's still one unresolved review comment from John[3] which I will resolve with a followup patch. Differences from v6->v7 [1]: * Fixed riscv build error detected by 0-day robot. Differences from v5->v6 [1]: * Carried Björn Töpel's ack for RISC-V code, plus a couple more acks from Yonhgong. * Doc fixups. * Trivial cleanups. Differences from v4->v5 [1]: * Fixed bogus type casts in interpreter that led to warnings from the 0day robot. * Dropped feature-detection for Clang per Andrii's suggestion in [4]. The selftests will now fail to build unless you have llvm-project commit 286daafd6512. The ENABLE_ATOMICS_TEST macro is still needed to support the no_alu32 tests. * Carried some Acks from John and Yonghong. * Dropped confusing usage of __atomic_exchange from prog_test in favour of __sync_lock_test_and_set. * [Really] got rid of all the forest of instruction macros (BPF_ATOMIC_FETCH_ADD and friends); now there's just BPF_ATOMIC_OP to define all the instructions as we use them in the verifier tests. This makes the atomic ops less special in that API, and I don't think the resulting usage is actually any harder to read. Differences from v3->v4 [1]: * Added one Ack from Yonghong. He acked some other patches but those have now changed non-trivally so I didn't add those acks. * Fixups to commit messages. * Fixed disassembly and comments: first arg to atomic_fetch_* is a pointer. * Improved prog_test efficiency. BPF progs are now all loaded in a single call, then the skeleton is re-used for each subtest. * Dropped use of tools/build/feature in favour of a one-liner in the Makefile. * Dropped the commit that created an emit_neg helper in the x86 JIT. It's not used any more (it wasn't used in v3 either). * Combined all the different filter.h macros (used to be BPF_ATOMIC_ADD, BPF_ATOMIC_FETCH_ADD, BPF_ATOMIC_AND, etc) into just BPF_ATOMIC32 and BPF_ATOMIC64. * Removed some references to BPF_STX_XADD from tools/, samples/ and lib/ that I missed before. Differences from v2->v3 [1]: * More minor fixes and naming/comment changes * Dropped atomic subtract: compilers can implement this by preceding an atomic add with a NEG instruction (which is what the x86 JIT did under the hood anyway). * Dropped the use of -mcpu=v4 in the Clang BPF command-line; there is no longer an architecture version bump. Instead a feature test is added to Kbuild - it builds a source file to check if Clang supports BPF atomics. * Fixed the prog_test so it no longer breaks test_progs-no_alu32. This requires some ifdef acrobatics to avoid complicating the prog_tests model where the same userspace code exercises both the normal and no_alu32 BPF test objects, using the same skeleton header. Differences from v1->v2 [1]: * Fixed mistakes in the netronome driver * Addd sub, add, or, xor operations * The above led to some refactors to keep things readable. (Maybe I should have just waited until I'd implemented these before starting the review...) * Replaced BPF_[CMP]SET | BPF_FETCH with just BPF_[CMP]XCHG, which include the BPF_FETCH flag * Added a bit of documentation. Suggestions welcome for more places to dump this info... The prog_test that's added depends on Clang/LLVM features added by Yonghong in commit 286daafd6512 (was https://reviews.llvm.org/D72184). This only includes a JIT implementation for x86_64 - I don't plan to implement JIT support myself for other architectures. Operations == This patchset adds atomic operations to the eBPF instruction set. The use-case that motivated this work was a trivial and efficient way to generate globally-unique cookies in BPF progs, but I think it's obvious that these features are pretty widely applicable. The instructions that are added here can be summarised with this list of kernel operations: * atomic[64]_[fetch_]add * atomic[64]_[fetch_]and * atomic[64]_[fetch_]or * atomic[64]_xchg * atomic[64]_cmpxchg The following are left out of scope for this effort: * 16 and 8 bit operations * Explicit memory barriers Encoding I originally planned to add new values for bpf_insn.opcode. This was rather unpleasant: the opcode space has holes in it but no entire instruction classes[2]. Yonghong Song had a better idea: use the immediate field of the existing STX XADD instruction to encode the operation. This works nicely, without breaking existing programs, because the immediate field is currently reserved-must-be-zero, and extra-nicely because BPF_ADD happens to be zero. Note that this of course makes immediate-source atomic operations impossible. It's hard to imagine a measurable speedup from such instructions, and if it existed it would certainly not benefit x86, which has no support for them. The BPF_OP opcode fields are re-used in the immediate, and an additional flag BPF_FETCH is used to mark instructions that should fetch a pre-modification value from memory. So, BPF_XADD is now called BPF_A