From: Soumya AR <[email protected]>

Hi,

This patch series implements support for integer atomic fetch min/max operations
in GCC, backing the C++26 std::atomic<int>::fetch_max and
std::atomic<int>::fetch_min operations.

CC'ed in: The following maintainers according to parts of the compiler modified

- middle-end maintainers: Patches 1-4 involve several tweaks in the middle-end.
  So do parts of Patch 8.
- runtime library maintainers: Patch 5 (extends libatomic) and Patch 7 
  (extends libstdc++).
- aarch64 maintainers: Patch 6 (extends aarch64 backend).
- Patch 8 extends various compiler components to handle atomic fetch min/max
  CC'ing David Malcolm and Jakub Jelinek for analyzer and asan modifications
  respectively.
  CC'ing Andrew MacLeod for tree-ssa modifications.
- Additionally including some contributors who have contributed to similar
  relevant areas.

Please let me know if there would be anyone else appropriate to CC for this.

I've highlighted key changes from each patch below, but kindly do read through
the commit messages for each patch for more information.

---

1. builtin: Add builtin types and function declarations for integer atomic fetch
   min/max

At RTL level, we distinguish between signed and unsigned min/max operations
based on the optab: SMIN/SMAX or UMIN/UMAX.

To model the same behaviour at a builtin level, we define:
  BUILT_IN_ATOMIC_FETCH_MIN_N, BUILT_IN_ATOMIC_FETCH_MAX_N,
  BUILT_IN_ATOMIC_MIN_FETCH_N, BUILT_IN_ATOMIC_MAX_FETCH_N,
  BUILT_IN_ATOMIC_FETCH_{S,U}MIN_N, BUILT_IN_ATOMIC_FETCH_{S,U}MAX_N,
  BUILT_IN_ATOMIC_{S,U}MIN_FETCH_N, BUILT_IN_ATOMIC_{S,U}MAX_FETCH_N
and the sized variants for the signed builtins.

---

2. optabs: Add sync optabs for atomic min/max operations

Adds optab definitions for both legacy __sync and __atomic min/max operations.

Rather than adding special clauses for the case where these optabs for the
legacy builtins do not exist, we define optabs for all operations and rely on
existing mechanisms to check whether these optabs are implemented or not.

This means that although unlikely, if backends do implement these optabs, they
should work correctly.

---

3. c,c++: Expand atomic min/max builtins to CAS loops

When handling atomic builtins, we handle fetch_after variants as fetch_before
OP val. To issue this compensation code, we call expand_simple_binop(), which
calls expand_binop().

Therefore, expand_binop() is extended to handle min/max operations, either via a
conditional move or a compare and branch sequence. To do this, we migrate the
code from expand_expr_real_2() to expand_binop() for min/max operations.

---

4. middle-end + aarch64: Sanity tests for atomic min/max operations

Regression tests for the above CAS implementation (tested via aarch64). 
Additionally, gcc.dg/atomic-op-*.c tests are also extended with min/max for
target independent tests.

---

5. libatomic: Add support for atomic fetch min/max builtins

We implement __atomic_fetch_{min,max} and __atomic_{min,max}_fetch operations in
libatomic for both signed and unsigned integer types.

This patch takes significant design decisions which are highlighted in more
detail in the patch.

- We introduce STYPE for signed operations alongside the existing UTYPE, with
  BUILTIN_TYPE resolving to the appropriate type based on the operation.
- Extend DECLARE_ALL_SIZED macro to handle both signed and unsigned types.
- Some tweaks in fop_n.c to make fallbacks work for min/max.

IMP: Note that the naming convention used for libcalls in this change
(__atomic_fetch_{s,u}min/max_<size>) was also coordinated with LLVM community.

---

6. aarch64: Add backend support for atomic fetch min/max operations

This patch adds aarch64 support for atomic min/max operations via the following
execution paths.

- LSE inline: When LSE is available at compile time, emits native atomic min/max
  instructions (ldsmin, ldsmax, ldumin, ldumax).
- Outline atomics: Runtime CPU feature detection via libgcc functions that
  dispatch to either LSE instructions or LL/SC sequences.
- Inline LL/SC: When outline atomics are disabled (-mno-outline-atomics) on
  non-LSE target, emits inline load-exclusive/store-exclusive sequences.

---

7. libstdc++: Use new atomic fetch min/max builtins in std::atomic

Enables C++26's atomic min/max operations in libstdc++ through the newly added
compiler builtins.

We use concepts to follow the same pattern as existing atomic_fetch_add/sub
operations. 

IMP: These functions are currently not guarded by C++26 feature checks, how
should we implement this?

---

8. middle-end: Extend compiler infrastructure for atomic min/max builtins

- Extendend asan, gimple-ssa-warn, and analyzer for atomic min/max.
- The analyzer currently doesn't recognize ternary ops as MIN_EXPR/MAX_EXPR, so
  analyzer tests use constants to verify correct min/max computation in memory.
- tsan emits unsupported warnings for atomic min/max
- tree-ssa-forwprop optimizes __atomic_fetch_min/max followed by min/max
  to the more efficient __atomic_min/max_fetch when the old value isn't needed.

---

Testing:

  Bootstrapped and regression tested on x86_64 and AArch64.
  Cross compiler regression tested on arm-linux.
  Cross compiler regression tested on AArch64 linux with Qemu emulating a
  machine that does not have LSE.

---

Soumya AR (8):
  builtin: Add builtin types and function declarations for integer
    atomic fetch min/max
  optabs: Add sync optabs for atomic min/max operations
  c,c++: Expand atomic min/max builtins to CAS loops
  middle-end + aarch64: Sanity tests for atomic min/max operations
  libatomic: Add support for atomic fetch min/max builtins
  aarch64: Add backend support for atomic fetch min/max operations
  libstdc++: Use new atomic fetch min/max builtins in std::atomic
  middle-end: Extend compiler infrastructure for atomic min/max builtins

 gcc/analyzer/kf.cc                            |  85 +++++
 gcc/asan.cc                                   |  40 ++
 gcc/builtin-types.def                         |  11 +
 gcc/builtins.cc                               | 116 ++++++
 gcc/c-family/c-common.cc                      |  50 +++
 gcc/config/aarch64/aarch64-protos.h           |   4 +
 gcc/config/aarch64/aarch64.cc                 |  51 +++
 gcc/config/aarch64/atomics.md                 |  54 ++-
 gcc/config/aarch64/iterators.md               |  30 +-
 gcc/expr.cc                                   |  88 +----
 gcc/fortran/types.def                         |  11 +
 gcc/gimple-ssa-warn-access.cc                 |   8 +
 gcc/optabs.cc                                 | 143 ++++++-
 gcc/optabs.def                                |  24 ++
 gcc/sync-builtins.def                         | 164 ++++++++
 .../c-c++-common/asan/atomic-max-invalid.c    |  19 +
 gcc/testsuite/gcc.dg/Wstringop-overflow-78.c  | 106 ++++++
 .../gcc.dg/analyzer/atomic-builtins-1.c       | 207 ++++++++++
 gcc/testsuite/gcc.dg/atomic-op-1.c            | 353 +++++++++++++++++
 gcc/testsuite/gcc.dg/atomic-op-2.c            | 353 +++++++++++++++++
 gcc/testsuite/gcc.dg/atomic-op-3.c            | 353 +++++++++++++++++
 gcc/testsuite/gcc.dg/atomic-op-4.c            | 353 +++++++++++++++++
 gcc/testsuite/gcc.dg/atomic-op-5.c            | 355 ++++++++++++++++++
 .../gcc.dg/tree-ssa/atomic-minmax-forwprop.c  |  56 +++
 gcc/testsuite/gcc.dg/tsan/atomic-minmax.c     |  48 +++
 .../gcc.target/aarch64/atomic-minmax-lse.c    | 122 ++++++
 .../gcc.target/aarch64/atomic-minmax-nolse.c  | 196 ++++++++++
 .../gcc.target/aarch64/atomic-minmax.c        | 128 +++++++
 .../gcc.target/aarch64/atomic-minmax.x        | 185 +++++++++
 gcc/tree-ssa-forwprop.cc                      |   7 +
 gcc/tsan.cc                                   |  56 ++-
 libatomic/Makefile.am                         |   3 +-
 libatomic/Makefile.in                         |   4 +-
 libatomic/acinclude.m4                        |  19 +
 libatomic/auto-config.h.in                    |  33 +-
 libatomic/config/linux/aarch64/atomic_16.S    | 126 +++++++
 libatomic/configure                           | 346 +++++++++++++++++
 libatomic/configure.ac                        |   1 +
 libatomic/fop_n.c                             | 118 +++++-
 libatomic/fsmax_n.c                           |  29 ++
 libatomic/fsmin_n.c                           |  29 ++
 libatomic/fumax_n.c                           |  28 ++
 libatomic/fumin_n.c                           |  28 ++
 libatomic/libatomic.map                       |  44 +++
 libatomic/libatomic_i.h                       |  26 +-
 libatomic/testsuite/libatomic.c/atomic-op-1.c | 353 +++++++++++++++++
 libatomic/testsuite/libatomic.c/atomic-op-2.c | 353 +++++++++++++++++
 libatomic/testsuite/libatomic.c/atomic-op-3.c | 353 +++++++++++++++++
 libatomic/testsuite/libatomic.c/atomic-op-4.c | 353 +++++++++++++++++
 libatomic/testsuite/libatomic.c/atomic-op-5.c | 355 ++++++++++++++++++
 libgcc/config/aarch64/lse.S                   |  62 ++-
 libgcc/config/aarch64/t-lse                   |   3 +-
 libstdc++-v3/include/bits/atomic_base.h       | 156 ++++++++
 .../include/c_compatibility/stdatomic.h       |   4 +
 libstdc++-v3/include/std/atomic               |  52 +++
 .../atomic_integral/fetch_minmax.cc           | 163 ++++++++
 .../atomic_integral/fetch_minmax_order.cc     | 111 ++++++
 .../29_atomics/atomic_integral/nonmembers.cc  |  24 ++
 .../29_atomics/atomic_ref/integral.cc         |  51 ++-
 .../headers/stdatomic.h/c_compat.cc           |   4 +
 60 files changed, 6874 insertions(+), 133 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/atomic-max-invalid.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/atomic-minmax-forwprop.c
 create mode 100644 gcc/testsuite/gcc.dg/tsan/atomic-minmax.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-minmax-lse.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-minmax-nolse.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-minmax.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/atomic-minmax.x
 create mode 100644 libatomic/fsmax_n.c
 create mode 100644 libatomic/fsmin_n.c
 create mode 100644 libatomic/fumax_n.c
 create mode 100644 libatomic/fumin_n.c
 create mode 100644 
libstdc++-v3/testsuite/29_atomics/atomic_integral/fetch_minmax.cc
 create mode 100644 
libstdc++-v3/testsuite/29_atomics/atomic_integral/fetch_minmax_order.cc

-- 
2.44.0

Reply via email to