Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

Andre Vieira (lists) via Gcc-patches Mon, 14 Nov 2022 06:08:54 -0800

Here is the latest version and an updated ChangeLog:

2022-11-14  Andre Vieira  <[email protected]>
                       Kyrylo Tkachov <[email protected]>


gcc/ChangeLog:

        * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New Macro.
        (TARGET_RCPC): New Macro.

* config/aarch64/atomics.md (atomic_load<mode>): Change into anexpand.

        (aarch64_atomic_load<mode>_rcpc): New define_insn for ldapr.
        (aarch64_atomic_load<mode>): Rename of old define_insn for ldar.
        * config/aarch64/iterators.md (UNSPEC_LDAP): New unspec enum value.

* doc/invoke.texi (rcpc): Ammend documentation to mention theeffects

        on code generation.

gcc/testsuite/ChangeLog:

        * gcc.target/aarch64/ldapr.c: New test.

On 10/11/2022 15:55, Kyrylo Tkachov wrote:

Hi Andre,

-----Original Message-----
From: Andre Vieira (lists) <[email protected]>
Sent: Thursday, November 10, 2022 11:17 AM
To: [email protected]
Cc: Kyrylo Tkachov <[email protected]>; Richard Earnshaw
<[email protected]>; Richard Sandiford
<[email protected]>
Subject: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire
semantics

Hello,

This patch enables the use of LDAPR for load-acquire semantics. After
some internal investigation based on the work published by Podkopaev et
al. (https://dl.acm.org/doi/10.1145/3290382) we can confirm that using
LDAPR for the C++ load-acquire semantics is a correct relaxation.

Bootstrapped and regression tested on aarch64-none-linux-gnu.

OK for trunk?

Thanks for the patch

2022-11-09  Andre Vieira  <[email protected]>
              Kyrylo Tkachov  <[email protected]>

gcc/ChangeLog:

          * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New Macro.
          (TARGET_RCPC): New Macro.
          * config/aarch64/atomics.md (atomic_load<mode>): Change into
          an expand.
          (aarch64_atomic_load<mode>_rcpc): New define_insn for ldapr.
          (aarch64_atomic_load<mode>): Rename of old define_insn for ldar.
          * config/aarch64/iterators.md (UNSPEC_LDAP): New unspec enum
value.
          *
doc/gcc/gcc-command-options/machine-dependent-options/aarch64-
options.rst
          (rcpc): Ammend documentation to mention the effects on code
generation.

gcc/testsuite/ChangeLog:

          * gcc.target/aarch64/ldapr.c: New test.
          * lib/target-supports.exp (add_options_for_aarch64_rcpc): New
options procedure.
          (check_effective_target_aarch64_rcpc_ok_nocache): New
check-effective-target.
          (check_effective_target_aarch64_rcpc_ok): Likewise.

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
bc95f6d9d15f190a3e33704b4def2860d5f339bd..801a62bf2ba432f35ae1931beb8c4405b77b36c3
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -657,7 +657,42 @@
    }
  )

-(define_insn "atomic_load<mode>"

+(define_expand "atomic_load<mode>"
+  [(match_operand:ALLI 0 "register_operand" "=r")
+   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
+   (match_operand:SI   2 "const_int_operand")]
+  ""
+  {
+    /* If TARGET_RCPC and this is an ACQUIRE load, then expand to a pattern
+       using UNSPECV_LDAP.  */
+    enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+    if (TARGET_RCPC
+       && (is_mm_acquire (model)
+           || is_mm_acq_rel (model)))
+    {
+      emit_insn (gen_aarch64_atomic_load<mode>_rcpc (operands[0], operands[1],
+                                                    operands[2]));
+    }
+    else
+    {
+      emit_insn (gen_aarch64_atomic_load<mode> (operands[0], operands[1],
+                                               operands[2]));
+    }

No braces needed for single-statement bodies.

diff --git 
a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst 
b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
index 
c2b23a6ee97ef2b7c74119f22c1d3e3d85385f4d..25d609238db7d45845dbc446ac21d12dddcf8eac
 100644
--- 
a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
+++ 
b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
@@ -437,9 +437,9 @@ the following and their inverses no :samp:`{feature}` :
    floating-point instructions. This option is enabled by default for 
:option:`-march=armv8.4-a`. Use of this option with architectures prior to 
Armv8.2-A is not supported.

:samp:`rcpc`

-  Enable the RcPc extension.  This does not change code generation from GCC,
-  but is passed on to the assembler, enabling inline asm statements to use
-  instructions from the RcPc extension.
+  Enable the RcPc extension.  This enables the use of the LDAPR instructions 
for
+  load-acquire atomic semantics, and passes it on to the assembler, enabling
+  inline asm statements to use instructions from the RcPc extension.

Let's capitalize this consistently throughout the patch as "RCpc".

diff --git a/gcc/testsuite/gcc.target/aarch64/ldapr.c 
b/gcc/testsuite/gcc.target/aarch64/ldapr.c
new file mode 100644
index 
0000000000000000000000000000000000000000..c36edfcd79a9ee41434ab09ac47d257a692a8606
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ldapr.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -std=c99" } */
+/* { dg-require-effective-target aarch64_rcpc_ok } */
+/* { dg-add-options aarch64_rcpc } */

If you're not doing an assemble here you probably don't care much about this 
target business? (it's more important on the arm side with incompatible ABIs, 
Thumb-ness).
I think in this case you can avoid introducing the effective targets and just 
add
#pragma GCC target "+rcpc"
to the body of the testcase (we use it in a few testcases for aarch64)

Otherwise looks good!
Thanks,
Kyrill

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
e60f9bce023b2cd5e7233ee9b8c61fc93c1494c2..51a8aa02a5850d5c79255dbf7e0764ffdec73ccd
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -221,6 +221,7 @@ enum class aarch64_feature : unsigned char {
 #define AARCH64_ISA_V9_3A          (aarch64_isa_flags & AARCH64_FL_V9_3A)
 #define AARCH64_ISA_MOPS          (aarch64_isa_flags & AARCH64_FL_MOPS)
 #define AARCH64_ISA_LS64          (aarch64_isa_flags & AARCH64_FL_LS64)
+#define AARCH64_ISA_RCPC           (aarch64_isa_flags & AARCH64_FL_RCPC)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO)
@@ -328,6 +329,9 @@ enum class aarch64_feature : unsigned char {
 /* SB instruction is enabled through +sb.  */
 #define TARGET_SB (AARCH64_ISA_SB)
 
+/* RCPC loads from Armv8.3-a.  */
+#define TARGET_RCPC (AARCH64_ISA_RCPC)
+
 /* Apply the workaround for Cortex-A53 erratum 835769.  */
 #define TARGET_FIX_ERR_A53_835769      \
   ((aarch64_fix_a53_err835769 == 2)    \
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
bc95f6d9d15f190a3e33704b4def2860d5f339bd..dc5f52ee8a4b349c0d8466a16196f83604893cbb
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -657,7 +657,38 @@
   }
 )
 
-(define_insn "atomic_load<mode>"
+(define_expand "atomic_load<mode>"
+  [(match_operand:ALLI 0 "register_operand" "=r")
+   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
+   (match_operand:SI   2 "const_int_operand")]
+  ""
+  {
+    /* If TARGET_RCPC and this is an ACQUIRE load, then expand to a pattern
+       using UNSPECV_LDAP.  */
+    enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+    if (TARGET_RCPC
+       && (is_mm_acquire (model)
+           || is_mm_acq_rel (model)))
+      emit_insn (gen_aarch64_atomic_load<mode>_rcpc (operands[0], operands[1],
+                                                    operands[2]));
+    else
+      emit_insn (gen_aarch64_atomic_load<mode> (operands[0], operands[1],
+                                               operands[2]));
+    DONE;
+  }
+)
+
+(define_insn "aarch64_atomic_load<mode>_rcpc"
+  [(set (match_operand:ALLI 0 "register_operand" "=r")
+    (unspec_volatile:ALLI
+      [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
+       (match_operand:SI 2 "const_int_operand")]                       ;; model
+      UNSPECV_LDAP))]
+  "TARGET_RCPC"
+  "ldapr<atomic_sfx>\t%<w>0, %1"
+)
+
+(define_insn "aarch64_atomic_load<mode>"
   [(set (match_operand:ALLI 0 "register_operand" "=r")
     (unspec_volatile:ALLI
       [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 
a8ad4e5ff215ade06c3ca13a24ef18d259afcb6c..d8c2f9d6c32d6f188d584c2e9d8fb36511624de6
 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -988,6 +988,7 @@
     UNSPECV_LX                 ; Represent a load-exclusive.
     UNSPECV_SX                 ; Represent a store-exclusive.
     UNSPECV_LDA                        ; Represent an atomic load or 
load-acquire.
+    UNSPECV_LDAP               ; Represent an atomic acquire load with RCpc 
semantics.
     UNSPECV_STL                        ; Represent an atomic store or 
store-release.
     UNSPECV_ATOMIC_CMPSW       ; Represent an atomic compare swap.
     UNSPECV_ATOMIC_EXCHG       ; Represent an atomic exchange.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
449df59729884aa3292559fffcfbbcc99182c13a..5a32d7b6e94502c57e6438cfd2563bc5631690e1
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -20168,9 +20168,9 @@ Enable FP16 fmla extension.  This also enables FP16 
extensions and
 floating-point instructions. This option is enabled by default for 
@option{-march=armv8.4-a}. Use of this option with architectures prior to 
Armv8.2-A is not supported.
 
 @item rcpc
-Enable the RcPc extension.  This does not change code generation from GCC,
-but is passed on to the assembler, enabling inline asm statements to use
-instructions from the RcPc extension.
+Enable the RCpc extension.  This enables the use of the LDAPR instructions for
+load-acquire atomic semantics, and passes it on to the assembler, enabling
+inline asm statements to use instructions from the RCpc extension.
 @item dotprod
 Enable the Dot Product extension.  This also enables Advanced SIMD 
instructions.
 @item aes

Re: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

Reply via email to