https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71153

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2016-05-16
                 CC|                            |pinskia at gcc dot gnu.org
           Assignee|unassigned at gcc dot gnu.org      |pinskia at gcc dot 
gnu.org
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
LDCLRL does MEM &= ~op.

LDCLR – atomic AND NOT (bitclear) of a location with value in a register, with
original data loaded into a register.
Applies to 8, 16, 32, 64 bits. Each instruction can have one of four possible
orderings - acquire, release,
acquire&release, no order, as it performs both a load and a store.

So the code is correct just not optimal.


Something like this should get the code back to optimal:
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 236c819..ad92f6a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11607,6 +11607,7 @@ aarch64_gen_atomic_ldop (enum rtx_code code, rtx
out_data, rtx out_result,
   aarch64_atomic_load_op_code ldop_code;
   rtx src;
   rtx x;
+  bool already_swapped = false;

   if (out_data)
     out_data = gen_lowpart (mode, out_data);
@@ -11614,6 +11615,12 @@ aarch64_gen_atomic_ldop (enum rtx_code code, rtx
out_data, rtx out_result,
   if (out_result)
     out_result = gen_lowpart (mode, out_result);

+  if (code == AND && CONST_INT_P (value))
+    {
+      value = simplify_gen_unary (NOT, mode, value, mode);
+      already_swapped = true;
+    }
+
   /* Make sure the value is in a register, putting it into a destination
      register if it needs to be manipulated.  */
   if (!register_operand (value, mode)
@@ -11670,8 +11677,11 @@ aarch64_gen_atomic_ldop (enum rtx_code code, rtx
out_data, rtx out_result,
        if (short_mode)
          src = gen_lowpart (wmode, src);

-       not_src = gen_rtx_NOT (wmode, src);
-       emit_insn (gen_rtx_SET (src, not_src));
+       if (!already_swapped)
+         {
+           not_src = gen_rtx_NOT (wmode, src);
+           emit_insn (gen_rtx_SET (src, not_src));
+         }

        if (short_mode)
          src = gen_lowpart (mode, src);

Reply via email to