On Wed, Nov 26, 2014 at 04:35:50PM +0000, James Greenhalgh wrote:
> Why do we want to turn off folding for the V4SF/V2SF/V2DF modes of these
> intrinsics? There should be no difference between the mid-end definition
> and the intrinsic definition of their behaviour.

Good point. Done.

> I also note that the integer forms of these now end up as an "abs" RTL
> expression - can we guarantee that preserves the intrinsics behaviour and
> no RTL-folder will come along and mis-optimize? Documentation is vague
> on this point.

I don't think we can guarantee that indefinitely, but it doesn't seem anyone has implemented such a rule *at present*. And we'd lose e.g. constant folding, if we switched to using an AArch64-specific UNSPEC. If anyone does add such a rule, I'd have some hope that the testcase will catch it...

> I'm also not convinced by the SIMD_ARG_SCRATCH foo you add. Looking at
> the aarch64.md:absdi2 pattern I can't see why we need that scratch at
> all. It seems we could get away with marking operand 0 early-clobber and
> using it as our scratch register. Then we could drop all the extra
> infrastructure from this patch.

Hah, good point. Ok, I attach a revised patch, updating the pattern as you suggest and omitting the infrastructure changes. (I'm leaving the now-unused qualifier_internal as it stands, as there are arguments both for removing it and "fixing" it as per previous patch, but I don't think we should hold this up in the case that we want to have that discussion!)

Cross-tested check-gcc on aarch64-none-elf, including previously-posted test of vabs_s8.

Ok for trunk (with previously-posted testcase) ?
--Alan

gcc/ChangeLog:

        * config/aarch64/aarch64.md (absdi2): Remove scratch operand by
        earlyclobbering result operand.

        * config/aarch64/aarch64-builtins.c (aarch64_types_unop_qualifiers):
        Remove final qualifier_internal.
        (aarch64_fold_builtin): Stop folding abs builtins, except on floats.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index c0881e643440dd688570bcbc2a849ae2d441225a..7a444053fc1aeb4f7d2764b7fd50a39fe0e4f83b 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -126,11 +126,9 @@ typedef struct
   enum aarch64_type_qualifiers *qualifiers;
 } aarch64_simd_builtin_datum;
 
-/*  The qualifier_internal allows generation of a unary builtin from
-    a pattern with a third pseudo-operand such as a match_scratch.  */
 static enum aarch64_type_qualifiers
 aarch64_types_unop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
-  = { qualifier_none, qualifier_none, qualifier_internal };
+  = { qualifier_none, qualifier_none };
 #define TYPES_UNOP (aarch64_types_unop_qualifiers)
 static enum aarch64_type_qualifiers
 aarch64_types_unopu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -1275,7 +1273,7 @@ aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args,
 
   switch (fcode)
     {
-      BUILTIN_VALLDI (UNOP, abs, 2)
+      BUILTIN_VDQF (UNOP, abs, 2)
 	return fold_build1 (ABS_EXPR, type, args[0]);
 	break;
       VAR1 (UNOP, floatv2si, 2, v2sf)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 17570ba026b45df097f8838751eed86d7ce03609..25ad10405a88ba6e629bb2bf671041a639876ff2 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1957,9 +1957,8 @@
 )
 
 (define_insn_and_split "absdi2"
-  [(set (match_operand:DI 0 "register_operand" "=r,w")
-	(abs:DI (match_operand:DI 1 "register_operand" "r,w")))
-   (clobber (match_scratch:DI 2 "=&r,X"))]
+  [(set (match_operand:DI 0 "register_operand" "=&r,w")
+	(abs:DI (match_operand:DI 1 "register_operand" "r,w")))]
   ""
   "@
    #
@@ -1969,7 +1968,7 @@
    && GP_REGNUM_P (REGNO (operands[1]))"
   [(const_int 0)]
   {
-    emit_insn (gen_rtx_SET (VOIDmode, operands[2],
+    emit_insn (gen_rtx_SET (VOIDmode, operands[0],
 			    gen_rtx_XOR (DImode,
 					 gen_rtx_ASHIFTRT (DImode,
 							   operands[1],
@@ -1978,7 +1977,7 @@
     emit_insn (gen_rtx_SET (VOIDmode,
 			    operands[0],
 			    gen_rtx_MINUS (DImode,
-					   operands[2],
+					   operands[0],
 					   gen_rtx_ASHIFTRT (DImode,
 							     operands[1],
 							     GEN_INT (63)))));

Reply via email to