Re: [PATCH][combine][v2] Canonicalise (r + r) to (r << 1) to aid recognition

2016-01-05 Thread Kyrill Tkachov

Hi all,

On 18/12/15 14:39, Bernd Schmidt wrote:

On 12/18/2015 03:29 PM, Kyrill Tkachov wrote:

Bootstrapped and tested on arm, aarch64, x86_64.
As before, there were no codegen differences for SPEC2006 on x86_64.
aarch64 SPEC2006 sees the effects described above.


I think this is OK. There may be some question as to whether this is a bug fix, 
so please wait a few days for objections before committing. I'm assuming 
there's no place which tries to do the opposite transformation...



As I didn't hear see objections I'll be committing this later today.

Thanks for the reviews.

Kyrill



Bernd





Re: [PATCH][combine][v2] Canonicalise (r + r) to (r << 1) to aid recognition

2015-12-18 Thread Bernd Schmidt

On 12/18/2015 03:29 PM, Kyrill Tkachov wrote:

Bootstrapped and tested on arm, aarch64, x86_64.
As before, there were no codegen differences for SPEC2006 on x86_64.
aarch64 SPEC2006 sees the effects described above.


I think this is OK. There may be some question as to whether this is a 
bug fix, so please wait a few days for objections before committing. I'm 
assuming there's no place which tries to do the opposite transformation...



Bernd



[PATCH][combine][v2] Canonicalise (r + r) to (r << 1) to aid recognition

2015-12-18 Thread Kyrill Tkachov

Hi all,

Following up from https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01723.html
here is the patch that makes combine canonicalise x + x expressions into x << 1.
This allows for more simplification opportunities, as well as increases the 
recognition
opportunities on targets that support combined arithmetic and shift 
instructions,
like aarch64 and arm.

This has the same effect on aarch64 as my first attempt, i.e. it increases the 
combination
opportunities for -mcpu=cortex-a53 with the added effect that simple register 
adds of the form:
"add x1, x0, x0" are now transformed into shifts "lsl x1, x0, #1".

It has been suggested in that thread that if the target wants to distinguish 
between a shift-by-one
and the plus form then it should match the shift form and explicitly output the 
instruction pattern for
the plus form.

This would be, of course, a separate aarch64-specific patch.

Bootstrapped and tested on arm, aarch64, x86_64.
As before, there were no codegen differences for SPEC2006 on x86_64.
aarch64 SPEC2006 sees the effects described above.

How does this approach look?

2015-12-18  Kyrylo Tkachov  

PR rtl-optimization/68651
* combine.c (combine_simplify_rtx): Canonicalize x + x into
x << 1.

2015-12-18  Kyrylo Tkachov  

PR rtl-optimization/68651
* gcc.target/aarch64/pr68651_1.c: New test.
diff --git a/gcc/combine.c b/gcc/combine.c
index 64d334e0b2a6d731310e67779ad3fd74c326a186..dc0d4bd52c717b88608d21dbaffe444eeb68bb2d 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5897,6 +5897,13 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
 			  || XEXP (temp, 1) != XEXP (x, 0)
 	return temp;
 	}
+
+  /* Canonicalize x + x into x << 1.  */
+  if (GET_MODE_CLASS (mode) == MODE_INT
+	  && rtx_equal_p (XEXP (x, 0), XEXP (x, 1))
+	  && !side_effects_p (XEXP (x, 0)))
+	return simplify_gen_binary (ASHIFT, mode, XEXP (x, 0), const1_rtx);
+
   break;
 
 case MINUS:
diff --git a/gcc/testsuite/gcc.target/aarch64/pr68651_1.c b/gcc/testsuite/gcc.target/aarch64/pr68651_1.c
new file mode 100644
index ..ef9456f538776e7db01ecf5473425aed9efd9de2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr68651_1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a53" } */
+
+int
+foo (int x)
+{
+  return (x * 2) & 65535;
+}
+/* { dg-final { scan-assembler "ubfiz\tw\[0-9\]*, w\[0-9\]*.*\n" } } */
+
+int
+bar (int x, int y)
+{
+  return (x * 2) | y;
+}
+/* { dg-final { scan-assembler "orr\tw\[0-9\]*, w\[0-9\]*, w\[0-9\]*, lsl 1.*\n" } } */