Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-20 Thread Kyrill Tkachov


On 18/07/15 17:02, Segher Boessenkool wrote:

On Fri, Jul 17, 2015 at 02:47:34PM -0600, Jeff Law wrote:

I mean move the whole if (BINARY_P ... block to after the existing
simplify calls, to just before the First see if we can apply comment,
and not do a new simplify_rtx call at all.  Does that work?

Yes, and here's the patch.
It just moves the simplification block.
The effect on codegen in SPEC2006 on aarch64 looks sane in the same
way as the original patch I posted (i.e. many redundant zero_extends
eliminated)
and together with patch 2/2 this helps in the -abs testcase.

I'm bootstrapping this on aarch64, arm and x86.
Any other testing would be appreciated.

Is this version ok if testing comes clean?

Thanks,
Kyrill

2015-07-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * combine.c (combine_simplify_rtx): Move simplification step
 before various transformations/substitutions.

OK.
jeff

The patch improves generated code on most archs (or at least code size,
which strongly correlates for combine), or is neutral.  xtensa regresses
a tiny bit; powerpc64 and hppa64 regress more.  I analysed the powerpc64
differences, and it seems to be all down to code that is now expressed as

(set (reg:DI) (lt:DI (reg:SI) (const_int 0)))

where before it was a bit extract (of a subreg).  The newly generated
pattern is simper alright, but the backend didn't recognise it.  With a
simple patch, it does, and the generated code is nicely better than
before.


Thanks for analyzing. So will you submit a powerpc patch
for this? I'm not familiar with the patterns there :)



The hppa64 problem looks similar.  Maybe other targets could use such
an improvement as well.

So yes, the patch is fine.  Thank you for working on it Kyrill :-)


x86_64, aarch64 and arm bootstrap passed successfully on my end
and testing looked clean too.
I've committed this patch with r225996 and 2/2 with r225997.

Thanks for helping me work through this!
Kyrill




Segher





Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-18 Thread Segher Boessenkool
On Fri, Jul 17, 2015 at 02:47:34PM -0600, Jeff Law wrote:
 I mean move the whole if (BINARY_P ... block to after the existing
 simplify calls, to just before the First see if we can apply comment,
 and not do a new simplify_rtx call at all.  Does that work?
 
 Yes, and here's the patch.
 It just moves the simplification block.
 The effect on codegen in SPEC2006 on aarch64 looks sane in the same
 way as the original patch I posted (i.e. many redundant zero_extends
 eliminated)
 and together with patch 2/2 this helps in the -abs testcase.
 
 I'm bootstrapping this on aarch64, arm and x86.
 Any other testing would be appreciated.
 
 Is this version ok if testing comes clean?
 
 Thanks,
 Kyrill
 
 2015-07-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
  * combine.c (combine_simplify_rtx): Move simplification step
  before various transformations/substitutions.
 OK.
 jeff

The patch improves generated code on most archs (or at least code size,
which strongly correlates for combine), or is neutral.  xtensa regresses
a tiny bit; powerpc64 and hppa64 regress more.  I analysed the powerpc64
differences, and it seems to be all down to code that is now expressed as

(set (reg:DI) (lt:DI (reg:SI) (const_int 0)))

where before it was a bit extract (of a subreg).  The newly generated
pattern is simper alright, but the backend didn't recognise it.  With a
simple patch, it does, and the generated code is nicely better than
before.

The hppa64 problem looks similar.  Maybe other targets could use such
an improvement as well.

So yes, the patch is fine.  Thank you for working on it Kyrill :-)


Segher


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-17 Thread Kyrill Tkachov


On 16/07/15 19:28, Segher Boessenkool wrote:

On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?

You mean move this hunk outside the if (BINARY_P (x)...) block it's in?
I think it would work, but I'm not sure if it would affect other cases.
I was also conscious that simplify_rtx might not be a cheap function to call
so frequently (or is it? I didn't profile it), so I tried to avoid calling
it unless I need for the transformation in question here.

I mean move the whole if (BINARY_P ... block to after the existing
simplify calls, to just before the First see if we can apply comment,
and not do a new simplify_rtx call at all.  Does that work?


Yes, and here's the patch.
It just moves the simplification block.
The effect on codegen in SPEC2006 on aarch64 looks sane in the same
way as the original patch I posted (i.e. many redundant zero_extends eliminated)
and together with patch 2/2 this helps in the -abs testcase.

I'm bootstrapping this on aarch64, arm and x86.
Any other testing would be appreciated.

Is this version ok if testing comes clean?

Thanks,
Kyrill

2015-07-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* combine.c (combine_simplify_rtx): Move simplification step
before various transformations/substitutions.


commit d7c7cd4acad9a34b3e991fd589125cc165f25c1f
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu Jul 16 19:42:58 2015 +0100

[combine] Alternative approach

diff --git a/gcc/combine.c b/gcc/combine.c
index 574f874..2f806ab 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5489,6 +5489,51 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
   SUBST (XEXP (x, 1), temp);
 }
 
+  /* Try to fold this expression in case we have constants that weren't
+ present before.  */
+  temp = 0;
+  switch (GET_RTX_CLASS (code))
+{
+case RTX_UNARY:
+  if (op0_mode == VOIDmode)
+	op0_mode = GET_MODE (XEXP (x, 0));
+  temp = simplify_unary_operation (code, mode, XEXP (x, 0), op0_mode);
+  break;
+case RTX_COMPARE:
+case RTX_COMM_COMPARE:
+  {
+	machine_mode cmp_mode = GET_MODE (XEXP (x, 0));
+	if (cmp_mode == VOIDmode)
+	  {
+	cmp_mode = GET_MODE (XEXP (x, 1));
+	if (cmp_mode == VOIDmode)
+	  cmp_mode = op0_mode;
+	  }
+	temp = simplify_relational_operation (code, mode, cmp_mode,
+	  XEXP (x, 0), XEXP (x, 1));
+  }
+  break;
+case RTX_COMM_ARITH:
+case RTX_BIN_ARITH:
+  temp = simplify_binary_operation (code, mode, XEXP (x, 0), XEXP (x, 1));
+  break;
+case RTX_BITFIELD_OPS:
+case RTX_TERNARY:
+  temp = simplify_ternary_operation (code, mode, op0_mode, XEXP (x, 0),
+	 XEXP (x, 1), XEXP (x, 2));
+  break;
+default:
+  break;
+}
+
+  if (temp)
+{
+  x = temp;
+  code = GET_CODE (temp);
+  op0_mode = VOIDmode;
+  mode = GET_MODE (temp);
+}
+
   /* If this is a simple operation applied to an IF_THEN_ELSE, try
  applying it to the arms of the IF_THEN_ELSE.  This often simplifies
  things.  Check for cases where both arms are testing the same
@@ -5588,51 +5633,6 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
 	}
 }
 
-  /* Try to fold this expression in case we have constants that weren't
- present before.  */
-  temp = 0;
-  switch (GET_RTX_CLASS (code))
-{
-case RTX_UNARY:
-  if (op0_mode == VOIDmode)
-	op0_mode = GET_MODE (XEXP (x, 0));
-  temp = simplify_unary_operation (code, mode, XEXP (x, 0), op0_mode);
-  break;
-case RTX_COMPARE:
-case RTX_COMM_COMPARE:
-  {
-	machine_mode cmp_mode = GET_MODE (XEXP (x, 0));
-	if (cmp_mode == VOIDmode)
-	  {
-	cmp_mode = GET_MODE (XEXP (x, 1));
-	if (cmp_mode == VOIDmode)
-	  cmp_mode = op0_mode;
-	  }
-	temp = simplify_relational_operation (code, mode, cmp_mode,
-	  XEXP (x, 0), XEXP (x, 1));
-  }
-  break;
-case RTX_COMM_ARITH:
-case RTX_BIN_ARITH:
-  temp = simplify_binary_operation (code, mode, XEXP (x, 0), XEXP (x, 1));
-  break;
-case RTX_BITFIELD_OPS:
-case RTX_TERNARY:
-  temp = simplify_ternary_operation (code, mode, op0_mode, XEXP (x, 0),
-	 XEXP (x, 1), XEXP (x, 2));
-  break;
-default:
-  break;
-}
-
-  if (temp)
-{
-  x = temp;
-  code = GET_CODE (temp);
-  op0_mode = VOIDmode;
-  mode = GET_MODE (temp);
-}
-
   /* First see if we can apply the inverse distributive law.  */
   if (code == PLUS || code == MINUS
   || code == AND || code == IOR || code == XOR)


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-17 Thread Pinski, Andrew




 On Jul 17, 2015, at 7:36 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 
 
 On 16/07/15 19:28, Segher Boessenkool wrote:
 On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:
 If you always want to simplify first, does it work to move this whole big
 block behind the simplify just following it?  Or do you want to simplify
 after the transform as well?
 You mean move this hunk outside the if (BINARY_P (x)...) block it's in?
 I think it would work, but I'm not sure if it would affect other cases.
 I was also conscious that simplify_rtx might not be a cheap function to call
 so frequently (or is it? I didn't profile it), so I tried to avoid calling
 it unless I need for the transformation in question here.
 I mean move the whole if (BINARY_P ... block to after the existing
 simplify calls, to just before the First see if we can apply comment,
 and not do a new simplify_rtx call at all.  Does that work?
 
 Yes, and here's the patch.
 It just moves the simplification block.
 The effect on codegen in SPEC2006 on aarch64 looks sane in the same
 way as the original patch I posted (i.e. many redundant zero_extends 
 eliminated)
 and together with patch 2/2 this helps in the -abs testcase.
 
 I'm bootstrapping this on aarch64, arm and x86.
 Any other testing would be appreciated.
 
 Is this version ok if testing comes clean?

This combined with the other patch looks much better than the original target 
specific hack. And it goes to show that doing it in a non-target specific way 
can improve other cases you did not see before. 

Thanks,
Andrew

 
 Thanks,
 Kyrill
 
 2015-07-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
* combine.c (combine_simplify_rtx): Move simplification step
before various transformations/substitutions.
 
 
 combine-2.patch


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-17 Thread Jeff Law

On 07/17/2015 05:36 AM, Kyrill Tkachov wrote:


On 16/07/15 19:28, Segher Boessenkool wrote:

On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:

If you always want to simplify first, does it work to move this
whole big
block behind the simplify just following it?  Or do you want to
simplify
after the transform as well?

You mean move this hunk outside the if (BINARY_P (x)...) block it's
in?
I think it would work, but I'm not sure if it would affect other cases.
I was also conscious that simplify_rtx might not be a cheap function
to call
so frequently (or is it? I didn't profile it), so I tried to avoid
calling
it unless I need for the transformation in question here.

I mean move the whole if (BINARY_P ... block to after the existing
simplify calls, to just before the First see if we can apply comment,
and not do a new simplify_rtx call at all.  Does that work?


Yes, and here's the patch.
It just moves the simplification block.
The effect on codegen in SPEC2006 on aarch64 looks sane in the same
way as the original patch I posted (i.e. many redundant zero_extends
eliminated)
and together with patch 2/2 this helps in the -abs testcase.

I'm bootstrapping this on aarch64, arm and x86.
Any other testing would be appreciated.

Is this version ok if testing comes clean?

Thanks,
Kyrill

2015-07-17  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * combine.c (combine_simplify_rtx): Move simplification step
 before various transformations/substitutions.

OK.
jeff


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Kyrill Tkachov


On 16/07/15 19:28, Segher Boessenkool wrote:

On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?

You mean move this hunk outside the if (BINARY_P (x)...) block it's in?
I think it would work, but I'm not sure if it would affect other cases.
I was also conscious that simplify_rtx might not be a cheap function to call
so frequently (or is it? I didn't profile it), so I tried to avoid calling
it unless I need for the transformation in question here.

I mean move the whole if (BINARY_P ... block to after the existing
simplify calls, to just before the First see if we can apply comment,
and not do a new simplify_rtx call at all.  Does that work?


Yes, it does the transformation I want :) if it's combined (pardon the pun)
with the simplify-rtx.c patch at 
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01433.html



Which brings the question why it wasn't there in the first place, hrm.


Dunno, I'll test this approach more thoroughly tomorrow, check the impact on
some codebases and propose a patch if it all works out.

Thanks for the help,
Kyrill




Segher





[PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Kyrill Tkachov

Hi all,

This is an attempt to solve the problem in the thread starting at
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html
in a generic way after some pointers from Segher and Andrew.

The problem I got was that combine_simplify_rtx was trying to
do some special handling of unary operations applied to if_then_else
but ended up exiting early due to:

  enum rtx_code cond_code = simplify_comparison (NE, cond, cop1);

  if (cond_code == NE  COMPARISON_P (cond))
return x;

I tried removing that bug that led to regressions in SPEC2006.
The solution that worked for me led to two patches.

In this first patch we add a simplification step to the rtx before trying any 
substitutions.
In the second patch I add the simplify-rtx.c simplification to transform - (y ? 
-x : x)
into (!y ? -x : x) which fixes the testcase I mentioned in the first thread.

This first patch by itself already showed to be an improvement for aarch64 with 
by managing
to eliminate a large amount of redundant zero_extend operations in SPEC2006.
Overall, I saw a 2.8% decrease in [su]xt[bhw] instructions generated for the 
whole of SPEC2006
and no regressions in code quality i.e. no instructions that were combined 
before but not combine
with this patch.

Bootstrapped and tested on arm, aarch64 and x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* combine.c (combine_simplify_rtx): Try to simplify if_then_else
rtxes before trying substitutions.
commit 685bc1a66a36292329f678bae555e9c43e434e5d
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu Jul 9 16:54:23 2015 +0100

[combine] Try to simplify before substituting

diff --git a/gcc/combine.c b/gcc/combine.c
index 574f874..40d2231 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest,
 {
   rtx cond, true_rtx, false_rtx;
 
+  /* If some simplification is possible from the start, try it now.  */
+  temp = simplify_rtx (x);
+
+  if (temp)
+	{
+	  x = temp;
+	  code = GET_CODE (x);
+	  mode = GET_MODE (x);
+	  op0_mode = VOIDmode;
+	}
+
   cond = if_then_else_cond (x, true_rtx, false_rtx);
   if (cond != 0
 	  /* If everything is a comparison, what we have is highly unlikely


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Segher Boessenkool
On Thu, Jul 16, 2015 at 04:25:14PM +0100, Kyrill Tkachov wrote:
 Hi all,
 
 This is an attempt to solve the problem in the thread starting at
 https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html
 in a generic way after some pointers from Segher and Andrew.
 
 The problem I got was that combine_simplify_rtx was trying to
 do some special handling of unary operations applied to if_then_else
 but ended up exiting early due to:
 
   enum rtx_code cond_code = simplify_comparison (NE, cond, cop1);
 
   if (cond_code == NE  COMPARISON_P (cond))
 return x;
 
 I tried removing that bug that led to regressions in SPEC2006.
 The solution that worked for me led to two patches.
 
 In this first patch we add a simplification step to the rtx before trying 
 any substitutions.

 diff --git a/gcc/combine.c b/gcc/combine.c
 index 574f874..40d2231 100644
 --- a/gcc/combine.c
 +++ b/gcc/combine.c
 @@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, 
 int in_dest,
  {
rtx cond, true_rtx, false_rtx;
  
 +  /* If some simplification is possible from the start, try it now.  */
 +  temp = simplify_rtx (x);
 +
 +  if (temp)
 + {
 +   x = temp;
 +   code = GET_CODE (x);
 +   mode = GET_MODE (x);
 +   op0_mode = VOIDmode;
 + }
 +
cond = if_then_else_cond (x, true_rtx, false_rtx);
if (cond != 0
 /* If everything is a comparison, what we have is highly unlikely

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?


Segher


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Segher Boessenkool
On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote:
 If you always want to simplify first, does it work to move this whole big
 block behind the simplify just following it?  Or do you want to simplify
 after the transform as well?
 
 You mean move this hunk outside the if (BINARY_P (x)...) block it's in?
 I think it would work, but I'm not sure if it would affect other cases.
 I was also conscious that simplify_rtx might not be a cheap function to call
 so frequently (or is it? I didn't profile it), so I tried to avoid calling
 it unless I need for the transformation in question here.

I mean move the whole if (BINARY_P ... block to after the existing
simplify calls, to just before the First see if we can apply comment,
and not do a new simplify_rtx call at all.  Does that work?

Which brings the question why it wasn't there in the first place, hrm.


Segher


Re: [PATCH][combine][1/2] Try to simplify before substituting

2015-07-16 Thread Kyrill Tkachov


On 16/07/15 19:13, Segher Boessenkool wrote:

On Thu, Jul 16, 2015 at 04:25:14PM +0100, Kyrill Tkachov wrote:

Hi all,

This is an attempt to solve the problem in the thread starting at
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html
in a generic way after some pointers from Segher and Andrew.

The problem I got was that combine_simplify_rtx was trying to
do some special handling of unary operations applied to if_then_else
but ended up exiting early due to:

   enum rtx_code cond_code = simplify_comparison (NE, cond, cop1);

   if (cond_code == NE  COMPARISON_P (cond))
 return x;

I tried removing that bug that led to regressions in SPEC2006.
The solution that worked for me led to two patches.

In this first patch we add a simplification step to the rtx before trying
any substitutions.
diff --git a/gcc/combine.c b/gcc/combine.c
index 574f874..40d2231 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int 
in_dest,
  {
rtx cond, true_rtx, false_rtx;
  
+  /* If some simplification is possible from the start, try it now.  */

+  temp = simplify_rtx (x);
+
+  if (temp)
+   {
+ x = temp;
+ code = GET_CODE (x);
+ mode = GET_MODE (x);
+ op0_mode = VOIDmode;
+   }
+
cond = if_then_else_cond (x, true_rtx, false_rtx);
if (cond != 0
  /* If everything is a comparison, what we have is highly unlikely

If you always want to simplify first, does it work to move this whole big
block behind the simplify just following it?  Or do you want to simplify
after the transform as well?


You mean move this hunk outside the if (BINARY_P (x)...) block it's in?
I think it would work, but I'm not sure if it would affect other cases.
I was also conscious that simplify_rtx might not be a cheap function to call
so frequently (or is it? I didn't profile it), so I tried to avoid calling
it unless I need for the transformation in question here.

Kyrill




Segher