Re: [PATCH][combine][1/2] Try to simplify before substituting
On 18/07/15 17:02, Segher Boessenkool wrote: On Fri, Jul 17, 2015 at 02:47:34PM -0600, Jeff Law wrote: I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Yes, and here's the patch. It just moves the simplification block. The effect on codegen in SPEC2006 on aarch64 looks sane in the same way as the original patch I posted (i.e. many redundant zero_extends eliminated) and together with patch 2/2 this helps in the -abs testcase. I'm bootstrapping this on aarch64, arm and x86. Any other testing would be appreciated. Is this version ok if testing comes clean? Thanks, Kyrill 2015-07-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * combine.c (combine_simplify_rtx): Move simplification step before various transformations/substitutions. OK. jeff The patch improves generated code on most archs (or at least code size, which strongly correlates for combine), or is neutral. xtensa regresses a tiny bit; powerpc64 and hppa64 regress more. I analysed the powerpc64 differences, and it seems to be all down to code that is now expressed as (set (reg:DI) (lt:DI (reg:SI) (const_int 0))) where before it was a bit extract (of a subreg). The newly generated pattern is simper alright, but the backend didn't recognise it. With a simple patch, it does, and the generated code is nicely better than before. Thanks for analyzing. So will you submit a powerpc patch for this? I'm not familiar with the patterns there :) The hppa64 problem looks similar. Maybe other targets could use such an improvement as well. So yes, the patch is fine. Thank you for working on it Kyrill :-) x86_64, aarch64 and arm bootstrap passed successfully on my end and testing looked clean too. I've committed this patch with r225996 and 2/2 with r225997. Thanks for helping me work through this! Kyrill Segher
Re: [PATCH][combine][1/2] Try to simplify before substituting
On Fri, Jul 17, 2015 at 02:47:34PM -0600, Jeff Law wrote: I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Yes, and here's the patch. It just moves the simplification block. The effect on codegen in SPEC2006 on aarch64 looks sane in the same way as the original patch I posted (i.e. many redundant zero_extends eliminated) and together with patch 2/2 this helps in the -abs testcase. I'm bootstrapping this on aarch64, arm and x86. Any other testing would be appreciated. Is this version ok if testing comes clean? Thanks, Kyrill 2015-07-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * combine.c (combine_simplify_rtx): Move simplification step before various transformations/substitutions. OK. jeff The patch improves generated code on most archs (or at least code size, which strongly correlates for combine), or is neutral. xtensa regresses a tiny bit; powerpc64 and hppa64 regress more. I analysed the powerpc64 differences, and it seems to be all down to code that is now expressed as (set (reg:DI) (lt:DI (reg:SI) (const_int 0))) where before it was a bit extract (of a subreg). The newly generated pattern is simper alright, but the backend didn't recognise it. With a simple patch, it does, and the generated code is nicely better than before. The hppa64 problem looks similar. Maybe other targets could use such an improvement as well. So yes, the patch is fine. Thank you for working on it Kyrill :-) Segher
Re: [PATCH][combine][1/2] Try to simplify before substituting
On 16/07/15 19:28, Segher Boessenkool wrote: On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote: If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? You mean move this hunk outside the if (BINARY_P (x)...) block it's in? I think it would work, but I'm not sure if it would affect other cases. I was also conscious that simplify_rtx might not be a cheap function to call so frequently (or is it? I didn't profile it), so I tried to avoid calling it unless I need for the transformation in question here. I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Yes, and here's the patch. It just moves the simplification block. The effect on codegen in SPEC2006 on aarch64 looks sane in the same way as the original patch I posted (i.e. many redundant zero_extends eliminated) and together with patch 2/2 this helps in the -abs testcase. I'm bootstrapping this on aarch64, arm and x86. Any other testing would be appreciated. Is this version ok if testing comes clean? Thanks, Kyrill 2015-07-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * combine.c (combine_simplify_rtx): Move simplification step before various transformations/substitutions. commit d7c7cd4acad9a34b3e991fd589125cc165f25c1f Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Jul 16 19:42:58 2015 +0100 [combine] Alternative approach diff --git a/gcc/combine.c b/gcc/combine.c index 574f874..2f806ab 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -5489,6 +5489,51 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, SUBST (XEXP (x, 1), temp); } + /* Try to fold this expression in case we have constants that weren't + present before. */ + temp = 0; + switch (GET_RTX_CLASS (code)) +{ +case RTX_UNARY: + if (op0_mode == VOIDmode) + op0_mode = GET_MODE (XEXP (x, 0)); + temp = simplify_unary_operation (code, mode, XEXP (x, 0), op0_mode); + break; +case RTX_COMPARE: +case RTX_COMM_COMPARE: + { + machine_mode cmp_mode = GET_MODE (XEXP (x, 0)); + if (cmp_mode == VOIDmode) + { + cmp_mode = GET_MODE (XEXP (x, 1)); + if (cmp_mode == VOIDmode) + cmp_mode = op0_mode; + } + temp = simplify_relational_operation (code, mode, cmp_mode, + XEXP (x, 0), XEXP (x, 1)); + } + break; +case RTX_COMM_ARITH: +case RTX_BIN_ARITH: + temp = simplify_binary_operation (code, mode, XEXP (x, 0), XEXP (x, 1)); + break; +case RTX_BITFIELD_OPS: +case RTX_TERNARY: + temp = simplify_ternary_operation (code, mode, op0_mode, XEXP (x, 0), + XEXP (x, 1), XEXP (x, 2)); + break; +default: + break; +} + + if (temp) +{ + x = temp; + code = GET_CODE (temp); + op0_mode = VOIDmode; + mode = GET_MODE (temp); +} + /* If this is a simple operation applied to an IF_THEN_ELSE, try applying it to the arms of the IF_THEN_ELSE. This often simplifies things. Check for cases where both arms are testing the same @@ -5588,51 +5633,6 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, } } - /* Try to fold this expression in case we have constants that weren't - present before. */ - temp = 0; - switch (GET_RTX_CLASS (code)) -{ -case RTX_UNARY: - if (op0_mode == VOIDmode) - op0_mode = GET_MODE (XEXP (x, 0)); - temp = simplify_unary_operation (code, mode, XEXP (x, 0), op0_mode); - break; -case RTX_COMPARE: -case RTX_COMM_COMPARE: - { - machine_mode cmp_mode = GET_MODE (XEXP (x, 0)); - if (cmp_mode == VOIDmode) - { - cmp_mode = GET_MODE (XEXP (x, 1)); - if (cmp_mode == VOIDmode) - cmp_mode = op0_mode; - } - temp = simplify_relational_operation (code, mode, cmp_mode, - XEXP (x, 0), XEXP (x, 1)); - } - break; -case RTX_COMM_ARITH: -case RTX_BIN_ARITH: - temp = simplify_binary_operation (code, mode, XEXP (x, 0), XEXP (x, 1)); - break; -case RTX_BITFIELD_OPS: -case RTX_TERNARY: - temp = simplify_ternary_operation (code, mode, op0_mode, XEXP (x, 0), - XEXP (x, 1), XEXP (x, 2)); - break; -default: - break; -} - - if (temp) -{ - x = temp; - code = GET_CODE (temp); - op0_mode = VOIDmode; - mode = GET_MODE (temp); -} - /* First see if we can apply the inverse distributive law. */ if (code == PLUS || code == MINUS || code == AND || code == IOR || code == XOR)
Re: [PATCH][combine][1/2] Try to simplify before substituting
On Jul 17, 2015, at 7:36 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: On 16/07/15 19:28, Segher Boessenkool wrote: On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote: If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? You mean move this hunk outside the if (BINARY_P (x)...) block it's in? I think it would work, but I'm not sure if it would affect other cases. I was also conscious that simplify_rtx might not be a cheap function to call so frequently (or is it? I didn't profile it), so I tried to avoid calling it unless I need for the transformation in question here. I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Yes, and here's the patch. It just moves the simplification block. The effect on codegen in SPEC2006 on aarch64 looks sane in the same way as the original patch I posted (i.e. many redundant zero_extends eliminated) and together with patch 2/2 this helps in the -abs testcase. I'm bootstrapping this on aarch64, arm and x86. Any other testing would be appreciated. Is this version ok if testing comes clean? This combined with the other patch looks much better than the original target specific hack. And it goes to show that doing it in a non-target specific way can improve other cases you did not see before. Thanks, Andrew Thanks, Kyrill 2015-07-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * combine.c (combine_simplify_rtx): Move simplification step before various transformations/substitutions. combine-2.patch
Re: [PATCH][combine][1/2] Try to simplify before substituting
On 07/17/2015 05:36 AM, Kyrill Tkachov wrote: On 16/07/15 19:28, Segher Boessenkool wrote: On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote: If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? You mean move this hunk outside the if (BINARY_P (x)...) block it's in? I think it would work, but I'm not sure if it would affect other cases. I was also conscious that simplify_rtx might not be a cheap function to call so frequently (or is it? I didn't profile it), so I tried to avoid calling it unless I need for the transformation in question here. I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Yes, and here's the patch. It just moves the simplification block. The effect on codegen in SPEC2006 on aarch64 looks sane in the same way as the original patch I posted (i.e. many redundant zero_extends eliminated) and together with patch 2/2 this helps in the -abs testcase. I'm bootstrapping this on aarch64, arm and x86. Any other testing would be appreciated. Is this version ok if testing comes clean? Thanks, Kyrill 2015-07-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * combine.c (combine_simplify_rtx): Move simplification step before various transformations/substitutions. OK. jeff
Re: [PATCH][combine][1/2] Try to simplify before substituting
On 16/07/15 19:28, Segher Boessenkool wrote: On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote: If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? You mean move this hunk outside the if (BINARY_P (x)...) block it's in? I think it would work, but I'm not sure if it would affect other cases. I was also conscious that simplify_rtx might not be a cheap function to call so frequently (or is it? I didn't profile it), so I tried to avoid calling it unless I need for the transformation in question here. I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Yes, it does the transformation I want :) if it's combined (pardon the pun) with the simplify-rtx.c patch at https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01433.html Which brings the question why it wasn't there in the first place, hrm. Dunno, I'll test this approach more thoroughly tomorrow, check the impact on some codebases and propose a patch if it all works out. Thanks for the help, Kyrill Segher
[PATCH][combine][1/2] Try to simplify before substituting
Hi all, This is an attempt to solve the problem in the thread starting at https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html in a generic way after some pointers from Segher and Andrew. The problem I got was that combine_simplify_rtx was trying to do some special handling of unary operations applied to if_then_else but ended up exiting early due to: enum rtx_code cond_code = simplify_comparison (NE, cond, cop1); if (cond_code == NE COMPARISON_P (cond)) return x; I tried removing that bug that led to regressions in SPEC2006. The solution that worked for me led to two patches. In this first patch we add a simplification step to the rtx before trying any substitutions. In the second patch I add the simplify-rtx.c simplification to transform - (y ? -x : x) into (!y ? -x : x) which fixes the testcase I mentioned in the first thread. This first patch by itself already showed to be an improvement for aarch64 with by managing to eliminate a large amount of redundant zero_extend operations in SPEC2006. Overall, I saw a 2.8% decrease in [su]xt[bhw] instructions generated for the whole of SPEC2006 and no regressions in code quality i.e. no instructions that were combined before but not combine with this patch. Bootstrapped and tested on arm, aarch64 and x86_64. Ok for trunk? Thanks, Kyrill 2015-07-16 Kyrylo Tkachov kyrylo.tkac...@arm.com * combine.c (combine_simplify_rtx): Try to simplify if_then_else rtxes before trying substitutions. commit 685bc1a66a36292329f678bae555e9c43e434e5d Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu Jul 9 16:54:23 2015 +0100 [combine] Try to simplify before substituting diff --git a/gcc/combine.c b/gcc/combine.c index 574f874..40d2231 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, { rtx cond, true_rtx, false_rtx; + /* If some simplification is possible from the start, try it now. */ + temp = simplify_rtx (x); + + if (temp) + { + x = temp; + code = GET_CODE (x); + mode = GET_MODE (x); + op0_mode = VOIDmode; + } + cond = if_then_else_cond (x, true_rtx, false_rtx); if (cond != 0 /* If everything is a comparison, what we have is highly unlikely
Re: [PATCH][combine][1/2] Try to simplify before substituting
On Thu, Jul 16, 2015 at 04:25:14PM +0100, Kyrill Tkachov wrote: Hi all, This is an attempt to solve the problem in the thread starting at https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html in a generic way after some pointers from Segher and Andrew. The problem I got was that combine_simplify_rtx was trying to do some special handling of unary operations applied to if_then_else but ended up exiting early due to: enum rtx_code cond_code = simplify_comparison (NE, cond, cop1); if (cond_code == NE COMPARISON_P (cond)) return x; I tried removing that bug that led to regressions in SPEC2006. The solution that worked for me led to two patches. In this first patch we add a simplification step to the rtx before trying any substitutions. diff --git a/gcc/combine.c b/gcc/combine.c index 574f874..40d2231 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, { rtx cond, true_rtx, false_rtx; + /* If some simplification is possible from the start, try it now. */ + temp = simplify_rtx (x); + + if (temp) + { + x = temp; + code = GET_CODE (x); + mode = GET_MODE (x); + op0_mode = VOIDmode; + } + cond = if_then_else_cond (x, true_rtx, false_rtx); if (cond != 0 /* If everything is a comparison, what we have is highly unlikely If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? Segher
Re: [PATCH][combine][1/2] Try to simplify before substituting
On Thu, Jul 16, 2015 at 07:17:54PM +0100, Kyrill Tkachov wrote: If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? You mean move this hunk outside the if (BINARY_P (x)...) block it's in? I think it would work, but I'm not sure if it would affect other cases. I was also conscious that simplify_rtx might not be a cheap function to call so frequently (or is it? I didn't profile it), so I tried to avoid calling it unless I need for the transformation in question here. I mean move the whole if (BINARY_P ... block to after the existing simplify calls, to just before the First see if we can apply comment, and not do a new simplify_rtx call at all. Does that work? Which brings the question why it wasn't there in the first place, hrm. Segher
Re: [PATCH][combine][1/2] Try to simplify before substituting
On 16/07/15 19:13, Segher Boessenkool wrote: On Thu, Jul 16, 2015 at 04:25:14PM +0100, Kyrill Tkachov wrote: Hi all, This is an attempt to solve the problem in the thread starting at https://gcc.gnu.org/ml/gcc-patches/2015-07/msg01010.html in a generic way after some pointers from Segher and Andrew. The problem I got was that combine_simplify_rtx was trying to do some special handling of unary operations applied to if_then_else but ended up exiting early due to: enum rtx_code cond_code = simplify_comparison (NE, cond, cop1); if (cond_code == NE COMPARISON_P (cond)) return x; I tried removing that bug that led to regressions in SPEC2006. The solution that worked for me led to two patches. In this first patch we add a simplification step to the rtx before trying any substitutions. diff --git a/gcc/combine.c b/gcc/combine.c index 574f874..40d2231 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -5510,6 +5510,17 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, { rtx cond, true_rtx, false_rtx; + /* If some simplification is possible from the start, try it now. */ + temp = simplify_rtx (x); + + if (temp) + { + x = temp; + code = GET_CODE (x); + mode = GET_MODE (x); + op0_mode = VOIDmode; + } + cond = if_then_else_cond (x, true_rtx, false_rtx); if (cond != 0 /* If everything is a comparison, what we have is highly unlikely If you always want to simplify first, does it work to move this whole big block behind the simplify just following it? Or do you want to simplify after the transform as well? You mean move this hunk outside the if (BINARY_P (x)...) block it's in? I think it would work, but I'm not sure if it would affect other cases. I was also conscious that simplify_rtx might not be a cheap function to call so frequently (or is it? I didn't profile it), so I tried to avoid calling it unless I need for the transformation in question here. Kyrill Segher