Re: [RFC]: Remove Mem/address type assumption in combiner
On 05/16/2015 07:55 PM, Hans-Peter Nilsson wrote: On Sat, 16 May 2015, Segher Boessenkool wrote: On Sat, May 16, 2015 at 12:36:38PM -0400, Hans-Peter Nilsson wrote: On Sat, 16 May 2015, Segher Boessenkool wrote: On Fri, May 15, 2015 at 10:40:48PM -0400, Hans-Peter Nilsson wrote: I confess the test-case-guarded addi pattern should have been expressed with a shift in addition to the multiplication. But they wouldn't ever match so they might very well have bitrotted by now :-( It seems you're saying that the canonicalization to ashift didn't work *at all*, when starting with an expression from an address? I knew it failed in part, but always thought it was just a partial failure. With a plus or minus combine would always write it as a mult. I don't think any other pass would create this combination. I haven't tested it though. Ports probably also generate that internally at various RTL passes, something that takes a bit more than an at-a-glance code inspection. Correct. THe PA port for example has a ton of this kind of RTL rewriting to exploit the shift-add insns and scaled indexed addressing modes (and correct for some oddities in the PA chip where the scaled modes don't exist in every context where you'd think they should). And you still have to to worry about things like combine taking a (mem (plus (mult))), selecting the (plus (mult)) as a split point and failing to canonicalize it into the ashift form. I ran into that while fixing up the PA for these changes. The good news is with two trivial combine changes and the expected changes to the PA backend, I can get code generation back to where it was before across my sample testfiles. Jeff
Re: [RFC]: Remove Mem/address type assumption in combiner
On Sat, May 16, 2015 at 09:55:08PM -0400, Hans-Peter Nilsson wrote: With a plus or minus combine would always write it as a mult. I don't think any other pass would create this combination. I haven't tested it though. Ports probably also generate that internally at various RTL passes, something that takes a bit more than an at-a-glance code inspection. Yeah, true. Hopefully port maintainers will know about what the port does, or at least notice the ICEs. I've been trying to come up with valid reasons there'd be ever be canonicalization by multiplication, but failed so I guess I'll rip it out. The unreachable thing should quickly tell you if that guess is wrong. Not something you want to leave in a production compiler, of course. I think you misunderstood; I mean that I pondered philosophical reasons to change the canonical representation; if there was some reasoning, current or looking forward, where we'd add back mult as non-address-canonical RTL. Ah. Yes, same here -- I don't see any situation where having mult instead of shift outside of a mem would be more convenient. Actually, there are two things you could help with: - (pointer-to) step-by-step instructions to recreate the Linux (kernel :) build, as those you did before for a multiple of targets. I'd like to know I fixed your observations. I used http://git.infradead.org/users/segher/buildall.git; it has a README. I see that doc is a little out of date, I'll update. - add a preferred canonicalization function to do conversion to/from memory-address-canonical RTL. Like fwprop.c:canonicalize_address (just not static :) and maybe also a canonicalize_nonaddress. At the moment, ports call replace_equiv_address (family of functions) when changing address RTL, but that code-path (at a glance) doesn't canonicalize whatever non-address-canonical RTL you throw at it. Maybe it should? Maybe validize_mem is nicer? It depends much what you really want to change. Does simplify_rtx not do what you want for nonaddress? Segher
Re: [RFC]: Remove Mem/address type assumption in combiner
On Sun, 17 May 2015, Segher Boessenkool wrote: I used http://git.infradead.org/users/segher/buildall.git; it has a README. I see that doc is a little out of date, I'll update. (git: not http: for cloning) Thanks, looks useful. Hm, maybe we already mention this in the wiki... - add a preferred canonicalization function to do conversion to/from memory-address-canonical RTL. Like fwprop.c:canonicalize_address (just not static :) and maybe also a canonicalize_nonaddress. At the moment, ports call replace_equiv_address (family of functions) when changing address RTL, but that code-path (at a glance) doesn't canonicalize whatever non-address-canonical RTL you throw at it. Maybe it should? Maybe validize_mem is nicer? It depends much what you really want to change. I'd had made use of a function that'd automatically rewrite an ASHIFT into a MULT. No, validize_mem is kind-of a sibling to replace_equiv_address with just the containing MEM and will just call it, says the code. A call to replace_equiv_address ( Co.) will *break out* invalid expressions like an ASHIFT - or at least is supposed to do that. But that'd be mostly for convenience of other ports; I've coped now, I think. Does simplify_rtx not do what you want for nonaddress? Oh, right, that's already what we have, thanks. brgds, H-P
Re: [RFC]: Remove Mem/address type assumption in combiner
On Sat, May 16, 2015 at 12:36:38PM -0400, Hans-Peter Nilsson wrote: On Sat, 16 May 2015, Segher Boessenkool wrote: On Fri, May 15, 2015 at 10:40:48PM -0400, Hans-Peter Nilsson wrote: I confess the test-case-guarded addi pattern should have been expressed with a shift in addition to the multiplication. But they wouldn't ever match so they might very well have bitrotted by now :-( It seems you're saying that the canonicalization to ashift didn't work *at all*, when starting with an expression from an address? I knew it failed in part, but always thought it was just a partial failure. With a plus or minus combine would always write it as a mult. I don't think any other pass would create this combination. I haven't tested it though. (In addition to as the canonically wrong one used to be the combine-matching pattern; I'm not sure I should really drop that just yet.) It is harmless to leave it in. It will rot though, eventually -- better take it out before then. Add some gcc_unreachable, perhaps. I've been trying to come up with valid reasons there'd be ever be canonicalization by multiplication, but failed so I guess I'll rip it out. The unreachable thing should quickly tell you if that guess is wrong. Not something you want to leave in a production compiler, of course. Looks like quite some work for you, I'm sorry about that, It's almost over, at least the editing part... Great to hear that! Segher
Re: [RFC]: Remove Mem/address type assumption in combiner
On Fri, May 15, 2015 at 10:40:48PM -0400, Hans-Peter Nilsson wrote: I confess the test-case-guarded addi pattern should have been expressed with a shift in addition to the multiplication. But they wouldn't ever match so they might very well have bitrotted by now :-( (In addition to as the canonically wrong one used to be the combine-matching pattern; I'm not sure I should really drop that just yet.) It is harmless to leave it in. It will rot though, eventually -- better take it out before then. Add some gcc_unreachable, perhaps. Supposedly more noteworthy: this now-stricter canonicalization leads to a requirement to rewrite patterns that used to be: (parallel [(set reg0 (mem (plus (mult reg1 N) reg2))) (set reg3 (plus (mult reg1 N) reg2))]) into the awkwardly asymmetric: (parallel [(set reg0 (mem (plus (mult reg1 2) reg2))) (set reg3 (plus (ashift reg1 M) reg2))]) (where M = log2 N) Yeah. This is true of parallels in general: canonicalisation works on each arm separately. I'm not sure what can be done about it. Looks like quite some work for you, I'm sorry about that, Segher
Re: [RFC]: Remove Mem/address type assumption in combiner
On Sat, 16 May 2015, Segher Boessenkool wrote: On Fri, May 15, 2015 at 10:40:48PM -0400, Hans-Peter Nilsson wrote: I confess the test-case-guarded addi pattern should have been expressed with a shift in addition to the multiplication. But they wouldn't ever match so they might very well have bitrotted by now :-( It seems you're saying that the canonicalization to ashift didn't work *at all*, when starting with an expression from an address? I knew it failed in part, but always thought it was just a partial failure. (In addition to as the canonically wrong one used to be the combine-matching pattern; I'm not sure I should really drop that just yet.) It is harmless to leave it in. It will rot though, eventually -- better take it out before then. Add some gcc_unreachable, perhaps. I've been trying to come up with valid reasons there'd be ever be canonicalization by multiplication, but failed so I guess I'll rip it out. Supposedly more noteworthy: this now-stricter canonicalization leads to a requirement to rewrite patterns that used to be: (parallel [(set reg0 (mem (plus (mult reg1 N) reg2))) (set reg3 (plus (mult reg1 N) reg2))]) into the awkwardly asymmetric: (parallel [(set reg0 (mem (plus (mult reg1 2) reg2))) (set reg3 (plus (ashift reg1 M) reg2))]) (where M = log2 N) Yeah. This is true of parallels in general: canonicalisation works on each arm separately. I'm not sure what can be done about it. Looks like quite some work for you, I'm sorry about that, It's almost over, at least the editing part... brgds, H-P
Re: [RFC]: Remove Mem/address type assumption in combiner
On Sat, 16 May 2015, Segher Boessenkool wrote: On Sat, May 16, 2015 at 12:36:38PM -0400, Hans-Peter Nilsson wrote: On Sat, 16 May 2015, Segher Boessenkool wrote: On Fri, May 15, 2015 at 10:40:48PM -0400, Hans-Peter Nilsson wrote: I confess the test-case-guarded addi pattern should have been expressed with a shift in addition to the multiplication. But they wouldn't ever match so they might very well have bitrotted by now :-( It seems you're saying that the canonicalization to ashift didn't work *at all*, when starting with an expression from an address? I knew it failed in part, but always thought it was just a partial failure. With a plus or minus combine would always write it as a mult. I don't think any other pass would create this combination. I haven't tested it though. Ports probably also generate that internally at various RTL passes, something that takes a bit more than an at-a-glance code inspection. (In addition to as the canonically wrong one used to be the combine-matching pattern; I'm not sure I should really drop that just yet.) It is harmless to leave it in. It will rot though, eventually -- better take it out before then. Add some gcc_unreachable, perhaps. I've been trying to come up with valid reasons there'd be ever be canonicalization by multiplication, but failed so I guess I'll rip it out. The unreachable thing should quickly tell you if that guess is wrong. Not something you want to leave in a production compiler, of course. I think you misunderstood; I mean that I pondered philosophical reasons to change the canonical representation; if there was some reasoning, current or looking forward, where we'd add back mult as non-address-canonical RTL. For the current scheme, keeping the matching-patterns but with assertions added might be helpful, yes. (People looking for practical clues: for spotting dead target code there are besides the alluded-to gcc_unreachable(), also the options to call internal_error() with a friendly string or generating invalid assembly with a descriptive mnemonic.) Actually, there are two things you could help with: - (pointer-to) step-by-step instructions to recreate the Linux (kernel :) build, as those you did before for a multiple of targets. I'd like to know I fixed your observations. - add a preferred canonicalization function to do conversion to/from memory-address-canonical RTL. Like fwprop.c:canonicalize_address (just not static :) and maybe also a canonicalize_nonaddress. At the moment, ports call replace_equiv_address (family of functions) when changing address RTL, but that code-path (at a glance) doesn't canonicalize whatever non-address-canonical RTL you throw at it. Maybe it should? Looks like quite some work for you, I'm sorry about that, It's almost over, at least the editing part... Great to hear that! Thanks for your enthusiasm! Well, I've known about this wart for 6+ years as seen in PR37939 (but before that likely at least as long), so I *should* thank you for fixing it; it's just that the added churn and required tweaks has somewhat of a sour taste. :) brgds, H-P
Re: [RFC]: Remove Mem/address type assumption in combiner
On Fri, 1 May 2015, Segher Boessenkool wrote: On Wed, Apr 29, 2015 at 12:03:35PM -0500, Segher Boessenkool wrote: On Wed, Apr 29, 2015 at 09:25:21AM +, Kumar, Venkataramanan wrote: diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code) but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM - : ((code == PLUS || code == MINUS) - SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); On X86_64, it passes bootstrap and regression tests. But on Aarch64 the test in PR passed, but I got a few test case failures. There are few patterns based on multiplication operations in Aarch64 backend which are used to match with the pattern combiner generated. Now those patterns have to be fixed to use SHIFTS. Also need to see any impact on other targets. Right. It would be good if you could find out for what targets it matters. The thing is, if a target expects only the patterns as combine used to make them, it will regress (as you've seen on aarch64); but it will regress _silently_. Which isn't so nice. But before that I wanted to check if the assumption in combiner, can simply be removed ? Seeing for what targets / patterns it makes a difference would tell us the answer to that, too :-) I'll run some tests with and without your patch. So I ran the tests. These are text sizes for vmlinux built for most architectures (before resp. after the patch). Thanks for actually checking the impact. Results are good, but they show many targets need some work... BIGGER 2212322 2212706 cris Also observable as noted in PR66171, a regression: Running /tmp/hpautotest-gcc0/gcc/gcc/testsuite/gcc.target/cris/cris.exp ... FAIL: gcc.target/cris/biap.c scan-assembler addi FAIL: gcc.target/cris/biap.c scan-assembler-not lsl I confess the test-case-guarded addi pattern should have been expressed with a shift in addition to the multiplication. (In addition to as the canonically wrong one used to be the combine-matching pattern; I'm not sure I should really drop that just yet.) I'll have to check that expressing using a shift fixes this issue. Supposedly more noteworthy: this now-stricter canonicalization leads to a requirement to rewrite patterns that used to be: (parallel [(set reg0 (mem (plus (mult reg1 N) reg2))) (set reg3 (plus (mult reg1 N) reg2))]) into the awkwardly asymmetric: (parallel [(set reg0 (mem (plus (mult reg1 2) reg2))) (set reg3 (plus (ashift reg1 M) reg2))]) (where M = log2 N) and of course a need to verify that combine actually *does* make use of the pattern after the change at least as much as it did before. Some targets need some love (but what else is new). Meh. Well, you now got some whine to go with that cheese. brgds, H-P
RE: [RFC]: Remove Mem/address type assumption in combiner
On Tue, 2015-05-12 at 05:21 +, Kumar, Venkataramanan wrote: Hi Steve, Yes this is expected. As Segher pointed out, we need to change .md patterns in target to be based on shifts instead of mults. Regards, Venkat. Here is my patch to change this. I tested it with the mips-mti-linux-gnu and mips-mti-elf targets and it fixed the MIPS specific tests that were scanning for an lsa instruction and it also had no regressions. Since the lsa instruction was the only one using the 'y' operand code and my change got rid of the only use of it, I removed it as part of this patch. Matthew or Catherine is this OK to checkin? 2015-05-12 Steve Ellcey sell...@imgtec.com * config/mips/mips.c (mips_print_operand): Remove 'y' operand code. * config/mips/mips.md (GPR:dlsa): Rewrite with shift operator. * config/mips/predicates.md (const_immlsa_operand): Remove log call. diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 16ed5f0..82d55b6 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -8440,7 +8440,6 @@ mips_print_operand_punct_valid_p (unsigned char code) 'x' Print the low 16 bits of CONST_INT OP in hexadecimal format. 'd' Print CONST_INT OP in decimal. 'm' Print one less than CONST_INT OP in decimal. - 'y' Print exact log2 of CONST_INT OP in decimal. 'h' Print the high-part relocation associated with OP, after stripping any outermost HIGH. 'R' Print the low-part relocation associated with OP. @@ -8504,19 +8503,6 @@ mips_print_operand (FILE *file, rtx op, int letter) output_operand_lossage (invalid use of '%%%c', letter); break; -case 'y': - if (CONST_INT_P (op)) - { - int val = exact_log2 (INTVAL (op)); - if (val != -1) - fprintf (file, %d, val); - else - output_operand_lossage (invalid use of '%%%c', letter); - } - else - output_operand_lossage (invalid use of '%%%c', letter); - break; - case 'h': if (code == HIGH) op = XEXP (op, 0); diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md index 76f2108..f6921c6 100644 --- a/gcc/config/mips/mips.md +++ b/gcc/config/mips/mips.md @@ -5528,11 +5528,11 @@ (define_insn GPR:dlsa [(set (match_operand:GPR 0 register_operand =d) - (plus:GPR (mult:GPR (match_operand:GPR 1 register_operand d) - (match_operand 2 const_immlsa_operand )) + (plus:GPR (ashift:GPR (match_operand:GPR 1 register_operand d) +(match_operand 2 const_immlsa_operand )) (match_operand:GPR 3 register_operand d)))] ISA_HAS_GPR:DLSA - GPR:dlsa\t%0,%1,%3,%y2 + GPR:dlsa\t%0,%1,%3,%2 [(set_attr type arith) (set_attr mode GPR:MODE)]) diff --git a/gcc/config/mips/predicates.md b/gcc/config/mips/predicates.md index fa17ac7..4929c3d 100644 --- a/gcc/config/mips/predicates.md +++ b/gcc/config/mips/predicates.md @@ -35,7 +35,7 @@ (define_predicate const_immlsa_operand (and (match_code const_int) - (match_test IN_RANGE (exact_log2 (INTVAL (op)), 1, 4 + (match_test IN_RANGE (INTVAL (op), 1, 4 (define_predicate const_uimm6_operand (and (match_code const_int)
RE: [RFC]: Remove Mem/address type assumption in combiner
-Original Message- From: Steve Ellcey [mailto:sell...@imgtec.com] Sent: Tuesday, May 12, 2015 12:51 PM To: Kumar, Venkataramanan Cc: Segher Boessenkool; Jeff Law (l...@redhat.com); gcc- patc...@gcc.gnu.org; maxim.kuvyr...@linaro.org; Matthew Fortune; Moore, Catherine Subject: RE: [RFC]: Remove Mem/address type assumption in combiner On Tue, 2015-05-12 at 05:21 +, Kumar, Venkataramanan wrote: Hi Steve, Yes this is expected. As Segher pointed out, we need to change .md patterns in target to be based on shifts instead of mults. Regards, Venkat. Here is my patch to change this. I tested it with the mips-mti-linux-gnu and mips-mti-elf targets and it fixed the MIPS specific tests that were scanning for an lsa instruction and it also had no regressions. Since the lsa instruction was the only one using the 'y' operand code and my change got rid of the only use of it, I removed it as part of this patch. Matthew or Catherine is this OK to checkin? 2015-05-12 Steve Ellcey sell...@imgtec.com * config/mips/mips.c (mips_print_operand): Remove 'y' operand code. * config/mips/mips.md (GPR:dlsa): Rewrite with shift operator. * config/mips/predicates.md (const_immlsa_operand): Remove log call. This is OK. Catherine
Re: [RFC]: Remove Mem/address type assumption in combiner
Hi Steve, On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote: This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? Yes, this is as expected. AFAICS the only change you need in the MIPS backend is to change the GPR:dlsa pattern to match a shift instead of a mult (and change the const_immlsa_operand predicate to just match 1..4 instead of the powers). Segher
RE: [RFC]: Remove Mem/address type assumption in combiner
Jeff Law l...@redhat.com writes: On 05/11/2015 01:46 PM, Jeff Law wrote: On 05/11/2015 01:44 PM, Steve Ellcey wrote: On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote: Hi Steve, On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote: This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? Yes, this is as expected. AFAICS the only change you need in the MIPS backend is to change the GPR:dlsa pattern to match a shift instead of a mult (and change the const_immlsa_operand predicate to just match 1..4 instead of the powers). Segher Hm, I thought it was going to be more complicated than that, but it seems to be working. I will do a complete test run and then submit a patch. Yea, it really should be that easy. I'm pretty sure the sh[123]add insns in the PA need to be updated in a similar manner. Oh, and just to be clear, I'm not expecting you to do this Steve. It'd be great if you did, but not required. Does this patch effectively change the canonicalization rules? The following Still exists in md.texi: @item Within address computations (i.e., inside @code{mem}), a left shift is converted into the appropriate multiplication by a power of two. Thanks, Matthew
Re: [RFC]: Remove Mem/address type assumption in combiner
On 05/11/2015 01:46 PM, Jeff Law wrote: On 05/11/2015 01:44 PM, Steve Ellcey wrote: On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote: Hi Steve, On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote: This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? Yes, this is as expected. AFAICS the only change you need in the MIPS backend is to change the GPR:dlsa pattern to match a shift instead of a mult (and change the const_immlsa_operand predicate to just match 1..4 instead of the powers). Segher Hm, I thought it was going to be more complicated than that, but it seems to be working. I will do a complete test run and then submit a patch. Yea, it really should be that easy. I'm pretty sure the sh[123]add insns in the PA need to be updated in a similar manner. Oh, and just to be clear, I'm not expecting you to do this Steve. It'd be great if you did, but not required. jeff
Re: [RFC]: Remove Mem/address type assumption in combiner
On Mon, May 11, 2015 at 08:16:41PM +, Matthew Fortune wrote: Does this patch effectively change the canonicalization rules? The following Still exists in md.texi: @item Within address computations (i.e., inside @code{mem}), a left shift is converted into the appropriate multiplication by a power of two. No, it makes combine *follow* those rules -- this isn't inside a MEM. Segher
RE: [RFC]: Remove Mem/address type assumption in combiner
Segher Boessenkool seg...@kernel.crashing.org writes: On Mon, May 11, 2015 at 08:16:41PM +, Matthew Fortune wrote: Does this patch effectively change the canonicalization rules? The following Still exists in md.texi: @item Within address computations (i.e., inside @code{mem}), a left shift is converted into the appropriate multiplication by a power of two. No, it makes combine *follow* those rules -- this isn't inside a MEM. Thanks, I'm being a bit slow today. Matthew
Re: [RFC]: Remove Mem/address type assumption in combiner
On 05/11/2015 01:44 PM, Steve Ellcey wrote: On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote: Hi Steve, On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote: This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? Yes, this is as expected. AFAICS the only change you need in the MIPS backend is to change the GPR:dlsa pattern to match a shift instead of a mult (and change the const_immlsa_operand predicate to just match 1..4 instead of the powers). Segher Hm, I thought it was going to be more complicated than that, but it seems to be working. I will do a complete test run and then submit a patch. Yea, it really should be that easy. I'm pretty sure the sh[123]add insns in the PA need to be updated in a similar manner. jeff
Re: [RFC]: Remove Mem/address type assumption in combiner
On Mon, 2015-05-11 at 13:22 -0500, Segher Boessenkool wrote: Hi Steve, On Mon, May 11, 2015 at 10:50:02AM -0700, Steve Ellcey wrote: This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? Yes, this is as expected. AFAICS the only change you need in the MIPS backend is to change the GPR:dlsa pattern to match a shift instead of a mult (and change the const_immlsa_operand predicate to just match 1..4 instead of the powers). Segher Hm, I thought it was going to be more complicated than that, but it seems to be working. I will do a complete test run and then submit a patch. Steve Ellcey
RE: [RFC]: Remove Mem/address type assumption in combiner
Hi Steve, Yes this is expected. As Segher pointed out, we need to change .md patterns in target to be based on shifts instead of mults. Regards, Venkat. -Original Message- From: Steve Ellcey [mailto:sell...@imgtec.com] Sent: Monday, May 11, 2015 11:20 PM To: Kumar, Venkataramanan Cc: Segher Boessenkool; Jeff Law (l...@redhat.com); gcc-patches@gcc.gnu.org; maxim.kuvyr...@linaro.org; Matthew Fortune; clm Subject: RE: [RFC]: Remove Mem/address type assumption in combiner On Thu, 2015-05-07 at 11:01 +, Kumar, Venkataramanan wrote: Hi Segher, Thank you I committed as r222874. Ref: https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=222874 Regards, Venkat. Venkat, This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? With this C code: signed short test (signed short *a, int index) { return a[index]; } GCC/combine for mips32r6 used to produce: (insn 8 7 9 2 (set (reg/f:SI 203) (plus:SI (mult:SI (reg:SI 5 $5 [ index ]) (const_int 2 [0x2])) (reg:SI 4 $4 [ a ]))) lsa.c:3 444 {lsa} (expr_list:REG_DEAD (reg:SI 5 $5 [ index ]) (expr_list:REG_DEAD (reg:SI 4 $4 [ a ]) (nil (insn 15 10 16 2 (set (reg/i:SI 2 $2) (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16]))) lsa.c:4 237 {*extendhisi2_seh} (expr_list:REG_DEAD (reg/f:SI 203) (nil))) And would generate: lsa $4,$5,$4,1 lh $2,0($4) But now it produces: (insn 7 4 8 2 (set (reg:SI 202) (ashift:SI (reg:SI 5 $5 [ index ]) (const_int 1 [0x1]))) lsa.c:3 432 {*ashlsi3} (expr_list:REG_DEAD (reg:SI 5 $5 [ index ]) (nil))) (insn 8 7 9 2 (set (reg/f:SI 203) (plus:SI (reg:SI 4 $4 [ a ]) (reg:SI 202))) lsa.c:3 13 {*addsi3} (expr_list:REG_DEAD (reg:SI 4 $4 [ a ]) (expr_list:REG_DEAD (reg:SI 202) (nil))) (insn 15 10 16 2 (set (reg/i:SI 2 $2) (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16]))) lsa.c:4 237 {*extendhisi2_seh} (expr_list:REG_DEAD (reg/f:SI 203) (nil))) Which generates: sll $5,$5,1 addu$4,$4,$5 lh $2,0($4) Steve Ellcey sell...@imgtec.com
RE: [RFC]: Remove Mem/address type assumption in combiner
On Thu, 2015-05-07 at 11:01 +, Kumar, Venkataramanan wrote: Hi Segher, Thank you I committed as r222874. Ref: https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=222874 Regards, Venkat. Venkat, This patch broke a number of MIPS tests, specifically mips32r6 tests that look for the lsa instruction (load scaled address) which shifts one register and then adds it to a second register. I am not sure if this needs to be addressed in combine.c or if we need to add a peephole optimization to mips.md to handle the new instruction sequence. What do you think? Is the change here what you would expect to see from your patch? With this C code: signed short test (signed short *a, int index) { return a[index]; } GCC/combine for mips32r6 used to produce: (insn 8 7 9 2 (set (reg/f:SI 203) (plus:SI (mult:SI (reg:SI 5 $5 [ index ]) (const_int 2 [0x2])) (reg:SI 4 $4 [ a ]))) lsa.c:3 444 {lsa} (expr_list:REG_DEAD (reg:SI 5 $5 [ index ]) (expr_list:REG_DEAD (reg:SI 4 $4 [ a ]) (nil (insn 15 10 16 2 (set (reg/i:SI 2 $2) (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16]))) lsa.c:4 237 {*extendhisi2_seh} (expr_list:REG_DEAD (reg/f:SI 203) (nil))) And would generate: lsa $4,$5,$4,1 lh $2,0($4) But now it produces: (insn 7 4 8 2 (set (reg:SI 202) (ashift:SI (reg:SI 5 $5 [ index ]) (const_int 1 [0x1]))) lsa.c:3 432 {*ashlsi3} (expr_list:REG_DEAD (reg:SI 5 $5 [ index ]) (nil))) (insn 8 7 9 2 (set (reg/f:SI 203) (plus:SI (reg:SI 4 $4 [ a ]) (reg:SI 202))) lsa.c:3 13 {*addsi3} (expr_list:REG_DEAD (reg:SI 4 $4 [ a ]) (expr_list:REG_DEAD (reg:SI 202) (nil))) (insn 15 10 16 2 (set (reg/i:SI 2 $2) (sign_extend:SI (mem:HI (reg/f:SI 203) [1 *_5+0 S2 A16]))) lsa.c:4 237 {*extendhisi2_seh} (expr_list:REG_DEAD (reg/f:SI 203) (nil))) Which generates: sll $5,$5,1 addu$4,$4,$5 lh $2,0($4) Steve Ellcey sell...@imgtec.com
RE: [RFC]: Remove Mem/address type assumption in combiner
Hi Segher, Thank you I committed as r222874. Ref: https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=222874 Regards, Venkat. -Original Message- From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] Sent: Tuesday, May 05, 2015 10:46 PM To: Kumar, Venkataramanan Cc: Jeff Law (l...@redhat.com); gcc-patches@gcc.gnu.org; maxim.kuvyr...@linaro.org Subject: Re: [RFC]: Remove Mem/address type assumption in combiner On Tue, May 05, 2015 at 04:14:03PM +, Kumar, Venkataramanan wrote: Hi Segher, Thank you for testing the patch and approving it. Before I commit it, I wanted to check with you on the changelog entry. Please find the updated patch with the changelog entry. I have just removed the comments that says checks for PLUS/MINUS RTX is a hack. Let me know if it ok. Regards, Venkat. Change Log --- 2015-05-05 Venkataramanan Kumar venkataramanan.ku...@amd.com * combine.c (make_compound_operation): Remove checks for PLUS/MINUS rtx type. Yes, this is okay for trunk. Segher
RE: [RFC]: Remove Mem/address type assumption in combiner
Hi Segher, Thank you for testing the patch and approving it. Before I commit it, I wanted to check with you on the changelog entry. Please find the updated patch with the changelog entry. I have just removed the comments that says checks for PLUS/MINUS RTX is a hack. Let me know if it ok. Regards, Venkat. Change Log --- 2015-05-05 Venkataramanan Kumar venkataramanan.ku...@amd.com * combine.c (make_compound_operation): Remove checks for PLUS/MINUS rtx type. ---patch--- diff --git a/gcc/combine.c b/gcc/combine.c index c04146a..9e3eb03 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7723,9 +7723,8 @@ extract_left_shift (rtx x, int count) We try, as much as possible, to re-use rtl expressions to save memory. IN_CODE says what kind of expression we are processing. Normally, it is - SET. In a memory address (inside a MEM, PLUS or minus, the latter two - being kludges), it is MEM. When processing the arguments of a comparison - or a COMPARE against zero, it is COMPARE. */ + SET. In a memory address it is MEM. When processing the arguments of + a comparison or a COMPARE against zero, it is COMPARE. */ rtx make_compound_operation (rtx x, enum rtx_code in_code) @@ -7745,8 +7744,6 @@ make_compound_operation (rtx x, enum rtx_code in_code) but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM - : ((code == PLUS || code == MINUS) - SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); ---patch--- -Original Message- From: Segher Boessenkool [mailto:seg...@kernel.crashing.org] Sent: Friday, May 01, 2015 8:48 PM To: Kumar, Venkataramanan Cc: Jeff Law (l...@redhat.com); gcc-patches@gcc.gnu.org; maxim.kuvyr...@linaro.org Subject: Re: [RFC]: Remove Mem/address type assumption in combiner On Wed, Apr 29, 2015 at 12:03:35PM -0500, Segher Boessenkool wrote: On Wed, Apr 29, 2015 at 09:25:21AM +, Kumar, Venkataramanan wrote: diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code) but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM - : ((code == PLUS || code == MINUS) - SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); On X86_64, it passes bootstrap and regression tests. But on Aarch64 the test in PR passed, but I got a few test case failures. There are few patterns based on multiplication operations in Aarch64 backend which are used to match with the pattern combiner generated. Now those patterns have to be fixed to use SHIFTS. Also need to see any impact on other targets. Right. It would be good if you could find out for what targets it matters. The thing is, if a target expects only the patterns as combine used to make them, it will regress (as you've seen on aarch64); but it will regress _silently_. Which isn't so nice. But before that I wanted to check if the assumption in combiner, can simply be removed ? Seeing for what targets / patterns it makes a difference would tell us the answer to that, too :-) I'll run some tests with and without your patch. So I ran the tests. These are text sizes for vmlinux built for most architectures (before resp. after the patch). Results are good, but they show many targets need some work... BIGGER 5410496 5432744 alpha 3848987 3849143 arm 2190276 2190292 blackfin 2188431 2191503 c6x (but data shrank by more than text grew) 2212322 2212706 cris 10896454 10896965 i386 7471891 7488771 parisc 6168799 6195391 parisc64 1545840 1545872 shnommu 1946649 1954149 xtensa I looked at some of those. Alpha regresses with s4add things, arm with add ,lsl things, parisc with shladd things. I tried to fix the parisc one (it seemed simplest), and the 32-bit kernel got most (but not all) of the size difference back; and the 64-bit wouldn't build (an assert in the kernel failed, and it uses a floating point reg where an integer one is needed -- I gave up). SMALLER 8688171 8688003 powerpc 20252605 20251797 powerpc64 11425334 11422558 s390 12300135 12299767 x86_64 The PowerPC differences are mostly what first was rlwinm;addi;rlwinm;lwz and now is rlwinm;lwz; i.e. the add is moved after the two rotates, the rotates are merged, and the add is made part of the offset in the load. NO DIFF 3321492 3321492 frv 3252747 3252747 m32r 4708232 4708232 microblaze 3949145 3949145 mips 5693019
Re: [RFC]: Remove Mem/address type assumption in combiner
On Tue, May 05, 2015 at 04:14:03PM +, Kumar, Venkataramanan wrote: Hi Segher, Thank you for testing the patch and approving it. Before I commit it, I wanted to check with you on the changelog entry. Please find the updated patch with the changelog entry. I have just removed the comments that says checks for PLUS/MINUS RTX is a hack. Let me know if it ok. Regards, Venkat. Change Log --- 2015-05-05 Venkataramanan Kumar venkataramanan.ku...@amd.com * combine.c (make_compound_operation): Remove checks for PLUS/MINUS rtx type. Yes, this is okay for trunk. Segher
Re: [RFC]: Remove Mem/address type assumption in combiner
On Wed, Apr 29, 2015 at 12:03:35PM -0500, Segher Boessenkool wrote: On Wed, Apr 29, 2015 at 09:25:21AM +, Kumar, Venkataramanan wrote: diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code) but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM - : ((code == PLUS || code == MINUS) - SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); On X86_64, it passes bootstrap and regression tests. But on Aarch64 the test in PR passed, but I got a few test case failures. There are few patterns based on multiplication operations in Aarch64 backend which are used to match with the pattern combiner generated. Now those patterns have to be fixed to use SHIFTS. Also need to see any impact on other targets. Right. It would be good if you could find out for what targets it matters. The thing is, if a target expects only the patterns as combine used to make them, it will regress (as you've seen on aarch64); but it will regress _silently_. Which isn't so nice. But before that I wanted to check if the assumption in combiner, can simply be removed ? Seeing for what targets / patterns it makes a difference would tell us the answer to that, too :-) I'll run some tests with and without your patch. So I ran the tests. These are text sizes for vmlinux built for most architectures (before resp. after the patch). Results are good, but they show many targets need some work... BIGGER 5410496 5432744 alpha 3848987 3849143 arm 2190276 2190292 blackfin 2188431 2191503 c6x (but data shrank by more than text grew) 2212322 2212706 cris 10896454 10896965 i386 7471891 7488771 parisc 6168799 6195391 parisc64 1545840 1545872 shnommu 1946649 1954149 xtensa I looked at some of those. Alpha regresses with s4add things, arm with add ,lsl things, parisc with shladd things. I tried to fix the parisc one (it seemed simplest), and the 32-bit kernel got most (but not all) of the size difference back; and the 64-bit wouldn't build (an assert in the kernel failed, and it uses a floating point reg where an integer one is needed -- I gave up). SMALLER 8688171 8688003 powerpc 20252605 20251797 powerpc64 11425334 11422558 s390 12300135 12299767 x86_64 The PowerPC differences are mostly what first was rlwinm;addi;rlwinm;lwz and now is rlwinm;lwz; i.e. the add is moved after the two rotates, the rotates are merged, and the add is made part of the offset in the load. NO DIFF 3321492 3321492 frv 3252747 3252747 m32r 4708232 4708232 microblaze 3949145 3949145 mips 5693019 5693019 mips64 2373441 2373441 mn10300 6489846 6489846 sh64 3733749 3733749 sparc 6075122 6075122 sparc64 Also quite a few without any diff. Good :-) Some of those might have unnecessary add/mult patterns now, but they also have the add/shift. I didn't see any big differences, and all are of the expected kind. Some targets need some love (but what else is new). The patch is approved for trunk. Thanks, Segher
Re: [RFC]: Remove Mem/address type assumption in combiner
On 04/29/2015 11:03 AM, Segher Boessenkool wrote: Right. It would be good if you could find out for what targets it matters. The thing is, if a target expects only the patterns as combine used to make them, it will regress (as you've seen on aarch64); but it will regress _silently_. Which isn't so nice. But before that I wanted to check if the assumption in combiner, can simply be removed ? Seeing for what targets / patterns it makes a difference would tell us the answer to that, too :-) Right. ANd that was one of the two general directions I recommended earlier this week ;-) 1. Figure out if this code still matters at all. 2. If the code still matters, accurately track if we're inside a MEM so that things canonicalize correctly. jeff
Re: [RFC]: Remove Mem/address type assumption in combiner
On 04/29/2015 03:25 AM, Kumar, Venkataramanan wrote: Hi Jeff/Segher, When we see an RTX code with PLUS or MINUS then it is treated as MEM/address type (we are inside address RTX). Is there any significance on that assumption? I removed this assumption and the test case in the PR 63949 passed. When appearing inside a MEM, we have different canonicalization rules. The comment in make_compound_operation clearly indicates that the PLUS/MINUS support is a hack. However, I'm pretty sure it was strictly for getting better code than for correctness. One path forward is to properly track if we're in a MEM, at which point the hack for PLUS/MINUS could probably just go away. diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code) but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM - : ((code == PLUS || code == MINUS) - SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); On X86_64, it passes bootstrap and regression tests. But on Aarch64 the test in PR passed, but I got a few test case failures. Tests that now fail, but worked before: gcc.target/aarch64/adds1.c scan-assembler adds\tw[0-9]+, w[0-9]+, w[0-9]+, lsl 3 gcc.target/aarch64/adds1.c scan-assembler adds\tx[0-9]+, x[0-9]+, x[0-9]+, lsl 3 So that test seems to want to verify that you can generate a shift-add type instruction. I suspect the others are similar. I'd be curious to see the .combine dump as well as the final assembly for this test. Which is a strong hint that we should be looking at target with shift-add style instructions. ARM, AArch64, HPPA, x86 come immediately to mind. There are few patterns based on multiplication operations in Aarch64 backend which are used to match with the pattern combiner generated. Now those patterns have to be fixed to use SHIFTS. Also need to see any impact on other targets. But before that I wanted to check if the assumption in combiner, can simply be removed ? Given what I'm seeing now, I doubt it can simply be removed at this point (which is a change in my position) since ports with these shift-add style instructions have probably been tuned to work with existing combine behaviour. We may need to do a deep test across various targets to identify those affected and fix them. jeff
Re: [RFC]: Remove Mem/address type assumption in combiner
On 04/29/2015 03:25 AM, Kumar, Venkataramanan wrote: Hi Jeff/Segher, Restarting the discussion on the GCC combiner assumption about Memory/address type. Ref: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01298.html https://gcc.gnu.org/ml/gcc/2015-04/msg00028.html Regards, Venkat. PS: I am starting a new thread since I no more have access to Linaro ID from where I sent the earlier mail. Funny, I sent mail just a few days ago trying to get this restarted, but it went to your Linaro address. I'll forward it separately. Jeff
Re: [RFC]: Remove Mem/address type assumption in combiner
Hello Venkat, On Wed, Apr 29, 2015 at 09:25:21AM +, Kumar, Venkataramanan wrote: diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..945abdb 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7703,8 +7703,6 @@ make_compound_operation (rtx x, enum rtx_code in_code) but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM - : ((code == PLUS || code == MINUS) - SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); On X86_64, it passes bootstrap and regression tests. But on Aarch64 the test in PR passed, but I got a few test case failures. There are few patterns based on multiplication operations in Aarch64 backend which are used to match with the pattern combiner generated. Now those patterns have to be fixed to use SHIFTS. Also need to see any impact on other targets. Right. It would be good if you could find out for what targets it matters. The thing is, if a target expects only the patterns as combine used to make them, it will regress (as you've seen on aarch64); but it will regress _silently_. Which isn't so nice. But before that I wanted to check if the assumption in combiner, can simply be removed ? Seeing for what targets / patterns it makes a difference would tell us the answer to that, too :-) I'll run some tests with and without your patch. Segher