Re: [PR81611] improve auto-inc
On 02/27/2018 04:18 PM, Alexandre Oliva wrote: > On Feb 14, 2018, Jeff Lawwrote: > >>> + regno = REGNO (inc_insn.reg0); >>> + int luid = DF_INSN_LUID (mem_insn.insn); >>> + mem_insn.insn = get_next_ref (regno, bb, reg_next_use); >> So I think a comment is warranted right as we enter the TRUE arm. > >> At that point INC_INSN is an inc/dec. But MEM_INSN is not necessarily a >> memory reference. It could be a memory reference, it could be a copy, >> it could be something completely different (it's just the next insn that >> references the result of the increment). In the case we care about we >> want it to be a copy of INC_INSN's REG_RES back to REG0. > >> ISTM that verifying MEM_INSN is a reg->reg copy (reg_res -> reg0) before >> we call get_next_ref for reg0 is advisable and probably good from a >> compile-time standpoint by avoiding calls into find_address. > > But we don't need it to be a copy. The transformation is just as > legitimate if the regs go independent ways after that point. We have > reg_res set to reg0+reg1, and then a use of reg0 in a MEM before any > other use of reg_res. We turn that into a copy of reg0 to reg_res, and > the MEM addr into a post_add of reg_res with reg1 (possibly a post_inc), > so that the MEM dereferences reg_res while it's still equal to reg0, and > after the MEM, reg_res becomes reg0+reg1, as it should for any > subsequent uses, and reg0 is unmodified. Whether or not a subsequent > copy from reg_res to reg0 is to be found won't make the transformation > any more or less legitimate. > >> After we call get_next_ref to get the next reference of the source of >> the increment, then we're hoping to find a memory reference that uses >> REG0. But it's not guaranteed it's a memory reference insn. > > Yeah, find_address will determine if it contains any of the MEM patterns > we might be interested in, but it could be anything whatsoever. The MEM > pattern might appear virtually anywhere in the insn. > >> I was having an awful time understanding how this code could work from >> the comments until I put it under a debugger and got a sense of the >> state as we entered that IF block. Then it was much clearer :-) > > Sorry, I realize the comments were written based on a lot of context > about the overall behavior of the pass, that I had learned while trying > to figure it out. At the risk of making it redundant, I've expanded the > comments, and added further tests that won't affect current behavior in > any significant way, but that might speed things up a bit and will save > us trouble should find_address be extended to catch additional patterns. > > >> I believe Georg had other testcases in subsequent comments in the BZ, >> but I don't believe they were flagged as regressions. > > However, with the testcases I realized the incremented register could > still be live, even if we didn't find a subsequent use for it. > Adjusting for that made those testcases use post_inc too. > > Here's the improved patch, regstrapped on aarch64-, ppc64-, and > ppc64el-linux-gnu. Ok to install? > > > [PR81611] turn inc-and-use-of-dead-orig into auto-inc > > When the addressing modes available on the machine don't allow offsets > in addresses, odds are that post-increments will be represented in > trees and RTL as: > > y <= x + 1 > ... *(x) ... > x <= y > > so deal with it by turning such RTL as: > > (set y (plus x n)) > ... (mem x) ... > > without intervening uses of y into > > (set y x) > ... (mem (post_add y n)) ... > > so as to create auto-inc addresses that we'd otherwise miss. > > > for gcc/ChangeLog > > PR rtl-optimization/81611 > * auto-inc-dec.c (attempt_change): Move dead note from > mem_insn if it's the next use of regno > (find_address): Take address use of reg holding > non-incremented value. Add parm to limit search to the named > reg only. > (merge_in_block): Attempt to use a mem insn that is the next > use of the original regno. OK. Thanks! Jeff
Re: [PR81611] improve auto-inc
On Feb 14, 2018, Jeff Lawwrote: >> + regno = REGNO (inc_insn.reg0); >> + int luid = DF_INSN_LUID (mem_insn.insn); >> + mem_insn.insn = get_next_ref (regno, bb, reg_next_use); > So I think a comment is warranted right as we enter the TRUE arm. > At that point INC_INSN is an inc/dec. But MEM_INSN is not necessarily a > memory reference. It could be a memory reference, it could be a copy, > it could be something completely different (it's just the next insn that > references the result of the increment). In the case we care about we > want it to be a copy of INC_INSN's REG_RES back to REG0. > ISTM that verifying MEM_INSN is a reg->reg copy (reg_res -> reg0) before > we call get_next_ref for reg0 is advisable and probably good from a > compile-time standpoint by avoiding calls into find_address. But we don't need it to be a copy. The transformation is just as legitimate if the regs go independent ways after that point. We have reg_res set to reg0+reg1, and then a use of reg0 in a MEM before any other use of reg_res. We turn that into a copy of reg0 to reg_res, and the MEM addr into a post_add of reg_res with reg1 (possibly a post_inc), so that the MEM dereferences reg_res while it's still equal to reg0, and after the MEM, reg_res becomes reg0+reg1, as it should for any subsequent uses, and reg0 is unmodified. Whether or not a subsequent copy from reg_res to reg0 is to be found won't make the transformation any more or less legitimate. > After we call get_next_ref to get the next reference of the source of > the increment, then we're hoping to find a memory reference that uses > REG0. But it's not guaranteed it's a memory reference insn. Yeah, find_address will determine if it contains any of the MEM patterns we might be interested in, but it could be anything whatsoever. The MEM pattern might appear virtually anywhere in the insn. > I was having an awful time understanding how this code could work from > the comments until I put it under a debugger and got a sense of the > state as we entered that IF block. Then it was much clearer :-) Sorry, I realize the comments were written based on a lot of context about the overall behavior of the pass, that I had learned while trying to figure it out. At the risk of making it redundant, I've expanded the comments, and added further tests that won't affect current behavior in any significant way, but that might speed things up a bit and will save us trouble should find_address be extended to catch additional patterns. > I believe Georg had other testcases in subsequent comments in the BZ, > but I don't believe they were flagged as regressions. However, with the testcases I realized the incremented register could still be live, even if we didn't find a subsequent use for it. Adjusting for that made those testcases use post_inc too. Here's the improved patch, regstrapped on aarch64-, ppc64-, and ppc64el-linux-gnu. Ok to install? [PR81611] turn inc-and-use-of-dead-orig into auto-inc When the addressing modes available on the machine don't allow offsets in addresses, odds are that post-increments will be represented in trees and RTL as: y <= x + 1 ... *(x) ... x <= y so deal with it by turning such RTL as: (set y (plus x n)) ... (mem x) ... without intervening uses of y into (set y x) ... (mem (post_add y n)) ... so as to create auto-inc addresses that we'd otherwise miss. for gcc/ChangeLog PR rtl-optimization/81611 * auto-inc-dec.c (attempt_change): Move dead note from mem_insn if it's the next use of regno (find_address): Take address use of reg holding non-incremented value. Add parm to limit search to the named reg only. (merge_in_block): Attempt to use a mem insn that is the next use of the original regno. --- gcc/auto-inc-dec.c | 140 1 file changed, 130 insertions(+), 10 deletions(-) diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c index d02fa9d081c7..e6dc1c30d716 100644 --- a/gcc/auto-inc-dec.c +++ b/gcc/auto-inc-dec.c @@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg) before the memory reference. */ gcc_assert (mov_insn); emit_insn_before (mov_insn, inc_insn.insn); - move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); + regno = REGNO (inc_insn.reg0); + if (reg_next_use[regno] == mem_insn.insn) + move_dead_notes (mov_insn, mem_insn.insn, inc_insn.reg0); + else + move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); regno = REGNO (inc_insn.reg_res); reg_next_def[regno] = mov_insn; @@ -825,13 +829,15 @@ parse_add_or_inc (rtx_insn *insn, bool before_mem) /* A recursive function that checks all of the mem uses in ADDRESS_OF_X to see if any single one of them is compatible with - what has been found in inc_insn. + what has
Re: [PR81611] improve auto-inc
On 01/23/2018 08:42 PM, Alexandre Oliva wrote: > These two patches fix PR81611. > > The first one improves forwprop so that we avoid adding SSA conflicting > by forwpropping the iv increment, which may cause both the incremented > and the original value to be live, even when the iv is copied between > the PHI node and the increment. We already handled the case in which > there aren't any such copies. > > Alas, this is not enough to address the problem on avr, even though it > fixes it on e.g. ppc. The reason is that avr rejects var+offset > addresses, and this prevents the memory access in a post-inc code > sequence from being adjusted to an address that auto-inc-dec can > recognize. > > The second patch adjusts auto-inc-dec to recognize a code sequence in > which the original, unincremented pseudo is used in an address after > it's incremented into another pseudo, and turn that into a post-inc > address, leaving the copying for subsequent passes to eliminate. > > Regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu and > aarch64-linux-gnu. Ok to install? > > > I'd appreciate suggestions on how to turn the submitted testcase into a > regression test; I suppose an avr-specific test that requires the > auto-inc transformation is a possibility, but that feels a bit too > limited, doesn't it? Thoughts? Thanks in advance, > > [ snip the DOM change which was approved and installed ] > [PR81611] turn inc-and-use-of-dead-orig into auto-inc > > When the addressing modes available on the machine don't allow offsets > in addresses, odds are that post-increments will be represented in > trees and RTL as: > > y <= x + 1 > ... *(x) ... > x <= y > > so deal with this form so as to create auto-inc addresses that we'd > otherwise miss. > > > for gcc/ChangeLog > > PR rtl-optimization/81611 > * auto-inc-dec.c (attempt_change): Move dead note from > mem_insn if it's the next use of regno > (find_address): Take address use of reg holding > non-incremented value. > (merge_in_block): Attempt to use a mem insn that is the next > use of the original regno. > --- > gcc/auto-inc-dec.c | 46 -- > 1 file changed, 44 insertions(+), 2 deletions(-) > > diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c > index d02fa9d081c7..4ffbcf56a456 100644 > --- a/gcc/auto-inc-dec.c > +++ b/gcc/auto-inc-dec.c > @@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg) >before the memory reference. */ >gcc_assert (mov_insn); >emit_insn_before (mov_insn, inc_insn.insn); > - move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); > + regno = REGNO (inc_insn.reg0); > + if (reg_next_use[regno] == mem_insn.insn) > + move_dead_notes (mov_insn, mem_insn.insn, inc_insn.reg0); > + else > + move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); > >regno = REGNO (inc_insn.reg_res); >reg_next_def[regno] = mov_insn; > @@ -1370,6 +1385,33 @@ merge_in_block (int max_reg, basic_block bb) > insn_is_add_or_inc = false; > } > } > + > + if (ok && insn_is_add_or_inc > + && inc_insn.reg0 != inc_insn.reg_res) > + { > + regno = REGNO (inc_insn.reg0); > + int luid = DF_INSN_LUID (mem_insn.insn); > + mem_insn.insn = get_next_ref (regno, bb, reg_next_use); So I think a comment is warranted right as we enter the TRUE arm. At that point INC_INSN is an inc/dec. But MEM_INSN is not necessarily a memory reference. It could be a memory reference, it could be a copy, it could be something completely different (it's just the next insn that references the result of the increment). In the case we care about we want it to be a copy of INC_INSN's REG_RES back to REG0. ISTM that verifying MEM_INSN is a reg->reg copy (reg_res -> reg0) before we call get_next_ref for reg0 is advisable and probably good from a compile-time standpoint by avoiding calls into find_address. After we call get_next_ref to get the next reference of the source of the increment, then we're hoping to find a memory reference that uses REG0. But it's not guaranteed it's a memory reference insn. I was having an awful time understanding how this code could work from the comments until I put it under a debugger and got a sense of the state as we entered that IF block. Then it was much clearer :-) I believe Georg had other testcases in subsequent comments in the BZ, but I don't believe they were flagged as regressions. So I think once you check in your patch we note in the BZ that the original testcase (and thus the regression) is fixed, but that the later tests are not. We then remove the regression marker. If Georg (or anyone else) does the analysis to show those later tests are also regressions, then we'll add the regression marker
Re: [PR81611] improve auto-inc
On Jan 25, 2018, Alexandre Olivawrote: > Thanks, I'll retest with the simplified test (just in case; for I can't > recall why I ended up with all those redundant conditions), As suspected, removing the redundant tests didn't regress anything. I suppose they mattered in some earlier experimental version of the patch (I vaguely recall having a long sequence of tests within the loop at some point), and I just failed to further simplify the final form. Thanks for catching the further simplification opportunities! FTR, here's the patch I'm installing, while awaiting a review from someone else on the rtl auto-inc patch (the second and last patch in https://gcc.gnu.org/ml/gcc-patches/2018-01/msg01994.html). [PR81611] accept copies in simple_iv_increment_p If there are copies between the GIMPLE_PHI at the loop body and the increment that reaches it (presumably through a back edge), still regard it as a simple_iv_increment, so that we won't consider the value in the back edge eligible for forwprop. Doing so would risk making the phi node and the incremented conflicting value live within the loop, and the phi node to be preserved for propagated uses after the loop. for gcc/ChangeLog PR tree-optimization/81611 * tree-ssa-dom.c (simple_iv_increment_p): Skip intervening copies. --- gcc/tree-ssa-dom.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 2b371667253a..a6f176c5def0 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1276,8 +1276,11 @@ record_equality (tree x, tree y, class const_and_copies *const_and_copies) /* Returns true when STMT is a simple iv increment. It detects the following situation: - i_1 = phi (..., i_2) - i_2 = i_1 +/- ... */ + i_1 = phi (..., i_k) + [...] + i_j = i_{j-1} for each j : 2 <= j <= k-1 + [...] + i_k = i_{k-1} +/- ... */ bool simple_iv_increment_p (gimple *stmt) @@ -1305,8 +1308,15 @@ simple_iv_increment_p (gimple *stmt) return false; phi = SSA_NAME_DEF_STMT (preinc); - if (gimple_code (phi) != GIMPLE_PHI) -return false; + while (gimple_code (phi) != GIMPLE_PHI) +{ + /* Follow trivial copies, but not the DEF used in a back edge, +so that we don't prevent coalescing. */ + if (!gimple_assign_ssa_name_copy_p (phi)) + return false; + preinc = gimple_assign_rhs1 (phi); + phi = SSA_NAME_DEF_STMT (preinc); +} for (i = 0; i < gimple_phi_num_args (phi); i++) if (gimple_phi_arg_def (phi, i) == lhs) -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR81611] improve auto-inc
On Jan 24, 2018, Richard Bienerwrote: > given gimple_assign_ssa_name_copy checks it is an assign *nod* > the lhs != preinc check is always false given you got to phi via > SSA_NAME_DEF_STMT of preinc. *nod* > The simple_iv_increment_p change is ok with that change. Thanks, I'll retest with the simplified test (just in case; for I can't recall why I ended up with all those redundant conditions), repost and install. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Re: [PR81611] improve auto-inc
On Wed, Jan 24, 2018 at 4:42 AM, Alexandre Olivawrote: > These two patches fix PR81611. > > The first one improves forwprop so that we avoid adding SSA conflicting > by forwpropping the iv increment, which may cause both the incremented > and the original value to be live, even when the iv is copied between > the PHI node and the increment. We already handled the case in which > there aren't any such copies. > > Alas, this is not enough to address the problem on avr, even though it > fixes it on e.g. ppc. The reason is that avr rejects var+offset > addresses, and this prevents the memory access in a post-inc code > sequence from being adjusted to an address that auto-inc-dec can > recognize. > > The second patch adjusts auto-inc-dec to recognize a code sequence in > which the original, unincremented pseudo is used in an address after > it's incremented into another pseudo, and turn that into a post-inc > address, leaving the copying for subsequent passes to eliminate. > > Regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu and > aarch64-linux-gnu. Ok to install? > > > I'd appreciate suggestions on how to turn the submitted testcase into a > regression test; I suppose an avr-specific test that requires the > auto-inc transformation is a possibility, but that feels a bit too > limited, doesn't it? Thoughts? Thanks in advance, > > > [PR81611] accept copies in simple_iv_increment_p > > If there are copies between the GIMPLE_PHI at the loop body and the > increment that reaches it (presumably through a back edge), still > regard it as a simple_iv_increment, so that we won't consider the > value in the back edge eligible for forwprop. Doing so would risk > making the phi node and the incremented conflicting value live > within the loop, and the phi node to be preserved for propagated > uses after the loop. > > for gcc/ChangeLog > > PR tree-optimization/81611 > * tree-ssa-dom.c (simple_iv_increment_p): Skip intervening > copies. > --- > gcc/tree-ssa-dom.c | 21 + > 1 file changed, 17 insertions(+), 4 deletions(-) > > diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c > index 2b371667253a..3c0ff9458342 100644 > --- a/gcc/tree-ssa-dom.c > +++ b/gcc/tree-ssa-dom.c > @@ -1276,8 +1276,11 @@ record_equality (tree x, tree y, class > const_and_copies *const_and_copies) > /* Returns true when STMT is a simple iv increment. It detects the > following situation: > > - i_1 = phi (..., i_2) > - i_2 = i_1 +/- ... */ > + i_1 = phi (..., i_k) > + [...] > + i_j = i_{j-1} for each j : 2 <= j <= k-1 > + [...] > + i_k = i_{k-1} +/- ... */ > > bool > simple_iv_increment_p (gimple *stmt) > @@ -1305,8 +1308,18 @@ simple_iv_increment_p (gimple *stmt) > return false; > >phi = SSA_NAME_DEF_STMT (preinc); > - if (gimple_code (phi) != GIMPLE_PHI) > -return false; > + while (gimple_code (phi) != GIMPLE_PHI) > +{ > + /* Follow trivial copies, but not the DEF used in a back edge, > +so that we don't prevent coalescing. */ > + if (gimple_code (phi) != GIMPLE_ASSIGN > + || gimple_assign_lhs (phi) != preinc > + || !gimple_assign_ssa_name_copy_p (phi)) given gimple_assign_ssa_name_copy checks it is an assign just do if (!gimple_assign_ssa-anme_Copy_p (phi)) the lhs != preinc check is always false given you got to phi via SSA_NAME_DEF_STMT of preinc. The simple_iv_increment_p change is ok with that change. The other change is RTL which I defer to somebody else. Richard. > + return false; > + > + preinc = gimple_assign_rhs1 (phi); > + phi = SSA_NAME_DEF_STMT (preinc); > +} > >for (i = 0; i < gimple_phi_num_args (phi); i++) > if (gimple_phi_arg_def (phi, i) == lhs) > > > [PR81611] turn inc-and-use-of-dead-orig into auto-inc > > When the addressing modes available on the machine don't allow offsets > in addresses, odds are that post-increments will be represented in > trees and RTL as: > > y <= x + 1 > ... *(x) ... > x <= y > > so deal with this form so as to create auto-inc addresses that we'd > otherwise miss. > > > for gcc/ChangeLog > > PR rtl-optimization/81611 > * auto-inc-dec.c (attempt_change): Move dead note from > mem_insn if it's the next use of regno > (find_address): Take address use of reg holding > non-incremented value. > (merge_in_block): Attempt to use a mem insn that is the next > use of the original regno. > --- > gcc/auto-inc-dec.c | 46 -- > 1 file changed, 44 insertions(+), 2 deletions(-) > > diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c > index d02fa9d081c7..4ffbcf56a456 100644 > --- a/gcc/auto-inc-dec.c > +++ b/gcc/auto-inc-dec.c > @@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg) > before the memory reference. */ >gcc_assert (mov_insn); >emit_insn_before (mov_insn,
[PR81611] improve auto-inc
These two patches fix PR81611. The first one improves forwprop so that we avoid adding SSA conflicting by forwpropping the iv increment, which may cause both the incremented and the original value to be live, even when the iv is copied between the PHI node and the increment. We already handled the case in which there aren't any such copies. Alas, this is not enough to address the problem on avr, even though it fixes it on e.g. ppc. The reason is that avr rejects var+offset addresses, and this prevents the memory access in a post-inc code sequence from being adjusted to an address that auto-inc-dec can recognize. The second patch adjusts auto-inc-dec to recognize a code sequence in which the original, unincremented pseudo is used in an address after it's incremented into another pseudo, and turn that into a post-inc address, leaving the copying for subsequent passes to eliminate. Regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu and aarch64-linux-gnu. Ok to install? I'd appreciate suggestions on how to turn the submitted testcase into a regression test; I suppose an avr-specific test that requires the auto-inc transformation is a possibility, but that feels a bit too limited, doesn't it? Thoughts? Thanks in advance, [PR81611] accept copies in simple_iv_increment_p If there are copies between the GIMPLE_PHI at the loop body and the increment that reaches it (presumably through a back edge), still regard it as a simple_iv_increment, so that we won't consider the value in the back edge eligible for forwprop. Doing so would risk making the phi node and the incremented conflicting value live within the loop, and the phi node to be preserved for propagated uses after the loop. for gcc/ChangeLog PR tree-optimization/81611 * tree-ssa-dom.c (simple_iv_increment_p): Skip intervening copies. --- gcc/tree-ssa-dom.c | 21 + 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 2b371667253a..3c0ff9458342 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -1276,8 +1276,11 @@ record_equality (tree x, tree y, class const_and_copies *const_and_copies) /* Returns true when STMT is a simple iv increment. It detects the following situation: - i_1 = phi (..., i_2) - i_2 = i_1 +/- ... */ + i_1 = phi (..., i_k) + [...] + i_j = i_{j-1} for each j : 2 <= j <= k-1 + [...] + i_k = i_{k-1} +/- ... */ bool simple_iv_increment_p (gimple *stmt) @@ -1305,8 +1308,18 @@ simple_iv_increment_p (gimple *stmt) return false; phi = SSA_NAME_DEF_STMT (preinc); - if (gimple_code (phi) != GIMPLE_PHI) -return false; + while (gimple_code (phi) != GIMPLE_PHI) +{ + /* Follow trivial copies, but not the DEF used in a back edge, +so that we don't prevent coalescing. */ + if (gimple_code (phi) != GIMPLE_ASSIGN + || gimple_assign_lhs (phi) != preinc + || !gimple_assign_ssa_name_copy_p (phi)) + return false; + + preinc = gimple_assign_rhs1 (phi); + phi = SSA_NAME_DEF_STMT (preinc); +} for (i = 0; i < gimple_phi_num_args (phi); i++) if (gimple_phi_arg_def (phi, i) == lhs) [PR81611] turn inc-and-use-of-dead-orig into auto-inc When the addressing modes available on the machine don't allow offsets in addresses, odds are that post-increments will be represented in trees and RTL as: y <= x + 1 ... *(x) ... x <= y so deal with this form so as to create auto-inc addresses that we'd otherwise miss. for gcc/ChangeLog PR rtl-optimization/81611 * auto-inc-dec.c (attempt_change): Move dead note from mem_insn if it's the next use of regno (find_address): Take address use of reg holding non-incremented value. (merge_in_block): Attempt to use a mem insn that is the next use of the original regno. --- gcc/auto-inc-dec.c | 46 -- 1 file changed, 44 insertions(+), 2 deletions(-) diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c index d02fa9d081c7..4ffbcf56a456 100644 --- a/gcc/auto-inc-dec.c +++ b/gcc/auto-inc-dec.c @@ -508,7 +508,11 @@ attempt_change (rtx new_addr, rtx inc_reg) before the memory reference. */ gcc_assert (mov_insn); emit_insn_before (mov_insn, inc_insn.insn); - move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); + regno = REGNO (inc_insn.reg0); + if (reg_next_use[regno] == mem_insn.insn) + move_dead_notes (mov_insn, mem_insn.insn, inc_insn.reg0); + else + move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0); regno = REGNO (inc_insn.reg_res); reg_next_def[regno] = mov_insn; @@ -842,7 +846,7 @@ find_address (rtx *address_of_x) if (code == MEM && rtx_equal_p (XEXP (x, 0), inc_insn.reg_res)) { - /* Match with *reg0. */ + /* Match with *reg_res. */ mem_insn.mem_loc = address_of_x;