On Tue, Jul 24, 2018 at 7:18 PM Segher Boessenkool <seg...@kernel.crashing.org> wrote: > > This patch allows combine to combine two insns into two. This helps > in many cases, by reducing instruction path length, and also allowing > further combinations to happen. PR85160 is a typical example of code > that it can improve. > > This patch does not allow such combinations if either of the original > instructions was a simple move instruction. In those cases combining > the two instructions increases register pressure without improving the > code. With this move test register pressure does no longer increase > noticably as far as I can tell. > > (At first I also didn't allow either of the resulting insns to be a > move instruction. But that is actually a very good thing to have, as > should have been obvious). > > Tested for many months; tested on about 30 targets. > > I'll commit this later this week if there are no objections.
Sounds good - but, _any_ testcase? Please! ;) Richard. > > Segher > > > 2018-07-24 Segher Boessenkool <seg...@kernel.crashing.org> > > PR rtl-optimization/85160 > * combine.c (is_just_move): New function. > (try_combine): Allow combining two instructions into two if neither of > the original instructions was a move. > > --- > gcc/combine.c | 22 ++++++++++++++++++++-- > 1 file changed, 20 insertions(+), 2 deletions(-) > > diff --git a/gcc/combine.c b/gcc/combine.c > index cfe0f19..d64e84d 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -2604,6 +2604,17 @@ can_split_parallel_of_n_reg_sets (rtx_insn *insn, int > n) > return true; > } > > +/* Return whether X is just a single set, with the source > + a general_operand. */ > +static bool > +is_just_move (rtx x) > +{ > + if (INSN_P (x)) > + x = PATTERN (x); > + > + return (GET_CODE (x) == SET && general_operand (SET_SRC (x), VOIDmode)); > +} > + > /* Try to combine the insns I0, I1 and I2 into I3. > Here I0, I1 and I2 appear earlier than I3. > I0 and I1 can be zero; then we combine just I2 into I3, or I1 and I2 into > @@ -2668,6 +2679,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, > rtx_insn *i0, > int swap_i2i3 = 0; > int split_i2i3 = 0; > int changed_i3_dest = 0; > + bool i2_was_move = false, i3_was_move = false; > > int maxreg; > rtx_insn *temp_insn; > @@ -3059,6 +3071,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, > rtx_insn *i0, > return 0; > } > > + /* Record whether i2 and i3 are trivial moves. */ > + i2_was_move = is_just_move (i2); > + i3_was_move = is_just_move (i3); > + > /* Record whether I2DEST is used in I2SRC and similarly for the other > cases. Knowing this will help in register status updating below. */ > i2dest_in_i2src = reg_overlap_mentioned_p (i2dest, i2src); > @@ -4014,8 +4030,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, > rtx_insn *i0, > && XVECLEN (newpat, 0) == 2 > && GET_CODE (XVECEXP (newpat, 0, 0)) == SET > && GET_CODE (XVECEXP (newpat, 0, 1)) == SET > - && (i1 || set_noop_p (XVECEXP (newpat, 0, 0)) > - || set_noop_p (XVECEXP (newpat, 0, 1))) > + && (i1 > + || set_noop_p (XVECEXP (newpat, 0, 0)) > + || set_noop_p (XVECEXP (newpat, 0, 1)) > + || (!i2_was_move && !i3_was_move)) > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART > && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT > -- > 1.8.3.1 >