On Mon, 22 Oct 2018 at 22:17, Segher Boessenkool <seg...@kernel.crashing.org> wrote: > > On most targets every function starts with moves from the parameter > passing (hard) registers into pseudos. Similarly, after every call > there is a move from the return register into a pseudo. These moves > usually combine with later instructions (leaving pretty much the same > instruction, just with a hard reg instead of a pseudo). > > This isn't a good idea. Register allocation can get rid of unnecessary > moves just fine, and moving the parameter passing registers into many > later instructions tends to prevent good register allocation. This > patch disallows combining moves from a hard (non-fixed) register. > > This also avoid the problem mentioned in PR87600 #c3 (combining hard > registers into inline assembler is problematic). > > Because the register move can often be combined with other instructions > *itself*, for example for setting some condition code, this patch adds > extra copies via new pseudos after every copy-from-hard-reg. > > On some targets this reduces average code size. On others it increases > it a bit, 0.1% or 0.2% or so. (I tested this on all *-linux targets). > > I'll commit this to trunk now. If there are problems, please don't > hesitate to let me know! Thanks. >
Hi, I have noticed many regressions on arm and aarch64 between 265366 and 265408 (this commit is 265398). I bisected at least one to this commit on aarch64: FAIL: gcc.dg/ira-shrinkwrap-prep-1.c scan-rtl-dump ira "Split live-range of register" The same test also regresses on arm. For a whole picture of all the regressions I noticed during these two commits, have a look at: http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/265408/report-build-info.html Christophe > > Segher > > > 2018-10-22 Segher Boessenkool <seg...@kernel.crashing.org> > > PR rtl-optimization/87600 > * combine.c: Add include of expr.h. > (cant_combine_insn_p): Do not combine moves from any hard non-fixed > register to a pseudo. > (make_more_copies): New function, add a copy to a new pseudo after > the moves from hard registers into pseudos. > (rest_of_handle_combine): Declare rebuild_jump_labels_after_combine > later. Call make_more_copies. > > --- > gcc/combine.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 46 insertions(+), 4 deletions(-) > > diff --git a/gcc/combine.c b/gcc/combine.c > index 256b5a4..3ff1760 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -99,6 +99,7 @@ along with GCC; see the file COPYING3. If not see > #include "explow.h" > #include "insn-attr.h" > #include "rtlhooks-def.h" > +#include "expr.h" > #include "params.h" > #include "tree-pass.h" > #include "valtrack.h" > @@ -2348,8 +2349,7 @@ cant_combine_insn_p (rtx_insn *insn) > dest = SUBREG_REG (dest); > if (REG_P (src) && REG_P (dest) > && ((HARD_REGISTER_P (src) > - && ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src)) > - && targetm.class_likely_spilled_p (REGNO_REG_CLASS (REGNO (src)))) > + && ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src))) > || (HARD_REGISTER_P (dest) > && ! TEST_HARD_REG_BIT (fixed_reg_set, REGNO (dest)) > && targetm.class_likely_spilled_p (REGNO_REG_CLASS (REGNO > (dest)))))) > @@ -14969,11 +14969,53 @@ dump_combine_total_stats (FILE *file) > total_attempts, total_merges, total_extras, total_successes); > } > > +/* Make pseudo-to-pseudo copies after every hard-reg-to-pseudo-copy, because > + the reg-to-reg copy can usefully combine with later instructions, but we > + do not want to combine the hard reg into later instructions, for that > + restricts register allocation. */ > +static void > +make_more_copies (void) > +{ > + basic_block bb; > + > + FOR_EACH_BB_FN (bb, cfun) > + { > + rtx_insn *insn; > + > + FOR_BB_INSNS (bb, insn) > + { > + if (!NONDEBUG_INSN_P (insn)) > + continue; > + > + rtx set = single_set (insn); > + if (!set) > + continue; > + rtx src = SET_SRC (set); > + rtx dest = SET_DEST (set); > + if (GET_CODE (src) == SUBREG) > + src = SUBREG_REG (src); > + if (!(REG_P (src) && HARD_REGISTER_P (src))) > + continue; > + if (TEST_HARD_REG_BIT (fixed_reg_set, REGNO (src))) > + continue; > + > + rtx new_reg = gen_reg_rtx (GET_MODE (dest)); > + rtx_insn *insn1 = gen_move_insn (new_reg, src); > + rtx_insn *insn2 = gen_move_insn (dest, new_reg); > + emit_insn_after (insn1, insn); > + emit_insn_after (insn2, insn1); > + delete_insn (insn); > + > + insn = insn2; > + } > + } > +} > + > /* Try combining insns through substitution. */ > static unsigned int > rest_of_handle_combine (void) > { > - int rebuild_jump_labels_after_combine; > + make_more_copies (); > > df_set_flags (DF_LR_RUN_DCE + DF_DEFER_INSN_RESCAN); > df_note_add_problem (); > @@ -14982,7 +15024,7 @@ rest_of_handle_combine (void) > regstat_init_n_sets_and_refs (); > reg_n_sets_max = max_reg_num (); > > - rebuild_jump_labels_after_combine > + int rebuild_jump_labels_after_combine > = combine_instructions (get_insns (), max_reg_num ()); > > /* Combining insns may have turned an indirect jump into a > -- > 1.8.3.1 >