On Sat, May 23, 2015 at 08:45:23AM -0500, Segher Boessenkool wrote:
> Thanks.  Did you see improvements around return as well, or mostly /
> only related to the function start?

The rlwinm vs. rldicl change was in dwarf2out.c add_ranges_num for r3
one insn before a blr.  I'm sure that was due to not combining return
value copies.  In fact, I'm sure all the improvements I saw were due
to changing the exit..  See below.

[snip]
> I agree it looks like a bug waiting to happen.  Please post it as a
> separate patch though.

OK.

> > +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the first block BB
> > +   that we don't want to combine with other instructions.  */
> > +
> > +static void
> > +twiddle_first_block (basic_block bb, basic_block to)
> 
> I don't like this much.  Messing with global state makes things harder
> to change later.  If it is hard to come up with a good name for a
> factor, it usually means it is not such a good factor.
> 
> You can do these inside can_combine_{def,use}_p as far as I can see?
> Probably need to give those an extra parameter then: for _def, a bool
> that says "don't allow moves from hard regs", set when the scan has
> encountered a NOTE_INSN_FUNCTION_BEG in this bb; and for _use, a regset
> of those hard regs we don't want to allow moves to (those seen in USEs
> later in that same block).  Or do it in the main loop itself, _{def,use}
> are each called in one place only; whatever works best.

Huh, that's the way I started writing this patch..  The reason I went
with modifying BLOCK_FOR_INSN is that the loop setting up LOG_LINKS
needs to test that anyway.  Changing code in the main loop means
every insn in a function will need to be tested for additional
conditions.  I thought that might be slower.  I can see your point
though, especially if someone wanted to wean combine off LOG_LINKS.
I'll rewrite the patch, which I realized was buggy anyway, and didn't
prevent param copies from being combined.  :-(

> What makes return values different from CALL args here, btw?  Why do
> you not want to combine return values, but you leave alone call args?

I don't think there is much difference between SETs for return values
and SETs for call args.  The reason I deliberately left them out is
that in the original discussion we were talking about parameters and
return values.  So I thought I'd better restrict the change to just
those SETs.

It would be a much simpler patch to make combine ignore all SETs
that are a reg,reg copy with one of them a hard reg.  I was a little
worried I'd regress some target if I tried that.  (Results on
powerpc64le-linux for such a change look good.  A lot more cases where
code is better, due to catching the parameter reg,reg copies.  In
looking at gcc/*.o I haven't yet seen any regressions in code quality.)

> > +/* Fill in log links field for all insns that we wish to combine.  */
> 
> Please don't change this comment; it was more correct before.

But it wasn't correct!  LOG_LINKS are not set up for *all* insns.
Perhaps I should add "that we might wish to combine" rather than
"that we wish to combine"?

> > @@ -1103,7 +1192,7 @@ create_log_links (void)
> >              we might wind up changing the semantics of the insn,
> >              even if reload can make what appear to be valid
> >              assignments later.  */
> > -         if (regno < FIRST_PSEUDO_REGISTER
> > +         if (HARD_REGISTER_NUM_P (regno)
> >               && asm_noperands (PATTERN (use_insn)) >= 0)
> >             continue;
> 
> An independent change.

Yeah, I guess.  I tidy these if I'm working on a piece of code.
Here's the whole file done.  Boostrapped and regression tested
powerpc64le-linux and x86_64-linux.

        * combine.c: Use HARD_REGISTER_P and HARD_REGISTER_NUM_P as
        appropriate throughout file in place of comparing regno
        against FIRST_PSEUDO_REGISTER.

Index: combine.c
===================================================================
--- combine.c   (revision 223463)
+++ combine.c   (working copy)
@@ -1103,7 +1103,7 @@ create_log_links (void)
                 we might wind up changing the semantics of the insn,
                 even if reload can make what appear to be valid
                 assignments later.  */
-             if (regno < FIRST_PSEUDO_REGISTER
+             if (HARD_REGISTER_NUM_P (regno)
                  && asm_noperands (PATTERN (use_insn)) >= 0)
                continue;
 
@@ -1730,7 +1730,7 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx
   rtx_insn *insn = (rtx_insn *) data;
 
   if (REG_P (x)
-      && REGNO (x) >= FIRST_PSEUDO_REGISTER
+      && !HARD_REGISTER_P (x)
       /* If this register is undefined at the start of the file, we can't
         say what its contents were.  */
       && ! REGNO_REG_SET_P
@@ -1897,7 +1897,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_i
                          && (REGNO (XEXP (i3elt, 0)) == regno
                              ? reg_set_between_p (XEXP (elt, 0),
                                                   PREV_INSN (insn), i3)
-                             : regno >= FIRST_PSEUDO_REGISTER))
+                             : !HARD_REGISTER_NUM_P (regno)))
                        return 0;
                    }
                  while (--i >= 0);
@@ -1971,7 +1971,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_i
       || (CALL_P (i3)
          && (find_reg_fusage (i3, USE, dest)
              || (REG_P (dest)
-                 && REGNO (dest) < FIRST_PSEUDO_REGISTER
+                 && HARD_REGISTER_P (dest)
                  && global_regs[REGNO (dest)])))
       /* Don't substitute into an incremented register.  */
       || FIND_REG_INC_NOTE (i3, dest)
@@ -2021,7 +2021,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_i
         register.  */
 
       if (REG_P (src)
-         && ((REGNO (dest) < FIRST_PSEUDO_REGISTER
+         && ((HARD_REGISTER_P (dest)
               && ! HARD_REGNO_MODE_OK (REGNO (dest), GET_MODE (dest)))
              /* Don't extend the life of a hard register unless it is
                 user variable (if we have few registers) or it can't
@@ -2030,7 +2030,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_i
                 Also avoid substituting a return register into I3, because
                 reload can't handle a conflict with constraints of other
                 inputs.  */
-             || (REGNO (src) < FIRST_PSEUDO_REGISTER
+             || (HARD_REGISTER_P (src)
                  && ! HARD_REGNO_MODE_OK (REGNO (src), GET_MODE (src)))))
        return 0;
     }
@@ -2052,7 +2052,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_i
             we leave it up to the machine description to either accept or
             reject use-and-clobber patterns.  */
          if (!REG_P (reg)
-             || REGNO (reg) >= FIRST_PSEUDO_REGISTER
+             || !HARD_REGISTER_P (reg)
              || !fixed_regs[REGNO (reg)])
            if (reg_overlap_mentioned_p (reg, src))
              return 0;
@@ -2075,7 +2075,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_i
      to be an explicit register variable, and was chosen for a reason.  */
 
   if (GET_CODE (src) == ASM_OPERANDS
-      && REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER)
+      && REG_P (dest) && HARD_REGISTER_P (dest))
     return 0;
 
   /* If INSN contains volatile references (specifically volatile MEMs),
@@ -2221,7 +2221,7 @@ combinable_i3pat (rtx_insn *i3, rtx *loc, rtx i2de
             checks this; here, we do a more specific test for this case.  */
 
          || (REG_P (inner_dest)
-             && REGNO (inner_dest) < FIRST_PSEUDO_REGISTER
+             && HARD_REGISTER_P (inner_dest)
              && (! HARD_REGNO_MODE_OK (REGNO (inner_dest),
                                        GET_MODE (inner_dest))))
          || (i1_not_in_src && reg_overlap_mentioned_p (i1dest, src))
@@ -2469,7 +2469,7 @@ can_change_dest_mode (rtx x, int added_sets, machi
   regno = REGNO (x);
   /* Allow hard registers if the new mode is legal, and occupies no more
      registers than the old mode.  */
-  if (regno < FIRST_PSEUDO_REGISTER)
+  if (HARD_REGISTER_NUM_P (regno))
     return (HARD_REGNO_MODE_OK (regno, mode)
            && REG_NREGS (x) >= hard_regno_nregs[regno][mode]);
 
@@ -2790,7 +2790,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
 
   if (i1 == 0 && NONJUMP_INSN_P (i3) && GET_CODE (PATTERN (i3)) == SET
       && REG_P (SET_SRC (PATTERN (i3)))
-      && REGNO (SET_SRC (PATTERN (i3))) >= FIRST_PSEUDO_REGISTER
+      && !HARD_REGISTER_P (SET_SRC (PATTERN (i3)))
       && find_reg_note (i3, REG_DEAD, SET_SRC (PATTERN (i3)))
       && GET_CODE (PATTERN (i2)) == PARALLEL
       && ! side_effects_p (SET_DEST (PATTERN (i3)))
@@ -3219,7 +3219,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
                {
                  unsigned int regno = REGNO (newpat_dest);
                  compare_mode = new_mode;
-                 if (regno < FIRST_PSEUDO_REGISTER)
+                 if (HARD_REGISTER_NUM_P (regno))
                    newpat_dest = gen_rtx_REG (compare_mode, regno);
                  else
                    {
@@ -3588,7 +3588,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
              machine_mode old_mode = GET_MODE (i2dest);
              rtx ni2dest;
 
-             if (REGNO (i2dest) < FIRST_PSEUDO_REGISTER)
+             if (HARD_REGISTER_P (i2dest))
                ni2dest = gen_rtx_REG (new_mode, REGNO (i2dest));
              else
                {
@@ -3604,7 +3604,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
              m_split_insn = combine_split_insns (parallel, i3);
 
              if (m_split_insn == 0
-                 && REGNO (i2dest) >= FIRST_PSEUDO_REGISTER)
+                 && !HARD_REGISTER_P (i2dest))
                {
                  struct undo *buf;
 
@@ -3725,7 +3725,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
             validated that we can do this.  */
          if (GET_MODE (i2dest) != split_mode && split_mode != VOIDmode)
            {
-             if (REGNO (i2dest) < FIRST_PSEUDO_REGISTER)
+             if (HARD_REGISTER_P (i2dest))
                newdest = gen_rtx_REG (split_mode, REGNO (i2dest));
              else
                {
@@ -5390,7 +5390,7 @@ subst (rtx x, rtx from, rtx to, int in_dest, int i
 
                  if (code == SUBREG
                      && REG_P (to)
-                     && REGNO (to) < FIRST_PSEUDO_REGISTER
+                     && HARD_REGISTER_P (to)
                      && simplify_subreg_regno (REGNO (to), GET_MODE (to),
                                                SUBREG_BYTE (x),
                                                GET_MODE (x)) < 0)
@@ -6645,7 +6645,7 @@ simplify_set (rtx x)
              unsigned int regno = REGNO (dest);
              rtx new_dest;
 
-             if (regno < FIRST_PSEUDO_REGISTER)
+             if (HARD_REGISTER_NUM_P (regno))
                new_dest = gen_rtx_REG (compare_mode, regno);
              else
                {
@@ -6750,7 +6750,7 @@ simplify_set (rtx x)
        < GET_MODE_SIZE (GET_MODE (SUBREG_REG (src))))
 #endif
 #ifdef CANNOT_CHANGE_MODE_CLASS
-      && ! (REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER
+      && ! (REG_P (dest) && HARD_REGISTER_P (dest)
            && REG_CANNOT_CHANGE_MODE_P (REGNO (dest),
                                         GET_MODE (SUBREG_REG (src)),
                                         GET_MODE (src)))
@@ -9815,7 +9815,7 @@ reg_nonzero_bits_for_combine (const_rtx x, machine
           && rsp->last_set_label < label_tick)
          || (rsp->last_set_label == label_tick
               && DF_INSN_LUID (rsp->last_set) < subst_low_luid)
-         || (REGNO (x) >= FIRST_PSEUDO_REGISTER
+         || (!HARD_REGISTER_P (x)
              && REGNO (x) < reg_n_sets_max
              && REG_N_SETS (REGNO (x)) == 1
              && !REGNO_REG_SET_P
@@ -9879,7 +9879,7 @@ reg_num_sign_bit_copies_for_combine (const_rtx x,
           && rsp->last_set_label < label_tick)
          || (rsp->last_set_label == label_tick
               && DF_INSN_LUID (rsp->last_set) < subst_low_luid)
-         || (REGNO (x) >= FIRST_PSEUDO_REGISTER
+         || (!HARD_REGISTER_P (x)
              && REGNO (x) < reg_n_sets_max
              && REG_N_SETS (REGNO (x)) == 1
              && !REGNO_REG_SET_P
@@ -12921,7 +12921,7 @@ record_truncated_value (rtx x)
     }
   /* ??? For hard-regs we now record everything.  We might be able to
      optimize this using last_set_mode.  */
-  else if (REG_P (x) && REGNO (x) < FIRST_PSEUDO_REGISTER)
+  else if (REG_P (x) && HARD_REGISTER_P (x))
     truncated_mode = GET_MODE (x);
   else
     return false;
@@ -13012,7 +13012,7 @@ get_last_value_validate (rtx *loc, rtx_insn *insn,
          if (rsp->last_set_invalid
              /* If this is a pseudo-register that was only set once and not
                 live at the beginning of the function, it is always valid.  */
-             || (! (regno >= FIRST_PSEUDO_REGISTER
+             || (! (!HARD_REGISTER_NUM_P (regno)
                     && regno < reg_n_sets_max
                     && REG_N_SETS (regno) == 1
                     && (!REGNO_REG_SET_P
@@ -13129,7 +13129,7 @@ get_last_value (const_rtx x)
 
   if (value == 0
       || (rsp->last_set_label < label_tick_ebb_start
-         && (regno < FIRST_PSEUDO_REGISTER
+         && (HARD_REGISTER_NUM_P (regno)
              || regno >= reg_n_sets_max
              || REG_N_SETS (regno) != 1
              || REGNO_REG_SET_P
@@ -13257,7 +13257,7 @@ reg_dead_at_p (rtx reg, rtx_insn *insn)
   /* Check that reg isn't mentioned in NEWPAT_USED_REGS.  For fixed registers
      we allow the machine description to decide whether use-and-clobber
      patterns are OK.  */
-  if (reg_dead_regno < FIRST_PSEUDO_REGISTER)
+  if (HARD_REGISTER_NUM_P (reg_dead_regno))
     {
       for (i = reg_dead_regno; i < reg_dead_endregno; i++)
        if (!fixed_regs[i] && TEST_HARD_REG_BIT (newpat_used_regs, i))
@@ -13331,7 +13331,7 @@ mark_used_regs_combine (rtx x)
       regno = REGNO (x);
       /* A hard reg in a wide mode may really be multiple registers.
         If so, mark all of them just like the first.  */
-      if (regno < FIRST_PSEUDO_REGISTER)
+      if (HARD_REGISTER_NUM_P (regno))
        {
          /* None of this applies to the stack, frame or arg pointers.  */
          if (regno == STACK_POINTER_REGNUM
@@ -13449,7 +13449,7 @@ move_deaths (rtx x, rtx maybe_kill_insn, int from_
             including X.  In that case, we must put REG_DEAD notes for
             the remaining registers in place of NOTE.  */
 
-         if (note != 0 && regno < FIRST_PSEUDO_REGISTER
+         if (note != 0 && HARD_REGISTER_NUM_P (regno)
              && (GET_MODE_SIZE (GET_MODE (XEXP (note, 0)))
                  > GET_MODE_SIZE (GET_MODE (x))))
            {
@@ -13472,7 +13472,7 @@ move_deaths (rtx x, rtx maybe_kill_insn, int from_
                    || (note != 0
                        && (GET_MODE_SIZE (GET_MODE (XEXP (note, 0)))
                            < GET_MODE_SIZE (GET_MODE (x)))))
-                  && regno < FIRST_PSEUDO_REGISTER
+                  && HARD_REGISTER_NUM_P (regno)
                   && REG_NREGS (x) > 1)
            {
              unsigned int ourend = END_REGNO (x);
@@ -13588,7 +13588,7 @@ reg_bitfield_target_p (rtx x, rtx body)
        return 0;
 
       tregno = REGNO (target), regno = REGNO (x);
-      if (tregno >= FIRST_PSEUDO_REGISTER || regno >= FIRST_PSEUDO_REGISTER)
+      if (!HARD_REGISTER_NUM_P (tregno) || !HARD_REGISTER_NUM_P (regno))
        return target == x;
 
       endtregno = end_hard_regno (GET_MODE (target), tregno);
@@ -13922,7 +13922,7 @@ distribute_notes (rtx notes, rtx_insn *from_insn,
                     TEM_INSN is doing.  If so, delete TEM_INSN.  Otherwise, 
make this
                     into a REG_UNUSED note instead. Don't delete sets to
                     global register vars.  */
-                 if ((REGNO (XEXP (note, 0)) >= FIRST_PSEUDO_REGISTER
+                 if ((!HARD_REGISTER_P (XEXP (note, 0))
                       || !global_regs[REGNO (XEXP (note, 0))])
                      && reg_set_p (XEXP (note, 0), PATTERN (tem_insn)))
                    {

-- 
Alan Modra
Australia Development Lab, IBM

Reply via email to