On 2020-03-20 1:19 p.m., Richard Sandiford wrote:
Ping
Richard, sorry.  I missed your original message.
Richard Sandiford <richard.sandif...@arm.com> writes:
[See:

   https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541694.html
   https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541759.html

for a walkthrough of what goes wrong in the testcase, but hopefully
the change makes sense on first principles.]

simplify_operand_subreg tries to detect whether the allocation for
a pseudo in a paradoxical subreg is also valid for the outer mode.
The condition it used to check for an invalid combination was:

   else if (REG_P (reg)
           && REGNO (reg) >= FIRST_PSEUDO_REGISTER
           && (hard_regno = lra_get_regno_hard_regno (REGNO (reg))) >= 0
           && (hard_regno_nregs (hard_regno, innermode)
               < hard_regno_nregs (hard_regno, mode))
           && (regclass = lra_get_allocno_class (REGNO (reg)))
           && (type != OP_IN
               || !in_hard_reg_set_p (reg_class_contents[regclass],
                                      mode, hard_regno)
               || overlaps_hard_reg_set_p (lra_no_alloc_regs,
                                           mode, hard_regno)))

I think there are two problems with this:

(1) It never actually checks whether the hard register is valid for the
     outer mode (in the hard_regno_mode_ok sense).  If it isn't, any attempt
     to reload in the outer mode is likely to cycle, because the implied
     regno/mode combination will be just as invalid next time
     curr_insn_transform sees the subreg.

(2) The check is valid for little-endian only.  For big-endian we need
     to move hard_regno backwards.

Using simplify_subreg_regno should avoid both problems.

As the existing comment says, IRA should always take subreg references
into account when allocating hard registers, so this fix-up should only
really be needed for pseudos allocated by LRA itself.

Tested on aarch64-linux-gnu, armeb-eabi and x86_64-linux-gnu.  OK for master?
I think for backports it would be safer to go with Tamar's patch.

OK.  The patch is logical but as usually it is hard to predict how it will behave.  So we need to pay attention to its effect after its committing.

Btw, I checked my patches for PR92303 and PR94185.  It seems they also solve the problem with cycling in PR94052.

gcc/
2020-03-10  Richard Sandiford  <richard.sandif...@arm.com>

        PR rtl-optimization/94052
        * lra-constraints.c (simplify_operand_subreg): Reload the inner
        register of a paradoxical subreg if simplify_subreg_regno fails
        to give a valid hard register for the outer mode.

gcc/testsuite/
2020-03-05  Tamar Christina  <tamar.christ...@arm.com>

        PR target/94052
        * gcc.target/aarch64/pr94052.C: New test.

Reply via email to