https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70802

            Bug ID: 70802
           Summary: IRA memory cost calculation incorrect for immediates
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wdijkstr at arm dot com
  Target Milestone: ---

The following code in ira-costs.c tries to improve the memory cost for
rematerializeable loads. There are several issues with this though:

1. The memory cost can become negative, forcing a spill, which is known to
cause incorrect code (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242)
2. The code only handles a subset of immediate loads, not all rematerializeable
values
3. The cost adjustment is not sufficient to make better decisions between
allocating an immediate to a callee-save register and spill a variable, or
allocate the variable and rematerialize the immediate

As an example of (3), if there is only one callee-save register free, IRA will
use it to allocate the immediate rather than the variable:

float bad_alloc(float x)
{
  x += 3.0f;
  g();
  x *= 3.0f;
  return x;
}

With -O2 -fomit-frame-pointer -ffixed-d8 -ffixed-d9 -ffixed-d10 -ffixed-d11
-ffixed-d12 -ffixed-d13 -ffixed-d14:

        str     x30, [sp, -32]!
        str     d15, [sp, 8]
        fmov    s15, 3.0e+0
        fadd    s0, s0, s15
        str     s0, [sp, 28]
        bl      g
        ldr     s0, [sp, 28]
        fmul    s0, s0, s15
        ldr     d15, [sp, 8]
        ldr     x30, [sp], 32
        ret

  a0(r76,l0) costs: CALLER_SAVE_REGS:15000,15000 GENERAL_REGS:15000,15000
FP_REGS:0,0 ALL_REGS:15000,15000 MEM:12000,12000
  a1(r73,l0) costs: CALLER_SAVE_REGS:10000,10000 GENERAL_REGS:10000,10000
FP_REGS:0,0 ALL_REGS:10000,10000 MEM:8000,8000

The immediate value r76 is counted as 1 def and 2 uses, so memory cost of
12000, while r73 has 1 def and 1 use, so memory cost of 8000. However the
worst-case rematerialization cost of r76 would be 2 moves, one which already
exists of course, so the memory cost should have been 4000...

ira-costs.c, ~line 1458:

  if (set != 0 && REG_P (SET_DEST (set)) && MEM_P (SET_SRC (set))
      && (note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL_RTX
      && ((MEM_P (XEXP (note, 0))
           && !side_effects_p (SET_SRC (set)))
          || (CONSTANT_P (XEXP (note, 0))
              && targetm.legitimate_constant_p (GET_MODE (SET_DEST (set)),
                                                XEXP (note, 0))
              && REG_N_SETS (REGNO (SET_DEST (set))) == 1))
      && general_operand (SET_SRC (set), GET_MODE (SET_SRC (set))))
    {
      enum reg_class cl = GENERAL_REGS;
      rtx reg = SET_DEST (set);
      int num = COST_INDEX (REGNO (reg));

      COSTS (costs, num)->mem_cost
        -= ira_memory_move_cost[GET_MODE (reg)][cl][1] * frequency;
      record_address_regs (GET_MODE (SET_SRC (set)),
                           MEM_ADDR_SPACE (SET_SRC (set)),
                           XEXP (SET_SRC (set), 0), 0, MEM, SCRATCH,
                           frequency * 2);
      counted_mem = true;
    }

Reply via email to