https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112426
Bug ID: 112426
Summary: sched1 pessimizes codegen on aarch64 by increasing
register pressure
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: acoplan at gcc dot gnu.org
Target Milestone: ---
Consider this example:
long *foo (long *p, long x, long y)
{
p[0] = x;
p[1] = y;
return p + 2;
}
on aarch64 at -O2 we get:
foo:
mov x3, x0
add x0, x0, 16
stp x1, x2, [x3]
ret
and disabling sched1 (with -fno-schedule-insns) we get:
foo:
stp x1, x2, [x0]
add x0, x0, 16
ret
so it looks like sched1 is making things worse. The RTL going in to sched1 is:
8: [r93:DI]=r98:DI
REG_DEAD r98:DI
9: [r93:DI+0x8]=r99:DI
REG_DEAD r99:DI
10: NOTE_INSN_DELETED
15: x0:DI=r93:DI+0x10
which allows r93 to be allocated to x0, but sched1 moves the add above the
stores, leading to:
15: x0:DI=r93:DI+0x10
REG_DEAD r93:DI
8: [r93:DI]=r98:DI
REG_DEAD r98:DI
9: [r93:DI+0x8]=r99:DI
which requires r93 to be allocated to a separate register, and leads to an
additional move.
Note that the codegen with sched1 disabled also presents an stp writeback
opportunity, whereas this isn't the case with sched1 enabled.