https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89575
Bug ID: 89575 Summary: LRA for msp430 - Max. number of generated reload insns - frame pointer subreg simplification Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jozef.l at mittosystems dot com Target Milestone: --- Created attachment 45881 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45881&action=edit testcase When enabling LRA for msp430, libgcc fails to build, specifically _muldi3.o. > gcc -S tester.i -O2 > during RTL pass: reload > ../../../../libgcc/libgcc2.c: In function '__muldi3': > ../../../../libgcc/libgcc2.c:558:1: internal compiler error: Max. number of > generated reload insns per insn is achieved (90) > > 558 | } > | ^ > 0xa2ab20 lra_constraints(bool) > ../../gcc/lra-constraints.c:4875 > 0xa12b84 lra(_IO_FILE*) > ../../gcc/lra.c:2461 > 0x9c68f1 do_reload > ../../gcc/ira.c:5516 > 0x9c68f1 execute > ../../gcc/ira.c:5700 The cycling reload occurs because IRA assigns hard register R4 (also FRAME_POINTER_REGNUM, but not fixed for this use) to a pseudo reg, but when LRA goes to simplify a subreg of the pseudo, it disallows simplification of this subreg. Specifically, simplify_subreg_regno (rtlanal.c): > /* We shouldn't simplify stack-related registers. */ > if ((!reload_completed || frame_pointer_needed) > && xregno == FRAME_POINTER_REGNUM) > return -1; This is in an output reload, so a new set of mov insns are generated to load the value back into the original, problematic pseudo of R4. Once again simplify_subreg_regno is called to simplify the pseudo of R4, but it is disallowed and the cycle continues. From the IRA dump: > Disposition: > 0:r28 l0 8 2:r30 l0 4 1:r31 l0 4 > ... > (insn 2 6 3 2 (set (subreg:HI (reg/v:DI 30 [ arg1 ]) 0) > (reg:HI 12 R12 [ arg1 ])) "tester.c":16:1 12 {movhi} > (expr_list:REG_DEAD (reg:HI 12 R12 [ arg1 ]) > (nil))) From the reload dump: > Creating newreg=37 from oldreg=30, assigning class NO_REGS to subreg reg > r37 > 2: r37:DI#0=R12:HI > ... > Inserting subreg reload after: > 42: r30:DI#0=r37:DI#0 > ... > Creating newreg=38 from oldreg=30, assigning class NO_REGS to subreg reg > r38 > 42: r38:DI#0=r37:DI#0 > ... > Inserting subreg reload after: > 52: r30:DI#0=r38:DI#0 And so on. Is it OK to allow simplification of a subreg of FRAME_POINTER_REGNUM when lra_in_progress is true? After all, constraints on the allocation of hard regs shouldn't get more resitrictive as compilation progresses? e.g. diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c index 3873b4098b0..9700928ff4e 100644 --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -3971,7 +3971,7 @@ simplify_subreg_regno (unsigned int xregno, machine_mode xmode, return -1; /* We shouldn't simplify stack-related registers. */ - if ((!reload_completed || frame_pointer_needed) + if ((!(reload_completed || lra_in_progress) || frame_pointer_needed) && xregno == FRAME_POINTER_REGNUM) return -1; This fixes the cycling reload for insn 2, as the frame pointer is not needed, but there are further separate issues building the test case. I've attached a reduced test case, and the IRA and reload dumps. > gcc -v > Target: msp430-elf > Configured with: ../configure --target=msp430-elf --disable-nls > --enable-languages=c,c++ > Thread model: single > gcc version 9.0.1 20190301 (experimental) (GCC)