https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Component|tree-optimization |target Version|unknown |13.0 Last reconfirmed| |2022-10-14 Status|UNCONFIRMED |NEW Target| |x86_64-*-* Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- The question is _why_ we generate worse code ... looks like pro/epilogue generation differs: @@ -11,15 +11,19 @@ Attempting shrink-wrapping optimization. Block 2 needs prologue due to insn 2: (insn 2 4 3 2 (set (reg/v/f:DI 3 bx [orig:84 f ] [84]) - (reg:DI 5 di [87])) "t.c":4:23 82 {*movdi_internal} + (reg:DI 5 di [86])) "t.c":4:23 82 {*movdi_internal} (nil)) After wrapping required blocks, PRO is now 2 Avoiding non-duplicatable blocks, PRO is now 2 Bumping back to anticipatable blocks, PRO is now 2 ... 1: NOTE_INSN_DELETED 4: NOTE_INSN_BASIC_BLOCK 2 - 18: [--sp:DI]=bx:DI - 19: NOTE_INSN_PROLOGUE_END + 18: [--sp:DI]=bp:DI + 19: [--sp:DI]=bx:DI + 20: {sp:DI=sp:DI-0x8;clobber flags:CC;clobber [scratch];} + REG_CFA_ADJUST_CFA sp:DI=sp:DI-0x8 + 21: NOTE_INSN_PROLOGUE_END 2: bx:DI=di:DI 3: NOTE_INSN_FUNCTION_BEG - 6: di:DI=0x8 - 7: ax:DI=call [`malloc'] argc:0 + 6: bp:DI=[bx:DI] + 7: di:DI=0x8 + 8: ax:DI=call [`malloc'] argc:0 REG_CALL_DECL `malloc' REG_EH_REGION 0 - 10: dx:DI=[bx:DI] - REG_EQUIV [bx:DI] - 11: [ax:DI]=dx:DI + 11: [ax:DI]=bp:DI 12: [bx:DI]=ax:DI - 20: NOTE_INSN_EPILOGUE_BEG - 21: bx:DI=[sp:DI++] + 22: NOTE_INSN_EPILOGUE_BEG + 23: {sp:DI=sp:DI+0x8;clobber flags:CC;clobber [scratch];} REG_CFA_ADJUST_CFA sp:DI=sp:DI+0x8 - 22: simple_return - 25: barrier + 24: bx:DI=[sp:DI++] + REG_CFA_ADJUST_CFA sp:DI=sp:DI+0x8 + 25: bp:DI=[sp:DI++] + REG_CFA_ADJUST_CFA sp:DI=sp:DI+0x8 + 26: simple_return + 29: barrier 17: NOTE_INSN_DELETED