http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41993
--- Comment #4 from Uros Bizjak <ubizjak at gmail dot com> 2012-11-04 16:45:51 UTC --- I have looked a bit into this problem, since AVX vzeroupper insertion now depends on MODE_EXIT functionality. IMO, the patch in Comment #1 is correct for all optimization levels. The reason, the problem is triggered only at -O0 is that since __builtin_return loads from the memory, gcc emits offsets to memory locations using the pseudo: ... (insn 9 8 11 2 (set (reg:SI 0 r0) (mem:SI (reg/f:SI 163) [0 S4 A8])) pr41933.c:3 238 {movsi_ie} (nil)) (insn 11 9 12 2 (set (reg:SI 165) (mem/f/c:SI (plus:SI (reg/f:SI 162) (const_int 60 [0x3c])) [0 rframe+0 S4 A32])) pr41933.c:3 238 {movsi_ie} (nil)) (insn 12 11 13 2 (set (reg/f:SI 164) (plus:SI (reg:SI 165) (const_int 4 [0x4]))) pr41933.c:3 62 {*addsi3_compact} (nil)) (insn 13 12 10 2 (set (reg:SI 64 fr0) (mem:SI (reg/f:SI 164) [0 S4 A8])) pr41933.c:3 238 {movsi_ie} (nil)) (insn 10 13 14 2 (use (reg:SI 0 r0)) pr41933.c:3 -1 (nil)) (insn 14 10 22 2 (use (reg:SI 64 fr0)) pr41933.c:3 -1 (nil)) (insn 22 14 0 2 (use (reg/i:SI 0 r0)) pr41933.c:4 -1 (nil)) This additional pseudo is what breaks the compilation. At -O2, we enter mode-switching with: (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 4 3 2 (set (reg/v/f:SI 161 [ rframe ]) (reg:SI 4 r4 [ rframe ])) pr41933.c:2 238 {movsi_ie} (expr_list:REG_DEAD (reg:SI 4 r4 [ rframe ]) (nil))) (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG) (insn 6 3 8 2 (set (reg:SI 0 r0) (mem:SI (reg/v/f:SI 161 [ rframe ]) [0 S4 A8])) pr41933.c:3 238 {movsi_ie} (nil)) (insn 8 6 7 2 (set (reg:SI 64 fr0) (mem:SI (plus:SI (reg/v/f:SI 161 [ rframe ]) (const_int 4 [0x4])) [0 S4 A8])) pr41933.c:3 238 {movsi_ie} (expr_list:REG_DEAD (reg/v/f:SI 161 [ rframe ]) (nil))) (insn 7 8 9 2 (use (reg:SI 0 r0)) pr41933.c:3 -1 (nil)) (insn 9 7 17 2 (use (reg:SI 64 fr0)) pr41933.c:3 -1 (expr_list:REG_DEAD (reg:SI 64 fr0) (nil))) (insn 17 9 0 2 (use (reg/i:SI 0 r0)) pr41933.c:4 -1 (nil)) In this case, we found many return registers (due to __builtin_return), and consequently lowered nregs to zero. This satisfies the following assert in (!nregs) and (nregs != hard_regno_nregs[ret_start][GET_MODE (ret_reg)]) cases. In -O0 case, we broke discovery loop too early, so we can't find all return regs. I would argue, that we should ignore non-relevant pseudos with: --cut here-- Index: mode-switching.c =================================================================== --- mode-switching.c (revision 193133) +++ mode-switching.c (working copy) @@ -324,7 +324,10 @@ create_pre_exit (int n_entities, int *entity_map, else break; if (copy_start >= FIRST_PSEUDO_REGISTER) - break; + { + last_insn = return_copy; + continue; + } copy_num = hard_regno_nregs[copy_start][GET_MODE (copy_reg)]; --cut here-- In the same way as in case of i.e. UNSPEC_VOLATILE in the preceeding code.