https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114415
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sayle at gcc dot gnu.org, | |uros at gcc dot gnu.org, | |vmakarov at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Seems it is lr_shrinkage pass where things go wrong. In bar0, all the 3 calls the function makes have 2 64-byte arguments (plus 64-byte return passed by hidden reference). -Oz apparently uses -mno-accumulate-outgoing-args, allocates some stack and then have (at least initially) before each of the 3 calls two spots which decrement %rsp by 64 and after each call one spot which increments %rsp by 128. csa merges the first %rsp -= 64 with the previous much larger %rsp decrement and in some cases the %rsp += 128 with following %rsp -= 64. The function sets %r12 to be the REG_ARGS_SIZE 64 %rsp level and %rbx to be the REG_ARGS_SIZE 128 %rsp level, i.e. %rbx = %r12 - 64. In asmcons pass, we still have before the call to bar1 %rsp -= 64; rep_movsi (fills in the second argument); %rsp -= 64; rep_movsi (fills in the first argument). But then lr_shrinkage pass moves both the stack decrements after all the rep_movsi calls, so we then have: (insn 60 71 66 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int -64 [0xffffffffffffffc0]))) (clobber (reg:CC 17 flags)) ]) "pr114415.c":27:8 272 {*adddi_1} (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_ARGS_SIZE (const_int 64 [0x40]) (nil)))) (insn 66 60 72 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 129) (const_int -64 [0xffffffffffffffc0]))) (clobber (reg:CC 17 flags)) ]) "pr114415.c":27:8 272 {*adddi_1} (expr_list:REG_DEAD (reg/f:DI 129) (expr_list:REG_UNUSED (reg:CC 17 flags) (expr_list:REG_ARGS_SIZE (const_int 128 [0x80]) (nil))))) and just set of %rdi and call to bar1. Now, as x86-64 has red zone, this in theory could still work fine (dunno if backend is supposed to have some barriers which prevent moving the rep movsl insns across it, unfortunately the fact that it uses some %rsp based address is not visible directly in the insn due to CSE), but then comes peephole2 ;; With -Oz, transform mov $imm,reg to the shorter push $imm; pop reg. and converts the (set (reg:DI %rcx) (const_int 16)) insns to push/pop pair, which when the stack pointer is higher than it should have been causes clobbering of the value.