Hello! Before vzeroupper gets emitted before function call, the compiler checks if if there are live call-saved SSE registers at the insertion point. This functionality is intended to handle Windows ABI, so we don't clear upper parts of the XMM registers that live across the call.
However, the called function saves only lower 128bit part of the XMM register, so it seems that wider modes have to be saved and restored by the caller function anyway. If this is the case, we don't have to cancel vzeroupper insertion before the call. Attached patch removes this cancellation, since all other ABIs clobber all XMM registers. 2018-21-11 Uros Bizjak <ubiz...@gmail.com> * config/i386/i386.c (ix86_avx_emit_vzeroupper): Remove. (ix86_emit_mode_set) <case AVX_U128>: Emit vzeroupper here. The patch is untested, since I have no Windows target here. Daniel, can you please review the above assumptions and test the patch on Windows target? Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c18c60a1d191..598165103716 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -19167,37 +19167,11 @@ emit_i387_cw_initialization (int mode) emit_move_insn (new_mode, reg); } -/* Emit vzeroupper. */ - -void -ix86_avx_emit_vzeroupper (HARD_REG_SET regs_live) -{ - int i; - - /* Cancel automatic vzeroupper insertion if there are - live call-saved SSE registers at the insertion point. */ - - for (i = FIRST_SSE_REG; i <= LAST_SSE_REG; i++) - if (TEST_HARD_REG_BIT (regs_live, i) && !call_used_regs[i]) - return; - - if (TARGET_64BIT) - for (i = FIRST_REX_SSE_REG; i <= LAST_REX_SSE_REG; i++) - if (TEST_HARD_REG_BIT (regs_live, i) && !call_used_regs[i]) - return; - - emit_insn (gen_avx_vzeroupper ()); -} - /* Generate one or more insns to set ENTITY to MODE. */ -/* Generate one or more insns to set ENTITY to MODE. HARD_REG_LIVE - is the set of hard registers live at the point where the insn(s) - are to be inserted. */ - static void ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED, - HARD_REG_SET regs_live) + HARD_REG_SET regs_live ATTRIBUTE_UNUSED) { switch (entity) { @@ -19207,7 +19181,7 @@ ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED, break; case AVX_U128: if (mode == AVX_U128_CLEAN) - ix86_avx_emit_vzeroupper (regs_live); + emit_insn (gen_avx_vzeroupper ()); break; case I387_TRUNC: case I387_FLOOR: