https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78954

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That depends on which CPU you tune for.
E.g. with -mtune=intel or -mtune=core2 etc. you get what you are asking for,
-mtune=generic takes into account that the movd    %edi, %xmm1 insn is very
slow on some AMD CPUs and because moving through stack isn't that slower on
Intel CPUs, it is a compromise between those tunings.
/* X86_TUNE_INTER_UNIT_MOVES_TO_VEC: Enable moves in from integer
   to SSE registers.  If disabled, the moves will be done by storing
   the value to memory and reloading.  */
DEF_TUNE (X86_TUNE_INTER_UNIT_MOVES_TO_VEC, "inter_unit_moves_to_vec",
          ~(m_AMD_MULTIPLE | m_GENERIC))

/* X86_TUNE_INTER_UNIT_MOVES_TO_VEC: Enable moves in from SSE
   to integer registers.  If disabled, the moves will be done by storing
   the value to memory and reloading.  */
DEF_TUNE (X86_TUNE_INTER_UNIT_MOVES_FROM_VEC, "inter_unit_moves_from_vec",
          ~m_ATHLON_K8)

where
#define m_AMD_MULTIPLE (m_ATHLON_K8 | m_AMDFAM10 | m_BDVER | m_BTVER \
                        | m_ZNVER1)

Reply via email to