https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> --- After set remove_partial_avx_dependency to true for register alternative, we get vxorps %xmm3, %xmm3, %xmm3 vmovsd .LC16(%rip), %xmm6 vmovsd .LC14(%rip), %xmm5 vcvtss2sd %xmm0, %xmm3, %xmm0 vmulsd .LC12(%rip), %xmm0, %xmm1 vroundsd $9, %xmm1, %xmm3, %xmm2 vcvttsd2siq %xmm2, %rax vsubsd %xmm2, %xmm1, %xmm1 vfmadd231sd .LC13(%rip), %xmm0, %xmm1 vfmadd213sd .LC17(%rip), %xmm1, %xmm6 vmovsd .LC18(%rip), %xmm0 vfmadd213sd .LC19(%rip), %xmm1, %xmm0 vfmadd213sd .LC15(%rip), %xmm1, %xmm5 movq %rax, %rdx sarq $4, %rax vmulsd %xmm1, %xmm1, %xmm4 addq $1023, %rax andl $15, %edx salq $52, %rax vmovq %rax, %xmm7 vmulsd tb.1(,%rdx,8), %xmm7, %xmm2 vfmadd132sd %xmm4, %xmm6, %xmm0 vmulsd %xmm2, %xmm1, %xmm1 vfmadd132sd %xmm4, %xmm5, %xmm0 vfmadd132sd %xmm1, %xmm2, %xmm0 vcvtsd2ss %xmm0, %xmm3, %xmm0 Also >Also there's a potentially related issue that GCC copies the initial xmm0 >value >to eax via stack in the beginning of the function: This issue disappears(should still be latent after fix).