https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110215
--- Comment #6 from Hongyu Wang <wwwhhhyyy333 at gmail dot com> --- Thanks for the fix, now for the attached test, main loop will not have any load. There is a remaining issue that the loop epilogue still contains load from stack and constant pool .L9: movslq %edx, %rax movss 72(%rsp), %xmm5 salq $2, %rax leaq (%rbx,%rax), %rcx movaps %xmm5, %xmm1 subss (%rcx), %xmm1 andps .LC4(%rip), %xmm1 movss %xmm1, (%rcx) leal 1(%rdx), %ecx addss %xmm1, %xmm0 cmpl %ecx, %r12d jle .L8 IRA dump shows the pseudos does not have conflict but they still failed to be allocated with register. This issue does not exist on aarch64.