https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67206
Bug ID: 67206 Summary: Redundant spills in simple copy loop for 32-bit x86 target Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ysrumyan at gmail dot com Target Milestone: --- For attached simple test-case we can see strange spills to stack, namely for (i=0; i<n; i++) out[j * n + i] = in[j * n + i]; .L9: movdqa (%eax), %xmm0 addl $1, %edx movdqu %xmm0, (%ecx) addl $16, %eax movdqa %xmm0, 32(%esp) ?? Redundant addl $16, %ecx movl %eax, 32(%esp) ?? Redundant cmpl 52(%esp), %edx movl %ecx, 48(%esp) ?? Redundant jb .L9 Another issue is that loop distribution is not recognized such loop and memmove loop. Note that this is reproduced with 4-9 compiler.