https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100258
Bug ID: 100258 Summary: constant store pulled out of the loop causes an extra memory load Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: x86_64-linux-gnu Take: void f(float *x, int t) { for(int i = 0; i < t; i++) x[i*3] = 1.0; } Right now this produces for it at -O2: testl %esi, %esi jle .L5 leal -1(%rsi), %eax leaq (%rax,%rax,2), %rax vmovss .LC0(%rip), %xmm0 leaq 12(%rdi,%rax,4), %rax .p2align 4,,10 .p2align 3 .L3: vmovss %xmm0, (%rdi) addq $12, %rdi cmpq %rax, %rdi jne .L3 .L5: ret ----- CUT ---- If we don't have a loop, e.g. just a store to *x, we get: movl $0x3f800000, (%rdi) Which is 1000000x more effiecent and we just need a loop around that without doing the load of .LC0.