https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100258

            Bug ID: 100258
           Summary: constant store pulled out of the loop causes an extra
                    memory load
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-linux-gnu

Take:
void f(float *x, int t)
{
  for(int i = 0; i < t; i++)
    x[i*3] = 1.0;
}

Right now this produces for it at -O2:
        testl   %esi, %esi
        jle     .L5
        leal    -1(%rsi), %eax
        leaq    (%rax,%rax,2), %rax
        vmovss  .LC0(%rip), %xmm0
        leaq    12(%rdi,%rax,4), %rax
        .p2align 4,,10
        .p2align 3
.L3:
        vmovss  %xmm0, (%rdi)
        addq    $12, %rdi
        cmpq    %rax, %rdi
        jne     .L3
.L5:
        ret

----- CUT ----
If we don't have a loop, e.g. just a store to *x, we get:
        movl    $0x3f800000, (%rdi)
Which is 1000000x more effiecent and we just need a loop around that without
doing the load of .LC0.

Reply via email to