https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79460

            Bug ID: 79460
           Summary: gcc fails to optimise out a simple additive loop for
                    seemingly arbitrary numbers of iterations
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: drraph at gmail dot com
  Target Milestone: ---

Consider:

float f(float x[]) {
  float p = 1.0;
  for (int i = 0; i < 202; i++)
    p += 1;
  return p;
}

and

float f(float x[]) {
  float p = 1.0;
  for (int i = 0; i < 200; i++)
    p += 1;
  return p;
}

both compiled in gcc 7 (20170210 snapshot) with  -Ofast .

In the former case (the 202 case) you get:

f:
        movss   xmm0, DWORD PTR .LC0[rip]
        ret
.LC0:
        .long   1128988672



In the latter case (the 200 case) you get:


f:
        movaps  xmm0, XMMWORD PTR .LC0[rip]
        xor     eax, eax
        movaps  xmm3, XMMWORD PTR .LC1[rip]
        movaps  xmm2, XMMWORD PTR .LC2[rip]
.L2:
        movaps  xmm1, xmm0
        add     eax, 1
        cmp     eax, 50
        addps   xmm0, xmm3
        addps   xmm1, xmm2
        jne     .L2
        shufps  xmm1, xmm1, 255
        movaps  xmm0, xmm1
        ret
.LC0:
        .long   1065353216
        .long   1073741824
        .long   1077936128
        .long   1082130432
.LC1:
        .long   1082130432
        .long   1082130432
        .long   1082130432
        .long   1082130432
.LC2:
        .long   1065353216
        .long   1065353216
        .long   1065353216
        .long   1065353216

There are a lot of other pairs of consecutive even numbered limits where one
optimizes well and the other doesn't. For example 194 and 196.

Reply via email to