https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101993

            Bug ID: 101993
           Summary: Potential vectorization opportunity when condition
                    checks array address
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wwwhhhyyy333 at gmail dot com
  Target Milestone: ---

For

float foo(int * restrict a, int * restrict res, int n)
{
  int i;
  for (i = 0; i < 8; i++)
  {
    if (a + i)
      res[i] = *(a + i) * 2;
  }
}

Compile with -O3

Clang generates

foo:                                    # @foo
        testq   %rdi, %rdi
        je      .LBB0_2
        movdqu  (%rdi), %xmm0
        paddd   %xmm0, %xmm0
        movdqu  %xmm0, (%rsi)
.LBB0_2:
        retq

While GCC generates

foo:
        testq   %rdi, %rdi
        je      .L5
        movl    (%rdi), %eax
        leaq    8(%rdi), %rdx
        addl    %eax, %eax
        movl    %eax, (%rsi)
        movl    4(%rdi), %eax
        addl    %eax, %eax
.L3:
        movl    %eax, 4(%rsi)
        movl    (%rdx), %eax
        addl    %eax, %eax
        movl    %eax, 8(%rsi)
        movl    12(%rdi), %eax
        addl    %eax, %eax
        movl    %eax, 12(%rsi)
        ret
.L5:
        movl    4, %eax
        movl    $8, %edx
        addl    %eax, %eax
        jmp     .L3

If a is 0 or negative then it should be an invalid pointer. It seems clang have
such assumption and test a first then optimize loop body.

Is it possible for GCC to do such optimization?

Reply via email to