https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101993
Bug ID: 101993
Summary: Potential vectorization opportunity when condition
checks array address
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wwwhhhyyy333 at gmail dot com
Target Milestone: ---
For
float foo(int * restrict a, int * restrict res, int n)
{
int i;
for (i = 0; i < 8; i++)
{
if (a + i)
res[i] = *(a + i) * 2;
}
}
Compile with -O3
Clang generates
foo: # @foo
testq %rdi, %rdi
je .LBB0_2
movdqu (%rdi), %xmm0
paddd %xmm0, %xmm0
movdqu %xmm0, (%rsi)
.LBB0_2:
retq
While GCC generates
foo:
testq %rdi, %rdi
je .L5
movl (%rdi), %eax
leaq 8(%rdi), %rdx
addl %eax, %eax
movl %eax, (%rsi)
movl 4(%rdi), %eax
addl %eax, %eax
.L3:
movl %eax, 4(%rsi)
movl (%rdx), %eax
addl %eax, %eax
movl %eax, 8(%rsi)
movl 12(%rdi), %eax
addl %eax, %eax
movl %eax, 12(%rsi)
ret
.L5:
movl 4, %eax
movl $8, %edx
addl %eax, %eax
jmp .L3
If a is 0 or negative then it should be an invalid pointer. It seems clang have
such assumption and test a first then optimize loop body.
Is it possible for GCC to do such optimization?