https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101993
Bug ID: 101993 Summary: Potential vectorization opportunity when condition checks array address Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wwwhhhyyy333 at gmail dot com Target Milestone: --- For float foo(int * restrict a, int * restrict res, int n) { int i; for (i = 0; i < 8; i++) { if (a + i) res[i] = *(a + i) * 2; } } Compile with -O3 Clang generates foo: # @foo testq %rdi, %rdi je .LBB0_2 movdqu (%rdi), %xmm0 paddd %xmm0, %xmm0 movdqu %xmm0, (%rsi) .LBB0_2: retq While GCC generates foo: testq %rdi, %rdi je .L5 movl (%rdi), %eax leaq 8(%rdi), %rdx addl %eax, %eax movl %eax, (%rsi) movl 4(%rdi), %eax addl %eax, %eax .L3: movl %eax, 4(%rsi) movl (%rdx), %eax addl %eax, %eax movl %eax, 8(%rsi) movl 12(%rdi), %eax addl %eax, %eax movl %eax, 12(%rsi) ret .L5: movl 4, %eax movl $8, %edx addl %eax, %eax jmp .L3 If a is 0 or negative then it should be an invalid pointer. It seems clang have such assumption and test a first then optimize loop body. Is it possible for GCC to do such optimization?