https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770

            Bug ID: 97770
           Summary: Missing vectorization for vpopcnt
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

cat test.c
---
void
foo(int* __restrict dest, int* src, int n)
{
  for (int i = 0; i != 8; i++)
    dest[i] = __builtin_popcount (src[i]);
}
---

gcc -O3 -march=icelake-server -S -fopt-info-all

Inlined 0 calls, eliminated 0 functions

test.c:4:3: missed: couldn't vectorize loop
test.c:5:15: missed: not vectorized: relevant stmt not supported: _7 =
__builtin_popcount (_5);
test.c:2:1: note: vectorized 0 loops in function.
test.c:4:3: note: ***** Analysis failed with vector mode VOID
test.c:4:3: note: ***** Analysis failed with vector mode V8SI
test.c:4:3: note: ***** Skipping vector mode V32QI, which would repeat the
analysis for V8SI
test.c:6:1: note: ***** Analysis failed with vector mode VOID


This loop could be vectorized by ICC and Clang:

foo(int*, int*, int):
        vpopcntd  ymm0, YMMWORD PTR [rsi]                       #5.15
        vmovdqu   YMMWORD PTR [rdi], ymm0                       #5.5
        vzeroupper                                              #6.1
        ret

Reply via email to