https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98563

            Bug ID: 98563
           Summary: regression: vectorization fails while it worked on gcc
                    9 and earlier
           Product: gcc
           Version: 10.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nathanael.schaeffer at gmail dot com
  Target Milestone: ---

I have found what seems to be a regression.

The following code is not compiled to 256-bit AVX when compiled with
-fopenmp-simd, while it is fully vectorized without!

Here are the resulting code with different options, with gcc 10.1:
-O3 -fopenmp-simd  => xmm
-O3                => ymm
-O3 -fopenmp-simd -fno-signed-zeros  => ymm

gcc 9 and earlier always vectorize to full-width (ymm)

#include <complex>
typedef std::complex<double> cplx;

void test(cplx* __restrict__ a, const cplx* b, double c, int N)
{
    #pragma omp simd
    for (int i=0; i<8*N; i++) {
        a[i] = c*(a[i]-b[i]);
    }
}

See the result on godbolt: https://godbolt.org/z/9ThqKE

Also, I discover that no avx512 code is generated for this loop. Is this
intended? Is there an option to enable avx512 vectorization?

Reply via email to