[Bug target/110227] gcc generates invalid AVX-512 code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110227 --- Comment #1 from Joe Weening --- Sorry, forgot to include the command line: $ gcc -march=cooperlake -O3 -c bug.c
[Bug target/110227] New: gcc generates invalid AVX-512 code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110227 Bug ID: 110227 Summary: gcc generates invalid AVX-512 code Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: joseph.weening at gmail dot com Target Milestone: --- gcc version 13.1.0 (GCC) Target: x86_64-pc-linux-gnu Configured with: ../gcc-13.1.0/configure --prefix=/usr/local/gcc/13.1.0 --disable-multilib --enable-languages=c,c++,fortran --with-gmp=/usr/local/gmp/6.2.1 --with-mpc=/usr/local/mpc/1.3.1 --with-mpfr=/usr/local/mpfr/4.2.0 --with-isl=/usr/local/isl/0.24 The following program generates an error: /tmp/ccAuHFqz.s: Assembler messages: /tmp/ccAuHFqz.s:28: Error: unsupported instruction `vpcmpeqd' The assembly code contains vpcmpeqd %xmm16, %xmm16, %xmm16 which perhaps is invalid for xmm registers above 15. #include __attribute__((noinline)) static void vswap(int32_t *x) { __m256i x0 = _mm256_loadu_si256((__m256i *) ([0])); __m256i x1 = _mm256_loadu_si256((__m256i *) ([1])); _mm256_storeu_si256((__m256i *) ([0]),(x1)); _mm256_storeu_si256((__m256i *) ([1]),(x0)); } void vproc(int32_t *x) { for (int32_t p=4; p>=1; p>>=1) { if (p == 4) { __m256i mask = _mm256_set_epi32(0, 0, 0, 0, -1, -1, -1, -1); __m256i x0 = _mm256_loadu_si256((__m256i *) ([0])); x0 = _mm256_xor_si256(x0, mask); _mm256_storeu_si256((__m256i *) ([0]),(x0)); } vswap(x); } }
[Bug target/91652] New: -march=skylake-avx512 -mno-fma -O2 generates FMA instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91652 Bug ID: 91652 Summary: -march=skylake-avx512 -mno-fma -O2 generates FMA instructions Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: joseph.weening at gmail dot com Target Milestone: --- Created attachment 46807 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46807=edit program to demonstrate generation of FMA instructions On x86_64, compiling C code with -march=skylake-avx512 -mno-fma -O2 generates FMA instructions. (-mno-fma works correctly with -march=skylake or -march=haswell.) The attached program demonstrates the problem. It prints "d is 0.00" when not using FMA, and "d is -1.00" when using FMA. % gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/var/tmp/gcc/libexec/gcc/x86_64-pc-linux-gnu/9.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-9.2.0/configure --prefix=/var/tmp/gcc --disable-multilib --enable-languages=c --with-gmp=/usr/local/gmp/6.1.2 --with-mpfr=/usr/local/mpfr/4.0.1 --with-mpc=/usr/local/mpc/1.1.0 Thread model: posix gcc version 9.2.0 (GCC) Command to demonstrate the bug: % gcc -march=skylake-avx512 -mno-fma -O2 fmabug.i && a.out Compiler errors, warnings: (none)