https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96512

            Bug ID: 96512
           Summary: wrong code generated with avx512 intrinsics in some
                    cases
           Product: gcc
           Version: 8.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nathanael.schaeffer at gmail dot com
  Target Milestone: ---

Created attachment 49013
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49013&action=edit
bug demonstrator

with gcc 8.3.0 (and also possibly some other versions), I found a nasty bug
that messes up the results of my calculations.
I worked hard to produce a simple reproduced, which is attached.

With gcc 8.3.0, compiling with
   gcc -O1 -g -D_GCC_VEC_=1 -march=skylake-avx512 bug_gcc_avx512.c

running ./a.out leads to a wrong result, displayed like so:
   ERROR :: 0.874347 == 0

Examining the generated assembly, I suspect this instruction to be wrong:
0x0000000000401186 <+100>:   vbroadcastsd zmm0,QWORD PTR [r8*8+0x1]
because r8 is aligned, and the 0x1 offset does not seem right...

When compiling with -march=skylake the problem goes away.
When using "alloca" instead of variable length array, the problem goes away
When inserting a printf, the problem goes away.

See in the attached source file. The lines commented with "NO BUG" make the bug
go away.

This has been a nightmare to spot, as it does not happen on all compiler
versions. I hope somebody can reproduce it and fix it...
Note that the assembly generated on godbolt does not seem to have this issue...

Reply via email to