https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96512
Bug ID: 96512 Summary: wrong code generated with avx512 intrinsics in some cases Product: gcc Version: 8.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: nathanael.schaeffer at gmail dot com Target Milestone: --- Created attachment 49013 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49013&action=edit bug demonstrator with gcc 8.3.0 (and also possibly some other versions), I found a nasty bug that messes up the results of my calculations. I worked hard to produce a simple reproduced, which is attached. With gcc 8.3.0, compiling with gcc -O1 -g -D_GCC_VEC_=1 -march=skylake-avx512 bug_gcc_avx512.c running ./a.out leads to a wrong result, displayed like so: ERROR :: 0.874347 == 0 Examining the generated assembly, I suspect this instruction to be wrong: 0x0000000000401186 <+100>: vbroadcastsd zmm0,QWORD PTR [r8*8+0x1] because r8 is aligned, and the 0x1 offset does not seem right... When compiling with -march=skylake the problem goes away. When using "alloca" instead of variable length array, the problem goes away When inserting a printf, the problem goes away. See in the attached source file. The lines commented with "NO BUG" make the bug go away. This has been a nightmare to spot, as it does not happen on all compiler versions. I hope somebody can reproduce it and fix it... Note that the assembly generated on godbolt does not seem to have this issue...