https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109519

            Bug ID: 109519
           Summary: aarch64: wrong code with NEON intrinsics on gcc-10 and
                    later
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: spop at gcc dot gnu.org
  Target Milestone: ---

Steps to reproduce:
$ git clone https://github.com/sebpop/bitshuffle.git -b gcc-10-bug
$ cd bitshuffle/reproduce
$ make
$ ./a.out

The expected output is produced by gcc-7, gcc-9, and clang-15. 
16384
4
14
16
33
39
45
51
57
67
102
108
120
126
128
134
138
140
[...]

gcc-9 is the last version of gcc I tested that works.

gcc-10 produces the following output:
./a.out
16384
0
0
0
0
39
45
51
57

gcc-11 and gcc-trunk produce the following output:
./a.out
16384
0
0
0
0
0
0
0

The output is also correct when removing the before-last patch from the git
repo https://github.com/kiyo-masui/bitshuffle/pull/140 
This patch exposes the bug in gcc by using NEON intrinsics instead of scalar
computations to translate move_mask instructions from SSE2 to NEON.

Reply via email to