https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82108
Bug ID: 82108 Summary: [7.2 Regression] Wrong vectorized code generated for x86_64 Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ell_se at yahoo dot com Target Milestone: --- The following snippet generates wrong vectorized code with GCC 7.2 on x86_64 with -O3: void downscale_2 (const float* src, int src_n, float* dst) { int i; for (i = 0; i < src_n; i += 2) { const float* a = src; const float* b = src + 4; dst[0] = (a[0] + b[0]) / 2; dst[1] = (a[1] + b[1]) / 2; dst[2] = (a[2] + b[2]) / 2; dst[3] = (a[3] + b[3]) / 2; src += 2 * 4; dst += 4; } The assembly for the vectorized version of the loop is: .L5: addl $1, %ecx movups (%rdi,%rax), %xmm0 movups 16(%rdi,%rax,2), %xmm2 addps %xmm2, %xmm0 mulps %xmm1, %xmm0 movups %xmm0, (%rdx,%rax) addq $16, %rax cmpl %r8d, %ecx jb .L5 Notice the missing ,2 on the first movups. It can be tested with: #include <stdio.h> int main () { const float in[4 * 4] = { 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8 }; float out[2 * 4]; downscale_2 (in, 4, out); /* correct: 3, 4, 5, 6 */ printf ("%g, %g, %g, %g\n", out[0], out[1], out[2], out[3]); /* incorrect: 5, 6, 7, 8; should also be 3, 4, 5, 6 */ printf ("%g, %g, %g, %g\n", out[4], out[5], out[6], out[7]); } This doesn't seem to happen with 7.1 or 6.3. For a chuckle, this affects a recent build of GIMP, and has resulted in this beauty: https://bug787222.bugzilla-attachments.gnome.org/attachment.cgi?id=359042 :) $gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/7.2.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --enable-libmpx --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --disable-multilib --disable-werror --enable-checking=release --enable-default-pie --enable-default-ssp Thread model: posix gcc version 7.2.0 (GCC)