[Bug tree-optimization/86924] tree-slp-vectorize may create unaligned memory access, causing segmentation fault

2018-08-21 Thread contact at ligh dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86924

--- Comment #3 from Mario Rohkrämer  ---
Created attachment 44567
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44567=edit
Zipped temp output encoder.i by lupo...

This is the "-save-temps" output which user lupo... attached in comment 12 to
the Chromium bug report I linked above.

[Bug tree-optimization/86924] tree-slp-vectorize may create unaligned memory access, causing segmentation fault

2018-08-21 Thread contact at ligh dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86924

--- Comment #2 from Mario Rohkrämer  ---
Unfortunately, I do not have much experience in running a compile manually. I
only let the "media-autobuild suite" batch run.

https://github.com/jb-alvarado/media-autobuild_suite/

I would not know for sure where to manipulate these batch/shell files to add
the requested argument, or how to manually run the compilation for one specific
file. It's all automated. But I will ask around and try to get advice.

[Bug tree-optimization/86924] New: tree-slp-vectorize may create unaligned memory access, causing segmentation fault

2018-08-12 Thread contact at ligh dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86924

Bug ID: 86924
   Summary: tree-slp-vectorize may create unaligned memory access,
causing segmentation fault
   Product: gcc
   Version: 8.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: contact at ligh dot de
  Target Milestone: ---

Compiler version: 8.2.0 for Windows 64 bit, as released in MSYS2 / MinGW64
Windows 7 SP1, 64 bit


$ gcc -v
Using built-in specs.
COLLECT_GCC=H:\development\media-autobuild_suite-master\msys64\mingw64\bin\gcc.exe
COLLECT_LTO_WRAPPER=H:/development/media-autobuild_suite-master/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-8.2.0/configure --prefix=/mingw64
--with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32
--host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32
--with-native-system-header-dir=/mingw64/x86_64-w64-mingw32/include
--libexecdir=/mingw64/lib --enable-bootstrap --with-arch=x86-64
--with-tune=generic --enable-languages=ada,c,lto,c++,objc,obj-c++,fortran
--enable-shared --enable-static --enable-libatomic --enable-threads=posix
--enable-graphite --enable-fully-dynamic-string
--enable-libstdcxx-filesystem-ts=yes --enable-libstdcxx-time=yes
--disable-libstdcxx-pch --disable-libstdcxx-debug --disable-isl-version-check
--enable-lto --enable-libgomp --disable-multilib --enable-checking=release
--disable-rpath --disable-win32-registry --disable-nls --disable-werror
--disable-symvers --with-libiconv --with-system-zlib --with-gmp=/mingw64
--with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64
--with-pkgversion='Rev1, Built by MSYS2 project'
--with-bugurl=https://sourceforge.net/projects/msys2 --with-gnu-as
--with-gnu-ld
Thread model: posix
gcc version 8.2.0 (Rev1, Built by MSYS2 project)


The AOMedia AV1 video encoder compiled with this version (but it is probably
independent of the operating system) crashes while encoding. The following bug
report in the Chromium bug tracker analyzed the problem, especially comment 7
went down to disassembly:

https://bugs.chromium.org/p/aomedia/issues/detail?id=2055#c7

Summary by lupo...:

+
Bug appears in the compilation of
https://aomedia.googlesource.com/aom/+/da17065690c185ae678d5db9466cf0a402ca6b6d/av1/encoder/encoder.c#3415
More precisely in the optimized and inlined lshift_bwd_ref_frames(cpi) inside
update_reference_frames

Disassembly listings to follow:
cmake -G "MSYS Makefiles" -DCONFIG_LOWBITDEPTH=1 -DENABLE_DOCS=0
-DENABLE_TESTS=off ../aom
loc_4D5CD2:
mov edx, [rcx+35624Ch]
movdqa  xmm3, xmmword ptr [rcx+478E38h]
mov [rcx+356248h], edx
mov edx, [rcx+356254h]
movaps  xmmword ptr [rcx+478E28h], xmm3
movdqa  xmm3, xmmword ptr [rcx+478E58h]
mov [rcx+35624Ch], edx
movaps  xmmword ptr [rcx+478E38h], xmm3
mov [rcx+356254h], r11d
jmp loc_4D58A0

cmake -G "MSYS Makefiles" -DCONFIG_LOWBITDEPTH=1 -DENABLE_DOCS=0
-DENABLE_TESTS=off -DAOM_EXTRA_C_FLAGS="-fno-tree-slp-vectorize"
-DAOM_EXTRA_CXX_FLAGS="-fno-tree-slp-vectorize" ../aom
loc_4D5DC2:
mov edx, [rcx+35624Ch]
movdqu  xmm3, xmmword ptr [rcx+478E38h]
movdqu  xmm5, xmmword ptr [rcx+478E58h]
mov [rcx+356248h], edx
mov edx, [rcx+356254h]
movups  xmmword ptr [rcx+478E28h], xmm3
mov [rcx+35624Ch], edx
movups  xmmword ptr [rcx+478E38h], xmm5
mov [rcx+356254h], r11d
jmp loc_4D5993

It all reduces to aligned vs unaligned memory access. By manually patching the
faulty executable, changing movdqa to movdqu and movaps to movups, I have been
able to finish an encode without problems.
+


Please excuse not providing all the details you requested in the "Reporting
Bugs" guide. But I believe the linked bug report in the Chromium tracker is
verbose enough to understand the issue.