http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53712
Bug #: 53712 Summary: SEGV in generated code for _mm_cmpistri with unaligned operand when using -O0 Classification: Unclassified Product: gcc Version: 4.6.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: jbem...@zonnet.nl Created attachment 27647 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27647 Test program triggering the bug - compile with "-O0 -msse4.2" Compile the attached program with "-msse4.2 -O0" The failing generated code looks like this (-S): test: .LFB575: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movq %rdi, -40(%rbp) movq %rsi, -48(%rbp) movq -48(%rbp), %rax movq %rax, -24(%rbp) movq -24(%rbp), %rax movdqu (%rax), %xmm0 movdqa %xmm0, -16(%rbp) movq -40(%rbp), %rax SEGV => movdqa (%rax), %xmm1 movdqa -16(%rbp), %xmm0 pcmpistri $0, %xmm1, %xmm0 movl %ecx, %eax ? => pcmpistrm $0, %xmm1, %xmm0 popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc This code causes a SEGV at the indicated instruction, because %rax has a value of 0x4006d8 which is not aligned by 16 Compiling with -O1 (or higher) fixes the problem: test: .LFB643: .cfi_startproc movdqu (%rsi), %xmm0 pcmpistri $0, (%rdi), %xmm0 movl %ecx, %eax ret .cfi_endproc The root cause is that "pcmpistri" allows an unaligned memory operand, while gcc generates aligned loads for the operands when using -O0. A second issue is that gcc generates a redundant "pcmpistrm" instruction at -O0, unclear where this is coming from? $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.6.3/lto-wrapper Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC)