http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53712

             Bug #: 53712
           Summary: SEGV in generated code for _mm_cmpistri with unaligned
                    operand when using -O0
    Classification: Unclassified
           Product: gcc
           Version: 4.6.3
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: jbem...@zonnet.nl


Created attachment 27647
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27647
Test program triggering the bug - compile with "-O0 -msse4.2"

Compile the attached program with "-msse4.2 -O0"

The failing generated code looks like this (-S):
test:
.LFB575:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movq    %rdi, -40(%rbp)
        movq    %rsi, -48(%rbp)
        movq    -48(%rbp), %rax
        movq    %rax, -24(%rbp)
        movq    -24(%rbp), %rax
        movdqu  (%rax), %xmm0
        movdqa  %xmm0, -16(%rbp)
        movq    -40(%rbp), %rax
SEGV => movdqa  (%rax), %xmm1
        movdqa  -16(%rbp), %xmm0
        pcmpistri       $0, %xmm1, %xmm0
        movl    %ecx, %eax
 ? =>   pcmpistrm       $0, %xmm1, %xmm0
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc

This code causes a SEGV at the indicated instruction, because %rax has a value
of 0x4006d8 which is not aligned by 16

Compiling with -O1 (or higher) fixes the problem:
test:
.LFB643:
        .cfi_startproc
        movdqu  (%rsi), %xmm0
        pcmpistri       $0, (%rdi), %xmm0
        movl    %ecx, %eax
        ret
        .cfi_endproc

The root cause is that "pcmpistri" allows an unaligned memory operand, while
gcc generates aligned loads for the operands when using -O0.

A second issue is that gcc generates a redundant "pcmpistrm" instruction at
-O0, unclear where this is coming from?

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.6.3/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin
--enable-java-awt=gtk --disable-dssi
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC)

Reply via email to