http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53712

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2012-06-18
                 CC|                            |uros at gcc dot gnu.org
     Ever Confirmed|0                           |1
            Summary|SEGV in generated code for  |Does not combine unaligned
                   |_mm_cmpistri with unaligned |load with  _mm_cmpistri,
                   |operand when using -O0      |redundant instruction at
                   |                            |-O0
      Known to fail|                            |4.8.0

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-18 
08:45:51 UTC ---
You have an unaligned load in the _mm_cmpistri arguments:

  * (const __m128i *) (s1)

s1 is not properly aligned.

At -O0 _mm_cmpistri is a macro while with optimization it is an inline
function.  Not sure where the pcmpistrm instruction is from.

Using

#include <nmmintrin.h>
#include <stdio.h>

int test( const char* s1, const char * s2 )
{
        __m128i s1chars = _mm_loadu_si128( (const __m128i *) s2 );
        __m128i s2chars = _mm_loadu_si128( (const __m128i *) (s1));
        return _mm_cmpistri( s1chars, s2chars, _SIDD_CMP_EQUAL_ANY );
}

int main( int argc, char * argv[] )
{
        const char* s1 = "1234567890b1234567890";
        const char* s2 = "abcdefghijklmnop";

        int r = test( s1, s2 );
        fprintf( stderr, "\nResult: %d", r );
        r = test( s1, s2+1 ); // misaligned s2
        fprintf( stderr, "\nResult: %d", r );
        return 0;
}

the testcase works as expected.  Still with the "redundant"(?) instruction
though.  Thus your source is invalid but the missed-optimization looks
odd (though it's only there at -O0).  It also misses to combine the
unaligned load into the cmpistri instruction.

Reply via email to