------- Comment #2 from hjl dot tools at gmail dot com 2010-06-09 03:26 ------- (In reply to comment #0) > > While running some tests against SSE4.2 instructions, I noticed that the > __builtin_ia32_pcmpestri128 method generates the correct pcmpestri call > followed immediately by an extraneous pcmpestrm call. The second call goes > away when compiled with any optimization level. >
-O0 generates unoptimized code. In your testcase, gcc also generates movdqa .LC0(%rip), %xmm0 movdqa %xmm0, -32(%rbp) movdqa .LC1(%rip), %xmm0 movdqa %xmm0, -48(%rbp) movdqa -48(%rbp), %xmm1 movdqa -32(%rbp), %xmm0 Those aren't necessary either. -- hjl dot tools at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |WORKSFORME http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44472