https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54349

--- Comment #11 from Alexander Peslyak <solar-gcc at openwall dot com> ---
Turns out that gcc 4.6.x to 4.8.x generating "movd" instead of "movq" is
actually a deliberate hack, to support binutils older than 2.17 ("movq" support
committed in 2005, released in 2006) and (presumably) non-GNU assemblers:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43215

Also related, on "vmovd":

https://sourceware.org/ml/binutils/2008-05/msg00257.html

Per H.J. Lu, this is because of an error in AMD's spec for x86-64.

More detail on this cursed intrinsic: gcc got the _mm_cvtsi128_si64x() (with
'x') form before it got Intel's _mm_cvtsi128_si64() name (without 'x').  (When
using the inline asm workaround above, this does not matter as the macro brings
the without 'x' form to older gcc as well.)  Older MSVC and Open64 had bugs for
the intrinsic (without 'x'):

http://www.thesalmons.org/john/random123/releases/1.08/docs/sse_8h_source.html#l00108

This refers to https://bugs.open64.net/show_bug.cgi?id=873 for the Open64 bug,
and I had looked at it before, but unfortunately right now their bug tracker
refuses connections (for https; and gives 404 for that path with http).  I have
no detail on what the MSVC bug was.  Apparently, these could result in
incorrect computation at runtime (the comment at the URL above mentions failed
assertions).  Using _mm_extract_epi64(x, 0) is a workaround (SSE4.1+, sometimes
slower).

Reply via email to