http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49133
Summary: [4.6 Regression] modification of aliased __m128d miscompiles Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: kr...@kde.org Compile the following testcase with "g++ -msse2 -O2" #include <xmmintrin.h> typedef double double_a __attribute__((__may_alias__)); struct V { __m128d data; }; int main() { V a; __m128d b; b = _mm_set_pd(1., 0.); a.data = _mm_set_pd(1., 0.); a.data = _mm_add_pd(a.data, _mm_and_pd(_mm_cmpeq_pd(a.data, _mm_set1_pd(0.)), _mm_set1_pd(2.))); reinterpret_cast<double_a *>(&a.data)[1] += 1.; b = _mm_add_pd(b, _mm_and_pd(_mm_cmpeq_pd(b, _mm_set1_pd(0.)), _mm_set1_pd(1.))); b = _mm_add_pd(b, _mm_and_pd(_mm_cmpeq_pd(b, _mm_set1_pd(1.)), _mm_set1_pd(1.))); if (_mm_movemask_pd(_mm_cmpeq_pd(a.data, b)) != 0x3) { abort(); } return 0; } GCC 4.6.[01] calculate the correct values for a.data[0] and a.data[1] but fail to combine the results correctly. I.e. the resulting shufpd $0x1 is wrong. GCC 4.5.x uses unpacklpd, which gives the correct result, but emits unnecessary stores to the stack.