bash-3.2$ cat x.c typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
__m128 __attribute__((noinline)) iszero (__m128 x) { return x; } typedef __m128 __attribute__((aligned(1))) unaligned; __m128 __attribute__((noinline)) foo (__m128 a1, __m128 a2, __m128 a3, __m128 a4, __m128 a5, __m128 a6, __m128 a7, __m128 a8, int b1, int b2, int b3, int b4, int b5, int b6, int b7, unaligned y) { return iszero (y); } int main (void) { unaligned x; __m128 y, x0 = { 0 }; x = x0; y = foo (x0, x0, x0, x0, x0, x0, x0, x0, 1, 2, 3, 4, 5, 6, 7, x); return !__builtin_memcmp (&y, &x0, sizeof (y)); } bash-3.2$ /export/build/gnu/gcc/build-x86_64-linux/stage1-gcc/xgcc -B/export/build/gnu/gcc/build-x86_64-linux/stage1-gcc/ -O x.c -o x bash-3.2$ ./x Segmentation fault bash-3.2$ The issue here is V4SFmode may not always be properly aligned. This is very similar to PR 32000. The difference is TDmode is passed as TImode on the stack. But here V4SFmode is used. The same problem exists to all other SSE modes. -- Summary: x86 backend uses aligned load on unaligned memory Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hjl dot tools at gmail dot com GCC target triplet: x86_64-unknown-linux-gnu BugsThisDependsOn: 32000 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35767