gcc version 4.3.2 (Ubuntu 4.3.2-1ubuntu12) This code causes an alignment fault because __builtin_ia32_loadups will generate a movaps instruction instead of a movups instruction.
#include <xmmintrin.h> void sse_func(__m128, __m128); float *get_arg(int); int main(void) { sse_func(__builtin_ia32_loadups(get_arg(0)), __builtin_ia32_loadups(get_arg(0))); } Disassembly: pushl $0 call get_arg movaps (%eax), %xmm0 movups %xmm0, 16(%esp) movl $0, (%esp) call get_arg movaps 16(%esp), %xmm1 movups (%eax), %xmm0 call sse_func addl $40, %esp get_arg() returns a non-aligned pointer, but it is loaded with movaps which expects a 16-byte aligned pointer. movups should be used instead. Also, the next instruction stores the SSE register to the stack which is already guaranteed to be 16-byte aligned, but it uses movups, which should only be used for non 16-byte aligned pointers. Command-line: gcc -O1 -march=i386 -msse -fomit-frame-pointer -S test.c -o - Bug also occurs with -O2 and -O3, but not with -O0, and with -march=i486 and -march=i586, but not with -march=i686. The bug disappears when -fomit-frame-pointer is removed. The bug also disappears when get_arg(int) is changed to get_arg(void):' #include <xmmintrin.h> void sse_func(__m128, __m128); float *get_arg(void); int main(void) { sse_func(__builtin_ia32_loadups(get_arg()), __builtin_ia32_loadups(get_arg())); } -- Summary: In some cases __builtin_ia32_loadups generates a movaps instruction Product: gcc Version: 4.3.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: amanieu at gmail dot com GCC build triplet: i486-pc-linux-gnu GCC host triplet: i486-pc-linux-gnu GCC target triplet: i486-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39442