http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748
--- Comment #34 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- Hmm, this was looking like a working example ((if it is valid C at all)), but after some thougt, I saw now it exposes a data store race: #include <stdio.h> #include <string.h> typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); union x { long long a; float b; } __attribute__((aligned(1))) ; struct s { union x xx[0]; V x; } __attribute__((packed)); void __attribute__((noinline, noclone)) foo(struct s * x) { x->xx[0].a = -1; x->xx[0].b = 3.14; x->x[1] = 0x123456789ABCDEF; } int main() { struct s ss; memset(&ss, 0, sizeof(ss)); foo (&ss); printf("%f %llX\n", ss.xx[0].b, ss.xx[0].a); printf("%llX %llX\n", ss.x[0], ss.x[1]); } the resulting code is: foo: .LFB23: .cfi_startproc movdqu (%rdi), %xmm0 movabsq $-4294967296, %rdx movq .LC1(%rip), %xmm1 psrldq $8, %xmm0 punpcklqdq %xmm0, %xmm1 movdqu %xmm1, (%rdi) movdqu (%rdi), %xmm2 movdqa %xmm2, -24(%rsp) movq -24(%rsp), %rax andq %rdx, %rax orq $1078523331, %rax movq %rax, -24(%rsp) movdqa -24(%rsp), %xmm3 movdqu %xmm3, (%rdi) movdqu (%rdi), %xmm0 movhps .LC2(%rip), %xmm0 movdqu %xmm0, (%rdi) ret Which shows all read/write accesses are 16 byte at a time and this creates a forbidden data store race. Looks like I shot my own patch down now :-)