http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53759
--- Comment #1 from Dag Lem <dag at nimrod dot no> 2012-06-24 12:45:55 UTC --- Test code as follows: ------------------------ typedef float v4sf __attribute__ ((vector_size (4*4))); typedef float v2sf __attribute__ ((vector_size (4*2))); v2sf mem[1]; int main() { v4sf reg = (v4sf){0,0,0,0}; reg = __builtin_ia32_loadlps(reg, mem); return reg[0]; } ------------------------ With -msse, gcc emits the following code: xorps %xmm0, %xmm0 movlps mem, %xmm0 However with -mavx, gcc emits: vxorps %xmm0, %xmm0, %xmm0 vmovlps mem, %xmm1, %xmm1 vshufps $0xe4, %xmm0, %xmm1, %xmm0 Shouldn't this rather have been something like vxorps %xmm0, %xmm0, %xmm0 vmovlps mem, %xmm0, %xmm0 ???