On Wed, Oct 30, 2013 at 11:05:58AM +0100, Jakub Jelinek wrote: > On Wed, Oct 30, 2013 at 11:00:13AM +0100, Jakub Jelinek wrote: > > But the above is 16 byte unaligned load. Furthermore, GCC supports > > -mavx256-split-unaligned-load and can emit 32 byte loads either as an > > unaligned 32 byte load, or merge of 16 byte unaligned loads. The patch > > affects only the cases where we were already emitting 16 byte or 32 byte > > unaligned loads rather than split loads. > > With my patch, the differences (in all cases only on f1) for > -O2 -mavx -ftree-vectorize with the patch is (16 byte unaligned load, not > split):
My point was that this could mask split loads, thank for clarifying