https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65965
--- Comment #5 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 22 Sep 2015, alalaw01 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65965 > > --- Comment #4 from alalaw01 at gcc dot gnu.org --- > (In reply to Richard Biener from comment #3) > > Fixed for GCC 6. > > Indeed. I note that the same testcase does _not_ SLP/vectorize if I use > consecutive indices: > > void > test (int*__restrict a, int*__restrict b) > { > a[0] = b[0]; > a[1] = b[1]; > a[2] = b[2]; > a[3] = b[3]; > a[4] = 0; > a[5] = 0; > a[6] = 0; > a[7] = 0; > } > > loop26a.c:6:13: note: Build SLP failed: different operation in stmt MEM[(int > *)a > _4(D) + 28B] = 0; > loop26a.c:6:13: note: original stmt *a_4(D) = _3; > loop26a.c:6:13: note: === vect_slp_analyze_data_ref_dependences === > loop26a.c:6:13: note: === vect_slp_analyze_operations === > loop26a.c:6:13: note: not vectorized: bad operation in basic block. > > Worth another bug? The above looks like if SLP is trying a vector size of v8si. It _should_ work for v4si. For v8si we indeed can't vectorize this as we don't support "partial" loads. We could vectorize with masked loads and IIRC on x86_64 the masked elements can be initialized to 0 or -1, so we can OR in the constant pieces. Not sure if that's worth another bug, please double-check your vector size first.