https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112736
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- The vectorizer sees <bb 3> [local count: 214748368]: # a.3_5 = PHI <_2(5), 2(2)> # ivtmp_9 = PHI <ivtmp_3(5), 3(2)> _14 = b[a.3_5]; c[a.3_5][0] = _14; c[a.3_5][1] = _14; c[a.3_5][2] = _14; c[a.3_5][3] = _14; _2 = a.3_5 + -1; ivtmp_3 = ivtmp_9 - 1; if (ivtmp_3 != 0) goto <bb 5>; [89.00%] else goto <bb 4>; [11.00%] <bb 5> [local count: 191126048]: goto <bb 3>; [100.00%] and uses SLP, this is likely caused by my patch to allow non-grouped-loads there. t.c:7:17: note: node 0x4637048 (max_nunits=4, refcnt=1) vector(4) int t.c:7:17: note: op template: _14 = b[a.3_5]; t.c:7:17: note: stmt 0 _14 = b[a.3_5]; t.c:7:17: note: stmt 1 _14 = b[a.3_5]; t.c:7:17: note: stmt 2 _14 = b[a.3_5]; t.c:7:17: note: stmt 3 _14 = b[a.3_5]; t.c:7:17: note: load permutation { 0 0 0 0 } I think we need to force strided-SLP for them.