https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #3 from JuzheZhong <juzhe.zhong at rivai dot ai> --- (In reply to Richard Biener from comment #2) > It might be > > t.c:29:22: missed: Data access with gaps requires scalar epilogue loop > > required when vectorizing the load groups. We end up with > > t.c:29:22: note: === vect_analyze_data_ref_accesses === > t.c:29:22: note: Detected single element interleaving array1[0][_8] step 4 > t.c:29:22: note: Detected single element interleaving array1[1][_8] step 4 > t.c:29:22: note: Detected single element interleaving array1[2][_8] step 4 > t.c:29:22: note: Detected single element interleaving array1[3][_8] step 4 > t.c:29:22: note: Detected single element interleaving array1[0][_1] step 4 > t.c:29:22: note: Detected single element interleaving array1[1][_1] step 4 > t.c:29:22: note: Detected single element interleaving array1[2][_1] step 4 > t.c:29:22: note: Detected single element interleaving array1[3][_1] step 4 > t.c:29:22: missed: not consecutive access array2[_4][_8] = _69; > t.c:29:22: note: using strided accesses > t.c:29:22: missed: not consecutive access array2[_4][_1] = _67; > t.c:29:22: note: using strided accesses > > it's better to use signed 'm' (or uint64_t I guess), then we get > > t.c:29:22: note: === vect_analyze_data_ref_accesses === > t.c:29:22: note: Detected interleaving load array1[0][_1] and array1[0][_8] > t.c:29:22: note: Detected interleaving load array1[1][_1] and array1[1][_8] > t.c:29:22: note: Detected interleaving load array1[2][_1] and array1[2][_8] > t.c:29:22: note: Detected interleaving load array1[3][_1] and array1[3][_8] > t.c:29:22: note: Detected interleaving store array2[_4][_1] and > array2[_4][_8] > t.c:29:22: note: Detected interleaving load of size 2 > t.c:29:22: note: _2 = array1[0][_1]; > t.c:29:22: note: _9 = array1[0][_8]; > t.c:29:22: note: Detected interleaving load of size 2 > t.c:29:22: note: _18 = array1[1][_1]; > t.c:29:22: note: _23 = array1[1][_8]; > t.c:29:22: note: Detected interleaving load of size 2 > t.c:29:22: note: _32 = array1[2][_1]; > t.c:29:22: note: _37 = array1[2][_8]; > t.c:29:22: note: Detected interleaving load of size 2 > t.c:29:22: note: _46 = array1[3][_1]; > t.c:29:22: note: _51 = array1[3][_8]; > t.c:29:22: note: Detected interleaving store of size 2 > t.c:29:22: note: array2[_4][_1] = _67; > t.c:29:22: note: array2[_4][_8] = _69; > > and no gap peeling required. > > I guess you say GCC 13 is bad as well? Sorry, I didn't check GCC13 but after investigation. Now, I confirm GCC 13.2.0 doesn't have the regression: https://godbolt.org/z/ndaWToaxP In GCC 13.2.0, there is no appearance of "requires scalar epilogue loop", wheras, GCC-14 has 72 times "requires scalar epilogue loop".