https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112774
Bug ID: 112774 Summary: Vectorize the loop by inferring nonwrapping information from arrays Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: hliu at amperecomputing dot com Target Milestone: --- This case extracted from another benchmark and it is simpler than the case in PR101450, as it has the additional boundary information from the array: int A[1024 * 2]; int foo (unsigned offset, unsigned N) { int sum = 0; for (unsigned i = 0; i < N; i++) sum += A[i + offset]; return sum; } The Gimple before the vectorization pass is: <bb 3> [local count: 955630224]: # sum_12 = PHI <sum_9(6), 0(5)> # i_14 = PHI <i_10(6), 0(5)> _1 = offset_8(D) + i_14; _2 = A[_1]; sum_9 = _2 + sum_12; i_10 = i_14 + 1; GCC failed to vectorize it as it the chrec "{offset_8, +, 1}_1" may overflow/wrap. I summarized more details in the email: https://gcc.gnu.org/pipermail/gcc/2023-November/242854.html Actually, GCC already knows it won't by inferring the range from the array (in estimate_numbers_of_iterations -> infer_loop_bounds_from_undefined -> infer_loop_bounds_from_array): Induction variable (unsigned int) offset_8(D) + 1 * iteration does not wrap in statement _2 = A[_1]; in loop 1. Statement _2 = A[_1]; is executed at most 2047 (bounded by 2047) + 1 times in loop 1. We can use re-use this information to vectorize this case. I already have a simple patch to achieve this, and will send it out later (after doing more tests).