https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112774

            Bug ID: 112774
           Summary: Vectorize the loop by inferring nonwrapping
                    information from arrays
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hliu at amperecomputing dot com
  Target Milestone: ---

This case extracted from another benchmark and it is simpler than the case in
PR101450, as it has the additional boundary information from the array:

    int A[1024 * 2];

    int foo (unsigned offset, unsigned N) 
    {
      int sum = 0;

      for (unsigned i = 0; i < N; i++)
        sum += A[i + offset];

      return sum;
    }

The Gimple before the vectorization pass is:

    <bb 3> [local count: 955630224]:
    # sum_12 = PHI <sum_9(6), 0(5)>
    # i_14 = PHI <i_10(6), 0(5)>
    _1 = offset_8(D) + i_14;
    _2 = A[_1];
    sum_9 = _2 + sum_12;
    i_10 = i_14 + 1;

GCC failed to vectorize it as it the chrec "{offset_8, +, 1}_1" may
overflow/wrap. I summarized more details in the email:
https://gcc.gnu.org/pipermail/gcc/2023-November/242854.html

Actually, GCC already knows it won't by inferring the range from the array
(in estimate_numbers_of_iterations -> infer_loop_bounds_from_undefined ->
infer_loop_bounds_from_array):

    Induction variable (unsigned int) offset_8(D) + 1 * iteration does not wrap
in statement _2 = A[_1];
     in loop 1.
    Statement _2 = A[_1];
     is executed at most 2047 (bounded by 2047) + 1 times in loop 1.

We can use re-use this information to vectorize this case. I already have a
simple patch to achieve this, and will send it out later (after doing more
tests).

Reply via email to