https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100756

            Bug ID: 100756
           Summary: vect: Superfluous epilog created on s390x
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rdapp at linux dot ibm.com
  Target Milestone: ---

Since g:d846f225c25c5885250c303c8d118caa08c447ab we create an epilog loop on
s390 for the following test case:

/* { dg-do compile } */
/* { dg-options "-O3 -mzarch -march=z13" } */
/* { dg-require-effective-target s390_vx } */

int
foo (int * restrict a, int n)
{
  int i, result = 0;

  for (i = 0; i < n * 4; i++)
    result += a[i];
  return result;
}

vec.c:10:17: note:  epilog loop required

The following check in
tree-vect-loop.c:vect_need_peeling_or_partial_vectors_p() is now true:

               || ((tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))                   
                     < (unsigned) exact_log2 (const_vf)) 

We now have LOOP_VINFO_NITERS (loop_vinfo) = _15 > 0 ? (unsigned int) _15 : 1
as compared to (unsigned int) _15 before. tree_ctz() returns 0 for the
conditional and 2 before which did not trigger the epilog requirement.

may_be_zero is _15 > 0 so it looks to me like we rather want to check the
not-may_be_zero part of niter for alignment. Not sure if this is the right/safe
thing to do, though.

Reply via email to