https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113026

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I do have a patch to avoid the epilog, but that doesn't help when adjusting the
testcase to

char dst[17];
void
foo (char *src, long n)
{
  for (long i = 0; i < n; i++)
    dst[i] = src[i];
}

because then we still fail to constrain the epilog number of iterations
(the different cases of flow are now quite complicated, the pending
early exit vectorization patches will complicate it more).

I'll see what to do _after_ that work got in.

The following helps the dst[16] case, ideally we'd refactor that a bit
according to the comment.

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 7a3db5f098b..a4dd2caa400 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -1260,7 +1260,11 @@ vect_need_peeling_or_partial_vectors_p (loop_vec_info
loop_vinfo)
             the epilogue is unnecessary.  */
          && (!LOOP_REQUIRES_VERSIONING (loop_vinfo)
              || ((unsigned HOST_WIDE_INT) max_niter
-                 > (th / const_vf) * const_vf))))
+                 /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD
+                    but that's only computed later based on our result.
+                    The following is the most conservative approximation.  */
+                 > (std::max ((unsigned HOST_WIDE_INT) th,
+                              const_vf) / const_vf) * const_vf))))
     return true;

   return false;

it's interesting that we don't seem to adjust the upper bound of the niters
for the epilog at all.  The following cures that as well:

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index bcd90a331f5..07a30b7ee98 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -3193,6 +3193,19 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters,
tree nitersm1,
            bb_before_epilog->count = single_pred_edge
(bb_before_epilog)->count ();
          bb_before_epilog = loop_preheader_edge (epilog)->src;
        }
+      else
+       {
+         /* When we do not have a loop-around edge to the epilog we know
+            the vector loop covered at least VF scalar iterations.  Update
+            any known upper bound with this knowledge.  */
+         if (loop->any_upper_bound)
+           epilog->nb_iterations_upper_bound -= constant_lower_bound (vf);
+         if (loop->any_likely_upper_bound)
+           epilog->nb_iterations_likely_upper_bound -= constant_lower_bound
(vf);
+         if (loop->any_estimate)
+           epilog->nb_iterations_estimate -= constant_lower_bound (vf);
+       }
+
       /* If loop is peeled for non-zero constant times, now niters refers to
         orig_niters - prolog_peeling, it won't overflow even the orig_niters
         overflows.  */

Reply via email to