Hi,

In PR120922 we first disabled setting a range on niters_vector for
partial vectorization and later introduced a ceiling division instead.

In PR121985 we ran into this again where a bogus range caused wrong code
later.  On top I saw several instances of this issue on a local branch
that enables more VLS length-controlled loops.

I believe we must not set niter_vector's range to TYPE_MAX / VF, no
matter the rounding due to the way niters_vector is used.  It's not
really identical to the number of vector iterations but the actual
number the loop will iterate is niters_vector / step where step = VF
for partial vectors.

Thus, only set the range to TYPE_MAX / VF if step == 1.

Bootstrapped and regtested on x86 and power10, regtested on aarch64 and 
riscv64.

Regards
 Robin

gcc/ChangeLog:

        PR middle-end/121985

        * tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Only
        set niter_vector's range if step == 1.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/pr121985.c: New test.
---
 .../gcc.target/riscv/rvv/autovec/pr121985.c   | 17 +++++++++++++
 gcc/tree-vect-loop-manip.cc                   | 25 ++++++-------------
 2 files changed, 24 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr121985.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr121985.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr121985.c
new file mode 100644
index 00000000000..6e5cdd927b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr121985.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv_zvl512b -mabi=lp64d -mrvv-max-lmul=m8" } */
+
+unsigned short a=3;
+char f=1;
+
+int main()
+{
+  for (char var=f; var<6; var++)
+    a *= 5;
+
+  return a;
+}
+
+/* We would set a wrong niter range that would cause us to extract the wrong
+   element.  */
+/* { dg-final { scan-assembler-not "vslidedown.vi.*,0" } } */
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 6af07efe68a..87ad7ee5bca 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -2850,29 +2850,18 @@ vect_gen_vector_loop_niters (loop_vec_info loop_vinfo, 
tree niters,
         we set range information to make niters analyzer's life easier.
         Note the number of latch iteration value can be TYPE_MAX_VALUE so
         we have to represent the vector niter TYPE_MAX_VALUE + 1 / vf.  */
-      if (stmts != NULL && const_vf > 0 && !LOOP_VINFO_EPILOGUE_P (loop_vinfo))
+      if (stmts != NULL
+         && integer_onep (step_vector))
        {
-         if (niters_no_overflow
-             && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
-           {
-             int_range<1> vr (type,
-                              wi::one (TYPE_PRECISION (type)),
-                              wi::div_ceil (wi::max_value
-                                                       (TYPE_PRECISION (type),
-                                                        TYPE_SIGN (type)),
-                                            const_vf,
-                                            TYPE_SIGN (type)));
-             set_range_info (niters_vector, vr);
-           }
-         else if (niters_no_overflow)
+         if (niters_no_overflow)
            {
              int_range<1> vr (type,
                               wi::one (TYPE_PRECISION (type)),
                               wi::div_trunc (wi::max_value
-                                                       (TYPE_PRECISION (type),
-                                                        TYPE_SIGN (type)),
-                                          const_vf,
-                                          TYPE_SIGN (type)));
+                                             (TYPE_PRECISION (type),
+                                              TYPE_SIGN (type)),
+                                             const_vf,
+                                             TYPE_SIGN (type)));
              set_range_info (niters_vector, vr);
            }
          /* For VF == 1 the vector IV might also overflow so we cannot
-- 
2.51.0


Reply via email to