https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125767

--- Comment #4 from Christopher Bazley <Chris.Bazley at arm dot com> ---
(In reply to Richard Sandiford from comment #3)
> I agree that this is a missing case, but it's more of a missed optimisation
> rather than a correctness issue.  Returning false should always be
> conservatively correct.

Hi Richard,

it would be possible to work around the issue but I reported it as a bug
because the function does not seem to behave according to its documented
contract.

The last time I encountered this issue, I put in a workaround:

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index cd3ba6fa1cb..367a9c63ea4 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1664,37 +1664,41 @@ check_load_store_for_partial_vectors (vec_info *vinfo,
tree vectype,
     }

   /* We might load more scalars than we need for permuting SLP loads.
      We checked in get_load_store_type that the extra elements
      don't leak into a new vector.  */
   auto group_memory_nvectors = [](poly_uint64 size, poly_uint64 nunits)
   {
     unsigned int nvectors;
     if (can_div_away_from_zero_p (size, nunits, &nvectors))
       return nvectors;
-    gcc_unreachable ();
+
+    gcc_assert (known_le (size, nunits));
+    return 1u;
   };

Now, I need a similar workaround in gen_lowpart_common:

    {
      /* MODE must occupy no more of the underlying registers than X.  */
      poly_uint64 regsize = REGMODE_NATURAL_SIZE (innermode);
      unsigned int mregs, xregs;
      if (!can_div_away_from_zero_p (msize, regsize, &mregs)
          || !can_div_away_from_zero_p (xsize, regsize, &xregs)
          || mregs > xregs)
        return 0;
    }

As this point, I think it is better just to modify the function to give the
expected result.

Reply via email to