https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123175
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #5)
> (In reply to Richard Biener from comment #4)
> > I'd say we might at least want to enforce nelts equality for constant size
> > vectors, thus ICE on must_ne ()?
>
> Yeah I have a patch for this on a patch series I didn't get a chance to
> upstream this year.
>
> The fix I have is more extensive since the pattern also mis-identifies sub
> patterns.
>
> To do so it extends the vec_perm_indices::all_in_range_p API to support
> matches on multiple subranges vec_perm_indices::all_in_ranges_p and the fix
> then becomes
>
> @@ -11084,6 +11097,8 @@ and,
> {
> /* Create a vec_perm_indices for the integer vector. */
> poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
> + poly_uint64 e_nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (op0));
> + bool same_size = known_eq (nelts, e_nelts);
> bool single_arg = (op0 == op1);
> vec_perm_indices sel (builder, single_arg ? 1 : 2, nelts);
> }
> @@ -11095,9 +11110,12 @@ and,
> {
> if (!single_arg)
> {
> - if (sel.all_from_input_p (0))
> + if ((same_size && sel.all_from_input_p (0))
> + || (!same_size && sel.all_in_range_p (0, e_nelts)))
> op1 = op0;
> - else if (sel.all_from_input_p (1))
> + else if ((same_size && sel.all_from_input_p (1))
> + || (!same_size
> + && sel.all_in_range_p (2 * e_nelts, e_nelts)))
> {
> op0 = op1;
> sel.rotate_inputs (1);
>
> I have had it for a few a while now but had no way to trigger it without the
> series so never submitted it.
>
> Could do so if you want?
The all_from_input_p work if nelts is correct, so this fix seems wrong. For
the particular pattern I think just initializing nelts from op0 is correct.
But as said, I wonder if it was really intended to relax VEC_PERM_EXPR this
much. I wonder if we even ever get those on non-VLA targets?
Going forward I'd like to see a vec_perm_indices CTOR from gassign *
and tree (for match.pd if the tree one handles SSA name by looking at
the definition would be convenient) to avoid such issues.
Do you have a non-GIMPLE testcase that shows the issue you are fixing above?