https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123175
Bug ID: 123175
Summary: Wrong folding of VEC_PERM_EXPR
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
When we use relaxed rules for VEC_PERM_EXPR vectors like
typedef int v4si __attribute__((vector_size(16)));
typedef int v2si __attribute__((vector_size(8)));
typedef char v4qi __attribute__((vector_size(4)));
v4si __GIMPLE() foo (v2si a, v2si b)
{
v4si res;
res = __VEC_PERM (a, b, _Literal (v4qi) { 0, 1, 2, 3 });
return res;
}
we mis-fold that to
res_3 = a_1(D);
because of the match.pd rule
(simplify
(vec_perm @0 @1 VECTOR_CST@2)
(with
{
tree op0 = @0, op1 = @1, op2 = @2;
machine_mode result_mode = TYPE_MODE (type);
...
which uses
/* Create a vec_perm_indices for the integer vector. */
poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type);
bool single_arg = (op0 == op1);
vec_perm_indices sel (builder, single_arg ? 1 : 2, nelts);
with nelts == 4 which misleds sel.series_p (0, 1, 0, 1) but also
sel.all_from_input_p as can be seen with
v4si __GIMPLE() foo (v2si a, v2si b)
{
v4si res;
res = __VEC_PERM (a, b, _Literal (v4qi) { 0, 2, 2, 1 });
return res;
}
which we fold to
res_3 = VEC_PERM_EXPR <a_1(D), a_1(D), { 0, 2, 2, 1 }>;
I've seen this when removing the unnecessary padding of shufflevector inputs
to mask length for PR123156.