https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116583
Bug ID: 116583 Summary: vectorizable_slp_permutation cannot handle even/odd extract from VLA vector Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- gcc.dg/vect/O3-pr39675-2.c:9:1: note: node 0x450c0e0 (max_nunits=1, refcnt=1) vector([4,4]) int gcc.dg/vect/O3-pr39675-2.c:9:1: note: op: VEC_PERM_EXPR gcc.dg/vect/O3-pr39675-2.c:9:1: note: stmt 0 a0_8 = in[_1]; gcc.dg/vect/O3-pr39675-2.c:9:1: note: stmt 1 a2_10 = in[_3]; gcc.dg/vect/O3-pr39675-2.c:9:1: note: lane permutation { 0[0] 0[2] } gcc.dg/vect/O3-pr39675-2.c:9:1: note: children 0x450bf18 gcc.dg/vect/O3-pr39675-2.c:9:1: note: node 0x450bf18 (max_nunits=4, refcnt=2) vector([4,4]) int gcc.dg/vect/O3-pr39675-2.c:9:1: note: op template: a0_8 = in[_1]; gcc.dg/vect/O3-pr39675-2.c:9:1: note: stmt 0 a0_8 = in[_1]; gcc.dg/vect/O3-pr39675-2.c:9:1: note: stmt 1 a1_9 = in[_2]; gcc.dg/vect/O3-pr39675-2.c:9:1: note: stmt 2 a2_10 = in[_3]; gcc.dg/vect/O3-pr39675-2.c:9:1: note: stmt 3 a3_11 = in[_4]; because the number of lanes in the SLP nodes do not agree we end up with repeating_p == false which causes the permute to fail to be supported for VLA vectors. repeating_p is initially set to tree vectype = SLP_TREE_VECTYPE (node); poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype); bool repeating_p = multiple_p (nunits, SLP_TREE_LANES (node)); I suppose as long as 'child' is repeating in the same sense the overall thing is still repeating. When doing that we get vect_a0_8.6_27 = .MASK_LOAD (vectp_in.4_23, 32B, loop_mask_19); vectp_in.4_28 = vectp_in.4_23 + POLY_INT_CST [16, 16]; vect_a0_8.7_29 = .MASK_LOAD (vectp_in.4_28, 32B, loop_mask_18); vectp_in.4_30 = vectp_in.4_23 + POLY_INT_CST [32, 32]; vect_a0_8.8_31 = .MASK_LOAD (vectp_in.4_30, 32B, loop_mask_6); vectp_in.4_32 = vectp_in.4_23 + POLY_INT_CST [48, 48]; vect_a0_8.9_33 = .MASK_LOAD (vectp_in.4_32, 32B, loop_mask_5); _43 = VEC_PERM_EXPR <vect_a0_8.6_27, vect_a0_8.6_27, { 0, 2, 4, ... }>; _44 = VEC_PERM_EXPR <vect_a0_8.7_29, vect_a0_8.7_29, { 0, 2, 4, ... }>; _45 = VEC_PERM_EXPR <_43, _43, { 0, 2, 4, ... }>; that isn't entirely what we expect though. We'd have expected _27, _29 in the first and _30 and _32 in the second and _43 and _44 in the third permute.