The following moves rejecting loop vectorization with vector(1) typed vectors from the initial vector type determining to after SLP discovery when we can check whether there's any instance with other than vector(1) vectors. For RVV at least vector(1) instances serve as a limited way to support partial loop vectorization. The following restores this.
Bootstrapped and tested on x86_64-unknown-linux-gnu. PR tree-optimization/121048 * tree-vect-loop.cc (vect_determine_vectype_for_stmt_1): Remove rejecting vector(1) vector types. (vect_set_stmts_vectype): Likewise. * tree-vect-slp.cc (vect_make_slp_decision): Only count instances with non-vector(1) root towards whether we have any interesting instances to vectorize. --- gcc/tree-vect-loop.cc | 9 +-------- gcc/tree-vect-slp.cc | 15 ++++++++++----- 2 files changed, 11 insertions(+), 13 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 8e4ef21cbf2..09e59cc30ed 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -191,12 +191,6 @@ vect_determine_vectype_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, if (stmt_vectype) { - if (known_le (TYPE_VECTOR_SUBPARTS (stmt_vectype), 1U)) - return opt_result::failure_at (STMT_VINFO_STMT (stmt_info), - "not vectorized: unsupported " - "data-type in %G", - STMT_VINFO_STMT (stmt_info)); - if (STMT_VINFO_VECTYPE (stmt_info)) /* The only case when a vectype had been already set is for stmts that contain a data ref, or for "pattern-stmts" (stmts generated @@ -303,8 +297,7 @@ vect_set_stmts_vectype (loop_vec_info loop_vinfo) scalar_type); vectype = get_vectype_for_scalar_type (loop_vinfo, scalar_type); - if (!vectype - || known_le (TYPE_VECTOR_SUBPARTS (vectype), 1U)) + if (!vectype) return opt_result::failure_at (phi, "not vectorized: unsupported " "data-type %T\n", diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index fe67d4dbc46..7c23496b5e0 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -7586,20 +7586,25 @@ vect_make_slp_decision (loop_vec_info loop_vinfo) hash_set<slp_tree> visited; FOR_EACH_VEC_ELT (slp_instances, i, instance) { - /* FORNOW: SLP if you can. */ + slp_tree root = SLP_INSTANCE_TREE (instance); + /* All unroll factors have the form: GET_MODE_SIZE (vinfo->vector_mode) * X for some rational X, so they must have a common multiple. */ - vect_update_slp_vf_for_node (SLP_INSTANCE_TREE (instance), - unrolling_factor, visited); + vect_update_slp_vf_for_node (root, unrolling_factor, visited); /* Mark all the stmts that belong to INSTANCE as PURE_SLP stmts. Later we call vect_detect_hybrid_slp () to find stmts that need hybrid SLP and loop-based vectorization. Such stmts will be marked as HYBRID. */ - vect_mark_slp_stmts (loop_vinfo, SLP_INSTANCE_TREE (instance)); - decided_to_slp++; + vect_mark_slp_stmts (loop_vinfo, root); + + /* If all instances ended up with vector(1) T roots make sure to + not vectorize. RVV for example relies on loop vectorization + when some instances are essentially kept scalar. See PR121048. */ + if (known_gt (TYPE_VECTOR_SUBPARTS (SLP_TREE_VECTYPE (root)), 1U)) + decided_to_slp++; } LOOP_VINFO_SLP_UNROLLING_FACTOR (loop_vinfo) = unrolling_factor; -- 2.43.0