Richard Biener <richard.guent...@gmail.com> writes: > On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Richard Biener <richard.guent...@gmail.com> writes: >> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford >> > <richard.sandif...@arm.com> wrote: >> >> >> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully >> >> we don't see in practice) and no-op conversions that are required >> >> to maintain correct gimple, such as changes between signed and >> >> unsigned types. These cases shouldn't generate any code and so >> >> shouldn't count against either the scalar or vector costs. >> >> >> >> Later patches test this, but it seemed worth splitting out. >> > >> > Hmm, but you have to adjust vect_compute_single_scalar_iteration_cost and >> > possibly the SLP cost walk as well, otherwise we're artificially making >> > those copies cheaper when vectorized. >> >> Ah, yeah. It looks complicated to reproduce the conditions exactly >> there, so how about just costing 1 copy in vectorizable_assignment >> to counteract it, and ignore ncopies? > > I guess costing a single scalar_stmt ought to make it exactly offset > the scalar cost?
To summarise what we said on IRC: the problem with that is that we need to count VF scalar stmts, where VF might be a runtime value. The follow-on loop costing code copes with variable VF without relying on vect_vf_for_cost. Calling vectorizable_assignment from the scalar costing code seemed like too much of a hack. And it turns out that we can't delay the scalar costing until after vect_analyze_stmts because vect_enhance_data_refs_alignment needs it before then. Reworking this whole thing is too much work for GCC 10 at this stage. So this patch goes with your suggestion of using a test based on tree_nop_conversion. To make sure that the scalar and vector costs stay somewhat consistent, vectorizable_assignment continues to cost stmts for which the new predicate is false. Tested as before. Thanks, Richard 2019-11-07 Richard Sandiford <richard.sandif...@arm.com> gcc/ * tree-vectorizer.h (vect_nop_conversion_p): Declare. * tree-vect-stmts.c (vect_nop_conversion_p): New function. (vectorizable_assignment): Don't add a cost for nop conversions. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Likewise. * tree-vect-slp.c (vect_bb_slp_scalar_cost): Likewise. Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2019-11-07 15:11:22.290972236 +0000 +++ gcc/tree-vectorizer.h 2019-11-07 16:32:14.817523866 +0000 @@ -1654,6 +1654,7 @@ extern tree vect_get_vec_def_for_stmt_co extern bool vect_transform_stmt (stmt_vec_info, gimple_stmt_iterator *, slp_tree, slp_instance); extern void vect_remove_stores (stmt_vec_info); +extern bool vect_nop_conversion_p (stmt_vec_info); extern opt_result vect_analyze_stmt (stmt_vec_info, bool *, slp_tree, slp_instance, stmt_vector_for_cost *); extern void vect_get_load_cost (stmt_vec_info, int, bool, Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2019-11-07 15:11:50.134775028 +0000 +++ gcc/tree-vect-stmts.c 2019-11-07 16:32:14.817523866 +0000 @@ -5284,6 +5284,29 @@ vectorizable_conversion (stmt_vec_info s return true; } +/* Return true if we can assume from the scalar form of STMT_INFO that + neither the scalar nor the vector forms will generate code. STMT_INFO + is known not to involve a data reference. */ + +bool +vect_nop_conversion_p (stmt_vec_info stmt_info) +{ + gassign *stmt = dyn_cast <gassign *> (stmt_info->stmt); + if (!stmt) + return false; + + tree lhs = gimple_assign_lhs (stmt); + tree_code code = gimple_assign_rhs_code (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + + if (code == SSA_NAME || code == VIEW_CONVERT_EXPR) + return true; + + if (CONVERT_EXPR_CODE_P (code)) + return tree_nop_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)); + + return false; +} /* Function vectorizable_assignment. @@ -5399,7 +5422,9 @@ vectorizable_assignment (stmt_vec_info s { STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_assignment"); - vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, cost_vec); + if (!vect_nop_conversion_p (stmt_info)) + vect_model_simple_cost (stmt_info, ncopies, dt, ndts, slp_node, + cost_vec); return true; } Index: gcc/tree-vect-loop.c =================================================================== --- gcc/tree-vect-loop.c 2019-11-07 15:11:22.290972236 +0000 +++ gcc/tree-vect-loop.c 2019-11-07 16:32:14.813523891 +0000 @@ -1126,7 +1126,9 @@ vect_compute_single_scalar_iteration_cos else kind = scalar_store; } - else + else if (vect_nop_conversion_p (stmt_info)) + continue; + else kind = scalar_stmt; record_stmt_cost (&LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c 2019-11-07 15:11:22.290972236 +0000 +++ gcc/tree-vect-slp.c 2019-11-07 16:32:14.817523866 +0000 @@ -3037,6 +3037,8 @@ vect_bb_slp_scalar_cost (basic_block bb, else kind = scalar_store; } + else if (vect_nop_conversion_p (stmt_info)) + continue; else kind = scalar_stmt; record_stmt_cost (cost_vec, 1, kind, stmt_info, 0, vect_body);