vect_model_store_cost had: /* Costs of the stores. */ if (STMT_VINFO_STRIDED_P (stmt_info) && !STMT_VINFO_GROUPED_ACCESS (stmt_info)) { /* N scalar stores plus extracting the elements. */ inside_cost += record_stmt_cost (body_cost_vec, ncopies * TYPE_VECTOR_SUBPARTS (vectype), scalar_store, stmt_info, 0, vect_body);
But non-SLP strided groups also use individual scalar stores rather than vector stores, so I think we should skip this only for SLP groups. The same applies to vect_model_load_cost. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Thanks, Richard gcc/ * tree-vect-stmts.c (vect_model_store_cost): For non-SLP strided groups, use the cost of N scalar accesses instead of ncopies vector accesses. (vect_model_load_cost): Likewise. diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index e90eeda..f883580 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -926,8 +926,7 @@ vect_model_store_cost (stmt_vec_info stmt_info, int ncopies, tree vectype = STMT_VINFO_VECTYPE (stmt_info); /* Costs of the stores. */ - if (STMT_VINFO_STRIDED_P (stmt_info) - && !STMT_VINFO_GROUPED_ACCESS (stmt_info)) + if (STMT_VINFO_STRIDED_P (stmt_info) && !(slp_node && grouped_access_p)) { /* N scalar stores plus extracting the elements. */ inside_cost += record_stmt_cost (body_cost_vec, @@ -1059,8 +1058,7 @@ vect_model_load_cost (stmt_vec_info stmt_info, int ncopies, } /* The loads themselves. */ - if (STMT_VINFO_STRIDED_P (stmt_info) - && !STMT_VINFO_GROUPED_ACCESS (stmt_info)) + if (STMT_VINFO_STRIDED_P (stmt_info) && !(slp_node && grouped_access_p)) { /* N scalar loads plus gathering them into a vector. */ tree vectype = STMT_VINFO_VECTYPE (stmt_info);