Kewen Lin <li...@linux.ibm.com> writes: > @@ -9743,11 +9739,23 @@ vectorizable_load (vec_info *vinfo, > unsigned int n_groups = 0; > for (j = 0; j < ncopies; j++) > { > - if (nloads > 1) > + if (nloads > 1 && !costing_p) > vec_alloc (v, nloads); > gimple *new_stmt = NULL; > for (i = 0; i < nloads; i++) > { > + if (costing_p) > + { > + if (VECTOR_TYPE_P (ltype)) > + vect_get_load_cost (vinfo, stmt_info, 1, > + alignment_support_scheme, misalignment, > + false, &inside_cost, nullptr, cost_vec, > + cost_vec, true); > + else > + inside_cost += record_stmt_cost (cost_vec, 1, scalar_load, > + stmt_info, 0, vect_body); > + continue; > + }
Just a note that this might make life harder for AArch64 costing. Strided SLP loads are still equivalent to vector loads for AArch64, since they happen on the FPR/vector side even if they have integral modes. But I agree this is more accurate from a general target-independent POV, especially given the relatively coarse-grain costing enum. So I think that's just something AArch64 will need to account for. Thanks, Richard