On Thu, 14 Oct 2021, Richard Sandiford wrote: > The current vector cost interface has a quite a bit of redundancy > built in. Each target that defines its own hooks has to replicate > the basic unsigned[3] management. Currently each target also > duplicates the cost adjustment for inner loops. > > This patch instead defines a vector_costs class for holding > the scalar or vector cost and allows targets to subclass it. > There is then only one costing hook: to create a new costs > structure of the appropriate type. Everything else can be > virtual functions, with common concepts implemented in the > base class rather than in each target's derivation. > > This might seem like excess C++-ification, but it shaves > ~100 LOC. I've also got some follow-on changes that become > significantly easier with this patch. Maybe it could help > with things like weighting blocks based on frequency too. > > This will clash with Andre's unrolling patches. His patches > have priority so this patch should queue behind them. > > The x86 and rs6000 parts fully convert to a self-contained class. > The equivalent aarch64 changes are more complex, so this patch > just does the bare minimum. A later patch will rework the > aarch64 bits. > > Tested on aarch64-linux-gnu, arm-linux-gnueabihf, x86_64-linux-gnu > and powerpc64le-linux-gnu. WDYT?
I like it! Thus OK. I suggested sth similar to Martin for the backend state '[PATCH 3/N] Come up with casm global state.', abstracting varasm global state and allowing targets to override this via the adjusted init_sections target hook. Richard. > Richard > > > gcc/ > * target.def (targetm.vectorize.init_cost): Replace with... > (targetm.vectorize.create_costs): ...this. > (targetm.vectorize.add_stmt_cost): Delete. > (targetm.vectorize.finish_cost): Likewise. > (targetm.vectorize.destroy_cost_data): Likewise. > * doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with... > (TARGET_VECTORIZE_CREATE_COSTS): ...this. > (TARGET_VECTORIZE_ADD_STMT_COST): Delete. > (TARGET_VECTORIZE_FINISH_COST): Likewise. > (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. > * doc/tm.texi: Regenerate. > * tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data > parameter. > (vec_info::target_cost_data): Change from a void * to a vector_costs *. > (vector_costs): New class. > (init_cost): Take a vec_info and return a vector_costs. > (dump_stmt_cost): Remove data parameter. > (add_stmt_cost): Replace vinfo and data parameters with a vector_costs. > (add_stmt_costs): Likewise. > (finish_cost): Replace data parameter with a vector_costs. > (destroy_cost_data): Delete. > * tree-vectorizer.c (dump_stmt_cost): Remove data argument and > don't print it. > (vec_info::vec_info): Remove the target_cost_data parameter and > initialize the member variable to null instead. > (vec_info::~vec_info): Delete target_cost_data instead of calling > destroy_cost_data. > (vector_costs::add_stmt_cost): New function. > (vector_costs::finish_cost): Likewise. > (vector_costs::record_stmt_cost): Likewise. > (vector_costs::adjust_cost_for_freq): Likewise. > * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update > call to vec_info::vec_info. > (vect_compute_single_scalar_iteration_cost): Update after above > changes to costing interface. > (vect_analyze_loop_operations): Likewise. > (vect_estimate_min_profitable_iters): Likewise. > (vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA > at the start_over point, where it needs to be recreated after > trying without slp. Update retry code accordingly. > * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call > to vec_info::vec_info. > (vect_slp_analyze_operation): Update after above changes to costing > interface. > (vect_bb_vectorization_profitable_p): Likewise. > * targhooks.h (default_init_cost): Replace with... > (default_vectorize_create_costs): ...this. > (default_add_stmt_cost): Delete. > (default_finish_cost, default_destroy_cost_data): Likewise. > * targhooks.c (default_init_cost): Replace with... > (default_vectorize_create_costs): ...this. > (default_add_stmt_cost): Delete, moving logic to vector_costs instead. > (default_finish_cost, default_destroy_cost_data): Delete. > * config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from > vector_costs. Add a constructor. > (aarch64_init_cost): Replace with... > (aarch64_vectorize_create_costs): ...this. > (aarch64_add_stmt_cost): Replace with... > (aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost > to adjust the cost for inner loops. > (aarch64_finish_cost): Replace with... > (aarch64_vector_costs::finish_cost): ...this. > (aarch64_destroy_cost_data): Delete. > (TARGET_VECTORIZE_INIT_COST): Replace with... > (TARGET_VECTORIZE_CREATE_COSTS): ...this. > (TARGET_VECTORIZE_ADD_STMT_COST): Delete. > (TARGET_VECTORIZE_FINISH_COST): Likewise. > (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. > * config/i386/i386.c (ix86_vector_costs): New structure. > (ix86_init_cost): Replace with... > (ix86_vectorize_create_costs): ...this. > (ix86_add_stmt_cost): Replace with... > (ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq > to adjust the cost for inner loops. > (ix86_finish_cost, ix86_destroy_cost_data): Delete. > (TARGET_VECTORIZE_INIT_COST): Replace with... > (TARGET_VECTORIZE_CREATE_COSTS): ...this. > (TARGET_VECTORIZE_ADD_STMT_COST): Delete. > (TARGET_VECTORIZE_FINISH_COST): Likewise. > (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. > * config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with... > (TARGET_VECTORIZE_CREATE_COSTS): ...this. > (TARGET_VECTORIZE_ADD_STMT_COST): Delete. > (TARGET_VECTORIZE_FINISH_COST): Likewise. > (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. > (rs6000_cost_data): Inherit from vector_costs. > Add a constructor. Drop loop_info, cost and costing_for_scalar > in favor of the corresponding vector_costs member variables. > Add "m_" to the names of the remaining member variables and > initialize them. > (rs6000_density_test): Replace with... > (rs6000_cost_data::density_test): ...this. > (rs6000_init_cost): Replace with... > (rs6000_vectorize_create_costs): ...this. > (rs6000_update_target_cost_per_stmt): Replace with... > (rs6000_cost_data::update_target_cost_per_stmt): ...this. > (rs6000_add_stmt_cost): Replace with... > (rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq > to adjust the cost for inner loops. > (rs6000_adjust_vect_cost_per_loop): Replace with... > (rs6000_cost_data::adjust_vect_cost_per_loop): ...this. > (rs6000_finish_cost): Replace with... > (rs6000_cost_data::finish_cost): ...this. Group loop code > into a single if statement and pass the loop_vinfo down to > subroutines. > (rs6000_destroy_cost_data): Delete. > --- > gcc/config/aarch64/aarch64.c | 137 ++++++++++-------------- > gcc/config/i386/i386.c | 76 ++++---------- > gcc/config/rs6000/rs6000.c | 196 ++++++++++++++--------------------- > gcc/doc/tm.texi | 25 +---- > gcc/doc/tm.texi.in | 8 +- > gcc/target.def | 49 +-------- > gcc/targhooks.c | 61 +---------- > gcc/targhooks.h | 8 +- > gcc/tree-vect-loop.c | 51 ++++----- > gcc/tree-vect-slp.c | 18 ++-- > gcc/tree-vectorizer.c | 67 ++++++++++-- > gcc/tree-vectorizer.h | 141 ++++++++++++++++++++----- > 12 files changed, 374 insertions(+), 463 deletions(-) > > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 902402d7503..f2e90990d9f 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -6255,7 +6255,7 @@ type @code{internal_fn}) should be considered expensive > when the mask is > all zeros. GCC can then try to branch around the instruction instead. > @end deftypefn > > -@deftypefn {Target Hook} {void *} TARGET_VECTORIZE_INIT_COST (class loop > *@var{loop_info}, bool @var{costing_for_scalar}) > +@deftypefn {Target Hook} {class vector_costs *} > TARGET_VECTORIZE_CREATE_COSTS (vec_info *@var{vinfo}, bool > @var{costing_for_scalar}) > This hook should initialize target-specific data structures in preparation > for modeling the costs of vectorizing a loop or basic block. The default > allocates three unsigned integers for accumulating costs for the prologue, > @@ -6266,29 +6266,6 @@ current cost model is for the scalar version of a loop > or block; otherwise > it is for the vector version. > @end deftypefn > > -@deftypefn {Target Hook} unsigned TARGET_VECTORIZE_ADD_STMT_COST (class > vec_info *@var{}, void *@var{data}, int @var{count}, enum vect_cost_for_stmt > @var{kind}, class _stmt_vec_info *@var{stmt_info}, tree @var{vectype}, int > @var{misalign}, enum vect_cost_model_location @var{where}) > -This hook should update the target-specific @var{data} in response to > -adding @var{count} copies of the given @var{kind} of statement to a > -loop or basic block. The default adds the builtin vectorizer cost for > -the copies of the statement to the accumulator specified by @var{where}, > -(the prologue, body, or epilogue) and returns the amount added. The > -return value should be viewed as a tentative cost that may later be > -revised. > -@end deftypefn > - > -@deftypefn {Target Hook} void TARGET_VECTORIZE_FINISH_COST (void > *@var{data}, unsigned *@var{prologue_cost}, unsigned *@var{body_cost}, > unsigned *@var{epilogue_cost}) > -This hook should complete calculations of the cost of vectorizing a loop > -or basic block based on @var{data}, and return the prologue, body, and > -epilogue costs as unsigned integers. The default returns the value of > -the three accumulators. > -@end deftypefn > - > -@deftypefn {Target Hook} void TARGET_VECTORIZE_DESTROY_COST_DATA (void > *@var{data}) > -This hook should release @var{data} and any related data structures > -allocated by TARGET_VECTORIZE_INIT_COST. The default releases the > -accumulator. > -@end deftypefn > - > @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_GATHER (const_tree > @var{mem_vectype}, const_tree @var{index_type}, int @var{scale}) > Target builtin that implements vector gather operation. @var{mem_vectype} > is the vector type of the load and @var{index_type} is scalar type of > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index 86352dc9bd2..738e7b8c19e 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -4190,13 +4190,7 @@ address; but often a machine-dependent strategy can > generate better code. > > @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE > > -@hook TARGET_VECTORIZE_INIT_COST > - > -@hook TARGET_VECTORIZE_ADD_STMT_COST > - > -@hook TARGET_VECTORIZE_FINISH_COST > - > -@hook TARGET_VECTORIZE_DESTROY_COST_DATA > +@hook TARGET_VECTORIZE_CREATE_COSTS > > @hook TARGET_VECTORIZE_BUILTIN_GATHER > > diff --git a/gcc/target.def b/gcc/target.def > index c5d90cace80..1baaba4cd0f 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -2051,7 +2051,7 @@ stores.", > > /* Target function to initialize the cost model for a loop or block. */ > DEFHOOK > -(init_cost, > +(create_costs, > "This hook should initialize target-specific data structures in > preparation\n\ > for modeling the costs of vectorizing a loop or basic block. The default\n\ > allocates three unsigned integers for accumulating costs for the prologue,\n\ > @@ -2060,50 +2060,9 @@ non-NULL, it identifies the loop being vectorized; > otherwise a single block\n\ > is being vectorized. If @var{costing_for_scalar} is true, it indicates > the\n\ > current cost model is for the scalar version of a loop or block; otherwise\n\ > it is for the vector version.", > - void *, > - (class loop *loop_info, bool costing_for_scalar), > - default_init_cost) > - > -/* Target function to record N statements of the given kind using the > - given vector type within the cost model data for the current loop or > - block. */ > -DEFHOOK > -(add_stmt_cost, > - "This hook should update the target-specific @var{data} in response to\n\ > -adding @var{count} copies of the given @var{kind} of statement to a\n\ > -loop or basic block. The default adds the builtin vectorizer cost for\n\ > -the copies of the statement to the accumulator specified by @var{where},\n\ > -(the prologue, body, or epilogue) and returns the amount added. The\n\ > -return value should be viewed as a tentative cost that may later be\n\ > -revised.", > - unsigned, > - (class vec_info *, void *data, int count, enum vect_cost_for_stmt kind, > - class _stmt_vec_info *stmt_info, tree vectype, int misalign, > - enum vect_cost_model_location where), > - default_add_stmt_cost) > - > -/* Target function to calculate the total cost of the current vectorized > - loop or block. */ > -DEFHOOK > -(finish_cost, > - "This hook should complete calculations of the cost of vectorizing a loop\n\ > -or basic block based on @var{data}, and return the prologue, body, and\n\ > -epilogue costs as unsigned integers. The default returns the value of\n\ > -the three accumulators.", > - void, > - (void *data, unsigned *prologue_cost, unsigned *body_cost, > - unsigned *epilogue_cost), > - default_finish_cost) > - > -/* Function to delete target-specific cost modeling data. */ > -DEFHOOK > -(destroy_cost_data, > - "This hook should release @var{data} and any related data structures\n\ > -allocated by TARGET_VECTORIZE_INIT_COST. The default releases the\n\ > -accumulator.", > - void, > - (void *data), > - default_destroy_cost_data) > + class vector_costs *, > + (vec_info *vinfo, bool costing_for_scalar), > + default_vectorize_create_costs) > > HOOK_VECTOR_END (vectorize) > > diff --git a/gcc/targhooks.c b/gcc/targhooks.c > index cbbcedf790f..ee8798cc84b 100644 > --- a/gcc/targhooks.c > +++ b/gcc/targhooks.c > @@ -1474,65 +1474,10 @@ default_empty_mask_is_expensive (unsigned ifn) > loop body, and epilogue) for a vectorized loop or block. So allocate an > array of three unsigned ints, set it to zero, and return its address. */ > > -void * > -default_init_cost (class loop *loop_info ATTRIBUTE_UNUSED, > - bool costing_for_scalar ATTRIBUTE_UNUSED) > -{ > - unsigned *cost = XNEWVEC (unsigned, 3); > - cost[vect_prologue] = cost[vect_body] = cost[vect_epilogue] = 0; > - return cost; > -} > - > -/* By default, the cost model looks up the cost of the given statement > - kind and mode, multiplies it by the occurrence count, accumulates > - it into the cost specified by WHERE, and returns the cost added. */ > - > -unsigned > -default_add_stmt_cost (class vec_info *vinfo, void *data, int count, > - enum vect_cost_for_stmt kind, > - class _stmt_vec_info *stmt_info, tree vectype, > - int misalign, > - enum vect_cost_model_location where) > -{ > - unsigned *cost = (unsigned *) data; > - unsigned retval = 0; > - int stmt_cost = targetm.vectorize.builtin_vectorization_cost (kind, > vectype, > - misalign); > - /* Statements in an inner loop relative to the loop being > - vectorized are weighted more heavily. The value here is > - arbitrary and could potentially be improved with analysis. */ > - if (where == vect_body && stmt_info > - && stmt_in_inner_loop_p (vinfo, stmt_info)) > - { > - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo); > - gcc_assert (loop_vinfo); > - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); > - } > - > - retval = (unsigned) (count * stmt_cost); > - cost[where] += retval; > - > - return retval; > -} > - > -/* By default, the cost model just returns the accumulated costs. */ > - > -void > -default_finish_cost (void *data, unsigned *prologue_cost, > - unsigned *body_cost, unsigned *epilogue_cost) > -{ > - unsigned *cost = (unsigned *) data; > - *prologue_cost = cost[vect_prologue]; > - *body_cost = cost[vect_body]; > - *epilogue_cost = cost[vect_epilogue]; > -} > - > -/* Free the cost data. */ > - > -void > -default_destroy_cost_data (void *data) > +vector_costs * > +default_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) > { > - free (data); > + return new vector_costs (vinfo, costing_for_scalar); > } > > /* Determine whether or not a pointer mode is valid. Assume defaults > diff --git a/gcc/targhooks.h b/gcc/targhooks.h > index 92d51992e62..64f2dc7e4a6 100644 > --- a/gcc/targhooks.h > +++ b/gcc/targhooks.h > @@ -118,13 +118,7 @@ extern opt_machine_mode default_vectorize_related_mode > (machine_mode, > poly_uint64); > extern opt_machine_mode default_get_mask_mode (machine_mode); > extern bool default_empty_mask_is_expensive (unsigned); > -extern void *default_init_cost (class loop *, bool); > -extern unsigned default_add_stmt_cost (class vec_info *, void *, int, > - enum vect_cost_for_stmt, > - class _stmt_vec_info *, tree, int, > - enum vect_cost_model_location); > -extern void default_finish_cost (void *, unsigned *, unsigned *, unsigned *); > -extern void default_destroy_cost_data (void *); > +extern vector_costs *default_vectorize_create_costs (vec_info *, bool); > > /* OpenACC hooks. */ > extern bool default_goacc_validate_dims (tree, int [], int, unsigned); > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c > index 961c1623f81..201000af425 100644 > --- a/gcc/tree-vect-loop.c > +++ b/gcc/tree-vect-loop.c > @@ -814,7 +814,7 @@ bb_in_loop_p (const_basic_block bb, const void *data) > stmt_vec_info structs for all the stmts in LOOP_IN. */ > > _loop_vec_info::_loop_vec_info (class loop *loop_in, vec_info_shared *shared) > - : vec_info (vec_info::loop, init_cost (loop_in, false), shared), > + : vec_info (vec_info::loop, shared), > loop (loop_in), > bbs (XCNEWVEC (basic_block, loop->num_nodes)), > num_itersm1 (NULL_TREE), > @@ -1292,18 +1292,18 @@ vect_compute_single_scalar_iteration_cost > (loop_vec_info loop_vinfo) > } > > /* Now accumulate cost. */ > - void *target_cost_data = init_cost (loop, true); > + vector_costs *target_cost_data = init_cost (loop_vinfo, true); > stmt_info_for_cost *si; > int j; > FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), > j, si) > - (void) add_stmt_cost (loop_vinfo, target_cost_data, si->count, > + (void) add_stmt_cost (target_cost_data, si->count, > si->kind, si->stmt_info, si->vectype, > si->misalign, si->where); > unsigned prologue_cost = 0, body_cost = 0, epilogue_cost = 0; > finish_cost (target_cost_data, &prologue_cost, &body_cost, > &epilogue_cost); > - destroy_cost_data (target_cost_data); > + delete target_cost_data; > LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST (loop_vinfo) > = prologue_cost + body_cost + epilogue_cost; > } > @@ -1783,7 +1783,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo) > } > } /* bbs */ > > - add_stmt_costs (loop_vinfo, loop_vinfo->target_cost_data, &cost_vec); > + add_stmt_costs (loop_vinfo->target_cost_data, &cost_vec); > > /* All operations in the loop are either irrelevant (deal with loop > control, or dead), or only used outside the loop and can be moved > @@ -2393,6 +2393,8 @@ start_over: > LOOP_VINFO_INT_NITERS (loop_vinfo)); > } > > + LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) = init_cost (loop_vinfo, false); > + > /* Analyze the alignment of the data-refs in the loop. > Fail if a data reference is found that cannot be vectorized. */ > > @@ -2757,9 +2759,8 @@ again: > LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).release (); > LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).release (); > /* Reset target cost data. */ > - destroy_cost_data (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo)); > - LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) > - = init_cost (LOOP_VINFO_LOOP (loop_vinfo), false); > + delete LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); > + LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) = nullptr; > /* Reset accumulated rgroup information. */ > release_vec_loop_controls (&LOOP_VINFO_MASKS (loop_vinfo)); > release_vec_loop_controls (&LOOP_VINFO_LENS (loop_vinfo)); > @@ -3895,7 +3896,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > int scalar_outside_cost = 0; > int assumed_vf = vect_vf_for_cost (loop_vinfo); > int npeel = LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo); > - void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); > + vector_costs *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); > > /* Cost model disabled. */ > if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo))) > @@ -3912,7 +3913,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > { > /* FIXME: Make cost depend on complexity of individual check. */ > unsigned len = LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo).length (); > - (void) add_stmt_cost (loop_vinfo, target_cost_data, len, vector_stmt, > + (void) add_stmt_cost (target_cost_data, len, vector_stmt, > NULL, NULL_TREE, 0, vect_prologue); > if (dump_enabled_p ()) > dump_printf (MSG_NOTE, > @@ -3925,12 +3926,12 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > { > /* FIXME: Make cost depend on complexity of individual check. */ > unsigned len = LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo).length (); > - (void) add_stmt_cost (loop_vinfo, target_cost_data, len, vector_stmt, > + (void) add_stmt_cost (target_cost_data, len, vector_stmt, > NULL, NULL_TREE, 0, vect_prologue); > len = LOOP_VINFO_CHECK_UNEQUAL_ADDRS (loop_vinfo).length (); > if (len) > /* Count LEN - 1 ANDs and LEN comparisons. */ > - (void) add_stmt_cost (loop_vinfo, target_cost_data, len * 2 - 1, > + (void) add_stmt_cost (target_cost_data, len * 2 - 1, > scalar_stmt, NULL, NULL_TREE, 0, vect_prologue); > len = LOOP_VINFO_LOWER_BOUNDS (loop_vinfo).length (); > if (len) > @@ -3941,7 +3942,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > for (unsigned int i = 0; i < len; ++i) > if (!LOOP_VINFO_LOWER_BOUNDS (loop_vinfo)[i].unsigned_p) > nstmts += 1; > - (void) add_stmt_cost (loop_vinfo, target_cost_data, nstmts, > + (void) add_stmt_cost (target_cost_data, nstmts, > scalar_stmt, NULL, NULL_TREE, 0, vect_prologue); > } > if (dump_enabled_p ()) > @@ -3954,7 +3955,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > if (LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo)) > { > /* FIXME: Make cost depend on complexity of individual check. */ > - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, vector_stmt, > + (void) add_stmt_cost (target_cost_data, 1, vector_stmt, > NULL, NULL_TREE, 0, vect_prologue); > if (dump_enabled_p ()) > dump_printf (MSG_NOTE, > @@ -3963,7 +3964,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > } > > if (LOOP_REQUIRES_VERSIONING (loop_vinfo)) > - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken, > + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken, > NULL, NULL_TREE, 0, vect_prologue); > > /* Count statements in scalar loop. Using this as scalar cost for a single > @@ -4051,7 +4052,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > if (peel_iters_prologue) > FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), j, si) > { > - (void) add_stmt_cost (loop_vinfo, target_cost_data, > + (void) add_stmt_cost (target_cost_data, > si->count * peel_iters_prologue, si->kind, > si->stmt_info, si->vectype, si->misalign, > vect_prologue); > @@ -4061,7 +4062,7 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > if (peel_iters_epilogue) > FOR_EACH_VEC_ELT (LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo), j, si) > { > - (void) add_stmt_cost (loop_vinfo, target_cost_data, > + (void) add_stmt_cost (target_cost_data, > si->count * peel_iters_epilogue, si->kind, > si->stmt_info, si->vectype, si->misalign, > vect_epilogue); > @@ -4070,20 +4071,20 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > /* Add possible cond_branch_taken/cond_branch_not_taken cost. */ > > if (prologue_need_br_taken_cost) > - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken, > + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken, > NULL, NULL_TREE, 0, vect_prologue); > > if (prologue_need_br_not_taken_cost) > - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, > + (void) add_stmt_cost (target_cost_data, 1, > cond_branch_not_taken, NULL, NULL_TREE, 0, > vect_prologue); > > if (epilogue_need_br_taken_cost) > - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, cond_branch_taken, > + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken, > NULL, NULL_TREE, 0, vect_epilogue); > > if (epilogue_need_br_not_taken_cost) > - (void) add_stmt_cost (loop_vinfo, target_cost_data, 1, > + (void) add_stmt_cost (target_cost_data, 1, > cond_branch_not_taken, NULL, NULL_TREE, 0, > vect_epilogue); > > @@ -4111,9 +4112,9 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > simpler and safer to use the worst-case cost; if this ends up > being the tie-breaker between vectorizing or not, then it's > probably better not to vectorize. */ > - (void) add_stmt_cost (loop_vinfo, target_cost_data, num_masks, > + (void) add_stmt_cost (target_cost_data, num_masks, > vector_stmt, NULL, NULL_TREE, 0, vect_prologue); > - (void) add_stmt_cost (loop_vinfo, target_cost_data, num_masks - 1, > + (void) add_stmt_cost (target_cost_data, num_masks - 1, > vector_stmt, NULL, NULL_TREE, 0, vect_body); > } > else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) > @@ -4163,9 +4164,9 @@ vect_estimate_min_profitable_iters (loop_vec_info > loop_vinfo, > body_stmts += 3 * num_vectors; > } > > - (void) add_stmt_cost (loop_vinfo, target_cost_data, prologue_stmts, > + (void) add_stmt_cost (target_cost_data, prologue_stmts, > scalar_stmt, NULL, NULL_TREE, 0, vect_prologue); > - (void) add_stmt_cost (loop_vinfo, target_cost_data, body_stmts, > + (void) add_stmt_cost (target_cost_data, body_stmts, > scalar_stmt, NULL, NULL_TREE, 0, vect_body); > } > > diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c > index 709bcb63686..a2eb20faef5 100644 > --- a/gcc/tree-vect-slp.c > +++ b/gcc/tree-vect-slp.c > @@ -4355,7 +4355,7 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo) > /* Initialize a bb_vec_info struct for the statements in BBS basic blocks. > */ > > _bb_vec_info::_bb_vec_info (vec<basic_block> _bbs, vec_info_shared *shared) > - : vec_info (vec_info::bb, init_cost (NULL, false), shared), > + : vec_info (vec_info::bb, shared), > bbs (_bbs), > roots (vNULL) > { > @@ -4897,7 +4897,7 @@ vect_slp_analyze_operations (vec_info *vinfo) > instance->cost_vec = cost_vec; > else > { > - add_stmt_costs (vinfo, vinfo->target_cost_data, &cost_vec); > + add_stmt_costs (vinfo->target_cost_data, &cost_vec); > cost_vec.release (); > } > } > @@ -5337,32 +5337,30 @@ vect_bb_vectorization_profitable_p (bb_vec_info > bb_vinfo, > continue; > } > > - void *scalar_target_cost_data = init_cost (NULL, true); > + class vector_costs *scalar_target_cost_data = init_cost (bb_vinfo, > true); > do > { > - add_stmt_cost (bb_vinfo, scalar_target_cost_data, > - li_scalar_costs[si].second); > + add_stmt_cost (scalar_target_cost_data, li_scalar_costs[si].second); > si++; > } > while (si < li_scalar_costs.length () > && li_scalar_costs[si].first == sl); > unsigned dummy; > finish_cost (scalar_target_cost_data, &dummy, &scalar_cost, &dummy); > - destroy_cost_data (scalar_target_cost_data); > + delete scalar_target_cost_data; > > /* Complete the target-specific vector cost calculation. */ > - void *vect_target_cost_data = init_cost (NULL, false); > + class vector_costs *vect_target_cost_data = init_cost (bb_vinfo, > false); > do > { > - add_stmt_cost (bb_vinfo, vect_target_cost_data, > - li_vector_costs[vi].second); > + add_stmt_cost (vect_target_cost_data, li_vector_costs[vi].second); > vi++; > } > while (vi < li_vector_costs.length () > && li_vector_costs[vi].first == vl); > finish_cost (vect_target_cost_data, &vec_prologue_cost, > &vec_inside_cost, &vec_epilogue_cost); > - destroy_cost_data (vect_target_cost_data); > + delete vect_target_cost_data; > > vec_outside_cost = vec_prologue_cost + vec_epilogue_cost; > > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c > index 20daa31187d..0b3a2dd6dc0 100644 > --- a/gcc/tree-vectorizer.c > +++ b/gcc/tree-vectorizer.c > @@ -98,11 +98,10 @@ auto_purge_vect_location::~auto_purge_vect_location () > /* Dump a cost entry according to args to F. */ > > void > -dump_stmt_cost (FILE *f, void *data, int count, enum vect_cost_for_stmt kind, > +dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind, > stmt_vec_info stmt_info, tree, int misalign, unsigned cost, > enum vect_cost_model_location where) > { > - fprintf (f, "%p ", data); > if (stmt_info) > { > print_gimple_expr (f, STMT_VINFO_STMT (stmt_info), 0, TDF_SLIM); > @@ -457,12 +456,11 @@ shrink_simd_arrays > /* Initialize the vec_info with kind KIND_IN and target cost data > TARGET_COST_DATA_IN. */ > > -vec_info::vec_info (vec_info::vec_kind kind_in, void *target_cost_data_in, > - vec_info_shared *shared_) > +vec_info::vec_info (vec_info::vec_kind kind_in, vec_info_shared *shared_) > : kind (kind_in), > shared (shared_), > stmt_vec_info_ro (false), > - target_cost_data (target_cost_data_in) > + target_cost_data (nullptr) > { > stmt_vec_infos.create (50); > } > @@ -472,7 +470,7 @@ vec_info::~vec_info () > for (slp_instance &instance : slp_instances) > vect_free_slp_instance (instance); > > - destroy_cost_data (target_cost_data); > + delete target_cost_data; > free_stmt_vec_infos (); > } > > @@ -1694,3 +1692,60 @@ scalar_cond_masked_key::get_cond_ops_from_tree (tree t) > this->op0 = t; > this->op1 = build_zero_cst (TREE_TYPE (t)); > } > + > +/* See the comment above the declaration for details. */ > + > +unsigned int > +vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, vect_cost_model_location where) > +{ > + unsigned int cost > + = builtin_vectorization_cost (kind, vectype, misalign) * count; > + return record_stmt_cost (stmt_info, where, cost); > +} > + > +/* See the comment above the declaration for details. */ > + > +void > +vector_costs::finish_cost () > +{ > + gcc_assert (!m_finished); > + m_finished = true; > +} > + > +/* Record a base cost of COST units against WHERE. If STMT_INFO is > + nonnull, use it to adjust the cost based on execution frequency > + (where appropriate). */ > + > +unsigned int > +vector_costs::record_stmt_cost (stmt_vec_info stmt_info, > + vect_cost_model_location where, > + unsigned int cost) > +{ > + cost = adjust_cost_for_freq (stmt_info, where, cost); > + m_costs[where] += cost; > + return cost; > +} > + > +/* COST is the base cost we have calculated for an operation in location > WHERE. > + If STMT_INFO is nonnull, use it to adjust the cost based on execution > + frequency (where appropriate). Return the adjusted cost. */ > + > +unsigned int > +vector_costs::adjust_cost_for_freq (stmt_vec_info stmt_info, > + vect_cost_model_location where, > + unsigned int cost) > +{ > + /* Statements in an inner loop relative to the loop being > + vectorized are weighted more heavily. The value here is > + arbitrary and could potentially be improved with analysis. */ > + if (where == vect_body > + && stmt_info > + && stmt_in_inner_loop_p (m_vinfo, stmt_info)) > + { > + loop_vec_info loop_vinfo = as_a<loop_vec_info> (m_vinfo); > + cost *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); > + } > + return cost; > +} > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h > index 4aa84acff59..44afda2bc9b 100644 > --- a/gcc/tree-vectorizer.h > +++ b/gcc/tree-vectorizer.h > @@ -368,7 +368,7 @@ public: > typedef hash_set<int_hash<machine_mode, E_VOIDmode, E_BLKmode> > mode_set; > enum vec_kind { bb, loop }; > > - vec_info (vec_kind, void *, vec_info_shared *); > + vec_info (vec_kind, vec_info_shared *); > ~vec_info (); > > stmt_vec_info add_stmt (gimple *); > @@ -406,7 +406,7 @@ public: > auto_vec<stmt_vec_info> grouped_stores; > > /* Cost data used by the target cost model. */ > - void *target_cost_data; > + class vector_costs *target_cost_data; > > /* The set of vector modes used in the vectorized region. */ > mode_set used_vector_modes; > @@ -1395,6 +1395,103 @@ struct gather_scatter_info { > #define PURE_SLP_STMT(S) ((S)->slp_type == pure_slp) > #define STMT_SLP_TYPE(S) (S)->slp_type > > +/* Contains the scalar or vector costs for a vec_info. */ > +class vector_costs > +{ > +public: > + vector_costs (vec_info *, bool); > + virtual ~vector_costs () {} > + > + /* Update the costs in response to adding COUNT copies of a statement. > + > + - WHERE specifies whether the cost occurs in the loop prologue, > + the loop body, or the loop epilogue. > + - KIND is the kind of statement, which is always meaningful. > + - STMT_INFO, if nonnull, describes the statement that will be > + vectorized. > + - VECTYPE, if nonnull, is the vector type that the vectorized > + statement will operate on. Note that this should be used in > + preference to STMT_VINFO_VECTYPE (STMT_INFO) since the latter > + is not correct for SLP. > + - for unaligned_load and unaligned_store statements, MISALIGN is > + the byte misalignment of the load or store relative to the target's > + preferred alignment for VECTYPE, or DR_MISALIGNMENT_UNKNOWN > + if the misalignment is not known. > + > + Return the calculated cost as well as recording it. The return > + value is used for dumping purposes. */ > + virtual unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, > + vect_cost_model_location where); > + > + /* Finish calculating the cost of the code. The results can be > + read back using the functions below. */ > + virtual void finish_cost (); > + > + unsigned int prologue_cost () const; > + unsigned int body_cost () const; > + unsigned int epilogue_cost () const; > + > +protected: > + unsigned int record_stmt_cost (stmt_vec_info, vect_cost_model_location, > + unsigned int); > + unsigned int adjust_cost_for_freq (stmt_vec_info, vect_cost_model_location, > + unsigned int); > + > + /* The region of code that we're considering vectorizing. */ > + vec_info *m_vinfo; > + > + /* True if we're costing the scalar code, false if we're costing > + the vector code. */ > + bool m_costing_for_scalar; > + > + /* The costs of the three regions, indexed by vect_cost_model_location. */ > + unsigned int m_costs[3]; > + > + /* True if finish_cost has been called. */ > + bool m_finished; > +}; > + > +/* Create costs for VINFO. COSTING_FOR_SCALAR is true if the costs > + are for scalar code, false if they are for vector code. */ > + > +inline > +vector_costs::vector_costs (vec_info *vinfo, bool costing_for_scalar) > + : m_vinfo (vinfo), > + m_costing_for_scalar (costing_for_scalar), > + m_costs (), > + m_finished (false) > +{ > +} > + > +/* Return the cost of the prologue code (in abstract units). */ > + > +inline unsigned int > +vector_costs::prologue_cost () const > +{ > + gcc_checking_assert (m_finished); > + return m_costs[vect_prologue]; > +} > + > +/* Return the cost of the body code (in abstract units). */ > + > +inline unsigned int > +vector_costs::body_cost () const > +{ > + gcc_checking_assert (m_finished); > + return m_costs[vect_body]; > +} > + > +/* Return the cost of the epilogue code (in abstract units). */ > + > +inline unsigned int > +vector_costs::epilogue_cost () const > +{ > + gcc_checking_assert (m_finished); > + return m_costs[vect_epilogue]; > +} > + > #define VECT_MAX_COST 1000 > > /* The maximum number of intermediate steps required in multi-step type > @@ -1531,29 +1628,28 @@ int vect_get_stmt_cost (enum vect_cost_for_stmt > type_of_cost) > > /* Alias targetm.vectorize.init_cost. */ > > -static inline void * > -init_cost (class loop *loop_info, bool costing_for_scalar) > +static inline vector_costs * > +init_cost (vec_info *vinfo, bool costing_for_scalar) > { > - return targetm.vectorize.init_cost (loop_info, costing_for_scalar); > + return targetm.vectorize.create_costs (vinfo, costing_for_scalar); > } > > -extern void dump_stmt_cost (FILE *, void *, int, enum vect_cost_for_stmt, > +extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt, > stmt_vec_info, tree, int, unsigned, > enum vect_cost_model_location); > > /* Alias targetm.vectorize.add_stmt_cost. */ > > static inline unsigned > -add_stmt_cost (vec_info *vinfo, void *data, int count, > +add_stmt_cost (vector_costs *costs, int count, > enum vect_cost_for_stmt kind, > stmt_vec_info stmt_info, tree vectype, int misalign, > enum vect_cost_model_location where) > { > - unsigned cost = targetm.vectorize.add_stmt_cost (vinfo, data, count, kind, > - stmt_info, vectype, > - misalign, where); > + unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, vectype, > + misalign, where); > if (dump_file && (dump_flags & TDF_DETAILS)) > - dump_stmt_cost (dump_file, data, count, kind, stmt_info, vectype, > misalign, > + dump_stmt_cost (dump_file, count, kind, stmt_info, vectype, misalign, > cost, where); > return cost; > } > @@ -1561,36 +1657,31 @@ add_stmt_cost (vec_info *vinfo, void *data, int count, > /* Alias targetm.vectorize.add_stmt_cost. */ > > static inline unsigned > -add_stmt_cost (vec_info *vinfo, void *data, stmt_info_for_cost *i) > +add_stmt_cost (vector_costs *costs, stmt_info_for_cost *i) > { > - return add_stmt_cost (vinfo, data, i->count, i->kind, i->stmt_info, > + return add_stmt_cost (costs, i->count, i->kind, i->stmt_info, > i->vectype, i->misalign, i->where); > } > > /* Alias targetm.vectorize.finish_cost. */ > > static inline void > -finish_cost (void *data, unsigned *prologue_cost, > +finish_cost (vector_costs *costs, unsigned *prologue_cost, > unsigned *body_cost, unsigned *epilogue_cost) > { > - targetm.vectorize.finish_cost (data, prologue_cost, body_cost, > epilogue_cost); > -} > - > -/* Alias targetm.vectorize.destroy_cost_data. */ > - > -static inline void > -destroy_cost_data (void *data) > -{ > - targetm.vectorize.destroy_cost_data (data); > + costs->finish_cost (); > + *prologue_cost = costs->prologue_cost (); > + *body_cost = costs->body_cost (); > + *epilogue_cost = costs->epilogue_cost (); > } > > inline void > -add_stmt_costs (vec_info *vinfo, void *data, stmt_vector_for_cost *cost_vec) > +add_stmt_costs (vector_costs *costs, stmt_vector_for_cost *cost_vec) > { > stmt_info_for_cost *cost; > unsigned i; > FOR_EACH_VEC_ELT (*cost_vec, i, cost) > - add_stmt_cost (vinfo, data, cost->count, cost->kind, cost->stmt_info, > + add_stmt_cost (costs, cost->count, cost->kind, cost->stmt_info, > cost->vectype, cost->misalign, cost->where); > } > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 76d99d247ae..93388ef9684 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -14523,11 +14523,15 @@ struct aarch64_sve_op_count : aarch64_vec_op_count > }; > > /* Information about vector code that we're in the process of costing. */ > -struct aarch64_vector_costs > +struct aarch64_vector_costs : public vector_costs > { > - /* The normal latency-based costs for each region (prologue, body and > - epilogue), indexed by vect_cost_model_location. */ > - unsigned int region[3] = {}; > + using vector_costs::vector_costs; > + > + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, > + vect_cost_model_location where) override; > + void finish_cost () override; > > /* True if we have performed one-time initialization based on the vec_info. > > @@ -14593,11 +14597,11 @@ struct aarch64_vector_costs > hash_map<nofree_ptr_hash<_stmt_vec_info>, unsigned int> seen_loads; > }; > > -/* Implement TARGET_VECTORIZE_INIT_COST. */ > -void * > -aarch64_init_cost (class loop *, bool) > +/* Implement TARGET_VECTORIZE_CREATE_COSTS. */ > +vector_costs * > +aarch64_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) > { > - return new aarch64_vector_costs; > + return new aarch64_vector_costs (vinfo, costing_for_scalar); > } > > /* Return true if the current CPU should use the new costs defined > @@ -15283,7 +15287,7 @@ aarch64_adjust_stmt_cost (vect_cost_for_stmt kind, > stmt_vec_info stmt_info, > } > > /* VINFO, COSTS, COUNT, KIND, STMT_INFO and VECTYPE are the same as for > - TARGET_VECTORIZE_ADD_STMT_COST and they describe an operation in the > + vector_costs::add_stmt_cost and they describe an operation in the > body of a vector loop. Record issue information relating to the vector > operation in OPS, where OPS is one of COSTS->scalar_ops, > COSTS->advsimd_ops > or COSTS->sve_ops; see the comments above those variables for details. > @@ -15479,32 +15483,29 @@ aarch64_count_ops (class vec_info *vinfo, > aarch64_vector_costs *costs, > } > } > > -/* Implement targetm.vectorize.add_stmt_cost. */ > -static unsigned > -aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count, > - enum vect_cost_for_stmt kind, > - struct _stmt_vec_info *stmt_info, tree vectype, > - int misalign, enum vect_cost_model_location where) > +unsigned > +aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, > + vect_cost_model_location where) > { > - auto *costs = static_cast<aarch64_vector_costs *> (data); > - > fractional_cost stmt_cost > = aarch64_builtin_vectorization_cost (kind, vectype, misalign); > > bool in_inner_loop_p = (where == vect_body > && stmt_info > - && stmt_in_inner_loop_p (vinfo, stmt_info)); > + && stmt_in_inner_loop_p (m_vinfo, stmt_info)); > > /* Do one-time initialization based on the vinfo. */ > - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo); > - bb_vec_info bb_vinfo = dyn_cast<bb_vec_info> (vinfo); > - if (!costs->analyzed_vinfo && aarch64_use_new_vector_costs_p ()) > + loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo); > + bb_vec_info bb_vinfo = dyn_cast<bb_vec_info> (m_vinfo); > + if (!analyzed_vinfo && aarch64_use_new_vector_costs_p ()) > { > if (loop_vinfo) > - aarch64_analyze_loop_vinfo (loop_vinfo, costs); > + aarch64_analyze_loop_vinfo (loop_vinfo, this); > else > - aarch64_analyze_bb_vinfo (bb_vinfo, costs); > - costs->analyzed_vinfo = true; > + aarch64_analyze_bb_vinfo (bb_vinfo, this); > + this->analyzed_vinfo = true; > } > > /* Try to get a more accurate cost by looking at STMT_INFO instead > @@ -15512,7 +15513,7 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > if (stmt_info && aarch64_use_new_vector_costs_p ()) > { > if (vectype && aarch64_sve_only_stmt_p (stmt_info, vectype)) > - costs->saw_sve_only_op = true; > + this->saw_sve_only_op = true; > > /* If we scalarize a strided store, the vectorizer costs one > vec_to_scalar for each element. However, we can store the first > @@ -15521,17 +15522,17 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > count -= 1; > > stmt_cost = aarch64_detect_scalar_stmt_subtype > - (vinfo, kind, stmt_info, stmt_cost); > + (m_vinfo, kind, stmt_info, stmt_cost); > > - if (vectype && costs->vec_flags) > - stmt_cost = aarch64_detect_vector_stmt_subtype (vinfo, kind, > + if (vectype && this->vec_flags) > + stmt_cost = aarch64_detect_vector_stmt_subtype (m_vinfo, kind, > stmt_info, vectype, > where, stmt_cost); > } > > /* Do any SVE-specific adjustments to the cost. */ > if (stmt_info && vectype && aarch64_sve_mode_p (TYPE_MODE (vectype))) > - stmt_cost = aarch64_sve_adjust_stmt_cost (vinfo, kind, stmt_info, > + stmt_cost = aarch64_sve_adjust_stmt_cost (m_vinfo, kind, stmt_info, > vectype, stmt_cost); > > if (stmt_info && aarch64_use_new_vector_costs_p ()) > @@ -15547,36 +15548,36 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > auto *issue_info = aarch64_tune_params.vec_costs->issue_info; > if (loop_vinfo > && issue_info > - && costs->vec_flags > + && this->vec_flags > && where == vect_body > && (!LOOP_VINFO_LOOP (loop_vinfo)->inner || in_inner_loop_p) > && vectype > && stmt_cost != 0) > { > /* Record estimates for the scalar code. */ > - aarch64_count_ops (vinfo, costs, count, kind, stmt_info, vectype, > - 0, &costs->scalar_ops, issue_info->scalar, > + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info, vectype, > + 0, &this->scalar_ops, issue_info->scalar, > vect_nunits_for_cost (vectype)); > > - if (aarch64_sve_mode_p (vinfo->vector_mode) && issue_info->sve) > + if (aarch64_sve_mode_p (m_vinfo->vector_mode) && issue_info->sve) > { > /* Record estimates for a possible Advanced SIMD version > of the SVE code. */ > - aarch64_count_ops (vinfo, costs, count, kind, stmt_info, > - vectype, VEC_ADVSIMD, &costs->advsimd_ops, > + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info, > + vectype, VEC_ADVSIMD, &this->advsimd_ops, > issue_info->advsimd, > aarch64_estimated_sve_vq ()); > > /* Record estimates for the SVE code itself. */ > - aarch64_count_ops (vinfo, costs, count, kind, stmt_info, > - vectype, VEC_ANY_SVE, &costs->sve_ops, > + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info, > + vectype, VEC_ANY_SVE, &this->sve_ops, > issue_info->sve, 1); > } > else > /* Record estimates for the Advanced SIMD code. Treat SVE like > Advanced SIMD if the CPU has no specific SVE costs. */ > - aarch64_count_ops (vinfo, costs, count, kind, stmt_info, > - vectype, VEC_ADVSIMD, &costs->advsimd_ops, > + aarch64_count_ops (m_vinfo, this, count, kind, stmt_info, > + vectype, VEC_ADVSIMD, &this->advsimd_ops, > issue_info->advsimd, 1); > } > > @@ -15585,24 +15586,11 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > loop. For simplicitly, we assume that one iteration of the > Advanced SIMD loop would need the same number of statements > as one iteration of the SVE loop. */ > - if (where == vect_body && costs->unrolled_advsimd_niters) > - costs->unrolled_advsimd_stmts > - += count * costs->unrolled_advsimd_niters; > + if (where == vect_body && this->unrolled_advsimd_niters) > + this->unrolled_advsimd_stmts > + += count * this->unrolled_advsimd_niters; > } > - > - /* Statements in an inner loop relative to the loop being > - vectorized are weighted more heavily. The value here is > - arbitrary and could potentially be improved with analysis. */ > - if (in_inner_loop_p) > - { > - gcc_assert (loop_vinfo); > - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME */ > - } > - > - unsigned retval = (count * stmt_cost).ceil (); > - costs->region[where] += retval; > - > - return retval; > + return record_stmt_cost (stmt_info, where, (count * stmt_cost).ceil ()); > } > > /* Dump information about the structure. */ > @@ -15966,27 +15954,15 @@ aarch64_adjust_body_cost (aarch64_vector_costs > *costs, unsigned int body_cost) > return body_cost; > } > > -/* Implement TARGET_VECTORIZE_FINISH_COST. */ > -static void > -aarch64_finish_cost (void *data, unsigned *prologue_cost, > - unsigned *body_cost, unsigned *epilogue_cost) > +void > +aarch64_vector_costs::finish_cost () > { > - auto *costs = static_cast<aarch64_vector_costs *> (data); > - *prologue_cost = costs->region[vect_prologue]; > - *body_cost = costs->region[vect_body]; > - *epilogue_cost = costs->region[vect_epilogue]; > - > - if (costs->is_loop > - && costs->vec_flags > + if (this->is_loop > + && this->vec_flags > && aarch64_use_new_vector_costs_p ()) > - *body_cost = aarch64_adjust_body_cost (costs, *body_cost); > -} > + m_costs[vect_body] = aarch64_adjust_body_cost (this, m_costs[vect_body]); > > -/* Implement TARGET_VECTORIZE_DESTROY_COST_DATA. */ > -static void > -aarch64_destroy_cost_data (void *data) > -{ > - delete static_cast<aarch64_vector_costs *> (data); > + vector_costs::finish_cost (); > } > > static void initialize_aarch64_code_model (struct gcc_options *); > @@ -26285,17 +26261,8 @@ aarch64_libgcc_floating_mode_supported_p > #undef TARGET_ARRAY_MODE_SUPPORTED_P > #define TARGET_ARRAY_MODE_SUPPORTED_P aarch64_array_mode_supported_p > > -#undef TARGET_VECTORIZE_INIT_COST > -#define TARGET_VECTORIZE_INIT_COST aarch64_init_cost > - > -#undef TARGET_VECTORIZE_ADD_STMT_COST > -#define TARGET_VECTORIZE_ADD_STMT_COST aarch64_add_stmt_cost > - > -#undef TARGET_VECTORIZE_FINISH_COST > -#define TARGET_VECTORIZE_FINISH_COST aarch64_finish_cost > - > -#undef TARGET_VECTORIZE_DESTROY_COST_DATA > -#define TARGET_VECTORIZE_DESTROY_COST_DATA aarch64_destroy_cost_data > +#undef TARGET_VECTORIZE_CREATE_COSTS > +#define TARGET_VECTORIZE_CREATE_COSTS aarch64_vectorize_create_costs > > #undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST > #define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \ > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index fb656094e9e..e40ae2b9c49 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -22842,26 +22842,30 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, > struct noce_if_info *if_info) > return default_noce_conversion_profitable_p (seq, if_info); > } > > -/* Implement targetm.vectorize.init_cost. */ > +/* x86-specific vector costs. */ > +class ix86_vector_costs : public vector_costs > +{ > + using vector_costs::vector_costs; > + > + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, > + vect_cost_model_location where) override; > +}; > > -static void * > -ix86_init_cost (class loop *, bool) > +/* Implement targetm.vectorize.create_costs. */ > + > +static vector_costs * > +ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) > { > - unsigned *cost = XNEWVEC (unsigned, 3); > - cost[vect_prologue] = cost[vect_body] = cost[vect_epilogue] = 0; > - return cost; > + return new ix86_vector_costs (vinfo, costing_for_scalar); > } > > -/* Implement targetm.vectorize.add_stmt_cost. */ > - > -static unsigned > -ix86_add_stmt_cost (class vec_info *vinfo, void *data, int count, > - enum vect_cost_for_stmt kind, > - class _stmt_vec_info *stmt_info, tree vectype, > - int misalign, > - enum vect_cost_model_location where) > +unsigned > +ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, vect_cost_model_location where) > { > - unsigned *cost = (unsigned *) data; > unsigned retval = 0; > bool scalar_p > = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store); > @@ -23032,15 +23036,7 @@ ix86_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > /* Statements in an inner loop relative to the loop being > vectorized are weighted more heavily. The value here is > arbitrary and could potentially be improved with analysis. */ > - if (where == vect_body && stmt_info > - && stmt_in_inner_loop_p (vinfo, stmt_info)) > - { > - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo); > - gcc_assert (loop_vinfo); > - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME. */ > - } > - > - retval = (unsigned) (count * stmt_cost); > + retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost); > > /* We need to multiply all vector stmt cost by 1.7 (estimated cost) > for Silvermont as it has out of order integer pipeline and can execute > @@ -23055,31 +23051,11 @@ ix86_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > retval = (retval * 17) / 10; > } > > - cost[where] += retval; > + m_costs[where] += retval; > > return retval; > } > > -/* Implement targetm.vectorize.finish_cost. */ > - > -static void > -ix86_finish_cost (void *data, unsigned *prologue_cost, > - unsigned *body_cost, unsigned *epilogue_cost) > -{ > - unsigned *cost = (unsigned *) data; > - *prologue_cost = cost[vect_prologue]; > - *body_cost = cost[vect_body]; > - *epilogue_cost = cost[vect_epilogue]; > -} > - > -/* Implement targetm.vectorize.destroy_cost_data. */ > - > -static void > -ix86_destroy_cost_data (void *data) > -{ > - free (data); > -} > - > /* Validate target specific memory model bits in VAL. */ > > static unsigned HOST_WIDE_INT > @@ -24363,14 +24339,8 @@ ix86_libgcc_floating_mode_supported_p > ix86_autovectorize_vector_modes > #undef TARGET_VECTORIZE_GET_MASK_MODE > #define TARGET_VECTORIZE_GET_MASK_MODE ix86_get_mask_mode > -#undef TARGET_VECTORIZE_INIT_COST > -#define TARGET_VECTORIZE_INIT_COST ix86_init_cost > -#undef TARGET_VECTORIZE_ADD_STMT_COST > -#define TARGET_VECTORIZE_ADD_STMT_COST ix86_add_stmt_cost > -#undef TARGET_VECTORIZE_FINISH_COST > -#define TARGET_VECTORIZE_FINISH_COST ix86_finish_cost > -#undef TARGET_VECTORIZE_DESTROY_COST_DATA > -#define TARGET_VECTORIZE_DESTROY_COST_DATA ix86_destroy_cost_data > +#undef TARGET_VECTORIZE_CREATE_COSTS > +#define TARGET_VECTORIZE_CREATE_COSTS ix86_vectorize_create_costs > > #undef TARGET_SET_CURRENT_FUNCTION > #define TARGET_SET_CURRENT_FUNCTION ix86_set_current_function > diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c > index 01a95591a5d..2e7b3bcad7e 100644 > --- a/gcc/config/rs6000/rs6000.c > +++ b/gcc/config/rs6000/rs6000.c > @@ -1452,14 +1452,8 @@ static const struct attribute_spec > rs6000_attribute_table[] = > #undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE > #define TARGET_VECTORIZE_PREFERRED_SIMD_MODE \ > rs6000_preferred_simd_mode > -#undef TARGET_VECTORIZE_INIT_COST > -#define TARGET_VECTORIZE_INIT_COST rs6000_init_cost > -#undef TARGET_VECTORIZE_ADD_STMT_COST > -#define TARGET_VECTORIZE_ADD_STMT_COST rs6000_add_stmt_cost > -#undef TARGET_VECTORIZE_FINISH_COST > -#define TARGET_VECTORIZE_FINISH_COST rs6000_finish_cost > -#undef TARGET_VECTORIZE_DESTROY_COST_DATA > -#define TARGET_VECTORIZE_DESTROY_COST_DATA rs6000_destroy_cost_data > +#undef TARGET_VECTORIZE_CREATE_COSTS > +#define TARGET_VECTORIZE_CREATE_COSTS rs6000_vectorize_create_costs > > #undef TARGET_LOOP_UNROLL_ADJUST > #define TARGET_LOOP_UNROLL_ADJUST rs6000_loop_unroll_adjust > @@ -5263,21 +5257,33 @@ rs6000_preferred_simd_mode (scalar_mode mode) > return word_mode; > } > > -struct rs6000_cost_data > +class rs6000_cost_data : public vector_costs > { > - struct loop *loop_info; > - unsigned cost[3]; > +public: > + using vector_costs::vector_costs; > + > + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, > + vect_cost_model_location where) override; > + void finish_cost () override; > + > +protected: > + void update_target_cost_per_stmt (vect_cost_for_stmt, stmt_vec_info, > + vect_cost_model_location, int, > + unsigned int); > + void density_test (loop_vec_info); > + void adjust_vect_cost_per_loop (loop_vec_info); > + > /* Total number of vectorized stmts (loop only). */ > - unsigned nstmts; > + unsigned m_nstmts = 0; > /* Total number of loads (loop only). */ > - unsigned nloads; > + unsigned m_nloads = 0; > /* Possible extra penalized cost on vector construction (loop only). */ > - unsigned extra_ctor_cost; > + unsigned m_extra_ctor_cost = 0; > /* For each vectorized loop, this var holds TRUE iff a non-memory vector > instruction is needed by the vectorization. */ > - bool vect_nonmem; > - /* Indicates this is costing for the scalar version of a loop or block. */ > - bool costing_for_scalar; > + bool m_vect_nonmem = false; > }; > > /* Test for likely overcommitment of vector hardware resources. If a > @@ -5286,20 +5292,19 @@ struct rs6000_cost_data > adequately reflect delays from unavailable vector resources. > Penalize the loop body cost for this case. */ > > -static void > -rs6000_density_test (rs6000_cost_data *data) > +void > +rs6000_cost_data::density_test (loop_vec_info loop_vinfo) > { > /* This density test only cares about the cost of vector version of the > loop, so immediately return if we are passed costing for the scalar > version (namely computing single scalar iteration cost). */ > - if (data->costing_for_scalar) > + if (m_costing_for_scalar) > return; > > - struct loop *loop = data->loop_info; > + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); > basic_block *bbs = get_loop_body (loop); > int nbbs = loop->num_nodes; > - loop_vec_info loop_vinfo = loop_vec_info_for_loop (data->loop_info); > - int vec_cost = data->cost[vect_body], not_vec_cost = 0; > + int vec_cost = m_costs[vect_body], not_vec_cost = 0; > > for (int i = 0; i < nbbs; i++) > { > @@ -5326,7 +5331,7 @@ rs6000_density_test (rs6000_cost_data *data) > if (density_pct > rs6000_density_pct_threshold > && vec_cost + not_vec_cost > rs6000_density_size_threshold) > { > - data->cost[vect_body] = vec_cost * (100 + rs6000_density_penalty) / > 100; > + m_costs[vect_body] = vec_cost * (100 + rs6000_density_penalty) / 100; > if (dump_enabled_p ()) > dump_printf_loc (MSG_NOTE, vect_location, > "density %d%%, cost %d exceeds threshold, penalizing " > @@ -5336,10 +5341,10 @@ rs6000_density_test (rs6000_cost_data *data) > > /* Check whether we need to penalize the body cost to account > for excess strided or elementwise loads. */ > - if (data->extra_ctor_cost > 0) > + if (m_extra_ctor_cost > 0) > { > - gcc_assert (data->nloads <= data->nstmts); > - unsigned int load_pct = (data->nloads * 100) / data->nstmts; > + gcc_assert (m_nloads <= m_nstmts); > + unsigned int load_pct = (m_nloads * 100) / m_nstmts; > > /* It's likely to be bounded by latency and execution resources > from many scalar loads which are strided or elementwise loads > @@ -5351,10 +5356,10 @@ rs6000_density_test (rs6000_cost_data *data) > the loads. > One typical case is the innermost loop of the hotspot of SPEC2017 > 503.bwaves_r without loop interchange. */ > - if (data->nloads > (unsigned int) rs6000_density_load_num_threshold > + if (m_nloads > (unsigned int) rs6000_density_load_num_threshold > && load_pct > (unsigned int) rs6000_density_load_pct_threshold) > { > - data->cost[vect_body] += data->extra_ctor_cost; > + m_costs[vect_body] += m_extra_ctor_cost; > if (dump_enabled_p ()) > dump_printf_loc (MSG_NOTE, vect_location, > "Found %u loads and " > @@ -5363,28 +5368,18 @@ rs6000_density_test (rs6000_cost_data *data) > "penalizing loop body " > "cost by extra cost %u " > "for ctor.\n", > - data->nloads, load_pct, > - data->extra_ctor_cost); > + m_nloads, load_pct, > + m_extra_ctor_cost); > } > } > } > > -/* Implement targetm.vectorize.init_cost. */ > +/* Implement targetm.vectorize.create_costs. */ > > -static void * > -rs6000_init_cost (struct loop *loop_info, bool costing_for_scalar) > +static vector_costs * > +rs6000_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) > { > - rs6000_cost_data *data = XNEW (rs6000_cost_data); > - data->loop_info = loop_info; > - data->cost[vect_prologue] = 0; > - data->cost[vect_body] = 0; > - data->cost[vect_epilogue] = 0; > - data->vect_nonmem = false; > - data->nstmts = 0; > - data->nloads = 0; > - data->extra_ctor_cost = 0; > - data->costing_for_scalar = costing_for_scalar; > - return data; > + return new rs6000_cost_data (vinfo, costing_for_scalar); > } > > /* Adjust vectorization cost after calling rs6000_builtin_vectorization_cost. > @@ -5413,13 +5408,12 @@ rs6000_adjust_vect_cost_per_stmt (enum > vect_cost_for_stmt kind, > /* Helper function for add_stmt_cost. Check each statement cost > entry, gather information and update the target_cost fields > accordingly. */ > -static void > -rs6000_update_target_cost_per_stmt (rs6000_cost_data *data, > - enum vect_cost_for_stmt kind, > - struct _stmt_vec_info *stmt_info, > - enum vect_cost_model_location where, > - int stmt_cost, > - unsigned int orig_count) > +void > +rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, > + vect_cost_model_location where, > + int stmt_cost, > + unsigned int orig_count) > { > > /* Check whether we're doing something other than just a copy loop. > @@ -5431,17 +5425,19 @@ rs6000_update_target_cost_per_stmt (rs6000_cost_data > *data, > || kind == vec_construct > || kind == scalar_to_vec > || (where == vect_body && kind == vector_stmt)) > - data->vect_nonmem = true; > + m_vect_nonmem = true; > > /* Gather some information when we are costing the vectorized instruction > for the statements located in a loop body. */ > - if (!data->costing_for_scalar && data->loop_info && where == vect_body) > + if (!m_costing_for_scalar > + && is_a<loop_vec_info> (m_vinfo) > + && where == vect_body) > { > - data->nstmts += orig_count; > + m_nstmts += orig_count; > > if (kind == scalar_load || kind == vector_load > || kind == unaligned_load || kind == vector_gather_load) > - data->nloads += orig_count; > + m_nloads += orig_count; > > /* Power processors do not currently have instructions for strided > and elementwise loads, and instead we must generate multiple > @@ -5469,20 +5465,16 @@ rs6000_update_target_cost_per_stmt (rs6000_cost_data > *data, > const unsigned int MAX_PENALIZED_COST_FOR_CTOR = 12; > if (extra_cost > MAX_PENALIZED_COST_FOR_CTOR) > extra_cost = MAX_PENALIZED_COST_FOR_CTOR; > - data->extra_ctor_cost += extra_cost; > + m_extra_ctor_cost += extra_cost; > } > } > } > > -/* Implement targetm.vectorize.add_stmt_cost. */ > - > -static unsigned > -rs6000_add_stmt_cost (class vec_info *vinfo, void *data, int count, > - enum vect_cost_for_stmt kind, > - struct _stmt_vec_info *stmt_info, tree vectype, > - int misalign, enum vect_cost_model_location where) > +unsigned > +rs6000_cost_data::add_stmt_cost (int count, vect_cost_for_stmt kind, > + stmt_vec_info stmt_info, tree vectype, > + int misalign, vect_cost_model_location where) > { > - rs6000_cost_data *cost_data = (rs6000_cost_data*) data; > unsigned retval = 0; > > if (flag_vect_cost_model) > @@ -5494,19 +5486,11 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > vectorized are weighted more heavily. The value here is > arbitrary and could potentially be improved with analysis. */ > unsigned int orig_count = count; > - if (where == vect_body && stmt_info > - && stmt_in_inner_loop_p (vinfo, stmt_info)) > - { > - loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo); > - gcc_assert (loop_vinfo); > - count *= LOOP_VINFO_INNER_LOOP_COST_FACTOR (loop_vinfo); /* FIXME. */ > - } > - > - retval = (unsigned) (count * stmt_cost); > - cost_data->cost[where] += retval; > + retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost); > + m_costs[where] += retval; > > - rs6000_update_target_cost_per_stmt (cost_data, kind, stmt_info, where, > - stmt_cost, orig_count); > + update_target_cost_per_stmt (kind, stmt_info, where, > + stmt_cost, orig_count); > } > > return retval; > @@ -5518,13 +5502,9 @@ rs6000_add_stmt_cost (class vec_info *vinfo, void > *data, int count, > vector with length by counting number of required lengths under condition > LOOP_VINFO_FULLY_WITH_LENGTH_P. */ > > -static void > -rs6000_adjust_vect_cost_per_loop (rs6000_cost_data *data) > +void > +rs6000_cost_data::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo) > { > - struct loop *loop = data->loop_info; > - gcc_assert (loop); > - loop_vec_info loop_vinfo = loop_vec_info_for_loop (loop); > - > if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) > { > rgroup_controls *rgc; > @@ -5535,49 +5515,29 @@ rs6000_adjust_vect_cost_per_loop (rs6000_cost_data > *data) > /* Each length needs one shift to fill into bits 0-7. */ > shift_cnt += num_vectors_m1 + 1; > > - rs6000_add_stmt_cost (loop_vinfo, (void *) data, shift_cnt, > scalar_stmt, > - NULL, NULL_TREE, 0, vect_body); > + add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL_TREE, 0, vect_body); > } > } > > -/* Implement targetm.vectorize.finish_cost. */ > - > -static void > -rs6000_finish_cost (void *data, unsigned *prologue_cost, > - unsigned *body_cost, unsigned *epilogue_cost) > +void > +rs6000_cost_data::finish_cost () > { > - rs6000_cost_data *cost_data = (rs6000_cost_data*) data; > - > - if (cost_data->loop_info) > + if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo)) > { > - rs6000_adjust_vect_cost_per_loop (cost_data); > - rs6000_density_test (cost_data); > - } > + adjust_vect_cost_per_loop (loop_vinfo); > + density_test (loop_vinfo); > > - /* Don't vectorize minimum-vectorization-factor, simple copy loops > - that require versioning for any reason. The vectorization is at > - best a wash inside the loop, and the versioning checks make > - profitability highly unlikely and potentially quite harmful. */ > - if (cost_data->loop_info) > - { > - loop_vec_info vec_info = loop_vec_info_for_loop (cost_data->loop_info); > - if (!cost_data->vect_nonmem > - && LOOP_VINFO_VECT_FACTOR (vec_info) == 2 > - && LOOP_REQUIRES_VERSIONING (vec_info)) > - cost_data->cost[vect_body] += 10000; > + /* Don't vectorize minimum-vectorization-factor, simple copy loops > + that require versioning for any reason. The vectorization is at > + best a wash inside the loop, and the versioning checks make > + profitability highly unlikely and potentially quite harmful. */ > + if (!m_vect_nonmem > + && LOOP_VINFO_VECT_FACTOR (loop_vinfo) == 2 > + && LOOP_REQUIRES_VERSIONING (loop_vinfo)) > + m_costs[vect_body] += 10000; > } > > - *prologue_cost = cost_data->cost[vect_prologue]; > - *body_cost = cost_data->cost[vect_body]; > - *epilogue_cost = cost_data->cost[vect_epilogue]; > -} > - > -/* Implement targetm.vectorize.destroy_cost_data. */ > - > -static void > -rs6000_destroy_cost_data (void *data) > -{ > - free (data); > + vector_costs::finish_cost (); > } > > /* Implement targetm.loop_unroll_adjust. */ > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)