https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #27 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to Richard Biener from comment #26) > (In reply to Hongtao Liu from comment #25) > > in vectorization, store shouldn't have such high cost because once the > > address is computed, the store instruction doesn't stall the pipeline > > (except for memory dependencies that are hard for compilers to detect). I'm > > experimenting with adjusting store costs: setting integer stores to > > COST_N_INSNS (1) - 1 (minus 1 since integer can benefit from renaming and > > have lower STLF costs), while keeping vector/floating-point stores at > > COST_N_INSNS (1). This approach should discourage vectorization for integer > > vector construction + vector store patterns, while still promoting > > vectorization for floating-point operations in vector construct + vector > > store scenarios. > > > > > > I'm testing below patch to see if there's any surprise. > > Sth like this was also on my TODO list. I'd have gone further and made > stores zero cost as we generally do not model AGU latency (also a zero > would more likely show up issues). We still want to add preference to bigger store with the consideration of STLF, COST_N_INSNS(1) may be more profitable. > > > > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > > index 52f82185e32..420104a04b2 100644 > > --- a/gcc/config/i386/i386.cc > > +++ b/gcc/config/i386/i386.cc > > @@ -25363,8 +25363,7 @@ ix86_default_vector_cost (enum vect_cost_for_stmt > > type_of_cost, > > : ix86_cost->int_load [2]) / 2; > > > > case scalar_store: > > - return COSTS_N_INSNS (fp ? ix86_cost->sse_store[0] > > - : ix86_cost->int_store [2]) / 2; > > + return fp ? COSTS_N_INSNS (1) : COSTS_N_INSNS (1) - 1; > > why is FP more costly than int? > 1) Many processors support memory renaming for integer but not for fp/sse 2) For integer there's potential extra integer_to_sse that won't be caught like the PR. That's why I reduce integerĀ scalar_storeĀ for by -1.
