Hi Kyrill, > -----Original Message----- > From: Kyrylo Tkachov <ktkac...@nvidia.com> > Sent: Friday, July 18, 2025 10:40 AM > To: GCC Patches <gcc-patches@gcc.gnu.org> > Cc: Tamar Christina <tamar.christ...@arm.com>; Richard Sandiford > <richard.sandif...@arm.com>; Alex Coplan <alex.cop...@arm.com>; Andrew > Pinski <pins...@gmail.com> > Subject: [PATCH 1/2] aarch64: NFC - Make vec_* rtx costing logic consistent > > Hi all, > > The rtx costs logic for CONST_VECTOR, VEC_DUPLICATE and VEC_SELECT sets > the cost unconditionally to the movi, dup or extract fields of extra_cost, > when the normal practice in that function is to use extra_cost only when speed > is set. When speed is false the function should estimate the size cost only. > This patch makes the logic consistent by using the extra_cost fields to > increment the cost when speed is set. This requires reducing the extra_cost > values > of the movi, dup and extract fields by COSTS_N_INSNS (1), as every insn being > costed > has a cost of COSTS_N_INSNS (1) at the start of the function. The cost > tables for > the CPUs are updated in line with this. >
I've always personally interpreted the "extra" costs structures to mean "additional costs" not extra costs on top of a baseline costs, hence the = vs +=. I think this diverged between the vector and scalar variants, so I'm not against this but, I'm not sure I understand why for SPEED we don't take into account that we still want a single instruction. i.e. even for -Os we still want movi x, #1 over a literal load. I think all these instructions are better for both codesize and speed but I'm probably missing something? Thanks, Tamar > With these changes the testsuite is unaffected so no different costing > decisions are made and this patch is just a cleanup. > > Bootstrapped and tested on aarch64-none-linux-gnu. > Ok for trunk? > Thanks > Kyrill > > Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com> > > gcc/ > > * config/aarch64/aarch64.cc (aarch64_rtx_costs): Add extra_cost values > only when speed is true for CONST_VECTOR, VEC_DUPLICATE, VEC_SELECT > cases. > * config/aarch64/aarch64-cost-tables.h (qdf24xx_extra_costs, > thunderx_extra_costs, thunderx2t99_extra_costs, > thunderx3t110_extra_costs, tsv110_extra_costs, a64fx_extra_costs, > ampere1_extra_costs, ampere1a_extra_costs, ampere1b_extra_costs): > Reduce cost of movi, dup, extract fields by COSTS_N_INSNS (1). > * config/arm/aarch-cost-tables.h (generic_extra_costs, > cortexa53_extra_costs, cortexa57_extra_costs, cortexa76_extra_costs, > exynosm1_extra_costs, xgene1_extra_costs): Likewise.