Hi Kyrill,

> -----Original Message-----
> From: Kyrylo Tkachov <ktkac...@nvidia.com>
> Sent: Friday, July 18, 2025 10:40 AM
> To: GCC Patches <gcc-patches@gcc.gnu.org>
> Cc: Tamar Christina <tamar.christ...@arm.com>; Richard Sandiford
> <richard.sandif...@arm.com>; Alex Coplan <alex.cop...@arm.com>; Andrew
> Pinski <pins...@gmail.com>
> Subject: [PATCH 1/2] aarch64: NFC - Make vec_* rtx costing logic consistent
> 
> Hi all,
> 
> The rtx costs logic for CONST_VECTOR, VEC_DUPLICATE and VEC_SELECT sets
> the cost unconditionally to the movi, dup or extract fields of extra_cost,
> when the normal practice in that function is to use extra_cost only when speed
> is set.  When speed is false the function should estimate the size cost only.
> This patch makes the logic consistent by using the extra_cost fields to
> increment the cost when speed is set.  This requires reducing the extra_cost 
> values
> of the movi, dup and extract fields by COSTS_N_INSNS (1), as every insn being
> costed
> has a cost of COSTS_N_INSNS (1) at the start of the function.  The cost 
> tables for
> the CPUs are updated in line with this.
> 

I've always personally interpreted the "extra" costs structures to mean 
"additional costs"
not extra costs on top of a baseline costs, hence the = vs +=.

I think this diverged between the vector and scalar variants, so I'm not 
against this but,

I'm not sure I understand why for SPEED we don't take into account that we 
still want a single instruction.

i.e. even for -Os we still want movi x, #1 over a literal load.  I think all 
these instructions are better
for both codesize and speed but I'm probably missing something?

Thanks,
Tamar

> With these changes the testsuite is unaffected so no different costing
> decisions are made and this patch is just a cleanup.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for trunk?
> Thanks
> Kyrill
> 
> Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com>
> 
> gcc/
> 
>       * config/aarch64/aarch64.cc (aarch64_rtx_costs): Add extra_cost values
>       only when speed is true for CONST_VECTOR, VEC_DUPLICATE, VEC_SELECT
>       cases.
>       * config/aarch64/aarch64-cost-tables.h (qdf24xx_extra_costs,
>       thunderx_extra_costs, thunderx2t99_extra_costs,
>       thunderx3t110_extra_costs, tsv110_extra_costs, a64fx_extra_costs,
>       ampere1_extra_costs, ampere1a_extra_costs, ampere1b_extra_costs):
>       Reduce cost of movi, dup, extract fields by COSTS_N_INSNS (1).
>       * config/arm/aarch-cost-tables.h (generic_extra_costs,
>       cortexa53_extra_costs, cortexa57_extra_costs, cortexa76_extra_costs,
>       exynosm1_extra_costs, xgene1_extra_costs): Likewise.

Reply via email to