https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112935
--- Comment #8 from Xi Ruoyao <xry111 at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #7) > (In reply to Xi Ruoyao from comment #5) > > > > so we still slightly penalty multiplication. To me we should code > > COSTS_N_INSNS (1) + 1 into loongarch_rtx_cost_optimize_size instead of > > special casing it in loongarch_rtx_costs. > > Oh yes slightly penalty is definitely not going make a huge difference if > the cost of an mult instruction is worse than an and and an neg. > > > > > For the default value (used when -O2) I'll do some micro-benchmark... I've changed it to /* Default RTX cost initializer. */ loongarch_rtx_cost_data::loongarch_rtx_cost_data () : fp_add (COSTS_N_INSNS (5)), fp_mult_sf (COSTS_N_INSNS (5)), fp_mult_df (COSTS_N_INSNS (5)), fp_div_sf (COSTS_N_INSNS (8)), fp_div_df (COSTS_N_INSNS (8)), int_mult_si (COSTS_N_INSNS (4)), int_mult_di (COSTS_N_INSNS (4)), int_div_si (COSTS_N_INSNS (5)), int_div_di (COSTS_N_INSNS (5)), branch_cost (6), memory_latency (4) {} based on micro-benchmark results. This fixes the int * _Bool case and int * 17 case. But for the original test case I still get a multiplication instruction.