Hi Maciej: LGTM, thanks for modeling this in cost model!
On Tue, Jul 19, 2022 at 12:48 AM Maciej W. Rozycki <ma...@embecosm.com> wrote: > > Fix a performance regression from commit 391500af1932 ("Do not ignore > costs of jump insns in combine."), a part of the m68k series for MODE_CC > conversion (<https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01028.html>), > observed in soft-fp code in libgcc used by some of the embench-iot > benchmarks. > > The immediate origin of the regression is the middle end, which in the > absence of cost information from the backend estimates the cost of an > RTL expression by assuming a single machine instruction for each of the > expression's subexpression. > > So for `if_then_else', which takes 3 operands, the estimated cost is 3 > instructions (i.e. 12 units) even though a branch instruction evaluates > it in a single machine cycle (ignoring the cost of actually taking the > branch of course, which is handled elsewhere). Consequently an insn > sequence like: > > (insn 595 594 596 43 (set (reg:DI 305) > (lshiftrt:DI (reg/v:DI 160 [ R_f ]) > (const_int 55 [0x37]))) ".../libgcc/soft-fp/adddf3.c":46:3 216 > {lshrdi3} > (nil)) > (insn 596 595 597 43 (set (reg:DI 304) > (and:DI (reg:DI 305) > (const_int 1 [0x1]))) ".../libgcc/soft-fp/adddf3.c":46:3 109 > {anddi3} > (expr_list:REG_DEAD (reg:DI 305) > (nil))) > (jump_insn 597 596 598 43 (set (pc) > (if_then_else (eq (reg:DI 304) > (const_int 0 [0])) > (label_ref:DI 1644) > (pc))) ".../libgcc/soft-fp/adddf3.c":46:3 237 {*branchdi} > (expr_list:REG_DEAD (reg:DI 304) > (int_list:REG_BR_PROB 536870916 (nil))) > -> 1644) > > does not (anymore, as from the commit referred) get combined into: > > (note 595 594 596 43 NOTE_INSN_DELETED) > (note 596 595 597 43 NOTE_INSN_DELETED) > (jump_insn 597 596 598 43 (parallel [ > (set (pc) > (if_then_else (eq (zero_extract:DI (reg/v:DI 160 [ R_f ]) > (const_int 1 [0x1]) > (const_int 55 [0x37])) > (const_int 0 [0])) > (label_ref:DI 1644) > (pc))) > (clobber (scratch:DI)) > ]) ".../libgcc/soft-fp/adddf3.c":46:3 243 {*branch_on_bitdi} > (int_list:REG_BR_PROB 536870916 (nil)) > -> 1644) > > This is because the new cost is incorrectly calculated as 28 units while > the cost of the original 3 instructions was 24: > > rejecting combination of insns 595, 596 and 597 > original costs 4 + 4 + 16 = 24 > replacement cost 28 > > Before the commit referred the cost of jump instruction was ignored and > considered 0 (i.e. unknown) and a sequence of instructions of a known > cost used to win: > > allowing combination of insns 595, 596 and 597 > original costs 4 + 4 + 0 = 0 > replacement cost 28 > > Add the missing costs for the 3 variants of `if_then_else' expressions > we currently define in the backend. > > With the fix in place the cost of this particular `if_then_else' pattern > is 2 instructions or 8 units (because of the shift operation) and > therefore the ultimate cost of the original 3 RTL insns will work out at > 16 units (4 + 4 + 8), however the replacement single RTL insn will cost > 8 units only. > > gcc/ > * config/riscv/riscv.cc (riscv_rtx_costs) <IF_THEN_ELSE>: New > case. > --- > gcc/config/riscv/riscv.cc | 27 +++++++++++++++++++++++++++ > 1 file changed, 27 insertions(+) > > gcc-riscv-rtx-costs-if-then-else.diff > Index: gcc/gcc/config/riscv/riscv.cc > =================================================================== > --- gcc.orig/gcc/config/riscv/riscv.cc > +++ gcc/gcc/config/riscv/riscv.cc > @@ -1853,6 +1853,33 @@ riscv_rtx_costs (rtx x, machine_mode mod > /* Otherwise use the default handling. */ > return false; > > + case IF_THEN_ELSE: > + if (TARGET_SFB_ALU > + && register_operand (XEXP (x, 1), mode) > + && sfb_alu_operand (XEXP (x, 2), mode) > + && comparison_operator (XEXP (x, 0), VOIDmode)) > + { > + /* For predicated conditional-move operations we assume the cost > + of a single instruction even though there are actually two. */ > + *total = COSTS_N_INSNS (1); > + return true; > + } > + else if (LABEL_REF_P (XEXP (x, 1)) && XEXP (x, 2) == pc_rtx) > + { > + if (equality_operator (XEXP (x, 0), mode) > + && GET_CODE (XEXP (XEXP (x, 0), 0)) == ZERO_EXTRACT) > + { > + *total = COSTS_N_INSNS (SINGLE_SHIFT_COST + 1); > + return true; > + } > + if (order_operator (XEXP (x, 0), mode)) > + { > + *total = COSTS_N_INSNS (1); > + return true; > + } > + } > + return false; > + > case NOT: > *total = COSTS_N_INSNS (GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1); > return false;