On 2023-11-29 19:09 Fei Gao <gao...@eswincomputing.com> wrote: > >On 2023-11-29 13:26 Jeff Law <jeffreya...@gmail.com> wrote: >> >> >> >>On 11/27/23 19:32, Fei Gao wrote: >>> op=[PLUS, MINUS, IOR, XOR, ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT] >>> >>> SIGN_EXTEND, ZERO_EXTEND and SUBREG has been considered >>> to support SImode in 64-bit machine. >>Let's defer these for now. We're supposed to be wrapping up work that >>was posted before stage1 closed. If these opcodes were trivial to >>support, then I would let them through, but SUBREGs for example can be >>problematical as their semantics can be complex. > >I can delete the 32bit-support in 64 bit machine. >Or split this patch into 2 parts word-size support and 32bit-support in 64 bit >machine. >Or keep current status if the following words persuades you :) > >Regarding complexity, the current patch differs from 1st version by >"to_replace" and reduces >complexity significantly. noce_simple_bbs() ensures we have only single set >for insn_a like x=sext(subreg(y) op subreg(z)). >Instead of constructing an insn considering complex extension and subreg from >scratch, to_replace locates the rtx z >in noce_bbs_ok_for_cond_zero_arith, >and then replace it with czero result. In this way, extension and subreg are >as simple as reg cases. >All the cases for extension and subreg have been uploaded in this patch. >They can be found by searching "int" in gcc.target/riscv/zicond_ifcvt_opt.c > >Could you please let me known which you prefer? I split the patches. See new series https://patchwork.sourceware.org/project/gcc/list/?series=27924 > >> >> >>> >>> Conditional op, if zero >>> rd = (rc == 0) ? (rs1 op rs2) : rs1 >>> --> >>> czero.nez rd, rs2, rc >>> op rd, rs1, rd >>> >>> Conditional op, if non-zero >>> rd = (rc != 0) ? (rs1 op rs2) : rs1 >>> --> >>> czero.eqz rd, rs2, rc >>> op rd, rs1, rd >>> >>> Co-authored-by: Xiao Zeng<zengx...@eswincomputing.com> >>> >>> gcc/ChangeLog: >>> >>> * ifcvt.cc (noce_try_cond_zero_arith):handler for condtional zero >>>based ifcvt >>> (noce_emit_czero): helper for noce_try_cond_zero_arith >>> (noce_cond_zero_binary_op_supported): check supported OPs for >>>condtional zero based ifcvt >>> (get_base_reg): get base reg of a subreg or the reg itself >>> (noce_bbs_ok_for_cond_zero_arith): check if BBs are OK for >>>condtional zero based ifcvt >>> (noce_process_if_block): add noce_try_cond_zero_arith >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/riscv/zicond_ifcvt_opt.c: New test. >>> --- >>> gcc/ifcvt.cc | 210 ++++++ >>> .../gcc.target/riscv/zicond_ifcvt_opt.c | 682 ++++++++++++++++++ >>> 2 files changed, 892 insertions(+) >>> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c >>> >>> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc >>> index a0af553b9ff..8f6a0e7f5fe 100644 >>> --- a/gcc/ifcvt.cc >>> +++ b/gcc/ifcvt.cc >>> @@ -787,6 +787,7 @@ static rtx noce_get_alt_condition (struct noce_if_info >>> *, rtx, rtx_insn **); >>> static bool noce_try_minmax (struct noce_if_info *); >>> static bool noce_try_abs (struct noce_if_info *); >>> static bool noce_try_sign_mask (struct noce_if_info *); >>> +static int noce_try_cond_zero_arith (struct noce_if_info *); >>> >>> /* Return the comparison code for reversed condition for IF_INFO, >>> or UNKNOWN if reversing the condition is not possible. */ >>> @@ -1831,6 +1832,40 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx >>> x, enum rtx_code code, >>> return NULL_RTX; >>> } >>> >>> +/* Emit a conditional zero, returning TARGET or NULL_RTX upon failure. >>> + IF_INFO describes the if-conversion scenario under consideration. >>> + CZERO_CODE selects the condition (EQ/NE). >>> + NON_ZERO_OP is the nonzero operand of the conditional move >>> + TARGET is the desired output register. */ >>> + >>> +static rtx >>> +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code, >>> + rtx non_zero_op, rtx target) >>[ ... ] >>The code you wrote is safe in that if constructs a suitable if-then-else >>as a single object, starts a new sequence the uses emit_insn to put that >>object onto a sequence. Then you extract that one and only one insn >>from the sequence and validate it can be recognized. >> >>In cases where you want to do something like this and know you're going >>to emit one and only one insn you can use 'make_insn_raw' without >>generating a new sequence. >> >>I would suggest you replace all the code starting with start_sequence() >>and ending with end_sequence () (inclusive) with something like >> >>insn = make_insn_raw (set); >>if (recog_memoized (insn) >= 0) >> { >> emit_insn (insn); >> return target; >> } >>return NULL_RTX; >Thanks for your advice, I will take it in next version. I used add_insn instead of emit_insn to avoid a segment fault caused by invalid NEXT_INSN (insn) of the raw insn.
BR, Fei > >> >> >>Note that in the future (gcc-15) when this code is generalized to use >>the expander it will potentially generate multiple insns at which point >>we'll have to put them on a sequence and validate they all are >>recognizable. But we'll tackle that at the appropriate time. >Sure. > >> >> >>> + >>> + return false; >>> +} >>> + >>> +/* Helper function to return REG itself or inner expression of a SUBREG, >>> + otherwise NULL_RTX for other RTX_CODE. */ >>> + >>> +static rtx >>> +get_base_reg (rtx exp) >>> +{ >>> + if (REG_P (exp)) >>> + return exp; >>> + else if (SUBREG_P (exp)) >>> + return SUBREG_REG (exp); >>> + >>> + return NULL_RTX; >>> + >>I would advise against handling subregs at this point. What you're >>doing is probably too simplistic given the semantics of subregs. >> >> >> >>> + >>> + /* Strip sign_extend if any. */ >>> + if (GET_CODE (a) == SIGN_EXTEND || GET_CODE (a) == ZERO_EXTEND) >>> + bin_exp = XEXP (a, 0); >>> + else >>> + bin_exp = a; >>Similarly while I do think we're going to want to handle extensions, >>let's not try and add them at this point. We want to get this wrapped >>up & integrated so that everyone can move their focus to bugfixing for >>gcc-14. >Same reply with the one regarding *SIGN_EXTEND, ZERO_EXTEND and SUBREG* at >the beginning. > >> >>> + >>> +/* Try to covert if-then-else with conditional zero, >>> + returning TURE on success or FALSE on failure. >>> + IF_INFO describes the if-conversion scenario under consideration. */ >>> + >>> +static int >>> +noce_try_cond_zero_arith (struct noce_if_info *if_info) >>> +{ >>> + rtx target, a; >>> + rtx_insn *seq; >>> + machine_mode mode = GET_MODE (if_info->x); >>> + rtx common = NULL_RTX; >>> + enum rtx_code czero_code = UNKNOWN; >>> + rtx non_zero_op = NULL_RTX; >>> + rtx *to_replace = NULL; >>> + >>> + if (!noce_bbs_ok_for_cond_zero_arith (if_info, &common, &czero_code, &a, >>> + &to_replace)) >>> + return false; >>> + >>> + /* Disallow CONST_INT currently for simplicity*/ >>> + if (to_replace == NULL || !REG_P (*to_replace)) >>> + return false;If you're not going to allow CONST_INT reject them in the >>ok_for_condzero_arith routine. Presumably you're rejecting constants >>because zicond doesn't allow them in the arms. We'll want to handle >>them in the future and can do so easily once we're using the expander. > >I remembered the shift-by-6 case mentioned by you, see [PATCH 3/4] for >const_int support:) >https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327149.html > >Also AND is supported in >https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327219.html > >Could you please reveiw 2 patches above so I will repost together if needed? >> >> >> >>> + >>> + non_zero_op = *to_replace; >>> + >>> + start_sequence (); >>> + >>> + /* If x is used in both input and out like x = c ? x + z : x, >>> + use a new reg to avoid modifying x */ >>> + if (common && rtx_equal_p (common, if_info->x)) >>> + target = gen_reg_rtx (mode); >>> + else >>> + target = if_info->x; >>> + >>> + target = noce_emit_czero (if_info, czero_code, non_zero_op, target); >>> + if (!target || !to_replace) >>> + { >>> + end_sequence (); >>> + return false; >>> + } >>> + >>> + *to_replace = target; >>> + emit_insn (gen_rtx_SET (if_info->x, a)); >>This isn't necessarily safe. Use emit_move_insn instead. >OK. > >Thanks & BR >Fei > >> >>This is pretty close. Hopefully one more iteration and it can be >>integrated. >> >>jeff