Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 11/27/23 19:57, Fei Gao wrote: 1. In find_if_header function, I found the following piece of codes: if (!reload_completed && noce_find_if_block(...)), and find_if_header must be called before noce_try_cond_zero_arith(). Ah good. 2. In noce_try_strore_flag_constants, new registers are also generated without can_create_pseudo_p() check. So I guess no need to add can_create_pseudo_p() here. Agreed. I could possibly make an argument that it might be nice to be able to look for these things after reload has completed, but I think we can ignore that for now. Thanks! Jeff
Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 11/27/23 19:46, Fei Gao wrote: On 2023-11-20 14:46 Jeff Law wrote: On 10/30/23 21:35, Fei Gao wrote: So just a few notes to further illustrate why I'm currently looking to take the VRULL+Ventana implementation. The code above would be much better handled by just calling noce_emit_cmove. noce_emit_cmove will go through the conditional move expander. So any improvement we make in the expander "just work" when called from the if-converter. noce_emit_czero is used here to make sure czero insns are emited. noce_emit_cmove includes SFB and Thead movcc, which will take precedence over zicond in RISCV if enabled. Unfortunately we have products with SFB and Zicond both available and saw such conflict. And that is also the reason to add hook TARGET_HAVE_COND_ZERO in [PATCH 1/4] to disallow ineffient code emited by SFB enable and Zicond disabled case. I understand what you're trying to do, but I would consider the TARGET_HAVE_COND_ZERO fundamentally the wrong approach. Hi Jeff Thanks for your review. I just post the new series. https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327148.html https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327151.html https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327149.html https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327150.html TARGET_HAVE_COND_ZERO has been deleted. THanks for the V2. I'll see what I can do with them this week. The series was posted prior to close of stage1, so it can still be integrated into gcc-14 if we get it cleaned up. Jeff
Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 2023-11-20 14:59 Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: >> Conditional add, if zero >> rd = (rc == 0) ? (rs1 + rs2) : rs1 >> --> >> czero.nez rd, rs2, rc >> add rd, rs1, rd >> >> Conditional add, if non-zero >> rd = (rc != 0) ? (rs1 + rs2) : rs1 >> --> >> czero.eqz rd, rs2, rc >> add rd, rs1, rd >> >> Co-authored-by: Xiao Zeng >> >> gcc/ChangeLog: >> >> * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith >> (noce_try_cond_zero_arith): handler for condtional zero op >> (noce_process_if_block): add noce_try_cond_zero_arith with hook >>control >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/zicond_ifcvt_opt.c: New test. >> --- >> gcc/ifcvt.cc | 112 +++ >> .../gcc.target/riscv/zicond_ifcvt_opt.c | 130 ++ >> 2 files changed, 242 insertions(+) >> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c >> >> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc >> index a0af553b9ff..4f98c1c7bf9 100644 >> --- a/gcc/ifcvt.cc >> +++ b/gcc/ifcvt.cc >> @@ -781,12 +781,14 @@ static bool noce_try_store_flag_constants (struct >> noce_if_info *); >> static bool noce_try_store_flag_mask (struct noce_if_info *); >> static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx, >> rtx, rtx, rtx, rtx = NULL, rtx = NULL); >> +static rtx noce_emit_czero (struct noce_if_info *, enum rtx_code, rtx, rtx); >> static bool noce_try_cmove (struct noce_if_info *); >> static bool noce_try_cmove_arith (struct noce_if_info *); >> static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn >>**); >> static bool noce_try_minmax (struct noce_if_info *); >> static bool noce_try_abs (struct noce_if_info *); >> static bool noce_try_sign_mask (struct noce_if_info *); >> +static bool noce_try_cond_zero_arith (struct noce_if_info *); >> >> /* Return the comparison code for reversed condition for IF_INFO, >> or UNKNOWN if reversing the condition is not possible. */ >> @@ -1831,6 +1833,32 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, >> enum rtx_code code, >> return NULL_RTX; >> } >> >> +static rtx >> +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code, >> rtx z, rtx target) >Every function needs a comment describing what the function does, it's >return value(s) and its arguments. There are many examples in ifcvt.cc >you can use to guide you. I might start with something like this: > >/* Emit a conditional zero, returning the location of the result > or NULL_RTX upon failure. > > IF_INFO describes the if-conversion scenario under consideration. > CZERO_CODE selects the condition (EQ/NE). > Z is the nonzero operand of the conditional move > TARGET is the desired output register. */ > >Or something like that. I would suggest renaming "Z" to something more >meaningful. Hi Jeff Thanks for your patients. All comments regarding coding style have been addressed in new patches. > > > >> >> +/* Convert x = c ? y + z : y or x = c ? y : y + z. */ >> + >> +static bool >> +noce_try_cond_zero_arith (struct noce_if_info *if_info) >The function comment really should be improved. For example it doesn't >indicate what the return value is. > >> + >> + /* cond must be EQ or NEQ comparision of a reg and 0. */ >In general when you refer to a variable in a comment, do so in upper >case. Use NE rather than NEQ as the former is how most code refers to a >not-equal rtx code. > > >> + if (GET_CODE (cond) != NE && GET_CODE (cond) != EQ) >> + return false; >> + if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx)) >> + return false; >> + >> + /* check y + z:y*/ >> + if (GET_CODE (a) == PLUS && REG_P (XEXP (a, 0)) && REG_P (XEXP (a, 1)) >> + && REG_P (b) && rtx_equal_p (XEXP (a, 0), b)) >Write comments as complete sentences. > >> + { >> + common = b; >> + z = XEXP (a, 1); >Rather than "z" use a more descriptive variable name. > > >> + >> + /* If we have x = c ? x + z : x, use a new reg to avoid modifying x */ >> + if (common && rtx_equal_p (common, if_info->x)) >> + target = gen_reg_rtx (mode); >> + else >> + target = if_info->x; >if-conversion runs both before and after register allocation. So you >have to handle the case where you can not generate new registers. Use >can_create_pseudo_p () to test for that. You may need to fail if you >can't generate a new register. 1. In find_if_header function, I found the following piece of codes: if (!reload_completed && noce_find_if_block(...)), and find_if_header must be called before noce_try_cond_zero_arith(). 2. In noce_try_strore_flag_constants, new registers are also generated without can_create_pseudo_p() check. So I guess no need to add can_create_pseudo_p() here. > >> + >> + target = noce_emit_czero (if_info, czero_code, z, target); >> + if (!target) >> + { >> + end_seque
Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 2023-11-20 14:46 Jeff Law wrote: > > > >On 10/30/23 21:35, Fei Gao wrote: > >>> So just a few notes to further illustrate why I'm currently looking to >>> take the VRULL+Ventana implementation. The code above would be much >>> better handled by just calling noce_emit_cmove. noce_emit_cmove will go >>> through the conditional move expander. So any improvement we make in >>> the expander "just work" when called from the if-converter. >> noce_emit_czero is used here to make sure czero insns are emited. >> noce_emit_cmove includes SFB and Thead movcc, which will take precedence >> over zicond in RISCV if enabled. Unfortunately we have products with SFB and >> Zicond >> both available and saw such conflict. >> And that is also the reason to add hook TARGET_HAVE_COND_ZERO >> in [PATCH 1/4] to disallow ineffient code emited by SFB enable and Zicond >> disabled case. >I understand what you're trying to do, but I would consider the >TARGET_HAVE_COND_ZERO fundamentally the wrong approach. Hi Jeff Thanks for your review. I just post the new series. https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327148.html https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327151.html https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327149.html https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg327150.html TARGET_HAVE_COND_ZERO has been deleted. > >I'm willing to defer routing everything through noce_emit_cmove for now, >but that's really where this code needs to be going. If that's causing >a conflict for a particular implementation with both SFB and Zicond, >then we'll have to look at the details and adjust things in the target >files. Agree. We can try noce_emit_cmove later with more TCs integrated recently. Also I tried to solve the conflict found in my TCs in [PATCH 1/4] and [PATCH 4/4]. > > >> Cool and waiting for your submit. Shifts/rotates can be added in >> noce_try_cond_zero_arith. >Fully agreed. Those are easy. Shifts/rotates have been added. BR, Fei > >> I tried to keep noce_try_cond_zero_arith simple without introducing SCC and >> other stuff >> as addtional insns will be generated for greater than like comparision >> but may not be generated for branch-insn based SFB. >And I think the result is we're going to fail to implement many >profitable if-conversions. > > >Jeff
Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 10/30/23 01:25, Fei Gao wrote: Conditional add, if zero rd = (rc == 0) ? (rs1 + rs2) : rs1 --> czero.nez rd, rs2, rc add rd, rs1, rd Conditional add, if non-zero rd = (rc != 0) ? (rs1 + rs2) : rs1 --> czero.eqz rd, rs2, rc add rd, rs1, rd Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith (noce_try_cond_zero_arith): handler for condtional zero op (noce_process_if_block): add noce_try_cond_zero_arith with hook control gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond_ifcvt_opt.c: New test. --- gcc/ifcvt.cc | 112 +++ .../gcc.target/riscv/zicond_ifcvt_opt.c | 130 ++ 2 files changed, 242 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index a0af553b9ff..4f98c1c7bf9 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -781,12 +781,14 @@ static bool noce_try_store_flag_constants (struct noce_if_info *); static bool noce_try_store_flag_mask (struct noce_if_info *); static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx, rtx, rtx, rtx, rtx = NULL, rtx = NULL); +static rtx noce_emit_czero (struct noce_if_info *, enum rtx_code, rtx, rtx); static bool noce_try_cmove (struct noce_if_info *); static bool noce_try_cmove_arith (struct noce_if_info *); static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **); static bool noce_try_minmax (struct noce_if_info *); static bool noce_try_abs (struct noce_if_info *); static bool noce_try_sign_mask (struct noce_if_info *); +static bool noce_try_cond_zero_arith (struct noce_if_info *); /* Return the comparison code for reversed condition for IF_INFO, or UNKNOWN if reversing the condition is not possible. */ @@ -1831,6 +1833,32 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code, return NULL_RTX; } +static rtx +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code, rtx z, rtx target) Every function needs a comment describing what the function does, it's return value(s) and its arguments. There are many examples in ifcvt.cc you can use to guide you. I might start with something like this: /* Emit a conditional zero, returning the location of the result or NULL_RTX upon failure. IF_INFO describes the if-conversion scenario under consideration. CZERO_CODE selects the condition (EQ/NE). Z is the nonzero operand of the conditional move TARGET is the desired output register. */ Or something like that. I would suggest renaming "Z" to something more meaningful. +/* Convert x = c ? y + z : y or x = c ? y : y + z. */ + +static bool +noce_try_cond_zero_arith (struct noce_if_info *if_info) The function comment really should be improved. For example it doesn't indicate what the return value is. + + /* cond must be EQ or NEQ comparision of a reg and 0. */ In general when you refer to a variable in a comment, do so in upper case. Use NE rather than NEQ as the former is how most code refers to a not-equal rtx code. + if (GET_CODE (cond) != NE && GET_CODE (cond) != EQ) +return false; + if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx)) +return false; + + /* check y + z:y*/ + if (GET_CODE (a) == PLUS && REG_P (XEXP (a, 0)) && REG_P (XEXP (a, 1)) + && REG_P (b) && rtx_equal_p (XEXP (a, 0), b)) Write comments as complete sentences. +{ + common = b; + z = XEXP (a, 1); Rather than "z" use a more descriptive variable name. + + /* If we have x = c ? x + z : x, use a new reg to avoid modifying x */ + if (common && rtx_equal_p (common, if_info->x)) +target = gen_reg_rtx (mode); + else +target = if_info->x; if-conversion runs both before and after register allocation. So you have to handle the case where you can not generate new registers. Use can_create_pseudo_p () to test for that. You may need to fail if you can't generate a new register. + + target = noce_emit_czero (if_info, czero_code, z, target); + if (!target) +{ + end_sequence (); + return false; +} + + target = expand_simple_binop (mode, PLUS, common, target, if_info->x, 0, + OPTAB_DIRECT); I think you want OPTAB_WIDEN and in the other calls. @@ -3975,6 +4085,8 @@ noce_process_if_block (struct noce_if_info *if_info) goto success; if (noce_try_store_flag_mask (if_info)) goto success; + if (targetm.have_cond_zero () && noce_try_cond_zero_arith (if_info)) + goto success; Replace targetm.have_cond_zero with HAVE_conditional_move since that's the RTL primitive we're building from. +**test_ADD_ceqz: +** czero\.eqz a3,a2,a3 +** add a0,a1,a3 +** ret Please don't use explicit registers unless you know they will always
Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 10/30/23 21:35, Fei Gao wrote: So just a few notes to further illustrate why I'm currently looking to take the VRULL+Ventana implementation. The code above would be much better handled by just calling noce_emit_cmove. noce_emit_cmove will go through the conditional move expander. So any improvement we make in the expander "just work" when called from the if-converter. noce_emit_czero is used here to make sure czero insns are emited. noce_emit_cmove includes SFB and Thead movcc, which will take precedence over zicond in RISCV if enabled. Unfortunately we have products with SFB and Zicond both available and saw such conflict. And that is also the reason to add hook TARGET_HAVE_COND_ZERO in [PATCH 1/4] to disallow ineffient code emited by SFB enable and Zicond disabled case. I understand what you're trying to do, but I would consider the TARGET_HAVE_COND_ZERO fundamentally the wrong approach. I'm willing to defer routing everything through noce_emit_cmove for now, but that's really where this code needs to be going. If that's causing a conflict for a particular implementation with both SFB and Zicond, then we'll have to look at the details and adjust things in the target files. Cool and waiting for your submit. Shifts/rotates can be added in noce_try_cond_zero_arith. Fully agreed. Those are easy. I tried to keep noce_try_cond_zero_arith simple without introducing SCC and other stuff as addtional insns will be generated for greater than like comparision but may not be generated for branch-insn based SFB. And I think the result is we're going to fail to implement many profitable if-conversions. Jeff
Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 2023-10-31 03:16 Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: >> Conditional add, if zero >> rd = (rc == 0) ? (rs1 + rs2) : rs1 >> --> >> czero.nez rd, rs2, rc >> add rd, rs1, rd >> >> Conditional add, if non-zero >> rd = (rc != 0) ? (rs1 + rs2) : rs1 >> --> >> czero.eqz rd, rs2, rc >> add rd, rs1, rd >> >> Co-authored-by: Xiao Zeng >> >> gcc/ChangeLog: >> >> * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith >> (noce_try_cond_zero_arith): handler for condtional zero op >> (noce_process_if_block): add noce_try_cond_zero_arith with hook >>control >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/zicond_ifcvt_opt.c: New test. >> --- >> gcc/ifcvt.cc | 112 +++ >> .../gcc.target/riscv/zicond_ifcvt_opt.c | 130 ++ >> 2 files changed, 242 insertions(+) >> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c >> >> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc >> index a0af553b9ff..4f98c1c7bf9 100644 >> --- a/gcc/ifcvt.cc >> +++ b/gcc/ifcvt.cc >> +static rtx >> +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code, >> rtx z, rtx target) >> +{ >> + machine_mode mode = GET_MODE (target); >> + rtx cond_op0 = XEXP (if_info->cond, 0); >> + rtx czero_cond >> + = gen_rtx_fmt_ee (czero_code, GET_MODE (cond_op0), cond_op0, >> const0_rtx); >> + rtx if_then_else = gen_rtx_IF_THEN_ELSE (mode, czero_cond, const0_rtx, z); >> + rtx set = gen_rtx_SET (target, if_then_else); >> + >> + start_sequence (); >> + rtx_insn *insn = emit_insn (set); >> + >> + if (recog_memoized (insn) >= 0) >> + { >> + rtx_insn *seq = get_insns (); >> + end_sequence (); >> + emit_insn (seq); >> + >> + return target; >> + } >> + >> + end_sequence (); >> + return NULL_RTX; >> +} >So just a few notes to further illustrate why I'm currently looking to >take the VRULL+Ventana implementation. The code above would be much >better handled by just calling noce_emit_cmove. noce_emit_cmove will go >through the conditional move expander. So any improvement we make in >the expander "just work" when called from the if-converter. noce_emit_czero is used here to make sure czero insns are emited. noce_emit_cmove includes SFB and Thead movcc, which will take precedence over zicond in RISCV if enabled. Unfortunately we have products with SFB and Zicond both available and saw such conflict. And that is also the reason to add hook TARGET_HAVE_COND_ZERO in [PATCH 1/4] to disallow ineffient code emited by SFB enable and Zicond disabled case. >> + >> /* Try only simple constants and registers here. More complex cases >> are handled in noce_try_cmove_arith after noce_try_store_flag_arith >> has had a go at it. */ >> @@ -2880,6 +2908,88 @@ noce_try_sign_mask (struct noce_if_info *if_info) >> return true; >> } >> >> +/* Convert x = c ? y + z : y or x = c ? y : y + z. */ >> + >> +static bool >> +noce_try_cond_zero_arith (struct noce_if_info *if_info) >> +{ >> + rtx target; >> + rtx_insn *seq; >> + machine_mode mode = GET_MODE (if_info->x); >> + rtx common = NULL_RTX; >> + enum rtx_code czero_code = UNKNOWN; >> + rtx a = if_info->a; >> + rtx b = if_info->b; >> + rtx z = NULL_RTX; >> + rtx cond = if_info->cond; >> + >> + if (!noce_simple_bbs (if_info)) >> + return false; >[ ... ] >So the internal code we have does a bit of canonicalization before the >optimizing transformations. In particular we may be presented with > >(a == 0) ? b : a which we transform into (a != 0 ? a : b) which allows >us to pick up more cases. (b != 0 ? b : a) gets similar handling. > >As I mentioned earlier, the VRULL+Ventana code handles wrapping >extensions & subregs. Our code also handles if-converting shifts/rotates. Cool and waiting for your submit. Shifts/rotates can be added in noce_try_cond_zero_arith. I tried to keep noce_try_cond_zero_arith simple without introducing SCC and other stuff as addtional insns will be generated for greater than like comparision but may not be generated for branch-insn based SFB. IMHO, the earlier the noce_try* function emerges in noce_process_if_block, the simpler optimization scenarios are and more efficent codes shall be generated, then the later function like noce_try_cmove_arith will handle the more general case. BR, Fei > >Hopefully that explains a bit more why I think cleaning up the >VRULL+Ventana code is a better choice. > >jeff
Re: Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 2023-10-31 00:36 Jeff Law wrote: > > > >On 10/30/23 01:25, Fei Gao wrote: >> Conditional add, if zero >> rd = (rc == 0) ? (rs1 + rs2) : rs1 >> --> >> czero.nez rd, rs2, rc >> add rd, rs1, rd >> >> Conditional add, if non-zero >> rd = (rc != 0) ? (rs1 + rs2) : rs1 >> --> >> czero.eqz rd, rs2, rc >> add rd, rs1, rd >> >> Co-authored-by: Xiao Zeng >> >> gcc/ChangeLog: >> >> * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith >> (noce_try_cond_zero_arith): handler for condtional zero op >> (noce_process_if_block): add noce_try_cond_zero_arith with hook >>control >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/zicond_ifcvt_opt.c: New test. >So the idea here is to improve upon the current code we generate for >conditional arithmetic. Right now we support conditional arithmetic >using zicond, but the sequence is poor. > >Basically the if-converter knows how to generate a conditional add, but >it does so in a way that isn't as efficient as it could be. > >In effect ifcvt wants to generate > >t = a + b >res = cond ? t : b > > >We want to change it to > >t = cond ? b : 0; >res = a + t; > >The latter sequence expands to more efficient code trivially for risc-v. Exactly. 2 less insns for add case below: long test_ADD_ceqz(long x, long y, long z, long c){ if (c) x = y + z; else x = y; return x; } test_ADD_ceqz(before this patch): add a2,a1,a2 czero.eqz a0,a2,a3 czero.nez a3,a1,a3 or a0,a3,a0 ret test_ADD_ceqz(after this patch): czero.eqz a3,a2,a3 add a0,a1,a3 ret > >I wandered a bit through the combine dumps to see if it would be easy to >capture this class of cases. We never get anything useful, and while I >can imagine "bridge" patterns that would potentially expose enough RTL >to allow us to rewrite without changing ifcvt, it'd just be a hack IMHO. > >So going back to ifcvt... > >In the first sequence the addition must wait for both "a" and "b" to be >available and the conditional move can fire on the next cycle. > >In the second sequence the conditional move can fire when just "b" is >available. So that gives "a" another cycle to become ready (say if it's >coming from memory or a multi-cycle operation like multiply). > >On the other hand the second sequence does keep "a" live longer. > >In the end I strongly suspect neither sequence is significantly better >than the other. Meaning I don't think we need to conditionalize using >condzero arith at all. As shown case above, 2 less insns with using condzero arith. > > >I'll note that subsequent patches add MINUS, IOR, XOR and AND. It's >also possible (and important) to handle shifts. There's a conditional >shift-by-6 in leela's hot path. This series is a initial framework for simple condzero arith. Shift may come later as it involes sugreg stuff. > >Overall this looks a lot like the VRULL code, but just less complete. >My inclination is to do a cleanup pass on the VRULL code verify it >handles all the cases in your tests and commit the VRULL implementation >with your tests. I searched and didn't find VRULL codes, could you please provide a link at your convience? My colleague Zeng Xiao posted monthes ago https://patchwork.sourceware.org/project/gcc/patch/20230719101156.21771-6-zengx...@eswincomputing.com/ But after fixing several bugs, we realized the previous implementation is quite complex and come up with this patch series. > >I'll do some further poking at this today. Thanks for re-submitting >these bits. Getting this target independent work cleaned up has been on >my TODO for a while now. Thanks for your patience. BR, Fei > >jeff
Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 10/30/23 01:25, Fei Gao wrote: Conditional add, if zero rd = (rc == 0) ? (rs1 + rs2) : rs1 --> czero.nez rd, rs2, rc add rd, rs1, rd Conditional add, if non-zero rd = (rc != 0) ? (rs1 + rs2) : rs1 --> czero.eqz rd, rs2, rc add rd, rs1, rd Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith (noce_try_cond_zero_arith): handler for condtional zero op (noce_process_if_block): add noce_try_cond_zero_arith with hook control gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond_ifcvt_opt.c: New test. --- gcc/ifcvt.cc | 112 +++ .../gcc.target/riscv/zicond_ifcvt_opt.c | 130 ++ 2 files changed, 242 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index a0af553b9ff..4f98c1c7bf9 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc +static rtx +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code, rtx z, rtx target) +{ + machine_mode mode = GET_MODE (target); + rtx cond_op0 = XEXP (if_info->cond, 0); + rtx czero_cond += gen_rtx_fmt_ee (czero_code, GET_MODE (cond_op0), cond_op0, const0_rtx); + rtx if_then_else = gen_rtx_IF_THEN_ELSE (mode, czero_cond, const0_rtx, z); + rtx set = gen_rtx_SET (target, if_then_else); + + start_sequence (); + rtx_insn *insn = emit_insn (set); + + if (recog_memoized (insn) >= 0) +{ + rtx_insn *seq = get_insns (); + end_sequence (); + emit_insn (seq); + + return target; +} + + end_sequence (); + return NULL_RTX; +} So just a few notes to further illustrate why I'm currently looking to take the VRULL+Ventana implementation. The code above would be much better handled by just calling noce_emit_cmove. noce_emit_cmove will go through the conditional move expander. So any improvement we make in the expander "just work" when called from the if-converter. + /* Try only simple constants and registers here. More complex cases are handled in noce_try_cmove_arith after noce_try_store_flag_arith has had a go at it. */ @@ -2880,6 +2908,88 @@ noce_try_sign_mask (struct noce_if_info *if_info) return true; } +/* Convert x = c ? y + z : y or x = c ? y : y + z. */ + +static bool +noce_try_cond_zero_arith (struct noce_if_info *if_info) +{ + rtx target; + rtx_insn *seq; + machine_mode mode = GET_MODE (if_info->x); + rtx common = NULL_RTX; + enum rtx_code czero_code = UNKNOWN; + rtx a = if_info->a; + rtx b = if_info->b; + rtx z = NULL_RTX; + rtx cond = if_info->cond; + + if (!noce_simple_bbs (if_info)) +return false; [ ... ] So the internal code we have does a bit of canonicalization before the optimizing transformations. In particular we may be presented with (a == 0) ? b : a which we transform into (a != 0 ? a : b) which allows us to pick up more cases. (b != 0 ? b : a) gets similar handling. As I mentioned earlier, the VRULL+Ventana code handles wrapping extensions & subregs. Our code also handles if-converting shifts/rotates. Hopefully that explains a bit more why I think cleaning up the VRULL+Ventana code is a better choice. jeff
Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 10/30/23 01:25, Fei Gao wrote: Conditional add, if zero rd = (rc == 0) ? (rs1 + rs2) : rs1 --> czero.nez rd, rs2, rc add rd, rs1, rd Conditional add, if non-zero rd = (rc != 0) ? (rs1 + rs2) : rs1 --> czero.eqz rd, rs2, rc add rd, rs1, rd Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith (noce_try_cond_zero_arith): handler for condtional zero op (noce_process_if_block): add noce_try_cond_zero_arith with hook control gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond_ifcvt_opt.c: New test. Just an intermediate follow-up. As I expected, the work we have internally fixes all the cases included in this patch. Similarly for patch #3. jeff
Re: [PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
On 10/30/23 01:25, Fei Gao wrote: Conditional add, if zero rd = (rc == 0) ? (rs1 + rs2) : rs1 --> czero.nez rd, rs2, rc add rd, rs1, rd Conditional add, if non-zero rd = (rc != 0) ? (rs1 + rs2) : rs1 --> czero.eqz rd, rs2, rc add rd, rs1, rd Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith (noce_try_cond_zero_arith): handler for condtional zero op (noce_process_if_block): add noce_try_cond_zero_arith with hook control gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond_ifcvt_opt.c: New test. So the idea here is to improve upon the current code we generate for conditional arithmetic. Right now we support conditional arithmetic using zicond, but the sequence is poor. Basically the if-converter knows how to generate a conditional add, but it does so in a way that isn't as efficient as it could be. In effect ifcvt wants to generate t = a + b res = cond ? t : b We want to change it to t = cond ? b : 0; res = a + t; The latter sequence expands to more efficient code trivially for risc-v. I wandered a bit through the combine dumps to see if it would be easy to capture this class of cases. We never get anything useful, and while I can imagine "bridge" patterns that would potentially expose enough RTL to allow us to rewrite without changing ifcvt, it'd just be a hack IMHO. So going back to ifcvt... In the first sequence the addition must wait for both "a" and "b" to be available and the conditional move can fire on the next cycle. In the second sequence the conditional move can fire when just "b" is available. So that gives "a" another cycle to become ready (say if it's coming from memory or a multi-cycle operation like multiply). On the other hand the second sequence does keep "a" live longer. In the end I strongly suspect neither sequence is significantly better than the other. Meaning I don't think we need to conditionalize using condzero arith at all. I'll note that subsequent patches add MINUS, IOR, XOR and AND. It's also possible (and important) to handle shifts. There's a conditional shift-by-6 in leela's hot path. Overall this looks a lot like the VRULL code, but just less complete. My inclination is to do a cleanup pass on the VRULL code verify it handles all the cases in your tests and commit the VRULL implementation with your tests. I'll do some further poking at this today. Thanks for re-submitting these bits. Getting this target independent work cleaned up has been on my TODO for a while now. jeff
[PATCH 2/4] [ifcvt] if convert x=c ? y+z : y by RISC-V Zicond like insns
Conditional add, if zero rd = (rc == 0) ? (rs1 + rs2) : rs1 --> czero.nez rd, rs2, rc add rd, rs1, rd Conditional add, if non-zero rd = (rc != 0) ? (rs1 + rs2) : rs1 --> czero.eqz rd, rs2, rc add rd, rs1, rd Co-authored-by: Xiao Zeng gcc/ChangeLog: * ifcvt.cc (noce_emit_czero): helper for noce_try_cond_zero_arith (noce_try_cond_zero_arith): handler for condtional zero op (noce_process_if_block): add noce_try_cond_zero_arith with hook control gcc/testsuite/ChangeLog: * gcc.target/riscv/zicond_ifcvt_opt.c: New test. --- gcc/ifcvt.cc | 112 +++ .../gcc.target/riscv/zicond_ifcvt_opt.c | 130 ++ 2 files changed, 242 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index a0af553b9ff..4f98c1c7bf9 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -781,12 +781,14 @@ static bool noce_try_store_flag_constants (struct noce_if_info *); static bool noce_try_store_flag_mask (struct noce_if_info *); static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx, rtx, rtx, rtx, rtx = NULL, rtx = NULL); +static rtx noce_emit_czero (struct noce_if_info *, enum rtx_code, rtx, rtx); static bool noce_try_cmove (struct noce_if_info *); static bool noce_try_cmove_arith (struct noce_if_info *); static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **); static bool noce_try_minmax (struct noce_if_info *); static bool noce_try_abs (struct noce_if_info *); static bool noce_try_sign_mask (struct noce_if_info *); +static bool noce_try_cond_zero_arith (struct noce_if_info *); /* Return the comparison code for reversed condition for IF_INFO, or UNKNOWN if reversing the condition is not possible. */ @@ -1831,6 +1833,32 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code, return NULL_RTX; } +static rtx +noce_emit_czero (struct noce_if_info *if_info, enum rtx_code czero_code, rtx z, rtx target) +{ + machine_mode mode = GET_MODE (target); + rtx cond_op0 = XEXP (if_info->cond, 0); + rtx czero_cond += gen_rtx_fmt_ee (czero_code, GET_MODE (cond_op0), cond_op0, const0_rtx); + rtx if_then_else = gen_rtx_IF_THEN_ELSE (mode, czero_cond, const0_rtx, z); + rtx set = gen_rtx_SET (target, if_then_else); + + start_sequence (); + rtx_insn *insn = emit_insn (set); + + if (recog_memoized (insn) >= 0) +{ + rtx_insn *seq = get_insns (); + end_sequence (); + emit_insn (seq); + + return target; +} + + end_sequence (); + return NULL_RTX; +} + /* Try only simple constants and registers here. More complex cases are handled in noce_try_cmove_arith after noce_try_store_flag_arith has had a go at it. */ @@ -2880,6 +2908,88 @@ noce_try_sign_mask (struct noce_if_info *if_info) return true; } +/* Convert x = c ? y + z : y or x = c ? y : y + z. */ + +static bool +noce_try_cond_zero_arith (struct noce_if_info *if_info) +{ + rtx target; + rtx_insn *seq; + machine_mode mode = GET_MODE (if_info->x); + rtx common = NULL_RTX; + enum rtx_code czero_code = UNKNOWN; + rtx a = if_info->a; + rtx b = if_info->b; + rtx z = NULL_RTX; + rtx cond = if_info->cond; + + if (!noce_simple_bbs (if_info)) +return false; + + /* cond must be EQ or NEQ comparision of a reg and 0. */ + if (GET_CODE (cond) != NE && GET_CODE (cond) != EQ) +return false; + if (!REG_P (XEXP (cond, 0)) || !rtx_equal_p (XEXP (cond, 1), const0_rtx)) +return false; + + /* check y + z:y*/ + if (GET_CODE (a) == PLUS && REG_P (XEXP (a, 0)) && REG_P (XEXP (a, 1)) + && REG_P (b) && rtx_equal_p (XEXP (a, 0), b)) +{ + common = b; + z = XEXP (a, 1); + /* x = c ? y+z : y, cond = !c --> x = cond ? y : y+z */ + czero_code = GET_CODE (cond); +} + /* check y : y+z */ + else if (GET_CODE (b) == PLUS && REG_P (XEXP (b, 0)) && REG_P (XEXP (b, 1)) + && REG_P (a) && rtx_equal_p (a, XEXP (b, 0))) +{ + common = a; + z = XEXP (b, 1); + /* x = c ? y : y+z, cond = !c --> x = !cond ? y : y+z */ + czero_code = noce_reversed_cond_code (if_info); +} + else +return false; + + if (czero_code == UNKNOWN) +return false; + + start_sequence (); + + /* If we have x = c ? x + z : x, use a new reg to avoid modifying x */ + if (common && rtx_equal_p (common, if_info->x)) +target = gen_reg_rtx (mode); + else +target = if_info->x; + + target = noce_emit_czero (if_info, czero_code, z, target); + if (!target) +{ + end_sequence (); + return false; +} + + target = expand_simple_binop (mode, PLUS, common, target, if_info->x, 0, + OPTAB_DIRECT); + if (!target) +{ + end_sequence (); + return false; +} + + if (target != if_info->x) +noce_emit_move_insn (if_info->x, target); + + seq = end_ifcvt_sequence (if_info); + if (!seq |