Re: [PATCH v2] ree: Improve ree pass for rs6000 target.
On 4/6/23 04:49, Ajit Agarwal wrote: Hello All: Eliminate unnecessary redundant extension within basic and across basic blocks. For rs6000 target we see redundant zero and sign extension and done to improve ree pass to eliminate such redundant zero and sign extension. Bootstrapped and regtested on powerpc64-linux-gnu. Thanks & Regards Ajit ree: Improve ree pass for rs6000 target. Eliminate unnecessary redundant extension within basic and across basic blocks. For rs6000 target we see redundant zero and sign extension and done to improve ree pass to eliminate such redundant zero and sign extension. 2023-04-06 Ajit Kumar Agarwal gcc/ChangeLog: * ree.cc (eliminate_across_bbs_p): Add checks to enable extension elimination across and within basic blocks. (def_arith_p): New function to check definition has arithmetic operation. (combine_set_extension): Modification to incorporate AND and current zero_extend and sign_extend instruction. (combline_reaching_defs): Add zero_extend and sign_extend. Add FUNCTION_ARG_REGNO_P abi interfaces calls and FUNCTION_VALUE_REGNO_P support. (merge_def_and_ext): Add calls to eliminate_across_bbs_p and zero_extend sign_extend and AND instruction. (insn_is_zext_p): New function. (add_removable_extension): Add FUNCTION_ARG_REGNO_P abi interface calls. * common/config/rs6000/rs6000-common.cc: Add REE pass as a default rs6000 target pass for O2 and above. gcc/testsuite/ChangeLog: * g++.target/powerpc/zext-elim.C: New testcase. * g++.target/powerpc/zext-elim-1.C: New testcase. * g++.target/powerpc/zext-elim-2.C: New testcase. * g++.target/powerpc/zext-elim-3.C: New testcase. * g++.target/powerpc/sext-elim.C: New testcase. It would be useful to know the kinds of patterns you're trying to improve. I get the sense there's at least three distinct cases you're trying to handle. One case appears to stem from operations which we know produce a zero extended results. For example x & 0x1. We can kindof view that as a zero extension from narrow modes up through word_mode since we know the upper bits are zero. Another stems from exploiting ABI characteristics. Finally extending to handle cases across basic blocks These should be independent changes. So I can easily see this patch should morph into a patch series with at least 4 entries. 1/4 Just moves code around and produces no functional changes. 2/4 Would implement the case where an operation is known to produce a zero extended result. 3/4 Would exploit the ABI characteristics to eliminate more extensions. 4/4 Would extend REE to work across blocks WIth this kind of structure patches #1 and #2 might to in fairly quickly, even if we have to figure out how to handle ABI issues. It's also easier for the reviewer on multiple levels. --- gcc/common/config/rs6000/rs6000-common.cc | 2 + gcc/ree.cc| 655 ++ gcc/testsuite/g++.target/powerpc/sext-elim.C | 18 + .../g++.target/powerpc/zext-elim-1.C | 19 + .../g++.target/powerpc/zext-elim-2.C | 11 + .../g++.target/powerpc/zext-elim-3.C | 16 + gcc/testsuite/g++.target/powerpc/zext-elim.C | 30 + 7 files changed, 606 insertions(+), 145 deletions(-) create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C diff --git a/gcc/common/config/rs6000/rs6000-common.cc b/gcc/common/config/rs6000/rs6000-common.cc index 2140c442ba9..968db215028 100644 --- a/gcc/common/config/rs6000/rs6000-common.cc +++ b/gcc/common/config/rs6000/rs6000-common.cc @@ -34,6 +34,8 @@ static const struct default_options rs6000_option_optimization_table[] = { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 }, /* Enable -fsched-pressure for first pass instruction scheduling. */ { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 }, +/* Enable -free for zero extension and sign extension elimination.*/ +{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 }, So if we're going to make this change, then we need to update the documentation as well (there's a section which lists which -f options are enabled at the different -O optimization levels. It may be better to have the ppc backend enable the REE pass on its own rather than forcing it on for all the targets since it hasn't been tested on all the targets. That's been pretty standard practice for the REE implementation. /* Enable -munroll-only-small-loops with -funroll-loops to unroll small loops at
Re: [PATCH v2] ree: Improve ree pass for rs6000 target.
On 6 April 2023 12:49:53 CEST, Ajit Agarwal via Gcc-patches wrote: >Hello All: > >Eliminate unnecessary redundant extension within basic and across basic blocks. To borrow HP's "arm-chair development mode", unfortunately most of the comments i had for the previous version apply to this version too, fwiw. PS: Pretty please avoid moving functions around, if possible, as it spoils the history needlessly, IMHO. thanks, and looking forward to a stage 1 patch.
[PATCH v2] ree: Improve ree pass for rs6000 target.
Hello All: Eliminate unnecessary redundant extension within basic and across basic blocks. For rs6000 target we see redundant zero and sign extension and done to improve ree pass to eliminate such redundant zero and sign extension. Bootstrapped and regtested on powerpc64-linux-gnu. Thanks & Regards Ajit ree: Improve ree pass for rs6000 target. Eliminate unnecessary redundant extension within basic and across basic blocks. For rs6000 target we see redundant zero and sign extension and done to improve ree pass to eliminate such redundant zero and sign extension. 2023-04-06 Ajit Kumar Agarwal gcc/ChangeLog: * ree.cc (eliminate_across_bbs_p): Add checks to enable extension elimination across and within basic blocks. (def_arith_p): New function to check definition has arithmetic operation. (combine_set_extension): Modification to incorporate AND and current zero_extend and sign_extend instruction. (combline_reaching_defs): Add zero_extend and sign_extend. Add FUNCTION_ARG_REGNO_P abi interfaces calls and FUNCTION_VALUE_REGNO_P support. (merge_def_and_ext): Add calls to eliminate_across_bbs_p and zero_extend sign_extend and AND instruction. (insn_is_zext_p): New function. (add_removable_extension): Add FUNCTION_ARG_REGNO_P abi interface calls. * common/config/rs6000/rs6000-common.cc: Add REE pass as a default rs6000 target pass for O2 and above. gcc/testsuite/ChangeLog: * g++.target/powerpc/zext-elim.C: New testcase. * g++.target/powerpc/zext-elim-1.C: New testcase. * g++.target/powerpc/zext-elim-2.C: New testcase. * g++.target/powerpc/zext-elim-3.C: New testcase. * g++.target/powerpc/sext-elim.C: New testcase. --- gcc/common/config/rs6000/rs6000-common.cc | 2 + gcc/ree.cc| 655 ++ gcc/testsuite/g++.target/powerpc/sext-elim.C | 18 + .../g++.target/powerpc/zext-elim-1.C | 19 + .../g++.target/powerpc/zext-elim-2.C | 11 + .../g++.target/powerpc/zext-elim-3.C | 16 + gcc/testsuite/g++.target/powerpc/zext-elim.C | 30 + 7 files changed, 606 insertions(+), 145 deletions(-) create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C diff --git a/gcc/common/config/rs6000/rs6000-common.cc b/gcc/common/config/rs6000/rs6000-common.cc index 2140c442ba9..968db215028 100644 --- a/gcc/common/config/rs6000/rs6000-common.cc +++ b/gcc/common/config/rs6000/rs6000-common.cc @@ -34,6 +34,8 @@ static const struct default_options rs6000_option_optimization_table[] = { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 }, /* Enable -fsched-pressure for first pass instruction scheduling. */ { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 }, +/* Enable -free for zero extension and sign extension elimination.*/ +{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 }, /* Enable -munroll-only-small-loops with -funroll-loops to unroll small loops at -O2 and above by default. */ { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 }, diff --git a/gcc/ree.cc b/gcc/ree.cc index 413aec7c8eb..8057f0325f4 100644 --- a/gcc/ree.cc +++ b/gcc/ree.cc @@ -253,6 +253,101 @@ struct ext_cand static int max_insn_uid; +/* Get all the reaching definitions of an instruction. The definitions are + desired for REG used in INSN. Return the definition list or NULL if a + definition is missing. If DEST is non-NULL, additionally push the INSN + of the definitions onto DEST. */ + +static struct df_link * +get_defs (rtx_insn *insn, rtx reg, vec *dest) +{ + df_ref use; + struct df_link *ref_chain, *ref_link; + + FOR_EACH_INSN_USE (use, insn) +{ + if (GET_CODE (DF_REF_REG (use)) == SUBREG) + return NULL; + if (REGNO (DF_REF_REG (use)) == REGNO (reg)) + break; +} + + if (use == NULL) +return NULL; + + ref_chain = DF_REF_CHAIN (use); + + for (ref_link = ref_chain; ref_link; ref_link = ref_link->next) +{ + /* Problem getting some definition for this instruction. */ + if (ref_link->ref == NULL) + return NULL; + if (DF_REF_INSN_INFO (ref_link->ref) == NULL) + return NULL; + /* As global regs are assumed to be defined at each function call +dataflow can report a call_insn as being a definition of REG. +But we can't do anything with that in this pass so proceed only +if the instruction really sets REG in a way that can be deduced +from the RTL structure. */ + if (global_regs[REGNO (reg)] + && !set_of