Re: [PATCH v2] ree: Improve ree pass for rs6000 target.

2023-04-14 Thread Jeff Law via Gcc-patches




On 4/6/23 04:49, Ajit Agarwal wrote:

Hello All:

Eliminate unnecessary redundant extension within basic and across basic blocks. 
For rs6000 target we see redundant zero and sign extension and done to improve 
ree pass to eliminate such redundant zero and sign extension.

Bootstrapped and regtested on powerpc64-linux-gnu.

Thanks & Regards
Ajit


ree: Improve ree pass for rs6000 target.

Eliminate unnecessary redundant extension within basic
and across basic blocks. For rs6000 target we see
redundant zero and sign extension and done to improve
ree pass to eliminate such redundant zero and sign
extension.

2023-04-06  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc (eliminate_across_bbs_p): Add checks to enable extension
elimination across and within basic blocks.
(def_arith_p): New function to check definition has arithmetic
operation.
(combine_set_extension): Modification to incorporate AND
and current zero_extend and sign_extend instruction.
(combline_reaching_defs): Add zero_extend and sign_extend.
Add FUNCTION_ARG_REGNO_P abi interfaces calls and
FUNCTION_VALUE_REGNO_P support.
(merge_def_and_ext): Add calls to eliminate_across_bbs_p and
zero_extend sign_extend and AND instruction.
(insn_is_zext_p): New function.
(add_removable_extension): Add FUNCTION_ARG_REGNO_P abi
interface calls.
* common/config/rs6000/rs6000-common.cc: Add REE pass as a
default rs6000 target pass for O2 and above.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/zext-elim.C: New testcase.
* g++.target/powerpc/zext-elim-1.C: New testcase.
* g++.target/powerpc/zext-elim-2.C: New testcase.
* g++.target/powerpc/zext-elim-3.C: New testcase.
* g++.target/powerpc/sext-elim.C: New testcase.
It would be useful to know the kinds of patterns you're trying to 
improve.  I get the sense there's at least three distinct cases you're 
trying to handle.


One case appears to stem from operations which we know produce a zero 
extended results.  For example x & 0x1.   We can kindof view that as a 
zero extension from narrow modes up through word_mode since we know the 
upper bits are zero.


Another stems from exploiting ABI characteristics.

Finally extending to handle cases across basic blocks

These should be independent changes.  So I can easily see this patch 
should morph into a patch series with at least 4 entries.


1/4 Just moves code around and produces no functional changes.
2/4 Would implement the case where an operation is known to
produce a zero extended result.
3/4 Would exploit the ABI characteristics to eliminate more
extensions.
4/4 Would extend REE to work across blocks

WIth this kind of structure patches #1 and #2 might to in fairly 
quickly, even if we have to figure out how to handle ABI issues.  It's 
also easier for the reviewer on multiple levels.



---
  gcc/common/config/rs6000/rs6000-common.cc |   2 +
  gcc/ree.cc| 655 ++
  gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
  .../g++.target/powerpc/zext-elim-1.C  |  19 +
  .../g++.target/powerpc/zext-elim-2.C  |  11 +
  .../g++.target/powerpc/zext-elim-3.C  |  16 +
  gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 +
  7 files changed, 606 insertions(+), 145 deletions(-)
  create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C

diff --git a/gcc/common/config/rs6000/rs6000-common.cc 
b/gcc/common/config/rs6000/rs6000-common.cc
index 2140c442ba9..968db215028 100644
--- a/gcc/common/config/rs6000/rs6000-common.cc
+++ b/gcc/common/config/rs6000/rs6000-common.cc
@@ -34,6 +34,8 @@ static const struct default_options 
rs6000_option_optimization_table[] =
  { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 },
  /* Enable -fsched-pressure for first pass instruction scheduling.  */
  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+/* Enable -free for zero extension and sign extension elimination.*/
+{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
So if we're going to make this change, then we need to update the 
documentation as well (there's a section which lists which -f options 
are enabled at the different -O optimization levels.


It may be better to have the ppc backend enable the REE pass on its own 
rather than forcing it on for all the targets since it hasn't been 
tested on all the targets.  That's been pretty standard practice for the 
REE implementation.




  /* Enable -munroll-only-small-loops with -funroll-loops to unroll small
 loops at 

Re: [PATCH v2] ree: Improve ree pass for rs6000 target.

2023-04-12 Thread Bernhard Reutner-Fischer via Gcc-patches
On 6 April 2023 12:49:53 CEST, Ajit Agarwal via Gcc-patches 
 wrote:
>Hello All:
>
>Eliminate unnecessary redundant extension within basic and across basic blocks.

To borrow HP's "arm-chair development mode", unfortunately most of the comments 
i had for the previous version apply to this version too, fwiw.

PS: Pretty please avoid moving functions around, if possible, as it spoils the 
history needlessly, IMHO.

thanks, and looking forward to a stage 1 patch.


[PATCH v2] ree: Improve ree pass for rs6000 target.

2023-04-06 Thread Ajit Agarwal via Gcc-patches
Hello All:

Eliminate unnecessary redundant extension within basic and across basic blocks. 
For rs6000 target we see redundant zero and sign extension and done to improve 
ree pass to eliminate such redundant zero and sign extension.

Bootstrapped and regtested on powerpc64-linux-gnu.

Thanks & Regards
Ajit


ree: Improve ree pass for rs6000 target.

Eliminate unnecessary redundant extension within basic
and across basic blocks. For rs6000 target we see
redundant zero and sign extension and done to improve
ree pass to eliminate such redundant zero and sign
extension.

2023-04-06  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc (eliminate_across_bbs_p): Add checks to enable extension
elimination across and within basic blocks.
(def_arith_p): New function to check definition has arithmetic
operation.
(combine_set_extension): Modification to incorporate AND
and current zero_extend and sign_extend instruction.
(combline_reaching_defs): Add zero_extend and sign_extend.
Add FUNCTION_ARG_REGNO_P abi interfaces calls and
FUNCTION_VALUE_REGNO_P support.
(merge_def_and_ext): Add calls to eliminate_across_bbs_p and
zero_extend sign_extend and AND instruction.
(insn_is_zext_p): New function.
(add_removable_extension): Add FUNCTION_ARG_REGNO_P abi
interface calls.
* common/config/rs6000/rs6000-common.cc: Add REE pass as a
default rs6000 target pass for O2 and above.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/zext-elim.C: New testcase.
* g++.target/powerpc/zext-elim-1.C: New testcase.
* g++.target/powerpc/zext-elim-2.C: New testcase.
* g++.target/powerpc/zext-elim-3.C: New testcase.
* g++.target/powerpc/sext-elim.C: New testcase.
---
 gcc/common/config/rs6000/rs6000-common.cc |   2 +
 gcc/ree.cc| 655 ++
 gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
 .../g++.target/powerpc/zext-elim-1.C  |  19 +
 .../g++.target/powerpc/zext-elim-2.C  |  11 +
 .../g++.target/powerpc/zext-elim-3.C  |  16 +
 gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 +
 7 files changed, 606 insertions(+), 145 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C

diff --git a/gcc/common/config/rs6000/rs6000-common.cc 
b/gcc/common/config/rs6000/rs6000-common.cc
index 2140c442ba9..968db215028 100644
--- a/gcc/common/config/rs6000/rs6000-common.cc
+++ b/gcc/common/config/rs6000/rs6000-common.cc
@@ -34,6 +34,8 @@ static const struct default_options 
rs6000_option_optimization_table[] =
 { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 },
 /* Enable -fsched-pressure for first pass instruction scheduling.  */
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+/* Enable -free for zero extension and sign extension elimination.*/
+{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
 /* Enable -munroll-only-small-loops with -funroll-loops to unroll small
loops at -O2 and above by default.  */
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
diff --git a/gcc/ree.cc b/gcc/ree.cc
index 413aec7c8eb..8057f0325f4 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -253,6 +253,101 @@ struct ext_cand
 
 static int max_insn_uid;
 
+/* Get all the reaching definitions of an instruction.  The definitions are
+   desired for REG used in INSN.  Return the definition list or NULL if a
+   definition is missing.  If DEST is non-NULL, additionally push the INSN
+   of the definitions onto DEST.  */
+
+static struct df_link *
+get_defs (rtx_insn *insn, rtx reg, vec *dest)
+{
+  df_ref use;
+  struct df_link *ref_chain, *ref_link;
+
+  FOR_EACH_INSN_USE (use, insn)
+{
+  if (GET_CODE (DF_REF_REG (use)) == SUBREG)
+   return NULL;
+  if (REGNO (DF_REF_REG (use)) == REGNO (reg))
+   break;
+}
+
+  if (use == NULL)
+return NULL;
+
+  ref_chain = DF_REF_CHAIN (use);
+
+  for (ref_link = ref_chain; ref_link; ref_link = ref_link->next)
+{
+  /* Problem getting some definition for this instruction.  */
+  if (ref_link->ref == NULL)
+   return NULL;
+  if (DF_REF_INSN_INFO (ref_link->ref) == NULL)
+   return NULL;
+  /* As global regs are assumed to be defined at each function call
+dataflow can report a call_insn as being a definition of REG.
+But we can't do anything with that in this pass so proceed only
+if the instruction really sets REG in a way that can be deduced
+from the RTL structure.  */
+  if (global_regs[REGNO (reg)]
+ && !set_of