On 6/15/23 11:28, Manolis Tsamis wrote:
This is a new RTL pass that tries to optimize memory offset calculations
by moving them from add immediate instructions to the memory loads/stores.
For example it can transform this:

   addi t4,sp,16
   add  t2,a6,t4
   shl  t3,t2,1
   ld   a2,0(t3)
   addi a2,1
   sd   a2,8(t2)

into the following (one instruction less):

   add  t2,a6,sp
   shl  t3,t2,1
   ld   a2,32(t3)
   addi a2,1
   sd   a2,24(t2)

Although there are places where this is done already, this pass is more
powerful and can handle the more difficult cases that are currently not
optimized. Also, it runs late enough and can optimize away unnecessary
stack pointer calculations.

gcc/ChangeLog:

        * Makefile.in: Add fold-mem-offsets.o.
        * passes.def: Schedule a new pass.
        * tree-pass.h (make_pass_fold_mem_offsets): Declare.
        * common.opt: New options.
        * doc/invoke.texi: Document new option.
        * fold-mem-offsets.cc: New file.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/fold-mem-offsets-1.c: New test.
        * gcc.target/riscv/fold-mem-offsets-2.c: New test.
        * gcc.target/riscv/fold-mem-offsets-3.c: New test.

Signed-off-by: Manolis Tsamis <manolis.tsa...@vrull.eu>
---

Changes in v2:
         - Made the pass target-independant instead of RISCV specific.
         - Fixed a number of bugs.
         - Add code to handle more ADD patterns as found
           in other targets (x86, aarch64).
         - Improved naming and comments.
         - Fixed bitmap memory leak.


diff --git a/gcc/fold-mem-offsets.cc b/gcc/fold-mem-offsets.cc
new file mode 100644
index 00000000000..8ef0f438191
--- /dev/null
+++ b/gcc/fold-mem-offsets.cc
@@ -0,0 +1,630 @@
+/* Late RTL pass to fold memory offsets.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
Do we still need this #define?




+/* Tracks which instructions can be reached through instructions that can
+   propagate offsets for folding.  */
+static bitmap_head can_fold_insns;
Is there any reason why you're using "bitmap_head" rather than just the generic "bitmap" type?

Also note that since you've got a class you could just put these objects into the class and make the routines that use them member functions. It's marginally cleaner than using static variables.


+
+/* Helper function that performs the actions defined by PHASE for INSN.  */
+static void
+fold_mem_offsets_driver (rtx_insn* insn, int phase)
+{
+  if (phase == FM_PHASE_COMMIT_INSNS)
So FM_PHASE_COMMIT_INSNS doesn't share any code with the other phases. Would it be better to just factor this into a distinct function?



+    {
+      rtx mem = get_foldable_mem (insn);
+
+      if (!mem)
+       return;
+
+      rtx mem_addr = XEXP (mem, 0);
+      rtx reg;
+      HOST_WIDE_INT cur_offset;
+
+      if (REG_P (mem_addr))
+       {
+         reg = mem_addr;
+         cur_offset = 0;
+       }
+      else if (GET_CODE (mem_addr) == PLUS
+              && REG_P (XEXP (mem_addr, 0))
+              && CONST_INT_P (XEXP (mem_addr, 1)))
+       {
+         reg = XEXP (mem_addr, 0);
+         cur_offset = INTVAL (XEXP (mem_addr, 1));
+       }
+      else
+       return;
So these is common to the non-commit phases. Would it be cleaner to factor it into its own function, then factor each of the non-commit phases into their own function which calls this common routine?



+      else if (phase == FM_PHASE_VALIDITY)
+       {
+         bitmap_head fold_insns;
+         bitmap_initialize (&fold_insns, NULL);
Note that we have auto-bitmap types which will clean up after themselves so that you don't have to manage allocation/deallocation.


Overall it looks really good. I could make an argument to include it now, but I think one more cycle would be best.

In the mean time, I've updated my tester to use the V2 version.

Thanks!
jeff

Reply via email to