On 11/9/22 16:07, Philipp Tomsich wrote:
Handling the register-const_int addition has very quickly escalated to
creating a full sign-extended 32bit constant and performing a
register-register for RISC-V in GCC so far, resulting in sequences like
(for the case of "a + 2048"):
        li      a5,4096
        addi    a5,a5,-2048
        add     a0,a0,a5

By adding an expansion for add<mode>3, we can emit optimised RTL that
matches the capabilities of RISC-V better by adding support for the
following, previously unoptimised cases:
   - addi + addi
        addi    a0,a0,2047
        addi    a0,a0,1
   - li + sh[123]add (if Zba is enabled)
        li      a5,960
        sh3add  a0,a5,a0

With this commit, we also fix up riscv_adjust_libcall_cfi_prologue()
and riscv_adjust_libcall_cfi_epilogue() to not use gen_add3_insn, as
the expander will otherwise wrap the resulting set-expression in an
insn (causing an ICE at dwarf2-time) when invoked with -msave-restore.

This closes the gap to LLVM, which has already been emitting these
optimised sequences.

Note that this benefits is perlbench (in SPEC CPU 2017), which needs
to add the constant 3840.

gcc/ChangeLog:

        * config/riscv/bitmanip.md (*shNadd): Rename.
        (riscv_shNadd<X:mode>): Expose as gen_riscv_shNadd{di/si}.
        * config/riscv/predicates.md (const_arith_shifted123_operand):
        New predicate (for constants that are a simm12, shifted by
        1, 2 or 3).
        (const_arith_2simm12_operand): New predicate (that can be
        expressed by adding 2 simm12 together).
        (addi_operand): New predicate (an immedaite operand suitable
        for the new add<mode>3 expansion).
        * config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue):
        Don't use gen_add3_insn, where a RTX instead of an INSN is
        required (otherwise this will break as soon as we have a
        define_expand for add<mode>3).
        (riscv_adjust_libcall_cfi_epilogue): Same.
        * config/riscv/riscv.md (addsi3): Rename.
        (riscv_addsi3): New name for addsi3.
        (adddi3): Rename.
        (riscv_adddi3): New name for adddi3.
        (add<mode>3): New expander that handles the basic and fancy
        (such as li+sh[123]add, addi+addi, ...) cases for adding
        register-register and register-const_int.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/addi.c: New test.
        * gcc.target/riscv/zba-shNadd-06.c: New test.

Signed-off-by: Philipp Tomsich <philipp.toms...@vrull.eu>
---

  gcc/config/riscv/bitmanip.md                  |  2 +-
  gcc/config/riscv/predicates.md                | 28 +++++++++
  gcc/config/riscv/riscv.cc                     | 10 ++--
  gcc/config/riscv/riscv.md                     | 58 ++++++++++++++++++-
  gcc/testsuite/gcc.target/riscv/addi.c         | 39 +++++++++++++
  .../gcc.target/riscv/zba-shNadd-06.c          | 11 ++++
  6 files changed, 141 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/addi.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c



diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 171a0cdced6..289ff7470c6 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -464,6 +464,60 @@
    [(set_attr "type" "arith")
     (set_attr "mode" "DI")])
+(define_expand "add<mode>3"
+  [(set (match_operand:GPR           0 "register_operand"      "=r,r")
+       (plus:GPR (match_operand:GPR 1 "register_operand"      " r,r")
+                 (match_operand:GPR 2 "addi_operand"          " r,I")))]
+  ""
+{
+  if (arith_operand (operands[2], <MODE>mode))
+    emit_insn (gen_riscv_add<mode>3 (operands[0], operands[1], operands[2]));
+  else if (const_arith_2simm12_operand (operands[2], <MODE>mode))
+    {
+      /* Split into two immediates that add up to the desired value:
+       * e.g., break up "a + 2445" into:
+       *         addi  a0,a0,2047
+       *        addi   a0,a0,398
+       */

Nit.  GNU comment style please.


+
+      HOST_WIDE_INT val = INTVAL (operands[2]);
+      HOST_WIDE_INT saturated = HOST_WIDE_INT_M1U << (IMM_BITS - 1);
+
+      if (val >= 0)
+        saturated = ~saturated;
+
+      val -= saturated;
+
+      rtx tmp = gen_reg_rtx (<MODE>mode);

Can't add<mode>3 be generated by LRA?  If so, don't you have to guard against going into this path as we shouldn't be creating new pseudos at that point (I know LRA can create some internally, but I don't think it handles new ones showing up due to target expanders).


Similarly for the shifted_123 case immediately following.


If we do indeed have an issue here, I'm not sure how best to resolve.  If the output operand does not overlap with the inputs, then we're golden and can just re-use it to form the constant.  If not,  then it's a bit tougher.  I'm not keen to add a test of no_new_pseudos to the operand predicate, but I don't see a better option yet.


jeff


Reply via email to