On 7/11/25 2:57 AM, Kuan-Lin Chen wrote:
This patch adds support for the XAndesperf ISA extension.
The 32-bit AndeStar V5 extension includes branch instructions,
load effective address instructions, and string processing
instructions for performance improvement.
New INSN patterns are added into the new file andes.md
as a seprated vender extension.
gcc/ChangeLog:
* config/riscv/constraints.md (Ou07): New constraint.
(ads_Bext): New constraint.
* config/riscv/iterators.md (ANYLE32): New iterator.
(sizen): New iterator.
(sh_limit): New iterator.
(sh_bit): New iterator.
* config/riscv/predicates.md (ads_branch_bbcs_operand): New predicate.
(ads_branch_bimm_operand): New predicate.
(ads_imm_extract_operand): New predicate.
(ads_extract_size_imm_si): New predicate.
(ads_extract_size_imm_di): New predicate.
(const_int5_operand): New predicate.
* config/riscv/riscv-builtins.cc:
Add new AVAIL andesperf32 and andesperf64.
Add new define RISCV_ATYPE_ULONG and RISCV_ATYPE_LONG.
* config/riscv/riscv-ftypes.def: New DEF_RISCV_FTYPE.
* config/riscv/riscv.cc
(riscv_extend_cost): Cost for pattern 'bfo'.
(riscv_rtx_costs): Cost for XAndesperf extension.
* config/riscv/riscv.md: Add support for XAndesperf to patterns
zero_extendsidi2_internal, zero_extendhi2, extendsidi2_internal,
extend<SHORT:mode><SUPERQI:mode>2, <any_extract:optab><GPR:mode>3
and branch_on_bit.
* config/riscv/vector-iterators.md
(sz): Add sign_extract and zero_extract.
* config/riscv/andes.def: New file for vender Andes.
* config/riscv/andes.md: New file for vender Andes.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xandesperf-1.c: New test.
* gcc.target/riscv/xandesperf-10.c: New test.
* gcc.target/riscv/xandesperf-2.c: New test.
* gcc.target/riscv/xandesperf-3.c: New test.
* gcc.target/riscv/xandesperf-4.c: New test.
* gcc.target/riscv/xandesperf-5.c: New test.
* gcc.target/riscv/xandesperf-6.c: New test.
* gcc.target/riscv/xandesperf-7.c: New test.
* gcc.target/riscv/xandesperf-8.c: New test.
* gcc.target/riscv/xandesperf-9.c: New test.
---
+
+(define_c_enum "unspec" [
+ ;; XANDESPERF string
+ UNSPEC_NDS_FFB
+ UNSPEC_NDS_FFZMISM
+ UNSPEC_NDS_FFMISM
+ UNSPEC_NDS_FLMISM
+])
+
+;; AndesPerf
+;; ....................
+;;
+;; BRANCH OPERATION
+;;
+;; ....................
+;;
+
+(define_insn "*nds_branch_imms7<mode>"
+ [(set (pc)
+ (if_then_else
+ (match_operator 1 "equality_operator"
+ [(match_operand:X 2 "register_operand" "r")
+ (match_operand:X 3 "ads_branch_bimm_operand" "Ou07")])
+ (label_ref (match_operand 0 "" ""))
+ (pc)))]
+ "TARGET_XANDESPERF"
+{
+ if (get_attr_length (insn) == 12)
+ return "nds.b%C1c\t%2,%3,1f; jump\t%l0,ra; 1:";
+
+ return "nds.b%C1c\t%2,%3,%l0";
+}
+ [(set_attr "type" "branch")
+ (set_attr "mode" "none")])
Not strictly necessary to change, but I think the preferred way to
handle the operator these days is with an iterator. So instead of a
match_operator construct, you can just use the "any_eq" iterator like
you would any other code.
That will expand to an EQ and a NE version of your pattern. Similarly
for your branch-on-bit pattern.
I don't think we have a code attribute for EQ/NE, but that would be
trivial to add and probably useful in general. See the code attr
"OPTAB" in iterators.md.
You could use the same kind of mechanism to define a code attribute that
emits "c" or "s" based on the EQ/NE pattern. As an example, see how the
"su" code attr works in the same file
+
+;; BFO[SZ]: msb < lsb: Combine pass doesn't convert (and (ashift) Y) to
+;; zero_extract when exact_log2 (Y + 1) < 0.
Wouldn't it be better to fix combine? I'm a bit surprised by the
comment since I've seen zero_extractions where the starting point and
size are arbitrary on other ports.
+
+;;
+;; ....................
+;;
+;; LOAD ADDRESS
+;;
+;; ....................
+;;
+
+(define_insn "lea_h<mode>"
+ [(set (match_operand:P 0 "register_operand" "=r")
+ (plus:P (ashift:P (match_operand:P 1 "register_operand" " r")
+ (const_int 1))
+ (match_operand:P 2 "register_operand" " r")))]
+ "TARGET_XANDESPERF"
+ { return "nds.lea.h\t%0,%2,%1"; }
+ [(set_attr "type" "arith")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "lea_w<mode>"
+ [(set (match_operand:P 0 "register_operand" "=r")
+ (plus:P (ashift:P (match_operand:P 1 "register_operand" " r")
+ (const_int 2))
+ (match_operand:P 2 "register_operand" " r")))]
+ "TARGET_XANDESPERF"
+ { return "nds.lea.w\t%0,%2,%1"; }
+ [(set_attr "type" "arith")
+ (set_attr "mode" "<MODE>")])
+
+(define_insn "lea_d<mode>"
+ [(set (match_operand:P 0 "register_operand" "=r")
+ (plus:P (ashift:P (match_operand:P 1 "register_operand" " r")
+ (const_int 3))
+ (match_operand:P 2 "register_operand" " r")))]
+ "TARGET_XANDESPERF"
+ { return "nds.lea.d\t%0,%2,%1"; }
+ [(set_attr "type" "arith")
+ (set_attr "mode" "<MODE>")])
So these seem effectively the same as the shNadd insns in bitmanip.md.
Though I guess you don't expect folks to turn on both Zba and Andes Perf
at the same time. So it probably works.
+
+;;
+;; ....................
+;;
+;; String Extension
+;;
+;; ....................
+;;
+
+(define_insn "riscv_nds_ffb<mode>"
+ [(set (match_operand:GPR 0 "register_operand" "=r")
+ (unspec:GPR [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
+ (match_operand:GPR 2 "nonmemory_operand" "rJ")]
UNSPEC_NDS_FFB))]
+ ""
+ "nds.ffb\t%0, %z1, %z2"
+ [(set_attr "mode" "<MODE>")
+ (set_attr "type" "arith")])
+
+(define_insn "riscv_nds_ffzmism<mode>"
+ [(set (match_operand:GPR 0 "register_operand" "=r")
+ (unspec:GPR [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
+ (match_operand:GPR 2 "reg_or_0_operand" "rJ")]
UNSPEC_NDS_FFZMISM))]
+ ""
+ "nds.ffzmism\t%0, %z1, %z2"
+ [(set_attr "mode" "<MODE>")
+ (set_attr "type" "arith")])
+
+(define_insn "riscv_nds_ffmism<mode>"
+ [(set (match_operand:GPR 0 "register_operand" "=r")
+ (unspec:GPR [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
+ (match_operand:GPR 2 "reg_or_0_operand" "rJ")]
UNSPEC_NDS_FFMISM))]
+ ""
+ "nds.ffmism\t%0, %z1, %z2"
+ [(set_attr "mode" "<MODE>")
+ (set_attr "type" "arith")])
+
+(define_insn "riscv_nds_flmism<mode>"
+ [(set (match_operand:GPR 0 "register_operand" "=r")
+ (unspec:GPR [(match_operand:GPR 1 "reg_or_0_operand" "rJ")
+ (match_operand:GPR 2 "reg_or_0_operand" "rJ")]
UNSPEC_NDS_FLMISM))]
+ ""
+ "nds.flmism\t%0, %z1, %z2"
+ [(set_attr "mode" "<MODE>")
+ (set_attr "type" "arith")])
What are the semantics of these UNSPECs? Do they match existing RTL
code, and if so wouldn't it be better to use the existing RTL codes?
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4428f0e96d2..7736c6b963fa 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3928,6 +3928,10 @@ riscv_extend_cost (rtx op, bool unsigned_p)
if (MEM_P (op))
return 0;
+ /* Andes bfo patterns. */
+ if (TARGET_XANDESPERF)
+ return COSTS_N_INSNS (1);
Note that you're going to return 1 for any extension here. That would
include extensions of memory operands and arbitrary RTL. One of the
quirks about the cost interface is you're not guaranteed to be presented
with RTL that actually matches any backend pattern.
@@ -4216,6 +4220,12 @@ riscv_rtx_costs (rtx x, machine_mode mode, int
outer_code, int opno ATTRIBUTE_UN
return false;
case AND:
+ /* Andes bfo patterns. */
+ if (TARGET_XANDESPERF && GET_CODE (XEXP (x, 0)) == ASHIFT)
+ {
+ *total = COSTS_N_INSNS (1);
+ return true;
+ }
Similarly here. You're potentially going to have arbitrary operands for
X in this code. That may ultimately be OK. But you need to be aware.
@@ -4470,6 +4493,15 @@ riscv_rtx_costs (rtx x, machine_mode mode, int
outer_code, int opno ATTRIBUTE_UN
} while (false);
}
+ /* Andes lea patterns. */
+ if (TARGET_XANDESPERF &&
+ ((TARGET_64BIT && GET_CODE (XEXP (x, 0)) == AND)
+ || GET_CODE (XEXP (x, 0)) == ASHIFT))
Formatting nit. The "&&" belongs on the next line and indentation needs
to be fixed appropriately.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c6661f5646b3..823e2cb859c5 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1891,6 +1891,7 @@
(zero_extend:DI
(match_operand:SI 1 "nonimmediate_operand" " r,m")))]
"TARGET_64BIT && !TARGET_ZBA && !TARGET_XTHEADBB && !TARGET_XTHEADMEMIDX
+ && !TARGET_XANDESPERF
&& !(REG_P (operands[1]) && VL_REG_P (REGNO (operands[1])))"
So we've got a blob of conditional code that's getting repeated over and
over.
I'd suggest refactoring a bit to avoid the repeated code duplication. A
new #define for the common parts of these tests, or a new function seems
appropriate.
diff --git a/gcc/config/riscv/vector-iterators.md
b/gcc/config/riscv/vector-iterators.md
index e60e3a8399ce..33300a010a1c 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -4307,7 +4307,7 @@
(umax "%3,%4")
(mult "%3,%4")])
-(define_code_attr sz [(sign_extend "s") (zero_extend "z")])
+(define_code_attr sz [(sign_extend "s") (zero_extend "z") (sign_extract "s")
(zero_extract "z")])
;; VLS modes that has NUNITS < 32.
(define_mode_iterator VLS_AVL_IMM [
diff --git a/gcc/testsuite/gcc.target/riscv/xandesperf-1.c
b/gcc/testsuite/gcc.target/riscv/xandesperf-1.c
new file mode 100644
index 000000000000..503991416b55
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xandesperf-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_xandesperf" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xandesperf" { target { rv32 } } } */
I think we usually specify the -mabi arguments in this cases as well.
We should consider if we want a distinct directory for the Andes
extensions.
Overall it looks pretty good, but does need a bit more minor work.
jeff