[gcc r15-1654] [committed][RISC-V] Fix expected output for thead store pair test

2024-06-26 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:03a3dffa43145f80548d32b266b9b87be07b52ee

commit r15-1654-g03a3dffa43145f80548d32b266b9b87be07b52ee
Author: Jeff Law 
Date:   Wed Jun 26 06:59:26 2024 -0600

[committed][RISC-V] Fix expected output for thead store pair test

Surya's patch to IRA has improved the code we generate for one of the thead
store pair tests for both rv32 and rv64.  This patch adjusts the 
expectations
of that test.

I've verified that the test now passes on rv32 and rv64 in my tester.  
Pushing
to the trunk.

gcc/testsuite
* gcc.target/riscv/xtheadmempair-3.c: Update expected output.

Diff:
---
 gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c 
b/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c
index 5dec702819a..99a6ae7f4d7 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c
@@ -17,13 +17,11 @@ void bar (xlen_t, xlen_t, xlen_t, xlen_t, xlen_t, xlen_t, 
xlen_t, xlen_t);
 void baz (xlen_t a, xlen_t b, xlen_t c, xlen_t d, xlen_t e, xlen_t f, xlen_t 
g, xlen_t h)
 {
   foo (a, b, c, d, e, f, g, h);
-  /* RV64: We don't use 0(sp), therefore we can only get 3 mempairs.  */
-  /* RV32: We don't use 0(sp)-8(sp), therefore we can only get 2 mempairs.  */
   bar (a, b, c, d, e, f, g, h);
 }
 
-/* { dg-final { scan-assembler-times "th.ldd\t" 3 { target { rv64 } } } } */
-/* { dg-final { scan-assembler-times "th.sdd\t" 3 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.ldd\t" 4 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.sdd\t" 4 { target { rv64 } } } } */
 
-/* { dg-final { scan-assembler-times "th.lwd\t" 2 { target { rv32 } } } } */
-/* { dg-final { scan-assembler-times "th.swd\t" 2 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "th.lwd\t" 4 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "th.swd\t" 4 { target { rv32 } } } } */


[gcc r15-1655] [committed] Remove compromised sh test

2024-06-26 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:47b68cda2c4afe32e84c5f18da0196c39e5e0edf

commit r15-1655-g47b68cda2c4afe32e84c5f18da0196c39e5e0edf
Author: Jeff Law 
Date:   Wed Jun 26 07:20:29 2024 -0600

[committed] Remove compromised sh test

Surya's recent patch to IRA improves the code for sh/pr54602-1.c slightly.
Specifically it's able to eliminate a save/restore in the prologue/epilogue 
and
a bit of register shuffling.

As a result there literally aren't any insns that can be used to fill the 
delay
slot of the return, so a nop gets emitted and the test fails.

Given there literally aren't any insns to move into the delay slot, the best
course of action is to just drop the test.

gcc/testsuite
* gcc.target/sh/pr54602-1.c: Delete test.

Diff:
---
 gcc/testsuite/gcc.target/sh/pr54602-1.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/gcc/testsuite/gcc.target/sh/pr54602-1.c 
b/gcc/testsuite/gcc.target/sh/pr54602-1.c
deleted file mode 100644
index e7fb2a9a642..000
--- a/gcc/testsuite/gcc.target/sh/pr54602-1.c
+++ /dev/null
@@ -1,14 +0,0 @@
-/* Verify that the delay slot is stuffed with register pop insns for normal
-   (i.e. not interrupt handler) function returns.  If everything goes as
-   expected we won't see any nop insns.  */
-/* { dg-do compile }  */
-/* { dg-options "-O1" } */
-/* { dg-final { scan-assembler-not "nop" } } */
-
-int test00 (int a, int b);
-
-int
-test01 (int a, int b, int c, int d)
-{
-  return test00 (a, b) + c;
-}


[gcc r15-1719] [committed] Fix mcore-elf regression after recent IRA change

2024-06-28 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:9fbbad9b6c6e7fa7eaf37552173f5b8b2958976b

commit r15-1719-g9fbbad9b6c6e7fa7eaf37552173f5b8b2958976b
Author: Jeff Law 
Date:   Fri Jun 28 18:36:50 2024 -0600

[committed] Fix mcore-elf regression after recent IRA change

So the recent IRA change exposed a bug in the mcore backend.

The mcore has a special instruction (xtrb3) which can zero extend a GPR into
R1.  It's useful because zextb requires a matching source/destination.
Unfortunately xtrb3 modifies CC.

The IRA changes twiddle register allocation such that we want to use xtrb3.
Unfortunately CC is live at the point where we want to use xtrb3 and 
clobbering
CC causes the test to fail.

Exposing the clobber in the expander and insn seems like the best path 
forward.
We could also drop the xtrb3 alternative, but that seems like it would hurt
codegen more than exposing the clobber.

The bitfield extraction patterns using xtrb look problematic as well, but I
didn't try to fix those.

This fixes the builtn-arith-overflow regressions and appears to fix
20010122-1.c as a side effect.

gcc/
* config/mcore/mcore.md  (zero_extendqihi2): Clobber CC in expander
and matching insn.
(zero_extendqisi2): Likewise.

Diff:
---
 gcc/config/mcore/mcore.md | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/config/mcore/mcore.md b/gcc/config/mcore/mcore.md
index d416ce24a97..432b89520d7 100644
--- a/gcc/config/mcore/mcore.md
+++ b/gcc/config/mcore/mcore.md
@@ -1057,15 +1057,17 @@
   [(set_attr "type" "load")])
 
 (define_expand "zero_extendqisi2"
-  [(set (match_operand:SI 0 "mcore_arith_reg_operand" "")
-   (zero_extend:SI (match_operand:QI 1 "general_operand" "")))]
+  [(parallel [(set (match_operand:SI 0 "mcore_arith_reg_operand" "")
+ (zero_extend:SI (match_operand:QI 1 "general_operand" "")))
+ (clobber (reg:CC 17))])]
   ""
   "") 
 
 ;; RBE: XXX: we don't recognize that the xtrb3 kills the CC register.
 (define_insn ""
   [(set (match_operand:SI 0 "mcore_arith_reg_operand" "=r,b,r")
-   (zero_extend:SI (match_operand:QI 1 "general_operand" "0,r,m")))]
+   (zero_extend:SI (match_operand:QI 1 "general_operand" "0,r,m")))
+   (clobber (reg:CC 17))]
   ""
   "@
zextb   %0
@@ -1091,15 +1093,17 @@
   [(set_attr "type" "load")])
 
 (define_expand "zero_extendqihi2"
-  [(set (match_operand:HI 0 "mcore_arith_reg_operand" "")
-   (zero_extend:HI (match_operand:QI 1 "general_operand" "")))]
+  [(parallel [(set (match_operand:HI 0 "mcore_arith_reg_operand" "")
+  (zero_extend:HI (match_operand:QI 1 "general_operand" "")))
+ (clobber (reg:CC 17))])]
   ""
   "") 
 
 ;; RBE: XXX: we don't recognize that the xtrb3 kills the CC register.
 (define_insn ""
   [(set (match_operand:HI 0 "mcore_arith_reg_operand" "=r,b,r")
-   (zero_extend:HI (match_operand:QI 1 "general_operand" "0,r,m")))]
+   (zero_extend:HI (match_operand:QI 1 "general_operand" "0,r,m")))
+   (clobber (reg:CC 17))]
   ""
   "@
zextb   %0


[gcc r15-1723] [to-be-committed, RISC-V, V4] movmem for RISCV with V extension

2024-06-29 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:42946aa9b3228262e413481a3193bda85c20ef4b

commit r15-1723-g42946aa9b3228262e413481a3193bda85c20ef4b
Author: Sergei Lewis 
Date:   Sat Jun 29 14:34:31 2024 -0600

[to-be-committed,RISC-V,V4] movmem for RISCV with V extension

I hadn't updated my repo on the host where I handle email, so it picked
up the older version of this patch without the testsuite fix.  So, V4
with the testsuite option for lmul fixed.

--

And Sergei's movmem patch.  Just trivial testsuite adjustment for an
option name change and a whitespace fix from me.

I've spun this in my tester for rv32 and rv64.  I'll wait for pre-commit
CI before taking further action.

Just a reminder, this patch is designed to handle the case where we can
issue a single vector load/store which avoids all the complexities of
determining which direction to copy.

--

gcc/ChangeLog

* config/riscv/riscv.md (movmem): New expander.

gcc/testsuite/ChangeLog

PR target/112109
* gcc.target/riscv/rvv/base/movmem-1.c: New test

Diff:
---
 gcc/config/riscv/riscv.md  | 22 
 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c | 60 ++
 2 files changed, 82 insertions(+)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ff37125e3f2..c0c960353eb 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2723,6 +2723,28 @@
 FAIL;
 })
 
+;; Inlining general memmove is a pessimisation: we can't avoid having to decide
+;; which direction to go at runtime, which is costly in instruction count
+;; however for situations where the entire move fits in one vector operation
+;; we can do all reads before doing any writes so we don't have to worry
+;; so generate the inline vector code in such situations
+;; nb. prefer scalar path for tiny memmoves.
+(define_expand "movmem"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+   (match_operand:BLK 1 "general_operand"))
+(use (match_operand:P 2 "const_int_operand"))
+(use (match_operand:SI 3 "const_int_operand"))])]
+  "TARGET_VECTOR"
+{
+  if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN / 8)
+   && (INTVAL (operands[2]) <= TARGET_MIN_VLEN)
+   && riscv_vector::expand_block_move (operands[0], operands[1],
+operands[2]))
+DONE;
+  else
+FAIL;
+})
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
new file mode 100644
index 000..d9d4a70a392
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-O3 -mrvv-max-lmul=dynamic" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define MIN_VECTOR_BYTES (__riscv_v_min_vlen / 8)
+
+/* Tiny memmoves should not be vectorised.
+** f1:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char *
+f1 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES - 1);
+}
+
+/* Vectorise+inline minimum vector register width with LMUL=1
+** f2:
+**  (
+**  vsetivli\s+zero,16,e8,m1,ta,ma
+**  |
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m1,ta,ma
+**  )
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char *
+f2 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES);
+}
+
+/* Vectorise+inline up to LMUL=8
+** f3:
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m8,ta,ma
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char *
+f3 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8);
+}
+
+/* Don't vectorise if the move is too large for one operation
+** f4:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char *
+f4 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8 + 1);
+}


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add dg-remove-option for z* extensions

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:bf3011d4d1833a268a65800d5b6b5e6be800fda4

commit bf3011d4d1833a268a65800d5b6b5e6be800fda4
Author: Patrick O'Neill 
Date:   Mon Jun 24 12:06:15 2024 -0700

RISC-V: Add dg-remove-option for z* extensions

This introduces testsuite support infra for removing extensions.
Since z* extensions don't have ordering requirements the logic for
adding/removing those extensions has also been consolidated.

This fixes RVWMO compile testcases failing on Ztso targets by removing
the extension from the -march string.

gcc/ChangeLog:

* doc/sourcebuild.texi (dg-remove-option): Add documentation.
(dg-add-option): Add documentation for riscv_{a,zaamo,zalrsc,ztso}

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Add 
dg-remove-options
for ztso.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-load-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-load-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-load-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-store-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-store-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Replace manually
specified -march string with dg-add/remove-options directives.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Ditto.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Ditto.
* lib/target-supports-dg.exp: Add dg-remove-options.
* lib/target-supports.exp: Add dg-remove-options and consolidate z*
extension add/remove-option code.

Signed-off-by: Patrick O'Neill 
(cherry picked from commit 580c37f1ef7db8e7a398184eb8f5d7555124d30a)

Diff:
---
 gcc/doc/sourcebuild.texi   |  43 +
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-1.c   |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-2.c   |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-3.c   |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-4.c   |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-5.c   |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-6.c   |   1 +
 .../riscv/amo/amo-table-a-6-compare-exchange-7.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-fence-1.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-fence-2.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-fence-3.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-fence-4.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-fence-5.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-load-1.c|   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-load-2.c|   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-load-3.c|   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-store-1.c   |   1 +
 .../gcc.target/riscv/amo/amo-table-a-6-store-2.c   |   1 +
 .../riscv/amo/amo-table-a-6-store-compat-3.c   

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [PATCH v2 2/3] RISC-V: setmem for RISCV with V extension

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:46289ca0be63cc7a395d01e2bcbf4f2a572249b1

commit 46289ca0be63cc7a395d01e2bcbf4f2a572249b1
Author: Sergei Lewis 
Date:   Mon Jun 24 14:20:14 2024 -0600

[PATCH v2 2/3] RISC-V: setmem for RISCV with V extension

This is primarily Sergei's work, my contributions were limited to
merging his expander with the one that's on the trunk, allowing
non-constant value and trivial testsuite adjustments due to option renaming.

I'm doing setmem first because it's the easiest.  The others will follow
soon enough.

I've tested this in my system, waiting on pre-commit CI to render its
verdict before moving forward.

gcc/ChangeLog

* config/riscv/riscv-protos.h (riscv_vector::expand_vec_setmem): New
function declaration.

* config/riscv/riscv-string.cc (riscv_vector::expand_vec_setmem): 
New
function: this generates an inline vectorised memory set, if and 
only if
we know the entire operation can be performed in a single vector 
store.

* config/riscv/riscv.md (setmem): Try 
riscv_vector::expand_vec_setmem
for constant lengths.  Do not require operand 2 to be a constant.

gcc/testsuite/ChangeLog

* gcc.target/riscv/rvv/base/setmem-1.c: New tests
* gcc.target/riscv/rvv/base/setmem-2.c: New tests
* gcc.target/riscv/rvv/base/setmem-3.c: New tests

(cherry picked from commit a424318d32103dde827e8507fa27d24d33407ec9)

Diff:
---
 gcc/config/riscv/riscv-protos.h|   1 +
 gcc/config/riscv/riscv-string.cc   |  89 ++
 gcc/config/riscv/riscv.md  |   9 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/setmem-1.c | 103 +
 gcc/testsuite/gcc.target/riscv/rvv/base/setmem-2.c |  51 ++
 gcc/testsuite/gcc.target/riscv/rvv/base/setmem-3.c |  69 ++
 6 files changed, 320 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index d6473d0cd85..a3380d4250d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -678,6 +678,7 @@ void expand_popcount (rtx *);
 void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
 bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool);
 void emit_vec_extract (rtx, rtx, rtx);
+bool expand_vec_setmem (rtx, rtx, rtx);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 4702001bd9b..1ddebdcee3f 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -1516,4 +1516,93 @@ expand_strcmp (rtx result, rtx src1, rtx src2, rtx 
nbytes,
   return true;
 }
 
+/* Check we are permitted to vectorise a memory operation.
+   If so, return true and populate lmul_out.
+   Otherwise, return false and leave lmul_out unchanged.  */
+static bool
+check_vectorise_memory_operation (rtx length_in, HOST_WIDE_INT &lmul_out)
+{
+  /* If we either can't or have been asked not to vectorise, respect this.  */
+  if (!TARGET_VECTOR)
+return false;
+  if (!(stringop_strategy & STRATEGY_VECTOR))
+return false;
+
+  /* If we can't reason about the length, don't vectorise.  */
+  if (!CONST_INT_P (length_in))
+return false;
+
+  HOST_WIDE_INT length = INTVAL (length_in);
+
+  /* If it's tiny, default operation is likely better; maybe worth
+ considering fractional lmul in the future as well.  */
+  if (length < (TARGET_MIN_VLEN / 8))
+return false;
+
+  /* If we've been asked to use a specific LMUL,
+ check the operation fits and do that.  */
+  if (rvv_max_lmul != RVV_DYNAMIC)
+{
+  lmul_out = TARGET_MAX_LMUL;
+  return (length <= ((TARGET_MAX_LMUL * TARGET_MIN_VLEN) / 8));
+}
+
+  /* Find smallest lmul large enough for entire op.  */
+  HOST_WIDE_INT lmul = 1;
+  while ((lmul <= 8) && (length > ((lmul * TARGET_MIN_VLEN) / 8)))
+{
+  lmul <<= 1;
+}
+
+  if (lmul > 8)
+return false;
+
+  lmul_out = lmul;
+  return true;
+}
+
+/* Used by setmemdi in riscv.md.  */
+bool
+expand_vec_setmem (rtx dst_in, rtx length_in, rtx fill_value_in)
+{
+  HOST_WIDE_INT lmul;
+  /* Check we are able and allowed to vectorise this operation;
+ bail if not.  */
+  if (!check_vectorise_memory_operation (length_in, lmul))
+return false;
+
+  machine_mode vmode
+  = riscv_vector::get_vector_mode (QImode, BYTES_PER_RISCV_VECTOR * lmul)
+   .require ();
+  rtx dst_addr = copy_addr_to_reg (XEXP (dst_in, 0));
+  rtx dst = change_address (dst_in, vmode, dst_addr);
+
+  rtx fill_value = gen_reg_rtx (vmode);
+  rtx broadcast_ops[] = { fill_value, fill_value_in };
+
+  /* If the length is exactly vlmax for the selected mode, do that.
+ Otherwise, use a predicated store.  */
+  if (known_eq (GET_MODE_SIZE (vmode), I

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed][RISC-V] Fix some of the testsuite fallout from late-combine patch

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:d46ded6dc58a9d2705ddb37b9a4a33117d58c622

commit d46ded6dc58a9d2705ddb37b9a4a33117d58c622
Author: Jeff Law 
Date:   Mon Jun 24 23:22:21 2024 -0600

[committed][RISC-V] Fix some of the testsuite fallout from late-combine 
patch

This fixes most, but not all of the testsuite fallout from the late-combine
patch.  Specifically in the vector space we're often able to eliminate a
broadcast of an scalar element across a vector.  That eliminates the vsetvl
related to the broadcast, but more importantly from the testsuite 
standpoint it
turns .vv forms into .vf or .vx forms.

There were two paths we could have taken here.  One to accept .v*, ignoring 
the
actual register operands.  Or to create new matches for the .vx and .vf
variants.  I selected the latter as I'd like us to know if the code to avoid
the broadcast regresses.

I'm pushing this through now so that we've got cleaner results and to 
prevent
duplicate work.  I've got patch for the rest of the testsuite fallout, but I
want to think about them a bit.

gcc/testsuite
* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Adjust
expected test output after late-combine changes.
* gcc.target/riscv/rvv/autovec/binop/vadd-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vmul-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/binop/vsub-rv64gcv-nofm.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: 
Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: 
Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax_zvfh-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin_zvfh-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Likewise.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Likewise.

(cherry picked from commit 41ff74aa581ed38d04c46e6c8839eab48e1

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] ira: Scale save/restore costs of callee save registers with block frequency

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6113813e19c6364bfc83fe22b218838b11ce3f38

commit 6113813e19c6364bfc83fe22b218838b11ce3f38
Author: Surya Kumari Jangala 
Date:   Tue Jun 25 08:37:49 2024 -0500

ira: Scale save/restore costs of callee save registers with block frequency

In assign_hard_reg(), when computing the costs of the hard registers, the
cost of saving/restoring a callee-save hard register in prolog/epilog is
taken into consideration. However, this cost is not scaled with the entry
block frequency. Without scaling, the cost of saving/restoring is quite
small and this can result in a callee-save register being chosen by
assign_hard_reg() even though there are free caller-save registers
available. Assigning a callee save register to a pseudo that is live
in the entire function and across a call will cause shrink wrap to fail.

2024-06-25  Surya Kumari Jangala  

gcc/
PR rtl-optimization/111673
* ira-color.cc (assign_hard_reg): Scale save/restore costs of
callee save registers with block frequency.

gcc/testsuite/
PR rtl-optimization/111673
* gcc.target/powerpc/pr111673.c: New test.

(cherry picked from commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b)

Diff:
---
 gcc/ira-color.cc|  4 +++-
 gcc/testsuite/gcc.target/powerpc/pr111673.c | 17 +
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
index b9ae32d1b4d..ca32a23a0c9 100644
--- a/gcc/ira-color.cc
+++ b/gcc/ira-color.cc
@@ -2178,7 +2178,9 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
add_cost = ((ira_memory_move_cost[mode][rclass][0]
 + ira_memory_move_cost[mode][rclass][1])
* saved_nregs / hard_regno_nregs (hard_regno,
- mode) - 1);
+ mode) - 1)
+  * (optimize_size ? 1 :
+ REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
cost += add_cost;
full_cost += add_cost;
  }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr111673.c 
b/gcc/testsuite/gcc.target/powerpc/pr111673.c
new file mode 100644
index 000..e0c0f85460a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr111673.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -fdump-rtl-pro_and_epilogue" } */
+
+/* Verify there is an early return without the prolog and shrink-wrap
+   the function. */
+
+int f (int);
+int
+advance (int dz)
+{
+  if (dz > 0)
+return (dz + dz) * dz;
+  else
+return dz * f (dz);
+}
+
+/* { dg-final { scan-rtl-dump-times "Performing shrink-wrapping" 1 
"pro_and_epilogue" } } */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [PATCH v2 3/3] RISC-V: cmpmem for RISCV with V extension

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6a975b9f6e819f4c0532593f15ff15ccaf3a616e

commit 6a975b9f6e819f4c0532593f15ff15ccaf3a616e
Author: Sergei Lewis 
Date:   Tue Jun 25 15:26:14 2024 -0600

[PATCH v2 3/3] RISC-V: cmpmem for RISCV with V extension

So this is the cmpmem patch from Sergei, updated for the trunk.

Updates included adjusting the existing cmpmemsi expander to
conditionally try expansion via vector.  And a minor testsuite
adjustment to turn off vector expansion in one test that is primarily
focused on vset optimization and ensuring we don't have extras.

I've spun this in my tester successfully and just want to see a clean
run through precommit CI before moving forward.

Jeff
gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_vector::expand_vec_cmpmem): New
function declaration.
* config/riscv/riscv-string.cc (riscv_vector::expand_vec_cmpmem): 
New
function.
* config/riscv/riscv.md (cmpmemsi): Try 
riscv_vector::expand_vec_cmpmem
for constant lengths.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/cmpmem-1.c: New codegen tests
* gcc.target/riscv/rvv/base/cmpmem-2.c: New execution tests
* gcc.target/riscv/rvv/base/cmpmem-3.c: New codegen tests
* gcc.target/riscv/rvv/base/cmpmem-4.c: New codegen tests
* gcc.target/riscv/rvv/autovec/vls/misalign-1.c: Turn off vector 
mem* and
str* handling.

(cherry picked from commit b1e828dd9694294de1ec71e319d32a6b30b087d8)

Diff:
---
 gcc/config/riscv/riscv-protos.h|   1 +
 gcc/config/riscv/riscv-string.cc   | 100 +
 gcc/config/riscv/riscv.md  |   7 +-
 .../gcc.target/riscv/rvv/autovec/vls/misalign-1.c  |   2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-1.c |  88 ++
 gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-2.c |  74 +++
 gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-3.c |  45 ++
 gcc/testsuite/gcc.target/riscv/rvv/base/cmpmem-4.c |  62 +
 8 files changed, 377 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index a3380d4250d..a8b76173fa0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -679,6 +679,7 @@ void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = 
false);
 bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool);
 void emit_vec_extract (rtx, rtx, rtx);
 bool expand_vec_setmem (rtx, rtx, rtx);
+bool expand_vec_cmpmem (rtx, rtx, rtx, rtx);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 1ddebdcee3f..257a514d290 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -1605,4 +1605,104 @@ expand_vec_setmem (rtx dst_in, rtx length_in, rtx 
fill_value_in)
   return true;
 }
 
+/* Used by cmpmemsi in riscv.md.  */
+
+bool
+expand_vec_cmpmem (rtx result_out, rtx blk_a_in, rtx blk_b_in, rtx length_in)
+{
+  HOST_WIDE_INT lmul;
+  /* Check we are able and allowed to vectorise this operation;
+ bail if not.  */
+  if (!check_vectorise_memory_operation (length_in, lmul))
+return false;
+
+  /* Strategy:
+ load entire blocks at a and b into vector regs
+ generate mask of bytes that differ
+ find first set bit in mask
+ find offset of first set bit in mask, use 0 if none set
+ result is ((char*)a[offset] - (char*)b[offset])
+   */
+
+  machine_mode vmode
+  = riscv_vector::get_vector_mode (QImode, BYTES_PER_RISCV_VECTOR * lmul)
+ .require ();
+  rtx blk_a_addr = copy_addr_to_reg (XEXP (blk_a_in, 0));
+  rtx blk_a = change_address (blk_a_in, vmode, blk_a_addr);
+  rtx blk_b_addr = copy_addr_to_reg (XEXP (blk_b_in, 0));
+  rtx blk_b = change_address (blk_b_in, vmode, blk_b_addr);
+
+  rtx vec_a = gen_reg_rtx (vmode);
+  rtx vec_b = gen_reg_rtx (vmode);
+
+  machine_mode mask_mode = get_mask_mode (vmode);
+  rtx mask = gen_reg_rtx (mask_mode);
+  rtx mismatch_ofs = gen_reg_rtx (Pmode);
+
+  rtx ne = gen_rtx_NE (mask_mode, vec_a, vec_b);
+  rtx vmsops[] = { mask, ne, vec_a, vec_b };
+  rtx vfops[] = { mismatch_ofs, mask };
+
+  /* If the length is exactly vlmax for the selected mode, do that.
+ Otherwise, use a predicated store.  */
+
+  if (known_eq (GET_MODE_SIZE (vmode), INTVAL (length_in)))
+{
+  emit_move_insn (vec_a, blk_a);
+  emit_move_insn (vec_b, blk_b);
+  emit_vlmax_insn (code_for_pred_cmp (vmode), riscv_vector::COMPARE_OP,
+  vmsops);
+
+  emit_vlmax_insn (code_for_pred_ffs (mask_mode, Pmode),
+  riscv_vector::CPOP_OP, vfops);
+}
+  else
+{
+  if (!satisfies_constraint_K (length_in))
+ length_in = force_reg

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed][RISC-V] Fix expected output for thead store pair test

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:69251774e403bc643ed862c13ca54eb37e6cabb2

commit 69251774e403bc643ed862c13ca54eb37e6cabb2
Author: Jeff Law 
Date:   Wed Jun 26 06:59:26 2024 -0600

[committed][RISC-V] Fix expected output for thead store pair test

Surya's patch to IRA has improved the code we generate for one of the thead
store pair tests for both rv32 and rv64.  This patch adjusts the 
expectations
of that test.

I've verified that the test now passes on rv32 and rv64 in my tester.  
Pushing
to the trunk.

gcc/testsuite
* gcc.target/riscv/xtheadmempair-3.c: Update expected output.

(cherry picked from commit 03a3dffa43145f80548d32b266b9b87be07b52ee)

Diff:
---
 gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c 
b/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c
index 5dec702819a..99a6ae7f4d7 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-3.c
@@ -17,13 +17,11 @@ void bar (xlen_t, xlen_t, xlen_t, xlen_t, xlen_t, xlen_t, 
xlen_t, xlen_t);
 void baz (xlen_t a, xlen_t b, xlen_t c, xlen_t d, xlen_t e, xlen_t f, xlen_t 
g, xlen_t h)
 {
   foo (a, b, c, d, e, f, g, h);
-  /* RV64: We don't use 0(sp), therefore we can only get 3 mempairs.  */
-  /* RV32: We don't use 0(sp)-8(sp), therefore we can only get 2 mempairs.  */
   bar (a, b, c, d, e, f, g, h);
 }
 
-/* { dg-final { scan-assembler-times "th.ldd\t" 3 { target { rv64 } } } } */
-/* { dg-final { scan-assembler-times "th.sdd\t" 3 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.ldd\t" 4 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.sdd\t" 4 { target { rv64 } } } } */
 
-/* { dg-final { scan-assembler-times "th.lwd\t" 2 { target { rv32 } } } } */
-/* { dg-final { scan-assembler-times "th.swd\t" 2 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "th.lwd\t" 4 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "th.swd\t" 4 { target { rv32 } } } } */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Rename amo testcases

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:0fe00385dd91ba9483bca688f5950390b70b0acd

commit 0fe00385dd91ba9483bca688f5950390b70b0acd
Author: Patrick O'Neill 
Date:   Tue Jun 25 14:14:16 2024 -0700

RISC-V: Rename amo testcases

Rename riscv/amo/ testcases to follow a '{ext}-{model}-{name}-{memory 
order}.c'
naming convention.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-load-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-release.c: ...here.
* gcc.target/riscv/amo/amo-zaamo-preferred-over-zalrsc.c: Move to...
* gcc.target/riscv/amo/zaamo-preferred-over-zalrsc.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Move 
to...
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Move 
to...
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-6.c: Move 
to...
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-3.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-2.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: 
...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-1.c: Move 
to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: 
...here.
   

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Consolidate amo testcase variants

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:001dc95c8c47f6ff0ddac5c962f5049473bdb9ef

commit 001dc95c8c47f6ff0ddac5c962f5049473bdb9ef
Author: Patrick O'Neill 
Date:   Tue Jun 25 14:14:17 2024 -0700

RISC-V: Consolidate amo testcase variants

Many riscv/amo/ testcases use check-function-bodies. These testcases can be
consolidated with related testcases (memory ordering variants) without 
affecting
the assertions.

Give functions descriptive names so testsuite failures are obvious from the
'FAIL:' line.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Removed.
* gcc.target/riscv/amo/a-rvwmo-fence.c: New test.
* gcc.target/riscv/amo/a-ztso-fence.c: New test.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c: New test.

Signed-off-by: Patrick O'Neill 
(cherry picked from commit aa89e86f70ac65e2d51f33ac45849d05a4f30524)

Diff:
---
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-fence.c | 56 
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-fence.c  | 52 +++
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c | 17 -
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c | 17 -
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c | 17 -
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c | 17 -
 .../gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c | 17 -
 .../gcc.target/riscv/amo/amo-table-a-6-fence-1.c   | 15 -
 .../gcc.target/riscv/amo/amo-table-a-6-fence-2.c   | 16 -
 .../gcc.target/riscv/amo/amo-table-a-6-fence-3.c   | 16 -
 .../gcc.target/riscv/amo/amo-table-a-6-fence-4.c   | 16 -
 .../gcc.target/riscv/amo/amo-table-a-6-fence-5.c   | 16 -
 .../riscv/amo/amo-table-ztso-amo-add-1.c   | 17 -
 .../riscv/amo/amo-table-ztso-amo-add-2.c   | 17 -
 .../riscv/amo/amo-table-ztso-amo-add-3.c   | 17 -
 .../riscv/amo/amo-table-ztso-amo-add-4.c   | 17 -
 .../riscv/amo/amo-table-ztso-amo-add-5.c   | 17 -
 .../gcc.target/riscv/amo/amo-table-ztso-fence-1.c  | 15 -
 .../gcc.target/riscv/amo/amo-table-ztso-fence-2.c  | 15 -
 .../gcc.target/riscv/amo/amo-table-ztso-fence-3.c  | 15 -
 .../gcc.target/riscv/amo/amo-table-ztso-fence-4.c  | 15 -
 .../gcc.target/riscv/amo/amo-table-ztso-fence-5.c  | 16 -
 .../gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c| 22 --
 .../gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c| 22 --
 .../gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c| 22 --
 .../gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c| 22 --
 .../gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c| 22 --
 .../gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c | 57 
 .../gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c  | 57 
 .../riscv/amo/zalrsc-rvwmo-amo-add-int.c   | 78 ++
 .../gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c | 78 ++
 31 files changed, 378 insertions(+), 435 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-fence.c 
b

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Update testcase comments to point to PSABI rather than Table A.6

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:09b41e93787a07e6dc3b6db35dea35d84b7481b1

commit 09b41e93787a07e6dc3b6db35dea35d84b7481b1
Author: Patrick O'Neill 
Date:   Tue Jun 25 14:14:18 2024 -0700

RISC-V: Update testcase comments to point to PSABI rather than Table A.6

Table A.6 was originally the source of truth for the recommended mappings.
Point to the PSABI doc since the memory model mappings have been moved 
there.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/a-rvwmo-fence.c: Replace A.6 reference with 
PSABI.
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-release.c: Ditto.
* gcc.target/riscv/amo/a-ztso-fence.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-release.c: Ditto.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: Ditto.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: Ditto.
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: 
Ditto.
* 
gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: 
Ditto.
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: 
Ditto.
* 
gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: 
Ditto.

Signed-off-by: Patrick O'Neill 
(cherry picked from commit 86a3dbeb6c6a36f8cf97c66cef83c9bc3ad82027)

Diff:
---
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-fence.c | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-load-acquire.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-load-relaxed.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c  | 3 ++-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-store-relaxed.c | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-rvwmo-store-release.c | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-fence.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-load-acquire.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-load-relaxed.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-load-seq-cst.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c   | 3 ++-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-store-relaxed.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/a-ztso-store-release.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c   | 2 +-
 gcc/te

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for vector truncate after .SAT_SUB

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:86b900f6761bb4265d3d30b900c366d1817e0f7d

commit 86b900f6761bb4265d3d30b900c366d1817e0f7d
Author: Pan Li 
Date:   Mon Jun 24 22:25:57 2024 +0800

RISC-V: Add testcases for vector truncate after .SAT_SUB

This patch would like to add the test cases of the vector truncate after
.SAT_SUB.  Aka:

  #define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T)   \
  void __attribute__((noinline))   \
  vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T *out, IN_T *op_1, IN_T y, \
 unsigned limit) \
  {\
unsigned i;\
for (i = 0; i < limit; i++)\
  {\
IN_T x = op_1[i];  \
out[i] = (OUT_T)(x >= y ? x - y : 0);  \
  }\
  }

The below 3 cases are included.

DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint8_t, uint16_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint16_t, uint32_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint32_t, uint64_t)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add helper
test macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c: 
New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c: 
New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c: 
New test.

Signed-off-by: Pan Li 
(cherry picked from commit b55798c0fc5cb02512b58502961d8425fb60588f)

Diff:
---
 .../riscv/rvv/autovec/binop/vec_sat_arith.h| 19 ++
 .../rvv/autovec/binop/vec_sat_binary_scalar.h  | 27 
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-1.c  | 21 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-2.c  | 21 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-3.c  | 21 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c  | 74 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c  | 74 ++
 .../rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c  | 74 ++
 8 files changed, 331 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
index d5c81fbe5a9..a3116033fb3 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
@@ -310,4 +310,23 @@ vec_sat_u_sub_##T##_fmt_10 (T *out, T *op_1, T *op_2, 
unsigned limit) \
 #define RUN_VEC_SAT_U_SUB_FMT_10(T, out, op_1, op_2, N) \
   vec_sat_u_sub_##T##_fmt_10(out, op_1, op_2, N)
 
+/**/
+/* Saturation Sub Truncated (Unsigned and Signed) 
*/
+/**/
+#define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T)   \
+void __attribute__((noinline))   \
+vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T *out, IN_T *op_1, IN_T y, \
+unsigned limit) \
+{\
+  unsigned i;\
+  for (i = 0; i < limit; i++)\
+{\
+  IN_T x = op_1[i];  \
+  out[i] = (OUT_T)(x >= y ? x - y : 0);  \
+}\
+}
+
+#define RUN_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T, out, op_1, y, N) \
+  vec_sat_u_sub_trunc_##OUT_T##_fmt_1(out, op_1, y, N)
+
 #endif
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h
new file mode 100644
index 000..c79b180054e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h
@@ -0,0 +1,27 @@
+#ifndef HAVE_DEFINED_VEC_SAT_BINARY_SCALAR
+#define HAVE_DEFINED_VEC_SAT_BINARY_SCALAR
+
+int
+main ()

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [to-be-committed, RISC-V, V4] movmem for RISCV with V extension

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:5bcf397a0c5a4a23e89af43fec6bd53c1faea3ee

commit 5bcf397a0c5a4a23e89af43fec6bd53c1faea3ee
Author: Sergei Lewis 
Date:   Sat Jun 29 14:34:31 2024 -0600

[to-be-committed,RISC-V,V4] movmem for RISCV with V extension

I hadn't updated my repo on the host where I handle email, so it picked
up the older version of this patch without the testsuite fix.  So, V4
with the testsuite option for lmul fixed.

--

And Sergei's movmem patch.  Just trivial testsuite adjustment for an
option name change and a whitespace fix from me.

I've spun this in my tester for rv32 and rv64.  I'll wait for pre-commit
CI before taking further action.

Just a reminder, this patch is designed to handle the case where we can
issue a single vector load/store which avoids all the complexities of
determining which direction to copy.

--

gcc/ChangeLog

* config/riscv/riscv.md (movmem): New expander.

gcc/testsuite/ChangeLog

PR target/112109
* gcc.target/riscv/rvv/base/movmem-1.c: New test

(cherry picked from commit 42946aa9b3228262e413481a3193bda85c20ef4b)

Diff:
---
 gcc/config/riscv/riscv.md  | 22 
 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c | 60 ++
 2 files changed, 82 insertions(+)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ff37125e3f2..c0c960353eb 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2723,6 +2723,28 @@
 FAIL;
 })
 
+;; Inlining general memmove is a pessimisation: we can't avoid having to decide
+;; which direction to go at runtime, which is costly in instruction count
+;; however for situations where the entire move fits in one vector operation
+;; we can do all reads before doing any writes so we don't have to worry
+;; so generate the inline vector code in such situations
+;; nb. prefer scalar path for tiny memmoves.
+(define_expand "movmem"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+   (match_operand:BLK 1 "general_operand"))
+(use (match_operand:P 2 "const_int_operand"))
+(use (match_operand:SI 3 "const_int_operand"))])]
+  "TARGET_VECTOR"
+{
+  if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN / 8)
+   && (INTVAL (operands[2]) <= TARGET_MIN_VLEN)
+   && riscv_vector::expand_block_move (operands[0], operands[1],
+operands[2]))
+DONE;
+  else
+FAIL;
+})
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
new file mode 100644
index 000..d9d4a70a392
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-O3 -mrvv-max-lmul=dynamic" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define MIN_VECTOR_BYTES (__riscv_v_min_vlen / 8)
+
+/* Tiny memmoves should not be vectorised.
+** f1:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char *
+f1 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES - 1);
+}
+
+/* Vectorise+inline minimum vector register width with LMUL=1
+** f2:
+**  (
+**  vsetivli\s+zero,16,e8,m1,ta,ma
+**  |
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m1,ta,ma
+**  )
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char *
+f2 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES);
+}
+
+/* Vectorise+inline up to LMUL=8
+** f3:
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m8,ta,ma
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char *
+f3 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8);
+}
+
+/* Don't vectorise if the move is too large for one operation
+** f4:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char *
+f4 (char *a, char const *b)
+{
+  return __builtin_memmove (a, b, MIN_VECTOR_BYTES * 8 + 1);
+}


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 1

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:170e8b651c138b37c5d86569a5acf909e137142c

commit 170e8b651c138b37c5d86569a5acf909e137142c
Author: Pan Li 
Date:   Sun Jun 30 16:03:41 2024 +0800

RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 1

This patch would like to add test cases for the unsigned scalar
.SAT_ADD IMM form 1.  Aka:

Form 1:
  #define DEF_SAT_U_ADD_IMM_FMT_1(T)   \
  T __attribute__((noinline))  \
  sat_u_add_imm_##T##_fmt_1 (T x)  \
  {\
return (T)(x + 9) >= x ? (x + 9) : -1; \
  }

DEF_SAT_U_ADD_IMM_FMT_1(uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add helper test macro.
* gcc.target/riscv/sat_u_add_imm-1.c: New test.
* gcc.target/riscv/sat_u_add_imm-2.c: New test.
* gcc.target/riscv/sat_u_add_imm-3.c: New test.
* gcc.target/riscv/sat_u_add_imm-4.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-1.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-2.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-3.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-4.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit ed213b384fdca9375c3ec53c2a0eae134fb98612)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h | 10 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-1.c   | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-2.c   | 21 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-3.c   | 18 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-4.c   | 17 
 .../gcc.target/riscv/sat_u_add_imm-run-1.c | 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-2.c | 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-3.c | 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-4.c | 46 ++
 9 files changed, 269 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index 0c2e44af718..4ec4ec36cc1 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -60,6 +60,16 @@ sat_u_add_##T##_fmt_6 (T x, T y)\
 #define RUN_SAT_U_ADD_FMT_5(T, x, y) sat_u_add_##T##_fmt_5(x, y)
 #define RUN_SAT_U_ADD_FMT_6(T, x, y) sat_u_add_##T##_fmt_6(x, y)
 
+#define DEF_SAT_U_ADD_IMM_FMT_1(T, IMM)  \
+T __attribute__((noinline))  \
+sat_u_add_imm##IMM##_##T##_fmt_1 (T x)   \
+{\
+  return (T)(x + IMM) >= x ? (x + IMM) : -1; \
+}
+
+#define RUN_SAT_U_ADD_IMM_FMT_1(T, x, IMM, expect) \
+  if (sat_u_add_imm##IMM##_##T##_fmt_1(x) != expect) __builtin_abort ()
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-1.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-1.c
new file mode 100644
index 000..14e9b7595a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_imm9_uint8_t_fmt_1:
+** addi\s+[atx][0-9]+,\s*a0,\s*9
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_IMM_FMT_1(uint8_t, 9)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-2.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-2.c
new file mode 100644
index 000..c1a3c6ff21d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-2.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_imm3_uint16_t_fmt_1:
+** addi\s+[atx][0-9]+,\s*a0,\s*3
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** slli\s+a0,\s*a0,\s*48
+** srli\s+a0,\s*a0,\s*48
+** ret
+*/
+DEF_SAT_U_ADD_IMM_FMT_1(uint16_t, 3)
+
+/* { dg-final { scan-rtl-du

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 2

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:0d154f4ef5020b85879e1013fa49ac6d1a7af62e

commit 0d154f4ef5020b85879e1013fa49ac6d1a7af62e
Author: Pan Li 
Date:   Sun Jun 30 16:14:38 2024 +0800

RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 2

This patch would like to add test cases for the unsigned scalar
.SAT_ADD IMM form 2.  Aka:

Form 2:
  #define DEF_SAT_U_ADD_IMM_FMT_2(T)  \
  T __attribute__((noinline)) \
  sat_u_add_imm_##T##_fmt_1 (T x) \
  {   \
return (T)(x + 9) < x ? -1 : (x + 9); \
  }

DEF_SAT_U_ADD_IMM_FMT_2(uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add helper test macro.
* gcc.target/riscv/sat_u_add_imm-5.c: New test.
* gcc.target/riscv/sat_u_add_imm-6.c: New test.
* gcc.target/riscv/sat_u_add_imm-7.c: New test.
* gcc.target/riscv/sat_u_add_imm-8.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-5.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-6.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-7.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-8.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit bff0d025aff8efaa5d991fcd13dd9876b115dc94)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h | 10 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-5.c   | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-6.c   | 21 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-7.c   | 18 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-8.c   | 17 
 .../gcc.target/riscv/sat_u_add_imm-run-5.c | 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-6.c | 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-7.c | 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-8.c | 46 ++
 9 files changed, 269 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index 4ec4ec36cc1..d94f0fd602c 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -67,9 +67,19 @@ sat_u_add_imm##IMM##_##T##_fmt_1 (T x)   \
   return (T)(x + IMM) >= x ? (x + IMM) : -1; \
 }
 
+#define DEF_SAT_U_ADD_IMM_FMT_2(T, IMM) \
+T __attribute__((noinline)) \
+sat_u_add_imm##IMM##_##T##_fmt_2 (T x)  \
+{   \
+  return (T)(x + IMM) < x ? -1 : (x + IMM); \
+}
+
 #define RUN_SAT_U_ADD_IMM_FMT_1(T, x, IMM, expect) \
   if (sat_u_add_imm##IMM##_##T##_fmt_1(x) != expect) __builtin_abort ()
 
+#define RUN_SAT_U_ADD_IMM_FMT_2(T, x, IMM, expect) \
+  if (sat_u_add_imm##IMM##_##T##_fmt_2(x) != expect) __builtin_abort ()
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-5.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-5.c
new file mode 100644
index 000..19b502db6c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-5.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_imm9_uint8_t_fmt_2:
+** addi\s+[atx][0-9]+,\s*a0,\s*9
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_IMM_FMT_2(uint8_t, 9)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-6.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-6.c
new file mode 100644
index 000..0317370b67e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-6.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_imm3_uint16_t_fmt_2:
+** addi\s+[atx][0-9]+,\s*a0,\s*3
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** slli\s+a0,\s*a0,\s*48
+** srli\s+a0,\s*a0,\s*48
+** ret
+*/
+DEF_SAT_U_ADD_IMM_FMT_2(

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 3

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:1ee1caef0b826e95dc96e4a751d86981bdcd8ee2

commit 1ee1caef0b826e95dc96e4a751d86981bdcd8ee2
Author: Pan Li 
Date:   Sun Jun 30 16:41:16 2024 +0800

RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 3

This patch would like to add test cases for the unsigned scalar
.SAT_ADD IMM form 3.  Aka:

Form 3:
  #define DEF_SAT_U_ADD_IMM_FMT_3(T)   \
  T __attribute__((noinline))  \
  sat_u_add_imm_##T##_fmt_3 (T x)  \
  {\
T ret; \
return __builtin_add_overflow (x, 8, &ret) ? -1 : ret; \
  }

DEF_SAT_U_ADD_IMM_FMT_3(uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add helper test macro.
* gcc.target/riscv/sat_u_add_imm-10.c: New test.
* gcc.target/riscv/sat_u_add_imm-11.c: New test.
* gcc.target/riscv/sat_u_add_imm-12.c: New test.
* gcc.target/riscv/sat_u_add_imm-9.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-10.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-11.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-12.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-9.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 6d98e88f61f9b2e6864775ce390e9ce0a1359624)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h | 11 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-10.c  | 21 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-11.c  | 18 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-12.c  | 17 
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-9.c   | 19 +
 .../gcc.target/riscv/sat_u_add_imm-run-10.c| 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-11.c| 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-12.c| 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-9.c | 46 ++
 9 files changed, 270 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index d94f0fd602c..83b294db476 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -74,12 +74,23 @@ sat_u_add_imm##IMM##_##T##_fmt_2 (T x)  \
   return (T)(x + IMM) < x ? -1 : (x + IMM); \
 }
 
+#define DEF_SAT_U_ADD_IMM_FMT_3(T, IMM)\
+T __attribute__((noinline))\
+sat_u_add_imm##IMM##_##T##_fmt_3 (T x) \
+{  \
+  T ret;   \
+  return __builtin_add_overflow (x, IMM, &ret) ? -1 : ret; \
+}
+
 #define RUN_SAT_U_ADD_IMM_FMT_1(T, x, IMM, expect) \
   if (sat_u_add_imm##IMM##_##T##_fmt_1(x) != expect) __builtin_abort ()
 
 #define RUN_SAT_U_ADD_IMM_FMT_2(T, x, IMM, expect) \
   if (sat_u_add_imm##IMM##_##T##_fmt_2(x) != expect) __builtin_abort ()
 
+#define RUN_SAT_U_ADD_IMM_FMT_3(T, x, IMM, expect) \
+  if (sat_u_add_imm##IMM##_##T##_fmt_3(x) != expect) __builtin_abort ()
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-10.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-10.c
new file mode 100644
index 000..24cdd267cca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-10.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_imm3_uint16_t_fmt_3:
+** addi\s+[atx][0-9]+,\s*a0,\s*3
+** slli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** srli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*48
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** slli\s+a0,\s*a0,\s*48
+** srli\s+a0,\s*a0,\s*48
+** ret
+*/
+DEF_SAT_U_ADD_IMM_FMT_3(uint16_t, 3)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-11.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-11.c
new file mode 100644
index 000..f30e2405a0d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-11.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 4

2024-07-02 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f97a257c3f83ea9490139800793b2352f74eb275

commit f97a257c3f83ea9490139800793b2352f74eb275
Author: Pan Li 
Date:   Sun Jun 30 16:48:19 2024 +0800

RISC-V: Add testcases for unsigned scalar .SAT_ADD IMM form 4

This patch would like to add test cases for the unsigned scalar
.SAT_ADD IMM form 4.  Aka:

Form 4:
  #define DEF_SAT_U_ADD_IMM_FMT_4(T)\
  T __attribute__((noinline))   \
  sat_u_add_imm_##T##_fmt_4 (T x)   \
  { \
T ret;  \
return __builtin_add_overflow (x, 9, &ret) == 0 ? ret : -1; \
  }

DEF_SAT_U_ADD_IMM_FMT_4(uint64_t)

The below test is passed for this patch.
* The rv64gcv regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add helper test macro.
* gcc.target/riscv/sat_u_add_imm-13.c: New test.
* gcc.target/riscv/sat_u_add_imm-14.c: New test.
* gcc.target/riscv/sat_u_add_imm-15.c: New test.
* gcc.target/riscv/sat_u_add_imm-16.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-13.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-14.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-15.c: New test.
* gcc.target/riscv/sat_u_add_imm-run-16.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 7a65ab6b5f38d3018ffd456f278a9fd885487a27)

Diff:
---
 gcc/testsuite/gcc.target/riscv/sat_arith.h | 11 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-13.c  | 19 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-14.c  | 21 ++
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-15.c  | 18 +
 gcc/testsuite/gcc.target/riscv/sat_u_add_imm-16.c  | 17 
 .../gcc.target/riscv/sat_u_add_imm-run-13.c| 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-14.c| 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-15.c| 46 ++
 .../gcc.target/riscv/sat_u_add_imm-run-16.c| 46 ++
 9 files changed, 270 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/sat_arith.h
index 83b294db476..75442c94dc1 100644
--- a/gcc/testsuite/gcc.target/riscv/sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/sat_arith.h
@@ -82,6 +82,14 @@ sat_u_add_imm##IMM##_##T##_fmt_3 (T x) \
   return __builtin_add_overflow (x, IMM, &ret) ? -1 : ret; \
 }
 
+#define DEF_SAT_U_ADD_IMM_FMT_4(T, IMM) \
+T __attribute__((noinline)) \
+sat_u_add_imm##IMM##_##T##_fmt_4 (T x)  \
+{   \
+  T ret;\
+  return __builtin_add_overflow (x, IMM, &ret) == 0 ? ret : -1; \
+}
+
 #define RUN_SAT_U_ADD_IMM_FMT_1(T, x, IMM, expect) \
   if (sat_u_add_imm##IMM##_##T##_fmt_1(x) != expect) __builtin_abort ()
 
@@ -91,6 +99,9 @@ sat_u_add_imm##IMM##_##T##_fmt_3 (T x) \
 #define RUN_SAT_U_ADD_IMM_FMT_3(T, x, IMM, expect) \
   if (sat_u_add_imm##IMM##_##T##_fmt_3(x) != expect) __builtin_abort ()
 
+#define RUN_SAT_U_ADD_IMM_FMT_4(T, x, IMM, expect) \
+  if (sat_u_add_imm##IMM##_##T##_fmt_4(x) != expect) __builtin_abort ()
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-13.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-13.c
new file mode 100644
index 000..a3b2679233c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-13.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -O3 -fdump-rtl-expand-details 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "sat_arith.h"
+
+/*
+** sat_u_add_imm9_uint8_t_fmt_4:
+** addi\s+[atx][0-9]+,\s*a0,\s*9
+** andi\s+[atx][0-9]+,\s*[atx][0-9]+,\s*0xff
+** sltu\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** neg\s+[atx][0-9]+,\s*[atx][0-9]+
+** or\s+[atx][0-9]+,\s*[atx][0-9]+,\s*[atx][0-9]+
+** andi\s+a0,\s*a0,\s*0xff
+** ret
+*/
+DEF_SAT_U_ADD_IMM_FMT_4(uint8_t, 9)
+
+/* { dg-final { scan-rtl-dump-times ".SAT_ADD " 2 "expand" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-14.c 
b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-14.c
new file mode 100644
index 000..968534b74da
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/sat_u_add_imm-14.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } 

[gcc r15-1823] [PATCH] ARC: Update gcc.target/arc/pr9001184797.c test

2024-07-03 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:c41eb4c702ed04993a475d5910c190af1ff66720

commit r15-1823-gc41eb4c702ed04993a475d5910c190af1ff66720
Author: Luis Silva 
Date:   Wed Jul 3 09:41:05 2024 -0600

[PATCH] ARC: Update gcc.target/arc/pr9001184797.c test

... to comply with new standards due to stricter analysis in
the latest GCC versions.

gcc/testsuite/ChangeLog:

* gcc.target/arc/pr9001184797.c: Fix compiler warnings.

Diff:
---
 gcc/testsuite/gcc.target/arc/pr9001184797.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arc/pr9001184797.c 
b/gcc/testsuite/gcc.target/arc/pr9001184797.c
index e76c6769042..6c5de5fe729 100644
--- a/gcc/testsuite/gcc.target/arc/pr9001184797.c
+++ b/gcc/testsuite/gcc.target/arc/pr9001184797.c
@@ -4,13 +4,15 @@
 
 /* This test studies the use of anchors and tls symbols. */
 
+extern int h();
+
 struct a b;
 struct a {
   long c;
   long d
 } e() {
   static __thread struct a f;
-  static __thread g;
+  static __thread int g;
   g = 5;
   h();
   if (f.c)


[gcc r15-1828] [committed] Fix previously latent bug in reorg affecting cris port

2024-07-03 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:e5f73853ae78d4e9ae434c707a12da1494459b24

commit r15-1828-ge5f73853ae78d4e9ae434c707a12da1494459b24
Author: Jeff Law 
Date:   Wed Jul 3 12:47:31 2024 -0600

[committed] Fix previously latent bug in reorg affecting cris port

The late-combine patch has triggered a previously latent bug in reorg.

Basically we have a sequence like this in the middle of reorg before we 
start
relaxing delay slots (cris-elf, gcc.dg/torture/pr98289.c)

> (insn 67 49 18 (sequence [
> (jump_insn 50 49 52 (set (pc)
> (if_then_else (ne (reg:CC 19 ccr)
> (const_int 0 [0]))
> (label_ref:SI 30)
> (pc))) "j.c":10:6 discrim 1 282 {*bnecc}
>  (expr_list:REG_DEAD (reg:CC 19 ccr)
> (int_list:REG_BR_PROB 7 (nil)))
>  -> 30)
> (insn/f 52 50 18 (set (mem:SI (reg/f:SI 14 sp) [1  S4 A8])
> (reg:SI 16 srp)) 37 {*mov_tomemsi}
>  (nil))
> ]) "j.c":10:6 discrim 1 -1
>  (nil))
>
> (note 18 67 54 [bb 3] NOTE_INSN_BASIC_BLOCK)
>
> (note 54 18 55 NOTE_INSN_EPILOGUE_BEG)
>
> (jump_insn 55 54 56 (return) "j.c":14:1 228 {*return_expanded}
>  (nil)
>  -> return)
>
> (barrier 56 55 43)
>
> (note 43 56 65 [bb 4] NOTE_INSN_BASIC_BLOCK)
>
> (note 65 43 30 NOTE_INSN_SWITCH_TEXT_SECTIONS)
>
> (code_label 30 65 8 5 6 (nil) [1 uses])
>
> (note 8 30 61 [bb 5] NOTE_INSN_BASIC_BLOCK)

So at a high level the things to note are that insn 50 conditionally jumps
around insn 55.  Second there's a SWITCH_TEXT_SECTIONS note between insn 50 
and
the target label for insn 50 (code_label 30).

reorg sees the conditional jump around the unconditional jump/return and 
will
invert the jump and retarget the original jump to an appropriate location.  
In
this case generating:

> (insn 67 49 18 (sequence [
> (jump_insn 50 49 52 (set (pc)
> (if_then_else (eq (reg:CC 19 ccr)
> (const_int 0 [0]))
> (label_ref:SI 68)
> (pc))) "j.c":10:6 discrim 1 281 {*beqcc}
>  (expr_list:REG_DEAD (reg:CC 19 ccr)
> (int_list:REG_BR_PROB 1073741831 (nil)))
>  -> 68)
> (insn/s/f 52 50 18 (set (mem:SI (reg/f:SI 14 sp) [1  S4 A8])
> (reg:SI 16 srp)) 37 {*mov_tomemsi}
>  (nil))
> ]) "j.c":10:6 discrim 1 -1
>  (nil))
>
> (note 18 67 54 [bb 3] NOTE_INSN_BASIC_BLOCK)
>
> (note 54 18 43 NOTE_INSN_EPILOGUE_BEG)
>
> (note 43 54 65 [bb 4] NOTE_INSN_BASIC_BLOCK)
>
> (note 65 43 8 NOTE_INSN_SWITCH_TEXT_SECTIONS)
>
> (note 8 65 61 [bb 5] NOTE_INSN_BASIC_BLOCK)
[ ... ]
Where the new target of the jump is a return statement later in the IL.

Note that we now have a SWITCH_TEXT_SECTIONS note that is not immediately
preceded by a BARRIER.  That triggers an assertion in the dwarf2 code.  
Removal
of the BARRIER is inherent in this optimization.

The fix is simple, we avoid this optimization when there's a
SWITCH_TEXT_SECTIONS note between the conditional jump insn and its target.
Thankfully we already have a routine to test for this in reorg, so we just 
need
to call it appropriately.  The other approach would be to drop the note 
which I
considered and discarded.

We don't have great coverage for delay slot targets.  I've tested arc, cris,
fr30, frv, h8, iq2000, microblaze, or1k, sh3  visium in my tester as crosses
without new regressions, fixing one regression along the way.   Bootstrap &
regression testing on sh4 and hppa will take considerably longer.

gcc/

* reorg.cc (relax_delay_slots): Do not optimize a conditional
jump around an unconditional jump/return in the presence of
a text section switch.

Diff:
---
 gcc/reorg.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/reorg.cc b/gcc/reorg.cc
index 99228a22c69..633099ca765 100644
--- a/gcc/reorg.cc
+++ b/gcc/reorg.cc
@@ -3409,7 +3409,8 @@ relax_delay_slots (rtx_insn *first)
  && next && simplejump_or_return_p (next)
  && (next_active_insn (as_a (target_label))
  == next_active_insn (next))
- && no_labels_between_p (insn, next))
+ && no_labels_between_p (insn, next)
+ && !switch_text_sections_between_p (insn, next_active_insn (next)))
{
  rtx label = JUMP_LABEL (next);
  rtx old_label = JUMP_LABEL (delay_jump_insn);


[gcc r15-1834] [committed] Fix newlib build failure with rx as well as several dozen testsuite failures

2024-07-03 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:759f4abe1220a8202b8389f9b756c35b6c9c439d

commit r15-1834-g759f4abe1220a8202b8389f9b756c35b6c9c439d
Author: Jeff Law 
Date:   Wed Jul 3 21:11:07 2024 -0600

[committed] Fix newlib build failure with rx as well as several dozen 
testsuite failures

The rx port has been failing to build newlib for a bit over a week.  I can't
remember if it was the late-combine work or the IRA costing twiddle, 
regardless
the real bug is in the rx backend.

Basically dwarf2cfi is blowing up because of inconsistent state caused by 
the
failure to mark a stack adjustment as frame related.  This instance in the
epilogue looks like a simple goof.

With the port building again, the testsuite would run and it showed a 
number of
regressions, again related to CFI handling.  The common thread was a 
failure to
mark a copy from FP to SP in the prologue as frame related.  The change 
which
introduced this bug as supposed to just be changing promotions of vector 
types.
It's unclear if Nick included the hunk accidentally or just goof'd on the
logic.  Regardless it looks quite incorrect.

Reverting that hunk fixes the regressions *and* fixes 94 pre-existing 
failures.

The net is rx-elf is regression free and has moved forward in terms of its
testsuite status.

Pushing to the trunk momentarily.

gcc/

* config/rx/rx.cc (rx_expand_prologue): Mark the copy from FP to SP
as frame related.
(rx_expand_epilogue): Mark the stack pointer adjustment as frame
related.

Diff:
---
 gcc/config/rx/rx.cc | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rx/rx.cc b/gcc/config/rx/rx.cc
index 8048cc98708..c84e1398aad 100644
--- a/gcc/config/rx/rx.cc
+++ b/gcc/config/rx/rx.cc
@@ -1845,8 +1845,7 @@ rx_expand_prologue (void)
gen_safe_add (stack_pointer_rtx, stack_pointer_rtx,
  GEN_INT (- (HOST_WIDE_INT) frame_size), true);
   else
-   gen_safe_add (stack_pointer_rtx, frame_pointer_rtx, NULL_RTX,
- false /* False because the epilogue will use the FP not 
the SP.  */);
+   gen_safe_add (stack_pointer_rtx, frame_pointer_rtx, NULL_RTX, true);
 }
 }
 
@@ -2119,7 +2118,7 @@ rx_expand_epilogue (bool is_sibcall)
   /* Cannot use the special instructions - deconstruct by hand.  */
   if (total_size)
gen_safe_add (stack_pointer_rtx, stack_pointer_rtx,
- GEN_INT (total_size), false);
+ GEN_INT (total_size), true);
 
   if (MUST_SAVE_ACC_REGISTER)
{


[gcc r15-1844] [committed][RISC-V] Fix test expectations after recent late-combine changes

2024-07-04 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b611f3969249967d7f098c6adfcf5f701192a2d0

commit r15-1844-gb611f3969249967d7f098c6adfcf5f701192a2d0
Author: Jeff Law 
Date:   Thu Jul 4 09:25:20 2024 -0600

[committed][RISC-V] Fix test expectations after recent late-combine changes

With the recent DCE related adjustment to late-combine the 
rvv/base/vcreate.c
test no longer has those undesirable vmvNr statements.

It's a bit unclear why this wasn't written as a scan-assembler-not and 
xfailed
given the comment says we don't want to see vmvNr insructions.  I must have
missed that during review.

This patch adjusts the test to expect no vmvNr statements and if they're 
ever
re-introduced, we'll get a nice unexpected failure.

gcc/testsuite
* gcc.target/riscv/rvv/base/vcreate.c: Update expected output.

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c
index 01006de7c81..1c7c154637e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c
@@ -256,6 +256,6 @@ test_vcreate_v_i64m2x4 (vint64m2_t v0, vint64m2_t v1, 
vint64m2_t v2,
 }
 
 // Ideally with O3, should find 0 instances of any vmvnr.v PR113913
-/* { dg-final { scan-assembler-times {vmv1r.v\s+v[0-9]+,\s*v[0-9]+} 72 } } */
-/* { dg-final { scan-assembler-times {vmv2r.v\s+v[0-9]+,\s*v[0-9]+} 36 } } */
-/* { dg-final { scan-assembler-times {vmv4r.v\s+v[0-9]+,\s*v[0-9]+} 16 } } */
+/* { dg-final { scan-assembler-not {vmv1r.v\s+v[0-9]+,\s*v[0-9]+} } } */
+/* { dg-final { scan-assembler-not {vmv2r.v\s+v[0-9]+,\s*v[0-9]+} } } */
+/* { dg-final { scan-assembler-not {vmv4r.v\s+v[0-9]+,\s*v[0-9]+} } } */


[gcc r15-1872] [committed] Fix various sh define_insn_and_split predicates

2024-07-06 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:cb9badea8be5396afe90f4c497e9f333cce1cb3f

commit r15-1872-gcb9badea8be5396afe90f4c497e9f333cce1cb3f
Author: Jeff Law 
Date:   Sat Jul 6 06:35:54 2024 -0600

[committed] Fix various sh define_insn_and_split predicates

The sh4-linux-gnu port has failed to bootstrap since the introduction of 
late
combine due to failures to split certain insns.

This is caused by incorrect predicates in various define_insn_and_split
patterns.  Essentially the insn's predicate is something like "TARGET_SH1".
The split predicate is "&& can_create_pseudos_p ()".  So these patterns will
match post-reload, but be un-splittable.  So at assembly output time, we get
the failure as the output template is "#".

This patch fixes the most obvious & egregious cases by bringing the split
condition into the insn's predicate and leaving "&& 1" as the split 
condition.
That's enough to get sh4-linux-gnu bootstrapping again and I'm hoping it 
does
the same for sh4eb-linux-gnu.

Pushing to the trunk.

gcc/
* config/sh/sh.md (adddi3): Only allow matching when we can
still create new pseudos.
(subdi3, *rotcl, *rotcr, *rotcr_neg_t, negdi2): Likewise.
(abs2, negabs2, negdi_cond): Likewise.
(*swapbisi2_and_shl8, *swapbhisi2, *movsi_index_disp_load): 
Likewise.
(*movhi_index_disp_load, *movindex_disp_store): Likewise.
(*mov_t_msb_neg, *negt_msb, clipu_one): Likewise.

Diff:
---
 gcc/config/sh/sh.md | 100 ++--
 1 file changed, 50 insertions(+), 50 deletions(-)

diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index 9491b49e55b..3e978254ab0 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/config/sh/sh.md
@@ -1542,9 +1542,9 @@
(plus:DI (match_operand:DI 1 "arith_reg_operand")
 (match_operand:DI 2 "arith_reg_operand")))
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(const_int 0)]
 {
   emit_insn (gen_clrt ());
@@ -1934,9 +1934,9 @@
(minus:DI (match_operand:DI 1 "arith_reg_operand")
  (match_operand:DI 2 "arith_reg_operand")))
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(const_int 0)]
 {
   emit_insn (gen_clrt ());
@@ -3174,9 +3174,9 @@
(and:SI (match_operand:SI 3 "arith_reg_or_t_reg_operand")
(const_int 1
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(const_int 0)]
 {
   gcc_assert (INTVAL (operands[2]) > 0);
@@ -3259,9 +3259,9 @@
   (match_operand:SI 2 "const_int_operand"))
(match_operand 3 "treg_set_expr")))
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(parallel [(set (match_dup 0)
   (ior:SI (ashift:SI (match_dup 1) (match_dup 2))
   (and:SI (match_dup 3) (const_int 1
@@ -3278,9 +3278,9 @@
(ashift:SI (match_operand:SI 2 "arith_reg_operand")
   (match_operand:SI 3 "const_int_operand"
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(parallel [(set (match_dup 0)
   (ior:SI (ashift:SI (match_dup 2) (match_dup 3))
   (and:SI (match_dup 1) (const_int 1
@@ -3293,9 +3293,9 @@
(lshiftrt:SI (match_operand:SI 3 "arith_reg_operand")
 (const_int 31
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(parallel [(set (match_dup 0)
   (ior:SI (ashift:SI (match_dup 1) (match_dup 2))
   (and:SI (reg:SI T_REG) (const_int 1
@@ -3312,9 +3312,9 @@
(ashift:SI (match_operand:SI 1 "arith_reg_operand")
   (match_operand:SI 2 "const_int_operand"
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(parallel [(set (match_dup 0)
   (ior:SI (ashift:SI (match_dup 1) (match_dup 2))
   (and:SI (reg:SI T_REG) (const_int 1
@@ -3332,9 +3332,9 @@
 (const_int 1)
 (match_operand 4 "const_int_operand"
(clobber (reg:SI T_REG))]
-  "TARGET_SH1"
+  "TARGET_SH1 && can_create_pseudo_p ()"
   "#"
-  "&& can_create_pseudo_p ()"
+  "&& 1"
   [(parallel [(set (match_dup 0)
   (ior:SI (ashift:SI (match_dup 1) (match_du

[gcc r15-1874] [to-be-committed][v3][RISC-V] Handle bit manipulation of SImode values

2024-07-06 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:273f16a125c4fab664683376ae04a9a31e7d6a22

commit r15-1874-g273f16a125c4fab664683376ae04a9a31e7d6a22
Author: Jeff Law 
Date:   Sat Jul 6 12:57:59 2024 -0600

[to-be-committed][v3][RISC-V] Handle bit manipulation of SImode values

Last patch in this round of bitmanip work...  At least I think I'm going to
pause here and switch gears to other projects that need attention 🙂

This patch introduces the ability to generate bitmanip instructions for rv64
when operating on SI objects when we know something about the range of the 
bit
position (due to masking of the position).

I've got note that the (7-pos % 8) bit position form was discovered by RAU 
in
500.perl.  I took that and expanded it to the simple (pos & mask) form as 
well
as covering bset, binv and bclr.

As far as the implementation is concerned

This turns the recently added define_splits into define_insn_and_split
constructs.  This allows combine to "see" enough RTL to realize a sign
extension is unnecessary.  Otherwise we get undesirable sign extensions for 
the
new testcases.

Second it adds new patterns for the logical operations.  Two patterns for
IOR/XOR and two patterns for AND.

I think a key concept to keep in mind is that once we determine a Zbs 
operation
is safe to perform on a SI value, we can rewrite the RTL in 64bit form.  If 
we
were ever to try and use range information at expand time for this stuff 
(and
we probably should investigate that), that's the path I'd suggest.

This is notably cleaner than my original implementation which actually kept 
the
more complex RTL form through final and emitted 2/3 instructions (mask the 
bit
position, then the bset/bclr/binv).

Tested in my tester, but waiting for pre-commit CI to report back before 
taking
further action.

gcc/
* config/riscv/bitmanip.md (bset splitters): Turn into 
define_and_splits.
Don't depend on combine splitting the "andn with constant" form.
(bset, binv, bclr with masked bit position): New patterns.

gcc/testsuite
* gcc.target/riscv/binv-for-simode-1.c: New test.
* gcc.target/riscv/bset-for-simode-1.c: New test.
* gcc.target/riscv/bclr-for-simode-1.c: New test.

Diff:
---
 gcc/config/riscv/bitmanip.md   | 135 ++---
 gcc/testsuite/gcc.target/riscv/bclr-for-simode-1.c |  25 
 gcc/testsuite/gcc.target/riscv/binv-for-simode-1.c |  24 
 gcc/testsuite/gcc.target/riscv/bset-for-simode-1.c |  24 
 4 files changed, 192 insertions(+), 16 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 3eedabffca0..f403ba8dbba 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -615,37 +615,140 @@
 ;; shift constant.  With the limited range we know the SImode sign
 ;; bit is never set, thus we can treat this as zero extending and
 ;; generate the bsetdi_2 pattern.
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
 (ashift:SI (const_int 1)
(subreg:QI
- (and:DI (not:DI (match_operand:DI 1 "register_operand"))
+ (and:DI (not:DI (match_operand:DI 1 "register_operand" 
"r"))
  (match_operand 2 "const_int_operand")) 0
-   (clobber (match_operand:DI 3 "register_operand"))]
+   (clobber (match_scratch:X 3 "=&r"))]
   "TARGET_64BIT
&& TARGET_ZBS
&& (TARGET_ZBB || TARGET_ZBKB)
&& (INTVAL (operands[2]) & 0x1f) != 0x1f"
-   [(set (match_dup 0) (and:DI (not:DI (match_dup 1)) (match_dup 2)))
-(set (match_dup 0) (zero_extend:DI (ashift:SI
-  (const_int 1)
-  (subreg:QI (match_dup 0) 0])
+  "#"
+  "&& reload_completed"
+   [(set (match_dup 3) (match_dup 2))
+(set (match_dup 3) (and:DI (not:DI (match_dup 1)) (match_dup 3)))
+(set (match_dup 0) (zero_extend:DI
+(ashift:SI (const_int 1) (match_dup 4]
+  { operands[4] = gen_lowpart (QImode, operands[3]); }
+  [(set_attr "type" "bitmanip")])
 
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
-   (any_extend:DI
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(any_extend:DI
 (ashift:SI (const_int 1)
(subreg:QI
- (and:DI (match_operand:DI 1 "register_operand")
+ (and:DI (match_operand:DI 1 "register_operand" "r")
  (match_operand 2 "const_int_operand")) 0]
   "TARGET_64BIT
&& TARGET_ZBS
&& (INTVAL (operands[2]) & 0x1f) != 0x1f"
-   [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 2)))
-(set (match

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Fix asm check failure for truncated after SAT_SUB

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b810058b87e322cf66e0130f6cfb04f08764b701

commit b810058b87e322cf66e0130f6cfb04f08764b701
Author: Pan Li 
Date:   Wed Jul 3 13:17:16 2024 +0800

RISC-V: Fix asm check failure for truncated after SAT_SUB

It seems that the asm check is incorrect for truncated after SAT_SUB,
we should take the vx check for vssubu instead of vv check.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c:
Update vssubu check from vv to vx.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c:
Ditto.

Signed-off-by: Pan Li 
(cherry picked from commit ab3e3d2f0564c2eb0640de3f4d0a50e1fcc8c318)

Diff:
---
 .../gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c  | 2 +-
 .../gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c  | 2 +-
 .../gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c
index dd9e3999a29..1e380657d74 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c
@@ -11,7 +11,7 @@
 ** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e16,\s*m1,\s*ta,\s*ma
 ** ...
 ** vle16\.v\s+v[0-9]+,\s*0\([atx][0-9]+\)
-** vssubu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vssubu\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[atx][0-9]+
 ** vsetvli\s+zero,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma
 ** vncvt\.x\.x\.w\s+v[0-9]+,\s*v[0-9]+
 ** ...
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c
index 738d1465a01..d7b8931f0ec 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c
@@ -11,7 +11,7 @@
 ** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e32,\s*m1,\s*ta,\s*ma
 ** ...
 ** vle32\.v\s+v[0-9]+,\s*0\([atx][0-9]+\)
-** vssubu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vssubu\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[atx][0-9]+
 ** vsetvli\s+zero,\s*zero,\s*e16,\s*mf2,\s*ta,\s*ma
 ** vncvt\.x\.x\.w\s+v[0-9]+,\s*v[0-9]+
 ** ...
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c
index b008b21cf0c..edf42a1f776 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c
@@ -11,7 +11,7 @@
 ** vsetvli\s+[atx][0-9]+,\s*[atx][0-9]+,\s*e64,\s*m1,\s*ta,\s*ma
 ** ...
 ** vle64\.v\s+v[0-9]+,\s*0\([atx][0-9]+\)
-** vssubu\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+
+** vssubu\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[atx][0-9]+
 ** vsetvli\s+zero,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma
 ** vncvt\.x\.x\.w\s+v[0-9]+,\s*v[0-9]+
 ** ...


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Bugfix vfmv insn honor zvfhmin for FP16 SEW [PR115763]

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:7013a4562f74ba14cd14f4e57cc318225752c762

commit 7013a4562f74ba14cd14f4e57cc318225752c762
Author: Pan Li 
Date:   Wed Jul 3 22:06:48 2024 +0800

RISC-V: Bugfix vfmv insn honor zvfhmin for FP16 SEW [PR115763]

According to the ISA,  the zvfhmin sub extension should only contain
convertion insn.  Thus,  the vfmv insn acts on FP16 should not be
present when only the zvfhmin option is given.

This patch would like to fix it by split the pred_broadcast define_insn
into zvfhmin and zvfh part.  Given below example:

void test (_Float16 *dest, _Float16 bias) {
  dest[0] = bias;
  dest[1] = bias;
}

when compile with -march=rv64gcv_zfh_zvfhmin

Before this patch:
test:
  vsetivlizero,2,e16,mf4,ta,ma
  vfmv.v.fv1,fa0 // should not leverage vfmv for zvfhmin
  vse16.v v1,0(a0)
  ret

After this patch:
test:
  addi sp,sp,-16
  fsh  fa0,14(sp)
  addi a5,sp,14
  vsetivli zero,2,e16,mf4,ta,ma
  vlse16.v v1,0(a5),zero
  vse16.v  v1,0(a0)
  addi sp,sp,16
  jr   ra

PR target/115763

gcc/ChangeLog:

* config/riscv/vector.md (*pred_broadcast): Split into
zvfh and zvfhmin part.
(*pred_broadcast_zvfh): New define_insn for zvfh part.
(*pred_broadcast_zvfhmin): Ditto but for zvfhmin.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/scalar_move-5.c: Adjust asm check.
* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-7.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-8.c: Ditto.
* gcc.target/riscv/rvv/base/pr115763-1.c: New test.
* gcc.target/riscv/rvv/base/pr115763-2.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit de9254e224eb3d89303cb9b3ba50b4c479c55f7c)

Diff:
---
 gcc/config/riscv/vector.md | 49 +++---
 .../gcc.target/riscv/rvv/base/pr115763-1.c |  9 
 .../gcc.target/riscv/rvv/base/pr115763-2.c | 10 +
 .../gcc.target/riscv/rvv/base/scalar_move-5.c  |  4 +-
 .../gcc.target/riscv/rvv/base/scalar_move-6.c  |  6 +--
 .../gcc.target/riscv/rvv/base/scalar_move-7.c  |  6 +--
 .../gcc.target/riscv/rvv/base/scalar_move-8.c  |  6 +--
 7 files changed, 64 insertions(+), 26 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index fe18ee5b5f7..d9474262d54 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -2080,31 +2080,50 @@
   [(set_attr "type" "vimov,vimov,vlds,vlds,vlds,vlds,vimovxv,vimovxv")
(set_attr "mode" "")])
 
-(define_insn "*pred_broadcast"
-  [(set (match_operand:V_VLSF_ZVFHMIN 0 "register_operand" "=vr, vr, 
vr, vr, vr, vr, vr, vr")
-   (if_then_else:V_VLSF_ZVFHMIN
+(define_insn "*pred_broadcast_zvfh"
+  [(set (match_operand:V_VLSF0 "register_operand"  "=vr,  vr,  
vr,  vr")
+   (if_then_else:V_VLSF
  (unspec:
-   [(match_operand: 1 "vector_broadcast_mask_operand" "Wc1,Wc1, 
vm, vm,Wc1,Wc1,Wb1,Wb1")
-(match_operand 4 "vector_length_operand"  " rK, rK, 
rK, rK, rK, rK, rK, rK")
-(match_operand 5 "const_int_operand"  "  i,  i,  
i,  i,  i,  i,  i,  i")
-(match_operand 6 "const_int_operand"  "  i,  i,  
i,  i,  i,  i,  i,  i")
-(match_operand 7 "const_int_operand"  "  i,  i,  
i,  i,  i,  i,  i,  i")
+   [(match_operand: 1 "vector_broadcast_mask_operand" "Wc1, Wc1, 
Wb1, Wb1")
+(match_operand  4 "vector_length_operand" " rK,  rK,  
rK,  rK")
+(match_operand  5 "const_int_operand" "  i,   i,   
i,   i")
+(match_operand  6 "const_int_operand" "  i,   i,   
i,   i")
+(match_operand  7 "const_int_operand" "  i,   i,   
i,   i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (vec_duplicate:V_VLSF_ZVFHMIN
-   (match_operand: 3 "direct_broadcast_operand"   " f,  
f,Wdm,Wdm,Wdm,Wdm,  f,  f"))
- (match_operand:V_VLSF_ZVFHMIN 2 "vector_merge_operand""vu,  0, 
vu,  0, vu,  0, vu,  0")))]
+ (vec_duplicate:V_VLSF
+   (match_operand: 3 "direct_broadcast_operand"  "  f,   f,   
f,   f"))
+ (match_operand:V_VLSF  2 "vector_merge_operand"  " vu,   0,  
vu,   0")))]
   "TARGET_VECTOR"
   "@
vfmv.v.f\t%0,%3
vfmv.v.f\t%0,%3
+   vfmv.s.f\t%0,%3
+   vfmv.s.f\t%0,%3"
+  [(set_attr "type" "vfmov,vfmov,vfmovfv,vfmovfv")
+   (set_attr "mode" "")])
+
+(define_insn "*pred_broadcast_zvfhmin"
+  [(set (match_operand:V_VLSF_ZVFHMIN   0 "register_operand"  
"=vr,  vr,  vr,  vr")
+   (if_then_else:V_VLSF_ZV

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add support for Zabha extension

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:de6abc8953cdba01e27be7427afe7834a68c583d

commit de6abc8953cdba01e27be7427afe7834a68c583d
Author: Gianluca Guida 
Date:   Tue Jul 2 18:05:14 2024 -0700

RISC-V: Add support for Zabha extension

The Zabha extension adds support for subword Zaamo ops.

Extension: https://github.com/riscv/riscv-zabha.git
Ratification: https://jira.riscv.org/browse/RVS-1685

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc
(riscv_subset_list::to_string): Skip zabha when not supported by
the assembler.
* config.in: Regenerate.
* config/riscv/arch-canonicalize: Make zabha imply zaamo.
* config/riscv/iterators.md (amobh): Add iterator for amo
byte/halfword.
* config/riscv/riscv.opt: Add zabha.
* config/riscv/sync.md (atomic_): Add
subword atomic op pattern.
(zabha_atomic_fetch_): Add subword
atomic_fetch op pattern.
(lrsc_atomic_fetch_): Prefer zabha over lrsc
for subword atomic ops.
(zabha_atomic_exchange): Add subword atomic exchange
pattern.
(lrsc_atomic_exchange): Prefer zabha over lrsc for subword
atomic exchange ops.
* configure: Regenerate.
* configure.ac: Add zabha assembler check.
* doc/sourcebuild.texi: Add zabha documentation.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add zabha testsuite infra support.
* gcc.target/riscv/amo/inline-atomics-1.c: Remove zabha to continue 
to
test the lr/sc subword patterns.
* gcc.target/riscv/amo/inline-atomics-2.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: 
Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: 
Ditto.
* gcc.target/riscv/amo/zabha-all-amo-ops-char-run.c: New test.
* gcc.target/riscv/amo/zabha-all-amo-ops-short-run.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-char.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-all-amo-ops-short.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-amo-add-char.c: New test.
* gcc.target/riscv/amo/zabha-rvwmo-amo-add-short.c: New test.
* gcc.target/riscv/amo/zabha-ztso-amo-add-char.c: New test.
* gcc.target/riscv/amo/zabha-ztso-amo-add-short.c: New test.

Co-Authored-By: Patrick O'Neill 
Signed-Off-By: Gianluca Guida 
Tested-by: Andrea Parri 
(cherry picked from commit 7b2b2e3d660edc8ef3a8cfbdfc2b0fd499459601)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc| 12 
 gcc/config.in  |  6 ++
 gcc/config/riscv/arch-canonicalize |  3 +
 gcc/config/riscv/iterators.md  |  3 +
 gcc/config/riscv/riscv.opt |  2 +
 gcc/config/riscv/sync.md   | 81 +-
 gcc/configure  | 31 +
 gcc/configure.ac   |  5 ++
 gcc/doc/sourcebuild.texi   | 12 +++-
 .../gcc.target/riscv/amo/inline-atomics-1.c|  1 +
 .../gcc.target/riscv/amo/inline-atomics-2.c|  1 +
 .../riscv/amo/zabha-all-amo-ops-char-run.c |  5 ++
 .../riscv/amo/zabha-all-amo-ops-short-run.c|  5 ++
 .../riscv/amo/zabha-rvwmo-all-amo-ops-char.c   | 23 ++
 .../riscv/amo/zabha-rvwmo-all-amo-ops-short.c  | 23 ++
 .../riscv/amo/zabha-rvwmo-amo-add-char.c   | 57 +++
 .../riscv/amo/zabha-rvwmo-amo-add-short.c  | 57 +++
 .../gcc.target/riscv/amo/zabha-ztso-amo-add-char.c | 57 +++
 .../riscv/amo/zabha-ztso-amo-add-short.c   | 57 +++
 .../zalrsc-rvwmo-subword-amo-add-char-acq-rel.c|  1 +
 .../zalrsc-rvwmo-subword-amo-add-char-acquire.c|  1 +
 .../zalrsc-rvwmo-subword-amo-add-char-relaxed.c|  1 +
 .../zalrsc-rvwmo-subword-amo-add-char-release.c|  1 +
 .../zalrsc-rvwmo-subword-amo-add-char-seq-cst.c|  1 +
 .../amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c |  1 +
 .../

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Describe -march behavior for dependent extensions

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:316bda6c8e290853cfca060f38b2f0f4281d65f9

commit 316bda6c8e290853cfca060f38b2f0f4281d65f9
Author: Palmer Dabbelt 
Date:   Tue Jul 2 18:20:39 2024 -0700

RISC-V: Describe -march behavior for dependent extensions

gcc/ChangeLog:

* doc/invoke.texi: Describe -march behavior for dependent 
extensions on
RISC-V.

(cherry picked from commit 70f6bc39c4b0e147a816ad1dad583f944616c367)

Diff:
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 023ee575b86..9486758d463 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -30927,6 +30927,10 @@ If both @option{-march} and @option{-mcpu=} are not 
specified, the default for
 this argument is system dependent, users who want a specific architecture
 extensions should specify one explicitly.
 
+When the RISC-V specifications define an extension as depending on other
+extensions, GCC will implicitly add the dependent extensions to the enabled
+extension set if they weren't added explicitly.
+
 @opindex mcpu
 @item -mcpu=@var{processor-string}
 Use architecture of and optimize the output for the given processor, specified


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [committed][RISC-V] Fix test expectations after recent late-combine changes

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:a2b380b973fc48b5b2f8c7ebff775ab5aab5a5cb

commit a2b380b973fc48b5b2f8c7ebff775ab5aab5a5cb
Author: Jeff Law 
Date:   Thu Jul 4 09:25:20 2024 -0600

[committed][RISC-V] Fix test expectations after recent late-combine changes

With the recent DCE related adjustment to late-combine the 
rvv/base/vcreate.c
test no longer has those undesirable vmvNr statements.

It's a bit unclear why this wasn't written as a scan-assembler-not and 
xfailed
given the comment says we don't want to see vmvNr insructions.  I must have
missed that during review.

This patch adjusts the test to expect no vmvNr statements and if they're 
ever
re-introduced, we'll get a nice unexpected failure.

gcc/testsuite
* gcc.target/riscv/rvv/base/vcreate.c: Update expected output.

(cherry picked from commit b611f3969249967d7f098c6adfcf5f701192a2d0)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c
index 01006de7c81..1c7c154637e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vcreate.c
@@ -256,6 +256,6 @@ test_vcreate_v_i64m2x4 (vint64m2_t v0, vint64m2_t v1, 
vint64m2_t v2,
 }
 
 // Ideally with O3, should find 0 instances of any vmvnr.v PR113913
-/* { dg-final { scan-assembler-times {vmv1r.v\s+v[0-9]+,\s*v[0-9]+} 72 } } */
-/* { dg-final { scan-assembler-times {vmv2r.v\s+v[0-9]+,\s*v[0-9]+} 36 } } */
-/* { dg-final { scan-assembler-times {vmv4r.v\s+v[0-9]+,\s*v[0-9]+} 16 } } */
+/* { dg-final { scan-assembler-not {vmv1r.v\s+v[0-9]+,\s*v[0-9]+} } } */
+/* { dg-final { scan-assembler-not {vmv2r.v\s+v[0-9]+,\s*v[0-9]+} } } */
+/* { dg-final { scan-assembler-not {vmv4r.v\s+v[0-9]+,\s*v[0-9]+} } } */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Use tu policy for first-element vec_set [PR115725].

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:30267a9ae0d49265e80d7d6748cd0851eb4fe1ad

commit 30267a9ae0d49265e80d7d6748cd0851eb4fe1ad
Author: Robin Dapp 
Date:   Mon Jul 1 13:37:17 2024 +0200

RISC-V: Use tu policy for first-element vec_set [PR115725].

This patch changes the tail policy for vmv.s.x from ta to tu.
By default the bug does not show up with qemu because qemu's
current vmv.s.x implementation always uses the tail-undisturbed
policy.  With a local qemu version that overwrites the tail
with ones when the tail-agnostic policy is specified, the bug
shows.

gcc/ChangeLog:

* config/riscv/autovec.md: Add TU policy.
* config/riscv/riscv-protos.h (enum insn_type): Define
SCALAR_MOVE_MERGED_OP_TU.

gcc/testsuite/ChangeLog:

PR target/115725

* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Adjust
test expectation.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto.

(cherry picked from commit acc3b703c05debc6276451f9daae5d0ffc797eac)

Diff:
---
 gcc/config/riscv/autovec.md  |  3 ++-
 gcc/config/riscv/riscv-protos.h  |  4 
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c   | 12 
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c   | 12 
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c   | 12 
 .../gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c   | 12 
 6 files changed, 22 insertions(+), 33 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 66d70f678a6..0fb6316a2cf 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1341,7 +1341,8 @@
 {
   rtx ops[] = {operands[0], operands[0], operands[1]};
   riscv_vector::emit_nonvlmax_insn (code_for_pred_broadcast (mode),
-   riscv_vector::SCALAR_MOVE_MERGED_OP, 
ops, CONST1_RTX (Pmode));
+   riscv_vector::SCALAR_MOVE_MERGED_OP_TU,
+   ops, CONST1_RTX (Pmode));
 }
   else
 {
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index a8b76173fa0..abf6e34b5cc 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -524,6 +524,10 @@ enum insn_type : unsigned int
   SCALAR_MOVE_MERGED_OP = HAS_DEST_P | HAS_MASK_P | USE_ONE_TRUE_MASK_P
  | HAS_MERGE_P | TDEFAULT_POLICY_P | MDEFAULT_POLICY_P
  | UNARY_OP_P,
+
+  SCALAR_MOVE_MERGED_OP_TU = HAS_DEST_P | HAS_MASK_P | USE_ONE_TRUE_MASK_P
+ | HAS_MERGE_P | TU_POLICY_P | MDEFAULT_POLICY_P
+ | UNARY_OP_P,
 };
 
 enum vlmul_type
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
index ecb160933d6..99b0f625c83 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c
@@ -64,14 +64,10 @@ typedef double vnx2df __attribute__((vector_size (16)));
 TEST_ALL1 (VEC_SET)
 TEST_ALL_VAR1 (VEC_SET_VAR1)
 
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*tu,\s*ma} 5 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*ta,\s*ma} 2 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*tu,\s*ma} 6 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*ta,\s*ma} 2 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*tu,\s*ma} 6 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*ta,\s*ma} 2 } } */
-/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*tu,\s*ma} 4 } } */
+/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e8,\s*m1,\s*tu,\s*ma} 6 } } */
+/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e16,\s*m1,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e32,\s*m1,\s*tu,\s*ma} 8 } } */
+/* { dg-final { scan-assembler-times 
{vset[i]*vli\s+[a-z0-9,]+,\s*e64,\s*m1,\s*tu,\s*ma} 6 } } */
 
 /* { dg-final { scan-assembler-times {\tvmv.v.x} 13 } } */
 /* { dg-final { scan-assembler-times {\tvfmv.v.f} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c
index 194abff77cc..64a40308eb1 100644
--- a/gcc

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: fix internal error on global variable-length array

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:a3d53feb6ab6b71a11e220a800d8f5a013e7590c

commit a3d53feb6ab6b71a11e220a800d8f5a013e7590c
Author: Eric Botcazou 
Date:   Sat Jul 6 11:56:19 2024 +0200

RISC-V: fix internal error on global variable-length array

This is an ICE in the RISC-V back-end calling tree_to_uhwi on the DECL_SIZE
of a global variable-length array.

gcc/
PR target/115591
* config/riscv/riscv.cc (riscv_valid_lo_sum_p): Add missing test on
tree_fits_uhwi_p before calling tree_to_uhwi.

gcc/testsuite/
* gnat.dg/array41.ads, gnat.dg/array41.adb: New test.

(cherry picked from commit 8bc5561c43b195e1638e5acace8b41b3f7512be3)

Diff:
---
 gcc/config/riscv/riscv.cc |  4 +++-
 gcc/testsuite/gnat.dg/array41.adb | 37 +
 gcc/testsuite/gnat.dg/array41.ads |  5 +
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index cca7ffde33a..4acd643fd8d 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1702,7 +1702,9 @@ riscv_valid_lo_sum_p (enum riscv_symbol_type sym_type, 
machine_mode mode,
   align = (SYMBOL_REF_DECL (x)
   ? DECL_ALIGN (SYMBOL_REF_DECL (x))
   : 1);
-  size = (SYMBOL_REF_DECL (x) && DECL_SIZE (SYMBOL_REF_DECL (x))
+  size = (SYMBOL_REF_DECL (x)
+ && DECL_SIZE (SYMBOL_REF_DECL (x))
+ && tree_fits_uhwi_p (DECL_SIZE (SYMBOL_REF_DECL (x)))
  ? tree_to_uhwi (DECL_SIZE (SYMBOL_REF_DECL (x)))
  : 2*BITS_PER_WORD);
 }
diff --git a/gcc/testsuite/gnat.dg/array41.adb 
b/gcc/testsuite/gnat.dg/array41.adb
new file mode 100644
index 000..d0d5a69eeaf
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/array41.adb
@@ -0,0 +1,37 @@
+-- { dg-do compile }
+
+with System.Storage_Elements;
+
+package body Array41 is
+
+   procedure Program_Initialization
+   with
+ Export,
+ Convention => Ada,
+ External_Name => "program_initialization";
+
+   procedure Program_Initialization is
+  use System.Storage_Elements;
+
+  Sdata : Storage_Element
+with Import, Convention => Asm, External_Name => "_sdata";
+  Edata : Storage_Element
+with Import, Convention => Asm, External_Name => "_edata";
+
+  Data_Size : constant Storage_Offset := Edata'Address - Sdata'Address;
+
+  --  Index from 1 so as to avoid subtracting 1 from the size
+  Data_In_Flash : constant Storage_Array (1 .. Data_Size)
+with Import, Convention => Asm, External_Name => "_sidata";
+
+  Data_In_Sram : Storage_Array (1 .. Data_Size)
+with Volatile, Import, Convention => Asm, External_Name => "_sdata";
+
+   begin
+  --  Copy rw data from flash to ram
+  for J in Data_In_Flash'Range loop
+ Data_In_Sram (J) := Data_In_Flash (J);
+  end loop;
+   end Program_Initialization;
+
+end Array41;
diff --git a/gcc/testsuite/gnat.dg/array41.ads 
b/gcc/testsuite/gnat.dg/array41.ads
new file mode 100644
index 000..50cde3cd819
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/array41.ads
@@ -0,0 +1,5 @@
+package Array41 is
+
+  pragma Elaborate_Body;
+
+end Array41;


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [to-be-committed][v3][RISC-V] Handle bit manipulation of SImode values

2024-07-07 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:576d8c0e57bc6894ab1215bfd45406b8adcfd2dd

commit 576d8c0e57bc6894ab1215bfd45406b8adcfd2dd
Author: Jeff Law 
Date:   Sat Jul 6 12:57:59 2024 -0600

[to-be-committed][v3][RISC-V] Handle bit manipulation of SImode values

Last patch in this round of bitmanip work...  At least I think I'm going to
pause here and switch gears to other projects that need attention 🙂

This patch introduces the ability to generate bitmanip instructions for rv64
when operating on SI objects when we know something about the range of the 
bit
position (due to masking of the position).

I've got note that the (7-pos % 8) bit position form was discovered by RAU 
in
500.perl.  I took that and expanded it to the simple (pos & mask) form as 
well
as covering bset, binv and bclr.

As far as the implementation is concerned

This turns the recently added define_splits into define_insn_and_split
constructs.  This allows combine to "see" enough RTL to realize a sign
extension is unnecessary.  Otherwise we get undesirable sign extensions for 
the
new testcases.

Second it adds new patterns for the logical operations.  Two patterns for
IOR/XOR and two patterns for AND.

I think a key concept to keep in mind is that once we determine a Zbs 
operation
is safe to perform on a SI value, we can rewrite the RTL in 64bit form.  If 
we
were ever to try and use range information at expand time for this stuff 
(and
we probably should investigate that), that's the path I'd suggest.

This is notably cleaner than my original implementation which actually kept 
the
more complex RTL form through final and emitted 2/3 instructions (mask the 
bit
position, then the bset/bclr/binv).

Tested in my tester, but waiting for pre-commit CI to report back before 
taking
further action.

gcc/
* config/riscv/bitmanip.md (bset splitters): Turn into 
define_and_splits.
Don't depend on combine splitting the "andn with constant" form.
(bset, binv, bclr with masked bit position): New patterns.

gcc/testsuite
* gcc.target/riscv/binv-for-simode-1.c: New test.
* gcc.target/riscv/bset-for-simode-1.c: New test.
* gcc.target/riscv/bclr-for-simode-1.c: New test.

(cherry picked from commit 273f16a125c4fab664683376ae04a9a31e7d6a22)

Diff:
---
 gcc/config/riscv/bitmanip.md   | 135 ++---
 gcc/testsuite/gcc.target/riscv/bclr-for-simode-1.c |  25 
 gcc/testsuite/gcc.target/riscv/binv-for-simode-1.c |  24 
 gcc/testsuite/gcc.target/riscv/bset-for-simode-1.c |  24 
 4 files changed, 192 insertions(+), 16 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 3eedabffca0..f403ba8dbba 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -615,37 +615,140 @@
 ;; shift constant.  With the limited range we know the SImode sign
 ;; bit is never set, thus we can treat this as zero extending and
 ;; generate the bsetdi_2 pattern.
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
 (ashift:SI (const_int 1)
(subreg:QI
- (and:DI (not:DI (match_operand:DI 1 "register_operand"))
+ (and:DI (not:DI (match_operand:DI 1 "register_operand" 
"r"))
  (match_operand 2 "const_int_operand")) 0
-   (clobber (match_operand:DI 3 "register_operand"))]
+   (clobber (match_scratch:X 3 "=&r"))]
   "TARGET_64BIT
&& TARGET_ZBS
&& (TARGET_ZBB || TARGET_ZBKB)
&& (INTVAL (operands[2]) & 0x1f) != 0x1f"
-   [(set (match_dup 0) (and:DI (not:DI (match_dup 1)) (match_dup 2)))
-(set (match_dup 0) (zero_extend:DI (ashift:SI
-  (const_int 1)
-  (subreg:QI (match_dup 0) 0])
+  "#"
+  "&& reload_completed"
+   [(set (match_dup 3) (match_dup 2))
+(set (match_dup 3) (and:DI (not:DI (match_dup 1)) (match_dup 3)))
+(set (match_dup 0) (zero_extend:DI
+(ashift:SI (const_int 1) (match_dup 4]
+  { operands[4] = gen_lowpart (QImode, operands[3]); }
+  [(set_attr "type" "bitmanip")])
 
-(define_split
-  [(set (match_operand:DI 0 "register_operand")
-   (any_extend:DI
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(any_extend:DI
 (ashift:SI (const_int 1)
(subreg:QI
- (and:DI (match_operand:DI 1 "register_operand")
+ (and:DI (match_operand:DI 1 "register_operand" "r")
  (match_operand 2 "const_int_operand")) 0]
   "TARGET_64BIT
&& TARGET_ZBS
&& (INTVAL (operands[2]) & 0x1f) != 0x1f"
-   [(set 

[gcc r15-1901] [to-be-committed][RISC-V][V3] DCE analysis for extension elimination

2024-07-08 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:98914f9eba5f19d3eb93fbce8726b5264631cba0

commit r15-1901-g98914f9eba5f19d3eb93fbce8726b5264631cba0
Author: Jeff Law 
Date:   Mon Jul 8 17:06:55 2024 -0600

[to-be-committed][RISC-V][V3] DCE analysis for extension elimination

The pre-commit testing showed that making ext-dce only active at -O2 and 
above
would require minor edits to the tests.  In some cases we had specified -O1 
in
the test or specified no optimization level at all. Those need to be bumped 
to
-O2.   In one test we had one set of dg-options overriding another.

The other approach that could have been taken would be to drop the -On
argument, add an explicit -fext-dce and add dg-skip-if options.  I 
originally
thought that was going to be way to go, but the dg-skip-if aspect was going 
to
get ugly as things like interaction between unrolling, peeling and -ftracer
would have to be accounted for and would likely need semi-regular 
adjustment.

Changes since V2:
  Testsuite changes to deal with pass only being enabled at -O2 or
  higher.

--

Changes since V1:

  Check flag_ext_dce before running the new pass.  I'd forgotten that
  I had removed that part of the gate to facilitate more testing.
  Turn flag_ext_dce on at -O2 and above.
  Adjust one of the riscv tests to explicitly avoid vectors
  Adjust a few aarch64 tests
In tbz_2.c we remove an unnecessary extension which causes us to use
"x" registers instead of "w" registers.

In the pred_clobber tests we also remove an extension and that
ultimately causes a reg->reg copy to change locations.

--

This was actually ack'd late in the gcc-14 cycle, but I chose not to 
integrate
it given how late we were in the cycle.

The basic idea here is to track liveness of subobjects within a word and if 
we
find an extension where the bits set aren't actually used, then we convert 
the
extension into a subreg.  The subreg typically simplifies away.

I've seen this help a few routines in coremark, fix one bug in the testsuite
(pr111384) and fix a couple internally reported bugs in Ventana.

The original idea and code were from Joern; Jivan and I hacked it into 
usable
shape.  I've had this in my tester for ~8 months, so it's been through more
build/test cycles than I care to contemplate and nearly every architecture 
we
support.

But just in case, I'm going to wait for it to spin through the pre-commit CI
tester.  I'll find my old ChangeLog before committing.

gcc/
* Makefile.in (OBJS): Add ext-dce.o
* common.opt (ext-dce): Document new option.
* df-scan.cc (df_get_ext_block_use_set): Delete prototype and
make extern.
* df.h (df_get_exit_block_use_set): Prototype.
* ext-dce.cc: New file/pass.
* opts.cc (default_options_table): Handle ext-dce at -O2 or higher.
* passes.def: Add ext-dce before combine.
* tree-pass.h (make_pass_ext_dce): Prototype.

gcc/testsuite
* gcc.target/aarch64/sve/pred_clobber_1.c: Update expected output.
* gcc.target/aarch64/sve/pred_clobber_2.c: Likewise.
* gcc.target/aarch64/sve/pred_clobber_3.c: Likewise.
* gcc.target/aarch64/tbz_2.c: Likewise.
* gcc.target/riscv/core_bench_list.c: New test.
* gcc.target/riscv/core_init_matrix.c: New test.
* gcc.target/riscv/core_list_init.c: New test.
* gcc.target/riscv/matrix_add_const.c: New test.
* gcc.target/riscv/mem-extend.c: New test.
* gcc.target/riscv/pr111384.c: New test.

Co-authored-by: Jivan Hakobyan 
Co-authored-by: Joern Rennecke 

Diff:
---
 gcc/Makefile.in|   1 +
 gcc/common.opt |   4 +
 gcc/df-scan.cc |   3 +-
 gcc/df.h   |   1 +
 gcc/ext-dce.cc | 943 +
 gcc/opts.cc|   1 +
 gcc/passes.def |   1 +
 .../gcc.target/aarch64/sve/pred_clobber_1.c|   1 +
 .../gcc.target/aarch64/sve/pred_clobber_2.c|   1 +
 .../gcc.target/aarch64/sve/pred_clobber_3.c|   1 +
 gcc/testsuite/gcc.target/aarch64/tbz_2.c   |   6 +-
 gcc/testsuite/gcc.target/riscv/core_bench_list.c   |  15 +
 gcc/testsuite/gcc.target/riscv/core_init_matrix.c  |  17 +
 gcc/testsuite/gcc.target/riscv/core_list_init.c|  18 +
 gcc/testsuite/gcc.target/riscv/matrix_add_const.c  |  13 +
 gcc/testsuite/gcc.target/riscv/mem-extend.c|  14 +
 gcc/testsuite/gcc.target/riscv/pr111384.c  |  11 +
 gcc/tree-pass.h|   1 +
 18

[gcc r15-1976] [to-be-committed, RISC-V] Eliminate unnecessary sign extension after inlined str[n]cmp

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:74d8accaf88f83bfcab1150bf9be5140e7ac0e94

commit r15-1976-g74d8accaf88f83bfcab1150bf9be5140e7ac0e94
Author: Jeff Law 
Date:   Thu Jul 11 12:05:56 2024 -0600

[to-be-committed,RISC-V] Eliminate unnecessary sign extension after inlined 
str[n]cmp

This patch eliminates an unnecessary sign extension for scalar inlined
string comparisons on rv64.

Conceptually this is pretty simple.  Prove all the paths which "return"
a value from the inlined string comparison already have sign extended
values.

FINAL_LABEL is the point after the calculation of the return value.  So
if we have a jump to FINAL_LABEL, we must have a properly extended
result value at that point.

Second we're going to arrange in the .md part of the expander to use an
X mode temporary for the result.  After computing the result we will (if
necessary) extract the low part of the result using a SUBREG tagged with
the appropriate SUBREG_PROMOTED_* bits.

So with that background.

We find a jump to FINAL_LABEL in emit_strcmp_scalar_compare_byte.  Since
we know the result is X mode, we can just emit the subtraction of the
two chars in X mode and we'll have a properly sign extended result.

There's 4 jumps to final_label in emit_strcmp_scalar.

The first is just returning zero and needs trivial simplification to not
force the result into SImode.

The second is after calling strcmp in the library.  The ABI mandates
that value is sign extended, so there's nothing to do for that case.

The 3rd occurs after a call to
emit_strcmp_scalar_result_calculation_nonul.  If we dive into that
routine it needs simplificationq similar to what we did in
emit_strcmp_scalar_compare_byte

The 4th occurs after a call to emit_strcmp_scalar_result_calculation
which again needs trivial adjustment like we've done in the other routines.

Finally, at the end of expand_strcmp, just store the X mode result
sitting in SUB to RESULT.

The net of all that is we know every path has its result properly
extended to X mode.  Standard redundant extension removal will take care
of the rest.

We've been running this within Ventana for about 6 months, so naturally
it's been through various QA cycles, dhrystone, spec2017, etc.  It's
also been through a build/test cycle in my tester.  Waiting on results
from the pre-commit testing before moving forward.

gcc/
* config/riscv/riscv-string.cc
(emit_strcmp_scalar_compare_byte): Set RESULT directly rather
than using a new temporary.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Use CONST0_RTX rather than
generating a new node.
(expand_strcmp): Copy directly from SUB to RESULT.
* config/riscv/riscv.md (cmpstrnsi, cmpstrsi): Pass an X
mode temporary to the expansion routines.  If necessary
extract low part of the word to store in final result location.

Diff:
---
 gcc/config/riscv/riscv-string.cc | 15 +--
 gcc/config/riscv/riscv.md| 28 
 2 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 257a514d2901..4736228e6f14 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -140,9 +140,7 @@ static void
 emit_strcmp_scalar_compare_byte (rtx result, rtx data1, rtx data2,
 rtx final_label)
 {
-  rtx tmp = gen_reg_rtx (Xmode);
-  do_sub3 (tmp, data1, data2);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  do_sub3 (result, data1, data2);
   emit_jump_insn (gen_jump (final_label));
   emit_barrier (); /* No fall-through.  */
 }
@@ -310,8 +308,7 @@ emit_strcmp_scalar_result_calculation_nonul (rtx result, 
rtx data1, rtx data2)
   rtx tmp = gen_reg_rtx (Xmode);
   emit_insn (gen_slt_3 (LTU, Xmode, Xmode, tmp, data1, data2));
   do_neg2 (tmp, tmp);
-  do_ior3 (tmp, tmp, const1_rtx);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  do_ior3 (result, tmp, const1_rtx);
 }
 
 /* strcmp-result calculation.
@@ -367,9 +364,7 @@ emit_strcmp_scalar_result_calculation (rtx result, rtx 
data1, rtx data2,
   unsigned int shiftr = (xlen - 1) * BITS_PER_UNIT;
   do_lshr3 (data1, data1, GEN_INT (shiftr));
   do_lshr3 (data2, data2, GEN_INT (shiftr));
-  rtx tmp = gen_reg_rtx (Xmode);
-  do_sub3 (tmp, data1, data2);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  do_sub3 (result, data1, data2);
 }
 
 /* Expand str(n)cmp using Zbb/TheadBb instructions.
@@ -444,7 +439,7 @@ riscv_expand_strcmp_scalar (rtx result, rtx src1, rtx src2,
   /* All compared and everything was equal.  */
   if (nc

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [RISC-V] add implied extension repeatly until stable

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:fe7b8250ee5190f31a622d99e394861f0688feef

commit fe7b8250ee5190f31a622d99e394861f0688feef
Author: Fei Gao 
Date:   Fri Jul 5 09:56:30 2024 +

[RISC-V] add implied extension repeatly until stable

Call handle_implied_ext repeatly until there's no
new subset added into the subset list.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc 
(riscv_subset_list::riscv_subset_list):
init m_subset_num to 0.
(riscv_subset_list::add): increase m_subset_num once a subset added.
(riscv_subset_list::finalize): call handle_implied_ext repeatly
until no change in m_subset_num.
* config/riscv/riscv-subset.h: add m_subset_num member.

Signed-off-by: Fei Gao 
(cherry picked from commit 682731d11f9c02b24358d1af1e2bf6fca0221ee7)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc | 14 +++---
 gcc/config/riscv/riscv-subset.h |  3 +++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 16bdb3fd2259..b9bda3e110a2 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -569,7 +569,8 @@ riscv_subset_t::riscv_subset_t ()
 }
 
 riscv_subset_list::riscv_subset_list (const char *arch, location_t loc)
-  : m_arch (arch), m_loc (loc), m_head (NULL), m_tail (NULL), m_xlen (0)
+  : m_arch (arch), m_loc (loc), m_head (NULL), m_tail (NULL), m_xlen (0),
+m_subset_num (0)
 {
 }
 
@@ -815,6 +816,7 @@ riscv_subset_list::add (const char *subset, int 
major_version,
   return;
 }
 
+  m_subset_num++;
   riscv_subset_t *s = new riscv_subset_t ();
   riscv_subset_t *itr;
 
@@ -1597,9 +1599,15 @@ void
 riscv_subset_list::finalize ()
 {
   riscv_subset_t *subset;
+  unsigned pre_subset_num;
 
-  for (subset = m_head; subset != NULL; subset = subset->next)
-handle_implied_ext (subset->name.c_str ());
+  do
+{
+  pre_subset_num = m_subset_num;
+  for (subset = m_head; subset != NULL; subset = subset->next)
+   handle_implied_ext (subset->name.c_str ());
+}
+  while (pre_subset_num != m_subset_num);
 
   gcc_assert (check_implied_ext ());
 
diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h
index fe7f54d8bc57..7dc196a20074 100644
--- a/gcc/config/riscv/riscv-subset.h
+++ b/gcc/config/riscv/riscv-subset.h
@@ -62,6 +62,9 @@ private:
   /* X-len of m_arch. */
   unsigned m_xlen;
 
+  /* Number of subsets. */
+  unsigned m_subset_num;
+
   riscv_subset_list (const char *, location_t);
 
   const char *parsing_subset_version (const char *, const char *, unsigned *,


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Implement .SAT_TRUNC for vector unsigned int

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:cbee085f82008a05b335754ea8b9764c3a78202c

commit cbee085f82008a05b335754ea8b9764c3a78202c
Author: Pan Li 
Date:   Fri Jul 5 09:02:47 2024 +0800

RISC-V: Implement .SAT_TRUNC for vector unsigned int

This patch would like to implement the .SAT_TRUNC for the RISC-V
backend.  With the help of the RVV Vector Narrowing Fixed-Point
Clip Instructions.  The below SEW(S) are supported:

* e64 => e32
* e64 => e16
* e64 => e8
* e32 => e16
* e32 => e8
* e16 => e8

Take below example to see the changes to asm.
Form 1:
  #define DEF_VEC_SAT_U_TRUNC_FMT_1(NT, WT) \
  void __attribute__((noinline))\
  vec_sat_u_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \
  { \
unsigned i; \
for (i = 0; i < limit; i++) \
  { \
WT x = in[i];   \
bool overflow = x > (WT)(NT)(-1);   \
out[i] = ((NT)x) | (NT)-overflow;   \
  } \
  }

DEF_VEC_SAT_U_TRUNC_FMT_1 (uint32_t, uint64_t)

Before this patch:
.L3:
  vsetvli  a5,a2,e64,m1,ta,ma
  vle64.v  v1,0(a1)
  vmsgtu.vvv0,v1,v2
  vsetvli  zero,zero,e32,mf2,ta,ma
  vncvt.x.x.w  v1,v1
  vmerge.vim   v1,v1,-1,v0
  vse32.v  v1,0(a0)
  slli a4,a5,3
  add  a1,a1,a4
  slli a4,a5,2
  add  a0,a0,a4
  sub  a2,a2,a5
  bne  a2,zero,.L3

After this patch:
.L3:
  vsetvli  a5,a2,e32,mf2,ta,ma
  vle64.v  v1,0(a1)
  vnclipu.wi   v1,v1,0
  vse32.v  v1,0(a0)
  slli a4,a5,3
  add  a1,a1,a4
  slli a4,a5,2
  add  a0,a0,a4
  sub  a2,a2,a5
  bne  a2,zero,.L3

Passed the rv64gcv fully regression tests.

gcc/ChangeLog:

* config/riscv/autovec.md (ustrunc2): Add
new pattern for double truncation.
(ustrunc2): Ditto but for quad truncation.
(ustrunc2): Ditto but for oct truncation.
* config/riscv/riscv-protos.h (expand_vec_double_ustrunc): Add
new func decl to expand double vec ustrunc.
(expand_vec_quad_ustrunc): Ditto but for quad.
(expand_vec_oct_ustrunc): Ditto but for oct.
* config/riscv/riscv-v.cc (expand_vec_double_ustrunc): Add new
func impl to expand vector double ustrunc.
(expand_vec_quad_ustrunc): Ditto but for quad.
(expand_vec_oct_ustrunc): Ditto but for oct.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add helper
test macros.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_data.h: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-4.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-5.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-6.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-2.c: New 
test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-4.c: New 
test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-5.c: New 
test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-6.c: New 
test.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_unary_vv_run.h: New 
test.

Signed-off-by: Pan Li 
(cherry picked from commit dafd63d7c5cddce1e00803606e742d75927b1a1e)

Diff:
---
 gcc/config/riscv/autovec.md|  35 ++
 gcc/config/riscv/riscv-protos.h|   4 +
 gcc/config/riscv/riscv-v.cc|  46 +++
 .../riscv/rvv/autovec/binop/vec_sat_arith.h|  22 ++
 .../riscv/rvv/autovec/unop/vec_sat_data.h  | 394 +
 .../riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c |  19 +
 .../riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c |  21 ++
 .../riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c |  23 ++
 .../riscv/rvv/autovec/unop/vec_sat_u_trunc-4.c 

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 1

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:79aca63c26b3eee221c26ee62803fa19e5996502

commit 79aca63c26b3eee221c26ee62803fa19e5996502
Author: Pan Li 
Date:   Mon Jul 8 20:31:31 2024 +0800

RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 1

After the middle-end supported the vector mode of .SAT_ADD,  add more
testcases to ensure the correctness of RISC-V backend for form 1.  Aka:

Form 1:
  #define DEF_VEC_SAT_U_ADD_IMM_FMT_1(T, IMM)  \
  T __attribute__((noinline))  \
  vec_sat_u_add_imm##IMM##_##T##_fmt_1 (T *out, T *in, unsigned limit) \
  {\
unsigned i;\
for (i = 0; i < limit; i++)\
  out[i] = (T)(in[i] + IMM) >= in[i] ? (in[i] + IMM) : -1; \
  }

DEF_VEC_SAT_U_ADD_IMM_FMT_1 (uint64_t, 9)

Passed the fully rv64gcv regression tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add help
test macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-2.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-4.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-2.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-4.c: New 
test.

Signed-off-by: Pan Li 
(cherry picked from commit 35b1096896a94a90d787f5ef402ba009dd4f0393)

Diff:
---
 .../riscv/rvv/autovec/binop/vec_sat_arith.h|  25 ++
 .../riscv/rvv/autovec/binop/vec_sat_data.h | 256 +
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-1.c  |  14 ++
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-2.c  |  14 ++
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-3.c  |  14 ++
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-4.c  |  14 ++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-1.c|  28 +++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-2.c|  28 +++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-3.c|  28 +++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-4.c|  28 +++
 10 files changed, 449 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
index b55a589e019a..3733c8fd2c15 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
@@ -4,6 +4,14 @@
 #include 
 #include 
 
+#define VALIDATE_RESULT(out, expect, N)  \
+  do \
+{\
+  for (unsigned i = 0; i < N; i++)   \
+if (out[i] != expect[i]) __builtin_abort (); \
+}\
+  while (false)
+
 
/**/
 /* Saturation Add (unsigned and signed)   
*/
 
/**/
@@ -139,6 +147,23 @@ vec_sat_u_add_##T##_fmt_8 (T *out, T *op_1, T *op_2, 
unsigned limit) \
 #define RUN_VEC_SAT_U_ADD_FMT_8(T, out, op_1, op_2, N) \
   vec_sat_u_add_##T##_fmt_8(out, op_1, op_2, N)
 
+#define DEF_VEC_SAT_U_ADD_IMM_FMT_1(T, IMM)  \
+T __attribute__((noinline))  \
+vec_sat_u_add_imm##IMM##_##T##_fmt_1 (T *out, T *in, unsigned limit) \
+{\
+  unsigned i;\
+  for (i = 0; i < limit; i++)\
+out[i] = (T)(in[i] + IMM) >= in[i] ? (in[i] + IMM) : -1; \
+}
+#define DEF_VEC_SAT_U_ADD_IMM_FMT_1_WRAP(T, IMM) \
+  DEF_VEC_SAT_U_ADD_IMM_FMT_1(T, IMM)
+
+#define RUN_VEC_SAT_U_ADD_IMM_FMT_1(T, out, op_1, expect, IMM, N) \
+  vec_sat_u_add_imm##IMM##_##T##_fmt_1(out, op_1, N); \
+  VALIDATE_RESULT (out, expect, N)
+#define RUN_VEC_SAT_U_ADD_IMM_FMT_1_WRAP(T, out, op_1, expect, IMM, N) \
+  RUN_VEC_SAT_U_ADD_IMM_FMT_1(T, out, op_1, expect, IMM, N)
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/*

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [to-be-committed][RISC-V][V3] DCE analysis for extension elimination

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:d678469ed4f98cee0180ff6a9961920404e82a14

commit d678469ed4f98cee0180ff6a9961920404e82a14
Author: Jeff Law 
Date:   Mon Jul 8 17:06:55 2024 -0600

[to-be-committed][RISC-V][V3] DCE analysis for extension elimination

The pre-commit testing showed that making ext-dce only active at -O2 and 
above
would require minor edits to the tests.  In some cases we had specified -O1 
in
the test or specified no optimization level at all. Those need to be bumped 
to
-O2.   In one test we had one set of dg-options overriding another.

The other approach that could have been taken would be to drop the -On
argument, add an explicit -fext-dce and add dg-skip-if options.  I 
originally
thought that was going to be way to go, but the dg-skip-if aspect was going 
to
get ugly as things like interaction between unrolling, peeling and -ftracer
would have to be accounted for and would likely need semi-regular 
adjustment.

Changes since V2:
  Testsuite changes to deal with pass only being enabled at -O2 or
  higher.

--

Changes since V1:

  Check flag_ext_dce before running the new pass.  I'd forgotten that
  I had removed that part of the gate to facilitate more testing.
  Turn flag_ext_dce on at -O2 and above.
  Adjust one of the riscv tests to explicitly avoid vectors
  Adjust a few aarch64 tests
In tbz_2.c we remove an unnecessary extension which causes us to use
"x" registers instead of "w" registers.

In the pred_clobber tests we also remove an extension and that
ultimately causes a reg->reg copy to change locations.

--

This was actually ack'd late in the gcc-14 cycle, but I chose not to 
integrate
it given how late we were in the cycle.

The basic idea here is to track liveness of subobjects within a word and if 
we
find an extension where the bits set aren't actually used, then we convert 
the
extension into a subreg.  The subreg typically simplifies away.

I've seen this help a few routines in coremark, fix one bug in the testsuite
(pr111384) and fix a couple internally reported bugs in Ventana.

The original idea and code were from Joern; Jivan and I hacked it into 
usable
shape.  I've had this in my tester for ~8 months, so it's been through more
build/test cycles than I care to contemplate and nearly every architecture 
we
support.

But just in case, I'm going to wait for it to spin through the pre-commit CI
tester.  I'll find my old ChangeLog before committing.

gcc/
* Makefile.in (OBJS): Add ext-dce.o
* common.opt (ext-dce): Document new option.
* df-scan.cc (df_get_ext_block_use_set): Delete prototype and
make extern.
* df.h (df_get_exit_block_use_set): Prototype.
* ext-dce.cc: New file/pass.
* opts.cc (default_options_table): Handle ext-dce at -O2 or higher.
* passes.def: Add ext-dce before combine.
* tree-pass.h (make_pass_ext_dce): Prototype.

gcc/testsuite
* gcc.target/aarch64/sve/pred_clobber_1.c: Update expected output.
* gcc.target/aarch64/sve/pred_clobber_2.c: Likewise.
* gcc.target/aarch64/sve/pred_clobber_3.c: Likewise.
* gcc.target/aarch64/tbz_2.c: Likewise.
* gcc.target/riscv/core_bench_list.c: New test.
* gcc.target/riscv/core_init_matrix.c: New test.
* gcc.target/riscv/core_list_init.c: New test.
* gcc.target/riscv/matrix_add_const.c: New test.
* gcc.target/riscv/mem-extend.c: New test.
* gcc.target/riscv/pr111384.c: New test.

Co-authored-by: Jivan Hakobyan 
Co-authored-by: Joern Rennecke 

(cherry picked from commit 98914f9eba5f19d3eb93fbce8726b5264631cba0)

Diff:
---
 gcc/Makefile.in   |   1 +
 gcc/common.opt|   4 +
 gcc/df-scan.cc|   3 +-
 gcc/df.h  |   1 +
 gcc/ext-dce.cc| 943 ++
 gcc/opts.cc   |   1 +
 gcc/passes.def|   1 +
 gcc/testsuite/gcc.target/aarch64/tbz_2.c  |   6 +-
 gcc/testsuite/gcc.target/riscv/core_bench_list.c  |  15 +
 gcc/testsuite/gcc.target/riscv/core_init_matrix.c |  17 +
 gcc/testsuite/gcc.target/riscv/core_list_init.c   |  18 +
 gcc/testsuite/gcc.target/riscv/matrix_add_const.c |  13 +
 gcc/testsuite/gcc.target/riscv/mem-extend.c   |  14 +
 gcc/testsuite/gcc.target/riscv/pr111384.c |  11 +
 gcc/tree-pass.h   |   1 +
 15 files changed, 1044 insertions(+), 5 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a74761b7ab32..c070a38

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 2

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f3efae407d19273bf91c16e588ff63a80f0baf26

commit f3efae407d19273bf91c16e588ff63a80f0baf26
Author: Pan Li 
Date:   Mon Jul 8 21:58:59 2024 +0800

RISC-V: Add testcases for unsigned vector .SAT_ADD IMM form 2

After the middle-end supported the vector mode of .SAT_ADD,  add more
testcases to ensure the correctness of RISC-V backend for form 2.  Aka:

Form 2:
  #define DEF_VEC_SAT_U_ADD_IMM_FMT_2(T, IMM)  \
  T __attribute__((noinline))  \
  vec_sat_u_add_imm##IMM##_##T##_fmt_2 (T *out, T *in, unsigned limit) \
  {\
unsigned i;\
for (i = 0; i < limit; i++)\
  out[i] = (T)(in[i] + IMM) < in[i] ? -1 : (in[i] + IMM);  \
  }

DEF_VEC_SAT_U_ADD_IMM_FMT_2 (uint64_t, 9)

Passed the fully rv64gcv regression tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add help
test macro.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-6.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-7.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-8.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-5.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-6.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-7.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-8.c: New 
test.

Signed-off-by: Pan Li 
(cherry picked from commit ecde8d50bea3573194f21277666f83463cbbe9c9)

Diff:
---
 .../riscv/rvv/autovec/binop/vec_sat_arith.h| 17 +
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c  | 14 +++
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-6.c  | 14 +++
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-7.c  | 14 +++
 .../riscv/rvv/autovec/binop/vec_sat_u_add_imm-8.c  | 14 +++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-5.c| 28 ++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-6.c| 28 ++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-7.c| 28 ++
 .../rvv/autovec/binop/vec_sat_u_add_imm-run-8.c| 28 ++
 9 files changed, 185 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
index 3733c8fd2c15..10459807b2c4 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
@@ -158,12 +158,29 @@ vec_sat_u_add_imm##IMM##_##T##_fmt_1 (T *out, T *in, 
unsigned limit) \
 #define DEF_VEC_SAT_U_ADD_IMM_FMT_1_WRAP(T, IMM) \
   DEF_VEC_SAT_U_ADD_IMM_FMT_1(T, IMM)
 
+#define DEF_VEC_SAT_U_ADD_IMM_FMT_2(T, IMM)  \
+T __attribute__((noinline))  \
+vec_sat_u_add_imm##IMM##_##T##_fmt_2 (T *out, T *in, unsigned limit) \
+{\
+  unsigned i;\
+  for (i = 0; i < limit; i++)\
+out[i] = (T)(in[i] + IMM) < in[i] ? -1 : (in[i] + IMM);  \
+}
+#define DEF_VEC_SAT_U_ADD_IMM_FMT_2_WRAP(T, IMM) \
+  DEF_VEC_SAT_U_ADD_IMM_FMT_2(T, IMM)
+
 #define RUN_VEC_SAT_U_ADD_IMM_FMT_1(T, out, op_1, expect, IMM, N) \
   vec_sat_u_add_imm##IMM##_##T##_fmt_1(out, op_1, N); \
   VALIDATE_RESULT (out, expect, N)
 #define RUN_VEC_SAT_U_ADD_IMM_FMT_1_WRAP(T, out, op_1, expect, IMM, N) \
   RUN_VEC_SAT_U_ADD_IMM_FMT_1(T, out, op_1, expect, IMM, N)
 
+#define RUN_VEC_SAT_U_ADD_IMM_FMT_2(T, out, op_1, expect, IMM, N) \
+  vec_sat_u_add_imm##IMM##_##T##_fmt_2(out, op_1, N); \
+  VALIDATE_RESULT (out, expect, N)
+#define RUN_VEC_SAT_U_ADD_IMM_FMT_2_WRAP(T, out, op_1, expect, IMM, N) \
+  RUN_VEC_SAT_U_ADD_IMM_FMT_2(T, out, op_1, expect, IMM, N)
+
 
/**/
 /* Saturation Sub (Unsigned and Signed)   
*/
 
/**/
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c
new file mode 100644
index ..d25fdcf78f38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c
@@ -0,0 +1,14 @@

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: testsuite: Properly gate LTO tests

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6f195492e9798607b475c14934b0b77cf258a2a6

commit 6f195492e9798607b475c14934b0b77cf258a2a6
Author: Christoph Müllner 
Date:   Fri Jul 5 09:53:34 2024 +0200

RISC-V: testsuite: Properly gate LTO tests

There are two test cases with the following skip directive:
  dg-skip-if "" { *-*-* } { "-flto -fno-fat-lto-objects" }
This reads as: skip if both '-flto' and '-fno-fat-lto-objects'
are present.  This is not the case if only '-flto' is present.

Since both tests depend on instruction sequences (one does
check-function-bodies the other tests for an assembler error
message), they won't work reliably with fat LTO objects.

Let's change the skip line to gate the test on '-flto'
to avoid failing tests like this:

FAIL: gcc.target/riscv/interrupt-misaligned.c   -O2 -flto   
check-function-bodies interrupt
FAIL: gcc.target/riscv/interrupt-misaligned.c   -O2 -flto 
-flto-partition=none   check-function-bodies interrupt
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto   (test for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto   (test for errors, line 9)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto -flto-partition=none   (test 
for errors, line 10)
FAIL: gcc.target/riscv/pr93202.c   -O2 -flto -flto-partition=none   (test 
for errors, line 9)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/interrupt-misaligned.c: Remove
"-fno-fat-lto-objects" from skip condition.
* gcc.target/riscv/pr93202.c: Likewise.

Signed-off-by: Christoph Müllner 
(cherry picked from commit 0717d50fc4ff983b79093bdef43b04e4584cc3cd)

Diff:
---
 gcc/testsuite/gcc.target/riscv/interrupt-misaligned.c | 2 +-
 gcc/testsuite/gcc.target/riscv/pr93202.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/interrupt-misaligned.c 
b/gcc/testsuite/gcc.target/riscv/interrupt-misaligned.c
index b5f8e6c2bbef..912f180e4d65 100644
--- a/gcc/testsuite/gcc.target/riscv/interrupt-misaligned.c
+++ b/gcc/testsuite/gcc.target/riscv/interrupt-misaligned.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -march=rv64gc -mabi=lp64d -fno-schedule-insns 
-fno-schedule-insns2" } */
-/* { dg-skip-if "" { *-*-* } { "-flto -fno-fat-lto-objects" } } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
 
 /*  Make sure no stack offset are misaligned.
 **  interrupt:
diff --git a/gcc/testsuite/gcc.target/riscv/pr93202.c 
b/gcc/testsuite/gcc.target/riscv/pr93202.c
index 5501191ea52c..5de003fac421 100644
--- a/gcc/testsuite/gcc.target/riscv/pr93202.c
+++ b/gcc/testsuite/gcc.target/riscv/pr93202.c
@@ -1,7 +1,7 @@
 /* PR inline-asm/93202 */
 /* { dg-do compile { target fpic } } */
 /* { dg-options "-fpic" } */
-/* { dg-skip-if "" { *-*-* } { "-flto -fno-fat-lto-objects" } } */
+/* { dg-skip-if "" { *-*-* } { "-flto" } } */
 
 void
 foo (void)


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Deduplicate arch subset list processing

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6b9226f62dd3a5425815a0b1ddf64215548b2669

commit 6b9226f62dd3a5425815a0b1ddf64215548b2669
Author: Christoph Müllner 
Date:   Fri Jul 5 01:09:46 2024 +0200

RISC-V: Deduplicate arch subset list processing

We have a code duplication in riscv_set_arch_by_subset_list() and
riscv_parse_arch_string(), where the latter function parses an ISA string
into a subset_list before doing the same as the former function.

riscv_parse_arch_string() is used to process command line options and
riscv_set_arch_by_subset_list() processes target attributes.
So, it is obvious that both functions should do the same.
Let's deduplicate the code to enforce this.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc 
(riscv_set_arch_by_subset_list):
Fix overlong line.
(riscv_parse_arch_string): Replace duplicated code by a call to
riscv_set_arch_by_subset_list.

Signed-off-by: Christoph Müllner 
(cherry picked from commit 85fa334fbcaa8e4b98ab197a8c9410dde87f0ae3)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc | 32 ++--
 1 file changed, 6 insertions(+), 26 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index b9bda3e110a2..dab2e7679653 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1826,7 +1826,8 @@ riscv_set_arch_by_subset_list (riscv_subset_list 
*subset_list,
   else if (subset_list->xlen () == 64)
opts->x_target_flags |= MASK_64BIT;
 
-  for (arch_ext_flag_tab = &riscv_ext_flag_table[0]; 
arch_ext_flag_tab->ext;
+  for (arch_ext_flag_tab = &riscv_ext_flag_table[0];
+  arch_ext_flag_tab->ext;
   ++arch_ext_flag_tab)
{
  if (subset_list->lookup (arch_ext_flag_tab->ext))
@@ -1850,30 +1851,6 @@ riscv_parse_arch_string (const char *isa,
   if (!subset_list)
 return;
 
-  if (opts)
-{
-  const riscv_ext_flag_table_t *arch_ext_flag_tab;
-  /* Clean up target flags before we set.  */
-  for (arch_ext_flag_tab = &riscv_ext_flag_table[0];
-  arch_ext_flag_tab->ext;
-  ++arch_ext_flag_tab)
-   opts->*arch_ext_flag_tab->var_ref &= ~arch_ext_flag_tab->mask;
-
-  if (subset_list->xlen () == 32)
-   opts->x_target_flags &= ~MASK_64BIT;
-  else if (subset_list->xlen () == 64)
-   opts->x_target_flags |= MASK_64BIT;
-
-
-  for (arch_ext_flag_tab = &riscv_ext_flag_table[0];
-  arch_ext_flag_tab->ext;
-  ++arch_ext_flag_tab)
-   {
- if (subset_list->lookup (arch_ext_flag_tab->ext))
-   opts->*arch_ext_flag_tab->var_ref |= arch_ext_flag_tab->mask;
-   }
-}
-
   /* Avoid double delete if current_subset_list equals cmdline_subset_list.  */
   if (current_subset_list && current_subset_list != cmdline_subset_list)
 delete current_subset_list;
@@ -1881,7 +1858,10 @@ riscv_parse_arch_string (const char *isa,
   if (cmdline_subset_list)
 delete cmdline_subset_list;
 
-  current_subset_list = cmdline_subset_list = subset_list;
+  cmdline_subset_list = subset_list;
+  /* current_subset_list is set in the call below.  */
+
+  riscv_set_arch_by_subset_list (subset_list, opts);
 }
 
 /* Return the riscv_cpu_info entry for CPU, NULL if not found.  */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Fix comment/naming in attribute parsing code

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:16d1d5b7db194fa7e0373bf78fccabdf8ef0d009

commit 16d1d5b7db194fa7e0373bf78fccabdf8ef0d009
Author: Christoph Müllner 
Date:   Fri Jul 5 04:58:07 2024 +0200

RISC-V: Fix comment/naming in attribute parsing code

Function target attributes have to be separated by semi-colons.
Let's fix the comment and variable naming to better explain what
the code does.

gcc/ChangeLog:

* config/riscv/riscv-target-attr.cc (riscv_process_target_attr):
Fix comments and variable names.

Signed-off-by: Christoph Müllner 
(cherry picked from commit 5ef0b7d2048a7142174ee3e8e021fc1a9c3e3334)

Diff:
---
 gcc/config/riscv/riscv-target-attr.cc | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-target-attr.cc 
b/gcc/config/riscv/riscv-target-attr.cc
index 19eb7b06d548..0bbe7df25d19 100644
--- a/gcc/config/riscv/riscv-target-attr.cc
+++ b/gcc/config/riscv/riscv-target-attr.cc
@@ -338,11 +338,11 @@ riscv_process_target_attr (tree fndecl, tree args, 
location_t loc,
   char *str_to_check = buf.get ();
   strcpy (str_to_check, TREE_STRING_POINTER (args));
 
-  /* Used to catch empty spaces between commas i.e.
+  /* Used to catch empty spaces between semi-colons i.e.
  attribute ((target ("attr1;;attr2"))).  */
-  unsigned int num_commas = num_occurences_in_str (';', str_to_check);
+  unsigned int num_semicolons = num_occurences_in_str (';', str_to_check);
 
-  /* Handle multiple target attributes separated by ','.  */
+  /* Handle multiple target attributes separated by ';'.  */
   char *token = strtok_r (str_to_check, ";", &str_to_check);
 
   riscv_target_attr_parser attr_parser (loc);
@@ -354,7 +354,7 @@ riscv_process_target_attr (tree fndecl, tree args, 
location_t loc,
   token = strtok_r (NULL, ";", &str_to_check);
 }
 
-  if (num_attrs != num_commas + 1)
+  if (num_attrs != num_semicolons + 1)
 {
   error_at (loc, "malformed % attribute",
TREE_STRING_POINTER (args));


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: fix zcmp popretz [PR113715]

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:4649bea8adb5b29d566dd4c63f68e56a877ebefc

commit 4649bea8adb5b29d566dd4c63f68e56a877ebefc
Author: Fei Gao 
Date:   Tue Jul 9 10:00:29 2024 +

RISC-V: fix zcmp popretz [PR113715]

No functional changes compared with V1, just spaces to table conversion
in testcases to pass check-function-bodies.

V1 passed regression locally but suprisingly failed in pre-commit CI, after
picking the patch from patchwork, I realize table got coverted to spaces
before sending the patch.

Root cause:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
Commit above tries in targetm.gen_epilogue () to detect if
there's li a0,0 insn at the end of insn chain, if so, cm.popret
is replaced by cm.popretz and li a0,0 insn is deleted.

Insertion of the generated epilogue sequence
into the insn chain doesn't happen at this moment.
If later shrink-wrap decides NOT to insert the epilogue sequence at the end
of insn chain, then the li a0,0 insn has already been mistakeny removed.

Fix this issue by removing generation of cm.popretz in epilogue,
leaving the assignment to a0 and use insn with cm.popret.

That's likely going to result in some kind of code size regression,
but not a correctness regression.

Optimization can be done in future.

Signed-off-by: Fei Gao 

gcc/ChangeLog:
PR target/113715

* config/riscv/riscv.cc (riscv_zcmp_can_use_popretz): Removed.
(riscv_gen_multi_pop_insn): Remove generation of cm.popretz.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rv32e_zcmp.c: Adapt TC.
* gcc.target/riscv/rv32i_zcmp.c: Likewise.

(cherry picked from commit 7a345d0314f8cf0f15ca3664b1e4430d65764570)

Diff:
---
 gcc/config/riscv/riscv.cc   | 53 -
 gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c |  3 +-
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c |  3 +-
 3 files changed, 4 insertions(+), 55 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4acd643fd8d3..ce73c18f88f8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8167,52 +8167,6 @@ riscv_adjust_libcall_cfi_epilogue ()
   return dwarf;
 }
 
-/* return true if popretz pattern can be matched.
-   set (reg 10 a0) (const_int 0)
-   use (reg 10 a0)
-   NOTE_INSN_EPILOGUE_BEG  */
-static rtx_insn *
-riscv_zcmp_can_use_popretz (void)
-{
-  rtx_insn *insn = NULL, *use = NULL, *clear = NULL;
-
-  /* sequence stack for NOTE_INSN_EPILOGUE_BEG*/
-  struct sequence_stack *outer_seq = get_current_sequence ()->next;
-  if (!outer_seq)
-return NULL;
-  insn = outer_seq->first;
-  if (!insn || !NOTE_P (insn) || NOTE_KIND (insn) != NOTE_INSN_EPILOGUE_BEG)
-return NULL;
-
-  /* sequence stack for the insn before NOTE_INSN_EPILOGUE_BEG*/
-  outer_seq = outer_seq->next;
-  if (outer_seq)
-insn = outer_seq->last;
-
-  /* skip notes  */
-  while (insn && NOTE_P (insn))
-{
-  insn = PREV_INSN (insn);
-}
-  use = insn;
-
-  /* match use (reg 10 a0)  */
-  if (use == NULL || !INSN_P (use) || GET_CODE (PATTERN (use)) != USE
-  || !REG_P (XEXP (PATTERN (use), 0))
-  || REGNO (XEXP (PATTERN (use), 0)) != A0_REGNUM)
-return NULL;
-
-  /* match set (reg 10 a0) (const_int 0 [0])  */
-  clear = PREV_INSN (use);
-  if (clear != NULL && INSN_P (clear) && GET_CODE (PATTERN (clear)) == SET
-  && REG_P (SET_DEST (PATTERN (clear)))
-  && REGNO (SET_DEST (PATTERN (clear))) == A0_REGNUM
-  && SET_SRC (PATTERN (clear)) == const0_rtx)
-return clear;
-
-  return NULL;
-}
-
 static void
 riscv_gen_multi_pop_insn (bool use_multi_pop_normal, unsigned mask,
  unsigned multipop_size)
@@ -8223,13 +8177,6 @@ riscv_gen_multi_pop_insn (bool use_multi_pop_normal, 
unsigned mask,
   if (!use_multi_pop_normal)
 insn = emit_insn (
   riscv_gen_multi_push_pop_insn (POP_IDX, multipop_size, regs_count));
-  else if (rtx_insn *clear_a0_insn = riscv_zcmp_can_use_popretz ())
-{
-  delete_insn (NEXT_INSN (clear_a0_insn));
-  delete_insn (clear_a0_insn);
-  insn = emit_jump_insn (
-   riscv_gen_multi_push_pop_insn (POPRETZ_IDX, multipop_size, regs_count));
-}
   else
 insn = emit_jump_insn (
   riscv_gen_multi_push_pop_insn (POPRET_IDX, multipop_size, regs_count));
diff --git a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c 
b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
index 50e443573ad9..0af4d7199f68 100644
--- a/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
+++ b/gcc/testsuite/gcc.target/riscv/rv32e_zcmp.c
@@ -259,7 +259,8 @@ foo (void)
 **test_popretz:
 ** cm.push {ra}, -16
 ** callf1
-** cm.popretz  {ra}, 16
+** li  a0,0
+** cm.popret   {ra}, 16
 */
 long
 test_popretz ()
diff --git a/gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c 
b/gcc/testsu

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add support for B standard extension

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:ac013ac6c759c6398db35fcc76265300375f5b37

commit ac013ac6c759c6398db35fcc76265300375f5b37
Author: Edwin Lu 
Date:   Wed Jul 10 09:44:48 2024 -0700

RISC-V: Add support for B standard extension

This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness
across toolchains

https://github.com/riscv/riscv-b/tags

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add imply rules for B 
extension
* config/riscv/arch-canonicalize: Ditto

Signed-off-by: Edwin Lu 
(cherry picked from commit 2a90c41a131080e5fdd2b5554fcdba5c654cb93f)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc | 7 +++
 gcc/config/riscv/arch-canonicalize  | 1 +
 2 files changed, 8 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index dab2e7679653..b0a16f5bd30f 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -84,6 +84,10 @@ static const riscv_implied_info_t riscv_implied_info[] =
 
   {"zabha", "zaamo"},
 
+  {"b", "zba"},
+  {"b", "zbb"},
+  {"b", "zbs"},
+
   {"zdinx", "zfinx"},
   {"zfinx", "zicsr"},
   {"zdinx", "zicsr"},
@@ -245,6 +249,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"c", ISA_SPEC_CLASS_20190608, 2, 0},
   {"c", ISA_SPEC_CLASS_2P2,  2, 0},
 
+  {"b",   ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"h",   ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"v",   ISA_SPEC_CLASS_NONE, 1, 0},
@@ -405,6 +411,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
 static const struct riscv_ext_version riscv_combine_info[] =
 {
   {"a", ISA_SPEC_CLASS_20191213, 2, 1},
+  {"b",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zk",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zkn",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zks",  ISA_SPEC_CLASS_NONE, 1, 0},
diff --git a/gcc/config/riscv/arch-canonicalize 
b/gcc/config/riscv/arch-canonicalize
index 35a7fe4455a6..2ea514dd9869 100755
--- a/gcc/config/riscv/arch-canonicalize
+++ b/gcc/config/riscv/arch-canonicalize
@@ -45,6 +45,7 @@ IMPLIED_EXT = {
   "zabha" : ["zaamo"],
 
   "f" : ["zicsr"],
+  "b" : ["zba", "zbb", "zbs"],
   "zdinx" : ["zfinx", "zicsr"],
   "zfinx" : ["zicsr"],
   "zhinx" : ["zhinxmin", "zfinx", "zicsr"],


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Update testsuite to use b

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6b32e85e97ef5f5654bb070172659f967c5ba50f

commit 6b32e85e97ef5f5654bb070172659f967c5ba50f
Author: Edwin Lu 
Date:   Wed Jul 3 17:17:27 2024 -0700

RISC-V: Update testsuite to use b

Update all instances of zba_zbb_zbs in the testsuite to use b instead

gcc/testsuite/ChangeLog:

* g++.target/riscv/redundant-bitmap-1.C: Use gcb instead of
zba_zbb_zbs
* g++.target/riscv/redundant-bitmap-2.C: Ditto
* g++.target/riscv/redundant-bitmap-3.C: Ditto
* g++.target/riscv/redundant-bitmap-4.C: Ditto
* gcc.target/riscv/shift-add-1.c: Ditto
* gcc.target/riscv/shift-add-2.c: Ditto
* gcc.target/riscv/synthesis-1.c: Ditto
* gcc.target/riscv/synthesis-2.c: Ditto
* gcc.target/riscv/synthesis-3.c: Ditto
* gcc.target/riscv/synthesis-4.c: Ditto
* gcc.target/riscv/synthesis-5.c: Ditto
* gcc.target/riscv/synthesis-6.c: Ditto
* gcc.target/riscv/synthesis-7.c: Ditto
* gcc.target/riscv/synthesis-8.c: Ditto
* gcc.target/riscv/zba_zbs_and-1.c: Ditto
* gcc.target/riscv/zbs-zext-3.c: Ditto
* lib/target-supports.exp: Add b to riscv_get_arch

Signed-off-by: Edwin Lu 
(cherry picked from commit 04df2a924bba38c271bfe4ed0e94af1877413818)

Diff:
---
 gcc/testsuite/g++.target/riscv/redundant-bitmap-1.C | 2 +-
 gcc/testsuite/g++.target/riscv/redundant-bitmap-2.C | 2 +-
 gcc/testsuite/g++.target/riscv/redundant-bitmap-3.C | 2 +-
 gcc/testsuite/g++.target/riscv/redundant-bitmap-4.C | 2 +-
 gcc/testsuite/gcc.target/riscv/shift-add-1.c| 2 +-
 gcc/testsuite/gcc.target/riscv/shift-add-2.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-1.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-2.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-3.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-4.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-5.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-6.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-7.c| 2 +-
 gcc/testsuite/gcc.target/riscv/synthesis-8.c| 2 +-
 gcc/testsuite/gcc.target/riscv/zba_zbs_and-1.c  | 2 +-
 gcc/testsuite/gcc.target/riscv/zbs-zext-3.c | 4 ++--
 gcc/testsuite/lib/target-supports.exp   | 2 +-
 17 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/g++.target/riscv/redundant-bitmap-1.C 
b/gcc/testsuite/g++.target/riscv/redundant-bitmap-1.C
index 37066f10eeae..62bb2ab7b67d 100644
--- a/gcc/testsuite/g++.target/riscv/redundant-bitmap-1.C
+++ b/gcc/testsuite/g++.target/riscv/redundant-bitmap-1.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=rv64gc_zba_zbb_zbs -mabi=lp64" } */
+/* { dg-options "-O2 -march=rv64gcb -mabi=lp64" } */
 
 void setBit(char &a, int b) {
 char c = 0x1UL << b;
diff --git a/gcc/testsuite/g++.target/riscv/redundant-bitmap-2.C 
b/gcc/testsuite/g++.target/riscv/redundant-bitmap-2.C
index 86acaba298fc..52204daecd11 100644
--- a/gcc/testsuite/g++.target/riscv/redundant-bitmap-2.C
+++ b/gcc/testsuite/g++.target/riscv/redundant-bitmap-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=rv64gc_zba_zbb_zbs -mabi=lp64" } */
+/* { dg-options "-O2 -march=rv64gcb -mabi=lp64" } */
 
 void setBit(char &a, int b) {
 char c = 0x1UL << b;
diff --git a/gcc/testsuite/g++.target/riscv/redundant-bitmap-3.C 
b/gcc/testsuite/g++.target/riscv/redundant-bitmap-3.C
index 16bd7c1785e7..6745220f2f41 100644
--- a/gcc/testsuite/g++.target/riscv/redundant-bitmap-3.C
+++ b/gcc/testsuite/g++.target/riscv/redundant-bitmap-3.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=rv64gc_zba_zbb_zbs -mabi=lp64" } */
+/* { dg-options "-O2 -march=rv64gcb -mabi=lp64" } */
 
 void setBit(char &a, int b) {
 char c = 0x1UL << b;
diff --git a/gcc/testsuite/g++.target/riscv/redundant-bitmap-4.C 
b/gcc/testsuite/g++.target/riscv/redundant-bitmap-4.C
index f664ee01a016..5e351fe457e9 100644
--- a/gcc/testsuite/g++.target/riscv/redundant-bitmap-4.C
+++ b/gcc/testsuite/g++.target/riscv/redundant-bitmap-4.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=rv64gc_zba_zbb_zbs -mabi=lp64" } */
+/* { dg-options "-O2 -march=rv64gcb -mabi=lp64" } */
 
 void setBit(char &a, int b) {
 char c = 0x1UL << b;
diff --git a/gcc/testsuite/gcc.target/riscv/shift-add-1.c 
b/gcc/testsuite/gcc.target/riscv/shift-add-1.c
index d98875c32716..db84a51a2227 100644
--- a/gcc/testsuite/gcc.target/riscv/shift-add-1.c
+++ b/gcc/testsuite/gcc.target/riscv/shift-add-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gc_zba_zbb_zbs -mabi=lp64" } */
+/* { dg-options "-march=rv64gcb -mabi=lp64" } */
 /* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
 
 int composeFromSurrogate(const unsigned short high) {
diff --git a/gcc/testsuite/gcc.

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: c implies zca, and conditionally zcf & zcd

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b3586773824f09bf2bf16ce398d2f988428c0821

commit b3586773824f09bf2bf16ce398d2f988428c0821
Author: Fei Gao 
Date:   Wed Jul 10 10:12:02 2024 +

RISC-V: c implies zca, and conditionally zcf & zcd

According to Zc-1.0.4-3.pdf from

https://github.com/riscvarchive/riscv-code-size-reduction/releases/tag/v1.0.4-3
The rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd

Signed-off-by: Fei Gao 
gcc/ChangeLog:

* common/config/riscv/riscv-common.cc:
c implies zca, and conditionally zcf & zcd.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/attribute-15.c: adapt TC.
* gcc.target/riscv/attribute-16.c: likewise.
* gcc.target/riscv/attribute-17.c: likewise.
* gcc.target/riscv/attribute-18.c: likewise.
* gcc.target/riscv/pr110696.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-1.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-3.c: likewise.
* gcc.target/riscv/arch-39.c: New test.
* gcc.target/riscv/arch-40.c: New test.

(cherry picked from commit 36e5e409190e595638cec053ea034d20d5c74d6b)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc  | 12 
 gcc/testsuite/gcc.target/riscv/arch-39.c |  7 +++
 gcc/testsuite/gcc.target/riscv/arch-40.c |  7 +++
 gcc/testsuite/gcc.target/riscv/attribute-15.c|  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-16.c|  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-17.c|  2 +-
 gcc/testsuite/gcc.target/riscv/attribute-18.c|  2 +-
 gcc/testsuite/gcc.target/riscv/pr110696.c|  2 +-
 .../gcc.target/riscv/rvv/base/abi-callee-saved-1-zcmp.c  |  2 +-
 .../gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c  |  2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-1.c |  4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-3.c |  8 
 12 files changed, 39 insertions(+), 13 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index b0a16f5bd30f..3c4178c19c99 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -82,6 +82,18 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"a", "zaamo"},
   {"a", "zalrsc"},
 
+  {"c", "zca"},
+  {"c", "zcf",
+   [] (const riscv_subset_list *subset_list) -> bool
+   {
+ return subset_list->xlen () == 32 && subset_list->lookup ("f");
+   }},
+  {"c", "zcd",
+   [] (const riscv_subset_list *subset_list) -> bool
+   {
+ return subset_list->lookup ("d");
+   }},
+
   {"zabha", "zaamo"},
 
   {"b", "zba"},
diff --git a/gcc/testsuite/gcc.target/riscv/arch-39.c 
b/gcc/testsuite/gcc.target/riscv/arch-39.c
new file mode 100644
index ..beeb81e44c50
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-39.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64idc_zcmt -mabi=lp64d" } */
+int
+foo ()
+{}
+
+/* { dg-error "zcd conflicts with zcmt" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/riscv/arch-40.c 
b/gcc/testsuite/gcc.target/riscv/arch-40.c
new file mode 100644
index ..eaefaf1d0d75
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-40.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64idc_zcmp -mabi=lp64d" } */
+int
+foo ()
+{}
+
+/* { dg-error "zcd conflicts with zcmp" "" { target *-*-* } 0 } */
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-15.c 
b/gcc/testsuite/gcc.target/riscv/attribute-15.c
index a2e394b6489b..ac6caaecd4f7 100644
--- a/gcc/testsuite/gcc.target/riscv/attribute-15.c
+++ b/gcc/testsuite/gcc.target/riscv/attribute-15.c
@@ -3,4 +3,4 @@
 int foo()
 {
 }
-/* { dg-final { scan-assembler ".attribute arch, 
\"rv32i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zaamo1p0_zalrsc1p0\"" } } */
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv32i2p0_m2p0_a2p0_f2p0_d2p0_c2p0_zaamo1p0_zalrsc1p0_zca1p0_zcd1p0_zcf1p0\"" 
} } */
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-16.c 
b/gcc/testsuite/gcc.target/riscv/attribute-16.c
index d2b18160cb5d..539e426ca976 100644
--- a/gcc/testsuite/gcc.target/riscv/attribute-16.c
+++ b/gcc/testsuite/gcc.target/riscv/attribute-16.c
@@ -3,4 +3,4 @@
 int foo()
 {
 }
-/* { dg-final { scan-assembler ".attribute arch, 
\"rv32i2p1_m2p0_a2p0_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0\"" 
} } */
+/* { dg-final { scan-assembler ".attribute arch, 
\"rv32i2p1_m2p0_a2p0_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zca1p0_zcd1p0_zcf1p0\""
 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/attribute-17.c 
b/g

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add testcases for vector .SAT_SUB in zip benchmark

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:1726acdf1f7a3c3129a08fa571d750c5d09f8176

commit 1726acdf1f7a3c3129a08fa571d750c5d09f8176
Author: Pan Li 
Date:   Thu Jul 11 15:54:32 2024 +0800

RISC-V: Add testcases for vector .SAT_SUB in zip benchmark

This patch would like to add the test cases for the vector .SAT_SUB in
the zip benchmark.  Aka:

Form in zip benchmark:
  #define DEF_VEC_SAT_U_SUB_ZIP(T1, T2) \
  void __attribute__((noinline))\
  vec_sat_u_sub_##T1##_##T2##_fmt_zip (T1 *x, T2 b, unsigned limit) \
  { \
T2 a;   \
T1 *p = x;  \
do {\
  a = *--p; \
  *p = (T1)(a >= b ? a - b : 0);\
} while (--limit);  \
  }

DEF_VEC_SAT_U_SUB_ZIP(uint8_t, uint16_t)

vec_sat_u_sub_uint16_t_uint32_t_fmt_zip:
  ...
  vsetvli   a4,zero,e32,m1,ta,ma
  vmv.v.x   v6,a1
  vsetvli   zero,zero,e16,mf2,ta,ma
  vid.v v2
  lia4,-1
  vnclipu.wiv6,v6,0   // .SAT_TRUNC
.L3:
  vle16.v   v3,0(a3)
  vrsub.vx  v5,v2,a6
  mva7,a4
  addw  a4,a4,t3
  vrgather.vv   v1,v3,v5
  vssubu.vv v1,v1,v6  // .SAT_SUB
  vrgather.vv   v3,v1,v5
  vse16.v   v3,0(a3)
  sub   a3,a3,t1
  bgtu  t4,a4,.L3

Passed the rv64gcv tests.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for .SAT_SUB in zip benchmark.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: New 
test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: New test.

Signed-off-by: Pan Li 
(cherry picked from commit b3c686416e88bf135def0e72d316713af01445a1)

Diff:
---
 .../riscv/rvv/autovec/binop/vec_sat_arith.h| 18 +
 .../riscv/rvv/autovec/binop/vec_sat_binary_vx.h| 22 ++
 .../riscv/rvv/autovec/binop/vec_sat_data.h | 81 ++
 .../rvv/autovec/binop/vec_sat_u_sub_zip-run.c  | 16 +
 .../riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c| 18 +
 5 files changed, 155 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
index 10459807b2c4..416a1e49a47b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h
@@ -322,6 +322,19 @@ vec_sat_u_sub_##T##_fmt_10 (T *out, T *op_1, T *op_2, 
unsigned limit) \
 } \
 }
 
+#define DEF_VEC_SAT_U_SUB_ZIP(T1, T2) \
+void __attribute__((noinline))\
+vec_sat_u_sub_##T1##_##T2##_fmt_zip (T1 *x, T2 b, unsigned limit) \
+{ \
+  T2 a;   \
+  T1 *p = x;  \
+  do {\
+a = *--p; \
+*p = (T1)(a >= b ? a - b : 0);\
+  } while (--limit);  \
+}
+#define DEF_VEC_SAT_U_SUB_ZIP_WRAP(T1, T2) DEF_VEC_SAT_U_SUB_ZIP(T1, T2)
+
 #define RUN_VEC_SAT_U_SUB_FMT_1(T, out, op_1, op_2, N) \
   vec_sat_u_sub_##T##_fmt_1(out, op_1, op_2, N)
 
@@ -352,6 +365,11 @@ vec_sat_u_sub_##T##_fmt_10 (T *out, T *op_1, T *op_2, 
unsigned limit) \
 #define RUN_VEC_SAT_U_SUB_FMT_10(T, out, op_1, op_2, N) \
   vec_sat_u_sub_##T##_fmt_10(out, op_1, op_2, N)
 
+#define RUN_VEC_SAT_U_SUB_FMT_ZIP(T1, T2, x, b, N) \
+  vec_sat_u_sub_##T1##_##T2##_fmt_zip(x, b, N)
+#define RUN_VEC_SAT_U_SUB_FMT_ZIP_WRAP(T1, T2, x, b, N) \
+  RUN_VEC_SAT_U_SUB_FMT_ZIP(T1, T2, x, b, N) \
+
 
/**/
 /* Saturation Sub Truncated (Unsigned and Signed) 
*/
 
/**/
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h
new file mode 100644
index 0

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [to-be-committed, RISC-V] Eliminate unnecessary sign extension after inlined str[n]cmp

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:e0bf7a4734598402720ed5b86285705c1791e457

commit e0bf7a4734598402720ed5b86285705c1791e457
Author: Jeff Law 
Date:   Thu Jul 11 12:05:56 2024 -0600

[to-be-committed,RISC-V] Eliminate unnecessary sign extension after inlined 
str[n]cmp

This patch eliminates an unnecessary sign extension for scalar inlined
string comparisons on rv64.

Conceptually this is pretty simple.  Prove all the paths which "return"
a value from the inlined string comparison already have sign extended
values.

FINAL_LABEL is the point after the calculation of the return value.  So
if we have a jump to FINAL_LABEL, we must have a properly extended
result value at that point.

Second we're going to arrange in the .md part of the expander to use an
X mode temporary for the result.  After computing the result we will (if
necessary) extract the low part of the result using a SUBREG tagged with
the appropriate SUBREG_PROMOTED_* bits.

So with that background.

We find a jump to FINAL_LABEL in emit_strcmp_scalar_compare_byte.  Since
we know the result is X mode, we can just emit the subtraction of the
two chars in X mode and we'll have a properly sign extended result.

There's 4 jumps to final_label in emit_strcmp_scalar.

The first is just returning zero and needs trivial simplification to not
force the result into SImode.

The second is after calling strcmp in the library.  The ABI mandates
that value is sign extended, so there's nothing to do for that case.

The 3rd occurs after a call to
emit_strcmp_scalar_result_calculation_nonul.  If we dive into that
routine it needs simplificationq similar to what we did in
emit_strcmp_scalar_compare_byte

The 4th occurs after a call to emit_strcmp_scalar_result_calculation
which again needs trivial adjustment like we've done in the other routines.

Finally, at the end of expand_strcmp, just store the X mode result
sitting in SUB to RESULT.

The net of all that is we know every path has its result properly
extended to X mode.  Standard redundant extension removal will take care
of the rest.

We've been running this within Ventana for about 6 months, so naturally
it's been through various QA cycles, dhrystone, spec2017, etc.  It's
also been through a build/test cycle in my tester.  Waiting on results
from the pre-commit testing before moving forward.

gcc/
* config/riscv/riscv-string.cc
(emit_strcmp_scalar_compare_byte): Set RESULT directly rather
than using a new temporary.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Use CONST0_RTX rather than
generating a new node.
(expand_strcmp): Copy directly from SUB to RESULT.
* config/riscv/riscv.md (cmpstrnsi, cmpstrsi): Pass an X
mode temporary to the expansion routines.  If necessary
extract low part of the word to store in final result location.

(cherry picked from commit 74d8accaf88f83bfcab1150bf9be5140e7ac0e94)

Diff:
---
 gcc/config/riscv/riscv-string.cc | 15 +--
 gcc/config/riscv/riscv.md| 28 
 2 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 257a514d2901..4736228e6f14 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -140,9 +140,7 @@ static void
 emit_strcmp_scalar_compare_byte (rtx result, rtx data1, rtx data2,
 rtx final_label)
 {
-  rtx tmp = gen_reg_rtx (Xmode);
-  do_sub3 (tmp, data1, data2);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  do_sub3 (result, data1, data2);
   emit_jump_insn (gen_jump (final_label));
   emit_barrier (); /* No fall-through.  */
 }
@@ -310,8 +308,7 @@ emit_strcmp_scalar_result_calculation_nonul (rtx result, 
rtx data1, rtx data2)
   rtx tmp = gen_reg_rtx (Xmode);
   emit_insn (gen_slt_3 (LTU, Xmode, Xmode, tmp, data1, data2));
   do_neg2 (tmp, tmp);
-  do_ior3 (tmp, tmp, const1_rtx);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  do_ior3 (result, tmp, const1_rtx);
 }
 
 /* strcmp-result calculation.
@@ -367,9 +364,7 @@ emit_strcmp_scalar_result_calculation (rtx result, rtx 
data1, rtx data2,
   unsigned int shiftr = (xlen - 1) * BITS_PER_UNIT;
   do_lshr3 (data1, data1, GEN_INT (shiftr));
   do_lshr3 (data2, data2, GEN_INT (shiftr));
-  rtx tmp = gen_reg_rtx (Xmode);
-  do_sub3 (tmp, data1, data2);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  do_sub3 (result, data1, data2);
 }
 
 /* Expand str(n)cmp using Zbb/TheadBb instructions.
@@ -444,7 +439,7 @@ riscv_expand_strcmp_scalar (rtx result, rtx src1, 

[gcc r15-1989] [committed] Fix m68k bootstrap segfault with late-combine

2024-07-11 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:a91c51c187a78e4164bc4039ebdb543848e379d2

commit r15-1989-ga91c51c187a78e4164bc4039ebdb543848e379d2
Author: Jeff Law 
Date:   Thu Jul 11 21:37:34 2024 -0600

[committed] Fix m68k bootstrap segfault with late-combine

So the m68k port has failed to bootstrap since the introduction of
late-combine.  My suspicion has been this is a backend problem.  Sure enough
after bisecting things down (thank goodness for the debug counter!) I'm 
happy
to report m68k (after this patch) has moved into its stage3 build for the 
first
time in a month.

Basically late-combine propagated an address calculation to its use points,
generating this insn (dwarf2out.c, I forget what function):

> (insn 653 652 655 (parallel [
> (set (mem/j:DI (plus:SI (plus:SI (reg/f:SI 9 %a1 [orig:64 _67 
] [64])
> (reg:SI 0 %d0 [321]))
> (const_int 20 [0x14])) [0 
slot_204->dw_attr_val.v.val_unsigned+0 S8 A16])
> (sign_extend:DI (mem/c:SI (plus:SI (reg/f:SI 14 %a6)
> (const_int -28 [0xffe4])) [870 
%sfp+-28 S4 A16])))
> (clobber (reg:SI 0 %d0))
> ]) "../../../gcc/gcc/dwarf2out.cc":24961:23 93 {extendsidi2}
>  (expr_list:REG_DEAD (reg/f:SI 9 %a1 [orig:64 _67 ] [64])
> (expr_list:REG_DEAD (reg:SI 0 %d0 [321])
> (expr_list:REG_UNUSED (reg:SI 0 %d0)
> (nil)
Note how the output uses d0 in the address calculation and the clobber uses 
d0.

It matches this insn in the md file:

> (define_insn "extendsidi2"
>   [(set (match_operand:DI 0 "nonimmediate_operand" "=d,o,o,<")
> (sign_extend:DI
>  (match_operand:SI 1 "nonimmediate_src_operand" "rm,rm,r,rm")))
>(clobber (match_scratch:SI 2 "=X,&d,&d,&d"))]
>   ""
> {
>   if (which_alternative == 0)
> /* Handle alternative 0.  */
> {
>   if (TARGET_68020 || TARGET_COLDFIRE)
> return "move%.l %1,%R0\;smi %0\;extb%.l %0";
>   else
> return "move%.l %1,%R0\;smi %0\;ext%.w %0\;ext%.l %0";
> }
>
>   /* Handle alternatives 1, 2 and 3.  We don't need to adjust address by 4
>  in alternative 3 because autodecrement will do that for us.  */
>   operands[3] = adjust_address (operands[0], SImode,
> which_alternative == 3 ? 0 : 4);
>   operands[0] = adjust_address (operands[0], SImode, 0);
>
>   if (TARGET_68020 || TARGET_COLDFIRE)
> return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
>   else
> return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
> }
>   [(set_attr "ok_for_coldfire" "yes,no,yes,yes")])
Note the smi/ext instruction pair in the case for alternatives 1..3.  Those
clobber the scratch register before we're done consuming inputs.  The 
scratch
register really needs to be marked as an earlyclobber.

That fixes the bootstrap problem, but a cursory review of m68k.md is not
encouraging.  I will not be surprised at all if there's more of this kind of
problem lurking.

But happy to at least have m68k bootstrapping again.   It's failing the
comparison test, but definitely progress.

* config/m68k/m68k.md (extendsidi2): Add missing early clobbers.

Diff:
---
 gcc/config/m68k/m68k.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md
index 037978db40c0..e5c252888448 100644
--- a/gcc/config/m68k/m68k.md
+++ b/gcc/config/m68k/m68k.md
@@ -1887,7 +1887,7 @@
   [(set (match_operand:DI 0 "nonimmediate_operand" "=d,o,o,<")
(sign_extend:DI
 (match_operand:SI 1 "nonimmediate_src_operand" "rm,rm,r,rm")))
-   (clobber (match_scratch:SI 2 "=X,d,d,d"))]
+   (clobber (match_scratch:SI 2 "=X,&d,&d,&d"))]
   ""
 {
   if (which_alternative == 0)


[gcc r15-2006] [RISC-V] Avoid unnecessary sign extension after memcmp

2024-07-12 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:ae829a27785307232e4db0df6a30ca275941b613

commit r15-2006-gae829a27785307232e4db0df6a30ca275941b613
Author: Jeff Law 
Date:   Fri Jul 12 07:53:41 2024 -0600

[RISC-V] Avoid unnecessary sign extension after memcmp

Similar to the str[n]cmp work, this adjusts the block compare expansion to 
do
its work in X mode with an appropriate lowpart extraction of the results at 
the
end of the sequence.

This has gone through my tester on rv32 and rv64, but that's it. Waiting on
pre-commit testing before moving forward.

gcc/

* config/riscv/riscv-string.cc 
(emit_memcmp_scalar_load_and_compare):
Set RESULT directly rather than using a temporary.
(emit_memcmp_scalar_result_calculation): Similarly.
(riscv_expand_block_compare_scalar): Use CONST0_RTX rather than
generating new RTL.
* config/riscv/riscv.md (cmpmemsi): Pass an X mode temporary to the
expansion routines.  If necessary extract low part of the word to 
store
in final result location.

Diff:
---
 gcc/config/riscv/riscv-string.cc | 15 ++-
 gcc/config/riscv/riscv.md| 14 --
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 4736228e6f14..80d22e87d571 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -663,9 +663,7 @@ emit_memcmp_scalar_load_and_compare (rtx result, rtx src1, 
rtx src2,
   /* Fast-path for a single byte.  */
   if (cmp_bytes == 1)
{
- rtx tmp = gen_reg_rtx (Xmode);
- do_sub3 (tmp, data1, data2);
- emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+ do_sub3 (result, data1, data2);
  emit_jump_insn (gen_jump (final_label));
  emit_barrier (); /* No fall-through.  */
  return;
@@ -702,12 +700,11 @@ emit_memcmp_scalar_result_calculation (rtx result, rtx 
data1, rtx data2)
   /* Get bytes in big-endian order and compare as words.  */
   do_bswap2 (data1, data1);
   do_bswap2 (data2, data2);
+
   /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence.  */
-  rtx tmp = gen_reg_rtx (Xmode);
-  emit_insn (gen_slt_3 (LTU, Xmode, Xmode, tmp, data1, data2));
-  do_neg2 (tmp, tmp);
-  do_ior3 (tmp, tmp, const1_rtx);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  emit_insn (gen_slt_3 (LTU, Xmode, Xmode, result, data1, data2));
+  do_neg2 (result, result);
+  do_ior3 (result, result, const1_rtx);
 }
 
 /* Expand memcmp using scalar instructions (incl. Zbb).
@@ -773,7 +770,7 @@ riscv_expand_block_compare_scalar (rtx result, rtx src1, 
rtx src2, rtx nbytes)
   data1, data2,
   diff_label, final_label);
 
-  emit_insn (gen_rtx_SET (result, gen_rtx_CONST_INT (SImode, 0)));
+  emit_move_insn (result, CONST0_RTX (GET_MODE (result)));
   emit_jump_insn (gen_jump (final_label));
   emit_barrier (); /* No fall-through.  */
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 2e2379dfca4f..5dee837a5878 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2675,9 +2675,19 @@
   operands[2], operands[3]))
 DONE;
 
-  if (riscv_expand_block_compare (operands[0], operands[1], operands[2],
+  rtx temp = gen_reg_rtx (word_mode);
+  if (riscv_expand_block_compare (temp, operands[1], operands[2],
   operands[3]))
-DONE;
+{
+  if (TARGET_64BIT)
+   {
+ temp = gen_lowpart (SImode, temp);
+ SUBREG_PROMOTED_VAR_P (temp) = 1;
+ SUBREG_PROMOTED_SET (temp, SRP_SIGNED);
+   }
+  emit_move_insn (operands[0], temp);
+  DONE;
+}
   else
 FAIL;
 })


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add SiFive extensions, xsfvcp and xsfcease

2024-07-12 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:68a2ba4a346992f1399abb410c5fb788aa9bca25

commit 68a2ba4a346992f1399abb410c5fb788aa9bca25
Author: Kito Cheng 
Date:   Tue Jul 9 15:50:57 2024 +0800

RISC-V: Add SiFive extensions, xsfvcp and xsfcease

We have already upstreamed these extensions into binutils, and now we need 
GCC
to recognize these extensions and pass them to binutils as well. We also 
plan
to upstream intrinsics in the near future. :)

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info): Add 
xsfvcp.
(riscv_ext_version_table): Add xsfvcp, xsfcease.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv.opt (riscv_sifive_subext): New.
(XSFVCP): New.
(XSFCEASE): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-sf-1.c: New.
* gcc.target/riscv/predef-sf-2.c: New.

(cherry picked from commit 3ea47ea1fcab95fd1b80acc724fdbb27fc436985)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc  |  8 
 gcc/config/riscv/riscv.opt   |  7 +++
 gcc/testsuite/gcc.target/riscv/predef-sf-1.c | 19 +++
 gcc/testsuite/gcc.target/riscv/predef-sf-2.c | 14 ++
 4 files changed, 48 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 3c4178c19c99..d883efa7a3ab 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -216,6 +216,8 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"ssstateen", "zicsr"},
   {"sstc", "zicsr"},
 
+  {"xsfvcp", "zve32x"},
+
   {NULL, NULL}
 };
 
@@ -415,6 +417,9 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
 
   {"xventanacondops", ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"xsfvcp",   ISA_SPEC_CLASS_NONE, 1, 0},
+  {"xsfcease", ISA_SPEC_CLASS_NONE, 1, 0},
+
   /* Terminate the list.  */
   {NULL, ISA_SPEC_CLASS_NONE, 0, 0}
 };
@@ -1822,6 +1827,9 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
 
   {"xventanacondops", &gcc_options::x_riscv_xventana_subext, 
MASK_XVENTANACONDOPS},
 
+  {"xsfvcp",   &gcc_options::x_riscv_sifive_subext, MASK_XSFVCP},
+  {"xsfcease", &gcc_options::x_riscv_sifive_subext, MASK_XSFCEASE},
+
   {NULL, NULL, 0}
 };
 
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 32a0dda58439..a1d70b636382 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -507,6 +507,13 @@ int riscv_xventana_subext
 
 Mask(XVENTANACONDOPS) Var(riscv_xventana_subext)
 
+TargetVariable
+int riscv_sifive_subext
+
+Mask(XSFVCP) Var(riscv_sifive_subext)
+
+Mask(XSFCEASE) Var(riscv_sifive_subext)
+
 Enum
 Name(isa_spec_class) Type(enum riscv_isa_spec_class)
 Supported ISA specs (for use with the -misa-spec= option):
diff --git a/gcc/testsuite/gcc.target/riscv/predef-sf-1.c 
b/gcc/testsuite/gcc.target/riscv/predef-sf-1.c
new file mode 100644
index ..d6c07e7d9207
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/predef-sf-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g_xsfvcp -mabi=lp64" } */
+
+int main () {
+#if !defined(__riscv)
+#error "__riscv"
+#endif
+
+#if !defined(__riscv_zve32x)
+#error "__riscv_zve32x"
+#endif
+
+
+#if !defined(__riscv_xsfvcp)
+#error "__riscv_xsfvcp"
+#endif
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/predef-sf-2.c 
b/gcc/testsuite/gcc.target/riscv/predef-sf-2.c
new file mode 100644
index ..dcb746bcd260
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/predef-sf-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g_xsfcease -mabi=lp64" } */
+
+int main () {
+#if !defined(__riscv)
+#error "__riscv"
+#endif
+
+#if !defined(__riscv_xsfcease)
+#error "__riscv_xsfvcp"
+#endif
+
+  return 0;
+}


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Disable misaligned vector access in hook riscv_slow_unaligned_access[PR115862]

2024-07-12 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:d39d88ed512ae6affbd2f5128b1e55d92d8f5c0a

commit d39d88ed512ae6affbd2f5128b1e55d92d8f5c0a
Author: xuli 
Date:   Thu Jul 11 04:29:11 2024 +

RISC-V: Disable misaligned vector access in hook 
riscv_slow_unaligned_access[PR115862]

The reason is that in the following code, icode = movmisalignv8si has
already been rejected by TARGET_VECTOR_MISALIGN_SUPPORTED, but it is
allowed by targetm.slow_unaligned_access,which is contradictory.

(((icode = optab_handler (movmisalign_optab, mode))
   != CODE_FOR_nothing)
  || targetm.slow_unaligned_access (mode, align))

misaligned vector access should be enabled by -mno-vector-strict-align 
option.

PR target/115862

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_slow_unaligned_access): Disable 
vector misalign.

Signed-off-by: Li Xu 
(cherry picked from commit 63d7d5998e3768f6e3703c29e8774e8b54af108c)

Diff:
---
 gcc/config/riscv/riscv.cc  |  5 ++-
 gcc/testsuite/gcc.target/riscv/rvv/base/pr115862.c | 52 ++
 2 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ce73c18f88f8..8a8826deac17 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10269,9 +10269,10 @@ riscv_cannot_copy_insn_p (rtx_insn *insn)
 /* Implement TARGET_SLOW_UNALIGNED_ACCESS.  */
 
 static bool
-riscv_slow_unaligned_access (machine_mode, unsigned int)
+riscv_slow_unaligned_access (machine_mode mode, unsigned int)
 {
-  return riscv_slow_unaligned_access_p;
+  return VECTOR_MODE_P (mode) ? TARGET_VECTOR_MISALIGN_SUPPORTED
+ : riscv_slow_unaligned_access_p;
 }
 
 static bool
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115862.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115862.c
new file mode 100644
index ..3cbc3c3a0ea4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115862.c
@@ -0,0 +1,52 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gcv_zvl512b -mabi=lp64d" } */
+
+struct mallinfo2
+{
+  int arena;
+  int ordblks;
+  int smblks;
+  int hblks;
+  int hblkhd;
+  int usmblks;
+  int fsmblks;
+  int uordblks;
+  int fordblks;
+  int keepcost;
+};
+
+struct mallinfo
+{
+  int arena;
+  int ordblks;
+  int smblks;
+  int hblks;
+  int hblkhd;
+  int usmblks;
+  int fsmblks;
+  int uordblks;
+  int fordblks;
+  int keepcost;
+};
+
+struct mallinfo
+__libc_mallinfo (void)
+{
+  struct mallinfo m;
+  struct mallinfo2 m2;
+
+  m.arena = m2.arena;
+  m.ordblks = m2.ordblks;
+  m.smblks = m2.smblks;
+  m.hblks = m2.hblks;
+  m.hblkhd = m2.hblkhd;
+  m.usmblks = m2.usmblks;
+  m.fsmblks = m2.fsmblks;
+  m.uordblks = m2.uordblks;
+  m.fordblks = m2.fordblks;
+  m.keepcost = m2.keepcost;
+
+  return m;
+}
+
+/* { dg-final { scan-assembler {vle32\.v} } } */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: NO_WARNING preferred else value for RVV

2024-07-12 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6322f7af2b8d41d5a298af66b5200d0c6ac79191

commit 6322f7af2b8d41d5a298af66b5200d0c6ac79191
Author: YunQiang Su 
Date:   Thu Jul 11 20:43:54 2024 +0800

RISC-V: NO_WARNING preferred else value for RVV

PR target/115840.

In riscv_preferred_else_value, we create an uninitialized tmp var
for else value, instead of the 0 (as default_preferred_else_value)
or the pre-exists VAR (as aarch64 does), so that we can use agnostic
policy.

The problem is that `warn_uninit` will emit a warning:
  '({anonymous})' may be used uninitialized

Let's mark this tmp var as NO_WARNING.

This problem is found when I try to build glibc with V extension.

gcc

PR target/115840
* config/riscv/riscv.cc(riscv_preferred_else_value): Mark
tmp_var as NO_WARNING.

gcc/testsuite
* gcc.dg/vect/pr115840.c: New testcase.

(cherry picked from commit c6f38e5e6d900b8ed6a4f5c126d3197946cad4dd)

Diff:
---
 gcc/config/riscv/riscv.cc|  6 +-
 gcc/testsuite/gcc.dg/vect/pr115840.c | 11 +++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8a8826deac17..07539b62 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -11432,7 +11432,11 @@ riscv_preferred_else_value (unsigned ifn, tree 
vectype, unsigned int nops,
tree *ops)
 {
   if (riscv_v_ext_mode_p (TYPE_MODE (vectype)))
-return get_or_create_ssa_default_def (cfun, create_tmp_var (vectype));
+{
+  tree tmp_var = create_tmp_var (vectype);
+  TREE_NO_WARNING (tmp_var) = 1;
+  return get_or_create_ssa_default_def (cfun, tmp_var);
+}
 
   return default_preferred_else_value (ifn, vectype, nops, ops);
 }
diff --git a/gcc/testsuite/gcc.dg/vect/pr115840.c 
b/gcc/testsuite/gcc.dg/vect/pr115840.c
new file mode 100644
index ..09dc9e4eb7c2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr115840.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wall -Werror" } */
+
+double loads[16];
+
+void
+foo (double loadavg[], int count)
+{
+  for (int i = 0; i < count; i++)
+loadavg[i] = loads[i] / 1.5;
+}


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [RISC-V] Avoid unnecessary sign extension after memcmp

2024-07-12 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:67116d59607d1897629deb3b95f7ccb9d4d875cb

commit 67116d59607d1897629deb3b95f7ccb9d4d875cb
Author: Jeff Law 
Date:   Fri Jul 12 07:53:41 2024 -0600

[RISC-V] Avoid unnecessary sign extension after memcmp

Similar to the str[n]cmp work, this adjusts the block compare expansion to 
do
its work in X mode with an appropriate lowpart extraction of the results at 
the
end of the sequence.

This has gone through my tester on rv32 and rv64, but that's it. Waiting on
pre-commit testing before moving forward.

gcc/

* config/riscv/riscv-string.cc 
(emit_memcmp_scalar_load_and_compare):
Set RESULT directly rather than using a temporary.
(emit_memcmp_scalar_result_calculation): Similarly.
(riscv_expand_block_compare_scalar): Use CONST0_RTX rather than
generating new RTL.
* config/riscv/riscv.md (cmpmemsi): Pass an X mode temporary to the
expansion routines.  If necessary extract low part of the word to 
store
in final result location.

(cherry picked from commit ae829a27785307232e4db0df6a30ca275941b613)

Diff:
---
 gcc/config/riscv/riscv-string.cc | 15 ++-
 gcc/config/riscv/riscv.md| 14 --
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 4736228e6f14..80d22e87d571 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -663,9 +663,7 @@ emit_memcmp_scalar_load_and_compare (rtx result, rtx src1, 
rtx src2,
   /* Fast-path for a single byte.  */
   if (cmp_bytes == 1)
{
- rtx tmp = gen_reg_rtx (Xmode);
- do_sub3 (tmp, data1, data2);
- emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+ do_sub3 (result, data1, data2);
  emit_jump_insn (gen_jump (final_label));
  emit_barrier (); /* No fall-through.  */
  return;
@@ -702,12 +700,11 @@ emit_memcmp_scalar_result_calculation (rtx result, rtx 
data1, rtx data2)
   /* Get bytes in big-endian order and compare as words.  */
   do_bswap2 (data1, data1);
   do_bswap2 (data2, data2);
+
   /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence.  */
-  rtx tmp = gen_reg_rtx (Xmode);
-  emit_insn (gen_slt_3 (LTU, Xmode, Xmode, tmp, data1, data2));
-  do_neg2 (tmp, tmp);
-  do_ior3 (tmp, tmp, const1_rtx);
-  emit_insn (gen_movsi (result, gen_lowpart (SImode, tmp)));
+  emit_insn (gen_slt_3 (LTU, Xmode, Xmode, result, data1, data2));
+  do_neg2 (result, result);
+  do_ior3 (result, result, const1_rtx);
 }
 
 /* Expand memcmp using scalar instructions (incl. Zbb).
@@ -773,7 +770,7 @@ riscv_expand_block_compare_scalar (rtx result, rtx src1, 
rtx src2, rtx nbytes)
   data1, data2,
   diff_label, final_label);
 
-  emit_insn (gen_rtx_SET (result, gen_rtx_CONST_INT (SImode, 0)));
+  emit_move_insn (result, CONST0_RTX (GET_MODE (result)));
   emit_jump_insn (gen_jump (final_label));
   emit_barrier (); /* No fall-through.  */
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 2e2379dfca4f..5dee837a5878 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2675,9 +2675,19 @@
   operands[2], operands[3]))
 DONE;
 
-  if (riscv_expand_block_compare (operands[0], operands[1], operands[2],
+  rtx temp = gen_reg_rtx (word_mode);
+  if (riscv_expand_block_compare (temp, operands[1], operands[2],
   operands[3]))
-DONE;
+{
+  if (TARGET_64BIT)
+   {
+ temp = gen_lowpart (SImode, temp);
+ SUBREG_PROMOTED_VAR_P (temp) = 1;
+ SUBREG_PROMOTED_SET (temp, SRP_SIGNED);
+   }
+  emit_move_insn (operands[0], temp);
+  DONE;
+}
   else
 FAIL;
 })


[gcc r15-2011] [PR rtl-optimization/115876] Fix one of two ubsan reported issues in new ext-dce.cc code

2024-07-12 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:a6f551d079de1d151b272bcdd3d42316857c9d4e

commit r15-2011-ga6f551d079de1d151b272bcdd3d42316857c9d4e
Author: Jeff Law 
Date:   Fri Jul 12 13:11:33 2024 -0600

[PR rtl-optimization/115876] Fix one of two ubsan reported issues in new 
ext-dce.cc code

David Binderman did a bootstrap build with ubsan enabled which triggered a 
few
errors in the new ext-dce.cc code.  This fixes the trivial case of shifting
negative values.

Bootstrapped and regression tested on x86.

Pushing to the trunk.

gcc/
PR rtl-optimization/115876
* ext-dce.cc (carry_backpropagate): Make mask and mmask unsigned.

Diff:
---
 gcc/ext-dce.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index adc9084df57d..91789d283fcd 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -374,13 +374,13 @@ binop_implies_op2_fully_live (rtx_code code)
exclusively pertain to the first operand.  */
 
 HOST_WIDE_INT
-carry_backpropagate (HOST_WIDE_INT mask, enum rtx_code code, rtx x)
+carry_backpropagate (unsigned HOST_WIDE_INT mask, enum rtx_code code, rtx x)
 {
   if (mask == 0)
 return 0;
 
   enum machine_mode mode = GET_MODE_INNER (GET_MODE (x));
-  HOST_WIDE_INT mmask = GET_MODE_MASK (mode);
+  unsigned HOST_WIDE_INT mmask = GET_MODE_MASK (mode);
   switch (code)
 {
 case PLUS:


[gcc r15-2048] Fix sign/carry bit handling in ext-dce.

2024-07-15 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:94b21f13763638f64e83e7f9959c7f1523b9eaed

commit r15-2048-g94b21f13763638f64e83e7f9959c7f1523b9eaed
Author: Jeff Law 
Date:   Mon Jul 15 16:57:44 2024 -0600

Fix sign/carry bit handling in ext-dce.

My change to fix a ubsan issue broke handling propagation of the carry/sign 
bit
down through a right shift.  Thanks to Andreas for the analysis and proposed
fix and Sergei for the testcase.

PR rtl-optimization/115876
PR rtl-optimization/115916
gcc/
* ext-dce.cc (carry_backpropagate): Make return type unsigned as 
well.
Cast to signed for right shift to preserve sign bit.

gcc/testsuite/

* g++.dg/torture/pr115916.C: New test.

Co-author: Andreas Schwab 
Co-author: Sergei Trofimovich 

Diff:
---
 gcc/ext-dce.cc  |  4 +-
 gcc/testsuite/g++.dg/torture/pr115916.C | 90 +
 2 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 91789d283fcd..2869a389c3aa 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -373,7 +373,7 @@ binop_implies_op2_fully_live (rtx_code code)
binop_implies_op2_fully_live (e.g. shifts), the computed mask may
exclusively pertain to the first operand.  */
 
-HOST_WIDE_INT
+unsigned HOST_WIDE_INT
 carry_backpropagate (unsigned HOST_WIDE_INT mask, enum rtx_code code, rtx x)
 {
   if (mask == 0)
@@ -393,7 +393,7 @@ carry_backpropagate (unsigned HOST_WIDE_INT mask, enum 
rtx_code code, rtx x)
 case ASHIFT:
   if (CONSTANT_P (XEXP (x, 1))
  && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode)))
-   return mask >> INTVAL (XEXP (x, 1));
+   return (HOST_WIDE_INT)mask >> INTVAL (XEXP (x, 1));
   return (2ULL << floor_log2 (mask)) - 1;
 
 /* We propagate for the shifted operand, but not the shift
diff --git a/gcc/testsuite/g++.dg/torture/pr115916.C 
b/gcc/testsuite/g++.dg/torture/pr115916.C
new file mode 100644
index ..3d788678eaa3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr115916.C
@@ -0,0 +1,90 @@
+/* { dg-do run } */
+
+#include 
+#include 
+
+struct ve {
+ve() = default;
+ve(const ve&) = default;
+ve& operator=(const ve&) = default;
+
+// note that the code usually uses the first half of this array
+uint8_t raw[16] = {};
+};
+
+static ve First8_(void) {
+ve m;
+__builtin_memset(m.raw, 0xff, 8);
+return m;
+}
+
+static ve And_(ve a, ve b) {
+ve au;
+__builtin_memcpy(au.raw, a.raw, 16);
+for (size_t i = 0; i < 8; ++i) {
+au.raw[i] &= b.raw[i];
+}
+return au;
+}
+
+__attribute__((noipa, optimize(0)))
+static void vec_assert(ve a) {
+if (a.raw[6] != 0x06 && a.raw[6] != 0x07)
+__builtin_trap();
+}
+
+static ve Reverse4_(ve v) {
+ve ret;
+for (size_t i = 0; i < 8; i += 4) {
+ret.raw[i + 0] = v.raw[i + 3];
+ret.raw[i + 1] = v.raw[i + 2];
+ret.raw[i + 2] = v.raw[i + 1];
+ret.raw[i + 3] = v.raw[i + 0];
+}
+return ret;
+}
+
+static ve DupEven_(ve v) {
+for (size_t i = 0; i < 8; i += 2) {
+v.raw[i + 1] = v.raw[i];
+}
+return v;
+}
+
+template 
+ve Per4LaneBlockShuffle_(ve v) {
+if (b) {
+return Reverse4_(v);
+} else {
+return DupEven_(v);
+}
+}
+
+template 
+static inline __attribute__((always_inline)) void 
DoTestPer4LaneBlkShuffle(const ve v) {
+ve actual = Per4LaneBlockShuffle_(v);
+const auto valid_lanes_mask = First8_();
+ve actual_masked = And_(valid_lanes_mask, actual);
+vec_assert(actual_masked);
+}
+
+static void DoTestPer4LaneBlkShuffles(const ve v) {
+alignas(128) uint8_t src_lanes[8];
+__builtin_memcpy(src_lanes, v.raw, 8);
+// need both, hm
+DoTestPer4LaneBlkShuffle(v);
+DoTestPer4LaneBlkShuffle(v);
+}
+
+__attribute__((noipa, optimize(0)))
+static void bug(void) {
+   uint8_t iv[8] = {1,2,3,4,5,6,7,8};
+   ve v;
+   __builtin_memcpy(v.raw, iv, 8);
+   DoTestPer4LaneBlkShuffles(v);
+}
+
+int main(void) {
+bug();
+}
+


[gcc r15-2049] Fix liveness computation for shift/rotate counts in ext-dce

2024-07-15 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b31b8af807f5459674b0b310cb62a5bc81b676e7

commit r15-2049-gb31b8af807f5459674b0b310cb62a5bc81b676e7
Author: Jeff Law 
Date:   Mon Jul 15 18:15:33 2024 -0600

Fix liveness computation for shift/rotate counts in ext-dce

So as I've noted before I believe the control flow in ext-dce.cc is horribly
messy.  While investigating a fix for 115877 I came across another problem
related to control flow handling.

Specifically, if we have an binary op which implies the 2nd operand is fully
live, then we'd actually fail to mark that operand as live.

We essentially broke out of the loop which was supposed to be safe.  But Y 
was
a REG and if Y is a REG or CONST_INT we skip sub-rtxs and thus failed to
process that operand (the shift count) at all.

Rather than muck around with control flow, we can just set all the bits as 
live
in DST_MASK and let normal processing continue.  With all the bits live IN
DST_MASK all the bits implied by the mode of the argument will also be live.

No testcase.

Bootstrapped and regression tested on x86.  Pushing to the trunk.

gcc/
* ext-dce.cc (ext_dce_process_uses): Simplify control flow and fix
liveness computation for shift/rotate counts.

Diff:
---
 gcc/ext-dce.cc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 2869a389c3aa..6c961feee635 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -632,10 +632,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  else if (!CONSTANT_P (y))
break;
 
- /* We might have (ashift (const_int 1) (reg...)) */
- /* XXX share this logic with code below.  */
+ /* We might have (ashift (const_int 1) (reg...))
+By setting dst_mask we can continue iterating on the
+the next operand and it will be considered fully live.  */
  if (binop_implies_op2_fully_live (GET_CODE (src)))
-   break;
+   dst_mask = -1;
 
  /* If this was anything but a binary operand, break the inner
 loop.  This is conservatively correct as it will cause the


[gcc r15-2185] [PR rtl-optimization/115877] Fix livein computation for ext-dce

2024-07-21 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:91e468b72dafc9dcd5dcf7915f1d0ef172264d53

commit r15-2185-g91e468b72dafc9dcd5dcf7915f1d0ef172264d53
Author: Jeff Law 
Date:   Sun Jul 21 07:36:37 2024 -0600

[PR rtl-optimization/115877] Fix livein computation for ext-dce

So I'm not yet sure how I'm going to break everything down, but this is easy
enough to break out as 1/N of ext-dce fixes/improvements.

When handling uses in an insn, we first determine what bits are set in the
destination which is represented in DST_MASK.  Then we use that to refine 
what
bits are live in the source operands.

In the source operand handling section we *modify* DST_MASK if the source
operand is a SUBREG (ugh!).  So if the first operand is a SUBREG, then we 
can
incorrectly compute which bit groups are live in the second operand, 
especially
if it is a SUBREG as well.

This was seen when testing a larger set of patches on the rl78 port
(builtin-arith-overflow-p-7 & pr71631 execution failures), so no new test 
for
this bugfix.

Run through my tester (in conjunction with other ext-dce changes) on the
various cross targets.  Run individually through a bootstrap and regression
test cycle on x86_64 as well.

Pushing to the trunk.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_uses): Restore the value of DST_MASK
for reach operand.

Diff:
---
 gcc/ext-dce.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 6d4b8858ec63..b4450e42ed16 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -591,8 +591,10 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
 making things live.  Breaking from this loop will cause
 the iterator to work on sub-rtxs, so it is safe to break
 if we see something we don't know how to handle.  */
+ unsigned HOST_WIDE_INT save_mask = dst_mask;
  for (;;)
{
+ dst_mask = save_mask;
  /* Strip an outer paradoxical subreg.  The bits outside
 the inner mode are don't cares.  So we can just strip
 and process the inner object.  */


[gcc r15-2186] [PR rtl-optimization/115877][2/n] Improve liveness computation for constant initialization

2024-07-21 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:9d8ef2711dfecd093077aef6123d9e93ea23454e

commit r15-2186-g9d8ef2711dfecd093077aef6123d9e93ea23454e
Author: Jeff Law 
Date:   Sun Jul 21 08:41:28 2024 -0600

[PR rtl-optimization/115877][2/n] Improve liveness computation for constant 
initialization

While debugging pr115877, I noticed we were failing to remove the 
destination
register from LIVENOW bitmap when it was set to a constant value.  ie  (set
(dest) (const_int)).  This was a trivial oversight in
safe_for_live_propagation.

I don't have an example of this affecting code generation, but it certainly
could.  More importantly, by making LIVENOW more accurate it's easier to 
debug
when LIVENOW differs from expectations.

As with the prior patch this has been tested as part of a larger patchset 
with
the crosses as well as individually on x86_64.

Pushing to the trunk,

PR rtl-optimization/115877
gcc/
* ext-dce.cc (safe_for_live_propagation): Handle RTX_CONST_OBJ.

Diff:
---
 gcc/ext-dce.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index b4450e42ed16..cbecfc53dba7 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -69,6 +69,7 @@ safe_for_live_propagation (rtx_code code)
   switch (GET_RTX_CLASS (code))
 {
   case RTX_OBJ:
+  case RTX_CONST_OBJ:
return true;
 
   case RTX_COMPARE:


[gcc r15-2196] [NFC][PR rtl-optimization/115877] Avoid setting irrelevant bit groups as live in ext-dce

2024-07-22 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:88d16194d0c8a6bdc2896c8944bfbf3e6038c9d2

commit r15-2196-g88d16194d0c8a6bdc2896c8944bfbf3e6038c9d2
Author: Jeff Law 
Date:   Mon Jul 22 08:45:10 2024 -0600

[NFC][PR rtl-optimization/115877] Avoid setting irrelevant bit groups as 
live in ext-dce

Another patch to refine liveness computations.  This should be NFC and is
designed to help debugging.

In simplest terms the patch avoids setting bit groups outside the size of a
pseudo as live.  Consider a HImode pseudo, bits 16..63 for such a pseudo 
don't
really have meaning, yet we often set bit groups related to bits 16.63 on in
the liveness bitmaps.

This makes debugging harder than it needs to be by simply having larger 
bitmaps
to verify when walking through the code in a debugger.

This has been bootstrapped and regression tested on x86_64.  It's also been
tested on the crosses in my tester without regressions.

Pushing to the trunk,

PR rtl-optimization/115877
gcc/
* ext-dce.cc (group_limit): New function.
(mark_reg_live): Likewise.
(ext_dce_process_sets): Use new functions.
(ext_dce_process_uses): Likewise.
(ext_dce_init): Likewise.

Diff:
---
 gcc/ext-dce.cc | 64 +++---
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index cbecfc53dba7..44f64e2d18cf 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -48,6 +48,57 @@ static bool modify;
bit 16..31
bit 32..BITS_PER_WORD-1  */
 
+/* For the given REG, return the number of bit groups implied by the
+   size of the REG's mode, up to a maximum of 4 (number of bit groups
+   tracked by this pass).
+
+   For partial integer and variable sized modes also return 4.  This
+   could possibly be refined for something like PSI mode, but it
+   does not seem worth the effort.  */
+
+static int
+group_limit (const_rtx reg)
+{
+  machine_mode mode = GET_MODE (reg);
+
+  if (!GET_MODE_BITSIZE (mode).is_constant ())
+return 4;
+
+  int size = GET_MODE_SIZE (mode).to_constant ();
+
+  size = exact_log2 (size);
+
+  if (size < 0)
+return 4;
+
+  size++;
+  return (size > 4 ? 4 : size);
+}
+
+/* Make all bit groups live for REGNO in bitmap BMAP.  For hard regs,
+   we assume all groups are live.  For a pseudo we consider the size
+   of the pseudo to avoid creating unnecessarily live chunks of data.  */
+
+static void
+make_reg_live (bitmap bmap, int regno)
+{
+  int limit;
+
+  /* For pseudos we can use the mode to limit how many bit groups
+ are marked as live since a pseudo only has one mode.  Hard
+ registers have to be handled more conservatively.  */
+  if (regno > FIRST_PSEUDO_REGISTER)
+{
+  rtx reg = regno_reg_rtx[regno];
+  limit = group_limit (reg);
+}
+  else
+limit = 4;
+
+  for (int i = 0; i < limit; i++)
+bitmap_set_bit (bmap, regno * 4 + i);
+}
+
 /* Note this pass could be used to narrow memory loads too.  It's
not clear if that's profitable or not in general.  */
 
@@ -196,7 +247,8 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
 
  /* Transfer all the LIVENOW bits for X into LIVE_TMP.  */
  HOST_WIDE_INT rn = REGNO (SUBREG_REG (x));
- for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + 4; i++)
+ int limit = group_limit (SUBREG_REG (x));
+ for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
@@ -260,7 +312,8 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  /* Transfer the appropriate bits from LIVENOW into
 LIVE_TMP.  */
  HOST_WIDE_INT rn = REGNO (x);
- for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + 4; i++)
+ int limit = group_limit (x);
+ for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
@@ -692,7 +745,7 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
   /* If we have a register reference that is not otherwise handled,
 just assume all the chunks are live.  */
   else if (REG_P (x))
-   bitmap_set_range (livenow, REGNO (x) * 4, 4);
+   bitmap_set_range (livenow, REGNO (x) * 4, group_limit (x));
 }
 }
 
@@ -819,10 +872,7 @@ ext_dce_init (void)
   unsigned i;
   bitmap_iterator bi;
   EXECUTE_IF_SET_IN_BITMAP (refs, 0, i, bi)
-{
-  for (int j = 0; j < 4; j++)
-   bitmap_set_bit (&livein[EXIT_BLOCK], i * 4 + j);
-}
+make_reg_live (&livein[EXIT_BLOCK], i);
 
   livenow = BITMAP_ALLOC (NULL);
   all_blocks = BITMAP_ALLOC (NULL);


[gcc r15-2203] [4/n][PR rtl-optimization/115877] Correct SUBREG handling in a destination

2024-07-22 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:ab7c0aed52054976d0b5e12c52e82239d4277b98

commit r15-2203-gab7c0aed52054976d0b5e12c52e82239d4277b98
Author: Jeff Law 
Date:   Mon Jul 22 10:11:57 2024 -0600

[4/n][PR rtl-optimization/115877] Correct SUBREG handling in a destination

If we encounter something during SET handling that we can not handle, the 
safe
thing to do is to ignore the destination and continue the loop.

We've actually been trying to do slightly better with SUBREG destinations by
iterating into SUBREG_REG.  It turns out that wasn't working as expected.

The problem is once we "continue" we lose the state that we were inside the 
SET
and thus we ended up ignoring the destination completely rather than 
tracking
the SUBREG_REG object.  This could be fixed by restarting SET processing, 
but I
just don't see this as all that important to handle.  So rather than leave 
the
code as-is, not working per design, I'm twiddling it to use the common 'skip
subrtxs and continue' idiom used elsewhere.

This is a prerequisite for another patch in this series.  Specifically I 
have a
patch that explicitly tracks if we skipped a destination rather than trying 
to
imply it from the state of LIVE_TMP.  So this is probably NFC right now, but
that's a short-lived NFC.

Bootstrapped and regression tested on x86 and also run as part of a larger 
kit
on the crosses in my tester.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_sets): More correctly handle SUBREG
destinations.

Diff:
---
 gcc/ext-dce.cc | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 44f64e2d18cf..21feabd9ce31 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -270,11 +270,18 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
= GET_MODE_MASK (GET_MODE_INNER (GET_MODE (x)));
  if (SUBREG_P (x))
{
- /* If we have a SUBREG that is too wide, just continue the loop
-and let the iterator go down into SUBREG_REG.  */
+ /* If we have a SUBREG destination that is too wide, just
+skip the destination rather than continuing this iterator.
+While continuing would be better, we'd need to strip the
+subreg and restart within the SET processing rather than
+the top of the loop which just complicates the flow even
+more.  */
  if (!is_a  (GET_MODE (SUBREG_REG (x)), 
&outer_mode)
  || GET_MODE_BITSIZE (outer_mode) > 64)
-   continue;
+   {
+ iter.skip_subrtxes ();
+ continue;
+   }
 
  /* We can safely strip a paradoxical subreg.  The inner mode will
 be narrower than the outer mode.  We'll clear fewer bits in


[gcc r15-2212] [5/n][PR rtl-optimization/115877] Fix handling of input/output operands

2024-07-22 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:ad642d2c950657539777ea436b787e7fff4ec09e

commit r15-2212-gad642d2c950657539777ea436b787e7fff4ec09e
Author: Jeff Law 
Date:   Mon Jul 22 21:48:28 2024 -0600

[5/n][PR rtl-optimization/115877] Fix handling of input/output operands

So in this patch we're correcting a failure to mark objects live in 
scenarios
like

(set (dest) (plus (dest) (src))

When handling set pseudos, we transfer the liveness information from LIVENOW
into LIVE_TMP.  LIVE_TMP is subsequently used to narrow what bit groups are
live for the inputs.

The first time we process the block we may not have DEST in the LIVENOW set 
(it
may be live across the loop, but not live after the loop).  Thus we can 
totally
miss making certain objects live, resulting in incorrect code.

The fix is pretty simple.  If LIVE_TMP is empty, then we should go ahead and
mark all the bit groups for the set object in LIVE_TMP.  This also removes 
an
invalid gcc_assert on the state of the liveness bitmaps.

This showed up on pru, rl78 and/or msp430 in the testsuite.  So no new test.

Bootstrapped and regression tested on x86_64 and also run through my tester 
on
all the cross platforms.

Pushing to the trunk.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_sets): Reasonably handle input/output
operands.
(ext_dce_rd_transfer_n): Drop bogus assertion.

Diff:
---
 gcc/ext-dce.cc | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 21feabd9ce31..c56dfb505b88 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -245,13 +245,25 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  continue;
}
 
- /* Transfer all the LIVENOW bits for X into LIVE_TMP.  */
+ /* LIVE_TMP contains the set groups that are live-out and set in
+this insn.  It is used to narrow the groups live-in for the
+inputs of this insn.
+
+The simple thing to do is mark all the groups as live, but
+that will significantly inhibit optimization.
+
+We also need to be careful in the case where we have an in-out
+operand.  If we're not careful we'd clear LIVE_TMP
+incorrectly.  */
  HOST_WIDE_INT rn = REGNO (SUBREG_REG (x));
  int limit = group_limit (SUBREG_REG (x));
  for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
+ if (bitmap_empty_p (live_tmp))
+   make_reg_live (live_tmp, rn);
+
  /* The mode of the SUBREG tells us how many bits we can
 clear.  */
  machine_mode mode = GET_MODE (x);
@@ -316,14 +328,25 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  /* Now handle the actual object that was changed.  */
  if (REG_P (x))
{
- /* Transfer the appropriate bits from LIVENOW into
-LIVE_TMP.  */
+ /* LIVE_TMP contains the set groups that are live-out and set in
+this insn.  It is used to narrow the groups live-in for the
+inputs of this insn.
+
+The simple thing to do is mark all the groups as live, but
+that will significantly inhibit optimization.
+
+We also need to be careful in the case where we have an in-out
+operand.  If we're not careful we'd clear LIVE_TMP
+incorrectly.  */
  HOST_WIDE_INT rn = REGNO (x);
  int limit = group_limit (x);
  for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
+ if (bitmap_empty_p (live_tmp))
+   make_reg_live (live_tmp, rn);
+
  /* Now clear the bits known written by this instruction.
 Note that BIT need not be a power of two, consider a
 ZERO_EXTRACT destination.  */
@@ -935,8 +958,6 @@ ext_dce_rd_transfer_n (int bb_index)
  the generic dataflow code that something changed.  */
   if (!bitmap_equal_p (&livein[bb_index], livenow))
 {
-  gcc_assert (!bitmap_intersect_compl_p (&livein[bb_index], livenow));
-
   bitmap_copy (&livein[bb_index], livenow);
   return true;
 }


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [PR rtl-optimization/115876] Fix one of two ubsan reported issues in new ext-dce.cc code

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:ead3454f089bc864e448b1bf6ace6b445eca3152

commit ead3454f089bc864e448b1bf6ace6b445eca3152
Author: Jeff Law 
Date:   Fri Jul 12 13:11:33 2024 -0600

[PR rtl-optimization/115876] Fix one of two ubsan reported issues in new 
ext-dce.cc code

David Binderman did a bootstrap build with ubsan enabled which triggered a 
few
errors in the new ext-dce.cc code.  This fixes the trivial case of shifting
negative values.

Bootstrapped and regression tested on x86.

Pushing to the trunk.

gcc/
PR rtl-optimization/115876
* ext-dce.cc (carry_backpropagate): Make mask and mmask unsigned.

(cherry picked from commit a6f551d079de1d151b272bcdd3d42316857c9d4e)

Diff:
---
 gcc/ext-dce.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index adc9084df57d..91789d283fcd 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -374,13 +374,13 @@ binop_implies_op2_fully_live (rtx_code code)
exclusively pertain to the first operand.  */
 
 HOST_WIDE_INT
-carry_backpropagate (HOST_WIDE_INT mask, enum rtx_code code, rtx x)
+carry_backpropagate (unsigned HOST_WIDE_INT mask, enum rtx_code code, rtx x)
 {
   if (mask == 0)
 return 0;
 
   enum machine_mode mode = GET_MODE_INNER (GET_MODE (x));
-  HOST_WIDE_INT mmask = GET_MODE_MASK (mode);
+  unsigned HOST_WIDE_INT mmask = GET_MODE_MASK (mode);
   switch (code)
 {
 case PLUS:


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add vector type of BFloat16 format

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:a5ca595e63500d080294f81eada2b25e320dd572

commit a5ca595e63500d080294f81eada2b25e320dd572
Author: Feng Wang 
Date:   Thu Jun 13 00:32:14 2024 +

RISC-V: Add vector type of BFloat16 format

v3: Rebase
v2: Rebase
The vector type of BFloat16 format is added in this patch,
subsequent extensions to zvfbfmin and zvfwma need to be based
on this patch.

Signed-off-by: Feng Wang 
gcc/ChangeLog:

* config/riscv/genrvv-type-indexer.cc (bfloat16_type):
Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_INDEX.
(bfloat16_wide_type): Ditto.
(same_ratio_eew_bf16_type): Ditto.
(main): Ditto.
* config/riscv/riscv-modes.def (ADJUST_BYTESIZE):
Add vector type for BFloat16.
(RVV_WHOLE_MODES): Add vector type for BFloat16.
(RVV_FRACT_MODE): Ditto.
(RVV_NF4_MODES): Ditto.
(RVV_NF8_MODES): Ditto.
(RVV_NF2_MODES): Ditto.
* config/riscv/riscv-vector-builtins-types.def (vbfloat16mf4_t):
Add builtin vector type for BFloat16.
(vbfloat16mf2_t): Add builtin vector type for BFloat16.
(vbfloat16m1_t): Ditto.
(vbfloat16m2_t): Ditto.
(vbfloat16m4_t): Ditto.
(vbfloat16m8_t): Ditto.
(vbfloat16mf4x2_t): Ditto.
(vbfloat16mf4x3_t): Ditto.
(vbfloat16mf4x4_t): Ditto.
(vbfloat16mf4x5_t): Ditto.
(vbfloat16mf4x6_t): Ditto.
(vbfloat16mf4x7_t): Ditto.
(vbfloat16mf4x8_t): Ditto.
(vbfloat16mf2x2_t): Ditto.
(vbfloat16mf2x3_t): Ditto.
(vbfloat16mf2x4_t): Ditto.
(vbfloat16mf2x5_t): Ditto.
(vbfloat16mf2x6_t): Ditto.
(vbfloat16mf2x7_t): Ditto.
(vbfloat16mf2x8_t): Ditto.
(vbfloat16m1x2_t): Ditto.
(vbfloat16m1x3_t): Ditto.
(vbfloat16m1x4_t): Ditto.
(vbfloat16m1x5_t): Ditto.
(vbfloat16m1x6_t): Ditto.
(vbfloat16m1x7_t): Ditto.
(vbfloat16m1x8_t): Ditto.
(vbfloat16m2x2_t): Ditto.
(vbfloat16m2x3_t): Ditto.
(vbfloat16m2x4_t): Ditto.
(vbfloat16m4x2_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (check_required_extensions):
Add required_ext checking for BFloat16.
* config/riscv/riscv-vector-builtins.def (vbfloat16mf4_t):
Add vector_type for BFloat16 in builtins.def.
(vbfloat16mf4x2_t): Ditto.
(vbfloat16mf4x3_t): Ditto.
(vbfloat16mf4x4_t): Ditto.
(vbfloat16mf4x5_t): Ditto.
(vbfloat16mf4x6_t): Ditto.
(vbfloat16mf4x7_t): Ditto.
(vbfloat16mf4x8_t): Ditto.
(vbfloat16mf2_t): Ditto.
(vbfloat16mf2x2_t): Ditto.
(vbfloat16mf2x3_t): Ditto.
(vbfloat16mf2x4_t): Ditto.
(vbfloat16mf2x5_t): Ditto.
(vbfloat16mf2x6_t): Ditto.
(vbfloat16mf2x7_t): Ditto.
(vbfloat16mf2x8_t): Ditto.
(vbfloat16m1_t): Ditto.
(vbfloat16m1x2_t): Ditto.
(vbfloat16m1x3_t): Ditto.
(vbfloat16m1x4_t): Ditto.
(vbfloat16m1x5_t): Ditto.
(vbfloat16m1x6_t): Ditto.
(vbfloat16m1x7_t): Ditto.
(vbfloat16m1x8_t): Ditto.
(vbfloat16m2_t): Ditto.
(vbfloat16m2x2_t): Ditto.
(vbfloat16m2x3_t): Ditto.
(vbfloat16m2x4_t): Ditto.
(vbfloat16m4_t): Ditto.
(vbfloat16m4x2_t): Ditto.
(vbfloat16m8_t): Ditto.
(double_trunc_bfloat_scalar): Add scalar_type def for BFloat16.
(double_trunc_bfloat_vector): Add vector_type def for BFloat16.
* config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_BF_16):
Add required defination of BFloat16 ext.
* config/riscv/riscv-vector-switch.def (ENTRY):
Add vector_type information for BFloat16.
(TUPLE_ENTRY): Add tuple vector_type information for BFloat16.

(cherry picked from commit 666f167bec09d1234e6496c86b566fe1a71f61f0)

Diff:
---
 gcc/config/riscv/genrvv-type-indexer.cc  | 115 +++
 gcc/config/riscv/riscv-modes.def |  30 +-
 gcc/config/riscv/riscv-vector-builtins-types.def |  50 ++
 gcc/config/riscv/riscv-vector-builtins.cc|   7 +-
 gcc/config/riscv/riscv-vector-builtins.def   |  55 ++-
 gcc/config/riscv/riscv-vector-builtins.h |   1 +
 gcc/config/riscv/riscv-vector-switch.def |  36 +++
 7 files changed, 291 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
b/gcc/config/riscv/genrvv-type-indexer.cc
index 27cbd14982c1..8626ddeaaa8b 100644
--- a/gcc/config/riscv/genrvv-type-indexer.cc
+++ b/

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:958e43c1baa3d40fcbb206bb8469c7782e044e7a

commit 958e43c1baa3d40fcbb206bb8469c7782e044e7a
Author: Feng Wang 
Date:   Mon Jun 17 01:59:57 2024 +

RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic

v3: Modify warning message in riscv.cc
v2: Rebase
Accroding to the intrinsic doc, the 'Zvfbfmin' and 'Zvfbfwma' intrinsic
functions are added by this patch.

Signed-off-by: Feng Wang 
gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (class vfncvtbf16_f):
Add 'Zvfbfmin' intrinsic in bases.
(class vfwcvtbf16_f): Ditto.
(class vfwmaccbf16): Add 'Zvfbfwma' intrinsic in bases.
(BASE): Add BASE macro for 'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv-vector-builtins-bases.h: Add declaration for 
'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv-vector-builtins-functions.def 
(REQUIRED_EXTENSIONS):
Add builtins def for 'Zvfbfmin' and 'Zvfbfwma'.
(vfncvtbf16_f): Ditto.
(vfncvtbf16_f_frm): Ditto.
(vfwcvtbf16_f): Ditto.
(vfwmaccbf16): Ditto.
(vfwmaccbf16_frm): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (supports_vectype_p):
Add vector intrinsic build judgment for BFloat16.
(build_all): Ditto.
(BASE_NAME_MAX_LEN): Adjust max length.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_F32_OPS):
Add new operand type for BFloat16.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_F32_OPS): Ditto.
(validate_instance_type_required_extensions):
Add required_ext checking for 'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv-vector-builtins.h (enum required_ext):
Add required_ext declaration for 'Zvfbfmin' and 'Zvfbfwma'.
(reqired_ext_to_isa_name): Ditto.
(required_extensions_specified): Ditto.
(struct function_group_info): Add match case for 'Zvfbfmin' and 
'Zvfbfwma'.
* config/riscv/riscv.cc (riscv_validate_vector_type):
Add required_ext checking for 'Zvfbfmin' and 'Zvfbfwma'.

(cherry picked from commit 281f021ed4fbf9c2336048e34b6b40c6f7119baa)

Diff:
---
 gcc/config/riscv/riscv-vector-builtins-bases.cc| 69 ++
 gcc/config/riscv/riscv-vector-builtins-bases.h |  7 +++
 .../riscv/riscv-vector-builtins-functions.def  | 15 +
 gcc/config/riscv/riscv-vector-builtins-shapes.cc   | 31 +-
 gcc/config/riscv/riscv-vector-builtins-types.def   | 13 
 gcc/config/riscv/riscv-vector-builtins.cc  | 67 +
 gcc/config/riscv/riscv-vector-builtins.h   | 34 +++
 gcc/config/riscv/riscv.cc  | 13 ++--
 8 files changed, 232 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 6483faba39c4..193392fbcc2a 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2417,6 +2417,60 @@ public:
   }
 };
 
+/* Implements vfncvtbf16_f. */
+template 
+class vfncvtbf16_f : public function_base
+{
+public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
+  bool may_require_frm_p () const override { return true; }
+
+  rtx expand (function_expander &e) const override
+  {
+return e.use_exact_insn (code_for_pred_trunc_to_bf16 (e.vector_mode ()));
+  }
+};
+
+/* Implements vfwcvtbf16_f. */
+class vfwcvtbf16_f : public function_base
+{
+public:
+  rtx expand (function_expander &e) const override
+  {
+return e.use_exact_insn (code_for_pred_extend_bf16_to (e.vector_mode ()));
+  }
+};
+
+/* Implements vfwmaccbf16. */
+template 
+class vfwmaccbf16 : public function_base
+{
+public:
+  bool has_rounding_mode_operand_p () const override
+  {
+return FRM_OP == HAS_FRM;
+  }
+
+  bool may_require_frm_p () const override { return true; }
+
+  bool has_merge_operand_p () const override { return false; }
+
+  rtx expand (function_expander &e) const override
+  {
+if (e.op_info->op == OP_TYPE_vf)
+  return e.use_widen_ternop_insn (
+   code_for_pred_widen_bf16_mul_scalar (e.vector_mode ()));
+if (e.op_info->op == OP_TYPE_vv)
+  return e.use_widen_ternop_insn (
+   code_for_pred_widen_bf16_mul (e.vector_mode ()));
+gcc_unreachable ();
+  }
+};
+
 static CONSTEXPR const vsetvl vsetvl_obj;
 static CONSTEXPR const vsetvl vsetvlmax_obj;
 static CONSTEXPR const loadstore vle_obj;
@@ -2734,6 +2788,14 @@ static CONSTEXPR const crypto_vv   
vsm4r_obj;
 static CONSTEXPR const vsm3me vsm3me_obj;
 static CONSTEXPR const vaeskf2_vsm3c   

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Add md files for vector BFloat16

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:c2b212fbed411183ca108323d674fcd62028c851

commit c2b212fbed411183ca108323d674fcd62028c851
Author: Feng Wang 
Date:   Tue Jun 18 06:13:35 2024 +

RISC-V: Add md files for vector BFloat16

V3: Add Bfloat16 vector insn in generic-vector-ooo.md
v2: Rebase
Accroding to the BFloat16 spec, some vector iterators and new pattern
are added in md files.

Signed-off-by: Feng Wang 
gcc/ChangeLog:

* config/riscv/generic-vector-ooo.md: Add def_insn_reservation for 
vector BFloat16.
* config/riscv/riscv.md: Add new insn name for vector BFloat16.
* config/riscv/vector-iterators.md: Add some iterators for vector 
BFloat16.
* config/riscv/vector.md: Add some attribute for vector BFloat16.
* config/riscv/vector-bfloat16.md: New file. Add insn pattern 
vector BFloat16.

(cherry picked from commit 9f521632dd9ce71ce28ff1da9c161f76bc20fe3e)

Diff:
---
 gcc/config/riscv/generic-vector-ooo.md |   4 +-
 gcc/config/riscv/riscv.md  |  13 ++-
 gcc/config/riscv/vector-bfloat16.md| 135 ++
 gcc/config/riscv/vector-iterators.md   | 169 -
 gcc/config/riscv/vector.md | 103 +---
 5 files changed, 407 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/generic-vector-ooo.md 
b/gcc/config/riscv/generic-vector-ooo.md
index 5e933c838418..efe6bc41e864 100644
--- a/gcc/config/riscv/generic-vector-ooo.md
+++ b/gcc/config/riscv/generic-vector-ooo.md
@@ -53,7 +53,7 @@
 (define_insn_reservation "vec_fcmp" 3
   (eq_attr "type" "vfrecp,vfminmax,vfcmp,vfsgnj,vfclass,vfcvtitof,\
vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\
-   vfncvtftoi,vfncvtftof")
+   vfncvtftoi,vfncvtftof,vfncvtbf16,vfwcvtbf16")
   "vxu_ooo_issue,vxu_ooo_alu")
 
 ;; Vector integer multiplication.
@@ -69,7 +69,7 @@
 
 ;; Vector float multiplication and FMA.
 (define_insn_reservation "vec_fmul" 6
-  (eq_attr "type" "vfmul,vfwmul,vfmuladd,vfwmuladd")
+  (eq_attr "type" "vfmul,vfwmul,vfmuladd,vfwmuladd,vfwmaccbf16")
   "vxu_ooo_issue,vxu_ooo_alu")
 
 ;; Vector crypto, assumed to be a generic operation for now.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 5dee837a5878..379015c60de8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -200,6 +200,7 @@
   RVVMF64BI,RVVMF32BI,RVVMF16BI,RVVMF8BI,RVVMF4BI,RVVMF2BI,RVVM1BI,
   RVVM8QI,RVVM4QI,RVVM2QI,RVVM1QI,RVVMF2QI,RVVMF4QI,RVVMF8QI,
   RVVM8HI,RVVM4HI,RVVM2HI,RVVM1HI,RVVMF2HI,RVVMF4HI,
+  RVVM8BF,RVVM4BF,RVVM2BF,RVVM1BF,RVVMF2BF,RVVMF4BF,
   RVVM8HF,RVVM4HF,RVVM2HF,RVVM1HF,RVVMF2HF,RVVMF4HF,
   RVVM8SI,RVVM4SI,RVVM2SI,RVVM1SI,RVVMF2SI,
   RVVM8SF,RVVM4SF,RVVM2SF,RVVM1SF,RVVMF2SF,
@@ -219,6 +220,11 @@
   RVVM2x4HI,RVVM1x4HI,RVVMF2x4HI,RVVMF4x4HI,
   RVVM2x3HI,RVVM1x3HI,RVVMF2x3HI,RVVMF4x3HI,
   RVVM4x2HI,RVVM2x2HI,RVVM1x2HI,RVVMF2x2HI,RVVMF4x2HI,
+  RVVM1x8BF,RVVMF2x8BF,RVVMF4x8BF,RVVM1x7BF,RVVMF2x7BF,
+  RVVMF4x7BF,RVVM1x6BF,RVVMF2x6BF,RVVMF4x6BF,RVVM1x5BF,
+  RVVMF2x5BF,RVVMF4x5BF,RVVM2x4BF,RVVM1x4BF,RVVMF2x4BF,
+  RVVMF4x4BF,RVVM2x3BF,RVVM1x3BF,RVVMF2x3BF,RVVMF4x3BF,
+  RVVM4x2BF,RVVM2x2BF,RVVM1x2BF,RVVMF2x2BF,RVVMF4x2BF,
   RVVM1x8HF,RVVMF2x8HF,RVVMF4x8HF,RVVM1x7HF,RVVMF2x7HF,
   RVVMF4x7HF,RVVM1x6HF,RVVMF2x6HF,RVVMF4x6HF,RVVM1x5HF,
   RVVMF2x5HF,RVVMF4x5HF,RVVM2x4HF,RVVM1x4HF,RVVMF2x4HF,
@@ -462,6 +468,10 @@
 ;; vsm4rcrypto vector SM4 Rounds instructions
 ;; vsm3me   crypto vector SM3 Message Expansion instructions
 ;; vsm3ccrypto vector SM3 Compression instructions
+;; 18.Vector BF16 instrctions
+;; vfncvtbf16  vector narrowing single floating-point to brain floating-point 
instruction
+;; vfwcvtbf16  vector widening brain floating-point to single floating-point 
instruction
+;; vfwmaccbf16  vector BF16 widening multiply-accumulate
 (define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -483,7 +493,7 @@
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,

vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,

vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
-   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
+   
vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c,vfncvtbf16,vfwcvtbf16,vfwmaccbf16"
   (cond [(eq_attr "got" "load") (const_string "load")
 
 ;; If a doubleword move uses these expensive instructions,
@@ -4373,6 +4383,7 @@
 (include "generic-ooo.md")
 (include "vector.md")
 (include "vector-crypto.md")
+(include "vector-bfloat16.md")
 (include "zicond.md")
 (include "sfb.md")
 (include "zc.md")
diff --git a/gcc/config/riscv/vector-bfloat16.md 
b/gcc/config/riscv/vector-bfloat16.md
new file mode 100644
index ..562aa8ee5ed7
--- /dev/null
+++ b/g

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Fix testcase for vector .SAT_SUB in zip benchmark

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:21814631a11523712913a1fdec4055176aa89e28

commit 21814631a11523712913a1fdec4055176aa89e28
Author: Edwin Lu 
Date:   Fri Jul 12 11:31:16 2024 -0700

RISC-V: Fix testcase for vector .SAT_SUB in zip benchmark

The following testcase was not properly testing anything due to an
uninitialized variable. As a result, the loop was not iterating through
the testing data, but instead on undefined values which could cause an
unexpected abort.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h:
initialize variable

Signed-off-by: Edwin Lu 
(cherry picked from commit 4306f76192bc7ab71c5997a7e2c95320505029ab)

Diff:
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h | 1 +
 1 file changed, 1 insertion(+)

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h
index d238c6392def..309d63377d53 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h
@@ -9,6 +9,7 @@ main ()
 
   for (i = 0; i < sizeof (DATA) / sizeof (DATA[0]); i++)
 {
+  d = DATA[i];
   RUN_BINARY_VX (&d.x[N], d.b, N);
 
   for (k = 0; k < N; k++)


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Implement locality for __builtin_prefetch

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:e8be9b1c419d1b59a6f5fe5f166c43bfea27ec0e

commit e8be9b1c419d1b59a6f5fe5f166c43bfea27ec0e
Author: Monk Chiang 
Date:   Thu Jul 6 14:05:17 2023 +0800

RISC-V: Implement locality for __builtin_prefetch

The patch add the Zihintntl instructions in the prefetch pattern.
Zicbop has prefetch instructions. Zihintntl has NTL instructions.
Insert NTL instructions before prefetch instruction, if target
has Zihintntl extension.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand): Add 'L' letter
to print zihintntl instructions string.
* config/riscv/riscv.md (prefetch): Add zihintntl instructions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/prefetch-zicbop.c: New test.
* gcc.target/riscv/prefetch-zihintntl.c: New test.

(cherry picked from commit bf26413fc4081dfd18b915580b35bdb71481327e)

Diff:
---
 gcc/config/riscv/riscv.cc  | 22 ++
 gcc/config/riscv/riscv.md  | 10 +++---
 gcc/testsuite/gcc.target/riscv/prefetch-zicbop.c   | 20 
 .../gcc.target/riscv/prefetch-zihintntl.c  | 20 
 4 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index d4553aacee96..9bedefa74c35 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6488,6 +6488,7 @@ riscv_asm_output_opcode (FILE *asm_out_file, const char 
*p)
'A' Print the atomic operation suffix for memory model OP.
'I' Print the LR suffix for memory model OP.
'J' Print the SC suffix for memory model OP.
+   'L' Print a non-temporal locality hints instruction.
'z' Print x0 if OP is zero, otherwise print OP normally.
'i' Print i if the operand is not a register.
'S' Print shift-index of single-bit mask OP.
@@ -6682,6 +6683,27 @@ riscv_print_operand (FILE *file, rtx op, int letter)
   break;
 }
 
+case 'L':
+  {
+   const char *ntl_hint = NULL;
+   switch (INTVAL (op))
+ {
+ case 0:
+   ntl_hint = "ntl.all";
+   break;
+ case 1:
+   ntl_hint = "ntl.pall";
+   break;
+ case 2:
+   ntl_hint = "ntl.p1";
+   break;
+ }
+
+  if (ntl_hint)
+   asm_fprintf (file, "%s\n\t", ntl_hint);
+  break;
+  }
+
 case 'i':
   if (code != REG)
 fputs ("i", file);
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 379015c60de8..46c46039c33a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -4113,12 +4113,16 @@
 {
   switch (INTVAL (operands[1]))
   {
-case 0: return "prefetch.r\t%a0";
-case 1: return "prefetch.w\t%a0";
+case 0: return TARGET_ZIHINTNTL ? "%L2prefetch.r\t%a0" : "prefetch.r\t%a0";
+case 1: return TARGET_ZIHINTNTL ? "%L2prefetch.w\t%a0" : "prefetch.w\t%a0";
 default: gcc_unreachable ();
   }
 }
-  [(set_attr "type" "store")])
+  [(set_attr "type" "store")
+   (set (attr "length") (if_then_else (and (match_test "TARGET_ZIHINTNTL")
+  (match_test "IN_RANGE (INTVAL 
(operands[2]), 0, 2)"))
+ (const_string "8")
+ (const_string "4")))])
 
 (define_insn "riscv_prefetchi_"
   [(unspec_volatile:X [(match_operand:X 0 "address_operand" "r")
diff --git a/gcc/testsuite/gcc.target/riscv/prefetch-zicbop.c 
b/gcc/testsuite/gcc.target/riscv/prefetch-zicbop.c
new file mode 100644
index ..0faa120f1f79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/prefetch-zicbop.c
@@ -0,0 +1,20 @@
+/* { dg-do compile target { { rv64-*-*}}} */
+/* { dg-options "-march=rv64gc_zicbop -mabi=lp64" } */
+
+void foo (char *p)
+{
+  __builtin_prefetch (p, 0, 0);
+  __builtin_prefetch (p, 0, 1);
+  __builtin_prefetch (p, 0, 2);
+  __builtin_prefetch (p, 0, 3);
+  __builtin_prefetch (p, 1, 0);
+  __builtin_prefetch (p, 1, 1);
+  __builtin_prefetch (p, 1, 2);
+  __builtin_prefetch (p, 1, 3);
+}
+
+/* { dg-final { scan-assembler-not "ntl.all\t" } } */
+/* { dg-final { scan-assembler-not "ntl.pall\t" } } */
+/* { dg-final { scan-assembler-not "ntl.p1\t" } } */
+/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c 
b/gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c
new file mode 100644
index ..78a3afe68333
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c
@@ -0,0 +1,20 @@
+/* { dg-do compile target { { rv64-*-*}}} */
+/* { dg-options "-march=rv64gc_zicbop_zihintntl -mabi=lp64" } */
+
+void foo (char *p)
+{
+  __builtin_prefetch (p, 0, 0);
+  __builtin_prefetch (p, 0, 1);
+  __builtin_prefetch (p, 0, 2);
+  __builtin_prefetch (p, 0, 3);
+  __builtin_prefetch (p, 1, 0);

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:64b4b5211aa664e224e3cd722ab5aa11f278aa68

commit 64b4b5211aa664e224e3cd722ab5aa11f278aa68
Author: Christoph Müllner 
Date:   Fri Jul 5 04:48:15 2024 +0200

RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr

Allocating an object on the heap with new, wrapping it in a
std::unique_ptr and finally getting the buffer via buf.get()
is a correct way to allocate a buffer that is automatically
freed on return.  However, a simple invocation of alloca()
does the same with less overhead.

gcc/ChangeLog:

* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Replace new + std::unique_ptr by alloca().
(riscv_process_one_target_attr): Likewise.
(riscv_process_target_attr): Likewise.

Signed-off-by: Christoph Müllner 
(cherry picked from commit 5040c273484d7123a40a99cdeb434cecbd17a2e9)

Diff:
---
 gcc/config/riscv/riscv-target-attr.cc | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv-target-attr.cc 
b/gcc/config/riscv/riscv-target-attr.cc
index 0bbe7df25d19..3d7753f64574 100644
--- a/gcc/config/riscv/riscv-target-attr.cc
+++ b/gcc/config/riscv/riscv-target-attr.cc
@@ -109,8 +109,7 @@ riscv_target_attr_parser::parse_arch (const char *str)
 {
   /* Parsing the extension list like "+[,+]*".  */
   size_t len = strlen (str);
-  std::unique_ptr buf (new char[len+1]);
-  char *str_to_check = buf.get ();
+  char *str_to_check = (char *) alloca (len + 1);
   strcpy (str_to_check, str);
   const char *token = strtok_r (str_to_check, ",", &str_to_check);
   m_subset_list = riscv_cmdline_subset_list ()->clone ();
@@ -247,8 +246,7 @@ riscv_process_one_target_attr (char *arg_str,
   return false;
 }
 
-  std::unique_ptr buf (new char[len+1]);
-  char *str_to_check = buf.get();
+  char *str_to_check = (char *) alloca (len + 1);
   strcpy (str_to_check, arg_str);
 
   char *arg = strchr (str_to_check, '=');
@@ -334,8 +332,7 @@ riscv_process_target_attr (tree fndecl, tree args, 
location_t loc,
   return false;
 }
 
-  std::unique_ptr buf (new char[len+1]);
-  char *str_to_check = buf.get ();
+  char *str_to_check = (char *) alloca (len + 1);
   strcpy (str_to_check, TREE_STRING_POINTER (args));
 
   /* Used to catch empty spaces between semi-colons i.e.


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Allow adding enabled extension via target arch attributes

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:4157d59413a5f35808603e06de06cfb388811a65

commit 4157d59413a5f35808603e06de06cfb388811a65
Author: Christoph Müllner 
Date:   Sat Jul 6 17:03:18 2024 +0200

RISC-V: Allow adding enabled extension via target arch attributes

The set of enabled extensions can be extended via target arch function
attributes by listing each extension with a '+' prefix and a comma as
list separator.  E.g.:
  __attribute__((target("arch=+zba,+zbb"))) void foo();

The programmer intends to ensure that one or more extensions
are enabled when building the code.  This is independent of the arch
string that is passed at build time via the -march= option.

Therefore, it is reasonable to allow enabling extensions via target arch
attributes, which have already been enabled via the -march= string.

The subset list code already supports such duplication for implied
extensions.  This patch adds an interface so the subset list
parser can be switched into a mode where duplication is allowed.

This commit fixes the following regressed test cases:
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_subset_list::add):
Allow adding enabled extension if m_allow_adding_dup is set.
* config/riscv/riscv-subset.h: Add m_allow_adding_dup and setter.
* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Allow adding enabled extensions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr115554.c: Change expected fail to expected 
pass.
* gcc.target/riscv/target-attr-16.c: New test.

Signed-off-by: Christoph Müllner 
(cherry picked from commit 61c21a719e205f70bd046c6a0275d1a3fd6341a4)

Diff:
---
 gcc/common/config/riscv/riscv-common.cc | 17 +--
 gcc/config/riscv/riscv-subset.h |  5 +
 gcc/config/riscv/riscv-target-attr.cc   |  3 +++
 gcc/testsuite/gcc.target/riscv/pr115554.c   |  2 --
 gcc/testsuite/gcc.target/riscv/target-attr-16.c | 28 +
 5 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 8e9beb6801f9..682826c0e344 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -702,12 +702,17 @@ riscv_subset_list::add (const char *subset, int 
major_version,
  ext->minor_version = minor_version;
}
   else
-   error_at (
- m_loc,
- "%<-march=%s%>: extension %qs appear more than one time",
- m_arch,
- subset);
-
+   {
+ /* The extension is already in the list.  */
+ if (!m_allow_adding_dup
+ || ext->major_version != major_version
+ || ext->minor_version != minor_version)
+   error_at (
+ m_loc,
+ "%<-march=%s%>: extension %qs appear more than one time",
+ m_arch,
+ subset);
+   }
   return;
 }
   else if (strlen (subset) == 1 && !standard_extensions_p (subset))
diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h
index 279716feab57..dace4de65753 100644
--- a/gcc/config/riscv/riscv-subset.h
+++ b/gcc/config/riscv/riscv-subset.h
@@ -65,6 +65,9 @@ private:
   /* Number of subsets. */
   unsigned m_subset_num;
 
+  /* Allow adding the same extension more than once.  */
+  bool m_allow_adding_dup;
+
   riscv_subset_list (const char *, location_t);
 
   const char *parsing_subset_version (const char *, const char *, unsigned *,
@@ -109,6 +112,8 @@ public:
 
   void set_loc (location_t);
 
+  void set_allow_adding_dup (bool v) { m_allow_adding_dup = v; }
+
   void finalize ();
 };
 
diff --git a/gcc/config/riscv/riscv-target-attr.cc 
b/gcc/config/riscv/riscv-target-attr.cc
index 317806143949..57235c9c0a7e 100644
--- a/gcc/config/riscv/riscv-target-attr.cc
+++ b/gcc/config/riscv/riscv-target-attr.cc
@@ -109,6 +109,8 @@ riscv_target_attr_parser::parse_arch (const char *str)
  ? riscv_subset_list::parse (local_arch_str, m_loc)
  : riscv_cmdline_subset_list ()->clone ();
   m_subset_list->set_loc (m_loc);
+  m_subset_list->set_allow_adding_dup (true);
+
   while (token)
{
  if (token[0] != '+')
@@ -134,6 +136,7 @@ riscv_target_attr_parser::parse_arch (const char *str)
  token = strtok_r (NULL, ",", &str_to_check);
}
 
+

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Rewrite target attribute handling

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:fa716b37c5c85663b8f73d725a2d4020116e2a77

commit fa716b37c5c85663b8f73d725a2d4020116e2a77
Author: Christoph Müllner 
Date:   Sat Jun 22 21:59:04 2024 +0200

RISC-V: Rewrite target attribute handling

The target-arch attribute handling in RISC-V is only a few months old,
but already saw a rewrite (9941f0295a14), which addressed an important
issue.  This rewrite introduced a hash table in the backend, which is
used to keep track of target-arch attributes of all functions.
The index of this hash table is the pointer to the function declaration
object (fndecl).  However, objects like these don't have the lifetime
that is assumed here, which resulted in observing two fndecl objects
with the same address for different objects (triggering the assertion
in riscv_func_target_put() -- see also PR115562).

This patch removes the hash table approach in favor of storing target
specific options using the DECL_FUNCTION_SPECIFIC_TARGET() macro, which
is also used by other backends and is specifically designed for this
purpose (https://gcc.gnu.org/onlinedocs/gccint/Function-Properties.html).

To have an accessible field in the target options, we need to
adjust riscv.opt and introduce the field riscv_arch_string
(for the already existing option '-march=').

Using this macro allows to remove much code from riscv-common.cc, which
controls access to the objects 'func_target_table' and 
'current_subset_list'.

One thing to mention is, that we had two subset lists:
current_subset_list and cmdline_subset_list, with the latter being
introduced recently for target attribute handling.
This patch reduces them back to one (cmdline_subset_list) which
contains the list of extensions that have been enabled by the command
line arguments.

Note that the patch keeps the existing behavior of rejecting
duplications of extensions when added via the '+' operator in a function
target attribute.  E.g. "-march=rv64gc_zbb" and "arch=+zbb" will trigger
an error (see pr115554.c).  However, at the same time this patch breaks
the acceptance of adding implied extensions, which causes the following
six regressions (with the error "extension 'EXT' appear more than one 
time"):
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c

New tests were added to document the behavior and to ensure it won't
regress.  This patch did not show any regressions for rv32/rv64
and fixes the ICEs from PR115554 and PR115562.

PR target/115554
PR target/115562

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (struct 
riscv_func_target_info):
Remove.
(struct riscv_func_target_hasher): Likewise.
(riscv_func_decl_hash): Likewise.
(riscv_func_target_hasher::hash): Likewise.
(riscv_func_target_hasher::equal): Likewise.
(riscv_current_subset_list): Likewise.
(riscv_cmdline_subset_list): Remove obsolete space.
(riscv_func_target_table_lazy_init): Remove.
(riscv_func_target_get): Likewise.
(riscv_func_target_put): Likewise.
(riscv_func_target_remove_and_destory): Likewise.
(riscv_arch_str): Generate from cmdline_subset_list.
(riscv_set_arch_by_subset_list): Don't set current_subset_list.
(riscv_parse_arch_string): Remove current_subset_list.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins):
Get subset list via riscv_cmdline_subset_list().
* config/riscv/riscv-subset.h (riscv_current_subset_list):
Remove prototype.
(riscv_func_target_get): Likewise.
(riscv_func_target_put): Likewise.
(riscv_func_target_remove_and_destory): Likewise.
* config/riscv/riscv-target-attr.cc 
(riscv_target_attr_parser::parse_arch):
Build base arch string from existing target options, if any.
(riscv_target_attr_parser::update_settings): Store new arch
string in target options.
(riscv_process_one_target_attr): Whitespace fix.
(riscv_process_target_attr): Drop opts argument.
(riscv_option_valid_attribute_p): Properly save, change and restore
target options.
* config/riscv/riscv.cc (get_arch_str): New function.
(riscv_declare_function_name): Get arch string for option-arch
directive from function's target options.
*

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] Fix liveness computation for shift/rotate counts in ext-dce

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:fa543ce46c2e205f2813e13fd9d4df65e8544b87

commit fa543ce46c2e205f2813e13fd9d4df65e8544b87
Author: Jeff Law 
Date:   Mon Jul 15 18:15:33 2024 -0600

Fix liveness computation for shift/rotate counts in ext-dce

So as I've noted before I believe the control flow in ext-dce.cc is horribly
messy.  While investigating a fix for 115877 I came across another problem
related to control flow handling.

Specifically, if we have an binary op which implies the 2nd operand is fully
live, then we'd actually fail to mark that operand as live.

We essentially broke out of the loop which was supposed to be safe.  But Y 
was
a REG and if Y is a REG or CONST_INT we skip sub-rtxs and thus failed to
process that operand (the shift count) at all.

Rather than muck around with control flow, we can just set all the bits as 
live
in DST_MASK and let normal processing continue.  With all the bits live IN
DST_MASK all the bits implied by the mode of the argument will also be live.

No testcase.

Bootstrapped and regression tested on x86.  Pushing to the trunk.

gcc/
* ext-dce.cc (ext_dce_process_uses): Simplify control flow and fix
liveness computation for shift/rotate counts.

(cherry picked from commit b31b8af807f5459674b0b310cb62a5bc81b676e7)

Diff:
---
 gcc/ext-dce.cc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 91789d283fcd..7ecb99fef81d 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -632,10 +632,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  else if (!CONSTANT_P (y))
break;
 
- /* We might have (ashift (const_int 1) (reg...)) */
- /* XXX share this logic with code below.  */
+ /* We might have (ashift (const_int 1) (reg...))
+By setting dst_mask we can continue iterating on the
+the next operand and it will be considered fully live.  */
  if (binop_implies_op2_fully_live (GET_CODE (src)))
-   break;
+   dst_mask = -1;
 
  /* If this was anything but a binary operand, break the inner
 loop.  This is conservatively correct as it will cause the


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] Revert "RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr"

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:20fe3e21e824daeb20679a24d3de78969c17710c

commit 20fe3e21e824daeb20679a24d3de78969c17710c
Author: Christoph Müllner 
Date:   Mon Jul 15 23:42:39 2024 +0200

Revert "RISC-V: Attribute parser: Use alloca() instead of new + 
std::unique_ptr"

This reverts commit 5040c273484d7123a40a99cdeb434cecbd17a2e9.

(cherry picked from commit eb0c163aada970b8351067b17121f013fc58dbc9)

Diff:
---
 gcc/config/riscv/riscv-target-attr.cc | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-target-attr.cc 
b/gcc/config/riscv/riscv-target-attr.cc
index 57235c9c0a7e..1645a6692177 100644
--- a/gcc/config/riscv/riscv-target-attr.cc
+++ b/gcc/config/riscv/riscv-target-attr.cc
@@ -101,7 +101,8 @@ riscv_target_attr_parser::parse_arch (const char *str)
 {
   /* Parsing the extension list like "+[,+]*".  */
   size_t len = strlen (str);
-  char *str_to_check = (char *) alloca (len + 1);
+  std::unique_ptr buf (new char[len+1]);
+  char *str_to_check = buf.get ();
   strcpy (str_to_check, str);
   const char *token = strtok_r (str_to_check, ",", &str_to_check);
   const char *local_arch_str = global_options.x_riscv_arch_string;
@@ -253,7 +254,8 @@ riscv_process_one_target_attr (char *arg_str,
   return false;
 }
 
-  char *str_to_check = (char *) alloca (len + 1);
+  std::unique_ptr buf (new char[len+1]);
+  char *str_to_check = buf.get();
   strcpy (str_to_check, arg_str);
 
   char *arg = strchr (str_to_check, '=');
@@ -339,7 +341,8 @@ riscv_process_target_attr (tree args, location_t loc)
   return false;
 }
 
-  char *str_to_check = (char *) alloca (len + 1);
+  std::unique_ptr buf (new char[len+1]);
+  char *str_to_check = buf.get ();
   strcpy (str_to_check, TREE_STRING_POINTER (args));
 
   /* Used to catch empty spaces between semi-colons i.e.


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] Add debug counter for ext_dce

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:b33c9eebd9581a86d56e3cdba2bef96fda1727f4

commit b33c9eebd9581a86d56e3cdba2bef96fda1727f4
Author: Andrew Pinski 
Date:   Tue Jul 16 09:53:20 2024 -0700

Add debug counter for ext_dce

Like r15-1610-gb6215065a5b143 (which adds one for late_combine),
adding one for ext_dce is useful to debug some issues with this pass.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* dbgcnt.def (ext_dce): New debug counter.
* ext-dce.cc (ext_dce_try_optimize_insn): Reject the insn
if the debug counter says so.
(ext_dce): Rename to ...
(ext_dce_execute): This.
(pass_ext_dce::execute): Update for the name of ext_dce.

Signed-off-by: Andrew Pinski 
(cherry picked from commit 7c3287f3613210d4f98c8095bc739bea6582bfbb)

Diff:
---
 gcc/dbgcnt.def |  1 +
 gcc/ext-dce.cc | 16 +---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index ed9f062eac2c..4e7aaeae2da5 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -162,6 +162,7 @@ DEBUG_COUNTER (dom_unreachable_edges)
 DEBUG_COUNTER (dse)
 DEBUG_COUNTER (dse1)
 DEBUG_COUNTER (dse2)
+DEBUG_COUNTER (ext_dce)
 DEBUG_COUNTER (form_fma)
 DEBUG_COUNTER (gcse2_delete)
 DEBUG_COUNTER (gimple_unroll)
diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 7ecb99fef81d..7270de2a3bfe 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-iter.h"
 #include "df.h"
 #include "print-rtl.h"
+#include "dbgcnt.h"
 
 /* These should probably move into a C++ class.  */
 static vec livein;
@@ -312,6 +313,15 @@ ext_dce_try_optimize_insn (rtx_insn *insn, rtx set)
   print_rtl_single (dump_file, SET_SRC (set));
 }
 
+  /* We decided to turn do the optimization but allow it to be rejected for
+ bisection purposes.  */
+  if (!dbg_cnt (::ext_dce))
+{
+  if (dump_file)
+   fprintf (dump_file, "Rejected due to debug counter.\n");
+  return;
+}
+
   new_pattern = simplify_gen_subreg (GET_MODE (src), inner,
 GET_MODE (inner), 0);
   /* simplify_gen_subreg may fail in which case NEW_PATTERN will be NULL.
@@ -881,8 +891,8 @@ static bool ext_dce_rd_confluence_n (edge) { return true; }
are never read.  Turn such extensions into SUBREGs instead which
can often be propagated away.  */
 
-static void
-ext_dce (void)
+void
+ext_dce_execute (void)
 {
   df_analyze ();
   ext_dce_init ();
@@ -929,7 +939,7 @@ public:
   virtual bool gate (function *) { return flag_ext_dce && optimize > 0; }
   virtual unsigned int execute (function *)
 {
-  ext_dce ();
+  ext_dce_execute ();
   return 0;
 }


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Fix testcase missing arch attribute

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:bdb2115f7ee854a6daecf6079274700321f1a2b5

commit bdb2115f7ee854a6daecf6079274700321f1a2b5
Author: Edwin Lu 
Date:   Tue Jul 16 17:43:45 2024 -0700

RISC-V: Fix testcase missing arch attribute

The C + F extention implies the zcf extension on rv32. Add missing zcf
extension for the rv32 target.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/target-attr-16.c: Update expected assembly

Signed-off-by: Edwin Lu 
(cherry picked from commit 5bb01e91d40c34e8f8230b142f7ebff3d6aa88d1)

Diff:
---
 gcc/testsuite/gcc.target/riscv/target-attr-16.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/target-attr-16.c 
b/gcc/testsuite/gcc.target/riscv/target-attr-16.c
index 1c7badccdeee..c6b626d0c6ce 100644
--- a/gcc/testsuite/gcc.target/riscv/target-attr-16.c
+++ b/gcc/testsuite/gcc.target/riscv/target-attr-16.c
@@ -24,5 +24,5 @@ void bar (void)
 {
 }
 
-/* { dg-final { scan-assembler-times ".option arch, 
rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zca1p0_zcd1p0_zba1p0_zbb1p0"
 4 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times ".option arch, 
rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zca1p0_zcd1p0_zcf1p0_zba1p0_zbb1p0"
 4 { target { rv32 } } } } */
 /* { dg-final { scan-assembler-times ".option arch, 
rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zaamo1p0_zalrsc1p0_zca1p0_zcd1p0_zba1p0_zbb1p0"
 4 { target { rv64 } } } } */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [PR rtl-optimization/115877][2/n] Improve liveness computation for constant initialization

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:faadb6b663ea7bda38ebfcd1ed772882cdc725da

commit faadb6b663ea7bda38ebfcd1ed772882cdc725da
Author: Jeff Law 
Date:   Sun Jul 21 08:41:28 2024 -0600

[PR rtl-optimization/115877][2/n] Improve liveness computation for constant 
initialization

While debugging pr115877, I noticed we were failing to remove the 
destination
register from LIVENOW bitmap when it was set to a constant value.  ie  (set
(dest) (const_int)).  This was a trivial oversight in
safe_for_live_propagation.

I don't have an example of this affecting code generation, but it certainly
could.  More importantly, by making LIVENOW more accurate it's easier to 
debug
when LIVENOW differs from expectations.

As with the prior patch this has been tested as part of a larger patchset 
with
the crosses as well as individually on x86_64.

Pushing to the trunk,

PR rtl-optimization/115877
gcc/
* ext-dce.cc (safe_for_live_propagation): Handle RTX_CONST_OBJ.

(cherry picked from commit 9d8ef2711dfecd093077aef6123d9e93ea23454e)

Diff:
---
 gcc/ext-dce.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index d431f8ac12d4..59bcc4572d57 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -69,6 +69,7 @@ safe_for_live_propagation (rtx_code code)
   switch (GET_RTX_CLASS (code))
 {
   case RTX_OBJ:
+  case RTX_CONST_OBJ:
return true;
 
   case RTX_COMPARE:


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Rearrange the test helper files for vector .SAT_*

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:97d90509f1d6b8189d6492d51383c06239c57bbe

commit 97d90509f1d6b8189d6492d51383c06239c57bbe
Author: Pan Li 
Date:   Sat Jul 20 10:43:44 2024 +0800

RISC-V: Rearrange the test helper files for vector .SAT_*

Rearrange the test help header files,  as well as align the name
conventions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: Move to...
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vvv_run.h: 
...here.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: Move 
to...
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vvx_run.h: 
...here.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: Move to...
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx_run.h: 
...here.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: Adjust
the include file names.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-25.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-26.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-27.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-28.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-29.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-30.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-31.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-32.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-25.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-26.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-27.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-28.c: Ditto.
   

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [PR rtl-optimization/115877] Fix livein computation for ext-dce

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:3e82d5753917b74abcbb9212465eec8d89fef824

commit 3e82d5753917b74abcbb9212465eec8d89fef824
Author: Jeff Law 
Date:   Sun Jul 21 07:36:37 2024 -0600

[PR rtl-optimization/115877] Fix livein computation for ext-dce

So I'm not yet sure how I'm going to break everything down, but this is easy
enough to break out as 1/N of ext-dce fixes/improvements.

When handling uses in an insn, we first determine what bits are set in the
destination which is represented in DST_MASK.  Then we use that to refine 
what
bits are live in the source operands.

In the source operand handling section we *modify* DST_MASK if the source
operand is a SUBREG (ugh!).  So if the first operand is a SUBREG, then we 
can
incorrectly compute which bit groups are live in the second operand, 
especially
if it is a SUBREG as well.

This was seen when testing a larger set of patches on the rl78 port
(builtin-arith-overflow-p-7 & pr71631 execution failures), so no new test 
for
this bugfix.

Run through my tester (in conjunction with other ext-dce changes) on the
various cross targets.  Run individually through a bootstrap and regression
test cycle on x86_64 as well.

Pushing to the trunk.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_uses): Restore the value of DST_MASK
for reach operand.

(cherry picked from commit 91e468b72dafc9dcd5dcf7915f1d0ef172264d53)

Diff:
---
 gcc/ext-dce.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 7270de2a3bfe..d431f8ac12d4 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -591,8 +591,10 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
 making things live.  Breaking from this loop will cause
 the iterator to work on sub-rtxs, so it is safe to break
 if we see something we don't know how to handle.  */
+ unsigned HOST_WIDE_INT save_mask = dst_mask;
  for (;;)
{
+ dst_mask = save_mask;
  /* Strip an outer paradoxical subreg.  The bits outside
 the inner mode are don't cares.  So we can just strip
 and process the inner object.  */


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [NFC][PR rtl-optimization/115877] Avoid setting irrelevant bit groups as live in ext-dce

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:316f9617dcf040ccad190d853ee7f94b2f9caace

commit 316f9617dcf040ccad190d853ee7f94b2f9caace
Author: Jeff Law 
Date:   Mon Jul 22 08:45:10 2024 -0600

[NFC][PR rtl-optimization/115877] Avoid setting irrelevant bit groups as 
live in ext-dce

Another patch to refine liveness computations.  This should be NFC and is
designed to help debugging.

In simplest terms the patch avoids setting bit groups outside the size of a
pseudo as live.  Consider a HImode pseudo, bits 16..63 for such a pseudo 
don't
really have meaning, yet we often set bit groups related to bits 16.63 on in
the liveness bitmaps.

This makes debugging harder than it needs to be by simply having larger 
bitmaps
to verify when walking through the code in a debugger.

This has been bootstrapped and regression tested on x86_64.  It's also been
tested on the crosses in my tester without regressions.

Pushing to the trunk,

PR rtl-optimization/115877
gcc/
* ext-dce.cc (group_limit): New function.
(mark_reg_live): Likewise.
(ext_dce_process_sets): Use new functions.
(ext_dce_process_uses): Likewise.
(ext_dce_init): Likewise.

(cherry picked from commit 88d16194d0c8a6bdc2896c8944bfbf3e6038c9d2)

Diff:
---
 gcc/ext-dce.cc | 64 +++---
 1 file changed, 57 insertions(+), 7 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 59bcc4572d57..d1a31e1819e2 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -48,6 +48,57 @@ static bool modify;
bit 16..31
bit 32..BITS_PER_WORD-1  */
 
+/* For the given REG, return the number of bit groups implied by the
+   size of the REG's mode, up to a maximum of 4 (number of bit groups
+   tracked by this pass).
+
+   For partial integer and variable sized modes also return 4.  This
+   could possibly be refined for something like PSI mode, but it
+   does not seem worth the effort.  */
+
+static int
+group_limit (const_rtx reg)
+{
+  machine_mode mode = GET_MODE (reg);
+
+  if (!GET_MODE_BITSIZE (mode).is_constant ())
+return 4;
+
+  int size = GET_MODE_SIZE (mode).to_constant ();
+
+  size = exact_log2 (size);
+
+  if (size < 0)
+return 4;
+
+  size++;
+  return (size > 4 ? 4 : size);
+}
+
+/* Make all bit groups live for REGNO in bitmap BMAP.  For hard regs,
+   we assume all groups are live.  For a pseudo we consider the size
+   of the pseudo to avoid creating unnecessarily live chunks of data.  */
+
+static void
+make_reg_live (bitmap bmap, int regno)
+{
+  int limit;
+
+  /* For pseudos we can use the mode to limit how many bit groups
+ are marked as live since a pseudo only has one mode.  Hard
+ registers have to be handled more conservatively.  */
+  if (regno > FIRST_PSEUDO_REGISTER)
+{
+  rtx reg = regno_reg_rtx[regno];
+  limit = group_limit (reg);
+}
+  else
+limit = 4;
+
+  for (int i = 0; i < limit; i++)
+bitmap_set_bit (bmap, regno * 4 + i);
+}
+
 /* Note this pass could be used to narrow memory loads too.  It's
not clear if that's profitable or not in general.  */
 
@@ -196,7 +247,8 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
 
  /* Transfer all the LIVENOW bits for X into LIVE_TMP.  */
  HOST_WIDE_INT rn = REGNO (SUBREG_REG (x));
- for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + 4; i++)
+ int limit = group_limit (SUBREG_REG (x));
+ for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
@@ -260,7 +312,8 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  /* Transfer the appropriate bits from LIVENOW into
 LIVE_TMP.  */
  HOST_WIDE_INT rn = REGNO (x);
- for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + 4; i++)
+ int limit = group_limit (x);
+ for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
@@ -692,7 +745,7 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
   /* If we have a register reference that is not otherwise handled,
 just assume all the chunks are live.  */
   else if (REG_P (x))
-   bitmap_set_range (livenow, REGNO (x) * 4, 4);
+   bitmap_set_range (livenow, REGNO (x) * 4, group_limit (x));
 }
 }
 
@@ -819,10 +872,7 @@ ext_dce_init (void)
   unsigned i;
   bitmap_iterator bi;
   EXECUTE_IF_SET_IN_BITMAP (refs, 0, i, bi)
-{
-  for (int j = 0; j < 4; j++)
-   bitmap_set_bit (&livein[EXIT_BLOCK], i * 4 + j);
-}
+make_reg_live (&livein[EXIT_BLOCK], i);
 
   livenow = BITMAP_ALLOC (NULL);
   all_blocks = BITMAP_ALLOC (NULL);


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] RISC-V: Implement the .SAT_TRUNC for scalar

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:0ee41b02916d9c4c68ae6dbaa364cefcd79bb7da

commit 0ee41b02916d9c4c68ae6dbaa364cefcd79bb7da
Author: Pan Li 
Date:   Mon Jul 1 16:36:35 2024 +0800

RISC-V: Implement the .SAT_TRUNC for scalar

This patch would like to implement the simple .SAT_TRUNC pattern
in the riscv backend. Aka:

Form 1:
  #define DEF_SAT_U_TRUC_FMT_1(NT, WT) \
  NT __attribute__((noinline)) \
  sat_u_truc_##WT##_to_##NT##_fmt_1 (WT x) \
  {\
bool overflow = x > (WT)(NT)(-1);  \
return ((NT)x) | (NT)-overflow;\
  }

DEF_SAT_U_TRUC_FMT_1(uint32_t, uint64_t)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_truc_uint16_t_to_uint8_t_fmt_1 (uint16_t x)
{
  _Bool overflow;
  unsigned char _1;
  unsigned char _2;
  unsigned char _3;
  uint8_t _6;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  overflow_5 = x_4(D) > 255;
  _1 = (unsigned char) x_4(D);
  _2 = (unsigned char) overflow_5;
  _3 = -_2;
  _6 = _1 | _3;
  return _6;
;;succ:   EXIT

}

After this patch:
__attribute__((noinline))
uint8_t sat_u_truc_uint16_t_to_uint8_t_fmt_1 (uint16_t x)
{
  uint8_t _6;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _6 = .SAT_TRUNC (x_4(D)); [tail call]
  return _6;
;;succ:   EXIT

}

The below tests suites are passed for this patch
1. The rv64gcv fully regression test.
2. The rv64gcv build with glibc

gcc/ChangeLog:

* config/riscv/iterators.md (ANYI_DOUBLE_TRUNC): Add new iterator
for int double truncation.
(ANYI_DOUBLE_TRUNCATED): Add new attr for int double truncation.
(anyi_double_truncated): Ditto but for lowercase.
* config/riscv/riscv-protos.h (riscv_expand_ustrunc): Add new
func decl for expanding ustrunc
* config/riscv/riscv.cc (riscv_expand_ustrunc): Add new func
impl to expand ustrunc.
* config/riscv/riscv.md (ustrunc2): 
Impl
the new pattern ustrunc2 for int.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macro.
* gcc.target/riscv/sat_arith_data.h: New test.
* gcc.target/riscv/sat_u_trunc-1.c: New test.
* gcc.target/riscv/sat_u_trunc-2.c: New test.
* gcc.target/riscv/sat_u_trunc-3.c: New test.
* gcc.target/riscv/sat_u_trunc-run-1.c: New test.
* gcc.target/riscv/sat_u_trunc-run-2.c: New test.
* gcc.target/riscv/sat_u_trunc-run-3.c: New test.
* gcc.target/riscv/scalar_sat_unary.h: New test.

Signed-off-by: Pan Li 
(cherry picked from commit 5d2115b850df63b0ecdf56efb720ad848e7afe21)

Diff:
---
 gcc/config/riscv/iterators.md  | 10 
 gcc/config/riscv/riscv-protos.h|  1 +
 gcc/config/riscv/riscv.cc  | 40 
 gcc/config/riscv/riscv.md  | 10 
 gcc/testsuite/gcc.target/riscv/sat_arith.h | 16 +++
 gcc/testsuite/gcc.target/riscv/sat_arith_data.h| 56 ++
 gcc/testsuite/gcc.target/riscv/sat_u_trunc-1.c | 17 +++
 gcc/testsuite/gcc.target/riscv/sat_u_trunc-2.c | 20 
 gcc/testsuite/gcc.target/riscv/sat_u_trunc-3.c | 19 
 gcc/testsuite/gcc.target/riscv/sat_u_trunc-run-1.c | 16 +++
 gcc/testsuite/gcc.target/riscv/sat_u_trunc-run-2.c | 16 +++
 gcc/testsuite/gcc.target/riscv/sat_u_trunc-run-3.c | 16 +++
 gcc/testsuite/gcc.target/riscv/scalar_sat_unary.h  | 22 +
 13 files changed, 259 insertions(+)

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index d61ed53a8b1b..734da041f0cb 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -65,6 +65,16 @@
 ;; Iterator for hardware-supported integer modes.
 (define_mode_iterator ANYI [QI HI SI (DI "TARGET_64BIT")])
 
+(define_mode_iterator ANYI_DOUBLE_TRUNC [HI SI (DI "TARGET_64BIT")])
+
+(define_mode_attr ANYI_DOUBLE_TRUNCATED [
+  (HI "QI") (SI "HI") (DI "SI")
+])
+
+(define_mode_attr anyi_double_truncated [
+  (HI "qi") (SI "hi") (DI "si")
+])
+
 ;; Iterator for hardware-supported floating-point modes.
 (define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX")
(DF "TARGET_DOUBLE_FLOAT || TARGET_ZDINX")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 7c0ea1b445b1..ce5e38d3dbbf 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -135,6 +135,7 @@ riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
 extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
 extern void riscv_expand_usadd (rtx, rtx, rtx

[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [4/n][PR rtl-optimization/115877] Correct SUBREG handling in a destination

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:d4f5e86b8cf0666aefd9c1f10188274af147df46

commit d4f5e86b8cf0666aefd9c1f10188274af147df46
Author: Jeff Law 
Date:   Mon Jul 22 10:11:57 2024 -0600

[4/n][PR rtl-optimization/115877] Correct SUBREG handling in a destination

If we encounter something during SET handling that we can not handle, the 
safe
thing to do is to ignore the destination and continue the loop.

We've actually been trying to do slightly better with SUBREG destinations by
iterating into SUBREG_REG.  It turns out that wasn't working as expected.

The problem is once we "continue" we lose the state that we were inside the 
SET
and thus we ended up ignoring the destination completely rather than 
tracking
the SUBREG_REG object.  This could be fixed by restarting SET processing, 
but I
just don't see this as all that important to handle.  So rather than leave 
the
code as-is, not working per design, I'm twiddling it to use the common 'skip
subrtxs and continue' idiom used elsewhere.

This is a prerequisite for another patch in this series.  Specifically I 
have a
patch that explicitly tracks if we skipped a destination rather than trying 
to
imply it from the state of LIVE_TMP.  So this is probably NFC right now, but
that's a short-lived NFC.

Bootstrapped and regression tested on x86 and also run as part of a larger 
kit
on the crosses in my tester.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_sets): More correctly handle SUBREG
destinations.

(cherry picked from commit ab7c0aed52054976d0b5e12c52e82239d4277b98)

Diff:
---
 gcc/ext-dce.cc | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index d1a31e1819e2..7f0a6d725f1e 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -270,11 +270,18 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
= GET_MODE_MASK (GET_MODE_INNER (GET_MODE (x)));
  if (SUBREG_P (x))
{
- /* If we have a SUBREG that is too wide, just continue the loop
-and let the iterator go down into SUBREG_REG.  */
+ /* If we have a SUBREG destination that is too wide, just
+skip the destination rather than continuing this iterator.
+While continuing would be better, we'd need to strip the
+subreg and restart within the SET processing rather than
+the top of the loop which just complicates the flow even
+more.  */
  if (!is_a  (GET_MODE (SUBREG_REG (x)), 
&outer_mode)
  || GET_MODE_BITSIZE (outer_mode) > 64)
-   continue;
+   {
+ iter.skip_subrtxes ();
+ continue;
+   }
 
  /* We can safely strip a paradoxical subreg.  The inner mode will
 be narrower than the outer mode.  We'll clear fewer bits in


[gcc(refs/vendors/riscv/heads/gcc-14-with-riscv-opts)] [5/n][PR rtl-optimization/115877] Fix handling of input/output operands

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:99d15ac27522519377f7019cf6e5cb67b1497458

commit 99d15ac27522519377f7019cf6e5cb67b1497458
Author: Jeff Law 
Date:   Mon Jul 22 21:48:28 2024 -0600

[5/n][PR rtl-optimization/115877] Fix handling of input/output operands

So in this patch we're correcting a failure to mark objects live in 
scenarios
like

(set (dest) (plus (dest) (src))

When handling set pseudos, we transfer the liveness information from LIVENOW
into LIVE_TMP.  LIVE_TMP is subsequently used to narrow what bit groups are
live for the inputs.

The first time we process the block we may not have DEST in the LIVENOW set 
(it
may be live across the loop, but not live after the loop).  Thus we can 
totally
miss making certain objects live, resulting in incorrect code.

The fix is pretty simple.  If LIVE_TMP is empty, then we should go ahead and
mark all the bit groups for the set object in LIVE_TMP.  This also removes 
an
invalid gcc_assert on the state of the liveness bitmaps.

This showed up on pru, rl78 and/or msp430 in the testsuite.  So no new test.

Bootstrapped and regression tested on x86_64 and also run through my tester 
on
all the cross platforms.

Pushing to the trunk.

PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_sets): Reasonably handle input/output
operands.
(ext_dce_rd_transfer_n): Drop bogus assertion.

(cherry picked from commit ad642d2c950657539777ea436b787e7fff4ec09e)

Diff:
---
 gcc/ext-dce.cc | 31 ++-
 1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 7f0a6d725f1e..43d2447acb5d 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -245,13 +245,25 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  continue;
}
 
- /* Transfer all the LIVENOW bits for X into LIVE_TMP.  */
+ /* LIVE_TMP contains the set groups that are live-out and set in
+this insn.  It is used to narrow the groups live-in for the
+inputs of this insn.
+
+The simple thing to do is mark all the groups as live, but
+that will significantly inhibit optimization.
+
+We also need to be careful in the case where we have an in-out
+operand.  If we're not careful we'd clear LIVE_TMP
+incorrectly.  */
  HOST_WIDE_INT rn = REGNO (SUBREG_REG (x));
  int limit = group_limit (SUBREG_REG (x));
  for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
+ if (bitmap_empty_p (live_tmp))
+   make_reg_live (live_tmp, rn);
+
  /* The mode of the SUBREG tells us how many bits we can
 clear.  */
  machine_mode mode = GET_MODE (x);
@@ -316,14 +328,25 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  /* Now handle the actual object that was changed.  */
  if (REG_P (x))
{
- /* Transfer the appropriate bits from LIVENOW into
-LIVE_TMP.  */
+ /* LIVE_TMP contains the set groups that are live-out and set in
+this insn.  It is used to narrow the groups live-in for the
+inputs of this insn.
+
+The simple thing to do is mark all the groups as live, but
+that will significantly inhibit optimization.
+
+We also need to be careful in the case where we have an in-out
+operand.  If we're not careful we'd clear LIVE_TMP
+incorrectly.  */
  HOST_WIDE_INT rn = REGNO (x);
  int limit = group_limit (x);
  for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + limit; i++)
if (bitmap_bit_p (livenow, i))
  bitmap_set_bit (live_tmp, i);
 
+ if (bitmap_empty_p (live_tmp))
+   make_reg_live (live_tmp, rn);
+
  /* Now clear the bits known written by this instruction.
 Note that BIT need not be a power of two, consider a
 ZERO_EXTRACT destination.  */
@@ -935,8 +958,6 @@ ext_dce_rd_transfer_n (int bb_index)
  the generic dataflow code that something changed.  */
   if (!bitmap_equal_p (&livein[bb_index], livenow))
 {
-  gcc_assert (!bitmap_intersect_compl_p (&livein[bb_index], livenow));
-
   bitmap_copy (&livein[bb_index], livenow);
   return true;
 }


[gcc r15-2240] [PR rtl-optimization/115877][6/n] Add testcase from pr115877

2024-07-23 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f9a60d575f02822852aa22513c636be38f9c63ea

commit r15-2240-gf9a60d575f02822852aa22513c636be38f9c63ea
Author: Jeff Law 
Date:   Tue Jul 23 19:11:04 2024 -0600

[PR rtl-optimization/115877][6/n] Add testcase from pr115877

This just adds the testcase from pr115877.  It's working now on the trunk.  
I'm
not done with cleanups/bugfixing, but there's no reason to not have the
testcase installed at this point.

PR rtl-optimization/115877
gcc/testsuite
* gcc.dg/torture/pr115877.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/torture/pr115877.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/torture/pr115877.c 
b/gcc/testsuite/gcc.dg/torture/pr115877.c
new file mode 100644
index ..432b1280b177
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr115877.c
@@ -0,0 +1,20 @@
+/* { dg-do run { target int128 } } */
+
+char a[16];
+unsigned short u;
+
+__int128
+foo (int i)
+{
+  i -= (unsigned short) ~u;
+  a[(unsigned short) i] = 1;
+  return i;
+}
+
+int
+main ()
+{
+  __int128 x = foo (0);
+  if (x != -0x)
+__builtin_abort();
+}


[gcc r15-2275] [rtl-optimization/116037] Explicitly track if a destination was skipped in ext-dce

2024-07-24 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:679086172b84be18c55fdbb9cda7e97806e7c083

commit r15-2275-g679086172b84be18c55fdbb9cda7e97806e7c083
Author: Jeff Law 
Date:   Wed Jul 24 11:16:26 2024 -0600

[rtl-optimization/116037] Explicitly track if a destination was skipped in 
ext-dce

So this has been in the hopper since the first bugs were reported against
ext-dce.  It'd been holding off committing as I was finding other issues in
terms of correctness of live computations.  There's still problems in that
space, but I think it's time to push this chunk forward.  I'm marking it as
116037, but it may impact other bugs.

This patch starts explicitly tracking if set processing skipped a 
destination,
which can happen for wide modes (TI+), vectors, certain subregs, etc.  This 
is
computed during ext_dce_set_processing.

During use processing we use that flag to determine reliably if we need to 
make
the inputs fully live and to avoid even trying to eliminate an extension if 
we
skipped output processing.

While testing this I found that a recent change to fix cases where we had 
two
subreg input operands mucked up the code to make things like a shift/rotate
count fully live.  So that goof has been fixed.

Bootstrapped and regression tested on x86.  Most, but not all, of these 
changes
have also been tested on the crosses.  Pushing to the trunk.

I'm not including it in this patch but I'm poking at converting this code to
use note_uses/note_stores to make it more maintainable.  The SUBREG and
STRICT_LOW_PART handling of note_stores is problematical, but I think it's
solvable.  I haven't tried a conversion to note_uses yet.

PR rtl-optimization/116037
gcc/
* ext-dce.cc (ext_dce_process_sets): Note if we ever skip a dest
and return that info explicitly.
(ext_dce_process_uses): If a set was skipped, then consider all bits
in every input as live.  Do not try to optimize away an extension if
we skipped processing a destination in the same insn.  Restore code
to make shift/rotate count fully live.
(ext_dce_process_bb): Handle API changes for ext_dce_process_sets.

gcc/testsuite/
* gcc.dg/torture/pr116037.c: New test

Diff:
---
 gcc/ext-dce.cc  | 42 ++---
 gcc/testsuite/gcc.dg/torture/pr116037.c | 36 
 2 files changed, 69 insertions(+), 9 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index c56dfb505b88..c94d1fc34145 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -181,9 +181,11 @@ safe_for_live_propagation (rtx_code code)
within an object) are set by INSN, the more aggressive the
optimization phase during use handling will be.  */
 
-static void
+static bool
 ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap live_tmp)
 {
+  bool skipped_dest = false;
+
   subrtx_iterator::array_type array;
   FOR_EACH_SUBRTX (iter, array, obj, NONCONST)
 {
@@ -210,6 +212,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  /* Skip the subrtxs of this destination.  There is
 little value in iterating into the subobjects, so
 just skip them for a bit of efficiency.  */
+ skipped_dest = true;
  iter.skip_subrtxes ();
  continue;
}
@@ -241,6 +244,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  /* Skip the subrtxs of the STRICT_LOW_PART.  We can't
 process them because it'll set objects as no longer
 live when they are in fact still live.  */
+ skipped_dest = true;
  iter.skip_subrtxes ();
  continue;
}
@@ -291,6 +295,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
  if (!is_a  (GET_MODE (SUBREG_REG (x)), 
&outer_mode)
  || GET_MODE_BITSIZE (outer_mode) > 64)
{
+ skipped_dest = true;
  iter.skip_subrtxes ();
  continue;
}
@@ -318,6 +323,7 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
 remain the same.  Thus we can not continue here, we must
 either figure out what part of the destination is modified
 or skip the sub-rtxs.  */
+ skipped_dest = true;
  iter.skip_subrtxes ();
  continue;
}
@@ -370,9 +376,11 @@ ext_dce_process_sets (rtx_insn *insn, rtx obj, bitmap 
live_tmp)
   else if (GET_CODE (x) == COND_EXEC)
{
  /* This isn't ideal, but may not be so bad in practice.  */
+ skipped_dest = true;
  iter.skip_subrtxes ();
}
 }
+  return skipped_dest;
 }
 
 /* INSN has a sign/zero exten

[gcc r15-2316] [committed] Trivial testcase adjustment

2024-07-25 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:2dd45655db47362153756261881413b368582597

commit r15-2316-g2dd45655db47362153756261881413b368582597
Author: Jeff Law 
Date:   Thu Jul 25 08:42:04 2024 -0600

[committed] Trivial testcase adjustment

I made pr116037.c dependent on int32 just based on the constants used 
without
noting the int128 vector type.  Naturally on targets that don't support 
int128
the test fails.  Fixed by changing the target selector from int32 to int128.

Pushed to the trunk.

gcc/testsuite
* gcc.dg/torture/pr116037.c: Fix target selector.

Diff:
---
 gcc/testsuite/gcc.dg/torture/pr116037.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr116037.c 
b/gcc/testsuite/gcc.dg/torture/pr116037.c
index cb34ba4e5d46..86ab50de4b2f 100644
--- a/gcc/testsuite/gcc.dg/torture/pr116037.c
+++ b/gcc/testsuite/gcc.dg/torture/pr116037.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-require-effective-target int32 } */
+/* { dg-require-effective-target int128 } */
 /* { dg-additional-options "-Wno-psabi" } */
 
 typedef __attribute__((__vector_size__ (64))) unsigned char VC;


[gcc r15-2321] [PR rtl-optimization/116039] Fix life computation for promoted subregs

2024-07-25 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:34fb0feca71f763b2fbe832548749666d34a4a76

commit r15-2321-g34fb0feca71f763b2fbe832548749666d34a4a76
Author: Jeff Law 
Date:   Thu Jul 25 12:32:28 2024 -0600

[PR rtl-optimization/116039] Fix life computation for promoted subregs

So this turned out to be a neat little test and while the fuzzer found it on
RISC-V, I wouldn't be surprised if the underlying issue is also the root 
cause
of the loongarch issue with ext-dce.

The key issue is that if we have something like

(set (dest) (any_extend (subreg (source

If the subreg object is marked with SUBREG_PROMOTED and the sign/unsigned 
state
matches the any_extend opcode, then combine (and I guess anything using
simplify-rtx) may simplify that to

(set (dest) (source))

That implies that bits outside the mode of the subreg are actually live and
valid.  This needs to be accounted for during liveness computation.

We have to be careful here though. If we're too conservative about setting
additional bits live, then we'll inhibit the desired optimization in the
coremark examples.  To do a good job we need to know the extension opcode.

I'm extremely unhappy with how the use handling works in ext-dce.  It mixes
different conceptual steps and has horribly complex control flow.  It only
handles a subset of the unary/binary opcodes, etc etc.  It's just damn mess.
It's going to need some more noodling around.

In the mean time this is a bit hacky in that it depends on non-obvious 
behavior
to know it can get the extension opcode, but I don't want to leave the 
trunk in
a broken state while I figure out the refactoring problem.

Bootstrapped and regression tested on x86 and tested on the crosses.  
Pushing to the trunk.

PR rtl-optimization/116039
gcc/
* ext-dce.cc (ext_dce_process_uses): Add some comments about 
concerns
with current code.  Mark additional bit groups as live when we have
an extension of a suitably promoted subreg.

gcc/testsuite
* gcc.dg/torture/pr116039.c: New test.

Diff:
---
 gcc/ext-dce.cc  | 43 -
 gcc/testsuite/gcc.dg/torture/pr116039.c | 20 +++
 2 files changed, 57 insertions(+), 6 deletions(-)

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index c94d1fc34145..14f163a01d63 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -667,6 +667,12 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
  if (modify && !skipped_dest && (dst_mask & ~src_mask) == 0)
ext_dce_try_optimize_insn (insn, x);
 
+ /* Stripping the extension here just seems wrong on multiple
+levels.  It's source side handling, so it seems like it
+belongs in the loop below.  Stripping here also makes it
+harder than necessary to properly handle live bit groups
+for (ANY_EXTEND (SUBREG)) where the SUBREG has
+SUBREG_PROMOTED state.  */
  dst_mask &= src_mask;
  src = XEXP (src, 0);
  code = GET_CODE (src);
@@ -674,8 +680,8 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
 
  /* Optimization is done at this point.  We just want to make
 sure everything that should get marked as live is marked
-from here onward.  */
-
+from here onward.  Shouldn't the backpropagate step happen
+before optimization?  */
  dst_mask = carry_backpropagate (dst_mask, code, src);
 
  /* We will handle the other operand of a binary operator
@@ -688,7 +694,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
  /* We're inside a SET and want to process the source operands
 making things live.  Breaking from this loop will cause
 the iterator to work on sub-rtxs, so it is safe to break
-if we see something we don't know how to handle.  */
+if we see something we don't know how to handle.
+
+This code is just hokey as it really just handles trivial
+unary and binary cases.  Otherwise the loop exits and we
+continue iterating on sub-rtxs, but outside the set context.  
*/
  unsigned HOST_WIDE_INT save_mask = dst_mask;
  for (;;)
{
@@ -704,10 +714,26 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj,
y = XEXP (y, 0);
  else if (SUBREG_P (y) && SUBREG_BYTE (y).is_constant ())
{
- /* For anything but (subreg (reg)), break the inner loop
-and process normally (conservatively).  */
- if (!REG_P (SUBREG_REG (y)))
+ /* We really want to k

[gcc r15-2352] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

2024-07-26 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6e5aae47e3b910f9af6983f744d7a3e2dcecba1d

commit r15-2352-g6e5aae47e3b910f9af6983f744d7a3e2dcecba1d
Author: Jeff Law 
Date:   Fri Jul 26 17:30:08 2024 -0600

[RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

A patch introduced a pattern to avoid unnecessary extensions when doing a
min/max operation where one of the values is a 32 bit positive constant.

> (define_insn_and_split "*minmax"
>   [(set (match_operand:DI 0 "register_operand" "=r")
> (sign_extend:DI
>   (subreg:SI
> (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 
"register_operand" "r"))
> (match_operand:DI 2 
"immediate_operand" "i"))
>0)))
>(clobber (match_scratch:DI 3 "=&r"))
>(clobber (match_scratch:DI 4 "=&r"))]
>   "TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0"
>   "#"
>   "&& reload_completed"
>   [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
>(set (match_dup 4) (match_dup 2))
>(set (match_dup 0) (:DI (match_dup 3) (match_dup 4)))]

Lots going on in here.  The key is the nonconstant value is zero extended 
from
SI to DI in the original RTL and we know the constant value is unchanged if 
we
were to sign extend it from 32 to 64 bits.

We change the extension of the nonconstant operand from zero to sign 
extension.
I'm pretty confident the goal there is take advantage of the fact that SI
values are kept sign extended and will often be optimized away.

The problem occurs when the nonconstant operand has the SI sign bit set.  
As an
example:

smax (0x800, 0x7)  resulting in 0x8000

The split RTL will generate
smax (sign_extend (0x8000), 0x7))

smax (0x8000, 0x7) resulting in 0x7

Opps.

We really needed to change the opcode to umax for this transformation to 
work.
That's easy enough.  But there's further improvements we can make.

First the pattern is a define_and_split with a post-reload split condition. 
 It
would be better implemented as a 4->3 define_split so that the costing model
just works.  Second, if operands[1] is a suitably promoted subreg, then we 
can
elide the sign extension when we generate the split code, so often it'll be 
a
4->2 split, again with the cost model working with no adjustments needed.

Tested on rv32 and rv64 in my tester.  I'll wait for the pre-commit tester 
to
spin it as well.

PR target/116085
gcc/
* config/riscv/bitmanip.md (minmax extension avoidance splitter):
Rewrite as a simpler define_split.  Adjust the opcode appropriately.
Avoid emitting sign extension if it's clearly not needed.
* config/riscv/iterators.md (minmax_optab): Rename to uminmax_optab
and map everything to unsigned variants.

gcc/testsuite/
* gcc.target/riscv/pr116085.c: New test.

Diff:
---
 gcc/config/riscv/bitmanip.md  | 38 +++
 gcc/config/riscv/iterators.md |  9 
 gcc/testsuite/gcc.target/riscv/pr116085.c | 29 +++
 3 files changed, 58 insertions(+), 18 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 9fc5215d6e35..b19295cd9424 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -549,23 +549,33 @@
 
 ;; Optimize the common case of a SImode min/max against a constant
 ;; that is safe both for sign- and zero-extension.
-(define_insn_and_split "*minmax"
-  [(set (match_operand:DI 0 "register_operand" "=r")
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
(sign_extend:DI
  (subreg:SI
-   (bitmanip_minmax:DI (zero_extend:DI (match_operand:SI 1 
"register_operand" "r"))
-   (match_operand:DI 2 
"immediate_operand" "i"))
-  0)))
-   (clobber (match_scratch:DI 3 "=&r"))
-   (clobber (match_scratch:DI 4 "=&r"))]
+   (bitmanip_minmax:DI (zero_extend:DI
+ (match_operand:SI 1 "register_operand"))
+   (match_operand:DI 2 "immediate_operand")) 0)))
+   (clobber (match_operand:DI 3 "register_operand"))
+   (clobber (match_operand:DI 4 "register_operand"))]
   "TARGET_64BIT && TARGET_ZBB && sext_hwi (INTVAL (operands[2]), 32) >= 0"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 3) (sign_extend:DI (match_dup 1)))
-   (set (match_dup 4) (match_dup 2))
-   (set (match_dup 0) (:DI (match_dup 3) (match_dup 4)))]
-  ""
-  [(set_attr "type" "bitmanip")])
+  [(set (match_dup 0) (:DI (match_dup 4) (match_dup 3)))]
+  "
+{
+  /* Load the constant into a register.  */
+  emit_move_insn (operands[3], operands[2]);
+
+  /* If operands[1] is a sign

[gcc r14-9341] [PR target/113001] Fix incorrect operand swapping in conditional move

2024-03-06 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:10cbfcd60f9e5bdbe486e1c0192e0f168d899b77

commit r14-9341-g10cbfcd60f9e5bdbe486e1c0192e0f168d899b77
Author: Jeff Law 
Date:   Wed Mar 6 09:50:44 2024 -0700

[PR target/113001] Fix incorrect operand swapping in conditional move

This bug totally fell off my radar.  Sorry about that.

We have some special casing the conditional move expander to simplify a
conditional move when comparing a register against zero and that same 
register
is one of the arms.

Specifically a (eq (reg) (const_int 0)) where reg is also the true arm or 
(ne
(reg) (const_int 0)) where reg is the false arm need not use the fully
generalized conditional move, thus saving an instruction for those cases.

In the NE case we swapped the operands, but didn't swap the condition, which
led to the ICE due to an unrecognized pattern.  THe backend actually has
distinct patterns for those two cases.  So swapping the operands is neither
needed nor advisable.

Regression tested on rv64gc and verified the new tests pass.

Pushing to the trunk.

PR target/113001
PR target/112871
gcc/
* config/riscv/riscv.cc (expand_conditional_move): Do not swap
operands when the comparison operand is the same as the false
arm for a NE test.

gcc/testsuite
* gcc.target/riscv/zicond-ice-3.c: New test.
* gcc.target/riscv/zicond-ice-4.c: New test.

Diff:
---
 gcc/config/riscv/riscv.cc |  2 --
 gcc/testsuite/gcc.target/riscv/zicond-ice-3.c | 15 +++
 gcc/testsuite/gcc.target/riscv/zicond-ice-4.c | 22 ++
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 691d967de29..680c4a728e9 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4633,8 +4633,6 @@ riscv_expand_conditional_move (rtx dest, rtx op, rtx 
cons, rtx alt)
   || (code == NE && rtx_equal_p (alt, op0)))
{
  rtx cond = gen_rtx_fmt_ee (code, GET_MODE (op0), op0, op1);
- if (!rtx_equal_p (cons, op0))
-   std::swap (alt, cons);
  alt = force_reg (mode, alt);
  emit_insn (gen_rtx_SET (dest,
  gen_rtx_IF_THEN_ELSE (mode, cond,
diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ice-3.c 
b/gcc/testsuite/gcc.target/riscv/zicond-ice-3.c
new file mode 100644
index 000..650986825ef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zicond-ice-3.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64d" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zicond -mabi=ilp32d" { target { rv32 } } } */
+
+long a, b;
+int c, d;
+void e(long *f) {
+  (b = *f) && --b;
+  for (; c;)
+;
+}
+void g() {
+  for (; d; d--)
+e(&a);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ice-4.c 
b/gcc/testsuite/gcc.target/riscv/zicond-ice-4.c
new file mode 100644
index 000..2be02c78a08
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zicond-ice-4.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64d" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zicond -mabi=ilp32d" { target { rv32 } } } */
+
+short a, c;
+int b, d, i;
+volatile char e;
+static int f[] = {1, 1};
+long g;
+int volatile h;
+short(j)() { return b ? a : 0; }
+void k() {
+l:
+  h;
+  g = 0;
+  for (; g <= 2; g++) {
+d | ((i || j() & (0 == f[g])) ^ i) && e;
+if (c)
+  goto l;
+  }
+}
+


[gcc r14-9415] [committed] [PR target/111362] Fix compare-debug issue with mode switching

2024-03-09 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:50531b6d400945793a1d549e6ee941d989319d42

commit r14-9415-g50531b6d400945793a1d549e6ee941d989319d42
Author: jlaw 
Date:   Sat Mar 9 19:27:32 2024 -0700

[committed] [PR target/111362] Fix compare-debug issue with mode switching

The issue here is the code we emit for mode-switching can change when -g is
added to the command line.  This is caused by processing debug notes 
occurring
after a call which is the last real statement in a basic block.

Without -g the CALL_INSN is literally the last insn in the block and the 
loop
exits.  If mode switching after the call is needed, it'll be handled as we
process outgoing edges.

With -g the loop iterates again and in the processing of the node the 
backend
signals that a mode switch is necessary.

I pondered fixing this in the target, but the better fix is to ignore the 
debug
notes in the insn stream.

I did a cursory review of some of the other compare-debug failures, but did 
not
immediately see others which would likely be fixed by this change.  Sigh.

Anyway, bootstrapped and regression tested on x86.  Regression tested on 
rv64
as well.

PR target/111362
gcc/
* mode-switching.cc (optimize_mode_switching): Only process
NONDEBUG insns.

gcc/testsuite

* gcc.target/riscv/compare-debug-1.c: New test.
* gcc.target/riscv/compare-debug-2.c: New test.

Diff:
---
 gcc/mode-switching.cc| 2 +-
 gcc/testsuite/gcc.target/riscv/compare-debug-1.c | 9 +
 gcc/testsuite/gcc.target/riscv/compare-debug-2.c | 3 +++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/mode-switching.cc b/gcc/mode-switching.cc
index 583929184ce..a145b77397d 100644
--- a/gcc/mode-switching.cc
+++ b/gcc/mode-switching.cc
@@ -959,7 +959,7 @@ optimize_mode_switching (void)
 
  FOR_BB_INSNS (bb, insn)
{
- if (INSN_P (insn))
+ if (NONDEBUG_INSN_P (insn))
{
  int mode = targetm.mode_switching.needed (e, insn, live_now);
  rtx link;
diff --git a/gcc/testsuite/gcc.target/riscv/compare-debug-1.c 
b/gcc/testsuite/gcc.target/riscv/compare-debug-1.c
new file mode 100644
index 000..d65bb287b9a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/compare-debug-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fno-tree-ch --param=max-completely-peel-times=0 
-march=rv64iv -mabi=lp64d -fcompare-debug" } */
+
+
+void
+foo(void) {
+  for (unsigned i = 0; i < sizeof(foo); i++)
+__builtin_printf("%d", i);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/compare-debug-2.c 
b/gcc/testsuite/gcc.target/riscv/compare-debug-2.c
new file mode 100644
index 000..d87758475e4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/compare-debug-2.c
@@ -0,0 +1,3 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fno-tree-ch --param=max-completely-peel-times=0 
-march=rv64iv -mabi=lp64d -fno-dce -fschedule-insns -fcompare-debug" } */
+#include "compare-debug-1.c"


[gcc r14-9416] [committed] [target/102250] Document python requirement for risc-v

2024-03-09 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:7c8f0a79a7e1e42f846ddbca14b98b47ddcfd178

commit r14-9416-g7c8f0a79a7e1e42f846ddbca14b98b47ddcfd178
Author: jlaw 
Date:   Sat Mar 9 20:11:39 2024 -0700

[committed] [target/102250] Document python requirement for risc-v

PR target/102250
gcc/

* doc/install.texi: Document need for python when building
RISC-V compilers.

Diff:
---
 gcc/doc/install.texi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 173233096d1..e3650e0c4f4 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -253,6 +253,11 @@ name of the package depends on your distro) or you must 
build GCC as a
 @option{--disable-multilib}.  Otherwise, you may encounter an error such as
 @samp{fatal error: gnu/stubs-32.h: No such file}
 
+@item Python
+If you configure a RISC-V compiler with the option @option{--with-arch} and
+the specified architecture string is non-canonical, then you will need
+@command{python} installed on the build system.
+
 @item @anchor{GNAT-prerequisite}GNAT
 
 In order to build GNAT, the Ada compiler, you need a working GNAT


[gcc r14-9417] Revert "[committed] Adjust expectations for pr59533-1.c"

2024-03-09 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:6f7d000fcacef31a6947f95021e445c846170f92

commit r14-9417-g6f7d000fcacef31a6947f95021e445c846170f92
Author: jlaw 
Date:   Sat Mar 9 21:33:47 2024 -0700

Revert "[committed] Adjust expectations for pr59533-1.c"

This reverts commit 7e16f819ff413c48702f9087b62eaac39a060a14.

Diff:
---
 gcc/testsuite/gcc.target/sh/pr59533-1.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/sh/pr59533-1.c 
b/gcc/testsuite/gcc.target/sh/pr59533-1.c
index 859b8e2d24c..b0469859df5 100644
--- a/gcc/testsuite/gcc.target/sh/pr59533-1.c
+++ b/gcc/testsuite/gcc.target/sh/pr59533-1.c
@@ -2,15 +2,15 @@
 /* { dg-do compile }  */
 /* { dg-options "-O1" } */
 
-/* { dg-final { scan-assembler-times "shll" 3 } }  */
+/* { dg-final { scan-assembler-times "shll" 1 } }  */
 /* { dg-final { scan-assembler-times "movt" 5 } }  */
 /* { dg-final { scan-assembler-times "rotcl" 1 } }  */
 /* { dg-final { scan-assembler-times "and" 3 } }  */
 /* { dg-final { scan-assembler-times "extu.b" 5 } }  */
 
-/* { dg-final { scan-assembler-times "cmp/pz" 25 { target { ! sh2a } } } }  */
-/* { dg-final { scan-assembler-times "addc" 6 { target { ! sh2a } } } }  */
-/* { dg-final { scan-assembler-times "subc" 14 { target { ! sh2a } } } }  */
+/* { dg-final { scan-assembler-times "cmp/pz" 27 { target { ! sh2a } } } }  */
+/* { dg-final { scan-assembler-times "addc" 4 { target { ! sh2a } } } }  */
+/* { dg-final { scan-assembler-times "subc" 16 { target { ! sh2a } } } }  */
 
 /* { dg-final { scan-assembler-times "cmp/pz" 25 { target { sh2a } } } }  */
 /* { dg-final { scan-assembler-times "addc" 6 { target { sh2a } } } }  */


[gcc r14-9419] [committed] [PR tree-optimization/110199] Simplify MIN/MAX more often

2024-03-10 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:8fe27ed193d60f6cd8b34761858a720c95eabbdb

commit r14-9419-g8fe27ed193d60f6cd8b34761858a720c95eabbdb
Author: jlaw 
Date:   Sun Mar 10 11:58:00 2024 -0600

[committed] [PR tree-optimization/110199] Simplify MIN/MAX more often

So as I mentioned in the BZ, the case of

t = MIN_EXPR (A, B)

where we know something about the relationship between A and B can be 
trivially
handled by some existing code in DOM.  That existing code would simplify 
when A
== B.  But by testing GE and LE instead of EQ we can cover more cases with
minimal effort.  When applicable the MIN/MAX turns into a simple copy.

I made one other change.  We have other binary operations that we simplify 
when
we know something about the relationship between the operands.  That code 
was
not canonicalizing the order of operands when building the expression to 
lookup
in the hash tables to discover that relationship.  Since those paths are 
only
testing for equality, we can trivially reverse them and not have to worry 
about
changing codes or anything like that.  So extremely safe and avoids having 
to
come back and fix that code to match the MIN_EXPR/MAX_EXPR case later.

Bootstrapped on x86 and also tested on the crosses.  I briefly thought there
was an sh regression, but that was actually the recent fwprop changes 
twiddling
code generation for one test.

PR tree-optimization/110199
gcc/
* tree-ssa-scopedtables.cc
(avail_exprs_stack::simplify_binary_operation): Generalize handling
of MIN_EXPR/MAX_EXPR to allow additional simplifications.  
Canonicalize
comparison operands for other cases.

gcc/testsuite

* gcc.dg/tree-ssa/minmax-27.c: New test.
* gcc.dg/tree-ssa/minmax-28.c: New test.

Diff:
---
 gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c | 118 ++
 gcc/testsuite/gcc.dg/tree-ssa/minmax-28.c | 117 +
 gcc/tree-ssa-scopedtables.cc  |  53 --
 3 files changed, 282 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c
new file mode 100644
index 000..4b94203b0d0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-27.c
@@ -0,0 +1,118 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom2" } */
+
+
+int min1(int a, int b)
+{
+if (a <= b)
+return a < b ? a : b;
+return 0;
+}
+
+int min2(int a, int b)
+{
+if (a <= b)
+return a > b ? b : a;
+return 0;
+}
+
+int min3(int a, int b)
+{
+if (a < b)
+return a < b ? a : b;
+return 0;
+}
+
+int min4(int a, int b)
+{
+if (a < b)
+return a > b ? b : a;
+return 0;
+}
+
+int min5(int a, int b)
+{
+if (a <= b)
+return a <= b ? a : b;
+return 0;
+}
+
+int min6(int a, int b)
+{
+if (a <= b)
+return a >= b ? b : a;
+return 0;
+}
+
+int min7(int a, int b)
+{
+if (a < b)
+return a <= b ? a : b;
+return 0;
+}
+
+int min8(int a, int b)
+{
+if (b > a)
+return a >= b ? b : a;
+return 0;
+}
+
+int min9(int a, int b)
+{
+if (b >= a)
+return a < b ? a : b;
+return 0;
+}
+
+int min10(int a, int b)
+{
+if (b >= a)
+return a > b ? b : a;
+return 0;
+}
+
+int min11(int a, int b)
+{
+if (b > a)
+return a < b ? a : b;
+return 0;
+}
+
+int min12(int a, int b)
+{
+if (b > a)
+return a > b ? b : a;
+return 0;
+}
+
+int min13(int a, int b)
+{
+if (b >= a)
+return a <= b ? a : b;
+return 0;
+}
+
+int min14(int a, int b)
+{
+if (b >= a)
+return a >= b ? b : a;
+return 0;
+}
+
+int min15(int a, int b)
+{
+if (b > a)
+return a <= b ? a : b;
+return 0;
+}
+
+int min16(int a, int b)
+{
+if (b > a)
+return a >= b ? b : a;
+return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "MIN_EXPR" "dom2" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-28.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-28.c
new file mode 100644
index 000..732126d7449
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-28.c
@@ -0,0 +1,117 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom2" } */
+
+int max1(int a, int b)
+{
+if (a <= b)
+return a < b ? b : a;
+return 0;
+}
+
+int max2(int a, int b)
+{
+if (a <= b)
+return a > b ? a : b;
+return 0;
+}
+
+int max3(int a, int b)
+{
+if (a < b)
+return a < b ? b : a;
+return 0;
+}
+
+int max4(int a, int b)
+{
+if (a < b)
+return a > b ? a : b;
+return 0;
+}
+
+int max5(int a, int b)
+{
+if (a <= b)
+return a <= b ? b : a;
+return 0;
+}
+
+int max6(int a, int b)
+{
+if (a <= b)
+return a >= b ? a : b;
+return 0;
+}
+
+int max7(int a, int b)
+{
+if (a < b)
+re

[gcc r14-9531] [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.

2024-03-18 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:d91a0cee3611f477730a1fc10beff050dfc800ec

commit r14-9531-gd91a0cee3611f477730a1fc10beff050dfc800ec
Author: Chen Jiawei 
Date:   Mon Mar 18 20:54:45 2024 -0600

[PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.

This patch add XiangShan Nanhu cpu microarchitecture,
Nanhu is a 6-issue, superscalar, out-of-order processor.
More details see: https://xiangshan-doc.readthedocs.io/zh-cn/latest/arch

gcc/ChangeLog:

* config/riscv/riscv-cores.def (RISCV_TUNE): New def.
(RISCV_CORE): Ditto.
* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type): New
option.
* config/riscv/riscv.cc: New def.
* config/riscv/riscv.md: New include.
* config/riscv/xiangshan.md: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/mcpu-xiangshan-nanhu.c: New test.

Co-Authored by: Lin Jiawei 

Diff:
---
 gcc/config/riscv/riscv-cores.def   |   6 +
 gcc/config/riscv/riscv-opts.h  |   1 +
 gcc/config/riscv/riscv.cc  |  17 +++
 gcc/config/riscv/riscv.md  |   3 +-
 gcc/config/riscv/xiangshan.md  | 148 +
 .../gcc.target/riscv/mcpu-xiangshan-nanhu.c|  34 +
 6 files changed, 208 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-cores.def b/gcc/config/riscv/riscv-cores.def
index 57928bccdc8..2f5efe3be86 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -40,6 +40,7 @@ RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
 RISCV_TUNE("sifive-p400-series", sifive_p400, sifive_p400_tune_info)
 RISCV_TUNE("sifive-p600-series", sifive_p600, sifive_p600_tune_info)
 RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
+RISCV_TUNE("xiangshan-nanhu", xiangshan, xiangshan_nanhu_tune_info)
 RISCV_TUNE("generic-ooo", generic_ooo, generic_ooo_tune_info)
 RISCV_TUNE("size", generic, optimize_size_tune_info)
 
@@ -90,4 +91,9 @@ RISCV_CORE("thead-c906",  
"rv64imafdc_xtheadba_xtheadbb_xtheadbs_xtheadcmo_"
  "xtheadcondmov_xtheadfmemidx_xtheadmac_"
  "xtheadmemidx_xtheadmempair_xtheadsync",
  "thead-c906")
+
+RISCV_CORE("xiangshan-nanhu",  "rv64imafdc_zba_zbb_zbc_zbs_"
+ "zbkb_zbkc_zbkx_zknd_zkne_zknh_zksed_zksh_"
+ "svinval_zicbom_zicboz",
+ "xiangshan-nanhu")
 #undef RISCV_CORE
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 281dd068c55..9ae86d52a75 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -57,6 +57,7 @@ enum riscv_microarchitecture_type {
   sifive_7,
   sifive_p400,
   sifive_p600,
+  xiangshan,
   generic_ooo
 };
 extern enum riscv_microarchitecture_type riscv_microarchitecture;
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 680c4a728e9..45015addd1f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -498,6 +498,23 @@ static const struct riscv_tune_param thead_c906_tune_info 
= {
   NULL,/* vector cost */
 };
 
+/* Costs to use when optimizing for xiangshan nanhu.  */
+static const struct riscv_tune_param xiangshan_nanhu_tune_info = {
+  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},  /* fp_add */
+  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},  /* fp_mul */
+  {COSTS_N_INSNS (10), COSTS_N_INSNS (20)},/* fp_div */
+  {COSTS_N_INSNS (3), COSTS_N_INSNS (3)},  /* int_mul */
+  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},  /* int_div */
+  6,   /* issue_rate */
+  3,   /* branch_cost */
+  3,   /* memory_cost */
+  3,   /* fmv_cost */
+  true,/* 
slow_unaligned_access */
+  false,   /* use_divmod_expansion */
+  RISCV_FUSE_ZEXTW | RISCV_FUSE_ZEXTH,  /* fusible_ops */
+  NULL,/* vector cost */
+};
+
 /* Costs to use when optimizing for a generic ooo profile.  */
 static const struct riscv_tune_param generic_ooo_tune_info = {
   {COSTS_N_INSNS (2), COSTS_N_INSNS (2)},  /* fp_add */
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index b16ed97909c..f433b03885c 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -685,7 +685,7 @@
 ;; Microarchitectures we know how to tune for.
 ;; Keep this in sync with enum riscv_microarchitecture.
 (define_attr "tune"
-  "generic,sifive_7,sifive_p400,sifive_p600,generic_ooo"
+  "generic,sifive_7,sifive_p400,sifive_p600,xiangshan,generic_ooo"
   (const (symbol_ref "

[gcc r14-9532] [PATCH v5 1/1] RISC-V: Add support for XCVbi extension in CV32E40P

2024-03-18 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:9eeca7753670d7bccd82e6ed7e4fe97cabd9a362

commit r14-9532-g9eeca7753670d7bccd82e6ed7e4fe97cabd9a362
Author: Mary Bennett 
Date:   Mon Mar 18 21:32:56 2024 -0600

[PATCH v5 1/1] RISC-V: Add support for XCVbi extension in CV32E40P

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
   Mary Bennett 
   Nandni Jamnadas 
   Pietra Ferreira 
   Charlie Keaney
   Jessica Mills
   Craig Blackmore 
   Simon Cook 
   Jeremy Bennett 
   Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Create XCVbi extension
support.
* config/riscv/riscv.opt: Likewise.
* config/riscv/corev.md: Implement cv_branch pattern
for cv.beqimm and cv.bneimm.
* config/riscv/riscv.md: Add CORE-V branch immediate to RISC-V
branch instruction pattern.
* config/riscv/constraints.md: Implement constraints
cv_bi_s5 - signed 5-bit immediate.
* config/riscv/predicates.md: Implement predicate
const_int5s_operand - signed 5 bit immediate.
* doc/sourcebuild.texi: Add XCVbi documentation.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
* lib/target-supports.exp: Add proc for XCVbi.

Diff:
---
 gcc/common/config/riscv/riscv-common.cc|  2 +
 gcc/config/riscv/constraints.md|  6 +++
 gcc/config/riscv/corev.md  | 37 +
 gcc/config/riscv/predicates.md |  4 ++
 gcc/config/riscv/riscv.md  |  2 +-
 gcc/config/riscv/riscv.opt |  2 +
 gcc/doc/sourcebuild.texi   |  3 ++
 .../gcc.target/riscv/cv-bi-beqimm-compile-1.c  | 17 
 .../gcc.target/riscv/cv-bi-beqimm-compile-2.c  | 48 ++
 .../gcc.target/riscv/cv-bi-bneimm-compile-1.c  | 17 
 .../gcc.target/riscv/cv-bi-bneimm-compile-2.c  | 48 ++
 gcc/testsuite/lib/target-supports.exp  | 13 ++
 12 files changed, 198 insertions(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 48efef40dfd..440127a2af0 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -366,6 +366,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"xcvalu", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xcvelw", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xcvsimd", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"xcvbi", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"xtheadba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xtheadbb", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1618,6 +1619,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"xcvalu",&gcc_options::x_riscv_xcv_subext, MASK_XCVALU},
   {"xcvelw",&gcc_options::x_riscv_xcv_subext, MASK_XCVELW},
   {"xcvsimd",   &gcc_options::x_riscv_xcv_subext, MASK_XCVSIMD},
+  {"xcvbi", &gcc_options::x_riscv_xcv_subext, MASK_XCVBI},
 
   {"xtheadba",  &gcc_options::x_riscv_xthead_subext, MASK_XTHEADBA},
   {"xtheadbb",  &gcc_options::x_riscv_xthead_subext, MASK_XTHEADBB},
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 41acaea04eb..972e8842c9f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -268,6 +268,12 @@
(and (match_test "IN_RANGE (ival, 0, 1073741823)")
 (match_test "exact_log2 (ival + 1) != -1"
 
+(define_constraint "CV_bi_sign5"
+  "@internal
+   A 5-bit signed immediate for CORE-V Immediate Branch."
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (ival, -16, 15)")))
+
 (define_constraint "CV_simd_si6"
   "A 6-bit signed immediate for SIMD."
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
index 3857c53ce10..e2db8f31130 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -2614,3 +2614,40 @@
 cv.subrotmj.div8\t%0,%1,%2"
[(set_attr "type" "arith")
(set_attr "mode" "SI")])
+
+;; XCVBI Instructions
+(define_insn "*cv_branch"
+  [(set (pc)
+   (if_then_else
+(match_operator 1 "equality_operator"
+[(match_operand:X 2 "register_operand" "r")
+ (match_operand:X 3 "const_int5s_operand" 
"CV_bi_sign5")])
+(label_ref (match_operand 0 "" ""))
+(pc)))]
+  "TARGET_XCVBI"
+{
+  if (get_attr_length (insn) == 12)
+return "cv.b%N1\t%2,%z3,1f; jump\t%l0,ra; 1:";
+
+  return "cv.b%C1imm\t%2,%3,%0";
+}
+  [(set_attr "t

[gcc r14-9606] [committed] Fix RISC-V missing stack tie

2024-03-21 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:c65046ff2ef0a9a46e59bc0b3369b2d226f6a239

commit r14-9606-gc65046ff2ef0a9a46e59bc0b3369b2d226f6a239
Author: Jeff Law 
Date:   Thu Mar 21 20:41:59 2024 -0600

[committed] Fix RISC-V missing stack tie

As some of you know, Raphael has been working on stack-clash support for the
RISC-V port.  A little while ago Florian reached out to us with an issue 
where
glibc was failing its smoke test due to referencing an unallocated stack 
slot.

Without diving into the code in detail I (incorrectly) concluded it was a
problem with the fallback of using Ada's stack-check paths due to not having
stack-clash support.

Once enough stack-clash bits were ready I had Raphael review the code 
generated
for Florian's test and we concluded the the original case from Florian was 
just
wrong irrespective of stack clash/stack check.  While Raphael's stack-clash
work will indirectly fix Florian's case, it really should also work without
stack-clash.

In particular this code was called out by valgrind:

> 0003cb5e :
> __GI___realpath():
>3cb5e:   81010113addisp,sp,-2032
>3cb62:   7d313423sd  s3,1992(sp)
>3cb66:   79fdlui s3,0xf
>3cb68:   7e813023sd  s0,2016(sp)
>3cb6c:   7c913c23sd  s1,2008(sp)
>3cb70:   7f010413addis0,sp,2032
>3cb74:   35098793addia5,s3,848 # 
f350 <__libc_initial+0xffe8946a>
>3cb78:   74fdlui s1,0xf
>3cb7a:   008789b3add s3,a5,s0
>3cb7e:   f9048793addia5,s1,-112 # 
ef90 <__libc_initial+0xffe890aa>
>3cb82:   008784b3add s1,a5,s0
>3cb86:   77fdlui a5,0xf
>3cb88:   7d413023sd  s4,1984(sp)
>3cb8c:   7b513c23sd  s5,1976(sp)
>3cb90:   7e113423sd  ra,2024(sp)
>3cb94:   7d213823sd  s2,2000(sp)
>3cb98:   7b613823sd  s6,1968(sp)
>3cb9c:   7b713423sd  s7,1960(sp)
>3cba0:   7b813023sd  s8,1952(sp)
>3cba4:   79913c23sd  s9,1944(sp)
>3cba8:   79a13823sd  s10,1936(sp)
>3cbac:   79b13423sd  s11,1928(sp)
>3cbb0:   34878793addia5,a5,840 # 
f348 <__libc_initial+0xffe89462>
>3cbb4:   4713li  a4,1024
>3cbb8:   00132a17auipc   s4,0x132
>3cbbc:   ae0a3a03ld  s4,-1312(s4) # 16e698 
<__stack_chk_guard>
>3cbc0:   01098893addia7,s3,16
>3cbc4:   42098693addia3,s3,1056
>3cbc8:   b8040a93addis5,s0,-1152
>3cbcc:   97a2add a5,a5,s0
>3cbce:   000a3603ld  a2,0(s4)
>3cbd2:   f8c43423sd  a2,-120(s0)
>3cbd6:   4601li  a2,0
>3cbd8:   3d14b023sd  a7,960(s1)
>3cbdc:   3ce4b423sd  a4,968(s1)
>3cbe0:   7cd4b823sd  a3,2000(s1)
>3cbe4:   7ce4bc23sd  a4,2008(s1)
>3cbe8:   b7543823sd  s5,-1168(s0)
>3cbec:   b6e43c23sd  a4,-1160(s0)
>3cbf0:   e38csd  a1,0(a5)
>3cbf2:   b0010113addisp,sp,-1280
In particular note the store at 0x3cbd8.  That's hitting (s1 + 960). If you
chase the values around, you'll find it's a bit more than 1k into 
unallocated
stack space.  It's also worth noting the final stack adjustment at 0x3cbf2.

While I haven't reproduced Florian's code exactly, I was able to get 
reasonably
close and verify my suspicion that everything was fine before sched2 and
incorrect after sched2.  It was also obvious at that point what had gone 
wrong
-- we were missing a stack tie after the final stack pointer adjustment.

This patch adds the missing stack tie.

While not technically a regression, I shudder at the thought of chasing one 
of
these issues down again in the wild.  Been there, done that.

Regression tested on rv64gc.  Verified the scheduler no longer mucked up
realpath by hand.  Pushing to the trunk.

gcc/
* config/riscv/riscv.cc (riscv_expand_prologue): Add missing stack
 

[gcc r14-9715] [committed] Provide suitable output template for zero_extendqihi2 on H8

2024-03-28 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:c1e66532cbb424bd7ea8c3b2c1ffea4bb5233309

commit r14-9715-gc1e66532cbb424bd7ea8c3b2c1ffea4bb5233309
Author: Jeff Law 
Date:   Thu Mar 28 16:56:53 2024 -0600

[committed] Provide suitable output template for zero_extendqihi2 on H8

Segher's recent combine change, quite unexpectedly, triggered a regression 
on
the H8 port.  It failed to build newlib.

The zero_extendqihi2 pattern provided two alternatives.  One where the 
source
and destination matched.  That turns into a suitable instruction trivially.
The second alternative was actually meant to capture cases where the value 
is
coming from memory.

What was missing here was the reg->reg case where the source and 
destination do
not match.  That fell into the second case which was requested to be split 
by
the pattern's output template.

The splitter had a suitable condition to make sure it only triggered in the
right cases.  Unfortunately with the pattern requiring a split in a case 
where
the splitter was going to fail led to the fault.

So regardless of what's going on in the combiner, this code was just wrong.
Fixed thusly by providing a suitable output template for the reg->reg case.

Regression tested on h8300-elf.  Pushing to the trunk.

gcc/

* config/h8300/extensions.md (zero_extendqihi*): Add output
template for reg->reg case where the regs don't match.

Diff:
---
 gcc/config/h8300/extensions.md | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/config/h8300/extensions.md b/gcc/config/h8300/extensions.md
index 7149dc0ac52..a1e8c4abd37 100644
--- a/gcc/config/h8300/extensions.md
+++ b/gcc/config/h8300/extensions.md
@@ -12,8 +12,8 @@
   })
 
 (define_insn_and_split "*zero_extendqihi2"
-  [(set (match_operand:HI 0 "register_operand" "=r,r")
-   (zero_extend:HI (match_operand:QI 1 "general_operand_src" "0,g>")))]
+  [(set (match_operand:HI 0 "register_operand" "=r,r,r")
+   (zero_extend:HI (match_operand:QI 1 "general_operand_src" "0,r,g>")))]
   ""
   "#"
   "&& reload_completed"
@@ -21,14 +21,15 @@
  (clobber (reg:CC CC_REG))])])
 
 (define_insn "*zero_extendqihi2"
-  [(set (match_operand:HI 0 "register_operand" "=r,r")
-   (zero_extend:HI (match_operand:QI 1 "general_operand_src" "0,g>")))
+  [(set (match_operand:HI 0 "register_operand" "=r,r,r")
+   (zero_extend:HI (match_operand:QI 1 "general_operand_src" "0,r,g>")))
(clobber (reg:CC CC_REG))]
   ""
   "@
   extu.w   %T0
+  mov.b\t%X1,%R0\;extu.w\t%T0
   #"
-  [(set_attr "length" "2,10")])
+  [(set_attr "length" "2,4,10")])
 
 ;; Split the zero extension of a general operand (actually a memory
 ;; operand) into a load of the operand and the actual zero extension


[gcc r14-9732] [committed] RISC-V: Add missing insn types to XiangShan Nanhu scheduler model

2024-03-31 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:08eaafadd5beaa56beb2d1fceca9f97eeb0219ba

commit r14-9732-g08eaafadd5beaa56beb2d1fceca9f97eeb0219ba
Author: Jeff Law 
Date:   Sun Mar 31 10:51:17 2024 -0600

[committed] RISC-V: Add missing insn types to XiangShan Nanhu scheduler 
model

The test for the recently added XiangShan Nanhu microarchitecture is failing
because the scheduler description does not have entries for certain insn 
types.

I'm adding  branch, jalr, ret and sfb_alu to the scheduler description, 
that's
enough to get the trivial test to pass.  However, I strongly suspect running
any significant code through the compiler when scheduling for this
microarchitecture will trigger faults.

Basically we have checking now that will fault if we have an insn in the IL
without an associated type or if we have an insn in the IL that does not 
map to
an insn reservation in the scheduler model.  We were tripping the latter
assertion for one of those branch types.  My suspicion is many insn types
aren't handled by that DFA.

The branch insns were pretty obvious and easy to fix.  But someone with more
experience with the uarch needs to do an audit to ensure that all insn types
map to an insn reservation.

gcc/
* config/riscv/xiangshan.md (xiangshan_jump): Add branch, jalr, ret
and sfb_alu.

Diff:
---
 gcc/config/riscv/xiangshan.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/xiangshan.md b/gcc/config/riscv/xiangshan.md
index 381c3ce1428..76539d332b8 100644
--- a/gcc/config/riscv/xiangshan.md
+++ b/gcc/config/riscv/xiangshan.md
@@ -70,7 +70,7 @@
 
 (define_insn_reservation "xiangshan_jump" 1
   (and (eq_attr "tune" "xiangshan")
-   (eq_attr "type" "jump,call,auipc,unknown"))
+   (eq_attr "type" "jump,call,auipc,unknown,branch,jalr,ret,sfb_alu"))
   "xs_jmp_rs")
 
 (define_insn_reservation "xiangshan_i2f" 3


[gcc] Created branch 'riscv/heads/gcc-14-with-riscv-opts' in namespace 'refs/vendors'

2024-04-30 Thread Jeff Law via Gcc-cvs
The branch 'riscv/heads/gcc-14-with-riscv-opts' was created in namespace 
'refs/vendors' pointing to:

 7a00c459cbb... libstdc++: Do not apply localized formatting to NaN and inf


[gcc r15-71] This is almost exclusively Jivan's work. His original post:

2024-04-30 Thread Jeff Law via Gcc-cvs
https://gcc.gnu.org/g:f652a35877e32d470d649d1aee5d94fa0169a478

commit r15-71-gf652a35877e32d470d649d1aee5d94fa0169a478
Author: Jivan Hakobyan 
Date:   Tue Apr 30 09:44:02 2024 -0600

This is almost exclusively Jivan's work.  His original post:

> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg336483.html

This patch is primarily meant to improve the code we generate for FP 
rounding
such as ceil/floor.  It also addresses some unnecessary sign extensions in 
the
same areas.

RISC-V's FP conversions have a bit of undesirable behavior that make them
non-suitable as-is for ceil/floor and other related functions. These
deficiencies are addressed in the Zfa extension, but there's no reason not 
to
pick up a nice improvement when we can.

Basically we can still use the basic FP conversions for floor/ceil and 
friends
when we don't care about inexact exceptions by checking for the special 
cases
first, then emitting the conversion when the special cases don't apply.  
That's
still much faster than calling into glibc.

The redundant sign extensions are eliminated using the same trick Jivan 
added
last year, just in a few more places ;-)

This eliminates roughly 10% of the dynamic instruction count for imagick.  
But
more importantly it's about a 17% performance improvement for that workload
within spec.

This has been bootstrapped as well as regression tested in a cross 
environment.
It's also successfully built & run specint/specfp correctly.

Pushing to the trunk and the coordination branch momentarily.

gcc/
* config/riscv/iterators.md (fix_ops, fix_uns): New iterators.
(RINT, rint_pattern, rint_rm): Remove unused iterators.
* config/riscv/riscv-protos.h (get_fp_rounding_coefficient): 
Prototype.
* config/riscv/riscv-v.cc (get_fp_rounding_coefficient): 
Externalize.
external linkage.
* config/riscv/riscv.md (UNSPEC_LROUND): Remove.
(fix_trunc2): Replace with ...
(_truncsi2): New expander & associated insn.
(_truncsi2_ext): New insn.
(_truncdi2): Likewise.
(l2): Replace with ...
(lrintsi2): New expander and associated insn.
(lrintsi2_ext, lrintdi2): New insns.
(2): Replace with
(lsi2): New expander and associated insn.
(lsi2_sext): New insn.
(ldi2): Likewise.
(2): New expander.

gcc/testsuite/
* gcc.target/riscv/fix.c: New test.
* gcc.target/riscv/round.c: New test.
* gcc.target/riscv/round_32.c: New test.
* gcc.target/riscv/round_64.c: New test.

Diff:
---
 gcc/config/riscv/iterators.md |  12 +-
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-v.cc   |   2 +-
 gcc/config/riscv/riscv.md | 211 +++---
 gcc/testsuite/gcc.target/riscv/fix.c  |  34 +
 gcc/testsuite/gcc.target/riscv/round.c| 144 
 gcc/testsuite/gcc.target/riscv/round_32.c |  22 
 gcc/testsuite/gcc.target/riscv/round_64.c |  23 
 8 files changed, 427 insertions(+), 22 deletions(-)

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index a7694137685..75e119e407a 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -196,6 +196,13 @@
 
 (define_code_iterator bitmanip_rotate [rotate rotatert])
 
+;; These code iterators allow the signed and unsigned fix operations to use
+;; the same template.
+(define_code_iterator fix_ops [fix unsigned_fix])
+
+(define_code_attr fix_uns [(fix "fix") (unsigned_fix "fixuns")])
+
+
 ;; ---
 ;; Code Attributes
 ;; ---
@@ -312,11 +319,6 @@
 ;; Int Iterators.
 ;; ---
 
-;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND "round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
-
 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
 (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET 
"le")])
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5d46a29d8b7..e5aebf3fc3d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -711,6 +711,7 @@ bool gather_scatter_valid_offset_p (machine_mode);
 HOST_WIDE_INT estimated_poly_value (poly_int64, unsigned int);
 bool whole_reg_to_reg_move_p (rtx *, machine_mode, int);
 bool splat_to_scalar_move_

  1   2   3   4   5   6   7   8   >