Re: [PATCH 2/2] libiberty/reconcat: Add note about append string to NULL

2023-12-17 Thread Jakub Jelinek
On Mon, Dec 18, 2023 at 11:44:22AM +0800, YunQiang Su wrote:
> For reconcat, if the `optr` can only be used as the last one
> of string list, aka, we cannot append something to it.
> Let's add some note into the document.
> 
> libiberty:
>   * concat.c (reconcat): Add note about append string to NULL
>   into document.
> ---
>  libiberty/concat.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/libiberty/concat.c b/libiberty/concat.c
> index 4cb1df3baf3..3a6b4ca71e8 100644
> --- a/libiberty/concat.c
> +++ b/libiberty/concat.c
> @@ -169,6 +169,9 @@ loop:
>str = reconcat (str, "pre-", str, NULL);
>  @end example
>  
> +Note: don't try to append string(s) to the a NULL string,
> +as the process will stop at the first NULL argument.
> +
>  @end deftypefn

I think this is unnecessary and misleading.
The fact that NULL is the variable argument terminator is already clearly
documented, and first argument to reconcat can be NULL just fine,
so one just needs to be careful.

Jakub



RE: [PATCH v2] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread Li, Pan2
Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, December 18, 2023 3:37 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v2] RISC-V: Bugfix for the RVV const vector

OK. LGTM. It's an obvious fix and not easy to add the test (No need to add such 
test).


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-12-18 15:35
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v2] RISC-V: Bugfix for the RVV const vector
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.

V = {{4, -4, 3, -3, 2, -2, 1, -1,}

Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32

After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vid.v  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32

It is not easy to add asm check stable enough for this case, as we need
to check the vadd -4 target comes from the vid output, which crosses 4
instructions up to point. Thus there is no test here and will be covered
by gcc.dg/vect/pr92420.c in the underlying patches.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-v.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-  gen_int_mode (step1, new_smode));
+  gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
--
2.34.1




Re: [RFC][V2] RISC-V: Support -mcmodel=large.

2023-12-17 Thread KuanLin Chen
Hi Jeff,

Sorry for this missing.
I've removed riscv_asm_output_pool_epilogue because the pool beginning is
always aligned from FUNCTION_BOUNDARY.
Please find attached. Thank you.

Jeff Law  於 2023年12月18日 週一 上午3:15寫道:

>
>
> On 11/10/23 02:10, KuanLin Chen wrote:
> > Sorry. It missed a semicolon in the previos patch. Please find the new
> > one in the attachment. Thanks.
> Thanks.  I was going to do some final testing with the plan to integrate
> this patch today, but I think there's a piece missing.  Specifically I
> think it's missing a definition for riscv_asm_output_pool_epilogue.
>
> Can you please send an updated patch that includes that function?
>
> Thanks,
> Jeff
>


0001-RISC-V-Support-mcmodel-large.patch
Description: Binary data


Re: [PATCH v2] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread juzhe.zh...@rivai.ai
OK. LGTM. It's an obvious fix and not easy to add the test (No need to add such 
test).



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-12-18 15:35
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v2] RISC-V: Bugfix for the RVV const vector
From: Pan Li 
 
This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.
 
V = {{4, -4, 3, -3, 2, -2, 1, -1,}
 
Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32
 
After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vid.v  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32
 
It is not easy to add asm check stable enough for this case, as we need
to check the vadd -4 target comes from the vid output, which crosses 4
instructions up to point. Thus there is no test here and will be covered
by gcc.dg/vect/pr92420.c in the underlying patches.
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv-v.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-  gen_int_mode (step1, new_smode));
+  gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
-- 
2.34.1
 
 


[PATCH v2] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread pan2 . li
From: Pan Li 

This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.

 V = {{4, -4, 3, -3, 2, -2, 1, -1,}

Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32

After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vid.v  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32

It is not easy to add asm check stable enough for this case, as we need
to check the vadd -4 target comes from the vid output, which crosses 4
instructions up to point. Thus there is no test here and will be covered
by gcc.dg/vect/pr92420.c in the underlying patches.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-gen_int_mode (step1, new_smode));
+gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
-- 
2.34.1



Re: [PATCH v1] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread juzhe.zh...@rivai.ai
The fix is reasonable.

But the test ASM check is too fragile which will easily break in the feature.

The key of the check should be:

vid.v   v8 -> can be v0, v8, v16, v24 since LMUL = 8

vadd.vi v8, v8, -4  -> should be using the result of vid.

I think you should adjust test check according to this suggestion.

Also, rename the test from const-vector-0.c into bug-7.c


juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-12-18 15:04
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for the RVV const vector
From: Pan Li 
 
This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.
 
V = {{4, -4, 3, -3, 2, -2, 1, -1,}
 
Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32
 
After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vidv  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/const-vector-0.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv-v.cc   |  2 +-
.../riscv/rvv/autovec/const-vector-0.c| 39 +++
2 files changed, 40 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-  gen_int_mode (step1, new_smode));
+  gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
new file mode 100644
index 000..4f83121c663
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d 
--param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define N 4
+struct C { int r, i; };
+
+/*
+** init_struct_data:
+** ...
+** vsetivli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m8,\s*ta,\s*ma
+** vid\.v\s+v8
+** li\s+[atx][0-9]+,\s*-1
+** vmul\.vx\s+v16,\s*v8,\s*[atx][0-9]+
+** vadd\.vi\s+v16,\s*v16,\s*4
+** vadd\.vi\s+v8,\s*v8,\s*-4
+** li\s+[axt][0-9]+,32
+** vsll\.vx\s+v8,\s*v8,\s*[atx][0-9]+
+** vor\.vv\s+v16,\s*v16,\s*v8
+** ...
+*/
+void
+init_struct_data (struct C * __restrict a, struct C * __restrict b,
+   struct C * __restrict c)
+{
+  int i;
+
+  for (i = 0; i < N; ++i)
+{
+  a[i].r = N - i;
+  a[i].i = i - N;
+
+  b[i].r = i - N;
+  b[i].i = i + N;
+
+  c[i].r = -1 - i;
+  c[i].i = 2 * N - 1 - i;
+}
+}
-- 
2.34.1
 
 


Re: [PATCH v2] testsuite: Fix cpymem-1.c dump checks under different riscv-sim for RVV.

2023-12-17 Thread juzhe.zh...@rivai.ai
LGTM.



juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-12-18 15:05
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; xuli
Subject: [PATCH v2] testsuite: Fix cpymem-1.c dump checks under different 
riscv-sim for RVV.
From: xuli 
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/cpymem-1.c: Fix checks.
---
.../gcc.target/riscv/rvv/base/cpymem-1.c  | 29 +--
1 file changed, 26 insertions(+), 3 deletions(-)
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
index 549d6648104..ccde7575051 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-additional-options "-O1" } */
+/* { dg-additional-options "-O1 -fno-schedule-insns -fno-schedule-insns2" } */
/* { dg-add-options riscv_v } */
/* { dg-final { check-function-bodies "**" "" } } */
@@ -50,11 +50,34 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
Use extern here so that we get a known alignment, lest
DATA_ALIGNMENT force us to make the scan pattern accomodate
code for different alignments depending on word size.
-** f3: { target { any-opts "-mcmodel=medlow" } }
+** f3: { target { { any-opts "-mcmodel=medlow" } && { no-opts 
"-march=rv64gcv_zvl512b" "-march=rv64gcv_zvl1024b" 
"--param=riscv-autovec-lmul=dynamic" "--param=riscv-autovec-lmul=m2" 
"--param=riscv-autovec-lmul=m4" "--param=riscv-autovec-lmul=m8" 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
**lui\s+[ta][0-7],%hi\(a_a\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
**lui\s+[ta][0-7],%hi\(a_b\)
**addi\s+a4,[ta][0-7],%lo\(a_b\)
-**vsetivli\s+zero,16,e32,m4,ta,ma
+**vsetivli\s+zero,16,e32,m8,ta,ma
+**vle32.v\s+v\d+,0\([ta][0-7]\)
+**vse32\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow 
--param=riscv-autovec-preference=fixed-vlmax" "-mcmodel=medlow 
-march=rv64gcv_zvl512b --param=riscv-autovec-preference=fixed-vlmax" } && { 
no-opts "-march=rv64gcv_zvl1024b" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vl(1|4|2)re32\.v\s+v\d+,0\([ta][0-7]\)
+**vs(1|4|2)r\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow -march=rv64gcv_zvl1024b" 
"-mcmodel=medlow -march=rv64gcv_zvl512b" } && { no-opts 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vsetivli\s+zero,16,e32,(m1|m4|mf2),ta,ma
**vle32.v\s+v\d+,0\([ta][0-7]\)
**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
**vse32\.v\s+v\d+,0\([ta][0-7]\)
-- 
2.17.1
 
 


[PATCH v2] testsuite: Fix cpymem-1.c dump checks under different riscv-sim for RVV.

2023-12-17 Thread Li Xu
From: xuli 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/cpymem-1.c: Fix checks.
---
 .../gcc.target/riscv/rvv/base/cpymem-1.c  | 29 +--
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
index 549d6648104..ccde7575051 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-additional-options "-O1" } */
+/* { dg-additional-options "-O1 -fno-schedule-insns -fno-schedule-insns2" } */
 /* { dg-add-options riscv_v } */
 /* { dg-final { check-function-bodies "**" "" } } */
 
@@ -50,11 +50,34 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
Use extern here so that we get a known alignment, lest
DATA_ALIGNMENT force us to make the scan pattern accomodate
code for different alignments depending on word size.
-** f3: { target { any-opts "-mcmodel=medlow" } }
+** f3: { target { { any-opts "-mcmodel=medlow" } && { no-opts 
"-march=rv64gcv_zvl512b" "-march=rv64gcv_zvl1024b" 
"--param=riscv-autovec-lmul=dynamic" "--param=riscv-autovec-lmul=m2" 
"--param=riscv-autovec-lmul=m4" "--param=riscv-autovec-lmul=m8" 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
 **lui\s+[ta][0-7],%hi\(a_a\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
 **lui\s+[ta][0-7],%hi\(a_b\)
 **addi\s+a4,[ta][0-7],%lo\(a_b\)
-**vsetivli\s+zero,16,e32,m4,ta,ma
+**vsetivli\s+zero,16,e32,m8,ta,ma
+**vle32.v\s+v\d+,0\([ta][0-7]\)
+**vse32\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow 
--param=riscv-autovec-preference=fixed-vlmax" "-mcmodel=medlow 
-march=rv64gcv_zvl512b --param=riscv-autovec-preference=fixed-vlmax" } && { 
no-opts "-march=rv64gcv_zvl1024b" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vl(1|4|2)re32\.v\s+v\d+,0\([ta][0-7]\)
+**vs(1|4|2)r\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow -march=rv64gcv_zvl1024b" 
"-mcmodel=medlow -march=rv64gcv_zvl512b" } && { no-opts 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vsetivli\s+zero,16,e32,(m1|m4|mf2),ta,ma
 **vle32.v\s+v\d+,0\([ta][0-7]\)
 **addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
 **vse32\.v\s+v\d+,0\([ta][0-7]\)
-- 
2.17.1



[PATCH v1] RISC-V: Bugfix for the RVV const vector

2023-12-17 Thread pan2 . li
From: Pan Li 

This patch would like to fix one bug of const vector for interleave.
Assume we need to generate interleave const vector like below.

 V = {{4, -4, 3, -3, 2, -2, 1, -1,}

Before this patch:
vsetvl a3, zero, e64, m8, ta, ma
vid.v   v8v8 =  {0, 1, 2, 3, 4}
li  a6, -1
vmul.vx v8, v8, a6v8 =  {-0, -1, -2, -3, -4}
vadd.vi v24, v8, 4v24 = { 4,  3,  2,  1,  0}
vadd.vi v8, v8, -4v8 =  {-4, -5, -6, -7, -8}
li  a6, 32
vsll.vx v8, v8, a6v8 =  {0, -4, 0, -5, 0, -6, 0, -7,} for e32
vor v24, v24, v8  v24 = {4, -4, 3, -5, 2, -6, 1, -7,} for e32

After this patch:
vsetvli a6,zero,e64,m8,ta,ma
vidv  v8  v8 =  {0, 1, 2, 3, 4}
li a7,-1
vmul.vx v16,v8,a7 v16 = {-0, -1, -2, -3, -4}
vaddvi v16,v16,4  v16 = { 4,  3,  2,  1, 0}
vaddvi v8,v8,-4   v8 =  {-4, -3, -2, -1, 0}
li a7,32
vsll.vx v8,v8,a7  v8 =  {0, -4, 0, -3, 0, -2,} for e32
vor.vv v16,v16,v8 v8 =  {4, -4, 3, -3, 2, -2,} for e32

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Take step2
instead of step1 for second series.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/const-vector-0.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc   |  2 +-
 .../riscv/rvv/autovec/const-vector-0.c| 39 +++
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index eade8db4cf1..d1eb7a0a9a5 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1331,7 +1331,7 @@ expand_const_vector (rtx target, rtx src)
  rtx tmp2 = gen_reg_rtx (new_mode);
  base2 = gen_int_mode (rtx_to_poly_int64 (base2), new_smode);
  expand_vec_series (tmp2, base2,
-gen_int_mode (step1, new_smode));
+gen_int_mode (step2, new_smode));
  rtx shifted_tmp2 = expand_simple_binop (
new_mode, ASHIFT, tmp2,
gen_int_mode (builder.inner_bits_size (), Pmode), NULL_RTX,
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
new file mode 100644
index 000..4f83121c663
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/const-vector-0.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d 
--param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3 
-fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define N 4
+struct C { int r, i; };
+
+/*
+** init_struct_data:
+** ...
+** vsetivli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m8,\s*ta,\s*ma
+** vid\.v\s+v8
+** li\s+[atx][0-9]+,\s*-1
+** vmul\.vx\s+v16,\s*v8,\s*[atx][0-9]+
+** vadd\.vi\s+v16,\s*v16,\s*4
+** vadd\.vi\s+v8,\s*v8,\s*-4
+** li\s+[axt][0-9]+,32
+** vsll\.vx\s+v8,\s*v8,\s*[atx][0-9]+
+** vor\.vv\s+v16,\s*v16,\s*v8
+** ...
+*/
+void
+init_struct_data (struct C * __restrict a, struct C * __restrict b,
+ struct C * __restrict c)
+{
+  int i;
+
+  for (i = 0; i < N; ++i)
+{
+  a[i].r = N - i;
+  a[i].i = i - N;
+
+  b[i].r = i - N;
+  b[i].i = i + N;
+
+  c[i].r = -1 - i;
+  c[i].i = 2 * N - 1 - i;
+}
+}
-- 
2.34.1



[PATCH] RISC-V: Enable vect test for RV32

2023-12-17 Thread Juzhe-Zhong
After recent fixes, almost all real FAILs on RV64 full coverage testing are 
fixed.

So, it's reasonable to start test RV32 vect testing now.

We will enable full coverage testing RV32 soon and to see what else need to be 
fixed.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Enable RV32 vect testing.

---
 gcc/testsuite/lib/target-supports.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index bd38d72562d..5925457d343 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11569,7 +11569,7 @@ proc check_vect_support_and_set_flags { } {
 }
 } elseif [istarget amdgcn-*-*] {
 set dg-do-what-default run
-} elseif [istarget riscv64-*-*] {
+} elseif [istarget riscv*-*-*] {
if [check_effective_target_riscv_v] {
lappend DEFAULT_VECTCFLAGS "--param" "riscv-vector-abi"
set dg-do-what-default run
-- 
2.36.3



Re: [PATCH] testsuite: Fix cpymem-1.c dump checks under different riscv-sim for RVV.

2023-12-17 Thread juzhe.zh...@rivai.ai
Could you add -fno-schedule-insns -fno-schedule-insns2 ?

So that the test won't be fragile to break a again when we tune the scheduling 
model and cost model.



juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-12-18 14:40
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; xuli
Subject: [PATCH] testsuite: Fix cpymem-1.c dump checks under different 
riscv-sim for RVV.
From: xuli 
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/cpymem-1.c: Fix checks.
---
.../gcc.target/riscv/rvv/base/cpymem-1.c  | 27 +--
1 file changed, 25 insertions(+), 2 deletions(-)
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
index 549d6648104..aac81079650 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
@@ -50,11 +50,34 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
Use extern here so that we get a known alignment, lest
DATA_ALIGNMENT force us to make the scan pattern accomodate
code for different alignments depending on word size.
-** f3: { target { any-opts "-mcmodel=medlow" } }
+** f3: { target { { any-opts "-mcmodel=medlow" } && { no-opts 
"-march=rv64gcv_zvl512b" "-march=rv64gcv_zvl1024b" 
"--param=riscv-autovec-lmul=dynamic" "--param=riscv-autovec-lmul=m2" 
"--param=riscv-autovec-lmul=m4" "--param=riscv-autovec-lmul=m8" 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
**lui\s+[ta][0-7],%hi\(a_a\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
**lui\s+[ta][0-7],%hi\(a_b\)
**addi\s+a4,[ta][0-7],%lo\(a_b\)
-**vsetivli\s+zero,16,e32,m4,ta,ma
+**vsetivli\s+zero,16,e32,m8,ta,ma
+**vle32.v\s+v\d+,0\([ta][0-7]\)
+**vse32\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow 
--param=riscv-autovec-preference=fixed-vlmax" "-mcmodel=medlow 
-march=rv64gcv_zvl512b --param=riscv-autovec-preference=fixed-vlmax" } && { 
no-opts "-march=rv64gcv_zvl1024b" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vl(1|4|2)re32\.v\s+v\d+,0\([ta][0-7]\)
+**vs(1|4|2)r\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow -march=rv64gcv_zvl1024b" 
"-mcmodel=medlow -march=rv64gcv_zvl512b" } && { no-opts 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vsetivli\s+zero,16,e32,(m1|m4|mf2),ta,ma
**vle32.v\s+v\d+,0\([ta][0-7]\)
**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
**vse32\.v\s+v\d+,0\([ta][0-7]\)
-- 
2.17.1
 
 


[PATCH] testsuite: Fix cpymem-1.c dump checks under different riscv-sim for RVV.

2023-12-17 Thread Li Xu
From: xuli 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/cpymem-1.c: Fix checks.
---
 .../gcc.target/riscv/rvv/base/cpymem-1.c  | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
index 549d6648104..aac81079650 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/cpymem-1.c
@@ -50,11 +50,34 @@ void f2 (__INT32_TYPE__* a, __INT32_TYPE__* b, int l)
Use extern here so that we get a known alignment, lest
DATA_ALIGNMENT force us to make the scan pattern accomodate
code for different alignments depending on word size.
-** f3: { target { any-opts "-mcmodel=medlow" } }
+** f3: { target { { any-opts "-mcmodel=medlow" } && { no-opts 
"-march=rv64gcv_zvl512b" "-march=rv64gcv_zvl1024b" 
"--param=riscv-autovec-lmul=dynamic" "--param=riscv-autovec-lmul=m2" 
"--param=riscv-autovec-lmul=m4" "--param=riscv-autovec-lmul=m8" 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
 **lui\s+[ta][0-7],%hi\(a_a\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
 **lui\s+[ta][0-7],%hi\(a_b\)
 **addi\s+a4,[ta][0-7],%lo\(a_b\)
-**vsetivli\s+zero,16,e32,m4,ta,ma
+**vsetivli\s+zero,16,e32,m8,ta,ma
+**vle32.v\s+v\d+,0\([ta][0-7]\)
+**vse32\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow 
--param=riscv-autovec-preference=fixed-vlmax" "-mcmodel=medlow 
-march=rv64gcv_zvl512b --param=riscv-autovec-preference=fixed-vlmax" } && { 
no-opts "-march=rv64gcv_zvl1024b" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vl(1|4|2)re32\.v\s+v\d+,0\([ta][0-7]\)
+**vs(1|4|2)r\.v\s+v\d+,0\([ta][0-7]\)
+**ret
+*/
+
+/*
+** f3: { target { { any-opts "-mcmodel=medlow -march=rv64gcv_zvl1024b" 
"-mcmodel=medlow -march=rv64gcv_zvl512b" } && { no-opts 
"--param=riscv-autovec-preference=fixed-vlmax" } } }
+**lui\s+[ta][0-7],%hi\(a_a\)
+**lui\s+[ta][0-7],%hi\(a_b\)
+**addi\s+a4,[ta][0-7],%lo\(a_b\)
+**vsetivli\s+zero,16,e32,(m1|m4|mf2),ta,ma
 **vle32.v\s+v\d+,0\([ta][0-7]\)
 **addi\s+[ta][0-7],[ta][0-7],%lo\(a_a\)
 **vse32\.v\s+v\d+,0\([ta][0-7]\)
-- 
2.17.1



Re: [PATCH] RISC-V: Add required_extensions in function_group

2023-12-17 Thread juzhe.zh...@rivai.ai
LGTM from my side.
Give kito 1 day to chime in,



juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-18 11:28
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang
Subject: [PATCH] RISC-V: Add required_extensions in function_group
In order to add other vector related extensions in the future, this
patch add one more parameter in the function_group_info, it will be
used to determine whether intrinsic registration processing is required.
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-functions.def (REQUIRED_EXTENSIONS):
Add new macro for match function.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_FUNCTION):
Add one more parameter for macro expanding.
(handle_pragma_vector): Add match function calls.
* config/riscv/riscv-vector-builtins.h (enum required_ext):
Add enum defination for required extension.
(struct function_group_info): Add one more parameter for checking required-ext.
---
.../riscv/riscv-vector-builtins-functions.def |  2 +
gcc/config/riscv/riscv-vector-builtins.cc |  7 ++-
gcc/config/riscv/riscv-vector-builtins.h  | 46 +++
3 files changed, 53 insertions(+), 2 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 1c37fd5fffe..03421d5bc10 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see
#define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO)
#endif
+#define REQUIRED_EXTENSIONS VECTOR_EXT
/* Internal helper functions for gimple fold use.  */
DEF_RVV_FUNCTION (read_vl, read_vl, none_preds, p_none_void_ops)
DEF_RVV_FUNCTION (vlenb, vlenb, none_preds, ul_none_void_ops)
@@ -650,5 +651,6 @@ DEF_RVV_FUNCTION (vsoxseg, seg_indexed_loadstore, 
none_m_preds, tuple_v_scalar_p
DEF_RVV_FUNCTION (vsoxseg, seg_indexed_loadstore, none_m_preds, 
tuple_v_scalar_ptr_eew32_index_ops)
DEF_RVV_FUNCTION (vsoxseg, seg_indexed_loadstore, none_m_preds, 
tuple_v_scalar_ptr_eew64_index_ops)
DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops)
+#undef REQUIRED_EXTENSIONS
#undef DEF_RVV_FUNCTION
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 6330a3a41c3..4e2c66c2de7 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2685,7 +2685,7 @@ static CONSTEXPR const function_type_info 
function_types[] = {
/* A list of all RVV intrinsic functions.  */
static function_group_info function_groups[] = {
#define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO) \
-  {#NAME, ::NAME, ::SHAPE, PREDS, OPS_INFO},
+  {#NAME, ::NAME, ::SHAPE, PREDS, OPS_INFO, REQUIRED_EXTENSIONS},
#include "riscv-vector-builtins-functions.def"
};
@@ -4413,7 +4413,10 @@ handle_pragma_vector ()
 = new hash_table (1023);
   function_builder builder;
   for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i)
-builder.register_function_group (function_groups[i]);
+  {
+if (function_groups[i].match (function_groups[i].required_extensions))
+  builder.register_function_group (function_groups[i]);
+  }
}
/* Return the function decl with RVV function subcode CODE, or error_mark_node
diff --git a/gcc/config/riscv/riscv-vector-builtins.h 
b/gcc/config/riscv/riscv-vector-builtins.h
index cd8ccab1724..4f38c09d73d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -110,6 +110,21 @@ static const unsigned int CP_WRITE_CSR = 1U << 5;
#define RVV_REQUIRE_MIN_VLEN_64 (1 << 5) /* Require TARGET_MIN_VLEN >= 64.  */
#define RVV_REQUIRE_ELEN_FP_16 (1 << 6) /* Require FP ELEN >= 32.  */
+/* Enumerates the required extensions.  */
+enum required_ext
+{
+  VECTOR_EXT,   /* Vector extension */
+  ZVBB_EXT,/* Cryto vector Zvbb sub-ext */
+  ZVBB_OR_ZVKB_EXT, /* Cryto vector Zvbb or zvkb sub-ext */
+  ZVBC_EXT,/* Crypto vector Zvbc sub-ext */
+  ZVKG_EXT,/* Crypto vector Zvkg sub-ext */
+  ZVKNED_EXT,  /* Crypto vector Zvkned sub-ext */
+  ZVKNHA_OR_ZVKNHB_EXT, /* Crypto vector Zvknh[ab] sub-ext */
+  ZVKNHB_EXT,  /* Crypto vector Zvknhb sub-ext */
+  ZVKSED_EXT,  /* Crypto vector Zvksed sub-ext */
+  ZVKSH_EXT,   /* Crypto vector Zvksh sub-ext */
+};
+
/* Enumerates the RVV operand types.  */
enum operand_type_index
{
@@ -212,6 +227,35 @@ class function_shape;
/* Static information about a set of functions.  */
struct function_group_info
{
+  /* Return true if required extension is enabled */
+  bool match (required_ext ext_value) const
+  {
+switch (ext_value)
+{
+  case VECTOR_EXT:
+return TARGET_VECTOR;
+  case ZVBB_EXT:
+return TARGET_ZVBB;
+  case ZVBB_OR_ZVKB_EXT:
+return (TARGET_ZVBB || TARGET_ZVKB);
+  case ZVBC_EXT:
+return TARGET_ZVBC;
+  case ZVKG_EXT:
+return 

[PATCH 2/2] libiberty/reconcat: Add note about append string to NULL

2023-12-17 Thread YunQiang Su
For reconcat, if the `optr` can only be used as the last one
of string list, aka, we cannot append something to it.
Let's add some note into the document.

libiberty:
* concat.c (reconcat): Add note about append string to NULL
into document.
---
 libiberty/concat.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libiberty/concat.c b/libiberty/concat.c
index 4cb1df3baf3..3a6b4ca71e8 100644
--- a/libiberty/concat.c
+++ b/libiberty/concat.c
@@ -169,6 +169,9 @@ loop:
   str = reconcat (str, "pre-", str, NULL);
 @end example
 
+Note: don't try to append string(s) to the a NULL string,
+as the process will stop at the first NULL argument.
+
 @end deftypefn
 
 */
-- 
2.39.2



[PATCH 1/2] MIPS: host_detect_local_cpu, init ret with concat [PR112759]

2023-12-17 Thread YunQiang Su
The function `reconcat` cannot append string(s) to NULL,
as the concat process will stop at the first NULL.

Let's initialize `ret` with `concat (" ", NULL)`, then
it can be used by reconcat.

gcc/

PR target/112759
* config/mips/driver-native.cc (host_detect_local_cpu):
initialize ret with concat, so that it can be used by
reconcat later.
---
 gcc/config/mips/driver-native.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/mips/driver-native.cc b/gcc/config/mips/driver-native.cc
index afc276f5278..471d1925eff 100644
--- a/gcc/config/mips/driver-native.cc
+++ b/gcc/config/mips/driver-native.cc
@@ -44,7 +44,7 @@ const char *
 host_detect_local_cpu (int argc, const char **argv)
 {
   const char *cpu = NULL;
-  char *ret = NULL;
+  char *ret = concat(" ", NULL);
   char buf[128];
   FILE *f;
   bool arch;
-- 
2.39.2



[PATCH] RISC-V: Add required_extensions in function_group

2023-12-17 Thread Feng Wang
In order to add other vector related extensions in the future, this
patch add one more parameter in the function_group_info, it will be
used to determine whether intrinsic registration processing is required.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-functions.def 
(REQUIRED_EXTENSIONS):
Add new macro for match function.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_FUNCTION):
Add one more parameter for macro 
expanding.
(handle_pragma_vector): Add match function calls.
* config/riscv/riscv-vector-builtins.h (enum required_ext):
Add enum defination for required extension.
(struct function_group_info): Add one more parameter for checking 
required-ext.
---
 .../riscv/riscv-vector-builtins-functions.def |  2 +
 gcc/config/riscv/riscv-vector-builtins.cc |  7 ++-
 gcc/config/riscv/riscv-vector-builtins.h  | 46 +++
 3 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index 1c37fd5fffe..03421d5bc10 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. If not see
 #define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO)
 #endif
 
+#define REQUIRED_EXTENSIONS VECTOR_EXT
 /* Internal helper functions for gimple fold use.  */
 DEF_RVV_FUNCTION (read_vl, read_vl, none_preds, p_none_void_ops)
 DEF_RVV_FUNCTION (vlenb, vlenb, none_preds, ul_none_void_ops)
@@ -650,5 +651,6 @@ DEF_RVV_FUNCTION (vsoxseg, seg_indexed_loadstore, 
none_m_preds, tuple_v_scalar_p
 DEF_RVV_FUNCTION (vsoxseg, seg_indexed_loadstore, none_m_preds, 
tuple_v_scalar_ptr_eew32_index_ops)
 DEF_RVV_FUNCTION (vsoxseg, seg_indexed_loadstore, none_m_preds, 
tuple_v_scalar_ptr_eew64_index_ops)
 DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops)
+#undef REQUIRED_EXTENSIONS
 
 #undef DEF_RVV_FUNCTION
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 6330a3a41c3..4e2c66c2de7 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2685,7 +2685,7 @@ static CONSTEXPR const function_type_info 
function_types[] = {
 /* A list of all RVV intrinsic functions.  */
 static function_group_info function_groups[] = {
 #define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO) 
\
-  {#NAME, ::NAME, ::SHAPE, PREDS, OPS_INFO},
+  {#NAME, ::NAME, ::SHAPE, PREDS, OPS_INFO, REQUIRED_EXTENSIONS},
 #include "riscv-vector-builtins-functions.def"
 };
 
@@ -4413,7 +4413,10 @@ handle_pragma_vector ()
 = new hash_table (1023);
   function_builder builder;
   for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i)
-builder.register_function_group (function_groups[i]);
+  {
+if (function_groups[i].match (function_groups[i].required_extensions))
+  builder.register_function_group (function_groups[i]);
+  }
 }
 
 /* Return the function decl with RVV function subcode CODE, or error_mark_node
diff --git a/gcc/config/riscv/riscv-vector-builtins.h 
b/gcc/config/riscv/riscv-vector-builtins.h
index cd8ccab1724..4f38c09d73d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -110,6 +110,21 @@ static const unsigned int CP_WRITE_CSR = 1U << 5;
 #define RVV_REQUIRE_MIN_VLEN_64 (1 << 5)   /* Require TARGET_MIN_VLEN >= 
64.  */
 #define RVV_REQUIRE_ELEN_FP_16 (1 << 6) /* Require FP ELEN >= 32.  */
 
+/* Enumerates the required extensions.  */
+enum required_ext
+{
+  VECTOR_EXT,   /* Vector extension */
+  ZVBB_EXT,/* Cryto vector Zvbb sub-ext */
+  ZVBB_OR_ZVKB_EXT, /* Cryto vector Zvbb or zvkb sub-ext */
+  ZVBC_EXT,/* Crypto vector Zvbc sub-ext */
+  ZVKG_EXT,/* Crypto vector Zvkg sub-ext */
+  ZVKNED_EXT,  /* Crypto vector Zvkned sub-ext */
+  ZVKNHA_OR_ZVKNHB_EXT, /* Crypto vector Zvknh[ab] sub-ext */
+  ZVKNHB_EXT,  /* Crypto vector Zvknhb sub-ext */
+  ZVKSED_EXT,  /* Crypto vector Zvksed sub-ext */
+  ZVKSH_EXT,   /* Crypto vector Zvksh sub-ext */
+};
+
 /* Enumerates the RVV operand types.  */
 enum operand_type_index
 {
@@ -212,6 +227,35 @@ class function_shape;
 /* Static information about a set of functions.  */
 struct function_group_info
 {
+  /* Return true if required extension is enabled */
+  bool match (required_ext ext_value) const
+  {
+switch (ext_value)
+{
+  case VECTOR_EXT:
+return TARGET_VECTOR;
+  case ZVBB_EXT:
+return TARGET_ZVBB;
+  case ZVBB_OR_ZVKB_EXT:
+return (TARGET_ZVBB || TARGET_ZVKB);
+  case ZVBC_EXT:
+return TARGET_ZVBC;
+  case ZVKG_EXT:
+return TARGET_ZVKG;
+  case ZVKNED_EXT:
+return 

[PATCH] RISC-V: Fix natural regsize for fixed-vlmax of -march=rv64gc_zve32f

2023-12-17 Thread Juzhe-Zhong
This patch fixes 12 ICEs of "full coverage" testing:
Running target 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/torture/pr96513.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error: 
Segmentation fault)
FAIL: gcc.dg/torture/pr96513.c   -O3 -g  (internal compiler error: Segmentation 
fault)

Running target 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/torture/pr111048.c   -O2  (internal compiler error: Segmentation 
fault)
FAIL: gcc.dg/torture/pr111048.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error: 
Segmentation fault)
FAIL: gcc.dg/torture/pr111048.c   -O3 -g  (internal compiler error: 
Segmentation fault)

FAIL: gcc.dg/torture/pr96513.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error: 
Segmentation fault)
FAIL: gcc.dg/torture/pr96513.c   -O3 -g  (internal compiler error: Segmentation 
fault)

Running target 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.dg/torture/pr96513.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error: 
Segmentation fault)
FAIL: gcc.dg/torture/pr96513.c   -O3 -g  (internal compiler error: Segmentation 
fault)

Running target 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
FAIL: gcc.c-torture/execute/2801-1.c   -O2  (internal compiler error: 
Segmentation fault)
FAIL: gcc.c-torture/execute/2801-1.c   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler 
error: Segmentation fault)
FAIL: gcc.c-torture/execute/2801-1.c   -O3 -g  (internal compiler error: 
Segmentation fault)

The root cause of those ICEs is vector register size = 32bits, wheras scalar 
register size = 64bit.
That is, vector regsize < scalar regsize on -march=rv64gc_zve32f FIXED-VLMAX.

So the original natural regsize using scalar register size is incorrect. 
Instead, we should return minimum regsize between vector regsize and scalar 
regsize.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_regmode_natural_size): Fix ICE for 
FIXED-VLMAX of -march=rv32gc_zve32f.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-4.c: New test.
* gcc.target/riscv/rvv/autovec/bug-5.c: New test.
* gcc.target/riscv/rvv/autovec/bug-6.c: New test.

---
 gcc/config/riscv/riscv.cc | 12 --
 .../gcc.target/riscv/rvv/autovec/bug-4.c  | 27 
 .../gcc.target/riscv/rvv/autovec/bug-5.c  | 24 +++
 .../gcc.target/riscv/rvv/autovec/bug-6.c  | 42 +++
 4 files changed, 102 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-6.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3fef1ab1514..8ae65760b6e 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9621,10 +9621,10 @@ riscv_regmode_natural_size (machine_mode mode)
 
   if (riscv_v_ext_mode_p (mode))
 {
+  poly_uint64 size = GET_MODE_SIZE (mode);
   if (riscv_v_ext_tuple_mode_p (mode))
{
- poly_uint64 size
-   = GET_MODE_SIZE (riscv_vector::get_subpart_mode (mode));
+ size = GET_MODE_SIZE (riscv_vector::get_subpart_mode (mode));
  if (known_lt (size, BYTES_PER_RISCV_VECTOR))
return size;
}
@@ -9634,8 +9634,14 @@ riscv_regmode_natural_size (machine_mode mode)
  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
return BYTES_PER_RISCV_VECTOR;
}
-  if (!GET_MODE_SIZE (mode).is_constant ())
+  if (!size.is_constant ())
return BYTES_PER_RISCV_VECTOR;
+  else if (!riscv_v_ext_vls_mode_p (mode))
+   /* For -march=rv64gc_zve32f, the natural vector register size
+  is 32bits which is smaller than scalar register size, so we
+  return minimum size between vector register size and scalar
+  register size.  */
+   return MIN (size.to_constant (), UNITS_PER_WORD);
 }
   return UNITS_PER_WORD;
 }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-4.c
new file mode 100644
index 000..c860e92dc3a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-4.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32f -mabi=lp64d -O3 
--param=riscv-autovec-lmul=m8 

[Patchv2, rs6000] Clean up pre-checkings of expand_block_compare

2023-12-17 Thread HAO CHEN GUI
Hi,
  This patch cleans up pre-checkings of expand_block_compare. It does
1. Assert only P7 above can enter this function as it's already guard
by the expand.
2. Return false when optimizing for size.
3. Remove P7 processor test as only P7 above can enter this function and
P7 LE is excluded by targetm.slow_unaligned_access. On P7 BE, the
performance of expand is better than the performance of library when
the length is long.

  Compared to last version,
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640082.html
the main change is to add some comments and move the variable definition
closed to its use.

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
regressions. Is this OK for trunk?

Thanks
Gui Haochen

ChangeLog
rs6000: Clean up the pre-checkings of expand_block_compare

gcc/
* gcc/config/rs6000/rs6000-string.cc (expand_block_compare): Assert
only P7 above can enter this function.  Return false (call library)
when it's optimized for size.  Remove P7 CPU test as only P7 above
can enter this function and P7 LE is excluded by the checking of
targetm.slow_unaligned_access on word_mode.  Also performance test
shows the expand of block compare with 16 bytes to 64 bytes length
is better than library on P7 BE.

gcc/testsuite/
* gcc.target/powerpc/block-cmp-3.c: New.


patch.diff
diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index cb9eeef05d8..49670cef4d7 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -1946,36 +1946,32 @@ expand_block_compare_gpr(unsigned HOST_WIDE_INT bytes, 
unsigned int base_align,
 bool
 expand_block_compare (rtx operands[])
 {
-  rtx target = operands[0];
-  rtx orig_src1 = operands[1];
-  rtx orig_src2 = operands[2];
-  rtx bytes_rtx = operands[3];
-  rtx align_rtx = operands[4];
+  /* TARGET_POPCNTD is already guarded at expand cmpmemsi.  */
+  gcc_assert (TARGET_POPCNTD);

-  /* This case is complicated to handle because the subtract
- with carry instructions do not generate the 64-bit
- carry and so we must emit code to calculate it ourselves.
- We choose not to implement this yet.  */
-  if (TARGET_32BIT && TARGET_POWERPC64)
+  if (optimize_insn_for_size_p ())
 return false;

-  bool isP7 = (rs6000_tune == PROCESSOR_POWER7);
-
   /* Allow this param to shut off all expansion.  */
   if (rs6000_block_compare_inline_limit == 0)
 return false;

-  /* targetm.slow_unaligned_access -- don't do unaligned stuff.
- However slow_unaligned_access returns true on P7 even though the
- performance of this code is good there.  */
-  if (!isP7
-  && (targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src1))
- || targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src2
+  /* This case is complicated to handle because the subtract
+ with carry instructions do not generate the 64-bit
+ carry and so we must emit code to calculate it ourselves.
+ We choose not to implement this yet.  */
+  if (TARGET_32BIT && TARGET_POWERPC64)
 return false;

-  /* Unaligned l*brx traps on P7 so don't do this.  However this should
- not affect much because LE isn't really supported on P7 anyway.  */
-  if (isP7 && !BYTES_BIG_ENDIAN)
+  rtx target = operands[0];
+  rtx orig_src1 = operands[1];
+  rtx orig_src2 = operands[2];
+  rtx bytes_rtx = operands[3];
+  rtx align_rtx = operands[4];
+
+  /* targetm.slow_unaligned_access -- don't do unaligned stuff.  */
+if (targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src1))
+   || targetm.slow_unaligned_access (word_mode, MEM_ALIGN (orig_src2)))
 return false;

   /* If this is not a fixed size compare, try generating loop code and
@@ -2023,14 +2019,6 @@ expand_block_compare (rtx operands[])
   if (!IN_RANGE (bytes, 1, max_bytes))
 return expand_compare_loop (operands);

-  /* The code generated for p7 and older is not faster than glibc
- memcmp if alignment is small and length is not short, so bail
- out to avoid those conditions.  */
-  if (targetm.slow_unaligned_access (word_mode, UINTVAL (align_rtx))
-  && ((base_align == 1 && bytes > 16)
- || (base_align == 2 && bytes > 32)))
-return false;
-
   rtx final_label = NULL;

   if (use_vec)
diff --git a/gcc/testsuite/gcc.target/powerpc/block-cmp-3.c 
b/gcc/testsuite/gcc.target/powerpc/block-cmp-3.c
new file mode 100644
index 000..c7e853ad593
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/block-cmp-3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+/* { dg-final { scan-assembler-times {\mb[l]? memcmp\M} 1 } }  */
+
+int foo (const char* s1, const char* s2)
+{
+  return __builtin_memcmp (s1, s2, 4);
+}


[Patchv2, rs6000] Correct definition of macro of fixed point efficient unaligned

2023-12-17 Thread HAO CHEN GUI
Hi,
  The patch corrects the definition of
TARGET_EFFICIENT_OVERLAPPING_UNALIGNED and replace it with the call of
slow_unaligned_access.

  Compared with last version,
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640076.html
the main change is to replace the macro with slow_unaligned_access.

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
regressions. Is this OK for trunk?

Thanks
Gui Haochen

ChangeLog
rs6000: Correct definition of macro of fixed point efficient unaligned

Marco TARGET_EFFICIENT_OVERLAPPING_UNALIGNED is used in rs6000-string.cc to
guard the platform which is efficient on fixed point unaligned load/store.
It's originally defined by TARGET_EFFICIENT_UNALIGNED_VSX which is enabled
from P8 and can be disabled by mno-vsx option. So the definition is wrong.
This patch corrects the problem and call slow_unaligned_access to judge if
fixed point unaligned load/store is efficient or not.

gcc/
* config/rs6000/rs6000.h (TARGET_EFFICIENT_OVERLAPPING_UNALIGNED):
Remove.
* config/rs6000/rs6000-string.cc (select_block_compare_mode):
Replace TARGET_EFFICIENT_OVERLAPPING_UNALIGNED with
targetm.slow_unaligned_access.
(expand_block_compare_gpr): Likewise.
(expand_block_compare): Likewise.
(expand_strncmp_gpr_sequence): Likewise.

gcc/testsuite/
* gcc.target/powerpc/block-cmp-1.c: New.
* gcc.target/powerpc/block-cmp-2.c: New.

patch.diff
diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index 44a946cd453..cb9eeef05d8 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -305,7 +305,7 @@ select_block_compare_mode (unsigned HOST_WIDE_INT offset,
   else if (bytes == GET_MODE_SIZE (QImode))
 return QImode;
   else if (bytes < GET_MODE_SIZE (SImode)
-  && TARGET_EFFICIENT_OVERLAPPING_UNALIGNED
+  && !targetm.slow_unaligned_access (SImode, align)
   && offset >= GET_MODE_SIZE (SImode) - bytes)
 /* This matches the case were we have SImode and 3 bytes
and offset >= 1 and permits us to move back one and overlap
@@ -313,7 +313,7 @@ select_block_compare_mode (unsigned HOST_WIDE_INT offset,
unwanted bytes off of the input.  */
 return SImode;
   else if (word_mode_ok && bytes < UNITS_PER_WORD
-  && TARGET_EFFICIENT_OVERLAPPING_UNALIGNED
+  && !targetm.slow_unaligned_access (word_mode, align)
   && offset >= UNITS_PER_WORD-bytes)
 /* Similarly, if we can use DImode it will get matched here and
can do an overlapping read that ends at the end of the block.  */
@@ -1749,7 +1749,7 @@ expand_block_compare_gpr(unsigned HOST_WIDE_INT bytes, 
unsigned int base_align,
   load_mode_size = GET_MODE_SIZE (load_mode);
   if (bytes >= load_mode_size)
cmp_bytes = load_mode_size;
-  else if (TARGET_EFFICIENT_OVERLAPPING_UNALIGNED)
+  else if (!targetm.slow_unaligned_access (load_mode, align))
{
  /* Move this load back so it doesn't go past the end.
 P8/P9 can do this efficiently.  */
@@ -2026,7 +2026,7 @@ expand_block_compare (rtx operands[])
   /* The code generated for p7 and older is not faster than glibc
  memcmp if alignment is small and length is not short, so bail
  out to avoid those conditions.  */
-  if (!TARGET_EFFICIENT_OVERLAPPING_UNALIGNED
+  if (targetm.slow_unaligned_access (word_mode, UINTVAL (align_rtx))
   && ((base_align == 1 && bytes > 16)
  || (base_align == 2 && bytes > 32)))
 return false;
@@ -2168,7 +2168,7 @@ expand_strncmp_gpr_sequence (unsigned HOST_WIDE_INT 
bytes_to_compare,
   load_mode_size = GET_MODE_SIZE (load_mode);
   if (bytes_to_compare >= load_mode_size)
cmp_bytes = load_mode_size;
-  else if (TARGET_EFFICIENT_OVERLAPPING_UNALIGNED)
+  else if (!targetm.slow_unaligned_access (load_mode, align))
{
  /* Move this load back so it doesn't go past the end.
 P8/P9 can do this efficiently.  */
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 326c45221e9..3971a56c588 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -483,10 +483,6 @@ extern int rs6000_vector_align[];
 #define TARGET_NO_SF_SUBREGTARGET_DIRECT_MOVE_64BIT
 #define TARGET_ALLOW_SF_SUBREG (!TARGET_DIRECT_MOVE_64BIT)

-/* This wants to be set for p8 and newer.  On p7, overlapping unaligned
-   loads are slow. */
-#define TARGET_EFFICIENT_OVERLAPPING_UNALIGNED TARGET_EFFICIENT_UNALIGNED_VSX
-
 /* Byte/char syncs were added as phased in for ISA 2.06B, but are not present
in power7, so conditionalize them on p8 features.  TImode syncs need quad
memory support.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/block-cmp-1.c 
b/gcc/testsuite/gcc.target/powerpc/block-cmp-1.c
new file mode 100644
index 000..bcf0cb2ab4f
--- /dev/null
+++ 

Re: [pushed][PATCH v3 0/2] LoongArch D support

2023-12-17 Thread chenglulu

Pushed to r14-6648 r14-6649 and r14-6650.

Thanks.

在 2023/12/8 下午6:09, Yang Yujie 写道:

This patchset is based on Zixing Liu's initial support patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631260.html

Updates
v1 -> v2: Rebased onto the dmd/druntime upstream state.
v2 -> v3: Dropped unnecessary changes.

Regtested on loongarch64-linux-gnu with the following result:

 === libphobos Summary ===

FAIL: libphobos.config/test22523.d -- --DRT-testmode=run-main execution test
FAIL: libphobos.gc/precisegc.d execution test
FAIL: libphobos.phobos/std/datetime/systime.d (test for excess errors)
UNRESOLVED: libphobos.phobos/std/datetime/systime.d compilation failed to 
produce executable
UNSUPPORTED: libphobos.phobos/std/net/curl.d: skipped test
UNSUPPORTED: libphobos.phobos_shared/std/net/curl.d: skipped test
FAIL: libphobos.shared/loadDR.c -ldl -pthread -g execution test (out-of-tree 
testing)

# of expected passes1024
# of unexpected failures4
# of unresolved testcases   1
# of unsupported tests  2

 === gdc Summary ===

FAIL: gdc.test/runnable/testaa.d   execution test
FAIL: gdc.test/runnable/testaa.d -fPIC   execution test

# of expected passes10353
# of unexpected failures2
# of unsupported tests  631


Yang Yujie (2):
   libruntime: Add fiber context switch code for LoongArch.
   libphobos: Update build scripts for LoongArch64.

  libphobos/configure   |  21 ++-
  libphobos/libdruntime/Makefile.am |   3 +
  libphobos/libdruntime/Makefile.in |  98 -
  .../config/loongarch/switchcontext.S  | 133 ++
  libphobos/m4/druntime/cpu.m4  |   5 +
  5 files changed, 220 insertions(+), 40 deletions(-)
  create mode 100644 libphobos/libdruntime/config/loongarch/switchcontext.S





Re: Re: [PATCH] RISC-V: Add viota missed avl_type attribute

2023-12-17 Thread Li Xu
Committed, thanks juzhe.



xu...@eswincomputing.com
 
From: juzhe.zhong
Date: 2023-12-18 09:08
To: Li Xu
CC: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; pal...@dabbelt.com
Subject: Re: [PATCH] RISC-V: Add viota missed avl_type attribute
lgtm
 Replied Message 
FromLi Xu
Date12/18/2023 09:04
togcc-patc...@gcc.gnu.org
cckito.ch...@gmail.com,
pal...@dabbelt.com,
juzhe.zh...@rivai.ai
Subject[PATCH] RISC-V: Add viota missed avl_type attribute


Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64

2023-12-17 Thread Wang
On 2023/12/14 03:35, Kees Cook wrote:
> On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote:
>> On 2023/12/13 16:48, Dan Li wrote:
>>> + Likun
>>>
>>> On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen  wrote:
 On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra  
 wrote:
> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote:
>
>> In the compiler part[4], most of the content is the same as Sami's
>> implementation[3], except for some minor differences, mainly including:
>>
>> 1. The function typeid is calculated differently and it is difficult
>> to be consistent.
> This means there is an effective ABI break between the compilers, which
> is sad :-( Is there really nothing to be done about this?
 I agree, this would be unfortunate, and would also be a compatibility
 issue with rustc where there's ongoing work to support
 clang-compatible CFI type hashes:

 https://github.com/rust-lang/rust/pull/105452

 Sami
>>
>>
>> Hi Peter and Sami
>>
>> I am Dan Li's colleague, and I will take over and continue the work of CFI.
> 
> Welcome; this is great news! :) Thanks for picking up the work.
> 
>>
>> Regarding the issue of gcc cfi type id being compatible with clang, we
>> have analyzed and verified:
>>
>> 1. clang uses Mangling defined in Itanium C++ ABI to encode the function
>> prototype, and uses the encoding result as input to generate cfi type id;
>> 2. Currently, gcc only implements mangling for the C++ compiler, and the
>> function prototype coding generated by these interfaces is compatible
>> with clang, but gcc's c compiler does not support mangling.;
>>
>> Adding mangling to gcc's c compiler is a huge and difficult task,because
>> we have to refactor the mangling of C++, splitting it into basic
>> mangling and language specific mangling, and adding support for the c
>> language which requires a deep understanding of the compiler and
>> language processing parts.
>>
>> And for the kernel cfi, I suggest separating type compatibility from CFI
>> basic functions. Type compatibility is independent from CFI basic
>> funcitons and should be dealt with under another topic. Should we focus
>> on the main issus of cfi, and  let it work first on linux kernel, and
>> left the compatible issue to be solved later?
> 
> If you mean keeping the hashes identical between Clang/LLVM and GCC,
> I think this is going to be a requirement due to adding Rust to the
> build environment (which uses the LLVM mangling and hashing).
> 
> FWIW, I think the subset of type mangling needed isn't the entirely C++
> language spec, so it shouldn't be hard to add this to GCC.
> 
> -Kees
> 

Thanks Kees, I will first try to implement a simple interface based on 
mangle to generate cfi type id.

Likun Wang

声明:这封邮件只允许文件接收者阅读,有很高的机密性要求。禁止其他人使用、打开、复制或转发里面的任何内容。如果本邮件错误地发给了你,请联系邮件发出者并删除这个文件。机密及法律的特权并不因为误发邮件而放弃或丧失。任何提出的观点或意见只属于作者的个人见解,并不一定代表本公司。

Re: [PATCH] RISC-V: Add viota missed avl_type attribute

2023-12-17 Thread juzhe.zhong
lgtm Replied Message FromLi XuDate12/18/2023 09:04 Togcc-patches@gcc.gnu.org Cckito.ch...@gmail.com,pal...@dabbelt.com,juzhe.zh...@rivai.aiSubject[PATCH] RISC-V: Add viota missed avl_type attribute


Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64

2023-12-17 Thread Wang
On 2023/12/13 22:45, Mark Rutland wrote:
> On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote:
>> On 2023/12/13 16:48, Dan Li wrote:
>>> + Likun
>>>
>>> On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote:
 On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra wrote:
> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote:
>
>> In the compiler part[4], most of the content is the same as Sami's
>> implementation[3], except for some minor differences, mainly including:
>>
>> 1. The function typeid is calculated differently and it is difficult
>> to be consistent.
> This means there is an effective ABI break between the compilers, which
> is sad :-( Is there really nothing to be done about this?
 I agree, this would be unfortunate, and would also be a compatibility
 issue with rustc where there's ongoing work to support
 clang-compatible CFI type hashes:

 https://github.com/rust-lang/rust/pull/105452

 Sami
>>
>> Hi Peter and Sami
>>
>> I am Dan Li's colleague, and I will take over and continue the work of CFI.
>>
>> Regarding the issue of gcc cfi type id being compatible with clang, we
>> have analyzed and verified:
>>
>> 1. clang uses Mangling defined in Itanium C++ ABI to encode the function
>> prototype, and uses the encoding result as input to generate cfi type id;
>> 2. Currently, gcc only implements mangling for the C++ compiler, and the
>> function prototype coding generated by these interfaces is compatible
>> with clang, but gcc's c compiler does not support mangling.;
>>
>> Adding mangling to gcc's c compiler is a huge and difficult task,because
>> we have to refactor the mangling of C++, splitting it into basic
>> mangling and language specific mangling, and adding support for the c
>> language which requires a deep understanding of the compiler and
>> language processing parts.
>>
>> And for the kernel cfi, I suggest separating type compatibility from CFI
>> basic functions. Type compatibility is independent from CFI basic
>> funcitons and should be dealt with under another topic. Should we focus
>> on the main issus of cfi, and  let it work first on linux kernel, and
>> left the compatible issue to be solved later?
> 
> I'm not sure what you're suggesting here exactly, do you mean to add a type ID
> scheme that's incompatible with clang, leaving everything else the same? If 
> so,
> what sort of scheme are you proposing?
> 
> It seems unfortunate to have a different scheme, but IIUC we expect all kernel
> objects to be built with the same compiler.
> 
> Mark.

Thanks Mark, I will consider a scheme that is compatible with clang.

Likun Wang.

声明:这封邮件只允许文件接收者阅读,有很高的机密性要求。禁止其他人使用、打开、复制或转发里面的任何内容。如果本邮件错误地发给了你,请联系邮件发出者并删除这个文件。机密及法律的特权并不因为误发邮件而放弃或丧失。任何提出的观点或意见只属于作者的个人见解,并不一定代表本公司。

[PATCH] RISC-V: Add viota missed avl_type attribute

2023-12-17 Thread Li Xu
From: Juzhe-Zhong 

This patch fixes the following FAIL when LMUL = 8:

riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medany/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=scalable
FAIL: gcc.dg/vect/slp-multitypes-2.c execution test

The rootcause is we missed viota avl_type, so we end up with incorrect vsetvl 
configuration:

vsetvli zero,a2,e64,m8,ta,ma
viota.m v16,v0

'a2' value is a garbage value.

After this patch:

vsetvli a4,zero,e64,m8,ta,ma
viota.m v16,v0

gcc/ChangeLog:

* config/riscv/vector.md: Add viota avl_type attribute.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-2.c: New test.

---
 gcc/config/riscv/vector.md|  2 +-
 .../gcc.target/riscv/rvv/autovec/bug-2.c  | 75 +++
 2 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-2.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index a1284fd3251..7646615b12a 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -831,7 +831,7 @@
  vfsqrt,vfrecp,vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\
  
vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\
  vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\
- vimovxv,vfmovfv,vlsegde,vlsegdff")
+ vimovxv,vfmovfv,vlsegde,vlsegdff,vmiota")
   (const_int 7)
 (eq_attr "type" "vldm,vstm,vmalu,vmalu")
   (const_int 5)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-2.c
new file mode 100644
index 000..9ff93d3b163
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-2.c
@@ -0,0 +1,75 @@
+/* { dg-do run } */
+/* { dg-require-effective-target riscv_v } */
+/* { dg-options "--param=riscv-autovec-lmul=m8 
--param=riscv-autovec-preference=scalable -ftree-vectorize 
-fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2" } */
+
+#define N 128 
+
+__attribute__ ((noinline)) int
+main1 (unsigned short a0, unsigned short a1, unsigned short a2, 
+   unsigned short a3, unsigned short a4, unsigned short a5,
+   unsigned short a6, unsigned short a7, unsigned short a8,
+   unsigned short a9, unsigned short a10, unsigned short a11,
+   unsigned short a12, unsigned short a13, unsigned short a14,
+   unsigned short a15, unsigned char b0, unsigned char b1)
+{
+  int i;
+  unsigned short out[N*16];
+  unsigned char out2[N*16];
+
+  for (i = 0; i < N; i++)
+{
+  out[i*16] = a8;
+  out[i*16 + 1] = a7;
+  out[i*16 + 2] = a1;
+  out[i*16 + 3] = a2;
+  out[i*16 + 4] = a8;
+  out[i*16 + 5] = a5;
+  out[i*16 + 6] = a5;
+  out[i*16 + 7] = a4;
+  out[i*16 + 8] = a12;
+  out[i*16 + 9] = a13;
+  out[i*16 + 10] = a14;
+  out[i*16 + 11] = a15;
+  out[i*16 + 12] = a6;
+  out[i*16 + 13] = a9;
+  out[i*16 + 14] = a0;
+  out[i*16 + 15] = a7;
+
+  out2[i*2] = b1;
+  out2[i*2+1] = b0;
+}
+
+  /* check results:  */
+#pragma GCC novector
+  for (i = 0; i < N; i++)
+{
+  if (out[i*16] != a8
+  || out[i*16 + 1] != a7
+  || out[i*16 + 2] != a1
+  || out[i*16 + 3] != a2
+  || out[i*16 + 4] != a8
+  || out[i*16 + 5] != a5
+  || out[i*16 + 6] != a5
+  || out[i*16 + 7] != a4
+  || out[i*16 + 8] != a12
+  || out[i*16 + 9] != a13
+  || out[i*16 + 10] != a14
+  || out[i*16 + 11] != a15
+  || out[i*16 + 12] != a6
+  || out[i*16 + 13] != a9
+  || out[i*16 + 14] != a0
+  || out[i*16 + 15] != a7
+  || out2[i*2] != b1
+  || out2[i*2 + 1] != b0)
+__builtin_abort ();
+}
+
+  return 0;
+}
+
+int main (void)
+{
+  main1 (15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0,20,21);
+
+  return 0;
+}
-- 
2.36.3



Re: [PATCH v1] RISC-V: Fix POLY INT handle bug

2023-12-17 Thread juzhe.zhong
lgtm. Replied Message Frompan2...@intel.comDate12/18/2023 08:22 Togcc-patches@gcc.gnu.org Ccjuzhe.zh...@rivai.ai,pan2...@intel.com,yanzhang.w...@intel.com,kito.ch...@gmail.comSubject[PATCH v1] RISC-V: Fix POLY INT handle bug


[PATCH v1] RISC-V: Fix POLY INT handle bug

2023-12-17 Thread pan2 . li
From: Pan Li 

This patch fixes the following FAIL:
Running target
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
FAIL: gcc.dg/vect/fast-math-vect-complex-3.c execution test

The root cause is we generate incorrect codegen for (const_poly_int:DI
[549755813888, 549755813888])

Before this patch:

li  a7,0
vmv.v.x v0,a7

After this patch:

csrra2,vlenb
sllia2,a2,33
vmv.v.x v0,a2

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_expand_mult_with_const_int):
Change int into HOST_WIDE_INT.
(riscv_legitimize_poly_move): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/bug-3.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 10 +++--
 .../gcc.target/riscv/rvv/autovec/bug-3.c  | 39 +++
 2 files changed, 45 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f60726711e8..3fef1ab1514 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2371,7 +2371,7 @@ riscv_expand_op (enum rtx_code code, machine_mode mode, 
rtx op0, rtx op1,
 
 static void
 riscv_expand_mult_with_const_int (machine_mode mode, rtx dest, rtx 
multiplicand,
- int multiplier)
+ HOST_WIDE_INT multiplier)
 {
   if (multiplier == 0)
 {
@@ -2380,7 +2380,7 @@ riscv_expand_mult_with_const_int (machine_mode mode, rtx 
dest, rtx multiplicand,
 }
 
   bool neg_p = multiplier < 0;
-  int multiplier_abs = abs (multiplier);
+  unsigned HOST_WIDE_INT multiplier_abs = abs (multiplier);
 
   if (multiplier_abs == 1)
 {
@@ -2475,8 +2475,10 @@ void
 riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
 {
   poly_int64 value = rtx_to_poly_int64 (src);
-  int offset = value.coeffs[0];
-  int factor = value.coeffs[1];
+  /* It use HOST_WIDE_INT intead of int since 32bit type is not enough
+ for e.g. (const_poly_int:DI [549755813888, 549755813888]).  */
+  HOST_WIDE_INT offset = value.coeffs[0];
+  HOST_WIDE_INT factor = value.coeffs[1];
   int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
   int div_factor = 0;
   /* Calculate (const_poly_int:MODE [m, n]) using scalar instructions.
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c
new file mode 100644
index 000..643e91b918e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/bug-3.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d 
--param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=scalable 
-fno-vect-cost-model -O2 -ffast-math" } */
+
+#define N 16
+
+_Complex float a[N] =
+{ 10.0F + 20.0iF, 11.0F + 21.0iF, 12.0F + 22.0iF, 13.0F + 23.0iF,
+  14.0F + 24.0iF, 15.0F + 25.0iF, 16.0F + 26.0iF, 17.0F + 27.0iF,
+  18.0F + 28.0iF, 19.0F + 29.0iF, 20.0F + 30.0iF, 21.0F + 31.0iF,
+  22.0F + 32.0iF, 23.0F + 33.0iF, 24.0F + 34.0iF, 25.0F + 35.0iF };
+_Complex float b[N] =
+{ 30.0F + 40.0iF, 31.0F + 41.0iF, 32.0F + 42.0iF, 33.0F + 43.0iF,
+  34.0F + 44.0iF, 35.0F + 45.0iF, 36.0F + 46.0iF, 37.0F + 47.0iF,
+  38.0F + 48.0iF, 39.0F + 49.0iF, 40.0F + 50.0iF, 41.0F + 51.0iF,
+  42.0F + 52.0iF, 43.0F + 53.0iF, 44.0F + 54.0iF, 45.0F + 55.0iF };
+
+_Complex float c[N];
+_Complex float res[N] =
+{ -500.0F + 1000.0iF, -520.0F + 1102.0iF,
+  -540.0F + 1208.0iF, -560.0F + 1318.0iF,
+  -580.0F + 1432.0iF, -600.0F + 1550.0iF,
+  -620.0F + 1672.0iF, -640.0F + 1798.0iF,
+  -660.0F + 1928.0iF, -680.0F + 2062.0iF,
+  -700.0F + 2200.0iF, -720.0F + 2342.0iF,
+  -740.0F + 2488.0iF, -760.0F + 2638.0iF,
+  -780.0F + 2792.0iF, -800.0F + 2950.0iF };
+
+
+void
+foo (void)
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+c[i] = a[i] * b[i];
+}
+
+/* { dg-final { scan-assembler-not {li\s+[a-x0-9]+,\s*0} } } */
+/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*33} 1 } 
} */
-- 
2.34.1



Re: [PATCH] c-family: Use -Wdiscarded-qualifiers for ignored qualifiers in __atomic_*

2023-12-17 Thread Jonathan Wakely
On Sun, 17 Dec 2023 at 15:38, Florian Weimer  wrote:
>
> This matches other compiler diagnostics.  No test updates are needed
> because c-c++-common/pr95378.c does not match a specific -W option.
>
> Fixes commit d2384b7b24f8557b66f6958a05ea99ff4307e75c ("c-family:
> check qualifiers of arguments to __atomic built-ins (PR 95378)").
>
> gcc/c-family/
>
> PR c/113050
> * c-common.cc (get_atomic_generic_size): Use
> OPT_Wdiscarded_qualifiers instead of
> OPT_Wincompatible_pointer_types.
>
> ---
>
> Jonathan, I assume this was just an oversight in your patch, and there
> is no fundamental reason to use -Wincompatible-pointer-types here?

You are correct. I think I forgot about -Wdiscarded-qualifiers at the time.



[PATCH] c++: Check null pointer deref when calling memfn in constexpr [PR102420]

2023-12-17 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

An alternative approach for the lambda issue would be to modify
'maybe_add_lambda_conv_op' to not pass a null pointer, but I wasn't sure
what the best approach for that would be.

-- >8 --

Calling a non-static member function on a null pointer is undefined
behaviour (see [expr.ref] p8) and should error in constant evaluation,
even if the 'this' pointer is never actually accessed within that
function.

One catch is that currently, the function pointer conversion operator
for lambda passes a null pointer as the 'this' pointer to the underlying
'operator()', so for now we ignore such calls.

PR c++/102420

gcc/cp/ChangeLog:

* constexpr.cc (cxx_bind_parameters_in_call): Check for calling
non-static member functions with a null pointer.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-memfn2.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/constexpr.cc   | 17 +
 gcc/testsuite/g++.dg/cpp0x/constexpr-memfn2.C | 10 ++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-memfn2.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 051f73fb73f..9c18538b302 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1884,6 +1884,23 @@ cxx_bind_parameters_in_call (const constexpr_ctx *ctx, 
tree t, tree fun,
 TARGET_EXPR, and use its CONSTRUCTOR as the value of the parm.  */
   arg = cxx_eval_constant_expression (ctx, x, vc_prvalue,
  non_constant_p, overflow_p);
+  /* Check we aren't dereferencing a null pointer when calling a non-static
+member function, which is undefined behaviour.  */
+  if (i == 0 && DECL_NONSTATIC_MEMBER_FUNCTION_P (fun)
+ && integer_zerop (arg)
+ /* But ignore calls from within the lambda function pointer
+conversion thunk, since this currently passes a null pointer.  */
+ && !(TREE_CODE (t) == CALL_EXPR
+  && CALL_FROM_THUNK_P (t)
+  && ctx->call
+  && ctx->call->fundef
+  && lambda_static_thunk_p (ctx->call->fundef->decl)))
+   {
+ if (!ctx->quiet)
+   error_at (cp_expr_loc_or_input_loc (x),
+ "dereferencing a null pointer");
+ *non_constant_p = true;
+   }
   /* Don't VERIFY_CONSTANT here.  */
   if (*non_constant_p && ctx->quiet)
break;
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-memfn2.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-memfn2.C
new file mode 100644
index 000..4749190a1f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-memfn2.C
@@ -0,0 +1,10 @@
+// PR c++/102420
+// { dg-do compile { target c++11 } }
+
+struct X {
+  constexpr int f() { return 0; }
+};
+constexpr int g(X* x) {
+return x->f();  // { dg-error "dereferencing a null pointer" }
+}
+constexpr int t = g(nullptr);  // { dg-message "in .constexpr. expansion" }
-- 
2.42.0



Re: [RFC][V2] RISC-V: Support -mcmodel=large.

2023-12-17 Thread Jeff Law




On 11/10/23 02:10, KuanLin Chen wrote:
Sorry. It missed a semicolon in the previos patch. Please find the new 
one in the attachment. Thanks.
Thanks.  I was going to do some final testing with the plan to integrate 
this patch today, but I think there's a piece missing.  Specifically I 
think it's missing a definition for riscv_asm_output_pool_epilogue.


Can you please send an updated patch that includes that function?

Thanks,
Jeff


[PATCH 5/5] OpenMP: Add prettyprinter support for context selectors.

2023-12-17 Thread Sandra Loosemore
With the change to use enumerators instead of strings to represent
context selector and selector-set names, the default tree-list output
for dumping selectors is less helpful for debugging and harder to use
in test cases.  This patch adds support for dumping context selectors
using syntax similar to that used for input to the compiler.

gcc/ChangeLog
* omp-general.cc (omp_context_name_list_prop): Remove static qualifer.
* omp-general.h (omp_context_name_list_prop): Declare.
* tree-cfg.cc (dump_function_to_file): Intercept
"omp declare variant base" attribute for special handling.
* tree-pretty-print.cc: Include omp-general.h.
(dump_omp_context_selector): New.
(print_omp_context_selector): New.
* tree-pretty-print.h (dump_omp_context_selector): Declare.
(print_omp_context_selector): Declare.
---
 gcc/omp-general.cc   |  2 +-
 gcc/omp-general.h|  1 +
 gcc/tree-cfg.cc  |  9 +
 gcc/tree-pretty-print.cc | 75 
 gcc/tree-pretty-print.h  |  3 ++
 5 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
index 233f235d81e..65990df1238 100644
--- a/gcc/omp-general.cc
+++ b/gcc/omp-general.cc
@@ -1234,7 +1234,7 @@ struct omp_ts_info omp_ts_map[] =
 /* Return a name from PROP, a property in selectors accepting
name lists.  */
 
-static const char *
+const char *
 omp_context_name_list_prop (tree prop)
 {
   gcc_assert (OMP_TP_NAME (prop) == OMP_TP_NAMELIST_NODE);
diff --git a/gcc/omp-general.h b/gcc/omp-general.h
index 66ed4903513..3c2b221b226 100644
--- a/gcc/omp-general.h
+++ b/gcc/omp-general.h
@@ -164,6 +164,7 @@ extern gimple *omp_build_barrier (tree lhs);
 extern tree find_combined_omp_for (tree *, int *, void *);
 extern poly_uint64 omp_max_vf (void);
 extern int omp_max_simt_vf (void);
+extern const char *omp_context_name_list_prop (tree);
 extern void omp_construct_traits_to_codes (tree, int, enum tree_code *);
 extern tree omp_check_context_selector (location_t loc, tree ctx);
 extern void omp_mark_declare_variant (location_t loc, tree variant,
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index d784b911532..1ab18fa6b0f 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -8291,6 +8291,15 @@ dump_function_to_file (tree fndecl, FILE *file, 
dump_flags_t flags)
 
  if (strstr (IDENTIFIER_POINTER (name), "no_sanitize"))
print_no_sanitize_attr_value (file, TREE_VALUE (chain));
+ else if (!strcmp (IDENTIFIER_POINTER (name),
+   "omp declare variant base"))
+   {
+ tree a = TREE_VALUE (chain);
+ print_generic_expr (file, TREE_PURPOSE (a), dump_flags);
+ fprintf (file, " match ");
+ print_omp_context_selector (file, TREE_VALUE (a),
+ dump_flags);
+   }
  else
print_generic_expr (file, TREE_VALUE (chain), dump_flags);
  fprintf (file, ")");
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 68857ae1cdf..fd61d28faff 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gomp-constants.h"
 #include "gimple.h"
 #include "fold-const.h"
+#include "omp-general.h"
 
 /* Routines in this file get invoked via the default tree printer
used by diagnostics and thus they are called from pp_printf which
@@ -1497,6 +1498,80 @@ dump_omp_clauses (pretty_printer *pp, tree clause, int 
spc, dump_flags_t flags,
 }
 }
 
+/* Dump an OpenMP context selector CTX to PP.  */
+void
+dump_omp_context_selector (pretty_printer *pp, tree ctx, int spc,
+  dump_flags_t flags)
+{
+  for (tree set = ctx; set && set != error_mark_node; set = TREE_CHAIN (set))
+{
+  pp_string (pp, OMP_TSS_NAME (set));
+  pp_string (pp, " = {");
+  for (tree sel = OMP_TSS_TRAIT_SELECTORS (set);
+  sel && sel != error_mark_node; sel = TREE_CHAIN (sel))
+   {
+ if (OMP_TS_CODE (sel) == OMP_TRAIT_INVALID)
+   pp_string (pp, "");
+ else
+   pp_string (pp, OMP_TS_NAME (sel));
+ tree score = OMP_TS_SCORE (sel);
+ tree props = OMP_TS_PROPERTIES (sel);
+ if (props)
+   {
+ pp_string (pp, " (");
+ if (score)
+   {
+ pp_string (pp, "score(");
+ dump_generic_node (pp, score, spc + 4, flags, false);
+ pp_string (pp, "): ");
+   }
+ for (tree prop = props; prop; prop = TREE_CHAIN (prop))
+   {
+ if (OMP_TP_NAME (prop) == OMP_TP_NAMELIST_NODE)
+   {
+ const char *str = omp_context_name_list_prop (prop);
+ pp_string (pp, "\"");
+   

[PATCH V4 3/5] OpenMP: Use enumerators for names of trait-sets and traits

2023-12-17 Thread Sandra Loosemore
This patch introduces enumerators to represent trait-set names and
trait names, which makes it easier to use tables to control other
behavior and for switch statements to dispatch on the tags.  The tags
are stored in the same place in the TREE_LIST structure (OMP_TSS_ID or
OMP_TS_ID) and are encoded there as integer constants.

gcc/ChangeLog
* omp-selectors.h: New file.
* omp-general.h: Include omp-selectors.h.
(OMP_TSS_CODE, OMP_TSS_NAME): New.
(OMP_TS_CODE, OMP_TS_NAME): New.
(make_trait_set_selector, make_trait_selector): Adjust declarations.
(omp_construct_traits_to_codes): Likewise.
(omp_context_selector_set_compare): Likewise.
(omp_get_context_selector): Likewise.
(omp_get_context_selector_list): New.
* omp-general.cc (omp_construct_traits_to_codes): Pass length in
as argument instead of returning it.  Make it table-driven.
(omp_tss_map): New.
(kind_properties, vendor_properties, extension_properties): New.
(atomic_default_mem_order_properties): New.
(omp_ts_map): New.
(omp_check_context_selector): Simplify lookup and dispatch logic.
(omp_mark_declare_variant): Ignore variants with unknown construct
selectors.  Adjust for new representation.
(make_trait_set_selector, make_trait_selector): Adjust for new
representations.
(omp_context_selector_matches): Simplify dispatch logic.  Avoid
fixed-sized buffers and adjust call to omp_construct_traits_to_codes.
(omp_context_selector_props_compare): Adjust for new representations
and simplify dispatch logic.
(omp_context_selector_set_compare): Likewise.
(omp_context_selector_compare): Likewise.
(omp_get_context_selector): Adjust for new representations, and split
out...
(omp_get_context_selector_list): New function.
(omp_lookup_tss_code): New.
(omp_lookup_ts_code): New.
(omp_context_compute_score): Adjust for new representations.  Avoid
fixed-sized buffers and magic numbers.  Adjust call to
omp_construct_traits_to_codes.
* gimplify.cc (omp_construct_selector_matches): Avoid use of
fixed-size buffer.  Adjust call to omp_construct_traits_to_codes.

gcc/c/ChangeLog
* c-parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(c_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic.  Uniformly warn instead of sometimes
error when an unknown selector is found.
(c_parser_omp_context_selector_specification): Likewise.
(c_finish_omp_declare_variant): Adjust for new representations.

gcc/cp/ChangeLog
* decl.cc (omp_declare_variant_finalize_one): Adjust for new
representations.
* parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(cp_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic.  Uniformly warn instead of sometimes
error when an unknown selector is found.
(cp_parser_omp_context_selector_specification): Likewise.
* pt.cc (tsubst_attribute): Adjust for new representations.

gcc/fortran/ChangeLog
* gfortran.h: Include omp-selectors.h.
(enum gfc_omp_trait_property_kind): Delete, and replace all
references with equivalent omp_tp_type enumerators.
(struct gfc_omp_trait_property): Update for omp_tp_type.
(struct gfc_omp_selector): Replace string name with new enumerator.
(struct gfc_omp_set_selector): Likewise.
* openmp.cc (gfc_free_omp_trait_property_list): Update for
omp_tp_type.
(omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(gfc_ignore_trait_property_extension): New.
(gfc_ignore_trait_property_extension_list): New.
(gfc_match_omp_selector): Adjust for new representations and simplify
dispatch logic.  Uniformly warn instead of sometimes error when an
unknown selector is found.
(gfc_match_omp_context_selector): Adjust for new representations.
(gfc_match_omp_context_selector_specification): Likewise.
* trans-openmp.cc (gfc_trans_omp_declare_variant): Adjust for
new representations.

gcc/testsuite/
* c-c++-common/gomp/declare-variant-1.c: Expect warning on
unknown selectors.
* c-c++-common/gomp/declare-variant-2.c: Likewise.
* gfortran.dg/gomp/declare-variant-1.f90: Likewise.
* gfortran.dg/gomp/declare-variant-2.f90: Likewise.
---
 gcc/c/c-parser.cc | 234 

Re: [PATCH V3 3/4] OpenMP: Use enumerators for names of trait-sets and traits

2023-12-17 Thread Sandra Loosemore

On 12/12/23 05:05, Tobias Burnus wrote:

Hi Sandra,

On 07.12.23 16:52, Sandra Loosemore wrote:

This patch introduces enumerators to represent trait-set names and
trait names, which makes it easier to use tables to control other
behavior and for switch statements to dispatch on the tags.  The tags
are stored in the same place in the TREE_LIST structure (OMP_TSS_ID or
OMP_TS_ID) and are encoded there as integer constants.


Thanks - that looks like a huge improvement.

* * *

I think it is useful to prepare for 'target_device'. However, it is currently 
not yet implemented

on mainline - contrary to OG13.

Can you add some kind of error diagnostic for it? On mainline, the current 
result is:


error: expected ‘construct’, ‘device’, ‘implementation’ or ‘user’ before 
‘target_device’

    13 | #pragma omp declare variant (f05) match (target_device={kind(gpu)})
   |  ^

But with your patch, it is silently accepted, which is bad.

(That's a modified version of 
gcc/testsuite/c-c++-common/gomp/declare-variant-10.c:13)


I think you have two options:

* Either fail with the same error message as above

* Or update the error message to list 'target_device' (for C/C++/Fortran)
   and handle 'target_device' separately with a sorry.

To whatever you think makes more sense for know, knowing that we do want to add 
'target_device'

in the not to far future.

(I am slightly preferring the updated-error message + sorry variant as it 
avoids touching

the messages later again, but either is fine.)


OK.  I had a FIXME in the code noting that listing all the valid selector-set 
keywords in the error message was prone to bit-rot anyway, so I have replaced 
it with something more generic, and added a sorry for the missing 
"target_device" support.


Also in V4 of the patch I have added a sorry for the missing "requires" 
selector support so it does that rather than ICE, as Julian discovered.


Finally, I improved the error handling for including a trait-score on selectors 
that don't permit it -- it now says explicitly that a score isn't permitted 
there, instead of a cascade of more more obscure errors.  I have a test case 
for that coming later with the metadirectives patches, which are not ready for 
GCC 14.


And  I have a new part 5 for this series coming along too, with 
prettyprinter support for the new selector representation.  I realize we're 
long past the end of stage 1 but I think this is still reasonable to consider 
for GCC 14.  It's only for internal debugging purposes, and I think it'll be 
useful for Julian's work and implementing missing 5.1/5.2/TR11 selector 
features for declare variant as well as my current hackery on finishing 
metadirectives.  There aren't any "declare variant" test cases that examine the 
attribute on the base function in dump files, BTW, but I did inspect the output 
by hand and also do some further testing with metadirective.



Otherwise, the patch LGTM.

As written before, 1/4, 2/4 and 4/4 are LGTM as posted.


Thanks.  I'll push parts 1-4 when part 3 is approved, and part 5 too if/when 
that's approved.


-Sandra


[V5] [C PATCH 4/4] c23: construct composite type for tagged types

2023-12-17 Thread Martin Uecker



Support for constructing composite types for structs and unions
in C23.

gcc/c:
* c-typeck.cc (composite_type_internal): Adapted from
composite_type to support structs and unions.
(composite_type): New wrapper function.
(build_conditional_operator): Return composite type.
* c-decl.cc (finish_struct): Allow NULL for
enclosing_struct_parse_info.

gcc/testsuite:
* gcc.dg/c23-tag-alias-6.c: New test.
* gcc.dg/c23-tag-composite-1.c: New test.
* gcc.dg/c23-tag-composite-2.c: New test.
* gcc.dg/c23-tag-composite-3.c: New test.
* gcc.dg/c23-tag-composite-4.c: New test.
* gcc.dg/c23-tag-composite-5.c: New test.
* gcc.dg/c23-tag-composite-6.c: New test.
* gcc.dg/c23-tag-composite-7.c: New test.
* gcc.dg/c23-tag-composite-8.c: New test.
* gcc.dg/c23-tag-composite-9.c: New test.
* gcc.dg/gnu23-tag-composite-1.c: New test.
* gcc.dg/gnu23-tag-composite-2.c: New test.
* gcc.dg/gnu23-tag-composite-3.c: New test.
* gcc.dg/gnu23-tag-composite-4.c: New test.
---
 gcc/c/c-decl.cc  |  21 +--
 gcc/c/c-typeck.cc| 137 ---
 gcc/testsuite/gcc.dg/c23-tag-alias-6.c   |  32 +
 gcc/testsuite/gcc.dg/c23-tag-composite-1.c   |  26 
 gcc/testsuite/gcc.dg/c23-tag-composite-2.c   |  16 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-3.c   |  50 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-4.c   |  21 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-5.c   |  25 
 gcc/testsuite/gcc.dg/c23-tag-composite-6.c   |  18 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-7.c   |  20 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-8.c   |  15 ++
 gcc/testsuite/gcc.dg/c23-tag-composite-9.c   |  19 +++
 gcc/testsuite/gcc.dg/gnu23-tag-composite-1.c |  45 ++
 gcc/testsuite/gcc.dg/gnu23-tag-composite-2.c |  30 
 gcc/testsuite/gcc.dg/gnu23-tag-composite-3.c |  24 
 gcc/testsuite/gcc.dg/gnu23-tag-composite-4.c |  28 
 16 files changed, 500 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-5.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-7.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-8.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-9.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-4.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 6639ec35e5f..b72738ea04a 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9674,7 +9674,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
 }
 
   /* Check for consistency with previous definition.  */
-  if (flag_isoc23)
+  if (flag_isoc23 && NULL != enclosing_struct_parse_info)
 {
   tree vistype = previous_tag (t);
   if (vistype
@@ -9744,16 +9744,19 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   if (warn_cxx_compat)
 warn_cxx_compat_finish_struct (fieldlist, TREE_CODE (t), loc);
 
-  delete struct_parse_info;
+  if (NULL != enclosing_struct_parse_info)
+{
+  delete struct_parse_info;
 
-  struct_parse_info = enclosing_struct_parse_info;
+  struct_parse_info = enclosing_struct_parse_info;
 
-  /* If this struct is defined inside a struct, add it to
- struct_types.  */
-  if (warn_cxx_compat
-  && struct_parse_info != NULL
-  && !in_sizeof && !in_typeof && !in_alignof)
-struct_parse_info->struct_types.safe_push (t);
+  /* If this struct is defined inside a struct, add it to
+struct_types.  */
+  if (warn_cxx_compat
+ && struct_parse_info != NULL
+ && !in_sizeof && !in_typeof && !in_alignof)
+   struct_parse_info->struct_types.safe_push (t);
+ }
 
   return t;
 }
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 4d3079156ba..ac31eba6e46 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -381,8 +381,15 @@ build_functype_attribute_variant (tree ntype, tree otype, 
tree attrs)
nonzero; if that isn't so, this may crash.  In particular, we
assume that qualifiers match.  */
 
+struct composite_cache {
+  tree t1;
+  tree t2;
+  tree composite;
+  struct composite_cache* next;
+};
+
 tree
-composite_type (tree t1, tree t2)
+composite_type_internal (tree t1, tree t2, struct composite_cache* cache)
 {
   enum tree_code code1;
   enum tree_code code2;
@@ -427,7 

[V5] [C PATCH 3/4] c23: aliasing of compatible tagged types

2023-12-17 Thread Martin Uecker



Tell the backend which types are equivalent by setting
TYPE_CANONICAL to one struct in the set of equivalent
structs.  Structs are considered equivalent by ignoring
all sizes of arrays nested in types below field level.

The following two structs are incompatible and lvalues
with these types can be assumed not to alias:

 struct foo { int a[3]; };
 struct foo { int a[4]; };

The following two structs are also incompatible, but
will get the same TYPE_CANONICAL and it is then not
exploited that lvalues with those types can not alias:

 struct bar { int (*p)[3]; };
 struct bar { int (*p)[4]; };

The reason is that both are compatible to

 struct bar { int (*p)[]; };

and therefore are in the same equivalence class.  For
the same reason all enums with the same underyling type
are in the same equivalence class.  Tests are added
for the expected aliasing behavior with optimization.

gcc/c:
* c-decl.cc (c_struct_hasher): Hash stable for struct
types.
(c_struct_hasher::hash, c_struct_hasher::equal): New
functions.
(finish_struct): Set TYPE_CANONICAL to first struct in
equivalence class.
* c-objc-common.cc (c_get_alias_set): Let structs or
unions with variable size alias anything.
* c-tree.h (comptypes_equiv): New prototype.
* c-typeck.cc (comptypes_equiv): New function.
(comptypes_internal): Implement equivalence mode.
(tagged_types_tu_compatible): Implement equivalence mode.

gcc/testsuite:
* gcc.dg/c23-tag-2.c: Activate.
* gcc.dg/c23-tag-5.c: Activate.
* gcc.dg/c23-tag-alias-1.c: New test.
* gcc.dg/c23-tag-alias-2.c: New test.
* gcc.dg/c23-tag-alias-3.c: New test.
* gcc.dg/c23-tag-alias-4.c: New test.
* gcc.dg/c23-tag-alias-5.c: New test.
* gcc.dg/gnu23-tag-alias-1.c: New test.
* gcc.dg/gnu23-tag-alias-2.c: New test.
* gcc.dg/gnu23-tag-alias-3.c: New test.
* gcc.dg/gnu23-tag-alias-4.c: New test.
* gcc.dg/gnu23-tag-alias-5.c: New test.
* gcc.dg/gnu23-tag-alias-6.c: New test.
* gcc.dg/gnu23-tag-alias-7.c: New test.
---
 gcc/c/c-decl.cc  |  51 ++-
 gcc/c/c-objc-common.cc   |   5 ++
 gcc/c/c-tree.h   |   1 +
 gcc/c/c-typeck.cc|  31 +++
 gcc/testsuite/gcc.dg/c23-tag-2.c |   2 +-
 gcc/testsuite/gcc.dg/c23-tag-5.c |   2 +-
 gcc/testsuite/gcc.dg/c23-tag-alias-1.c   |  49 +++
 gcc/testsuite/gcc.dg/c23-tag-alias-2.c   |  50 +++
 gcc/testsuite/gcc.dg/c23-tag-alias-3.c   |  32 +++
 gcc/testsuite/gcc.dg/c23-tag-alias-4.c   |  32 +++
 gcc/testsuite/gcc.dg/c23-tag-alias-5.c   |  36 
 gcc/testsuite/gcc.dg/gnu23-tag-alias-1.c |  33 +++
 gcc/testsuite/gcc.dg/gnu23-tag-alias-2.c |  85 ++
 gcc/testsuite/gcc.dg/gnu23-tag-alias-3.c |  83 ++
 gcc/testsuite/gcc.dg/gnu23-tag-alias-4.c |  36 
 gcc/testsuite/gcc.dg/gnu23-tag-alias-5.c | 107 +++
 gcc/testsuite/gcc.dg/gnu23-tag-alias-6.c |  60 +
 gcc/testsuite/gcc.dg/gnu23-tag-alias-7.c |  93 
 18 files changed, 785 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-5.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-4.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-5.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-6.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-alias-7.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 26188aa225e..6639ec35e5f 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -634,6 +634,36 @@ public:
   auto_vec typedefs_seen;
 };
 
+
+/* Hash table for structs and unions.  */
+struct c_struct_hasher : ggc_ptr_hash
+{
+  static hashval_t hash (tree t);
+  static bool equal (tree, tree);
+};
+
+/* Hash an RECORD OR UNION.  */
+hashval_t
+c_struct_hasher::hash (tree type)
+{
+  inchash::hash hstate;
+
+  hstate.add_int (TREE_CODE (type));
+  hstate.add_object (TYPE_NAME (type));
+
+  return hstate.end ();
+}
+
+/* Compare two RECORD or UNION types.  */
+bool
+c_struct_hasher::equal (tree t1,  tree t2)
+{
+  return comptypes_equiv_p (t1, t2);
+}
+
+/* All tagged typed so that TYPE_CANONICAL can be set correctly.  */
+static GTY (()) hash_table *c_struct_htab;
+
 /* Information for the struct or union currently being parsed, or
NULL if not parsing a struct or union.  */
 static class c_struct_parse_info *struct_parse_info;
@@ -8713,7 

[V5] [C PATCH 2/4] c23: tag compatibility rules for enums

2023-12-17 Thread Martin Uecker



Allow redefinition of enum types and enumerators.  Diagnose
nested redefinitions including redefinitions in the enum
specifier for enum types with fixed underlying type.

gcc/c:
* c-tree.h (c_parser_enum_specifier): Add parameter.
* c-decl.cc (start_enum): Allow redefinition.
(finish_enum): Diagnose conflicts.
(build_enumerator): Set context.
(diagnose_mismatched_decls): Diagnose conflicting enumerators.
(push_decl): Preserve context for enumerators.
* c-parser.cc (c_parser_enum_specifier): Remember when
seen is from an enum type which is not yet defined.

gcc/testsuide/:
* gcc.dg/c23-tag-enum-1.c: New test.
* gcc.dg/c23-tag-enum-2.c: New test.
* gcc.dg/c23-tag-enum-3.c: New test.
* gcc.dg/c23-tag-enum-4.c: New test.
* gcc.dg/c23-tag-enum-5.c: New test.
* gcc.dg/gnu23-tag-enum-1.c: Mew test.
---
 gcc/c/c-decl.cc | 65 +
 gcc/c/c-parser.cc   |  5 +-
 gcc/c/c-tree.h  |  3 +-
 gcc/c/c-typeck.cc   |  5 +-
 gcc/testsuite/gcc.dg/c23-tag-enum-1.c   | 56 +
 gcc/testsuite/gcc.dg/c23-tag-enum-2.c   | 17 +++
 gcc/testsuite/gcc.dg/c23-tag-enum-3.c   |  7 +++
 gcc/testsuite/gcc.dg/c23-tag-enum-4.c   | 22 +
 gcc/testsuite/gcc.dg/c23-tag-enum-5.c   | 18 +++
 gcc/testsuite/gcc.dg/gnu23-tag-enum-1.c | 19 
 10 files changed, 205 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-enum-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-enum-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-enum-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-enum-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-enum-5.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-enum-1.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 0e6b4a5248b..26188aa225e 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -2112,9 +2112,24 @@ diagnose_mismatched_decls (tree newdecl, tree olddecl,
  given scope.  */
   if (TREE_CODE (olddecl) == CONST_DECL)
 {
-  auto_diagnostic_group d;
-  error ("redeclaration of enumerator %q+D", newdecl);
-  locate_old_decl (olddecl);
+  if (flag_isoc23
+ && TYPE_NAME (DECL_CONTEXT (newdecl))
+ && DECL_CONTEXT (newdecl) != DECL_CONTEXT (olddecl)
+ && TYPE_NAME (DECL_CONTEXT (newdecl)) == TYPE_NAME (DECL_CONTEXT 
(olddecl)))
+   {
+ if (!simple_cst_equal (DECL_INITIAL (olddecl), DECL_INITIAL 
(newdecl)))
+   {
+ auto_diagnostic_group d;
+ error ("conflicting redeclaration of enumerator %q+D", newdecl);
+ locate_old_decl (olddecl);
+   }
+   }
+  else
+   {
+ auto_diagnostic_group d;
+ error ("redeclaration of enumerator %q+D", newdecl);
+ locate_old_decl (olddecl);
+   }
   return false;
 }
 
@@ -3275,8 +3290,11 @@ pushdecl (tree x)
 
   /* Must set DECL_CONTEXT for everything not at file scope or
  DECL_FILE_SCOPE_P won't work.  Local externs don't count
- unless they have initializers (which generate code).  */
+ unless they have initializers (which generate code).  We
+ also exclude CONST_DECLs because enumerators will get the
+ type of the enum as context.  */
   if (current_function_decl
+  && TREE_CODE (x) != CONST_DECL
   && (!VAR_OR_FUNCTION_DECL_P (x)
  || DECL_INITIAL (x) || !TREE_PUBLIC (x)))
 DECL_CONTEXT (x) = current_function_decl;
@@ -9759,7 +9777,7 @@ layout_array_type (tree t)
 
 tree
 start_enum (location_t loc, struct c_enum_contents *the_enum, tree name,
-   tree fixed_underlying_type)
+   tree fixed_underlying_type, bool potential_nesting_p)
 {
   tree enumtype = NULL_TREE;
   location_t enumloc = UNKNOWN_LOCATION;
@@ -9771,9 +9789,26 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name,
   if (name != NULL_TREE)
 enumtype = lookup_tag (ENUMERAL_TYPE, name, true, );
 
+  if (enumtype != NULL_TREE && TREE_CODE (enumtype) == ENUMERAL_TYPE)
+{
+  /* If the type is currently being defined or if we have seen an
+incomplete version which is now complete, this is a nested
+redefinition.  The later happens if the redefinition occurs
+inside the enum specifier itself.  */
+  if (C_TYPE_BEING_DEFINED (enumtype)
+ || (potential_nesting_p && TYPE_VALUES (enumtype) != NULL_TREE))
+   error_at (loc, "nested redefinition of %", name);
+
+  /* For C23 we allow redefinitions.  We set to zero and check for
+consistency later.  */
+  if (flag_isoc23 && TYPE_VALUES (enumtype) != NULL_TREE)
+   enumtype = NULL_TREE;
+}
+
   if (enumtype == NULL_TREE || TREE_CODE (enumtype) != ENUMERAL_TYPE)
 {
   enumtype = make_node (ENUMERAL_TYPE);
+  TYPE_SIZE (enumtype) = NULL_TREE;
   pushtag (loc, 

[V5] [C PATCH 1/4] c23: tag compatibility rules for struct and unions

2023-12-17 Thread Martin Uecker


Here is the revised series.  The first three patches only
have changes in the tests as well as the return value
changes.   The fourth patch was now also revised,
with changes and tests to make sure that the composite
type works correctly for bit-fields, anonymous structs/unions,
alignment, packed structs, attributes, aliasing, etc. 
It now calls finish_struct to reuse the existing code for
setting up the struct.


Bootstrapped and regression tested on x86_64.




Implement redeclaration and compatibility rules for
structures and unions in C23.

gcc/c/:
* c-decl.cc (previous_tag): New function.
(parser_xref_tag): Find earlier definition.
(get_parm_info): Turn off warning for C23.
(start_struct): Allow redefinitons.
(finish_struct): Diagnose conflicts.
* c-tree.h (comptypes_same_p): Add prototype.
* c-typeck.cc (comptypes_same_p): New function
(comptypes_internal): Activate comparison of tagged types.
(convert_for_assignment): Ignore qualifiers.
(digest_init): Add error.
(initialized_elementwise_p): Allow compatible types.

gcc/testsuite/:
* gcc.dg/c23-enum-7.c: Remove warning.
* gcc.dg/c23-tag-1.c: New test.
* gcc.dg/c23-tag-2.c: New deactivated test.
* gcc.dg/c23-tag-3.c: New test.
* gcc.dg/c23-tag-4.c: New test.
* gcc.dg/c23-tag-5.c: New deactivated test.
* gcc.dg/c23-tag-6.c: New test.
* gcc.dg/c23-tag-7.c: New test.
* gcc.dg/c23-tag-8.c: New test.
* gcc.dg/gnu23-tag-1.c: New test.
* gcc.dg/gnu23-tag-2.c: New test.
* gcc.dg/gnu23-tag-3.c: New test.
* gcc.dg/gnu23-tag-4.c: New test.
* gcc.dg/pr112488-2.c: Remove warning.
---
 gcc/c/c-decl.cc| 72 +++---
 gcc/c/c-tree.h |  1 +
 gcc/c/c-typeck.cc  | 38 +---
 gcc/testsuite/gcc.dg/c23-enum-7.c  |  6 +--
 gcc/testsuite/gcc.dg/c23-tag-1.c   | 67 +++
 gcc/testsuite/gcc.dg/c23-tag-2.c   | 43 ++
 gcc/testsuite/gcc.dg/c23-tag-3.c   | 16 +++
 gcc/testsuite/gcc.dg/c23-tag-4.c   | 26 +++
 gcc/testsuite/gcc.dg/c23-tag-5.c   | 33 ++
 gcc/testsuite/gcc.dg/c23-tag-6.c   | 58 
 gcc/testsuite/gcc.dg/c23-tag-7.c   | 12 +
 gcc/testsuite/gcc.dg/c23-tag-8.c   | 10 +
 gcc/testsuite/gcc.dg/gnu23-tag-1.c | 10 +
 gcc/testsuite/gcc.dg/gnu23-tag-2.c | 18 
 gcc/testsuite/gcc.dg/gnu23-tag-3.c | 28 
 gcc/testsuite/gcc.dg/gnu23-tag-4.c | 31 +
 gcc/testsuite/gcc.dg/pr112488-2.c  |  2 +-
 17 files changed, 454 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-5.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-7.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-8.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-4.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 039a66fef09..0e6b4a5248b 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -2037,6 +2037,28 @@ locate_old_decl (tree decl)
decl, TREE_TYPE (decl));
 }
 
+
+/* Helper function.  For a tagged type, it finds the declaration
+   for a visible tag declared in the the same scope if such a
+   declaration exists.  */
+static tree
+previous_tag (tree type)
+{
+  struct c_binding *b = NULL;
+  tree name = TYPE_NAME (type);
+
+  if (name)
+b = I_TAG_BINDING (name);
+
+  if (b)
+b = b->shadowed;
+
+  if (b && B_IN_CURRENT_SCOPE (b))
+return b->decl;
+
+  return NULL_TREE;
+}
+
 /* Subroutine of duplicate_decls.  Compare NEWDECL to OLDDECL.
Returns true if the caller should proceed to merge the two, false
if OLDDECL should simply be discarded.  As a side effect, issues
@@ -8573,11 +8595,14 @@ get_parm_info (bool ellipsis, tree expr)
  if (TREE_CODE (decl) != UNION_TYPE || b->id != NULL_TREE)
{
  if (b->id)
-   /* The %s will be one of 'struct', 'union', or 'enum'.  */
-   warning_at (b->locus, 0,
-   "%<%s %E%> declared inside parameter list"
-   " will not be visible outside of this definition or"
-   " declaration", keyword, b->id);
+   {
+ /* The %s will be one of 'struct', 'union', or 'enum'.  */
+ if (!flag_isoc23)
+   warning_at (b->locus, 0,
+   "%<%s %E%> declared inside parameter list"
+ 

Re: [PATCH] Fortran: fix argument passing to CONTIGUOUS, TARGET dummy [PR97592]

2023-12-17 Thread Paul Richard Thomas
Hi Harald,

It might be a simple patch but I have to confess it took a while for me to
get my head around the difference between gfc_is_not_contiguous and
!gfc_is_simply_contigous :-(

Yes, this is OK for mainline and, after a short delay, for 13-branch.

Thanks for the patch

Paul


On Sat, 16 Dec 2023 at 18:28, Harald Anlauf  wrote:

> Dear all,
>
> the attached simple patch fixes a (9+) regression for passing
> to a CONTIGUOUS,TARGET dummy an *effective argument* that is
> contiguous, although the actual argument is not simply-contiguous
> (it is a pointer without the CONTIGOUS attribute in the PR).
>
> Since a previous attempt for a patch lead to regressions in
> gfortran.dg/bind-c-contiguous-3.f90, which is rather dense,
> I decided to enhance the current testcase with various
> combinations of actual and dummy arguments that allow to
> study whether a _gfortran_internal_pack is generated in
> places where we want to.  (_gfortran_internal_pack does not
> create a temporary when no packing is needed).
>
> Regtested on x86_64-pc-linux-gnu.  OK for mainline?
>
> I would like to backport this - after a grace period - to
> at least 13-branch.  Any objections here?
>
> Thanks,
> Harald
>
>


[PATCH] c-family: Use -Wdiscarded-qualifiers for ignored qualifiers in __atomic_*

2023-12-17 Thread Florian Weimer
This matches other compiler diagnostics.  No test updates are needed
because c-c++-common/pr95378.c does not match a specific -W option.

Fixes commit d2384b7b24f8557b66f6958a05ea99ff4307e75c ("c-family:
check qualifiers of arguments to __atomic built-ins (PR 95378)").

gcc/c-family/

PR c/113050
* c-common.cc (get_atomic_generic_size): Use
OPT_Wdiscarded_qualifiers instead of
OPT_Wincompatible_pointer_types.

---

Jonathan, I assume this was just an oversight in your patch, and there
is no fundamental reason to use -Wincompatible-pointer-types here?

Thanks,
Florian

 gcc/c-family/c-common.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 0f1de44a348..6ea727f446f 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -7637,7 +7637,7 @@ get_atomic_generic_size (location_t loc, tree function,
return 0;
  }
else
- pedwarn (loc, OPT_Wincompatible_pointer_types, "argument %d "
+ pedwarn (loc, OPT_Wdiscarded_qualifiers, "argument %d "
   "of %qE discards % qualifier", x + 1,
   function);
  }
@@ -7651,7 +7651,7 @@ get_atomic_generic_size (location_t loc, tree function,
return 0;
  }
else
- pedwarn (loc, OPT_Wincompatible_pointer_types, "argument %d "
+ pedwarn (loc, OPT_Wdiscarded_qualifiers, "argument %d "
   "of %qE discards % qualifier", x + 1,
   function);
  }

base-commit: da70c5b17123b7c81155ef03fb4591b71a681344



Pushed: [PATCH 0/3] LoongArch: Fix instruction costs

2023-12-17 Thread Xi Ruoyao
On Sun, 2023-12-10 at 01:03 +0800, Xi Ruoyao wrote:
> Update LoongArch instruction costs based on the micro-benchmark results
> on LA464 and LA664.  In particular, this allows generating alsl/slli or
> alsl/slli + add pairs for multiplying some constants as on LA464/LA664
> a mul instruction is 4x slower than alsl, slli, or add instructions.
> 
> Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?
> 
> Xi Ruoyao (3):
>   LoongArch: Include rtl.h for COSTS_N_INSNS instead of hard coding our
>     own
>   LoongArch: Fix instruction costs [PR112936]
>   LoongArch: Add alslsi3_extend
> 
>  gcc/config/loongarch/loongarch-def.cc | 42 ++-
>  gcc/config/loongarch/loongarch.cc | 22 +-
>  gcc/config/loongarch/loongarch.md | 12 ++
>  .../loongarch/mul-const-reduction.c   | 11 +
>  4 files changed, 56 insertions(+), 31 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/mul-const-reduction.c

Pushed to r14-664{1,2,3} as all 3 patches are approved.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] LoongArch: Add sign_extend pattern for 32-bit rotate shift

2023-12-17 Thread Xi Ruoyao
Remove a redundant sign extension.

gcc/ChangeLog:

* config/loongarch/loongarch.md (rotrsi3_extend): New
define_insn.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/rotrw.c: New test.
---

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/loongarch.md  | 10 ++
 gcc/testsuite/gcc.target/loongarch/rotrw.c | 17 +
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/rotrw.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index c7058282a21..30025bf1908 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2893,6 +2893,16 @@ (define_insn "rotr3"
   [(set_attr "type" "shift,shift")
(set_attr "mode" "")])
 
+(define_insn "rotrsi3_extend"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+   (sign_extend:DI
+ (rotatert:SI (match_operand:SI 1 "register_operand" "r,r")
+  (match_operand:SI 2 "arith_operand" "r,I"]
+  "TARGET_64BIT"
+  "rotr%i2.w\t%0,%1,%2"
+  [(set_attr "type" "shift,shift")
+   (set_attr "mode" "SI")])
+
 ;; The following templates were added to generate "bstrpick.d + alsl.d"
 ;; instruction pairs.
 ;; It is required that the values of const_immalsl_operand and
diff --git a/gcc/testsuite/gcc.target/loongarch/rotrw.c 
b/gcc/testsuite/gcc.target/loongarch/rotrw.c
new file mode 100644
index 000..6ed45e8b86c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/rotrw.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler "rotr\\.w\t\\\$r4,\\\$r4,\\\$r5" } } */
+/* { dg-final { scan-assembler "rotri\\.w\t\\\$r4,\\\$r4,5" } } */
+/* { dg-final { scan-assembler-not "slli\\.w" } } */
+
+unsigned
+rotr (unsigned a, unsigned b)
+{
+  return a >> b | a << 32 - b;
+}
+
+unsigned
+rotri (unsigned a)
+{
+  return a >> 5 | a << 27;
+}
-- 
2.43.0



[PATCH] LoongArch: Fix FP vector comparsons [PR113034]

2023-12-17 Thread Xi Ruoyao
We had the following mappings between vfcmp submenmonics and RTX
codes:

(define_code_attr fcc
  [(unordered "cun")
   (ordered   "cor")
   (eq   "ceq")
   (ne   "cne")
   (uneq  "cueq")
   (unle  "cule")
   (unlt  "cult")
   (le   "cle")
   (lt   "clt")])

This is inconsistent with scalar code:

(define_code_attr fcond [(unordered "cun")
 (uneq "cueq")
 (unlt "cult")
 (unle "cule")
 (eq "ceq")
 (lt "slt")
 (le "sle")
 (ordered "cor")
 (ltgt "sne")
 (ne "cune")
 (ge "sge")
 (gt "sgt")
 (unge "cuge")
 (ungt "cugt")])

For every RTX code for which the LSX/LASX code is different from the
scalar code, the scalar code is correct and the LSX/LASX code is wrong.
Most seriously, the RTX code NE should be mapped to "cneq", not "cne".
Rewrite vfcmp define_insns in simd.md using the same mapping as
scalar fcmp.

Note that GAS does not support [x]vfcmp.{c/s}[u]{ge/gt} (pseudo)
instruction (although fcmp.{c/s}[u]{ge/gt} is supported), so we need to
switch the order of inputs and use [x]vfcmp.{c/s}[u]{le/lt} instead.

The vfcmp.{sult/sule/clt/cle}.{s/d} instructions do not have a single
RTX code, but they can be modeled as an inversed RTX code following a
"not" operation.  Doing so allows the compiler to optimized vectorized
__builtin_isless etc. to a single instruction.  This optimization should
be added for scalar code too and I'll do it later.

Tests are added for mapping between C code, IEC 60559 operations, and
vfcmp instructions.

[1]:https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640713.html

gcc/ChangeLog:

PR target/113034
* config/loongarch/lasx.md (UNSPEC_LASX_XVFCMP_*): Remove.
(lasx_xvfcmp_caf_): Remove.
(lasx_xvfcmp_cune_): Remove.
(FSC256_UNS): Remove.
(fsc256): Remove.
(lasx_xvfcmp__): Remove.
(lasx_xvfcmp__): Remove.
* config/loongarch/lsx.md (UNSPEC_LSX_XVFCMP_*): Remove.
(lsx_vfcmp_caf_): Remove.
(lsx_vfcmp_cune_): Remove.
(vfcond): Remove.
(fcc): Remove.
(FSC_UNS): Remove.
(fsc): Remove.
(lsx_vfcmp__): Remove.
(lsx_vfcmp__): Remove.
* config/loongarch/simd.md
(fcond_simd): New define_code_iterator.
(_vfcmp__):
New define_insn.
(fcond_simd_rev): New define_code_iterator.
(fcond_rev_asm): New define_code_attr.
(_vfcmp__):
New define_insn.
(fcond_inv): New define_code_iterator.
(fcond_inv_rev): New define_code_iterator.
(fcond_inv_rev_asm): New define_code_attr.
(_vfcmp__): New define_insn.
(_vfcmp__):
New define_insn.
(UNSPEC_SIMD_FCMP_CAF, UNSPEC_SIMD_FCMP_SAF,
UNSPEC_SIMD_FCMP_SEQ, UNSPEC_SIMD_FCMP_SUN,
UNSPEC_SIMD_FCMP_SUEQ, UNSPEC_SIMD_FCMP_CNE,
UNSPEC_SIMD_FCMP_SOR, UNSPEC_SIMD_FCMP_SUNE): New unspecs.
(SIMD_FCMP): New define_int_iterator.
(fcond_unspec): New define_int_attr.
(_vfcmp__): New define_insn.
* config/loongarch/loongarch.cc (loongarch_expand_lsx_cmp):
Remove unneeded special cases.

gcc/testsuite/ChangeLog:

PR target/113034
* gcc.target/loongarch/vfcmp-f.c: New test.
* gcc.target/loongarch/vfcmp-d.c: New test.
* gcc.target/loongarch/xvfcmp-f.c: New test.
* gcc.target/loongarch/xvfcmp-d.c: New test.
* gcc.target/loongarch/vector/lasx/lasx-vcond-2.c: Scan for cune
instead of cne.
* gcc.target/loongarch/vector/lsx/lsx-vcond-2.c: Likewise.
---

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/lasx.md  |  76 
 gcc/config/loongarch/loongarch.cc |  60 +-
 gcc/config/loongarch/lsx.md   |  83 
 gcc/config/loongarch/simd.md  | 118 
 .../loongarch/vector/lasx/lasx-vcond-2.c  |   4 +-
 .../loongarch/vector/lsx/lsx-vcond-2.c|   4 +-
 gcc/testsuite/gcc.target/loongarch/vfcmp-d.c  |  28 +++
 gcc/testsuite/gcc.target/loongarch/vfcmp-f.c  | 178 ++
 gcc/testsuite/gcc.target/loongarch/xvfcmp-d.c |  29 +++
 gcc/testsuite/gcc.target/loongarch/xvfcmp-f.c |  27 +++
 10 files changed, 385 insertions(+), 222 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vfcmp-d.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vfcmp-f.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/xvfcmp-d.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/xvfcmp-f.c

diff --git a/gcc/config/loongarch/lasx.md 

Re: [PATCH,doc] install: Drop hppa*-hp-hpux10, remove old notes on hppa*-hp-hpux11

2023-12-17 Thread John David Anglin

On 2023-12-17 2:28 a.m., Gerald Pfeifer wrote:

Hi Dave,

based on our earlier e-mail, I understand we don't support hppa*-hp-hpux10
any longer, so let's remove them from the installation docs.

On the way remove references to GCC 2.95 and 3.0 from hppa*-hp-hpux11.

Okay?

The sentence about 64-bit libffi for hpux also can be removed.  I ported it a 
few years
ago.

Otherwise, the change is okay.



(I believe it would be great if you could have a look at that part of the
installation docs. I'm pretty confident there is quite a bit more we can
garbage collect or simplify.)

Maybe I can do it tomorrow.

Dave



Gerald


gcc:
PR target/69374
* doc/install.texi (Specific) : Remove section.
(Specific) : Remove references to GCC 2.95 and 3.0.
---
  gcc/doc/install.texi | 18 --
  1 file changed, 18 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 84d8834a9b5..17cef5a2bae 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3742,8 +3742,6 @@ information have to.
  @item
  @uref{#hppa-hp-hpux,,hppa*-hp-hpux*}
  @item
-@uref{#hppa-hp-hpux10,,hppa*-hp-hpux10}
-@item
  @uref{#hppa-hp-hpux11,,hppa*-hp-hpux11}
  @item
  @uref{#x-x-linux-gnu,,*-*-linux-gnu}
@@ -4152,27 +4150,11 @@ a list of the predefines used with each standard.
  
  More specific information to @samp{hppa*-hp-hpux*} targets follows.
  
-@html

-
-@end html
-@anchor{hppa-hp-hpux10}
-@heading hppa*-hp-hpux10
-For hpux10.20, we @emph{highly} recommend you pick up the latest sed patch
-@code{PHCO_19798} from HP@.
-
-The C++ ABI has changed incompatibly in GCC 4.0.  COMDAT subspaces are
-used for one-only code and data.  This resolves many of the previous
-problems in using C++ on this target.  However, the ABI is not compatible
-with the one implemented under HP-UX 11 using secondary definitions.
-
  @html
  
  @end html
  @anchor{hppa-hp-hpux11}
  @heading hppa*-hp-hpux11
-GCC 3.0 and up support HP-UX 11.  GCC 2.95.x is not supported and cannot
-be used to compile GCC 3.0 and up.
-
  The libffi library haven't been ported to 64-bit HP-UX@ and doesn't build.
  
  Refer to @uref{binaries.html,,binaries} for information about obtaining



--
John David Anglin  dave.ang...@bell.net



RE: [PATCH v4] [tree-optimization/110279] Consider FMA in get_reassociation_width

2023-12-17 Thread Di Zhao OS
Hello Thomas,

> -Original Message-
> From: Thomas Schwinge 
> Sent: Friday, December 15, 2023 5:46 PM
> To: Di Zhao OS ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener 
> Subject: RE: [PATCH v4] [tree-optimization/110279] Consider FMA in
> get_reassociation_width
> 
> Hi!
> 
> On 2023-12-13T08:14:28+, Di Zhao OS  wrote:
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr110279-2.c
> > @@ -0,0 +1,41 @@
> > +/* PR tree-optimization/110279 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-Ofast --param tree-reassoc-width=4 --param fully-
> pipelined-fma=1 -fdump-tree-reassoc2-details -fdump-tree-optimized" } */
> > +/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
> > +
> > +#define LOOP_COUNT 8
> > +typedef double data_e;
> > +
> > +#include 
> > +
> > +__attribute_noinline__ data_e
> > +foo (data_e in)
> 
> Pushed to master branch commit 91e9e8faea4086b3b8aef2355fc12c1559d425f6
> "Fix 'gcc.dg/pr110279-2.c' syntax error due to '__attribute_noinline__'",
> see attached.
> 
> However:
> 
> > +{
> > +  data_e a1, a2, a3, a4;
> > +  data_e tmp, result = 0;
> > +  a1 = in + 0.1;
> > +  a2 = in * 0.1;
> > +  a3 = in + 0.01;
> > +  a4 = in * 0.59;
> > +
> > +  data_e result2 = 0;
> > +
> > +  for (int ic = 0; ic < LOOP_COUNT; ic++)
> > +{
> > +  /* Test that a complete FMA chain with length=4 is not broken.  */
> > +  tmp = a1 + a2 * a2 + a3 * a3 + a4 * a4 ;
> > +  result += tmp - ic;
> > +  result2 = result2 / 2 - tmp;
> > +
> > +  a1 += 0.91;
> > +  a2 += 0.1;
> > +  a3 -= 0.01;
> > +  a4 -= 0.89;
> > +
> > +}
> > +
> > +  return result + result2;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "was chosen for reassociation"
> "reassoc2"} } */
> > +/* { dg-final { scan-tree-dump-times {\.FMA } 3 "optimized"} } */

Thank you for the fix.

> ..., I still see these latter two tree dump scans FAIL, for GCN:
> 
> $ grep -C2 'was chosen for reassociation' pr110279-2.c.197t.reassoc2
>   2 *: a3_40
>   2 *: a2_39
> Width = 4 was chosen for reassociation
> Transforming _15 = powmult_1 + powmult_3;
>  into _63 = powmult_1 + a1_38;
> $ grep -F .FMA pr110279-2.c.265t.optimized
>   _63 = .FMA (a2_39, a2_39, a1_38);
>   _64 = .FMA (a3_40, a3_40, powmult_5);
> 
> ..., nvptx:
> 
> $ grep -C2 'was chosen for reassociation' pr110279-2.c.197t.reassoc2
>   2 *: a3_40
>   2 *: a2_39
> Width = 4 was chosen for reassociation
> Transforming _15 = powmult_1 + powmult_3;
>  into _63 = powmult_1 + a1_38;
> $ grep -F .FMA pr110279-2.c.265t.optimized
>   _63 = .FMA (a2_39, a2_39, a1_38);
>   _64 = .FMA (a3_40, a3_40, powmult_5);

For these 2 targets, the reassoc_width for FMUL is 1 (default value),
While the testcase assumes that to be 4. The bug was introduced when I
updated the patch but forgot to update the testcase.

> ..., but also x86_64-pc-linux-gnu:
> 
> $  grep -C2 'was chosen for reassociation' pr110279-2.c.197t.reassoc2
>   2 *: a3_40
>   2 *: a2_39
> Width = 2 was chosen for reassociation
> Transforming _15 = powmult_1 + powmult_3;
>  into _63 = powmult_1 + powmult_3;
> $ grep -cF .FMA pr110279-2.c.265t.optimized
> 0

For x86_64 this needs "-mfma". Sorry the compile options missed that.
Can the change below fix these issues? I moved them into
testsuite/gcc.target/aarch64, since they rely on tunings.

Tested on aarch64-unknown-linux-gnu.

> 
> Grüße
>  Thomas
> 
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
> München, HRB 106955

Thanks,
Di Zhao

---
 gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-1.c | 3 +--
 gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-2.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)
 rename gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-1.c (83%)
 rename gcc/testsuite/{gcc.dg => gcc.target/aarch64}/pr110279-2.c (78%)

diff --git a/gcc/testsuite/gcc.dg/pr110279-1.c 
b/gcc/testsuite/gcc.target/aarch64/pr110279-1.c
similarity index 83%
rename from gcc/testsuite/gcc.dg/pr110279-1.c
rename to gcc/testsuite/gcc.target/aarch64/pr110279-1.c
index f25b6aec967..97d693f56a5 100644
--- a/gcc/testsuite/gcc.dg/pr110279-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr110279-1.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Ofast --param avoid-fma-max-bits=512 --param 
tree-reassoc-width=4 -fdump-tree-widening_mul-details" } */
-/* { dg-additional-options "-march=armv8.2-a" { target aarch64-*-* } } */
+/* { dg-options "-Ofast -mcpu=generic --param avoid-fma-max-bits=512 --param 
tree-reassoc-width=4 -fdump-tree-widening_mul-details" } */
 
 #define LOOP_COUNT 8
 typedef double data_e;
diff --git a/gcc/testsuite/gcc.dg/pr110279-2.c 

Re: RFC -- targets with unsigned bifields

2023-12-17 Thread Richard Biener



> Am 17.12.2023 um 04:29 schrieb Jeff Law :
> 
> 
> So mcore-elf is the slowest target to test with a simulator.  Not because 
> it's simulator is particularly bad, but because some tests timeout as they've 
> gotten into infinite loops.  This causes the mcore-elf port to take about 2X 
> longer than most other gdbsim ports.
> 
> I tracked this down to the port unconditionally adding -funsigned-bitfields 
> to CC1_SPEC.  According to the comment it's how the ABI is defined for the 
> mcore targets.
> 
> It'd be nice to get reasonable results from mcore-elf in a reasonable amount 
> of time.  The question is how.
> 
> I *could* just disable the -funsigned-bitfields within the tester.  We 
> certainly have the ability to carry forward patches like this which exist 
> only to help the testing effort.
> 
> Another approach would be to add an explicit -fsigned-bifields to the 
> arguments for the affected tests.  I'd guess it's on the order of around 35 
> distinct tests that would need to be updated.
> 
> A third approach would be to grub around and see if there's a way to add a 
> -fsigned-bitfields using dejagnu, perhaps in the baseboards file.

When the testcases are simply invalid with unsigned bitfields then I suggest to 
add a dg effective target we could require?  Or are the testcases actually 
miscompiled?

I suppose neither -f[un]signed-bitfields is the standard behavior but bitfield 
signedness is determined by the underlying type?  Or is this flag about sth 
else?

I could imagine a test needing the default behavior?

Richard 

> Looking for suggestions/recommendations here.
> 
> Jeff