Re: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread juzhe.zh...@rivai.ai
I plan to implement BF16 vector in GCC but still waiting for ISA ratified since 
GCC policy doesn't allow un-ratified ISA.

Currently, we are working on INT8,INT16,INT32,INT64,FP16,FP32,FP64 
auto-vectorizaiton.
It should very simple BF16 in current vector framework in GCC.

Thanks.


juzhe.zh...@rivai.ai
 
From: Li, Pan2
Date: 2023-06-01 14:57
To: juzhe.zh...@rivai.ai
Subject: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 
in riscv like x86_64 and arm.
FYI.
 
-Original Message-
From: Gcc-patches  On Behalf 
Of Jin Ma via Gcc-patches
Sent: Thursday, June 1, 2023 2:51 PM
To: gcc-patches@gcc.gnu.org
Cc: shi...@iscas.ac.cn; kito.ch...@gmail.com; Jin Ma 
Subject: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in 
riscv like x86_64 and arm.
 
hi, 
 
Are there any new developments about Zfb? Are there any plans to implement the 
Zvfbfmin and Zvfbfwma expansion? I see that Zfb is being reviewed in llvm, 
maybe we should do the same on gcc.
 
Ref: https://reviews.llvm.org/D151313
 https://reviews.llvm.org/D150929
 


[PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.

2023-06-01 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to introduce the built-in type vfloat16m{f}*_t, as
well as their machine mode VNx*HF. They depend on architecture zvfhmin
or zvfh.

When givn the zvfhmin or zvfh, the macro TARGET_VECTOR_ELEN_FP_16 will
be true.

The underlying PATCH will implement the zvfhmin extension based on this.

Signed-off-by: Pan Li 

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add FP_16 mask to zvfhmin
and zvfh.
* config/riscv/genrvv-type-indexer.cc (valid_type): Allow FP16.
(main): Disable FP16 tuple.
* config/riscv/riscv-opts.h (MASK_VECTOR_ELEN_FP_16): New macro.
(TARGET_VECTOR_ELEN_FP_16): Ditto.
* config/riscv/riscv-vector-builtins.cc (check_required_extensions):
Add FP16.
* config/riscv/riscv-vector-builtins.def (vfloat16mf4_t): New type.
(vfloat16mf2_t): Ditto.
(vfloat16m1_t): Ditto.
(vfloat16m2_t): Ditto.
(vfloat16m4_t): Ditto.
(vfloat16m8_t): Ditto.
* config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_FP_16):
New macro.
* config/riscv/riscv-vector-switch.def (ENTRY): Allow FP16
machine mode based on TARGET_VECTOR_ELEN_FP_16.
---
 gcc/common/config/riscv/riscv-common.cc|  2 ++
 gcc/config/riscv/genrvv-type-indexer.cc|  7 +--
 gcc/config/riscv/riscv-opts.h  |  4 
 gcc/config/riscv/riscv-vector-builtins.cc  |  2 ++
 gcc/config/riscv/riscv-vector-builtins.def | 20 +++
 gcc/config/riscv/riscv-vector-builtins.h   |  1 +
 gcc/config/riscv/riscv-vector-switch.def   | 23 ++
 7 files changed, 49 insertions(+), 10 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index e6ed3df9ea6..3247d526c0a 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1248,6 +1248,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zve64x",   &gcc_options::x_riscv_vector_elen_flags, MASK_VECTOR_ELEN_64},
   {"zve64f",   &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_32},
   {"zve64d",   &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_64},
+  {"zvfhmin",  &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_16},
+  {"zvfh", &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_16},
 
   {"zvl32b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL32B},
   {"zvl64b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL64B},
diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
b/gcc/config/riscv/genrvv-type-indexer.cc
index 18e1b375396..8fc93ceaab4 100644
--- a/gcc/config/riscv/genrvv-type-indexer.cc
+++ b/gcc/config/riscv/genrvv-type-indexer.cc
@@ -54,7 +54,7 @@ valid_type (unsigned sew, int lmul_log2, bool float_p)
 case 8:
   return lmul_log2 >= -3 && !float_p;
 case 16:
-  return lmul_log2 >= -2 && !float_p;
+  return lmul_log2 >= -2;
 case 32:
   return lmul_log2 >= -1;
 case 64:
@@ -73,6 +73,9 @@ valid_type (unsigned sew, int lmul_log2, unsigned nf, bool 
float_p)
   if (nf > 8 || nf < 1)
 return false;
 
+  if (sew == 16 && nf != 1 && float_p) // Disable FP16 tuple in temporarily.
+return false;
+
   switch (lmul_log2)
 {
 case 1:
@@ -342,7 +345,7 @@ main (int argc, const char **argv)
fprintf (fp, ")\n");
  }
   // Build for vfloat
-  for (unsigned sew : {32, 64})
+  for (unsigned sew : {16, 32, 64})
 for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3})
   for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8})
{
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 5f387d0e393..208a557b8ff 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -154,6 +154,8 @@ enum riscv_entity
 #define MASK_VECTOR_ELEN_64(1 << 1)
 #define MASK_VECTOR_ELEN_FP_32 (1 << 2)
 #define MASK_VECTOR_ELEN_FP_64 (1 << 3)
+/* Align the bit index to riscv-vector-builtins.h.  */
+#define MASK_VECTOR_ELEN_FP_16 (1 << 6)
 
 #define TARGET_VECTOR_ELEN_32 \
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_32) != 0)
@@ -163,6 +165,8 @@ enum riscv_entity
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_32) != 0)
 #define TARGET_VECTOR_ELEN_FP_64 \
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_64) != 0)
+#define TARGET_VECTOR_ELEN_FP_16 \
+  ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_16) != 0)
 
 #define MASK_ZVL32B(1 <<  0)
 #define MASK_ZVL64B(1 <<  1)
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 9fea70709fd..43bf6d8f262 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2944,6 +2944,8 @@ check_required_extensions (const function_instance 
&instance)
 
   uint64_t riscv_isa_flags = 0;
 
+  if (TARGET_VECTOR_ELEN_FP_16)
+riscv_isa_flags |= RVV_REQUIRE_ELEN_FP_16;
   if (TARGET_VECTOR_ELEN_FP_32)
 riscv_

Re: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.

2023-06-01 Thread juzhe.zh...@rivai.ai
LGTM. 

We are waiting for FP16 vector to start floating-point auto-vectorizations

Thanks so much.


juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-06-01 15:17
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang
Subject: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.
From: Pan Li 
 
This patch would like to introduce the built-in type vfloat16m{f}*_t, as
well as their machine mode VNx*HF. They depend on architecture zvfhmin
or zvfh.
 
When givn the zvfhmin or zvfh, the macro TARGET_VECTOR_ELEN_FP_16 will
be true.
 
The underlying PATCH will implement the zvfhmin extension based on this.
 
Signed-off-by: Pan Li 
 
gcc/ChangeLog:
 
* common/config/riscv/riscv-common.cc: Add FP_16 mask to zvfhmin
and zvfh.
* config/riscv/genrvv-type-indexer.cc (valid_type): Allow FP16.
(main): Disable FP16 tuple.
* config/riscv/riscv-opts.h (MASK_VECTOR_ELEN_FP_16): New macro.
(TARGET_VECTOR_ELEN_FP_16): Ditto.
* config/riscv/riscv-vector-builtins.cc (check_required_extensions):
Add FP16.
* config/riscv/riscv-vector-builtins.def (vfloat16mf4_t): New type.
(vfloat16mf2_t): Ditto.
(vfloat16m1_t): Ditto.
(vfloat16m2_t): Ditto.
(vfloat16m4_t): Ditto.
(vfloat16m8_t): Ditto.
* config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_FP_16):
New macro.
* config/riscv/riscv-vector-switch.def (ENTRY): Allow FP16
machine mode based on TARGET_VECTOR_ELEN_FP_16.
---
gcc/common/config/riscv/riscv-common.cc|  2 ++
gcc/config/riscv/genrvv-type-indexer.cc|  7 +--
gcc/config/riscv/riscv-opts.h  |  4 
gcc/config/riscv/riscv-vector-builtins.cc  |  2 ++
gcc/config/riscv/riscv-vector-builtins.def | 20 +++
gcc/config/riscv/riscv-vector-builtins.h   |  1 +
gcc/config/riscv/riscv-vector-switch.def   | 23 ++
7 files changed, 49 insertions(+), 10 deletions(-)
 
diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index e6ed3df9ea6..3247d526c0a 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1248,6 +1248,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zve64x",   &gcc_options::x_riscv_vector_elen_flags, MASK_VECTOR_ELEN_64},
   {"zve64f",   &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_32},
   {"zve64d",   &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_64},
+  {"zvfhmin",  &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_16},
+  {"zvfh", &gcc_options::x_riscv_vector_elen_flags, 
MASK_VECTOR_ELEN_FP_16},
   {"zvl32b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL32B},
   {"zvl64b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL64B},
diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
b/gcc/config/riscv/genrvv-type-indexer.cc
index 18e1b375396..8fc93ceaab4 100644
--- a/gcc/config/riscv/genrvv-type-indexer.cc
+++ b/gcc/config/riscv/genrvv-type-indexer.cc
@@ -54,7 +54,7 @@ valid_type (unsigned sew, int lmul_log2, bool float_p)
 case 8:
   return lmul_log2 >= -3 && !float_p;
 case 16:
-  return lmul_log2 >= -2 && !float_p;
+  return lmul_log2 >= -2;
 case 32:
   return lmul_log2 >= -1;
 case 64:
@@ -73,6 +73,9 @@ valid_type (unsigned sew, int lmul_log2, unsigned nf, bool 
float_p)
   if (nf > 8 || nf < 1)
 return false;
+  if (sew == 16 && nf != 1 && float_p) // Disable FP16 tuple in temporarily.
+return false;
+
   switch (lmul_log2)
 {
 case 1:
@@ -342,7 +345,7 @@ main (int argc, const char **argv)
fprintf (fp, ")\n");
  }
   // Build for vfloat
-  for (unsigned sew : {32, 64})
+  for (unsigned sew : {16, 32, 64})
 for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3})
   for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8})
{
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 5f387d0e393..208a557b8ff 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -154,6 +154,8 @@ enum riscv_entity
#define MASK_VECTOR_ELEN_64(1 << 1)
#define MASK_VECTOR_ELEN_FP_32 (1 << 2)
#define MASK_VECTOR_ELEN_FP_64 (1 << 3)
+/* Align the bit index to riscv-vector-builtins.h.  */
+#define MASK_VECTOR_ELEN_FP_16 (1 << 6)
#define TARGET_VECTOR_ELEN_32 \
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_32) != 0)
@@ -163,6 +165,8 @@ enum riscv_entity
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_32) != 0)
#define TARGET_VECTOR_ELEN_FP_64 \
   ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_64) != 0)
+#define TARGET_VECTOR_ELEN_FP_16 \
+  ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_16) != 0)
#define MASK_ZVL32B(1 <<  0)
#define MASK_ZVL64B(1 <<  1)
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 9fea70709fd..43bf6d8f262 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2944,6 +2944,8 @@ check_required_extensions (const function_instance 
&instance)
   uint64_t riscv_isa_flags = 0;
+  if (T

[PATCH v5] tree-ssa-sink: Improve code sinking pass

2023-06-01 Thread Ajit Agarwal via Gcc-patches
Hello All:

This patch improves code sinking pass to sink statements before call to reduce
register pressure.
Review comments are incorporated.

For example :

void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
  int l;
  l = a + b + c + d +e + f;
  if (a != 5)
{
  bar();
  j = l;
}
}

Code Sinking does the following:

void bar();
int j;
void foo(int a, int b, int c, int d, int e, int f)
{
  int l;
  
  if (a != 5)
{
  l = a + b + c + d +e + f; 
  bar();
  j = l;
}
}

Bootstrapped regtested on powerpc64-linux-gnu.

Thanks & Regards
Ajit


tree-ssa-sink: Improve code sinking pass

Currently, code sinking will sink code after function calls.  This increases
register pressure for callee-saved registers.  The following patch improves
code sinking by placing the sunk code before calls in the use block or in
the immediate dominator of the use blocks.

2023-06-01  Ajit Kumar Agarwal  

gcc/ChangeLog:

PR tree-optimization/81953
* tree-ssa-sink.cc (statement_sink_location): Move statements before
calls.
(def_use_same_block): New function.
(select_best_block): Add heuristics to select the best blocks in the
immediate post dominator.

gcc/testsuite/ChangeLog:

PR tree-optimization/81953
* gcc.dg/tree-ssa/ssa-sink-20.c: New testcase.
* gcc.dg/tree-ssa/ssa-sink-21.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c | 15 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c | 19 ++
 gcc/tree-ssa-sink.cc| 71 ++---
 3 files changed, 95 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
new file mode 100644
index 000..d3b79ca5803
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-20.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sink-stats" } */
+void bar();
+int j;
+void foo(int a, int b, int c, int d, int e, int f)
+{
+  int l;
+  l = a + b + c + d +e + f;
+  if (a != 5)
+{
+  bar();
+  j = l;
+}
+}
+/* { dg-final { scan-tree-dump 
{l_12\s+=\s+_4\s+\+\s+f_11\(D\);\n\s+bar\s+\(\)} sink1 } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
new file mode 100644
index 000..84e7938c54f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-21.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sink-stats" } */
+void bar();
+int j, x;
+void foo(int a, int b, int c, int d, int e, int f)
+{
+  int l;
+  l = a + b + c + d +e + f;
+  if (a != 5)
+{
+  bar();
+  if (b != 3)
+x = 3;
+  else
+x = 5;
+  j = l;
+}
+}
+/* { dg-final { scan-tree-dump 
{l_13\s+=\s+_4\s+\+\s+f_12\(D\);\n\s+bar\s+\(\)} sink1 } } */
diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
index b1ba7a2ad6c..f1d25f1a0f8 100644
--- a/gcc/tree-ssa-sink.cc
+++ b/gcc/tree-ssa-sink.cc
@@ -171,9 +171,28 @@ nearest_common_dominator_of_uses (def_operand_p def_p, 
bool *debug_stmts)
   return commondom;
 }
 
+/* Return TRUE if immediate uses of the defs in
+   STMT occur in the same block as STMT, FALSE otherwise.  */
+
+static bool
+def_use_same_block (gimple *stmt)
+{
+  def_operand_p def;
+  ssa_op_iter iter;
+
+  FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF)
+{
+  gimple *def_stmt = SSA_NAME_DEF_STMT (DEF_FROM_PTR (def));
+  if ((gimple_bb (def_stmt) == gimple_bb (stmt)))
+   return true;
+ }
+  return false;
+}
+
 /* Given EARLY_BB and LATE_BB, two blocks in a path through the dominator
tree, return the best basic block between them (inclusive) to place
-   statements.
+   statements. The best basic block should be an immediate dominator of
+   best basic block if the use stmt is after the call.
 
We want the most control dependent block in the shallowest loop nest.
 
@@ -190,7 +209,8 @@ nearest_common_dominator_of_uses (def_operand_p def_p, bool 
*debug_stmts)
 static basic_block
 select_best_block (basic_block early_bb,
   basic_block late_bb,
-  gimple *stmt)
+  gimple *stmt,
+  gimple *use)
 {
   basic_block best_bb = late_bb;
   basic_block temp_bb = late_bb;
@@ -237,7 +257,40 @@ select_best_block (basic_block early_bb,
   /* If result of comparsion is unknown, prefer EARLY_BB.
 Thus use !(...>=..) rather than (...<...)  */
   && !(best_bb->count * 100 >= early_bb->count * threshold))
-return best_bb;
+{
+  basic_block new_best_bb = get_immediate_dominator (CDI_DOMINATORS, 
best_bb);
+  /* Return best_bb if def and use are in same block otherwise new_best_bb.
+
+Things to consider:
+
+  new_best_bb is not equal to best_bb 

Re: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.

2023-06-01 Thread Kito Cheng via Gcc-patches
LGTM, thanks :)

On Thu, Jun 1, 2023 at 3:20 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM.
>
> We are waiting for FP16 vector to start floating-point auto-vectorizations
>
> Thanks so much.
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-01 15:17
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang
> Subject: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.
> From: Pan Li 
>
> This patch would like to introduce the built-in type vfloat16m{f}*_t, as
> well as their machine mode VNx*HF. They depend on architecture zvfhmin
> or zvfh.
>
> When givn the zvfhmin or zvfh, the macro TARGET_VECTOR_ELEN_FP_16 will
> be true.
>
> The underlying PATCH will implement the zvfhmin extension based on this.
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add FP_16 mask to zvfhmin
> and zvfh.
> * config/riscv/genrvv-type-indexer.cc (valid_type): Allow FP16.
> (main): Disable FP16 tuple.
> * config/riscv/riscv-opts.h (MASK_VECTOR_ELEN_FP_16): New macro.
> (TARGET_VECTOR_ELEN_FP_16): Ditto.
> * config/riscv/riscv-vector-builtins.cc (check_required_extensions):
> Add FP16.
> * config/riscv/riscv-vector-builtins.def (vfloat16mf4_t): New type.
> (vfloat16mf2_t): Ditto.
> (vfloat16m1_t): Ditto.
> (vfloat16m2_t): Ditto.
> (vfloat16m4_t): Ditto.
> (vfloat16m8_t): Ditto.
> * config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_FP_16):
> New macro.
> * config/riscv/riscv-vector-switch.def (ENTRY): Allow FP16
> machine mode based on TARGET_VECTOR_ELEN_FP_16.
> ---
> gcc/common/config/riscv/riscv-common.cc|  2 ++
> gcc/config/riscv/genrvv-type-indexer.cc|  7 +--
> gcc/config/riscv/riscv-opts.h  |  4 
> gcc/config/riscv/riscv-vector-builtins.cc  |  2 ++
> gcc/config/riscv/riscv-vector-builtins.def | 20 +++
> gcc/config/riscv/riscv-vector-builtins.h   |  1 +
> gcc/config/riscv/riscv-vector-switch.def   | 23 ++
> 7 files changed, 49 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index e6ed3df9ea6..3247d526c0a 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1248,6 +1248,8 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>{"zve64x",   &gcc_options::x_riscv_vector_elen_flags, MASK_VECTOR_ELEN_64},
>{"zve64f",   &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_32},
>{"zve64d",   &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_64},
> +  {"zvfhmin",  &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_16},
> +  {"zvfh", &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_16},
>{"zvl32b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL32B},
>{"zvl64b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL64B},
> diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
> b/gcc/config/riscv/genrvv-type-indexer.cc
> index 18e1b375396..8fc93ceaab4 100644
> --- a/gcc/config/riscv/genrvv-type-indexer.cc
> +++ b/gcc/config/riscv/genrvv-type-indexer.cc
> @@ -54,7 +54,7 @@ valid_type (unsigned sew, int lmul_log2, bool float_p)
>  case 8:
>return lmul_log2 >= -3 && !float_p;
>  case 16:
> -  return lmul_log2 >= -2 && !float_p;
> +  return lmul_log2 >= -2;
>  case 32:
>return lmul_log2 >= -1;
>  case 64:
> @@ -73,6 +73,9 @@ valid_type (unsigned sew, int lmul_log2, unsigned nf, bool 
> float_p)
>if (nf > 8 || nf < 1)
>  return false;
> +  if (sew == 16 && nf != 1 && float_p) // Disable FP16 tuple in temporarily.
> +return false;
> +
>switch (lmul_log2)
>  {
>  case 1:
> @@ -342,7 +345,7 @@ main (int argc, const char **argv)
> fprintf (fp, ")\n");
>   }
>// Build for vfloat
> -  for (unsigned sew : {32, 64})
> +  for (unsigned sew : {16, 32, 64})
>  for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3})
>for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8})
> {
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index 5f387d0e393..208a557b8ff 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -154,6 +154,8 @@ enum riscv_entity
> #define MASK_VECTOR_ELEN_64(1 << 1)
> #define MASK_VECTOR_ELEN_FP_32 (1 << 2)
> #define MASK_VECTOR_ELEN_FP_64 (1 << 3)
> +/* Align the bit index to riscv-vector-builtins.h.  */
> +#define MASK_VECTOR_ELEN_FP_16 (1 << 6)
> #define TARGET_VECTOR_ELEN_32 \
>((riscv_vector_elen_flags & MASK_VECTOR_ELEN_32) != 0)
> @@ -163,6 +165,8 @@ enum riscv_entity
>((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_32) != 0)
> #define TARGET_VECTOR_ELEN_FP_64 \
>((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_64) != 0)
> +#define TARGET_VECTOR_ELEN_FP_16 \
> +  ((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_16) != 0)
> #define MASK_ZVL32B(1 <<  0)
> #define MASK_ZVL64B(1 <<  1)
> diff --git a/gcc/config/riscv/riscv-vector-b

RE: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.

2023-06-01 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito and Juzhe.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, June 1, 2023 3:21 PM
To: juzhe.zh...@rivai.ai
Cc: Li, Pan2 ; gcc-patches ; 
Kito.cheng ; Wang, Yanzhang 
Subject: Re: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.

LGTM, thanks :)

On Thu, Jun 1, 2023 at 3:20 PM juzhe.zh...@rivai.ai  
wrote:
>
> LGTM.
>
> We are waiting for FP16 vector to start floating-point 
> auto-vectorizations
>
> Thanks so much.
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-01 15:17
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; pan2.li; yanzhang.wang
> Subject: [PATCH] RISC-V: Introduce vfloat16m{f}*_t and their machine mode.
> From: Pan Li 
>
> This patch would like to introduce the built-in type vfloat16m{f}*_t, 
> as well as their machine mode VNx*HF. They depend on architecture 
> zvfhmin or zvfh.
>
> When givn the zvfhmin or zvfh, the macro TARGET_VECTOR_ELEN_FP_16 will 
> be true.
>
> The underlying PATCH will implement the zvfhmin extension based on this.
>
> Signed-off-by: Pan Li 
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add FP_16 mask to zvfhmin and 
> zvfh.
> * config/riscv/genrvv-type-indexer.cc (valid_type): Allow FP16.
> (main): Disable FP16 tuple.
> * config/riscv/riscv-opts.h (MASK_VECTOR_ELEN_FP_16): New macro.
> (TARGET_VECTOR_ELEN_FP_16): Ditto.
> * config/riscv/riscv-vector-builtins.cc (check_required_extensions):
> Add FP16.
> * config/riscv/riscv-vector-builtins.def (vfloat16mf4_t): New type.
> (vfloat16mf2_t): Ditto.
> (vfloat16m1_t): Ditto.
> (vfloat16m2_t): Ditto.
> (vfloat16m4_t): Ditto.
> (vfloat16m8_t): Ditto.
> * config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_FP_16):
> New macro.
> * config/riscv/riscv-vector-switch.def (ENTRY): Allow FP16 machine 
> mode based on TARGET_VECTOR_ELEN_FP_16.
> ---
> gcc/common/config/riscv/riscv-common.cc|  2 ++
> gcc/config/riscv/genrvv-type-indexer.cc|  7 +--
> gcc/config/riscv/riscv-opts.h  |  4 
> gcc/config/riscv/riscv-vector-builtins.cc  |  2 ++ 
> gcc/config/riscv/riscv-vector-builtins.def | 20 +++
> gcc/config/riscv/riscv-vector-builtins.h   |  1 +
> gcc/config/riscv/riscv-vector-switch.def   | 23 ++
> 7 files changed, 49 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index e6ed3df9ea6..3247d526c0a 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1248,6 +1248,8 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>{"zve64x",   &gcc_options::x_riscv_vector_elen_flags, MASK_VECTOR_ELEN_64},
>{"zve64f",   &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_32},
>{"zve64d",   &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_64},
> +  {"zvfhmin",  &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_16},
> +  {"zvfh", &gcc_options::x_riscv_vector_elen_flags, 
> MASK_VECTOR_ELEN_FP_16},
>{"zvl32b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL32B},
>{"zvl64b",&gcc_options::x_riscv_zvl_flags, MASK_ZVL64B},
> diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
> b/gcc/config/riscv/genrvv-type-indexer.cc
> index 18e1b375396..8fc93ceaab4 100644
> --- a/gcc/config/riscv/genrvv-type-indexer.cc
> +++ b/gcc/config/riscv/genrvv-type-indexer.cc
> @@ -54,7 +54,7 @@ valid_type (unsigned sew, int lmul_log2, bool float_p)
>  case 8:
>return lmul_log2 >= -3 && !float_p;
>  case 16:
> -  return lmul_log2 >= -2 && !float_p;
> +  return lmul_log2 >= -2;
>  case 32:
>return lmul_log2 >= -1;
>  case 64:
> @@ -73,6 +73,9 @@ valid_type (unsigned sew, int lmul_log2, unsigned nf, bool 
> float_p)
>if (nf > 8 || nf < 1)
>  return false;
> +  if (sew == 16 && nf != 1 && float_p) // Disable FP16 tuple in temporarily.
> +return false;
> +
>switch (lmul_log2)
>  {
>  case 1:
> @@ -342,7 +345,7 @@ main (int argc, const char **argv)
> fprintf (fp, ")\n");
>   }
>// Build for vfloat
> -  for (unsigned sew : {32, 64})
> +  for (unsigned sew : {16, 32, 64})
>  for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3})
>for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8}) { diff --git 
> a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index 
> 5f387d0e393..208a557b8ff 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -154,6 +154,8 @@ enum riscv_entity
> #define MASK_VECTOR_ELEN_64(1 << 1)
> #define MASK_VECTOR_ELEN_FP_32 (1 << 2) #define MASK_VECTOR_ELEN_FP_64 
> (1 << 3)
> +/* Align the bit index to riscv-vector-builtins.h.  */ #define 
> +MASK_VECTOR_ELEN_FP_16 (1 << 6)
> #define TARGET_VECTOR_ELEN_32 \
>((riscv_vector_elen_flags & MASK_VECTOR_ELEN_32) != 0) @@ -163,6 
> +165,8 @@ enum riscv_entity
>((riscv_vector_elen_flags & MASK_VECTOR_ELEN_FP_32) != 0) #define 
> TARGET

Re: [PATCH 1/3] testsuite: Unbork multilib testing on RISC-V (and any target really)

2023-06-01 Thread Thomas Schwinge
Hi!

First, Vineet, great that you've now tracked this down!  :-) Indeed
"early exit" vs. 'torture-finish' was exactly the issue that I suspected.

It may not be what you originally intended, but I hope at least you've
learned some things about DejaGnu/TCL...  ;-P

Yesterday, I actually had begun looking into this.  To avoid the big
download and having to wait for a lot of packages to be build with your
'riscv-gnu-toolchain' recipe:
,
I intended to do just a quick GCC build on compile farm gcc92, which
(a) didn't turn out to be quick, and (b) eventually failed due to

"Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such 
file or directory"...

(I'm now running 'riscv-gnu-toolchain' to verify this, and another thing.)

Before we push your patch, let me please verify that it indeed doesn't
change any 'gcc.misc-tests/i386-prefetch.exp' semantics, and:

On 2023-05-31T19:13:01+0100, Iain Sandoe via Gcc-patches 
 wrote:
>> On 31 May 2023, at 18:57, Jeff Law via Gcc-patches  
>> wrote:
>> On 5/31/23 10:25, Vineet Gupta wrote:
>>> Multilib testing on trunk is currently busted (and surprisingly this
>>> affects any/all targets but it seems nobody cares). We currently get the
>>> following splat:
>> I wouldn't say that nobody cares, it just hasn't bubbled up on anyone's 
>> priority list yet (most developers aren't working on targets that make heavy 
>> use of multilibs).

So I regularely do build x86_64 GNU/Linux with default '-m64' plus '-m32'
multilib -- but of course, there's no "early exit" for those, as there's
no 'string match "* -march=*" " [board_info target multilib_flags] "'...

>> But probably more importantly, this problem seems to not be triggering on 
>> all multilib targets.  For example, I just examined my tester's build logs 
>> and couldn't see this on the H8/300 or V850 ports.  Which begs the question, 
>> why?

..., which may be the case for those, too?  In other words: the problem
only shows up if '-march=[...]' appears in the flags, which indeed may
not be a common thing?  I'll cross-verify this with x86_64 and
'-march=[...]' flags.

And, I still intend to figure out why this issue apparently disappears
with my recent 'LTO_TORTURE_OPTIONS' patches reverted:
.


Otherwise:

> I do have a multilib problem [with libgomp] on Darwin (which has been noticed 
> : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109951) but it is not obvious 
> how the fix proposed would solve this - unless it’s some subtle change in 
> global content for the multilib options.
>
> (testing anyway)

No, this is really a separate issue.  I understand what's happening, and
have an idea about how to address this that I'll post later.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

This patch is to enhance vwmul.vv combine optimizations.
Consider this following code:
void
vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2,
  int16_t *__restrict dst3, int16_t *__restrict dst4,
  int8_t *__restrict a, int8_t *__restrict b,
  int8_t *__restrict a2, int8_t *__restrict b2, int n)
{
  for (int i = 0; i < n; i++)
{
  dst[i] = (int16_t) a[i] * (int16_t) b[i];
  dst2[i] = (int16_t) a2[i] * (int16_t) b[i];
  dst3[i] = (int16_t) a2[i] * (int16_t) a[i];
  dst4[i] = (int16_t) a[i] * (int16_t) b2[i];
}
}

In such complicate case, the operand is not single used, used by multiple 
statements.
GCC combine optimization will iterate the combination of the operands.

First round -> combine one of the operand and change vsext + vmul into vwmul.wv
Second round -> combine the other operand and change vwmul.wv into vwmul.vv

Notice when I add a pseudo vwmul.wv pattern, it makes vwmulsu.vv testcase fail
since GCC prefer such pattern order:

(mul: (zero_extend)
  (sign_exted))

So change vwmulsu.vv instruction operands order.

gcc/ChangeLog:

* config/riscv/vector.md: Shift zero_extend and sign_extend order.
* config/riscv/autovec-opt.md: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/widen/widen-7.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 56 +++
 gcc/config/riscv/vector.md|  9 +--
 .../riscv/rvv/autovec/widen/widen-7.c | 27 +
 .../rvv/autovec/widen/widen-complicate-3.c| 32 +++
 .../riscv/rvv/autovec/widen/widen_run-7.c | 34 +++
 5 files changed, 154 insertions(+), 4 deletions(-)
 create mode 100644 gcc/config/riscv/autovec-opt.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
new file mode 100644
index 000..5b7dc9bef8c
--- /dev/null
+++ b/gcc/config/riscv/autovec-opt.md
@@ -0,0 +1,56 @@
+;; Machine description for optimization of RVV auto-vectorization.
+;; Copyright (C) 2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; We don't have vwmul.wv instruction like vwadd.wv in RVV.
+;; This pattern is an intermediate RTL IR as a pseudo vwmul.wv to enhance
+;; optimization of instructions combine.
+(define_insn_and_split "@pred_single_widen_mul"
+  [(set (match_operand:VWEXTI 0 "register_operand"  "=&vr,&vr")
+   (if_then_else:VWEXTI
+ (unspec:
+   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1")
+(match_operand 5 "vector_length_operand"  "   rK,   
rK")
+(match_operand 6 "const_int_operand"  "i,
i")
+(match_operand 7 "const_int_operand"  "i,
i")
+(match_operand 8 "const_int_operand"  "i,
i")
+(reg:SI VL_REGNUM)
+(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+ (mult:VWEXTI
+   (any_extend:VWEXTI
+ (match_operand: 4 "register_operand" "   vr,   
vr"))
+   (match_operand:VWEXTI 3 "register_operand" "   vr,   
vr"))
+ (match_operand:VWEXTI 2 "vector_merge_operand"   "   vu,
0")))]
+  "TARGET_VECTOR"
+  "#"
+  "&& can_create_pseudo_p ()"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vf2 (, mode);
+rtx tmp = gen_reg_rtx (mode);
+rtx ops[] = {tmp, operands[4]};
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops);
+
+emit_insn (gen_pred (MULT, mode, operands[0], operands[1], 
operands[2],
+operands[3], tmp, operands[5], operands[6],
+operands[7], operands[8]));
+DONE;
+  }
+  [(set_attr "type" "viwmul")
+   (set_attr "mode" 

[PATCH] Don't try bswap + rotate when TYPE_PRECISION(n->type) > n->range.

2023-06-01 Thread liuhongt via Gcc-patches
For the testcase in the PR, we have

  br64 = br;
  br64 = ((br64 << 16) & 0x00ffull) | (br64 & 0xff00ull);

  n->n: 0x300200.
  n->range: 32.
  n->type: uint64.

The original code assumes n->range is same as TYPE PRECISION(n->type),
and tries to rotate the mask from 0x30200 -> 0x20300 which is
incorrect. The patch fixed this bug by not trying bswap + rotate when
TYPE_PRECISION(n->type) is not equal to n->range.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR tree-optimization/110067
* gimple-ssa-store-merging.cc (find_bswap_or_nop): Don't try
bswap + rotate when TYPE_PRECISION(n->type) > n->range.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110067.c: New test.
---
 gcc/gimple-ssa-store-merging.cc  |  3 +
 gcc/testsuite/gcc.target/i386/pr110067.c | 77 
 2 files changed, 80 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr110067.c

diff --git a/gcc/gimple-ssa-store-merging.cc b/gcc/gimple-ssa-store-merging.cc
index 9cb574fa315..401496a9231 100644
--- a/gcc/gimple-ssa-store-merging.cc
+++ b/gcc/gimple-ssa-store-merging.cc
@@ -1029,6 +1029,9 @@ find_bswap_or_nop (gimple *stmt, struct symbolic_number 
*n, bool *bswap,
   /* TODO, handle cast64_to_32 and big/litte_endian memory
 source when rsize < range.  */
   if (n->range == orig_range
+ /* There're case like 0x30200 for uint32->uint64 cast,
+Don't hanlde this.  */
+ && n->range == TYPE_PRECISION (n->type)
  && ((orig_range == 32
   && optab_handler (rotl_optab, SImode) != CODE_FOR_nothing)
  || (orig_range == 64
diff --git a/gcc/testsuite/gcc.target/i386/pr110067.c 
b/gcc/testsuite/gcc.target/i386/pr110067.c
new file mode 100644
index 000..c4208811628
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr110067.c
@@ -0,0 +1,77 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-strict-aliasing" } */
+
+#include 
+#define force_inline __inline__ __attribute__ ((__always_inline__))
+
+__attribute__((noipa))
+static void
+fetch_pixel_no_alpha_32_bug (void *out)
+{
+  uint32_t *ret = out;
+  *ret = 0xff499baf;
+}
+
+static force_inline uint32_t
+bilinear_interpolation_local (uint32_t tl, uint32_t tr,
+ uint32_t bl, uint32_t br,
+ int distx, int disty)
+{
+  uint64_t distxy, distxiy, distixy, distixiy;
+  uint64_t tl64, tr64, bl64, br64;
+  uint64_t f, r;
+
+  distx <<= 1;
+  disty <<= 1;
+
+  distxy = distx * disty;
+  distxiy = distx * (256 - disty);
+  distixy = (256 - distx) * disty;
+  distixiy = (256 - distx) * (256 - disty);
+
+  /* Alpha and Blue */
+  tl64 = tl & 0xffff;
+  tr64 = tr & 0xffff;
+  bl64 = bl & 0xffff;
+  br64 = br & 0xffff;
+
+  f = tl64 * distixiy + tr64 * distxiy + bl64 * distixy + br64 * distxy;
+  r = f & 0xffffull;
+
+  /* Red and Green */
+  tl64 = tl;
+  tl64 = ((tl64 << 16) & 0x00ffull) | (tl64 & 0xff00ull);
+
+  tr64 = tr;
+  tr64 = ((tr64 << 16) & 0x00ffull) | (tr64 & 0xff00ull);
+
+  bl64 = bl;
+  bl64 = ((bl64 << 16) & 0x00ffull) | (bl64 & 0xff00ull);
+
+  br64 = br;
+  br64 = ((br64 << 16) & 0x00ffull) | (br64 & 0xff00ull);
+
+  f = tl64 * distixiy + tr64 * distxiy + bl64 * distixy + br64 * distxy;
+  r |= ((f >> 16) & 0x00ffull) | (f & 0xff00ull);
+
+  return (uint32_t)(r >> 16);
+}
+
+__attribute__((noipa))
+static void
+bits_image_fetch_pixel_bilinear_32_bug (void *out)
+{
+  uint32_t br;
+  uint32_t *ret = out;
+
+  fetch_pixel_no_alpha_32_bug (&br);
+  *ret = bilinear_interpolation_local (0, 0, 0, br, 0x41, 0x42);
+}
+
+int main() {
+  uint32_t r;
+  bits_image_fetch_pixel_bilinear_32_bug (&r);
+  if (r != 0x4213282d)
+__builtin_abort ();
+  return 0;
+}
-- 
2.39.1.388.g2fc9e9ca3c



Re: Re: [PATCH V3] VECT: Change flow of decrement IV

2023-06-01 Thread juzhe.zh...@rivai.ai
Thanks Kewen. Let's wait for Richard and Richi.



juzhe.zh...@rivai.ai
 
From: Kewen.Lin
Date: 2023-06-01 13:24
To: juzhe.zh...@rivai.ai
CC: richard.sandiford; rguenther; gcc-patches
Subject: Re: [PATCH V3] VECT: Change flow of decrement IV
Hi,
 
on 2023/6/1 13:00, juzhe.zh...@rivai.ai wrote:
> This patch is no difference from V2.
 
I support this patch based on the testing and SPEC2017 evaluation
results on Power (see my comments on patch v2).
 
> Just add PR tree-optimization/109971 as Kewen's suggested.
 
Thanks for adding that, I was expecting you will add that when you
are committing it, not really requesting one new version. :)  btw,
the PR marker(s) will trigger scripts to comment some commit info
(commit link, commit log) into the specified PR(s), people can
find some connections between PRs and (fixing or progressing forward)
commits easily.
 
BR,
Kewen
 
> 
> Already bootstrapped and Regression on X86 no difference.
> 
> Ok for trunk ?
> --
> juzhe.zh...@rivai.ai
> 
>  
> *From:* juzhe.zhong 
> *Date:* 2023-06-01 12:36
> *To:* gcc-patches 
> *CC:* richard.sandiford ; rguenther 
> ; linkw ; Ju-Zhe Zhong 
> 
> *Subject:* [PATCH V3] VECT: Change flow of decrement IV
> From: Ju-Zhe Zhong 
>  
> Follow Richi's suggestion, I change current decrement IV flow from:
>  
> do {
>remain -= MIN (vf, remain);
> } while (remain != 0);
>  
> into:
>  
> do {
>old_remain = remain;
>len = MIN (vf, remain);
>remain -= vf;
> } while (old_remain >= vf);
>  
> to enhance SCEV.
>  
> Include fixes from kewen.
>  
>  
> This patch will need to wait for Kewen's test feedback.
>  
> Testing on X86 is on-going
>  
> Co-Authored by: Kewen Lin  
>  
>   PR tree-optimization/109971
>  
> gcc/ChangeLog:
>  
> * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): 
> Change decrement IV flow.
> (vect_set_loop_condition_partial_vectors): Ditto.
>  
> ---
> gcc/tree-vect-loop-manip.cc | 36 +---
> 1 file changed, 25 insertions(+), 11 deletions(-)
>  
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index acf3642ceb2..3f735945e67 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -483,7 +483,7 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
> gimple_stmt_iterator loop_cond_gsi,
> rgroup_controls *rgc, tree niters,
> tree niters_skip, bool might_wrap_p,
> - tree *iv_step)
> + tree *iv_step, tree *compare_step)
> {
>tree compare_type = LOOP_VINFO_RGROUP_COMPARE_TYPE (loop_vinfo);
>tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo);
> @@ -538,9 +538,9 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>...
>vect__4.8_28 = .LEN_LOAD (_17, 32B, _36, 0);
>...
> -ivtmp_35 = ivtmp_9 - _36;
> +ivtmp_35 = ivtmp_9 - POLY_INT_CST [4, 4];
>...
> -if (ivtmp_35 != 0)
> +if (ivtmp_9 > POLY_INT_CST [4, 4])
>  goto ; [83.33%]
>else
>  goto ; [16.67%]
> @@ -549,13 +549,15 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>tree step = rgc->controls.length () == 1 ? rgc->controls[0]
>: make_ssa_name (iv_type);
>/* Create decrement IV.  */
> -  create_iv (nitems_total, MINUS_EXPR, step, NULL_TREE, loop, 
> &incr_gsi,
> - insert_after, &index_before_incr, &index_after_incr);
> +  create_iv (nitems_total, MINUS_EXPR, nitems_step, NULL_TREE, loop,
> + &incr_gsi, insert_after, &index_before_incr,
>  

[PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

This patch is to enhance vwmul.vv combine optimizations.
Consider this following code:
void
vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2,
  int16_t *__restrict dst3, int16_t *__restrict dst4,
  int8_t *__restrict a, int8_t *__restrict b,
  int8_t *__restrict a2, int8_t *__restrict b2, int n)
{
  for (int i = 0; i < n; i++)
{
  dst[i] = (int16_t) a[i] * (int16_t) b[i];
  dst2[i] = (int16_t) a2[i] * (int16_t) b[i];
  dst3[i] = (int16_t) a2[i] * (int16_t) a[i];
  dst4[i] = (int16_t) a[i] * (int16_t) b2[i];
}
}

In such complicate case, the operand is not single used, used by multiple 
statements.
GCC combine optimization will iterate the combination of the operands.

Also, we add another pattern of vwmulsu.vv to enhance the vwmulsu.vv 
optimization.
Currently, we have format:

(mult: (sign_extend) (zero_extend)) in vector.md for intrinsics calling.
Now, we add a new vwmulsu.ww with this format:
(mult: (zero_extend) (sign_extend)) 

To handle this following cases (sign and unsigned widening multiplication 
mixing codes):
void
vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2,
  int16_t *__restrict dst3, int16_t *__restrict dst4,
  int8_t *__restrict a, uint8_t *__restrict b,
  uint8_t *__restrict a2, int8_t *__restrict b2, int n)
{
  for (int i = 0; i < n; i++)
{
  dst[i] = (int16_t) a[i] * (int16_t) b[i];
  dst2[i] = (int16_t) a2[i] * (int16_t) b[i];
  dst3[i] = (int16_t) a2[i] * (int16_t) a[i];
  dst4[i] = (int16_t) a[i] * (int16_t) b2[i];
}
}

Before this patch:

...
   vsetvli zero,t1,e8,m1,ta,ma
vle8.v  v1,0(a4)
vsetvli t3,zero,e16,m2,ta,ma
vsext.vf2   v6,v1
vsetvli zero,t1,e8,m1,ta,ma
vle8.v  v1,0(a5)
vsetvli t3,zero,e16,m2,ta,ma
add t0,a0,t4
vzext.vf2   v4,v1
vmul.vv v2,v4,v6
vsetvli zero,t1,e16,m2,ta,ma
vse16.v v2,0(t0)
vle8.v  v1,0(a6)
vsetvli t3,zero,e16,m2,ta,ma
add t0,a1,t4
vzext.vf2   v2,v1
vmul.vv v4,v2,v4
vsetvli zero,t1,e16,m2,ta,ma
vse16.v v4,0(t0)
vsetvli t3,zero,e16,m2,ta,ma
add t0,a2,t4
vmul.vv v2,v2,v6
vsetvli zero,t1,e16,m2,ta,ma
vse16.v v2,0(t0)
add t0,a3,t4
vle8.v  v1,0(a7)
vsetvli t3,zero,e16,m2,ta,ma
sub t6,t6,t1
vsext.vf2   v2,v1
vmul.vv v2,v2,v6
vsetvli zero,t1,e16,m2,ta,ma
vse16.v v2,0(t0)
...

After this patch:
...
  vsetvli zero,t1,e8,mf2,ta,ma
vle8.v  v1,0(a4)
vle8.v  v3,0(a5)
vsetvli t6,zero,e8,mf2,ta,ma
add t0,a0,t3
vwmulsu.vv  v2,v1,v3
vsetvli zero,t1,e16,m1,ta,ma
vse16.v v2,0(t0)
vle8.v  v2,0(a6)
vsetvli t6,zero,e8,mf2,ta,ma
add t0,a1,t3
vwmulu.vv   v4,v3,v2
vsetvli zero,t1,e16,m1,ta,ma
vse16.v v4,0(t0)
vsetvli t6,zero,e8,mf2,ta,ma
add t0,a2,t3
vwmulsu.vv  v3,v1,v2
vsetvli zero,t1,e16,m1,ta,ma
vse16.v v3,0(t0)
add t0,a3,t3
vle8.v  v3,0(a7)
vsetvli t6,zero,e8,mf2,ta,ma
sub t4,t4,t1
vwmul.vvv2,v1,v3
vsetvli zero,t1,e16,m1,ta,ma
vse16.v v2,0(t0)
...

gcc/ChangeLog:

* config/riscv/vector.md: Add vector-opt.md.
* config/riscv/autovec-opt.md: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/widen/widen-7.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-4.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 80 +++
 gcc/config/riscv/vector.md|  3 +-
 .../riscv/rvv/autovec/widen/widen-7.c | 27 +++
 .../rvv/autovec/widen/widen-complicate-3.c| 32 
 .../rvv/autovec/widen/widen-complicate-4.c| 31 +++
 .../riscv/rvv/autovec/widen/widen_run-7.c | 34 
 6 files changed, 206 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/riscv/autovec-opt.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-4.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
new file mode 100644
index 000..92cdc4e9a16
--- /dev/null
+++ b/gcc/config/riscv/autovec-opt.md
@@ -0,0 +1,80 @@
+;; Machine description for optimization of 

Re: [PATCH] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-01 Thread juzhe.zh...@rivai.ai
Hi, forget about this patch.
Just go directly the V2 patch with same title.

That's the last patch I fine tune for integer widening auto-vectorization.

Thanks.


juzhe.zh...@rivai.ai
 
From: juzhe.zhong
Date: 2023-06-01 15:31
To: gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; jeffreyalaw; rdapp.gcc; Juzhe-Zhong
Subject: [PATCH] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv 
instruction optimizations
From: Juzhe-Zhong 
 
This patch is to enhance vwmul.vv combine optimizations.
Consider this following code:
void
vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2,
  int16_t *__restrict dst3, int16_t *__restrict dst4,
  int8_t *__restrict a, int8_t *__restrict b,
  int8_t *__restrict a2, int8_t *__restrict b2, int n)
{
  for (int i = 0; i < n; i++)
{
  dst[i] = (int16_t) a[i] * (int16_t) b[i];
  dst2[i] = (int16_t) a2[i] * (int16_t) b[i];
  dst3[i] = (int16_t) a2[i] * (int16_t) a[i];
  dst4[i] = (int16_t) a[i] * (int16_t) b2[i];
}
}
 
In such complicate case, the operand is not single used, used by multiple 
statements.
GCC combine optimization will iterate the combination of the operands.
 
First round -> combine one of the operand and change vsext + vmul into vwmul.wv
Second round -> combine the other operand and change vwmul.wv into vwmul.vv
 
Notice when I add a pseudo vwmul.wv pattern, it makes vwmulsu.vv testcase fail
since GCC prefer such pattern order:
 
(mul: (zero_extend)
  (sign_exted))
 
So change vwmulsu.vv instruction operands order.
 
gcc/ChangeLog:
 
* config/riscv/vector.md: Shift zero_extend and sign_extend order.
* config/riscv/autovec-opt.md: New file.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/widen/widen-7.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c: New test.
* gcc.target/riscv/rvv/autovec/widen/widen_run-7.c: New test.
 
---
gcc/config/riscv/autovec-opt.md   | 56 +++
gcc/config/riscv/vector.md|  9 +--
.../riscv/rvv/autovec/widen/widen-7.c | 27 +
.../rvv/autovec/widen/widen-complicate-3.c| 32 +++
.../riscv/rvv/autovec/widen/widen_run-7.c | 34 +++
5 files changed, 154 insertions(+), 4 deletions(-)
create mode 100644 gcc/config/riscv/autovec-opt.md
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-7.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-3.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-7.c
 
diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
new file mode 100644
index 000..5b7dc9bef8c
--- /dev/null
+++ b/gcc/config/riscv/autovec-opt.md
@@ -0,0 +1,56 @@
+;; Machine description for optimization of RVV auto-vectorization.
+;; Copyright (C) 2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; We don't have vwmul.wv instruction like vwadd.wv in RVV.
+;; This pattern is an intermediate RTL IR as a pseudo vwmul.wv to enhance
+;; optimization of instructions combine.
+(define_insn_and_split "@pred_single_widen_mul"
+  [(set (match_operand:VWEXTI 0 "register_operand"  "=&vr,&vr")
+ (if_then_else:VWEXTI
+   (unspec:
+ [(match_operand: 1 "vector_mask_operand"   "vmWc1,vmWc1")
+  (match_operand 5 "vector_length_operand"  "   rK,   rK")
+  (match_operand 6 "const_int_operand"  "i,i")
+  (match_operand 7 "const_int_operand"  "i,i")
+  (match_operand 8 "const_int_operand"  "i,i")
+  (reg:SI VL_REGNUM)
+  (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+   (mult:VWEXTI
+ (any_extend:VWEXTI
+   (match_operand: 4 "register_operand" "   vr,   vr"))
+ (match_operand:VWEXTI 3 "register_operand" "   vr,   vr"))
+   (match_operand:VWEXTI 2 "vector_merge_operand"   "   vu,0")))]
+  "TARGET_VECTOR"
+  "#"
+  "&& can_create_pseudo_p ()"
+  [(const_int 0)]
+  {
+insn_code icode = code_for_pred_vf2 (, mode);
+rtx tmp = gen_reg_rtx (mode);
+rtx ops[] = {tmp, operands[4]};
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::R

Re: [PATCH v5] tree-ssa-sink: Improve code sinking pass

2023-06-01 Thread Bernhard Reutner-Fischer via Gcc-patches
On 1 June 2023 09:20:08 CEST, Ajit Agarwal  wrote:
>Hello All:
>
>This patch improves code sinking pass to sink statements before call to reduce
>register pressure.
>Review comments are incorporated.

Hi Ajit!

I had two comments for v4 that you did not address in v5 or followed up.
thanks,


[PATCH][committed] aarch64: Add =r, m and =m, r alternatives to 64-bit vector move patterns

2023-06-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

We can use the X registers to load and store 64-bit vector modes, we just need 
to add the alternatives
to the mov patterns. This straightforward patch does that and for the pair 
variants too.
For the testcase in the code we now generate the optimal assembly without any 
superfluous
GP<->SIMD moves.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (*aarch64_simd_mov):
Add =r,m and =r,m alternatives.
(load_pair): Likewise.
(vec_store_pair): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/xreg-vec-modes_1.c: New test.


rm64.patch
Description: rm64.patch


[committed] libstdc++: Fix condition for supported SIMD types on ARMv8

2023-06-01 Thread Matthias Kretz via Gcc-patches
pushed to trunk, will backport

tested on arm-linux-gnueabihf

-- 8< --

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR libstdc++/110050
* include/experimental/bits/simd.h (__vectorized_sizeof): With
__have_neon_a32 only single-precision float works (in addition
to integers).
---
 libstdc++-v3/include/experimental/bits/simd.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index f94b8361ab0..834fe923065 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2808,8 +2808,10 @@ __vectorized_sizeof()
 	  return 16;
 
 	// ARM:
-	if constexpr (__have_neon_a64
-		  || (__have_neon_a32 && !is_same_v<_Tp, double>) )
+	if constexpr (__have_neon_a64)
+	  return 16;
+	if constexpr (__have_neon_a32 and (not is_floating_point_v<_Tp>
+	 or is_same_v<_Tp, float>))
 	  return 16;
 	if constexpr (__have_neon
 		  && sizeof(_Tp) < 8


Re: [r14-1452 Regression] FAIL: g++.dg/pr104547.C -std=gnu++17 scan-tree-dump-not vrp2 "_M_default_append" on Linux/x86_64

2023-06-01 Thread Christophe Lyon via Gcc-patches
Hi!

We have noticed the same problem on aarch64, if that's easier to reproduce.

Thanks,
Christophe


On Thu, 1 Jun 2023 at 06:20, haochen.jiang via Gcc-regression <
gcc-regress...@gcc.gnu.org> wrote:

> On Linux/x86_64,
>
> fb409a15d9babc78fe1d9957afcbaf1102cce58f is the first bad commit
> commit fb409a15d9babc78fe1d9957afcbaf1102cce58f
> Author: Jonathan Wakely 
> Date:   Thu May 25 09:57:46 2023 +0100
>
> libstdc++: Express std::vector's size() <= capacity() invariant in code
>
> caused
>
> FAIL: g++.dg/pr104547.C  -std=gnu++14  scan-tree-dump-not vrp2
> "_M_default_append"
> FAIL: g++.dg/pr104547.C  -std=gnu++17  scan-tree-dump-not vrp2
> "_M_default_append"
>
> with GCC configured with
>
> ../../gcc/configure
> --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-1452/usr
> --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet
> --without-isl --enable-libmpx x86_64-linux --disable-bootstrap
>
> To reproduce:
>
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m64}'"
> $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/pr104547.C
> --target_board='unix{-m64\ -march=cascadelake}'"
>
> (Please do not reply to this email, for question about this report,
> contact me at haochen dot jiang at intel.com)
>


Re: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread Liao Shihua

Hi, Ma Jin

    1. There are few developments since May in GCC because the spec of 
Zfbf  is constantly changing.


    2. We (PLCT lab) will implement Zvfbfmin and Zvfbfwma after Zvfh 
has been merged in GCC.


    3. I will send a patch to support bfloat16_t in RISC-V port, but 
Zfbf extension's patch will be sent after it released.


Liao Shihua

在 2023/6/1 14:51, Jin Ma 写道:

hi,

Are there any new developments about Zfb? Are there any plans to implement
the Zvfbfmin and Zvfbfwma expansion? I see that Zfb is being reviewed in
llvm, maybe we should do the same on gcc.

Ref:https://reviews.llvm.org/D151313
  https://reviews.llvm.org/D150929


Re: [r14-1452 Regression] FAIL: g++.dg/pr104547.C -std=gnu++17 scan-tree-dump-not vrp2 "_M_default_append" on Linux/x86_64

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Thu, 1 Jun 2023 at 10:06, Christophe Lyon wrote:

> Hi!
>
> We have noticed the same problem on aarch64, if that's easier to reproduce.
>


I am already testing a fix.


Re: [PATCH] libstdc++: optimize EH phase 2

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Thu, 1 Jun 2023 at 04:13, Jason Merrill via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> Tested x86_64-pc-linux-gnu, OK for trunk?
>

OK, thanks.


>
> -- 8< --
>
> In the ABI's two-phase EH model, first we walk the stack looking for a
> handler, then we walk the stack running cleanups until we reach that
> handler.  In the cleanup phase, we shouldn't redundantly check the handlers
> along the way, e.g. when walking through g():
>
>   void f() { throw 42; }
>   void g() { try { f(); } catch (void *) { } }
>   int main() { try { g(); } catch (int) { } }
>
> libstdc++-v3/ChangeLog:
>
> * libsupc++/eh_personality.cc (PERSONALITY_FUNCTION): Don't check
> handlers in the cleanup phase.
> ---
>  libstdc++-v3/libsupc++/eh_personality.cc | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/libstdc++-v3/libsupc++/eh_personality.cc
> b/libstdc++-v3/libsupc++/eh_personality.cc
> index 12391e563d6..cc6bc048892 100644
> --- a/libstdc++-v3/libsupc++/eh_personality.cc
> +++ b/libstdc++-v3/libsupc++/eh_personality.cc
> @@ -592,6 +592,10 @@ PERSONALITY_FUNCTION (int version,
>   // Zero filter values are cleanups.
>   saw_cleanup = true;
> }
> + else if (actions == _UA_CLEANUP_PHASE)
> +   // We checked the handlers in the search phase; if one of them
> +   // matched, actions would also have _UA_HANDLER_FRAME set.
> +   ;
>   else if (ar_filter > 0)
> {
>   // Positive filter values are handlers.
>
> base-commit: 68816ba245afc6d0e1482bde2d15b35b925b4195
> --
> 2.31.1
>
>


Re: [committed] libstdc++: Fix preprocessor conditions for std::from_chars [PR109921]

2023-06-01 Thread Christophe Lyon via Gcc-patches
Hi,


On Wed, 31 May 2023 at 14:25, Jonathan Wakely via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Tested powerpc64le-linux. Pushed to trunk.
>
> -- >8 --
>
> We use the from_chars_strtod function with __strtof128 to read a
> _Float128 value, but from_chars_strtod is not defined unless uselocale
> is available. This can lead to compilation failures for some targets,
> because we try to define the _Flaot128 overload in terms of a
> non-existing from_chars_strtod function.
>
> Only try to use __strtof128 if uselocale is available, otherwise
> fallback to the long double overload of std::from_chars (which might
> fallback to the double overload, which should use fast_float).
>
> This ensures we always define the full set of overloads, even if they
> are not always accurate for all values of the wider types.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/109921
> * src/c++17/floating_from_chars.cc (USE_STRTOF128_FOR_FROM_CHARS):
> Only define when USE_STRTOD_FOR_FROM_CHARS is also defined.
> (USE_STRTOD_FOR_FROM_CHARS): Do not undefine when long double is
> binary64.
> (from_chars(const char*, const char*, double&, chars_format)):
> Check __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ here.
> (from_chars(const char*, const char*, _Float128&, chars_format))
> Only use from_chars_strtod when USE_STRTOD_FOR_FROM_CHARS is
> defined, otherwise parse a long double and convert to _Float128.
>


This is causing a regression on aarch64:
 FAIL: libstdc++-abi/abi_check

The log says:

3 added symbols
0
_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE11_S_allocateERS3_m
std::__cxx11::basic_string,
std::allocator >::_S_allocate(std::allocator&, unsigned
long)
version status: compatible
GLIBCXX_3.4.32
type: function
status: added

1
_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_S_allocateERS3_m
std::__cxx11::basic_string,
std::allocator >::_S_allocate(std::allocator&, unsigned long)
version status: compatible
GLIBCXX_3.4.32
type: function
status: added

2
_ZSt10from_charsPKcS0_RDF128_St12chars_format
std::from_chars(char const*, char const*, _Float128&, std::chars_format)
version status: incompatible
GLIBCXX_3.4.31
type: function
status: added


2 undesignated symbols
0
_ZSt11__once_call
std::__once_call
version status: compatible
GLIBCXX_3.4.11
type: tls
type size: 8
status: undesignated

1
_ZSt15__once_callable
std::__once_callable
version status: compatible
GLIBCXX_3.4.11
type: tls
type size: 8
status: undesignated


1 incompatible symbols
0
_ZSt10from_charsPKcS0_RDF128_St12chars_format
std::from_chars(char const*, char const*, _Float128&, std::chars_format)
version status: incompatible
GLIBCXX_3.4.31
type: function
status: added



 libstdc++-v3 check-abi Summary 

# of added symbols:  3
# of missing symbols:0
# of undesignated symbols:   2
# of incompatible symbols:   1


Can you have a look?

Thanks,
Christophe

---
>  libstdc++-v3/src/c++17/floating_from_chars.cc | 20 ---
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc
> b/libstdc++-v3/src/c++17/floating_from_chars.cc
> index ebd428d5be3..eea878072b0 100644
> --- a/libstdc++-v3/src/c++17/floating_from_chars.cc
> +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
> @@ -64,7 +64,7 @@
>  // strtold for __ieee128
>  extern "C" __ieee128 __strtoieee128(const char*, char**);
>  #elif __FLT128_MANT_DIG__ == 113 && __LDBL_MANT_DIG__ != 113 \
> -  && defined(__GLIBC_PREREQ)
> +  && defined(__GLIBC_PREREQ) && defined(USE_STRTOD_FOR_FROM_CHARS)
>  #define USE_STRTOF128_FOR_FROM_CHARS 1
>  extern "C" _Float128 __strtof128(const char*, char**)
>__asm ("strtof128")
> @@ -77,10 +77,6 @@ extern "C" _Float128 __strtof128(const char*, char**)
>  #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
>  && __SIZE_WIDTH__ >= 32
>  # define USE_LIB_FAST_FLOAT 1
> -# if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
> -// No need to use strtold.
> -#  undef USE_STRTOD_FOR_FROM_CHARS
> -# endif
>  #endif
>
>  #if USE_LIB_FAST_FLOAT
> @@ -1261,7 +1257,7 @@ from_chars_result
>  from_chars(const char* first, const char* last, long double& value,
>chars_format fmt) noexcept
>  {
> -#if ! USE_STRTOD_FOR_FROM_CHARS
> +#if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ || !defined
> USE_STRTOD_FOR_FROM_CHARS
>// Either long double is the same as double, or we can't use strtold.
>// In the latter case, this might give an incorrect result (e.g. values
>// out of range of double give an error, even if they fit in long
> double).
> @@ -1329,13 +1325,23 @@
> _ZSt10from_charsPKcS0_RDF128_St12chars_format(const char* first,
>   __ieee128& value,
>   chars_format fmt) noexcept
>  __attribute__((alias
> ("_ZSt10from_charsPKcS0_Ru9__ieee128St12chars_form

[PATCH] doc: Fix description of x86 -m32 option [PR109954]

2023-06-01 Thread Jonathan Wakely via Gcc-patches
In https://gcc.gnu.org/PR109954 I suggested also adding:

"N.B., using @option{-march} might be required to produce code suitable
for a specific CPU family, e.g., @option{-march=i486}."

I realise that that is true for all of -m32, -m64 and -mx32, and similar
rules apply for other targets too. But I still feel that saying it
explicitly for -m32 doesn't hurt, and would avoid a common
misunderstanding by putting that info somewhere it's more likely to be
read.

But I'd prefer to just fix the part that is *wrong*, and then we can
discuss whether or not that other part is an improvement. This patch
fixes the wrongness.

OK for trunk and release branches?

-- >8 --

This option does not imply -march=i386 so it's incorrect to say it
generates code that will run on "any i386 system".

gcc/ChangeLog:

PR target/109954
* doc/invoke.texi (x86 Options): Fix description of -m32 option.
---
 gcc/doc/invoke.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 898a88ce33e..ec71c2e9e0f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -34091,7 +34091,7 @@ on x86-64 processors in 64-bit environments.
 Generate code for a 16-bit, 32-bit or 64-bit environment.
 The @option{-m32} option sets @code{int}, @code{long}, and pointer types
 to 32 bits, and
-generates code that runs on any i386 system.
+generates code that runs in 32-bit mode.
 
 The @option{-m64} option sets @code{int} to 32 bits and @code{long} and pointer
 types to 64 bits, and generates code for the x86-64 architecture.
-- 
2.40.1



Re: [PATCH] doc: Fix description of x86 -m32 option [PR109954]

2023-06-01 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 01, 2023 at 11:37:37AM +0100, Jonathan Wakely via Gcc-patches wrote:
> In https://gcc.gnu.org/PR109954 I suggested also adding:
> 
> "N.B., using @option{-march} might be required to produce code suitable
> for a specific CPU family, e.g., @option{-march=i486}."
> 
> I realise that that is true for all of -m32, -m64 and -mx32, and similar
> rules apply for other targets too. But I still feel that saying it
> explicitly for -m32 doesn't hurt, and would avoid a common
> misunderstanding by putting that info somewhere it's more likely to be
> read.
> 
> But I'd prefer to just fix the part that is *wrong*, and then we can
> discuss whether or not that other part is an improvement. This patch
> fixes the wrongness.
> 
> OK for trunk and release branches?

Ok, thanks.

> This option does not imply -march=i386 so it's incorrect to say it
> generates code that will run on "any i386 system".
> 
> gcc/ChangeLog:
> 
>   PR target/109954
>   * doc/invoke.texi (x86 Options): Fix description of -m32 option.
> ---
>  gcc/doc/invoke.texi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 898a88ce33e..ec71c2e9e0f 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -34091,7 +34091,7 @@ on x86-64 processors in 64-bit environments.
>  Generate code for a 16-bit, 32-bit or 64-bit environment.
>  The @option{-m32} option sets @code{int}, @code{long}, and pointer types
>  to 32 bits, and
> -generates code that runs on any i386 system.
> +generates code that runs in 32-bit mode.
>  
>  The @option{-m64} option sets @code{int} to 32 bits and @code{long} and 
> pointer
>  types to 64 bits, and generates code for the x86-64 architecture.
> -- 
> 2.40.1

Jakub



Re: [committed] libstdc++: Fix preprocessor conditions for std::from_chars [PR109921]

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Thu, 1 Jun 2023 at 10:30, Christophe Lyon via Libstdc++
 wrote:
>
> Hi,
>
>
> On Wed, 31 May 2023 at 14:25, Jonathan Wakely via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> > Tested powerpc64le-linux. Pushed to trunk.
> >
> > -- >8 --
> >
> > We use the from_chars_strtod function with __strtof128 to read a
> > _Float128 value, but from_chars_strtod is not defined unless uselocale
> > is available. This can lead to compilation failures for some targets,
> > because we try to define the _Flaot128 overload in terms of a
> > non-existing from_chars_strtod function.
> >
> > Only try to use __strtof128 if uselocale is available, otherwise
> > fallback to the long double overload of std::from_chars (which might
> > fallback to the double overload, which should use fast_float).
> >
> > This ensures we always define the full set of overloads, even if they
> > are not always accurate for all values of the wider types.
> >
> > libstdc++-v3/ChangeLog:
> >
> > PR libstdc++/109921
> > * src/c++17/floating_from_chars.cc (USE_STRTOF128_FOR_FROM_CHARS):
> > Only define when USE_STRTOD_FOR_FROM_CHARS is also defined.
> > (USE_STRTOD_FOR_FROM_CHARS): Do not undefine when long double is
> > binary64.
> > (from_chars(const char*, const char*, double&, chars_format)):
> > Check __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ here.
> > (from_chars(const char*, const char*, _Float128&, chars_format))
> > Only use from_chars_strtod when USE_STRTOD_FOR_FROM_CHARS is
> > defined, otherwise parse a long double and convert to _Float128.
> >
>
>
> This is causing a regression on aarch64:
>  FAIL: libstdc++-abi/abi_check

This is now PR 110077.


>
> The log says:
>
> 3 added symbols
> 0
> _ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE11_S_allocateERS3_m
> std::__cxx11::basic_string,
> std::allocator >::_S_allocate(std::allocator&, unsigned
> long)
> version status: compatible
> GLIBCXX_3.4.32
> type: function
> status: added
>
> 1
> _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_S_allocateERS3_m
> std::__cxx11::basic_string,
> std::allocator >::_S_allocate(std::allocator&, unsigned long)
> version status: compatible
> GLIBCXX_3.4.32
> type: function
> status: added
>
> 2
> _ZSt10from_charsPKcS0_RDF128_St12chars_format
> std::from_chars(char const*, char const*, _Float128&, std::chars_format)
> version status: incompatible
> GLIBCXX_3.4.31
> type: function
> status: added
>
>
> 2 undesignated symbols
> 0
> _ZSt11__once_call
> std::__once_call
> version status: compatible
> GLIBCXX_3.4.11
> type: tls
> type size: 8
> status: undesignated
>
> 1
> _ZSt15__once_callable
> std::__once_callable
> version status: compatible
> GLIBCXX_3.4.11
> type: tls
> type size: 8
> status: undesignated
>
>
> 1 incompatible symbols
> 0
> _ZSt10from_charsPKcS0_RDF128_St12chars_format
> std::from_chars(char const*, char const*, _Float128&, std::chars_format)
> version status: incompatible
> GLIBCXX_3.4.31
> type: function
> status: added
>
>
>
>  libstdc++-v3 check-abi Summary 
>
> # of added symbols:  3
> # of missing symbols:0
> # of undesignated symbols:   2
> # of incompatible symbols:   1
>
>
> Can you have a look?
>
> Thanks,
> Christophe
>
> ---
> >  libstdc++-v3/src/c++17/floating_from_chars.cc | 20 ---
> >  1 file changed, 13 insertions(+), 7 deletions(-)
> >
> > diff --git a/libstdc++-v3/src/c++17/floating_from_chars.cc
> > b/libstdc++-v3/src/c++17/floating_from_chars.cc
> > index ebd428d5be3..eea878072b0 100644
> > --- a/libstdc++-v3/src/c++17/floating_from_chars.cc
> > +++ b/libstdc++-v3/src/c++17/floating_from_chars.cc
> > @@ -64,7 +64,7 @@
> >  // strtold for __ieee128
> >  extern "C" __ieee128 __strtoieee128(const char*, char**);
> >  #elif __FLT128_MANT_DIG__ == 113 && __LDBL_MANT_DIG__ != 113 \
> > -  && defined(__GLIBC_PREREQ)
> > +  && defined(__GLIBC_PREREQ) && defined(USE_STRTOD_FOR_FROM_CHARS)
> >  #define USE_STRTOF128_FOR_FROM_CHARS 1
> >  extern "C" _Float128 __strtof128(const char*, char**)
> >__asm ("strtof128")
> > @@ -77,10 +77,6 @@ extern "C" _Float128 __strtof128(const char*, char**)
> >  #if _GLIBCXX_FLOAT_IS_IEEE_BINARY32 && _GLIBCXX_DOUBLE_IS_IEEE_BINARY64 \
> >  && __SIZE_WIDTH__ >= 32
> >  # define USE_LIB_FAST_FLOAT 1
> > -# if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
> > -// No need to use strtold.
> > -#  undef USE_STRTOD_FOR_FROM_CHARS
> > -# endif
> >  #endif
> >
> >  #if USE_LIB_FAST_FLOAT
> > @@ -1261,7 +1257,7 @@ from_chars_result
> >  from_chars(const char* first, const char* last, long double& value,
> >chars_format fmt) noexcept
> >  {
> > -#if ! USE_STRTOD_FOR_FROM_CHARS
> > +#if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__ || !defined
> > USE_STRTOD_FOR_FROM_CHARS
> >// Either long double is the same as double, or we can't use strtold.
> >// In the latter case, this might give an incorrect result (e.g. values
> >// o

Re: [PATCH 1/3] testsuite: Unbork multilib testing on RISC-V (and any target really)

2023-06-01 Thread Thomas Schwinge
Hi!

On 2023-06-01T09:24:20+0200, I wrote:
> First, Vineet, great that you've now tracked this down!  :-) Indeed
> "early exit" vs. 'torture-finish' was exactly the issue that I suspected.
>
> It may not be what you originally intended, but I hope at least you've
> learned some things about DejaGnu/TCL...  ;-P
>
> Yesterday, I actually had begun looking into this.  To avoid the big
> download and having to wait for a lot of packages to be build with your
> 'riscv-gnu-toolchain' recipe:
> ,
> I intended to do just a quick GCC build on compile farm gcc92, which
> (a) didn't turn out to be quick, and (b) eventually failed due to
> 
> "Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such 
> file or directory"...
>
> (I'm now running 'riscv-gnu-toolchain' to verify this, and another thing.)

If running that with just 'RUNTESTFLAGS="i386-prefetch.exp"', I get:

[...]
Schedule of variations:
riscv-sim/-march=rv32imac/-mabi=ilp32/-mcmodel=medlow
riscv-sim/-march=rv32imafdc/-mabi=ilp32d/-mcmodel=medlow
riscv-sim/-march=rv64imac/-mabi=lp64/-mcmodel=medlow
riscv-sim/-march=rv64imafdc/-mabi=lp64d/-mcmodel=medlow

Running target riscv-sim/-march=rv32imac/-mabi=ilp32/-mcmodel=medlow
[...]
Running [...]/gcc.misc-tests/i386-prefetch.exp ...

=== gcc Summary for 
riscv-sim/-march=rv32imac/-mabi=ilp32/-mcmodel=medlow ===

Running target riscv-sim/-march=rv32imafdc/-mabi=ilp32d/-mcmodel=medlow
[...]
Running [...]/gcc.misc-tests/i386-prefetch.exp ...
ERROR: tcl error sourcing [...]/gcc.misc-tests/i386-prefetch.exp.
ERROR: tcl error code NONE
ERROR: torture-init: LTO_TORTURE_OPTIONS is not empty as expected
[...]

..., which indeed complains about 'LTO_TORTURE_OPTIONS' (which relates to
my recent changes in that area, which I now do understand, see below).

The issue is that indeed 'torture-init' now does set
'LTO_TORTURE_OPTIONS', whereas 'torture_without_loops',
'torture_with_loops' are set only when 'set-torture-options' is called.

> Before we push your patch, let me please verify that it indeed doesn't
> change any 'gcc.misc-tests/i386-prefetch.exp' semantics

Done.

> and:
>
> On 2023-05-31T19:13:01+0100, Iain Sandoe via Gcc-patches 
>  wrote:
>>> On 31 May 2023, at 18:57, Jeff Law via Gcc-patches 
>>>  wrote:
>>> On 5/31/23 10:25, Vineet Gupta wrote:
 Multilib testing on trunk is currently busted (and surprisingly this
 affects any/all targets but it seems nobody cares). We currently get the
 following splat:
>>> I wouldn't say that nobody cares, it just hasn't bubbled up on anyone's 
>>> priority list yet (most developers aren't working on targets that make 
>>> heavy use of multilibs).
>
> So I regularely do build x86_64 GNU/Linux with default '-m64' plus '-m32'
> multilib -- but of course, there's no "early exit" for those, as there's
> no 'string match "* -march=*" " [board_info target multilib_flags] "'...
>
>>> But probably more importantly, this problem seems to not be triggering on 
>>> all multilib targets.  For example, I just examined my tester's build logs 
>>> and couldn't see this on the H8/300 or V850 ports.  Which begs the 
>>> question, why?
>
> ..., which may be the case for those, too?  In other words: the problem
> only shows up if '-march=[...]' appears in the flags, which indeed may
> not be a common thing?  I'll cross-verify this with x86_64 and
> '-march=[...]' flags.

That is the crucial thing indeed.  Vineet, please note that in the Git
commit log.  That is, instead of "Multilib testing", say "Multilib
testing involving '-march=[...]' flags", or similar.

The ERRORs do reproduce with x86_64 GNU/Linux with:


RUNTESTFLAGS='--target_board=unix\{-m32,-m64,-march=generic,-march=generic\} 
i386-prefetch.exp'

..., for example.  Here, '-m32' behaves as expected, '-m64' behaves as
expected, the first '-march=generic' does the 'torture-init' and "early
exit", the second '-march=generic' then again does 'torture-init' and
runs into the error condition.

> And, I still intend to figure out why this issue apparently disappears
> with my recent 'LTO_TORTURE_OPTIONS' patches reverted:
> .

In the "old world", 'torture-init', *not* followed by
'set-torture-options', *not* followed by 'torture-finish', then another
'torture-init' was not a problem -- but in the "new world" it now is.

This also explains my confusion; the original report was:

ERROR: torture-init: torture_without_loops is not empty as expected

..., note: not 'LTO_TORTURE_OPTIONS' but 'torture_without_loops', and
those I'd not directly touched in my recent changes, which had made me
confused.

The 'torture_without_loops' error condition now does arise if there's a
'torture-init', *not* followed by 'set-torture-options

Re: [PATCH] Move std::search into algobase.h

2023-06-01 Thread Rainer Orth
Jonathan Wakely via Gcc-patches  writes:

> On Wed, 31 May 2023 at 18:39, François Dumont via Libstdc++ <
> libstd...@gcc.gnu.org> wrote:
>
>> libstdc++: Reduce  inclusion to 
>>
>>
>> Move the std::search definition from stl_algo.h to stl_algobase.h and use
>> the later in .
>>
>> For consistency also move std::__parallel::search and associated helpers
>> from
>>  to  so that
>> std::__parallel::search
>> is accessible along with std::search.
>>
>> libstdc++-v3/ChangeLog:
>>
>>  * include/bits/stl_algo.h
>>  (std::__search, std::search(_FwdIt1, _FwdIt1, _FwdIt2,
>> _FwdIt2, _BinPred)): Move...
>>  * include/bits/stl_algobase.h: ...here.
>>  * include/std/functional: Replace  include by
>> .
>>  * include/parallel/algo.h (std::__parallel::search<_FIt1,
>> _FIt2, _BinaryPred>)
>>  (std::__parallel::__search_switch<_FIt1, _FIt2,
>> _BinaryPred, _ItTag1, _ItTag2>):
>>  Move...
>>  * include/parallel/algobase.h: ...here.
>>  * include/std/functional: Remove  and
>> 
>>  includes. Include .
>>
>> Tested under Linux x86_64.
>>
>> Ok to commit ?
>>
>
> OK

This seems to have caused

+FAIL: 17_intro/headers/c++2011/parallel_mode.cc (test for excess errors)
+FAIL: 17_intro/headers/c++2014/parallel_mode.cc (test for excess errors)

on i386-pc-solaris2.11:

Excess errors:
/var/gcc/regression/master/11.4-gcc-gas/build/i386-pc-solaris2.11/libstdc++-v3/include/parallel/algobase.h:496:
 error: '__search_template' is not a member of '__gnu_parallel'; did you mean 
'__find_template'?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH 2/3] RISC-V: Add missing torture-init and torture-finish for rvv.exp

2023-06-01 Thread Thomas Schwinge
Hi!

On 2023-05-31T09:25:33-0700, Vineet Gupta  wrote:
> From: Kito Cheng 
>
> This is in line with recent test harness expectations and is a
> preventive change as it doesn't actually fix any errors.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/rvv.exp: Add torture-init and
>   torture-finish.
>
> Signed-off-by: Vineet Gupta 
> ---
>  gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp 
> b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> index 5e69235a268c..7ab7456d1d15 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
> @@ -39,6 +39,7 @@ if [istarget riscv32-*-*] then {
>
>  # Initialize `dg'.
>  dg-init
> +torture-init
>
>  # Main loop.
>  set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -mabi=$gcc_mabi -O3"
> @@ -90,5 +91,7 @@ foreach op $AUTOVEC_TEST_OPTS {
>  dg-runtest [lsort [glob -nocomplain 
> $srcdir/$subdir/autovec/vls-vlmax/*.\[cS\]]] \
>   "-std=c99 -O3 -ftree-vectorize --param 
> riscv-autovec-preference=fixed-vlmax" $CFLAGS
>
> +torture-finish
> +
>  # All done.
>  dg-finish

I suggest to drop this patch: 'gcc.target/riscv/rvv/rvv.exp' isn't doing
anything with torture testing flags etc., but (in addition to
'dg-runtest') just calls 'gcc-dg-runtest', which internally does
'torture-init', 'torture-finish' -- like in a number of other '*.exp'
files.  As you say, this patch "doesn't actually fix any errors".


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] Move std::search into algobase.h

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Thu, 1 Jun 2023 at 12:52, Rainer Orth 
wrote:

> Jonathan Wakely via Gcc-patches  writes:
>
> > On Wed, 31 May 2023 at 18:39, François Dumont via Libstdc++ <
> > libstd...@gcc.gnu.org> wrote:
> >
> >> libstdc++: Reduce  inclusion to 
> >>
> >>
> >> Move the std::search definition from stl_algo.h to stl_algobase.h and
> use
> >> the later in .
> >>
> >> For consistency also move std::__parallel::search and associated helpers
> >> from
> >>  to  so that
> >> std::__parallel::search
> >> is accessible along with std::search.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >>  * include/bits/stl_algo.h
> >>  (std::__search, std::search(_FwdIt1, _FwdIt1, _FwdIt2,
> >> _FwdIt2, _BinPred)): Move...
> >>  * include/bits/stl_algobase.h: ...here.
> >>  * include/std/functional: Replace  include by
> >> .
> >>  * include/parallel/algo.h (std::__parallel::search<_FIt1,
> >> _FIt2, _BinaryPred>)
> >>  (std::__parallel::__search_switch<_FIt1, _FIt2,
> >> _BinaryPred, _ItTag1, _ItTag2>):
> >>  Move...
> >>  * include/parallel/algobase.h: ...here.
> >>  * include/std/functional: Remove  and
> >> 
> >>  includes. Include .
> >>
> >> Tested under Linux x86_64.
> >>
> >> Ok to commit ?
> >>
> >
> > OK
>
> This seems to have caused
>
> +FAIL: 17_intro/headers/c++2011/parallel_mode.cc (test for excess errors)
> +FAIL: 17_intro/headers/c++2014/parallel_mode.cc (test for excess errors)
>
> on i386-pc-solaris2.11:
>

I think it affects all targets.


>
> Excess errors:
> /var/gcc/regression/master/11.4-gcc-gas/build/i386-pc-solaris2.11/libstdc++-v3/include/parallel/algobase.h:496:
> error: '__search_template' is not a member of '__gnu_parallel'; did you
> mean '__find_template'?
>
> Rainer
>
> --
>
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>


[PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

2023-06-01 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to add some test cases of vfloat16*_t (non tuple),
no 'zvfh' or 'zvfhmin' will meet unknown type.

Signed-off-by: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-16.c: Add test cases.
* gcc.target/riscv/rvv/base/user-7.c: Likewise.
---
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c | 6 ++
 gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
index be2cbb5efd7..9e962a70acf 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
@@ -173,6 +173,12 @@ void f___rvv_int64m2x4_t () {__rvv_int64m2x4_t t;} /* { 
dg-error {unknown type n
 void f___rvv_uint64m2x4_t () {__rvv_uint64m2x4_t t;} /* { dg-error {unknown 
type name '__rvv_uint64m2x4_t'} } */
 void f___rvv_int64m4x2_t () {__rvv_int64m4x2_t t;} /* { dg-error {unknown type 
name '__rvv_int64m4x2_t'} } */
 void f___rvv_uint64m4x2_t () {__rvv_uint64m4x2_t t;} /* { dg-error {unknown 
type name '__rvv_uint64m4x2_t'} } */
+void f___rvv_float16mf4_t () {__rvv_float16mf4_t t;} /* { dg-error {unknown 
type name '__rvv_float16mf4_t'} } */
+void f___rvv_float16mf2_t () {__rvv_float16mf2_t t;} /* { dg-error {unknown 
type name '__rvv_float16mf2_t'} } */
+void f___rvv_float16m1_t () {__rvv_float16m1_t t;} /* { dg-error {unknown type 
name '__rvv_float16m1_t'} } */
+void f___rvv_float16m2_t () {__rvv_float16m2_t t;} /* { dg-error {unknown type 
name '__rvv_float16m2_t'} } */
+void f___rvv_float16m4_t () {__rvv_float16m4_t t;} /* { dg-error {unknown type 
name '__rvv_float16m4_t'} } */
+void f___rvv_float16m8_t () {__rvv_float16m8_t t;} /* { dg-error {unknown type 
name '__rvv_float16m8_t'} } */
 void f___rvv_float32mf2x2_t () {__rvv_float32mf2x2_t t;}
 void f___rvv_float32mf2x3_t () {__rvv_float32mf2x3_t t;}
 void f___rvv_float32mf2x4_t () {__rvv_float32mf2x4_t t;}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
index 2172a5c7c79..0620a728208 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
@@ -173,6 +173,12 @@ void f_vint64m2x4_t () {vint64m2x4_t t;} /* { dg-error 
{unknown type name 'vint6
 void f_vuint64m2x4_t () {vuint64m2x4_t t;} /* { dg-error {unknown type name 
'vuint64m2x4_t'} } */
 void f_vint64m4x2_t () {vint64m4x2_t t;} /* { dg-error {unknown type name 
'vint64m4x2_t'} } */
 void f_vuint64m4x2_t () {vuint64m4x2_t t;} /* { dg-error {unknown type name 
'vuint64m4x2_t'} } */
+void f_vfloat16mf4_t () {vfloat16mf4_t t;} /* { dg-error {unknown type name 
'vfloat16mf4_t'} } */
+void f_vfloat16mf2_t () {vfloat16mf2_t t;} /* { dg-error {unknown type name 
'vfloat16mf2_t'} } */
+void f_vfloat16m1_t () {vfloat16m1_t t;} /* { dg-error {unknown type name 
'vfloat16m1_t'} } */
+void f_vfloat16m2_t () {vfloat16m2_t t;} /* { dg-error {unknown type name 
'vfloat16m2_t'} } */
+void f_vfloat16m4_t () {vfloat16m4_t t;} /* { dg-error {unknown type name 
'vfloat16m4_t'} } */
+void f_vfloat16m8_t () {vfloat16m8_t t;} /* { dg-error {unknown type name 
'vfloat16m8_t'} } */
 void f_vfloat32mf2x2_t () {vfloat32mf2x2_t t;} /* { dg-error {unknown type 
name 'vfloat32mf2x2_t'} } */
 void f_vfloat32mf2x3_t () {vfloat32mf2x3_t t;} /* { dg-error {unknown type 
name 'vfloat32mf2x3_t'} } */
 void f_vfloat32mf2x4_t () {vfloat32mf2x4_t t;} /* { dg-error {unknown type 
name 'vfloat32mf2x4_t'} } */
-- 
2.34.1



RE: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

2023-06-01 Thread Li, Pan2 via Gcc-patches
Thanks Juzhe for pointing out this.

Pan

-Original Message-
From: Li, Pan2  
Sent: Thursday, June 1, 2023 8:09 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2 ; 
Wang, Yanzhang 
Subject: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

From: Pan Li 

This patch would like to add some test cases of vfloat16*_t (non tuple), no 
'zvfh' or 'zvfhmin' will meet unknown type.

Signed-off-by: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-16.c: Add test cases.
* gcc.target/riscv/rvv/base/user-7.c: Likewise.
---
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c | 6 ++  
gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
index be2cbb5efd7..9e962a70acf 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
@@ -173,6 +173,12 @@ void f___rvv_int64m2x4_t () {__rvv_int64m2x4_t t;} /* { 
dg-error {unknown type n  void f___rvv_uint64m2x4_t () {__rvv_uint64m2x4_t t;} 
/* { dg-error {unknown type name '__rvv_uint64m2x4_t'} } */  void 
f___rvv_int64m4x2_t () {__rvv_int64m4x2_t t;} /* { dg-error {unknown type name 
'__rvv_int64m4x2_t'} } */  void f___rvv_uint64m4x2_t () {__rvv_uint64m4x2_t t;} 
/* { dg-error {unknown type name '__rvv_uint64m4x2_t'} } */
+void f___rvv_float16mf4_t () {__rvv_float16mf4_t t;} /* { dg-error 
+{unknown type name '__rvv_float16mf4_t'} } */ void f___rvv_float16mf2_t 
+() {__rvv_float16mf2_t t;} /* { dg-error {unknown type name 
+'__rvv_float16mf2_t'} } */ void f___rvv_float16m1_t () 
+{__rvv_float16m1_t t;} /* { dg-error {unknown type name 
+'__rvv_float16m1_t'} } */ void f___rvv_float16m2_t () 
+{__rvv_float16m2_t t;} /* { dg-error {unknown type name 
+'__rvv_float16m2_t'} } */ void f___rvv_float16m4_t () 
+{__rvv_float16m4_t t;} /* { dg-error {unknown type name 
+'__rvv_float16m4_t'} } */ void f___rvv_float16m8_t () 
+{__rvv_float16m8_t t;} /* { dg-error {unknown type name 
+'__rvv_float16m8_t'} } */
 void f___rvv_float32mf2x2_t () {__rvv_float32mf2x2_t t;}  void 
f___rvv_float32mf2x3_t () {__rvv_float32mf2x3_t t;}  void 
f___rvv_float32mf2x4_t () {__rvv_float32mf2x4_t t;} diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
index 2172a5c7c79..0620a728208 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
@@ -173,6 +173,12 @@ void f_vint64m2x4_t () {vint64m2x4_t t;} /* { dg-error 
{unknown type name 'vint6  void f_vuint64m2x4_t () {vuint64m2x4_t t;} /* { 
dg-error {unknown type name 'vuint64m2x4_t'} } */  void f_vint64m4x2_t () 
{vint64m4x2_t t;} /* { dg-error {unknown type name 'vint64m4x2_t'} } */  void 
f_vuint64m4x2_t () {vuint64m4x2_t t;} /* { dg-error {unknown type name 
'vuint64m4x2_t'} } */
+void f_vfloat16mf4_t () {vfloat16mf4_t t;} /* { dg-error {unknown type 
+name 'vfloat16mf4_t'} } */ void f_vfloat16mf2_t () {vfloat16mf2_t t;} 
+/* { dg-error {unknown type name 'vfloat16mf2_t'} } */ void 
+f_vfloat16m1_t () {vfloat16m1_t t;} /* { dg-error {unknown type name 
+'vfloat16m1_t'} } */ void f_vfloat16m2_t () {vfloat16m2_t t;} /* { 
+dg-error {unknown type name 'vfloat16m2_t'} } */ void f_vfloat16m4_t () 
+{vfloat16m4_t t;} /* { dg-error {unknown type name 'vfloat16m4_t'} } */ 
+void f_vfloat16m8_t () {vfloat16m8_t t;} /* { dg-error {unknown type 
+name 'vfloat16m8_t'} } */
 void f_vfloat32mf2x2_t () {vfloat32mf2x2_t t;} /* { dg-error {unknown type 
name 'vfloat32mf2x2_t'} } */  void f_vfloat32mf2x3_t () {vfloat32mf2x3_t t;} /* 
{ dg-error {unknown type name 'vfloat32mf2x3_t'} } */  void f_vfloat32mf2x4_t 
() {vfloat32mf2x4_t t;} /* { dg-error {unknown type name 'vfloat32mf2x4_t'} } */
--
2.34.1



Re: [PATCH] Move std::search into algobase.h

2023-06-01 Thread François Dumont via Gcc-patches
Sorry, I had fully tested the move from bits/stl_algo.h to
bits/stl_algobase.h.

But it appears that the script I used to run the tests after the other move
has not done what I expected.

I'll provide the patch shortly.


Le jeu. 1 juin 2023 à 14:06, Jonathan Wakely  a écrit :

>
>
> On Thu, 1 Jun 2023 at 12:52, Rainer Orth 
> wrote:
>
>> Jonathan Wakely via Gcc-patches  writes:
>>
>> > On Wed, 31 May 2023 at 18:39, François Dumont via Libstdc++ <
>> > libstd...@gcc.gnu.org> wrote:
>> >
>> >> libstdc++: Reduce  inclusion to 
>> >>
>> >>
>> >> Move the std::search definition from stl_algo.h to stl_algobase.h and
>> use
>> >> the later in .
>> >>
>> >> For consistency also move std::__parallel::search and associated
>> helpers
>> >> from
>> >>  to  so that
>> >> std::__parallel::search
>> >> is accessible along with std::search.
>> >>
>> >> libstdc++-v3/ChangeLog:
>> >>
>> >>  * include/bits/stl_algo.h
>> >>  (std::__search, std::search(_FwdIt1, _FwdIt1, _FwdIt2,
>> >> _FwdIt2, _BinPred)): Move...
>> >>  * include/bits/stl_algobase.h: ...here.
>> >>  * include/std/functional: Replace  include by
>> >> .
>> >>  * include/parallel/algo.h (std::__parallel::search<_FIt1,
>> >> _FIt2, _BinaryPred>)
>> >>  (std::__parallel::__search_switch<_FIt1, _FIt2,
>> >> _BinaryPred, _ItTag1, _ItTag2>):
>> >>  Move...
>> >>  * include/parallel/algobase.h: ...here.
>> >>  * include/std/functional: Remove  and
>> >> 
>> >>  includes. Include .
>> >>
>> >> Tested under Linux x86_64.
>> >>
>> >> Ok to commit ?
>> >>
>> >
>> > OK
>>
>> This seems to have caused
>>
>> +FAIL: 17_intro/headers/c++2011/parallel_mode.cc (test for excess errors)
>> +FAIL: 17_intro/headers/c++2014/parallel_mode.cc (test for excess errors)
>>
>> on i386-pc-solaris2.11:
>>
>
> I think it affects all targets.
>
>
>>
>> Excess errors:
>> /var/gcc/regression/master/11.4-gcc-gas/build/i386-pc-solaris2.11/libstdc++-v3/include/parallel/algobase.h:496:
>> error: '__search_template' is not a member of '__gnu_parallel'; did you
>> mean '__find_template'?
>>
>> Rainer
>>
>> --
>>
>> -
>> Rainer Orth, Center for Biotechnology, Bielefeld University
>>
>>


Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Max Filippov via Gcc-patches
On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
 wrote:
> More optimized than the default RTL generation.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (adddi3, subdi3):
> New RTL generation patterns implemented according to the instruc-
> tion idioms described in the Xtensa ISA reference manual (p. 600).
> ---
>  gcc/config/xtensa/xtensa.md | 52 +
>  1 file changed, 52 insertions(+)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Max Filippov via Gcc-patches
On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
 wrote:
> More optimized than the default RTL generation.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (adddi3, subdi3):
> New RTL generation patterns implemented according to the instruc-
> tion idioms described in the Xtensa ISA reference manual (p. 600).
> ---
>  gcc/config/xtensa/xtensa.md | 52 +
>  1 file changed, 52 insertions(+)
>
> diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
> index eda1353894b..21afa747e89 100644
> --- a/gcc/config/xtensa/xtensa.md
> +++ b/gcc/config/xtensa/xtensa.md
> @@ -190,6 +190,35 @@
> (set_attr "mode""SI")
> (set_attr "length"  "3")])
>
> +(define_expand "adddi3"
> +  [(set (match_operand:DI 0 "register_operand")
> +   (plus:DI (match_operand:DI 1 "register_operand")
> +(match_operand:DI 2 "register_operand")))]
> +  ""
> +{
> +  rtx lo_dest, hi_dest, lo_op0, hi_op0, lo_op1, hi_op1;
> +  rtx_code_label *label;
> +  if (rtx_equal_p (operands[0], operands[1])
> +  || rtx_equal_p (operands[0], operands[2])

> +  || ! REG_P (operands[1]) || ! REG_P (operands[2]))

I wonder if these additional conditions are necessary, given that
the operands have the "register_operand" predicates?

-- 
Thanks.
-- Max


[COMMITTED] cse: Change return type of predicate functions from int to bool

2023-06-01 Thread Uros Bizjak via Gcc-patches
Also change some function arguments to bool and remove one instance
of always zero function argument.

gcc/ChangeLog:

* rtl.h (exp_equiv_p): Change return type from int to bool.
* cse.cc (mention_regs): Change return type from int to bool
and adjust function body accordingly.
(exp_equiv_p): Ditto.
(insert_regs): Ditto. Change "modified" function argument to bool
and update usage accordingly.
(record_jump_cond): Remove always zero "reversed_nonequality"
function argument and update usage accordingly.
(fold_rtx): Change "changed" variable to bool.
(record_jump_equiv): Remove unneeded "reversed_nonequality" variable.
(is_dead_reg): Change return type from int to bool.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/cse.cc b/gcc/cse.cc
index 86403b95938..2bb63ac4105 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -511,8 +511,8 @@ static void new_basic_block (void);
 static void make_new_qty (unsigned int, machine_mode);
 static void make_regs_eqv (unsigned int, unsigned int);
 static void delete_reg_equiv (unsigned int);
-static int mention_regs (rtx);
-static int insert_regs (rtx, struct table_elt *, int);
+static bool mention_regs (rtx);
+static bool insert_regs (rtx, struct table_elt *, bool);
 static void remove_from_table (struct table_elt *, unsigned);
 static void remove_pseudo_from_table (rtx, unsigned);
 static struct table_elt *lookup (rtx, unsigned, machine_mode);
@@ -542,8 +542,7 @@ static enum rtx_code find_comparison_args (enum rtx_code, 
rtx *, rtx *,
 static rtx fold_rtx (rtx, rtx_insn *);
 static rtx equiv_constant (rtx);
 static void record_jump_equiv (rtx_insn *, bool);
-static void record_jump_cond (enum rtx_code, machine_mode, rtx, rtx,
- int);
+static void record_jump_cond (enum rtx_code, machine_mode, rtx, rtx);
 static void cse_insn (rtx_insn *);
 static void cse_prescan_path (struct cse_basic_block_data *);
 static void invalidate_from_clobbers (rtx_insn *);
@@ -967,19 +966,19 @@ delete_reg_equiv (unsigned int reg)
mention_regs is not called when a register itself
is being stored in the table.
 
-   Return 1 if we have done something that may have changed the hash code
-   of X.  */
+   Return true if we have done something that may have changed
+   the hash code of X.  */
 
-static int
+static bool
 mention_regs (rtx x)
 {
   enum rtx_code code;
   int i, j;
   const char *fmt;
-  int changed = 0;
+  bool changed = false;
 
   if (x == 0)
-return 0;
+return false;
 
   code = GET_CODE (x);
   if (code == REG)
@@ -997,7 +996,7 @@ mention_regs (rtx x)
  SUBREG_TICKED (i) = -1;
}
 
-  return 0;
+  return false;
 }
 
   /* If this is a SUBREG, we don't want to discard other SUBREGs of the same
@@ -1024,7 +1023,7 @@ mention_regs (rtx x)
 
   REG_IN_TABLE (i) = REG_TICK (i);
   SUBREG_TICKED (i) = REGNO (SUBREG_REG (x));
-  return 0;
+  return false;
 }
 
   /* If X is a comparison or a COMPARE and either operand is a register
@@ -1041,28 +1040,32 @@ mention_regs (rtx x)
 {
   if (REG_P (XEXP (x, 0))
  && ! REGNO_QTY_VALID_P (REGNO (XEXP (x, 0
-   if (insert_regs (XEXP (x, 0), NULL, 0))
+   if (insert_regs (XEXP (x, 0), NULL, false))
  {
rehash_using_reg (XEXP (x, 0));
-   changed = 1;
+   changed = true;
  }
 
   if (REG_P (XEXP (x, 1))
  && ! REGNO_QTY_VALID_P (REGNO (XEXP (x, 1
-   if (insert_regs (XEXP (x, 1), NULL, 0))
+   if (insert_regs (XEXP (x, 1), NULL, false))
  {
rehash_using_reg (XEXP (x, 1));
-   changed = 1;
+   changed = true;
  }
 }
 
   fmt = GET_RTX_FORMAT (code);
   for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
 if (fmt[i] == 'e')
-  changed |= mention_regs (XEXP (x, i));
+  {
+   if (mention_regs (XEXP (x, i)))
+ changed = true;
+  }
 else if (fmt[i] == 'E')
   for (j = 0; j < XVECLEN (x, i); j++)
-   changed |= mention_regs (XVECEXP (x, i, j));
+   if (mention_regs (XVECEXP (x, i, j)))
+ changed = true;
 
   return changed;
 }
@@ -1070,15 +1073,15 @@ mention_regs (rtx x)
 /* Update the register quantities for inserting X into the hash table
with a value equivalent to CLASSP.
(If the class does not contain a REG, it is irrelevant.)
-   If MODIFIED is nonzero, X is a destination; it is being modified.
+   If MODIFIED is true, X is a destination; it is being modified.
Note that delete_reg_equiv should be called on a register
before insert_regs is done on that register with MODIFIED != 0.
 
-   Nonzero value means that elements of reg_qty have changed
+   True value means that elements of reg_qty have changed
so X's hash code may be different.  */
 
-static int
-insert_regs (rtx x, struct table_elt *classp, int modified)
+static bool
+insert_regs (rtx x, struct table_elt *classp, bool mo

Re: [PATCH] Fix PR 110042: ifcvt regression due to paradoxical subregs

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 15:22, Andrew Pinski wrote:

On Wed, May 31, 2023 at 12:29 AM Richard Biener via Gcc-patches
 wrote:


On Wed, May 31, 2023 at 6:34 AM Andrew Pinski via Gcc-patches
 wrote:


After r14-1014-gc5df248509b489364c573e8, GCC started to emit
directly a zero_extract for `(t1&0x8)!=0`. This introduced
a small regression where ifcvt would not do the ifconversion
as there is now a paradoxical subreg in the dest which
was being rejected. Since paradoxical subreg set the whole
register, we can treat it as the same as a reg in the two places.

OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.


OK I guess.   I vaguely remember SUBREG_PROMOTED_UNSIGNED_P
applies to non-paradoxical subregs but I might be swapping things - maybe
you remember better and whether that would cause any issues here?


So I looked into the history of the code in ifcvt.cc, this code was
added with r6-3071-ge65bf4e814d38c to accept more complex bb
(https://inbox.sourceware.org/gcc-patches/559fbb13.80...@arm.com/).
The thread where we start talking about subregs is located with Jeff's
email starting here:
https://inbox.sourceware.org/gcc-patches/55bbafac.5020...@redhat.com/ .

Jeff,
   I know Richard already approved this patch but could you provide a
second eye as you were involved reviewing the original code here and I
want to make sure I understood the code in a a reasonable fashion?
It's been a while.   I think my original concerns were with RMW operands 
and making sure we tracked all sub-components of such operands correctly.


That was based on the original version, but that version looks like it 
should have been OK after reviewing the details of reg_referenced_p. 
It's since moved to DF.


So the only worry I immediately see is whether or not DF is giving uses 
and sets of sub-compenents of a RMW operand or multi-hard register modes.


Jeff


Re: [PATCH 1/3] testsuite: Unbork multilib testing on RISC-V (and any target really)

2023-06-01 Thread Jeff Law via Gcc-patches




On 6/1/23 01:24, Thomas Schwinge wrote:




But probably more importantly, this problem seems to not be triggering on all 
multilib targets.  For example, I just examined my tester's build logs and 
couldn't see this on the H8/300 or V850 ports.  Which begs the question, why?


..., which may be the case for those, too?  In other words: the problem
only shows up if '-march=[...]' appears in the flags, which indeed may
not be a common thing?  I'll cross-verify this with x86_64 and
'-march=[...]' flags.
Correct.  Those do not use -march.  They have distinct flags for turning 
on sub-variants.  So for example on the H8 -ms turns on H8/S code 
generation, -msx turns on H8/S, etc.  V850 has different options, but 
follows the same basic principle.



Jeff


[PATCH 1/2] c++: refine dependent_alias_template_spec_p [PR90679]

2023-06-01 Thread Patrick Palka via Gcc-patches
For a complex alias template-id, dependent_alias_template_spec_p returns
true if any template argument of the template-id is dependent.  This
predicate indicates that substitution into the template-id may behave
differently with respect to SFINAE than substitution into the expanded
alias, and so the alias is in a way non-transparent.  For example
'first_t' in

  template using first_t = T;
  template first_t f();

is such an alias template-id since first_t doesn't use its second
template parameter and so the substitution into the expanded alias would
discard the SFINAE effects of the corresponding (dependent) argument 'T&'.

But this predicate is overly conservative since what really matters for
sake of SFINAE equivalence is whether a template argument corresponding
to an _unused_ template parameter is dependent.  So the predicate should
return false for e.g. 'first_t' or 'first_t'.

This patch refines the predicate appropriately.  We need to be able to
efficiently determine which template parameters of a complex alias
template are unused, so to that end we add a new out parameter to
complex_alias_template_p and cache its result in an on-the-side
hash_map that replaces the existing TEMPLATE_DECL_COMPLEX_ALIAS_P
flag.  And in doing so, we fix a latent bug that this flag wasn't
being propagated during partial instantiation, and so we were treating
all partially instantiated member alias templates as non-complex.

PR c++/90679

gcc/cp/ChangeLog:

* cp-tree.h (TEMPLATE_DECL_COMPLEX_ALIAS_P): Remove.
(most_general_template): Constify parameter.
* pt.cc (push_template_decl): Adjust after removing
TEMPLATE_DECL_COMPLEX_ALIAS_P.
(complex_alias_tmpl_info): New hash_map.
(uses_all_template_parms_data::seen): Change type to
tree* from bool*.
(complex_alias_template_r): Adjust accordingly.
(complex_alias_template_p): Add 'seen_out' out parameter.
Call most_general_template and check PRIMARY_TEMPLATE_P.
Use complex_alias_tmpl_info to cache the result and set
'*seen_out' accordigly.
(dependent_alias_template_spec_p): Add !processing_template_decl
early exit test.  Consider dependence of only template arguments
corresponding to seen template parameters as per

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-75.C: New test.
---
 gcc/cp/cp-tree.h   |   7 +-
 gcc/cp/pt.cc   | 101 +++--
 gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C |  24 +
 3 files changed, 100 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a1b882f11fe..5330d1e1f62 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -543,7 +543,6 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
2: DECL_THIS_EXTERN (in VAR_DECL, FUNCTION_DECL or PARM_DECL)
   DECL_IMPLICIT_TYPEDEF_P (in a TYPE_DECL)
   DECL_CONSTRAINT_VAR_P (in a PARM_DECL)
-  TEMPLATE_DECL_COMPLEX_ALIAS_P (in TEMPLATE_DECL)
   DECL_INSTANTIATING_NSDMI_P (in a FIELD_DECL)
   USING_DECL_UNRELATED_P (in USING_DECL)
3: DECL_IN_AGGR_P.
@@ -3655,10 +3654,6 @@ struct GTY(()) lang_decl {
 #define TYPE_DECL_ALIAS_P(NODE) \
   DECL_LANG_FLAG_6 (TYPE_DECL_CHECK (NODE))
 
-/* Nonzero for TEMPLATE_DECL means that it is a 'complex' alias template.  */
-#define TEMPLATE_DECL_COMPLEX_ALIAS_P(NODE) \
-  DECL_LANG_FLAG_2 (TEMPLATE_DECL_CHECK (NODE))
-
 /* Nonzero for a type which is an alias for another type; i.e, a type
which declaration was written 'using name-of-type =
another-type'.  */
@@ -7403,7 +7398,7 @@ extern tree tsubst_argument_pack  (tree, tree, 
tsubst_flags_t, tree);
 extern tree tsubst_template_args   (tree, tree, tsubst_flags_t, 
tree);
 extern tree tsubst_template_arg(tree, tree, 
tsubst_flags_t, tree);
 extern tree tsubst_function_parms  (tree, tree, tsubst_flags_t, 
tree);
-extern tree most_general_template  (tree);
+extern tree most_general_template  (const_tree);
 extern tree get_mostly_instantiated_function_type (tree);
 extern bool problematic_instantiation_changed  (void);
 extern void record_last_problematic_instantiation (void);
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 7fb3e75bceb..1b28195e10d 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -211,7 +211,6 @@ static tree listify (tree);
 static tree listify_autos (tree, tree);
 static tree tsubst_template_parm (tree, tree, tsubst_flags_t);
 static tree instantiate_alias_template (tree, tree, tsubst_flags_t);
-static bool complex_alias_template_p (const_tree tmpl);
 static tree get_underlying_template (tree);
 static tree tsubst_attributes (tree, tree, tsubst_flags_t, tree);
 static tree canonicalize_expr_argument (tree, tsubst_flags_t);
@@ -6233,8 +6232,6 @@ push_template_decl (tree decl, bool is_friend)
   

[PATCH 2/2] c++: partial ordering and dep alias tmpl specs [PR90679]

2023-06-01 Thread Patrick Palka via Gcc-patches
During partial ordering, we want to look through dependent alias
template specializations within template arguments and otherwise
treat them as opaque in other contexts (see e.g. r7-7116-g0c942f3edab108
and r11-7011-g6e0a231a4aa240).  To that end template_args_equal was
given a partial_order flag that controls this behavior.  This flag
does the right thing when a dependent alias template specialization
appears as template argument of the partial specialization, e.g. in

  template using first_t = T;
  template struct traits;
  template struct traits> { }; // #1
  template struct traits> { }; // #2

we correctly consider #2 to be more specialized than #1.  But if
the alias specialization appears as a template argument of another
class template specialization, e.g. in

  template struct traits>> { }; // #1
  template struct traits>> { }; // #2

then we incorrectly consider #1 and #2 to be unordered.  This is because

  1. we don't propagate the flag to recursive template_args_equal calls
  2. we don't use structural equality for class template specializations
 written in terms of dependent alias template specializations

This patch fixes the first issue by turning the partial_order flag into
a global.  This patch fixes the second issue by making us propagate
structural equality appropriately when building a class template
specialization.  In passing this patch also improves hashing of
specializations that use structural equality.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/90679

gcc/cp/ChangeLog:

* cp-tree.h (comp_template_args): Remove partial_order
parameter.
(template_args_equal): Likewise.
* pt.cc (iterative_hash_template_arg) : Hash
the template and arguments for specializations that use
structural equality.
(comparing_for_partial_ordering): New flag.
(template_args_equal): Remove partial order parameter and
use comparing_for_partial_ordering instead.
(comp_template_args): Likewise.
(comp_template_args_porder): Set comparing_for_partial_ordering
instead.  Make static.
(any_template_arguments_need_structural_equality_p): Return true
for an argument that's a dependent alias template specialization
or a class template specialization that itself needs structural
equality.
* tree.cc (cp_tree_equal) : Adjust call to
comp_template_args.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-75a.C: New test.
* g++.dg/cpp0x/alias-decl-75b.C: New test.
---
 gcc/cp/cp-tree.h|  4 +--
 gcc/cp/pt.cc| 40 +
 gcc/cp/tree.cc  |  2 +-
 gcc/testsuite/g++.dg/cpp0x/alias-decl-75a.C | 26 ++
 gcc/testsuite/g++.dg/cpp0x/alias-decl-75b.C | 26 ++
 5 files changed, 88 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-75a.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-75b.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 5330d1e1f62..f08e5630a5c 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7381,8 +7381,8 @@ extern int template_class_depth   (tree);
 extern int is_specialization_of(tree, tree);
 extern bool is_specialization_of_friend(tree, tree);
 extern bool comp_template_args (tree, tree, tree * = NULL,
-tree * = NULL, bool = false);
-extern int template_args_equal  (tree, tree, bool = false);
+tree * = NULL);
+extern int template_args_equal  (tree, tree);
 extern tree maybe_process_partial_specialization (tree);
 extern tree most_specialized_instantiation (tree);
 extern tree most_specialized_partial_spec   (tree, tsubst_flags_t);
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1b28195e10d..1a32f10b22b 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1913,6 +1913,11 @@ iterative_hash_template_arg (tree arg, hashval_t val)
default:
  if (tree canonical = TYPE_CANONICAL (arg))
val = iterative_hash_object (TYPE_HASH (canonical), val);
+ else if (tree ti = TYPE_TEMPLATE_INFO (arg))
+   {
+ val = iterative_hash_template_arg (TI_TEMPLATE (ti), val);
+ val = iterative_hash_template_arg (TI_ARGS (ti), val);
+   }
  break;
}
 
@@ -9296,6 +9301,12 @@ coerce_template_parms (tree parms,
   return return_full_args ? new_args : new_inner_args;
 }
 
+/* Whether we are comparing template arguments during partial ordering
+   (and therefore want the comparison to look through dependent alias
+   template specializations).  */
+
+static int comparing_for_partial_ordering;
+
 /* Returns true if T is a wrapper to make a C++20 template paramet

Re: [PATCH 3/3] testsuite: print any leaking torture options for debugging

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 10:25, Vineet Gupta wrote:

This was helpful when debugging the recent multilib testsuite failure.

gcc/testsuite:
* lib/torture-options.exp: print the value of non-empty options:
torture_without_loops, torture_with_loops, LTO_TORTURE_OPTIONS.

OK
jeff


Re: [PATCH 1/3] testsuite: Unbork multilib testing on RISC-V (and any target really)

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 10:25, Vineet Gupta wrote:

Multilib testing on trunk is currently busted (and surprisingly this
affects any/all targets but it seems nobody cares). We currently get the
following splat:

| ERROR: tcl error code NONE
| ERROR: torture-init: torture_without_loops is not empty as expected

And this takes down pretty much all of testsuite.

|   = Summary of gcc testsuite =
|| # of unexpected case / # of unique unexpected 
case
||  gcc |  g++ | gfortran |
| rv64imafdc/  lp64d/ medlow | 5421 / 4 |1 / 1 |   72 /12 |
| rv32imafdc/ ilp32d/ medlow | 5422 / 5 |3 / 2 |   72 /12 |
|   rv32imac/  ilp32/ medlow |  391 / 5 |3 / 2 |  109 /19 |
|   rv64imac/   lp64/ medlow | 5422 / 5 |1 / 1 |  109 /19 |

There have been recent improvements in test harness around pairing of
torture-{init,finish} and checking for leaking torture options. This
however triggers a latent bug introduced way back in 2009: commit 3dd1415dc88
"i386-prefetch.exp: Skip tests when multilib flags contain -march" which
missed a pairing torture-finish. It was benign so far but in the new
regime it causes extra state "torture-init-done" confusing the 2nd round of
tests (in multilib).

This fix moves the early exit outside of torture-{init,finish} bracket
and brings RISC-V testing back to sanity.

| rv64imafdc/  lp64d/ medlow |3 / 2 |1 / 1 |   72 /12 |
| rv32imafdc/ ilp32d/ medlow |4 / 3 |3 / 2 |   72 /12 |
|   rv32imac/  ilp32/ medlow |3 / 2 |3 / 2 |  109 /19 |
|   rv64imac/   lp64/ medlow |5 / 4 |1 / 1 |  109 /19 |

gcc/testsuite:
* gcc.misc-tests/i386-prefetch.exp: Move early return outside
  the torture-{init,finish}
OK after addressing Thomas's comments which I think just amounted to 
moving the code to a different place and adjusting the comments in the 
commit message.


jeff


Re: [PATCH 2/3] RISC-V: Add missing torture-init and torture-finish for rvv.exp

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 10:25, Vineet Gupta wrote:

From: Kito Cheng 

This is in line with recent test harness expectations and is a
preventive change as it doesn't actually fix any errors.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add torture-init and
torture-finish.
Thomas's recommendation was to drop this as it doesn't change any 
observed behavior.  Do you agree with that recommendation?


jeff


Re: [PATCH] doc: improve docs for -pedantic{,-errors}

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 21:31, Jason Merrill via Gcc-patches wrote:

Tested by looking at the makeinfo output.  OK for trunk?

-- 8< --

Recent discussion of -Wimplicit led me to want to clarify this section of
the documentation, and mark which diagnostics other than -Wpedantic are
affected by -pedantic-errors.

gcc/ChangeLog:

* doc/invoke.texi (-Wpedantic): Improve clarity.

OK
jeff


[committed] libstdc++: Document removal of implicit allocator rebinding extensions

2023-06-01 Thread Jonathan Wakely via Gcc-patches
Pushed to trunk. The first two changes will be backported too.

-- >8 --

Traditionally libstdc++ allowed containers and strings to be
instantiated with allocator's that have the wrong value type, implicitly
rebinding the allocator to the container's value type. Since C++20 that
has been explicitly ill-formed, so the extension is no longer supported
in strict modes (e.g. -std=c++17) and in C++20 and later.

libstdc++-v3/ChangeLog:

* doc/xml/manual/evolution.xml: Document removal of implicit
allocator rebinding extensions in strict mode and for C++20.
* doc/html/*: Regenerate.
---
 libstdc++-v3/doc/html/manual/api.html | 13 +
 libstdc++-v3/doc/xml/manual/evolution.xml | 19 +++
 2 files changed, 32 insertions(+)

diff --git a/libstdc++-v3/doc/xml/manual/evolution.xml 
b/libstdc++-v3/doc/xml/manual/evolution.xml
index 4037a18d2df..db70f24f2f9 100644
--- a/libstdc++-v3/doc/xml/manual/evolution.xml
+++ b/libstdc++-v3/doc/xml/manual/evolution.xml
@@ -915,6 +915,13 @@ Calling a std::bind result as volatile was 
deprecated for C++17.
   libstdc++.so.8.
 
 
+
+  The extension allowing containers to be instantiated with an allocator
+  that doesn't match the container's value type is no longer allowed in
+  strict (-std=c++NN) modes, only in
+  -std=gnu++NN modes.
+
+
 
 
 9
@@ -998,6 +1005,12 @@ Calling a std::bind result as volatile was 
deprecated for C++17.
   added.
 
 
+
+  The extension allowing containers to be instantiated with an allocator
+  that doesn't match the container's value type is no longer allowed in
+  C++20 mode, even in non-strict -std=gnu++20 mode.
+
+
 
 
 11
@@ -1096,6 +1109,12 @@ Deprecate the non-standard overload that allows 
std::setfill
 to be used with std::basic_istream.
 
 
+
+  The extension allowing std::basic_string to be instantiated
+  with an allocator that doesn't match the string's character type is no
+  longer allowed in C++20 mode.
+
+
 
 
 
-- 
2.40.1



[committed] libstdc++: Fix code size regressions in std::vector [PR110060]

2023-06-01 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pusshed to trunk.

-- >8 --

My r14-1452-gfb409a15d9babc change to add optimization hints to
std::vector causes regressions because it makes std::vector::size() and
std::vector::capacity() too big to inline. That's the opposite of what
I wanted, so revert the changes to those functions.

To achieve the original aim of optimizing vec.assign(vec.size(), x) we
can add a local optimization hint to _M_fill_assign, so that it doesn't
affect all other uses of size() and capacity().

Additionally, add the same hint to the _M_assign_aux overload for
forward iterators and add that to the testcase.

It would be nice to similarly optimize:
  if (vec1.size() == vec2.size()) vec1 = vec2;
but adding hints to operator=(const vector&) doesn't help. Presumably
the relationships between the two sizes and two capacities are too
complex to track effectively.

libstdc++-v3/ChangeLog:

PR libstdc++/110060
* include/bits/stl_vector.h (_Vector_base::_M_invariant):
Remove.
(vector::size, vector::capacity): Remove calls to _M_invariant.
* include/bits/vector.tcc (vector::_M_fill_assign): Add
optimization hint to reallocating path.
(vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)):
Likewise.
* testsuite/23_containers/vector/capacity/invariant.cc: Moved
to...
* testsuite/23_containers/vector/modifiers/assign/no_realloc.cc:
...here. Check assign(FwdIter, FwdIter) too.
* testsuite/23_containers/vector/types/1.cc: Revert addition
of -Wno-stringop-overread option.
---
 libstdc++-v3/include/bits/stl_vector.h| 23 +--
 libstdc++-v3/include/bits/vector.tcc  | 17 ++
 .../assign/no_realloc.cc} |  6 +
 .../testsuite/23_containers/vector/types/1.cc |  2 +-
 4 files changed, 20 insertions(+), 28 deletions(-)
 rename libstdc++-v3/testsuite/23_containers/vector/{capacity/invariant.cc => 
modifiers/assign/no_realloc.cc} (70%)

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index e593be443bc..70ced3d101f 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -389,23 +389,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
 protected:
 
-  __attribute__((__always_inline__))
-  _GLIBCXX20_CONSTEXPR void
-  _M_invariant() const
-  {
-#if __OPTIMIZE__
-   if (this->_M_impl._M_finish < this->_M_impl._M_start)
- __builtin_unreachable();
-   if (this->_M_impl._M_finish > this->_M_impl._M_end_of_storage)
- __builtin_unreachable();
-
-   size_t __sz = this->_M_impl._M_finish - this->_M_impl._M_start;
-   size_t __cap = this->_M_impl._M_end_of_storage - this->_M_impl._M_start;
-   if (__sz > __cap)
- __builtin_unreachable();
-#endif
-  }
-
   _GLIBCXX20_CONSTEXPR
   void
   _M_create_storage(size_t __n)
@@ -1005,10 +988,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
   size_type
   size() const _GLIBCXX_NOEXCEPT
-  {
-   _Base::_M_invariant();
-   return size_type(this->_M_impl._M_finish - this->_M_impl._M_start);
-  }
+  { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
 
   /**  Returns the size() of the largest possible %vector.  */
   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
@@ -1095,7 +1075,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   size_type
   capacity() const _GLIBCXX_NOEXCEPT
   {
-   _Base::_M_invariant();
return size_type(this->_M_impl._M_end_of_storage
   - this->_M_impl._M_start);
   }
diff --git a/libstdc++-v3/include/bits/vector.tcc 
b/libstdc++-v3/include/bits/vector.tcc
index d6fdea2dd01..acd11e2dc68 100644
--- a/libstdc++-v3/include/bits/vector.tcc
+++ b/libstdc++-v3/include/bits/vector.tcc
@@ -270,15 +270,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 vector<_Tp, _Alloc>::
 _M_fill_assign(size_t __n, const value_type& __val)
 {
+  const size_type __sz = size();
   if (__n > capacity())
{
+ if (__n <= __sz)
+   __builtin_unreachable();
  vector __tmp(__n, __val, _M_get_Tp_allocator());
  __tmp._M_impl._M_swap_data(this->_M_impl);
}
-  else if (__n > size())
+  else if (__n > __sz)
{
  std::fill(begin(), end(), __val);
- const size_type __add = __n - size();
+ const size_type __add = __n - __sz;
  _GLIBCXX_ASAN_ANNOTATE_GROW(__add);
  this->_M_impl._M_finish =
std::__uninitialized_fill_n_a(this->_M_impl._M_finish,
@@ -316,10 +319,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _M_assign_aux(_ForwardIterator __first, _ForwardIterator __last,
std::forward_iterator_tag)
   {
+   const size_type __sz = size();
const size_type __len = std::distance(__first, __last)

[committed] libstdc++: Do not use std::expected::value() in monadic ops (LWG 3938)

2023-06-01 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pusshed to trunk.

-- >8 --

The monadic operations in std::expected always check has_value() so we
can avoid the execptional path in value() and the assertions in error()
by accessing _M_val and _M_unex directly. This means that the monadic
operations no longer require _M_unex to be copyable so that it can be
thrown from value(), as modified by LWG 3938.

This also fixes two incorrect uses of std::move in transform(F&&)& and
transform(F&&) const& which I found while making these changes.

Now that move-only error types are supported, it's possible to properly
test the constraints that LWG 3877 added to and_then and transform. The
lwg3877.cc test now does that.

libstdc++-v3/ChangeLog:

* include/std/expected (expected::and_then, expected::or_else)
(expected::transform_error): Use _M_val and _M_unex instead of
calling value() and error(), as per LWG 3938.
(expected::transform): Likewise. Remove incorrect std::move
calls from lvalue overloads.
(expected::and_then, expected::or_else)
(expected::transform): Use _M_unex instead of calling
error().
* testsuite/20_util/expected/lwg3877.cc: Add checks for and_then
and transform, and for std::expected.
* testsuite/20_util/expected/lwg3938.cc: New test.
---
 libstdc++-v3/include/std/expected |  78 +-
 .../testsuite/20_util/expected/lwg3877.cc | 145 ++
 .../testsuite/20_util/expected/lwg3938.cc | 142 +
 3 files changed, 298 insertions(+), 67 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/expected/lwg3938.cc

diff --git a/libstdc++-v3/include/std/expected 
b/libstdc++-v3/include/std/expected
index 5ea0d6a7cb9..a63557448f7 100644
--- a/libstdc++-v3/include/std/expected
+++ b/libstdc++-v3/include/std/expected
@@ -745,8 +745,7 @@ namespace __expected
   {
if (_M_has_value) [[likely]]
  return std::move(_M_val);
-   _GLIBCXX_THROW_OR_ABORT(bad_expected_access<_Er>(
- std::move(_M_unex)));
+   _GLIBCXX_THROW_OR_ABORT(bad_expected_access<_Er>(std::move(_M_unex)));
   }
 
   constexpr _Tp&&
@@ -754,8 +753,7 @@ namespace __expected
   {
if (_M_has_value) [[likely]]
  return std::move(_M_val);
-   _GLIBCXX_THROW_OR_ABORT(bad_expected_access<_Er>(
- std::move(_M_unex)));
+   _GLIBCXX_THROW_OR_ABORT(bad_expected_access<_Er>(std::move(_M_unex)));
   }
 
   constexpr const _Er&
@@ -849,9 +847,9 @@ namespace __expected
  static_assert(is_same_v);
 
  if (has_value())
-   return std::__invoke(std::forward<_Fn>(__f), value());
+   return std::__invoke(std::forward<_Fn>(__f), _M_val);
  else
-   return _Up(unexpect, error());
+   return _Up(unexpect, _M_unex);
}
 
   template requires is_constructible_v<_Er, const _Er&>
@@ -863,9 +861,9 @@ namespace __expected
  static_assert(is_same_v);
 
  if (has_value())
-   return std::__invoke(std::forward<_Fn>(__f), value());
+   return std::__invoke(std::forward<_Fn>(__f), _M_val);
  else
-   return _Up(unexpect, error());
+   return _Up(unexpect, _M_unex);
}
 
   template requires is_constructible_v<_Er, _Er>
@@ -877,9 +875,9 @@ namespace __expected
  static_assert(is_same_v);
 
  if (has_value())
-   return std::__invoke(std::forward<_Fn>(__f), std::move(value()));
+   return std::__invoke(std::forward<_Fn>(__f), std::move(_M_val));
  else
-   return _Up(unexpect, std::move(error()));
+   return _Up(unexpect, std::move(_M_unex));
}
 
 
@@ -892,9 +890,9 @@ namespace __expected
  static_assert(is_same_v);
 
  if (has_value())
-   return std::__invoke(std::forward<_Fn>(__f), std::move(value()));
+   return std::__invoke(std::forward<_Fn>(__f), std::move(_M_val));
  else
-   return _Up(unexpect, std::move(error()));
+   return _Up(unexpect, std::move(_M_unex));
}
 
   template requires is_constructible_v<_Tp, _Tp&>
@@ -906,9 +904,9 @@ namespace __expected
  static_assert(is_same_v);
 
  if (has_value())
-   return _Gr(in_place, value());
+   return _Gr(in_place, _M_val);
  else
-   return std::__invoke(std::forward<_Fn>(__f), error());
+   return std::__invoke(std::forward<_Fn>(__f), _M_unex);
}
 
   template requires is_constructible_v<_Tp, const _Tp&>
@@ -920,9 +918,9 @@ namespace __expected
  static_assert(is_same_v);
 
  if (has_value())
-   return _Gr(in_place, value());
+   return _Gr(in_place, _M_val);
  else
-   return std::__invoke(std::forward<_Fn>(__f), error());
+   return std::__invoke(std::forward<_Fn>(__f), _M_unex);
   

[Patch, fortran] PR87477 - [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-06-01 Thread Paul Richard Thomas via Gcc-patches
Hi All,

This started out as the search for a fix to pr109948 and evolved to roll in
5 other prs.

Basically parse_associate was far too clunky and, in anycase, existing
functions in resolve.cc were well capable of doing the determination of the
target expression rank. While I was checking the comments, the lightbulb
flashed with respect to prs 102109/112/190 and the chunk dealing with
function results of unknown type was born.

Thanks to the changes in parse.cc, the problem in pr99326 migrated
upstream to the resolution and the chunklet in resolve.cc was an obvious
fix.

I am minded to s/{ dg-do run}/{ dg-do compile } for all six testcases. At
the testing stage, I wanted to check that the testcases actually did what
they are supposed to do :-)

Bootstraps and regtests OK - good for head?

Paul

PS I need to do some housekeeping on pr87477 now. Some of the blockers have
"fixed themselves" and others are awaiting backporting. I think that there
are only 4 or so left, of which 89645 and 99065 are the most difficult to
deal with.
diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc
index 5e2a95688d2..3947444f17c 100644
--- a/gcc/fortran/parse.cc
+++ b/gcc/fortran/parse.cc
@@ -4919,6 +4919,7 @@ parse_associate (void)
   gfc_state_data s;
   gfc_statement st;
   gfc_association_list* a;
+  gfc_array_spec *as;
 
   gfc_notify_std (GFC_STD_F2003, "ASSOCIATE construct at %C");
 
@@ -4934,8 +4935,7 @@ parse_associate (void)
   for (a = new_st.ext.block.assoc; a; a = a->next)
 {
   gfc_symbol* sym;
-  gfc_ref *ref;
-  gfc_array_ref *array_ref;
+  gfc_expr *target;
 
   if (gfc_get_sym_tree (a->name, NULL, &a->st, false))
 	gcc_unreachable ();
@@ -4952,6 +4952,7 @@ parse_associate (void)
 	 for parsing component references on the associate-name
 	 in case of association to a derived-type.  */
   sym->ts = a->target->ts;
+  target = a->target;
 
   /* Don’t share the character length information between associate
 	 variable and target if the length is not a compile-time constant,
@@ -4971,31 +4972,37 @@ parse_associate (void)
 	   && sym->ts.u.cl->length->expr_type == EXPR_CONSTANT))
 	sym->ts.u.cl = gfc_new_charlen (gfc_current_ns, NULL);
 
-  /* Check if the target expression is array valued.  This cannot always
-	 be done by looking at target.rank, because that might not have been
-	 set yet.  Therefore traverse the chain of refs, looking for the last
-	 array ref and evaluate that.  */
-  array_ref = NULL;
-  for (ref = a->target->ref; ref; ref = ref->next)
-	if (ref->type == REF_ARRAY)
-	  array_ref = &ref->u.ar;
-  if (array_ref || a->target->rank)
+  /* Check if the target expression is array valued. This cannot be done
+	 by calling gfc_resolve_expr because the context is unavailable.
+	 However, the references can be resolved and the rank of the target
+	 expression set.  */
+  if (target->ref && gfc_resolve_ref (target)
+	  && target->expr_type != EXPR_ARRAY
+	  && target->expr_type != EXPR_COMPCALL)
+	gfc_expression_rank (target);
+
+  /* Determine whether or not function expressions with unknown type are
+	 structure constructors. If so, the function result can be converted
+	 to be a derived type.
+	 TODO: Deal with references to sibling functions that have not yet been
+	 parsed (PRs 89645 and 99065).  */
+  if (target->expr_type == EXPR_FUNCTION && target->ts.type == BT_UNKNOWN)
 	{
-	  gfc_array_spec *as;
-	  int dim, rank = 0;
-	  if (array_ref)
+	  gfc_symbol *derived;
+	  /* The derived type has a leading uppercase character.  */
+	  gfc_find_symbol (gfc_dt_upper_string (target->symtree->name),
+			   my_ns->parent, 1, &derived);
+	  if (derived && derived->attr.flavor == FL_DERIVED)
 	{
-	  a->rankguessed = 1;
-	  /* Count the dimension, that have a non-scalar extend.  */
-	  for (dim = 0; dim < array_ref->dimen; ++dim)
-		if (array_ref->dimen_type[dim] != DIMEN_ELEMENT
-		&& !(array_ref->dimen_type[dim] == DIMEN_UNKNOWN
-			 && array_ref->end[dim] == NULL
-			 && array_ref->start[dim] != NULL))
-		  ++rank;
+	  sym->ts.type = BT_DERIVED;
+	  sym->ts.u.derived = derived;
 	}
-	  else
-	rank = a->target->rank;
+	}
+
+  if (target->rank)
+	{
+	  int rank = 0;
+	  rank = target->rank;
 	  /* When the rank is greater than zero then sym will be an array.  */
 	  if (sym->ts.type == BT_CLASS && CLASS_DATA (sym))
 	{
@@ -5006,8 +5013,8 @@ parse_associate (void)
 		  /* Don't just (re-)set the attr and as in the sym.ts,
 		 because this modifies the target's attr and as.  Copy the
 		 data and do a build_class_symbol.  */
-		  symbol_attribute attr = CLASS_DATA (a->target)->attr;
-		  int corank = gfc_get_corank (a->target);
+		  symbol_attribute attr = CLASS_DATA (target)->attr;
+		  int corank = gfc_get_corank (target);
 		  gfc_typespec type;
 
 		  if (rank || corank)
@@ -5042,7 +5049,7 @@ parse_associate (void)
 	  as = gfc_get_array_spec ();
 	  as->type = AS_DEFERR

Re: [PATCH 04/14] c++: use _P() defines from tree.h

2023-06-01 Thread Patrick Palka via Gcc-patches
On Sat, May 13, 2023 at 7:26 PM Bernhard Reutner-Fischer via
Gcc-patches  wrote:
>
> From: Bernhard Reutner-Fischer 
>
> gcc/cp/ChangeLog:
>
> * call.cc (promoted_arithmetic_type_p): Use _P defines from tree.h.
> (build_conditional_expr): Ditto.
> (convert_like_internal): Ditto.
> (convert_arg_to_ellipsis): Ditto.
> (build_over_call): Ditto.
> (compare_ics): Ditto.
> * class.cc (is_empty_base_ref): Ditto.
> * coroutines.cc (rewrite_param_uses): Ditto.
> * cp-tree.h (DECL_DISCRIMINATOR_P): Ditto.
> (ARITHMETIC_TYPE_P): Ditto.
> * cvt.cc (ocp_convert): Ditto.
> * cxx-pretty-print.cc (pp_cxx_template_argument_list): Ditto.
> * decl.cc (layout_var_decl): Ditto.
> (get_tuple_size): Ditto.
> * error.cc (dump_simple_decl): Ditto.
> * lambda.cc (start_lambda_scope): Ditto.
> * mangle.cc (write_template_arg): Ditto.
> * method.cc (spaceship_comp_cat): Ditto.
> * module.cc (node_template_info): Ditto.
> (trees_out::start): Ditto.
> (trees_out::decl_node): Ditto.
> (trees_in::read_var_def): Ditto.
> (set_instantiating_module): Ditto.
> * name-lookup.cc (maybe_record_mergeable_decl): Ditto.
> (consider_decl): Ditto.
> (maybe_add_fuzzy_decl): Ditto.
> * pt.cc (convert_nontype_argument): Ditto.
> * semantics.cc (handle_omp_array_sections_1): Ditto.
> (finish_omp_clauses): Ditto.
> (finish_omp_target_clauses_r): Ditto.
> (is_this_parameter): Ditto.
> * tree.cc (build_cplus_array_type): Ditto.
> (is_this_expression): Ditto.
> * typeck.cc (do_warn_enum_conversions): Ditto.
> * typeck2.cc (store_init_value): Ditto.
> (check_narrowing): Ditto.
> ---
>  gcc/cp/call.cc | 42 +++---
>  gcc/cp/class.cc|  2 +-
>  gcc/cp/coroutines.cc   |  2 +-
>  gcc/cp/cp-tree.h   |  4 ++--
>  gcc/cp/cvt.cc  |  2 +-
>  gcc/cp/cxx-pretty-print.cc |  2 +-
>  gcc/cp/decl.cc |  4 ++--
>  gcc/cp/error.cc|  2 +-
>  gcc/cp/lambda.cc   |  2 +-
>  gcc/cp/mangle.cc   |  2 +-
>  gcc/cp/method.cc   |  2 +-
>  gcc/cp/module.cc   | 12 +--
>  gcc/cp/name-lookup.cc  |  6 +++---
>  gcc/cp/pt.cc   |  2 +-
>  gcc/cp/semantics.cc| 24 +++---
>  gcc/cp/tree.cc |  4 ++--
>  gcc/cp/typeck.cc   |  4 ++--
>  gcc/cp/typeck2.cc  | 10 -
>  18 files changed, 64 insertions(+), 64 deletions(-)
>
> diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> index 2a06520c0c1..6e13d17f6b8 100644
> --- a/gcc/cp/call.cc
> +++ b/gcc/cp/call.cc
> @@ -2746,7 +2746,7 @@ promoted_arithmetic_type_p (tree type)
>   integral types plus floating types.  */
>return ((CP_INTEGRAL_TYPE_P (type)
>&& same_type_p (type_promotes_to (type), type))
> - || TREE_CODE (type) == REAL_TYPE);
> + || SCALAR_FLOAT_TYPE_P (type));
>  }
>
>  /* Create any builtin operator overload candidates for the operator in
> @@ -5759,10 +5759,10 @@ build_conditional_expr (const op_location_t &loc,
>if ((TREE_CODE (arg2) == EXCESS_PRECISION_EXPR
> || TREE_CODE (arg3) == EXCESS_PRECISION_EXPR)
>&& (TREE_CODE (arg2_type) == INTEGER_TYPE
> - || TREE_CODE (arg2_type) == REAL_TYPE
> + || SCALAR_FLOAT_TYPE_P (arg2_type)
>   || TREE_CODE (arg2_type) == COMPLEX_TYPE)
>&& (TREE_CODE (arg3_type) == INTEGER_TYPE
> - || TREE_CODE (arg3_type) == REAL_TYPE
> + || SCALAR_FLOAT_TYPE_P (arg3_type)
>   || TREE_CODE (arg3_type) == COMPLEX_TYPE))
>  {
>semantic_result_type
> @@ -5775,8 +5775,8 @@ build_conditional_expr (const op_location_t &loc,
> t1 = TREE_TYPE (t1);
>   if (TREE_CODE (t2) == COMPLEX_TYPE)
> t2 = TREE_TYPE (t2);
> - gcc_checking_assert (TREE_CODE (t1) == REAL_TYPE
> -  && TREE_CODE (t2) == REAL_TYPE
> + gcc_checking_assert (SCALAR_FLOAT_TYPE_P (t1)
> +  && SCALAR_FLOAT_TYPE_P (t2)
>&& (extended_float_type_p (t1)
>|| extended_float_type_p (t2))
>&& cp_compare_floating_point_conversion_ranks
> @@ -6127,8 +6127,8 @@ build_conditional_expr (const op_location_t &loc,
> t1 = TREE_TYPE (t1);
>   if (TREE_CODE (t2) == COMPLEX_TYPE)
> t2 = TREE_TYPE (t2);
> - gcc_checking_assert (TREE_CODE (t1) == REAL_TYPE
> -  && TREE_CODE (t2) == REAL_TYPE
> + gcc_checking_assert (SCALAR_FLOAT_TYPE_P (t1)
> +  && SCALAR_FLOAT_TYPE_P (t2)
>&& (extended_float_type_p (t1)
>

Re: [RFC] light expander sra for parameters and returns

2023-06-01 Thread Martin Jambor
Hi,

On Tue, May 30 2023, Richard Biener wrote:
> On Mon, 29 May 2023, Jiufu Guo wrote:
>
>> Hi,
>> 
>> Previously, I was investigating some struct parameters and returns related
>> PRs 69143/65421/108073.
>> 
>> Investigating the issues case by case, and drafting patches for each of
>> them one by one. This would help us to enhance code incrementally.
>> While, this way, patches would interact with each other and implement
>> different codes for similar issues (because of the different paths in
>> gimple/rtl).  We may have a common fix for those issues.
>> 
>> We know a few other related PRs(such as meta-bug PR101926) exist. For those
>> PRs in different targets with different symptoms (and also different root
>> cause), I would expect a method could help some of them, but it may
>> be hard to handle all of them in one fix.
>> 
>> With investigation and check discussion for the issues, I remember a
>> suggestion from Richard: it would be nice to perform some SRA-like analysis
>> for the accesses on the structs (parameter/returns).
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605117.html
>> This may be a 'fairly common method' for those issues. With this idea,
>> I drafted a patch as below in this mail.
>> 
>> I also thought about directly using tree-sra.cc, e.g. enhance it and rerun it
>> at the end of GIMPLE passes. While since some issues are introduced inside
>> the expander, so below patch also co-works with other parts of the expander.
>> And since we already have tree-sra in gimple pass, we only need to take more
>> care on parameter and return in this patch: other decls could be handled
>> well in tree-sra.
>> 
>> The steps of this patch are:
>> 1. Collect struct type parameters and returns, and then scan the function to
>> get the accesses on them. And figure out the accesses which would be 
>> profitable
>> to be scalarized (using registers of the parameter/return ). Now, reading on
>> parameter and writing on returns are checked in the current patch.
>> 2. When/after the scalar registers are determined/expanded for the return or
>> parameters, compute the corresponding scalar register(s) for each accesses of
>> the return/parameter, and prepare the scalar RTLs for those accesses.
>> 3. When using/expanding the accesses expression, leverage the 
>> computed/prepared
>> scalars directly.
>> 
>> This patch is tested on ppc64 both LE and BE.
>> To continue, I would ask for comments and suggestions first. And then I would
>> update/enhance accordingly.  Thanks in advance!
>
> Thanks for working on this - the description above sounds exactly like
> what should be done.
>
> Now - I'd like the code to re-use the access tree data structure from
> SRA plus at least the worker creating the accesses from a stmt.

I have had a first look at the patch but still need to look into it more
to understand how it uses the information it gathers.

My plan is to make the access-tree infrastructure of IPA-SRA more
generic and hopefully usable even for this purpose, rather than the one
in tree-sra.cc.  But that really builds a tree of accesses, bailing out
on any partial overlaps, for example, which may not be the right thing
here since I don't see any tree-building here.  But I still need to
properly read set_scalar_rtx_for_aggregate_access function in the patch,
which I plan to do next week.

Thanks,

Martin

>
> The RTL expansion code already does a sweep over stmts in
> discover_nonconstant_array_refs which makes sure RTL expansion doesn't
> scalarize (aka assign non-stack) to variables which have accesses
> that would later eventually FAIL to expand when operating on registers.
> That's very much related to the task at hand so we should try to
> at least merge the CFG walks of both (it produces a forced_stack_vars
> bitmap).
>
> Can you work together with Martin to split out the access tree
> data structure and share it?
>
> I didn't look in detail as of how you make use of the information
> yet.
>
> Thanks,
> Richard.
>
>> 
>> BR,
>> Jeff (Jiufu)
>> 
>> 
>> ---
>>  gcc/cfgexpand.cc | 567 ++-
>>  gcc/expr.cc  |  15 +-
>>  gcc/function.cc  |  26 +-
>>  gcc/opts.cc  |   8 +-
>>  gcc/testsuite/g++.target/powerpc/pr102024.C  |   2 +-
>>  gcc/testsuite/gcc.target/powerpc/pr108073.c  |  29 +
>>  gcc/testsuite/gcc.target/powerpc/pr65421-1.c |   6 +
>>  gcc/testsuite/gcc.target/powerpc/pr65421-2.c |  32 ++
>>  8 files changed, 675 insertions(+), 10 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-2.c
>> 
>> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
>> index 85a93a547c0..95c29b6b6fe 100644
>> --- a/gcc/cfgexpand.cc
>> +++ b/gcc/cfgexpand.cc
>> @@ -97,6 +97,564 @@ static bool defer_stack_allocation (tree, bool

Re: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

2023-06-01 Thread Kito Cheng via Gcc-patches
Lgtm

Li, Pan2 via Gcc-patches 於 2023年6月1日 週四,20:10寫道:

> Thanks Juzhe for pointing out this.
>
> Pan
>
> -Original Message-
> From: Li, Pan2 
> Sent: Thursday, June 1, 2023 8:09 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2 <
> pan2...@intel.com>; Wang, Yanzhang 
> Subject: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types
>
> From: Pan Li 
>
> This patch would like to add some test cases of vfloat16*_t (non tuple),
> no 'zvfh' or 'zvfhmin' will meet unknown type.
>
> Signed-off-by: Pan Li 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/abi-16.c: Add test cases.
> * gcc.target/riscv/rvv/base/user-7.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c | 6 ++
> gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c | 6 ++
>  2 files changed, 12 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
> index be2cbb5efd7..9e962a70acf 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
> @@ -173,6 +173,12 @@ void f___rvv_int64m2x4_t () {__rvv_int64m2x4_t t;} /*
> { dg-error {unknown type n  void f___rvv_uint64m2x4_t ()
> {__rvv_uint64m2x4_t t;} /* { dg-error {unknown type name
> '__rvv_uint64m2x4_t'} } */  void f___rvv_int64m4x2_t () {__rvv_int64m4x2_t
> t;} /* { dg-error {unknown type name '__rvv_int64m4x2_t'} } */  void
> f___rvv_uint64m4x2_t () {__rvv_uint64m4x2_t t;} /* { dg-error {unknown type
> name '__rvv_uint64m4x2_t'} } */
> +void f___rvv_float16mf4_t () {__rvv_float16mf4_t t;} /* { dg-error
> +{unknown type name '__rvv_float16mf4_t'} } */ void f___rvv_float16mf2_t
> +() {__rvv_float16mf2_t t;} /* { dg-error {unknown type name
> +'__rvv_float16mf2_t'} } */ void f___rvv_float16m1_t ()
> +{__rvv_float16m1_t t;} /* { dg-error {unknown type name
> +'__rvv_float16m1_t'} } */ void f___rvv_float16m2_t ()
> +{__rvv_float16m2_t t;} /* { dg-error {unknown type name
> +'__rvv_float16m2_t'} } */ void f___rvv_float16m4_t ()
> +{__rvv_float16m4_t t;} /* { dg-error {unknown type name
> +'__rvv_float16m4_t'} } */ void f___rvv_float16m8_t ()
> +{__rvv_float16m8_t t;} /* { dg-error {unknown type name
> +'__rvv_float16m8_t'} } */
>  void f___rvv_float32mf2x2_t () {__rvv_float32mf2x2_t t;}  void
> f___rvv_float32mf2x3_t () {__rvv_float32mf2x3_t t;}  void
> f___rvv_float32mf2x4_t () {__rvv_float32mf2x4_t t;} diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
> index 2172a5c7c79..0620a728208 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
> @@ -173,6 +173,12 @@ void f_vint64m2x4_t () {vint64m2x4_t t;} /* {
> dg-error {unknown type name 'vint6  void f_vuint64m2x4_t () {vuint64m2x4_t
> t;} /* { dg-error {unknown type name 'vuint64m2x4_t'} } */  void
> f_vint64m4x2_t () {vint64m4x2_t t;} /* { dg-error {unknown type name
> 'vint64m4x2_t'} } */  void f_vuint64m4x2_t () {vuint64m4x2_t t;} /* {
> dg-error {unknown type name 'vuint64m4x2_t'} } */
> +void f_vfloat16mf4_t () {vfloat16mf4_t t;} /* { dg-error {unknown type
> +name 'vfloat16mf4_t'} } */ void f_vfloat16mf2_t () {vfloat16mf2_t t;}
> +/* { dg-error {unknown type name 'vfloat16mf2_t'} } */ void
> +f_vfloat16m1_t () {vfloat16m1_t t;} /* { dg-error {unknown type name
> +'vfloat16m1_t'} } */ void f_vfloat16m2_t () {vfloat16m2_t t;} /* {
> +dg-error {unknown type name 'vfloat16m2_t'} } */ void f_vfloat16m4_t ()
> +{vfloat16m4_t t;} /* { dg-error {unknown type name 'vfloat16m4_t'} } */
> +void f_vfloat16m8_t () {vfloat16m8_t t;} /* { dg-error {unknown type
> +name 'vfloat16m8_t'} } */
>  void f_vfloat32mf2x2_t () {vfloat32mf2x2_t t;} /* { dg-error {unknown
> type name 'vfloat32mf2x2_t'} } */  void f_vfloat32mf2x3_t ()
> {vfloat32mf2x3_t t;} /* { dg-error {unknown type name 'vfloat32mf2x3_t'} }
> */  void f_vfloat32mf2x4_t () {vfloat32mf2x4_t t;} /* { dg-error {unknown
> type name 'vfloat32mf2x4_t'} } */
> --
> 2.34.1
>
>


Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/06/01 23:20, Max Filippov wrote:
> On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
>  wrote:
>> More optimized than the default RTL generation.
>>
>> gcc/ChangeLog:
>>
>> * config/xtensa/xtensa.md (adddi3, subdi3):
>> New RTL generation patterns implemented according to the instruc-
>> tion idioms described in the Xtensa ISA reference manual (p. 600).
>> ---
>>  gcc/config/xtensa/xtensa.md | 52 +
>>  1 file changed, 52 insertions(+)
>>
>> diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
>> index eda1353894b..21afa747e89 100644
>> --- a/gcc/config/xtensa/xtensa.md
>> +++ b/gcc/config/xtensa/xtensa.md
>> @@ -190,6 +190,35 @@
>> (set_attr "mode""SI")
>> (set_attr "length"  "3")])
>>
>> +(define_expand "adddi3"
>> +  [(set (match_operand:DI 0 "register_operand")
>> +   (plus:DI (match_operand:DI 1 "register_operand")
>> +(match_operand:DI 2 "register_operand")))]
>> +  ""
>> +{
>> +  rtx lo_dest, hi_dest, lo_op0, hi_op0, lo_op1, hi_op1;
>> +  rtx_code_label *label;
>> +  if (rtx_equal_p (operands[0], operands[1])
>> +  || rtx_equal_p (operands[0], operands[2])
> 
>> +  || ! REG_P (operands[1]) || ! REG_P (operands[2]))
> 
> I wonder if these additional conditions are necessary, given that
> the operands have the "register_operand" predicates?
> 

See register_operand() in gcc/recog.cc.

In fact, I've encountered several operands that satisfy the
register_operand predicate but result in REG_P() being false.


Re: [PATCH] c++: ahead of time variable template-id coercion [PR89442]

2023-06-01 Thread Patrick Palka via Gcc-patches
On Wed, May 3, 2023 at 9:50 AM Patrick Palka  wrote:
>
> This patch makes us coerce the arguments of a variable template-id ahead
> of time, as we do for other template-ids, which allows us to immediately
> diagnose template parameter/argument kind mismatches and arity mismatches.
>
> Unfortunately this causes a regression in cpp1z/constexpr-if20.C: coercing
> the variable template-id m ahead of time means we strip it of
> typedefs, yielding m::q, typename C::q>, but in this
> stripped form we're directly using 'i' and so we expect to have captured
> it.  This is PR107437 but with a variable template instead of a class
> template.  I'm not sure how to fix this :(
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?

Ping.

>
> PR c++/89442
> PR c++/107437
>
> gcc/cp/ChangeLog:
>
> * cp-tree.h (lookup_template_variable): Add complain parameter.
> * parser.cc (cp_parser_template_id): Pass tf_warning_or_error
> to lookup_template_variable.
> * pt.cc (lookup_template_variable): Add complain parameter.
> Coerce template arguments here ...
> (finish_template_variable): ... instead of here.
> (lookup_and_finish_template_variable): Check for error_mark_node
> result from lookup_template_variable.
> (tsubst_copy) : Pass complain to
> lookup_template_variable.
> (instantiate_template): Use build2 instead of
> lookup_template_variable to build a TEMPLATE_ID_EXPR
> for most_specialized_partial_spec.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp/pr64127.C: Expect "expected unqualified-id at end
> of input" error.
> * g++.dg/cpp0x/alias-decl-ttp1.C: Fix template parameter/argument
> kind mismatch for variable template has_P_match_V.
> * g++.dg/cpp1y/pr72759.C: Expect "template argument 1 is invalid"
> error.
> * g++.dg/cpp1z/constexpr-if20.C: XFAIL test due to bogus "'i' is
> not captured" error.
> * g++.dg/cpp1z/noexcept-type21.C: Fix arity of variable template d.
> * g++.dg/diagnostic/not-a-function-template-1.C: Add default
> template argument to variable template A so that A<> is valid.
> * g++.dg/parse/error56.C: Don't expect "ISO C++ forbids
> declaration with no type" error.
> * g++.dg/parse/template30.C: Don't expect "parse error in
> template argument list" error.
> * g++.dg/cpp1y/var-templ80.C: New test.
> ---
>  gcc/cp/cp-tree.h  |  2 +-
>  gcc/cp/parser.cc  |  2 +-
>  gcc/cp/pt.cc  | 20 ++-
>  gcc/testsuite/g++.dg/cpp/pr64127.C|  2 +-
>  gcc/testsuite/g++.dg/cpp0x/alias-decl-ttp1.C  |  2 +-
>  gcc/testsuite/g++.dg/cpp1y/pr72759.C  |  2 +-
>  gcc/testsuite/g++.dg/cpp1y/var-templ80.C  | 12 +++
>  gcc/testsuite/g++.dg/cpp1z/constexpr-if20.C   |  1 +
>  gcc/testsuite/g++.dg/cpp1z/noexcept-type21.C  |  2 +-
>  .../diagnostic/not-a-function-template-1.C|  2 +-
>  gcc/testsuite/g++.dg/parse/error56.C  |  1 -
>  gcc/testsuite/g++.dg/parse/template30.C   |  3 +--
>  12 files changed, 32 insertions(+), 19 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/var-templ80.C
>
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index c9c4cd6f32f..96807282ec5 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -7349,7 +7349,7 @@ extern bool redeclare_class_template  
> (tree, tree, tree);
>  extern tree lookup_template_class  (tree, tree, tree, tree,
>  int, tsubst_flags_t);
>  extern tree lookup_template_function   (tree, tree);
> -extern tree lookup_template_variable   (tree, tree);
> +extern tree lookup_template_variable   (tree, tree, tsubst_flags_t);
>  extern bool uses_template_parms(tree);
>  extern bool uses_template_parms_level  (tree, int);
>  extern bool uses_outer_template_parms_in_constraints (tree);
> diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> index d89553e7da8..4982583809b 100644
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
> @@ -18525,7 +18525,7 @@ cp_parser_template_id (cp_parser *parser,
>  }
>else if (variable_template_p (templ))
>  {
> -  template_id = lookup_template_variable (templ, arguments);
> +  template_id = lookup_template_variable (templ, arguments, 
> tf_warning_or_error);
>if (TREE_CODE (template_id) == TEMPLATE_ID_EXPR)
> SET_EXPR_LOCATION (template_id, combined_loc);
>  }
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 930291917f2..abf99feab20 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -10329,11 +10329,16 @@ lookup_template_class (tree d1, tree arglist, tree 
> in_decl, tree context,
>  /* Return a TEMPLATE_ID_EXPR for the given variable template and ARGL

[committed] libstdc++: Fix PSTL test that fails in C++20

2023-06-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk, will backport too.

Tom, this will require rebasing your PSTL rebase patch, but it should be
trivial.

-- >8 --

This test fails in C++20 and later due to a warning:

warning: C++20 says that these are ambiguous, even though the second is 
reversed:
note: candidate 1: 'bool MyClass::operator==(const MyClass&)'
note: candidate 2: 'bool MyClass::operator==(const MyClass&)' (reversed)
note: try making the operator a 'const' member function
FAIL: 26_numerics/pstl/numeric_ops/transform_reduce.cc (test for excess errors)

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc:
Add const to equality operator.
---
 .../testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/libstdc++-v3/testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc 
b/libstdc++-v3/testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc
index ec020b42bbb..bec1c141278 100644
--- a/libstdc++-v3/testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc
+++ b/libstdc++-v3/testsuite/26_numerics/pstl/numeric_ops/transform_reduce.cc
@@ -68,7 +68,7 @@ class MyClass
 }
 friend MyClass operator*(const MyClass& x, const MyClass& y) { return 
MyClass(x.my_field * y.my_field); }
 bool
-operator==(const MyClass& in)
+operator==(const MyClass& in) const
 {
 return my_field == in.my_field;
 }
-- 
2.40.1



[PATCH] libstdc++: Use AS_IF in configure.ac

2023-06-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. I'd appreciate a second set of eyeballs on this
before I push it.

-- >8 --

This ensures that anything that depends on AC_REQUIRE is hoisted out of
the conditional block.

The always-false test x"long_double_math_on_this_cpu" = x"yes" condition
is not altered by this commit, only changed to use the AS_IF syntax.

libstdc++-v3/ChangeLog:

* configure.ac: Use AS_IF.
* configure: Regenerate.
---
 libstdc++-v3/configure| 1148 +++--
 libstdc++-v3/configure.ac |   20 +-
 2 files changed, 590 insertions(+), 578 deletions(-)

diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 0abe54e7b9a..f3bcf7affdd 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -266,7 +266,7 @@ AC_CHECK_HEADERS([linux/random.h], [], [],
 AC_CHECK_HEADERS([xlocale.h])
 
 # Only do link tests if native. Else, hardcode.
-if $GLIBCXX_IS_NATIVE; then
+AS_IF([$GLIBCXX_IS_NATIVE],[
 
   # We can do more elaborate tests that assume a working linker.
   CANADIAN=no
@@ -298,7 +298,7 @@ if $GLIBCXX_IS_NATIVE; then
   # For iconv support.
   AM_ICONV
 
-else
+],[
 
   # This lets us hard-code the functionality we know we'll have in the cross
   # target environment.  "Let" is a sugar-coated word placed on an especially
@@ -330,7 +330,7 @@ else
 
   # First, test for "known" system libraries.  We may be using newlib even
   # on a hosted environment.
-  if test "x${with_newlib}" = "xyes"; then
+  AS_IF([test "x${with_newlib}" = "xyes"],[
 os_include_dir="os/newlib"
 AC_DEFINE(HAVE_HYPOT)
 
@@ -386,14 +386,14 @@ else
 AC_DEFINE(HAVE_USLEEP)
 ;;
 esac
-  elif test "x$with_headers" != "xno"; then
+  ],[test "x$with_headers" != "xno" ],[
 GLIBCXX_CROSSCONFIG
-  fi
+  ])
 
   # At some point, we should differentiate between architectures
   # like x86, which have long double versions, and alpha/powerpc/etc.,
   # which don't. For the time being, punt.
-  if test x"long_double_math_on_this_cpu" = x"yes"; then
+  AS_IF([test x"long_double_math_on_this_cpu" = x"yes"],[
 AC_DEFINE(HAVE_ACOSL)
 AC_DEFINE(HAVE_ASINL)
 AC_DEFINE(HAVE_ATAN2L)
@@ -417,8 +417,8 @@ else
 AC_DEFINE(HAVE_SQRTL)
 AC_DEFINE(HAVE_TANL)
 AC_DEFINE(HAVE_TANHL)
-  fi
-fi
+  ])
+])
 
 # Check for _Unwind_GetIPInfo.
 GCC_CHECK_UNWIND_GETIPINFO
@@ -449,7 +449,7 @@ case "$target" in
 #error no need for long double compatibility
 #endif
   ], [ac_ldbl_compat=yes], [ac_ldbl_compat=no])
-  if test "$ac_ldbl_compat" = yes; then
+  AS_IF([test "$ac_ldbl_compat" = yes],[
 AC_DEFINE([_GLIBCXX_LONG_DOUBLE_COMPAT],1,
  [Define if compatibility should be provided for 
-mlong-double-64.])
 
port_specific_symbol_files="\$(top_srcdir)/config/os/gnu-linux/ldbl-extra.ver"
@@ -485,7 +485,7 @@ case "$target" in
 fi
;;
 esac
-  fi
+  ])
 esac
 AC_SUBST(LONG_DOUBLE_COMPAT_FLAGS)
 AC_SUBST(LONG_DOUBLE_128_FLAGS)
-- 
2.40.1



Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-06-01 Thread Andre Vieira (lists) via Gcc-patches

Hi,

This is the updated patch and cover letter. Patches for inline and 
gimple-op changes will follow soon.


DEF_INTERNAL_WIDENING_OPTAB_FN and DEF_INTERNAL_NARROWING_OPTAB_FN 
are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN 
respectively. With the exception that they provide convenience wrappers 
for a single vector to vector conversion, a hi/lo split or an even/odd 
split.  Each definition for  will require either signed optabs 
named  and  (for widening) or a single  (for 
narrowing) for each of the five functions it creates.


 For example, for widening addition the 
DEF_INTERNAL_WIDENING_OPTAB_FN will create five internal functions: 
IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO, 
IFN_VEC_WIDEN_PLUS_EVEN and IFN_VEC_WIDEN_PLUS_ODD. Each requiring two 
optabs, one for signed and one for unsigned.

 Aarch64 implements the hi/lo split optabs:
 IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_add_hi_ -> (u/s)addl2
 IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_add_lo_ -> (u/s)addl

This gives the same functionality as the previous 
WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into 
VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.


gcc/ChangeLog:

2023-04-25  Andre Vieira  
Joel Hutton  
Tamar Christina  

* config/aarch64/aarch64-simd.md 
(vec_widen_addl_lo_): Rename

this ...
(vec_widen_add_lo_): ... to this.
(vec_widen_addl_hi_): Rename this ...
(vec_widen_add_hi_): ... to this.
(vec_widen_subl_lo_): Rename this ...
(vec_widen_sub_lo_): ... to this.
(vec_widen_subl_hi_): Rename this ...
(vec_widen_sub_hi_): ...to this.
* doc/generic.texi: Document new IFN codes.
	* internal-fn.cc (ifn_cmp): Function to compare ifn's for 
sorting/searching.

(lookup_hilo_internal_fn): Add lookup function.
(commutative_binary_fn_p): Add widen_plus fn's.
(widening_fn_p): New function.
(narrowing_fn_p): New function.
(direct_internal_fn_optab): Change visibility.
* internal-fn.def (DEF_INTERNAL_WIDENING_OPTAB_FN): Macro to define an
internal_fn that expands into multiple internal_fns for widening.
(DEF_INTERNAL_NARROWING_OPTAB_FN): Likewise but for narrowing.
(IFN_VEC_WIDEN_PLUS, IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO,
 IFN_VEC_WIDEN_PLUS_EVEN, IFN_VEC_WIDEN_PLUS_ODD,
 IFN_VEC_WIDEN_MINUS, IFN_VEC_WIDEN_MINUS_HI, 
IFN_VEC_WIDEN_MINUS_LO,
 IFN_VEC_WIDEN_MINUS_ODD, IFN_VEC_WIDEN_MINUS_EVEN): Define 
widening

plus,minus functions.
* internal-fn.h (direct_internal_fn_optab): Declare new prototype.
(lookup_hilo_internal_fn): Likewise.
(widening_fn_p): Likewise.
(Narrowing_fn_p): Likewise.
* optabs.cc (commutative_optab_p): Add widening plus optabs.
* optabs.def (OPTAB_D): Define widen add, sub optabs.
* tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
patterns with a hi/lo or even/odd split.
(vect_recog_sad_pattern): Refactor to use new IFN codes.
(vect_recog_widen_plus_pattern): Likewise.
(vect_recog_widen_minus_pattern): Likewise.
(vect_recog_average_pattern): Likewise.
* tree-vect-stmts.cc (vectorizable_conversion): Add support for
_HILO IFNs.
(supportable_widening_operation): Likewise.
* tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vect-widen-add.c: Test that new
IFN_VEC_WIDEN_PLUS is being used.
* gcc.target/aarch64/vect-widen-sub.c: Test that new
IFN_VEC_WIDEN_MINUS is being used.

On 22/05/2023 14:06, Richard Biener wrote:

On Thu, 18 May 2023, Andre Vieira (lists) wrote:


How about this?

Not sure about the DEF_INTERNAL documentation I rewrote in internal-fn.def,
was struggling to word these, so improvements welcome!


The even/odd variant optabs are also commutative_optab_p, so is
the vec_widen_sadd without hi/lo or even/odd.

+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */

do you really want -all?  I think you want -details

+  else if (widening_fn_p (ifn)
+  || narrowing_fn_p (ifn))
+   {
+ tree lhs = gimple_get_lhs (stmt);
+ if (!lhs)
+   {
+ error ("vector IFN call with no lhs");
+ debug_generic_stmt (fn);

that's an error because ...?  Maybe we want to verify this
for all ECF_CONST|ECF_NOTHROW (or pure instead of const) internal
function calls, but I wouldn't add any verification as part
of this patch (not special to widening/narrowing fns either).

 if (gimple_call_internal_p (stmt))
- return 0;
+ {
+   internal_fn fn = gimple_call_internal_fn (stmt);
+   switch (fn)
+ {
+ case IFN_VEC_WIDEN_PLUS_HI:
+ case IFN_VEC_WIDEN_PLUS_LO:
+ case IFN_VEC_WIDEN_MINUS_HI:
+ ca

[PATCH] inline: improve internal function costs

2023-06-01 Thread Andre Vieira (lists) via Gcc-patches

Hi,

This is a follow-up of the internal function patch to add widening and 
narrowing patterns.  This patch improves the inliner cost estimation for 
internal functions.


Bootstrapped and regression tested on aarch64-unknown-linux-gnu.

gcc/ChangeLog:

* ipa-fnsummary.cc (analyze_function_body): Correctly handle
non-zero costed internal functions.
* tree-inline.cc (estimate_num_insns): Improve costing for internal
functions.diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc
index 
b328bb8ce14b0725f6e5607da9d1e2f61e9baf62..449961fe44e4d86bf61e625dff0759d58e1e80ba
 100644
--- a/gcc/ipa-fnsummary.cc
+++ b/gcc/ipa-fnsummary.cc
@@ -2862,16 +2862,19 @@ analyze_function_body (struct cgraph_node *node, bool 
early)
 to happen, but we cannot do that for call statements
 because edges are accounted specially.  */
 
- if (*(is_gimple_call (stmt) ? &bb_predicate : &p) != false)
+ if (*(is_gimple_call (stmt) && !gimple_call_internal_p (stmt)
+   ? &bb_predicate : &p) != false)
{
  time += final_time;
  size += this_size;
}
 
  /* We account everything but the calls.  Calls have their own
-size/time info attached to cgraph edges.  This is necessary
-in order to make the cost disappear after inlining.  */
- if (!is_gimple_call (stmt))
+size/time info attached to cgraph edges.  This is necessary
+in order to make the cost disappear after inlining.  The only
+exceptions are internal calls.  */
+ if (!is_gimple_call (stmt)
+ || gimple_call_internal_p (stmt))
{
  if (prob)
{
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 
99efddc36c8906a797583a569424336e961c35d1..bac84d277254703369c27993dcad048de8d4ff70
 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4427,7 +4427,48 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
tree decl;
 
if (gimple_call_internal_p (stmt))
- return 0;
+ {
+   switch (gimple_call_internal_fn (stmt))
+ {
+ default:
+   return 1;
+
+ case IFN_GOMP_TARGET_REV:
+ case IFN_GOMP_USE_SIMT:
+ case IFN_GOMP_SIMT_ENTER_ALLOC:
+ case IFN_GOMP_SIMT_EXIT:
+ case IFN_GOMP_SIMT_LANE:
+ case IFN_GOMP_SIMT_VF:
+ case IFN_GOMP_SIMT_LAST_LANE:
+ case IFN_GOMP_SIMT_ORDERED_PRED:
+ case IFN_GOMP_SIMT_VOTE_ANY:
+ case IFN_GOMP_SIMT_XCHG_BFLY:
+ case IFN_GOMP_SIMT_XCHG_IDX:
+ case IFN_GOMP_SIMD_LANE:
+ case IFN_GOMP_SIMD_VF:
+ case IFN_GOMP_SIMD_LAST_LANE:
+ case IFN_GOMP_SIMD_ORDERED_START:
+ case IFN_GOMP_SIMD_ORDERED_END:
+ case IFN_BUILTIN_EXPECT:
+ case IFN_ANNOTATE:
+ case IFN_NOP:
+ case IFN_UNIQUE:
+ case IFN_DEFERRED_INIT:
+ case IFN_ASSUME:
+   return 0;
+
+ case IFN_UBSAN_NULL:
+ case IFN_UBSAN_BOUNDS:
+ case IFN_UBSAN_VPTR:
+ case IFN_UBSAN_CHECK_ADD:
+ case IFN_UBSAN_CHECK_SUB:
+ case IFN_UBSAN_CHECK_MUL:
+ case IFN_UBSAN_PTR:
+ case IFN_UBSAN_OBJECT_SIZE:
+   /* Estimating a compare and jump.  */
+   return 2;
+ }
+ }
else if ((decl = gimple_call_fndecl (stmt))
 && fndecl_built_in_p (decl))
  {


[PATCH] gimple-range: implement widen plus range

2023-06-01 Thread Andre Vieira (lists) via Gcc-patches

Hi,

This patch adds gimple-range information for the new IFN_VEC_WIDEN_PLUS* 
internal functions, identical to what VEC_WIDEN_PLUS did.


Bootstrapped and regression tested on aarch64-unknown-linux-gnu.

gcc/ChangeLog:

* gimple-range-op.cc (gimple_range_op_handler::maybe_non_standard):
Add support for IFN_VEC_WIDEN_PLUS*.diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 
59c47e2074ddc73065468fe92274c260bd5bac48..7a84931d6204a56549cf1563114d8db7a2e26a6a
 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard ()
 {
   range_operator *signed_op = ptr_op_widen_mult_signed;
   range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
+  bool signed1, signed2, signed_ret;
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
 switch (gimple_assign_rhs_code (m_stmt))
   {
@@ -1196,32 +1197,58 @@ gimple_range_op_handler::maybe_non_standard ()
  m_op1 = gimple_assign_rhs1 (m_stmt);
  m_op2 = gimple_assign_rhs2 (m_stmt);
  tree ret = gimple_assign_lhs (m_stmt);
- bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
- bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
- bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
-
- /* Normally these operands should all have the same sign, but
-some passes and violate this by taking mismatched sign args.  At
-the moment the only one that's possible is mismatch inputs and
-unsigned output.  Once ranger supports signs for the operands we
-can properly fix it,  for now only accept the case we can do
-correctly.  */
- if ((signed1 ^ signed2) && signed_ret)
-   return;
-
- m_valid = true;
- if (signed2 && !signed1)
-   std::swap (m_op1, m_op2);
-
- if (signed1 || signed2)
-   m_int = signed_op;
- else
-   m_int = unsigned_op;
+ signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+ signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+ signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
  break;
}
default:
- break;
+ return;
+  }
+  else if (gimple_code (m_stmt) == GIMPLE_CALL
+  && gimple_call_internal_p (m_stmt)
+  && gimple_get_lhs (m_stmt) != NULL_TREE)
+switch (gimple_call_internal_fn (m_stmt))
+  {
+  case IFN_VEC_WIDEN_PLUS:
+  case IFN_VEC_WIDEN_PLUS_LO:
+  case IFN_VEC_WIDEN_PLUS_HI:
+  case IFN_VEC_WIDEN_PLUS_EVEN:
+  case IFN_VEC_WIDEN_PLUS_ODD:
+ {
+   signed_op = ptr_op_widen_plus_signed;
+   unsigned_op = ptr_op_widen_plus_unsigned;
+   m_valid = false;
+   m_op1 = gimple_call_arg (m_stmt, 0);
+   m_op2 = gimple_call_arg (m_stmt, 1);
+   tree ret = gimple_get_lhs (m_stmt);
+   signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+   signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+   signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
+   break;
+ }
+  default:
+   return;
   }
+  else
+return;
+
+  /* Normally these operands should all have the same sign, but some passes
+ and violate this by taking mismatched sign args.  At the moment the only
+ one that's possible is mismatch inputs and unsigned output.  Once ranger
+ supports signs for the operands we can properly fix it,  for now only
+ accept the case we can do correctly.  */
+  if ((signed1 ^ signed2) && signed_ret)
+return;
+
+  m_valid = true;
+  if (signed2 && !signed1)
+std::swap (m_op1, m_op2);
+
+  if (signed1 || signed2)
+m_int = signed_op;
+  else
+m_int = unsigned_op;
 }
 
 // Set up a gimple_range_op_handler for any built in function which can be


Re: [PATCH 04/14] c++: use _P() defines from tree.h

2023-06-01 Thread Bernhard Reutner-Fischer via Gcc-patches
On Thu, 1 Jun 2023 11:24:06 -0400
Patrick Palka  wrote:

> On Sat, May 13, 2023 at 7:26 PM Bernhard Reutner-Fischer via
> Gcc-patches  wrote:

> > diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
> > index 131b212ff73..19dfb3ed782 100644
> > --- a/gcc/cp/tree.cc
> > +++ b/gcc/cp/tree.cc
> > @@ -1173,7 +1173,7 @@ build_cplus_array_type (tree elt_type, tree 
> > index_type, int dependent)
> >  }
> >
> >/* Avoid spurious warnings with VLAs (c++/54583).  */
> > -  if (TYPE_SIZE (t) && EXPR_P (TYPE_SIZE (t)))
> > +  if (CAN_HAVE_LOCATION_P (TYPE_SIZE (t)))  
> 
> Hmm, this change seems undesirable...

mhm, yes that is misleading. I'll prepare a patch to revert this.
Let me have a look if there were other such CAN_HAVE_LOCATION_P changes
that we'd want to revert.

thanks,


Re: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread Jeff Law via Gcc-patches




On 6/1/23 01:01, juzhe.zh...@rivai.ai wrote:
I plan to implement BF16 vector in GCC but still waiting for ISA 
ratified since GCC policy doesn't allow un-ratified ISA.
Right.  So those specs need to move along further before we can start 
integrating code.




Currently, we are working on INT8,INT16,INT32,INT64,FP16,FP32,FP64 
auto-vectorizaiton.

It should very simple BF16 in current vector framework in GCC.
In prior architectures I've worked on the bulk of BF16 work was just 
adding additional entries to existing iterators.  So I agree, it should 
be very simple :-)


Jeff



Re: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread Palmer Dabbelt

On Thu, 01 Jun 2023 09:48:47 PDT (-0700), jeffreya...@gmail.com wrote:



On 6/1/23 01:01, juzhe.zh...@rivai.ai wrote:

I plan to implement BF16 vector in GCC but still waiting for ISA
ratified since GCC policy doesn't allow un-ratified ISA.

Right.  So those specs need to move along further before we can start
integrating code.



Currently, we are working on INT8,INT16,INT32,INT64,FP16,FP32,FP64
auto-vectorizaiton.
It should very simple BF16 in current vector framework in GCC.

In prior architectures I've worked on the bulk of BF16 work was just
adding additional entries to existing iterators.  So I agree, it should
be very simple :-)


We should also have someone who's a bit more plugged in to floating 
point check to make sure the RISC-V bfloat16 semantics match IEEE.  I 
don't see any issues, but I'm not really a FP person so I'm not sure.  
There were certainly a lot of subtlies for the other FP bits, so even if 
the implementation just plumbs straight through IMO it's worth checking.


We have one FP person at Rivos, I can try and rope him in if you want?  
Happy to have someone else do it, though, as he's usually pretty busy ;)


Re: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread Philipp Tomsich
On Thu, 1 Jun 2023 at 18:49, Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 6/1/23 01:01, juzhe.zh...@rivai.ai wrote:
> > I plan to implement BF16 vector in GCC but still waiting for ISA
> > ratified since GCC policy doesn't allow un-ratified ISA.
> Right.  So those specs need to move along further before we can start
> integrating code.

Doesn't our policy require specs to only pass the FREEZE milestone
(i.e., the requirement for public review) before we can start
integrating them?
This should give us at least a 6 week (minimum 30 days public-review
plus 2 weeks for the TSC vote to send this up for ratification)
headstart on ratification (with the small risk of minor changes
required due to review comments) to start integrating support for new
extensions.

Best,
Philipp.

p.s.: Just for reference, the RISC-V Lifecycle Guide (defining these
milestones in specification development) is linked from
https://wiki.riscv.org/ for details.


> >
> > Currently, we are working on INT8,INT16,INT32,INT64,FP16,FP32,FP64
> > auto-vectorizaiton.
> > It should very simple BF16 in current vector framework in GCC.
> In prior architectures I've worked on the bulk of BF16 work was just
> adding additional entries to existing iterators.  So I agree, it should
> be very simple :-)
>
> Jeff
>


Re: FW: [RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-06-01 Thread Jeff Law via Gcc-patches




On 6/1/23 10:56, Palmer Dabbelt wrote:

On Thu, 01 Jun 2023 09:48:47 PDT (-0700), jeffreya...@gmail.com wrote:



On 6/1/23 01:01, juzhe.zh...@rivai.ai wrote:

I plan to implement BF16 vector in GCC but still waiting for ISA
ratified since GCC policy doesn't allow un-ratified ISA.

Right.  So those specs need to move along further before we can start
integrating code.



Currently, we are working on INT8,INT16,INT32,INT64,FP16,FP32,FP64
auto-vectorizaiton.
It should very simple BF16 in current vector framework in GCC.

In prior architectures I've worked on the bulk of BF16 work was just
adding additional entries to existing iterators.  So I agree, it should
be very simple :-)


We should also have someone who's a bit more plugged in to floating 
point check to make sure the RISC-V bfloat16 semantics match IEEE.  I 
don't see any issues, but I'm not really a FP person so I'm not sure. 
There were certainly a lot of subtlies for the other FP bits, so even if 
the implementation just plumbs straight through IMO it's worth checking.


We have one FP person at Rivos, I can try and rope him in if you want? 
Happy to have someone else do it, though, as he's usually pretty busy ;)
I don't really have an FP expert here.  I can't honestly pretend to be 
one myself.


jeff


Re: [Patch, fortran] PR87477 - [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-06-01 Thread Mikael Morin

Le 01/06/2023 à 17:20, Paul Richard Thomas via Fortran a écrit :

Hi All,

This started out as the search for a fix to pr109948 and evolved to roll in
5 other prs.

Basically parse_associate was far too clunky and, in anycase, existing
functions in resolve.cc were well capable of doing the determination of the
target expression rank. While I was checking the comments, the lightbulb
flashed with respect to prs 102109/112/190 and the chunk dealing with
function results of unknown type was born.

Thanks to the changes in parse.cc, the problem in pr99326 migrated
upstream to the resolution and the chunklet in resolve.cc was an obvious
fix.

I am minded to s/{ dg-do run}/{ dg-do compile } for all six testcases.

Makes sense, the PRs were bogus errors and ICEs, so all compile time issues.


At
the testing stage, I wanted to check that the testcases actually did what
they are supposed to do :-)

Bootstraps and regtests OK - good for head?


OK.  Thanks for this.


Paul

PS I need to do some housekeeping on pr87477 now. Some of the blockers have
"fixed themselves" and others are awaiting backporting. I think that there
are only 4 or so left, of which 89645 and 99065 are the most difficult to
deal with.




Re: [PATCH 04/14] c++: use _P() defines from tree.h

2023-06-01 Thread Bernhard Reutner-Fischer via Gcc-patches
Hi David, Patrick,

On Thu, 1 Jun 2023 18:33:46 +0200
Bernhard Reutner-Fischer  wrote:

> On Thu, 1 Jun 2023 11:24:06 -0400
> Patrick Palka  wrote:
> 
> > On Sat, May 13, 2023 at 7:26 PM Bernhard Reutner-Fischer via
> > Gcc-patches  wrote:  
> 
> > > diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
> > > index 131b212ff73..19dfb3ed782 100644
> > > --- a/gcc/cp/tree.cc
> > > +++ b/gcc/cp/tree.cc
> > > @@ -1173,7 +1173,7 @@ build_cplus_array_type (tree elt_type, tree 
> > > index_type, int dependent)
> > >  }
> > >
> > >/* Avoid spurious warnings with VLAs (c++/54583).  */
> > > -  if (TYPE_SIZE (t) && EXPR_P (TYPE_SIZE (t)))
> > > +  if (CAN_HAVE_LOCATION_P (TYPE_SIZE (t)))
> > 
> > Hmm, this change seems undesirable...  
> 
> mhm, yes that is misleading. I'll prepare a patch to revert this.
> Let me have a look if there were other such CAN_HAVE_LOCATION_P changes
> that we'd want to revert.

Sorry for that!
I'd revert the hunk above and the one in gcc-rich-location.cc
(maybe_range_label_for_tree_type_mismatch::get_text), please see
attached. Bootstrap running, ok for trunk if it passes?

thanks,
>From 322bce380144b5199cca5775f7a3f0fb30a219ae Mon Sep 17 00:00:00 2001
From: Bernhard Reutner-Fischer 
Date: Thu, 1 Jun 2023 19:44:19 +0200
Subject: [PATCH] c++, analyzer: Expand CAN_HAVE_LOCATION_P macro.

r14-985-gca2007a9bb3074 used the collapsed macro definition
CAN_HAVE_LOCATION_P in gcc-rich-location.cc and r14-977-g8861c80733da5c
in c++'s build_cplus_array_type ().
However, although otherwise correct, the usage of CAN_HAVE_LOCATION_P
in these two spots is misleading, so this patch reverts aforementioned
two hunks.

gcc/cp/ChangeLog:

* tree.cc (build_cplus_array_type): Revert using the macro
CAN_HAVE_LOCATION_P.

gcc/ChangeLog:

* gcc-rich-location.cc 
(maybe_range_label_for_tree_type_mismatch::get_text):
Revert using the macro CAN_HAVE_LOCATION_P.
---
 gcc/cp/tree.cc   | 2 +-
 gcc/gcc-rich-location.cc | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 19dfb3ed782..9363166152a 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -1173,7 +1173,7 @@ build_cplus_array_type (tree elt_type, tree index_type, 
int dependent)
 }
 
   /* Avoid spurious warnings with VLAs (c++/54583).  */
-  if (CAN_HAVE_LOCATION_P (TYPE_SIZE (t)))
+  if (TYPE_SIZE (t) && EXPR_P (TYPE_SIZE (t)))
 suppress_warning (TYPE_SIZE (t), OPT_Wunused);
 
   /* Push these needs up to the ARRAY_TYPE so that initialization takes
diff --git a/gcc/gcc-rich-location.cc b/gcc/gcc-rich-location.cc
index edecf07f81e..d02a5144cc6 100644
--- a/gcc/gcc-rich-location.cc
+++ b/gcc/gcc-rich-location.cc
@@ -200,7 +200,7 @@ maybe_range_label_for_tree_type_mismatch::get_text 
(unsigned range_idx) const
   tree expr_type = TREE_TYPE (m_expr);
 
   tree other_type = NULL_TREE;
-  if (CAN_HAVE_LOCATION_P (m_other_expr))
+  if (m_other_expr && EXPR_P (m_other_expr))
 other_type = TREE_TYPE (m_other_expr);
 
   range_label_for_type_mismatch inner (expr_type, other_type);
-- 
2.30.2



Re: [PATCH] doc: clarify semantics of vector bitwise shifts

2023-06-01 Thread Alexander Monakov via Gcc-patches


On Wed, 31 May 2023, Richard Biener wrote:

> On Tue, May 30, 2023 at 4:49 PM Alexander Monakov  wrote:
> >
> >
> > On Thu, 25 May 2023, Richard Biener wrote:
> >
> > > On Wed, May 24, 2023 at 8:36 PM Alexander Monakov  
> > > wrote:
> > > >
> > > >
> > > > On Wed, 24 May 2023, Richard Biener via Gcc-patches wrote:
> > > >
> > > > > I’d have to check the ISAs what they actually do here - it of course 
> > > > > depends
> > > > > on RTL semantics as well but as you say those are not strictly 
> > > > > defined here
> > > > > either.
> > > >
> > > > Plus, we can add the following executable test to the testsuite:
> > >
> > > Yeah, that's probably a good idea.  I think your documentation change
> > > with the added sentence about the truncation is OK.
> >
> > I am no longer confident in my patch, sorry.
> >
> > My claim about vector shift semantics in OpenCL was wrong. In fact it 
> > specifies
> > that RHS of a vector shift is masked to the exact bitwidth of the element 
> > type.
> >
> > So, to collect various angles:
> >
> > 1. OpenCL semantics would need an 'AND' before a shift (except VSX/Altivec).
> >
> > 2. From user side we had a request to follow C integer promotion semantics
> >in https://gcc.gnu.org/PR91838 but I now doubt we can do that.
> >
> > 3. LLVM makes oversized vector shifts UB both for 'vector_size' and
> >'ext_vector_type'.
> 
> I had the impression GCC desired to do 3. as well, matching what we do
> for scalar shifts.
> 
> > 4. Vector lowering does not emit promotions, and starting from gcc-12
> >ranger treats oversized shifts according to the documentation you
> >cite below, and optimizes (e.g. with '-O2 -mno-sse')
> >
> > typedef short v8hi __attribute__((vector_size(16)));
> >
> > void f(v8hi *p)
> > {
> > *p >>= 16;
> > }
> >
> >to zeroing '*p'. If this looks unintended, I can file a bug.
> >
> > I still think we need to clarify semantics of vector shifts, but probably
> > not in the way I proposed initially. What do you think?
> 
> I think the intent at some point was to adhere to the OpenCL spec
> for the GCC vector extension (because that's a written spec while
> GCCs vector extension docs are lacking).  Originally the powerpc
> altivec 'vector' keyword spurred most of the development IIRC
> so it might be useful to see how they specify shifts.

It doesn't look like they document the semantics of '<<' and '>>'
operators for vector types.

> So yes, we probably should clarify the semantics to match the
> implementation (since we have two targets doing things differently
> since forever we can only document it as UB) and also note the
> difference from OpenCL (in case OpenCL is still relevant these
> days we might want to offer a -fopencl-vectors to emit the required
> AND).

It doesn't have to be UB, in principle we could say that shift amount
is taken modulo some power of two depending on the target without UB.
But since LLVM already treats that as UB, we might as well follow.

I think for addition/multiplication of signed vectors everybody
expects them to have wrapping semantics without UB on overflow though?

Revised patch below.

> It would be also good to amend the RTL documentation.
> 
> It would be very nice to start an internals documentation section
> around collecting what the middle-end considers undefined
> or implementation defined (aka target defined) behavior in the
> GENERIC, GIMPLE and RTL ILs and what predicates eventually
> control that (like TYPE_OVERFLOW_UNDEFINED).  Maybe spread it over
> {gimple,generic,rtl}.texi, though gimple.texi is only about the representation
> and all semantics are shared and documented in generic.texi.

Hm, noted. Thanks.

---8<---

>From e4e8d9e262f2f8dbc91a94291cf7accb74d27e7c Mon Sep 17 00:00:00 2001
From: Alexander Monakov 
Date: Wed, 24 May 2023 15:48:29 +0300
Subject: [PATCH] doc: clarify semantics of vector bitwise shifts

Explicitly say that attempted shift past element bit width is UB for
vector types.  Mention that integer promotions do not happen.

gcc/ChangeLog:

* doc/extend.texi (Vector Extensions): Clarify bitwise shift
semantics.
---
 gcc/doc/extend.texi | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e426a2eb7d..3723cfe467 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -12026,7 +12026,14 @@ elements in the operand.
 It is possible to use shifting operators @code{<<}, @code{>>} on
 integer-type vectors. The operation is defined as following: @code{@{a0,
 a1, @dots{}, an@} >> @{b0, b1, @dots{}, bn@} == @{a0 >> b0, a1 >> b1,
-@dots{}, an >> bn@}}@. Vector operands must have the same number of
+@dots{}, an >> bn@}}@.  Unlike OpenCL, values of @code{b} are not
+implicitly taken modulo bit width of the base type @code{B}, and the behavior
+is undefined if any @code{bi} is greater than or equal to @code{B}.
+
+In contrast to scalar operations in C and C+

Re: [PATCH V2] RISC-V: Support RVV permutation auto-vectorization

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 20:36, juzhe.zh...@rivai.ai wrote:

From: Juzhe-Zhong 

This patch supports vector permutation for VLS only by vec_perm pattern.
We will support TARGET_VECTORIZE_VEC_PERM_CONST to support VLA permutation
in the future.

Fixed following comments from Robin.
Ok for trunk?

gcc/ChangeLog:

 * config/riscv/autovec.md (vec_perm): New pattern.
 * config/riscv/predicates.md (vector_perm_operand): New predicate.
 * config/riscv/riscv-protos.h (enum insn_type): New enum.
 (expand_vec_perm): New function.
 * config/riscv/riscv-v.cc (const_vec_all_in_range_p): Ditto.
 (gen_const_vector_dup): Ditto.
 (emit_vlmax_gather_insn): Ditto.
 (emit_vlmax_masked_gather_mu_insn): Ditto.
 (expand_vec_perm): Ditto.

OK.
jeff


Re: [PATCH] RISC-V: Add vwadd.wv/vwsub.wv auto-vectorization lowering optimization

2023-06-01 Thread Jeff Law via Gcc-patches




On 5/31/23 21:48, juzhe.zh...@rivai.ai wrote:

From: Juzhe-Zhong 

1. This patch optimize the codegen of the following auto-vectorization codes:

void foo (int32_t * __restrict a, int64_t * __restrict b, int64_t * __restrict 
c, int n)
{
 for (int i = 0; i < n; i++)
   c[i] = (int64_t)a[i] + b[i];
}

Combine instruction from:

...
vsext.vf2
vadd.vv
...

into:

...
vwadd.wv
...

Since for PLUS operation, GCC prefer the following RTL operand order when 
combining:

(plus: (sign_extend:..)
(reg:)

instead of

(plus: (reg:..)
(sign_extend:)




which is different from MINUS pattern.
Right.  Canonicaliation rules will have the sign_extend as the first 
operand when the opcode is associative.


I split patterns of vwadd/vwsub, and add dedicated patterns for them.

2. This patch not only optimize the case as above (1) mentioned, also enhance 
vwadd.vv/vwsub.vv
optimization for complicate PLUS/MINUS codes, consider this following codes:

__attribute__ ((noipa)) void

vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2,
  int16_t *__restrict dst3, int8_t *__restrict a,
  int8_t *__restrict b, int8_t *__restrict a2,
  int8_t *__restrict b2, int n)
{
   for (int i = 0; i < n; i++)
 {
   dst[i] = (int16_t) a[i] + (int16_t) b[i];
   dst2[i] = (int16_t) a2[i] + (int16_t) b[i];
   dst3[i] = (int16_t) a2[i] + (int16_t) a[i];
 }
}

Before this patch:
...
 vsetvli zero,a6,e8,mf2,ta,ma
 vle8.v  v2,0(a3)
 vle8.v  v1,0(a4)
 vsetvli t1,zero,e16,m1,ta,ma
 vsext.vf2   v3,v2
 vsext.vf2   v2,v1
 vadd.vv v1,v2,v3
 vsetvli zero,a6,e16,m1,ta,ma
 vse16.v v1,0(a0)
 vle8.v  v4,0(a5)
 vsetvli t1,zero,e16,m1,ta,ma
 vsext.vf2   v1,v4
 vadd.vv v2,v1,v2
...

After this patch:
...
 vsetvlizero,a6,e8,mf2,ta,ma
vle8.v  v3,0(a4)
vle8.v  v1,0(a3)
vsetvli t4,zero,e8,mf2,ta,ma
vwadd.vvv2,v1,v3
vsetvli zero,a6,e16,m1,ta,ma
vse16.v v2,0(a0)
vle8.v  v2,0(a5)
vsetvli t4,zero,e8,mf2,ta,ma
vwadd.vvv4,v3,v2
vsetvli zero,a6,e16,m1,ta,ma
vse16.v v4,0(a1)
vsetvli t4,zero,e8,mf2,ta,ma
sub a7,a7,a6
vwadd.vvv3,v2,v1
vsetvli zero,a6,e16,m1,ta,ma
vse16.v v3,0(a2)
...

The reason why current upstream GCC can not optimize codes using vwadd 
thoroughly is combine PASS
needs intermediate RTL IR (extend one of the operand pattern (vwadd.wv)), then 
base on this intermediate
RTL IR, extend the other operand to generate vwadd.vv.

So vwadd.wv/vwsub.wv definitely helps to vwadd.vv/vwsub.vv code optimizations.
  
gcc/ChangeLog:


 * config/riscv/riscv-vector-builtins-bases.cc: Change 
vwadd.wv/vwsub.wv intrinsic API expander
 * config/riscv/vector.md 
(@pred_single_widen_): Remove it.
 (@pred_single_widen_sub): New pattern.
 (@pred_single_widen_add): New pattern.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/rvv/autovec/widen/widen-5.c: New test.
 * gcc.target/riscv/rvv/autovec/widen/widen-6.c: New test.
 * gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: New test.
 * gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: New test.
 * gcc.target/riscv/rvv/autovec/widen/widen_run-5.c: New test.
 * gcc.target/riscv/rvv/autovec/widen/widen_run-6.c: New test.

OK
jeff


[PATCH] Fortran: force error on bad KIND specifier [PR88552]

2023-06-01 Thread Harald Anlauf via Gcc-patches
Dear all,

we sometimes silently accept wrong declarations with unbalanced
parentheses, as the PR and testcases therein show.

It appears that the fix is obvious: use the existing error paths in
gfc_match_kind_spec and error return from gfc_match_decl_type_spec.
I'm still posting it here in case I have missed something not so
obvious.

The patch regtests cleanly on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From a30ff5af130c4d33c086fd136978d5f49cb8bde4 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 1 Jun 2023 20:56:11 +0200
Subject: [PATCH] Fortran: force error on bad KIND specifier [PR88552]

gcc/fortran/ChangeLog:

	PR fortran/88552
	* decl.cc (gfc_match_kind_spec): Use error path on missing right
	parenthesis.
	(gfc_match_decl_type_spec): Use error return when an error occurred
	during matching a KIND specifier.

gcc/testsuite/ChangeLog:

	PR fortran/88552
	* gfortran.dg/pr88552.f90: New test.
---
 gcc/fortran/decl.cc   | 4 
 gcc/testsuite/gfortran.dg/pr88552.f90 | 6 ++
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr88552.f90

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 1de2b231242..deb20647fb9 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -3366,6 +3366,7 @@ close_brackets:
   else
 	gfc_error ("Missing right parenthesis at %C");
   m = MATCH_ERROR;
+  goto no_match;
 }
   else
  /* All tests passed.  */
@@ -4716,6 +4717,9 @@ get_kind:
   return MATCH_ERROR;
 }

+  if (m == MATCH_ERROR)
+return MATCH_ERROR;
+
   /* Defer association of the KIND expression of function results
  until after USE and IMPORT statements.  */
   if ((gfc_current_state () == COMP_NONE && gfc_error_flag_test ())
diff --git a/gcc/testsuite/gfortran.dg/pr88552.f90 b/gcc/testsuite/gfortran.dg/pr88552.f90
new file mode 100644
index 000..15e1b372f8f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr88552.f90
@@ -0,0 +1,6 @@
+! { dg-do compile }
+! PR fortran/88552
+! Contributed by G.Steinmetz
+
+integer(len((c)) :: n   ! { dg-error "must be CHARACTER" }
+end
--
2.35.3



Re: [PATCH 2/3] RISC-V: Add missing torture-init and torture-finish for rvv.exp

2023-06-01 Thread Vineet Gupta

On 6/1/23 07:54, Jeff Law wrote:



On 5/31/23 10:25, Vineet Gupta wrote:

From: Kito Cheng 

This is in line with recent test harness expectations and is a
preventive change as it doesn't actually fix any errors.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add torture-init and
torture-finish.
Thomas's recommendation was to drop this as it doesn't change any 
observed behavior.  Do you agree with that recommendation?


Yep dropped now.

Thx,
-Vineet


[Committed] testsuite: Unbork multilib setups using -march flags (RISC-V)

2023-06-01 Thread Vineet Gupta
RISC-V multilib testing is currently busted with follow splat all over:

|Schedule of variations:
|riscv-sim/-march=rv64imafdc/-mabi=lp64d/-mcmodel=medlow
|riscv-sim/-march=rv32imafdc/-mabi=ilp32d/-mcmodel=medlow
|riscv-sim/-march=rv32imac/-mabi=ilp32/-mcmodel=medlow
|riscv-sim/-march=rv64imac/-mabi=lp64/-mcmodel=medlow
...
...
| ERROR: tcl error code NONE
| ERROR: torture-init: torture_without_loops is not empty as expected

causing insane amount of false failures.

|   = Summary of gcc testsuite =
|| # of unexpected case / # of unique unexpected 
case
||  gcc |  g++ | gfortran |
| rv64imafdc/  lp64d/ medlow | 5421 / 4 |1 / 1 |6 / 1 |
| rv32imafdc/ ilp32d/ medlow | 5422 / 5 |3 / 2 |6 / 1 |
|   rv32imac/  ilp32/ medlow |  391 / 5 |3 / 2 |   43 / 8 |
|   rv64imac/   lp64/ medlow | 5422 / 5 |1 / 1 |   43 / 8 |

The error splat itself is from recent test harness improvements for stricter
checks for torture-{init,finish} pairing. But the real issue is a latent bug
from 2009: commit 3dd1415dc88, ("i386-prefetch.exp: Skip tests when multilib
flags contain -march") which added an "early exit" condition to 
i386-prefetch.exp
which could potentially cause an unpaired torture-{init,finish}.

The early exit only happens in a multlib setup using -march in flags
which is what RISC-V happens to use, hence the reason this was only seen
on RISC-V multilib testing.

Moving the early exit outside of torture-{init,finish} bracket
reinstates RISC-V testing.

| rv64imafdc/  lp64d/ medlow |3 / 2 |1 / 1 |6 / 1 |
| rv32imafdc/ ilp32d/ medlow |4 / 3 |3 / 2 |6 / 1 |
|   rv32imac/  ilp32/ medlow |3 / 2 |3 / 2 |   43 / 8 |
|   rv64imac/   lp64/ medlow |5 / 4 |1 / 1 |   43 / 8 |

gcc/testsuite:
* gcc.misc-tests/i386-prefetch.exp: Move early return outside
the torture-{init,finish}

Signed-off-by: Vineet Gupta 
---
 gcc/testsuite/gcc.misc-tests/i386-prefetch.exp | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/i386-prefetch.exp 
b/gcc/testsuite/gcc.misc-tests/i386-prefetch.exp
index ad9e56a54bcf..98aab506cba0 100644
--- a/gcc/testsuite/gcc.misc-tests/i386-prefetch.exp
+++ b/gcc/testsuite/gcc.misc-tests/i386-prefetch.exp
@@ -14,6 +14,13 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
+if { [board_info target exists multilib_flags]
+ && [string match "* -march=*" " [board_info target multilib_flags] "] } {
+# Multilib flags come after the -march flags we pass and override
+# them, so skip these tests when such flags are passed.
+return
+}
+
 # Test that the correct data prefetch instructions (SSE or 3DNow! variant,
 # or none) are used for various i386 cpu-type and instruction set
 # extension options for __builtin_prefetch.  When using -mtune, specify
@@ -90,13 +97,6 @@ load_lib torture-options.exp
 dg-init
 torture-init
 
-if { [board_info target exists multilib_flags]
- && [string match "* -march=*" " [board_info target multilib_flags] "] } {
-# Multilib flags come after the -march flags we pass and override
-# them, so skip these tests when such flags are passed.
-return
-}
-
 set-torture-options $PREFETCH_NONE
 gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/i386-pf-none-*.c]] "" 
""
 
-- 
2.34.1



[Committed] testsuite: print any leaking torture options for debugging

2023-06-01 Thread Vineet Gupta
This was helpful when debugging the recent multilib testsuite failure.

gcc/testsuite:
* lib/torture-options.exp: print the value of non-empty options:
torture_without_loops, torture_with_loops, LTO_TORTURE_OPTIONS.

Signed-off-by: Vineet Gupta 
---
 gcc/testsuite/lib/torture-options.exp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/lib/torture-options.exp 
b/gcc/testsuite/lib/torture-options.exp
index d00d07e9378d..dfb536d1d96c 100644
--- a/gcc/testsuite/lib/torture-options.exp
+++ b/gcc/testsuite/lib/torture-options.exp
@@ -23,15 +23,15 @@ proc torture-init { args } {
 global torture_without_loops global_with_loops
 
 if [info exists torture_without_loops] {
-   error "torture-init: torture_without_loops is not empty as expected"
+   error "torture-init: torture_without_loops is not empty as expected = 
\"${torture_without_loops}\""
 }
 if [info exists torture_with_loops] {
-   error "torture-init: torture_with_loops is not empty as expected"
+   error "torture-init: torture_with_loops is not empty as expected = 
\"${torture_with_loops}\""
 }
 
 global LTO_TORTURE_OPTIONS
 if [info exists LTO_TORTURE_OPTIONS] {
-   error "torture-init: LTO_TORTURE_OPTIONS is not empty as expected"
+   error "torture-init: LTO_TORTURE_OPTIONS is not empty as expected =  
\"${LTO_TORTURE_OPTIONS}\""
 }
 set LTO_TORTURE_OPTIONS ""
 if [check_effective_target_lto] {
-- 
2.34.1



Re: [PATCH] rs6000: Fix __builtin_vec_xst_trunc definition

2023-06-01 Thread Carl Love via Gcc-patches
On Wed, 2023-05-31 at 12:59 -0500, Peter Bergner wrote:
> On 5/22/23 4:04 AM, Kewen.Lin wrote:
> > on 2023/5/11 02:06, Carl Love via Gcc-patches wrote:
> > > @@ -3161,12 +3161,15 @@
> > >void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed
> > > char *);
> > >  TR_STXVRBX vsx_stxvrbx {stvec}
> > >  
> > > -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
> > > int *);
> > > +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed
> > > short *);
> > >  TR_STXVRHX vsx_stxvrhx {stvec}
> > >  
> > > -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
> > > short *);
> > > +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed
> > > int *);
> > >  TR_STXVRWX vsx_stxvrwx {stvec}
> > 
> > Good catching!
> 
> This hunk should be its own patch and commit, as it is independent of
> the other change.  Especially since other built-ins also don't have
> {,un}simgned long * as arguments, not just
> __builtin_altivec_tr_stxvr*x.

Yes, I was thinking the patch needs to be split into a bug fix and a
patch for the long * arguments.

I redid the patch to create the bug fix only.  The patch includes a
testcase that tests the __builtin_altivec_tr_stxvr* builtins.  I will
post the new patch.

The updated patch is now called:  " rs6000: Fix arguments for
__builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx"

> 
> 
> 
> > > +  void __builtin_altivec_tr_stxvrlx (vsq, signed long, signed
> > > long *);
> > > +TR_STXVRLX vsx_stxvrdx {stvec}
> > > +
> > 
> > This is mapped to the one used for type long long, it's a hard
> > mapping,
> > IMHO it's wrong and not consistent with what the users expect,
> > since on Power
> > the size of type long int is 4 bytes at -m32 while 8 bytes at -m64,
> > this
> > implementation binding to 8 bytes can cause trouble in 32-bit.  I
> > wonder if
> > it's a good idea to add one overloaded version for type long int,
> > for now
> > openxl also emits error message for long int type pointer (see its
> > doc [1]),
> > users can use casting to make it to the acceptable pointer types
> > (long long
> > or int as its size).
> 
> I'm the person who noticed that we don't accept signed/unsigned long
> * as
> an argument type and asked Carl to investigate.  I find it hard to
> believe
> we accept all integer pointer types, except long *.  I agree that it
> shouldn't
> always map to long long *, since as you say, that's wrong for -m32.
> My hope was that we could somehow automagically handle the long *
> types
> in the built-in machinery, mapping them to either the int * built-in
> or
> the long long * built-in depending on -m32 or -m64.  Again, this
> limitation
> is no limited to __builtin_altivec_tr_stx* built-ins, but others as
> well,
> so I was kind of hoping for a general solution that would fix them
> all.
> I'm not sure of that's possible though.

Per Peter's request, I added the overloaded version of the
__builtin_vec_xst_trunc builtin with the long * argument which Kewen
pushed back on.  So, that approach is not acceptable.  Not sure about
how to get the builtin infrastructure to automatically map long * to
int * or long long *?  If someone has some idea on how to do that, I
will gladly pursue it.  I will study the builtin support some more to
see if I can come up with any ideas as well.

 Carl



[PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx

2023-06-01 Thread Carl Love via Gcc-patches
Kewen, Segher, Peter:

The following patch is a redo of the previous "rs6000: Fix
__builtin_vec_xst_trunc definition" patch.  

This patch fixes the argument in the two builtin definitions
__builtin_altivec_tr_stxvrwx and __builtin_altivec_tr_stxvrhx.  It also
adds with a testcase to validate the related builtins which have the
third argument of char *, short *, int * and long long *.

I have tested the patch on Power 10 with no regressions.

Please let me know if this patch is acceptable for mainline.

  Carl 


rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, 
__builtin_altivec_tr_stxvrhx

The third argument for __builtin_altivec_tr_stxvrhx should be short *
not int *.  Similarly, the third argument for __builtin_altivec_tr_stxvrwx
should be int * not short *.  This patch fixes the arguments in the two
builtins.

A runnable test case is added to test the __builtin_altivec_tr_stxvrbx,
__builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx and
__builtin_altivec_tr_stxvrdx builtins.

gcc/
* config/rs6000/rs6000-builtins.def (__builtin_altivec_tr_stxvrhx,
__builtin_altivec_tr_stxvrwx): Fix type of third argument.

gcc/testsuite/
* gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c: New test
for __builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx,
__builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrdx.
---
 gcc/config/rs6000/rs6000-builtins.def |   4 +-
 .../builtin_altivec_tr_stxvr_runnable.c   | 107 ++
 2 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..d7839f2e06b 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -3161,10 +3161,10 @@
   void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
 TR_STXVRBX vsx_stxvrbx {stvec}
 
-  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
+  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
 TR_STXVRHX vsx_stxvrhx {stvec}
 
-  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
+  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
 TR_STXVRWX vsx_stxvrwx {stvec}
 
   void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
diff --git 
a/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
new file mode 100644
index 000..46014d83535
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
@@ -0,0 +1,107 @@
+/* Test of __builtin_vec_xst_trunc  */
+
+/* { dg-do run { target power10_hw } } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DEBUG 0
+
+vector signed __int128 store_data =
+  {  (__int128) 0x8ACE << 64 | (__int128) 0xfedcba9876543217ULL};
+
+union conv_t {
+  vector signed __int128 vsi128;
+  unsigned long long ull[2];
+} conv;
+
+void abort (void);
+
+
+int
+main () {
+  int i;
+  signed long sl;
+  signed char sc, expected_sc;
+  signed short ss, expected_ss;
+  signed int si, expected_si;
+  signed long long int sll, expected_sll;
+  signed char *psc;
+  signed short *pss;
+  signed int *psi;
+  signed long long int *psll;
+  
+#if DEBUG
+  val.vsi128 = store_data;
+   printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);
+#endif
+
+  psc = ≻
+  pss = &ss;
+  psi = &si;
+  psll = &sll;
+
+  sl = 1;
+  sc =0xA1;
+  expected_sc = 0xA1;
+  __builtin_altivec_tr_stxvrbx (store_data, sl, psc);
+
+  if (expected_sc != sc & 0xFF)
+#if DEBUG
+printf(" ERROR: Signed char = 0x%x doesn't match expected value 0x%x\n",
+  sc & 0xFF, expected_sc);
+#else
+abort();
+#endif
+
+  sl = 1;
+  ss = 0x52;
+  expected_ss = 0x1752;
+  __builtin_altivec_tr_stxvrhx (store_data, sl, pss);
+
+  if (expected_ss != ss & 0x)
+#if DEBUG
+printf(" ERROR: Signed short = 0x%x doesn't match expected value 0x%x\n",
+  ss, expected_ss) & 0x;
+#else
+abort();
+#endif
+
+  sl = 1;
+  si = 0x21;
+  expected_si = 0x54321721;
+   __builtin_altivec_tr_stxvrwx (store_data, sl, psi);
+
+   if (expected_si != si)
+#if DEBUG
+printf(" ERROR: Signed int = 0x%x doesn't match expected value 0x%x\n",
+  si, expected_si);
+#else
+abort();
+#endif
+
+  sl = 1;
+  sll = 0x12FFULL;
+   expected_sll = 0xdcba9876543217FF;
+   __builtin_altivec_tr_stxvrdx (store_data, sl, psll);
+
+   if (expected_sll != sll)
+#if DEBUG
+printf(" ERROR: Signed long long int = 0x%llx doesn't match expected value 
0x%llx\n",
+  sll, expected_sll);
+#else
+

Re: [PATCH 1/2] libstdc++: Implement more maintainable header

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Sat, 29 Apr 2023 at 11:25, Arsen Arsenović via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> This commit replaces the ad-hoc logic in  with an AutoGen
> database that (mostly) declaratively generates a version.h bit which
> combines all of the FTM logic across all headers together.
>
> This generated header defines macros of the form __glibcxx_foo,
> equivalent to their __cpp_lib_foo variants, according to rules specified
> in version.def and, optionally, if __glibcxx_want_foo or
> __glibcxx_want_all are defined, also defines __cpp_lib_foo forms with
> the same definition.
>
> libstdc++-v3/ChangeLog:
>
> * include/Makefile.am (bits_freestanding): Add version.h.
> (allcreated): Add version.h.
> (${bits_srcdir}/version.h): New rule.  Regenerates
> version.h out of version.{def,tpl}.
> * include/Makefile.in: Regenerate.
> * include/bits/version.def: New file.  Declares a list of
> all feature test macros, their values and their preconditions.
> * include/bits/version.tpl: New file.  Turns version.def
> into a sequence of #if blocks.
> * include/bits/version.h: New file.  Generated from
> version.def.
> * include/std/version: Replace with a __glibcxx_want_all define
> and bits/version.h include.
> ---
>  libstdc++-v3/include/Makefile.am  |   10 +-
>  libstdc++-v3/include/Makefile.in  |   10 +-
>  libstdc++-v3/include/bits/version.def | 1591 
>  libstdc++-v3/include/bits/version.h   | 1937 +
>  libstdc++-v3/include/bits/version.tpl |  209 +++
>  libstdc++-v3/include/std/version  |  350 +
>  6 files changed, 3758 insertions(+), 349 deletions(-)
>  create mode 100644 libstdc++-v3/include/bits/version.def
>  create mode 100644 libstdc++-v3/include/bits/version.h
>  create mode 100644 libstdc++-v3/include/bits/version.tpl
>
> diff --git a/libstdc++-v3/include/Makefile.am
> b/libstdc++-v3/include/Makefile.am
> index a880e8ee227..a07b4c18585 100644
> --- a/libstdc++-v3/include/Makefile.am
> +++ b/libstdc++-v3/include/Makefile.am
> @@ -154,6 +154,7 @@ bits_freestanding = \
> ${bits_srcdir}/stl_raw_storage_iter.h \
> ${bits_srcdir}/stl_relops.h \
> ${bits_srcdir}/stl_uninitialized.h \
> +   ${bits_srcdir}/version.h \
> ${bits_srcdir}/string_view.tcc \
> ${bits_srcdir}/uniform_int_dist.h \
> ${bits_srcdir}/unique_ptr.h \
> @@ -1113,7 +1114,8 @@ allcreated = \
> ${host_builddir}/c++config.h \
> ${host_builddir}/largefile-config.h \
> ${thread_host_headers} \
> -   ${pch_build}
> +   ${pch_build} \
> +   ${bits_srcdir}/version.h
>
>  # Here are the rules for building the headers
>  all-local: ${allstamped} ${allcreated}
> @@ -1463,6 +1465,12 @@ ${pch3_output}: ${pch3_source} ${pch2_output}
> -mkdir -p ${pch3_output_builddir}
> $(CXX) $(PCHFLAGS) $(AM_CPPFLAGS) -O2 -g ${pch3_source} -o $@
>
> +# AutoGen .
> +${bits_srcdir}/version.h: ${bits_srcdir}/version.def \
> +   ${bits_srcdir}/version.tpl
> +   cd $(@D) && \
> +   autogen version.def
>

It looks like this will regenerate the bits/version.h file if it's older
than the definitions or the autogen template, right?

Generally we don't want to touch anything in the source tree as part of a
normal build. It's OK to do that when configured with
--enable-maintainer-mode (which nobody working on libstdc++ actually uses,
because it causes problems IME) or via a dedicated target which is not
built by default (e.g. doc/Makefile.am has the doc-html-docbook-regenerate
target, which is isn't a prereq of any other targets so it's only run if
you explicitly request it).

The problem with modifying the source tree as part of a normal build is
that it might be on read-only media, and so the build will fail if this
target can't be updated. We would also want to add the version.h header to
the contrib/gcc_update script that updates the timestamps of generated
files, so that they are always newer than their prereqs.

Maybe the best option here is to assume that version.h is always up to
date, and add a custom target to regen it manually, which we can run after
editing the .def or .tpl files. What do you think?

My only other concern with this patch is that I don't speak lisp so the
Guile code in version.tpl is opaque and unmaintainable for me. That is
fixable though.


[RFC] range-op restructuring

2023-06-01 Thread Andrew MacLeod via Gcc-patches
With the addition of floating point ranges, we did a lot of additional 
class abstraction, then added a bunch more routines for floating point. 
We didn't know how it would look in the end, so we just marched forward 
and got it working.


Now that has settled down a bit, and before we go and add more kinds of 
ranges, I want to visit restructuring the files and provide a better 
dispatch to the range operators from a vrange.


We currently dispatch based on the type of the statement.. int or 
float.  the line is blurred heavily when we have statements that have 
more than one kid of range.. ie


int_value = (int) float_value
  vs
float_value = (float) int_value

Under the current regime, both kinds of casts have to go into the float 
table.. and this is going to get more complicated if we add more 
distinct kinds of ranges. With the current implementation, the floating 
point range operators don't even inherit from range_operator, they are 
their own kind of operator.   The ideal situation is to have a single 
unified range-operator class which has all the combinations, and they 
rest in a single table.  This simplifies numerous things, and avoid us 
having to classify anything in some arbitrary way. It also moves us back 
in the direction of the original vision I had for range-ops.


Ive done an initial rough conversion so you can see what it looks like. 
I've attached a new range-op.h which shows class range-operator will all 
the virtual function combinations. The new dispatch mechniasm buys us 
about 1% speedup in both VRP and in jump_threading.  The new mechanism 
also handles unsupported combinations of operands smoothly, simply 
returning false if its an unsupported combination of aprameters that is 
invoked, which is what a default routine would do.


  As for conversion, lets take operator_not_equal as an example.    The 
end result in range-operator.h is:


class operator_not_equal : public range_operator
{
public:
  bool fold_range (irange &r, tree type, const irange &op1, const 
irange &op2, relation_trio = TRIO_VARYING) const final override;
  bool fold_range (irange &r, tree type, const frange &op1, const 
frange &op2, relation_trio = TRIO_VARYING) const final override;


  bool op1_range (irange &r, tree type, const irange &lhs, const irange 
&op2, relation_trio = TRIO_VARYING) const final override;
  bool op1_range (frange &r, tree type, const irange &lhs, const frange 
&op2, relation_trio = TRIO_VARYING) const final override;


  bool op2_range (frange &r, tree type, const irange &lhs, const frange 
&op1, relation_trio rel = TRIO_VARYING) const final override;
  bool op2_range (irange &r, tree type, const irange &lhs, const irange 
&op1, relation_trio = TRIO_VARYING) const final override;


  relation_kind op1_op2_relation (const irange &lhs) const;
  void update_bitmask (irange &r, const irange &lh, const irange &rh) 
const;

};
extern operator_not_equal rop_NE_EXPR;

When we add a new range type, such as pointers, you simply add the 
required prototypes, add new dispatch codes, and implement them.


This is going  to cause some churn but I am trying to keep it to a 
minimum.  I've been mucking about with it for a couple of weeks, and I 
was thinking to structure it something like this:


range-op.h and range-op.cc  will have the base range_operator class, 
along with the range_op_handler class  we move all the class headers, 
like the above operator_not_equal class into "range-operator.h"


Where the code goes is the biggest struggle. Initially I was going to 
put it all in one file. This would be best as it allows us to co-locate 
all the code for various classes of routines. But that would already be 
about 8000 lines for int and float combined, and will only get larger 
with new range types. I also considered just including all the 
range-op-int.cc and range-op-float.cc files into range-op.cc when it 
compiles, but you still end up with  a big compilation unit.  So this is 
what I'm think now:


We leave all the existing floating point code in range-op-float.cc, and 
then move all the existing integer code into a range-op-int.cc file (or 
I suppose we could even leave it in range-op.cc to avoid extra churn).  
The classes must move to  a header to be accessible from the various 
files which implement them.  This provides for minimal churn. a few 
deletes and renames, and thats it.  I've attached the diff which moves 
operator_not_equal to this form (once some other structuring is in place).


The plus is you can see in the header file range-operator.h exactly what 
is available for any opcode.  Whats less than ideal is that some of the 
routines are in range-op.cc and some are in range-op-float.cc,   
Ultimately, I dont think thats such a big deal as those floating point 
routines often require common infrastructure, such as nan querying that 
integer things dont need.   When we add say a pointer range class,  we 
would create range-op-pointer.cc and all the new stuff required for 
p

Re: [PATCH 2/2] libstdc++: Replace all manual FTM definitions and use

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Sat, 29 Apr 2023 at 11:24, Arsen Arsenović via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> libstdc++-v3/ChangeLog:
>
> * libsupc++/typeinfo: Switch to bits/version.h for
> __cpp_lib_constexpr_typeinfo.
>
>
Does this change have an impact on compilation speed?
With this change we'll be re-including bits/version.h multiple times in
most compilations, and unlike other headers the preprocessor can't optimize
away the second and subsequent times its' included, because the header
isn't idempotent.
It will only affect the preprocessing phase, which is a fraction of the
time taken by template instantiation and middle end optimizations, but I'd
like to know it's not *too* expensive before committing to this approach.



> @@ -234,9 +234,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>return __atomic_test_and_set (&_M_i, int(__m));
>  }
>
> -#if __cplusplus > 201703L
> -#define __cpp_lib_atomic_flag_test 201907L
> -
> +#ifdef __cpp_lib_atomic_flag_test
>  _GLIBCXX_ALWAYS_INLINE bool
>  test(memory_order __m = memory_order_seq_cst) const noexcept
>  {
>

This is more "structured" and maintainable than the current ad-hoc way we
deal with FTMs, but this seems like a readability/usability regression in
terms of being able to open the header and see "ah this feature is only
available for C++20 and up". Instead you can see it's available for the
specified FTM, but now you have to go and find where that's defined, and
that's not even defined in C++, it's in the version.def file. It's also
defined in bits/version.h, but that's a generated file and so is very
verbose and long.


diff --git a/libstdc++-v3/include/bits/move_only_function.h
> b/libstdc++-v3/include/bits/move_only_function.h
> index 71d52074978..81d7d9f7c0a 100644
> --- a/libstdc++-v3/include/bits/move_only_function.h
> +++ b/libstdc++-v3/include/bits/move_only_function.h
> @@ -32,7 +32,10 @@
>
>  #pragma GCC system_header
>
> -#if __cplusplus > 202002L
> +#define __glibcxx_want_move_only_function
> +#include 
> +
> +#ifdef __cpp_lib_move_only_function
>

Here's another case where I think the __cplusplus > 202002L is more
discoverable.

Although maybe I'm biased, because I look at that and immediately see
"C++23 and up". Maybe the average user finds that less clear. Maybe the
average user doesn't need to look at this anyway, but I know *I* do it
fairly often.

I wonder if it would help if we kept a comment there with a (possibly
imprecise) hint about the conditions under which the feature is defined. So
in this case:

// Only defined for C++23
#ifdef __cpp_lib_move_only_function

That retains the info that's currently there, and is even more readable
than the __cplusplus check.

There's a risk that those comments would get out of step with reality,
which is one of the things this patch set aims to solve. But I think in
practice that's unlikely. std::move_only_function isn't suddenly going to
become available in C++20, or stop being available in C++23 and move to
C++26.

What do you think?


Re: [PATCH] Fortran: force error on bad KIND specifier [PR88552]

2023-06-01 Thread Mikael Morin

Hello,

Le 01/06/2023 à 21:05, Harald Anlauf via Fortran a écrit :

Dear all,

we sometimes silently accept wrong declarations with unbalanced
parentheses, as the PR and testcases therein show.

It appears that the fix is obvious: use the existing error paths in
gfc_match_kind_spec and error return from gfc_match_decl_type_spec.
I'm still posting it here in case I have missed something not so
obvious.

The patch regtests cleanly on x86_64-pc-linux-gnu.  OK for mainline?


It looks good, but...


Thanks,
Harald

From a30ff5af130c4d33c086fd136978d5f49cb8bde4 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 1 Jun 2023 20:56:11 +0200
Subject: [PATCH] Fortran: force error on bad KIND specifier [PR88552]

gcc/fortran/ChangeLog:

PR fortran/88552
* decl.cc (gfc_match_kind_spec): Use error path on missing right
parenthesis.
(gfc_match_decl_type_spec): Use error return when an error occurred
during matching a KIND specifier.

gcc/testsuite/ChangeLog:

PR fortran/88552
* gfortran.dg/pr88552.f90: New test.
---
 gcc/fortran/decl.cc   | 4 
 gcc/testsuite/gfortran.dg/pr88552.f90 | 6 ++
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr88552.f90

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 1de2b231242..deb20647fb9 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -3366,6 +3366,7 @@ close_brackets:
   else
gfc_error ("Missing right parenthesis at %C");
   m = MATCH_ERROR;
+  goto no_match;
 }
   else
  /* All tests passed.  */
@@ -4716,6 +4717,9 @@ get_kind:
   return MATCH_ERROR;
 }

+  if (m == MATCH_ERROR)
+return MATCH_ERROR;
+

... can you move this up to the place where m is set?
OK with that change.

Thanks


Re: [PATCH] Move std::search into algobase.h

2023-06-01 Thread François Dumont via Gcc-patches

It's of course not as easy as I thought.

I would never have detected this problem on my system because I'm 
missing omp.h.


I've implemented and added a:

// { dg-require-effective-target omp }

so that now those tests are UNRESOLVED rather than PASS.

Now I've install OMP and try to rebuild lib to reproduce the failure.

To be continued tomorrow...

On 01/06/2023 14:05, Jonathan Wakely wrote:



On Thu, 1 Jun 2023 at 12:52, Rainer Orth  
wrote:


Jonathan Wakely via Gcc-patches  writes:

> On Wed, 31 May 2023 at 18:39, François Dumont via Libstdc++ <
> libstd...@gcc.gnu.org > wrote:
>
>> libstdc++: Reduce  inclusion to 
>>
>>
>> Move the std::search definition from stl_algo.h to
stl_algobase.h and use
>> the later in .
>>
>> For consistency also move std::__parallel::search and
associated helpers
>> from
>>  to  so that
>> std::__parallel::search
>> is accessible along with std::search.
>>
>> libstdc++-v3/ChangeLog:
>>
>>              * include/bits/stl_algo.h
>>              (std::__search, std::search(_FwdIt1, _FwdIt1, _FwdIt2,
>> _FwdIt2, _BinPred)): Move...
>>              * include/bits/stl_algobase.h: ...here.
>>              * include/std/functional: Replace 
include by
>> .
>>              * include/parallel/algo.h
(std::__parallel::search<_FIt1,
>> _FIt2, _BinaryPred>)
>> (std::__parallel::__search_switch<_FIt1, _FIt2,
>> _BinaryPred, _ItTag1, _ItTag2>):
>>              Move...
>>              * include/parallel/algobase.h: ...here.
>>              * include/std/functional: Remove  and
>> 
>>              includes. Include .
>>
>> Tested under Linux x86_64.
>>
>> Ok to commit ?
>>
>
> OK

This seems to have caused

+FAIL: 17_intro/headers/c++2011/parallel_mode.cc (test for excess
errors)
+FAIL: 17_intro/headers/c++2014/parallel_mode.cc (test for excess
errors)

on i386-pc-solaris2.11:


I think it affects all targets.


Excess errors:

/var/gcc/regression/master/11.4-gcc-gas/build/i386-pc-solaris2.11/libstdc++-v3/include/parallel/algobase.h:496:
error: '__search_template' is not a member of '__gnu_parallel';
did you mean '__find_template'?

        Rainer

-- 
-

Rainer Orth, Center for Biotechnology, Bielefeld University



Re: [PATCH] Fix PR 110042: ifcvt regression due to paradoxical subregs

2023-06-01 Thread Andrew Pinski via Gcc-patches
On Thu, Jun 1, 2023 at 7:36 AM Jeff Law  wrote:
>
>
>
> On 5/31/23 15:22, Andrew Pinski wrote:
> > On Wed, May 31, 2023 at 12:29 AM Richard Biener via Gcc-patches
> >  wrote:
> >>
> >> On Wed, May 31, 2023 at 6:34 AM Andrew Pinski via Gcc-patches
> >>  wrote:
> >>>
> >>> After r14-1014-gc5df248509b489364c573e8, GCC started to emit
> >>> directly a zero_extract for `(t1&0x8)!=0`. This introduced
> >>> a small regression where ifcvt would not do the ifconversion
> >>> as there is now a paradoxical subreg in the dest which
> >>> was being rejected. Since paradoxical subreg set the whole
> >>> register, we can treat it as the same as a reg in the two places.
> >>>
> >>> OK? Bootstrapped and tested on x86_64-linux-gnu and aarch64-linux-gnu.
> >>
> >> OK I guess.   I vaguely remember SUBREG_PROMOTED_UNSIGNED_P
> >> applies to non-paradoxical subregs but I might be swapping things - maybe
> >> you remember better and whether that would cause any issues here?
> >
> > So I looked into the history of the code in ifcvt.cc, this code was
> > added with r6-3071-ge65bf4e814d38c to accept more complex bb
> > (https://inbox.sourceware.org/gcc-patches/559fbb13.80...@arm.com/).
> > The thread where we start talking about subregs is located with Jeff's
> > email starting here:
> > https://inbox.sourceware.org/gcc-patches/55bbafac.5020...@redhat.com/ .
> >
> > Jeff,
> >I know Richard already approved this patch but could you provide a
> > second eye as you were involved reviewing the original code here and I
> > want to make sure I understood the code in a a reasonable fashion?
> It's been a while.   I think my original concerns were with RMW operands
> and making sure we tracked all sub-components of such operands correctly.
>
> That was based on the original version, but that version looks like it
> should have been OK after reviewing the details of reg_referenced_p.
> It's since moved to DF.
>
> So the only worry I immediately see is whether or not DF is giving uses
> and sets of sub-compenents of a RMW operand or multi-hard register modes.

So I looked into the code some more, for bb_valid_for_noce_process_p,
we are checking to make sure if a reg that was being set is not used
by the comparison (and it works using reg_overlap_mentioned_p that
satisfies the multi-hard register mode use case and satisfies the RMW
case as we are just checking to make sure it does not do that to the
registers of the comparison).

For bbs_ok_for_cmove_arith, having the check as paradoxical_subreg_p
(rather than a plain SUBREG) satisfies RMW operand case (as it is not
a RMW operation) and DF already handles multi-hard register when using
FOR_EACH_INSN_DEF/FOR_EACH_INSN_USE (and a paradoxical hard register
is an invalid RTL in the first place).

I wrote this up more to convince myself this is safe and for future
references for others when looking into the code to understand it.

Thanks,
Andrew

>
> Jeff


Re: [PATCH] Fortran: force error on bad KIND specifier [PR88552]

2023-06-01 Thread Harald Anlauf via Gcc-patches

Hi Mikael,

Am 01.06.23 um 22:33 schrieb Mikael Morin:

Hello,

Le 01/06/2023 à 21:05, Harald Anlauf via Fortran a écrit :

Dear all,

we sometimes silently accept wrong declarations with unbalanced
parentheses, as the PR and testcases therein show.

It appears that the fix is obvious: use the existing error paths in
gfc_match_kind_spec and error return from gfc_match_decl_type_spec.
I'm still posting it here in case I have missed something not so
obvious.

The patch regtests cleanly on x86_64-pc-linux-gnu.  OK for mainline?


It looks good, but...


Thanks,
Harald

From a30ff5af130c4d33c086fd136978d5f49cb8bde4 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 1 Jun 2023 20:56:11 +0200
Subject: [PATCH] Fortran: force error on bad KIND specifier [PR88552]

gcc/fortran/ChangeLog:

PR fortran/88552
* decl.cc (gfc_match_kind_spec): Use error path on missing right
parenthesis.
(gfc_match_decl_type_spec): Use error return when an error occurred
during matching a KIND specifier.

gcc/testsuite/ChangeLog:

PR fortran/88552
* gfortran.dg/pr88552.f90: New test.
---
 gcc/fortran/decl.cc   | 4 
 gcc/testsuite/gfortran.dg/pr88552.f90 | 6 ++
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/pr88552.f90

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 1de2b231242..deb20647fb9 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -3366,6 +3366,7 @@ close_brackets:
   else
 gfc_error ("Missing right parenthesis at %C");
   m = MATCH_ERROR;
+  goto no_match;
 }
   else
  /* All tests passed.  */
@@ -4716,6 +4717,9 @@ get_kind:
   return MATCH_ERROR;
 }

+  if (m == MATCH_ERROR)
+    return MATCH_ERROR;
+

... can you move this up to the place where m is set?
OK with that change.


I was afraid that this would regress on the existing testcases
pr91660_[12].f90 that depend on an error message emitted just
before that hunk, but this turned out not to happen.

Adjusted version committed as:
r14-1477-gff8f45d20f9ea6acc99442ad29212d177f58e8fe .


Thanks



Thanks for the review!

Harald




[PATCH] c++: fix up caching of level lowered ttps

2023-06-01 Thread Patrick Palka via Gcc-patches
Due to level/depth mismatches between the template parameters of a level
lowered ttp and the original ttp, the ttp comparison check added by
r14-418-g0bc2a1dc327af9 never actually holds outside of erroneous cases.
Moreover, it'd be good to cache the overall TEMPLATE_TEMPLATE_PARM
instead of just the corresponding TEMPLATE_PARM_INDEX.

It's tricky to cache all level lowered ttps since the result of level
lowering may depend on more than just the depth of the arguments, e.g.
for TT in

  template
  struct A
  {
template class TT>
void f();
  }

the substitution T=int yields a different level-lowerd ttp than T=char.
But these kinds of ttps seem to be rare in practice, and "simple" ttps
that don't depend on outer template parameters are easy enough to
cache like so.  Unfortunately, this means we're back to expecting a
duplicate error in nontype12.C again since the ttp in question is
not "simple" so caching of the (erroneous) lowered ttp doesn't happen.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  This reduces memory usage of range-v3's zip.cpp by 1%.

gcc/cp/ChangeLog:

* cp-tree.h (TEMPLATE_PARM_DESCENDANTS): Harden.
(TEMPLATE_TYPE_DESCENDANTS): Define.
(TEMPLATE_TEMPLATE_PARM_SIMPLE_P): Define.
* pt.cc (reduce_template_parm_level): Revert
r14-418-g0bc2a1dc327af9 change.
(process_template_parm): Set TEMPLATE_TEMPLATE_PARM_SIMPLE_P
appropriately.
(uses_outer_template_parms): Determine the outer depth of
a template template parm without relying on DECL_CONTEXT.
(tsubst) : Cache lowering a
simple template template parm.  Consistently use 'code'.

gcc/testsuite/ChangeLog:

* g++.dg/template/nontype12.C: Expect a duplicate error again.
---
 gcc/cp/cp-tree.h  | 10 +-
 gcc/cp/pt.cc  | 37 +--
 gcc/testsuite/g++.dg/template/nontype12.C |  3 +-
 3 files changed, 31 insertions(+), 19 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f08e5630a5c..cd762667bec 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -525,6 +525,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
5: CLASS_TYPE_P (in RECORD_TYPE and UNION_TYPE)
   ENUM_FIXED_UNDERLYING_TYPE_P (in ENUMERAL_TYPE)
   AUTO_IS_DECLTYPE (in TEMPLATE_TYPE_PARM)
+  TEMPLATE_TEMPLATE_PARM_SIMPLE_P (in TEMPLATE_TEMPLATE_PARM)
6: TYPE_DEPENDENT_P_VALID
 
Usage of DECL_LANG_FLAG_?:
@@ -5991,7 +5992,7 @@ enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, 
TYPENAME_FLAG };
((template_parm_index*)TEMPLATE_PARM_INDEX_CHECK (NODE))
 #define TEMPLATE_PARM_IDX(NODE) (TEMPLATE_PARM_INDEX_CAST (NODE)->index)
 #define TEMPLATE_PARM_LEVEL(NODE) (TEMPLATE_PARM_INDEX_CAST (NODE)->level)
-#define TEMPLATE_PARM_DESCENDANTS(NODE) (TREE_CHAIN (NODE))
+#define TEMPLATE_PARM_DESCENDANTS(NODE) (TREE_CHAIN (TEMPLATE_PARM_INDEX_CHECK 
(NODE)))
 #define TEMPLATE_PARM_ORIG_LEVEL(NODE) (TEMPLATE_PARM_INDEX_CAST 
(NODE)->orig_level)
 #define TEMPLATE_PARM_DECL(NODE) (TEMPLATE_PARM_INDEX_CAST (NODE)->decl)
 #define TEMPLATE_PARM_PARAMETER_PACK(NODE) \
@@ -6009,6 +6010,8 @@ enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, 
TYPENAME_FLAG };
   (TEMPLATE_PARM_LEVEL (TEMPLATE_TYPE_PARM_INDEX (NODE)))
 #define TEMPLATE_TYPE_ORIG_LEVEL(NODE) \
   (TEMPLATE_PARM_ORIG_LEVEL (TEMPLATE_TYPE_PARM_INDEX (NODE)))
+#define TEMPLATE_TYPE_DESCENDANTS(NODE) \
+  (TEMPLATE_PARM_DESCENDANTS (TEMPLATE_TYPE_PARM_INDEX (NODE)))
 #define TEMPLATE_TYPE_DECL(NODE) \
   (TEMPLATE_PARM_DECL (TEMPLATE_TYPE_PARM_INDEX (NODE)))
 #define TEMPLATE_TYPE_PARAMETER_PACK(NODE) \
@@ -6018,6 +6021,11 @@ enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, 
TYPENAME_FLAG };
 #define CLASS_PLACEHOLDER_TEMPLATE(NODE) \
   (DECL_INITIAL (TYPE_NAME (TEMPLATE_TYPE_PARM_CHECK (NODE
 
+/* True iff the template parameters of this TEMPLATE_TEMPLATE_PARM don't
+   depend on outer template parameters.  */
+#define TEMPLATE_TEMPLATE_PARM_SIMPLE_P(NODE) \
+  (TYPE_LANG_FLAG_5 (TEMPLATE_TEMPLATE_PARM_CHECK (NODE)))
+
 /* Contexts in which auto deduction occurs. These flags are
used to control diagnostics in do_auto_deduction.  */
 
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1a32f10b22b..15128cf3c7c 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -219,6 +219,7 @@ static tree enclosing_instantiation_of (tree tctx);
 static void instantiate_body (tree pattern, tree args, tree d, bool nested);
 static tree maybe_dependent_member_ref (tree, tree, tsubst_flags_t, tree);
 static void mark_template_arguments_used (tree, tree);
+static bool uses_outer_template_parms (tree);
 
 /* Make the current scope suitable for access checking when we are
processing T.  T can be FUNCTION_DECL for instantiated function
@@ -4554,12 +4555,7 @@ reduce_template_parm_level (tree index, tree type, int 
levels, tree args,
   if (TEMPLATE_PARM_DESCENDANTS (index) == NULL_TREE
   || (TEMPLATE_PARM_LEVEL 

Re: [PATCH] Move std::search into algobase.h

2023-06-01 Thread Jonathan Wakely via Gcc-patches
On Thu, 1 Jun 2023, 21:37 François Dumont via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> It's of course not as easy as I thought.
>
> I would never have detected this problem on my system because I'm
> missing omp.h.
>
> I've implemented and added a:
>
> // { dg-require-effective-target omp }
>
> so that now those tests are UNRESOLVED rather than PASS.
>
> Now I've install OMP and try to rebuild lib to reproduce the failure.
>

You shouldn't need to install anything, just build gcc and don't configure
it with --disable-libgomp




> To be continued tomorrow...
>
> On 01/06/2023 14:05, Jonathan Wakely wrote:
> >
> >
> > On Thu, 1 Jun 2023 at 12:52, Rainer Orth 
> > wrote:
> >
> > Jonathan Wakely via Gcc-patches  writes:
> >
> > > On Wed, 31 May 2023 at 18:39, François Dumont via Libstdc++ <
> > > libstd...@gcc.gnu.org > wrote:
> > >
> > >> libstdc++: Reduce  inclusion to 
> > >>
> > >>
> > >> Move the std::search definition from stl_algo.h to
> > stl_algobase.h and use
> > >> the later in .
> > >>
> > >> For consistency also move std::__parallel::search and
> > associated helpers
> > >> from
> > >>  to  so that
> > >> std::__parallel::search
> > >> is accessible along with std::search.
> > >>
> > >> libstdc++-v3/ChangeLog:
> > >>
> > >>  * include/bits/stl_algo.h
> > >>  (std::__search, std::search(_FwdIt1, _FwdIt1,
> _FwdIt2,
> > >> _FwdIt2, _BinPred)): Move...
> > >>  * include/bits/stl_algobase.h: ...here.
> > >>  * include/std/functional: Replace 
> > include by
> > >> .
> > >>  * include/parallel/algo.h
> > (std::__parallel::search<_FIt1,
> > >> _FIt2, _BinaryPred>)
> > >> (std::__parallel::__search_switch<_FIt1, _FIt2,
> > >> _BinaryPred, _ItTag1, _ItTag2>):
> > >>  Move...
> > >>  * include/parallel/algobase.h: ...here.
> > >>  * include/std/functional: Remove 
> and
> > >> 
> > >>  includes. Include .
> > >>
> > >> Tested under Linux x86_64.
> > >>
> > >> Ok to commit ?
> > >>
> > >
> > > OK
> >
> > This seems to have caused
> >
> > +FAIL: 17_intro/headers/c++2011/parallel_mode.cc (test for excess
> > errors)
> > +FAIL: 17_intro/headers/c++2014/parallel_mode.cc (test for excess
> > errors)
> >
> > on i386-pc-solaris2.11:
> >
> >
> > I think it affects all targets.
> >
> >
> > Excess errors:
> >
>  
> /var/gcc/regression/master/11.4-gcc-gas/build/i386-pc-solaris2.11/libstdc++-v3/include/parallel/algobase.h:496:
> > error: '__search_template' is not a member of '__gnu_parallel';
> > did you mean '__find_template'?
> >
> > Rainer
> >
> > --
> >
>  -
> > Rainer Orth, Center for Biotechnology, Bielefeld University
> >
>


[PATCH] RISC-V: Add _mu C++ overloaded intrinsics for load && viota && vid

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

Base on these:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/232
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/233

Add _mu C++ overloaded intrinsics for load && viota && vid.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Add _mu overloaded 
intrinsics.

---
 gcc/config/riscv/riscv-vector-builtins-bases.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index a8113f6602b..498c6ba042e 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -164,7 +164,7 @@ public:
   {
 if (STORE_P || LST_TYPE == LST_INDEXED)
   return true;
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -963,7 +963,7 @@ public:
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
 return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
-  || pred == PRED_TYPE_tumu;
+  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
   }
 
   rtx expand (function_expander &e) const override
@@ -979,7 +979,7 @@ public:
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
 return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
-  || pred == PRED_TYPE_tumu;
+  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
   }
 
   rtx expand (function_expander &e) const override
@@ -1749,7 +1749,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -1794,7 +1794,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
-- 
2.36.1



rs6000: Fix expected counts powerpc/p9-vec-length-full

2023-06-01 Thread Carl Love via Gcc-patches


GCC maintainers:

The following patch updates the expected instruction counts in four
tests.  The counts in all of the tests changed with commit
f574e2dfae79055f16d0c63cc12df24815d8ead6.  

The updated counts have been verified on both Power 9 and Power 10.

Please let me know if this patch is acceptable for mainline.  Thanks.

  Carl 


rs6000: Fix expected counts powerpc/p9-vec-length-full tests

The counts for instructions lxvl and stxvl in tests:

  p9-vec-length-full-1.c
  p9-vec-length-full-2.c
  p9-vec-length-full-6.c
  p9-vec-length-full-7.c

changed with commit:

   commit f574e2dfae79055f16d0c63cc12df24815d8ead6
   Author: Ju-Zhe Zhong 
   Date:   Thu May 25 22:42:35 2023 +0800

 VECT: Add decrement IV iteration loop control by variable amount support

 This patch is supporting decrement IV by following the flow designed by
 Richard:
   ...

The expected counts for lxvl changed from 20 to 40 and the counts for stxvl
changed from 10 to 20 in the first three tests.  The number of stxvl
instructions changed from 12 to 20 in p9-vec-length-full-7.c.  This
patch updates the number of expected instructions in the four tests.

The counts have been verified on Power 9 and Power 10.
---
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c | 4 ++--
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c | 4 ++--
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c | 4 ++--
 gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c 
b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c
index f01f1c54fa5..5e4f34421d3 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-1.c
@@ -12,5 +12,5 @@
 /* { dg-final { scan-assembler-not   {\mstxv\M} } } */
 /* { dg-final { scan-assembler-not   {\mlxvx\M} } } */
 /* { dg-final { scan-assembler-not   {\mstxvx\M} } } */
-/* { dg-final { scan-assembler-times {\mlxvl\M} 20 } } */
-/* { dg-final { scan-assembler-times {\mstxvl\M} 10 } } */
+/* { dg-final { scan-assembler-times {\mlxvl\M} 40 } } */
+/* { dg-final { scan-assembler-times {\mstxvl\M} 20 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c 
b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c
index f546e97fa7d..c7d927382c3 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-2.c
@@ -12,5 +12,5 @@
 /* { dg-final { scan-assembler-not   {\mstxv\M} } } */
 /* { dg-final { scan-assembler-not   {\mlxvx\M} } } */
 /* { dg-final { scan-assembler-not   {\mstxvx\M} } } */
-/* { dg-final { scan-assembler-times {\mlxvl\M} 20 } } */
-/* { dg-final { scan-assembler-times {\mstxvl\M} 10 } } */
+/* { dg-final { scan-assembler-times {\mlxvl\M} 40 } } */
+/* { dg-final { scan-assembler-times {\mstxvl\M} 20 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c 
b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c
index 65ddf2b098a..f3be3842c62 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-6.c
@@ -11,5 +11,5 @@
 /* It can use normal vector load for constant vector load.  */
 /* { dg-final { scan-assembler-times {\mstxvx?\M} 6 } } */
 /* 64bit/32bit pairs won't use partial vectors.  */
-/* { dg-final { scan-assembler-times {\mlxvl\M} 10 } } */
-/* { dg-final { scan-assembler-times {\mstxvl\M} 10 } } */
+/* { dg-final { scan-assembler-times {\mlxvl\M} 20 } } */
+/* { dg-final { scan-assembler-times {\mstxvl\M} 20 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c 
b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c
index e0e51d9a972..da086f1826a 100644
--- a/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-vec-length-full-7.c
@@ -12,4 +12,4 @@
 
 /* Each type has one stxvl excepting for int8 and uint8, that have two due to
rtl pass bbro duplicating the block which has one stxvl.  */
-/* { dg-final { scan-assembler-times {\mstxvl\M} 12 } } */
+/* { dg-final { scan-assembler-times {\mstxvl\M} 20 } } */
-- 
2.37.2




[PATCH] RISC-V: Add __RISCV_ prefix to VXRM and FRM enum

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

According to doc:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222/files
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/226

Add __RISCV_ prefix to VXRM and FRM enum.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (DEF_RVV_VXRM_ENUM): Add 
__RISCV_ prefix.
(DEF_RVV_FRM_ENUM): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/frm-1.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-1.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-10.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-11.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-12.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-6.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-7.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-8.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-9.c: Ditto.

---
 gcc/config/riscv/riscv-vector-builtins.cc |  8 
 gcc/testsuite/gcc.target/riscv/rvv/base/frm-1.c   | 10 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c  |  8 
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-10.c |  8 
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-11.c |  4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-12.c |  4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-6.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-7.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-8.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-9.c  |  8 
 10 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 43bf6d8f262..9e6dae98a6d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4026,11 +4026,11 @@ register_vxrm ()
 {
   auto_vec values;
 #define DEF_RVV_VXRM_ENUM(NAME, VALUE) 
 \
-  values.quick_push (string_int_pair ("VXRM_" #NAME, VALUE));
+  values.quick_push (string_int_pair ("__RISCV_VXRM_" #NAME, VALUE));
 #include "riscv-vector-builtins.def"
 #undef DEF_RVV_VXRM_ENUM
 
-  lang_hooks.types.simulate_enum_decl (input_location, "RVV_VXRM", &values);
+  lang_hooks.types.simulate_enum_decl (input_location, "__RISCV_VXRM", 
&values);
 }
 
 /* Register the frm enum.  */
@@ -4039,11 +4039,11 @@ register_frm ()
 {
   auto_vec values;
 #define DEF_RVV_FRM_ENUM(NAME, VALUE)  
\
-  values.quick_push (string_int_pair ("FRM_" #NAME, VALUE));
+  values.quick_push (string_int_pair ("__RISCV_FRM_" #NAME, VALUE));
 #include "riscv-vector-builtins.def"
 #undef DEF_RVV_FRM_ENUM
 
-  lang_hooks.types.simulate_enum_decl (input_location, "RVV_FRM", &values);
+  lang_hooks.types.simulate_enum_decl (input_location, "__RISCV_FRM", &values);
 }
 
 /* Implement #pragma riscv intrinsic vector.  */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/frm-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/frm-1.c
index f5635fb959e..ff19c8bc089 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/frm-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/frm-1.c
@@ -5,27 +5,27 @@
 
 size_t f0 ()
 {
-  return FRM_RNE;
+  return __RISCV_FRM_RNE;
 }
 
 size_t f1 ()
 {
-  return FRM_RTZ;
+  return __RISCV_FRM_RTZ;
 }
 
 size_t f2 ()
 {
-  return FRM_RDN;
+  return __RISCV_FRM_RDN;
 }
 
 size_t f3 ()
 {
-  return FRM_RUP;
+  return __RISCV_FRM_RUP;
 }
 
 size_t f4 ()
 {
-  return FRM_RMM;
+  return __RISCV_FRM_RMM;
 }
 
 /* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*0} 1} } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
index 0d364787ad0..b0ed27b0520 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
@@ -5,22 +5,22 @@
 
 size_t f0 ()
 {
-  return VXRM_RNU;
+  return __RISCV_VXRM_RNU;
 }
 
 size_t f1 ()
 {
-  return VXRM_RNE;
+  return __RISCV_VXRM_RNE;
 }
 
 size_t f2 ()
 {
-  return VXRM_RDN;
+  return __RISCV_VXRM_RDN;
 }
 
 size_t f3 ()
 {
-  return VXRM_ROD;
+  return __RISCV_VXRM_ROD;
 }
 
 /* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*0} 1} } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-10.c
index a707aa1645e..3c7872bb73d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-10.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-10.c
@@ -8,16 +8,16 @@ void f (void * in, void *out, int32_t x, int n, int m)
   for (int i = 0; i < n; i++) {
 vint32m1_t v = __riscv_vle32_v_i32m1 (in + i, 4);
 vint32m1_t v2 = __riscv_vle32_v_i32m1_tu (v, in + 100 + i, 4);
-vint32m1_t v3 = __riscv_vaadd_vx_i32m1 (v2, 0, VXRM_RDN, 4);
-v3 = __riscv_vaadd_vx_i32m1 (v3, 3, VXRM_RDN, 4);
+vint32m1_t v3 = __riscv_vaadd_vx_i32m1 (v2, 0, __RISCV_VXRM_RDN, 4);
+v3 = __riscv_vaadd_vx_i32m1 (v3, 3, __RISCV_VXRM_RDN, 4);
 __riscv_vse32_v_i32m1 (out + 100 + i, v3, 4

Re: [PATCH] RISC-V: Add __RISCV_ prefix to VXRM and FRM enum

2023-06-01 Thread Jeff Law via Gcc-patches




On 6/1/23 17:19, juzhe.zh...@rivai.ai wrote:

From: Juzhe-Zhong 

According to doc:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222/files
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/226

Add __RISCV_ prefix to VXRM and FRM enum.

gcc/ChangeLog:

 * config/riscv/riscv-vector-builtins.cc (DEF_RVV_VXRM_ENUM): Add 
__RISCV_ prefix.
 (DEF_RVV_FRM_ENUM): Ditto.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/rvv/base/frm-1.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-1.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-10.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-11.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-12.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-6.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-7.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-8.c: Ditto.
 * gcc.target/riscv/rvv/base/vxrm-9.c: Ditto.

OK
jeff


[PATCH] i386: Add missing vector truncate patterns [PR92658].

2023-06-01 Thread liuhongt via Gcc-patches
Add missing insn patterns for v2si -> v2hi/v2qi and v2hi-> v2qi vector
truncate.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/92658
* config/i386/mmx.md (truncv2hiv2qi2): New define_insn.
(truncv2si2): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr92658-avx512bw-trunc-2.c: New test.
---
 gcc/config/i386/mmx.md| 21 +++
 .../i386/pr92658-avx512bw-trunc-2.c   | 27 +++
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index dbcb850ffde..bb45098f797 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -3667,6 +3667,27 @@ (define_expand "v2qiv2hi2"
   DONE;
 })
 
+(define_insn "truncv2hiv2qi2"
+  [(set (match_operand:V2QI 0 "register_operand" "=v")
+   (truncate:V2QI
+ (match_operand:V2HI 1 "register_operand" "v")))]
+  "TARGET_AVX512VL && TARGET_AVX512BW"
+  "vpmovwb\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "TI")])
+
+(define_mode_iterator V2QI_V2HI [V2QI V2HI])
+(define_insn "truncv2si2"
+  [(set (match_operand:V2QI_V2HI 0 "register_operand" "=v")
+   (truncate:V2QI_V2HI
+ (match_operand:V2SI 1 "register_operand" "v")))]
+  "TARGET_AVX512VL && TARGET_MMX_WITH_SSE"
+  "vpmovd\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "TI")])
+
 ;; Pack/unpack vector modes
 (define_mode_attr mmxpackmode
   [(V4HI "V8QI") (V2SI "V4HI")])
diff --git a/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c 
b/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c
new file mode 100644
index 000..2f5b7dc5668
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr92658-avx512bw-trunc-2.c
@@ -0,0 +1,27 @@
+/* PR target/92658 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
+/* { dg-final { scan-assembler-times "vpmovwb" 1 } } */
+/* { dg-final { scan-assembler-times "vpmovdb" 1 { target { ! ia32 } } } } */
+/* { dg-final { scan-assembler-times "vpmovdw" 1 { target { ! ia32 } } } } */
+
+void
+foo (int* __restrict a, char* b)
+{
+b[0] = a[0];
+b[1] = a[1];
+}
+
+void
+foo2 (short* __restrict a, char* b)
+{
+b[0] = a[0];
+b[1] = a[1];
+}
+
+void
+foo3 (int* __restrict a, short* b)
+{
+b[0] = a[0];
+b[1] = a[1];
+}
-- 
2.39.1.388.g2fc9e9ca3c



[PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-01 Thread liuhongt via Gcc-patches
We have already use intermidate type in case WIDEN, but not for NONE,
this patch extended that.

I didn't do that in pattern recog since we need to know whether the
stmt belongs to any slp_node to decide the vectype, the related optabs
are checked according to vectype_in and vectype_out. For non-slp case,
vec_pack/unpack are always used when lhs has different size from rhs,
for slp case, sometimes vec_pack/unpack is used, somethings
direct conversion is used.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/110018
* tree-vect-stmts.cc (vectorizable_conversion): Use
intermiediate integer type for float_expr/fix_trunc_expr when
direct optab is not existed.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr110018-1.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr110018-1.c | 94 ++
 gcc/tree-vect-stmts.cc | 56 -
 2 files changed, 149 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr110018-1.c

diff --git a/gcc/testsuite/gcc.target/i386/pr110018-1.c 
b/gcc/testsuite/gcc.target/i386/pr110018-1.c
new file mode 100644
index 000..b1baffd7af1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr110018-1.c
@@ -0,0 +1,94 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512fp16 -mavx512vl -O2 -mavx512dq" } */
+/* { dg-final { scan-assembler-times {(?n)vcvttp[dsh]2[dqw]} 5 } } */
+/* { dg-final { scan-assembler-times {(?n)vcvt[dqw]*2p[dsh]} 5 } } */
+
+void
+foo (double* __restrict a, char* b)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+}
+
+void
+foo1 (float* __restrict a, char* b)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+}
+
+void
+foo2 (_Float16* __restrict a, char* b)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+  a[4] = b[4];
+  a[5] = b[5];
+  a[6] = b[6];
+  a[7] = b[7];
+}
+
+void
+foo3 (double* __restrict a, short* b)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+}
+
+void
+foo4 (float* __restrict a, char* b)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+}
+
+void
+foo5 (double* __restrict b, char* a)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+}
+
+void
+foo6 (float* __restrict b, char* a)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+}
+
+void
+foo7 (_Float16* __restrict b, char* a)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+  a[4] = b[4];
+  a[5] = b[5];
+  a[6] = b[6];
+  a[7] = b[7];
+}
+
+void
+foo8 (double* __restrict b, short* a)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+}
+
+void
+foo9 (float* __restrict b, char* a)
+{
+  a[0] = b[0];
+  a[1] = b[1];
+  a[2] = b[2];
+  a[3] = b[3];
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index bd3b07a3aa1..1118c89686d 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5162,6 +5162,49 @@ vectorizable_conversion (vec_info *vinfo,
return false;
   if (supportable_convert_operation (code, vectype_out, vectype_in, 
&code1))
break;
+  if ((code == FLOAT_EXPR
+  && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
+ || (code == FIX_TRUNC_EXPR
+ && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)))
+   {
+ bool float_expr_p = code == FLOAT_EXPR;
+ scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
+ fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
+ code1 = float_expr_p ? code : NOP_EXPR;
+ codecvt1 = float_expr_p ? NOP_EXPR : code;
+ FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode)
+   {
+ imode = rhs_mode_iter.require ();
+ if (GET_MODE_SIZE (imode) > fltsz)
+   break;
+
+ cvt_type
+   = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode),
+ 0);
+ cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type,
+ slp_node);
+ /* This should only happened for SLP as long as loop vectorizer
+only supports same-sized vector.  */
+ if (cvt_type == NULL_TREE
+ || maybe_ne (TYPE_VECTOR_SUBPARTS (cvt_type), nunits_in)
+ || !supportable_convert_operation (code1, vectype_out,
+cvt_type, &code1)
+ || !supportable_convert_operation (codecvt1, cvt_type,
+vectype_in, &codecvt1))
+   continue;
+
+ found_mode = true;
+ break;
+   }
+
+ if (found_mode)
+   {
+ multi_step_cvt++;
+ interm_types.safe_push (cvt_type);
+ cvt_type = NULL_TREE;
+ break;
+   }
+   }
   /* FALLTHRU */
 unsupported:
   if (dump_enabled_p ())
@@ -5381,7 +5424,18 @@ vectorizable_conversion (vec_info *v

RE: [PATCH V2] RISC-V: Support RVV permutation auto-vectorization

2023-06-01 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Friday, June 2, 2023 2:49 AM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; pal...@dabbelt.com; 
pal...@rivosinc.com; rdapp@gmail.com
Subject: Re: [PATCH V2] RISC-V: Support RVV permutation auto-vectorization



On 5/31/23 20:36, juzhe.zh...@rivai.ai wrote:
> From: Juzhe-Zhong 
> 
> This patch supports vector permutation for VLS only by vec_perm pattern.
> We will support TARGET_VECTORIZE_VEC_PERM_CONST to support VLA 
> permutation in the future.
> 
> Fixed following comments from Robin.
> Ok for trunk?
> 
> gcc/ChangeLog:
> 
>  * config/riscv/autovec.md (vec_perm): New pattern.
>  * config/riscv/predicates.md (vector_perm_operand): New predicate.
>  * config/riscv/riscv-protos.h (enum insn_type): New enum.
>  (expand_vec_perm): New function.
>  * config/riscv/riscv-v.cc (const_vec_all_in_range_p): Ditto.
>  (gen_const_vector_dup): Ditto.
>  (emit_vlmax_gather_insn): Ditto.
>  (emit_vlmax_masked_gather_mu_insn): Ditto.
>  (expand_vec_perm): Ditto.
OK.
jeff


RE: [PATCH] RISC-V: Add vwadd.wv/vwsub.wv auto-vectorization lowering optimization

2023-06-01 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Friday, June 2, 2023 2:52 AM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; pal...@dabbelt.com; 
pal...@rivosinc.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Add vwadd.wv/vwsub.wv auto-vectorization lowering 
optimization



On 5/31/23 21:48, juzhe.zh...@rivai.ai wrote:
> From: Juzhe-Zhong 
> 
> 1. This patch optimize the codegen of the following auto-vectorization codes:
> 
> void foo (int32_t * __restrict a, int64_t * __restrict b, int64_t * 
> __restrict c, int n) {
>  for (int i = 0; i < n; i++)
>c[i] = (int64_t)a[i] + b[i];
> }
> 
> Combine instruction from:
> 
> ...
> vsext.vf2
> vadd.vv
> ...
> 
> into:
> 
> ...
> vwadd.wv
> ...
> 
> Since for PLUS operation, GCC prefer the following RTL operand order when 
> combining:
> 
> (plus: (sign_extend:..)
> (reg:)
> 
> instead of
> 
> (plus: (reg:..)
> (sign_extend:)

> 
> which is different from MINUS pattern.
Right.  Canonicaliation rules will have the sign_extend as the first operand 
when the opcode is associative.
> 
> I split patterns of vwadd/vwsub, and add dedicated patterns for them.
> 
> 2. This patch not only optimize the case as above (1) mentioned, also enhance 
> vwadd.vv/vwsub.vv
> optimization for complicate PLUS/MINUS codes, consider this following 
> codes:
> 
> __attribute__ ((noipa)) void
> vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2,
> int16_t *__restrict dst3, int8_t *__restrict a,
> int8_t *__restrict b, int8_t *__restrict a2,
> int8_t *__restrict b2, int n)
> {
>for (int i = 0; i < n; i++)
>  {
>dst[i] = (int16_t) a[i] + (int16_t) b[i];
>dst2[i] = (int16_t) a2[i] + (int16_t) b[i];
>dst3[i] = (int16_t) a2[i] + (int16_t) a[i];
>  }
> }
> 
> Before this patch:
> ...
>  vsetvli zero,a6,e8,mf2,ta,ma
>  vle8.v  v2,0(a3)
>  vle8.v  v1,0(a4)
>  vsetvli t1,zero,e16,m1,ta,ma
>  vsext.vf2   v3,v2
>  vsext.vf2   v2,v1
>  vadd.vv v1,v2,v3
>  vsetvli zero,a6,e16,m1,ta,ma
>  vse16.v v1,0(a0)
>  vle8.v  v4,0(a5)
>  vsetvli t1,zero,e16,m1,ta,ma
>  vsext.vf2   v1,v4
>  vadd.vv v2,v1,v2
> ...
> 
> After this patch:
> ...
>  vsetvli  zero,a6,e8,mf2,ta,ma
>   vle8.v  v3,0(a4)
>   vle8.v  v1,0(a3)
>   vsetvli t4,zero,e8,mf2,ta,ma
>   vwadd.vvv2,v1,v3
>   vsetvli zero,a6,e16,m1,ta,ma
>   vse16.v v2,0(a0)
>   vle8.v  v2,0(a5)
>   vsetvli t4,zero,e8,mf2,ta,ma
>   vwadd.vvv4,v3,v2
>   vsetvli zero,a6,e16,m1,ta,ma
>   vse16.v v4,0(a1)
>   vsetvli t4,zero,e8,mf2,ta,ma
>   sub a7,a7,a6
>   vwadd.vvv3,v2,v1
>   vsetvli zero,a6,e16,m1,ta,ma
>   vse16.v v3,0(a2)
> ...
> 
> The reason why current upstream GCC can not optimize codes using vwadd 
> thoroughly is combine PASS needs intermediate RTL IR (extend one of 
> the operand pattern (vwadd.wv)), then base on this intermediate RTL IR, 
> extend the other operand to generate vwadd.vv.
> 
> So vwadd.wv/vwsub.wv definitely helps to vwadd.vv/vwsub.vv code optimizations.
>   
> gcc/ChangeLog:
> 
>  * config/riscv/riscv-vector-builtins-bases.cc: Change 
> vwadd.wv/vwsub.wv intrinsic API expander
>  * config/riscv/vector.md 
> (@pred_single_widen_): Remove it.
>  (@pred_single_widen_sub): New pattern.
>  (@pred_single_widen_add): New pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>  * gcc.target/riscv/rvv/autovec/widen/widen-5.c: New test.
>  * gcc.target/riscv/rvv/autovec/widen/widen-6.c: New test.
>  * gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: New test.
>  * gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: New test.
>  * gcc.target/riscv/rvv/autovec/widen/widen_run-5.c: New test.
>  * gcc.target/riscv/rvv/autovec/widen/widen_run-6.c: New test.
OK
jeff


RE: [PATCH] RISC-V: Add __RISCV_ prefix to VXRM and FRM enum

2023-06-01 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Friday, June 2, 2023 7:49 AM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Add __RISCV_ prefix to VXRM and FRM enum



On 6/1/23 17:19, juzhe.zh...@rivai.ai wrote:
> From: Juzhe-Zhong 
> 
> According to doc:
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222/files
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/226
> 
> Add __RISCV_ prefix to VXRM and FRM enum.
> 
> gcc/ChangeLog:
> 
>  * config/riscv/riscv-vector-builtins.cc (DEF_RVV_VXRM_ENUM): Add 
> __RISCV_ prefix.
>  (DEF_RVV_FRM_ENUM): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>  * gcc.target/riscv/rvv/base/frm-1.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-1.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-10.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-11.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-12.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-6.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-7.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-8.c: Ditto.
>  * gcc.target/riscv/rvv/base/vxrm-9.c: Ditto.
OK
jeff


RE: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

2023-06-01 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, June 1, 2023 11:36 PM
To: Li, Pan2 
Cc: Wang, Yanzhang ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; kito.ch...@sifive.com
Subject: Re: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

Lgtm

Li, Pan2 via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>於 2023年6月1日 週四,20:10寫道:
Thanks Juzhe for pointing out this.

Pan

-Original Message-
From: Li, Pan2 mailto:pan2...@intel.com>>
Sent: Thursday, June 1, 2023 8:09 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; 
kito.ch...@sifive.com; Li, Pan2 
mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: [PATCH] RISC-V: Add test for vfloat16*_t (non tuple) types

From: Pan Li mailto:pan2...@intel.com>>

This patch would like to add some test cases of vfloat16*_t (non tuple), no 
'zvfh' or 'zvfhmin' will meet unknown type.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-16.c: Add test cases.
* gcc.target/riscv/rvv/base/user-7.c: Likewise.
---
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c | 6 ++  
gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
index be2cbb5efd7..9e962a70acf 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-16.c
@@ -173,6 +173,12 @@ void f___rvv_int64m2x4_t () {__rvv_int64m2x4_t t;} /* { 
dg-error {unknown type n  void f___rvv_uint64m2x4_t () {__rvv_uint64m2x4_t t;} 
/* { dg-error {unknown type name '__rvv_uint64m2x4_t'} } */  void 
f___rvv_int64m4x2_t () {__rvv_int64m4x2_t t;} /* { dg-error {unknown type name 
'__rvv_int64m4x2_t'} } */  void f___rvv_uint64m4x2_t () {__rvv_uint64m4x2_t t;} 
/* { dg-error {unknown type name '__rvv_uint64m4x2_t'} } */
+void f___rvv_float16mf4_t () {__rvv_float16mf4_t t;} /* { dg-error
+{unknown type name '__rvv_float16mf4_t'} } */ void f___rvv_float16mf2_t
+() {__rvv_float16mf2_t t;} /* { dg-error {unknown type name
+'__rvv_float16mf2_t'} } */ void f___rvv_float16m1_t ()
+{__rvv_float16m1_t t;} /* { dg-error {unknown type name
+'__rvv_float16m1_t'} } */ void f___rvv_float16m2_t ()
+{__rvv_float16m2_t t;} /* { dg-error {unknown type name
+'__rvv_float16m2_t'} } */ void f___rvv_float16m4_t ()
+{__rvv_float16m4_t t;} /* { dg-error {unknown type name
+'__rvv_float16m4_t'} } */ void f___rvv_float16m8_t ()
+{__rvv_float16m8_t t;} /* { dg-error {unknown type name
+'__rvv_float16m8_t'} } */
 void f___rvv_float32mf2x2_t () {__rvv_float32mf2x2_t t;}  void 
f___rvv_float32mf2x3_t () {__rvv_float32mf2x3_t t;}  void 
f___rvv_float32mf2x4_t () {__rvv_float32mf2x4_t t;} diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
index 2172a5c7c79..0620a728208 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/user-7.c
@@ -173,6 +173,12 @@ void f_vint64m2x4_t () {vint64m2x4_t t;} /* { dg-error 
{unknown type name 'vint6  void f_vuint64m2x4_t () {vuint64m2x4_t t;} /* { 
dg-error {unknown type name 'vuint64m2x4_t'} } */  void f_vint64m4x2_t () 
{vint64m4x2_t t;} /* { dg-error {unknown type name 'vint64m4x2_t'} } */  void 
f_vuint64m4x2_t () {vuint64m4x2_t t;} /* { dg-error {unknown type name 
'vuint64m4x2_t'} } */
+void f_vfloat16mf4_t () {vfloat16mf4_t t;} /* { dg-error {unknown type
+name 'vfloat16mf4_t'} } */ void f_vfloat16mf2_t () {vfloat16mf2_t t;}
+/* { dg-error {unknown type name 'vfloat16mf2_t'} } */ void
+f_vfloat16m1_t () {vfloat16m1_t t;} /* { dg-error {unknown type name
+'vfloat16m1_t'} } */ void f_vfloat16m2_t () {vfloat16m2_t t;} /* {
+dg-error {unknown type name 'vfloat16m2_t'} } */ void f_vfloat16m4_t ()
+{vfloat16m4_t t;} /* { dg-error {unknown type name 'vfloat16m4_t'} } */
+void f_vfloat16m8_t () {vfloat16m8_t t;} /* { dg-error {unknown type
+name 'vfloat16m8_t'} } */
 void f_vfloat32mf2x2_t () {vfloat32mf2x2_t t;} /* { dg-error {unknown type 
name 'vfloat32mf2x2_t'} } */  void f_vfloat32mf2x3_t () {vfloat32mf2x3_t t;} /* 
{ dg-error {unknown type name 'vfloat32mf2x3_t'} } */  void f_vfloat32mf2x4_t 
() {vfloat32mf2x4_t t;} /* { dg-error {unknown type name 'vfloat32mf2x4_t'} } */
--
2.34.1


Re: [PATCH] RISC-V: Add _mu C++ overloaded intrinsics for load && viota && vid

2023-06-01 Thread KuanLin Chen via Gcc-patches
Hi Juzhe,

I think fault_load_def::get_name should remove "instance.pred ==
PRED_TYPE_mu", right?

 於 2023年6月2日 週五 上午7:05寫道:
>
> From: Juzhe-Zhong 
>
> Base on these:
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/232
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/233
>
> Add _mu C++ overloaded intrinsics for load && viota && vid.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc: Add _mu overloaded 
> intrinsics.
>
> ---
>  gcc/config/riscv/riscv-vector-builtins-bases.cc | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index a8113f6602b..498c6ba042e 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -164,7 +164,7 @@ public:
>{
>  if (STORE_P || LST_TYPE == LST_INDEXED)
>return true;
> -return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
> +return pred != PRED_TYPE_none;
>}
>
>rtx expand (function_expander &e) const override
> @@ -963,7 +963,7 @@ public:
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
>  return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
> -  || pred == PRED_TYPE_tumu;
> +  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
>}
>
>rtx expand (function_expander &e) const override
> @@ -979,7 +979,7 @@ public:
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
>  return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
> -  || pred == PRED_TYPE_tumu;
> +  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
>}
>
>rtx expand (function_expander &e) const override
> @@ -1749,7 +1749,7 @@ public:
>
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
> -return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
> +return pred != PRED_TYPE_none;
>}
>
>rtx expand (function_expander &e) const override
> @@ -1794,7 +1794,7 @@ public:
>
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
> -return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
> +return pred != PRED_TYPE_none;
>}
>
>rtx expand (function_expander &e) const override
> --
> 2.36.1
>


[PATCH RFA] c++: make initializer_list array static again [PR110070]

2023-06-01 Thread Jason Merrill via Gcc-patches
I ended up deciding not to apply the DECL_NOT_OBSERVABLE patch that you
approved in stage 3 because I didn't feel like it was fully baked; I'm happy
with this version now, which seems like a more broadly useful flag.

Tested x86_64-pc-linux-gnu.  OK for trunk?

-- 8< --

After the maybe_init_list_as_* patches, I noticed that we were putting the
array of strings into .rodata, but then memcpying it into an automatic
array, which is pointless; we should be able to use it directly.

This doesn't happen automatically because TREE_ADDRESSABLE is set (since
r12-657 for PR100464), and so gimplify_init_constructor won't promote the
variable to static.  Theoretically we could do escape analysis to recognize
that the address, though taken, never leaves the function; that would allow
promotion when we're only using the address for indexing within the
function, as in initlist-opt2.C.  But this would be a new pass.

And in initlist-opt1.C, we're passing the array address to another function,
so it definitely escapes; it's only safe in this case because it's calling a
standard library function that we know only uses it for indexing.  So, a
flag seems needed.  I first thought to put the flag on the TARGET_EXPR, but
the VAR_DECL seems more appropriate.

In a previous revision of the patch I called this flag DECL_NOT_OBSERVABLE,
but I think DECL_MERGEABLE is a better name, especially if we're going to
apply it to the backing array of initializer_list, which is observable.  I
then also check it in places that check for -fmerge-all-constants, so that
multiple equivalent initializer-lists can also be combined.  And then it
seemed to make sense for [[no_unique_address]] to have this meaning for
user-written variables.

I think the note in [dcl.init.list]/6 intended to allow this kind of merging
for initializer_lists, but it didn't actually work; for an explicit array
with the same initializer, if the address escapes the program could tell
whether the same variable in two frames have the same address.  P2752 is
trying to correct this defect, so I'm going to assume that this is the
intent.

PR c++/110070
PR c++/105838

gcc/ChangeLog:

* tree.h (DECL_MERGEABLE): New.
* tree-core.h (struct tree_decl_common): Mention it.
* gimplify.cc (gimplify_init_constructor): Check it.
* cgraph.cc (symtab_node::address_can_be_compared_p): Likewise.
* varasm.cc (categorize_decl_for_section): Likewise.

gcc/cp/ChangeLog:

* call.cc (maybe_init_list_as_array): Set DECL_MERGEABLE.
(convert_like_internal) [ck_list]: Set it.
(set_up_extended_ref_temp): Copy it.
* tree.cc (handle_no_unique_addr_attribute): Set it.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/initlist-opt1.C: Check for static array.
* g++.dg/tree-ssa/initlist-opt2.C: Likewise.
* g++.dg/tree-ssa/initlist-opt4.C: New test.
* g++.dg/opt/icf1.C: New test.
* g++.dg/opt/icf2.C: New test.
---
 gcc/tree-core.h   |  3 ++-
 gcc/tree.h|  6 ++
 gcc/cgraph.cc |  2 +-
 gcc/cp/call.cc| 15 ---
 gcc/cp/tree.cc|  9 -
 gcc/gimplify.cc   |  3 ++-
 gcc/testsuite/g++.dg/opt/icf1.C   | 16 
 gcc/testsuite/g++.dg/opt/icf2.C   | 17 +
 gcc/testsuite/g++.dg/tree-ssa/initlist-opt1.C |  1 +
 gcc/testsuite/g++.dg/tree-ssa/initlist-opt2.C |  2 ++
 gcc/testsuite/g++.dg/tree-ssa/initlist-opt4.C | 13 +
 gcc/varasm.cc |  2 +-
 12 files changed, 81 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/icf1.C
 create mode 100644 gcc/testsuite/g++.dg/opt/icf2.C
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/initlist-opt4.C

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 9d44c04bf03..6dd7b680b57 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1803,7 +1803,8 @@ struct GTY(()) tree_decl_common {
  In VAR_DECL, PARM_DECL and RESULT_DECL, this is
  DECL_HAS_VALUE_EXPR_P.  */
   unsigned decl_flag_2 : 1;
-  /* In FIELD_DECL, this is DECL_PADDING_P.  */
+  /* In FIELD_DECL, this is DECL_PADDING_P.
+ In VAR_DECL, this is DECL_MERGEABLE.  */
   unsigned decl_flag_3 : 1;
   /* Logically, these two would go in a theoretical base shared by var and
  parm decl. */
diff --git a/gcc/tree.h b/gcc/tree.h
index 0b72663e6a1..8a4beba1230 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3233,6 +3233,12 @@ extern void decl_fini_priority_insert (tree, 
priority_type);
 #define DECL_NONALIASED(NODE) \
   (VAR_DECL_CHECK (NODE)->base.nothrow_flag)
 
+/* In a VAR_DECL, nonzero if this variable is not required to have a distinct
+   address from other variables with the same constant value.  In other words,
+   consider -fmerge-all-constants to be on for this VAR_DECL.  *

Re: Re: [PATCH] RISC-V: Add _mu C++ overloaded intrinsics for load && viota && vid

2023-06-01 Thread juzhe.zh...@rivai.ai
Oh. Yes. Thanks for catching this!
Will send V2 soon.



juzhe.zh...@rivai.ai
 
From: KuanLin Chen
Date: 2023-06-02 09:26
To: gcc-patches; juzhe.zhong
CC: kito.cheng; palmer; rdapp.gcc; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Add _mu C++ overloaded intrinsics for load && 
viota && vid
Hi Juzhe,
 
I think fault_load_def::get_name should remove "instance.pred ==
PRED_TYPE_mu", right?
 
 於 2023年6月2日 週五 上午7:05寫道:
>
> From: Juzhe-Zhong 
>
> Base on these:
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/232
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/233
>
> Add _mu C++ overloaded intrinsics for load && viota && vid.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins-bases.cc: Add _mu overloaded 
> intrinsics.
>
> ---
>  gcc/config/riscv/riscv-vector-builtins-bases.cc | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
> b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> index a8113f6602b..498c6ba042e 100644
> --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
> @@ -164,7 +164,7 @@ public:
>{
>  if (STORE_P || LST_TYPE == LST_INDEXED)
>return true;
> -return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
> +return pred != PRED_TYPE_none;
>}
>
>rtx expand (function_expander &e) const override
> @@ -963,7 +963,7 @@ public:
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
>  return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
> -  || pred == PRED_TYPE_tumu;
> +  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
>}
>
>rtx expand (function_expander &e) const override
> @@ -979,7 +979,7 @@ public:
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
>  return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
> -  || pred == PRED_TYPE_tumu;
> +  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
>}
>
>rtx expand (function_expander &e) const override
> @@ -1749,7 +1749,7 @@ public:
>
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
> -return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
> +return pred != PRED_TYPE_none;
>}
>
>rtx expand (function_expander &e) const override
> @@ -1794,7 +1794,7 @@ public:
>
>bool can_be_overloaded_p (enum predication_type_index pred) const override
>{
> -return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
> +return pred != PRED_TYPE_none;
>}
>
>rtx expand (function_expander &e) const override
> --
> 2.36.1
>
 


[PATCH] RISC-V: Add _mu C++ overloaded intrinsics for load && viota && vid

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

Base on these:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/232
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/233

Add _mu C++ overloaded intrinsics for load && viota && vid.

Co-authored-by: KuanLin Chen 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Add _mu overloaded 
intrinsics.
* config/riscv/riscv-vector-builtins-shapes.cc (struct fault_load_def): 
Ditto.

---
 gcc/config/riscv/riscv-vector-builtins-bases.cc | 17 +++--
 .../riscv/riscv-vector-builtins-shapes.cc   |  5 ++---
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 3f92084929d..09870c327fa 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -164,7 +164,7 @@ public:
   {
 if (STORE_P || LST_TYPE == LST_INDEXED)
   return true;
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -967,7 +967,7 @@ public:
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
 return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
-  || pred == PRED_TYPE_tumu;
+  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
   }
 
   rtx expand (function_expander &e) const override
@@ -983,7 +983,7 @@ public:
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
 return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
-  || pred == PRED_TYPE_tumu;
+  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
   }
 
   rtx expand (function_expander &e) const override
@@ -1715,6 +1715,11 @@ public:
 return CP_READ_MEMORY | CP_WRITE_CSR;
   }
 
+  bool can_be_overloaded_p (enum predication_type_index pred) const override
+  {
+return pred != PRED_TYPE_none;
+  }
+
   gimple *fold (gimple_folder &f) const override
   {
 return fold_fault_load (f);
@@ -1753,7 +1758,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -1798,7 +1803,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -1888,7 +1893,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   gimple *fold (gimple_folder &f) const override
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 76262f07ce4..c8daae01f91 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -550,9 +550,8 @@ struct fault_load_def : public build_base
   char *get_name (function_builder &b, const function_instance &instance,
  bool overloaded_p) const override
   {
-if (overloaded_p)
-  if (instance.pred == PRED_TYPE_none || instance.pred == PRED_TYPE_mu)
-   return nullptr;
+if (overloaded_p && !instance.base->can_be_overloaded_p (instance.pred))
+  return nullptr;
 tree type = builtin_types[instance.type.index].vector;
 machine_mode mode = TYPE_MODE (type);
 int sew = GET_MODE_BITSIZE (GET_MODE_INNER (mode));
-- 
2.36.1



[PATCH V2] RISC-V: Add _mu C++ overloaded intrinsics for load && viota && vid

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

Base on these:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/232
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/233

Add _mu C++ overloaded intrinsics for load && viota && vid.

Co-authored-by: KuanLin Chen 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Add _mu overloaded 
intrinsics.
* config/riscv/riscv-vector-builtins-shapes.cc (struct fault_load_def): 
Ditto.

---
 gcc/config/riscv/riscv-vector-builtins-bases.cc | 17 +++--
 .../riscv/riscv-vector-builtins-shapes.cc   |  5 ++---
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 3f92084929d..09870c327fa 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -164,7 +164,7 @@ public:
   {
 if (STORE_P || LST_TYPE == LST_INDEXED)
   return true;
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -967,7 +967,7 @@ public:
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
 return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
-  || pred == PRED_TYPE_tumu;
+  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
   }
 
   rtx expand (function_expander &e) const override
@@ -983,7 +983,7 @@ public:
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
 return pred == PRED_TYPE_tu || pred == PRED_TYPE_tum
-  || pred == PRED_TYPE_tumu;
+  || pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu;
   }
 
   rtx expand (function_expander &e) const override
@@ -1715,6 +1715,11 @@ public:
 return CP_READ_MEMORY | CP_WRITE_CSR;
   }
 
+  bool can_be_overloaded_p (enum predication_type_index pred) const override
+  {
+return pred != PRED_TYPE_none;
+  }
+
   gimple *fold (gimple_folder &f) const override
   {
 return fold_fault_load (f);
@@ -1753,7 +1758,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -1798,7 +1803,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   rtx expand (function_expander &e) const override
@@ -1888,7 +1893,7 @@ public:
 
   bool can_be_overloaded_p (enum predication_type_index pred) const override
   {
-return pred != PRED_TYPE_none && pred != PRED_TYPE_mu;
+return pred != PRED_TYPE_none;
   }
 
   gimple *fold (gimple_folder &f) const override
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 76262f07ce4..c8daae01f91 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -550,9 +550,8 @@ struct fault_load_def : public build_base
   char *get_name (function_builder &b, const function_instance &instance,
  bool overloaded_p) const override
   {
-if (overloaded_p)
-  if (instance.pred == PRED_TYPE_none || instance.pred == PRED_TYPE_mu)
-   return nullptr;
+if (overloaded_p && !instance.base->can_be_overloaded_p (instance.pred))
+  return nullptr;
 tree type = builtin_types[instance.type.index].vector;
 machine_mode mode = TYPE_MODE (type);
 int sew = GET_MODE_BITSIZE (GET_MODE_INNER (mode));
-- 
2.36.1



[COMMITTED] MAINTAINERS: Add myself as MIPS port maintainer

2023-06-01 Thread YunQiang Su
ChangeLog:

* MAINTAINERS (CPU Port Maintainers): Add myself as MIPS
port maintainer.
(Write After Approval): Remove myself.
---
 MAINTAINERS | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4a7c963914b..c8b787b6e1e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -91,7 +91,7 @@ m68k port Andreas Schwab  

 m68k-motorola-sysv portPhilippe De Muyter  
 mcore port Nick Clifton
 microblaze Michael Eager   
-mips port  Matthew Fortune 
+mips port  YunQiang Su 
 mmix port  Hans-Peter Nilsson  
 mn10300 port   Jeff Law
 mn10300 port   Alexandre Oliva 
@@ -652,7 +652,6 @@ Basile Starynkevitch

 Jakub Staszak  
 Graham Stott   
 Jeff Sturm 
-YunQiang Su
 Robert Suchanek
 Andrew Sutton  
 Gabriele Svelto
-- 
2.30.2



Followup on PR/109279: large constants on RISCV

2023-06-01 Thread Vineet Gupta

Hi Jeff,

I finally got around to collecting various observations on PR/109279 - 
more importantly the state of large constants in RV backend, apologies 
in advance for the long email.


It seems the various commits in area have improved the original test 
case of 0x1010101_01010101


  Before 2e886eef7f2b  |   With 2e886eef7f2b   | With 
0530254413f8 | With c104ef4b5eb1
    (const pool)   | define_insn_and_split | "splitter relaxed 
new |

   | "*mvconst_internal"   | pseudos" |
lui  a5,%hi(.LANCHOR0) | li a0,0x0101  | li a5,0x0101    
| li   a5,0x0101
ld   a0,%lo(.LANCHOR0)(a5) | addi   a0,0x0101  | addi 
a5,a5,0x0101 | addi a5,a5,0x0101
ret    | slli   a0,a0,16   | mv a0,a5    
| slli a0,a5,32
   | addi   a0,a0,0x0101   | slli 
a5,a5,32 | add  a0,a0,a5

   | slli   a0,a0,16   | add a0,a5,a0 |
   | addi   a0,a0,0x0101   | 
ret   |

   | ret   |

But same commits seem to have regressed Andrew's test from same PR 
(which is the theme of this email).

The seemingly contrived test turned out to be much more than I'd hoped for.

   long long f(void)
   {
 unsigned t = 0x101_0101;
 long long t1 = t;
 long long t2 = ((unsigned long long )t) << 32;
 asm("":"+r"(t1));
 return t1 | t2;
   }

  Before 2e886eef7f2b  |   With 2e886eef7f2b    | With 0530254413f8
    (ideal code)   | define_insn_and_split  | "splitter relaxed new
   |    |  pseudos"
   li   a0,0x101   |    li   a5,0x101   |    li a0,0x101_
   addi a0,a0,0x101    |    addi a5,a5,0x101    |    addi a0,a0,0x101
   slli a5,a0,32   |    mv   a0,a5  |    li a5,0x101_
   or   a0,a0,a5   |    slli a5,a5,32   |    slli a0,a0,32
   ret |    or   a0,a0,a5   |    addi a5,a5,0x101
   |    ret |    or   a0,a5,a0
    |    ret

As a baseline, RTL just before cse1 (in 260r.dfinit) in all of above is:

   # lower word

   (insn 6 2 7 2 (set (reg:DI 138)
    (const_int [0x101]))  {*movdi_64bit}

   (insn 7 6 8 2 (set (reg:DI 137)
    (plus:DI (reg:DI 138)
    (const_int [0x101]))) {adddi3}
 (expr_list:REG_EQUAL (const_int [0x1010101]) )

   (insn 5 8 9 2 (set (reg/v:DI 134 [ t1 ])
    (reg:DI 136 [ t1 ])) {*movdi_64bit}

   # upper word (created independently)

   (insn 9 5 10 2 (set (reg:DI 141)
    (const_int [0x101]))  {*movdi_64bit}

   (insn 10 9 11 2 (set (reg:DI 142)
    (plus:DI (reg:DI 141)
    (const_int [0x101]))) {adddi3}

   (insn 11 10 12 2 (set (reg:DI 140)
    (ashift:DI (reg:DI 142)
    (const_int 32 [0x20]))) {ashldi3}
   (expr_list:REG_EQUAL (const_int [0x1010101]))

   # stitch them
   (insn 12 11 13 2 (set (reg:DI 139)
    (ior:DI (reg/v:DI 134 [ t1 ])
    (reg:DI 140))) "const2.c":7:13 99 {iordi3}


Prior to 2e886eef7f2b, cse1 could do its job: finding oldest equivalent 
registers for the fragments of const and reusing the reg.


   (insn 7 6 8 2 (set (reg:DI 137)
    (plus:DI (reg:DI 138)
    (const_int [0x101]))) {adddi3}
    (expr_list:REG_EQUAL (const_int [0x1010101])))
   [...]

   (insn 11 10 12 2 (set (reg:DI 140)
    (ashift:DI (reg:DI 137)
 ^   OLD EQUIV REG
    (const_int 32 [0x20]))) {ashldi3}
    (expr_list:REG_EQUAL (const_int [0x1010101_])))


With 2e886eef7f2b, define_insn_and_split "*mvconst_internal" recog() 
kicks in during cse1, eliding insns for a const_int.


   (insn 7 6 8 2 (set (reg:DI 137)
    (const_int [0x1010101])) {*mvconst_internal}
    (expr_list:REG_EQUAL (const_int [0x1010101])))
   [...]

   (insn 11 10 12 2 (set (reg:DI 140)
    (const_int [0x1010101_])) {*mvconst_internal}
    (expr_list:REG_EQUAL (const_int  [0x1010101_]) ))

Eventually split1 breaks it up using same mvconst_internal splitter, but 
the cse opportunity has been lost.
*This is a now a baseline for large consts handling for RV backend which 
we all need to be aware of*.



(2) Now on to the nuances as to why things get progressively worse after 
commit 0530254413f8.


It all seems to get down to register allocation passes:

sched1 before 0530254413f8

   ;; 0--> b  0: i  22 r140=0x101    :alu
   ;; 1--> b  0: i  20 r137=0x101    :alu
   ;; 2--> b  0: i  23 r140=r140+0x101   :alu
   ;; 3--> b  0: i  21 r137=r137+0x101   :alu
   ;; 4--> b  0: i  24 r140=r140<<0x20   :alu
   ;; 5--> b  0: i  25 r136=r137 :alu
   ;; 6--> b  0: i   8 r136=asm_operands :nothing
   ;; 7--> b  0: i  17 a0=r136|r140  :alu
   ;; 8--> b  0: i  18 use a0    :nothing

sched1 with 053

[PATCH RFA] varasm: check float size

2023-06-01 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, OK for trunk?

-- 8< --

In PR95226, the testcase was failing because we tried to output_constant a
NOP_EXPR to float from a double REAL_CST, and so we output a double where
the caller wanted a float.  That doesn't happen anymore, but with the
output_constant hunk we will ICE in that situation rather than emit the
wrong number of bytes.

Part of the problem was that initializer_constant_valid_p_1 returned true
for that NOP_EXPR, because it compared the sizes of integer types but not
floating-point types.  So the C++ front end assumed it didn't need to fold
the initializer.

PR c++/95226

gcc/ChangeLog:

* varasm.cc (output_constant) [REAL_TYPE]: Check that sizes match.
(initializer_constant_valid_p_1): Compare float precision.
---
 gcc/varasm.cc | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 34400ec39ef..dd84754a283 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -4876,16 +4876,16 @@ initializer_constant_valid_p_1 (tree value, tree 
endtype, tree *cache)
tree src_type = TREE_TYPE (src);
tree dest_type = TREE_TYPE (value);
 
-   /* Allow conversions between pointer types, floating-point
-  types, and offset types.  */
+   /* Allow conversions between pointer types and offset types.  */
if ((POINTER_TYPE_P (dest_type) && POINTER_TYPE_P (src_type))
-   || (FLOAT_TYPE_P (dest_type) && FLOAT_TYPE_P (src_type))
|| (TREE_CODE (dest_type) == OFFSET_TYPE
&& TREE_CODE (src_type) == OFFSET_TYPE))
  return initializer_constant_valid_p_1 (src, endtype, cache);
 
-   /* Allow length-preserving conversions between integer types.  */
-   if (INTEGRAL_TYPE_P (dest_type) && INTEGRAL_TYPE_P (src_type)
+   /* Allow length-preserving conversions between integer types and
+  floating-point types.  */
+   if (((INTEGRAL_TYPE_P (dest_type) && INTEGRAL_TYPE_P (src_type))
+|| (FLOAT_TYPE_P (dest_type) && FLOAT_TYPE_P (src_type)))
&& (TYPE_PRECISION (dest_type) == TYPE_PRECISION (src_type)))
  return initializer_constant_valid_p_1 (src, endtype, cache);
 
@@ -5255,6 +5255,7 @@ output_constant (tree exp, unsigned HOST_WIDE_INT size, 
unsigned int align,
   break;
 
 case REAL_TYPE:
+  gcc_assert (size == thissize);
   if (TREE_CODE (exp) != REAL_CST)
error ("initializer for floating value is not a floating constant");
   else

base-commit: 5fccebdbd9666e0adf6dd8357c21d4ef3ac3f83f
-- 
2.31.1



Re: [PATCH] rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrhx

2023-06-01 Thread Kewen.Lin via Gcc-patches
Hi Carl,

on 2023/6/2 04:01, Carl Love wrote:
> Kewen, Segher, Peter:
> 
> The following patch is a redo of the previous "rs6000: Fix
> __builtin_vec_xst_trunc definition" patch.  
> 
> This patch fixes the argument in the two builtin definitions
> __builtin_altivec_tr_stxvrwx and __builtin_altivec_tr_stxvrhx.  It also
> adds with a testcase to validate the related builtins which have the
> third argument of char *, short *, int * and long long *.
> 
> I have tested the patch on Power 10 with no regressions.
> 
> Please let me know if this patch is acceptable for mainline.

Thanks for catching and fixing this.

OK for trunk with or without some nits below in test case fixed
(as it's just a test case :)).

> 
>   Carl 
> 
> 
> rs6000: Fix arguments for __builtin_altivec_tr_stxvrwx, 
> __builtin_altivec_tr_stxvrhx
> 
> The third argument for __builtin_altivec_tr_stxvrhx should be short *
> not int *.  Similarly, the third argument for __builtin_altivec_tr_stxvrwx
> should be int * not short *.  This patch fixes the arguments in the two
> builtins.
> 
> A runnable test case is added to test the __builtin_altivec_tr_stxvrbx,
> __builtin_altivec_tr_stxvrhx, __builtin_altivec_tr_stxvrwx and
> __builtin_altivec_tr_stxvrdx builtins.
> 
> gcc/
>   * config/rs6000/rs6000-builtins.def (__builtin_altivec_tr_stxvrhx,
>   __builtin_altivec_tr_stxvrwx): Fix type of third argument.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c: New test
>   for __builtin_altivec_tr_stxvrbx, __builtin_altivec_tr_stxvrhx,
>   __builtin_altivec_tr_stxvrwx, __builtin_altivec_tr_stxvrdx.
> ---
>  gcc/config/rs6000/rs6000-builtins.def |   4 +-
>  .../builtin_altivec_tr_stxvr_runnable.c   | 107 ++
>  2 files changed, 109 insertions(+), 2 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..d7839f2e06b 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -3161,10 +3161,10 @@
>void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
>  TR_STXVRBX vsx_stxvrbx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed short *);
>  TR_STXVRHX vsx_stxvrhx {stvec}
>  
> -  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed int *);
>  TR_STXVRWX vsx_stxvrwx {stvec}
>  
>void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
> diff --git 
> a/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
> new file mode 100644
> index 000..46014d83535
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/builtin_altivec_tr_stxvr_runnable.c
> @@ -0,0 +1,107 @@
> +/* Test of __builtin_vec_xst_trunc  */
> +
> +/* { dg-do run { target power10_hw } } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DEBUG 0
> +
> +vector signed __int128 store_data =
> +  {  (__int128) 0x8ACE << 64 | (__int128) 0xfedcba9876543217ULL};
> +
> +union conv_t {
> +  vector signed __int128 vsi128;
> +  unsigned long long ull[2];
> +} conv;
> +
> +void abort (void);
> +
> +
> +int
> +main () {
> +  int i;
> +  signed long sl;
> +  signed char sc, expected_sc;
> +  signed short ss, expected_ss;
> +  signed int si, expected_si;
> +  signed long long int sll, expected_sll;
> +  signed char *psc;
> +  signed short *pss;
> +  signed int *psi;
> +  signed long long int *psll;
> +  
> +#if DEBUG
> +  val.vsi128 = store_data;
> +   printf("Data to store [%d] = 0x%llx %llx\n", i, val.ull[1], val.ull[0]);

odd indent.

> +#endif
> +
> +  psc = ≻
> +  pss = &ss;
> +  psi = &si;
> +  psll = &sll;
> +
> +  sl = 1;
> +  sc =0xA1;

one more space after "=".

> +  expected_sc = 0xA1;
> +  __builtin_altivec_tr_stxvrbx (store_data, sl, psc);
> +
> +  if (expected_sc != sc & 0xFF)
> +#if DEBUG
> +printf(" ERROR: Signed char = 0x%x doesn't match expected value 0x%x\n",
> +sc & 0xFF, expected_sc);
> +#else
> +abort();
> +#endif
> +
> +  sl = 1;

redundant, and easy to cause misunderstanding that sl can get changed.

> +  ss = 0x52;
> +  expected_ss = 0x1752;
> +  __builtin_altivec_tr_stxvrhx (store_data, sl, pss);
> +
> +  if (expected_ss != ss & 0x)
> +#if DEBUG
> +printf(" ERROR: Signed short = 0x%x doesn't match expected value 0x%x\n",
> +ss, expected_ss) & 0x;
> +#else
> +abort();
> +#endif
> +
> +  sl = 1;

same as

[PATCH] RISC-V: Fix warning in predicated.md

2023-06-01 Thread juzhe . zhong
From: Juzhe-Zhong 

Notice there is warning in predicates.md:
../../../riscv-gcc/gcc/config/riscv/predicates.md: In function ???bool 
arith_operand_or_mode_mask(rtx, machine_mode)???:
../../../riscv-gcc/gcc/config/riscv/predicates.md:33:14: warning: comparison 
between signed and unsigned integer expressions [-Wsign-compare]
 (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
../../../riscv-gcc/gcc/config/riscv/predicates.md:34:20: warning: comparison 
between signed and unsigned integer expressions [-Wsign-compare]
 || INTVAL (op) == GET_MODE_MASK (SImode)"

gcc/ChangeLog:

* config/riscv/predicates.md: Change INTVAL into UINTVAL.

---
 gcc/config/riscv/predicates.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 1ed84850e35..d14b1ca30bb 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -31,7 +31,7 @@
   (ior (match_operand 0 "arith_operand")
(and (match_code "const_int")
 (match_test "INTVAL (op) == GET_MODE_MASK (HImode)
-|| INTVAL (op) == GET_MODE_MASK (SImode)"
+|| UINTVAL (op) == GET_MODE_MASK (SImode)"
 
 (define_predicate "lui_operand"
   (and (match_code "const_int")
-- 
2.36.1



  1   2   >