Re: [PATCH] developer option: -fdump-generic-nodes; initial incorporation

2024-02-28 Thread Richard Biener
On Wed, Feb 28, 2024 at 4:14 PM David Malcolm  wrote:
>
> On Wed, 2024-02-28 at 08:58 +0100, Richard Biener wrote:
> > On Tue, Feb 27, 2024 at 10:20 PM Robert Dubner 
> > wrote:
> > >
> > > Richard,
> > >
> > > Thank you very much for your comments.
> > >
> > > When I set out to create the capability, I had a "specification" in
> > > mind.
> > >
> > > I didn't have a clue how to create a GENERIC tree that could be fed
> > > to the
> > > middle end in a way that would successfully result in an
> > > executable.  And I
> > > needed to be able to do that in order to proceed with the project
> > > of
> > > creating a COBOL front end.
> > >
> > > So, I came up with the idea of using GCC to compile simple
> > > programs, and to
> > > hook into the compiler to examine the trees fed to the middle end,
> > > and to
> > > display those trees in the human-readable format I needed to
> > > understand
> > > them.  And that's what I did.
> > >
> > > My first incarnation generated pure text files, and I used that to
> > > get
> > > going.
> > >
> > > After a while I realized that when I used the output file, I was
> > > spending a
> > > lot of time searching through the text files.  And I had the
> > > brainstorm!
> > > Hyperlinks!  HTML files!  We have the technology!  So, I created
> > > the .HTML
> > > files as well.
> > >
> > > I found this useful to the point of necessity in order to learn how
> > > to
> > > generate the GENERIC trees.  I believe it would be equally useful
> > > to the
> > > next developer who, for whatever reason, needs to understand, on a
> > > "You need
> > > to learn the alphabet before you can learn how to read" level, what
> > > the
> > > middle end requires from a GENERIC tree generated by a front end.
> > >
> > > But I've never used it on a complex program. I've used it only to
> > > learn how
> > > to create the GENERIC nodes for very particular things, and so I
> > > would use
> > > the -fdump-generic-nodes feature on a very simple C program that
> > > demonstrated, in isolation, the feature I needed.  Once I figured
> > > it out, I
> > > would create front end C routines or macros that used the
> > > tree.h/tree.cc
> > > features to build those GENERIC trees, and then I would move on.
> > >
> > > I decided to offer it up here, in order to to learn how to create
> > > patches
> > > and to get
> > > to know the people and the process, as well as from the desire to
> > > share it.
> > > And instantly I got the "How about a machine-readable format?"
> > > comments.
> > > Which are reasonable.  So, because it wasn't hard, I hacked at the
> > > existing
> > > code to create a JSON output.  (But I remind you that up until now,
> > > nobody
> > > seems to have needed a JSON representation.)
> > >
> > > And your observation that the human readable representation could
> > > be made
> > > from the JSON representation is totally accurate.
> > >
> > > But that wasn't my specification.  My specification was "A tool so
> > > that a
> > > human being can examine a simple GENERIC tree to learn how it's
> > > done."
> > >
> > > But it seems to me that we are now moving into the realm of a new
> > > specification.
> > >
> > > Said another way:  To go from "A human readable representation of a
> > > simple
> > > GENERIC tree" to "A machine readable JSON representation of an
> > > arbitrarily
> > > complex GENERIC tree, from which a human readable representation
> > > can be
> > > created" means, in effect, starting over on a different project
> > > that I don't
> > > need.  I already *have* a project that I am working on -- the COBOL
> > > front
> > > end.
> > >
> > > The complexity of GENERIC trees is, in my experienced opinion, an
> > > obstacle
> > > for the creation of front ends.  The GCC Internals document has a
> > > lot of
> > > information, but to go from it to a front end is like using the
> > > maintenance
> > > manual for an F16 fighter to try to learn to fly the aircraft.
> > >
> > > The program "main(){}" generates a tree with over seventy nodes.  I
> > > see no
> > > way to document why that's true; it's all arbitrary in the sense
> > > that "this
> > > is how GCC works".  -fdump-generic-nodes made it possible for me to
> > > figure
> > > out how those nodes are connected and, thus, how to create a new
> > > front end.
> > > I figure that other developers might find it useful, as well.
> > >
> > > I guess I am saying that I am not, at this time, able to work on a
> > > whole
> > > different tool.  I think what I have done so far does something
> > > useful that
> > > doesn't seem to otherwise exist in GCC.
> > >
> > > I suppose the question for you is, "Is it useful enough?"
> > >
> > > I won't be offended if the answer is "No" and I hope you won't be
> > > offended
> > > by my not having the bandwidth to address your very thoughtful and
> > > valid
> > > observations about how it could be better.
> >
> > No offense taken - I did realize how useful this was to you (and
> > specifically
> > 

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-02-28 Thread Richard Biener
On Wed, 28 Feb 2024, Andre Vieira (lists) wrote:

> 
> 
> On 27/02/2024 08:47, Richard Biener wrote:
> > On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:
> > 
> >>
> >>
> >> On 05/02/2024 09:56, Richard Biener wrote:
> >>> On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:
> >>>
> 
> 
>  On 01/02/2024 07:19, Richard Biener wrote:
> > On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> >
> >
> > The patch didn't come with a testcase so it's really hard to tell
> > what goes wrong now and how it is fixed ...
> 
>  My bad! I had a testcase locally but never added it...
> 
>  However... now I look at it and ran it past Richard S, the codegen isn't
>  'wrong', but it does have the potential to lead to some pretty slow
>  codegen,
>  especially for inbranch simdclones where it transforms the SVE predicate
>  into
>  an Advanced SIMD vector by inserting the elements one at a time...
> 
>  An example of which can be seen if you do:
> 
>  gcc -O3 -march=armv8-a+sve -msve-vector-bits=128  -fopenmp-simd t.c -S
> 
>  with the following t.c:
>  #pragma omp declare simd simdlen(4) inbranch
>  int __attribute__ ((const)) fn5(int);
> 
>  void fn4 (int *a, int *b, int n)
>  {
>    for (int i = 0; i < n; ++i)
>    b[i] = fn5(a[i]);
>  }
> 
>  Now I do have to say, for our main usecase of libmvec we won't have any
>  'inbranch' Advanced SIMD clones, so we avoid that issue... But of course
>  that
>  doesn't mean user-code will.
> >>>
> >>> It seems to use SVE masks with vector(4)  and the
> >>> ABI says the mask is vector(4) int.  You say that's because we choose
> >>> a Adv SIMD clone for the SVE VLS vector code (it calls _ZGVnM4v_fn5).
> >>>
> >>> The vectorizer creates
> >>>
> >>> _44 = VEC_COND_EXPR ;
> >>>
> >>> and then vector lowering decomposes this.  That means the vectorizer
> >>> lacks a check that the target handles this VEC_COND_EXPR.
> >>>
> >>> Of course I would expect that SVE with VLS vectors is able to
> >>> code generate this operation, so it's missing patterns in the end.
> >>>
> >>> Richard.
> >>>
> >>
> >> What should we do for GCC-14? Going forward I think the right thing to do
> >> is
> >> to add these patterns. But I am not even going to try to do that right now
> >> and
> >> even though we can codegen for this, the result doesn't feel like it would
> >> ever be profitable which means I'd rather not vectorize, or well pick a
> >> different vector mode if possible.
> >>
> >> This would be achieved with the change to the targethook. If I change the
> >> hook
> >> to take modes, using STMT_VINFO_VECTYPE (stmt_vinfo), is that OK for now?
> > 
> > Passing in a mode is OK.  I'm still not fully understanding why the
> > clone isn't fully specifying 'mode' and if it does not why the
> > vectorizer itself can not disregard it.
> 
> 
> We could check that the modes of the parameters & return type are the same as
> the vector operands & result in the vectorizer. But then we'd also want to
> make sure we don't reject cases where we have simdclones with compatible
> modes, aka same element type, but a multiple element count.  Which is where'd
> we get in trouble again I think, because we'd want to accept V8SI -> 2x V4SI,
> but not V8SI -> 2x VNx4SI (with VLS and aarch64_sve_vg = 2), not because it's
> invalid, but because right now the codegen is bad. And it's easier to do this
> in the targethook, which we can technically also use to 'rank' simdclones by
> setting a target_badness value, so in the future we could decide to assign
> some 'badness' to influence the rank a SVE simdclone for Advanced SIMD loops
> vs an Advanced SIMD clone for Advanced SIMD loops.
> 
> This does touch another issue of simdclone costing, which is a larger issue in
> general and one we (arm) might want to approach in the future. It's a complex
> issue, because the vectorizer doesn't know the performance impact of a
> simdclone, we assume (as we should) that its faster than the original scalar,
> though we currently don't record costs for either, but we don't know by how
> much or how much impact it has, so the vectorizer can't reason whether it's
> beneficial to use a simdclone if it has to do a lot of operand preparation, we
> can merely tell it to use it, or not and all the other operations in the loop
> will determine costing.
> 
> 
> > From the past discussion I understood the existing situation isn't
> > as bad as initially thought and no bad things happen right now?
> Nope, I thought they compiler would fall apart, but it seems to be able to
> transform the operands from one mode into the other, so without the targethook
> it just generates slower loops in certain cases, which we'd rather avoid given
> the usecase for simdclones is to speed things up ;)
> 
> 
> Attached reworked patch.
> 
> 
> This patch adds a machine_mode argument to TARGET_SIMD_CLONE_USABLE to make
> sure 

Re: [COMMITTED] aarch64: Fix memtag builtins vs GC [PR108174]

2024-02-28 Thread Andrew Pinski
On Wed, Feb 28, 2024 at 11:14 PM Andrew Pinski  wrote:
>
> The memtag builtins were being GC'ed away so we end up
> with a crash sometimes (maybe even wrong code).
> This fixes that issue by adding GTY on the variable/struct
> aarch64_memtag_builtin_data.
>
> Committed as obvious after a build/test for aarch64-linux-gnu.

Also committed to the GCC 13 branch.

Thanks,
Andrew

>
> PR target/108174
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-builtins.cc (aarch64_memtag_builtin_data): 
> Make
> static and mark with GTY.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/acle/memtag_4.c: New test.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/aarch64/aarch64-builtins.cc   |  2 +-
>  gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c | 16 
>  2 files changed, 17 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index 277904f6d14..75d21de1401 100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -1840,7 +1840,7 @@ aarch64_init_prefetch_builtin (void)
>  }
>
>  /* Initialize the memory tagging extension (MTE) builtins.  */
> -struct
> +static GTY(()) struct GTY(())
>  {
>tree ftype;
>enum insn_code icode;
> diff --git a/gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c 
> b/gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c
> new file mode 100644
> index 000..1e209ffc25a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv9-a+memtag  --param ggc-min-expand=0 --param 
> ggc-min-heapsize=0" } */
> +/* PR target/108174 */
> +/* Check to make sure that the builtin functions are not GC'ed away. */
> +#include "arm_acle.h"
> +
> +void g(void)
> +{
> +  const char *c;
> +  __arm_mte_increment_tag(c , 0 );
> +}
> +void h(void)
> +{
> +  const char *c;
> +  __arm_mte_increment_tag( c,0);
> +}
> --
> 2.43.0
>


[PATCH] LoongArch: Allow s9 as a register alias

2024-02-28 Thread Xi Ruoyao
The psABI allows using s9 as an alias of r22.

gcc/ChangeLog:

* config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add
s9 as an alias of r22.
---

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/loongarch.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index 8b453ab3140..bf2351f0968 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -931,6 +931,7 @@ typedef struct {
   { "t8",  20 + GP_REG_FIRST },\
   { "x",   21 + GP_REG_FIRST },\
   { "fp",  22 + GP_REG_FIRST },\
+  { "s9",  22 + GP_REG_FIRST },\
   { "s0",  23 + GP_REG_FIRST },\
   { "s1",  24 + GP_REG_FIRST },\
   { "s2",  25 + GP_REG_FIRST },\
-- 
2.44.0



[PATCH] LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to allow the IE to LE linker relaxation

2024-02-28 Thread Xi Ruoyao
In Binutils we need to make IE to LE relaxation only allowed when there
is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid
"partial" relaxation won't happen with the extreme code model.  So if we
are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an
R_LARCH_RELAX to allow the relaxation.  The IE to LE relaxation does not
require the pcalau12i and the ld instruction to be adjacent, so we don't
need to limit ourselves to use the macro.

For the distro maintainers backporting changes: this change depends on
r14-8721, without r14-8721 R_LARCH_RELAX can be emitted mistakenly in
the extreme code model.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_print_operand_reloc):
Support 'Q' for R_LARCH_RELAX for TLS IE.
(loongarch_output_move): Use 'Q' to print R_LARCH_RELAX for TLS
IE.
* config/loongarch/loongarch.md (ld_from_got): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tls-ie-relax.c: New test.
* gcc.target/loongarch/tls-ie-norelax.c: New test.
* gcc.target/loongarch/tls-ie-extreme.c: New test.
---

Bootstrapped & regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/loongarch.cc | 15 ++-
 gcc/config/loongarch/loongarch.md |  2 +-
 .../gcc.target/loongarch/tls-ie-extreme.c |  5 +
 .../gcc.target/loongarch/tls-ie-norelax.c |  5 +
 gcc/testsuite/gcc.target/loongarch/tls-ie-relax.c | 11 +++
 5 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/tls-ie-extreme.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/tls-ie-norelax.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/tls-ie-relax.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 0428b6e65d5..70e31bb831c 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4981,7 +4981,7 @@ loongarch_output_move (rtx dest, rtx src)
  if (type == SYMBOL_TLS_LE)
return "lu12i.w\t%0,%h1";
  else
-   return "pcalau12i\t%0,%h1";
+   return "%Q1pcalau12i\t%0,%h1";
}
 
   if (src_code == CONST_INT)
@@ -6145,6 +6145,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
hi64_part,
'L'  Print the low-part relocation associated with OP.
'm' Print one less than CONST_INT OP in decimal.
'N' Print the inverse of the integer branch condition for comparison OP.
+   'Q'  Print R_LARCH_RELAX for TLS IE.
'r'  Print address 12-31bit relocation associated with OP.
'R'  Print address 32-51bit relocation associated with OP.
'T' Print 'f' for (eq:CC ...), 't' for (ne:CC ...),
@@ -6282,6 +6283,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
letter);
   break;
 
+case 'Q':
+  if (!TARGET_LINKER_RELAXATION)
+   break;
+
+  if (code == HIGH)
+   op = XEXP (op, 0);
+
+  if (loongarch_classify_symbolic_expression (op) == SYMBOL_TLS_IE)
+   fprintf (file, ".reloc\t.,R_LARCH_RELAX\n\t");
+
+  break;
+
 case 'r':
   loongarch_print_operand_reloc (file, op, false /* hi64_part */,
 true /* lo_reloc */);
diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index f3b5c641fce..525e1e82183 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2620,7 +2620,7 @@ (define_insn "@ld_from_got"
(match_operand:P 2 "symbolic_operand")))]
UNSPEC_LOAD_FROM_GOT))]
   ""
-  "ld.\t%0,%1,%L2"
+  "%Q2ld.\t%0,%1,%L2"
   [(set_attr "type" "move")]
 )
 
diff --git a/gcc/testsuite/gcc.target/loongarch/tls-ie-extreme.c 
b/gcc/testsuite/gcc.target/loongarch/tls-ie-extreme.c
new file mode 100644
index 000..00c545a3e8c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/tls-ie-extreme.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=loongarch64 -mabi=lp64d -mcmodel=extreme 
-mexplicit-relocs=auto -mrelax" } */
+/* { dg-final { scan-assembler-not "R_LARCH_RELAX" { target tls_native } } } */
+
+#include "tls-ie-relax.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/tls-ie-norelax.c 
b/gcc/testsuite/gcc.target/loongarch/tls-ie-norelax.c
new file mode 100644
index 000..dd6bf3634a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/tls-ie-norelax.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcmodel=normal -mexplicit-relocs -mno-relax" } */
+/* { dg-final { scan-assembler-not "R_LARCH_RELAX" { target tls_native } } } */
+
+#include "tls-ie-relax.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/tls-ie-relax.c 
b/gcc/testsuite/gcc.target/loongarch/tls-ie-relax.c
new file mode 100644
index 000..e9f7569b1da
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/tls-ie-relax.c
@@ 

[COMMITTED] aarch64: Fix memtag builtins vs GC [PR108174]

2024-02-28 Thread Andrew Pinski
The memtag builtins were being GC'ed away so we end up
with a crash sometimes (maybe even wrong code).
This fixes that issue by adding GTY on the variable/struct
aarch64_memtag_builtin_data.

Committed as obvious after a build/test for aarch64-linux-gnu.

PR target/108174

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (aarch64_memtag_builtin_data): Make
static and mark with GTY.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/memtag_4.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64-builtins.cc   |  2 +-
 gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c | 16 
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index 277904f6d14..75d21de1401 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -1840,7 +1840,7 @@ aarch64_init_prefetch_builtin (void)
 }
 
 /* Initialize the memory tagging extension (MTE) builtins.  */
-struct
+static GTY(()) struct GTY(())
 {
   tree ftype;
   enum insn_code icode;
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c 
b/gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c
new file mode 100644
index 000..1e209ffc25a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/memtag_4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv9-a+memtag  --param ggc-min-expand=0 --param 
ggc-min-heapsize=0" } */
+/* PR target/108174 */
+/* Check to make sure that the builtin functions are not GC'ed away. */
+#include "arm_acle.h"
+
+void g(void)
+{
+  const char *c;
+  __arm_mte_increment_tag(c , 0 );
+}
+void h(void)
+{
+  const char *c;
+  __arm_mte_increment_tag( c,0);
+}
-- 
2.43.0



[PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-28 Thread Xi Ruoyao
Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To
prevent a similar issue from happening again, add a test case.

Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with LSX and LASX).

gcc/testsuite:

* gcc.dg/vect/vect-neg-zero.c: New test.
---

v1->v2: Remove { dg-do run } which was likely triggering a SIGILL on
Linaro ARM CI.

Ok for trunk?

 gcc/testsuite/gcc.dg/vect/vect-neg-zero.c | 38 +++
 1 file changed, 38 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-neg-zero.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c 
b/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
new file mode 100644
index 000..6af4a02c517
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
@@ -0,0 +1,38 @@
+/* { dg-add-options ieee } */
+/* { dg-additional-options "-fsigned-zeros" } */
+
+double x[4] = {-0.0, 0.0, -0.0, 0.0};
+float y[8] = {-0.0, 0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0};
+
+static __attribute__ ((always_inline)) inline void
+test (int factor)
+{
+  double a[4];
+  float b[8];
+
+  asm ("" ::: "memory");
+
+  for (int i = 0; i < 2 * factor; i++)
+a[i] = -x[i];
+
+  for (int i = 0; i < 4 * factor; i++)
+b[i] = -y[i];
+
+#pragma GCC novector
+  for (int i = 0; i < 2 * factor; i++)
+if (__builtin_signbit (a[i]) == __builtin_signbit (x[i]))
+  __builtin_abort ();
+
+#pragma GCC novector
+  for (int i = 0; i < 4 * factor; i++)
+if (__builtin_signbit (b[i]) == __builtin_signbit (y[i]))
+  __builtin_abort ();
+}
+
+int
+main (void)
+{
+  test (1);
+  test (2);
+  return 0;
+}
-- 
2.44.0



[PATCH v2] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-02-28 Thread Xi Ruoyao
The vect_int_mod target selector is evaluated with the options in
DEFAULT_VECTCFLAGS in effect, but these options are not automatically
passed to tests out of the vect directories.  So this test fails on
targets where integer vector modulo operation is supported but requiring
an option to enable, for example LoongArch.

In this test case, the only expected optimization not happened in
original is in corge because it needs forward propogation.  So we can
scan the forwprop2 dump (where the vector operation is not expanded to
scalars yet) instead of optimized, then we don't need to consider
vect_int_mod or not.

gcc/testsuite/ChangeLog:

PR testsuite/113418
* gcc.dg/pr104992.c (dg-options): Use -fdump-tree-forwprop2
instead of -fdump-tree-optimized.
(dg-final): Scan forwprop2 dump instead of optimized, and remove
the use of vect_int_mod.
* lib/target-supports.exp (check_effective_target_vect_int_mod):
Remove because it's not used anymore.
---

v1->v2: Remove check_effective_target_vect_int_mod as it's now unused.

This fixes the test failure on loongarch64-linux-gnu.  Also tested on
x86_64-linux-gnu.  Ok for trunk?

 gcc/testsuite/gcc.dg/pr104992.c   |  5 ++---
 gcc/testsuite/lib/target-supports.exp | 13 -
 2 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pr104992.c b/gcc/testsuite/gcc.dg/pr104992.c
index 82f8c75559c..6fd513d34b2 100644
--- a/gcc/testsuite/gcc.dg/pr104992.c
+++ b/gcc/testsuite/gcc.dg/pr104992.c
@@ -1,6 +1,6 @@
 /* PR tree-optimization/104992 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wno-psabi -fdump-tree-optimized" } */
+/* { dg-options "-O2 -Wno-psabi -fdump-tree-forwprop2" } */
 
 #define vector __attribute__((vector_size(4*sizeof(int
 
@@ -54,5 +54,4 @@ __attribute__((noipa)) unsigned waldo (unsigned x, unsigned 
y, unsigned z) {
 return x / y * z == x;
 }
 
-/* { dg-final { scan-tree-dump-times " % " 9 "optimized" { target { ! 
vect_int_mod } } } } */
-/* { dg-final { scan-tree-dump-times " % " 6 "optimized" { target vect_int_mod 
} } } */
+/* { dg-final { scan-tree-dump-times " % " 6 "forwprop2" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4138cc9a662..ae33c4f1e3a 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -9064,19 +9064,6 @@ proc check_effective_target_vect_long_mult { } {
 return $answer
 }
 
-# Return 1 if the target supports vector int modulus, 0 otherwise.
-
-proc check_effective_target_vect_int_mod { } {
-return [check_cached_effective_target_indexed vect_int_mod {
-  expr { ([istarget powerpc*-*-*]
- && [check_effective_target_has_arch_pwr10])
- || [istarget amdgcn-*-*]
- || ([istarget loongarch*-*-*]
-&& [check_effective_target_loongarch_sx])
- || ([istarget riscv*-*-*]
-&& [check_effective_target_riscv_v]) }}]
-}
-
 # Return 1 if the target supports vector even/odd elements extraction, 0 
otherwise.
 
 proc check_effective_target_vect_extract_even_odd { } {
-- 
2.44.0



Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Jinyang He

On 2024-02-29 09:42, mengqinggang wrote:


Generate la.tls.desc macro instruction for TLS descriptors model.

la.tls.desc expand to
   pcalau12i $a0, %desc_pc_hi20(a)
   ld.d  $a1, $a0, %desc_ld_pc_lo12(a)
   addi.d$a0, $a0, %desc_add_pc_lo12(a)
   jirl  $ra, $a1, %desc_call(a)


Sorry for I might miss something before. Just some confusing.
In binutils `la.tls.desc` has been resolved as

#define INSN_LA_TLS_DESC64  \
  "pcalau12i $r4,%%desc_pc_hi20(%2);"   \
  "addi.d $r4,$r4,%%desc_pc_lo12(%2);"  \
  "ld.d $r1,$r4,%%desc_ld(%2);" \
  "jirl $r1,$r1,%%desc_call(%2);",  \

Should is need to be consistent with binutils?




The default is TLS descriptors, but can be configure with
-mtls-dialect={desc,trad}.

gcc/ChangeLog:

* config.gcc: Add --with_tls to change the TLS flavor.
* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
configure TLS flavor.
* config/loongarch/loongarch-opts.h (enum loongarch_tls_type): New.
* config/loongarch/loongarch-protos.h (NUM_SYMBOL_TYPES): New.
* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add
instruction sequence length data for TLS DESC.
(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
* config/loongarch/loongarch.h (TARGET_TLS_DESC): New.
* config/loongarch/loongarch.md (@got_load_tls_desc): New.
* config/loongarch/loongarch.opt: Regenerated.
---
Changes v1 -> v2:
- Clobber fcc0-fcc7 registers in got_load_tls_desc template.
- Support --with-tls in configure.

  gcc/config.gcc| 15 ++-
  gcc/config/loongarch/genopts/loongarch.opt.in | 14 ++
  gcc/config/loongarch/loongarch-opts.h |  6 +++
  gcc/config/loongarch/loongarch-protos.h   |  3 +-
  gcc/config/loongarch/loongarch.cc | 45 +++
  gcc/config/loongarch/loongarch.h  |  8 
  gcc/config/loongarch/loongarch.md | 36 +++
  gcc/config/loongarch/loongarch.opt| 14 ++
  8 files changed, 130 insertions(+), 11 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index a0f9c672308..72a5e992821 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2546,6 +2546,7 @@ loongarch*-*-linux*)
# Force .init_array support.  The configure script cannot always
# automatically detect that GAS supports it, yet we require it.
gcc_cv_initfini_array=yes
+   with_tls=${with_tls:-desc}
;;
  
  loongarch*-*-elf*)

@@ -4987,7 +4988,7 @@ case "${target}" in
;;
  
  	loongarch*-*)

-   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib"
+   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib tls"
  
  		# Local variables

unset \
@@ -5245,6 +5246,18 @@ case "${target}" in
with_multilib_list="${abi_base}/${abi_ext}"
fi
  
+		# Handle --with-tls.

+   case "$with_tls" in
+   "" \
+   | trad | desc)
+   # OK
+   ;;
+   *)
+   echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
+   exit 1
+   ;;
+   esac
+
# Check if the configured default ABI combination is included in
# ${with_multilib_list}.
loongarch_multilib_list_sane=no
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 02f918053f5..2cc943ef683 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -262,3 +262,17 @@ default value is 4.
  ; CPUCFG independently, so we use bit flags to specify them.
  TargetVariable
  HOST_WIDE_INT la_isa_evolution = 0
+
+Enum
+Name(tls_type) Type(enum loongarch_tls_type)
+The possible TLS dialects:
+
+EnumValue
+Enum(tls_type) String(trad) Value(TLS_TRADITIONAL)
+
+EnumValue
+Enum(tls_type) String(desc) Value(TLS_DESCRIPTORS)
+
+mtls-dialect=
+Target RejectNegative Joined Enum(tls_type) Var(loongarch_tls_dialect) 
Init(TLS_DESCRIPTORS) Save
+Specify TLS dialect.
diff --git a/gcc/config/loongarch/loongarch-opts.h 
b/gcc/config/loongarch/loongarch-opts.h
index 586e67e65ee..a08ab6fac10 100644
--- a/gcc/config/loongarch/loongarch-opts.h
+++ b/gcc/config/loongarch/loongarch-opts.h
@@ -134,4 +134,10 @@ struct loongarch_flags {
  #define HAVE_AS_TLS_LE_RELAXATION 0
  #endif
  
+/* TLS types.  */

+enum loongarch_tls_type {
+  TLS_TRADITIONAL,
+  TLS_DESCRIPTORS
+};
+
  #endif /* LOONGARCH_OPTS_H */
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 1fdfda9af01..6b417a3c371 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -53,8 +53,9 @@ enum loongarch_symbol_type {

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-02-28 Thread Hongtao Liu
On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek  wrote:
>
> Hi!
>
> Adding Hongtao and Honza into the loop as the ones who acked the original
> patch.
>
> The no_callee_saved_registers by default for noreturn functions change can
> break in-process backtrace(3) or backtraces from debugger or other process
> (quite often, any time the noreturn function decides to use the bp register
> and any of the parent frames uses a frame pointer; the unwinder just crashes
> in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> like to save bp register in that case:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
I think this patch makes sense and LGTM, we save and restore frame
pointer for noreturn.
>
> and additionally the no_callee_saved_registers by default for noreturn
> functions change can make debugging harder, again not localized to the
> noreturn function, but any of its callers.  So, if say glibc abort function
> implementation needs a lot of normally callee-saved registers, no matter how
> users recompile their apps, they will see garbage or optimized out
> vars/parameters in their code unless they rebuild their glibc with -O0.
> So, I think we should guard that by a non-default option:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646649.html
So it turns off the optimization for noreturn functions by default,
I'm not sure about this.
Any comments, H.J?
>
> Plus we need to somehow make sure to emit DW_CFA_undefined for the modified
> but not saved normally callee-saved registers, so that we at least don't get
> garbage in debug info.  H.J. posted some patches for that, so far I wasn't
> happy about the implementation but the actual change is desirable.
>
> Your thoughts on this?
>
> Jakub
>


-- 
BR,
Hongtao


Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao
On Thu, 2024-02-29 at 14:08 +0800, Xi Ruoyao wrote:
> > +  "TARGET_TLS_DESC"
> > +  "la.tls.desc\t%0,%1"
> 
> With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead
> of la.tls.desc.  As we don't want to add too many code we can just hard
> code the 4 instructions here instead of splitting this insn, just
> something like
> 
> { return TARGET_EXPLICIT_RELOCS_ALWAS ? ".." : "la.tls.desc\t%0,%1"; }

And if -mcmodel=extreme we should use a 3-operand la.tls.desc.  Or if we
don't want to support this we can just error out if -mcmodel=extreme -
mtls-dialect=desc.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao
On Thu, 2024-02-29 at 09:42 +0800, mengqinggang wrote:
> Generate la.tls.desc macro instruction for TLS descriptors model.
> 
> la.tls.desc expand to
>   pcalau12i $a0, %desc_pc_hi20(a)
>   ld.d  $a1, $a0, %desc_ld_pc_lo12(a)
>   addi.d    $a0, $a0, %desc_add_pc_lo12(a)
>   jirl  $ra, $a1, %desc_call(a)
> 
> The default is TLS descriptors, but can be configure with
> -mtls-dialect={desc,trad}.

Please keep trad as the default for now.  Glibc-2.40 will be released
after GCC 14.1 but we don't want to end up in a situation where the
default configuration of the latest GCC release creating something not
working with latest Glibc release.

And there's also musl libc we need to take into account.

Or you can write some autoconf test for if the assembler supports
tlsdesc and check TARGET_GLIBC_MAJOR & TARGET_GLIBC_MINOR for Glibc
version to decide if enable desc by default.  If you want this but don't
have time to implement you can leave trad the default and I'll take care
of this.

/* snip */

> +(define_insn "@got_load_tls_desc"
> +  [(set (match_operand:P 0 "register_operand" "=r")
> + (unspec:P
> +     [(match_operand:P 1 "symbolic_operand" "")]
> +     UNSPEC_TLS_DESC))
> +    (clobber (reg:SI FCC0_REGNUM))
> +    (clobber (reg:SI FCC1_REGNUM))
> +    (clobber (reg:SI FCC2_REGNUM))
> +    (clobber (reg:SI FCC3_REGNUM))
> +    (clobber (reg:SI FCC4_REGNUM))
> +    (clobber (reg:SI FCC5_REGNUM))
> +    (clobber (reg:SI FCC6_REGNUM))
> +    (clobber (reg:SI FCC7_REGNUM))
> +    (clobber (reg:SI A1_REGNUM))
> +    (clobber (reg:SI RETURN_ADDR_REGNUM))]

Ok, the clobber list is correct.

> +  "TARGET_TLS_DESC"
> +  "la.tls.desc\t%0,%1"

With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead
of la.tls.desc.  As we don't want to add too many code we can just hard
code the 4 instructions here instead of splitting this insn, just
something like

{ return TARGET_EXPLICIT_RELOCS_ALWAS ? ".." : "la.tls.desc\t%0,%1"; }

> +  [(set_attr "got" "load")
> +   (set_attr "mode" "")])

We need (set_attr "length" "16") in this list as this actually expands
into 16 bytes.


-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[patch, libgfortran] Part 2: PR105456 Child I/O does not propage iostat

2024-02-28 Thread Jerry D
The attached patch adds the error checks similar to the first patch 
previously committed.


I noticed a redundancy in some defines MSGLEN and IOMSG_LEN so I 
consolidated this to one define in io.h. This is just cleanup stuff.


I have added test cases for each of the places where UDTIO is done in 
the library.


Regressions tested on x86_64.

OK for trunk?

Regards,

Jerry

commit 640991bd6b83df4197b2eaec63d1e0e695e48b75
Author: Jerry DeLisle 
Date:   Wed Feb 28 20:51:06 2024 -0800

Fortran: Add user defined error messages for UDTIO.

The defines IOMSG_LEN and MSGLEN were redundant so these are combined
into IOMSG_LEN as defined in io.h.

The remainder of the patch adds checks for when a user defined
derived type IO procedure sets the IOSTAT or IOMSG variables
independent of the librrary defined I/O messages.

PR libfortran/105456

libgfortran/ChangeLog:

* io/io.h (IOMSG_LEN): Moved to here.
* io/list_read.c (MSGLEN): Removed MSGLEN.
(convert_integer): Changed MSGLEN to IOMSG_LEN.
(parse_repeat): Likewise.
(read_logical): Likewise.
(read_integer): Likewise.
(read_character): Likewise.
(parse_real): Likewise.
(read_complex): Likewise.
(read_real): Likewise.
(check_type): Likewise.
(list_formatted_read_scalar): Adjust to IOMSG_LEN.
(nml_read_obj): Add user defined error message.
* io/transfer.c (unformatted_read): Add user defined error
message.
(unformatted_write): Add user defined error message.
(formatted_transfer_scalar_read): Add user defined error 
message.
(formatted_transfer_scalar_write): Add user defined error 
message.
* io/write.c (list_formatted_write_scalar): Add user 
defined error message.

(nml_write_obj): Add user defined error message.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105456-nmlr.f90: New test.
* gfortran.dg/pr105456-nmlw.f90: New test.
* gfortran.dg/pr105456-ruf.f90: New test.
* gfortran.dg/pr105456-wf.f90: New test.
* gfortran.dg/pr105456-wuf.f90: New test.diff --git a/gcc/testsuite/gfortran.dg/pr105456-nmlr.f90 b/gcc/testsuite/gfortran.dg/pr105456-nmlr.f90
new file mode 100644
index 000..5ce5d082133
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr105456-nmlr.f90
@@ -0,0 +1,60 @@
+! { dg-do run }
+! { dg-shouldfail "The users message" }
+module m
+  implicit none
+  type :: t
+character :: c
+integer :: k
+  contains
+procedure :: write_formatted
+generic :: write(formatted) => write_formatted
+procedure :: read_formatted
+generic :: read(formatted) => read_formatted
+  end type
+contains
+  subroutine write_formatted(dtv, unit, iotype, v_list, iostat, iomsg)
+class(t), intent(in) :: dtv
+integer, intent(in) :: unit
+character(*), intent(in) :: iotype
+integer, intent(in) :: v_list(:)
+integer, intent(out) :: iostat
+character(*), intent(inout) :: iomsg
+if (iotype.eq."NAMELIST") then
+  write (unit, '(a1,a1,i3)') dtv%c,',', dtv%k
+else
+  write (unit,*) dtv%c, dtv%k
+end if
+  end subroutine
+  subroutine read_formatted(dtv, unit, iotype, v_list, iostat, iomsg)
+class(t), intent(inout) :: dtv
+integer, intent(in) :: unit
+character(*), intent(in) :: iotype
+integer, intent(in) :: v_list(:)
+integer, intent(out) :: iostat
+character(*), intent(inout) :: iomsg
+character :: comma
+if (iotype.eq."NAMELIST") then
+  read (unit, '(a1,a1,i3)') dtv%c, comma, dtv%k
+else
+  read (unit,*) dtv%c, comma, dtv%k
+endif
+iostat = 42
+iomsg = "The users message"
+if (comma /= ',') STOP 1
+  end subroutine
+end module
+
+program p
+  use m
+  implicit none
+  character(len=50) :: buffer
+  type(t) :: x
+  namelist /nml/ x
+  x = t('a', 5)
+  write (buffer, nml)
+  if (buffer.ne.'   X=a,  5  /') STOP 1
+  x = t('x', 0)
+  read (buffer, nml)
+  if (x%c.ne.'a'.or. x%k.ne.5) STOP 2
+end
+! { dg-output "Fortran runtime error: The users message" }
diff --git a/gcc/testsuite/gfortran.dg/pr105456-nmlw.f90 b/gcc/testsuite/gfortran.dg/pr105456-nmlw.f90
new file mode 100644
index 000..2c496e611f4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr105456-nmlw.f90
@@ -0,0 +1,60 @@
+! { dg-do run }
+! { dg-shouldfail "The users message" }
+module m
+  implicit none
+  type :: t
+character :: c
+integer :: k
+  contains
+procedure :: write_formatted
+generic :: write(formatted) => write_formatted
+procedure :: read_formatted
+generic :: read(formatted) => read_formatted
+  end type
+contains
+  subroutine write_formatted(dtv, unit, iotype, v_list, iostat, iomsg)
+class(t), intent(in) :: dtv
+integer, intent(in) :: unit
+character(*), intent(in) :: iotype
+integer, intent(in) :: v_list(:)

Re: [PATCH 08/11] rs6000, add tests and documentation for various, built-ins

2024-02-28 Thread Kewen.Lin
Hi,

on 2024/2/21 01:57, Carl Love wrote:
>  
>  GCC maintainers:
> 
> The patch adds documentation a number of built-ins.
> 
> The patch has been tested on Power 10 with no regressions.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.
> 
>   Carl 
> 
>  rs6000, add tests and documentation for various built-ins
> 
> This patch adds a test case and documentation in extend.texi for the
> following built-ins:
> 
> __builtin_altivec_fix_sfsi
> __builtin_altivec_fixuns_sfsi
> __builtin_altivec_float_sisf
> __builtin_altivec_uns_float_sisf

I think these are covered by vec_{unsigned,signed,float}, could you
have a check?

> __builtin_altivec_vrsqrtfp

Similar to that __builtin_altivec_vrsqrtefp is covered by vec_rsqrte,
this is already covered by vec_rsqrt, which has the vf instance
__builtin_vsx_xvrsqrtsp, so this one is useless and removable.


> __builtin_altivec_mask_for_load

This one is for internal use, I don't think we want to document it in
user manual.

> __builtin_altivec_vsel_1ti
> __builtin_altivec_vsel_1ti_uns

I think we can extend the existing vec_sel for vsq and vuq, also update
the documentation.

> __builtin_vec_init_v16qi
> __builtin_vec_init_v4sf
> __builtin_vec_init_v4si
> __builtin_vec_init_v8hi

There are more vec_init variants __builtin_vec_init_{v2df,v2di,v1ti},
any reasons not include them here? ...

> __builtin_vec_set_v16qi
> __builtin_vec_set_v4sf
> __builtin_vec_set_v4si
> __builtin_vec_set_v8hi

... and some similar variants for this one?

it seems that users can just use something like:

  vector ... = {x, y} ...

for the vector initialization and something like:

  vector ... z;
  z[0] = ...;
  z[i] = ...;

for the vector set.  Can you check if there are some
differences between the above style and builtin? (both
BE and LE).  And the historical reasons for adding them?

If we really need them, I'd like to see we just have
the according overload function like vec_init and vec_set
instead of exposing the instances with different suffixes.

BR,
Kewen

> 
> gcc/ChangeLog:
>   * doc/extend.texi (__builtin_altivec_fix_sfsi,
>   __builtin_altivec_fixuns_sfsi, __builtin_altivec_float_sisf,
>   __builtin_altivec_uns_float_sisf, __builtin_altivec_vrsqrtfp,
>   __builtin_altivec_mask_for_load, __builtin_altivec_vsel_1ti,
>   __builtin_altivec_vsel_1ti_uns, __builtin_vec_init_v16qi,
>   __builtin_vec_init_v4sf, __builtin_vec_init_v4si,
>   __builtin_vec_init_v8hi, __builtin_vec_set_v16qi,
>   __builtin_vec_set_v4sf, __builtin_vec_set_v4si,
>   __builtin_vec_set_v8hi): Add documentation.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/altivec-38.c: New test case.
> ---
>  gcc/doc/extend.texi   |  98 
>  gcc/testsuite/gcc.target/powerpc/altivec-38.c | 503 ++
>  2 files changed, 601 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/altivec-38.c
> 
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 87fd30bfa9e..89d0a1f77b0 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22678,6 +22678,104 @@ if the VSX instruction set is available.  The 
> @samp{vec_vsx_ld} and
>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>  
>  
> +@smallexample
> +vector signed int __builtin_altivec_fix_sfsi (vector float);
> +vector signed int __builtin_altivec_fixuns_sfsi (vector float);
> +vector float __builtin_altivec_float_sisf (vector int);
> +vector float __builtin_altivec_uns_float_sisf (vector int);
> +vector float __builtin_altivec_vrsqrtfp (vector float);
> +@end smallexample
> +
> +The @code{__builtin_altivec_fix_sfsi} converts a vector of single precision
> +floating point values to a vector of signed integers with round to zero.
> +
> +The @code{__builtin_altivec_fixuns_sfsi} converts a vector of single 
> precision
> +floating point values to a vector of unsigned integers with round to zero.  
> If
> +the rounded floating point value is less then 0 the result is 0 and VXCVI
> +is set to 1.
> +
> +The @code{__builtin_altivec_float_sisf} converts a vector of single precision
> +signed integers to a vector of floating point values using the rounding mode
> +specified by RN.
> +
> +The @code{__builtin_altivec_uns_float_sisf} converts a vector of single
> +precision unsigned integers to a vector of floating point values using the
> +rounding mode specified by RN.
> +
> +The @code{__builtin_altivec_vrsqrtfp} returns a vector of floating point
> +estimates of the reciprical square root of each floating point source vector
> +element.
> +
> +@smallexample
> +vector signed char test_altivec_mask_for_load (const void *);
> +@end smallexample
> +
> +The @code{__builtin_altivec_vrsqrtfp} returns a vector mask based on the
> +bottom four bits of the argument.  Let X be the 32-byte value:
> +0x00 || 0x01 || 0x02 || ... || 0x1D || 0x1E || 0x1F.
> +Bytes sh 

Re: [PATCH v3] c++/modules: Support lambdas attached to more places in modules [PR111710]

2024-02-28 Thread Nathaniel Shead
On Wed, Feb 28, 2024 at 12:34:51PM -0500, Jason Merrill wrote:
> On 2/27/24 23:12, Nathaniel Shead wrote:
> > On Tue, Feb 27, 2024 at 11:59:46AM -0500, Patrick Palka wrote:
> > > On Fri, 16 Feb 2024, Nathaniel Shead wrote:
> > > 
> > > > On Tue, Feb 13, 2024 at 07:52:01PM -0500, Jason Merrill wrote:
> > > > > On 2/10/24 17:57, Nathaniel Shead wrote:
> > > > > > The fix for PR107398 weakened the restrictions that lambdas must 
> > > > > > belong
> > > > > > to namespace scope. However this was not sufficient: we also need to
> > > > > > allow lambdas keyed to FIELD_DECLs or PARM_DECLs.
> > > > > 
> > > > > I wonder about keying such lambdas to the class and function, 
> > > > > respectively,
> > > > > rather than specifically to the field or parameter, but I suppose it 
> > > > > doesn't
> > > > > matter.
> > > > 
> > > > I did some more testing and realised my testcase didn't properly
> > > > exercise whether I'd properly deduplicated or not, and an improved
> > > > testcase proved that actually keying to the field rather than the class
> > > > did cause issues. (Parameter vs. function doesn't seem to have mattered
> > > > however.)
> > > > 
> > > > Here's an updated patch that fixes this, and includes the changes for
> > > > lambdas in base classes that I'd had as a separate patch earlier. I've
> > > > also added some concepts testcases just in case.
> > > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > > 
> > > > -- >8 --
> > > > 
> > > > The fix for PR107398 weakened the restrictions that lambdas must belong
> > > > to namespace scope. However this was not sufficient: we also need to
> > > > allow lambdas attached to FIELD_DECLs, PARM_DECLs, and TYPE_DECLs.
> > > > 
> > > > For field decls we key the lambda to its class rather than the field
> > > > itself. This avoids some errors with deduplicating fields.
> > > > 
> > > > Additionally, by [basic.link] p15.2 a lambda defined anywhere in a
> > > > class-specifier should not be TU-local, which includes base-class
> > > > declarations, so ensure that lambdas declared there are keyed
> > > > appropriately as well.
> > > > 
> > > > Because this now requires 'DECL_MODULE_KEYED_DECLS_P' to be checked on a
> > > > fairly large number of different kinds of DECLs, and that in general
> > > > it's safe to just get 'false' as a result of a check on an unexpected
> > > > DECL type, this patch also removes the tree checking from the accessor.
> > > > 
> > > > Finally, to handle deduplicating templated lambda fields, we need to
> > > > ensure that we can determine that two lambdas from different field decls
> > > > match. The modules code does not attempt to deduplicate expression
> > > > nodes, which causes issues as the LAMBDA_EXPRs are then considered to be
> > > > different. However, rather than checking the LAMBDA_EXPR directly we can
> > > > instead check its type: the generated RECORD_TYPE for a LAMBDA_EXPR must
> > > > also be unique, and /is/ deduplicated on import, so we can just check
> > > > for that instead.
> > > 
> > > We probably should be deduplicating LAMBDA_EXPR on stream-in, perhaps
> > > something like
> > > 
> > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > index e8eabb1f6f9..1b2ba2e0fa8 100644
> > > --- a/gcc/cp/module.cc
> > > +++ b/gcc/cp/module.cc
> > > @@ -9183,6 +9183,13 @@ trees_in::tree_value ()
> > > return NULL_TREE;
> > >   }
> > > +  if (TREE_CODE (t) == LAMBDA_EXPR
> > > +  && CLASSTYPE_LAMBDA_EXPR (TREE_TYPE (t)))
> > > +{
> > > +  existing = CLASSTYPE_LAMBDA_EXPR (TREE_TYPE (t));
> > > +  back_refs[~tag] = existing;
> > > +}
> > > +
> > > dump (dumper::TREE) && dump ("Read tree:%d %C:%N", tag, TREE_CODE 
> > > (t), t);
> > > if (TREE_CODE (existing) == INTEGER_CST && !TREE_OVERFLOW (existing))
> > > 
> > > would suffice?  If not we probably need to take inspiration from the
> > > TREE_BINFO streaming, and handle LAMBDA_EXPR similarly..
> > > 
> > 
> > Ah yup, right, that makes more sense. Your suggestion seems to work,
> > thanks! Here's an updated patch.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> With that change, do you still need to key to the class instead of the field
> for dedup to work properly?
> 
> OK either way.
> 

Thanks. And yes, we still need to key to the class rather than the
field; it looks like otherwise what happens is that when reading the
lambda's TYPE_DECL for a lambda used as an NSDMI, 'key_mergeable' ends
up loading the FIELD_DECL it's attached to (and attempting to install
its keyed declarations) when reading its 'name' before it gets a chance
to decide that the TYPE_DECL has an existing definition and deduplicate,
which causes 'set_overrun' to be called due to apparent mismatch.

By keying to the class instead of the field this isn't an issue because
the keyed declarations of the containing type are handled after the
nested TYPE_DECLs are deduplicated.

I'll added a comment to this 

[PATCH gcc] Hurd x86_64: add unwind support for signal trampoline code

2024-02-28 Thread Flavio Cruz
Tested with some simple toy examples where an exception is thrown in the
signal handler.

libgcc/ChangeLog:
* config/i386/gnu-unwind.h: Support unwinding x86_64 signal frames.

Signed-off-by: Flavio Cruz 
---
 libgcc/config/i386/gnu-unwind.h | 97 -
 1 file changed, 94 insertions(+), 3 deletions(-)

diff --git a/libgcc/config/i386/gnu-unwind.h b/libgcc/config/i386/gnu-unwind.h
index 0751b5593d4..02b060ab4a5 100644
--- a/libgcc/config/i386/gnu-unwind.h
+++ b/libgcc/config/i386/gnu-unwind.h
@@ -32,9 +32,100 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 
 #ifdef __x86_64__
 
-/*
- * TODO: support for 64 bits needs to be implemented.
- */
+#define MD_FALLBACK_FRAME_STATE_FOR x86_gnu_fallback_frame_state
+
+static _Unwind_Reason_Code
+x86_gnu_fallback_frame_state
+(struct _Unwind_Context *context, _Unwind_FrameState *fs)
+{
+  static const unsigned char gnu_sigtramp_code[] =
+  {
+/* rpc_wait_trampoline: */
+0x48, 0xc7, 0xc0, 0xe7, 0xff, 0xff, 0xff,/* mov$-25,%rax */
+0x0f, 0x05,  /* syscall */
+0x49, 0x89, 0x04, 0x24,  /* mov%rax,(%r12) */
+0x48, 0x89, 0xdc,/* mov%rbx,%rsp */
+
+/* trampoline: */
+0x5f,/* pop%rdi */
+0x5e,/* pop%rsi */
+0x5a,/* pop%rdx */
+0x48, 0x83, 0xc4, 0x08,  /* add$0x8,%rsp */
+0x41, 0xff, 0xd5,/* call   *%r13 */
+
+/* RA HERE */
+0x48, 0x8b, 0x7c, 0x24, 0x10,/* mov0x10(%rsp),%rdi */
+0xc3,/* ret */
+
+/* firewall: */
+0xf4,/* hlt */
+  };
+
+  const size_t gnu_sigtramp_len = sizeof gnu_sigtramp_code;
+  const size_t gnu_sigtramp_tail = 7; /* length of tail after RA */
+
+  struct stack_contents {
+void *sigreturn_addr;
+void *sigreturn_returns_here;
+struct sigcontext *return_scp;
+  } *stack_contents;
+  struct sigcontext *scp;
+  unsigned long usp;
+
+  unsigned char *adjusted_pc = (unsigned char*)(context->ra) +
+gnu_sigtramp_tail - gnu_sigtramp_len;
+  if (memcmp (adjusted_pc, gnu_sigtramp_code, gnu_sigtramp_len))
+return _URC_END_OF_STACK;
+
+  stack_contents = context->cfa;
+
+  scp = stack_contents->return_scp;
+  usp = scp->sc_ursp;
+
+  fs->regs.reg[0].loc.offset = (unsigned long)>sc_rax - usp;
+  fs->regs.reg[1].loc.offset = (unsigned long)>sc_rdx - usp;
+  fs->regs.reg[2].loc.offset = (unsigned long)>sc_rcx - usp;
+  fs->regs.reg[3].loc.offset = (unsigned long)>sc_rbx - usp;
+  fs->regs.reg[4].loc.offset = (unsigned long)>sc_rsi - usp;
+  fs->regs.reg[5].loc.offset = (unsigned long)>sc_rdi - usp;
+  fs->regs.reg[6].loc.offset = (unsigned long)>sc_rbp - usp;
+  fs->regs.reg[8].loc.offset = (unsigned long)>sc_r8 - usp;
+  fs->regs.reg[9].loc.offset = (unsigned long)>sc_r9 - usp;
+  fs->regs.reg[10].loc.offset = (unsigned long)>sc_r10 - usp;
+  fs->regs.reg[11].loc.offset = (unsigned long)>sc_r11 - usp;
+  fs->regs.reg[12].loc.offset = (unsigned long)>sc_r12 - usp;
+  fs->regs.reg[13].loc.offset = (unsigned long)>sc_r13 - usp;
+  fs->regs.reg[14].loc.offset = (unsigned long)>sc_r14 - usp;
+  fs->regs.reg[15].loc.offset = (unsigned long)>sc_r15 - usp;
+  fs->regs.reg[16].loc.offset = (unsigned long)>sc_rip - usp;
+
+  /* Register 7 is rsp  */
+  fs->regs.cfa_how = CFA_REG_OFFSET;
+  fs->regs.cfa_reg = 7;
+  fs->regs.cfa_offset = usp - (unsigned long) context->cfa;
+
+  fs->regs.how[0] = REG_SAVED_OFFSET;
+  fs->regs.how[1] = REG_SAVED_OFFSET;
+  fs->regs.how[2] = REG_SAVED_OFFSET;
+  fs->regs.how[3] = REG_SAVED_OFFSET;
+  fs->regs.how[4] = REG_SAVED_OFFSET;
+  fs->regs.how[5] = REG_SAVED_OFFSET;
+  fs->regs.how[6] = REG_SAVED_OFFSET;
+  fs->regs.how[8] = REG_SAVED_OFFSET;
+  fs->regs.how[9] = REG_SAVED_OFFSET;
+  fs->regs.how[10] = REG_SAVED_OFFSET;
+  fs->regs.how[11] = REG_SAVED_OFFSET;
+  fs->regs.how[12] = REG_SAVED_OFFSET;
+  fs->regs.how[13] = REG_SAVED_OFFSET;
+  fs->regs.how[14] = REG_SAVED_OFFSET;
+  fs->regs.how[15] = REG_SAVED_OFFSET;
+  fs->regs.how[16] = REG_SAVED_OFFSET;
+
+  fs->retaddr_column = 16;
+  fs->signal_frame = 1;
+
+  return _URC_NO_REASON;
+}
 
 #else /* ifdef __x86_64__  */
 
-- 
2.43.0



Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

2024-02-28 Thread Kito Cheng
Committed with Palmer's suggestions for the commit message, also I
plan to back port that to 11, 12 and 13 release branches :)

On Thu, Feb 29, 2024 at 4:27 AM Palmer Dabbelt  wrote:
>
> On Wed, 28 Feb 2024 09:36:38 PST (-0800), Patrick O'Neill wrote:
> >
> > On 2/28/24 07:02, Palmer Dabbelt wrote:
> >> On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreya...@gmail.com wrote:
> >>>
> >>>
> >>> On 2/28/24 05:23, Kito Cheng wrote:
>  atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic
>  operation on
>  RV64, however lr.w is doing sign extend to DI and compare
>  instruction only have
>  DI mode on RV64, so the expected value should be sign extend before
>  compare as
>  well, so that we can get right compare result.
> 
>  gcc/ChangeLog:
> 
>  PR target/114130
>  * config/riscv/sync.md (atomic_compare_and_swap): Sign
>  extend the expected value if needed.
> 
>  gcc/testsuite/ChangeLog:
> 
>  * gcc.target/riscv/pr114130.c: New.
> >>> Nearly rejected this as I think the description was a bit ambiguous and
> >>> I thought you were extending the result of the lr.w.  But it's actually
> >>> the other value you're ensuring gets properly extended.
> >>
> >> I had the same response, but after reading it I'm not quite sure how
> >> to say it better.
>
> Maybe something like
>
> atomic_compare_and_swapsi will use lr.w to do obtain the original value,
> which sign extends to DI.  RV64 only has DI comparisons, so we also need
> to sign extend the expected value to DI as otherwise the comparison will
> fail when the expected value has the 32nd bit set.
>
> would do it?  Either way
>
> Reviewed-by: Palmer Dabbelt 
>
> as I've managed to convince myself it's correct.  We should probably
> backport this one, the bug has likely been around for a while.
>
> >>
> >>> OK.
> >>
> >> I was looking at the code to try and ask if we have the same bug for
> >> the short inline CAS routines, but I've got to run to some meetings...
> >
> > I don't think subword AMO CAS is impacted.
> >
> > As part of the CAS we mask both the expected value [2] and the retrieved
> > value[1] before comparing.
>
> I'm always a bit lost when it comes to bit arithmetic, but I think it's
> OK.  It smells like it's being a little loose with the
> extensions/comparisons, but just looking at some generated code for this
> simple case:
>
> void foo(uint16_t *p, uint16_t *e, uint16_t *d) {
> __atomic_compare_exchange(p, e, d, 0, __ATOMIC_RELAXED, 
> __ATOMIC_RELAXED);
> }
>
> foo:
> lhu a3,0(a2)
> lhu a2,0(a1)
> andia4,a0,3
> li  a5,65536
> slliw   a4,a4,3
> addiw   a5,a5,-1
> sllwa5,a5,a4
> sllwa3,a3,a4
> sllwa7,a2,a4
> andia0,a0,-4
> and a3,a3,a5
> not t1,a5
> and a7,a7,a5
> 1:
> lr.wa6, 0(a0)
> and t3, a6, a5// Both a6 (from the lr.w) and a5
>   // (from the sllw) are sign extended,
>   // so the result in t3 is sign extended.
> bne t3, a7, 1f// a7 is also sign extended (before
>   // and after the masking above), so
>   // it's safe for comparison
> and t3, a6, t1
> or  t3, t3, a3
> sc.wt3, t3, 0(a0) // The top bits of t3 end up polluted
>   // with sign extension, but it doesn't
>   // matter because of the sc.w.
> bnezt3, 1b
> 1:
> srawa6,a6,a4
> slliw   a2,a2,16
> slliw   a5,a6,16
> sraiw   a2,a2,16
> sraiw   a5,a5,16
> subwa5,a5,a2
> beq a5,zero,.L1
> sh  a6,0(a1)
> .L1:
> ret
>
> So I think we're OK -- that masking of a7 looks redundant here, but I
> don't think we could get away with just
>
> diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
> index 54bb0a66518..15956940032 100644
> --- a/gcc/config/riscv/sync.md
> +++ b/gcc/config/riscv/sync.md
> @@ -456,7 +456,6 @@ (define_expand "atomic_cas_value_strong"
>riscv_lshift_subword (mode, o, shift, _o);
>riscv_lshift_subword (mode, n, shift, _n);
>
> -  emit_move_insn (shifted_o, gen_rtx_AND (SImode, shifted_o, mask));
>emit_move_insn (shifted_n, gen_rtx_AND (SImode, shifted_n, mask));
>
>enum memmodel model_success = (enum memmodel) INTVAL (operands[4]);
>
> because we'd need the masking for when we don't know the high bits are
> safe pre-shift.  So maybe some sort of simplify call could help out
> there, but I bet it's not really worth bothering -- the 

Re: [PATCH] Fortran - Error compiling PDT Type-bound Procedures [PR82943/86148/86268]

2024-02-28 Thread Alexander Westbrooks
Hello,

I meant to add a link to the commit to the previous email:

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=edfe198084338691d0facc86bf8dfa6ede3ca676

Thanks,

Alexander Westbrooks

On Wed, Feb 28, 2024 at 8:24 PM Alexander Westbrooks
 wrote:
>
> Hello,
>
> I've updated the patch with those changes, ran through the gcc-verify
> step and fixed up the commit, and then pushed it to the trunk.
>
> Thank you for your feedback, and I look forward to working on GFortran.
>
> Thanks,
>
> Alexander Westbrooks
>
> On Wed, Feb 28, 2024 at 1:55 PM Harald Anlauf  wrote:
> >
> > Hi Alex,
> >
> > this is now mostly correct, with the following exceptions:
> >
> > First, you should notice that the formatting of the commit message,
> > when checked using "git gcc-verify", needs minor corrections.  You
> > will be guided how to fix this yourself.
> >
> > Second, testcase pdt_37.f03 has an undeclared dummy argument, which
> > can be detected by adding "implicit none" (I usually use that
> > whenever implicit typing is not wanted explicitly).  I would get:
> >
> > pdt_37.f03:33:47:
> >
> > 33 | subroutine assumed_len_param_ptr(this, that)
> >|   1
> > Error: Symbol 'that' at (1) has no IMPLICIT type; did you mean 'this'?
> >
> > I assume you want to uncomment the declaration of dummy 'that'.
> >
> > Third, I still see a - minor - indentation/tabbing/space issue here:
> >
> > diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
> > index 44f89f6afb4..852e0820e6a 100644
> > --- a/gcc/fortran/resolve.cc
> > +++ b/gcc/fortran/resolve.cc
> > [...]
> > +  if ( resolve_bindings_derived->attr.pdt_template
> > + && gfc_pdt_is_instance_of (resolve_bindings_derived,
> > +   CLASS_DATA (me_arg)->ts.u.derived)
> > +  && (me_arg->param_list != NULL)
> > +  && (gfc_spec_list_type (me_arg->param_list,
> > +CLASS_DATA(me_arg)->ts.u.derived)
> > +!= SPEC_ASSUMED))
> >
> > OK with the above fixed.
> >
> > Thanks for the patch!
> >
> > Harald
> >
> > On 2/28/24 07:24, Alexander Westbrooks wrote:
> > > Harald,
> > >
> > > Jerry helped me figure out my editor settings so that I could fix
> > > whitespace and formatting issues in my code. With my editor configured
> > > correctly, I saw that my code was not conforming to coding standards
> > > as I previously thought it was. I have fixed those things and updated
> > > my patch. Thank you for your patience.
> > >
> > > Let me know if this is okay to push to the trunk.
> > >
> > > Thanks,
> > >
> > > Alexander Westbrooks
> > >
> > > On Sun, Feb 25, 2024 at 2:40 PM Alexander Westbrooks
> > >  wrote:
> > >>
> > >> Harald,
> > >>
> > >> Thank you for reviewing my code. I've been doing research and debugging 
> > >> to investigate the error thrown by Intel and NAG for the deferred 
> > >> parameter in the dummy variable declaration. I found where the problem 
> > >> was and added the fix as part of my patch. I've attached the patch as a 
> > >> file, which also includes your feedback and suggested fixes. I've 
> > >> updated the test case pdt_37.f03 to check for the POINTER or ALLOCATABLE 
> > >> error as you suggested.
> > >>
> > >> All regression tests pass, including the new ones, after including the 
> > >> fix for the POINTER or ALLOCATABLE error for CLASS declarations of PDTs 
> > >> when deferred length parameters are used. This was tested on WSL 2, with 
> > >> Ubuntu 20.04 distro.
> > >>
> > >> Is this okay to push to the trunk?
> > >>
> > >> Thanks,
> > >>
> > >> Alexander Westbrooks
> > >>
> > >>
> > >> On Sun, Feb 11, 2024 at 2:11 PM Harald Anlauf  wrote:
> > >>>
> > >>> Hi Alex,
> > >>>
> > >>> I've been unable to apply your patch to my local trunk, likely due to
> > >>> whitespace issues my newsreader handles differently from your site.
> > >>> I see it inline instead of attached.
> > >>>
> > >>> A few general remarks:
> > >>>
> > >>> Please follow the general recommendation regarding style if possible,
> > >>> see https://www.gnu.org/prep/standards/standards.html#Formatting
> > >>> regarding formatting/whitespace use (5.1) and comments (5.2)
> > >>>
> > >>> Also, when an error message text spans multiple lines, please place the
> > >>> whitespace at the end of a line, not at the beginning of the new one:
> > >>>
> >  +  if ( resolve_bindings_derived->attr.pdt_template &&
> >  +   !gfc_pdt_is_instance_of(resolve_bindings_derived,
> >  +   CLASS_DATA(me_arg)->ts.u.derived))
> >  +{
> >  +  gfc_error ("Argument %qs of %qs with PASS(%s) at %L must be of"
> >  +" the parametric derived-type %qs", me_arg->name, proc->name,
> > >>>
> > >>> gfc_error ("Argument %qs of %qs with PASS(%s) at %L must be of "
> > >>>"the parametric derived-type %qs", me_arg->name,
> > >>> proc->name,
> > >>>
> >  +

Re: [PATCH 1/3] Change 'v1' float and int code to fall back to v0

2024-02-28 Thread Andrew Pinski
On Wed, Feb 28, 2024 at 5:35 PM Tom Tromey  wrote:
>
> > "Andrew" == Andrew Pinski  writes:
>
> Andrew> I don't know how to update the script server side after it is
> Andrew> committed in git. the checker script is located in git though:
>
> Thanks, I didn't realize it was there.
>
> Could you check in your patch?
> IMO it seems obvious.

Pushed as r14-9230-g5ff49272bf4eb6
(https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646819.html).
I noticed there should be other cleanup of the bug components there
too but I will leave that for another time.

Thanks,
Andrew Pinski

>
> Tom


[COMMITTED] Add libcc1 to bug components

2024-02-28 Thread Andrew Pinski
As found by Tom Tromey in 
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646807.html
libcc1 is not listed as bug component even though it is there in bugzilla.
This fixes that oversight.

Committed as obvious after testing using git gcc-verify on a patch.

contrib/ChangeLog:

* gcc-changelog/git_commit.py (bug_components): Add libcc1.

Signed-off-by: Andrew Pinski 
---
 contrib/gcc-changelog/git_commit.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 87bec4e00f5..87ecb9e1a17 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -105,6 +105,7 @@ bug_components = {
 'java',
 'jit',
 'libbacktrace',
+'libcc1',
 'libf2c',
 'libffi',
 'libfortran',
-- 
2.34.1



Re: [PATCH] Fortran - Error compiling PDT Type-bound Procedures [PR82943/86148/86268]

2024-02-28 Thread Alexander Westbrooks
Hello,

I've updated the patch with those changes, ran through the gcc-verify
step and fixed up the commit, and then pushed it to the trunk.

Thank you for your feedback, and I look forward to working on GFortran.

Thanks,

Alexander Westbrooks

On Wed, Feb 28, 2024 at 1:55 PM Harald Anlauf  wrote:
>
> Hi Alex,
>
> this is now mostly correct, with the following exceptions:
>
> First, you should notice that the formatting of the commit message,
> when checked using "git gcc-verify", needs minor corrections.  You
> will be guided how to fix this yourself.
>
> Second, testcase pdt_37.f03 has an undeclared dummy argument, which
> can be detected by adding "implicit none" (I usually use that
> whenever implicit typing is not wanted explicitly).  I would get:
>
> pdt_37.f03:33:47:
>
> 33 | subroutine assumed_len_param_ptr(this, that)
>|   1
> Error: Symbol 'that' at (1) has no IMPLICIT type; did you mean 'this'?
>
> I assume you want to uncomment the declaration of dummy 'that'.
>
> Third, I still see a - minor - indentation/tabbing/space issue here:
>
> diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
> index 44f89f6afb4..852e0820e6a 100644
> --- a/gcc/fortran/resolve.cc
> +++ b/gcc/fortran/resolve.cc
> [...]
> +  if ( resolve_bindings_derived->attr.pdt_template
> + && gfc_pdt_is_instance_of (resolve_bindings_derived,
> +   CLASS_DATA (me_arg)->ts.u.derived)
> +  && (me_arg->param_list != NULL)
> +  && (gfc_spec_list_type (me_arg->param_list,
> +CLASS_DATA(me_arg)->ts.u.derived)
> +!= SPEC_ASSUMED))
>
> OK with the above fixed.
>
> Thanks for the patch!
>
> Harald
>
> On 2/28/24 07:24, Alexander Westbrooks wrote:
> > Harald,
> >
> > Jerry helped me figure out my editor settings so that I could fix
> > whitespace and formatting issues in my code. With my editor configured
> > correctly, I saw that my code was not conforming to coding standards
> > as I previously thought it was. I have fixed those things and updated
> > my patch. Thank you for your patience.
> >
> > Let me know if this is okay to push to the trunk.
> >
> > Thanks,
> >
> > Alexander Westbrooks
> >
> > On Sun, Feb 25, 2024 at 2:40 PM Alexander Westbrooks
> >  wrote:
> >>
> >> Harald,
> >>
> >> Thank you for reviewing my code. I've been doing research and debugging to 
> >> investigate the error thrown by Intel and NAG for the deferred parameter 
> >> in the dummy variable declaration. I found where the problem was and added 
> >> the fix as part of my patch. I've attached the patch as a file, which also 
> >> includes your feedback and suggested fixes. I've updated the test case 
> >> pdt_37.f03 to check for the POINTER or ALLOCATABLE error as you suggested.
> >>
> >> All regression tests pass, including the new ones, after including the fix 
> >> for the POINTER or ALLOCATABLE error for CLASS declarations of PDTs when 
> >> deferred length parameters are used. This was tested on WSL 2, with Ubuntu 
> >> 20.04 distro.
> >>
> >> Is this okay to push to the trunk?
> >>
> >> Thanks,
> >>
> >> Alexander Westbrooks
> >>
> >>
> >> On Sun, Feb 11, 2024 at 2:11 PM Harald Anlauf  wrote:
> >>>
> >>> Hi Alex,
> >>>
> >>> I've been unable to apply your patch to my local trunk, likely due to
> >>> whitespace issues my newsreader handles differently from your site.
> >>> I see it inline instead of attached.
> >>>
> >>> A few general remarks:
> >>>
> >>> Please follow the general recommendation regarding style if possible,
> >>> see https://www.gnu.org/prep/standards/standards.html#Formatting
> >>> regarding formatting/whitespace use (5.1) and comments (5.2)
> >>>
> >>> Also, when an error message text spans multiple lines, please place the
> >>> whitespace at the end of a line, not at the beginning of the new one:
> >>>
>  +  if ( resolve_bindings_derived->attr.pdt_template &&
>  +   !gfc_pdt_is_instance_of(resolve_bindings_derived,
>  +   CLASS_DATA(me_arg)->ts.u.derived))
>  +{
>  +  gfc_error ("Argument %qs of %qs with PASS(%s) at %L must be of"
>  +" the parametric derived-type %qs", me_arg->name, proc->name,
> >>>
> >>> gfc_error ("Argument %qs of %qs with PASS(%s) at %L must be of "
> >>>"the parametric derived-type %qs", me_arg->name,
> >>> proc->name,
> >>>
>  +me_arg->name, , resolve_bindings_derived->name);
>  +  goto error;
>  +}
> >>>
> >>> The following change is almost unreadable: the lnegthy comment is split
> >>> over three parts and almost hides the code.  Couldn't this be combined
> >>> into one comment before the function?
> >>>
>  diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
>  index fddf68f8398..11f4bac0415 100644
>  --- a/gcc/fortran/symbol.cc
>  +++ b/gcc/fortran/symbol.cc
>  @@ -5172,6 

[PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread mengqinggang
Generate la.tls.desc macro instruction for TLS descriptors model.

la.tls.desc expand to
  pcalau12i $a0, %desc_pc_hi20(a)
  ld.d  $a1, $a0, %desc_ld_pc_lo12(a)
  addi.d$a0, $a0, %desc_add_pc_lo12(a)
  jirl  $ra, $a1, %desc_call(a)

The default is TLS descriptors, but can be configure with
-mtls-dialect={desc,trad}.

gcc/ChangeLog:

* config.gcc: Add --with_tls to change the TLS flavor.
* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
configure TLS flavor.
* config/loongarch/loongarch-opts.h (enum loongarch_tls_type): New.
* config/loongarch/loongarch-protos.h (NUM_SYMBOL_TYPES): New.
* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add
instruction sequence length data for TLS DESC.
(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
* config/loongarch/loongarch.h (TARGET_TLS_DESC): New.
* config/loongarch/loongarch.md (@got_load_tls_desc): New.
* config/loongarch/loongarch.opt: Regenerated.
---
Changes v1 -> v2:
- Clobber fcc0-fcc7 registers in got_load_tls_desc template.
- Support --with-tls in configure.

 gcc/config.gcc| 15 ++-
 gcc/config/loongarch/genopts/loongarch.opt.in | 14 ++
 gcc/config/loongarch/loongarch-opts.h |  6 +++
 gcc/config/loongarch/loongarch-protos.h   |  3 +-
 gcc/config/loongarch/loongarch.cc | 45 +++
 gcc/config/loongarch/loongarch.h  |  8 
 gcc/config/loongarch/loongarch.md | 36 +++
 gcc/config/loongarch/loongarch.opt| 14 ++
 8 files changed, 130 insertions(+), 11 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index a0f9c672308..72a5e992821 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2546,6 +2546,7 @@ loongarch*-*-linux*)
# Force .init_array support.  The configure script cannot always
# automatically detect that GAS supports it, yet we require it.
gcc_cv_initfini_array=yes
+   with_tls=${with_tls:-desc}
;;
 
 loongarch*-*-elf*)
@@ -4987,7 +4988,7 @@ case "${target}" in
;;
 
loongarch*-*)
-   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib"
+   supported_defaults="abi arch tune fpu simd multilib-default 
strict-align-lib tls"
 
# Local variables
unset \
@@ -5245,6 +5246,18 @@ case "${target}" in
with_multilib_list="${abi_base}/${abi_ext}"
fi
 
+   # Handle --with-tls.
+   case "$with_tls" in
+   "" \
+   | trad | desc)
+   # OK
+   ;;
+   *)
+   echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
+   exit 1
+   ;;
+   esac
+
# Check if the configured default ABI combination is included in
# ${with_multilib_list}.
loongarch_multilib_list_sane=no
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 02f918053f5..2cc943ef683 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -262,3 +262,17 @@ default value is 4.
 ; CPUCFG independently, so we use bit flags to specify them.
 TargetVariable
 HOST_WIDE_INT la_isa_evolution = 0
+
+Enum
+Name(tls_type) Type(enum loongarch_tls_type)
+The possible TLS dialects:
+
+EnumValue
+Enum(tls_type) String(trad) Value(TLS_TRADITIONAL)
+
+EnumValue
+Enum(tls_type) String(desc) Value(TLS_DESCRIPTORS)
+
+mtls-dialect=
+Target RejectNegative Joined Enum(tls_type) Var(loongarch_tls_dialect) 
Init(TLS_DESCRIPTORS) Save
+Specify TLS dialect.
diff --git a/gcc/config/loongarch/loongarch-opts.h 
b/gcc/config/loongarch/loongarch-opts.h
index 586e67e65ee..a08ab6fac10 100644
--- a/gcc/config/loongarch/loongarch-opts.h
+++ b/gcc/config/loongarch/loongarch-opts.h
@@ -134,4 +134,10 @@ struct loongarch_flags {
 #define HAVE_AS_TLS_LE_RELAXATION 0
 #endif
 
+/* TLS types.  */
+enum loongarch_tls_type {
+  TLS_TRADITIONAL,
+  TLS_DESCRIPTORS
+};
+
 #endif /* LOONGARCH_OPTS_H */
diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index 1fdfda9af01..6b417a3c371 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -53,8 +53,9 @@ enum loongarch_symbol_type {
   SYMBOL_TLS_LE,
   SYMBOL_TLSGD,
   SYMBOL_TLSLDM,
+  SYMBOL_TLS_DESC,
 };
-#define NUM_SYMBOL_TYPES (SYMBOL_TLSLDM + 1)
+#define NUM_SYMBOL_TYPES (SYMBOL_TLS_DESC + 1)
 
 /* Routines implemented in loongarch.cc.  */
 extern rtx loongarch_emit_move (rtx, rtx);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 0428b6e65d5..b4e43f1d037 100644
--- a/gcc/config/loongarch/loongarch.cc

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-28 Thread Li, Pan2
> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I 
> suspect those are going to fail for RISC-V as those aren't tieable.

Yes, you are right. Different REG_CLASS are not allowed to be tieable in RISC-V.

static bool
riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
  /* We don't allow different REG_CLASS modes tieable since it
 will cause ICE in register allocation (RA).
 E.g. V2SI and DI are not tieable.  */
  if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
return false;
  return (mode1 == mode2
  || !(GET_MODE_CLASS (mode1) == MODE_FLOAT
   && GET_MODE_CLASS (mode2) == MODE_FLOAT));
}

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, February 29, 2024 1:33 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; rdapp@gmail.com; Liu, Hongtao 

Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val



On 2/27/24 21:51, Li, Pan2 wrote:
>>if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>>  return NULL_RTX;
>>if (!targetm.modes_tieable_p (int_mode, mode))
>>  return NULL_RTX;
> 
> Yes, will return NULL_RTX for in the first if, given src_int_mode is E_DImode 
> while src_mode is
> E_V2SFmode and mode is E_V4QImode. The extract_low_bits convert the modes 
> E_V2SFmode/E_V4QImode
> to E_DImode/E_SImode in advance before tieable checking, validate_subreg and 
> gen_lowpart.
> 
> Not sure if my understanding is correct but looks extract_low_bits cannot 
> take care of vector modes
> up to a point because vector modes are always untieable to its' int mode, and 
> then return NULL_RTX.
Well, the code tries to turn the vector mode into a suitable integer 
mode via int_mode_for_mode.  That takes a mode, including vector modes 
and tries to find an integer mode of the exact same size.

So it's going to check if V2SF can be tied to DI and V4QI with SI.  I 
suspect those are going to fail for RISC-V as those aren't tieable.

Jeff



Re: [PATCH 1/3] Change 'v1' float and int code to fall back to v0

2024-02-28 Thread Tom Tromey
> "Andrew" == Andrew Pinski  writes:

Andrew> I don't know how to update the script server side after it is
Andrew> committed in git. the checker script is located in git though:

Thanks, I didn't realize it was there.

Could you check in your patch?
IMO it seems obvious.

Tom


RE: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread Li, Pan2
Personally I prefer to remove --param=riscv-autovec-preference=none and only 
allow
mrvv-vector-bits, to avoid tricky(maybe) sematic of none preference. However, 
let’s
wait for a while in case there are some comments from others.

Pan

From: Kito Cheng 
Sent: Wednesday, February 28, 2024 10:55 PM
To: 钟居哲 
Cc: Li, Pan2 ; gcc-patches ; Wang, 
Yanzhang ; rdapp.gcc ; Jeff Law 

Subject: Re: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for 
RVV

Hmm, maybe only keep --param=riscv-autovec-preference=none and remove other two 
if we think that might still useful? But anyway I have no strong opinion to 
keep that, I mean I am ok to remove whole --param=riscv-autovec-preference.

钟居哲 mailto:juzhe.zh...@rivai.ai>> 於 2024年2月28日 週三 21:59 
寫道:
I think it makes more sense to remove --param=riscv-autovec-preference and add 
-mrvv-vector-bits


juzhe.zh...@rivai.ai

From: Kito Cheng
Date: 2024-02-28 20:56
To: pan2.li
CC: gcc-patches; 
juzhe.zhong; 
yanzhang.wang; 
rdapp.gcc; jeffreyalaw
Subject: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
Take one more look, I think this option should work and integrate with
--param=riscv-autovec-preference= since they have similar jobs but
slightly different.

We have 3 value for  --param=riscv-autovec-preference=: none, scalable
and fixed-vlmax

-mrvv-vector-bits=scalable is work like
--param=riscv-autovec-preference=scalable and
-mrvv-vector-bits=zvl is work like
--param=riscv-autovec-preference=fixed-vlmax.

So I think...we need to do some conflict check, like:

-mrvv-vector-bits=zvl can't work with --param=riscv-autovec-preference=scalable
-mrvv-vector-bits=scalable can't work with
--param=riscv-autovec-preference=fixed-vlmax

but it may not just alias since there is some useful combinations like:

-mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
NO auto vectorization but intrinsic code still could benefit from the
-mrvv-vector-bits=zvl option.

-mrvv-vector-bits=scalable with --param=riscv-autovec-preference=none
Should still work for VLS code gen, but just disable auto
vectorization per the option semantic.

However here is something we need some fix, since
--param=riscv-autovec-preference=none still disable VLS code gen for
now, you can see some example here:
https://godbolt.org/z/fMTr3eW7K

But I think it's really the right behavior here, this part might need
to be fixed in vls_mode_valid_p and some other places.


Anyway I think we need to check all use sites with RVV_FIXED_VLMAX and
RVV_SCALABLE, and need to make sure all use site of RVV_FIXED_VLMAX
also checked with RVV_VECTOR_BITS_ZVL.



> -/* Return the VLEN value associated with -march.
> +static int
> +riscv_convert_vector_bits (int min_vlen)

Not sure if we really need this function, it seems it always returns min_vlen?

> +{
> +  int rvv_bits = 0;
> +
> +  switch (rvv_vector_bits)
> +{
> +  case RVV_VECTOR_BITS_ZVL:
> +  case RVV_VECTOR_BITS_SCALABLE:
> +   rvv_bits = min_vlen;
> +   break;
> +  default:
> +   gcc_unreachable ();
> +}
> +
> +  return rvv_bits;
> +}
> +
> +/* Return the VLEN value associated with -march and -mwrvv-vector-bits.



Re: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread juzhe.zh...@rivai.ai
I think it makes more sense to remove the whole 
--param=riscv-autovec-preference since we should use
-fno-tree-vectorize instead of --param=riscv-autovec-preference=none which is 
more reasonable compile option
for users.

--param is just a internal testing option that we added before, ideally we 
should remove them.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-02-28 22:55
To: 钟居哲
CC: pan2.li; gcc-patches; yanzhang.wang; rdapp.gcc; Jeff Law
Subject: Re: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for 
RVV
Hmm, maybe only keep --param=riscv-autovec-preference=none and remove other two 
if we think that might still useful? But anyway I have no strong opinion to 
keep that, I mean I am ok to remove whole --param=riscv-autovec-preference.

钟居哲  於 2024年2月28日 週三 21:59 寫道:
I think it makes more sense to remove --param=riscv-autovec-preference and add 
-mrvv-vector-bits



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-02-28 20:56
To: pan2.li
CC: gcc-patches; juzhe.zhong; yanzhang.wang; rdapp.gcc; jeffreyalaw
Subject: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
Take one more look, I think this option should work and integrate with
--param=riscv-autovec-preference= since they have similar jobs but
slightly different.
 
We have 3 value for  --param=riscv-autovec-preference=: none, scalable
and fixed-vlmax
 
-mrvv-vector-bits=scalable is work like
--param=riscv-autovec-preference=scalable and
-mrvv-vector-bits=zvl is work like
--param=riscv-autovec-preference=fixed-vlmax.
 
So I think...we need to do some conflict check, like:
 
-mrvv-vector-bits=zvl can't work with --param=riscv-autovec-preference=scalable
-mrvv-vector-bits=scalable can't work with
--param=riscv-autovec-preference=fixed-vlmax
 
but it may not just alias since there is some useful combinations like:
 
-mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
NO auto vectorization but intrinsic code still could benefit from the
-mrvv-vector-bits=zvl option.
 
-mrvv-vector-bits=scalable with --param=riscv-autovec-preference=none
Should still work for VLS code gen, but just disable auto
vectorization per the option semantic.
 
However here is something we need some fix, since
--param=riscv-autovec-preference=none still disable VLS code gen for
now, you can see some example here:
https://godbolt.org/z/fMTr3eW7K
 
But I think it's really the right behavior here, this part might need
to be fixed in vls_mode_valid_p and some other places.
 
 
Anyway I think we need to check all use sites with RVV_FIXED_VLMAX and
RVV_SCALABLE, and need to make sure all use site of RVV_FIXED_VLMAX
also checked with RVV_VECTOR_BITS_ZVL.
 
 
 
> -/* Return the VLEN value associated with -march.
> +static int
> +riscv_convert_vector_bits (int min_vlen)
 
Not sure if we really need this function, it seems it always returns min_vlen?
 
> +{
> +  int rvv_bits = 0;
> +
> +  switch (rvv_vector_bits)
> +{
> +  case RVV_VECTOR_BITS_ZVL:
> +  case RVV_VECTOR_BITS_SCALABLE:
> +   rvv_bits = min_vlen;
> +   break;
> +  default:
> +   gcc_unreachable ();
> +}
> +
> +  return rvv_bits;
> +}
> +
> +/* Return the VLEN value associated with -march and -mwrvv-vector-bits.
 


Re: [PATCH 1/3] Change 'v1' float and int code to fall back to v0

2024-02-28 Thread Andrew Pinski
On Wed, Feb 28, 2024 at 3:26 PM Jeff Law  wrote:
>
>
>
> On 2/28/24 15:57, Tom Tromey wrote:
> >> "Jeff" == Jeff Law  writes:
>
> > I could not push this because:
> >
> > remote: *** ChangeLog format failed:
> > remote: *** ERR: invalid PR component in subject: "Fix PR libcc1/113977"
> >
> > I guess this script isn't in sync with the components in bugzilla.
> >
> > I don't know how to fix this.
> Me neither, but I can suggest a hacky workaround.  Change the component
> in bugzilla to something the pre-commit hooks understand, push the fix,
> then change the component back a little while later and adjust the
> ChangeLog after it gets generated overnight.  Ugly as sin.

I don't know how to update the script server side after it is
committed in git. the checker script is located in git though:
```
[apinski@xeond2 contrib]$ git diff gcc-changelog/git_commit.py
diff --git a/contrib/gcc-changelog/git_commit.py
b/contrib/gcc-changelog/git_commit.py
index 87bec4e00f5..4a3720de7fb 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -109,6 +109,7 @@ bug_components = {
 'libffi',
 'libfortran',
 'libgcc',
+'libcc1',
 'libgcj',
 'libgomp',
 'libitm',
```

Thanks,
Andrew Pinski

>
> jeff


Re: [PATCH 1/3] Change 'v1' float and int code to fall back to v0

2024-02-28 Thread Jeff Law




On 2/28/24 15:57, Tom Tromey wrote:

"Jeff" == Jeff Law  writes:



I could not push this because:

remote: *** ChangeLog format failed:
remote: *** ERR: invalid PR component in subject: "Fix PR libcc1/113977"

I guess this script isn't in sync with the components in bugzilla.

I don't know how to fix this.
Me neither, but I can suggest a hacky workaround.  Change the component 
in bugzilla to something the pre-commit hooks understand, push the fix, 
then change the component back a little while later and adjust the 
ChangeLog after it gets generated overnight.  Ugly as sin.


jeff


Re: [PATCH] c++: -Wuninitialized when binding a ref to uninit DM [PR113987]

2024-02-28 Thread Jason Merrill

On 2/22/24 14:28, Marek Polacek wrote:

On Thu, Feb 22, 2024 at 08:34:45AM +, Jason Merrill wrote:

On 2/20/24 19:15, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This PR asks that our -Wuninitialized for mem-initializers does
not warn when binding a reference to an uninitialized data member.
We already check !INDIRECT_TYPE_P in find_uninit_fields_r, but
that won't catch binding a parameter of a reference type to an
uninitialized field, as in:

struct S { S (int&); };
struct T {
T() : s(i) {}
S s;
int i;
};

This patch adds a new function to handle this case.


For type_build_ctor_call types like S, it's weird that we currently
find_uninit_fields before building the initialization.  What if we move the
check after the build_aggr_init so we have the actual initializer instead of
just the expression?


Thanks.  I've tried but unfortunately I'm not getting anywhere.  One
problem is that immediately after the find_uninit_fields call we may
change the TREE_LIST in

   if (init && TREE_CODE (init) == TREE_LIST)
  //...

so we'd have to cope with that somehow.  Sinking find_uninit_fields
into one of the conditions below looks like a complication.  Another
problem is that calling find_uninit_fields on the result of
build_aggr_init call causes a bogus warning: we create something like
E::E (&((struct F *) this)->e, ((struct F *) this)->a)
and then warn that the this object is uninitialized.  So I'm not sure
if that fix would be simpler.


Fair enough, the patch is OK.

Jason



Re: [PING] Re: [PATCH 1/2] c-family: -Waddress-of-packed-member and casts

2024-02-28 Thread Jason Merrill

On 2/22/24 03:51, Torbjorn SVENSSON wrote:

Ping!


Hmm, I'm somewhat reluctant to backport a significant behavior change 
like this just to silence a testsuite failure, even though I think the 
change is correct.  Maybe just xfail the tests for GCC 13?


Jason


On 2024-02-07 17:19, Torbjorn SVENSSON wrote:

Hi,

Is it okay to backport b7e4a4c626eeeb32c291d5bbbaa148c5081b6bfd to 
releases/gcc-13?


Without this backport, I see these failures on arm-none-eabi:

FAIL: 
gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c 
(test for excess errors)
FAIL: gcc.dg/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c 
(test for excess errors)


Kind regards,
Torbjörn

On 2023-12-11 08:28, Richard Biener wrote:

On Wed, Nov 22, 2023 at 11:45 PM Jason Merrill  wrote:


Tested x86_64-pc-linux-gnu, OK for trunk?


OK


-- 8< --

-Waddress-of-packed-member, in addition to the documented warning about
taking the address of a packed member, also warns about casting from
a pointer to a TYPE_PACKED type to a pointer to a type with greater
alignment.

This wrongly warns if the source is a pointer to enum when 
-fshort-enums

is on, since that is also represented by TYPE_PACKED.

And there's already -Wcast-align to catch casting from pointer to less
aligned type (packed or otherwise) to pointer to more aligned type; 
even
apart from the enum problem, this seems like a somewhat arbitrary 
subset of

that warning.  Though that isn't currently on by default.

So, this patch removes the undocumented type-based warning from
-Waddress-of-packed-member.  Some of the tests where the warning is
desirable I changed to use -Wcast-align=strict instead.  The ones that
require -Wno-incompatible-pointer-types, I just removed.

gcc/c-family/ChangeLog:

 * c-warn.cc (check_address_or_pointer_of_packed_member):
 Remove warning based on TYPE_PACKED.

gcc/testsuite/ChangeLog:

 * c-c++-common/Waddress-of-packed-member-1.c: Don't expect
 a warning on the cast cases.
 * c-c++-common/pr51628-35.c: Use -Wcast-align=strict.
 * g++.dg/warn/Waddress-of-packed-member3.C: Likewise.
 * gcc.dg/pr88928.c: Likewise.
 * gcc.dg/pr51628-20.c: Removed.
 * gcc.dg/pr51628-21.c: Removed.
 * gcc.dg/pr51628-25.c: Removed.
---
  gcc/c-family/c-warn.cc    | 58 
+--

  .../Waddress-of-packed-member-1.c | 12 ++--
  gcc/testsuite/c-c++-common/pr51628-35.c   |  6 +-
  .../g++.dg/warn/Waddress-of-packed-member3.C  |  8 +--
  gcc/testsuite/gcc.dg/pr51628-20.c | 11 
  gcc/testsuite/gcc.dg/pr51628-21.c | 11 
  gcc/testsuite/gcc.dg/pr51628-25.c |  9 ---
  gcc/testsuite/gcc.dg/pr88928.c    |  6 +-
  8 files changed, 19 insertions(+), 102 deletions(-)
  delete mode 100644 gcc/testsuite/gcc.dg/pr51628-20.c
  delete mode 100644 gcc/testsuite/gcc.dg/pr51628-21.c
  delete mode 100644 gcc/testsuite/gcc.dg/pr51628-25.c

diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index d2938b91043..2a399ba6d14 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2991,10 +2991,9 @@ check_alignment_of_packed_member (tree type, 
tree field, bool rvalue)

    return NULL_TREE;
  }

-/* Return struct or union type if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
+/* Return struct or union type if the right hand value, RHS
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.
 Otherwise, return NULL_TREE.  */

  static tree
@@ -3021,57 +3020,6 @@ check_address_or_pointer_of_packed_member 
(tree type, tree rhs)


    type = TREE_TYPE (type);

-  if (TREE_CODE (rhs) == PARM_DECL
-  || VAR_P (rhs)
-  || TREE_CODE (rhs) == CALL_EXPR)
-    {
-  tree rhstype = TREE_TYPE (rhs);
-  if (TREE_CODE (rhs) == CALL_EXPR)
-   {
- rhs = CALL_EXPR_FN (rhs); /* Pointer expression.  */
- if (rhs == NULL_TREE)
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);    /* Pointer type.  */
- /* We could be called while processing a template and RHS 
could be

-    a functor.  In that case it's a class, not a pointer.  */
- if (!rhs || !POINTER_TYPE_P (rhs))
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);    /* Function type.  */
- rhstype = TREE_TYPE (rhs);
- if (!rhstype || !POINTER_TYPE_P (rhstype))
-   return NULL_TREE;
- rvalue = true;
-   }
-  if (rvalue && POINTER_TYPE_P (rhstype))
-   rhstype = TREE_TYPE (rhstype);
-  while (TREE_CODE (rhstype) == ARRAY_TYPE)
-   rhstype = TREE_TYPE (rhstype);
-  if (TYPE_PACKED (rhstype))
-   {
- unsigned int type_align = min_align_of_type (type);
-  

Re: [PATCH v2] c++: implement [[gnu::non_owning]] [PR110358]

2024-02-28 Thread Jason Merrill

On 2/21/24 19:35, Marek Polacek wrote:

On Fri, Jan 26, 2024 at 04:04:35PM -0500, Jason Merrill wrote:

On 1/25/24 20:37, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Since -Wdangling-reference has false positives that can't be
prevented, we should offer an easy way to suppress the warning.
Currently, that is only possible by using a #pragma, either around the
enclosing class or around the call site.  But #pragma GCC diagnostic tend
to be onerous.  A better solution would be to have an attribute.  Such
an attribute should not be tied to this particular warning though.  [*]

The warning bogusly triggers for classes that are like std::span,
std::reference_wrapper, and std::ranges::ref_view.  The common property
seems to be that these classes are only wrappers around some data.  So
I chose the name non_owning, but I'm not attached to it.  I hope that
in the future the attribute can be used for something other than this
diagnostic.


You decided not to pursue Barry's request for a bool argument to the
attribute?


At first I thought it'd be an unnecessary complication but it was actually
pretty easy.  Better to accept the optional argument from the get-go
otherwise people would have to add > GCC 14 checks.
  

Might it be more useful for the attribute to make reference_like_class_p
return true, so that we still warn about a temporary of another type passing
through it?


Good point.  Fixed.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Since -Wdangling-reference has false positives that can't be
prevented, we should offer an easy way to suppress the warning.
Currently, that is only possible by using a #pragma, either around the
enclosing class or around the call site.  But #pragma GCC diagnostic tend
to be onerous.  A better solution would be to have an attribute.  Such
an attribute should not be tied to this particular warning though.

The warning bogusly triggers for classes that are like std::span,
std::reference_wrapper, and std::ranges::ref_view.  The common property
seems to be that these classes are only wrappers around some data.  So
I chose the name non_owning, but I'm not attached to it.  I hope that
in the future the attribute can be used for something other than this
diagnostic.

This attribute takes an optional bool argument to support cases like:

   template 
   struct [[gnu::non_owning(std::is_reference_v)]] S {
  // ...
   };

PR c++/110358
PR c++/109642

gcc/cp/ChangeLog:

* call.cc (non_owning_p): New.
(reference_like_class_p): Use it.
(do_warn_dangling_reference): Use it.  Don't warn when the function
or its enclosing class has attribute gnu::non_owning.
* tree.cc (cxx_gnu_attributes): Add gnu::non_owning.
(handle_non_owning_attribute): New.

gcc/ChangeLog:

* doc/extend.texi: Document gnu::non_owning.
* doc/invoke.texi: Mention that gnu::non_owning disables
-Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-non-owning1.C: New test.
* g++.dg/ext/attr-non-owning2.C: New test.
* g++.dg/ext/attr-non-owning3.C: New test.
* g++.dg/ext/attr-non-owning4.C: New test.
* g++.dg/ext/attr-non-owning5.C: New test.
* g++.dg/ext/attr-non-owning6.C: New test.
* g++.dg/ext/attr-non-owning7.C: New test.
* g++.dg/ext/attr-non-owning8.C: New test.
* g++.dg/ext/attr-non-owning9.C: New test.
---
  gcc/cp/call.cc  | 38 ++--
  gcc/cp/tree.cc  | 26 +
  gcc/doc/extend.texi | 25 
  gcc/doc/invoke.texi | 21 +++
  gcc/testsuite/g++.dg/ext/attr-non-owning1.C | 38 
  gcc/testsuite/g++.dg/ext/attr-non-owning2.C | 29 +
  gcc/testsuite/g++.dg/ext/attr-non-owning3.C | 24 
  gcc/testsuite/g++.dg/ext/attr-non-owning4.C | 14 +
  gcc/testsuite/g++.dg/ext/attr-non-owning5.C | 31 ++
  gcc/testsuite/g++.dg/ext/attr-non-owning6.C | 65 +
  gcc/testsuite/g++.dg/ext/attr-non-owning7.C | 31 ++
  gcc/testsuite/g++.dg/ext/attr-non-owning8.C | 30 ++
  gcc/testsuite/g++.dg/ext/attr-non-owning9.C | 25 
  13 files changed, 391 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning1.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning2.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning3.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning4.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning5.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning6.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning7.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning8.C
  create mode 100644 gcc/testsuite/g++.dg/ext/attr-non-owning9.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc

Re: [PATCH 1/3] Change 'v1' float and int code to fall back to v0

2024-02-28 Thread Tom Tromey
> "Jeff" == Jeff Law  writes:

Jeff> Given this is all libcc1 related and thus primarily of interest to
Jeff> gdb, if you're happy with it, then it's OK for the trunk.

Thank you.

I could not push this because:

remote: *** ChangeLog format failed:
remote: *** ERR: invalid PR component in subject: "Fix PR libcc1/113977"

I guess this script isn't in sync with the components in bugzilla.

I don't know how to fix this.

Tom


Re: [PATCH] c++: Fix explicit instantiation of const variable templates after earlier implicit instantation [PR113976]

2024-02-28 Thread Jason Merrill

On 2/26/24 12:10, Patrick Palka wrote:

On Tue, 20 Feb 2024, Jakub Jelinek wrote:


Hi!

Already previously instantiated const variable templates had
cp_apply_type_quals_to_decl called when they were instantiated,
but if they need runtime initialization, their TREE_READONLY flag
has been subsequently cleared.
Explicit variable template instantiation calls grokdeclarator which
calls cp_apply_type_quals_to_decl on them again, setting TREE_READONLY
flag again, but nothing clears it afterwards, so we emit such
instantiations into rodata sections and segfault when the dynamic
initialization attempts to initialize them.

The following patch fixes that by not calling cp_apply_type_quals_to_decl
on already instantiated variable declarations.


LGTM, this seems like the safest approach for backporting.  Note
we can't check DECL_EXPLICIT_INSTANTIATION at this point because
that doesn't get set until later from do_decl_instantiation.


Agreed, OK.



Re: [PATCH] c++: auto(x) partial substitution [PR110025, PR114138]

2024-02-28 Thread Jason Merrill

On 2/27/24 15:48, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk and perhaps 13?

-- >8 --

In r12-6773-g09845ad7569bac we gave CTAD placeholders a level of 0 and
ensured we never replaced them via tsubst.  It turns out that autos
representing an explicit cast need the same treatment and for the same
reason: such autos appear in an expression context and so their level
gets easily messed up after partial substitution, leading to premature
replacement via an incidental tsubst instead of via do_auto_deduction.

This patch fixes this by extending the r12-6773 approach to auto(x) and
auto{x}.

PR c++/110025
PR c++/114138

gcc/cp/ChangeLog:

* cp-tree.h (make_cast_auto): Declare.
* parser.cc (cp_parser_functional_cast): Replace a parsed auto
with a level-less one via make_cast_auto.
* pt.cc (find_parameter_packs_r): Don't treat level-less auto
as a type parameter pack.
(tsubst) : Generalized CTAD placeholder
handling to all level-less autos.
(make_cast_auto): Define.
(do_auto_deduction): Handle deduction of a level-less non-CTAD
auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast16.C: New test.
* g++.dg/cpp23/auto-fncast17.C: New test.
* g++.dg/cpp23/auto-fncast18.C: New test.
---
  gcc/cp/cp-tree.h   |  1 +
  gcc/cp/parser.cc   | 11 
  gcc/cp/pt.cc   | 31 +-
  gcc/testsuite/g++.dg/cpp23/auto-fncast16.C | 12 
  gcc/testsuite/g++.dg/cpp23/auto-fncast17.C | 15 +
  gcc/testsuite/g++.dg/cpp23/auto-fncast18.C | 71 ++
  6 files changed, 138 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast16.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast17.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast18.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 04c3aa6cd91..6f1da1c7bad 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7476,6 +7476,7 @@ extern tree make_decltype_auto(void);
  extern tree make_constrained_auto (tree, tree);
  extern tree make_constrained_decltype_auto(tree, tree);
  extern tree make_template_placeholder (tree);
+extern tree make_cast_auto (void);
  extern bool template_placeholder_p(tree);
  extern bool ctad_template_p   (tree);
  extern bool unparenthesized_id_or_class_member_access_p (tree);
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 3ee9d49fb8e..1e518e6ef51 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -33314,6 +33314,17 @@ cp_parser_functional_cast (cp_parser* parser, tree 
type)
if (!type)
  type = error_mark_node;
  
+  if (TREE_CODE (type) == TYPE_DECL

+  && is_auto (TREE_TYPE (type)))
+type = TREE_TYPE (type);
+
+  if (is_auto (type)
+  && !AUTO_IS_DECLTYPE (type)
+  && !PLACEHOLDER_TYPE_CONSTRAINTS (type)
+  && !CLASS_PLACEHOLDER_TEMPLATE (type))
+/* auto(x) and auto{x} are represented using a level-less auto.  */
+type = make_cast_auto ();
+
if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
  {
cp_lexer_set_source_position (parser->lexer);
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2803824d11e..620fe5cdbfa 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -3921,7 +3921,8 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, 
void* data)
 parameter pack (14.6.3), or the type-specifier-seq of a type-id that
 is a pack expansion, the invented template parameter is a template
 parameter pack.  */
-  if (ppd->type_pack_expansion_p && is_auto (t))
+  if (ppd->type_pack_expansion_p && is_auto (t)
+ && TEMPLATE_TYPE_LEVEL (t) != 0)
TEMPLATE_TYPE_PARAMETER_PACK (t) = true;
if (TEMPLATE_TYPE_PARAMETER_PACK (t))
  parameter_pack_p = true;
@@ -16297,9 +16298,14 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
}
  
  case TEMPLATE_TYPE_PARM:

-  if (template_placeholder_p (t))
+  if (TEMPLATE_TYPE_LEVEL (t) == 0)
{
+ /* Level-less auto must be replaced via do_auto_deduction.  */


This comment could use clarification about the CTAD case.


+ gcc_checking_assert (is_auto (t));
  tree tmpl = CLASS_PLACEHOLDER_TEMPLATE (t);
+ if (!tmpl)
+   return t;
+
  tmpl = tsubst_expr (tmpl, args, complain, in_decl);
  if (TREE_CODE (tmpl) == TEMPLATE_TEMPLATE_PARM)
tmpl = TEMPLATE_TEMPLATE_PARM_TEMPLATE_DECL (tmpl);
@@ -29311,6 +29317,17 @@ template_placeholder_p (tree t)
return is_auto (t) && CLASS_PLACEHOLDER_TEMPLATE (t);
  }
  
+/* Return an auto for an explicit cast, e.g. auto(x) or auto{x}.

+   Like CTAD placeholders, these have level 0 so that they're not
+   accidentally replaced 

[PATCH] contrib/gcc-changelog/git_check_commit.py: Implement --num-commits

2024-02-28 Thread Ken Matsui
This patch implements a --num-commits (-n) flag for shorthand for
the range of hash~N..hash commits.

contrib/ChangeLog:

* gcc-changelog/git_check_commit.py: Implement --num-commits.

Signed-off-by: Ken Matsui 
---
 contrib/gcc-changelog/git_check_commit.py | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/contrib/gcc-changelog/git_check_commit.py 
b/contrib/gcc-changelog/git_check_commit.py
index 8cca9f439a5..22e032e8b38 100755
--- a/contrib/gcc-changelog/git_check_commit.py
+++ b/contrib/gcc-changelog/git_check_commit.py
@@ -22,6 +22,12 @@ import argparse
 
 from git_repository import parse_git_revisions
 
+def nonzero_uint(value):
+ivalue = int(value)
+if ivalue <= 0:
+raise argparse.ArgumentTypeError('%s is not a non-zero positive 
integer' % value)
+return ivalue
+
 parser = argparse.ArgumentParser(description='Check git ChangeLog format '
  'of a commit')
 parser.add_argument('revisions', default='HEAD', nargs='?',
@@ -33,8 +39,17 @@ parser.add_argument('-p', '--print-changelog', 
action='store_true',
 help='Print final changelog entires')
 parser.add_argument('-v', '--verbose', action='store_true',
 help='Print verbose information')
+parser.add_argument('-n', '--num-commits', type=nonzero_uint, default=1,
+help='Number of commits to check (i.e. shorthand for '
+'hash~N..hash)')
 args = parser.parse_args()
 
+if args.num_commits > 1:
+if '..' in args.revisions:
+print('ERR: --num-commits and range of revisions are mutually 
exclusive')
+exit(1)
+args.revisions = '{0}~{1}..{0}'.format(args.revisions, args.num_commits)
+
 retval = 0
 for git_commit in parse_git_revisions(args.git_path, args.revisions):
 res = 'OK' if git_commit.success else 'FAILED'
-- 
2.44.0



Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

2024-02-28 Thread Palmer Dabbelt

On Wed, 28 Feb 2024 09:36:38 PST (-0800), Patrick O'Neill wrote:


On 2/28/24 07:02, Palmer Dabbelt wrote:

On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreya...@gmail.com wrote:



On 2/28/24 05:23, Kito Cheng wrote:

atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic
operation on
RV64, however lr.w is doing sign extend to DI and compare
instruction only have
DI mode on RV64, so the expected value should be sign extend before
compare as
well, so that we can get right compare result.

gcc/ChangeLog:

PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap): Sign
extend the expected value if needed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr114130.c: New.

Nearly rejected this as I think the description was a bit ambiguous and
I thought you were extending the result of the lr.w.  But it's actually
the other value you're ensuring gets properly extended.


I had the same response, but after reading it I'm not quite sure how
to say it better.


Maybe something like

   atomic_compare_and_swapsi will use lr.w to do obtain the original value, 
   which sign extends to DI.  RV64 only has DI comparisons, so we also need 
   to sign extend the expected value to DI as otherwise the comparison will 
   fail when the expected value has the 32nd bit set.


would do it?  Either way

Reviewed-by: Palmer Dabbelt 

as I've managed to convince myself it's correct.  We should probably 
backport this one, the bug has likely been around for a while.





OK.


I was looking at the code to try and ask if we have the same bug for
the short inline CAS routines, but I've got to run to some meetings...


I don't think subword AMO CAS is impacted.

As part of the CAS we mask both the expected value [2] and the retrieved
value[1] before comparing.


I'm always a bit lost when it comes to bit arithmetic, but I think it's 
OK.  It smells like it's being a little loose with the 
extensions/comparisons, but just looking at some generated code for this 
simple case:


   void foo(uint16_t *p, uint16_t *e, uint16_t *d) {
   __atomic_compare_exchange(p, e, d, 0, __ATOMIC_RELAXED, 
__ATOMIC_RELAXED);
   }

   foo:
   lhu a3,0(a2)
   lhu a2,0(a1)
   andia4,a0,3
   li  a5,65536
   slliw   a4,a4,3
   addiw   a5,a5,-1
   sllwa5,a5,a4
   sllwa3,a3,a4
   sllwa7,a2,a4
   andia0,a0,-4
   and a3,a3,a5
   not t1,a5
   and a7,a7,a5
   1:
   lr.wa6, 0(a0)
   and t3, a6, a5// Both a6 (from the lr.w) and a5 
  // (from the sllw) are sign extended, 
  // so the result in t3 is sign extended.
   bne t3, a7, 1f// a7 is also sign extended (before 
  // and after the masking above), so 
  // it's safe for comparison

   and t3, a6, t1
   or  t3, t3, a3
   sc.wt3, t3, 0(a0) // The top bits of t3 end up polluted 
	  // with sign extension, but it doesn't 
  // matter because of the sc.w.

   bnezt3, 1b
   1:
   srawa6,a6,a4
   slliw   a2,a2,16
   slliw   a5,a6,16
   sraiw   a2,a2,16
   sraiw   a5,a5,16
   subwa5,a5,a2
   beq a5,zero,.L1
   sh  a6,0(a1)
   .L1:
   ret

So I think we're OK -- that masking of a7 looks redundant here, but I 
don't think we could get away with just


   diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
   index 54bb0a66518..15956940032 100644
   --- a/gcc/config/riscv/sync.md
   +++ b/gcc/config/riscv/sync.md
   @@ -456,7 +456,6 @@ (define_expand "atomic_cas_value_strong"
  riscv_lshift_subword (mode, o, shift, _o);
  riscv_lshift_subword (mode, n, shift, _n);
   
   -  emit_move_insn (shifted_o, gen_rtx_AND (SImode, shifted_o, mask));

  emit_move_insn (shifted_n, gen_rtx_AND (SImode, shifted_n, mask));
   
  enum memmodel model_success = (enum memmodel) INTVAL (operands[4]);


because we'd need the masking for when we don't know the high bits are 
safe pre-shift.  So maybe some sort of simplify call could help out 
there, but I bet it's not really worth bothering -- the bookeeping 
doesn't generally matter that much around AMOs.



- Patrick

[1]:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l495
[2]:
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l459





Jeff


Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Segher Boessenkool
On Wed, Feb 28, 2024 at 11:58:15AM -0600, Peter Bergner wrote:
> On 2/28/24 8:31 AM, Segher Boessenkool wrote:
> > On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote:
> >> So it seems you're not NAKing the use of splat_input_operand, but
> >> just that it needs more explanation in the git log entry, correct?
> > 
> > I NAK the patch.  _Of course_ there needs to be *something* done, there
> > is a bug after all, it needs to be fixed.
> > 
> > But no, there are big questions about if splat_input_operand is correct
> > as well.  This needs to be justified in the patch submission.
> 
> Ok, then Jeevitha, repost the patch with the s/op1/operands[1]/ only change.
> Jeevitha has already bootstrapped and regtested that change and it does
> fix the bug.
> 
> Clearly, the splat_input_operand change needs more discussion and would
> be a follow-on patch...if we decide to do it at all.

It is clear that input_operand is wrong.  It isn't clear that
splat_input_operand is correct though :-(


Segher


Re: [PATCH] Fortran - Error compiling PDT Type-bound Procedures [PR82943/86148/86268]

2024-02-28 Thread Harald Anlauf

Hi Alex,

this is now mostly correct, with the following exceptions:

First, you should notice that the formatting of the commit message,
when checked using "git gcc-verify", needs minor corrections.  You
will be guided how to fix this yourself.

Second, testcase pdt_37.f03 has an undeclared dummy argument, which
can be detected by adding "implicit none" (I usually use that
whenever implicit typing is not wanted explicitly).  I would get:

pdt_37.f03:33:47:

   33 | subroutine assumed_len_param_ptr(this, that)
  |   1
Error: Symbol 'that' at (1) has no IMPLICIT type; did you mean 'this'?

I assume you want to uncomment the declaration of dummy 'that'.

Third, I still see a - minor - indentation/tabbing/space issue here:

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 44f89f6afb4..852e0820e6a 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
[...]
+  if ( resolve_bindings_derived->attr.pdt_template
+ && gfc_pdt_is_instance_of (resolve_bindings_derived,
+   CLASS_DATA (me_arg)->ts.u.derived)
+  && (me_arg->param_list != NULL)
+  && (gfc_spec_list_type (me_arg->param_list,
+CLASS_DATA(me_arg)->ts.u.derived)
+!= SPEC_ASSUMED))

OK with the above fixed.

Thanks for the patch!

Harald

On 2/28/24 07:24, Alexander Westbrooks wrote:

Harald,

Jerry helped me figure out my editor settings so that I could fix
whitespace and formatting issues in my code. With my editor configured
correctly, I saw that my code was not conforming to coding standards
as I previously thought it was. I have fixed those things and updated
my patch. Thank you for your patience.

Let me know if this is okay to push to the trunk.

Thanks,

Alexander Westbrooks

On Sun, Feb 25, 2024 at 2:40 PM Alexander Westbrooks
 wrote:


Harald,

Thank you for reviewing my code. I've been doing research and debugging to 
investigate the error thrown by Intel and NAG for the deferred parameter in the 
dummy variable declaration. I found where the problem was and added the fix as 
part of my patch. I've attached the patch as a file, which also includes your 
feedback and suggested fixes. I've updated the test case pdt_37.f03 to check 
for the POINTER or ALLOCATABLE error as you suggested.

All regression tests pass, including the new ones, after including the fix for 
the POINTER or ALLOCATABLE error for CLASS declarations of PDTs when deferred 
length parameters are used. This was tested on WSL 2, with Ubuntu 20.04 distro.

Is this okay to push to the trunk?

Thanks,

Alexander Westbrooks


On Sun, Feb 11, 2024 at 2:11 PM Harald Anlauf  wrote:


Hi Alex,

I've been unable to apply your patch to my local trunk, likely due to
whitespace issues my newsreader handles differently from your site.
I see it inline instead of attached.

A few general remarks:

Please follow the general recommendation regarding style if possible,
see https://www.gnu.org/prep/standards/standards.html#Formatting
regarding formatting/whitespace use (5.1) and comments (5.2)

Also, when an error message text spans multiple lines, please place the
whitespace at the end of a line, not at the beginning of the new one:


+  if ( resolve_bindings_derived->attr.pdt_template &&
+   !gfc_pdt_is_instance_of(resolve_bindings_derived,
+   CLASS_DATA(me_arg)->ts.u.derived))
+{
+  gfc_error ("Argument %qs of %qs with PASS(%s) at %L must be of"
+" the parametric derived-type %qs", me_arg->name, proc->name,


gfc_error ("Argument %qs of %qs with PASS(%s) at %L must be of "
   "the parametric derived-type %qs", me_arg->name,
proc->name,


+me_arg->name, , resolve_bindings_derived->name);
+  goto error;
+}


The following change is almost unreadable: the lnegthy comment is split
over three parts and almost hides the code.  Couldn't this be combined
into one comment before the function?


diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
index fddf68f8398..11f4bac0415 100644
--- a/gcc/fortran/symbol.cc
+++ b/gcc/fortran/symbol.cc
@@ -5172,6 +5172,35 @@ gfc_type_is_extension_of (gfc_symbol *t1, gfc_symbol
*t2)
 return gfc_compare_derived_types (t1, t2);
   }

+/* Check if a parameterized derived type t2 is an instance of a PDT
template t1 */
+
+bool
+gfc_pdt_is_instance_of(gfc_symbol *t1, gfc_symbol *t2)
+{
+  if ( !t1->attr.pdt_template || !t2->attr.pdt_type )
+return false;
+
+  /*
+in decl.cc, gfc_get_pdt_instance, a pdt instance is given a 3
character prefix "Pdt", followed
+by an underscore list of the kind parameters, up to a maximum of 8.
+
+So to check if a PDT Type corresponds to the template, extract the
core derive_type name,
+and then see if it is type compatible by name...
+
+For example:
+
+Pdtf_2_2 -> extract out the 'f' -> see if the derived type 'f' is

[PATCH] OpenMP: warn about iteration var modifications in loop body

2024-02-28 Thread Frederik Harwath

Hi,

this patch implements a warning about (some simple cases of direct)
modifications of iteration variables in OpenMP loops which are forbidden
according to the OpenMP specification. I think this can be helpful,
especially for new OpenMP users. I have implemented this after I
observed some confusion concerning this topic recently.
The check is implemented during gimplification. It reuses the
"loop_iter_var" vector in the "gimplify_omp_ctx" which was previously
only used for "doacross" handling to identify the loop iteration
variables during the gimplification of MODIFY_EXPRs in omp_for bodies.
I have only added a common C/C++ test because I don't see any special
C++ constructs for which a warning *should* be emitted and Fortran
rejects modifications of iteration variables in do loops in general.

I have run "make check" on x86_64-linux-gnu and not observed any
regressions.

Is it ok to commit this?

Best regards,
Frederik
From 4944a9f94bcda9907e0118e71137ee7e192657c2 Mon Sep 17 00:00:00 2001
From: Frederik Harwath 
Date: Tue, 27 Feb 2024 21:07:00 +
Subject: [PATCH] OpenMP: warn about iteration var modifications in loop body

OpenMP loop iteration variables may not be changed by user code in the
loop body according to the OpenMP specification.  In general, the
compiler cannot enforce this, but nevertheless simple cases in which
the user modifies the iteration variable directly in the loop body
(in contrast to, e.g., modifications through a pointer) can be recognized. A
warning should be useful, for instance, to new users of OpenMP.

This commit implements a warning about forbidden iteration var modifications
during gimplification. It reuses the "loop_iter_var" vector in the
"gimplify_omp_ctx" which was previously only used for "doacross" handling to
identify the loop iteration variables during the gimplification of MODIFY_EXPRs
in omp_for bodies.

gcc/ChangeLog:

	* gimplify.cc (struct gimplify_omp_ctx): Add field "in_omp_for_body" to
	recognize the gimplification state during which the new warning should
	be emitted. Add field "is_doacross" to distinguish the original use of
	"loop_iter_var" from its new use.
	(new_omp_context): Initialize new gimplify_omp_ctx fields.
	(gimplify_modify_expr): Emit warning if iter var is modified.
	(gimplify_omp_for): Make initialization and filling of loop_iter_var
	vector unconditional and adjust new gimplify_omp_ctx fields before
	gimplifying the omp_for body.
	(gimplify_omp_ordered): Check for do_across field in addition to
	emptiness check on loop_iter_var vector since the vector is now always
	being filled.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/iter-var-modification.c: New test.

Signed-off-by: Frederik Harwath  
---
 gcc/gimplify.cc   |  54 +++---
 .../c-c++-common/gomp/iter-var-modification.c | 100 ++
 2 files changed, 138 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/iter-var-modification.c

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 7f79b3cc7e6..a74ad987cf7 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -235,6 +235,8 @@ struct gimplify_omp_ctx
   bool order_concurrent;
   bool has_depend;
   bool in_for_exprs;
+  bool in_omp_for_body;
+  bool is_doacross;
   int defaultmap[5];
 };
 
@@ -456,6 +458,10 @@ new_omp_context (enum omp_region_type region_type)
   c->privatized_types = new hash_set;
   c->location = input_location;
   c->region_type = region_type;
+  c->loop_iter_var.create (0);
+  c->in_omp_for_body = false;
+  c->is_doacross = false;
+
   if ((region_type & ORT_TASK) == 0)
 c->default_kind = OMP_CLAUSE_DEFAULT_SHARED;
   else
@@ -6312,6 +6318,18 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
 	  || TREE_CODE (*expr_p) == INIT_EXPR);
 
+  if (gimplify_omp_ctxp && gimplify_omp_ctxp->in_omp_for_body)
+{
+  size_t num_vars = gimplify_omp_ctxp->loop_iter_var.length () / 2;
+  for (size_t i = 0; i < num_vars; i++)
+	{
+	  if (*to_p == gimplify_omp_ctxp->loop_iter_var[2 * i + 1])
+	warning_at (input_location, OPT_Wopenmp,
+			"forbidden modification of iteration variable %qE in "
+			"OpenMP loop", *to_p);
+	}
+}
+
   /* Trying to simplify a clobber using normal logic doesn't work,
  so handle it here.  */
   if (TREE_CLOBBER_P (*from_p))
@@ -15334,6 +15352,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 	  == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
 	  == TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
+  int len = TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt));
+  gimplify_omp_ctxp->loop_iter_var.create (len * 2);
 
   tree c = omp_find_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ORDERED);
   bool is_doacross = false;
@@ -15342,8 +15362,6 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
 {
   OMP_CLAUSE_ORDERED_DOACROSS (c) = 1;
   

[PATCH v14 23/26] c++: Implement __is_invocable built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::is_invocable.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_invocable.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_INVOCABLE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.
* cp-tree.h (build_invoke): New function.
* method.cc (build_invoke): New function.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_invocable.
* g++.dg/ext/is_invocable1.C: New test.
* g++.dg/ext/is_invocable2.C: New test.
* g++.dg/ext/is_invocable3.C: New test.
* g++.dg/ext/is_invocable4.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |   6 +
 gcc/cp/cp-trait.def  |   1 +
 gcc/cp/cp-tree.h |   2 +
 gcc/cp/method.cc | 132 +
 gcc/cp/semantics.cc  |   4 +
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |   3 +
 gcc/testsuite/g++.dg/ext/is_invocable1.C | 349 +++
 gcc/testsuite/g++.dg/ext/is_invocable2.C | 139 +
 gcc/testsuite/g++.dg/ext/is_invocable3.C |  51 
 gcc/testsuite/g++.dg/ext/is_invocable4.C |  33 +++
 10 files changed, 720 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_invocable1.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_invocable2.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_invocable3.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_invocable4.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 23ea66d9c12..c87b126fdb1 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3791,6 +3791,12 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_FUNCTION:
   inform (loc, "  %qT is not a function", t1);
   break;
+case CPTK_IS_INVOCABLE:
+  if (!t2)
+inform (loc, "  %qT is not invocable", t1);
+  else
+inform (loc, "  %qT is not invocable by %qE", t1, t2);
+  break;
 case CPTK_IS_LAYOUT_COMPATIBLE:
   inform (loc, "  %qT is not layout compatible with %qT", t1, t2);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 85056c8140b..6cb2b55f4ea 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -75,6 +75,7 @@ DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
 DEFTRAIT_EXPR (IS_ENUM, "__is_enum", 1)
 DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1)
 DEFTRAIT_EXPR (IS_FUNCTION, "__is_function", 1)
+DEFTRAIT_EXPR (IS_INVOCABLE, "__is_invocable", -1)
 DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2)
 DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1)
 DEFTRAIT_EXPR (IS_MEMBER_FUNCTION_POINTER, "__is_member_function_pointer", 1)
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 334c11396c2..261d3a71faa 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7334,6 +7334,8 @@ extern tree get_copy_assign   (tree);
 extern tree get_default_ctor   (tree);
 extern tree get_dtor   (tree, tsubst_flags_t);
 extern tree build_stub_object  (tree);
+extern tree build_invoke   (tree, const_tree,
+tsubst_flags_t);
 extern tree strip_inheriting_ctors (tree);
 extern tree inherited_ctor_binfo   (tree);
 extern bool base_ctor_omit_inherited_parms (tree);
diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 98c10e6a8b5..953f1bed6fc 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -1928,6 +1928,138 @@ build_trait_object (tree type)
   return build_stub_object (type);
 }
 
+/* [func.require] Build an expression of INVOKE(FN_TYPE, ARG_TYPES...).  If the
+   given is not invocable, returns error_mark_node.  */
+
+tree
+build_invoke (tree fn_type, const_tree arg_types, tsubst_flags_t complain)
+{
+  if (fn_type == error_mark_node || arg_types == error_mark_node)
+return error_mark_node;
+
+  gcc_assert (TYPE_P (fn_type));
+  gcc_assert (TREE_CODE (arg_types) == TREE_VEC);
+
+  /* Access check is required to determine if the given is invocable.  */
+  deferring_access_check_sentinel acs (dk_no_deferred);
+
+  /* INVOKE is an unevaluated context.  */
+  cp_unevaluated cp_uneval_guard;
+
+  bool is_ptrdatamem;
+  bool is_ptrmemfunc;
+  if (TREE_CODE (fn_type) == REFERENCE_TYPE)
+{
+  tree deref_fn_type = TREE_TYPE (fn_type);
+  is_ptrdatamem = TYPE_PTRDATAMEM_P (deref_fn_type);
+  is_ptrmemfunc = TYPE_PTRMEMFUNC_P (deref_fn_type);
+
+  /* Dereference fn_type if it is a pointer to member.  */
+  if (is_ptrdatamem || is_ptrmemfunc)
+   fn_type = deref_fn_type;
+}
+  else
+{
+  is_ptrdatamem = TYPE_PTRDATAMEM_P (fn_type);
+  is_ptrmemfunc = TYPE_PTRMEMFUNC_P (fn_type);
+}
+
+  if (is_ptrdatamem && TREE_VEC_LENGTH (arg_types) != 1)
+/* Only a pointer to data member with one argument is invocable.  */
+return error_mark_node;

[PATCH v14 26/26] libstdc++: Optimize std::is_nothrow_invocable compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of
std::is_nothrow_invocable by dispatching to the new
__is_nothrow_invocable built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_nothrow_invocable): Use
__is_nothrow_invocable built-in trait.
* testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc:
Handle the new error from __is_nothrow_invocable.
* testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc:
Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits  | 4 
 .../20_util/is_nothrow_invocable/incomplete_args_neg.cc   | 1 +
 .../testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc  | 1 +
 3 files changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 9af233bcc75..093d85a51a8 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3265,8 +3265,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// std::is_nothrow_invocable
   template
 struct is_nothrow_invocable
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_nothrow_invocable)
+: public __bool_constant<__is_nothrow_invocable(_Fn, _ArgTypes...)>
+#else
 : __and_<__is_invocable_impl<__invoke_result<_Fn, _ArgTypes...>, void>,
 __call_is_nothrow_<_Fn, _ArgTypes...>>::type
+#endif
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Fn>{}),
"_Fn must be a complete class or an unbounded array");
diff --git 
a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc
index 3c225883eaf..3f8542dd366 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
diff --git 
a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc
index 5a728bfa03b..d3bdf08448b 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
-- 
2.44.0



[PATCH v14 04/26] libstdc++: Optimize std::is_volatile compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_volatile
by dispatching to the new __is_volatile built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_volatile): Use __is_volatile
built-in trait.
(is_volatile_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 6e9ebfb8a18..60cd22b6f15 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -851,6 +851,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// is_volatile
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_volatile)
+  template
+struct is_volatile
+: public __bool_constant<__is_volatile(_Tp)>
+{ };
+#else
   template
 struct is_volatile
 : public false_type { };
@@ -858,6 +864,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_volatile<_Tp volatile>
 : public true_type { };
+#endif
 
   /// is_trivial
   template
@@ -3356,10 +3363,15 @@ template 
   inline constexpr bool is_function_v<_Tp&&> = false;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_volatile)
+template 
+  inline constexpr bool is_volatile_v = __is_volatile(_Tp);
+#else
 template 
   inline constexpr bool is_volatile_v = false;
 template 
   inline constexpr bool is_volatile_v = true;
+#endif
 
 template 
   inline constexpr bool is_trivial_v = __is_trivial(_Tp);
-- 
2.44.0



[PATCH v14 18/26] libstdc++: Optimize std::add_rvalue_reference compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of
std::add_rvalue_reference by dispatching to the new
__add_rvalue_reference built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (add_rvalue_reference): Use
__add_rvalue_reference built-in trait.
(__add_rvalue_reference_helper): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 17bf47d59d3..18a5e4de2d3 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1185,6 +1185,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// @cond undocumented
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_rvalue_reference)
+  template
+struct __add_rvalue_reference_helper
+{ using type = __add_rvalue_reference(_Tp); };
+#else
   template
 struct __add_rvalue_reference_helper
 { using type = _Tp; };
@@ -1192,6 +1197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct __add_rvalue_reference_helper<_Tp, __void_t<_Tp&&>>
 { using type = _Tp&&; };
+#endif
 
   template
 using __add_rval_ref_t = typename __add_rvalue_reference_helper<_Tp>::type;
@@ -1748,9 +1754,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// add_rvalue_reference
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_rvalue_reference)
+  template
+struct add_rvalue_reference
+{ using type = __add_rvalue_reference(_Tp); };
+#else
   template
 struct add_rvalue_reference
 { using type = __add_rval_ref_t<_Tp>; };
+#endif
 
 #if __cplusplus > 201103L
   /// Alias template for remove_reference
-- 
2.44.0



[PATCH v14 24/26] libstdc++: Optimize std::is_invocable compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_invocable
by dispatching to the new __is_invocable built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_invocable): Use __is_invocable
built-in trait.
* testsuite/20_util/is_invocable/incomplete_args_neg.cc: Handle
the new error from __is_invocable.
* testsuite/20_util/is_invocable/incomplete_neg.cc: Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits  | 4 
 .../testsuite/20_util/is_invocable/incomplete_args_neg.cc | 1 +
 libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc | 1 +
 3 files changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 1577042a5b8..9af233bcc75 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3235,7 +3235,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// std::is_invocable
   template
 struct is_invocable
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_invocable)
+: public __bool_constant<__is_invocable(_Fn, _ArgTypes...)>
+#else
 : __is_invocable_impl<__invoke_result<_Fn, _ArgTypes...>, void>::type
+#endif
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Fn>{}),
"_Fn must be a complete class or an unbounded array");
diff --git a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc
index a575750f9e9..9619129b817 100644
--- a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc
index 05848603555..b478ebce815 100644
--- a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
-- 
2.44.0



[PATCH v14 19/26] c++: Implement __decay built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::decay.

gcc/cp/ChangeLog:

* cp-trait.def: Define __decay.
* semantics.cc (finish_trait_type): Handle CPTK_DECAY.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __decay.
* g++.dg/ext/decay.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  | 12 
 gcc/testsuite/g++.dg/ext/decay.C | 22 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 4 files changed, 38 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/decay.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 173818adf79..2d1cb7c227c 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -51,6 +51,7 @@
 DEFTRAIT_TYPE (ADD_LVALUE_REFERENCE, "__add_lvalue_reference", 1)
 DEFTRAIT_TYPE (ADD_POINTER, "__add_pointer", 1)
 DEFTRAIT_TYPE (ADD_RVALUE_REFERENCE, "__add_rvalue_reference", 1)
+DEFTRAIT_TYPE (DECAY, "__decay", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_ASSIGN, "__has_nothrow_assign", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_CONSTRUCTOR, "__has_nothrow_constructor", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_COPY, "__has_nothrow_copy", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 19d6f87a9ea..45dc509855a 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12801,6 +12801,18 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
return type1;
   return cp_build_reference_type (type1, /*rval=*/true);
 
+case CPTK_DECAY:
+  if (TYPE_REF_P (type1))
+   type1 = TREE_TYPE (type1);
+
+  if (TREE_CODE (type1) == ARRAY_TYPE)
+   return finish_trait_type (CPTK_ADD_POINTER, TREE_TYPE (type1), type2,
+ complain);
+  else if (TREE_CODE (type1) == FUNCTION_TYPE)
+   return finish_trait_type (CPTK_ADD_POINTER, type1, type2, complain);
+  else
+   return cv_unqualified (type1);
+
 case CPTK_REMOVE_ALL_EXTENTS:
   return strip_array_types (type1);
 
diff --git a/gcc/testsuite/g++.dg/ext/decay.C b/gcc/testsuite/g++.dg/ext/decay.C
new file mode 100644
index 000..8adedfeefe6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/decay.C
@@ -0,0 +1,22 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+// Positive tests.
+using test1_type = __decay(bool);
+SA(__is_same(test1_type, bool));
+
+// NB: DR 705.
+using test2_type = __decay(const int);
+SA(__is_same(test2_type, int));
+
+using test3_type = __decay(int[4]);
+SA(__is_same(test3_type, __remove_extent(int[4])*));
+
+using fn_type = void ();
+using test4_type = __decay(fn_type);
+SA(__is_same(test4_type, __add_pointer(fn_type)));
+
+using cfn_type = void () const;
+using test5_type = __decay(cfn_type);
+SA(__is_same(test5_type, cfn_type));
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index c2503c5d82b..3aca273aad6 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -11,6 +11,9 @@
 #if !__has_builtin (__add_rvalue_reference)
 # error "__has_builtin (__add_rvalue_reference) failed"
 #endif
+#if !__has_builtin (__decay)
+# error "__has_builtin (__decay) failed"
+#endif
 #if !__has_builtin (__builtin_addressof)
 # error "__has_builtin (__builtin_addressof) failed"
 #endif
-- 
2.44.0



[PATCH v14 10/26] libstdc++: Optimize std::add_pointer compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::add_pointer
by dispatching to the new __add_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (add_pointer): Use __add_pointer
built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index c4585a23df9..6346d1daee2 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2149,6 +2149,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 #endif
 
+  /// add_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_pointer)
+  template
+struct add_pointer
+{ using type = __add_pointer(_Tp); };
+#else
   template
 struct __add_pointer_helper
 { using type = _Tp; };
@@ -2157,7 +2163,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __add_pointer_helper<_Tp, __void_t<_Tp*>>
 { using type = _Tp*; };
 
-  /// add_pointer
   template
 struct add_pointer
 : public __add_pointer_helper<_Tp>
@@ -2170,6 +2175,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct add_pointer<_Tp&&>
 { using type = _Tp*; };
+#endif
 
 #if __cplusplus > 201103L
   /// Alias template for remove_pointer
-- 
2.44.0



[PATCH v14 00/26] Optimize more type traits

2024-02-28 Thread Ken Matsui
Hi,

This patch series implements __is_const, __is_volatile, __is_pointer,
and __is_unbounded_array built-in traits, which were isolated from my
previous patch series "Optimize type traits compilation performance"
because they contained performance regression.  I confirmed that this
patch series does not cause any performance regression.  The main reason
of the performance regression were the exhaustiveness of the benchmarks
and the instability of the benchmark results.  Also, this patch series
includes built-ins for add_pointer, remove_extent, remove_all_extents,
add_lvalue_reference, add_rvalue_reference, decay, rank, is_invocable,
and is_nothrow_invocable.  Here are the benchmark results:

is_const: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_const.md#sat-dec-23-090605-am-pst-2023
time: -4.36603%, peak memory: -0.300891%, total memory: -0.247934%

is_const_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_const_v.md#sat-jun-24-044815-am-pdt-2023
time: -2.86467%, peak memory: -1.0654%, total memory: -1.62369%

is_volatile: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_volatile.md#sun-oct-22-091644-pm-pdt-2023
time: -5.25164%, peak memory: -0.337971%, total memory: -0.247934%

is_volatile_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_volatile_v.md#sat-dec-23-091518-am-pst-2023
time: -4.06816%, peak memory: -0.609298%, total memory: -0.659134%

is_pointer: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_pointer.md#sat-dec-23-124903-pm-pst-2023
time: -2.47124%, peak memory: -2.98207%, total memory: -4.0811%

is_pointer_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_pointer_v.md#sun-oct-22-122257-am-pdt-2023
time: -4.71336%, peak memory: -2.25026%, total memory: -3.125%

is_unbounded_array: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_unbounded_array.md#sun-oct-22-091644-pm-pdt-2023
time: -6.33287%, peak memory: -0.602494%, total memory: -1.56035%

is_unbounded_array_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_unbounded_array_v.md#sat-dec-23-010046-pm-pst-2023
time: -1.50025%, peak memory: -1.07386%, total memory: -2.32394%

add_pointer_t: 
https://github.com/ken-matsui/gcc-bench/blob/main/add_pointer_t.md#wed-feb-28-060044-am-pst-2024
time: -21.6673%, peak memory: -14.%, total memory: -17.4716%

remove_extent_t: 
https://github.com/ken-matsui/gcc-bench/blob/main/remove_extent_t.md#wed-feb-28-063021-am-pst-2024
time: -14.4089%, peak memory: -2.71836%, total memory: -9.87013%

remove_all_extents_t: 
https://github.com/ken-matsui/gcc-bench/blob/main/remove_all_extents_t.md#wed-feb-28-064716-am-pst-2024
time: -28.8941%, peak memory: -16.6981%, total memory: -23.6088%

add_lvalue_reference_t: 
https://github.com/ken-matsui/gcc-bench/blob/main/add_lvalue_reference_t.md#wed-feb-28-070023-am-pst-2024
time: -33.8827%, peak memory: -24.9292%, total memory: -25.3043%

add_rvalue_reference_t: 
https://github.com/ken-matsui/gcc-bench/blob/main/add_rvalue_reference_t.md#wed-feb-28-070701-am-pst-2024
time: -23.9186%, peak memory: -17.1311%, total memory: -19.5891%

decay_t: 
https://github.com/ken-matsui/gcc-bench/blob/main/decay_t.md#wed-feb-28-072330-am-pst-2024
time: -42.4076%, peak memory: -29.2077%, total memory: -33.0914%

rank: 
https://github.com/ken-matsui/gcc-bench/blob/main/rank.md#wed-feb-28-074917-am-pst-2024
time: -33.7312%, peak memory: -27.5885%, total memory: -34.5736%

rank_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/rank_v.md#wed-feb-28-073632-am-pst-2024
time: -40.7174%, peak memory: -16.4653%, total memory: -23.0131%

is_invocable_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_invocable.md#wed-feb-28-111001-am-pst-2024
time: -58.8307%, peak memory: -59.4966%, total memory: -59.8871%
(This benchmark is not exhaustive as my laptop crashed with larger benchmarks)

is_nothrow_invocable_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_nothrow_invocable.md#wed-feb-28-112414-am-pst-2024
time: -70.4102%, peak memory: -62.5516%, total memory: -65.5853%
(This benchmark is not exhaustive as my laptop crashed with larger benchmarks)

Sincerely,
Ken Matsui

Ken Matsui (26):
  c++: Implement __is_const built-in trait
  libstdc++: Optimize std::is_const compilation performance
  c++: Implement __is_volatile built-in trait
  libstdc++: Optimize std::is_volatile compilation performance
  c++: Implement __is_pointer built-in trait
  libstdc++: Optimize std::is_pointer compilation performance
  c++: Implement __is_unbounded_array built-in trait
  libstdc++: Optimize std::is_unbounded_array compilation performance
  c++: Implement __add_pointer built-in trait
  libstdc++: Optimize std::add_pointer compilation performance
  c++: Implement __remove_extent built-in trait
  libstdc++: Optimize std::remove_extent compilation performance
  c++: Implement __remove_all_extents built-in trait
  libstdc++: Optimize std::remove_all_extents compilation performance
  c++: Implement __add_lvalue_reference built-in trait
  

[PATCH v14 08/26] libstdc++: Optimize std::is_unbounded_array compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of
std::is_unbounded_array by dispatching to the new
__is_unbounded_array built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_unbounded_array_v): Use
__is_unbounded_array built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 6407738a726..c4585a23df9 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3706,11 +3706,16 @@ template
   /// True for a type that is an array of unknown bound.
   /// @ingroup variable_templates
   /// @since C++20
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__is_unbounded_array)
+  template
+inline constexpr bool is_unbounded_array_v = __is_unbounded_array(_Tp);
+# else
   template
 inline constexpr bool is_unbounded_array_v = false;
 
   template
 inline constexpr bool is_unbounded_array_v<_Tp[]> = true;
+# endif
 
   /// True for a type that is an array of known bound.
   /// @since C++20
-- 
2.44.0



[PATCH v14 09/26] c++: Implement __add_pointer built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::add_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __add_pointer.
* semantics.cc (finish_trait_type): Handle CPTK_ADD_POINTER.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __add_pointer.
* g++.dg/ext/add_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  9 ++
 gcc/testsuite/g++.dg/ext/add_pointer.C   | 39 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 4 files changed, 52 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/add_pointer.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 05514a51c21..63f879287ce 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -48,6 +48,7 @@
 #define DEFTRAIT_TYPE_DEFAULTED
 #endif
 
+DEFTRAIT_TYPE (ADD_POINTER, "__add_pointer", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_ASSIGN, "__has_nothrow_assign", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_CONSTRUCTOR, "__has_nothrow_constructor", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_COPY, "__has_nothrow_copy", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 1794e83baa2..635441a7a90 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12776,6 +12776,15 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
 
   switch (kind)
 {
+case CPTK_ADD_POINTER:
+  if (FUNC_OR_METHOD_TYPE_P (type1)
+ && (type_memfn_quals (type1) != TYPE_UNQUALIFIED
+ || type_memfn_rqual (type1) != REF_QUAL_NONE))
+   return type1;
+  if (TYPE_REF_P (type1))
+   type1 = TREE_TYPE (type1);
+  return build_pointer_type (type1);
+
 case CPTK_REMOVE_CV:
   return cv_unqualified (type1);
 
diff --git a/gcc/testsuite/g++.dg/ext/add_pointer.C 
b/gcc/testsuite/g++.dg/ext/add_pointer.C
new file mode 100644
index 000..c405cdd0feb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/add_pointer.C
@@ -0,0 +1,39 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+
+SA(__is_same(__add_pointer(int), int*));
+SA(__is_same(__add_pointer(int*), int**));
+SA(__is_same(__add_pointer(const int), const int*));
+SA(__is_same(__add_pointer(int&), int*));
+SA(__is_same(__add_pointer(ClassType*), ClassType**));
+SA(__is_same(__add_pointer(ClassType), ClassType*));
+SA(__is_same(__add_pointer(void), void*));
+SA(__is_same(__add_pointer(const void), const void*));
+SA(__is_same(__add_pointer(volatile void), volatile void*));
+SA(__is_same(__add_pointer(const volatile void), const volatile void*));
+
+void f1();
+using f1_type = decltype(f1);
+using pf1_type = decltype();
+SA(__is_same(__add_pointer(f1_type), pf1_type));
+
+void f2() noexcept; // PR libstdc++/78361
+using f2_type = decltype(f2);
+using pf2_type = decltype();
+SA(__is_same(__add_pointer(f2_type), pf2_type));
+
+using fn_type = void();
+using pfn_type = void(*)();
+SA(__is_same(__add_pointer(fn_type), pfn_type));
+
+SA(__is_same(__add_pointer(void() &), void() &));
+SA(__is_same(__add_pointer(void() & noexcept), void() & noexcept));
+SA(__is_same(__add_pointer(void() const), void() const));
+SA(__is_same(__add_pointer(void(...) &), void(...) &));
+SA(__is_same(__add_pointer(void(...) & noexcept), void(...) & noexcept));
+SA(__is_same(__add_pointer(void(...) const), void(...) const));
+
+SA(__is_same(__add_pointer(void() __restrict), void() __restrict));
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index b1430e9bd8b..9d861398bae 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -2,6 +2,9 @@
 // { dg-do compile }
 // Verify that __has_builtin gives the correct answer for C++ built-ins.
 
+#if !__has_builtin (__add_pointer)
+# error "__has_builtin (__add_pointer) failed"
+#endif
 #if !__has_builtin (__builtin_addressof)
 # error "__has_builtin (__builtin_addressof) failed"
 #endif
-- 
2.44.0



[PATCH v14 03/26] c++: Implement __is_volatile built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::is_volatile.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_volatile.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_volatile.
* g++.dg/ext/is_volatile.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_volatile.C   | 20 
 5 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index f32a1c78d63..9a7a12629e7 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3861,6 +3861,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_VOLATILE:
+  inform (loc, "  %qT is not a volatile type", t1);
+  break;
 case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
   inform (loc, "  %qT is not a reference that binds to a temporary "
  "object of type %qT (direct-initialization)", t1, t2);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 36faed9c0b3..e9347453829 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -92,6 +92,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 0d08900492b..41c25f43d27 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12532,6 +12532,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_VOLATILE:
+  return CP_TYPE_VOLATILE_P (type1);
+
 case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
   return ref_xes_from_temporary (type1, type2, /*direct_init=*/true);
 
@@ -12702,6 +12705,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
 case CPTK_IS_UNION:
+case CPTK_IS_VOLATILE:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index e3640faeb96..b2e2f2f694d 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -158,6 +158,9 @@
 #if !__has_builtin (__is_union)
 # error "__has_builtin (__is_union) failed"
 #endif
+#if !__has_builtin (__is_volatile)
+# error "__has_builtin (__is_volatile) failed"
+#endif
 #if !__has_builtin (__reference_constructs_from_temporary)
 # error "__has_builtin (__reference_constructs_from_temporary) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_volatile.C 
b/gcc/testsuite/g++.dg/ext/is_volatile.C
new file mode 100644
index 000..80a1cfc880d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_volatile.C
@@ -0,0 +1,20 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+using cClassType = const ClassType;
+using vClassType = volatile ClassType;
+using cvClassType = const volatile ClassType;
+
+// Positive tests.
+SA(__is_volatile(volatile int));
+SA(__is_volatile(const volatile int));
+SA(__is_volatile(vClassType));
+SA(__is_volatile(cvClassType));
+
+// Negative tests.
+SA(!__is_volatile(int));
+SA(!__is_volatile(const int));
+SA(!__is_volatile(ClassType));
+SA(!__is_volatile(cClassType));
-- 
2.44.0



[PATCH v14 13/26] c++: Implement __remove_all_extents built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::remove_all_extents.

gcc/cp/ChangeLog:

* cp-trait.def: Define __remove_all_extents.
* semantics.cc (finish_trait_type): Handle
CPTK_REMOVE_ALL_EXTENTS.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__remove_all_extents.
* g++.dg/ext/remove_all_extents.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  3 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 +++
 gcc/testsuite/g++.dg/ext/remove_all_extents.C | 16 
 4 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_all_extents.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 577c96d579b..933c8bcbe68 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -98,6 +98,7 @@ DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
 DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
+DEFTRAIT_TYPE (REMOVE_ALL_EXTENTS, "__remove_all_extents", 1)
 DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
 DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1)
 DEFTRAIT_TYPE (REMOVE_EXTENT, "__remove_extent", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 58696225fc4..078424dac23 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12785,6 +12785,9 @@ finish_trait_type (cp_trait_kind kind, tree type1, tree 
type2,
type1 = TREE_TYPE (type1);
   return build_pointer_type (type1);
 
+case CPTK_REMOVE_ALL_EXTENTS:
+  return strip_array_types (type1);
+
 case CPTK_REMOVE_CV:
   return cv_unqualified (type1);
 
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 5d5cbe3b019..85b74bd676b 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -176,6 +176,9 @@
 #if !__has_builtin (__reference_converts_from_temporary)
 # error "__has_builtin (__reference_converts_from_temporary) failed"
 #endif
+#if !__has_builtin (__remove_all_extents)
+# error "__has_builtin (__remove_all_extents) failed"
+#endif
 #if !__has_builtin (__remove_cv)
 # error "__has_builtin (__remove_cv) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/remove_all_extents.C 
b/gcc/testsuite/g++.dg/ext/remove_all_extents.C
new file mode 100644
index 000..60ade2ade7f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/remove_all_extents.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+
+SA(__is_same(__remove_all_extents(int), int));
+SA(__is_same(__remove_all_extents(int[2]), int));
+SA(__is_same(__remove_all_extents(int[2][3]), int));
+SA(__is_same(__remove_all_extents(int[][3]), int));
+SA(__is_same(__remove_all_extents(const int[2][3]), const int));
+SA(__is_same(__remove_all_extents(ClassType), ClassType));
+SA(__is_same(__remove_all_extents(ClassType[2]), ClassType));
+SA(__is_same(__remove_all_extents(ClassType[2][3]), ClassType));
+SA(__is_same(__remove_all_extents(ClassType[][3]), ClassType));
+SA(__is_same(__remove_all_extents(const ClassType[2][3]), const ClassType));
-- 
2.44.0



[PATCH v14 22/26] libstdc++: Optimize std::rank compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::rank
by dispatching to the new __rank built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (rank): Use __rank built-in trait.
(rank_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 2f4c8dd3b21..1577042a5b8 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1473,6 +1473,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// rank
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__rank)
+  template
+struct rank
+: public integral_constant { };
+#else
   template
 struct rank
 : public integral_constant { };
@@ -1484,6 +1489,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct rank<_Tp[]>
 : public integral_constant::value> { };
+#endif
 
   /// extent
   template
@@ -3579,12 +3585,17 @@ template 
 template 
   inline constexpr size_t alignment_of_v = alignment_of<_Tp>::value;
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__rank)
+template 
+  inline constexpr size_t rank_v = __rank(_Tp);
+#else
 template 
   inline constexpr size_t rank_v = 0;
 template 
   inline constexpr size_t rank_v<_Tp[_Size]> = 1 + rank_v<_Tp>;
 template 
   inline constexpr size_t rank_v<_Tp[]> = 1 + rank_v<_Tp>;
+#endif
 
 template 
   inline constexpr size_t extent_v = 0;
-- 
2.44.0



[PATCH v14 05/26] c++: Implement __is_pointer built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::is_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_pointer.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
* g++.dg/ext/is_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 gcc/testsuite/g++.dg/ext/is_pointer.C| 51 
 5 files changed, 62 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 9a7a12629e7..244070d93c2 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3828,6 +3828,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_POD:
   inform (loc, "  %qT is not a POD type", t1);
   break;
+case CPTK_IS_POINTER:
+  inform (loc, "  %qT is not a pointer", t1);
+  break;
 case CPTK_IS_POLYMORPHIC:
   inform (loc, "  %qT is not a polymorphic type", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index e9347453829..18e2d0f3480 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_NOTHROW_CONVERTIBLE, 
"__is_nothrow_convertible", 2)
 DEFTRAIT_EXPR (IS_OBJECT, "__is_object", 1)
 DEFTRAIT_EXPR (IS_POINTER_INTERCONVERTIBLE_BASE_OF, 
"__is_pointer_interconvertible_base_of", 2)
 DEFTRAIT_EXPR (IS_POD, "__is_pod", 1)
+DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
 DEFTRAIT_EXPR (IS_POLYMORPHIC, "__is_polymorphic", 1)
 DEFTRAIT_EXPR (IS_REFERENCE, "__is_reference", 1)
 DEFTRAIT_EXPR (IS_SAME, "__is_same", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 41c25f43d27..9dcdb06191a 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12502,6 +12502,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_POD:
   return pod_type_p (type1);
 
+case CPTK_IS_POINTER:
+  return TYPE_PTR_P (type1);
+
 case CPTK_IS_POLYMORPHIC:
   return CLASS_TYPE_P (type1) && TYPE_POLYMORPHIC_P (type1);
 
@@ -12701,6 +12704,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_MEMBER_OBJECT_POINTER:
 case CPTK_IS_MEMBER_POINTER:
 case CPTK_IS_OBJECT:
+case CPTK_IS_POINTER:
 case CPTK_IS_REFERENCE:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index b2e2f2f694d..96b7a89e4f1 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -125,6 +125,9 @@
 #if !__has_builtin (__is_pod)
 # error "__has_builtin (__is_pod) failed"
 #endif
+#if !__has_builtin (__is_pointer)
+# error "__has_builtin (__is_pointer) failed"
+#endif
 #if !__has_builtin (__is_polymorphic)
 # error "__has_builtin (__is_polymorphic) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_pointer.C
new file mode 100644
index 000..d6e39565950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(!__is_pointer(int));
+SA(__is_pointer(int*));
+SA(__is_pointer(int**));
+
+SA(__is_pointer(const int*));
+SA(__is_pointer(const int**));
+SA(__is_pointer(int* const));
+SA(__is_pointer(int** const));
+SA(__is_pointer(int* const* const));
+
+SA(__is_pointer(volatile int*));
+SA(__is_pointer(volatile int**));
+SA(__is_pointer(int* volatile));
+SA(__is_pointer(int** volatile));
+SA(__is_pointer(int* volatile* volatile));
+
+SA(__is_pointer(const volatile int*));
+SA(__is_pointer(const volatile int**));
+SA(__is_pointer(const int* volatile));
+SA(__is_pointer(volatile int* const));
+SA(__is_pointer(int* const volatile));
+SA(__is_pointer(const int** volatile));
+SA(__is_pointer(volatile int** const));
+SA(__is_pointer(int** const volatile));
+SA(__is_pointer(int* const* const volatile));
+SA(__is_pointer(int* volatile* const volatile));
+SA(__is_pointer(int* const volatile* const volatile));
+
+SA(!__is_pointer(int&));
+SA(!__is_pointer(const int&));
+SA(!__is_pointer(volatile int&));
+SA(!__is_pointer(const volatile int&));
+
+SA(!__is_pointer(int&&));
+SA(!__is_pointer(const int&&));
+SA(!__is_pointer(volatile int&&));
+SA(!__is_pointer(const volatile int&&));
+
+SA(!__is_pointer(int[3]));
+SA(!__is_pointer(const int[3]));
+SA(!__is_pointer(volatile int[3]));
+SA(!__is_pointer(const volatile int[3]));
+
+SA(!__is_pointer(int(int)));
+SA(__is_pointer(int(*const)(int)));
+SA(__is_pointer(int(*volatile)(int)));
+SA(__is_pointer(int(*const 

[PATCH v14 16/26] libstdc++: Optimize std::add_lvalue_reference compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of
std::add_lvalue_reference by dispatching to the new
__add_lvalue_reference built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (add_lvalue_reference): Use
__add_lvalue_reference built-in trait.
(__add_lvalue_reference_helper): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 34475e6279a..17bf47d59d3 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1157,6 +1157,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// @cond undocumented
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_lvalue_reference)
+  template
+struct __add_lvalue_reference_helper
+{ using type = __add_lvalue_reference(_Tp); };
+#else
   template
 struct __add_lvalue_reference_helper
 { using type = _Tp; };
@@ -1164,6 +1169,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct __add_lvalue_reference_helper<_Tp, __void_t<_Tp&>>
 { using type = _Tp&; };
+#endif
 
   template
 using __add_lval_ref_t = typename __add_lvalue_reference_helper<_Tp>::type;
@@ -1731,9 +1737,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// add_lvalue_reference
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_lvalue_reference)
+  template
+struct add_lvalue_reference
+{ using type = __add_lvalue_reference(_Tp); };
+#else
   template
 struct add_lvalue_reference
 { using type = __add_lval_ref_t<_Tp>; };
+#endif
 
   /// add_rvalue_reference
   template
-- 
2.44.0



[PATCH v14 17/26] c++: Implement __add_rvalue_reference built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::add_rvalue_reference.

gcc/cp/ChangeLog:

* cp-trait.def: Define __add_rvalue_reference.
* semantics.cc (finish_trait_type): Handle
CPTK_ADD_RVALUE_REFERENCE.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__add_rvalue_reference.
* g++.dg/ext/add_rvalue_reference.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  8 
 .../g++.dg/ext/add_rvalue_reference.C | 20 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 +++
 4 files changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/add_rvalue_reference.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 9a27dca4ea3..173818adf79 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -50,6 +50,7 @@
 
 DEFTRAIT_TYPE (ADD_LVALUE_REFERENCE, "__add_lvalue_reference", 1)
 DEFTRAIT_TYPE (ADD_POINTER, "__add_pointer", 1)
+DEFTRAIT_TYPE (ADD_RVALUE_REFERENCE, "__add_rvalue_reference", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_ASSIGN, "__has_nothrow_assign", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_CONSTRUCTOR, "__has_nothrow_constructor", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_COPY, "__has_nothrow_copy", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 05f5b62f9df..19d6f87a9ea 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12793,6 +12793,14 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
type1 = TREE_TYPE (type1);
   return build_pointer_type (type1);
 
+case CPTK_ADD_RVALUE_REFERENCE:
+  if (VOID_TYPE_P (type1)
+ || (FUNC_OR_METHOD_TYPE_P (type1)
+ && (type_memfn_quals (type1) != TYPE_UNQUALIFIED
+ || type_memfn_rqual (type1) != REF_QUAL_NONE)))
+   return type1;
+  return cp_build_reference_type (type1, /*rval=*/true);
+
 case CPTK_REMOVE_ALL_EXTENTS:
   return strip_array_types (type1);
 
diff --git a/gcc/testsuite/g++.dg/ext/add_rvalue_reference.C 
b/gcc/testsuite/g++.dg/ext/add_rvalue_reference.C
new file mode 100644
index 000..c92fe6bfa17
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/add_rvalue_reference.C
@@ -0,0 +1,20 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+
+SA(__is_same(__add_rvalue_reference(int), int&&));
+SA(__is_same(__add_rvalue_reference(int&&), int&&));
+SA(__is_same(__add_rvalue_reference(int&), int&));
+SA(__is_same(__add_rvalue_reference(const int), const int&&));
+SA(__is_same(__add_rvalue_reference(int*), int*&&));
+SA(__is_same(__add_rvalue_reference(ClassType&&), ClassType&&));
+SA(__is_same(__add_rvalue_reference(ClassType), ClassType&&));
+SA(__is_same(__add_rvalue_reference(int(int)), int(&&)(int)));
+SA(__is_same(__add_rvalue_reference(void), void));
+SA(__is_same(__add_rvalue_reference(const void), const void));
+SA(__is_same(__add_rvalue_reference(bool(int) const), bool(int) const));
+SA(__is_same(__add_rvalue_reference(bool(int) &), bool(int) &));
+SA(__is_same(__add_rvalue_reference(bool(int) const &&), bool(int) const &&));
+SA(__is_same(__add_rvalue_reference(bool(int)), bool(&&)(int)));
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 3fca9cfabcc..c2503c5d82b 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -8,6 +8,9 @@
 #if !__has_builtin (__add_pointer)
 # error "__has_builtin (__add_pointer) failed"
 #endif
+#if !__has_builtin (__add_rvalue_reference)
+# error "__has_builtin (__add_rvalue_reference) failed"
+#endif
 #if !__has_builtin (__builtin_addressof)
 # error "__has_builtin (__builtin_addressof) failed"
 #endif
-- 
2.44.0



[PATCH v14 12/26] libstdc++: Optimize std::remove_extent compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::remove_extent
by dispatching to the new __remove_extent built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_extent): Use __remove_extent
built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 6346d1daee2..73ddce351fd 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2092,6 +2092,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Array modifications.
 
   /// remove_extent
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_extent)
+  template
+struct remove_extent
+{ using type = __remove_extent(_Tp); };
+#else
   template
 struct remove_extent
 { using type = _Tp; };
@@ -2103,6 +2108,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct remove_extent<_Tp[]>
 { using type = _Tp; };
+#endif
 
   /// remove_all_extents
   template
-- 
2.44.0



[PATCH v14 11/26] c++: Implement __remove_extent built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::remove_extent.

gcc/cp/ChangeLog:

* cp-trait.def: Define __remove_extent.
* semantics.cc (finish_trait_type): Handle CPTK_REMOVE_EXTENT.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __remove_extent.
* g++.dg/ext/remove_extent.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  5 +
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/remove_extent.C | 16 
 4 files changed, 25 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_extent.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 63f879287ce..577c96d579b 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -100,6 +100,7 @@ DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_tempo
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
 DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1)
+DEFTRAIT_TYPE (REMOVE_EXTENT, "__remove_extent", 1)
 DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1)
 DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1)
 DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 635441a7a90..58696225fc4 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12793,6 +12793,11 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
type1 = TREE_TYPE (type1);
   return cv_unqualified (type1);
 
+case CPTK_REMOVE_EXTENT:
+  if (TREE_CODE (type1) == ARRAY_TYPE)
+   type1 = TREE_TYPE (type1);
+  return type1;
+
 case CPTK_REMOVE_POINTER:
   if (TYPE_PTR_P (type1))
type1 = TREE_TYPE (type1);
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 9d861398bae..5d5cbe3b019 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -182,6 +182,9 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__remove_extent)
+# error "__has_builtin (__remove_extent) failed"
+#endif
 #if !__has_builtin (__remove_pointer)
 # error "__has_builtin (__remove_pointer) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/remove_extent.C 
b/gcc/testsuite/g++.dg/ext/remove_extent.C
new file mode 100644
index 000..6183aca5a48
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/remove_extent.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+
+SA(__is_same(__remove_extent(int), int));
+SA(__is_same(__remove_extent(int[2]), int));
+SA(__is_same(__remove_extent(int[2][3]), int[3]));
+SA(__is_same(__remove_extent(int[][3]), int[3]));
+SA(__is_same(__remove_extent(const int[2]), const int));
+SA(__is_same(__remove_extent(ClassType), ClassType));
+SA(__is_same(__remove_extent(ClassType[2]), ClassType));
+SA(__is_same(__remove_extent(ClassType[2][3]), ClassType[3]));
+SA(__is_same(__remove_extent(ClassType[][3]), ClassType[3]));
+SA(__is_same(__remove_extent(const ClassType[2]), const ClassType));
-- 
2.44.0



[PATCH v14 20/26] libstdc++: Optimize std::decay compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::decay
by dispatching to the new __decay built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (decay): Use __decay built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 18a5e4de2d3..2f4c8dd3b21 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2316,6 +2316,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @cond undocumented
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__decay)
+  template
+struct decay
+{ using type = __decay(_Tp); };
+#else
   // Decay trait for arrays and functions, used for perfect forwarding
   // in make_pair, make_tuple, etc.
   template
@@ -2347,6 +2352,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct decay<_Tp&&>
 { using type = typename __decay_selector<_Tp>::type; };
+#endif
 
   /// @cond undocumented
 
-- 
2.44.0



[PATCH v14 15/26] c++: Implement __add_lvalue_reference built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::add_lvalue_reference.

gcc/cp/ChangeLog:

* cp-trait.def: Define __add_lvalue_reference.
* semantics.cc (finish_trait_type): Handle
CPTK_ADD_LVALUE_REFERENCE.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__add_lvalue_reference.
* g++.dg/ext/add_lvalue_reference.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  8 +++
 .../g++.dg/ext/add_lvalue_reference.C | 21 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 +++
 4 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/add_lvalue_reference.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 933c8bcbe68..9a27dca4ea3 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -48,6 +48,7 @@
 #define DEFTRAIT_TYPE_DEFAULTED
 #endif
 
+DEFTRAIT_TYPE (ADD_LVALUE_REFERENCE, "__add_lvalue_reference", 1)
 DEFTRAIT_TYPE (ADD_POINTER, "__add_pointer", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_ASSIGN, "__has_nothrow_assign", 1)
 DEFTRAIT_EXPR (HAS_NOTHROW_CONSTRUCTOR, "__has_nothrow_constructor", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 078424dac23..05f5b62f9df 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12776,6 +12776,14 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
 
   switch (kind)
 {
+case CPTK_ADD_LVALUE_REFERENCE:
+  if (VOID_TYPE_P (type1)
+ || (FUNC_OR_METHOD_TYPE_P (type1)
+ && (type_memfn_quals (type1) != TYPE_UNQUALIFIED
+ || type_memfn_rqual (type1) != REF_QUAL_NONE)))
+   return type1;
+  return cp_build_reference_type (type1, /*rval=*/false);
+
 case CPTK_ADD_POINTER:
   if (FUNC_OR_METHOD_TYPE_P (type1)
  && (type_memfn_quals (type1) != TYPE_UNQUALIFIED
diff --git a/gcc/testsuite/g++.dg/ext/add_lvalue_reference.C 
b/gcc/testsuite/g++.dg/ext/add_lvalue_reference.C
new file mode 100644
index 000..8fe1e0300e5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/add_lvalue_reference.C
@@ -0,0 +1,21 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+
+SA(__is_same(__add_lvalue_reference(int), int&));
+SA(__is_same(__add_lvalue_reference(int&), int&));
+SA(__is_same(__add_lvalue_reference(const int), const int&));
+SA(__is_same(__add_lvalue_reference(int*), int*&));
+SA(__is_same(__add_lvalue_reference(ClassType&), ClassType&));
+SA(__is_same(__add_lvalue_reference(ClassType), ClassType&));
+SA(__is_same(__add_lvalue_reference(int(int)), int(&)(int)));
+SA(__is_same(__add_lvalue_reference(int&&), int&));
+SA(__is_same(__add_lvalue_reference(ClassType&&), ClassType&));
+SA(__is_same(__add_lvalue_reference(void), void));
+SA(__is_same(__add_lvalue_reference(const void), const void));
+SA(__is_same(__add_lvalue_reference(bool(int) const), bool(int) const));
+SA(__is_same(__add_lvalue_reference(bool(int) &), bool(int) &));
+SA(__is_same(__add_lvalue_reference(bool(int) const &&), bool(int) const &&));
+SA(__is_same(__add_lvalue_reference(bool(int)), bool(&)(int)));
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 85b74bd676b..3fca9cfabcc 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -2,6 +2,9 @@
 // { dg-do compile }
 // Verify that __has_builtin gives the correct answer for C++ built-ins.
 
+#if !__has_builtin (__add_lvalue_reference)
+# error "__has_builtin (__add_lvalue_reference) failed"
+#endif
 #if !__has_builtin (__add_pointer)
 # error "__has_builtin (__add_pointer) failed"
 #endif
-- 
2.44.0



Re: [PATCH v2 5/5] bpf: renamed coreout.* files to btfext-out.*.

2024-02-28 Thread Cupertino Miranda


Corrected and Pushed.

Thanks,
Cupertino

David Faust writes:

> On 2/27/24 11:04, Cupertino Miranda wrote:
>> gcc/ChangeLog:
>>
>>  * config.gcc (target_gtfiles): Changes coreout to btfext-out.
>>  (extra_objs): Changes coreout to btfext-out.
>>  * config/bpf/coreout.cc: Renamed to btfext-out.cc.
>>  * config/bpf/btfext-out.cc: Added.
>>  * config/bpf/coreout.h: Renamed to btfext-out.h.
>>  * config/bpf/btfext-out.h: Added.
>>  * config/bpf/core-builtins.cc: Changes include.
>>  * config/bpf/core-builtins.h: Changes include.
>>  * config/bpf/t-bpf: Renamed file.
> This last entry is confusing, sounds like t-bpf is renamed, which it
> isn't. I'd suggest to just say "accomodate renamed files" or so.
>
> Similar to prior patches, there is a mix of present and past tenses here.
> Please stick with the present.
>
> Changes -> Change
> Added -> Add  (or just "New.")
> Renamed -> Rename.
>
> OK with those changes.
> Thanks.
>
>> ---
>>  gcc/config.gcc   | 4 ++--
>>  gcc/config/bpf/{coreout.cc => btfext-out.cc} | 4 ++--
>>  gcc/config/bpf/{coreout.h => btfext-out.h}   | 2 +-
>>  gcc/config/bpf/core-builtins.cc  | 2 +-
>>  gcc/config/bpf/core-builtins.h   | 2 +-
>>  gcc/config/bpf/t-bpf | 4 ++--
>>  6 files changed, 9 insertions(+), 9 deletions(-)
>>  rename gcc/config/bpf/{coreout.cc => btfext-out.cc} (99%)
>>  rename gcc/config/bpf/{coreout.h => btfext-out.h} (98%)
>>
>> diff --git a/gcc/config.gcc b/gcc/config.gcc
>> index a0f9c6723083..1ca033d75b66 100644
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -1653,8 +1653,8 @@ bpf-*-*)
>>  tmake_file="${tmake_file} bpf/t-bpf"
>>  use_collect2=no
>>  use_gcc_stdint=provide
>> -extra_objs="coreout.o core-builtins.o"
>> -target_gtfiles="$target_gtfiles \$(srcdir)/config/bpf/coreout.cc 
>> \$(srcdir)/config/bpf/core-builtins.cc"
>> +extra_objs="btfext-out.o core-builtins.o"
>> +target_gtfiles="$target_gtfiles \$(srcdir)/config/bpf/btfext-out.cc 
>> \$(srcdir)/config/bpf/core-builtins.cc"
>>  ;;
>>  cris-*-elf | cris-*-none)
>>  tm_file="elfos.h newlib-stdint.h ${tm_file}"
>> diff --git a/gcc/config/bpf/coreout.cc b/gcc/config/bpf/btfext-out.cc
>> similarity index 99%
>> rename from gcc/config/bpf/coreout.cc
>> rename to gcc/config/bpf/btfext-out.cc
>> index 31b2abc3151b..4281cca83e13 100644
>> --- a/gcc/config/bpf/coreout.cc
>> +++ b/gcc/config/bpf/btfext-out.cc
>> @@ -33,7 +33,7 @@
>>  #include "tree-pretty-print.h"
>>  #include "cgraph.h"
>>
>> -#include "coreout.h"
>> +#include "btfext-out.h"
>>
>>  /* This file contains data structures and routines for construction and 
>> output
>> of BPF Compile Once - Run Everywhere (BPF CO-RE) information.
>> @@ -618,4 +618,4 @@ btf_ext_output (void)
>>dw2_asm_output_data (4, 0, "Required padding by libbpf structs");
>>  }
>>
>> -#include "gt-coreout.h"
>> +#include "gt-btfext-out.h"
>> diff --git a/gcc/config/bpf/coreout.h b/gcc/config/bpf/btfext-out.h
>> similarity index 98%
>> rename from gcc/config/bpf/coreout.h
>> rename to gcc/config/bpf/btfext-out.h
>> index 1c26b9274739..b36309475c97 100644
>> --- a/gcc/config/bpf/coreout.h
>> +++ b/gcc/config/bpf/btfext-out.h
>> @@ -1,4 +1,4 @@
>> -/* coreout.h - Declarations and definitions related to
>> +/* btfext-out.h - Declarations and definitions related to
>> BPF Compile Once - Run Everywhere (CO-RE) support.
>> Copyright (C) 2021-2024 Free Software Foundation, Inc.
>>
>> diff --git a/gcc/config/bpf/core-builtins.cc 
>> b/gcc/config/bpf/core-builtins.cc
>> index aa75fd68cae6..8d8c54c1fb3d 100644
>> --- a/gcc/config/bpf/core-builtins.cc
>> +++ b/gcc/config/bpf/core-builtins.cc
>> @@ -45,7 +45,7 @@ along with GCC; see the file COPYING3.  If not see
>>
>>  #include "ctfc.h"
>>  #include "btf.h"
>> -#include "coreout.h"
>> +#include "btfext-out.h"
>>  #include "core-builtins.h"
>>
>>  /* BPF CO-RE builtins definition.
>> diff --git a/gcc/config/bpf/core-builtins.h b/gcc/config/bpf/core-builtins.h
>> index c54f6ddac812..e56b55b94e0c 100644
>> --- a/gcc/config/bpf/core-builtins.h
>> +++ b/gcc/config/bpf/core-builtins.h
>> @@ -1,7 +1,7 @@
>>  #ifndef BPF_CORE_BUILTINS_H
>>  #define BPF_CORE_BUILTINS_H
>>
>> -#include "coreout.h"
>> +#include "btfext-out.h"
>>
>>  enum bpf_builtins
>>  {
>> diff --git a/gcc/config/bpf/t-bpf b/gcc/config/bpf/t-bpf
>> index 18f1fa67794d..dc50332350c4 100644
>> --- a/gcc/config/bpf/t-bpf
>> +++ b/gcc/config/bpf/t-bpf
>> @@ -1,7 +1,7 @@
>>
>> -TM_H += $(srcdir)/config/bpf/coreout.h $(srcdir)/config/bpf/core-builtins.h
>> +TM_H += $(srcdir)/config/bpf/btfext-out.h 
>> $(srcdir)/config/bpf/core-builtins.h
>>
>> -coreout.o: $(srcdir)/config/bpf/coreout.cc
>> +btfext-out.o: $(srcdir)/config/bpf/btfext-out.cc
>>  $(COMPILE) $<
>>  $(POSTCOMPILE)
>>


[PATCH v14 14/26] libstdc++: Optimize std::remove_all_extents compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of
std::remove_all_extents by dispatching to the new __remove_all_extents
built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_all_extents): Use
__remove_all_extents built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 73ddce351fd..34475e6279a 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2111,6 +2111,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// remove_all_extents
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_all_extents)
+  template
+struct remove_all_extents
+{ using type = __remove_all_extents(_Tp); };
+#else
   template
 struct remove_all_extents
 { using type = _Tp; };
@@ -2122,6 +2127,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct remove_all_extents<_Tp[]>
 { using type = typename remove_all_extents<_Tp>::type; };
+#endif
 
 #if __cplusplus > 201103L
   /// Alias template for remove_extent
-- 
2.44.0



[PATCH v14 02/26] libstdc++: Optimize std::is_const compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_const
by dispatching to the new __is_const built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_const): Use __is_const built-in
trait.
(is_const_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 21402fd8c13..6e9ebfb8a18 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -835,6 +835,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type properties.
 
   /// is_const
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+  template
+struct is_const
+: public __bool_constant<__is_const(_Tp)>
+{ };
+#else
   template
 struct is_const
 : public false_type { };
@@ -842,6 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_const<_Tp const>
 : public true_type { };
+#endif
 
   /// is_volatile
   template
@@ -3327,10 +3334,15 @@ template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+template 
+  inline constexpr bool is_const_v = __is_const(_Tp);
+#else
 template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+#endif
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
 template 
-- 
2.44.0



[PATCH v14 07/26] c++: Implement __is_unbounded_array built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::is_unbounded_array.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_unbounded_array.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_UNBOUNDED_ARRAY.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_unbounded_array.
* g++.dg/ext/is_unbounded_array.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc  |  3 ++
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
 gcc/testsuite/g++.dg/ext/is_unbounded_array.C | 37 +++
 5 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_unbounded_array.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 244070d93c2..000df847342 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3861,6 +3861,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_TRIVIALLY_COPYABLE:
   inform (loc, "  %qT is not trivially copyable", t1);
   break;
+case CPTK_IS_UNBOUNDED_ARRAY:
+  inform (loc, "  %qT is not an unbounded array", t1);
+  break;
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 18e2d0f3480..05514a51c21 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -92,6 +92,7 @@ DEFTRAIT_EXPR (IS_TRIVIAL, "__is_trivial", 1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
+DEFTRAIT_EXPR (IS_UNBOUNDED_ARRAY, "__is_unbounded_array", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
 DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 9dcdb06191a..1794e83baa2 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12532,6 +12532,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_TRIVIALLY_COPYABLE:
   return trivially_copyable_p (type1);
 
+case CPTK_IS_UNBOUNDED_ARRAY:
+  return array_of_unknown_bound_p (type1);
+
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
@@ -12708,6 +12711,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_REFERENCE:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
+case CPTK_IS_UNBOUNDED_ARRAY:
 case CPTK_IS_UNION:
 case CPTK_IS_VOLATILE:
   break;
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 96b7a89e4f1..b1430e9bd8b 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -158,6 +158,9 @@
 #if !__has_builtin (__is_trivially_copyable)
 # error "__has_builtin (__is_trivially_copyable) failed"
 #endif
+#if !__has_builtin (__is_unbounded_array)
+# error "__has_builtin (__is_unbounded_array) failed"
+#endif
 #if !__has_builtin (__is_union)
 # error "__has_builtin (__is_union) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_unbounded_array.C 
b/gcc/testsuite/g++.dg/ext/is_unbounded_array.C
new file mode 100644
index 000..283a74e1a0a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_unbounded_array.C
@@ -0,0 +1,37 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+class ClassType { };
+class IncompleteClass;
+union IncompleteUnion;
+
+SA_TEST_CATEGORY(__is_unbounded_array, int[2], false);
+SA_TEST_CATEGORY(__is_unbounded_array, int[], true);
+SA_TEST_CATEGORY(__is_unbounded_array, int[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, int[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[2], false);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[], true);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[2], false);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[], true);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, IncompleteClass[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, IncompleteClass[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, 

[PATCH v14 25/26] c++: Implement __is_nothrow_invocable built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::is_nothrow_invocable.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_nothrow_invocable.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_NOTHROW_INVOCABLE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_nothrow_invocable.
* g++.dg/ext/is_nothrow_invocable.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc  |  6 ++
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 +
 .../g++.dg/ext/is_nothrow_invocable.C | 62 +++
 5 files changed, 76 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_nothrow_invocable.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index c87b126fdb1..43d4f2102d6 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3824,6 +3824,12 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_NOTHROW_CONVERTIBLE:
  inform (loc, "  %qT is not nothrow convertible from %qE", t2, t1);
   break;
+case CPTK_IS_NOTHROW_INVOCABLE:
+   if (!t2)
+ inform (loc, "  %qT is not nothrow invocable", t1);
+   else
+ inform (loc, "  %qT is not nothrow invocable by %qE", t1, t2);
+   break;
 case CPTK_IS_OBJECT:
   inform (loc, "  %qT is not an object type", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 6cb2b55f4ea..a9714921e94 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -84,6 +84,7 @@ DEFTRAIT_EXPR (IS_MEMBER_POINTER, "__is_member_pointer", 1)
 DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2)
 DEFTRAIT_EXPR (IS_NOTHROW_CONSTRUCTIBLE, "__is_nothrow_constructible", -1)
 DEFTRAIT_EXPR (IS_NOTHROW_CONVERTIBLE, "__is_nothrow_convertible", 2)
+DEFTRAIT_EXPR (IS_NOTHROW_INVOCABLE, "__is_nothrow_invocable", -1)
 DEFTRAIT_EXPR (IS_OBJECT, "__is_object", 1)
 DEFTRAIT_EXPR (IS_POINTER_INTERCONVERTIBLE_BASE_OF, 
"__is_pointer_interconvertible_base_of", 2)
 DEFTRAIT_EXPR (IS_POD, "__is_pod", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 149c0631d62..dba7b43a109 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12494,6 +12494,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_NOTHROW_CONVERTIBLE:
   return is_nothrow_convertible (type1, type2);
 
+case CPTK_IS_NOTHROW_INVOCABLE:
+  return expr_noexcept_p (build_invoke (type1, type2, tf_none), tf_none);
+
 case CPTK_IS_OBJECT:
   return (type_code1 != FUNCTION_TYPE
  && type_code1 != REFERENCE_TYPE
@@ -12689,6 +12692,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_NOTHROW_ASSIGNABLE:
 case CPTK_IS_NOTHROW_CONSTRUCTIBLE:
 case CPTK_IS_NOTHROW_CONVERTIBLE:
+case CPTK_IS_NOTHROW_INVOCABLE:
 case CPTK_IS_TRIVIALLY_ASSIGNABLE:
 case CPTK_IS_TRIVIALLY_CONSTRUCTIBLE:
 case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index d2a7ebdf25c..624d3525f27 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -131,6 +131,9 @@
 #if !__has_builtin (__is_nothrow_convertible)
 # error "__has_builtin (__is_nothrow_convertible) failed"
 #endif
+#if !__has_builtin (__is_nothrow_invocable)
+# error "__has_builtin (__is_nothrow_invocable) failed"
+#endif
 #if !__has_builtin (__is_object)
 # error "__has_builtin (__is_object) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_nothrow_invocable.C 
b/gcc/testsuite/g++.dg/ext/is_nothrow_invocable.C
new file mode 100644
index 000..2f9b40e5538
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_nothrow_invocable.C
@@ -0,0 +1,62 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+using func_type = void(*)();
+SA( ! __is_nothrow_invocable(func_type) );
+
+#if __cpp_noexcept_function_type
+using func_type_nt = void(*)() noexcept;
+SA(   __is_nothrow_invocable(func_type_nt) );
+#endif
+
+struct X { };
+using mem_type = int X::*;
+
+SA( ! __is_nothrow_invocable(mem_type) );
+SA( ! __is_nothrow_invocable(mem_type, int) );
+SA( ! __is_nothrow_invocable(mem_type, int&) );
+SA(   __is_nothrow_invocable(mem_type, X&) );
+
+using memfun_type = int (X::*)();
+
+SA( ! __is_nothrow_invocable(memfun_type) );
+SA( ! __is_nothrow_invocable(memfun_type, int) );
+SA( ! __is_nothrow_invocable(memfun_type, int&) );
+SA( ! __is_nothrow_invocable(memfun_type, X&) );
+SA( ! __is_nothrow_invocable(memfun_type, X*) );
+
+#if __cpp_noexcept_function_type
+using memfun_type_nt = int (X::*)() noexcept;
+
+SA( ! __is_nothrow_invocable(memfun_type_nt) );
+SA( ! 

[PATCH v14 06/26] libstdc++: Optimize std::is_pointer compilation performance

2024-02-28 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_pointer
by dispatching to the new __is_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_pointer): Use
__is_pointer built-in trait.  Optimize its implementation.
* include/std/type_traits (is_pointer): Likewise.
(is_pointer_v): Likewise.

Co-authored-by: Jonathan Wakely 
Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/bits/cpp_type_traits.h | 31 ++-
 libstdc++-v3/include/std/type_traits| 44 +
 2 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h 
b/libstdc++-v3/include/bits/cpp_type_traits.h
index 59f1a1875eb..210a9ea00da 100644
--- a/libstdc++-v3/include/bits/cpp_type_traits.h
+++ b/libstdc++-v3/include/bits/cpp_type_traits.h
@@ -363,6 +363,13 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   //
   // Pointer types
   //
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+  template
+struct __is_pointer : __truth_type<_IsPtr>
+{
+  enum { __value = _IsPtr };
+};
+#else
   template
 struct __is_pointer
 {
@@ -377,6 +384,28 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   typedef __true_type __type;
 };
 
+  template
+struct __is_pointer<_Tp* const>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+
+  template
+struct __is_pointer<_Tp* volatile>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+
+  template
+struct __is_pointer<_Tp* const volatile>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+#endif
+
   //
   // An arithmetic type is an integer type or a floating point type
   //
@@ -387,7 +416,7 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
 
   //
   // A scalar type is an arithmetic type or a pointer type
-  // 
+  //
   template
 struct __is_scalar
 : public __traitor<__is_arithmetic<_Tp>, __is_pointer<_Tp> >
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 60cd22b6f15..6407738a726 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -542,19 +542,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public true_type { };
 #endif
 
-  template
-struct __is_pointer_helper
+  /// is_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+  template
+struct is_pointer
+: public __bool_constant<__is_pointer(_Tp)>
+{ };
+#else
+  template
+struct is_pointer
 : public false_type { };
 
   template
-struct __is_pointer_helper<_Tp*>
+struct is_pointer<_Tp*>
 : public true_type { };
 
-  /// is_pointer
   template
-struct is_pointer
-: public __is_pointer_helper<__remove_cv_t<_Tp>>::type
-{ };
+struct is_pointer<_Tp* const>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* volatile>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* const volatile>
+: public true_type { };
+#endif
 
   /// is_lvalue_reference
   template
@@ -3264,8 +3278,22 @@ template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+template 
+  inline constexpr bool is_pointer_v = __is_pointer(_Tp);
+#else
 template 
-  inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
+  inline constexpr bool is_pointer_v = false;
+template 
+  inline constexpr bool is_pointer_v<_Tp*> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* volatile> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
+#endif
+
 template 
   inline constexpr bool is_lvalue_reference_v = false;
 template 
-- 
2.44.0



[PATCH v14 21/26] c++: Implement __rank built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::rank.

gcc/cp/ChangeLog:

* cp-trait.def: Define __rank.
* constraint.cc (diagnose_trait_expr): Handle CPTK_RANK.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __rank.
* g++.dg/ext/rank.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  | 23 ---
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/rank.C  | 24 
 5 files changed, 51 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/rank.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 000df847342..23ea66d9c12 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3870,6 +3870,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_VOLATILE:
   inform (loc, "  %qT is not a volatile type", t1);
   break;
+case CPTK_RANK:
+  inform (loc, "  %qT cannot yield a rank", t1);
+  break;
 case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
   inform (loc, "  %qT is not a reference that binds to a temporary "
  "object of type %qT (direct-initialization)", t1, t2);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 2d1cb7c227c..85056c8140b 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -99,6 +99,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, 
"__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNBOUNDED_ARRAY, "__is_unbounded_array", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
 DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
+DEFTRAIT_EXPR (RANK, "__rank", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 DEFTRAIT_TYPE (REMOVE_ALL_EXTENTS, "__remove_all_extents", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 45dc509855a..7242db75248 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12550,6 +12550,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_DEDUCIBLE:
   return type_targs_deducible_from (type1, type2);
 
+/* __rank is handled in finish_trait_expr. */
+case CPTK_RANK:
+
 #define DEFTRAIT_TYPE(CODE, NAME, ARITY) \
 case CPTK_##CODE:
 #include "cp-trait.def"
@@ -12622,7 +12625,10 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
   if (processing_template_decl)
 {
   tree trait_expr = make_node (TRAIT_EXPR);
-  TREE_TYPE (trait_expr) = boolean_type_node;
+  if (kind == CPTK_RANK)
+   TREE_TYPE (trait_expr) = size_type_node;
+  else
+   TREE_TYPE (trait_expr) = boolean_type_node;
   TRAIT_EXPR_TYPE1 (trait_expr) = type1;
   TRAIT_EXPR_TYPE2 (trait_expr) = type2;
   TRAIT_EXPR_KIND (trait_expr) = kind;
@@ -12714,6 +12720,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_UNBOUNDED_ARRAY:
 case CPTK_IS_UNION:
 case CPTK_IS_VOLATILE:
+case CPTK_RANK:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
@@ -12745,8 +12752,18 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
   gcc_unreachable ();
 }
 
-  tree val = (trait_expr_value (kind, type1, type2)
- ? boolean_true_node : boolean_false_node);
+  tree val;
+  if (kind == CPTK_RANK)
+{
+  size_t rank = 0;
+  for (; TREE_CODE (type1) == ARRAY_TYPE; type1 = TREE_TYPE (type1))
+   ++rank;
+  val = build_int_cst (size_type_node, rank);
+}
+  else
+val = (trait_expr_value (kind, type1, type2)
+  ? boolean_true_node : boolean_false_node);
+
   return maybe_wrap_with_location (val, loc);
 }
 
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 3aca273aad6..7f7b27f7aa7 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -179,6 +179,9 @@
 #if !__has_builtin (__is_volatile)
 # error "__has_builtin (__is_volatile) failed"
 #endif
+#if !__has_builtin (__rank)
+# error "__has_builtin (__rank) failed"
+#endif
 #if !__has_builtin (__reference_constructs_from_temporary)
 # error "__has_builtin (__reference_constructs_from_temporary) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/rank.C b/gcc/testsuite/g++.dg/ext/rank.C
new file mode 100644
index 000..28894184387
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/rank.C
@@ -0,0 +1,24 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+
+SA(__rank(int) == 0);
+SA(__rank(int[2]) == 1);
+SA(__rank(int[][4]) == 2);
+SA(__rank(int[2][2][4][4][6][6]) == 6);

[PATCH v14 01/26] c++: Implement __is_const built-in trait

2024-02-28 Thread Ken Matsui
This patch implements built-in trait for std::is_const.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_const.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
* g++.dg/ext/is_const.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_const.C  | 20 
 5 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 49de3211d4c..f32a1c78d63 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3767,6 +3767,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_CLASS:
   inform (loc, "  %qT is not a class", t1);
   break;
+case CPTK_IS_CONST:
+  inform (loc, "  %qT is not a const type", t1);
+  break;
 case CPTK_IS_CONSTRUCTIBLE:
   if (!t2)
 inform (loc, "  %qT is not default constructible", t1);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 394f006f20f..36faed9c0b3 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -64,6 +64,7 @@ DEFTRAIT_EXPR (IS_ASSIGNABLE, "__is_assignable", 2)
 DEFTRAIT_EXPR (IS_BASE_OF, "__is_base_of", 2)
 DEFTRAIT_EXPR (IS_BOUNDED_ARRAY, "__is_bounded_array", 1)
 DEFTRAIT_EXPR (IS_CLASS, "__is_class", 1)
+DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
 DEFTRAIT_EXPR (IS_CONSTRUCTIBLE, "__is_constructible", -1)
 DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
 DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 57840176863..0d08900492b 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12446,6 +12446,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_CLASS:
   return NON_UNION_CLASS_TYPE_P (type1);
 
+case CPTK_IS_CONST:
+  return CP_TYPE_CONST_P (type1);
+
 case CPTK_IS_CONSTRUCTIBLE:
   return is_xible (INIT_EXPR, type1, type2);
 
@@ -12688,6 +12691,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ARRAY:
 case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
+case CPTK_IS_CONST:
 case CPTK_IS_ENUM:
 case CPTK_IS_FUNCTION:
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 02b4b4d745d..e3640faeb96 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -71,6 +71,9 @@
 #if !__has_builtin (__is_class)
 # error "__has_builtin (__is_class) failed"
 #endif
+#if !__has_builtin (__is_const)
+# error "__has_builtin (__is_const) failed"
+#endif
 #if !__has_builtin (__is_constructible)
 # error "__has_builtin (__is_constructible) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
b/gcc/testsuite/g++.dg/ext/is_const.C
new file mode 100644
index 000..8a0e8df72a9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_const.C
@@ -0,0 +1,20 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+using cClassType = const ClassType;
+using vClassType = volatile ClassType;
+using cvClassType = const volatile ClassType;
+
+// Positive tests.
+SA(__is_const(const int));
+SA(__is_const(const volatile int));
+SA(__is_const(cClassType));
+SA(__is_const(cvClassType));
+
+// Negative tests.
+SA(!__is_const(int));
+SA(!__is_const(volatile int));
+SA(!__is_const(ClassType));
+SA(!__is_const(vClassType));
-- 
2.44.0



Re: [PATCH v2 4/5] bpf: implementation of func_info in .BTF.ext.

2024-02-28 Thread Cupertino Miranda


Corrected and Pushed, with the following little change to resolve a
warning I missed before, the patch introduced was:


diff --git a/gcc/config/bpf/btfext-out.cc b/gcc/config/bpf/btfext-out.cc
index 6ebbb54ef73e..00d2501a976b 100644
--- a/gcc/config/bpf/btfext-out.cc
+++ b/gcc/config/bpf/btfext-out.cc
@@ -200,7 +200,7 @@ btfext_info_sec_find_or_add (const char *sec_name, bool add)
   return ret;
 }

-#define SEARCH_NODE_AND_RETURN(TYPE, FIELD, CONDITION) ({ \
+#define SEARCH_NODE_AND_RETURN(TYPE, FIELD, CONDITION) __extension__ ({ \
   TYPE **head = &(FIELD); \
   while (*head != NULL) \
 { \

Thanks,
Cupertino


David Faust writes:

> Hi Cupertino,
>
> On 2/27/24 11:04, Cupertino Miranda wrote:
>> Kernel verifier complains in some particular cases for missing func_info
>> implementation in .BTF.ext. This patch implements it.
>>
>> Strings are cached locally in coreout.cc to avoid adding duplicated
>> strings in the string list. This string deduplication should eventually
>> be moved to the CTFC functions such that this happens widely.
>>
>> With this implementation, the CO-RE relocations information was also
>> simplified and integrated with the FuncInfo structures.
>>
>
> I have just a couple small comments inline in the patch below, but they
> are very minor and only suggestions/nits.
>
> The ChangeLog has the same past/present tense issue as the other patches
> in the series, but apart from that I see no issues. Great work! Thanks
> for implementing this.
>
> Patch is OK with the ChangeLog fixed up, and the inline nits - if
> you agree.
> Thanks!
>
>> gcc/Changelog:
>>
>>  PR target/113453
>>  * config/bpf/bpf.cc (bpf_function_prologue): Defined target
>>  hook.
>>  * config/bpf/coreout.cc (brf_ext_info_section)
>>  (btf_ext_info): Moved from coreout.h
>>  (btf_ext_funcinfo, btf_ext_lineinfo): Added struct.
>>  (bpf_core_reloc): Renamed to btf_ext_core_reloc.
>>  (btf_ext): Added static variable.
>>  (btfext_info_sec_find_or_add, SEARCH_NODE_AND_RETURN)
>>  (bpf_create_or_find_funcinfo, bpt_create_core_reloc)
>>  (btf_ext_add_string, btf_funcinfo_type_callback)
>>  (btf_add_func_info_for, btf_validate_funcinfo)
>>  (btf_ext_info_len, output_btfext_func_info): Added function.
>>  (output_btfext_header, bpf_core_reloc_add)
>>  (output_btfext_core_relocs, btf_ext_init, btf_ext_output):
>>  Changed to support new structs.
>>  * config/bpf/coreout.h (btf_ext_funcinfo, btf_ext_lineinfo):
>>  Moved and changed in coreout.cc.
>>  (btf_add_func_info_for, btf_ext_add_string): Added prototypes.
>>
>> gcc/testsuite/ChangeLog:
>>  PR target/113453
>>  * gcc.target/bpf/btfext-funcinfo-nocore.c: Added.
>>  * gcc.target/bpf/btfext-funcinfo.c: Added.
>>  * gcc.target/bpf/core-attr-5.c: Fixed regexp.
>>  * gcc.target/bpf/core-attr-6.c: Fixed regexp.
>>  * gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Fixed regexp.
>>  * gcc.target/bpf/core-section-1.c: Fixed regexp
>> ---
>>  gcc/config/bpf/bpf.cc |  12 +
>>  gcc/config/bpf/coreout.cc | 518 +-
>>  gcc/config/bpf/coreout.h  |  20 +-
>>  .../gcc.target/bpf/btfext-funcinfo-nocore.c   |  42 ++
>>  .../gcc.target/bpf/btfext-funcinfo.c  |  46 ++
>>  gcc/testsuite/gcc.target/bpf/core-attr-5.c|   9 +-
>>  gcc/testsuite/gcc.target/bpf/core-attr-6.c|   6 +-
>>  .../bpf/core-builtin-fieldinfo-offset-1.c |  13 +-
>>  gcc/testsuite/gcc.target/bpf/core-section-1.c |   2 +-
>>  9 files changed, 506 insertions(+), 162 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/bpf/btfext-funcinfo-nocore.c
>>  create mode 100644 gcc/testsuite/gcc.target/bpf/btfext-funcinfo.c
>>
>> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
>> index 4318b26b9cda..ea47e3a8dbfb 100644
>> --- a/gcc/config/bpf/bpf.cc
>> +++ b/gcc/config/bpf/bpf.cc
>> @@ -385,6 +385,18 @@ bpf_compute_frame_layout (void)
>>  #undef TARGET_COMPUTE_FRAME_LAYOUT
>>  #define TARGET_COMPUTE_FRAME_LAYOUT bpf_compute_frame_layout
>>
>> +/* Defined to initialize data for func_info region in .BTF.ext section.  */
>> +
>> +static void
>> +bpf_function_prologue (FILE *f ATTRIBUTE_UNUSED)
>> +{
>> +  if (btf_debuginfo_p ())
>> +btf_add_func_info_for (cfun->decl, current_function_func_begin_label);
>> +}
>> +
>> +#undef TARGET_ASM_FUNCTION_PROLOGUE
>> +#define TARGET_ASM_FUNCTION_PROLOGUE bpf_function_prologue
>> +
>>  /* Expand to the instructions in a function prologue.  This function
>> is called when expanding the 'prologue' pattern in bpf.md.  */
>>
>> diff --git a/gcc/config/bpf/coreout.cc b/gcc/config/bpf/coreout.cc
>> index 2f06ec2a0f29..31b2abc3151b 100644
>> --- a/gcc/config/bpf/coreout.cc
>> +++ b/gcc/config/bpf/coreout.cc
>> @@ -31,6 +31,7 @@
>>  #include "btf.h"
>>  #include "rtl.h"
>>  #include "tree-pretty-print.h"
>> +#include "cgraph.h"
>>
>>  #include "coreout.h"
>>
>> @@ -95,64 

Re: [PATCH v2 3/5] bpf: Always emit .BTF.ext section if generating BTF

2024-02-28 Thread Cupertino Miranda


Corrected and Pushed.

Thanks,
Cupertino

David Faust writes:

> On 2/27/24 11:04, Cupertino Miranda wrote:
>> BPF applications, when generating BTF information should always create a
>> .BTF.ext section.
>> Current implementation was only creating it when -mco-re option was used.
>> This patch makes .BTF.ext always be generated for BPF target objects.
>> The patch also adds conditions around btf_finalize function call
>> such that BTF deallocation happens later for BPF target.
>> For BPF, btf_finalize is only called after .BTF.ext is generated.
>
> Thank you, this version makes it much more clear what the patch does.
>
>>
>> gcc/ChangeLog:
>>
>>  * config/bpf/bpf.cc (bpf_option_override): Make .BTF.ext
>>  enabled by default for BPF.
>>  (bpf_file_end): Call BTF deallocation.
>>  * dwarf2ctf.cc (ctf_debug_finalize): Conditionally execute BTF
>>  deallocation.
>
> You are missing ChangeLog entries for bpf_asm_init_sections and
> ctf_debug_finish.
>
> The script contrib/gcc-changelog/git_check_commit.py may help
> to catch those.
>
> The code changes LGTM, so OK with the ChangeLog fixed.
> Thanks.
>
>> ---
>>  gcc/config/bpf/bpf.cc | 20 +---
>>  gcc/dwarf2ctf.cc  | 12 ++--
>>  2 files changed, 15 insertions(+), 17 deletions(-)
>>
>> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
>> index d6ca47eeecbe..4318b26b9cda 100644
>> --- a/gcc/config/bpf/bpf.cc
>> +++ b/gcc/config/bpf/bpf.cc
>> @@ -195,10 +195,8 @@ bpf_option_override (void)
>>if (TARGET_BPF_CORE && !btf_debuginfo_p ())
>>  error ("BPF CO-RE requires BTF debugging information, use %<-gbtf%>");
>>
>> -  /* To support the portability needs of BPF CO-RE approach, BTF debug
>> - information includes the BPF CO-RE relocations.  */
>> -  if (TARGET_BPF_CORE)
>> -write_symbols |= BTF_WITH_CORE_DEBUG;
>> +  /* BPF applications always generate .BTF.ext.  */
>> +  write_symbols |= BTF_WITH_CORE_DEBUG;
>>
>>/* Unlike much of the other BTF debug information, the information 
>> necessary
>>   for CO-RE relocations is added to the CTF container by the BPF backend.
>> @@ -218,10 +216,7 @@ bpf_option_override (void)
>>/* -gbtf implies -mcore when using the BPF backend, unless -mno-co-re
>>   is specified.  */
>>if (btf_debuginfo_p () && !(target_flags_explicit & MASK_BPF_CORE))
>> -{
>> -  target_flags |= MASK_BPF_CORE;
>> -  write_symbols |= BTF_WITH_CORE_DEBUG;
>> -}
>> +target_flags |= MASK_BPF_CORE;
>>
>>/* Determine available features from ISA setting (-mcpu=).  */
>>if (bpf_has_jmpext == -1)
>> @@ -267,7 +262,7 @@ bpf_option_override (void)
>>  static void
>>  bpf_asm_init_sections (void)
>>  {
>> -  if (TARGET_BPF_CORE)
>> +  if (btf_debuginfo_p () && btf_with_core_debuginfo_p ())
>>  btf_ext_init ();
>>  }
>>
>> @@ -279,8 +274,11 @@ bpf_asm_init_sections (void)
>>  static void
>>  bpf_file_end (void)
>>  {
>> -  if (TARGET_BPF_CORE)
>> -btf_ext_output ();
>> +  if (btf_debuginfo_p () && btf_with_core_debuginfo_p ())
>> +{
>> +  btf_ext_output ();
>> +  btf_finalize ();
>> +}
>>  }
>>
>>  #undef TARGET_ASM_FILE_END
>> diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
>> index 93e5619933fa..dca86edfffa9 100644
>> --- a/gcc/dwarf2ctf.cc
>> +++ b/gcc/dwarf2ctf.cc
>> @@ -944,7 +944,10 @@ ctf_debug_finalize (const char *filename, bool btf)
>>if (btf)
>>  {
>>btf_output (filename);
>> -  btf_finalize ();
>> +  /* btf_finalize when compiling BPF applciations gets deallocated by 
>> the
>> + BPF target in bpf_file_end.  */
>> +  if (btf_debuginfo_p () && !btf_with_core_debuginfo_p ())
>> +btf_finalize ();
>>  }
>>
>>else
>> @@ -1027,11 +1030,8 @@ ctf_debug_finish (const char * filename)
>>/* Emit BTF debug info here when CO-RE relocations need to be generated.
>>   BTF with CO-RE relocations needs to be generated when CO-RE is in 
>> effect
>>   for the BPF target.  */
>> -  if (btf_with_core_debuginfo_p ())
>> -{
>> -  gcc_assert (btf_debuginfo_p ());
>> -  ctf_debug_finalize (filename, btf_debuginfo_p ());
>> -}
>> +  if (btf_debuginfo_p () && btf_with_core_debuginfo_p ())
>> +ctf_debug_finalize (filename, btf_debuginfo_p ());
>>  }
>>
>>  #include "gt-dwarf2ctf.h"


Re: [PATCH v2 2/5] btf: added KIND_FUNC traversal function.

2024-02-28 Thread Cupertino Miranda


Corrected and Pushed.

Thanks,
Cupertino

David Faust writes:

> Hi Cupertino,
>
> Similar to patch 1, please use present tense to match the style of
> existing commits, in commit message and in ChangeLog.
>
> On 2/27/24 11:04, Cupertino Miranda wrote:
>> Added a traversal function to traverse all BTF_KIND_FUNC nodes with a
>> callback function. Used for .BTF.ext section content creation.
>
> Added -> Add
>
>>
>> gcc/ChangeLog:
>>
>>  * btfout.cc (output_btf_func_types): Use FOR_EACH_VEC_ELT.
>>  (traverse_btf_func_types): Defined function.
>>  * ctfc.h (funcs_traverse_callback): Typedef for function
>>  prototype.
>>  (traverse_btf_func_types): Added prototype.
>
> Mix of present and past tenses here, please stick to the present:
> Defined -> Define
> Added -> Add
>
> The code changes LGTM, so OK with those nits fixed.
> Thanks.
>
>> ---
>>  gcc/btfout.cc | 22 --
>>  gcc/ctfc.h|  3 +++
>>  2 files changed, 23 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
>> index 7e114e224449..7aabd99f3e7c 100644
>> --- a/gcc/btfout.cc
>> +++ b/gcc/btfout.cc
>> @@ -1276,8 +1276,10 @@ output_btf_types (ctf_container_ref ctfc)
>>  static void
>>  output_btf_func_types (ctf_container_ref ctfc)
>>  {
>> -  for (size_t i = 0; i < vec_safe_length (funcs); i++)
>> -btf_asm_func_type (ctfc, (*funcs)[i], i);
>> +  ctf_dtdef_ref ref;
>> +  unsigned i;
>> +  FOR_EACH_VEC_ELT (*funcs, i, ref)
>> +btf_asm_func_type (ctfc, ref, i);
>>  }
>>
>>  /* Output all BTF_KIND_DATASEC records.  */
>> @@ -1452,4 +1454,20 @@ btf_finalize (void)
>>tu_ctfc = NULL;
>>  }
>>
>> +/* Traversal function for all BTF_KIND_FUNC type records.  */
>> +
>> +bool
>> +traverse_btf_func_types (funcs_traverse_callback callback, void *data)
>> +{
>> +  ctf_dtdef_ref ref;
>> +  unsigned i;
>> +  FOR_EACH_VEC_ELT (*funcs, i, ref)
>> +{
>> +  bool stop = callback (ref, data);
>> +  if (stop == true)
>> +return true;
>> +}
>> +  return false;
>> +}
>> +
>>  #include "gt-btfout.h"
>> diff --git a/gcc/ctfc.h b/gcc/ctfc.h
>> index 7aac57edac55..fa188bf2f5a4 100644
>> --- a/gcc/ctfc.h
>> +++ b/gcc/ctfc.h
>> @@ -441,6 +441,9 @@ extern int ctf_add_variable (ctf_container_ref, const 
>> char *, ctf_id_t,
>>  extern ctf_id_t ctf_lookup_tree_type (ctf_container_ref, const tree);
>>  extern ctf_id_t get_btf_id (ctf_id_t);
>>
>> +typedef bool (*funcs_traverse_callback) (ctf_dtdef_ref, void *);
>> +bool traverse_btf_func_types (funcs_traverse_callback, void *);
>> +
>>  /* CTF section does not emit location information; at this time, location
>> information is needed for BTF CO-RE use-cases.  */
>>


Re: [PATCH v2 1/5] btf: fixed type id in BTF_KIND_FUNC struct data.

2024-02-28 Thread Cupertino Miranda


Corrected and Pushed.

Thanks,
Cupertino

David Faust writes:

> Hi Cupertino,
>
> Just some nits below. Apologies for incoming pedantry.
>
> On 2/27/24 11:04, Cupertino Miranda wrote:
>> This patch correct the aditition of +1 on the type id, which originally
>> was done in the wrong location and leaded to func_sts->dtd_type for
>> BTF_KIND_FUNCS struct data to contain the type id of the previous entry.
>
> Multiple typos here:
>   correct -> corrects
>   aditition -> addition
>   ...leaded to.. -> ..led to..
>   func_sts -> func_dtd
>   BTF_KIND_FUNCS -> BTF_KIND_FUNC
>
>>
>> gcc/ChangeLog:
>>
>>  * btfout.cc (btf_collect_dataset): Corrected BTF type id.
>
> Please use present tense in the ChangeLog entries, to match GNU style
> guidelines and existing entries,
> i.e. "Correct..." instead of "Corrected..."
>
> The same goes for the commit header, please use present tense to match
> the style of existing commits,
> i.e. "btf: fix type id..." instead of "fixed".
>
> The patch itself LGTM, so OK with above changes.
> Thanks!
>
>> ---
>>  gcc/btfout.cc | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
>> index dcf751f8fe0d..7e114e224449 100644
>> --- a/gcc/btfout.cc
>> +++ b/gcc/btfout.cc
>> @@ -457,7 +457,8 @@ btf_collect_datasec (ctf_container_ref ctfc)
>>func_dtd->dtd_data.ctti_type = dtd->dtd_type;
>>func_dtd->linkage = dtd->linkage;
>>func_dtd->dtd_name = dtd->dtd_name;
>> -  func_dtd->dtd_type = num_types_added + num_types_created;
>> +  /* +1 for the sentinel type not in the types map.  */
>> +  func_dtd->dtd_type = num_types_added + num_types_created + 1;
>>
>>/* Only the BTF_KIND_FUNC type actually references the name. The
>>   BTF_KIND_FUNC_PROTO is always anonymous.  */
>> @@ -480,8 +481,7 @@ btf_collect_datasec (ctf_container_ref ctfc)
>>
>>struct btf_var_secinfo info;
>>
>> -  /* +1 for the sentinel type not in the types map.  */
>> -  info.type = func_dtd->dtd_type + 1;
>> +  info.type = func_dtd->dtd_type;
>>
>>/* Both zero at compile time.  */
>>info.size = 0;


Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Peter Bergner
On 2/28/24 8:31 AM, Segher Boessenkool wrote:
> On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote:
>> So it seems you're not NAKing the use of splat_input_operand, but
>> just that it needs more explanation in the git log entry, correct?
> 
> I NAK the patch.  _Of course_ there needs to be *something* done, there
> is a bug after all, it needs to be fixed.
> 
> But no, there are big questions about if splat_input_operand is correct
> as well.  This needs to be justified in the patch submission.

Ok, then Jeevitha, repost the patch with the s/op1/operands[1]/ only change.
Jeevitha has already bootstrapped and regtested that change and it does
fix the bug.

Clearly, the splat_input_operand change needs more discussion and would
be a follow-on patch...if we decide to do it at all.

Peter



Re: [PATCH] RISC-V: add option -m(no-)autovec-segment

2024-02-28 Thread Jeff Law




On 2/27/24 13:30, Greg McGary wrote:

On 2/27/24 8:25 AM, Jeff Law wrote:




On 2/25/24 21:53, Greg McGary wrote:

Add option -m(no-)autovec-segment to enable/disable autovectorizer
from emitting vector segment load/store instructions. This is useful for
performance experiments.

gcc/ChangeLog:
* config/riscv/autovec.md (vec_mask_len_load_lanes, 
vec_mask_len_store_lanes):

  Predicate with TARGET_VECTOR_AUTOVEC_SEGMENT
* gcc/config/riscv/riscv-opts.h (TARGET_VECTOR_AUTOVEC_SEGMENT): 
New macro.

* gcc/config/riscv/riscv.opt (-m(no-)autovec-segment): New option.
* gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent 
divide-by-zero.

* testsuite/gcc.target/riscv/rvv/autovec/struct/*_noseg*.c,
testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: New tests.
I don't mind having options to do this kind of selection (we've done 
similar things internally for other RVV features).  But I don't think 
now is the time to be introducing this stuff.  We're in stage4 of the 
development cycle after all.



No problemo. Will you take the simple bugfix?

   gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent divide-by-zero.
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc

index 1dbe1115da4..6303d82d959 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -11521,7 +11521,8 @@ vectorizable_load (vec_info *vinfo,
   - (vec_num * j + i) * nunits);
  /* remain should now be > 0 and < nunits.  */
  unsigned num;
-    if (constant_multiple_p (nunits, remain, ))
+    if (known_gt (remain, 0)
+    && constant_multiple_p (nunits, remain, ))
    {
  tree ptype;
  new_vtype


I am unaware of a testcase that triggers it without disabling segmented 
load, so LMK if you are cool with the fix without a test case.
We'd really need a testcase and some analysis -- this change will affect 
every target, so you'd need to explain why the change is correct.


jeff



Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

2024-02-28 Thread Patrick O'Neill



On 2/28/24 07:02, Palmer Dabbelt wrote:

On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreya...@gmail.com wrote:



On 2/28/24 05:23, Kito Cheng wrote:
atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic 
operation on
RV64, however lr.w is doing sign extend to DI and compare 
instruction only have
DI mode on RV64, so the expected value should be sign extend before 
compare as

well, so that we can get right compare result.

gcc/ChangeLog:

PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap): Sign
extend the expected value if needed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr114130.c: New.

Nearly rejected this as I think the description was a bit ambiguous and
I thought you were extending the result of the lr.w.  But it's actually
the other value you're ensuring gets properly extended.


I had the same response, but after reading it I'm not quite sure how 
to say it better.



OK.


I was looking at the code to try and ask if we have the same bug for 
the short inline CAS routines, but I've got to run to some meetings...


I don't think subword AMO CAS is impacted.

As part of the CAS we mask both the expected value [2] and the retrieved 
value[1] before comparing.


- Patrick

[1]: 
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l495
[2]: 
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l459






Jeff


Re: [PATCH v3] c++/modules: Support lambdas attached to more places in modules [PR111710]

2024-02-28 Thread Jason Merrill

On 2/27/24 23:12, Nathaniel Shead wrote:

On Tue, Feb 27, 2024 at 11:59:46AM -0500, Patrick Palka wrote:

On Fri, 16 Feb 2024, Nathaniel Shead wrote:


On Tue, Feb 13, 2024 at 07:52:01PM -0500, Jason Merrill wrote:

On 2/10/24 17:57, Nathaniel Shead wrote:

The fix for PR107398 weakened the restrictions that lambdas must belong
to namespace scope. However this was not sufficient: we also need to
allow lambdas keyed to FIELD_DECLs or PARM_DECLs.


I wonder about keying such lambdas to the class and function, respectively,
rather than specifically to the field or parameter, but I suppose it doesn't
matter.


I did some more testing and realised my testcase didn't properly
exercise whether I'd properly deduplicated or not, and an improved
testcase proved that actually keying to the field rather than the class
did cause issues. (Parameter vs. function doesn't seem to have mattered
however.)

Here's an updated patch that fixes this, and includes the changes for
lambdas in base classes that I'd had as a separate patch earlier. I've
also added some concepts testcases just in case.

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

The fix for PR107398 weakened the restrictions that lambdas must belong
to namespace scope. However this was not sufficient: we also need to
allow lambdas attached to FIELD_DECLs, PARM_DECLs, and TYPE_DECLs.

For field decls we key the lambda to its class rather than the field
itself. This avoids some errors with deduplicating fields.

Additionally, by [basic.link] p15.2 a lambda defined anywhere in a
class-specifier should not be TU-local, which includes base-class
declarations, so ensure that lambdas declared there are keyed
appropriately as well.

Because this now requires 'DECL_MODULE_KEYED_DECLS_P' to be checked on a
fairly large number of different kinds of DECLs, and that in general
it's safe to just get 'false' as a result of a check on an unexpected
DECL type, this patch also removes the tree checking from the accessor.

Finally, to handle deduplicating templated lambda fields, we need to
ensure that we can determine that two lambdas from different field decls
match. The modules code does not attempt to deduplicate expression
nodes, which causes issues as the LAMBDA_EXPRs are then considered to be
different. However, rather than checking the LAMBDA_EXPR directly we can
instead check its type: the generated RECORD_TYPE for a LAMBDA_EXPR must
also be unique, and /is/ deduplicated on import, so we can just check
for that instead.


We probably should be deduplicating LAMBDA_EXPR on stream-in, perhaps
something like

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index e8eabb1f6f9..1b2ba2e0fa8 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -9183,6 +9183,13 @@ trees_in::tree_value ()
return NULL_TREE;
  }
  
+  if (TREE_CODE (t) == LAMBDA_EXPR

+  && CLASSTYPE_LAMBDA_EXPR (TREE_TYPE (t)))
+{
+  existing = CLASSTYPE_LAMBDA_EXPR (TREE_TYPE (t));
+  back_refs[~tag] = existing;
+}
+
dump (dumper::TREE) && dump ("Read tree:%d %C:%N", tag, TREE_CODE (t), t);
  
if (TREE_CODE (existing) == INTEGER_CST && !TREE_OVERFLOW (existing))


would suffice?  If not we probably need to take inspiration from the
TREE_BINFO streaming, and handle LAMBDA_EXPR similarly..



Ah yup, right, that makes more sense. Your suggestion seems to work,
thanks! Here's an updated patch.

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?


With that change, do you still need to key to the class instead of the 
field for dedup to work properly?


OK either way.


-- >8 --

The fix for PR107398 weakened the restrictions that lambdas must belong
to namespace scope. However this was not sufficient: we also need to
allow lambdas attached to FIELD_DECLs, PARM_DECLs, and TYPE_DECLs.

For field decls we key the lambda to its class rather than the field
itself. This avoids some errors with deduplicating fields.

Additionally, by [basic.link] p15.2 a lambda defined anywhere in a
class-specifier should not be TU-local, which includes base-class
declarations, so ensure that lambdas declared there are keyed
appropriately as well.

Because this now requires 'DECL_MODULE_KEYED_DECLS_P' to be checked on a
fairly large number of different kinds of DECLs, and that in general
it's safe to just get 'false' as a result of a check on an unexpected
DECL type, this patch also removes the tree checking from the accessor.

Finally, to handle deduplicating templated lambda fields, we need to
ensure that we can determine that two lambdas from different field decls
match, so we ensure that we deduplicate LAMBDA_EXPRs on stream in.

PR c++/111710

gcc/cp/ChangeLog:

* cp-tree.h (DECL_MODULE_KEYED_DECLS_P): Remove tree checking.
(struct lang_decl_base): Update comments and fix whitespace.
* module.cc (trees_out::lang_decl_bools): Always write
module_keyed_decls_p flag...
(trees_in::lang_decl_bools): ...and 

Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-28 Thread Jeff Law




On 2/27/24 21:51, Li, Pan2 wrote:

   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
 return NULL_RTX;
   if (!targetm.modes_tieable_p (int_mode, mode))
 return NULL_RTX;


Yes, will return NULL_RTX for in the first if, given src_int_mode is E_DImode 
while src_mode is
E_V2SFmode and mode is E_V4QImode. The extract_low_bits convert the modes 
E_V2SFmode/E_V4QImode
to E_DImode/E_SImode in advance before tieable checking, validate_subreg and 
gen_lowpart.

Not sure if my understanding is correct but looks extract_low_bits cannot take 
care of vector modes
up to a point because vector modes are always untieable to its' int mode, and 
then return NULL_RTX.
Well, the code tries to turn the vector mode into a suitable integer 
mode via int_mode_for_mode.  That takes a mode, including vector modes 
and tries to find an integer mode of the exact same size.


So it's going to check if V2SF can be tied to DI and V4QI with SI.  I 
suspect those are going to fail for RISC-V as those aren't tieable.


Jeff



Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-02-28 Thread Andre Vieira (lists)



On 27/02/2024 08:47, Richard Biener wrote:

On Mon, 26 Feb 2024, Andre Vieira (lists) wrote:




On 05/02/2024 09:56, Richard Biener wrote:

On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:




On 01/02/2024 07:19, Richard Biener wrote:

On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:


The patch didn't come with a testcase so it's really hard to tell
what goes wrong now and how it is fixed ...


My bad! I had a testcase locally but never added it...

However... now I look at it and ran it past Richard S, the codegen isn't
'wrong', but it does have the potential to lead to some pretty slow
codegen,
especially for inbranch simdclones where it transforms the SVE predicate
into
an Advanced SIMD vector by inserting the elements one at a time...

An example of which can be seen if you do:

gcc -O3 -march=armv8-a+sve -msve-vector-bits=128  -fopenmp-simd t.c -S

with the following t.c:
#pragma omp declare simd simdlen(4) inbranch
int __attribute__ ((const)) fn5(int);

void fn4 (int *a, int *b, int n)
{
  for (int i = 0; i < n; ++i)
  b[i] = fn5(a[i]);
}

Now I do have to say, for our main usecase of libmvec we won't have any
'inbranch' Advanced SIMD clones, so we avoid that issue... But of course
that
doesn't mean user-code will.


It seems to use SVE masks with vector(4)  and the
ABI says the mask is vector(4) int.  You say that's because we choose
a Adv SIMD clone for the SVE VLS vector code (it calls _ZGVnM4v_fn5).

The vectorizer creates

_44 = VEC_COND_EXPR ;

and then vector lowering decomposes this.  That means the vectorizer
lacks a check that the target handles this VEC_COND_EXPR.

Of course I would expect that SVE with VLS vectors is able to
code generate this operation, so it's missing patterns in the end.

Richard.



What should we do for GCC-14? Going forward I think the right thing to do is
to add these patterns. But I am not even going to try to do that right now and
even though we can codegen for this, the result doesn't feel like it would
ever be profitable which means I'd rather not vectorize, or well pick a
different vector mode if possible.

This would be achieved with the change to the targethook. If I change the hook
to take modes, using STMT_VINFO_VECTYPE (stmt_vinfo), is that OK for now?


Passing in a mode is OK.  I'm still not fully understanding why the
clone isn't fully specifying 'mode' and if it does not why the
vectorizer itself can not disregard it.



We could check that the modes of the parameters & return type are the 
same as the vector operands & result in the vectorizer. But then we'd 
also want to make sure we don't reject cases where we have simdclones 
with compatible modes, aka same element type, but a multiple element 
count.  Which is where'd we get in trouble again I think, because we'd 
want to accept V8SI -> 2x V4SI, but not V8SI -> 2x VNx4SI (with VLS and 
aarch64_sve_vg = 2), not because it's invalid, but because right now the 
codegen is bad. And it's easier to do this in the targethook, which we 
can technically also use to 'rank' simdclones by setting a 
target_badness value, so in the future we could decide to assign some 
'badness' to influence the rank a SVE simdclone for Advanced SIMD loops 
vs an Advanced SIMD clone for Advanced SIMD loops.


This does touch another issue of simdclone costing, which is a larger 
issue in general and one we (arm) might want to approach in the future. 
It's a complex issue, because the vectorizer doesn't know the 
performance impact of a simdclone, we assume (as we should) that its 
faster than the original scalar, though we currently don't record costs 
for either, but we don't know by how much or how much impact it has, so 
the vectorizer can't reason whether it's beneficial to use a simdclone 
if it has to do a lot of operand preparation, we can merely tell it to 
use it, or not and all the other operations in the loop will determine 
costing.




 From the past discussion I understood the existing situation isn't
as bad as initially thought and no bad things happen right now?
Nope, I thought they compiler would fall apart, but it seems to be able 
to transform the operands from one mode into the other, so without the 
targethook it just generates slower loops in certain cases, which we'd 
rather avoid given the usecase for simdclones is to speed things up ;)



Attached reworked patch.


This patch adds a machine_mode argument to TARGET_SIMD_CLONE_USABLE to 
make sure the target can reject a simd_clone based on the vector mode it 
is using.  This is needed because for VLS SVE vectorization the 
vectorizer accepts Advanced SIMD simd clones when vectorizing using SVE 
types because the simdlens might match, this currently leads to 
suboptimal codegen.


Other targets do not currently need to use this argument.

gcc/ChangeLog:

* target.def (TARGET_SIMD_CLONE_USABLE): Add argument.
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass vector_mode
to call 

Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64

2024-02-28 Thread Richard Biener



> Am 28.02.2024 um 16:05 schrieb Jeff Law :
> 
> 
> 
>> On 2/28/24 03:05, Richard Biener wrote:
>> 
>> Untested fix for targets that cannot handle the original IL below.
>> I'm not convinced that's the way to go here, is it?  Or scrap
>> the testcase?  Or have a cheap way to say "this target doesn't
>> support _any_ vec_cond"?
>> Another way, but for stage1 I think, would be to delay all the
>> vector checking to the "final" maybe_push_res_to_seq, but that
>> requires aggressively cleaning out intermediate folding results
>> no longer needed and also still requires patterns to do the
>> "is the incoming IL supported on the target" as later we do not
>> know what parts we matched (this could in theory be automated
>> as well, of course).  There's also no way to say "don't apply
>> this unless the final simplification result is a constant".
>> While we might be able to add a new '*' modifier saying
>> "if the result is supported by the target" there's again no
>> way of conditionalizing this on the original operation being
>> supported.  OK, maybe on the match allow '*1' and when '*1'
>> is used in the result we fail if any of the '*1' annotated
>> parts in the match were supported but at least one of the '*1'
>> in the result is not.  Supported only for select operations.
>> But it still doesn't help as we'd have to somehow check the
>> re-simplified result and we can't really track where a
>> vec_cond we put in goes.
> So I don't mind completely deferring to the next stage1; I don't consider 
> this a significant quality regression and if we think we can do something 
> cleaner and avoid the pattern explosion I certainly don't mind waiting.
> 
> Seems like we ought to open a bug so it doesn't get lost, particularly if 
> you'd prefer to defer.

I have some other idea to get away with a leaner not supported query when we’re 
before vector lowering.  I’ll see how that goes tomorrow.

Richard 

> jeff


Re: [PATCH 01/11] rs6000, Fix __builtin_vsx_cmple* args and documentation, builtins

2024-02-28 Thread Carl Love
Kewen:

Thanks for the review.  From the review, it looks like a few of the built-ins 
just need to be replaced with an overloaded version of an existing PVPIR 
documented buit-in.  Most of the rest can just be removed.  I will work on 
redoing the patch set accordingly.  We can then look at the new patch set after 
stage 4 is over.

   Carl 

On 2/20/24 09:55, Carl Love wrote:
> 
> GCC maintainers:
> 
> This patch fixes the arguments and return type for the various 
> __builtin_vsx_cmple* built-ins.  They were defined as signed but should have 
> been defined as unsigned.
> 
> The patch has been tested on Power 10 with no regressions.
> 
> Please let me know if this patch is acceptable for mainline.  Thanks.
> 
>   Carl 
> 
> -
> 
> rs6000, Fix __builtin_vsx_cmple* args and documentation, builtins
> 
> The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
> __builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take
> unsigned arguments and return an unsigned result.  This patch changes
> the arguments and return type from signed to unsigned.
> 
> The documentation for the signed and unsigned versions of
> __builtin_vsx_cmple is missing from extend.texi.  This patch adds the
> missing documentation.
> 
> Test cases are added for each of the signed and unsigned built-ins.
> 
> gcc/ChangeLog:
>   * config/rs6000/rs6000-builtins.def (__builtin_vsx_cmple_u16qi,
>   __builtin_vsx_cmple_u2di, __builtin_vsx_cmple_u4si): Change
>   arguments and return from signed to unsigned.
>   * doc/extend.texi (__builtin_vsx_cmple_16qi,
>   __builtin_vsx_cmple_8hi, __builtin_vsx_cmple_4si,
>   __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u8hi,
>   __builtin_vsx_cmple_u4si): Add documentation.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.target/powerpc/vsx-cmple.c: New test file.
> ---
>  gcc/config/rs6000/rs6000-builtins.def|  10 +-
>  gcc/doc/extend.texi  |  23 
>  gcc/testsuite/gcc.target/powerpc/vsx-cmple.c | 127 +++
>  3 files changed, 155 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx-cmple.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 3bc7fed6956..d66a53a0fab 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1349,16 +1349,16 @@
>const vss __builtin_vsx_cmple_8hi (vss, vss);
>  CMPLE_8HI vector_ngtv8hi {}
>  
> -  const vsc __builtin_vsx_cmple_u16qi (vsc, vsc);
> +  const vuc __builtin_vsx_cmple_u16qi (vuc, vuc);
>  CMPLE_U16QI vector_ngtuv16qi {}
>  
> -  const vsll __builtin_vsx_cmple_u2di (vsll, vsll);
> +  const vull __builtin_vsx_cmple_u2di (vull, vull);
>  CMPLE_U2DI vector_ngtuv2di {}
>  
> -  const vsi __builtin_vsx_cmple_u4si (vsi, vsi);
> +  const vui __builtin_vsx_cmple_u4si (vui, vui);
>  CMPLE_U4SI vector_ngtuv4si {}
>  
> -  const vss __builtin_vsx_cmple_u8hi (vss, vss);
> +  const vus __builtin_vsx_cmple_u8hi (vus, vus);
>  CMPLE_U8HI vector_ngtuv8hi {}
>  
>const vd __builtin_vsx_concat_2df (double, double);
> @@ -1769,7 +1769,7 @@
>const vf __builtin_vsx_xvcvuxdsp (vull);
>  XVCVUXDSP vsx_xvcvuxdsp {}
>  
> -  const vd __builtin_vsx_xvcvuxwdp (vsi);
> +  const vd __builtin_vsx_xvcvuxwdp (vui);
>  XVCVUXWDP vsx_xvcvuxwdp {}
>  
>const vf __builtin_vsx_xvcvuxwsp (vsi);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 2b8ba1949bf..4d8610f6aa8 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -22522,6 +22522,29 @@ if the VSX instruction set is available.  The 
> @samp{vec_vsx_ld} and
>  @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>  @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>  
> +
> +@smallexample
> +vector signed char __builtin_vsx_cmple_16qi (vector signed char,
> + vector signed char);
> +vector signed short __builtin_vsx_cmple_8hi (vector signed short,
> + vector signed short);
> +vector signed int __builtin_vsx_cmple_4si (vector signed int,
> + vector signed int);
> +vector unsigned char __builtin_vsx_cmple_u16qi (vector unsigned char,
> +vector unsigned char);
> +vector unsigned short __builtin_vsx_cmple_u8hi (vector unsigned short,
> +vector unsigned short);
> +vector unsigned int __builtin_vsx_cmple_u4si (vector unsigned int,
> +  vector unsigned int);
> +@end smallexample
> +
> +The builti-ins @code{__builtin_vsx_cmple_16qi}, 
> @code{__builtin_vsx_cmple_8hi},
> +@code{__builtin_vsx_cmple_4si}, @code{__builtin_vsx_cmple_u16qi},
> 

Re: [PATCH] developer option: -fdump-generic-nodes; initial incorporation

2024-02-28 Thread David Malcolm
On Wed, 2024-02-28 at 08:58 +0100, Richard Biener wrote:
> On Tue, Feb 27, 2024 at 10:20 PM Robert Dubner 
> wrote:
> > 
> > Richard,
> > 
> > Thank you very much for your comments.
> > 
> > When I set out to create the capability, I had a "specification" in
> > mind.
> > 
> > I didn't have a clue how to create a GENERIC tree that could be fed
> > to the
> > middle end in a way that would successfully result in an
> > executable.  And I
> > needed to be able to do that in order to proceed with the project
> > of
> > creating a COBOL front end.
> > 
> > So, I came up with the idea of using GCC to compile simple
> > programs, and to
> > hook into the compiler to examine the trees fed to the middle end,
> > and to
> > display those trees in the human-readable format I needed to
> > understand
> > them.  And that's what I did.
> > 
> > My first incarnation generated pure text files, and I used that to
> > get
> > going.
> > 
> > After a while I realized that when I used the output file, I was
> > spending a
> > lot of time searching through the text files.  And I had the
> > brainstorm!
> > Hyperlinks!  HTML files!  We have the technology!  So, I created
> > the .HTML
> > files as well.
> > 
> > I found this useful to the point of necessity in order to learn how
> > to
> > generate the GENERIC trees.  I believe it would be equally useful
> > to the
> > next developer who, for whatever reason, needs to understand, on a
> > "You need
> > to learn the alphabet before you can learn how to read" level, what
> > the
> > middle end requires from a GENERIC tree generated by a front end.
> > 
> > But I've never used it on a complex program. I've used it only to
> > learn how
> > to create the GENERIC nodes for very particular things, and so I
> > would use
> > the -fdump-generic-nodes feature on a very simple C program that
> > demonstrated, in isolation, the feature I needed.  Once I figured
> > it out, I
> > would create front end C routines or macros that used the
> > tree.h/tree.cc
> > features to build those GENERIC trees, and then I would move on.
> > 
> > I decided to offer it up here, in order to to learn how to create
> > patches
> > and to get
> > to know the people and the process, as well as from the desire to
> > share it.
> > And instantly I got the "How about a machine-readable format?"
> > comments.
> > Which are reasonable.  So, because it wasn't hard, I hacked at the
> > existing
> > code to create a JSON output.  (But I remind you that up until now,
> > nobody
> > seems to have needed a JSON representation.)
> > 
> > And your observation that the human readable representation could
> > be made
> > from the JSON representation is totally accurate.
> > 
> > But that wasn't my specification.  My specification was "A tool so
> > that a
> > human being can examine a simple GENERIC tree to learn how it's
> > done."
> > 
> > But it seems to me that we are now moving into the realm of a new
> > specification.
> > 
> > Said another way:  To go from "A human readable representation of a
> > simple
> > GENERIC tree" to "A machine readable JSON representation of an
> > arbitrarily
> > complex GENERIC tree, from which a human readable representation
> > can be
> > created" means, in effect, starting over on a different project
> > that I don't
> > need.  I already *have* a project that I am working on -- the COBOL
> > front
> > end.
> > 
> > The complexity of GENERIC trees is, in my experienced opinion, an
> > obstacle
> > for the creation of front ends.  The GCC Internals document has a
> > lot of
> > information, but to go from it to a front end is like using the
> > maintenance
> > manual for an F16 fighter to try to learn to fly the aircraft.
> > 
> > The program "main(){}" generates a tree with over seventy nodes.  I
> > see no
> > way to document why that's true; it's all arbitrary in the sense
> > that "this
> > is how GCC works".  -fdump-generic-nodes made it possible for me to
> > figure
> > out how those nodes are connected and, thus, how to create a new
> > front end.
> > I figure that other developers might find it useful, as well.
> > 
> > I guess I am saying that I am not, at this time, able to work on a
> > whole
> > different tool.  I think what I have done so far does something
> > useful that
> > doesn't seem to otherwise exist in GCC.
> > 
> > I suppose the question for you is, "Is it useful enough?"
> > 
> > I won't be offended if the answer is "No" and I hope you won't be
> > offended
> > by my not having the bandwidth to address your very thoughtful and
> > valid
> > observations about how it could be better.
> 
> No offense taken - I did realize how useful this was to you (and
> specifically
> the hyper-linking looked even very useful to me!).  I often lament
> the lack
> of domain-specific visualization tools for the various data
> structures GCC
> has - having something for GENERIC would be very welcome.
> 
> We have for example ways to dump graphviz .dot format graphs of the
> CFG
> and 

Re: [r14-9173 Regression] FAIL: gcc.dg/tree-ssa/andnot-2.c scan-tree-dump-not forwprop3 "_expr" on Linux/x86_64

2024-02-28 Thread Jeff Law




On 2/28/24 03:05, Richard Biener wrote:



Untested fix for targets that cannot handle the original IL below.
I'm not convinced that's the way to go here, is it?  Or scrap
the testcase?  Or have a cheap way to say "this target doesn't
support _any_ vec_cond"?

Another way, but for stage1 I think, would be to delay all the
vector checking to the "final" maybe_push_res_to_seq, but that
requires aggressively cleaning out intermediate folding results
no longer needed and also still requires patterns to do the
"is the incoming IL supported on the target" as later we do not
know what parts we matched (this could in theory be automated
as well, of course).  There's also no way to say "don't apply
this unless the final simplification result is a constant".
While we might be able to add a new '*' modifier saying
"if the result is supported by the target" there's again no
way of conditionalizing this on the original operation being
supported.  OK, maybe on the match allow '*1' and when '*1'
is used in the result we fail if any of the '*1' annotated
parts in the match were supported but at least one of the '*1'
in the result is not.  Supported only for select operations.
But it still doesn't help as we'd have to somehow check the
re-simplified result and we can't really track where a
vec_cond we put in goes.
So I don't mind completely deferring to the next stage1; I don't 
consider this a significant quality regression and if we think we can do 
something cleaner and avoid the pattern explosion I certainly don't mind 
waiting.


Seems like we ought to open a bug so it doesn't get lost, particularly 
if you'd prefer to defer.


jeff


[PATCH] ctf: Fix multi-dimentional array types ordering in CTF

2024-02-28 Thread Cupertino Miranda
Hi everyone,

In order to facilitate reviewing, I include a copy of the function in
this email, since the code structure changes are too hard to analyse
in the patch itself.

Looking forward to your comments.

Regards,
Cupertino

=== Function changes ===

/* Generate CTF for an ARRAY_TYPE.
   C argument is used as the iterator for the recursive calls to
   gen_ctf_array_type. C is the current die within the recursion.
   When C is NULL, it means it is the first call to gen_ctf_array_type.
   C should always be NULL when called from other functions.  */

static ctf_id_t
gen_ctf_array_type (ctf_container_ref ctfc,
dw_die_ref array_type,
dw_die_ref c = NULL)
{
  int vector_type_p = get_AT_flag (array_type, DW_AT_GNU_vector);
  if (vector_type_p)
return CTF_NULL_TYPEID;

  if (c == dw_get_die_child (array_type))
{
  dw_die_ref array_elems_type = ctf_get_AT_type (array_type);
  return gen_ctf_type (ctfc, array_elems_type);
}
  else
{
  ctf_arinfo_t arinfo;
  ctf_id_t array_node_type_id = CTF_NULL_TYPEID;
  if (c == NULL)
c = dw_get_die_child (array_type);

  c = dw_get_die_sib (c);

  ctf_id_t child_id = gen_ctf_array_type (ctfc, array_type, c);

  dw_die_ref array_index_type;
  uint32_t array_num_elements;

  if (dw_get_die_tag (c) == DW_TAG_subrange_type)
{
  dw_attr_node *upper_bound_at;

  array_index_type = ctf_get_AT_type (c);

  /* When DW_AT_upper_bound is used to specify the size of an
 array in DWARF, it is usually an unsigned constant
 specifying the upper bound index of the array.  However,
 for unsized arrays, such as foo[] or bar[0],
 DW_AT_upper_bound is a signed integer constant
 instead.  */

  upper_bound_at = get_AT (c, DW_AT_upper_bound);
  if (upper_bound_at
  && AT_class (upper_bound_at) == dw_val_class_unsigned_const)
/* This is the upper bound index.  */
array_num_elements = get_AT_unsigned (c, DW_AT_upper_bound) + 1;
  else if (get_AT (c, DW_AT_count))
array_num_elements = get_AT_unsigned (c, DW_AT_count);
  else
{
  /* This is a VLA of some kind.  */
  array_num_elements = 0;
}
}
  else if (dw_get_die_tag (c) == DW_TAG_enumeration_type)
{
  array_index_type = 0;
  array_num_elements = 0;
  /* XXX writeme. */
  gcc_assert (1);
}
  else
gcc_unreachable ();

  /* Ok, mount and register the array type.  Note how the array
 type we register here is the type of the elements in
 subsequent "dimensions", if there are any.  */

  arinfo.ctr_nelems = array_num_elements;
  if (array_index_type)
arinfo.ctr_index = gen_ctf_type (ctfc, array_index_type);
  else
arinfo.ctr_index = gen_ctf_type (ctfc, ctf_array_index_die);

  arinfo.ctr_contents = child_id;
  if (!ctf_type_exists (ctfc, c, _node_type_id))
array_node_type_id = ctf_add_array (ctfc, CTF_ADD_ROOT, ,
c);
  return array_node_type_id;
}
}

=== The patch ===

Multi-dimentional array types would be linked in reverse order to the
expected C standard ordering. In other words, CTF would define the type
of "char [a][b][c]" as if it was an array of "char [c][b][a]"
dimentions.

gcc/ChangeLog:

* dwarf2ctf.cc (gen_ctf_array_type): Correct order in which CTF
multi-dimentional array types are linked.

gcc/testsuite/ChangeLog:
* gcc.dg/debug/ctf/ctf-array-6.c: Add test.
---
 gcc/dwarf2ctf.cc | 67 
 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-6.c | 14 
 2 files changed, 41 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-6.c

diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
index 93e5619933fa..95ceca196217 100644
--- a/gcc/dwarf2ctf.cc
+++ b/gcc/dwarf2ctf.cc
@@ -349,42 +349,40 @@ gen_ctf_pointer_type (ctf_container_ref ctfc, dw_die_ref 
ptr_type)
   return ptr_type_id;
 }
 
-/* Generate CTF for an array type.  */
+/* Generate CTF for an ARRAY_TYPE.
+   C argument is used as the iterator for the recursive calls to
+   gen_ctf_array_type. C is the current die within the recursion.
+   When C is NULL, it means it is the first call to gen_ctf_array_type.
+   C should always be NULL when called from other functions.  */
 
 static ctf_id_t
-gen_ctf_array_type (ctf_container_ref ctfc, dw_die_ref array_type)
+gen_ctf_array_type (ctf_container_ref ctfc,
+   dw_die_ref array_type,
+   dw_die_ref c = NULL)
 {
-  dw_die_ref c;
-  ctf_id_t array_elems_type_id = CTF_NULL_TYPEID;
-
   int vector_type_p = get_AT_flag (array_type, DW_AT_GNU_vector);
   if (vector_type_p)
-return array_elems_type_id;
-
-  dw_die_ref 

Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

2024-02-28 Thread Palmer Dabbelt

On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreya...@gmail.com wrote:



On 2/28/24 05:23, Kito Cheng wrote:

atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic operation on
RV64, however lr.w is doing sign extend to DI and compare instruction only have
DI mode on RV64, so the expected value should be sign extend before compare as
well, so that we can get right compare result.

gcc/ChangeLog:

PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap): Sign
extend the expected value if needed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr114130.c: New.

Nearly rejected this as I think the description was a bit ambiguous and
I thought you were extending the result of the lr.w.  But it's actually
the other value you're ensuring gets properly extended.


I had the same response, but after reading it I'm not quite sure how to 
say it better.



OK.


I was looking at the code to try and ask if we have the same bug for the 
short inline CAS routines, but I've got to run to some meetings...




Jeff


Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

2024-02-28 Thread Jeff Law




On 2/28/24 05:23, Kito Cheng wrote:

atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic operation on
RV64, however lr.w is doing sign extend to DI and compare instruction only have
DI mode on RV64, so the expected value should be sign extend before compare as
well, so that we can get right compare result.

gcc/ChangeLog:

PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap): Sign
extend the expected value if needed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr114130.c: New.
Nearly rejected this as I think the description was a bit ambiguous and 
I thought you were extending the result of the lr.w.  But it's actually 
the other value you're ensuring gets properly extended.


OK.

Jeff



Re: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread Kito Cheng
Hmm, maybe only keep --param=riscv-autovec-preference=none and remove other
two if we think that might still useful? But anyway I have no strong
opinion to keep that, I mean I am ok to remove whole
--param=riscv-autovec-preference.

钟居哲  於 2024年2月28日 週三 21:59 寫道:

> I think it makes more sense to remove --param=riscv-autovec-preference and
> add -mrvv-vector-bits
>
> --
> juzhe.zh...@rivai.ai
>
>
> *From:* Kito Cheng 
> *Date:* 2024-02-28 20:56
> *To:* pan2.li 
> *CC:* gcc-patches ; juzhe.zhong
> ; yanzhang.wang ; rdapp.gcc
> ; jeffreyalaw 
> *Subject:* Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits
> for RVV
> Take one more look, I think this option should work and integrate with
> --param=riscv-autovec-preference= since they have similar jobs but
> slightly different.
>
> We have 3 value for  --param=riscv-autovec-preference=: none, scalable
> and fixed-vlmax
>
> -mrvv-vector-bits=scalable is work like
> --param=riscv-autovec-preference=scalable and
> -mrvv-vector-bits=zvl is work like
> --param=riscv-autovec-preference=fixed-vlmax.
>
> So I think...we need to do some conflict check, like:
>
> -mrvv-vector-bits=zvl can't work with
> --param=riscv-autovec-preference=scalable
> -mrvv-vector-bits=scalable can't work with
> --param=riscv-autovec-preference=fixed-vlmax
>
> but it may not just alias since there is some useful combinations like:
>
> -mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
> NO auto vectorization but intrinsic code still could benefit from the
> -mrvv-vector-bits=zvl option.
>
> -mrvv-vector-bits=scalable with --param=riscv-autovec-preference=none
> Should still work for VLS code gen, but just disable auto
> vectorization per the option semantic.
>
> However here is something we need some fix, since
> --param=riscv-autovec-preference=none still disable VLS code gen for
> now, you can see some example here:
> https://godbolt.org/z/fMTr3eW7K
>
> But I think it's really the right behavior here, this part might need
> to be fixed in vls_mode_valid_p and some other places.
>
>
> Anyway I think we need to check all use sites with RVV_FIXED_VLMAX and
> RVV_SCALABLE, and need to make sure all use site of RVV_FIXED_VLMAX
> also checked with RVV_VECTOR_BITS_ZVL.
>
>
>
> > -/* Return the VLEN value associated with -march.
> > +static int
> > +riscv_convert_vector_bits (int min_vlen)
>
> Not sure if we really need this function, it seems it always returns
> min_vlen?
>
> > +{
> > +  int rvv_bits = 0;
> > +
> > +  switch (rvv_vector_bits)
> > +{
> > +  case RVV_VECTOR_BITS_ZVL:
> > +  case RVV_VECTOR_BITS_SCALABLE:
> > +   rvv_bits = min_vlen;
> > +   break;
> > +  default:
> > +   gcc_unreachable ();
> > +}
> > +
> > +  return rvv_bits;
> > +}
> > +
> > +/* Return the VLEN value associated with -march and -mwrvv-vector-bits.
>
>
>


Re: [committed] libstdc++: Fix noexcept on dtors in [PR114152]

2024-02-28 Thread Jonathan Wakely
Oops, sorry, I CC'd the wrong Victor on this patch (you've both
reported libstdc++ bugs today and I grabbed the email address from the
wrong browser tab).

On Wed, 28 Feb 2024 at 14:53, Jonathan Wakely wrote:
>
> Tested x86_64-linux, pushed to trunk. Backport to gcc-13 to follow.
>
> -- >8 --
>
> The PR points out that the destructors all have incorrect
> noexcept-specifiers.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/114152
> * include/experimental/scope (scope_exit scope_fail): Make
> destructor unconditionally noexcept.
> (scope_sucess): Fix noexcept-specifier.
> * testsuite/experimental/scopeguard/114152.cc: New test.
> ---
>  libstdc++-v3/include/experimental/scope   |  6 ++---
>  .../experimental/scopeguard/114152.cc | 24 +++
>  2 files changed, 27 insertions(+), 3 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/experimental/scopeguard/114152.cc
>
> diff --git a/libstdc++-v3/include/experimental/scope 
> b/libstdc++-v3/include/experimental/scope
> index 5dbeac14795..ea273e8c095 100644
> --- a/libstdc++-v3/include/experimental/scope
> +++ b/libstdc++-v3/include/experimental/scope
> @@ -97,7 +97,7 @@ namespace experimental::inline fundamentals_v3
>scope_exit& operator=(const scope_exit&) = delete;
>scope_exit& operator=(scope_exit&&) = delete;
>
> -  ~scope_exit() noexcept(noexcept(this->_M_exit_function))
> +  ~scope_exit() noexcept
>{
> if (_M_execute_on_destruction)
>   _M_exit_function();
> @@ -157,7 +157,7 @@ namespace experimental::inline fundamentals_v3
>scope_fail& operator=(const scope_fail&) = delete;
>scope_fail& operator=(scope_fail&&) = delete;
>
> -  ~scope_fail() noexcept(noexcept(this->_M_exit_function))
> +  ~scope_fail() noexcept
>{
> if (std::uncaught_exceptions() > _M_uncaught_init)
>   _M_exit_function();
> @@ -211,7 +211,7 @@ namespace experimental::inline fundamentals_v3
>scope_success& operator=(const scope_success&) = delete;
>scope_success& operator=(scope_success&&) = delete;
>
> -  ~scope_success() noexcept(noexcept(this->_M_exit_function))
> +  ~scope_success() noexcept(noexcept(this->_M_exit_function()))
>{
> if (std::uncaught_exceptions() <= _M_uncaught_init)
>   _M_exit_function();
> diff --git a/libstdc++-v3/testsuite/experimental/scopeguard/114152.cc 
> b/libstdc++-v3/testsuite/experimental/scopeguard/114152.cc
> new file mode 100644
> index 000..63c1f710e9f
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/experimental/scopeguard/114152.cc
> @@ -0,0 +1,24 @@
> +// { dg-do compile { target c++20 } }
> +
> +// PR libstdc++/114152
> +// Wrong exception specifiers for LFTSv3 scope guard destructors
> +
> +#include 
> +
> +using namespace std::experimental;
> +
> +struct F {
> +  void operator()() noexcept(false);
> +};
> +
> +static_assert( noexcept(std::declval&>().~scope_exit()) );
> +static_assert( noexcept(std::declval&>().~scope_fail()) );
> +static_assert( ! 
> noexcept(std::declval&>().~scope_success()) );
> +
> +struct G {
> +  void operator()() noexcept(true);
> +};
> +
> +static_assert( noexcept(std::declval&>().~scope_exit()) );
> +static_assert( noexcept(std::declval&>().~scope_fail()) );
> +static_assert( noexcept(std::declval&>().~scope_success()) 
> );
> --
> 2.43.0
>



[committed] libstdc++: Fix noexcept on dtors in [PR114152]

2024-02-28 Thread Jonathan Wakely
Tested x86_64-linux, pushed to trunk. Backport to gcc-13 to follow.

-- >8 --

The PR points out that the destructors all have incorrect
noexcept-specifiers.

libstdc++-v3/ChangeLog:

PR libstdc++/114152
* include/experimental/scope (scope_exit scope_fail): Make
destructor unconditionally noexcept.
(scope_sucess): Fix noexcept-specifier.
* testsuite/experimental/scopeguard/114152.cc: New test.
---
 libstdc++-v3/include/experimental/scope   |  6 ++---
 .../experimental/scopeguard/114152.cc | 24 +++
 2 files changed, 27 insertions(+), 3 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/experimental/scopeguard/114152.cc

diff --git a/libstdc++-v3/include/experimental/scope 
b/libstdc++-v3/include/experimental/scope
index 5dbeac14795..ea273e8c095 100644
--- a/libstdc++-v3/include/experimental/scope
+++ b/libstdc++-v3/include/experimental/scope
@@ -97,7 +97,7 @@ namespace experimental::inline fundamentals_v3
   scope_exit& operator=(const scope_exit&) = delete;
   scope_exit& operator=(scope_exit&&) = delete;
 
-  ~scope_exit() noexcept(noexcept(this->_M_exit_function))
+  ~scope_exit() noexcept
   {
if (_M_execute_on_destruction)
  _M_exit_function();
@@ -157,7 +157,7 @@ namespace experimental::inline fundamentals_v3
   scope_fail& operator=(const scope_fail&) = delete;
   scope_fail& operator=(scope_fail&&) = delete;
 
-  ~scope_fail() noexcept(noexcept(this->_M_exit_function))
+  ~scope_fail() noexcept
   {
if (std::uncaught_exceptions() > _M_uncaught_init)
  _M_exit_function();
@@ -211,7 +211,7 @@ namespace experimental::inline fundamentals_v3
   scope_success& operator=(const scope_success&) = delete;
   scope_success& operator=(scope_success&&) = delete;
 
-  ~scope_success() noexcept(noexcept(this->_M_exit_function))
+  ~scope_success() noexcept(noexcept(this->_M_exit_function()))
   {
if (std::uncaught_exceptions() <= _M_uncaught_init)
  _M_exit_function();
diff --git a/libstdc++-v3/testsuite/experimental/scopeguard/114152.cc 
b/libstdc++-v3/testsuite/experimental/scopeguard/114152.cc
new file mode 100644
index 000..63c1f710e9f
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/scopeguard/114152.cc
@@ -0,0 +1,24 @@
+// { dg-do compile { target c++20 } }
+
+// PR libstdc++/114152
+// Wrong exception specifiers for LFTSv3 scope guard destructors
+
+#include 
+
+using namespace std::experimental;
+
+struct F {
+  void operator()() noexcept(false);
+};
+
+static_assert( noexcept(std::declval&>().~scope_exit()) );
+static_assert( noexcept(std::declval&>().~scope_fail()) );
+static_assert( ! noexcept(std::declval&>().~scope_success()) 
);
+
+struct G {
+  void operator()() noexcept(true);
+};
+
+static_assert( noexcept(std::declval&>().~scope_exit()) );
+static_assert( noexcept(std::declval&>().~scope_fail()) );
+static_assert( noexcept(std::declval&>().~scope_success()) );
-- 
2.43.0



[committed] libstdc++: Change some URLs in the manual to use https

2024-02-28 Thread Jonathan Wakely
Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* doc/xml/manual/appendix_contributing.xml: Change URLs to use
https.
* doc/html/manual/*: Regenerate.
---
 .../doc/html/manual/appendix_contributing.html   |  8 
 libstdc++-v3/doc/html/manual/source_code_style.html  |  4 ++--
 .../doc/xml/manual/appendix_contributing.xml | 12 ++--
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml 
b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index 0dcafcb98af..ac607fcfad4 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -20,7 +20,7 @@
 
   The GNU C++ Library is part of GCC and follows the same development model,
   so the general rules for
-  http://www.w3.org/1999/xlink; 
xlink:href="http://gcc.gnu.org/contribute.html;>contributing
+  http://www.w3.org/1999/xlink; 
xlink:href="https://gcc.gnu.org/contribute.html;>contributing
   to GCC apply. Active
   contributors are assigned maintainership responsibility, and given
   write access to the source repository. First-time contributors
@@ -64,7 +64,7 @@
   

  Peruse
- the http://www.w3.org/1999/xlink; 
xlink:href="http://www.gnu.org/prep/standards/;>GNU
+ the http://www.w3.org/1999/xlink; 
xlink:href="https://www.gnu.org/prep/standards/;>GNU
  Coding Standards, and chuckle when you hit the part
  about Using Languages Other Than C.

@@ -91,7 +91,7 @@
   Assignment
 
 
-  See the http://www.w3.org/1999/xlink; 
xlink:href="http://gcc.gnu.org/contribute.html#legal;>legal 
prerequisites for all GCC contributions.
+  See the http://www.w3.org/1999/xlink; 
xlink:href="https://gcc.gnu.org/contribute.html#legal;>legal 
prerequisites for all GCC contributions.
 
 
 
@@ -155,7 +155,7 @@
  some recent commits for format and content. The
  contrib/mklog.py script can be used to
  generate a ChangeLog template for commit messages. See
- http://www.w3.org/1999/xlink; 
xlink:href="http://gcc.gnu.org/gitwrite.html;>Read-write Git access
+ http://www.w3.org/1999/xlink; 
xlink:href="https://gcc.gnu.org/gitwrite.html;>Read-write Git access
  for scripts and aliases that are useful here.

   
@@ -618,13 +618,13 @@ indicate a place that may require attention for 
multi-thread safety.
   it is intended to precede the recommendations of the GNU Coding
   Standard, which can be referenced in full here:
 
-  http://www.w3.org/1999/xlink; 
xlink:href="http://www.gnu.org/prep/standards/standards.html#Formatting;>http://www.gnu.org/prep/standards/standards.html#Formatting
+  http://www.w3.org/1999/xlink; 
xlink:href="https://www.gnu.org/prep/standards/standards.html#Formatting;>https://www.gnu.org/prep/standards/standards.html#Formatting
 
   The rest of this is also interesting reading, but skip the "Design
   Advice" part.
 
   The GCC coding conventions are here, and are also useful:
-  http://www.w3.org/1999/xlink; 
xlink:href="http://gcc.gnu.org/codingconventions.html;>http://gcc.gnu.org/codingconventions.html
+  http://www.w3.org/1999/xlink; 
xlink:href="https://gcc.gnu.org/codingconventions.html;>https://gcc.gnu.org/codingconventions.html
 
   In addition, because it doesn't seem to be stated explicitly anywhere
   else, there is an 80 column source limit.
-- 
2.43.0



[committed] libstdc++: Update outdated docs on contributing

2024-02-28 Thread Jonathan Wakely
Pushed to trunk, but I've just noticed it should be https not http.

-- >8 --

We don't want a separate ChangeLog submission now.

libstdc++-v3/ChangeLog:

* doc/xml/manual/appendix_contributing.xml: Replace outdated
info on ChangeLog entries.
* doc/html/manual/appendix_contributing.html: Regenerate.
---
 .../doc/html/manual/appendix_contributing.html   | 16 +---
 .../doc/xml/manual/appendix_contributing.xml | 16 +---
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml 
b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
index a9196493adc..0dcafcb98af 100644
--- a/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
+++ b/libstdc++-v3/doc/xml/manual/appendix_contributing.xml
@@ -151,12 +151,12 @@
 
   

- A ChangeLog entry as plain text; see the various
- ChangeLog files for format and content. If you are
- using emacs as your editor, simply position the insertion
- point at the beginning of your change and hit CX-4a to bring
- up the appropriate ChangeLog entry. See--magic! Similar
- functionality also exists for vi.
+ A ChangeLog entry as part of the Git commit message. Check
+ some recent commits for format and content. The
+ contrib/mklog.py script can be used to
+ generate a ChangeLog template for commit messages. See
+ http://www.w3.org/1999/xlink; 
xlink:href="http://gcc.gnu.org/gitwrite.html;>Read-write Git access
+ for scripts and aliases that are useful here.

   
 
@@ -171,7 +171,7 @@
   

  The patch itself. If you are using the Git repository use
- git diff or git format-patch
+ git show or git format-patch
  to produce a patch;
  otherwise, use diff -cp OLD NEW. If your
  version of diff does not support these options, then get the
@@ -186,6 +186,8 @@
  patches and related discussion should be sent to the
  libstdc++ mailing list. In common with the rest of GCC,
  patches should also be sent to the gcc-patches mailing list.
+ So you could send your email To:libstd...@gcc.gnu.org and
+ Cc:gcc-patches@gcc.gnu.org for example.

   
 
-- 
2.43.0



Re: [PATCH V2] rs6000: Don't allow immediate value in the vsx_splat pattern [PR113950]

2024-02-28 Thread Segher Boessenkool
On Tue, Feb 27, 2024 at 04:50:02PM -0600, Peter Bergner wrote:
> On 2/27/24 6:40 AM, Segher Boessenkool wrote:
> > On Tue, Feb 27, 2024 at 02:02:38AM +0530, jeevitha wrote:
> > input_operand allows a lot of things that splat_input_operand does not,
> > not just immediate operands.  NAK.
> > 
> > (For example, *all* memory is okay for input_operand, always).
> > 
> > I'm not saying we do not want to restrict these things, but a commit
> > that doesn't discuss this at all is not okay.  Sorry.
> 
> So it seems you're not NAKing the use of splat_input_operand, but
> just that it needs more explanation in the git log entry, correct?

I NAK the patch.  _Of course_ there needs to be *something* done, there
is a bug after all, it needs to be fixed.

But no, there are big questions about if splat_input_operand is correct
as well.  This needs to be justified in the patch submission.

> Yes, input_operand accepts a lot more things than splat_input_operand
> does, but the multiple define_insns this define_expand feeds, uses
> gpc_reg_operand, memory_operand and splat_input_operand for their
> operands[1] operand (splat_input_operand accepts reg and mem too),
> so it seems to match better what the patterns will be accepting and
> I always thought that using predicates that more accurately reflect
> what the define_insns expect/accept lead to better code gen.

Still, it needs explanation why we allowed it before, but that was a
mistake, or for some reason we do not need it.

Sell your patch!  :-)

> Mike, was it just an oversight to not use splat_input_operand for the
> vsx_splat_ expander or was input_operand a conscious decision?
> 
> If input_operand was used purposely, then we can just fall back to
> the s/op1/operands[1]/ change which we already know fixes the bug.

input_operand allows all inputs for mov insns.  It isn't suitable
for any other instructions.


Segher


Re: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread 钟居哲
I think it makes more sense to remove --param=riscv-autovec-preference and add 
-mrvv-vector-bits



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2024-02-28 20:56
To: pan2.li
CC: gcc-patches; juzhe.zhong; yanzhang.wang; rdapp.gcc; jeffreyalaw
Subject: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV
Take one more look, I think this option should work and integrate with
--param=riscv-autovec-preference= since they have similar jobs but
slightly different.
 
We have 3 value for  --param=riscv-autovec-preference=: none, scalable
and fixed-vlmax
 
-mrvv-vector-bits=scalable is work like
--param=riscv-autovec-preference=scalable and
-mrvv-vector-bits=zvl is work like
--param=riscv-autovec-preference=fixed-vlmax.
 
So I think...we need to do some conflict check, like:
 
-mrvv-vector-bits=zvl can't work with --param=riscv-autovec-preference=scalable
-mrvv-vector-bits=scalable can't work with
--param=riscv-autovec-preference=fixed-vlmax
 
but it may not just alias since there is some useful combinations like:
 
-mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
NO auto vectorization but intrinsic code still could benefit from the
-mrvv-vector-bits=zvl option.
 
-mrvv-vector-bits=scalable with --param=riscv-autovec-preference=none
Should still work for VLS code gen, but just disable auto
vectorization per the option semantic.
 
However here is something we need some fix, since
--param=riscv-autovec-preference=none still disable VLS code gen for
now, you can see some example here:
https://godbolt.org/z/fMTr3eW7K
 
But I think it's really the right behavior here, this part might need
to be fixed in vls_mode_valid_p and some other places.
 
 
Anyway I think we need to check all use sites with RVV_FIXED_VLMAX and
RVV_SCALABLE, and need to make sure all use site of RVV_FIXED_VLMAX
also checked with RVV_VECTOR_BITS_ZVL.
 
 
 
> -/* Return the VLEN value associated with -march.
> +static int
> +riscv_convert_vector_bits (int min_vlen)
 
Not sure if we really need this function, it seems it always returns min_vlen?
 
> +{
> +  int rvv_bits = 0;
> +
> +  switch (rvv_vector_bits)
> +{
> +  case RVV_VECTOR_BITS_ZVL:
> +  case RVV_VECTOR_BITS_SCALABLE:
> +   rvv_bits = min_vlen;
> +   break;
> +  default:
> +   gcc_unreachable ();
> +}
> +
> +  return rvv_bits;
> +}
> +
> +/* Return the VLEN value associated with -march and -mwrvv-vector-bits.
 


RE: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread Li, Pan2
Oops, this is more complicated than original expectation.

Consider somehow the mrvv-vector-bits (zvl or scalable) decides how we perform 
the auto-vec.
I may have one proposal to combine them together?

For example, mrvv-vector-bits=zvl indicates we will auto-vect in fixed-vlmax 
way, and
mrvv-vector-bits=scalable indicates we will perform scalable auto-vec. That may 
make
things clean and get ride of the conflict code in many places (maybe).

Please help to correct me if any misunderstandings. Meanwhile, this change is 
sort of now or never up to point IMO.
We can only do it before GCC-14 release or never I guess (to avoid breaking 
changes).

> However here is something we need some fix, since
> --param=riscv-autovec-preference=none still disable VLS code gen for
> now, you can see some example here:
> https://godbolt.org/z/fMTr3eW7K

I am not quite sure about the right behavior here, when mrvv-vector-bits is 
given while riscv-autovec-preference is none...

Pan

-Original Message-
From: Kito Cheng  
Sent: Wednesday, February 28, 2024 8:57 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; rdapp@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

Take one more look, I think this option should work and integrate with
--param=riscv-autovec-preference= since they have similar jobs but
slightly different.

We have 3 value for  --param=riscv-autovec-preference=: none, scalable
and fixed-vlmax

-mrvv-vector-bits=scalable is work like
--param=riscv-autovec-preference=scalable and
-mrvv-vector-bits=zvl is work like
--param=riscv-autovec-preference=fixed-vlmax.

So I think...we need to do some conflict check, like:

-mrvv-vector-bits=zvl can't work with --param=riscv-autovec-preference=scalable
-mrvv-vector-bits=scalable can't work with
--param=riscv-autovec-preference=fixed-vlmax

but it may not just alias since there is some useful combinations like:

-mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
NO auto vectorization but intrinsic code still could benefit from the
-mrvv-vector-bits=zvl option.

-mrvv-vector-bits=scalable with --param=riscv-autovec-preference=none
Should still work for VLS code gen, but just disable auto
vectorization per the option semantic.

However here is something we need some fix, since
--param=riscv-autovec-preference=none still disable VLS code gen for
now, you can see some example here:
https://godbolt.org/z/fMTr3eW7K

But I think it's really the right behavior here, this part might need
to be fixed in vls_mode_valid_p and some other places.


Anyway I think we need to check all use sites with RVV_FIXED_VLMAX and
RVV_SCALABLE, and need to make sure all use site of RVV_FIXED_VLMAX
also checked with RVV_VECTOR_BITS_ZVL.



> -/* Return the VLEN value associated with -march.
> +static int
> +riscv_convert_vector_bits (int min_vlen)

Not sure if we really need this function, it seems it always returns min_vlen?

> +{
> +  int rvv_bits = 0;
> +
> +  switch (rvv_vector_bits)
> +{
> +  case RVV_VECTOR_BITS_ZVL:
> +  case RVV_VECTOR_BITS_SCALABLE:
> +   rvv_bits = min_vlen;
> +   break;
> +  default:
> +   gcc_unreachable ();
> +}
> +
> +  return rvv_bits;
> +}
> +
> +/* Return the VLEN value associated with -march and -mwrvv-vector-bits.


[PATCH 2/2] tree-optimization/113831 - revert original fix

2024-02-28 Thread Richard Biener
This reverts the original fix for PR113831 which is better fixed by
the PR114121 fix.  I've XFAILed instead of removing the PR108355
testcase again.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/113831
PR tree-optimization/108355
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): Revert
PR113831 fix.

* gcc.dg/tree-ssa/ssa-fre-104.c: XFAIL.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-104.c |   2 +-
 gcc/tree-ssa-sccvn.cc   | 134 
 2 files changed, 1 insertion(+), 135 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-104.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-104.c
index f0f12ef82b7..425c32dd93c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-104.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-104.c
@@ -21,4 +21,4 @@ int main() {
   *c = 
 }
 
-/* { dg-final { scan-tree-dump-not "foo" "fre1" } } */
+/* { dg-final { scan-tree-dump-not "foo" "fre1" { xfail *-*-* } } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 21123644a5a..0b7fb0663c7 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -912,8 +912,6 @@ copy_reference_ops_from_ref (tree ref, 
vec *result)
 {
   /* For non-calls, store the information that makes up the address.  */
   tree orig = ref;
-  unsigned start = result->length ();
-  bool seen_variable_array_ref = false;
   while (ref)
 {
   vn_reference_op_s temp;
@@ -1000,12 +998,6 @@ copy_reference_ops_from_ref (tree ref, 
vec *result)
tree eltype = TREE_TYPE (TREE_TYPE (TREE_OPERAND (ref, 0)));
/* Record index as operand.  */
temp.op0 = TREE_OPERAND (ref, 1);
-   /* When the index is not constant we have to apply the same
-  logic as get_ref_base_and_extent which eventually uses
-  global ranges to refine the overall ref extent.  Record
-  we've seen such a case, fixup below.  */
-   if (TREE_CODE (temp.op0) == SSA_NAME)
- seen_variable_array_ref = true;
/* Always record lower bounds and element size.  */
temp.op1 = array_ref_low_bound (ref);
/* But record element size in units of the type alignment.  */
@@ -1098,132 +1090,6 @@ copy_reference_ops_from_ref (tree ref, 
vec *result)
   else
ref = NULL_TREE;
 }
-  poly_int64 offset, size, max_size;
-  tree base;
-  bool rev;
-  if (seen_variable_array_ref
-  && handled_component_p (orig)
-  && (base = get_ref_base_and_extent (orig,
- , , _size, ))
-  && known_size_p (max_size)
-  && known_eq (size, max_size))
-{
-  poly_int64 orig_offset = offset;
-  poly_int64 tem;
-  if (TREE_CODE (base) == MEM_REF
- && mem_ref_offset (base).to_shwi ())
-   offset += tem * BITS_PER_UNIT;
-  HOST_WIDE_INT coffset = offset.to_constant ();
-  /* When get_ref_base_and_extent computes an offset constrained to
-a constant position we have to fixup variable array indexes in
-the ref to avoid the situation where based on context we'd have
-to value-number the same vn_reference ops differently.  Make
-the vn_reference ops differ by adjusting those indexes to
-appropriate constants.  */
-  poly_int64 off = 0;
-  bool oob_index = false;
-  for (unsigned i = result->length (); i > start; --i)
-   {
- auto  = (*result)[i-1];
- if (flag_checking
- && op.opcode == ARRAY_REF
- && TREE_CODE (op.op0) == INTEGER_CST)
-   {
- /* The verifier below chokes on inconsistencies of handling
-out-of-bound accesses so disable it in that case.  */
- tree atype = (*result)[i].type;
- if (TREE_CODE (atype) == ARRAY_TYPE)
-   if (tree dom = TYPE_DOMAIN (atype))
- if ((TYPE_MIN_VALUE (dom)
-  && TREE_CODE (TYPE_MIN_VALUE (dom)) == INTEGER_CST
-  && (wi::to_widest (op.op0)
-  < wi::to_widest (TYPE_MIN_VALUE (dom
- || (TYPE_MAX_VALUE (dom)
- && TREE_CODE (TYPE_MAX_VALUE (dom)) == INTEGER_CST
- && (wi::to_widest (op.op0)
- > wi::to_widest (TYPE_MAX_VALUE (dom)
-   oob_index = true;
-   }
- if ((op.opcode == ARRAY_REF
-  || op.opcode == ARRAY_RANGE_REF)
- && TREE_CODE (op.op0) == SSA_NAME)
-   {
- /* There's a single constant index that get's 'off' closer
-to 'offset'.  */
- unsigned HOST_WIDE_INT elsz
-   = tree_to_uhwi (op.op2) * vn_ref_op_align_unit ();
- unsigned HOST_WIDE_INT idx
-   = (coffset - off.to_constant ()) / BITS_PER_UNIT / elsz;
- if (idx == 0)
-   op.op0 = 

[PATCH 1/2] tree-optimization/114121 - wrong VN with context sensitive range info

2024-02-28 Thread Richard Biener
When VN ends up exploiting range-info specifying the ao_ref offset
and max_size we have to make sure to reflect this in the hashtable
entry for the recorded expression.  The PR113831 fix handled the
case where we can encode this in the operands themselves but this
bug shows the issue is more widespread.

So instead of altering the operands the following instead records
this extra info that's possibly used, only throwing it away when
the value-numbering didn't come up with a non-VARYING value which
is an important detail to preserve CSE as opposed to constant
folding which is where all cases currently known popped up.

With this the original PR113831 fix can be reverted, see 2/2.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/114121
* tree-ssa-sccvn.h (vn_reference_s::offset,
vn_reference_s::max_size): New fields.
(vn_reference_insert_pieces): Adjust prototype.
* tree-ssa-pre.cc (phi_translate_1): Preserve offset/max_size.
* tree-ssa-sccvn.cc (vn_reference_eq): Compare offset and
size, allow using "don't know" state.
(vn_walk_cb_data::finish): Pass along offset/max_size.
(vn_reference_lookup_or_insert_for_pieces): Take offset and
max_size as argument and use it.
(vn_reference_lookup_3): Properly adjust offset and max_size
according to the adjusted ao_ref.
(vn_reference_lookup_pieces): Initialize offset and max_size.
(vn_reference_lookup): Likewise.
(vn_reference_lookup_call): Likewise.
(vn_reference_insert): Likewise.
(visit_reference_op_call): Likewise.
(vn_reference_insert_pieces): Take offset and max_size
as argument and use it.

* gcc.dg/torture/pr114121.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr114121.c | 35 
 gcc/tree-ssa-pre.cc |  5 ++-
 gcc/tree-ssa-sccvn.cc   | 55 ++---
 gcc/tree-ssa-sccvn.h|  3 ++
 4 files changed, 91 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr114121.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr114121.c 
b/gcc/testsuite/gcc.dg/torture/pr114121.c
new file mode 100644
index 000..9a6ddf2957e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114121.c
@@ -0,0 +1,35 @@
+/* { dg-do run { target bitint } } */
+
+#if __BITINT_MAXWIDTH__ >= 256
+unsigned a, b, c, d, e;
+unsigned _BitInt(256) f;
+
+__attribute__((noipa)) unsigned short
+bswap16 (int t)
+{
+  return __builtin_bswap16 (t);
+}
+
+void
+foo (unsigned z, unsigned _BitInt(512) y, unsigned *r)
+{
+  unsigned t = __builtin_sub_overflow_p (0, y << 509, f);
+  z *= bswap16 (t);
+  d = __builtin_sub_overflow_p (c, 3, (unsigned _BitInt(512)) 0);
+  unsigned q = z + c + b;
+  unsigned short n = q >> (8 + a);
+  *r = b + e + n;
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 256
+  unsigned x;
+  foo (8, 2, );
+  if (x != 8)
+__builtin_abort ();
+#endif
+  return 0;
+}
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index d29214d04f8..75217f5cde1 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -1666,8 +1666,9 @@ phi_translate_1 (bitmap_set_t dest,
if (!newoperands.exists ())
  newoperands = operands.copy ();
newref = vn_reference_insert_pieces (newvuse, ref->set,
-ref->base_set, ref->type,
-newoperands,
+ref->base_set,
+ref->offset, ref->max_size,
+ref->type, newoperands,
 result, new_val_id);
newoperands = vNULL;
  }
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index c20ad85c743..21123644a5a 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -438,7 +438,7 @@ static void init_vn_nary_op_from_pieces (vn_nary_op_t, 
unsigned int,
 enum tree_code, tree, tree *);
 static tree vn_lookup_simplify_result (gimple_match_op *);
 static vn_reference_t vn_reference_lookup_or_insert_for_pieces
- (tree, alias_set_type, alias_set_type, tree,
+ (tree, alias_set_type, alias_set_type, poly_int64, poly_int64, tree,
   vec, tree);
 
 /* Return whether there is value numbering information for a given SSA name.  
*/
@@ -748,6 +748,8 @@ vn_reference_compute_hash (const vn_reference_t vr1)
vn_reference_op_compute_hash (vro, hstate);
}
 }
+  /* Do not hash vr1->offset or vr1->max_size, we want to get collisions
+ to be able to identify compatible results.  */
   result = hstate.end ();
   /* ??? We would ICE later if we hash instead of adding that in. */
   if (vr1->vuse)
@@ -772,6 

Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread Kito Cheng
Take one more look, I think this option should work and integrate with
--param=riscv-autovec-preference= since they have similar jobs but
slightly different.

We have 3 value for  --param=riscv-autovec-preference=: none, scalable
and fixed-vlmax

-mrvv-vector-bits=scalable is work like
--param=riscv-autovec-preference=scalable and
-mrvv-vector-bits=zvl is work like
--param=riscv-autovec-preference=fixed-vlmax.

So I think...we need to do some conflict check, like:

-mrvv-vector-bits=zvl can't work with --param=riscv-autovec-preference=scalable
-mrvv-vector-bits=scalable can't work with
--param=riscv-autovec-preference=fixed-vlmax

but it may not just alias since there is some useful combinations like:

-mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
NO auto vectorization but intrinsic code still could benefit from the
-mrvv-vector-bits=zvl option.

-mrvv-vector-bits=scalable with --param=riscv-autovec-preference=none
Should still work for VLS code gen, but just disable auto
vectorization per the option semantic.

However here is something we need some fix, since
--param=riscv-autovec-preference=none still disable VLS code gen for
now, you can see some example here:
https://godbolt.org/z/fMTr3eW7K

But I think it's really the right behavior here, this part might need
to be fixed in vls_mode_valid_p and some other places.


Anyway I think we need to check all use sites with RVV_FIXED_VLMAX and
RVV_SCALABLE, and need to make sure all use site of RVV_FIXED_VLMAX
also checked with RVV_VECTOR_BITS_ZVL.



> -/* Return the VLEN value associated with -march.
> +static int
> +riscv_convert_vector_bits (int min_vlen)

Not sure if we really need this function, it seems it always returns min_vlen?

> +{
> +  int rvv_bits = 0;
> +
> +  switch (rvv_vector_bits)
> +{
> +  case RVV_VECTOR_BITS_ZVL:
> +  case RVV_VECTOR_BITS_SCALABLE:
> +   rvv_bits = min_vlen;
> +   break;
> +  default:
> +   gcc_unreachable ();
> +}
> +
> +  return rvv_bits;
> +}
> +
> +/* Return the VLEN value associated with -march and -mwrvv-vector-bits.


[PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

2024-02-28 Thread Kito Cheng
atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic operation on
RV64, however lr.w is doing sign extend to DI and compare instruction only have
DI mode on RV64, so the expected value should be sign extend before compare as
well, so that we can get right compare result.

gcc/ChangeLog:

PR target/114130
* config/riscv/sync.md (atomic_compare_and_swap): Sign
extend the expected value if needed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr114130.c: New.
---
 gcc/config/riscv/sync.md  |  9 +
 gcc/testsuite/gcc.target/riscv/pr114130.c | 12 
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr114130.c

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 54bb0a66518..6f0b5aae08d 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -353,6 +353,15 @@
(match_operand:SI 7 "const_int_operand" "")] ;; mod_f
   "TARGET_ATOMIC"
 {
+  if (word_mode != mode && operands[3] != const0_rtx)
+{
+  /* We don't have SI mode compare on RV64, so we need to make sure 
expected
+value is sign-extended.  */
+  rtx tmp0 = gen_reg_rtx (word_mode);
+  emit_insn (gen_extend_insn (tmp0, operands[3], word_mode, mode, 
0));
+  operands[3] = simplify_gen_subreg (mode, tmp0, word_mode, 0);
+}
+
   emit_insn (gen_atomic_cas_value_strong (operands[1], operands[2],
operands[3], operands[4],
operands[6], operands[7]));
diff --git a/gcc/testsuite/gcc.target/riscv/pr114130.c 
b/gcc/testsuite/gcc.target/riscv/pr114130.c
new file mode 100644
index 000..647e27dab32
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr114130.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64" } */
+#include 
+
+void foo(uint32_t *p) {
+uintptr_t x = *(uintptr_t *)p;
+uint32_t e = !p ? 0 : (uintptr_t)p >> 1;
+uint32_t d = (uintptr_t)x;
+__atomic_compare_exchange(p, , , 0, __ATOMIC_RELAXED, 
__ATOMIC_RELAXED);
+}
+
+/* { dg-final { scan-assembler-bound {sext.w\t} >= 1 } } */
-- 
2.34.1



Re: [wwwdocs] Add anchor for contrib/gcc-git-customization.sh docs

2024-02-28 Thread Jakub Jelinek
On Wed, Feb 28, 2024 at 11:15:10AM +, Jonathan Wakely wrote:
> I'd like to be able to link directly to this part of the page from other
> docs.
> 
> OK for wwwdocs?

LGTM.

> diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
> index c89cdb8f..54f8005a 100644
> --- a/htdocs/gitwrite.html
> +++ b/htdocs/gitwrite.html
> @@ -374,7 +374,7 @@ collaborators pull from there.
>  
>  Scripts exist in the contrib directory to help manage these spaces.
>  
> -contrib/gcc-git-customization.sh
> +contrib/gcc-git-customization.sh
>  
>  This script will help set up your personal area.  It will also define
>  some aliases that might be useful when developing GCC.  The script will
> -- 
> 2.43.0

Jakub



Re: [PATCH 2/2] aarch64: Add support for _BitInt

2024-02-28 Thread Jakub Jelinek
On Tue, Feb 27, 2024 at 01:40:09PM +, Andre Vieira (lists) wrote:
> Dropped the first patch and dealt with the comments above, hopefully I
> didn't miss any this time.
> 
> --
> 
> This patch adds support for C23's _BitInt for the AArch64 port when
> compiling
> for little endianness.  Big Endianness requires further target-agnostic
> support and we therefor disable it for now.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64.cc (TARGET_C_BITINT_TYPE_INFO): Declare MACRO.
>   (aarch64_bitint_type_info): New function.
>   (aarch64_return_in_memory_1): Return large _BitInt's in memory.
>   (aarch64_function_arg_alignment): Adapt to correctly return the ABI
>   mandated alignment of _BitInt(N) where N > 128 as the alignment of
>   TImode.
>   (aarch64_composite_type_p): Return true for _BitInt(N), where N > 128.
> 
> libgcc/ChangeLog:
> 
>   * config/aarch64/t-softfp (softfp_extras): Add floatbitinthf,
>   floatbitintbf, floatbitinttf and fixtfbitint.
>   * config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Add __floatbitinthf,
>   __floatbitintbf, __floatbitinttf and __fixtfbitint.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/bitint-alignments.c: New test.
>   * gcc.target/aarch64/bitint-args.c: New test.
>   * gcc.target/aarch64/bitint-sizes.c: New test.

LGTM, but as this is mostly aarch64 specific, I'll defer the final ack
to Richard or Kyrylo.

Jakub



[wwwdocs] Add anchor for contrib/gcc-git-customization.sh docs

2024-02-28 Thread Jonathan Wakely
I'd like to be able to link directly to this part of the page from other
docs.

OK for wwwdocs?

---
 htdocs/gitwrite.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index c89cdb8f..54f8005a 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -374,7 +374,7 @@ collaborators pull from there.
 
 Scripts exist in the contrib directory to help manage these spaces.
 
-contrib/gcc-git-customization.sh
+contrib/gcc-git-customization.sh
 
 This script will help set up your personal area.  It will also define
 some aliases that might be useful when developing GCC.  The script will
-- 
2.43.0



[committed] testsuite: XFAIL ssa-sink-18.c also on powerpc64 [PR111462]

2024-02-28 Thread Jakub Jelinek
Hi!

powerpc64-linux apparently (not very surprisingly) behaves the same
way as powerpc64le-linux and has 4 sunk statements rather than 5,
so we should xfail it on powerpc64*-*-* rather than just powerpc64le-*-*.
powerpc-linux has 3 sunk statements, but the scan pattern is done for
lp64 only as the comment explains.

Tested in a cross to powerpc64-linux with
make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} 
tree-ssa.exp=ssa-sink-18.c'
and committed to trunk as obvious.

2024-02-28  Jakub Jelinek  

PR testsuite/111462
* gcc.dg/tree-ssa/ssa-sink-18.c: XFAIL also on powerpc64.

--- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c.jj  2024-01-09 
09:22:57.685124089 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c 2024-02-28 12:05:38.040579565 
+0100
@@ -213,6 +213,6 @@ compute_on_bytes (uint8_t *in_data, int
 expected, so this case is restricted to lp64 only so far.  This different
 ivopts choice affects riscv64 as well, probably because it also lacks
 base+index addressing modes, so the ip[len] address computation can't be
-made from the IV computation above.  powerpc64le similarly is affected.  */
+made from the IV computation above.  powerpc64{,le} similarly is affected. 
 */
 
- /* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink2" { target 
lp64 xfail { riscv64-*-* powerpc64le-*-* hppa*64*-*-* } } } } */
+ /* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink2" { target 
lp64 xfail { riscv64-*-* powerpc64*-*-* hppa*64*-*-* } } } } */

Jakub



  1   2   >