date:20240511

[to-be-committed][RISC-V] Improve usage of slli.uw in constant synthesis

2024-05-11 Thread Jeff Law


And an improvement to using slli.uw...

I recently added the ability to use slli.uw in the synthesis path.  That 
code was conditional on the right justified constant being a LUI_OPERAND 
after sign extending from bit 31 to bit 63.


That code is working fine, but could be improved.  Specifically there's 
no reason it shouldn't work for LUI+ADDI under the same circumstances. 
So rather than testing the sign extended, right justified constant is a 
LUI_OPERAND, we can just test that the right justified constant has 
precisely 32 leading zeros.



Waiting on CI to finish, expecting to commit after it's successful.

Jeff
gcc/
* config/riscv/riscv.cc (riscv_build_integer_1): Use slli.uw more.

gcc/testsuite
* gcc.target/riscv/synthesis-5.c: New test.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 049f8f8cb9f..a1e5a014bed 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -819,13 +819,14 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  & ~HOST_WIDE_INT_C (0x8000)
shift -= IMM_BITS, x <<= IMM_BITS;
 
-  /* Adjust X if it isn't a LUI operand in isolation, but we can use
-a subsequent "uw" instruction form to mask off the undesirable
-bits.  */
+  /* If X has bits 32..63 clear and bit 31 set, then go ahead and mark
+it as desiring a "uw" operation for the shift.  That way we can have
+LUI+ADDI to generate the constant, then shift it into position
+clearing out the undesirable bits.  */
   if (!LUI_OPERAND (x)
  && TARGET_64BIT
  && TARGET_ZBA
- && LUI_OPERAND (x & ~HOST_WIDE_INT_C (0x8000UL)))
+ && clz_hwi (x) == 32)
{
  x = sext_hwi (x, 32);
  use_uw = true;
diff --git a/gcc/testsuite/gcc.target/riscv/synthesis-5.c 
b/gcc/testsuite/gcc.target/riscv/synthesis-5.c
new file mode 100644
index 000..4d81565b563
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/synthesis-5.c
@@ -0,0 +1,294 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+
+/* We aggressively skip as we really just need to test the basic synthesis
+   which shouldn't vary based on the optimization level.  -O1 seems to work
+   and eliminates the usual sources of extraneous dead code that would throw
+   off the counts.  */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O2" "-O3" "-Os" "-Oz" "-flto" } } 
*/
+/* { dg-options "-march=rv64gc_zba_zbb_zbs" } */
+
+/* Rather than test for a specific synthesis of all these constants or
+   having thousands of tests each testing one variant, we just test the
+   total number of instructions.
+
+   This isn't expected to change much and any change is worthy of a look.  */
+/* { dg-final { scan-assembler-times 
"\\t(add|addi|bseti|li|ret|sh1add|sh2add|sh3add|slli)" 556 } } */
+
+unsigned long foo_0x80180001000(void) { return 0x80180001000UL; }
+
+unsigned long foo_0x80280001000(void) { return 0x80280001000UL; }
+
+unsigned long foo_0x80480001000(void) { return 0x80480001000UL; }
+
+unsigned long foo_0x80880001000(void) { return 0x80880001000UL; }
+
+unsigned long foo_0x81080001000(void) { return 0x81080001000UL; }
+
+unsigned long foo_0x82080001000(void) { return 0x82080001000UL; }
+
+unsigned long foo_0x84080001000(void) { return 0x84080001000UL; }
+
+unsigned long foo_0x88080001000(void) { return 0x88080001000UL; }
+
+unsigned long foo_0x90080001000(void) { return 0x90080001000UL; }
+
+unsigned long foo_0xa0080001000(void) { return 0xa0080001000UL; }
+
+unsigned long foo_0x8031000(void) { return 0x8031000UL; }
+
+unsigned long foo_0x8051000(void) { return 0x8051000UL; }
+
+unsigned long foo_0x8091000(void) { return 0x8091000UL; }
+
+unsigned long foo_0x8111000(void) { return 0x8111000UL; }
+
+unsigned long foo_0x8211000(void) { return 0x8211000UL; }
+
+unsigned long foo_0x8411000(void) { return 0x8411000UL; }
+
+unsigned long foo_0x8811000(void) { return 0x8811000UL; }
+
+unsigned long foo_0x9011000(void) { return 0x9011000UL; }
+
+unsigned long foo_0xa011000(void) { return 0xa011000UL; }
+
+unsigned long foo_0xc011000(void) { return 0xc011000UL; }
+
+unsigned long foo_0x8061000(void) { return 0x8061000UL; }
+
+unsigned long foo_0x80a1000(void) { return 0x80a1000UL; }
+
+unsigned long foo_0x8121000(void) { return 0x8121000UL; }
+
+unsigned long foo_0x8221000(void) { return 0x8221000UL; }
+
+unsigned long foo_0x8421000(void) { return 0x8421000UL; }
+
+unsigned long foo_0x8821000(void) { return 0x8821000UL; }
+
+unsigned long foo_0xa021000(void) { return 0xa021000UL; }
+
+unsigned long foo_0xc021000(void) { return 0xc021000UL; }
+
+unsigned long foo_0x80c1000(void) { return 0x80c1000UL; }
+
+unsigned long foo_0x8141000(void) { return 0x8141000UL; }
+
+unsigned long foo_0

[to-be-committed] RISC-V Fix minor regression in synthesis WRT bseti usage

2024-05-11 Thread Jeff Law

Overnight testing showed a small number of cases where constant 
synthesis was doing something dumb.  Specifically generating more 
instructions than the number of bits set in the constant.


It was a minor goof in the recent bseti code.  In the code to first 
figure out what bits LUI could set, I included one bit outside the space 
LUI operates.  For some dumb reason I kept thinking in terms of 11 low 
bits belonging to addi, but it's actually 12 bits.  The net is what we 
thought should be a single LUI for costing turned into LUI+ADDI.


I didn't let the test run to completion, but over the course of 12 hours 
it found 9 cases.  Given we know that the triggers all have 0x800 set, I 
bet we could likely find more, but I doubt it's that critical to cover 
every possible constant that regressed.


This has run in my tester (rv64gc, rv32gcv), but I'll wait for the CI 
tester as it covers the bitmanip extensions much better.



Jeff

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 9c98b1da035..049f8f8cb9f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -921,12 +921,12 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
 
   /* First handle any bits set by LUI.  Be careful of the
 SImode sign bit!.  */
-  if (value & 0x7800)
+  if (value & 0x7000)
{
  alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
- alt_codes[i].value = value & 0x7800;
+ alt_codes[i].value = value & 0x7000;
  alt_codes[i].use_uw = false;
- value &= ~0x7800;
+ value &= ~0x7000;
   i++;
}
 
diff --git a/gcc/testsuite/gcc.target/riscv/synthesis-4.c 
b/gcc/testsuite/gcc.target/riscv/synthesis-4.c
new file mode 100644
index 000..328a55b9e6e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/synthesis-4.c
@@ -0,0 +1,34 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* We aggressively skip as we really just need to test the basic synthesis
+   which shouldn't vary based on the optimization level.  -O1 seems to work
+   and eliminates the usual sources of extraneous dead code that would throw
+   off the counts.  */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O2" "-O3" "-Os" "-Oz" "-flto" } } 
*/
+/* { dg-options "-march=rv64gc_zba_zbb_zbs" } */
+
+/* Rather than test for a specific synthesis of all these constants or
+   having thousands of tests each testing one variant, we just test the
+   total number of instructions. 
+
+   This isn't expected to change much and any change is worthy of a look.  */
+/* { dg-final { scan-assembler-times 
"\\t(add|addi|bseti|li|ret|sh1add|sh2add|sh3add|slli)" 45 } } */
+
+
+unsigned long foo_0x640800(void) { return 0x640800UL; }
+
+unsigned long foo_0xc40800(void) { return 0xc40800UL; }
+
+unsigned long foo_0x1840800(void) { return 0x1840800UL; }
+
+unsigned long foo_0x3040800(void) { return 0x3040800UL; }
+
+unsigned long foo_0x6040800(void) { return 0x6040800UL; }
+
+unsigned long foo_0xc040800(void) { return 0xc040800UL; }
+
+unsigned long foo_0x18040800(void) { return 0x18040800UL; }
+
+unsigned long foo_0x30040800(void) { return 0x30040800UL; }
+
+unsigned long foo_0x60040800(void) { return 0x60040800UL; }

Re: [PATCH] c++: lvalueness of non-dependent assignment [PR114994]

2024-05-11 Thread Patrick Palka

On Fri, 10 May 2024, Jason Merrill wrote:

> On 5/9/24 16:23, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> > OK for trunk/14?  For trunk as a follow-up I can implement the
> > mentionted representation change to use CALL_EXPR instead of
> > MODOP_EXPR for a non-dependent simple assignment expression that
> > resolved to an operator= overload.
> > 
> > -- >8 --
> > 
> > r14-4111 made us check non-dependent assignment expressions ahead of
> > time, as well as give them a type.  Unlike for compound assignment
> > expressions however, if a simple assignment resolves to an operator
> > overload we still represent it as a (typed) MODOP_EXPR instead of a
> > CALL_EXPR to the selected overload.  This, I reckoned, was just a
> > pessimization (since we'll have to repeat overload resolution at
> > instantiatiation time) but should be harmless.  (And it should be
> > easily fixable by giving cp_build_modify_expr an 'overload' parameter).
> > 
> > But it breaks the below testcase ultimately because MODOP_EXPR (of
> > non-reference type) is always treated as an lvalue according to
> > lvalue_kind, which is incorrect for the MODOP_EXPR representing x=42.
> > 
> > We can fix this by representing such assignment expressions as CALL_EXPRs
> > matching what that of compound assignments, but that turns out to
> > require some tweaking of our -Wparentheses warning logic which seems
> > unsuitable for backporting.
> > 
> > So this patch instead more conservatively fixes this by refining
> > lvalue_kind to consider the type of a (simple) MODOP_EXPR as we
> > already do for COND_EXPR.
> > 
> > PR c++/114994
> > 
> > gcc/cp/ChangeLog:
> > 
> > * tree.cc (lvalue_kind) : Consider the
> > type of a simple assignment expression.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/template/non-dependent32.C: New test.
> > ---
> >   gcc/cp/tree.cc |  7 +++
> >   .../g++.dg/template/non-dependent32.C  | 18 ++
> >   2 files changed, 25 insertions(+)
> >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent32.C
> > 
> > diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
> > index f1a23ffe817..0b97b789aab 100644
> > --- a/gcc/cp/tree.cc
> > +++ b/gcc/cp/tree.cc
> > @@ -275,6 +275,13 @@ lvalue_kind (const_tree ref)
> > /* We expect to see unlowered MODOP_EXPRs only during
> >  template processing.  */
> > gcc_assert (processing_template_decl);
> > +  if (TREE_CODE (TREE_OPERAND (ref, 1)) == NOP_EXPR
> > + && CLASS_TYPE_P (TREE_TYPE (TREE_OPERAND (ref, 0
> > +   /* As in the COND_EXPR case, but for non-dependent assignment
> > +  expressions created by build_x_modify_expr.  */
> > +   goto default_;
> 
> This seems overly specific, I'd think the same thing would apply to += and
> such?

We shouldn't see += etc of class type here since we already represent
those as CALL_EXPR to the selected operator=, but indeed it could
otherwise apply to +=.  Like so?

-- >8 -- 

Subject: [PATCH] c++: lvalueness of non-dependent assignment expr [PR114994]

PR c++/114994

gcc/cp/ChangeLog:

* tree.cc (lvalue_kind) : Consider the
type of a class assignment expression.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent32.C: New test.
---
 gcc/cp/tree.cc |  5 -
 .../g++.dg/template/non-dependent32.C  | 18 ++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/template/non-dependent32.C

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index f1a23ffe817..9d37d255d8d 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -275,7 +275,10 @@ lvalue_kind (const_tree ref)
   /* We expect to see unlowered MODOP_EXPRs only during
 template processing.  */
   gcc_assert (processing_template_decl);
-  return clk_ordinary;
+  if (CLASS_TYPE_P (TREE_TYPE (TREE_OPERAND (ref, 0
+   goto default_;
+  else
+   return clk_ordinary;
 
 case MODIFY_EXPR:
 case TYPEID_EXPR:
diff --git a/gcc/testsuite/g++.dg/template/non-dependent32.C 
b/gcc/testsuite/g++.dg/template/non-dependent32.C
new file mode 100644
index 000..54252c7dfaf
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/non-dependent32.C
@@ -0,0 +1,18 @@
+// PR c++/114994
+// { dg-do compile { target c++11 } }
+
+struct udl_arg {
+  udl_arg operator=(int);
+};
+
+void f(udl_arg&&);
+
+template
+void g() {
+  udl_arg x;
+  f(x=42); // { dg-bogus "cannot bind" }
+}
+
+int main() {
+  g();
+}
-- 
2.45.0.119.g0f3415f1f8


... and here's the updated patch to make us represent simple class
assignment as CALL_EXPR as well (should the first ptach go in for 14/trunk,
and this second patch for trunk only?):

Subject: [PATCH] c++: lvalueness of non-dependent assignment expr [PR114994]

PR c++/114994

gcc/cp/ChangeLog:

* call.cc (build_new_op): Pass 'overload' to
cp_

Re: [PATCH v2 1/3] diagnostics: Enable escape sequence processing on windows consoles

2024-05-11 Thread Peter0x44


9 May 2024 6:02:34 pm Peter Damianov :

Since windows 10 release v1511, the windows console has had support for 
VT100
escape sequences. We should try to enable this, and utilize it where 
possible.


gcc/ChangeLog:
    * diagnostic-color.cc (should_colorize): Enable processing of VT100
    escape sequences on windows consoles

Signed-off-by: Peter Damianov 
---

Forgot to add -v2 to git send-email the first time I sent. Sorry for 
the spam.


gcc/diagnostic-color.cc | 21 -
1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/gcc/diagnostic-color.cc b/gcc/diagnostic-color.cc
index f01a0fc2e37..3af198654af 100644
--- a/gcc/diagnostic-color.cc
+++ b/gcc/diagnostic-color.cc
@@ -213,12 +213,23 @@ should_colorize (void)
  pp_write_text_to_stream() in pretty-print.cc calls fputs() on
  that stream.  However, the code below for non-Windows doesn't 
seem

  to care about it either...  */
-  HANDLE h;
-  DWORD m;
+  HANDLE handle;
+  DWORD mode;
+  BOOL isconsole = false;

-  h = GetStdHandle (STD_ERROR_HANDLE);
-  return (h != INVALID_HANDLE_VALUE) && (h != NULL)
- && GetConsoleMode (h, &m);
+  handle = GetStdHandle (STD_ERROR_HANDLE);
+
+  if ((handle != INVALID_HANDLE_VALUE) && (handle != NULL))
+    isconsole = GetConsoleMode (handle, &mode);
+
+  if (isconsole)
+    {
+  /* Try to enable processing of VT100 escape sequences */
+  mode |= ENABLE_PROCESSED_OUTPUT | 
ENABLE_VIRTUAL_TERMINAL_PROCESSING;

+  SetConsoleMode (handle, mode);
+    }
+
+  return isconsole;
#else
   char const *t = getenv ("TERM");
   /* emacs M-x shell sets TERM="dumb".  */
--
2.39.2

I asked a windows terminal maintainer to review the patches here:
https://github.com/microsoft/terminal/discussions/17219#discussioncomment-9375044
And got an "LGTM".

I tested the patches with windows terminal, conhost.exe, and conhost.exe 
with the "use legacy console" box checked, and they all worked correctly.


I think this is okay for trunk.

[PATCH v2] driver: Output to a temp file; rename upon success [PR80182]

2024-05-11 Thread Peter Damianov

Currently, commands like:
gcc -o file.c -lm
will delete the user's code.

This patch makes the linker write executables to a temp file, and then renames
the temp file if successful. This fixes the case above, but has limitations.
The source file will still get overwritten if the link "succeeds", such as the
case of: gcc -o file.c -lm -r

It's not perfect, but it should hopefully stop some people from ruining their
day.

gcc/ChangeLog:
PR driver/80182
* gcc.cc (output_file_temp): New global variable
(driver_handle_option): Create temp file for executable output
(driver::maybe_run_linker): Rename output_file_temp to output_file if
the linker ran successfully

Signed-off-by: Peter Damianov 
---

v2: Make sure temp file is removed in the case of a compiler error

 gcc/gcc.cc | 51 +++
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 830a4700a87..fabb83c43ee 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -2138,6 +2138,11 @@ static int have_E = 0;
 /* Pointer to output file name passed in with -o. */
 static const char *output_file = 0;
 
+/* We write the output file to a temp file, and rename it if linking
+   is successful. This is to prevent mistakes like: gcc -o file.c -lm from
+   deleting the user's code.  */
+static const char *output_file_temp = 0;
+
 /* Pointer to input file name passed in with -truncate.
This file should be truncated after linking. */
 static const char *totruncate_file = 0;
@@ -4610,10 +4615,16 @@ driver_handle_option (struct gcc_options *opts,
 #if defined(HAVE_TARGET_EXECUTABLE_SUFFIX) || 
defined(HAVE_TARGET_OBJECT_SUFFIX)
   arg = convert_filename (arg, ! have_c, 0);
 #endif
-  output_file = arg;
+  output_file_temp = output_file = arg;
+  /* If creating an executable, output to a temp file.  */
+  if (!have_c)
+   {
+ output_file_temp = make_temp_file("");
+ record_temp_file(output_file_temp, false, true);
+   }
   /* On some systems, ld cannot handle "-o" without a space.  So
 split the option from its argument.  */
-  save_switch ("-o", 1, &arg, validated, true);
+  save_switch ("-o", 1, &output_file_temp, validated, true);
   return true;
 
 case OPT_pie:
@@ -9266,22 +9277,30 @@ driver::maybe_run_linker (const char *argv0) const
   linker_was_run = (tmp != execution_count);
 }
 
-  /* If options said don't run linker,
- complain about input files to be given to the linker.  */
-
-  if (! linker_was_run && !seen_error ())
-for (i = 0; (int) i < n_infiles; i++)
-  if (explicit_link_files[i]
- && !(infiles[i].language && infiles[i].language[0] == '*'))
+  if (!seen_error ())
+{
+  if (linker_was_run)
+   /* If the linker finished without errors, rename the output from the
+  temporary file to the real output name.  */
+   rename(output_file_temp, output_file);
+  else
{
- warning (0, "%s: linker input file unused because linking not done",
-  outfiles[i]);
- if (access (outfiles[i], F_OK) < 0)
-   /* This is can be an indication the user specifed an errorneous
-  separated option value, (or used the wrong prefix for an
-  option).  */
-   error ("%s: linker input file not found: %m", outfiles[i]);
+ /* If options said don't run linker,
+complain about input files to be given to the linker.  */
+ for (i = 0; (int) i < n_infiles; i++)
+   if (explicit_link_files[i]
+   && !(infiles[i].language && infiles[i].language[0] == '*'))
+ {
+   warning (0, "%s: linker input file unused because linking not 
done",
+outfiles[i]);
+   if (access (outfiles[i], F_OK) < 0)
+ /* This is can be an indication the user specifed an 
errorneous
+separated option value, (or used the wrong prefix for an
+option).  */
+ error ("%s: linker input file not found: %m", outfiles[i]);
+ }
}
+}
 }
 
 /* The end of "main".  */
-- 
2.39.2

Re: [Patch, fortran] PR84006 [11/12/13/14/15 Regression] ICE in storage_size() with CLASS entity

2024-05-11 Thread Harald Anlauf


Hi Paul,

Am 11.05.24 um 08:20 schrieb Paul Richard Thomas:

Hi Harald,

Thanks for the review. The attached resubmission fixes all the invalid
accesses, memory leaks and puts right the incorrect result.

In the course of fixing the fix, I found that deferred character length
MOLDs gave an ICE because reallocation on assign was using 'dest_word_len'
before it was defined. This is fixed by not fixing 'dest_word_len' for
these MOLDs. Unfortunately, the same did not work for unlimited polymorphic
MOLD expressions and so I added a TODO error in iresolve.cc since it
results in all manner of memory errors in runtime. I will return to this
another day.

A resubmission of the patch for PR113363 will follow since it depends on
this one to fix all the memory problems.

OK for mainline?


this is OK from my side.

One minor nit: the updated testcase transfer_class_4.f90 has

  if (sz /= storage_size (real32)/8) stop 1

I think you meant either  storage_size (r)  or  storage_size (1._real32)
instead of checking the storage size of the integer real32 here...

Thanks for the patch!

Harald


Regards

Paul

On Thu, 9 May 2024 at 08:52, Paul Richard Thomas <
paul.richard.tho...@gmail.com> wrote:


Hi Harald,

The Linaro people caught that as well. Thanks.

Interestingly, I was about to re-submit the patch for PR113363, in which
all the invalid accesses and memory leaks are fixed but requires this patch
to do so. The final transfer was thrown in because it seemed to be working
out of the box but should be checked anyway.

Inserting your print statements, my test shows the difference in
size(chr_a) but prints chr_a as "abcdefgjj" no matter what I do. Needless
to say, the latter was the only check that I did. The problem, I suspect,
lies somewhere in the murky depths of
trans-array.cc(gfc_alloc_allocatable_for_assignment) or in the array part
of intrinsic_transfer, untouched by either patch, and is present in 13- and
14-branches.

I am onto it.

Cheers

Paul


On Wed, 8 May 2024 at 22:06, Harald Anlauf  wrote:


Hi Paul,

this looks mostly good, but the new testcase transfer_class_4.f90
does exhibit a problem with your patch.  Run it with valgrind,
or with -fcheck=bounds, or with -fsanitize=address, or add the
following around the final transfer:

print *, storage_size (star_a), storage_size (chr_a), size (chr_a), len
(chr_a)
chr_a = transfer (star_a, chr_a)
print *, storage_size (star_a), storage_size (chr_a), size (chr_a), len
(chr_a)
print *, ">", chr_a, "<"

This prints for me:

40  40   2   5$
40  40   4   5$
   >abcdefghij^@^@^@^@^@^@^@^@^@^@<$

So since the physical representation of chr_a is sufficient
to hold star_a (F2023:16.9.212), no reallocation with a wrong
calculated size should happen.  (Intel and NAG get this right.)

Can you check again?

Thanks,
Harald

Re: [COMMITTED] Remove obsolete Solaris 11.3 support

2024-05-11 Thread John Paul Adrian Glaubitz

Hi Peter,

On Fri, 2024-05-10 at 12:07 +0100, Peter Tribble wrote:
> Tribblix is built from the last commit that worked (November 2021), with any 
> relevant changes
> since cherry-picked on top. So in terms of timeline Tribblix is contemporary 
> with 11.4, with
> hardware support matching the original Solaris 11 release.

Thanks, good to know! And thanks a lot for your efforts!

> But we're well into the second decade since the fork, so there's enough 
> divergence that illumos
> and Solaris are really different, even if some of what you see looks very 
> similar.
> 
> (And illumos on SPARC uses gcc4 to build the kernel [!], although 
> applications on Tribblix use gcc7.
> Given the target market, having the latest and greatest toolchains isn't the 
> highest priority.)

Well, at some point you will run into code that won't build with that old 
toolchain anymore.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Re: [PATCH 03/21] testsuite: Add more allocation size tests for conjured svalues [PR110014]

2024-05-11 Thread NightStrike

On Thu, May 9, 2024 at 1:47 PM David Malcolm  wrote:
>
> From: Tim Lange 
>
> This patch adds the reproducers reported in PR 110014 as test cases. The
> false positives in those cases are already fixed with PR 109577.
>
> 2023-06-09  Tim Lange  
>
> PR analyzer/110014
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/analyzer/realloc-pr110014.c: New tests.
>



> diff --git a/gcc/testsuite/gcc.dg/analyzer/realloc-pr110014.c 
> b/gcc/testsuite/gcc.dg/analyzer/realloc-pr110014.c
> new file mode 100644
> index 000..d76b8781413
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/realloc-pr110014.c
> @@ -0,0 +1,25 @@
> +void *realloc (void *, unsigned long)
> +  __attribute__((__nothrow__, __leaf__))
> +  __attribute__((__warn_unused_result__)) __attribute__((__alloc_size__ 
> (2)));

This change missed my comment about the wrong type for realloc from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110014#c3

Can you please fix this on all branches?

Re: [PATCH 02/21] analyzer: Fix allocation size false positive on conjured svalue [PR109577]

2024-05-11 Thread NightStrike

On Thu, May 9, 2024 at 1:45 PM David Malcolm  wrote:
>
> From: Tim Lange 
>
> Currently, the analyzer tries to prove that the allocation size is a
> multiple of the pointee's type size.  This patch reverses the behavior
> to try to prove that the expression is not a multiple of the pointee's
> type size.  With this change, each unhandled case should be gracefully
> considered as correct.  This fixes the bug reported in PR 109577 by
> Paul Eggert.
>



> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr109577.c 
> b/gcc/testsuite/gcc.dg/analyzer/pr109577.c
> new file mode 100644
> index 000..a6af6f7019f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr109577.c
> @@ -0,0 +1,16 @@
> +void *malloc (unsigned long);

This change missed my comment here describing this mistake:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109577#c5

Can you please fix this on all branches?

[PATCH] jit: Ensure ssize_t is defined.

2024-05-11 Thread FX Coudert

Hi,

On some targets it seems that ssize_t is not defined by any of the headers 
transitively included by . This leads to a bootstrap fail when jit is 
enabled. The attached patch fixes it by include . Other headers in 
GCC treat  as available on all targets, so we include it 
unconditionally.

Tested on x86_64-darwin and x86_64-linux. OK to push?

FX



0001-jit-Ensure-ssize_t-is-defined.patch
Description: Binary data

[PATCH] driver: Output to a temp file; rename upon success [PR80182]

2024-05-11 Thread Peter Damianov

Currently, commands like:
gcc -o file.c -lm
will delete the user's code.

This patch makes the linker write executables to a temp file, and then renames
the temp file if successful. This fixes the case above, but has limitations.
The source file will still get overwritten if the link "succeeds", such as the
case of: gcc -o file.c -lm -r

It's not perfect, but it should hopefully stop some people from ruining their
day.

gcc/ChangeLog:
PR driver/80182
* gcc.cc (output_file_temp): New global variable
(driver_handle_option): Create temp file for executable output
(driver::maybe_run_linker): Rename output_file_temp to output_file if
the linker ran successfully

Signed-off-by: Peter Damianov 
---
 gcc/gcc.cc | 50 +-
 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 830a4700a87..cd796d190de 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -2138,6 +2138,11 @@ static int have_E = 0;
 /* Pointer to output file name passed in with -o. */
 static const char *output_file = 0;
 
+/* We write the output file to a temp file, and rename it if linking
+   is successful. This is to prevent mistakes like: gcc -o file.c -lm from
+   deleting the user's code.  */
+static const char *output_file_temp = 0;
+
 /* Pointer to input file name passed in with -truncate.
This file should be truncated after linking. */
 static const char *totruncate_file = 0;
@@ -2294,7 +2299,7 @@ close_at_file (void)
 
   record_temp_file (temp_file, !save_temps_flag, !save_temps_flag);
 }
-
+
 /* Load specs from a file name named FILENAME, replacing occurrences of
various different types of line-endings, \r\n, \n\r and just \r, with
a single \n.  */
@@ -4610,10 +4615,13 @@ driver_handle_option (struct gcc_options *opts,
 #if defined(HAVE_TARGET_EXECUTABLE_SUFFIX) || 
defined(HAVE_TARGET_OBJECT_SUFFIX)
   arg = convert_filename (arg, ! have_c, 0);
 #endif
-  output_file = arg;
+  output_file_temp = output_file = arg;
+  /* If creating an executable, output to a temp file.  */
+  if (!have_c)
+   output_file_temp = make_temp_file("");
   /* On some systems, ld cannot handle "-o" without a space.  So
 split the option from its argument.  */
-  save_switch ("-o", 1, &arg, validated, true);
+  save_switch ("-o", 1, &output_file_temp, validated, true);
   return true;
 
 case OPT_pie:
@@ -9266,22 +9274,30 @@ driver::maybe_run_linker (const char *argv0) const
   linker_was_run = (tmp != execution_count);
 }
 
-  /* If options said don't run linker,
- complain about input files to be given to the linker.  */
-
-  if (! linker_was_run && !seen_error ())
-for (i = 0; (int) i < n_infiles; i++)
-  if (explicit_link_files[i]
- && !(infiles[i].language && infiles[i].language[0] == '*'))
+  if (!seen_error ())
+{
+  if (linker_was_run)
+   /* If the linker finished without errors, rename the output from the
+  temporary file to the real output name.  */
+   rename(output_file_temp, output_file);
+  else
{
- warning (0, "%s: linker input file unused because linking not done",
-  outfiles[i]);
- if (access (outfiles[i], F_OK) < 0)
-   /* This is can be an indication the user specifed an errorneous
-  separated option value, (or used the wrong prefix for an
-  option).  */
-   error ("%s: linker input file not found: %m", outfiles[i]);
+ /* If options said don't run linker,
+complain about input files to be given to the linker.  */
+ for (i = 0; (int) i < n_infiles; i++)
+   if (explicit_link_files[i]
+   && !(infiles[i].language && infiles[i].language[0] == '*'))
+ {
+   warning (0, "%s: linker input file unused because linking not 
done",
+outfiles[i]);
+   if (access (outfiles[i], F_OK) < 0)
+ /* This is can be an indication the user specifed an 
errorneous
+separated option value, (or used the wrong prefix for an
+option).  */
+ error ("%s: linker input file not found: %m", outfiles[i]);
+ }
}
+}
 }
 
 /* The end of "main".  */
-- 
2.39.2

Re: [PATCH v2 1/4] Support for CodeView debugging format

2024-05-11 Thread Jeff Law





On 10/30/23 6:28 PM, Mark Harmstone wrote:

This patch and the following add initial support for Microsoft's
CodeView debugging format, as used by MSVC, to mingw targets.

Note that you will need a recent version of binutils for this to be
useful. The best way to view the output is to run Microsoft's
cvdump.exe, found in their microsoft-pdb repo on GitHub, against the
object files.
So I'd hoped to have these wrapped up last year in time for gcc-14, but 
life got in the way.


The patches are fine for the trunk, though they are missing ChangeLog 
entries.  I'll cobble those together and push the series to the trunk.


Thanks for your patience.

jeff

[SUBREG V3 2/4] DF: Add DF_LIVE_SUBREG problem

2024-05-11 Thread Juzhe-Zhong

This patch add a new DF problem, named DF_LIVE_SUBREG. This problem
is extended from the DF_LR problem and support track the subreg liveness
of multireg pseudo if these pseudo satisfy the following conditions:

  1. the mode size greater than it's REGMODE_NATURAL_SIZE.
  2. the reg is used in insns via subreg pattern.

The main methods are as follows:

  1. split bitmap in/out/def/use fileds to full_in/out/def/use and
 partial_in/out/def/use. If a pseudo need to be tracked it's subreg
 liveness, then it is recorded in partial_in/out/def/use fileds.
 Meantimes, there are range_in/out/def/use fileds which records the live
 range of the tracked pseudo.
  2. in the df_live_subreg_finalize function, we move the tracked pseudo from
 the partial_in/out/def/use to full_in/out/def/use if the pseudo's live
 range is full.

Co-authored-by: Lehua Ding 

gcc/ChangeLog:

* Makefile.in: Add subreg-live-range object file.
* df-problems.cc (struct df_live_subreg_problem_data): New struct.
(df_live_subreg_get_bb_info): New function.
(get_live_subreg_local_bb_info): Ditto.
(multireg_p): Ditto.
(need_track_subreg_p): Ditto.
(init_range): Ditto.
(remove_subreg_range): Ditto.
(add_subreg_range_to_def): Ditto.
(add_subreg_range_to_use): Ditto.
(df_live_subreg_free_bb_info): Ditto.
(df_live_subreg_alloc): Ditto.
(df_live_subreg_reset): Ditto.
(df_live_subreg_bb_local_compute): Ditto.
(df_live_subreg_local_compute): Ditto.
(df_live_subreg_init): Ditto.
(df_live_subreg_check_result): Ditto.
(df_live_subreg_confluence_0): Ditto.
(df_live_subreg_confluence_n): Ditto.
(df_live_subreg_transfer_function): Ditto.
(df_live_subreg_finalize): Ditto.
(df_live_subreg_free): Ditto.
(df_live_subreg_top_dump): Ditto.
(df_live_subreg_bottom_dump): Ditto.
(df_live_subreg_add_problem): Ditto.
* df.h (enum df_problem_id): New enum.
(class subregs_live): New class.
(class df_live_subreg_local_bb_info): Ditto.
(class df_live_subreg_bb_info): Ditto.
(df_live_subreg): New function.
(df_live_subreg_add_problem): Ditto.
(df_live_subreg_finalize): Ditto.
(df_live_subreg_check_result): Ditto.
(multireg_p): Ditto.
(init_range): Ditto.
(add_subreg_range_to_def): Ditto.
(add_subreg_range_to_use): Ditto.
(remove_subreg_range): Ditto.
(df_get_subreg_live_in): Ditto.
(df_get_subreg_live_out): Ditto.
(df_get_subreg_live_full_in): Ditto.
(df_get_subreg_live_full_out): Ditto.
(df_get_subreg_live_partial_in): Ditto.
(df_get_subreg_live_partial_out): Ditto.
(df_get_subreg_live_range_in): Ditto.
(df_get_subreg_live_range_out): Ditto.
* regs.h (get_nblocks): New macro.
* sbitmap.cc (bitmap_full_p): New function.
(bitmap_same_p): Ditto.
(test_full): Ditto.
(test_same): Ditto.
(sbitmap_cc_tests): Ditto.
* sbitmap.h (bitmap_full_p): Ditto.
(bitmap_same_p): Ditto.
* timevar.def (TV_DF_LIVE_SUBREG): New timer stat.
* subreg-live-range.cc: New file.
* subreg-live-range.h: New file.

---
 gcc/Makefile.in  |   1 +
 gcc/df-problems.cc   | 886 ++-
 gcc/df.h | 159 +++
 gcc/regs.h   |   5 +
 gcc/sbitmap.cc   |  98 +
 gcc/sbitmap.h|   2 +
 gcc/subreg-live-range.cc |  53 +++
 gcc/subreg-live-range.h  | 206 +
 gcc/timevar.def  |   1 +
 9 files changed, 1410 insertions(+), 1 deletion(-)
 create mode 100644 gcc/subreg-live-range.cc
 create mode 100644 gcc/subreg-live-range.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ecd51146357..11722506018 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1683,6 +1683,7 @@ OBJS = \
store-motion.o \
streamer-hooks.o \
stringpool.o \
+   subreg-live-range.o \
substring-locations.o \
target-globals.o \
targhooks.o \
diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
index 88ee0dd67fc..01f1f850925 100644
--- a/gcc/df-problems.cc
+++ b/gcc/df-problems.cc
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "target.h"
 #include "rtl.h"
 #include "df.h"
+#include "subreg-live-range.h"
 #include "memmodel.h"
 #include "tm_p.h"
 #include "insn-config.h"
@@ -1344,8 +1345,891 @@ df_lr_verify_transfer_functions (void)
   bitmap_clear (&all_blocks);
 }
 
+/*
+   REGISTER AND SUBREGS LIVES
+   Like DF_LR, but include tracking subreg liveness.  Currently used to provide
+   subreg liveness related information to the register allocator.  The subreg
+   information is currently tracked for registers that satisf

[SUBREG V3 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-05-11 Thread Juzhe-Zhong

V3: Address comments from Dimitar Dimitrov

These patches are used to add a new data flow DF_LIVE_SUBREG,
which will track subreg liveness and then apply it to IRA and LRA
passes (enabled via -O3 or -ftrack-subreg-liveness). These patches
are for GCC 15. And these codes are pushed to the devel/subreg-coalesce
branch. In addition, my colleague Shuo Chen will also be involved in some
of the remain work, thank you for your support.

These patches are separated from the subreg-coalesce patches submitted
a few months ago. I refactored the code according to comments. The next
patches will support subreg coalesce base on they. Here are some data
abot build time of SPEC INT 2017 (x86-64 target):

  baseline   baseline(+track-subreg-liveness)
specint2017 build time :  1892s  1883s

Regarding build times, I've run it a few times, but they all seem to take
much less time. Since the difference is small, it's possible that it's just
a change in environment. But it's theoretically possible, since supporting
subreg-liveness could have reduced the number of living regs.

For memory usage, I trided PR 69609 by valgrind, peak memory size grow from
2003910656 to 2003947520, very small increase.

Note that these patches don't enable register coalesce with subreg liveness in 
IRA/LRA,
so no performance change as expected.

And we will enable register coalsece with subreg liveness tracking in the 
followup patches.

Bootstrap and Regtested on x86-64 no regression.

Co-authored-by: Lehua Ding 

Juzhe-Zhong (4):
  DF: Add -ftrack-subreg-liveness option
  DF: Add DF_LIVE_SUBREG problem
  IRA: Add DF_LIVE_SUBREG problem
  LRA: Apply DF_LIVE_SUBREG data

 gcc/Makefile.in  |   1 +
 gcc/common.opt   |   4 +
 gcc/common.opt.urls  |   3 +
 gcc/df-problems.cc   | 886 ++-
 gcc/df.h | 159 +++
 gcc/doc/invoke.texi  |   8 +
 gcc/ira-build.cc |   7 +-
 gcc/ira-color.cc |   8 +-
 gcc/ira-emit.cc  |  12 +-
 gcc/ira-lives.cc |   7 +-
 gcc/ira.cc   |  19 +-
 gcc/lra-coalesce.cc  |  27 +-
 gcc/lra-constraints.cc   | 109 -
 gcc/lra-int.h|   4 +
 gcc/lra-lives.cc | 357 
 gcc/lra-remat.cc |   8 +-
 gcc/lra-spills.cc|  27 +-
 gcc/lra.cc   |  10 +-
 gcc/opts.cc  |   1 +
 gcc/regs.h   |   5 +
 gcc/sbitmap.cc   |  98 +
 gcc/sbitmap.h|   2 +
 gcc/subreg-live-range.cc |  53 +++
 gcc/subreg-live-range.h  | 206 +
 gcc/timevar.def  |   1 +
 25 files changed, 1886 insertions(+), 136 deletions(-)
 create mode 100644 gcc/subreg-live-range.cc
 create mode 100644 gcc/subreg-live-range.h

-- 
2.36.3

[SUBREG V3 4/4] LRA: Apply DF_LIVE_SUBREG data

2024-05-11 Thread Juzhe-Zhong

This patch apply the DF_LIVE_SUBREG to LRA pass. More changes were made
to the LRA than the IRA since the LRA will modify the DF data directly.
The main big changes are centered on the lra-lives.cc file.

Co-authored-by: Lehua Ding 

gcc/ChangeLog:

* lra-coalesce.cc (update_live_info): Apply DF_LIVE_SUBREG data.
(lra_coalesce): Ditto.
* lra-constraints.cc (update_ebb_live_info): Ditto.
(get_live_on_other_edges): Ditto.
(inherit_in_ebb): Ditto.
(lra_inheritance): Ditto.
(fix_bb_live_info): Ditto.
(remove_inheritance_pseudos): Ditto.
* lra-int.h (GCC_LRA_INT_H): Ditto.
(struct lra_insn_reg): Ditto.
* lra-lives.cc (class bb_data_pseudos): Ditto.
(need_track_subreg_p): New function.
(make_hard_regno_live): Ditto
(make_hard_regno_dead): Ditto.
(mark_regno_live): Apply DF_LIVE_SUBREG data.
(mark_regno_dead): Ditto.
(live_trans_fun): Ditto.
(live_con_fun_0): Ditto.
(live_con_fun_n): Ditto.
(initiate_live_solver): Ditto.
(finish_live_solver): Ditto.
(process_bb_lives): Ditto.
(lra_create_live_ranges_1): Ditto.
* lra-remat.cc (dump_candidates_and_remat_bb_data): Ditto.
(calculate_livein_cands): Ditto.
(do_remat): Ditto.
* lra-spills.cc (spill_pseudos): Ditto.
* lra.cc (new_insn_reg): Ditto.
(add_regs_to_insn_regno_info): Ditto.

---
 gcc/lra-coalesce.cc|  27 +++-
 gcc/lra-constraints.cc | 109 ++---
 gcc/lra-int.h  |   4 +
 gcc/lra-lives.cc   | 357 -
 gcc/lra-remat.cc   |   8 +-
 gcc/lra-spills.cc  |  27 +++-
 gcc/lra.cc |  10 +-
 7 files changed, 430 insertions(+), 112 deletions(-)

diff --git a/gcc/lra-coalesce.cc b/gcc/lra-coalesce.cc
index a9b5b51cb3f..9416775a009 100644
--- a/gcc/lra-coalesce.cc
+++ b/gcc/lra-coalesce.cc
@@ -186,19 +186,28 @@ static bitmap_head used_pseudos_bitmap;
 /* Set up USED_PSEUDOS_BITMAP, and update LR_BITMAP (a BB live info
bitmap).  */
 static void
-update_live_info (bitmap lr_bitmap)
+update_live_info (bitmap all, bitmap full, bitmap partial)
 {
   unsigned int j;
   bitmap_iterator bi;
 
   bitmap_clear (&used_pseudos_bitmap);
-  EXECUTE_IF_AND_IN_BITMAP (&coalesced_pseudos_bitmap, lr_bitmap,
+  EXECUTE_IF_AND_IN_BITMAP (&coalesced_pseudos_bitmap, all,
FIRST_PSEUDO_REGISTER, j, bi)
 bitmap_set_bit (&used_pseudos_bitmap, first_coalesced_pseudo[j]);
-  if (! bitmap_empty_p (&used_pseudos_bitmap))
+  if (!bitmap_empty_p (&used_pseudos_bitmap))
 {
-  bitmap_and_compl_into (lr_bitmap, &coalesced_pseudos_bitmap);
-  bitmap_ior_into (lr_bitmap, &used_pseudos_bitmap);
+  bitmap_and_compl_into (all, &coalesced_pseudos_bitmap);
+  bitmap_ior_into (all, &used_pseudos_bitmap);
+
+  if (flag_track_subreg_liveness)
+   {
+ bitmap_and_compl_into (full, &coalesced_pseudos_bitmap);
+ bitmap_ior_and_compl_into (full, &used_pseudos_bitmap, partial);
+
+ bitmap_and_compl_into (partial, &coalesced_pseudos_bitmap);
+ bitmap_ior_and_compl_into (partial, &used_pseudos_bitmap, full);
+   }
 }
 }
 
@@ -301,8 +310,12 @@ lra_coalesce (void)
   bitmap_initialize (&used_pseudos_bitmap, ®_obstack);
   FOR_EACH_BB_FN (bb, cfun)
 {
-  update_live_info (df_get_live_in (bb));
-  update_live_info (df_get_live_out (bb));
+  update_live_info (df_get_subreg_live_in (bb),
+   df_get_subreg_live_full_in (bb),
+   df_get_subreg_live_partial_in (bb));
+  update_live_info (df_get_subreg_live_out (bb),
+   df_get_subreg_live_full_out (bb),
+   df_get_subreg_live_partial_out (bb));
   FOR_BB_INSNS_SAFE (bb, insn, next)
if (INSN_P (insn)
&& bitmap_bit_p (&involved_insns_bitmap, INSN_UID (insn)))
diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 5b78fd0b7e5..effb5d8484c 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -6554,34 +6554,86 @@ update_ebb_live_info (rtx_insn *head, rtx_insn *tail)
{
  if (prev_bb != NULL)
{
- /* Update df_get_live_in (prev_bb):  */
+ /* Update subreg live (prev_bb):  */
+ bitmap subreg_all_in = df_get_subreg_live_in (prev_bb);
+ bitmap subreg_full_in = df_get_subreg_live_full_in (prev_bb);
+ bitmap subreg_partial_in = df_get_subreg_live_partial_in 
(prev_bb);
+ subregs_live *range_in = df_get_subreg_live_range_in (prev_bb);
  EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi)
if (bitmap_bit_p (&live_regs, j))
- bitmap_set_bit (df_get_live_in (prev_bb), j);
-   else
- bitmap_clear_bit (df_get_live_in (prev_bb), j);
+ {
+   bitmap_

[SUBREG V3 3/4] IRA: Add DF_LIVE_SUBREG problem

2024-05-11 Thread Juzhe-Zhong

This patch simple replace df_get_live_in to df_get_subreg_live_in
and replace df_get_live_out to df_get_subreg_live_out.

Co-authored-by: Lehua Ding 

gcc/ChangeLog:

* ira-build.cc (create_bb_allocnos): Apply DF_LIVE_SUBREG data.
(create_loop_allocnos): Diito.
* ira-color.cc (ira_loop_edge_freq): Diito.
* ira-emit.cc (generate_edge_moves): Diito.
(add_ranges_and_copies): Diito.
* ira-lives.cc (process_out_of_region_eh_regs): Diito.
(add_conflict_from_region_landing_pads): Diito.
(process_bb_node_lives): Diito.
* ira.cc (find_moveable_pseudos): Diito.
(interesting_dest_for_shprep_1): Diito.
(allocate_initial_values): Diito.
(ira): Diito.

---
 gcc/ira-build.cc |  7 ---
 gcc/ira-color.cc |  8 
 gcc/ira-emit.cc  | 12 ++--
 gcc/ira-lives.cc |  7 ---
 gcc/ira.cc   | 19 ---
 5 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/gcc/ira-build.cc b/gcc/ira-build.cc
index ea593d5a087..283ff36d3dd 100644
--- a/gcc/ira-build.cc
+++ b/gcc/ira-build.cc
@@ -1921,7 +1921,8 @@ create_bb_allocnos (ira_loop_tree_node_t bb_node)
   create_insn_allocnos (PATTERN (insn), NULL, false);
   /* It might be a allocno living through from one subloop to
  another.  */
-  EXECUTE_IF_SET_IN_REG_SET (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, i, bi)
+  EXECUTE_IF_SET_IN_REG_SET (df_get_subreg_live_in (bb), FIRST_PSEUDO_REGISTER,
+i, bi)
 if (ira_curr_regno_allocno_map[i] == NULL)
   ira_create_allocno (i, false, ira_curr_loop_tree_node);
 }
@@ -1937,9 +1938,9 @@ create_loop_allocnos (edge e)
   bitmap_iterator bi;
   ira_loop_tree_node_t parent;
 
-  live_in_regs = df_get_live_in (e->dest);
+  live_in_regs = df_get_subreg_live_in (e->dest);
   border_allocnos = ira_curr_loop_tree_node->border_allocnos;
-  EXECUTE_IF_SET_IN_REG_SET (df_get_live_out (e->src),
+  EXECUTE_IF_SET_IN_REG_SET (df_get_subreg_live_out (e->src),
 FIRST_PSEUDO_REGISTER, i, bi)
 if (bitmap_bit_p (live_in_regs, i))
   {
diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
index b9ae32d1b4d..bfebc48ef83 100644
--- a/gcc/ira-color.cc
+++ b/gcc/ira-color.cc
@@ -2786,8 +2786,8 @@ ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int 
regno, bool exit_p)
   FOR_EACH_EDGE (e, ei, loop_node->loop->header->preds)
if (e->src != loop_node->loop->latch
&& (regno < 0
-   || (bitmap_bit_p (df_get_live_out (e->src), regno)
-   && bitmap_bit_p (df_get_live_in (e->dest), regno
+   || (bitmap_bit_p (df_get_subreg_live_out (e->src), regno)
+   && bitmap_bit_p (df_get_subreg_live_in (e->dest), regno
  freq += EDGE_FREQUENCY (e);
 }
   else
@@ -2795,8 +2795,8 @@ ira_loop_edge_freq (ira_loop_tree_node_t loop_node, int 
regno, bool exit_p)
   auto_vec edges = get_loop_exit_edges (loop_node->loop);
   FOR_EACH_VEC_ELT (edges, i, e)
if (regno < 0
-   || (bitmap_bit_p (df_get_live_out (e->src), regno)
-   && bitmap_bit_p (df_get_live_in (e->dest), regno)))
+   || (bitmap_bit_p (df_get_subreg_live_out (e->src), regno)
+   && bitmap_bit_p (df_get_subreg_live_in (e->dest), regno)))
  freq += EDGE_FREQUENCY (e);
 }
 
diff --git a/gcc/ira-emit.cc b/gcc/ira-emit.cc
index d347f11fa02..8075b082e36 100644
--- a/gcc/ira-emit.cc
+++ b/gcc/ira-emit.cc
@@ -510,8 +510,8 @@ generate_edge_moves (edge e)
 return;
   src_map = src_loop_node->regno_allocno_map;
   dest_map = dest_loop_node->regno_allocno_map;
-  regs_live_in_dest = df_get_live_in (e->dest);
-  regs_live_out_src = df_get_live_out (e->src);
+  regs_live_in_dest = df_get_subreg_live_in (e->dest);
+  regs_live_out_src = df_get_subreg_live_out (e->src);
   EXECUTE_IF_SET_IN_REG_SET (regs_live_in_dest,
 FIRST_PSEUDO_REGISTER, regno, bi)
 if (bitmap_bit_p (regs_live_out_src, regno))
@@ -1229,16 +1229,16 @@ add_ranges_and_copies (void)
 destination block) to use for searching allocnos by their
 regnos because of subsequent IR flattening.  */
   node = IRA_BB_NODE (bb)->parent;
-  bitmap_copy (live_through, df_get_live_in (bb));
+  bitmap_copy (live_through, df_get_subreg_live_in (bb));
   add_range_and_copies_from_move_list
(at_bb_start[bb->index], node, live_through, REG_FREQ_FROM_BB (bb));
-  bitmap_copy (live_through, df_get_live_out (bb));
+  bitmap_copy (live_through, df_get_subreg_live_out (bb));
   add_range_and_copies_from_move_list
(at_bb_end[bb->index], node, live_through, REG_FREQ_FROM_BB (bb));
   FOR_EACH_EDGE (e, ei, bb->succs)
{
- bitmap_and (live_through,
- df_get_live_in (e->dest), df_get_live_out (bb));
+ bitmap_and (live_through, df_get_subreg_live_in (e->dest),
+ df_get_subreg_l

[SUBREG V3 1/4] DF: Add -ftrack-subreg-liveness option

2024-05-11 Thread Juzhe-Zhong

Add new flag -ftrack-subreg-liveness to enable track-subreg-liveness.
This flag is enabled at -O3/fast.

Co-authored-by: Lehua Ding 

gcc/ChangeLog:

* common.opt: Add -ftrack-subreg-liveness option.
* common.opt.urls: Ditto.
* doc/invoke.texi: Ditto.
* opts.cc: Ditto.

---
 gcc/common.opt  | 4 
 gcc/common.opt.urls | 3 +++
 gcc/doc/invoke.texi | 8 
 gcc/opts.cc | 1 +
 4 files changed, 16 insertions(+)

diff --git a/gcc/common.opt b/gcc/common.opt
index 40cab3cb36a..5710e817abe 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2163,6 +2163,10 @@ fira-share-spill-slots
 Common Var(flag_ira_share_spill_slots) Init(1) Optimization
 Share stack slots for spilled pseudo-registers.
 
+ftrack-subreg-liveness
+Common Var(flag_track_subreg_liveness) Init(0) Optimization
+Track subreg liveness information.
+
 fira-verbose=
 Common RejectNegative Joined UInteger Var(flag_ira_verbose) Init(5)
 -fira-verbose= Control IRA's level of diagnostic messages.
diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls
index f71ed80a34b..59f27a6f7c6 100644
--- a/gcc/common.opt.urls
+++ b/gcc/common.opt.urls
@@ -880,6 +880,9 @@ 
UrlSuffix(gcc/Optimize-Options.html#index-fira-share-save-slots)
 fira-share-spill-slots
 UrlSuffix(gcc/Optimize-Options.html#index-fira-share-spill-slots)
 
+ftrack-subreg-liveness
+UrlSuffix(gcc/Optimize-Options.html#index-ftrack-subreg-liveness)
+
 fira-verbose=
 UrlSuffix(gcc/Developer-Options.html#index-fira-verbose)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ddcd5213f06..fbcde8aa745 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13188,6 +13188,14 @@ Disable sharing of stack slots allocated for 
pseudo-registers.  Each
 pseudo-register that does not get a hard register gets a separate
 stack slot, and as a result function stack frames are larger.
 
+@opindex ftrack-subreg-liveness
+@item -ftrack-subreg-liveness
+Enable tracking subreg liveness information. This infomation allows IRA
+and LRA to support subreg coalesce feature which can improve the quality
+of register allocation.
+
+This option is enabled at level @option{-O3} for all targets.
+
 @opindex flra-remat
 @item -flra-remat
 Enable CFG-sensitive rematerialization in LRA.  Instead of loading
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 600e0ea..50c0b62c5af 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -689,6 +689,7 @@ static const struct default_options default_options_table[] 
=
 { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 },
 { OPT_LEVELS_3_PLUS, OPT_fvect_cost_model_, NULL, VECT_COST_MODEL_DYNAMIC 
},
 { OPT_LEVELS_3_PLUS, OPT_fversion_loops_for_strides, NULL, 1 },
+{ OPT_LEVELS_3_PLUS, OPT_ftrack_subreg_liveness, NULL, 1 },
 
 /* -O3 parameters.  */
 { OPT_LEVELS_3_PLUS, OPT__param_max_inline_insns_auto_, NULL, 30 },
-- 
2.36.3

[PATCH] c++: Strengthen checks on 'main'

2024-05-11 Thread Nathaniel Shead

I wasn't entirely sure what to do with the 'abi/main.C' testcase here;
is this OK, or should I e.g. lower the linkage error to a pedwarn for
the purposes of this test?

Bootstrapped and regtested on x86_64-pc-linux-gnu, and lightly checked
with a MinGW32 cross.  OK for trunk?

-- >8 --

This patch adds some missing requirements for legal main declarations,
as according to [basic.start.main] p2.

gcc/cp/ChangeLog:

* decl.cc (grokfndecl): Check for main functions with language
linkage or module attachment.
(grokvardecl): Check for extern 'C' entities named main.

gcc/testsuite/ChangeLog:

* g++.dg/abi/main.C: Rework to avoid declaring main with
language linkage.
* g++.dg/modules/contracts-1_b.C: Don't declare main in named
module.
* g++.dg/modules/contracts-3_b.C: Likewise.
* g++.dg/modules/contracts-4_d.C: Likewise.
* g++.dg/modules/horcrux-1_a.C: Export declarations, so that...
* g++.dg/modules/horcrux-1_b.C: Don't declare main in named
module.
* g++.dg/modules/main-1.C: New test.
* g++.dg/parse/linkage5.C: New test.
* g++.dg/parse/linkage6.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/decl.cc   | 19 ++---
 gcc/testsuite/g++.dg/abi/main.C  | 29 
 gcc/testsuite/g++.dg/modules/contracts-1_b.C |  4 ---
 gcc/testsuite/g++.dg/modules/contracts-3_b.C |  4 ---
 gcc/testsuite/g++.dg/modules/contracts-4_d.C |  2 --
 gcc/testsuite/g++.dg/modules/horcrux-1_a.C   |  3 ++
 gcc/testsuite/g++.dg/modules/horcrux-1_b.C   |  2 +-
 gcc/testsuite/g++.dg/modules/main-1.C|  5 
 gcc/testsuite/g++.dg/parse/linkage5.C| 14 ++
 gcc/testsuite/g++.dg/parse/linkage6.C| 13 +
 10 files changed, 63 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/main-1.C
 create mode 100644 gcc/testsuite/g++.dg/parse/linkage5.C
 create mode 100644 gcc/testsuite/g++.dg/parse/linkage6.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index e02562466a7..5ab3b787d6f 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -10781,6 +10781,11 @@ grokfndecl (tree ctype,
  "cannot declare %<::main%> to be %qs", "consteval");
   if (!publicp)
error_at (location, "cannot declare %<::main%> to be static");
+  if (current_lang_depth () != 0)
+   error_at (location, "cannot declare %<::main%> with a"
+ " linkage specification");
+  if (module_attach_p ())
+   error_at (location, "cannot attach %<::main%> to a named module");
   inlinep = 0;
   publicp = 1;
 }
@@ -11280,10 +11285,16 @@ grokvardecl (tree type,
 DECL_INTERFACE_KNOWN (decl) = 1;
 
   if (DECL_NAME (decl)
-  && MAIN_NAME_P (DECL_NAME (decl))
-  && scope == global_namespace)
-error_at (DECL_SOURCE_LOCATION (decl),
- "cannot declare %<::main%> to be a global variable");
+  && MAIN_NAME_P (DECL_NAME (decl)))
+{
+  if (scope == global_namespace)
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "cannot declare %<::main%> to be a global variable");
+  else if (DECL_EXTERN_C_P (decl))
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "an entity named % cannot be declared with "
+ "C language linkage");
+}
 
   /* Check that the variable can be safely declared as a concept.
  Note that this also forbids explicit specializations.  */
diff --git a/gcc/testsuite/g++.dg/abi/main.C b/gcc/testsuite/g++.dg/abi/main.C
index 4c5f1ea213c..f3882d70612 100644
--- a/gcc/testsuite/g++.dg/abi/main.C
+++ b/gcc/testsuite/g++.dg/abi/main.C
@@ -1,24 +1,19 @@
 /* { dg-do compile } */
 
-/* Check if entry points get implicit C linkage. If they don't, compiler will
- * error on incompatible declarations */
+/* Check if entry points get implicit C linkage. Determined by checking that
+   the names are unmangled */
 
-int main();
-extern "C" int main();
+int main() {}
 
 #ifdef __MINGW32__
-
-int wmain();
-extern "C" int wmain();
-
-int DllMain();
-extern "C" int DllMain();
-
-int WinMain();
-extern "C" int WinMain();
-
-int wWinMain();
-extern "C" int wWinMain();
-
+int wmain() {}
+int DllMain() {}
+int WinMain() {}
+int wWinMain() {}
 #endif
 
+// { dg-final { scan-assembler-not "_Z4mainv" } }
+// { dg-final { scan-assembler-not "_Z5wmainv" { target *-*-mingw32 } } }
+// { dg-final { scan-assembler-not "_Z7DllMainv" { target *-*-mingw32 } } }
+// { dg-final { scan-assembler-not "_Z7WinMainv" { target *-*-mingw32 } } }
+// { dg-final { scan-assembler-not "_Z8wWinMainv" { target *-*-mingw32 } } }
diff --git a/gcc/testsuite/g++.dg/modules/contracts-1_b.C 
b/gcc/testsuite/g++.dg/modules/contracts-1_b.C
index 30c15f6928b..aa36c8d6b1b 100644
--- a/gcc/testsuite/g++.dg/modules/contracts-1_b.C
+++ b/gcc/testsuite/g++.dg/modules/contracts-1_b.C
@@ -1,15 +1,11 @@
 // { dg-module-do run }
 // { dg-additional-optio

[PING} Re: [PATCH 0/13] rs6000, built-in cleanup patch series

2024-05-11 Thread Carl Love

Ping, just wondering if anyone has had a chance to look at the patch series.

Thanks.

  Carl  

On 4/19/24 14:04, Carl Love wrote:
> GCC maintainers:
> 
> The following patch series removes duplicate built-ins.  There are patches to 
> extend an existing overloaded built-in to cover additional input types.  The 
> final patch removes built-ins to set and initialize vectors.  The code 
> generated by these built-ins with the default optimization is efficient than 
> the code generated by using straight C code.  The assembly code for the 
> built-in and straight C code is the same with -O3
> optimizations.  In this case, the built-ins are removed as they add no 
> additional value.
> 
> The patches have all been tested on Power 10 LE.  The last patch was also 
> tested on Power 8 BE.
> 
> No regression tests were seen.
> 
> Please let me know if the patches are acceptable for mainline.  Thanks.
> 
>Carl 
>

RE: [PATCH] tree-optimization/114760 - check variants of >> and << in loop-niter

2024-05-11 Thread Di Zhao OS

Fixed the problems and committed to trunk.

Thanks,
Di Zhao

> -Original Message-
> From: Richard Biener 
> Sent: Friday, May 10, 2024 8:56 PM
> To: Di Zhao OS 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] tree-optimization/114760 - check variants of >> and << in
> loop-niter
> 
> On Fri, May 10, 2024 at 12:55 PM Di Zhao OS
>  wrote:
> >
> > This patch tries to fix pr114760 by checking for the
> > variants explicitly. When recognizing bit counting idiom,
> > include pattern "x * 2" for "x << 1", and "x / 2" for
> > "x >> 1" (given x is unsigned).
> >
> > Bootstrapped and tested on x86_64-linux-gnu.
> >
> > Thanks,
> > Di Zhao
> >
> > ---
> >
> > gcc/ChangeLog:
> > PR tree-optimization/114760
> > * tree-ssa-loop-niter.cc (is_lshift_by_1): New function
> > to check if STMT is equivalent to x << 1.
> > (is_rshift_by_1): New function to check if STMT is
> > equivalent to x >> 1.
> > (number_of_iterations_cltz): Enhance the identification
> > of logical shift by one.
> > (number_of_iterations_cltz_complement): Enhance the
> > identification of logical shift by one.
> >
> > gcc/testsuite/ChangeLog:
> > PR tree-optimization/114760
> > * gcc.dg/tree-ssa/pr114760-1.c: New test.
> > * gcc.dg/tree-ssa/pr114760-2.c: New test.
> > ---
> >  gcc/testsuite/gcc.dg/tree-ssa/pr114760-1.c | 69 ++
> >  gcc/testsuite/gcc.dg/tree-ssa/pr114760-2.c | 20 +++
> >  gcc/tree-ssa-loop-niter.cc | 56 +-
> >  3 files changed, 131 insertions(+), 14 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr114760-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr114760-2.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114760-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/pr114760-1.c
> > new file mode 100644
> > index 000..9f10ccc3b51
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114760-1.c
> > @@ -0,0 +1,69 @@
> > +/* PR tree-optimization/114760 */
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target clz } */
> > +/* { dg-require-effective-target ctz } */
> > +/* { dg-options "-O3 -fdump-tree-optimized" } */
> > +
> > +unsigned
> > +ntz32_1 (unsigned x)
> > +{
> > +  int n = 32;
> > +  while (x != 0)
> > +{
> > +  n = n - 1;
> > +  x = x * 2;
> > +}
> > +  return n;
> > +}
> > +
> > +unsigned
> > +ntz32_2 (unsigned x)
> > +{
> > +  int n = 32;
> > +  while (x != 0)
> > +{
> > +  n = n - 1;
> > +  x = x + x;
> > +}
> > +  return n;
> > +}
> > +
> > +unsigned
> > +ntz32_3 (unsigned x)
> > +{
> > +  int n = 32;
> > +  while (x != 0)
> > +{
> > +  n = n - 1;
> > +  x = x << 1;
> > +}
> > +  return n;
> > +}
> > +
> > +#define PREC (__CHAR_BIT__ * __SIZEOF_INT__)
> > +int
> > +nlz32_1 (unsigned int b) {
> > +int c = PREC;
> > +
> > +while (b != 0) {
> > +   b >>= 1;
> > +   c --;
> > +}
> > +
> > +return c;
> > +}
> > +
> > +int
> > +nlz32_2 (unsigned int b) {
> > +int c = PREC;
> > +
> > +while (b != 0) {
> > +   b /= 2;
> > +   c --;
> > +}
> > +
> > +return c;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-times "__builtin_ctz|\\.CTZ" 3
> "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "__builtin_clz|\\.CLZ" 2
> "optimized" } } */
> > \ No newline at end of file
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr114760-2.c
> b/gcc/testsuite/gcc.dg/tree-ssa/pr114760-2.c
> > new file mode 100644
> > index 000..e1b4c4b1338
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr114760-2.c
> > @@ -0,0 +1,20 @@
> > +/* PR tree-optimization/114760 */
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target clz } */
> > +/* { dg-options "-O3 -fdump-tree-optimized" } */
> > +
> > +// Check that for signed type, there's no CLZ.
> > +#define PREC (__CHAR_BIT__ * __SIZEOF_INT__)
> > +int
> > +no_nlz32 (int b) {
> > +int c = PREC;
> > +
> > +while (b != 0) {
> > +   b /= 2;
> > +   c --;
> > +}
> > +
> > +return c;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "__builtin_ctz|\\.CLZ" "optimized" } }
> */
> > \ No newline at end of file
> > diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
> > index 0fde07e626f..1d99264949b 100644
> > --- a/gcc/tree-ssa-loop-niter.cc
> > +++ b/gcc/tree-ssa-loop-niter.cc
> > @@ -2303,6 +2303,38 @@ build_cltz_expr (tree src, bool leading, bool
> define_at_zero)
> >return call;
> >  }
> >
> > +/* Returns true if STMT is equivalent to x << 1.  */
> > +
> > +static bool
> > +is_lshift_by_1 (gimple *stmt)
> 
> You are checking for gimple-assign before calling these so please
> use a 'gassign *' typed argument.
> 
> > +{
> > +  if (gimple_assign_rhs_code (stmt) == LSHIFT_EXPR
> > +  && integer_onep (gimple_assign_rhs2 (stmt)))
> > +return true;
> > +  if (gimple_assign_rhs_code (stmt) == MULT_EXPR
> > +  && TREE_CODE

Re: [PATCH] Adjust range type of calls into fold_range for IPA passes [PR114985]

2024-05-11 Thread Aldy Hernandez

I have pushed a few cleanups to make it easier to move forward without
disturbing passes which are affected by IPA's mixing up the range
types.  As I explained in my previous patch, this restores the default
behavior of silently returning VARYING when a range operator is
unsupported in either a particular operator, or in the dispatch code.

I would like to re-enable prange support, as IPA was already broken
before the prange work, and the debugging trap can be turned off to
analyze (#define TRAP_ON_UNHANDLED_POINTER_OPERATORS 1).

I have re-tested the effects of re-enabling prange in current trunk:

1. x86-64/32 bootstraps with no regressions with and without the trap.
2. ppc64le bootstraps with no regressions, but fails with the trap.
3. aarch64 bootstraps, but fails with the trap (no space on compile
farm to run tests)
4. sparc: bootstrap already broken, so I can't test.

So, for the above 4 architectures things work as before, and we have a
PR to track the IPA problem which doesn't seem to affect neither
bootstrap nor tests.

Does this sound reasonable?

Aldy

On Fri, May 10, 2024 at 12:26 PM Richard Biener
 wrote:
>
> On Fri, May 10, 2024 at 11:24 AM Aldy Hernandez  wrote:
> >
> > There are various calls into fold_range() that have the wrong type
> > associated with the range temporary used to hold the result.  This
> > used to work, because we could store either integers or pointers in a
> > Value_Range, but is no longer the case with prange's.  Now you must
> > explicitly state which type of range the temporary will hold before
> > storing into it.  You can change this at a later time with set_type(),
> > but you must always have a type before using the temporary, and it
> > must match what fold_range() returns.
> >
> > This patch adjusts the IPA code to restore the previous functionality,
> > so I can re-enable the prange code, but I do question whether the
> > previous code was correct.  I have added appropriate comments to help
> > the maintainers, but someone with more knowledge should revamp this
> > going forward.
> >
> > The basic problem is that pointer comparisons return a boolean, but
> > the IPA code is initializing the resulting range as a pointer.  This
> > wasn't a problem, because fold_range() would previously happily force
> > the range into an integer one, and everything would work.  But now we
> > must initialize the range to an integer before calling into
> > fold_range.  The thing is, that the failing case sets the result back
> > into a pointer, which is just weird but existing behavior.  I have
> > documented this in the code.
> >
> >   if (!handler
> >   || !op_res.supports_type_p (vr_type)
> >   || !handler.fold_range (op_res, vr_type, srcvr, op_vr))
> > /* For comparison operators, the type here may be
> >different than the range type used in fold_range above.
> >For example, vr_type may be a pointer, whereas the type
> >returned by fold_range will always be a boolean.
> >
> >This shouldn't cause any problems, as the set_varying
> >below will happily change the type of the range in
> >op_res, and then the cast operation in
> >ipa_vr_operation_and_type_effects will ultimately leave
> >things in the desired type, but it is confusing.
> >
> >Perhaps the original intent was to use the type of
> >op_res here?  */
> > op_res.set_varying (vr_type);
> >
> > BTW, this is not to say that the original gimple IR was wrong, but that
> > IPA is setting the range type of the result of fold_range() to the type of
> > the operands, which does not necessarily match in the case of a
> > comparison.
> >
> > I am just restoring previous behavior here, but I do question whether it
> > was right to begin with.
> >
> > Testing currently in progress on x86-64 and ppc64le with prange enabled.
> >
> > OK pending tests?
>
> I think this "intermediate" patch is unnecessary and instead the code should
> be fixed correctly, avoiding missed-optimization regressions.
>
> Richard.
>
> > gcc/ChangeLog:
> >
> > PR tree-optimization/114985
> > * ipa-cp.cc (ipa_value_range_from_jfunc): Adjust type of op_res.
> > (propagate_vr_across_jump_function): Same.
> > * ipa-fnsummary.cc (evaluate_conditions_for_known_args): Adjust
> > type for res.
> > * ipa-prop.h (ipa_type_for_fold_range): New.
> > ---
> >  gcc/ipa-cp.cc| 18 --
> >  gcc/ipa-fnsummary.cc |  6 +-
> >  gcc/ipa-prop.h   | 13 +
> >  3 files changed, 34 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
> > index 5781f50c854..3c395632364 100644
> > --- a/gcc/ipa-cp.cc
> > +++ b/gcc/ipa-cp.cc
> > @@ -1730,7 +1730,7 @@ ipa_value_range_from_jfunc (vrange &vr,
> > }
> >else
> > {
> > - Value_Range op_

Re: [COMMITTED] [prange] Do not trap by default on range dispatch mismatches.

2024-05-11 Thread Aldy Hernandez

For the record, we have always returned false (VARYING) for
unsupported range operators.  This patch just restores the behavior
we've always had, while adding a knob for further analysis (for
example. IPA which is getting its range types mixed up).

Aldy

On Sat, May 11, 2024 at 11:28 AM Aldy Hernandez  wrote:
>
> The trap in the range-op dispatch code is really an internal debugging
> aid, and only a temporary one for a few weeks while the dust settles.
> This patch turns it off by default, allowing problematic passes to
> turn it on for analysis.
>
> gcc/ChangeLog:
>
> * range-op.cc (TRAP_ON_UNHANDLED_POINTER_OPERATORS): New
> (range_op_handler::fold_range): Use it.
> (range_op_handler::op1_range): Same.
> (range_op_handler::op2_range): Same.
> (range_op_handler::lhs_op1_relation): Same.
> (range_op_handler::lhs_op2_relation): Same.
> (range_op_handler::op1_op2_relation): Same.
> ---
>  gcc/range-op.cc | 23 +--
>  1 file changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/range-op.cc b/gcc/range-op.cc
> index a134af68141..6a410ff656c 100644
> --- a/gcc/range-op.cc
> +++ b/gcc/range-op.cc
> @@ -49,6 +49,11 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-ccp.h"
>  #include "range-op-mixed.h"
>
> +// Set to 1 to trap on range-op entries that cannot handle the pointer
> +// combination being requested.  This is a temporary sanity check to
> +// aid in debugging, and will be removed later in the release cycle.
> +#define TRAP_ON_UNHANDLED_POINTER_OPERATORS 0
> +
>  // Instantiate the operators which apply to multiple types here.
>
>  operator_equal op_equal;
> @@ -233,7 +238,8 @@ range_op_handler::fold_range (vrange &r, tree type,
>  #if CHECKING_P
>if (!lh.undefined_p () && !rh.undefined_p ())
>  gcc_assert (m_operator->operand_check_p (type, lh.type (), rh.type ()));
> -  if (has_pointer_operand_p (r, lh, rh)
> +  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
> +  && has_pointer_operand_p (r, lh, rh)
>&& !m_operator->pointers_handled_p (DISPATCH_FOLD_RANGE,
>   dispatch_kind (r, lh, rh)))
>  discriminator_fail (r, lh, rh);
> @@ -299,7 +305,8 @@ range_op_handler::op1_range (vrange &r, tree type,
>  #if CHECKING_P
>if (!op2.undefined_p ())
>  gcc_assert (m_operator->operand_check_p (lhs.type (), type, op2.type 
> ()));
> -  if (has_pointer_operand_p (r, lhs, op2)
> +  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
> +  && has_pointer_operand_p (r, lhs, op2)
>&& !m_operator->pointers_handled_p (DISPATCH_OP1_RANGE,
>   dispatch_kind (r, lhs, op2)))
>  discriminator_fail (r, lhs, op2);
> @@ -353,7 +360,8 @@ range_op_handler::op2_range (vrange &r, tree type,
>  #if CHECKING_P
>if (!op1.undefined_p ())
>  gcc_assert (m_operator->operand_check_p (lhs.type (), op1.type (), 
> type));
> -  if (has_pointer_operand_p (r, lhs, op1)
> +  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
> +  && has_pointer_operand_p (r, lhs, op1)
>&& !m_operator->pointers_handled_p (DISPATCH_OP2_RANGE,
>   dispatch_kind (r, lhs, op1)))
>  discriminator_fail (r, lhs, op1);
> @@ -395,7 +403,8 @@ range_op_handler::lhs_op1_relation (const vrange &lhs,
>  {
>gcc_checking_assert (m_operator);
>  #if CHECKING_P
> -  if (has_pointer_operand_p (lhs, op1, op2)
> +  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
> +  && has_pointer_operand_p (lhs, op1, op2)
>&& !m_operator->pointers_handled_p (DISPATCH_LHS_OP1_RELATION,
>   dispatch_kind (lhs, op1, op2)))
>  discriminator_fail (lhs, op1, op2);
> @@ -442,7 +451,8 @@ range_op_handler::lhs_op2_relation (const vrange &lhs,
>  {
>gcc_checking_assert (m_operator);
>  #if CHECKING_P
> -  if (has_pointer_operand_p (lhs, op1, op2)
> +  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
> +  && has_pointer_operand_p (lhs, op1, op2)
>&& !m_operator->pointers_handled_p (DISPATCH_LHS_OP2_RELATION,
>   dispatch_kind (lhs, op1, op2)))
>  discriminator_fail (lhs, op1, op2);
> @@ -475,7 +485,8 @@ range_op_handler::op1_op2_relation (const vrange &lhs,
>  {
>gcc_checking_assert (m_operator);
>  #if CHECKING_P
> -  if (has_pointer_operand_p (lhs, op1, op2)
> +  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
> +  && has_pointer_operand_p (lhs, op1, op2)
>&& !m_operator->pointers_handled_p (DISPATCH_OP1_OP2_RELATION,
>   dispatch_kind (lhs, op1, op2)))
>  discriminator_fail (lhs, op1, op2);
> --
> 2.45.0
>

[COMMITTED] [prange] Default unimplemented prange operators to false.

2024-05-11 Thread Aldy Hernandez

The canonical way to indicate that a range operator is unsupported is
to return false, which has the sematic meaning of VARYING.  This patch
cleans up a few default virtuals that were trying harder to set
VARYING manually.

gcc/ChangeLog:

* range-op-ptr.cc (range_operator::fold_range): Return false.
---
 gcc/range-op-ptr.cc | 55 +
 1 file changed, 15 insertions(+), 40 deletions(-)

diff --git a/gcc/range-op-ptr.cc b/gcc/range-op-ptr.cc
index 466edc6bf74..65cca65103a 100644
--- a/gcc/range-op-ptr.cc
+++ b/gcc/range-op-ptr.cc
@@ -62,63 +62,38 @@ range_operator::pointers_handled_p (range_op_dispatch_type 
ATTRIBUTE_UNUSED,
 }
 
 bool
-range_operator::fold_range (prange &r, tree type,
-   const prange &op1,
-   const prange &op2,
-   relation_trio trio) const
+range_operator::fold_range (prange &, tree, const prange &, const prange &,
+   relation_trio) const
 {
-  relation_kind rel = trio.op1_op2 ();
-  r.set_varying (type);
-  op1_op2_relation_effect (r, type, op1, op2, rel);
-  return true;
+  return false;
 }
 
 bool
-range_operator::fold_range (prange &r, tree type,
-   const prange &op1,
-   const irange &op2,
-   relation_trio trio) const
+range_operator::fold_range (prange &, tree, const prange &, const irange &,
+   relation_trio) const
 {
-  relation_kind rel = trio.op1_op2 ();
-  r.set_varying (type);
-  op1_op2_relation_effect (r, type, op1, op2, rel);
-  return true;
+  return false;
 }
 
 bool
-range_operator::fold_range (irange &r, tree type,
-   const prange &op1,
-   const prange &op2,
-   relation_trio trio) const
+range_operator::fold_range (irange &, tree, const prange &, const prange &,
+   relation_trio) const
 {
-  relation_kind rel = trio.op1_op2 ();
-  r.set_varying (type);
-  op1_op2_relation_effect (r, type, op1, op2, rel);
-  return true;
+  return false;
 }
 
 bool
-range_operator::fold_range (prange &r, tree type,
-   const irange &op1,
-   const prange &op2,
-   relation_trio trio) const
+range_operator::fold_range (prange &, tree, const irange &, const prange &,
+   relation_trio) const
 {
-  relation_kind rel = trio.op1_op2 ();
-  r.set_varying (type);
-  op1_op2_relation_effect (r, type, op1, op2, rel);
-  return true;
+  return false;
 }
 
 bool
-range_operator::fold_range (irange &r, tree type,
-   const prange &op1,
-   const irange &op2,
-   relation_trio trio) const
+range_operator::fold_range (irange &, tree, const prange &, const irange &,
+   relation_trio) const
 {
-  relation_kind rel = trio.op1_op2 ();
-  r.set_varying (type);
-  op1_op2_relation_effect (r, type, op1, op2, rel);
-  return true;
+  return false;
 }
 
 bool
-- 
2.45.0

[COMMITTED] [prange] Do not trap by default on range dispatch mismatches.

2024-05-11 Thread Aldy Hernandez

The trap in the range-op dispatch code is really an internal debugging
aid, and only a temporary one for a few weeks while the dust settles.
This patch turns it off by default, allowing problematic passes to
turn it on for analysis.

gcc/ChangeLog:

* range-op.cc (TRAP_ON_UNHANDLED_POINTER_OPERATORS): New
(range_op_handler::fold_range): Use it.
(range_op_handler::op1_range): Same.
(range_op_handler::op2_range): Same.
(range_op_handler::lhs_op1_relation): Same.
(range_op_handler::lhs_op2_relation): Same.
(range_op_handler::op1_op2_relation): Same.
---
 gcc/range-op.cc | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index a134af68141..6a410ff656c 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -49,6 +49,11 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-ccp.h"
 #include "range-op-mixed.h"
 
+// Set to 1 to trap on range-op entries that cannot handle the pointer
+// combination being requested.  This is a temporary sanity check to
+// aid in debugging, and will be removed later in the release cycle.
+#define TRAP_ON_UNHANDLED_POINTER_OPERATORS 0
+
 // Instantiate the operators which apply to multiple types here.
 
 operator_equal op_equal;
@@ -233,7 +238,8 @@ range_op_handler::fold_range (vrange &r, tree type,
 #if CHECKING_P
   if (!lh.undefined_p () && !rh.undefined_p ())
 gcc_assert (m_operator->operand_check_p (type, lh.type (), rh.type ()));
-  if (has_pointer_operand_p (r, lh, rh)
+  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
+  && has_pointer_operand_p (r, lh, rh)
   && !m_operator->pointers_handled_p (DISPATCH_FOLD_RANGE,
  dispatch_kind (r, lh, rh)))
 discriminator_fail (r, lh, rh);
@@ -299,7 +305,8 @@ range_op_handler::op1_range (vrange &r, tree type,
 #if CHECKING_P
   if (!op2.undefined_p ())
 gcc_assert (m_operator->operand_check_p (lhs.type (), type, op2.type ()));
-  if (has_pointer_operand_p (r, lhs, op2)
+  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
+  && has_pointer_operand_p (r, lhs, op2)
   && !m_operator->pointers_handled_p (DISPATCH_OP1_RANGE,
  dispatch_kind (r, lhs, op2)))
 discriminator_fail (r, lhs, op2);
@@ -353,7 +360,8 @@ range_op_handler::op2_range (vrange &r, tree type,
 #if CHECKING_P
   if (!op1.undefined_p ())
 gcc_assert (m_operator->operand_check_p (lhs.type (), op1.type (), type));
-  if (has_pointer_operand_p (r, lhs, op1)
+  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
+  && has_pointer_operand_p (r, lhs, op1)
   && !m_operator->pointers_handled_p (DISPATCH_OP2_RANGE,
  dispatch_kind (r, lhs, op1)))
 discriminator_fail (r, lhs, op1);
@@ -395,7 +403,8 @@ range_op_handler::lhs_op1_relation (const vrange &lhs,
 {
   gcc_checking_assert (m_operator);
 #if CHECKING_P
-  if (has_pointer_operand_p (lhs, op1, op2)
+  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
+  && has_pointer_operand_p (lhs, op1, op2)
   && !m_operator->pointers_handled_p (DISPATCH_LHS_OP1_RELATION,
  dispatch_kind (lhs, op1, op2)))
 discriminator_fail (lhs, op1, op2);
@@ -442,7 +451,8 @@ range_op_handler::lhs_op2_relation (const vrange &lhs,
 {
   gcc_checking_assert (m_operator);
 #if CHECKING_P
-  if (has_pointer_operand_p (lhs, op1, op2)
+  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
+  && has_pointer_operand_p (lhs, op1, op2)
   && !m_operator->pointers_handled_p (DISPATCH_LHS_OP2_RELATION,
  dispatch_kind (lhs, op1, op2)))
 discriminator_fail (lhs, op1, op2);
@@ -475,7 +485,8 @@ range_op_handler::op1_op2_relation (const vrange &lhs,
 {
   gcc_checking_assert (m_operator);
 #if CHECKING_P
-  if (has_pointer_operand_p (lhs, op1, op2)
+  if (TRAP_ON_UNHANDLED_POINTER_OPERATORS
+  && has_pointer_operand_p (lhs, op1, op2)
   && !m_operator->pointers_handled_p (DISPATCH_OP1_OP2_RELATION,
  dispatch_kind (lhs, op1, op2)))
 discriminator_fail (lhs, op1, op2);
-- 
2.45.0

[PATCH v26 10/13] libstdc++: Optimize std::decay compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::decay
by dispatching to the new __decay built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (decay): Use __decay built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 824cad90a25..4cc587d4e08 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2316,6 +2316,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @cond undocumented
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__decay)
+  template
+struct decay
+{ using type = __decay(_Tp); };
+#else
   // Decay trait for arrays and functions, used for perfect forwarding
   // in make_pair, make_tuple, etc.
   template
@@ -2347,6 +2352,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct decay<_Tp&&>
 { using type = typename __decay_selector<_Tp>::type; };
+#endif
 
   /// @cond undocumented
 
-- 
2.44.0

[PATCH v26 09/13] libstdc++: Optimize std::add_rvalue_reference compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of
std::add_rvalue_reference by dispatching to the new
__add_rvalue_reference built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (add_rvalue_reference): Use
__add_rvalue_reference built-in trait.
(__add_rvalue_reference_helper): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index effa3fbcb75..824cad90a25 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1185,6 +1185,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// @cond undocumented
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_rvalue_reference)
+  template
+struct __add_rvalue_reference_helper
+{ using type = __add_rvalue_reference(_Tp); };
+#else
   template
 struct __add_rvalue_reference_helper
 { using type = _Tp; };
@@ -1192,6 +1197,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct __add_rvalue_reference_helper<_Tp, __void_t<_Tp&&>>
 { using type = _Tp&&; };
+#endif
 
   template
 using __add_rval_ref_t = typename __add_rvalue_reference_helper<_Tp>::type;
@@ -1748,9 +1754,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// add_rvalue_reference
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_rvalue_reference)
+  template
+struct add_rvalue_reference
+{ using type = __add_rvalue_reference(_Tp); };
+#else
   template
 struct add_rvalue_reference
 { using type = __add_rval_ref_t<_Tp>; };
+#endif
 
 #if __cplusplus > 201103L
   /// Alias template for remove_reference
-- 
2.44.0

[PATCH v26 06/13] libstdc++: Optimize std::remove_extent compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::remove_extent
by dispatching to the new __remove_extent built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_extent): Use __remove_extent
built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 6313fde94c2..58281c2e38c 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2092,6 +2092,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Array modifications.
 
   /// remove_extent
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_extent)
+  template
+struct remove_extent
+{ using type = __remove_extent(_Tp); };
+#else
   template
 struct remove_extent
 { using type = _Tp; };
@@ -2103,6 +2108,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct remove_extent<_Tp[]>
 { using type = _Tp; };
+#endif
 
   /// remove_all_extents
   template
-- 
2.44.0

[PATCH v26 07/13] libstdc++: Optimize std::remove_all_extents compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of
std::remove_all_extents by dispatching to the new
__remove_all_extents built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_all_extents): Use
__remove_all_extents built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 58281c2e38c..5b74e44d0a6 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2111,6 +2111,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// remove_all_extents
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_all_extents)
+  template
+struct remove_all_extents
+{ using type = __remove_all_extents(_Tp); };
+#else
   template
 struct remove_all_extents
 { using type = _Tp; };
@@ -2122,6 +2127,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct remove_all_extents<_Tp[]>
 { using type = typename remove_all_extents<_Tp>::type; };
+#endif
 
 #if __cplusplus > 201103L
   /// Alias template for remove_extent
-- 
2.44.0

[PATCH v26 05/13] libstdc++: Optimize std::add_pointer compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::add_pointer
by dispatching to the new __add_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (add_pointer): Use __add_pointer
built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 1562757f886..6313fde94c2 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2149,6 +2149,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 #endif
 
+  /// add_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_pointer)
+  template
+struct add_pointer
+{ using type = __add_pointer(_Tp); };
+#else
   template
 struct __add_pointer_helper
 { using type = _Tp; };
@@ -2157,7 +2163,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __add_pointer_helper<_Tp, __void_t<_Tp*>>
 { using type = _Tp*; };
 
-  /// add_pointer
   template
 struct add_pointer
 : public __add_pointer_helper<_Tp>
@@ -2170,6 +2175,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct add_pointer<_Tp&&>
 { using type = _Tp*; };
+#endif
 
 #if __cplusplus > 201103L
   /// Alias template for remove_pointer
-- 
2.44.0

[PATCH v26 04/13] libstdc++: Optimize std::is_unbounded_array compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of
std::is_unbounded_array by dispatching to the new
__is_unbounded_array built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_unbounded_array_v): Use
__is_unbounded_array built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index ea013b4b7bc..1562757f886 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3710,11 +3710,16 @@ template
   /// True for a type that is an array of unknown bound.
   /// @ingroup variable_templates
   /// @since C++20
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__is_unbounded_array)
+  template
+inline constexpr bool is_unbounded_array_v = __is_unbounded_array(_Tp);
+# else
   template
 inline constexpr bool is_unbounded_array_v = false;
 
   template
 inline constexpr bool is_unbounded_array_v<_Tp[]> = true;
+# endif
 
   /// True for a type that is an array of known bound.
   /// @since C++20
-- 
2.44.0

[PATCH v26 03/13] libstdc++: Optimize std::is_pointer compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::is_pointer
by dispatching to the new __is_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_pointer): Use
__is_pointer built-in trait.  Optimize its implementation.
* include/std/type_traits (is_pointer): Likewise.
(is_pointer_v): Likewise.

Co-authored-by: Jonathan Wakely 
Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/bits/cpp_type_traits.h | 31 ++-
 libstdc++-v3/include/std/type_traits| 44 +
 2 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h 
b/libstdc++-v3/include/bits/cpp_type_traits.h
index 59f1a1875eb..210a9ea00da 100644
--- a/libstdc++-v3/include/bits/cpp_type_traits.h
+++ b/libstdc++-v3/include/bits/cpp_type_traits.h
@@ -363,6 +363,13 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   //
   // Pointer types
   //
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+  template
+struct __is_pointer : __truth_type<_IsPtr>
+{
+  enum { __value = _IsPtr };
+};
+#else
   template
 struct __is_pointer
 {
@@ -377,6 +384,28 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   typedef __true_type __type;
 };
 
+  template
+struct __is_pointer<_Tp* const>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+
+  template
+struct __is_pointer<_Tp* volatile>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+
+  template
+struct __is_pointer<_Tp* const volatile>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+#endif
+
   //
   // An arithmetic type is an integer type or a floating point type
   //
@@ -387,7 +416,7 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
 
   //
   // A scalar type is an arithmetic type or a pointer type
-  // 
+  //
   template
 struct __is_scalar
 : public __traitor<__is_arithmetic<_Tp>, __is_pointer<_Tp> >
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 748fa186881..ea013b4b7bc 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -542,19 +542,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public true_type { };
 #endif
 
-  template
-struct __is_pointer_helper
+  /// is_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+  template
+struct is_pointer
+: public __bool_constant<__is_pointer(_Tp)>
+{ };
+#else
+  template
+struct is_pointer
 : public false_type { };
 
   template
-struct __is_pointer_helper<_Tp*>
+struct is_pointer<_Tp*>
 : public true_type { };
 
-  /// is_pointer
   template
-struct is_pointer
-: public __is_pointer_helper<__remove_cv_t<_Tp>>::type
-{ };
+struct is_pointer<_Tp* const>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* volatile>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* const volatile>
+: public true_type { };
+#endif
 
   /// is_lvalue_reference
   template
@@ -3268,8 +3282,22 @@ template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+template 
+  inline constexpr bool is_pointer_v = __is_pointer(_Tp);
+#else
 template 
-  inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
+  inline constexpr bool is_pointer_v = false;
+template 
+  inline constexpr bool is_pointer_v<_Tp*> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* volatile> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
+#endif
+
 template 
   inline constexpr bool is_lvalue_reference_v = false;
 template 
-- 
2.44.0

[PATCH v26 01/13] libstdc++: Optimize std::is_const compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::is_const
by dispatching to the new __is_const built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_const): Use __is_const built-in
trait.
(is_const_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index b441bf9908f..8df0cf3ac3b 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -835,6 +835,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type properties.
 
   /// is_const
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+  template
+struct is_const
+: public __bool_constant<__is_const(_Tp)>
+{ };
+#else
   template
 struct is_const
 : public false_type { };
@@ -842,6 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_const<_Tp const>
 : public true_type { };
+#endif
 
   /// is_volatile
   template
@@ -3331,10 +3338,15 @@ template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+template 
+  inline constexpr bool is_const_v = __is_const(_Tp);
+#else
 template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+#endif
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
 template 
-- 
2.44.0

[PATCH v26 13/13] libstdc++: Optimize std::is_nothrow_invocable compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of
std::is_nothrow_invocable by dispatching to the new
__is_nothrow_invocable built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_nothrow_invocable): Use
__is_nothrow_invocable built-in trait.
* testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc:
Handle the new error from __is_nothrow_invocable.
* testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc:
Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits  | 4 
 .../20_util/is_nothrow_invocable/incomplete_args_neg.cc   | 1 +
 .../testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc  | 1 +
 3 files changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 502032787bd..a046560c178 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3269,8 +3269,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// std::is_nothrow_invocable
   template
 struct is_nothrow_invocable
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_nothrow_invocable)
+: public __bool_constant<__is_nothrow_invocable(_Fn, _ArgTypes...)>
+#else
 : __and_<__is_invocable_impl<__invoke_result<_Fn, _ArgTypes...>, void>,
 __call_is_nothrow_<_Fn, _ArgTypes...>>::type
+#endif
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Fn>{}),
"_Fn must be a complete class or an unbounded array");
diff --git 
a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc
index 3c225883eaf..3f8542dd366 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_args_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
diff --git 
a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc
index 5a728bfa03b..d3bdf08448b 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/incomplete_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
-- 
2.44.0

[PATCH v26 12/13] libstdc++: Optimize std::is_invocable compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::is_invocable
by dispatching to the new __is_invocable built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_invocable): Use __is_invocable
built-in trait.
* testsuite/20_util/is_invocable/incomplete_args_neg.cc: Handle
the new error from __is_invocable.
* testsuite/20_util/is_invocable/incomplete_neg.cc: Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits  | 4 
 .../testsuite/20_util/is_invocable/incomplete_args_neg.cc | 1 +
 libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc | 1 +
 3 files changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index e9313205550..502032787bd 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3239,7 +3239,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// std::is_invocable
   template
 struct is_invocable
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_invocable)
+: public __bool_constant<__is_invocable(_Fn, _ArgTypes...)>
+#else
 : __is_invocable_impl<__invoke_result<_Fn, _ArgTypes...>, void>::type
+#endif
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Fn>{}),
"_Fn must be a complete class or an unbounded array");
diff --git a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc
index a575750f9e9..9619129b817 100644
--- a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_args_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc 
b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc
index 05848603555..b478ebce815 100644
--- a/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_invocable/incomplete_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-error "must be a complete class" "" { target *-*-* } 0 }
+// { dg-prune-output "invalid use of incomplete type" }
 
 #include 
 
-- 
2.44.0

[PATCH v26 02/13] libstdc++: Optimize std::is_volatile compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::is_volatile
by dispatching to the new __is_volatile built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_volatile): Use __is_volatile
built-in trait.
(is_volatile_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 8df0cf3ac3b..748fa186881 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -851,6 +851,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// is_volatile
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_volatile)
+  template
+struct is_volatile
+: public __bool_constant<__is_volatile(_Tp)>
+{ };
+#else
   template
 struct is_volatile
 : public false_type { };
@@ -858,6 +864,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_volatile<_Tp volatile>
 : public true_type { };
+#endif
 
   /// is_trivial
   template
@@ -3360,10 +3367,15 @@ template 
   inline constexpr bool is_function_v<_Tp&&> = false;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_volatile)
+template 
+  inline constexpr bool is_volatile_v = __is_volatile(_Tp);
+#else
 template 
   inline constexpr bool is_volatile_v = false;
 template 
   inline constexpr bool is_volatile_v = true;
+#endif
 
 template 
   inline constexpr bool is_trivial_v = __is_trivial(_Tp);
-- 
2.44.0

[PATCH v26 11/13] libstdc++: Optimize std::rank compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of std::rank
by dispatching to the new __array_rank built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (rank): Use __array_rank built-in
trait.
(rank_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 4cc587d4e08..e9313205550 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1473,6 +1473,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// rank
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__array_rank)
+  template
+struct rank
+: public integral_constant { };
+#else
   template
 struct rank
 : public integral_constant { };
@@ -1484,6 +1489,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct rank<_Tp[]>
 : public integral_constant::value> { };
+#endif
 
   /// extent
   template
@@ -3583,12 +3589,17 @@ template 
 template 
   inline constexpr size_t alignment_of_v = alignment_of<_Tp>::value;
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__array_rank)
+template 
+  inline constexpr size_t rank_v = __array_rank(_Tp);
+#else
 template 
   inline constexpr size_t rank_v = 0;
 template 
   inline constexpr size_t rank_v<_Tp[_Size]> = 1 + rank_v<_Tp>;
 template 
   inline constexpr size_t rank_v<_Tp[]> = 1 + rank_v<_Tp>;
+#endif
 
 template 
   inline constexpr size_t extent_v = 0;
-- 
2.44.0

[PATCH v26 08/13] libstdc++: Optimize std::add_lvalue_reference compilation performance

2024-05-11 Thread Ken Matsui

This patch optimizes the compilation performance of
std::add_lvalue_reference by dispatching to the new
__add_lvalue_reference built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (add_lvalue_reference): Use
__add_lvalue_reference built-in trait.
(__add_lvalue_reference_helper): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 5b74e44d0a6..effa3fbcb75 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1157,6 +1157,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// @cond undocumented
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_lvalue_reference)
+  template
+struct __add_lvalue_reference_helper
+{ using type = __add_lvalue_reference(_Tp); };
+#else
   template
 struct __add_lvalue_reference_helper
 { using type = _Tp; };
@@ -1164,6 +1169,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct __add_lvalue_reference_helper<_Tp, __void_t<_Tp&>>
 { using type = _Tp&; };
+#endif
 
   template
 using __add_lval_ref_t = typename __add_lvalue_reference_helper<_Tp>::type;
@@ -1731,9 +1737,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// add_lvalue_reference
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__add_lvalue_reference)
+  template
+struct add_lvalue_reference
+{ using type = __add_lvalue_reference(_Tp); };
+#else
   template
 struct add_lvalue_reference
 { using type = __add_lval_ref_t<_Tp>; };
+#endif
 
   /// add_rvalue_reference
   template
-- 
2.44.0

Re: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-11 Thread juzhe.zh...@rivai.ai

LGTM from my side. Wait for kito chime in.



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-05-11 15:54
To: gcc-patches
CC: juzhe.zhong; kito.cheng; Pan Li
Subject: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar
From: Pan Li 
 
For the vfw vx format RVV intrinsic, the scalar type _Float16 also
requires the zvfh extension.  Unfortunately,  we only check the
vector tree type and miss the scalar _Float16 type checking.  For
example:
 
vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t vl)
{
  return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl);
}
 
It should report some error message like zvfh extension is required
instead of ICE for unreg insn.
 
This patch would like to make up such kind of validation for _Float16
in the RVV intrinsic API.  It will report some error like below when
there is no zvfh enabled.
 
error: built-in function '__riscv_vfwsub_wf_f32mf2(vs2,  rs1,  vl)'
  requires the zvfhmin or zvfh ISA extension
 
PR target/114988
 
Passed the rv64gcv fully regression tests, included c/c++/fortran.
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins.cc
(validate_instance_type_required_extensions): New func impl to
validate the intrinisc func type ops.
(expand_builtin): Validate instance type before expand.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr114988-1.c: New test.
* gcc.target/riscv/rvv/base/pr114988-2.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv-vector-builtins.cc | 51 +++
.../gcc.target/riscv/rvv/base/pr114988-1.c|  9 
.../gcc.target/riscv/rvv/base/pr114988-2.c|  9 
3 files changed, 69 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
 
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 192a6c230d1..3fdb4400d70 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4632,6 +4632,54 @@ gimple_fold_builtin (unsigned int code, 
gimple_stmt_iterator *gsi, gcall *stmt)
   return gimple_folder (rfn.instance, rfn.decl, gsi, stmt).fold ();
}
+static bool
+validate_instance_type_required_extensions (const rvv_type_info type,
+ tree exp)
+{
+  uint64_t exts = type.required_extensions;
+
+  if ((exts & RVV_REQUIRE_ELEN_FP_16) &&
+!TARGET_VECTOR_ELEN_FP_16_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+ "built-in function %qE requires the "
+ "zvfhmin or zvfh ISA extension",
+ exp);
+  return false;
+}
+
+  if ((exts & RVV_REQUIRE_ELEN_FP_32) &&
+!TARGET_VECTOR_ELEN_FP_32_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+ "built-in function %qE requires the "
+ "zve32f, zve64f, zve64d or v ISA extension",
+ exp);
+  return false;
+}
+
+  if ((exts & RVV_REQUIRE_ELEN_FP_64) &&
+!TARGET_VECTOR_ELEN_FP_64_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+ "built-in function %qE requires the zve64d or v ISA extension",
+ exp);
+  return false;
+}
+
+  if ((exts & RVV_REQUIRE_ELEN_64) &&
+!TARGET_VECTOR_ELEN_64_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+ "built-in function %qE requires the "
+ "zve64x, zve64f, zve64d or v ISA extension",
+ exp);
+  return false;
+}
+
+  return true;
+}
+
/* Expand a call to the RVV function with subcode CODE.  EXP is the call
expression and TARGET is the preferred location for the result.
Return the value of the lhs.  */
@@ -4649,6 +4697,9 @@ expand_builtin (unsigned int code, tree exp, rtx target)
   return target;
 }
+  if (!validate_instance_type_required_extensions (rfn.instance.type, exp))
+return target;
+
   return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
new file mode 100644
index 000..b8474804c88
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t vl)
+{
+  return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl); /* { dg-error {built-in 
function '__riscv_vfwsub_wf_f32mf2\(vs2,  rs1,  vl\)' requires the zvfhmin or 
zvfh ISA extension} } */
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
new file mode 100644
index 000..49aa3141af3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+vfloat32mf2_t test_vfwadd_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t vl)
+{
+

[PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-11 Thread pan2 . li

From: Pan Li 

For the vfw vx format RVV intrinsic, the scalar type _Float16 also
requires the zvfh extension.  Unfortunately,  we only check the
vector tree type and miss the scalar _Float16 type checking.  For
example:

vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t vl)
{
  return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl);
}

It should report some error message like zvfh extension is required
instead of ICE for unreg insn.

This patch would like to make up such kind of validation for _Float16
in the RVV intrinsic API.  It will report some error like below when
there is no zvfh enabled.

error: built-in function '__riscv_vfwsub_wf_f32mf2(vs2,  rs1,  vl)'
  requires the zvfhmin or zvfh ISA extension

PR target/114988

Passed the rv64gcv fully regression tests, included c/c++/fortran.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc
(validate_instance_type_required_extensions): New func impl to
validate the intrinisc func type ops.
(expand_builtin): Validate instance type before expand.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr114988-1.c: New test.
* gcc.target/riscv/rvv/base/pr114988-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-vector-builtins.cc | 51 +++
 .../gcc.target/riscv/rvv/base/pr114988-1.c|  9 
 .../gcc.target/riscv/rvv/base/pr114988-2.c|  9 
 3 files changed, 69 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 192a6c230d1..3fdb4400d70 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4632,6 +4632,54 @@ gimple_fold_builtin (unsigned int code, 
gimple_stmt_iterator *gsi, gcall *stmt)
   return gimple_folder (rfn.instance, rfn.decl, gsi, stmt).fold ();
 }
 
+static bool
+validate_instance_type_required_extensions (const rvv_type_info type,
+   tree exp)
+{
+  uint64_t exts = type.required_extensions;
+
+  if ((exts & RVV_REQUIRE_ELEN_FP_16) &&
+!TARGET_VECTOR_ELEN_FP_16_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+   "built-in function %qE requires the "
+   "zvfhmin or zvfh ISA extension",
+   exp);
+  return false;
+}
+
+  if ((exts & RVV_REQUIRE_ELEN_FP_32) &&
+!TARGET_VECTOR_ELEN_FP_32_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+   "built-in function %qE requires the "
+   "zve32f, zve64f, zve64d or v ISA extension",
+   exp);
+  return false;
+}
+
+  if ((exts & RVV_REQUIRE_ELEN_FP_64) &&
+!TARGET_VECTOR_ELEN_FP_64_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+   "built-in function %qE requires the zve64d or v ISA extension",
+   exp);
+  return false;
+}
+
+  if ((exts & RVV_REQUIRE_ELEN_64) &&
+!TARGET_VECTOR_ELEN_64_P (riscv_vector_elen_flags))
+{
+  error_at (EXPR_LOCATION (exp),
+   "built-in function %qE requires the "
+   "zve64x, zve64f, zve64d or v ISA extension",
+   exp);
+  return false;
+}
+
+  return true;
+}
+
 /* Expand a call to the RVV function with subcode CODE.  EXP is the call
expression and TARGET is the preferred location for the result.
Return the value of the lhs.  */
@@ -4649,6 +4697,9 @@ expand_builtin (unsigned int code, tree exp, rtx target)
   return target;
 }
 
+  if (!validate_instance_type_required_extensions (rfn.instance.type, exp))
+return target;
+
   return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
 }
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
new file mode 100644
index 000..b8474804c88
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t vl)
+{
+  return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl); /* { dg-error {built-in 
function '__riscv_vfwsub_wf_f32mf2\(vs2,  rs1,  vl\)' requires the zvfhmin or 
zvfh ISA extension} } */
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
new file mode 100644
index 000..49aa3141af3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+vfloat32mf2_t test_vfwadd_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, siz

39 matches

Mail list logo