date:20231202

Re: [committed] Fix gnu23-builtins-no-dfp

2023-12-02 Thread Florian Weimer

* Jeff Law:

> Anyway, this test was the one I was most concerned about.  Basically 
> we're testing that on a !dfp target that the builtins are not available. 
>   It expects a warning, but gets an error by default now.  I just 
> changed the test to use -fpermissive, so that the test behaves as it did 
> previously.

In these ambiguous cases, I cloned tests into -fpermissive and error
variants.  This might be appropriate here as well (or I should remove
the clones again if those are the wrong thing to do).

[committed] Fix gnu23-builtins-no-dfp

2023-12-02 Thread Jeff Law

Last patch for the night.  There's still a bit of minor fallout left in 
GCC (loongarch testsuite for example).  But things are looking good on 
the targets I test.  The plan is to start submitting the various 
newlib/libgloss fixes tomorrow.


Anyway, this test was the one I was most concerned about.  Basically 
we're testing that on a !dfp target that the builtins are not available. 
 It expects a warning, but gets an error by default now.  I just 
changed the test to use -fpermissive, so that the test behaves as it did 
previously.


Pushed to the trunk.
Jeffcommit f37744662cbc74efcceb790b99dcd6521c51a578
Author: Jeff Law 
Date:   Sat Dec 2 22:54:46 2023 -0700

[committed] Fix gnu23-builtins-no-dfp

Last patch for the night.  There's still a bit of minor fallout left in GCC
(loongarch testsuite for example).  But things are looking good on the 
targets
I test.  The plan is to start submitting the various newlib/libgloss fixes
tomorrow.

Anyway, this test was the one I was most concerned about.  Basically we're
testing that on a !dfp target that the builtins are not available.  It 
expects
a warning, but gets an error by default now.  I just changed the test to use
-fpermissive, so that the test behaves as it did previously.

Pushed to the trunk.

gcc/testsuite
* gcc.dg/gnu23-builtins-no-dfp-1.c: Add -fpermissive.

diff --git a/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c 
b/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c
index 9fa25f0dd13..8fe4efbdd98 100644
--- a/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c
+++ b/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c
@@ -1,7 +1,7 @@
 /* Test C23 built-in functions: test DFP built-in functions are not
available when no DFP support.  Bug 91985.  */
 /* { dg-do compile { target { ! dfp } } } */
-/* { dg-options "-std=gnu23" } */
+/* { dg-options "-std=gnu23 -fpermissive" } */
 
 int fabsd32 (void);
 int fabsd64 (void);

[committed] Fix build of libgcc on ports using FDPIC

2023-12-02 Thread Jeff Law



read_encoded_value_with_base has an ifdef'd code path conditional on 
__FDPIC__ which was calling _Unwind_gnu_Find_got without a prototype. 
This naturally caused various build failures.


This adds a suitable prototype.

Pushed to the trunk.
commit 4cef6daf40f4aefd748245a720955d4e52d1a81e
Author: Jeff Law 
Date:   Sat Dec 2 22:45:48 2023 -0700

[committed] Fix build of libgcc on ports using FDPIC

read_encoded_value_with_base has an ifdef'd code path conditional on 
__FDPIC__
which was calling _Unwind_gnu_Find_got without a prototype.  This naturally
caused various build failures.

This adds a suitable prototype.

Pushed to the trunk.

libgcc

* unwind-pe.h (_Unwind_gnu_Find_got): Add prototype.

diff --git a/libgcc/unwind-pe.h b/libgcc/unwind-pe.h
index 3f98c93589a..d714a27a935 100644
--- a/libgcc/unwind-pe.h
+++ b/libgcc/unwind-pe.h
@@ -173,6 +173,8 @@ read_sleb128 (const unsigned char *p, _sleb128_t *val)
   return p;
 }
 
+extern _Unwind_Ptr _Unwind_gnu_Find_got (_Unwind_Ptr);
+
 /* Load an encoded value from memory at P.  The value is returned in VAL;
The function returns P incremented past the value.  BASE is as given
by base_of_encoded_value for this encoding in the appropriate context.  */

[committed] Fix pr65369.c

2023-12-02 Thread Jeff Law



There's a caller/callee type mismatch in this test that shows up on 
targets where ints are something other than 32 bit types.


Based on reviewing the original bug report, the fix and the part of the 
test this fixes, I'm reasonably confident this hasn't compromised the test.


Pushed to the trunk.

Jeffcommit 3da08ffa6df2634092a6292b045568fc326e28e6
Author: Jeff Law 
Date:   Sat Dec 2 22:40:41 2023 -0700

[committed] Fix pr65369.c

There's a caller/callee type mismatch in this test that shows up on targets
where ints are something other than 32 bit types.

Based on reviewing the original bug report, the fix and the part of the test
this fixes, I'm reasonably confident this hasn't compromised the test.

gcc/testsuite
* gcc.c-torture/execute/pr65369.c: Fix type mismatch.

diff --git a/gcc/testsuite/gcc.c-torture/execute/pr65369.c 
b/gcc/testsuite/gcc.c-torture/execute/pr65369.c
index 017fe1b01ce..548b48fa43f 100644
--- a/gcc/testsuite/gcc.c-torture/execute/pr65369.c
+++ b/gcc/testsuite/gcc.c-torture/execute/pr65369.c
@@ -6,7 +6,7 @@ static const char data[] =
   "123456789012345678901234567890";
 
 __attribute__ ((noinline))
-static void foo (const unsigned int *buf)
+static void foo (const uint32_t *buf)
 {
   if (__builtin_memcmp (buf, data, 64))
 __builtin_abort ();

[committed] Fix comp-goto-1.c on 16 bit targets

2023-12-02 Thread Jeff Law



I don't remember what port triggered this, but it's obviously that 
comp-goto-1.c needs to be fixed.


Basically the test has two implementations.  One is just a dummy with no 
return value on main() triggering the new errors.


Pushed to the trunk.

Jeffcommit d5c823b033bb6409bbcd115b318093126f5a674f
Author: Jeff Law 
Date:   Sat Dec 2 22:32:22 2023 -0700

[committed] Fix comp-goto-1.c on 16 bit targets

I don't remember what port triggered this, but it's obviously that
comp-goto-1.c needs to be fixed.

Basically the test has two implementations.  One is just a dummy with no 
return
value on main() triggering the new errors.

gcc/testsuite
* gcc.c-torture/execute/comp-goto-1.c: Fix return value of main for
16 bit targets.

diff --git a/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c 
b/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c
index 4379fe70e9c..6be63c097ac 100644
--- a/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/comp-goto-1.c
@@ -163,5 +163,5 @@ main ()
   exit (0);
 }
 #else
-main(){ exit (0); }
+int main(){ exit (0); }
 #endif

[committed] Fix a few arc tests

2023-12-02 Thread Jeff Law



Similar to others.  Where it's easy to fix the implicit types or add 
prototypes I did.  One was just ugly and I didn't want to think too 
hard, so I just added -fpermissive.


Pushed to the trunk.

Jeff

commit 595c695216e72c8491bf20d30e5298e2064caa73
Author: Jeff Law 
Date:   Sat Dec 2 22:16:33 2023 -0700

[committed] Fix a few arc tests

Similar to others.  Where it's easy to fix the implicit types or add 
prototypes
I did.  One was just ugly and I didn't want to think too hard, so I just 
added
-fpermissive.

Pushed to the trunk.

gcc/testsuite
* gcc.target/arc/lra-1.c: Fix missing prototypes and implicit
types in variable definitions.
* gcc.target/arc/pic-1.c: Similarly.
* gcc.target/arc/pr9001191897.c: Similarly.
* gcc.target/arc/pr9001195952.c: Add -fpermissive.

diff --git a/gcc/testsuite/gcc.target/arc/lra-1.c 
b/gcc/testsuite/gcc.target/arc/lra-1.c
index 27336d1a6af..3c936453663 100644
--- a/gcc/testsuite/gcc.target/arc/lra-1.c
+++ b/gcc/testsuite/gcc.target/arc/lra-1.c
@@ -4,12 +4,16 @@
 /* ap is replaced with an address like base+offset by lra,
where offset is larger than s9, resulting into an ICE.  */
 
-typedef struct { char a[500] } b;
-c;
+typedef struct { char a[500]; } b;
+int c;
 struct d {
   short e;
-  b f
-} g(int h, int i, int j, int k, char l, int m, int n, char *p) {
+  b f;
+};
+
+int q (struct d);
+
+struct d g(int h, int i, int j, int k, char l, int m, int n, char *p) {
 again:;
   struct d o;
   *p = c = ({ q(o); });
diff --git a/gcc/testsuite/gcc.target/arc/pic-1.c 
b/gcc/testsuite/gcc.target/arc/pic-1.c
index ab24763b67f..ed1e4d3765e 100644
--- a/gcc/testsuite/gcc.target/arc/pic-1.c
+++ b/gcc/testsuite/gcc.target/arc/pic-1.c
@@ -3,6 +3,9 @@
 /* { dg-skip-if "PIC not available for ARC6xx" { arc6xx } } */
 /* { dg-options "-mno-sdata -w -Os -fpic" } */
 
+void e (char);
+
+void 
 a() {
   char *b = "";
   char c;
diff --git a/gcc/testsuite/gcc.target/arc/pr9001191897.c 
b/gcc/testsuite/gcc.target/arc/pr9001191897.c
index fc3642629d3..d51b0429044 100644
--- a/gcc/testsuite/gcc.target/arc/pr9001191897.c
+++ b/gcc/testsuite/gcc.target/arc/pr9001191897.c
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
 /* { dg-skip-if "" { ! { clmcpu } } } */
 /* { dg-options "-mcpu=archs -Os -fpic -mno-sdata -mno-indexed-loads -w" } */
-a;
+int a;
+void
 c() {
   static char b[25];
   for (; a >= 0; a--)
diff --git a/gcc/testsuite/gcc.target/arc/pr9001195952.c 
b/gcc/testsuite/gcc.target/arc/pr9001195952.c
index 252438d8d78..f820960d5e3 100644
--- a/gcc/testsuite/gcc.target/arc/pr9001195952.c
+++ b/gcc/testsuite/gcc.target/arc/pr9001195952.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-skip-if "" { ! { clmcpu } } } */
-/* { dg-options "-mcpu=archs -Os -w -fpic" } */
+/* { dg-options "-mcpu=archs -Os -w -fpic -fpermissive" } */
 
 /* tst_movb split pattern is wrong for anything else than NPS
chip.  */

[committed] Fix nios2 tests

2023-12-02 Thread Jeff Law



The nios2 port has two tests that are affected by the recent changes. 
In cdx-ldstwm-1.c it was easiest to just add -fpermissive.  for 
cdx-ldstwm-2.c adding an prototype for exit and abort is all that's needed.


Pushed to the trunk,
Jeff
commit 2280317c3771a28e9288b7f4c4c23aa4b0ac31dd
Author: Jeff Law 
Date:   Sat Dec 2 22:12:55 2023 -0700

[committed] Fix nios2 tests

The nios2 port has two tests that are affected by the recent changes.  In
cdx-ldstwm-1.c it was easiest to just add -fpermissive.  for cdx-ldstwm-2.c
adding an prototype for exit and abort is all that's needed.

gcc/testsuite
* gcc.target/nios2/cdx-ldstwm-1.c: Add -fpermissive.
* gcc.target/nios2/cdx-ldstwm-2.c: Add prototypes fro abort and 
exit.

diff --git a/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-1.c 
b/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-1.c
index 7beeea15e2f..6b7a7d0e7f4 100644
--- a/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-1.c
+++ b/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-1.c
@@ -1,5 +1,5 @@
 /* { dg-do assemble } */
-/* { dg-options "-O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions -march=r2 -mcdx -w" } */
+/* { dg-options "-O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions -march=r2 -mcdx -w -fpermissive" } */
 
 /* Based on gcc.c-torture/compile/920501-23.c.
This test used to result in assembler errors with R2 CDX because of
diff --git a/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-2.c 
b/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-2.c
index 0e69534dcc1..eb273bbc5ec 100644
--- a/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-2.c
+++ b/gcc/testsuite/gcc.target/nios2/cdx-ldstwm-2.c
@@ -1,6 +1,9 @@
 /* { dg-do assemble } */
 /* { dg-options "-O3 -fomit-frame-pointer -funroll-loops -march=r2 -mcdx -w" } 
*/
 
+extern void abort (void);
+extern int exit (int);
+
 /* Based on gcc.c-torture/execute/20021120-1.c.
This test used to result in assembler errors with R2 CDX because of
a bug in regrename; it wasn't re-validating insns after renaming, so

[committed] Fix rx build failure in libgcc

2023-12-02 Thread Jeff Law



The rx port has a bunch of what I presume are ABI compatibility 
functions in libgcc.  Those compatibility functions routines such as 
__eqdf2 from libgcc, but without a prototype.  This patch adds the 
missing prototypes.


Pushed to the trunk,
Jeff
commit f1fdd2839ccbc1028b59fbaa7f342e41e3bef848
Author: Jeff Law 
Date:   Sat Dec 2 22:07:59 2023 -0700

[committed] Fix rx build failure in libgcc

The rx port has a bunch of what I presume are ABI compatibility functions in
libgcc.  Those compatibility functions routines such as __eqdf2 from libgcc,
but without a prototype.  This patch adds the missing prototypes.

libgcc/
* config/rx/rx-abi-functions.c (__ltdf2, __gtdf2): Add prototype.
(__ledf2, __gedf2, __eqdf2, __nedf2): Likewise.
(__ltsf2, __gtsf2, __lesf2, __gesf2, __eqsf2, __nesf2): Likewise.

diff --git a/libgcc/config/rx/rx-abi-functions.c 
b/libgcc/config/rx/rx-abi-functions.c
index f505d7d24c3..fc96e4ff171 100644
--- a/libgcc/config/rx/rx-abi-functions.c
+++ b/libgcc/config/rx/rx-abi-functions.c
@@ -33,6 +33,13 @@
 
 #ifdef __RX_64BIT_DOUBLES__
 
+extern int __ltdf2 (double, double);
+extern int __gtdf2 (double, double);
+extern int __ledf2 (double, double);
+extern int __gedf2 (double, double);
+extern int __eqdf2 (double, double);
+extern int __nedf2 (double, double);
+
 int _COM_CMPLTd (double a, double b) { return __ltdf2 (a, b) == -1; }
 int _COM_CMPGTd (double a, double b) { return __gtdf2 (a, b) == 1; }
 int _COM_CMPLEd (double a, double b) { return __ledf2 (a, b) != 1; }
@@ -49,6 +56,13 @@ int _COM_CMPNEf (double, double) __attribute__ ((weak, alias 
("_COM_CMPNEd")));
 
 #else /* 32-bit doubles.  */
 
+extern int __ltsf2 (float, float);
+extern int __gtsf2 (float, float);
+extern int __lesf2 (float, float);
+extern int __gesf2 (float, float);
+extern int __eqsf2 (float, float);
+extern int __nesf2 (float, float);
+
 double _COM_CONVfd (float a) { return a; }
 float  _COM_CONVdf (double a) { return a; }

[committed] Fix minor testsuite problems on H8 after C99 changes

2023-12-02 Thread Jeff Law

Two minor regressions on the H8 were triggered by the C99 changes. 
First pr58400.c has several functions without prototypes.  I just added 
-fpermissive to that test.  Second pr17306-2.c has a single call to an 
unprototyped function for which I added the prototype.


These are both H8 specific tests.

Pushed to the trunk,
Jeff
commit 622c5356676caec1dc970869d6671244703f0559
Author: Jeff Law 
Date:   Sat Dec 2 22:03:28 2023 -0700

[committed] Fix minor testsuite problems on H8 after C99 changes

Two minor regressions on the H8 were triggered by the C99 changes.  First
pr58400.c has several functions without prototypes.  I just added 
-fpermissive
to that test.  Second pr17306-2.c has a single call to an unprototyped 
function
for which I added the prototype.

These are both H8 specific tests.

gcc/testsuite
* gcc.target/h8300/pr58400.c: Add -fpermissive.
* gcc.target/h8300/pr17306-2.c: Add missing prototype.

diff --git a/gcc/testsuite/gcc.target/h8300/pr17306-2.c 
b/gcc/testsuite/gcc.target/h8300/pr17306-2.c
index a407c74b4cd..8c79f31b9db 100644
--- a/gcc/testsuite/gcc.target/h8300/pr17306-2.c
+++ b/gcc/testsuite/gcc.target/h8300/pr17306-2.c
@@ -8,6 +8,8 @@ struct x {
   char y;
 };
 
+void oof (void);
+
 struct x __attribute__ ((eightbit_data)) foo;
 
 int bar ()
diff --git a/gcc/testsuite/gcc.target/h8300/pr58400.c 
b/gcc/testsuite/gcc.target/h8300/pr58400.c
index 496626f4b48..9d1ad7a2202 100644
--- a/gcc/testsuite/gcc.target/h8300/pr58400.c
+++ b/gcc/testsuite/gcc.target/h8300/pr58400.c
@@ -1,5 +1,5 @@
 /* { dg-do compile }  */
-/* { dg-options "-Os -mh -mint32 -w" }  */
+/* { dg-options "-Os -mh -mint32 -w -fpermissive" }  */
 
  typedef unsigned short __u16;
  typedef __signed__ int __s32;

[committed] Fix frv build after C99 changes

2023-12-02 Thread Jeff Law

Two issues prevent the frv-elf port from building after the C99 changes. 
 First the trampoline code emitted into libgcc has calls to exit, but 
no prototype.  Adding a trivial prototype for exit() into the macro 
fixes that little goof.


Second, frvbegin.c has a call to atexit, so a quick prototype is added 
into frvbegin.c to fix that problem.


That's enough to get the compiler building again.

Pushed to the trunk,
Jeffcommit 870b63fe71607b94c0e5b0c6e61cd807e0216ddd
Author: Jeff Law 
Date:   Sat Dec 2 21:54:36 2023 -0700

[committed] Fix frv build after C99 changes

Two issues prevent the frv-elf port from building after the C99 changes.  
First
the trampoline code emitted into libgcc has calls to exit, but no prototype.
Adding a trivial prototype for exit() into the macro fixes that little goof.

Second, frvbegin.c has a call to atexit, so a quick prototype is added into
frvbegin.c to fix that problem.

That's enough to get the compiler building again.

gcc/
* config/frv/frv.h (TRANSFER_FROM_TRAMPOLINE): Add prototype for 
exit.

libgcc/
* config/frv/frvbegin.c (atexit): Add prototype.

diff --git a/gcc/config/frv/frv.h b/gcc/config/frv/frv.h
index 979561126f8..93a7c6d0fcc 100644
--- a/gcc/config/frv/frv.h
+++ b/gcc/config/frv/frv.h
@@ -1241,6 +1241,7 @@ typedef struct frv_stack {
 #if ! __FRV_FDPIC__
 #define TRANSFER_FROM_TRAMPOLINE   \
 extern int Twrite (int, const void *, unsigned);   \
+extern void exit (int);
\
\
 void   \
 __trampoline_setup (short * addr, int size, int fnaddr, int sc)
\
@@ -1284,6 +1285,7 @@ __asm__("\n"  
\
 #else
 #define TRANSFER_FROM_TRAMPOLINE   \
 extern int Twrite (int, const void *, unsigned);   \
+extern void exit (int);
\
\
 void   \
 __trampoline_setup (addr, size, fnaddr, sc)\
diff --git a/libgcc/config/frv/frvbegin.c b/libgcc/config/frv/frvbegin.c
index 76b40ec46c6..24ea06b1ae7 100644
--- a/libgcc/config/frv/frvbegin.c
+++ b/libgcc/config/frv/frvbegin.c
@@ -119,6 +119,7 @@ __do_global_dtors (void)
 }
 }
 
+int atexit (void (*)(void));
 /* Run the global constructors.  */
 void
 __do_global_ctors (void)

Re: [PATCH] gcc/doc: spelling mistakes and example

2023-12-02 Thread Xi Ruoyao

On Sun, 2023-12-03 at 00:17 +, Jonny Grant wrote:
> @@ -733,7 +733,7 @@ To configure GCC:
>  @smallexample
>  % mkdir @var{objdir}
>  % cd @var{objdir}
> -% @var{srcdir}/configure [@var{options}] [@var{target}]
> +% ../@var{srcdir}/configure [@var{options}] [@var{target}]
>  @end smallexample

No, this is definitely incorrect.  srcdir is the path (it may be
relative or absolute) to the GCC source tree.  It's not necessary to be
placed in the parent directory of objdir.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

[PATCH v3 3/3] amdgcn, libgomp: low-latency allocator

2023-12-02 Thread Andrew Stubbs


This implements the OpenMP low-latency memory allocator for AMD GCN using the
small per-team LDS memory (Local Data Store).

Since addresses can now refer to LDS space, the "Global" address space is
no-longer compatible.  This patch therefore switches the backend to use
entirely "Flat" addressing (which supports both memories).  A future patch
will re-enable "global" instructions for cases where it is known to be safe
to do so.

gcc/ChangeLog:

* config/gcn/gcn-builtins.def (DISPATCH_PTR): New built-in.
* config/gcn/gcn.cc (gcn_init_machine_status): Disable global
addressing.
(gcn_expand_builtin_1): Implement GCN_BUILTIN_DISPATCH_PTR.

libgomp/ChangeLog:

* config/gcn/libgomp-gcn.h (TEAM_ARENA_START): Move to here.
(TEAM_ARENA_FREE): Likewise.
(TEAM_ARENA_END): Likewise.
(GCN_LOWLAT_HEAP): New.
* config/gcn/team.c (LITTLEENDIAN_CPU): New, and import hsa.h.
(__gcn_lowlat_init): New prototype.
(gomp_gcn_enter_kernel): Initialize the low-latency heap.
* libgomp.h (TEAM_ARENA_START): Move to libgomp.h.
(TEAM_ARENA_FREE): Likewise.
(TEAM_ARENA_END): Likewise.
* plugin/plugin-gcn.c (lowlat_size): New variable.
(print_kernel_dispatch): Label the group_segment_size purpose.
(init_environment_variables): Read GOMP_GCN_LOWLAT_POOL.
(create_kernel_dispatch): Pass low-latency head allocation to kernel.
(run_kernel): Use shadow; don't assume values.
* testsuite/libgomp.c/omp_alloc-traits.c: Enable for amdgcn.
* config/gcn/allocator.c: New file.
* libgomp.texi: Document low-latency implementation details.
---
 gcc/config/gcn/gcn-builtins.def   |   2 +
 gcc/config/gcn/gcn.cc |  16 ++-
 libgomp/config/gcn/allocator.c| 127 ++
 libgomp/config/gcn/libgomp-gcn.h  |   6 +
 libgomp/config/gcn/team.c |  12 ++
 libgomp/libgomp.h |   3 -
 libgomp/libgomp.texi  |  13 ++
 libgomp/plugin/plugin-gcn.c   |  35 -
 .../testsuite/libgomp.c/omp_alloc-traits.c|   2 +-
 9 files changed, 205 insertions(+), 11 deletions(-)
 create mode 100644 libgomp/config/gcn/allocator.c

diff --git a/gcc/config/gcn/gcn-builtins.def b/gcc/config/gcn/gcn-builtins.def
index 636a8e7a1a9..471457d7c23 100644
--- a/gcc/config/gcn/gcn-builtins.def
+++ b/gcc/config/gcn/gcn-builtins.def
@@ -164,6 +164,8 @@ DEF_BUILTIN (FIRST_CALL_THIS_THREAD_P, -1, "first_call_this_thread_p", B_INSN,
 	 _A1 (GCN_BTI_BOOL), gcn_expand_builtin_1)
 DEF_BUILTIN (KERNARG_PTR, -1, "kernarg_ptr", B_INSN, _A1 (GCN_BTI_VOIDPTR),
 	 gcn_expand_builtin_1)
+DEF_BUILTIN (DISPATCH_PTR, -1, "dispatch_ptr", B_INSN, _A1 (GCN_BTI_VOIDPTR),
+	 gcn_expand_builtin_1)
 DEF_BUILTIN (GET_STACK_LIMIT, -1, "get_stack_limit", B_INSN,
 	 _A1 (GCN_BTI_VOIDPTR), gcn_expand_builtin_1)
 
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 22d2b6ebf6d..d70238820dd 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -110,7 +110,8 @@ gcn_init_machine_status (void)
 
   f = ggc_cleared_alloc ();
 
-  if (TARGET_GCN3)
+  // FIXME: re-enable global addressing with safety for LDS-flat addresses
+  //if (TARGET_GCN3)
 f->use_flat_addressing = true;
 
   return f;
@@ -4881,6 +4882,19 @@ gcn_expand_builtin_1 (tree exp, rtx target, rtx /*subtarget */ ,
 	  }
 	return ptr;
   }
+case GCN_BUILTIN_DISPATCH_PTR:
+  {
+	rtx ptr;
+	if (cfun->machine->args.reg[DISPATCH_PTR_ARG] >= 0)
+	   ptr = gen_rtx_REG (DImode,
+			  cfun->machine->args.reg[DISPATCH_PTR_ARG]);
+	else
+	  {
+	ptr = gen_reg_rtx (DImode);
+	emit_move_insn (ptr, const0_rtx);
+	  }
+	return ptr;
+  }
 case GCN_BUILTIN_FIRST_CALL_THIS_THREAD_P:
   {
 	/* Stash a marker in the unused upper 16 bits of s[0:1] to indicate
diff --git a/libgomp/config/gcn/allocator.c b/libgomp/config/gcn/allocator.c
new file mode 100644
index 000..e9a95d683f9
--- /dev/null
+++ b/libgomp/config/gcn/allocator.c
@@ -0,0 +1,127 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should

[PATCH v3 1/3] libgomp, nvptx: low-latency memory allocator

2023-12-02 Thread Andrew Stubbs


This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc.  The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.

The use of the PTX dynamic_smem_size feature means that low-latency allocator
will not work with the PTX 3.1 multilib.

For now, the omp_low_lat_mem_alloc allocator also works, but that will change
when I implement the access traits.

libgomp/ChangeLog:

* allocator.c (MEMSPACE_ALLOC): New macro.
(MEMSPACE_CALLOC): New macro.
(MEMSPACE_REALLOC): New macro.
(MEMSPACE_FREE): New macro.
(predefined_alloc_mapping): New array.  Add _Static_assert to match.
(ARRAY_SIZE): New macro.
(omp_aligned_alloc): Use MEMSPACE_ALLOC.
Implement fall-backs for predefined allocators.  Simplify existing
fall-backs.
(omp_free): Use MEMSPACE_FREE.
(omp_calloc): Use MEMSPACE_CALLOC. Implement fall-backs for
predefined allocators.  Simplify existing fall-backs.
(omp_realloc): Use MEMSPACE_REALLOC, MEMSPACE_ALLOC, and MEMSPACE_FREE.
Implement fall-backs for predefined allocators.  Simplify existing
fall-backs.
* config/nvptx/team.c (__nvptx_lowlat_pool): New asm variable.
(__nvptx_lowlat_init): New prototype.
(gomp_nvptx_main): Call __nvptx_lowlat_init.
* libgomp.texi: Update memory space table.
* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
* basic-allocator.c: New file.
* config/nvptx/allocator.c: New file.
* testsuite/libgomp.c/omp_alloc-1.c: New test.
* testsuite/libgomp.c/omp_alloc-2.c: New test.
* testsuite/libgomp.c/omp_alloc-3.c: New test.
* testsuite/libgomp.c/omp_alloc-4.c: New test.
* testsuite/libgomp.c/omp_alloc-5.c: New test.
* testsuite/libgomp.c/omp_alloc-6.c: New test.

Co-authored-by: Kwok Cheung Yeung  
Co-Authored-By: Thomas Schwinge 
---
 libgomp/allocator.c   | 246 --
 libgomp/basic-allocator.c | 380 ++
 libgomp/config/nvptx/allocator.c  | 120 +++
 libgomp/config/nvptx/team.c   |  18 +
 libgomp/libgomp.texi  |   9 +-
 libgomp/plugin/plugin-nvptx.c |  23 +-
 libgomp/testsuite/libgomp.c/omp_alloc-1.c |  56 
 libgomp/testsuite/libgomp.c/omp_alloc-2.c |  64 
 libgomp/testsuite/libgomp.c/omp_alloc-3.c |  42 +++
 libgomp/testsuite/libgomp.c/omp_alloc-4.c | 196 +++
 libgomp/testsuite/libgomp.c/omp_alloc-5.c |  63 
 libgomp/testsuite/libgomp.c/omp_alloc-6.c | 117 +++
 12 files changed, 1231 insertions(+), 103 deletions(-)
 create mode 100644 libgomp/basic-allocator.c
 create mode 100644 libgomp/config/nvptx/allocator.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-6.c

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index b4e50e2ad72..fa398128368 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -37,6 +37,47 @@
 
 #define omp_max_predefined_alloc omp_thread_mem_alloc
 
+/* These macros may be overridden in config//allocator.c.
+   The following definitions (ab)use comma operators to avoid unused
+   variable errors.  */
+#ifndef MEMSPACE_ALLOC
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
+  malloc (((void)(MEMSPACE), (SIZE)))
+#endif
+#ifndef MEMSPACE_CALLOC
+#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
+  calloc (1, (((void)(MEMSPACE), (SIZE
+#endif
+#ifndef MEMSPACE_REALLOC
+#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE) \
+  realloc (ADDR, (((void)(MEMSPACE), (void)(OLDSIZE), (SIZE
+#endif
+#ifndef MEMSPACE_FREE
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
+  free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))
+#endif
+
+/* Map the predefined allocators to the correct memory space.
+   The index to this table is the omp_allocator_handle_t enum value.
+   When the user calls omp_alloc with a predefined allocator this
+   table determines what memory they get.  */
+static const omp_memspace_handle_t predefined_alloc_mapping[] = {
+  omp_default_mem_space,   /* omp_null_allocator doesn't actually use this. */
+  omp_default_mem_space,   /* omp_default_mem_alloc. */
+  omp_large_cap_mem_space, /* omp_large_cap_mem_alloc. */
+  omp_const_mem_space, /* omp_const_mem_alloc. */
+  omp_high_bw_mem_space,   /* omp_high_bw_mem_alloc. */

[PATCH v3 0/3] libgomp: OpenMP low-latency omp_alloc

2023-12-02 Thread Andrew Stubbs

This patch series is a rework of the patch series posted in August.

https://patchwork.sourceware.org/project/gcc/list/?series=23045=%2A=both

The series implements device-specific allocators and adds a low-latency
allocator for both GPUs architectures.

This time the omp_low_lat_mem_alloc does not work because the default
traits are incompatible (GPU low-latency memory is not accessible to
other teams).  I've also included documentation and addressed the
comments from Tobias's review.

Andrew

Andrew Stubbs (3):
  libgomp, nvptx: low-latency memory allocator
  openmp, nvptx: low-lat memory access traits
  amdgcn, libgomp: low-latency allocator

 gcc/config/gcn/gcn-builtins.def   |   2 +
 gcc/config/gcn/gcn.cc |  16 +-
 libgomp/allocator.c   | 266 +++-
 libgomp/basic-allocator.c | 380 ++
 libgomp/config/gcn/allocator.c| 127 ++
 libgomp/config/gcn/libgomp-gcn.h  |   6 +
 libgomp/config/gcn/team.c |  12 +
 libgomp/config/nvptx/allocator.c  | 141 +++
 libgomp/config/nvptx/team.c   |  18 +
 libgomp/libgomp.h |   3 -
 libgomp/libgomp.texi  |  40 +-
 libgomp/plugin/plugin-gcn.c   |  35 +-
 libgomp/plugin/plugin-nvptx.c |  23 +-
 libgomp/testsuite/libgomp.c/omp_alloc-1.c |  66 +++
 libgomp/testsuite/libgomp.c/omp_alloc-2.c |  72 
 libgomp/testsuite/libgomp.c/omp_alloc-3.c |  49 +++
 libgomp/testsuite/libgomp.c/omp_alloc-4.c | 197 +
 libgomp/testsuite/libgomp.c/omp_alloc-5.c |  71 
 libgomp/testsuite/libgomp.c/omp_alloc-6.c | 118 ++
 .../testsuite/libgomp.c/omp_alloc-traits.c|  66 +++
 20 files changed, 1595 insertions(+), 113 deletions(-)
 create mode 100644 libgomp/basic-allocator.c
 create mode 100644 libgomp/config/gcn/allocator.c
 create mode 100644 libgomp/config/nvptx/allocator.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-6.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-traits.c

-- 
2.41.0

[PATCH v3 2/3] openmp, nvptx: low-lat memory access traits

2023-12-02 Thread Andrew Stubbs


The NVPTX low latency memory is not accessible outside the team that allocates
it, and therefore should be unavailable for allocators with the access trait
"all".  This change means that the omp_low_lat_mem_alloc predefined
allocator no longer works (but omp_cgroup_mem_alloc still does).

libgomp/ChangeLog:

* allocator.c (MEMSPACE_VALIDATE): New macro.
(omp_init_allocator): Use MEMSPACE_VALIDATE.
(omp_aligned_alloc): Use OMP_LOW_LAT_MEM_ALLOC_INVALID.
(omp_aligned_calloc): Likewise.
(omp_realloc): Likewise.
* config/nvptx/allocator.c (nvptx_memspace_validate): New function.
(MEMSPACE_VALIDATE): New macro.
(OMP_LOW_LAT_MEM_ALLOC_INVALID): New define.
* libgomp.texi: Document low-latency implementation details.
* testsuite/libgomp.c/omp_alloc-1.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-2.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-3.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-4.c (main): Add access trait.
* testsuite/libgomp.c/omp_alloc-5.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-6.c (main): Add access trait.
* testsuite/libgomp.c/omp_alloc-traits.c: New test.
---
 libgomp/allocator.c   | 20 ++
 libgomp/config/nvptx/allocator.c  | 21 ++
 libgomp/libgomp.texi  | 18 +
 libgomp/testsuite/libgomp.c/omp_alloc-1.c | 10 +++
 libgomp/testsuite/libgomp.c/omp_alloc-2.c |  8 +++
 libgomp/testsuite/libgomp.c/omp_alloc-3.c |  7 ++
 libgomp/testsuite/libgomp.c/omp_alloc-4.c |  7 +-
 libgomp/testsuite/libgomp.c/omp_alloc-5.c |  8 +++
 libgomp/testsuite/libgomp.c/omp_alloc-6.c |  7 +-
 .../testsuite/libgomp.c/omp_alloc-traits.c| 66 +++
 10 files changed, 166 insertions(+), 6 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-traits.c

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index fa398128368..a8a80f8028d 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -56,6 +56,10 @@
 #define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
   free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))
 #endif
+#ifndef MEMSPACE_VALIDATE
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) \
+  (((void)(MEMSPACE), (void)(ACCESS), 1))
+#endif
 
 /* Map the predefined allocators to the correct memory space.
The index to this table is the omp_allocator_handle_t enum value.
@@ -439,6 +443,10 @@ omp_init_allocator (omp_memspace_handle_t memspace, int ntraits,
   if (data.pinned)
 return omp_null_allocator;
 
+  /* Reject unsupported memory spaces.  */
+  if (!MEMSPACE_VALIDATE (data.memspace, data.access))
+return omp_null_allocator;
+
   ret = gomp_malloc (sizeof (struct omp_allocator_data));
   *ret = data;
 #ifndef HAVE_SYNC_BUILTINS
@@ -522,6 +530,10 @@ retry:
 new_size += new_alignment - sizeof (void *);
   if (__builtin_add_overflow (size, new_size, _size))
 goto fail;
+#ifdef OMP_LOW_LAT_MEM_ALLOC_INVALID
+  if (allocator == omp_low_lat_mem_alloc)
+goto fail;
+#endif
 
   if (__builtin_expect (allocator_data
 			&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
@@ -820,6 +832,10 @@ retry:
 goto fail;
   if (__builtin_add_overflow (size_temp, new_size, _size))
 goto fail;
+#ifdef OMP_LOW_LAT_MEM_ALLOC_INVALID
+  if (allocator == omp_low_lat_mem_alloc)
+goto fail;
+#endif
 
   if (__builtin_expect (allocator_data
 			&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
@@ -1054,6 +1070,10 @@ retry:
   if (__builtin_add_overflow (size, new_size, _size))
 goto fail;
   old_size = data->size;
+#ifdef OMP_LOW_LAT_MEM_ALLOC_INVALID
+  if (allocator == omp_low_lat_mem_alloc)
+goto fail;
+#endif
 
   if (__builtin_expect (allocator_data
 			&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
diff --git a/libgomp/config/nvptx/allocator.c b/libgomp/config/nvptx/allocator.c
index 6014fba177f..a3302411bcb 100644
--- a/libgomp/config/nvptx/allocator.c
+++ b/libgomp/config/nvptx/allocator.c
@@ -108,6 +108,21 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, void *addr,
 return realloc (addr, size);
 }
 
+static inline int
+nvptx_memspace_validate (omp_memspace_handle_t memspace, unsigned access)
+{
+#if __PTX_ISA_VERSION_MAJOR__ > 4 \
+|| (__PTX_ISA_VERSION_MAJOR__ == 4 && __PTX_ISA_VERSION_MINOR >= 1)
+  /* Disallow use of low-latency memory when it must be accessible by
+ all threads.  */
+  return (memspace != omp_low_lat_mem_space
+	  || access != omp_atv_all);
+#else
+  /* Low-latency memory is not available before PTX 4.1.  */
+  return (memspace != omp_low_lat_mem_space);
+#endif
+}
+
 #define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
   nvptx_memspace_alloc (MEMSPACE, SIZE)
 #define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
@@ -116,5 +131,11 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, void *addr,
   nvptx_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, SIZE)
 #define

[PATCH] wwwdocs: spelling mistakes

2023-12-02 Thread Jonny Grant

2023-12-03  Jonathan Grant  

htdocs/
* bugs/management.html: adition spelling.
* codingrationale.html: suprises spelling.
* contribute.html: elipsis leter spelling.
* gcc-14/changes.html: modifed spelling.
* gccmission.html: groundrules spelling.
* projects/cli.html: backnend  spelling.
* projects/cfg.html: nowday spelling.
* projects/cxx-reflection/index.html: fonctionalities spelling.
* projects/optimize.html: blowup -> increase, indepentent spelling.
* projects/tree-profiling.html: Optimizaion spelling.
* testing/index.html: runing spelling.



>From 97f197c3f8218df2362a053d47548f27cd19d81a Mon Sep 17 00:00:00 2001
From: Jonathan Grant 
Date: Sun, 3 Dec 2023 00:29:43 +
Subject: [PATCH] wwwdocs: spelling mistake corrections

---
 htdocs/bugs/management.html   | 2 +-
 htdocs/codingrationale.html   | 2 +-
 htdocs/contribute.html| 4 ++--
 htdocs/gcc-14/changes.html| 2 +-
 htdocs/gccmission.html| 2 +-
 htdocs/projects/cfg.html  | 2 +-
 htdocs/projects/cli.html  | 2 +-
 htdocs/projects/cxx-reflection/index.html | 2 +-
 htdocs/projects/optimize.html | 6 +++---
 htdocs/projects/tree-profiling.html   | 2 +-
 htdocs/testing/index.html | 2 +-
 11 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/htdocs/bugs/management.html b/htdocs/bugs/management.html
index 28dfa76a..b2bb740e 100644
--- a/htdocs/bugs/management.html
+++ b/htdocs/bugs/management.html
@@ -64,7 +64,7 @@ perspective, these are the relevant ones and what their 
values mean:
 The status and resolution fields define and track the life cycle of a
 bug.  In addition to their https://gcc.gnu.org/bugzilla/page.cgi?id=fields.html;>regular
-descriptions, we also use two adition status values:
+descriptions, we also use two additional status values:
 
 
 
diff --git a/htdocs/codingrationale.html b/htdocs/codingrationale.html
index 6cc76885..c51c9da4 100644
--- a/htdocs/codingrationale.html
+++ b/htdocs/codingrationale.html
@@ -155,7 +155,7 @@ Wide use of implicit conversion can cause some very 
surprising results.
 
 
 C++03 has no explicit conversion operators,
-and hence using them cannot avoid suprises.
+and hence using them cannot avoid surprises.
 Wait for C++11.
 
 
diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index fbe5b39c..152675fa 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -329,7 +329,7 @@ the commit message so that Bugzilla will correctly notice 
the
 commit.  If your patch relates to two bugs, then write
 [PRn, PRm].  For multiple
 bugs, just cite the most relevant one in the summary and use an
-elipsis instead of the second, or subsequent PR numbers; list all the
+ellipsis instead of the second, or subsequent PR numbers; list all the
 related PRs in the body of the commit message in the normal way.
 
 It is not necessary to cite bugs that are closed as duplicates of
@@ -354,7 +354,7 @@ together.
 If you submit a new version of a patch series, then you should
 start a new email thread (don't reply to the original patch series).
 This avoids email threads becoming confused between discussions of the
-first and subsequent revisions of the patch set.  Your cover leter
+first and subsequent revisions of the patch set.  Your cover letter
 (0/nnn) should explain clearly what has been changed between
 the two patch series.  Also state if some of the patches are unchanged
 between revisions; this saves maintainers having to re-review the
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 5a453437..bd51ecb4 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -34,7 +34,7 @@ a work-in-progress.
   another structure, is deprecated. Refer to
   https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html;>
   Zero Length Arrays.
-  Any code relying on this extension should be modifed to ensure that
+  Any code relying on this extension should be modified to ensure that
   C99 flexible array members only end up at the ends of structures.
   Please use the warning option
   https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end;>-Wflex-array-member-not-at-end
 to
diff --git a/htdocs/gccmission.html b/htdocs/gccmission.html
index 58a12755..1124fe9f 100644
--- a/htdocs/gccmission.html
+++ b/htdocs/gccmission.html
@@ -55,7 +55,7 @@ GCC.
  Patches will be considered equally based on their
  technical merits.
  All individuals and companies are welcome to contribute
- as long as they accept the groundrules.
+ as long as they accept the ground rules.
  
 Open mailing lists.
 Developer friendly tools and procedures (i.e. [version control], multiple
diff --git a/htdocs/projects/cfg.html b/htdocs/projects/cfg.html
index b1ee1f34..b695766e 100644
---

[PATCH] gcc/doc: spelling mistakes and example

2023-12-02 Thread Jonny Grant



2023-12-03  Jonathan Grant  

gcc/doc
* install.texi: show ../ back from the objdir in the example invoking 
configure
 correct spelling support, arithmetics


This page is what is generated from install.texi
https://gcc.gnu.org/install/configure.html



>From c9fec3796600cc44c0839d0471935482612e4596 Mon Sep 17 00:00:00 2001
From: Jonathan Grant 
Date: Sun, 3 Dec 2023 00:15:12 +
Subject: [PATCH]  gcc/doc: spelling mistakes and example

---
 gcc/doc/install.texi | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index c1ccb8ba02d..96a65aa5080 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -733,7 +733,7 @@ To configure GCC:
 @smallexample
 % mkdir @var{objdir}
 % cd @var{objdir}
-% @var{srcdir}/configure [@var{options}] [@var{target}]
+% ../@var{srcdir}/configure [@var{options}] [@var{target}]
 @end smallexample
 
 @heading Distributor options
@@ -1449,23 +1449,23 @@ for riscv*-*-elf*.  The accepted values and meanings 
are given below.
 Every config is constructed with four components: architecture string, ABI,
 reuse rule with architecture string and reuse rule with sub-extension.
 
-Example 1: Add multi-lib suppport for rv32i with ilp32.
+Example 1: Add multi-lib support for rv32i with ilp32.
 @smallexample
 rv32i-ilp32--
 @end smallexample
 
-Example 2: Add multi-lib suppport for rv32i with ilp32 and rv32imafd with 
ilp32.
+Example 2: Add multi-lib support for rv32i with ilp32 and rv32imafd with ilp32.
 @smallexample
 rv32i-ilp32--;rv32imafd-ilp32--
 @end smallexample
 
-Example 3: Add multi-lib suppport for rv32i with ilp32; rv32im with ilp32 and
+Example 3: Add multi-lib support for rv32i with ilp32; rv32im with ilp32 and
 rv32ic with ilp32 will reuse this multi-lib set.
 @smallexample
 rv32i-ilp32-rv32im-c
 @end smallexample
 
-Example 4: Add multi-lib suppport for rv64ima with lp64; rv64imaf with lp64,
+Example 4: Add multi-lib support for rv64ima with lp64; rv64imaf with lp64,
 rv64imac with lp64 and rv64imafc with lp64 will reuse this multi-lib set.
 @smallexample
 rv64ima-lp64--f,c,fc
@@ -1476,13 +1476,13 @@ rv64ima-lp64--f,c,fc
 config options, @var{val} is a comma separated list of possible code model,
 currently we support medlow and medany.
 
-Example 5: Add multi-lib suppport for rv64ima with lp64; rv64ima with lp64 and
+Example 5: Add multi-lib support for rv64ima with lp64; rv64ima with lp64 and
 medlow code model
 @smallexample
 rv64ima-lp64--;--cmodel=medlow
 @end smallexample
 
-Example 6: Add multi-lib suppport for rv64ima with lp64; rv64ima with lp64 and
+Example 6: Add multi-lib support for rv64ima with lp64; rv64ima with lp64 and
 medlow code model; rv64ima with lp64 and medany code model
 @smallexample
 rv64ima-lp64--;--cmodel=medlow,medany
@@ -1607,7 +1607,7 @@ libraries.  This option is only supported on Epiphany 
targets.
 
 @item --with-fpmath=@var{isa}
 This options sets @option{-mfpmath=sse} by default and specifies the default
-ISA for floating-point arithmetics.  You can select either @samp{sse} which
+ISA for floating-point arithmetic.  You can select either @samp{sse} which
 enables @option{-msse2} or @samp{avx} which enables @option{-mavx} by default.
 This option is only supported on i386 and x86-64 targets.
 
-- 
2.40.1

Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]

2023-12-02 Thread Jakub Jelinek

On Sat, Dec 02, 2023 at 11:04:04AM +, Richard Sandiford wrote:
> I still maintain that so much stuff relies on the lack of false-positive
> REG_UNUSED notes that (whatever the intention might have been) we need
> to prevent the false positive.  Like Andrew says, any use of single_set
> is suspect if there's a REG_UNUSED note for something that is in fact used.

The false positive REG_UNUSED in that case comes from
(insn 15 14 35 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:DI 0 ax [111])
(reg:DI 1 dx [112]))) "pr112760.c":11:22 12 {*cmpdi_1}
 (expr_list:REG_UNUSED (reg:CCZ 17 flags)
(nil)))
(insn 35 15 36 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:DI 0 ax [111])
(reg:DI 1 dx [112]))) "pr112760.c":11:22 12 {*cmpdi_1}
 (expr_list:REG_DEAD (reg:DI 1 dx [112])
(expr_list:REG_DEAD (reg:DI 0 ax [111])
(nil
...
use of flags
Haven't verified what causes the redundant comparison, but postreload cse
then does:
110   if (!count && cselib_redundant_set_p (body))
111 {
112   if (check_for_inc_dec (insn))
113 delete_insn_and_edges (insn);
114   /* We're done with this insn.  */
115   goto done;
116 }
So, we'd in such cases need to look up what instruction was the earlier
setter and if it has REG_UNUSED note, drop it.

Jakub

[PATCH] driver: Fix memory leak.

2023-12-02 Thread Costas Argyris

Use std::vector instead of malloc'd pointer
to get automatic freeing of memory.

Result was verified by valgrind, which showed
one less loss record.

I think Jonathan is/was working on this transition
but on a larger scale.


0001-driver-Fix-memory-leak.patch
Description: Binary data

[PATCH] RISC-V: Document optimization parameter riscv-strcmp-inline-limit

2023-12-02 Thread Christoph Müllner

This patch documents the optimization parameter
riscv-strcmp-inline-limit, which can be used to tweak the behaviour
of -minline-strcmp and -minline-strncmp.

gcc/ChangeLog:

PR target/112650
* doc/invoke.texi: Document riscv-strcmp-inline-limit.

Signed-off-by: Christoph Müllner 
---
 gcc/doc/invoke.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2fab4c5d71f..ba2d843b484 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -29846,6 +29846,10 @@ Inlining will only be done if the strings are properly 
aligned
 and instructions for accelerated processing are available.
 The default is to not inline strcmp calls.
 
+The @option{--param riscv-strcmp-inline-limit=@{n}} parameter controls
+the maximum number of bytes compared by the inlined code.
+The default value is 64.
+
 @opindex minline-strncmp
 @item -minline-strncmp
 @itemx -mno-inline-strncmp
@@ -29854,6 +29858,10 @@ Inlining will only be done if the strings are properly 
aligned
 and instructions for accelerated processing are available.
 The default is to not inline strncmp calls.
 
+The @option{--param riscv-strcmp-inline-limit=@{n}} parameter controls
+the maximum number of bytes compared by the inlined code.
+The default value is 64.
+
 @opindex mshorten-memrefs
 @item -mshorten-memrefs
 @itemx -mno-shorten-memrefs
-- 
2.41.0

Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]

2023-12-02 Thread Richard Biener




> Am 02.12.2023 um 14:15 schrieb Richard Sandiford :
> 
> Eric Botcazou  writes:
>>> So sorry to be awkward, but I don't think this is the way to go.  I think
>>> we'll just end up playing whack-a-mole and adding df_note_add_problem to
>>> lots of passes.
>> 
>> We have doing that for the past 15 years though, so what has changed?
> 
> Off-hand, I couldn't remember a case where we'd done this specifically
> for false-positive REG_UNUSED notes.  (There probably were cases though.)
> 
>>> (FTR, I'm not saying passes have to avoid false negatives, just false
>>> positives.  If a pass updates an instruction with a REG_UNUSED note,
>>> and the pass is no longer sure whether the register is unused or not,
>>> the pass can just delete the note.)
>> 
>> Reintroducing the manual management of such notes would be a step backward.
> 
> I think false-positive REG_UNUSED notes are fundamentally different
> from the other cases though.  If a register is unused then its natural
> state is to remain unused.  The register will only become used if something
> goes out of its way to add new uses of an instruction result that "just
> happens to be there".  That's a deliberate decision and needs some
> analysis to prove that it's safe.  Requiring the pass to clear REG_UNUSED
> notes too doesn't seem like a significant extra burden.
> 
> Trying to reduce false-negative REG_UNUSED notes is different,
> since deleting any instruction could in principle make a register
> go from used to unused.  Same for REG_DEAD notes: if a pass deletes
> an instruction with a REG_DEAD note then it shouldn't have to figure
> out where the new death occurs.
> 
> Not sure how representative this is, but I tried the hack below
> to flag cases where single_set is used in passes that don't have
> up-to-date notes, then ran it on execute.exp.  The checking fired
> for every version of every test.  The collected passes were:
> 
> single_set: bbro
> single_set: cc_fusion
> single_set: ce1
> single_set: ce2
> single_set: ce3
> single_set: cmpelim
> single_set: combine
> single_set: compgotos
> single_set: cprop
> single_set: dwarf2
> single_set: fold_mem_offsets
> single_set: fwprop1
> single_set: fwprop2
> single_set: gcse2
> single_set: hoist
> single_set: init-regs
> single_set: ira
> single_set: jump2
> single_set: jump_after_combine
> single_set: loop2_done
> single_set: loop2_invariant
> single_set: postreload
> single_set: pro_and_epilogue
> single_set: ree
> single_set: reload
> single_set: rtl_dce
> single_set: rtl pre
> single_set: sched1
> single_set: sched2
> single_set: sched_fusion
> single_set: sms
> single_set: split1
> single_set: split2
> single_set: split3
> single_set: split5
> single_set: subreg3
> single_set: ud_dce
> single_set: vartrack
> single_set: web
> 
> which is a lot of passes :)
> 
> Some of the calls might be OK in context, due to call-specific
> circumstances.  But I think it's generally easier to see/show that
> a pass is adding new uses of existing defs than it is to prove that
> a use of single_set is safe even when notes aren't up-to-date.

I think this asks for a verify_notes that checks if notes are conservatively 
correct as to our definition then?  Not sure if doable for equal/equiv notes 
though.

Richard 

> Thanks,
> Richard
> 
> 
> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
> index d2cfaf7f50f..ece49f041e0 100644
> --- a/gcc/df-problems.cc
> +++ b/gcc/df-problems.cc
> @@ -3782,6 +3782,8 @@ void
> df_note_add_problem (void)
> {
>   df_add_problem (_NOTE);
> +  extern bool single_set_ok;
> +  single_set_ok = true;
> }
> 
> 
> diff --git a/gcc/passes.cc b/gcc/passes.cc
> index 6f894a41d22..d8e12ea2512 100644
> --- a/gcc/passes.cc
> +++ b/gcc/passes.cc
> @@ -2637,9 +2637,14 @@ execute_one_pass (opt_pass *pass)
> do_per_function (verify_curr_properties,
> (void *)(size_t)pass->properties_required);
> 
> +  extern bool single_set_ok;
> +  single_set_ok = !df;
> +
>   /* Do it!  */
>   todo_after = pass->execute (cfun);
> 
> +  single_set_ok = !df;
> +
>   if (todo_after & TODO_discard_function)
> {
>   /* Stop timevar.  */
> diff --git a/gcc/rtl.h b/gcc/rtl.h
> index e4b6cc0dbb5..af3bd1b7cfa 100644
> --- a/gcc/rtl.h
> +++ b/gcc/rtl.h
> @@ -3607,8 +3607,11 @@ extern void add_auto_inc_notes (rtx_insn *, rtx);
> 
> /* Handle the cheap and common cases inline for performance.  */
> 
> +extern void check_single_set ();
> inline rtx single_set (const rtx_insn *insn)
> {
> +  check_single_set ();
> +
>   if (!INSN_P (insn))
> return NULL_RTX;
> 
> diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
> index 87267ee3b88..e0207e2753a 100644
> --- a/gcc/rtlanal.cc
> +++ b/gcc/rtlanal.cc
> @@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "rtl-iter.h"
> #include "hard-reg-set.h"
> #include "function-abi.h"
> +#include "tree-pass.h"
> 
> /* Forward declarations */
> static void set_of_1 (rtx, const_rtx, void *);
> @@ -1543,6 +1544,20 @@ record_hard_reg_uses (rtx

Re: [PATCH v2 3/7] aarch64: Add eh_return compile tests

2023-12-02 Thread Andrew Pinski

On Fri, Nov 3, 2023 at 8:37 AM Szabolcs Nagy  wrote:
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/eh_return-2.c: New test.
> * gcc.target/aarch64/eh_return-3.c: New test.


gcc.target/aarch64/eh_return-3.c fails when running the testsuite with
`-march=armv9-a+sve` . I think it is a good idea to try to keep the
testsuite clean when running with different -march=/-mcpu= options
even. I know there are many failures due to -march=/-mcpu option right
now but this is a new testcase and all.

Thanks,
Andrew

>
> ---
> v2: check-function-bodies in eh_return-3.c
> (this is not very robust, but easier to read)
> ---
>  .../gcc.target/aarch64/eh_return-2.c  |  9 ++
>  .../gcc.target/aarch64/eh_return-3.c  | 30 +++
>  2 files changed, 39 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/eh_return-2.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/eh_return-3.c
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/eh_return-2.c 
> b/gcc/testsuite/gcc.target/aarch64/eh_return-2.c
> new file mode 100644
> index 000..4a9d124e891
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/eh_return-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-final { scan-assembler "add\tsp, sp, x5" } } */
> +/* { dg-final { scan-assembler "br\tx6" } } */
> +
> +void
> +foo (unsigned long off, void *handler)
> +{
> +  __builtin_eh_return (off, handler);
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/eh_return-3.c 
> b/gcc/testsuite/gcc.target/aarch64/eh_return-3.c
> new file mode 100644
> index 000..bfbe92af427
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/eh_return-3.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbranch-protection=pac-ret+leaf" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +/*
> +**foo:
> +** hint25 // paciasp
> +** stp x0, x1, .*
> +** stp x2, x3, .*
> +** cbz w2, .*
> +** mov x4, 0
> +** ldp x2, x3, .*
> +** ldp x0, x1, .*
> +** cbz x4, .*
> +** add sp, sp, x5
> +** br  x6
> +** hint29 // autiasp
> +** ret
> +** mov x5, x0
> +** mov x6, x1
> +** mov x4, 1
> +** b   .*
> +*/
> +void
> +foo (unsigned long off, void *handler, int c)
> +{
> +  if (c)
> +return;
> +  __builtin_eh_return (off, handler);
> +}
> --
> 2.25.1
>

[PATCH] driver: Call finalize method from main.

2023-12-02 Thread Costas Argyris

Calling the driver::finalize() method before returning from main
seems to be reducing some memory leaks of the driver (PR93019).

$ head -n20 before_patch.txt
==385521== Memcheck, a memory error detector
==385521== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==385521== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright
info
==385521== Command: /home/cargyris/gcc/install/bin/gcc -g -O2 t1.c
==385521== Parent PID: 165153
==385521==
==385521==
==385521== HEAP SUMMARY:
==385521== in use at exit: 173,136 bytes in 115 blocks
==385521==   total heap usage: 497 allocs, 382 frees, 229,696 bytes
allocated
==385521==
==385521== 1 bytes in 1 blocks are still reachable in loss record 1 of 99

$ head -n20 after_patch.txt
==397575== Memcheck, a memory error detector
==397575== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==397575== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright
info
==397575== Command: /home/cargyris/gcc/install/bin/gcc -g -O2 t1.c
==397575== Parent PID: 165153
==397575==
==397575==
==397575== HEAP SUMMARY:
==397575== in use at exit: 146,573 bytes in 77 blocks
==397575==   total heap usage: 497 allocs, 420 frees, 229,696 bytes
allocated
==397575==
==397575== 3 bytes in 1 blocks are indirectly lost in loss record 1 of 61

This makes some sense because driver::finalize() performs many
memory deallocations, among other things.It was introduced here:

https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9376dd63e6a2d94823f6faf8212c9f37bef5a656

Is there any reason for not wanting to call it before returning from main?

Also, the driver constructor's can_finalize boolean looks more like a
can_restore_env boolean, as it only affects the environment manager
and nothing else in the driver (no?).By making driver::finalize() call
env.restore() conditional on can_restore_env, it is possible to always
call driver::finalize() even for can_restore_env == false and still do all
the rest that finalize() does, including freeing memory.

Is there anything wrong with calling d.finalize() from main without
restoring the environment?The next step is returning from main
anyway, and the program ends, so I can't think of any reason not
to do this, given that it helps with the memory leaks.Is there?


0001-driver-Call-finalize-method-from-main.patch
Description: Binary data

[PATCH] download_prerequisites: add --only-gettext

2023-12-02 Thread Arsen Arsenović

contrib/ChangeLog:

* download_prerequisites
: Parse --only-gettext.
(echo_archives): Check only_gettext and stop early if true.
(helptext): Document --only-gettext.
---
Afternoon,

This patch adds a --only-gettext option to download_prerequisites for
when the only useful dependency to download is gettext (which will
restore a gcc source tree to a similar 'intlness' as before the
externalization of gettext-runtime).

For context, see
https://inbox.sourceware.org/CAFiYyc2-JxH358GUcZfR4iBMq5qj6Nf4W=7lyoqyw6b-u8d...@mail.gmail.com/

OK for trunk?

TIA, have a lovely day!

 contrib/download_prerequisites | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites
index 9568091c0dba..30ff0cc9491a 100755
--- a/contrib/download_prerequisites
+++ b/contrib/download_prerequisites
@@ -36,16 +36,18 @@ gettext='gettext-0.22.tar.gz'
 base_url='http://gcc.gnu.org/pub/gcc/infrastructure/'
 
 echo_archives() {
+echo "${gettext}"
+if "${only_gettext}"; then return; fi
 echo "${gmp}"
 echo "${mpfr}"
 echo "${mpc}"
-echo "${gettext}"
 if [ ${graphite} -gt 0 ]; then echo "${isl}"; fi
 }
 
 graphite=1
 verify=1
 force=0
+only_gettext=false
 OS=$(uname)
 
 if type wget > /dev/null ; then
@@ -74,6 +76,7 @@ The following options are available:
  --no-verify  don't verify package integrity
  --sha512 use SHA512 checksum to verify package integrity (default)
  --md5use MD5 checksum to verify package integrity
+ --only-gettext   inhibit downloading any package but gettext
  --help   show this text and exit
  --versionshow version information and exit
 "
@@ -159,6 +162,9 @@ do
 chksum_extension='md5'
 verify=1
 ;;
+--only-gettext)
+only_gettext=true
+;;
 -*)
 die "unknown option: ${arg}"
 ;;
-- 
2.43.0

Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-02 Thread Jeff Law





On 12/1/23 22:47, Sam James wrote:


Jeff Law  writes:


On 12/1/23 18:13, Sam James wrote:

钟居哲  writes:


Hi, This patch cause error on building newlib/glibc/musl on RISC-V port:

/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:8:40:
error: passing argument 3 of 'syscall_errno' makes integer from pointer without 
a cast [-Wint-conversion]
  8 |   return syscall_errno (SYS_access, 2, file, mode, 0, 0, 0, 0);
|^~~~
||
|const char *

This looks like an issue in newlib. We expect broken code to be
broken
by the recent changes. Can you investigate it on the newlib side?

A ton of stuff in newlib/libgloss is broken due to the compiler
changes.   But that's not a big surprise -- much of the
newlib/libgloss code is c89 and clearly wrong for c99 and newer.


Yeah, it's probably a reasonable candidate for -fpermissive to start
with until it's cleaned up.
Perhaps.  Particularly if it can be confined to libgloss as that's where 
the bulk of the problems are.   It'd be even better if we could 
constrain it per-port, but I suspect putting all that in place would be 
more work than just fixing this stuff.






(Also, sorry, I didn't mean my comment to appear glib. I just meant to
say "yes, this looks expected".)
No worries, I didn't take it that way at all.  I fully agree this looks 
expected and while annoying it's not a big deal IMHO.  We fix and move on.


jeff

Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-12-02 Thread Jason Merrill


On 12/1/23 20:31, waffl3x wrote:

On Friday, December 1st, 2023 at 9:52 AM, Jason Merrill  
wrote:

On 12/1/23 01:02, waffl3x wrote:


I ran into another issue while devising tests for redeclarations of
xobj member functions as static member functions and vice versa. I am
pretty sure by the literal wording of the standard, this is well formed.

template
concept Constrain = true;

struct S {
void f(this auto, Constrain auto) {};
static void f(Constrain auto) {};

void g(this auto const&, Constrain auto) {};
static void g(Constrain auto) {};

void h(this auto&&, Constrain auto) {};
static void h(Constrain auto) {};
};

And also,

struct S{
void f(this auto) {};
static void f() {};

void g(this auto const&) {};
static void g() {};

void h(this auto&&) {};
static void h() {};
};

I wrote these tests expecting them to be ill-formed, and found what I
thought was a bug when they were not diagnosed as redecelarations.
However, given how the code for resolving overloads and determining
redeclarations looks, I believe this is actually well formed on a
technicality. I can't find the passages in the standard that specify
this so I can't be sure.



I think the relevant section is
https://eel.is/c++draft/basic.scope.scope


Anyway, the template parameter list differs because of the deduced
object parameter. Now here is the question, you are required to ignore
the object parameter when determining if these are redeclarations or
not, but what about the template parameters associated with the object
parameter? Am I just missing the passage that specifies this or is this
an actual defect in the standard?



I think that since they differ in template parameters, they don't
correspond under https://eel.is/c++draft/basic.scope.scope#4.5 so they
can be overloaded.

This is specified in terms of the template-head grammar non-terminal,
but elsewhere we say that abbreviated templates are equivalent to
writing out the template parameters explicitly.


The annoying thing is, even if this was brought up, I think the only
solution is to ratify these examples as well formed.


Yes.


I can't get over that I feel like this goes against the spirit of the
specification. Just because an object argument is deduced should not
suddenly mean we take it into account. Too bad there's no good solution.


Yep.  Note that it's normal for a template to overload with a non-template:

struct A
{
  void f();
  template  void f();  // OK
};


I especially don't like that that the following case is ambiguous. I
understand why, but I don't like it.

template
concept Constrain = true;

struct S {
   int f(this auto, Constrain auto) {};
   static f(auto) {};
};
main() {
   S{}.f(0);
}

I would like to see this changed honestly. When an ambiguity is
encountered, the more constrained function should be taken into account
even if they normally can't be considered. Is there some pitfall with
this line of thinking that kept it out of the standard? Is it just a
case of "too hard to specify" or is there some reason it's impossible
to do in all but the simplest of cases?


I would actually expect the static function to be chosen as more 
specialized before we get to considering constraints, just as with


void f(auto, Constrain auto) = delete;
void f(const S&, auto) {}
int main() { f(S{},0); } // OK

Though it looks like [temp.func.order] needs to be adjusted for explicit 
object parameters.  And more_specialized_fn in gcc still has an outdated 
handling of object parameters that just skips them, from before the 
clearer specification in C++11 and later; this is PR53499.  No need to 
address that preexisting bug in this patch.


Jason

[PATCH] gettext: disable install, docs targets, libasprintf, threads

2023-12-02 Thread Arsen Arsenović

This fixes issues reported by David Edelsohn , and by
Eric Gallager .

ChangeLog:

* Makefile.def (gettext): Disable (via missing)
{install-,}{pdf,html,info,dvi} and TAGS targets.  Set no_install
to true.  Add --disable-threads --disable-libasprintf.  Drop the
lib_path (as there are no shared libs).
---
Afternoon,

This patch disables various targets and features on the gettext module
to fix problems reported by David Edelsohn and Eric Gallager in
https://inbox.sourceware.org/CAGWvnynmWgNjup4cAwSbsy1vw_MJLQqSULwM=kth_+lt+_s...@mail.gmail.com/
and followups and on IRC, respectively.

The gettext module does not actually require any of these to be usable
for the purposes of the toolchain, so disabling them seems to be a
decent workaround.

This seemed to fix the respective issues for both Eric and David,
though, I could not get GDB to build on AIX with or without this patch
applied (I needed to disable sim, gdb and gnulib modules).

It is possible I am missing something.  Due to some unfortunate
circumstances, it's taken more time than anticipated to actually get
this change tested, and I've had to context swap quite a few bits.  Such
a process has quite a lot of room for error.

Tested on x86_64-unknown-freebsd13.2.

 Makefile.def |  13 +++-
 Makefile.in  | 202 ---
 [removed regenerated file from the patch below]
 2 files changed, 40 insertions(+), 175 deletions(-)

diff --git a/Makefile.def b/Makefile.def
index 792f81447e1b..ba89d46b2495 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -80,8 +80,17 @@ host_modules= { module= gettext; bootstrap=true; 
no_install=true;
// need it in some configuratons, which is determined via 
nontrivial tests.
// Always enabling pic seems to make sense for something tied to
// user-facing output.
-extra_configure_flags='--disable-shared --disable-java 
--disable-csharp --with-pic';
-lib_path=intl/.libs; };
+   extra_configure_flags='--disable-shared --disable-threads 
--disable-java --disable-csharp --with-pic --disable-libasprintf';
+   missing= pdf;
+   missing= html;
+   missing= info;
+   missing= dvi;
+   missing= install-pdf;
+   missing= install-html;
+   missing= install-info;
+   missing= install-dvi;
+   missing= TAGS;
+   no_install= true; };
 host_modules= { module= tcl;
 missing=mostlyclean; };
 host_modules= { module= itcl; };
diff --git a/Makefile.in b/Makefile.in
index da2344b3f3dc..3bd7d37e9605 100644
 
-- 
2.43.0

Re: [PATCH] Tweak language choice in config-list.mk

2023-12-02 Thread Richard Sandiford

Richard Biener  writes:
> On Thu, Sep 7, 2023 at 11:30 AM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> When I tried to use config-list.mk, the build for every triple except
>> the build machine's failed for m2.  This is because, unlike other
>> languages, m2 builds target objects during all-gcc.  The build will
>> therefore fail unless you have access to an appropriate binutils
>> (or an equivalent).  That's quite a big ask for over 100 targets. :)
>>
>> This patch therefore makes m2 an optional inclusion.
>>
>> Doing that wasn't entirely straightforward though.  The current
>> configure line includes "--enable-languages=all,...", which means
>> that the "..." can only force languages to be added that otherwise
>> wouldn't have been.  (I.e. the only effect of the "..." is to
>> override configure autodetection.)
>>
>> The choice of all,ada and:
>>
>>   # Make sure you have a recent enough gcc (with ada support) in your path so
>>   # that --enable-werror-always will work.
>>
>> make it clear that lack of GNAT should be a build failure rather than
>> silently ignored.  This predates the D frontend, which requires GDC
>> in the same way that Ada requires GNAT.  I don't know of a reason
>> why D should be treated differently.
>>
>> The patch therefore expands the "all" into a specific list of
>> languages.
>>
>> That in turn meant that Fortran had to be handled specially,
>> since bpf and mmix don't support Fortran.
>>
>> Perhaps there's an argument that m2 shouldn't build target objects
>> during all-gcc,
>
> Yes, I think that's unfortunate - can you open a bugreport for this?

For the record, I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112825

>> but (a) it works for practical usage and (b) the
>> patch is an easy workaround.  I'd be happy for the patch to be
>> reverted if the build system changes.
>>
>> OK to install?
>
> OK.

Thanks.  Now belatedly pushed after using it to retest the attribute
namespace patch (thanks for reviewing that too).

Richard

>> Richard
>>
>>
>> gcc/
>> * contrib/config-list.mk (OPT_IN_LANGUAGES): New variable.
>> ($(LIST)): Replace --enable-languages=all with a specifc list.
>> Disable fortran on bpf and mmix.  Enable the languages in
>> OPT_IN_LANGUAGES.
>> ---
>>  contrib/config-list.mk | 17 ++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/contrib/config-list.mk b/contrib/config-list.mk
>> index e570b13c71b..50ecb014bc0 100644
>> --- a/contrib/config-list.mk
>> +++ b/contrib/config-list.mk
>> @@ -12,6 +12,11 @@ TEST=all-gcc
>>  # supply an absolute path.
>>  GCC_SRC_DIR=../../gcc
>>
>> +# Define this to ,m2 if you want to build Modula-2.  Modula-2 builds target
>> +# objects during all-gcc, so it can only be included if you've installed
>> +# binutils (or an equivalent) for each target.
>> +OPT_IN_LANGUAGES=
>> +
>>  # Use -j / -l make arguments and nice to assure a smooth resource-efficient
>>  # load on the build machine, e.g. for 24 cores:
>>  # svn co svn://gcc.gnu.org/svn/gcc/branches/foo-branch gcc
>> @@ -126,17 +131,23 @@ $(LIST): make-log-dir
>> TGT=`echo $@ | awk 'BEGIN { FS = "OPT" }; { print $$1 }'` && 
>>\
>> TGT=`$(GCC_SRC_DIR)/config.sub $$TGT` && 
>>\
>> case $$TGT in
>>\
>> -   *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix* | 
>> bpf-*-*)\
>> +   bpf-*-*) 
>>\
>> ADDITIONAL_LANGUAGES=""; 
>>\
>> ;;   
>>\
>> -   *)   
>>\
>> +   *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix* | 
>> bpf-*-*)\
>> +   ADDITIONAL_LANGUAGES=",fortran"; 
>>\
>> +   ;;   
>>\
>> +   mmix-*-*)
>>\
>> ADDITIONAL_LANGUAGES=",go";  
>>\
>> ;;   
>>\
>> +   *)   
>>\
>> +   ADDITIONAL_LANGUAGES=",fortran,go";  
>>\
>> +   ;;   
>>\
>> esac &&

Re: [PATCH] Tweak language choice in config-list.mk

2023-12-02 Thread Richard Sandiford

Christophe Lyon  writes:
> Hi!
>
>
> On Thu, 7 Sept 2023 at 11:30, Richard Sandiford via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
>> When I tried to use config-list.mk, the build for every triple except
>> the build machine's failed for m2.  This is because, unlike other
>> languages, m2 builds target objects during all-gcc.  The build will
>> therefore fail unless you have access to an appropriate binutils
>> (or an equivalent).  That's quite a big ask for over 100 targets. :)
>>
>> This patch therefore makes m2 an optional inclusion.
>>
>> Doing that wasn't entirely straightforward though.  The current
>> configure line includes "--enable-languages=all,...", which means
>> that the "..." can only force languages to be added that otherwise
>> wouldn't have been.  (I.e. the only effect of the "..." is to
>> override configure autodetection.)
>>
>> The choice of all,ada and:
>>
>>   # Make sure you have a recent enough gcc (with ada support) in your path
>> so
>>   # that --enable-werror-always will work.
>>
>> make it clear that lack of GNAT should be a build failure rather than
>> silently ignored.  This predates the D frontend, which requires GDC
>> in the same way that Ada requires GNAT.  I don't know of a reason
>> why D should be treated differently.
>>
>> The patch therefore expands the "all" into a specific list of
>> languages.
>>
>> That in turn meant that Fortran had to be handled specially,
>> since bpf and mmix don't support Fortran.
>>
>> Perhaps there's an argument that m2 shouldn't build target objects
>> during all-gcc, but (a) it works for practical usage and (b) the
>> patch is an easy workaround.  I'd be happy for the patch to be
>> reverted if the build system changes.
>>
>> OK to install?
>>
>> Richard
>>
>>
>> gcc/
>> * contrib/config-list.mk (OPT_IN_LANGUAGES): New variable.
>> ($(LIST)): Replace --enable-languages=all with a specifc list.
>> Disable fortran on bpf and mmix.  Enable the languages in
>> OPT_IN_LANGUAGES.
>> ---
>>  contrib/config-list.mk | 17 ++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/contrib/config-list.mk b/contrib/config-list.mk
>> index e570b13c71b..50ecb014bc0 100644
>> --- a/contrib/config-list.mk
>> +++ b/contrib/config-list.mk
>> @@ -12,6 +12,11 @@ TEST=all-gcc
>>  # supply an absolute path.
>>  GCC_SRC_DIR=../../gcc
>>
>> +# Define this to ,m2 if you want to build Modula-2.  Modula-2 builds
>> target
>> +# objects during all-gcc, so it can only be included if you've installed
>> +# binutils (or an equivalent) for each target.
>> +OPT_IN_LANGUAGES=
>> +
>>  # Use -j / -l make arguments and nice to assure a smooth
>> resource-efficient
>>  # load on the build machine, e.g. for 24 cores:
>>  # svn co svn://gcc.gnu.org/svn/gcc/branches/foo-branch gcc
>> @@ -126,17 +131,23 @@ $(LIST): make-log-dir
>> TGT=`echo $@ | awk 'BEGIN { FS = "OPT" }; { print $$1 }'`
>> &&\
>> TGT=`$(GCC_SRC_DIR)/config.sub $$TGT` &&
>>   \
>> case $$TGT in
>>  \
>> -   *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*
>> | bpf-*-*)\
>> +   bpf-*-*)
>>   \
>> ADDITIONAL_LANGUAGES="";
>>   \
>> ;;
>>   \
>> -   *)
>>   \
>> +   *-*-darwin* | *-*-cygwin* | *-*-mingw* | *-*-aix*
>> | bpf-*-*)\
>>
> Am I misreading, or are you matching bpf here and above? From your commit
> message, I think bpf should either be only above (and define
> ADDITIONAL_LANGUAGES to "") and along with mmix (if it supports go) ?

Thanks for the catch.  I forgot to remove bpf from the old list when
adding the new case.  I've now (finally!) pushed the patch with the
redundant bpf removed.

Richard

>> +   ADDITIONAL_LANGUAGES=",fortran";
>>   \
>> +   ;;
>>   \
>> +   mmix-*-*)
>>  \
>> ADDITIONAL_LANGUAGES=",go";
>>  \
>> ;;
>>   \
>> +   *)
>>   \
>> +   ADDITIONAL_LANGUAGES=",fortran,go";
>>  \
>> +   ;;
>>   \
>> esac &&
>>  \
>> $(GCC_SRC_DIR)/configure
>>   \
>> --target=$(subst SCRIPTS,`pwd`/../scripts/,$(subst
>> OPT,$(empty) -,$@))  \
>> --enable-werror-always ${host_options}
>>   \
>> -   --enable-languages=all,ada$$ADDITIONAL_LANGUAGES;
>>

Re: [PATCH v1 1/2] LoongArch: Switch loongarch-def from C to C++ to make it possible.

2023-12-02 Thread Xi Ruoyao

On Sat, 2023-12-02 at 20:44 +0800, chenglulu wrote:
> > > @@ -657,12 +658,18 @@ abi_str (struct loongarch_abi abi)
> > >        strlen (loongarch_abi_base_strings[abi.base]));
> > >      else
> > >    {
> > > +  /* This situation has not yet occurred, so in order to avoid
> > > the
> > > +  -Warray-bounds warning during C++ syntax checking, this part
> > > +  of the code is commented first.*/
> > > +  /*
> > Just put a "gcc_unreachable ();" here?
> Um, I just thought that the code can't go here, I will add a prompt 
> message here.:-(

If I read the code correctly, this is indeed unreachable so we can just
put gcc_unreachable() here.  But maybe I'm wrong.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]

2023-12-02 Thread Richard Sandiford

Eric Botcazou  writes:
>> So sorry to be awkward, but I don't think this is the way to go.  I think
>> we'll just end up playing whack-a-mole and adding df_note_add_problem to
>> lots of passes.
>
> We have doing that for the past 15 years though, so what has changed?

Off-hand, I couldn't remember a case where we'd done this specifically
for false-positive REG_UNUSED notes.  (There probably were cases though.)

>> (FTR, I'm not saying passes have to avoid false negatives, just false
>> positives.  If a pass updates an instruction with a REG_UNUSED note,
>> and the pass is no longer sure whether the register is unused or not,
>> the pass can just delete the note.)
>
> Reintroducing the manual management of such notes would be a step backward.

I think false-positive REG_UNUSED notes are fundamentally different
from the other cases though.  If a register is unused then its natural
state is to remain unused.  The register will only become used if something
goes out of its way to add new uses of an instruction result that "just
happens to be there".  That's a deliberate decision and needs some
analysis to prove that it's safe.  Requiring the pass to clear REG_UNUSED
notes too doesn't seem like a significant extra burden.

Trying to reduce false-negative REG_UNUSED notes is different,
since deleting any instruction could in principle make a register
go from used to unused.  Same for REG_DEAD notes: if a pass deletes
an instruction with a REG_DEAD note then it shouldn't have to figure
out where the new death occurs.

Not sure how representative this is, but I tried the hack below
to flag cases where single_set is used in passes that don't have
up-to-date notes, then ran it on execute.exp.  The checking fired
for every version of every test.  The collected passes were:

single_set: bbro
single_set: cc_fusion
single_set: ce1
single_set: ce2
single_set: ce3
single_set: cmpelim
single_set: combine
single_set: compgotos
single_set: cprop
single_set: dwarf2
single_set: fold_mem_offsets
single_set: fwprop1
single_set: fwprop2
single_set: gcse2
single_set: hoist
single_set: init-regs
single_set: ira
single_set: jump2
single_set: jump_after_combine
single_set: loop2_done
single_set: loop2_invariant
single_set: postreload
single_set: pro_and_epilogue
single_set: ree
single_set: reload
single_set: rtl_dce
single_set: rtl pre
single_set: sched1
single_set: sched2
single_set: sched_fusion
single_set: sms
single_set: split1
single_set: split2
single_set: split3
single_set: split5
single_set: subreg3
single_set: ud_dce
single_set: vartrack
single_set: web

which is a lot of passes :)

Some of the calls might be OK in context, due to call-specific
circumstances.  But I think it's generally easier to see/show that
a pass is adding new uses of existing defs than it is to prove that
a use of single_set is safe even when notes aren't up-to-date.

Thanks,
Richard

diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
index d2cfaf7f50f..ece49f041e0 100644
--- a/gcc/df-problems.cc
+++ b/gcc/df-problems.cc
@@ -3782,6 +3782,8 @@ void
 df_note_add_problem (void)
 {
   df_add_problem (_NOTE);
+  extern bool single_set_ok;
+  single_set_ok = true;
 }

diff --git a/gcc/passes.cc b/gcc/passes.cc
index 6f894a41d22..d8e12ea2512 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -2637,9 +2637,14 @@ execute_one_pass (opt_pass *pass)
 do_per_function (verify_curr_properties,
 (void *)(size_t)pass->properties_required);

+  extern bool single_set_ok;
+  single_set_ok = !df;
+
   /* Do it!  */
   todo_after = pass->execute (cfun);

+  single_set_ok = !df;
+
   if (todo_after & TODO_discard_function)
 {
   /* Stop timevar.  */
diff --git a/gcc/rtl.h b/gcc/rtl.h
index e4b6cc0dbb5..af3bd1b7cfa 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3607,8 +3607,11 @@ extern void add_auto_inc_notes (rtx_insn *, rtx);

 /* Handle the cheap and common cases inline for performance.  */

+extern void check_single_set ();
 inline rtx single_set (const rtx_insn *insn)
 {
+  check_single_set ();
+
   if (!INSN_P (insn))
 return NULL_RTX;

diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 87267ee3b88..e0207e2753a 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-iter.h"
 #include "hard-reg-set.h"
 #include "function-abi.h"
+#include "tree-pass.h"

 /* Forward declarations */
 static void set_of_1 (rtx, const_rtx, void *);
@@ -1543,6 +1544,20 @@ record_hard_reg_uses (rtx *px, void *data)
It may also have CLOBBERs, USEs, or SET whose output
will not be used, which we ignore.  */

+bool single_set_ok;
+void 
+check_single_set ()
+{
+  static opt_pass *last_pass;
+  if (!single_set_ok
+  && current_pass
+  && last_pass != current_pass)
+{
+  last_pass = current_pass;
+  fprintf (stderr, "single_set: %s\n", current_pass->name);
+}
+}
+
 rtx
 single_set_2 (const rtx_insn *insn, const_rtx pat)
 {

Re: [PATCH v1 1/2] LoongArch: Switch loongarch-def from C to C++ to make it possible.

2023-12-02 Thread chenglulu




在 2023/12/2 下午6:15, Xi Ruoyao 写道:

On Sat, 2023-12-02 at 16:14 +0800, Lulu Cheng wrote:

/* snip */


diff --git a/gcc/config/loongarch/loongarch-opts.cc
b/gcc/config/loongarch/loongarch-opts.cc
index b5836f198c0..6861642a98d 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -163,6 +163,7 @@ loongarch_config_target (struct loongarch_target
*target,
     int follow_multilib_list_p)
  {
    struct loongarch_target t;
+
    if (!target)
  return;
  
@@ -657,12 +658,18 @@ abi_str (struct loongarch_abi abi)

     strlen (loongarch_abi_base_strings[abi.base]));
    else
  {
+  /* This situation has not yet occurred, so in order to avoid
the
+-Warray-bounds warning during C++ syntax checking, this part
+of the code is commented first.*/
+  /*

Just put a "gcc_unreachable ();" here?
Um, I just thought that the code can't go here, I will add a prompt 
message here.:-(





    APPEND_STRING (loongarch_abi_base_strings[abi.base])
    APPEND1 ('/')
    APPEND_STRING (loongarch_abi_ext_strings[abi.ext])
    APPEND1 ('\0')
  
    return XOBFINISH (_obstack, const char *);

+  */
+  gcc_unreachable ();
  }
  }
  
diff --git a/gcc/config/loongarch/loongarch-opts.h

b/gcc/config/loongarch/loongarch-opts.h
index fa3773223bc..7a644c86d48 100644
--- a/gcc/config/loongarch/loongarch-opts.h
+++ b/gcc/config/loongarch/loongarch-opts.h
@@ -21,7 +21,10 @@ along with GCC; see the file COPYING3.  If not see
  #ifndef LOONGARCH_OPTS_H
  #define LOONGARCH_OPTS_H
  
+/* This is a C++ header and it shouldn't be used by target libraries.  */

+#if !defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
  #include "loongarch-def.h"
+#endif

With this change we can revert r14-5634 (remove the #if
!defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
guards in loongarch-def.h as they'll be unneeded).



Ok!

Thanks!

[PATCH] libgcov: Call __builtin_fork instead of fork

2023-12-02 Thread Florian Weimer

Some targets do not provide a prototype for fork, and compilation now
fails with an implicit-function-declaration error.

libgcc/

* libgcov-interface.c (__gcov_fork):

Generated code is the same on x86_64-linux-gnu.  Okay for trunk?

Thanks,
Florian
---
 libgcc/libgcov-interface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/libgcov-interface.c b/libgcc/libgcov-interface.c
index b2ee9308641..d166e98510d 100644
--- a/libgcc/libgcov-interface.c
+++ b/libgcc/libgcov-interface.c
@@ -182,7 +182,7 @@ pid_t
 __gcov_fork (void)
 {
   pid_t pid;
-  pid = fork ();
+  pid = __builtin_fork ();
   if (pid == 0)
 {
   __GTHREAD_MUTEX_INIT_FUNCTION (&__gcov_mx);

base-commit: 193ef02a7f4f3e5349ad9cf8d3d4df466a99b677

Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]

2023-12-02 Thread Eric Botcazou

> So sorry to be awkward, but I don't think this is the way to go.  I think
> we'll just end up playing whack-a-mole and adding df_note_add_problem to
> lots of passes.

We have doing that for the past 15 years though, so what has changed?

> (FTR, I'm not saying passes have to avoid false negatives, just false
> positives.  If a pass updates an instruction with a REG_UNUSED note,
> and the pass is no longer sure whether the register is unused or not,
> the pass can just delete the note.)

Reintroducing the manual management of such notes would be a step backward.

-- 
Eric Botcazou

[PATCH] htdocs/contribute.html: correct disctinct->distinct spelling

2023-12-02 Thread Jonny Grant

Correct a spelling mistake this page:
https://gcc.gnu.org/contribute.html


>From 5bfcc97cafb195c68d3110247dabeaf464b8d367 Mon Sep 17 00:00:00 2001
From: Jonathan Grant 
Date: Sat, 2 Dec 2023 11:55:40 +
Subject: [PATCH] htdocs/contribute.html: correct disctinct->distinct spelling

---
 htdocs/contribute.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index 7c1ae323..fbe5b39c 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -299,7 +299,7 @@ followed by a colon.  For example,
 
 
 Some large components may be subdivided into sub-components.  If
-the subcomponent name is not disctinct in its own right, you can use the
+the subcomponent name is not distinct in its own right, you can use the
 form component/sub-component:.
 
 Series identifier
-- 
2.40.1

Re: [PATCH] lower-bitint: Fix up lower_addsub_overflow [PR112807]

2023-12-02 Thread Richard Biener




> Am 02.12.2023 um 12:05 schrieb Jakub Jelinek :
> 
> Hi!
> 
> lower_addsub_overflow uses handle_cast or handle_operand to extract current
> limb from the operands.  Both of those functions heavily assume that they
> return a large or huge BITINT_TYPE.  The problem in the testcase is that
> this is violated.  Normally, lower_addsub_overflow isn't even called if
> neither the return's type element type nor any of the operand is large/huge
> BITINT_TYPE (on x86_64 129+ bits), for middle BITINT_TYPE (on x86_64 65-128
> bits) some other code casts such operands to {,unsigned }__int128.
> In the testcase the result is complex unsigned, so small, but one of the
> arguments is _BitInt(256), so lower_addsub_overflow is called.  But
> range_for_prec asks the ranger for ranges of the operands and in this
> case the first argument has [0, 0x] range and second [-2, 1], so
> unsigned 32-bit and signed 2-bit, and in such case the code for
> handle_operand/handle_cast purposes would use the _BitInt(256) type for the
> first operand (ok), but because prec3 aka maximum of result precision and
> the VRP computes ranges of the arguments is 32, use cast to 32-bit
> BITINT_TYPE, which is why it didn't work correctly.
> The following patch ensures that in such cases we use handle_cast to the
> type of the other argument.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok

> Perhaps incrementally, we could try to optimize this in an earlier phase,
> see that while the .{ADD,SUB}_OVERFLOW has large/huge _BitInt argument, as
> ranger says it fits into a smaller type, add a cast of the larger argument
> to the smaller precision type in which it fits.  Either in
> gimple_lower_bitint, or match.pd.  An argument for the latter is that e.g.
> complex unsigned .ADD_OVERFLOW (unsigned_long_long_arg, unsigned_arg)
> where ranger says unsigned_long_long_arg fits into unsigned 32-bit could
> be also more efficient as
> .ADD_OVERFLOW ((unsigned) unsigned_long_long_arg, unsigned_arg)

Sounds reasonable.

Richard 

> 2023-12-02  Jakub Jelinek  
> 
>PR middle-end/112807
>* gimple-lower-bitint.cc (bitint_large_huge::lower_addsub_overflow):
>When choosing type0 and type1 types, if prec3 has small/middle bitint
>kind, use maximum of type0 and type1's precision instead of prec3.
> 
>* gcc.dg/bitint-46.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj2023-12-01 10:56:45.535228688 +0100
> +++ gcc/gimple-lower-bitint.cc2023-12-01 18:38:24.633663667 +0100
> @@ -3911,15 +3911,18 @@ bitint_large_huge::lower_addsub_overflow
> 
>   tree type0 = TREE_TYPE (arg0);
>   tree type1 = TREE_TYPE (arg1);
> -  if (TYPE_PRECISION (type0) < prec3)
> +  int prec5 = prec3;
> +  if (bitint_precision_kind (prec5) < bitint_prec_large)
> +prec5 = MAX (TYPE_PRECISION (type0), TYPE_PRECISION (type1));
> +  if (TYPE_PRECISION (type0) < prec5)
> {
> -  type0 = build_bitint_type (prec3, TYPE_UNSIGNED (type0));
> +  type0 = build_bitint_type (prec5, TYPE_UNSIGNED (type0));
>   if (TREE_CODE (arg0) == INTEGER_CST)
>arg0 = fold_convert (type0, arg0);
> }
> -  if (TYPE_PRECISION (type1) < prec3)
> +  if (TYPE_PRECISION (type1) < prec5)
> {
> -  type1 = build_bitint_type (prec3, TYPE_UNSIGNED (type1));
> +  type1 = build_bitint_type (prec5, TYPE_UNSIGNED (type1));
>   if (TREE_CODE (arg1) == INTEGER_CST)
>arg1 = fold_convert (type1, arg1);
> }
> --- gcc/testsuite/gcc.dg/bitint-46.c.jj2023-12-01 18:47:12.460245617 +0100
> +++ gcc/testsuite/gcc.dg/bitint-46.c2023-12-01 18:46:41.297683578 +0100
> @@ -0,0 +1,32 @@
> +/* PR middle-end/112807 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=gnu23 -O2" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 256
> +__attribute__((noipa)) int
> +foo (_BitInt (256) a, _BitInt (2) b)
> +{
> +  if (a < 0 || a > ~0U)
> +return -1;
> +  return __builtin_sub_overflow_p (a, b, 0);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 256
> +  if (foo (-5wb, 1wb) != -1
> +  || foo (1 + (_BitInt (256)) ~0U, -2) != -1
> +  || foo (0, 0) != 0
> +  || foo (0, 1) != 0
> +  || foo (0, -1) != 0
> +  || foo (~0U, 0) != 1
> +  || foo (__INT_MAX__, 0) != 0
> +  || foo (__INT_MAX__, -1) != 1
> +  || foo (1 + (_BitInt (256)) __INT_MAX__, 0) != 1
> +  || foo (1 + (_BitInt (256)) __INT_MAX__, 1) != 0
> +  || foo (1 + (_BitInt (256)) __INT_MAX__, -2) != 1)
> +__builtin_abort ();
> +#endif
> +}
> 
>Jakub
>

[PATCH] lower-bitint: Fix up lower_addsub_overflow [PR112807]

2023-12-02 Thread Jakub Jelinek

Hi!

lower_addsub_overflow uses handle_cast or handle_operand to extract current
limb from the operands.  Both of those functions heavily assume that they
return a large or huge BITINT_TYPE.  The problem in the testcase is that
this is violated.  Normally, lower_addsub_overflow isn't even called if
neither the return's type element type nor any of the operand is large/huge
BITINT_TYPE (on x86_64 129+ bits), for middle BITINT_TYPE (on x86_64 65-128
bits) some other code casts such operands to {,unsigned }__int128.
In the testcase the result is complex unsigned, so small, but one of the
arguments is _BitInt(256), so lower_addsub_overflow is called.  But
range_for_prec asks the ranger for ranges of the operands and in this
case the first argument has [0, 0x] range and second [-2, 1], so
unsigned 32-bit and signed 2-bit, and in such case the code for
handle_operand/handle_cast purposes would use the _BitInt(256) type for the
first operand (ok), but because prec3 aka maximum of result precision and
the VRP computes ranges of the arguments is 32, use cast to 32-bit
BITINT_TYPE, which is why it didn't work correctly.
The following patch ensures that in such cases we use handle_cast to the
type of the other argument.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Perhaps incrementally, we could try to optimize this in an earlier phase,
see that while the .{ADD,SUB}_OVERFLOW has large/huge _BitInt argument, as
ranger says it fits into a smaller type, add a cast of the larger argument
to the smaller precision type in which it fits.  Either in
gimple_lower_bitint, or match.pd.  An argument for the latter is that e.g.
complex unsigned .ADD_OVERFLOW (unsigned_long_long_arg, unsigned_arg)
where ranger says unsigned_long_long_arg fits into unsigned 32-bit could
be also more efficient as
.ADD_OVERFLOW ((unsigned) unsigned_long_long_arg, unsigned_arg)

2023-12-02  Jakub Jelinek  

PR middle-end/112807
* gimple-lower-bitint.cc (bitint_large_huge::lower_addsub_overflow):
When choosing type0 and type1 types, if prec3 has small/middle bitint
kind, use maximum of type0 and type1's precision instead of prec3.

* gcc.dg/bitint-46.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2023-12-01 10:56:45.535228688 +0100
+++ gcc/gimple-lower-bitint.cc  2023-12-01 18:38:24.633663667 +0100
@@ -3911,15 +3911,18 @@ bitint_large_huge::lower_addsub_overflow
 
   tree type0 = TREE_TYPE (arg0);
   tree type1 = TREE_TYPE (arg1);
-  if (TYPE_PRECISION (type0) < prec3)
+  int prec5 = prec3;
+  if (bitint_precision_kind (prec5) < bitint_prec_large)
+prec5 = MAX (TYPE_PRECISION (type0), TYPE_PRECISION (type1));
+  if (TYPE_PRECISION (type0) < prec5)
 {
-  type0 = build_bitint_type (prec3, TYPE_UNSIGNED (type0));
+  type0 = build_bitint_type (prec5, TYPE_UNSIGNED (type0));
   if (TREE_CODE (arg0) == INTEGER_CST)
arg0 = fold_convert (type0, arg0);
 }
-  if (TYPE_PRECISION (type1) < prec3)
+  if (TYPE_PRECISION (type1) < prec5)
 {
-  type1 = build_bitint_type (prec3, TYPE_UNSIGNED (type1));
+  type1 = build_bitint_type (prec5, TYPE_UNSIGNED (type1));
   if (TREE_CODE (arg1) == INTEGER_CST)
arg1 = fold_convert (type1, arg1);
 }
--- gcc/testsuite/gcc.dg/bitint-46.c.jj 2023-12-01 18:47:12.460245617 +0100
+++ gcc/testsuite/gcc.dg/bitint-46.c2023-12-01 18:46:41.297683578 +0100
@@ -0,0 +1,32 @@
+/* PR middle-end/112807 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=gnu23 -O2" } */
+
+#if __BITINT_MAXWIDTH__ >= 256
+__attribute__((noipa)) int
+foo (_BitInt (256) a, _BitInt (2) b)
+{
+  if (a < 0 || a > ~0U)
+return -1;
+  return __builtin_sub_overflow_p (a, b, 0);
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 256
+  if (foo (-5wb, 1wb) != -1
+  || foo (1 + (_BitInt (256)) ~0U, -2) != -1
+  || foo (0, 0) != 0
+  || foo (0, 1) != 0
+  || foo (0, -1) != 0
+  || foo (~0U, 0) != 1
+  || foo (__INT_MAX__, 0) != 0
+  || foo (__INT_MAX__, -1) != 1
+  || foo (1 + (_BitInt (256)) __INT_MAX__, 0) != 1
+  || foo (1 + (_BitInt (256)) __INT_MAX__, 1) != 0
+  || foo (1 + (_BitInt (256)) __INT_MAX__, -2) != 1)
+__builtin_abort ();
+#endif
+}

Jakub

Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]

2023-12-02 Thread Richard Sandiford

Jakub Jelinek  writes:
> Hi!
>
> The following testcase ICEs on x86_64-linux since df_note_add_problem ()
> call has been added to mode switching.
> The problem is that the pro_and_epilogue pass in
> prepare_shrink_wrap -> copyprop_hardreg_forward_bb_without_debug_insn
> uses regcprop.cc infrastructure which relies on accurate REG_DEAD/REG_UNUSED
> notes.  E.g. regcprop.cc
>   /* We need accurate notes.  Earlier passes such as if-conversion may
>  leave notes in an inconsistent state.  */
>   df_note_add_problem ();
> documents that.  On the testcase below it is in particular the
>   /* Detect obviously dead sets (via REG_UNUSED notes) and remove them.  
> */
>   if (set
>   && !RTX_FRAME_RELATED_P (insn)
>   && NONJUMP_INSN_P (insn)
>   && !may_trap_p (set)
>   && find_reg_note (insn, REG_UNUSED, SET_DEST (set))
>   && !side_effects_p (SET_SRC (set))
>   && !side_effects_p (SET_DEST (set)))
> {
>   bool last = insn == BB_END (bb);
>   delete_insn (insn);
>   if (last)
> break;
>   continue;
> } 
> case where a stale REG_UNUSED note breaks stuff up (added in vzeroupper
> pass, redundant insn after it deleted later).
>
> The following patch makes sure the notes are not stale if shrink wrapping
> is enabled.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

I still maintain that so much stuff relies on the lack of false-positive
REG_UNUSED notes that (whatever the intention might have been) we need
to prevent the false positive.  Like Andrew says, any use of single_set
is suspect if there's a REG_UNUSED note for something that is in fact used.

So sorry to be awkward, but I don't think this is the way to go.  I think
we'll just end up playing whack-a-mole and adding df_note_add_problem to
lots of passes.

(FTR, I'm not saying passes have to avoid false negatives, just false
positives.  If a pass updates an instruction with a REG_UNUSED note,
and the pass is no longer sure whether the register is unused or not,
the pass can just delete the note.)

Richard

> 2023-12-02  Jakub Jelinek  
>
>   PR rtl-optimization/112760
>   * function.cc (thread_prologue_and_epilogue_insns): If
>   SHRINK_WRAPPING_ENABLED, call df_note_add_problem before calling
>   df_analyze.
>
>   * gcc.dg/pr112760.c: New test.
>
> --- gcc/function.cc.jj2023-11-07 08:32:01.699254744 +0100
> +++ gcc/function.cc   2023-12-01 13:42:51.885189341 +0100
> @@ -6036,6 +6036,11 @@ make_epilogue_seq (void)
>  void
>  thread_prologue_and_epilogue_insns (void)
>  {
> +  /* prepare_shrink_wrap uses copyprop_hardreg_forward_bb_without_debug_insn
> + which uses regcprop.cc functions which rely on accurate REG_UNUSED
> + and REG_DEAD notes.  */
> +  if (SHRINK_WRAPPING_ENABLED)
> +df_note_add_problem ();
>df_analyze ();
>  
>/* Can't deal with multiple successors of the entry block at the
> --- gcc/testsuite/gcc.dg/pr112760.c.jj2023-12-01 13:46:57.444746529 
> +0100
> +++ gcc/testsuite/gcc.dg/pr112760.c   2023-12-01 13:46:36.729036971 +0100
> @@ -0,0 +1,22 @@
> +/* PR rtl-optimization/112760 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fno-dce -fno-guess-branch-probability 
> --param=max-cse-insns=0" } */
> +/* { dg-additional-options "-m8bit-idiv -mavx" { target i?86-*-* x86_64-*-* 
> } } */
> +
> +unsigned g;
> +
> +__attribute__((__noipa__)) unsigned short
> +foo (unsigned short a, unsigned short b)
> +{
> +  unsigned short x = __builtin_add_overflow_p (a, g, (unsigned short) 0);
> +  g -= g / b;
> +  return x;
> +}
> +
> +int
> +main ()
> +{
> +  unsigned short x = foo (40, 6);
> +  if (x != 0)
> +__builtin_abort ();
> +}
>
>   Jakub

[PATCH] c++: #pragma GCC unroll C++ fixes [PR112795]

2023-12-02 Thread Jakub Jelinek

Hi!

foo in the unroll-5.C testcase ICEs because cp_parser_pragma_unroll
during parsing calls maybe_constant_value unconditionally, which is
fine if !processing_template_decl, but can ICE otherwise.

While just calling fold_non_dependent_expr there instead could be enough
to fix the ICE (and I guess the right thing to do for backports if any),
I don't see a reason why we couldn't handle a dependent #pragma GCC unroll
argument as well, the unrolling isn't done in the FE and all the middle-end
cares about is that ANNOTATE_EXPR has a 1..65534 last operand when it is
annot_expr_unroll_kind.

So, the following patch changes all the unsigned short unroll arguments
to tree unroll (and thus avoids the tree -> unsigned short -> tree
conversions), does the type and value checking during parsing only if
the argument isn't dependent and repeats it during instantiation.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-12-02  Jakub Jelinek  

PR c++/112795
gcc/cp/
* cp-tree.h (cp_convert_range_for): Change UNROLL type from
unsigned short to tree.
(finish_while_stmt_cond, finish_do_stmt, finish_for_cond): Likewise.
* parser.cc (cp_parser_statement): Pass NULL_TREE rather than 0 to
cp_parser_iteration_statement UNROLL argument.
(cp_parser_for, cp_parser_c_for): Change UNROLL type from
unsigned short to tree.
(cp_parser_range_for): Likewise.  Set RANGE_FOR_UNROLL to just UNROLL
rather than build_int_cst from it.
(cp_convert_range_for, cp_parser_iteration_statement): Change UNROLL
type from unsigned short to tree.
(cp_parser_omp_loop_nest): Pass NULL_TREE rather than 0 to
cp_parser_range_for UNROLL argument.
(cp_parser_pragma_unroll): Return tree rather than unsigned short.
If parsed expression is type dependent, just return it, don't diagnose
issues with value if it is value dependent.
(cp_parser_pragma): Change UNROLL type from unsigned short to tree.
* semantics.cc (finish_while_stmt_cond): Change UNROLL type from
unsigned short to tree.  Build ANNOTATE_EXPR with UNROLL as its last
operand rather than build_int_cst from it.
(finish_do_stmt, finish_for_cond): Likewise.
* pt.cc (tsubst_stmt) : Change UNROLL type from
unsigned short to tree and set it to RECUR on RANGE_FOR_UNROLL (t).
(tsubst_expr) : For annot_expr_unroll_kind repeat
checks on UNROLL value from cp_parser_pragma_unroll.
gcc/testsuite/
* g++.dg/ext/unroll-5.C: New test.
* g++.dg/ext/unroll-6.C: New test.

--- gcc/cp/cp-tree.h.jj 2023-12-01 08:10:42.707324577 +0100
+++ gcc/cp/cp-tree.h2023-12-01 16:08:20.152165244 +0100
@@ -7371,7 +7371,7 @@ extern bool maybe_clone_body  (tree);
 
 /* In parser.cc */
 extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
- unsigned short, bool);
+ tree, bool);
 extern void cp_convert_omp_range_for (tree &, tree &, tree &,
  tree &, tree &, tree &, tree &, tree &);
 extern void cp_finish_omp_range_for (tree, tree);
@@ -7692,19 +7692,16 @@ extern void begin_else_clause   (tree);
 extern void finish_else_clause (tree);
 extern void finish_if_stmt (tree);
 extern tree begin_while_stmt   (void);
-extern void finish_while_stmt_cond (tree, tree, bool, unsigned short,
-bool);
+extern void finish_while_stmt_cond (tree, tree, bool, tree, bool);
 extern void finish_while_stmt  (tree);
 extern tree begin_do_stmt  (void);
 extern void finish_do_body (tree);
-extern void finish_do_stmt (tree, tree, bool, unsigned short,
-bool);
+extern void finish_do_stmt (tree, tree, bool, tree, bool);
 extern tree finish_return_stmt (tree);
 extern tree begin_for_scope(tree *);
 extern tree begin_for_stmt (tree, tree);
 extern void finish_init_stmt   (tree);
-extern void finish_for_cond(tree, tree, bool, unsigned short,
-bool);
+extern void finish_for_cond(tree, tree, bool, tree, bool);
 extern void finish_for_expr(tree, tree);
 extern void finish_for_stmt(tree);
 extern tree begin_range_for_stmt   (tree, tree);
--- gcc/cp/parser.cc.jj 2023-12-01 08:10:42.800323262 +0100
+++ gcc/cp/parser.cc2023-12-02 08:52:45.254387503 +0100
@@ -2391,15 +2391,15 @@ static tree cp_parser_selection_statemen
 static tree cp_parser_condition
   (cp_parser *);
 static tree cp_parser_iteration_statement
-  (cp_parser *, bool *, bool, unsigned short, bool);
+  (cp_parser

[PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]

2023-12-02 Thread Jakub Jelinek

Hi!

The following testcase ICEs on x86_64-linux since df_note_add_problem ()
call has been added to mode switching.
The problem is that the pro_and_epilogue pass in
prepare_shrink_wrap -> copyprop_hardreg_forward_bb_without_debug_insn
uses regcprop.cc infrastructure which relies on accurate REG_DEAD/REG_UNUSED
notes.  E.g. regcprop.cc
  /* We need accurate notes.  Earlier passes such as if-conversion may
 leave notes in an inconsistent state.  */
  df_note_add_problem ();
documents that.  On the testcase below it is in particular the
  /* Detect obviously dead sets (via REG_UNUSED notes) and remove them.  */
  if (set
  && !RTX_FRAME_RELATED_P (insn)
  && NONJUMP_INSN_P (insn)
  && !may_trap_p (set)
  && find_reg_note (insn, REG_UNUSED, SET_DEST (set))
  && !side_effects_p (SET_SRC (set))
  && !side_effects_p (SET_DEST (set)))
{
  bool last = insn == BB_END (bb);
  delete_insn (insn);
  if (last)
break;
  continue;
} 
case where a stale REG_UNUSED note breaks stuff up (added in vzeroupper
pass, redundant insn after it deleted later).

The following patch makes sure the notes are not stale if shrink wrapping
is enabled.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-12-02  Jakub Jelinek  

PR rtl-optimization/112760
* function.cc (thread_prologue_and_epilogue_insns): If
SHRINK_WRAPPING_ENABLED, call df_note_add_problem before calling
df_analyze.

* gcc.dg/pr112760.c: New test.

--- gcc/function.cc.jj  2023-11-07 08:32:01.699254744 +0100
+++ gcc/function.cc 2023-12-01 13:42:51.885189341 +0100
@@ -6036,6 +6036,11 @@ make_epilogue_seq (void)
 void
 thread_prologue_and_epilogue_insns (void)
 {
+  /* prepare_shrink_wrap uses copyprop_hardreg_forward_bb_without_debug_insn
+ which uses regcprop.cc functions which rely on accurate REG_UNUSED
+ and REG_DEAD notes.  */
+  if (SHRINK_WRAPPING_ENABLED)
+df_note_add_problem ();
   df_analyze ();
 
   /* Can't deal with multiple successors of the entry block at the
--- gcc/testsuite/gcc.dg/pr112760.c.jj  2023-12-01 13:46:57.444746529 +0100
+++ gcc/testsuite/gcc.dg/pr112760.c 2023-12-01 13:46:36.729036971 +0100
@@ -0,0 +1,22 @@
+/* PR rtl-optimization/112760 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-guess-branch-probability 
--param=max-cse-insns=0" } */
+/* { dg-additional-options "-m8bit-idiv -mavx" { target i?86-*-* x86_64-*-* } 
} */
+
+unsigned g;
+
+__attribute__((__noipa__)) unsigned short
+foo (unsigned short a, unsigned short b)
+{
+  unsigned short x = __builtin_add_overflow_p (a, g, (unsigned short) 0);
+  g -= g / b;
+  return x;
+}
+
+int
+main ()
+{
+  unsigned short x = foo (40, 6);
+  if (x != 0)
+__builtin_abort ();
+}

Jakub

Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-02 Thread Iain Sandoe




> On 2 Dec 2023, at 09:42, Martin Uecker  wrote:
> 
> 
>> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>> 
>> -- >8 --
>> It came up that a good hardening strategy is to disable trampolines
>> which may require executable stack.  Therefore the following patch
>> adds -Werror=trampolines to -fhardened.
> 
> This would add a warning about specific code (where it is then
> unclear whether rewriting it is feasible or even an improvement),
> which seems different to all the other flags -fhardening has
> now.
> 
> GCC now has an option to allocate trampolines on the heap,
> which would seem to be a better fit.

Indeed, I was thinking of mentioning this.

>  On the other hand,
> it does not work with longjmp which may be a limitation.

I suspect that we can make this work using handlers and forced unwind,
but unfortunately do not have time to work on it at the moment.

Iain

> 
> Martin
> 
> 
>> 
>> gcc/ChangeLog:
>> 
>>  * common.opt (Wtrampolines): Enable by -fhardened.
>>  * doc/invoke.texi: Reflect that -fhardened enables -Werror=trampolines.
>>  * opts.cc (print_help_hardened): Add -Werror=trampolines.
>>  * toplev.cc (process_options): Enable -Werror=trampolines for
>>  -fhardened.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.dg/fhardened-1.c: New test.
>>  * gcc.dg/fhardened-2.c: New test.
>>  * gcc.dg/fhardened-3.c: New test.
>>  * gcc.dg/fhardened-4.c: New test.
>>  * gcc.dg/fhardened-5.c: New test.
>> ---
>> gcc/common.opt |  2 +-
>> gcc/doc/invoke.texi|  1 +
>> gcc/opts.cc|  1 +
>> gcc/testsuite/gcc.dg/fhardened-1.c | 27 +++
>> gcc/testsuite/gcc.dg/fhardened-2.c | 25 +
>> gcc/testsuite/gcc.dg/fhardened-3.c | 25 +
>> gcc/testsuite/gcc.dg/fhardened-4.c | 25 +
>> gcc/testsuite/gcc.dg/fhardened-5.c | 27 +++
>> gcc/toplev.cc  |  8 +++-
>> 9 files changed, 139 insertions(+), 2 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.dg/fhardened-1.c
>> create mode 100644 gcc/testsuite/gcc.dg/fhardened-2.c
>> create mode 100644 gcc/testsuite/gcc.dg/fhardened-3.c
>> create mode 100644 gcc/testsuite/gcc.dg/fhardened-4.c
>> create mode 100644 gcc/testsuite/gcc.dg/fhardened-5.c
>> 
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 161a035d736..9b09c7cb3df 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -807,7 +807,7 @@ Common Var(warn_system_headers) Warning
>> Do not suppress warnings from system headers.
>> 
>> Wtrampolines
>> -Common Var(warn_trampolines) Warning
>> +Common Var(warn_trampolines) Warning EnabledBy(fhardened)
>> Warn whenever a trampoline is generated.
>> 
>> Wtrivial-auto-var-init
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 2fab4c5d71f..c1664a1a0f1 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -17745,6 +17745,7 @@ may change between major releases of GCC, but are 
>> currently:
>> -fstack-protector-strong
>> -fstack-clash-protection
>> -fcf-protection=full @r{(x86 GNU/Linux only)}
>> +-Werror=trampolines
>> }
>> 
>> The list of options enabled by @option{-fhardened} can be generated using
>> diff --git a/gcc/opts.cc b/gcc/opts.cc
>> index 5d5efaf1b9e..aa062b87cef 100644
>> --- a/gcc/opts.cc
>> +++ b/gcc/opts.cc
>> @@ -2517,6 +2517,7 @@ print_help_hardened ()
>>   printf ("  %s\n", "-fstack-protector-strong");
>>   printf ("  %s\n", "-fstack-clash-protection");
>>   printf ("  %s\n", "-fcf-protection=full");
>> +  printf ("  %s\n", "-Werror=trampolines");
>>   putchar ('\n');
>> }
>> 
>> diff --git a/gcc/testsuite/gcc.dg/fhardened-1.c 
>> b/gcc/testsuite/gcc.dg/fhardened-1.c
>> new file mode 100644
>> index 000..8710959b6f1
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/fhardened-1.c
>> @@ -0,0 +1,27 @@
>> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
>> +/* { dg-require-effective-target trampolines } */
>> +/* { dg-options "-fhardened -O" } */
>> +
>> +static void
>> +baz (int (*bar) (void))
>> +{
>> +  bar ();
>> +}
>> +
>> +int
>> +main (void)
>> +{
>> +  int a = 6;
>> +
>> +  int
>> +  bar (void)// { dg-error "trampoline" }
>> +  {
>> +return a;
>> +  }
>> +
>> +  baz (bar);
>> +
>> +  return 0;
>> +}
>> +
>> +/* { dg-prune-output "some warnings being treated as errors" } */
>> diff --git a/gcc/testsuite/gcc.dg/fhardened-2.c 
>> b/gcc/testsuite/gcc.dg/fhardened-2.c
>> new file mode 100644
>> index 000..d47512aa47f
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/fhardened-2.c
>> @@ -0,0 +1,25 @@
>> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
>> +/* { dg-require-effective-target trampolines } */
>> +/* { dg-options "-fhardened -O -Wno-trampolines" } */
>> +
>> +static void
>> +baz (int (*bar) (void))
>> +{
>> +  bar ();
>> +}
>> +
>> +int
>> +main (void)
>> +{
>> +  int a = 6;
>> +
>> +  int
>> +  bar (void)// {

Re: [PATCH v1 1/2] LoongArch: Switch loongarch-def from C to C++ to make it possible.

2023-12-02 Thread Xi Ruoyao

On Sat, 2023-12-02 at 16:14 +0800, Lulu Cheng wrote:

/* snip */

> diff --git a/gcc/config/loongarch/loongarch-opts.cc
> b/gcc/config/loongarch/loongarch-opts.cc
> index b5836f198c0..6861642a98d 100644
> --- a/gcc/config/loongarch/loongarch-opts.cc
> +++ b/gcc/config/loongarch/loongarch-opts.cc
> @@ -163,6 +163,7 @@ loongarch_config_target (struct loongarch_target
> *target,
>    int follow_multilib_list_p)
>  {
>    struct loongarch_target t;
> +
>    if (!target)
>  return;
>  
> @@ -657,12 +658,18 @@ abi_str (struct loongarch_abi abi)
>    strlen (loongarch_abi_base_strings[abi.base]));
>    else
>  {
> +  /* This situation has not yet occurred, so in order to avoid
> the
> +  -Warray-bounds warning during C++ syntax checking, this part
> +  of the code is commented first.*/
> +  /*

Just put a "gcc_unreachable ();" here?

>    APPEND_STRING (loongarch_abi_base_strings[abi.base])
>    APPEND1 ('/')
>    APPEND_STRING (loongarch_abi_ext_strings[abi.ext])
>    APPEND1 ('\0')
>  
>    return XOBFINISH (_obstack, const char *);
> +  */
> +  gcc_unreachable ();
>  }
>  }
>  
> diff --git a/gcc/config/loongarch/loongarch-opts.h
> b/gcc/config/loongarch/loongarch-opts.h
> index fa3773223bc..7a644c86d48 100644
> --- a/gcc/config/loongarch/loongarch-opts.h
> +++ b/gcc/config/loongarch/loongarch-opts.h
> @@ -21,7 +21,10 @@ along with GCC; see the file COPYING3.  If not see
>  #ifndef LOONGARCH_OPTS_H
>  #define LOONGARCH_OPTS_H
>  
> +/* This is a C++ header and it shouldn't be used by target libraries.  */
> +#if !defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
>  #include "loongarch-def.h"
> +#endif

With this change we can revert r14-5634 (remove the #if
!defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
guards in loongarch-def.h as they'll be unneeded).


-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH, v3] Fortran: deferred-length character optional dummy arguments [PR93762,PR100651]

2023-12-02 Thread FX Coudert

Hi,

> this patch extends the previous version by adding further code testing
> the presence of an optional deferred-length character argument also
> in the function initialization code.  This allows to re-enable a
> commented-out test in v2.

Nice, that sounds logical.

> Regtested on x86_64-pc-linux-gnu.  OK for mainline?

OK.

FX

Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-02 Thread Martin Uecker



> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> -- >8 --
> It came up that a good hardening strategy is to disable trampolines
> which may require executable stack.  Therefore the following patch
> adds -Werror=trampolines to -fhardened.

This would add a warning about specific code (where it is then
unclear whether rewriting it is feasible or even an improvement),
which seems different to all the other flags -fhardening has
now.

GCC now has an option to allocate trampolines on the heap,
which would seem to be a better fit.  On the other hand,
it does not work with longjmp which may be a limitation.

Martin


> 
> gcc/ChangeLog:
> 
>   * common.opt (Wtrampolines): Enable by -fhardened.
>   * doc/invoke.texi: Reflect that -fhardened enables -Werror=trampolines.
>   * opts.cc (print_help_hardened): Add -Werror=trampolines.
>   * toplev.cc (process_options): Enable -Werror=trampolines for
>   -fhardened.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/fhardened-1.c: New test.
>   * gcc.dg/fhardened-2.c: New test.
>   * gcc.dg/fhardened-3.c: New test.
>   * gcc.dg/fhardened-4.c: New test.
>   * gcc.dg/fhardened-5.c: New test.
> ---
>  gcc/common.opt |  2 +-
>  gcc/doc/invoke.texi|  1 +
>  gcc/opts.cc|  1 +
>  gcc/testsuite/gcc.dg/fhardened-1.c | 27 +++
>  gcc/testsuite/gcc.dg/fhardened-2.c | 25 +
>  gcc/testsuite/gcc.dg/fhardened-3.c | 25 +
>  gcc/testsuite/gcc.dg/fhardened-4.c | 25 +
>  gcc/testsuite/gcc.dg/fhardened-5.c | 27 +++
>  gcc/toplev.cc  |  8 +++-
>  9 files changed, 139 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-4.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-5.c
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 161a035d736..9b09c7cb3df 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -807,7 +807,7 @@ Common Var(warn_system_headers) Warning
>  Do not suppress warnings from system headers.
>  
>  Wtrampolines
> -Common Var(warn_trampolines) Warning
> +Common Var(warn_trampolines) Warning EnabledBy(fhardened)
>  Warn whenever a trampoline is generated.
>  
>  Wtrivial-auto-var-init
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 2fab4c5d71f..c1664a1a0f1 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -17745,6 +17745,7 @@ may change between major releases of GCC, but are 
> currently:
>  -fstack-protector-strong
>  -fstack-clash-protection
>  -fcf-protection=full @r{(x86 GNU/Linux only)}
> +-Werror=trampolines
>  }
>  
>  The list of options enabled by @option{-fhardened} can be generated using
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 5d5efaf1b9e..aa062b87cef 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -2517,6 +2517,7 @@ print_help_hardened ()
>printf ("  %s\n", "-fstack-protector-strong");
>printf ("  %s\n", "-fstack-clash-protection");
>printf ("  %s\n", "-fcf-protection=full");
> +  printf ("  %s\n", "-Werror=trampolines");
>putchar ('\n');
>  }
>  
> diff --git a/gcc/testsuite/gcc.dg/fhardened-1.c 
> b/gcc/testsuite/gcc.dg/fhardened-1.c
> new file mode 100644
> index 000..8710959b6f1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fhardened-1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
> +/* { dg-require-effective-target trampolines } */
> +/* { dg-options "-fhardened -O" } */
> +
> +static void
> +baz (int (*bar) (void))
> +{
> +  bar ();
> +}
> +
> +int
> +main (void)
> +{
> +  int a = 6;
> +
> +  int
> +  bar (void) // { dg-error "trampoline" }
> +  {
> +return a;
> +  }
> +
> +  baz (bar);
> +
> +  return 0;
> +}
> +
> +/* { dg-prune-output "some warnings being treated as errors" } */
> diff --git a/gcc/testsuite/gcc.dg/fhardened-2.c 
> b/gcc/testsuite/gcc.dg/fhardened-2.c
> new file mode 100644
> index 000..d47512aa47f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fhardened-2.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
> +/* { dg-require-effective-target trampolines } */
> +/* { dg-options "-fhardened -O -Wno-trampolines" } */
> +
> +static void
> +baz (int (*bar) (void))
> +{
> +  bar ();
> +}
> +
> +int
> +main (void)
> +{
> +  int a = 6;
> +
> +  int
> +  bar (void) // { dg-bogus "trampoline" }
> +  {
> +return a;
> +  }
> +
> +  baz (bar);
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/fhardened-3.c 
> b/gcc/testsuite/gcc.dg/fhardened-3.c
> new file mode 100644
> index 000..cebae13d8be
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fhardened-3.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */

Re: [11 PATCH] libiberty, Darwin: Fix a build warning. [PR112823]

2023-12-02 Thread FX Coudert

> Thanks. I can't push it myself - could you do that for me?

Pushed.

FX

Re: [r14-5930 Regression] FAIL: gcc.c-torture/compile/libcall-2.c -Os (test for excess errors) on Linux/x86_64

2023-12-02 Thread FX Coudert

> mcmodel=large s not supported (yet) on any Darwin arch [PR90698], so the test 
> needs skipping or xfailing, I think (either way with a reference to the PR).

Pushed as 
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=b74981b5cf32ebf4bfffd25e7174b5c80243447a

FX

Re: [pushed][PATCH v1 1/2] LoongArch: Accelerate optimization of scalar signed/unsigned popcount.

2023-12-02 Thread chenglulu


Pushed to r14-6072.

在 2023/11/28 下午3:38, Li Wei 写道:

In LoongArch, the vector popcount has corresponding instructions, while
the scalar does not. Currently, the scalar popcount is calculated
through a loop, and the value of a non-power of two needs to be iterated
several times, so the vector popcount instruction is considered for
optimization.

gcc/ChangeLog:

* config/loongarch/loongarch.md (v2di): Used to simplify the
  following templates.
(popcount2): New.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/popcnt.c: New test.
* gcc.target/loongarch/popcount.c: New test.
---
  gcc/config/loongarch/loongarch.md | 27 +++-
  gcc/testsuite/gcc.target/loongarch/popcnt.c   | 41 +++
  gcc/testsuite/gcc.target/loongarch/popcount.c | 17 
  3 files changed, 83 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/popcnt.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/popcount.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index cd4ed495697..c440d9c348f 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -1515,7 +1515,30 @@ (define_insn "truncdfsf2"
 (set_attr "cnv_mode" "D2S")
 (set_attr "mode" "SF")])
  
-

+;; In vector registers, popcount can be implemented directly through
+;; the vector instruction [X]VPCNT.  For GP registers, we can implement
+;; it through the following method.  Compared with loop implementation
+;; of popcount, the following method has better performance.
+
+;; This attribute used for get connection of scalar mode and corresponding
+;; vector mode.
+(define_mode_attr cntmap [(SI "v4si") (DI "v2di")])
+
+(define_expand "popcount2"
+  [(set (match_operand:GPR 0 "register_operand")
+   (popcount:GPR (match_operand:GPR 1 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  rtx in = operands[1];
+  rtx out = operands[0];
+  rtx vreg = mode == SImode ? gen_reg_rtx (V4SImode) :
+   gen_reg_rtx (V2DImode);
+  emit_insn (gen_lsx_vinsgr2vr_ (vreg, in, vreg, GEN_INT (1)));
+  emit_insn (gen_popcount2 (vreg, vreg));
+  emit_insn (gen_lsx_vpickve2gr_ (out, vreg, GEN_INT (0)));
+  DONE;
+})
+
  ;;
  ;;  
  ;;
@@ -3882,7 +3905,7 @@ (define_peephole2
   (any_extend:SI (match_dup 3)))])]
"")
  
-

+
  
  (define_mode_iterator QHSD [QI HI SI DI])
  
diff --git a/gcc/testsuite/gcc.target/loongarch/popcnt.c b/gcc/testsuite/gcc.target/loongarch/popcnt.c

new file mode 100644
index 000..a10fca42092
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/popcnt.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlsx" } */
+/* { dg-final { scan-assembler-not {popcount} } } */
+/* { dg-final { scan-assembler-times "vpcnt.d" 2 { target { loongarch64*-*-* } 
} } } */
+/* { dg-final { scan-assembler-times "vpcnt.w" 4 { target { loongarch64*-*-* } 
} } } */
+
+int
+foo (int x)
+{
+  return __builtin_popcount (x);
+}
+
+long
+foo1 (long x)
+{
+  return __builtin_popcountl (x);
+}
+
+long long
+foo2 (long long x)
+{
+  return __builtin_popcountll (x);
+}
+
+int
+foo3 (int *p)
+{
+  return __builtin_popcount (*p);
+}
+
+unsigned
+foo4 (int x)
+{
+  return __builtin_popcount (x);
+}
+
+unsigned long
+foo5 (int x)
+{
+  return __builtin_popcount (x);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/popcount.c 
b/gcc/testsuite/gcc.target/loongarch/popcount.c
new file mode 100644
index 000..390ff067617
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/popcount.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlsx -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "__builtin_popcount|\\.POPCOUNT" 1 
"optimized" } } */
+
+int
+PopCount (long b)
+{
+  int c = 0;
+
+  while (b)
+{
+  b &= b - 1;
+  c++;
+}
+
+  return c;
+}

Re: [pushed][PATCH v1 2/2] LoongArch: Optimize vector constant extract-{even/odd} permutation.

2023-12-02 Thread chenglulu




在 2023/11/29 下午5:44, Xi Ruoyao 写道:

On Tue, 2023-11-28 at 15:39 +0800, Li Wei wrote:

For vector constant extract-{even/odd} permutation replace the default
[x]vshuf instruction combination with [x]vilv{l/h} instruction, which
can reduce instructions and improves performance.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_is_odd_extraction):
  Supplementary function prototype.

 ^^
These two white spaces should be removed.  And I'd suggest "New forward
declaration".

Otherwise LGTM.


pushed to r14-6073.  At the same time, the indentation problem has been 
modified.


Thanks.




(loongarch_is_even_extraction): Adjust.
(loongarch_try_expand_lsx_vshuf_const): Adjust.
(loongarch_is_extraction_permutation): Adjust.
(loongarch_expand_vec_perm_const_2): Adjust.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/lasx-extract-even_odd-opt.c: New test.

Re:[pushed] [PATCH v1] LoongArch: Remove duplicate definition of CLZ_DEFINED_VALUE_AT_ZERO.

2023-12-02 Thread chenglulu


Pushed to r14-6070.

在 2023/11/29 上午9:53, Xi Ruoyao 写道:

On Tue, 2023-11-28 at 15:56 +0800, Li Wei wrote:

In the r14-5547 commit, C[LT]Z_DEFINED_VALUE_AT_ZERO were defined at
the same time, but in fact, CLZ_DEFINED_VALUE_AT_ZERO has already been
defined, so remove the duplicate definition.

gcc/ChangeLog:

* config/loongarch/loongarch.h (CTZ_DEFINED_VALUE_AT_ZERO): Add
  description.
(CLZ_DEFINED_VALUE_AT_ZERO): Remove duplicate definition.

LGTM.

Interestingly the compiler does not give any warning when a macro is
redefined but with exactly same definition.


---
  gcc/config/loongarch/loongarch.h | 9 +++--
  1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index 115222e70fd..fa8a3f5582f 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -288,10 +288,12 @@ along with GCC; see the file COPYING3.  If not see
  /* Define if loading short immediate values into registers sign extends.  */
  #define SHORT_IMMEDIATES_SIGN_EXTEND 1
  
-/* The clz.{w/d} instructions have the natural values at 0.  */

+/* The clz.{w/d}, ctz.{w/d} instructions have the natural values at 0.  */
  
  #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \

    ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
+#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
  
  /* Standard register usage.  */
  
@@ -1239,8 +1241,3 @@ struct GTY (()) machine_function
  
  #define TARGET_EXPLICIT_RELOCS \

    (la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS)
-
-#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
-  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
-#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
-  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)

[PATCH v1 1/2] LoongArch: Switch loongarch-def from C to C++ to make it possible.

2023-12-02 Thread Lulu Cheng

From: Xi Ruoyao 

We'll use HOST_WIDE_INT in LoongArch static properties in following patches.
Switch loongarch-def from C to C++ to make it possible.

To keep the same readability as C99 designated initializers, create a
std::array like data structure with position setter function, and add
field setter functions for structs used in loongarch-def.cc.

gcc/ChangeLog:

* config/loongarch/loongarch-def.h: Remove extern "C".
(loongarch_isa_base_strings): Declare as loongarch_def_array
instead of plain array.
(loongarch_isa_ext_strings): Likewise.
(loongarch_abi_base_strings): Likewise.
(loongarch_abi_ext_strings): Likewise.
(loongarch_cmodel_strings): Likewise.
(loongarch_cpu_strings): Likewise.
(loongarch_cpu_default_isa): Likewise.
(loongarch_cpu_issue_rate): Likewise.
(loongarch_cpu_multipass_dfa_lookahead): Likewise.
(loongarch_cpu_cache): Likewise.
(loongarch_cpu_align): Likewise.
(loongarch_cpu_rtx_cost_data): Likewise.
(loongarch_isa): Add a constructor and field setter functions.
* config/loongarch/loongarch-opts.h (loongarch-defs.h): Do not
include for target libraries.
* config/loongarch/loongarch-tune.h (LOONGARCH_TUNE_H): Likewise.
(struct loongarch_rtx_cost_data): Likewise.
(struct loongarch_cache): Likewise.
(struct loongarch_align): Likewise.
* config/loongarch/t-loongarch: Compile loongarch-def.cc with the
C++ compiler.
* config/loongarch/loongarch-def-array.h: New file for a
std:array like data structure with position setter function.
* config/loongarch/loongarch-def.c: Rename to ...
* config/loongarch/loongarch-def.cc: ... here.
(loongarch_cpu_strings): Define as loongarch_def_array instead
of plain array.
(loongarch_cpu_default_isa): Likewise.
(loongarch_cpu_cache): Likewise.
(loongarch_cpu_align): Likewise.
(loongarch_cpu_rtx_cost_data): Likewise.
(loongarch_cpu_issue_rate): Likewise.
(loongarch_cpu_multipass_dfa_lookahead): Likewise.
(loongarch_isa_base_strings): Likewise.
(loongarch_isa_ext_strings): Likewise.
(loongarch_abi_base_strings): Likewise.
(loongarch_abi_ext_strings): Likewise.
(loongarch_cmodel_strings): Likewise.
(abi_minimal_isa): Likewise.
(loongarch_rtx_cost_optimize_size): Use field setter functions
instead of designated initializers.
(loongarch_rtx_cost_data): Implement default constructor.
---
 gcc/config/loongarch/loongarch-def-array.h |  40 
 gcc/config/loongarch/loongarch-def.c   | 227 -
 gcc/config/loongarch/loongarch-def.cc  | 187 +
 gcc/config/loongarch/loongarch-def.h   |  51 +++--
 gcc/config/loongarch/loongarch-opts.cc |   7 +
 gcc/config/loongarch/loongarch-opts.h  |   3 +
 gcc/config/loongarch/loongarch-tune.h  | 123 ++-
 gcc/config/loongarch/t-loongarch   |   4 +-
 8 files changed, 389 insertions(+), 253 deletions(-)
 create mode 100644 gcc/config/loongarch/loongarch-def-array.h
 delete mode 100644 gcc/config/loongarch/loongarch-def.c
 create mode 100644 gcc/config/loongarch/loongarch-def.cc

diff --git a/gcc/config/loongarch/loongarch-def-array.h 
b/gcc/config/loongarch/loongarch-def-array.h
new file mode 100644
index 000..bdb3e9c6a2b
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-def-array.h
@@ -0,0 +1,40 @@
+/* A std::array like data structure for LoongArch static properties.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef _LOONGARCH_DEF_ARRAY_H
+#define _LOONGARCH_DEF_ARRAY_H 1
+
+template 
+class loongarch_def_array {
+private:
+  T arr[N];
+public:
+  loongarch_def_array () : arr{} {}
+
+  T [] (int n) { return arr[n]; }
+  const T [] (int n) const { return arr[n]; }
+
+  loongarch_def_array set (int idx, T &)
+  {
+(*this)[idx] = value;
+return *this;
+  }
+};
+
+#endif
diff --git a/gcc/config/loongarch/loongarch-def.c 
b/gcc/config/loongarch/loongarch-def.c
deleted file mode 100644
index f22d488acb2..000
--- a/gcc/config/loongarch/loongarch-def.c
+++ /dev/null
@@ -1,227 +0,0 @@
-/* LoongArch static properties.
-   Copyright (C) 2021-2023 Free Software

[PATCH v1 0/2] Delete ISA_BASE_LA64V110 related definitions.

2023-12-02 Thread Lulu Cheng

1. Rebase Xi Ruoyao's patch to the latest commit.
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636798.html

2. Described in LoongArch Reference Manual v1.1:
The new functional subsets in each new version have independent identification
bits in the return value of the CPUCFG instruction. It is recommended that the
software determines the running process based on this information rather than
the version number of the Loongson architecture.

So delete the ISA_BASE_LA64V110 related definitions here.

Lulu Cheng (1):
  LoongArch: Remove the definition of ISA_BASE_LA64V110 from the code.

Xi Ruoyao (1):
  LoongArch: Switch loongarch-def from C to C++ to make it possible.

 .../loongarch/genopts/loongarch-strings   |   1 -
 gcc/config/loongarch/genopts/loongarch.opt.in |   3 -
 gcc/config/loongarch/loongarch-cpu.cc |  23 +-
 gcc/config/loongarch/loongarch-def-array.h|  40 +++
 gcc/config/loongarch/loongarch-def.c  | 227 --
 gcc/config/loongarch/loongarch-def.cc | 193 +++
 gcc/config/loongarch/loongarch-def.h  |  63 ++---
 gcc/config/loongarch/loongarch-opts.cc|  10 +-
 gcc/config/loongarch/loongarch-opts.h |   7 +-
 gcc/config/loongarch/loongarch-str.h  |   1 -
 gcc/config/loongarch/loongarch-tune.h | 123 +-
 gcc/config/loongarch/loongarch.opt|   3 -
 gcc/config/loongarch/t-loongarch  |   4 +-
 13 files changed, 404 insertions(+), 294 deletions(-)
 create mode 100644 gcc/config/loongarch/loongarch-def-array.h
 delete mode 100644 gcc/config/loongarch/loongarch-def.c
 create mode 100644 gcc/config/loongarch/loongarch-def.cc

-- 
2.31.1

[PATCH v1 2/2] LoongArch: Remove the definition of ISA_BASE_LA64V110 from the code.

2023-12-02 Thread Lulu Cheng

The instructions defined in LoongArch Reference Manual v1.1 are not the 
instruction
set v1.1 version. The CPU defined later may only support some instructions in
LoongArch Reference Manual v1.1. Therefore, the macro ISA_BASE_LA64V110 and
related definitions are removed here.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Delete 
STR_ISA_BASE_LA64V110.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-cpu.cc (ISA_BASE_LA64V110_FEATURES): 
Delete macro.
(fill_native_cpu_config): Define a new variable hw_isa_evolution record 
the
extended instruction set support read from cpucfg.
* config/loongarch/loongarch-def.cc: Set evolution at initialization.
* config/loongarch/loongarch-def.h (ISA_BASE_LA64V100): Delete.
(ISA_BASE_LA64V110): Likewise.
(N_ISA_BASE_TYPES): Likewise.
(defined): Likewise.
* config/loongarch/loongarch-opts.cc: Likewise.
* config/loongarch/loongarch-opts.h (TARGET_64BIT): Likewise.
(ISA_BASE_IS_LA64V110): Likewise.
* config/loongarch/loongarch-str.h (STR_ISA_BASE_LA64V110): Likewise.
* config/loongarch/loongarch.opt: Regenerate.
---
 .../loongarch/genopts/loongarch-strings   |  1 -
 gcc/config/loongarch/genopts/loongarch.opt.in |  3 ---
 gcc/config/loongarch/loongarch-cpu.cc | 23 +--
 gcc/config/loongarch/loongarch-def.cc | 14 +++
 gcc/config/loongarch/loongarch-def.h  | 12 ++
 gcc/config/loongarch/loongarch-opts.cc|  3 ---
 gcc/config/loongarch/loongarch-opts.h |  4 +---
 gcc/config/loongarch/loongarch-str.h  |  1 -
 gcc/config/loongarch/loongarch.opt|  3 ---
 9 files changed, 19 insertions(+), 45 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings 
b/gcc/config/loongarch/genopts/loongarch-strings
index b2070c83ed0..7bc4824007e 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -30,7 +30,6 @@ STR_CPU_LA664   la664
 
 # Base architecture
 STR_ISA_BASE_LA64V100 la64
-STR_ISA_BASE_LA64V110 la64v1.1
 
 # -mfpu
 OPTSTR_ISA_EXT_FPUfpu
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index 8af6cc6f532..483b185b059 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -32,9 +32,6 @@ Basic ISAs of LoongArch:
 EnumValue
 Enum(isa_base) String(@@STR_ISA_BASE_LA64V100@@) Value(ISA_BASE_LA64V100)
 
-EnumValue
-Enum(isa_base) String(@@STR_ISA_BASE_LA64V110@@) Value(ISA_BASE_LA64V110)
-
 ;; ISA extensions / adjustments
 Enum
 Name(isa_ext_fpu) Type(int)
diff --git a/gcc/config/loongarch/loongarch-cpu.cc 
b/gcc/config/loongarch/loongarch-cpu.cc
index 622df47916f..4033320d0e1 100644
--- a/gcc/config/loongarch/loongarch-cpu.cc
+++ b/gcc/config/loongarch/loongarch-cpu.cc
@@ -23,7 +23,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
-#include "tm.h"
 #include "diagnostic-core.h"
 
 #include "loongarch-def.h"
@@ -32,19 +31,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "loongarch-cpucfg-map.h"
 #include "loongarch-str.h"
 
-/* loongarch_isa_base_features defined here instead of loongarch-def.c
-   because we need to use options.h.  Pay attention on the order of elements
-   in the initializer becaue ISO C++ does not allow C99 designated
-   initializers!  */
-
-#define ISA_BASE_LA64V110_FEATURES \
-  (OPTION_MASK_ISA_DIV32 | OPTION_MASK_ISA_LD_SEQ_SA \
-   | OPTION_MASK_ISA_LAM_BH | OPTION_MASK_ISA_LAMCAS)
-
-int64_t loongarch_isa_base_features[N_ISA_BASE_TYPES] = {
-  /* [ISA_BASE_LA64V100] = */ 0,
-  /* [ISA_BASE_LA64V110] = */ ISA_BASE_LA64V110_FEATURES,
-};
 
 /* Native CPU detection with "cpucfg" */
 static uint32_t cpucfg_cache[N_CPUCFG_WORDS] = { 0 };
@@ -235,18 +221,20 @@ fill_native_cpu_config (struct loongarch_target *tgt)
   /* Use the native value anyways.  */
   preset.simd = tmp;
 
+
+  int64_t hw_isa_evolution = 0;
+
   /* Features added during ISA evolution.  */
   for (const auto : cpucfg_map)
if (cpucfg_cache[entry.cpucfg_word] & entry.cpucfg_bit)
- preset.evolution |= entry.isa_evolution_bit;
+ hw_isa_evolution |= entry.isa_evolution_bit;
 
   if (native_cpu_type != CPU_NATIVE)
{
  /* Check if the local CPU really supports the features of the base
 ISA of probed native_cpu_type.  If any feature is not detected,
 either GCC or the hardware is buggy.  */
- auto base_isa_feature = loongarch_isa_base_features[preset.base];
- if ((preset.evolution & base_isa_feature) != base_isa_feature)
+ if ((preset.evolution & hw_isa_evolution) != hw_isa_evolution)
warning (0,
 "detected base architecture

52 matches

Mail list logo