Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Lulu Cheng


在 2024/4/19 下午10:27, Xi Ruoyao 写道:

On Fri, 2024-04-19 at 19:04 +0800, Yang Yujie wrote:

  @table @samp
  @item native
-This selects the CPU to generate code for at compilation time by determining
-the processor type of the compiling machine.  Using @option{-march=native}
-enables all instruction subsets supported by the local machine (hence
-the result might not run on different machines).  Using @option{-mtune=native}
-produces code optimized for the local machine under the constraints
-of the selected instruction set.
+Local processor type detected by the native compiler.
  @item loongarch64
-A generic CPU with 64-bit extensions.
+Generic LoongArch 64-bit processor.
  @item la464
-LoongArch LA464 CPU with LBT, LSX, LASX, LVZ.
+LoongArch LA464-based processor with LSX, LASX.
+@item la664
+LoongArch LA664-based processor with LSX, LASX and all LoongArch v1.1 features.

One LoongArch v1.1 feature "Hardware Page Table Walker" is not
implemented by LA664.  Maybe "all LoongArch v1.1 **unprivileged**
features"?

The description of *-march* is "+Generate instructions for the machine 
type @var{arch-type}.",


so is there no need to write it like this here?



+@item la64v1.0
+LoongArch64 ISA version 1.0.
+@item la64v1.1
+LoongArch64 ISA version 1.1.

IMO it's better to use a wording like LA664, i.e. "a CPU implementing
all LoongArch v1.1 unprivileged features" (emphasising "all", as the
v1.1 manual allows to only implement a subset of v1.1 features).



RE: [PATCH v1] RISC-V: Add xfail test case for wv insn register overlap

2024-04-19 Thread Li, Pan2
Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Saturday, April 20, 2024 9:20 AM
To: Li, Pan2 ; gcc-patches 
Cc: kito.cheng ; Robin Dapp ; Li, 
Pan2 
Subject: Re: [PATCH v1] RISC-V: Add xfail test case for wv insn register overlap

LGTM.


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2024-04-20 09:04
To: gcc-patches
CC: juzhe.zhong; 
kito.cheng; rdapp.gcc; 
Pan Li
Subject: [PATCH v1] RISC-V: Add xfail test case for wv insn register overlap
From: Pan Li mailto:pan2...@intel.com>>

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-42.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../gcc.target/riscv/rvv/base/pr112431-42.c   | 30 +++
1 file changed, 30 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
new file mode 100644
index 000..fa5dac58a20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ffast-math" } */
+
+#include 
+
+int64_t
+reduc_plus_int (int *__restrict a, int n)
+{
+  int64_t r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+double
+reduc_plus_float (float *__restrict a, int n)
+{
+  double r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} { xfail riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-assembler-times {vwadd\.wv} 1 } } */
+/* { dg-final { scan-assembler-times {vfwadd\.wv} 1 } } */
--
2.34.1




Re: [PATCH v1] RISC-V: Add xfail test case for wv insn register overlap

2024-04-19 Thread juzhe.zh...@rivai.ai
LGTM.



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-04-20 09:04
To: gcc-patches
CC: juzhe.zhong; kito.cheng; rdapp.gcc; Pan Li
Subject: [PATCH v1] RISC-V: Add xfail test case for wv insn register overlap
From: Pan Li 
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr112431-42.c: New test.
 
Signed-off-by: Pan Li 
---
.../gcc.target/riscv/rvv/base/pr112431-42.c   | 30 +++
1 file changed, 30 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
new file mode 100644
index 000..fa5dac58a20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ffast-math" } */
+
+#include 
+
+int64_t
+reduc_plus_int (int *__restrict a, int n)
+{
+  int64_t r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+double
+reduc_plus_float (float *__restrict a, int n)
+{
+  double r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} { xfail riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-assembler-times {vwadd\.wv} 1 } } */
+/* { dg-final { scan-assembler-times {vfwadd\.wv} 1 } } */
-- 
2.34.1
 
 


[PATCH v2] RISC-V: Add xfail test case for wv insn register overlap

2024-04-19 Thread pan2 . li
From: Pan Li 

We reverted below patch for wv insn overlap, add the related wv
insn test and mark it as xfail.  And we will remove the xfail
after we support the register overlap in GCC-15.

b3b2799b872 RISC-V: Support one more overlap for wv instructions

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-42.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/base/pr112431-42.c   | 30 +++
 1 file changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
new file mode 100644
index 000..fa5dac58a20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ffast-math" } */
+
+#include 
+
+int64_t
+reduc_plus_int (int *__restrict a, int n)
+{
+  int64_t r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+double
+reduc_plus_float (float *__restrict a, int n)
+{
+  double r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} { xfail riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-assembler-times {vwadd\.wv} 1 } } */
+/* { dg-final { scan-assembler-times {vfwadd\.wv} 1 } } */
-- 
2.34.1



[PATCH v1] RISC-V: Add xfail test case for wv insn register overlap

2024-04-19 Thread pan2 . li
From: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-42.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/base/pr112431-42.c   | 30 +++
 1 file changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
new file mode 100644
index 000..fa5dac58a20
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-42.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ffast-math" } */
+
+#include 
+
+int64_t
+reduc_plus_int (int *__restrict a, int n)
+{
+  int64_t r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+double
+reduc_plus_float (float *__restrict a, int n)
+{
+  double r = 0;
+  for (int i = 0; i < n; ++i)
+r += a[i];
+  return r;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} { xfail riscv*-*-* } } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-assembler-times {vwadd\.wv} 1 } } */
+/* { dg-final { scan-assembler-times {vfwadd\.wv} 1 } } */
-- 
2.34.1



RE: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail tests

2024-04-19 Thread Li, Pan2
> I'm not sure I'm following.  Did we miss something that should have been
> covered?  Like only an overlap on the srcs but not the dest?
> Are there testcases that fail?  If so we should definitely have one.

> If something is broken then indeed we should revert it.

Yes, we may need to support these in gcc-15.

> ... why not just revert everything and xfail all the tests in a
> follow up?  Your patch is essentially a revert but doesn't look like
> it.  I'd rather we let a revert be a revert and adjust the tests
> separately so it becomes clear.

Sure, will revert b3b2799b872 and then file the patch for the xfail tests.

Pan

-Original Message-
From: Robin Dapp  
Sent: Friday, April 19, 2024 10:54 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail 
tests

Hi Pan,

> The RVV register overlap requires both the dest, and src operands.
> Thus the rigister filter in constraint cannot cover the fully sematics
> of the vector register overlap.

I'm not sure I'm following.  Did we miss something that should have been
covered?  Like only an overlap on the srcs but not the dest?
Are there testcases that fail?  If so we should definitely have one.

If something is broken then indeed we should revert it.

But...

> Thus, revert these overlap patches list and xfail the related test
> cases.  This patch would like to revert *b3b2799b872*, and the full
> picture of related series are listed as below.

... why not just revert everything and xfail all the tests in a
follow up?  Your patch is essentially a revert but doesn't look like
it.  I'd rather we let a revert be a revert and adjust the tests
separately so it becomes clear. 

Regards
 Robin



[PATCH] Add rvalue::get_name method (and its C equivalent)

2024-04-19 Thread Guillaume Gomez
Hi,

I just encountered the need to retrieve the name of an `rvalue` (if
there is one) while working on the Rust GCC backend.

This patch adds a getter to retrieve the information.

Cordially.
From d2ddeec950f23533e5e18bc0c10c4b49eef3cda3 Mon Sep 17 00:00:00 2001
From: Guillaume Gomez 
Date: Sat, 20 Apr 2024 01:02:20 +0200
Subject: [PATCH] [PATCH] Add rvalue::get_name method

gcc/jit/ChangeLog:

	* jit-recording.h: Add rvalue::get_name method
	* libgccjit.cc (gcc_jit_rvalue_get_name): Likewise
	* libgccjit.h (gcc_jit_rvalue_get_name): Likewise
	* libgccjit.map: Likewise

gcc/testsuite/ChangeLog:

	* jit.dg/test-tls.c: Add test for gcc_jit_rvalue_get_name
---
 gcc/jit/jit-recording.h |  8 
 gcc/jit/libgccjit.cc| 16 
 gcc/jit/libgccjit.h |  4 
 gcc/jit/libgccjit.map   |  5 +
 gcc/testsuite/jit.dg/test-tls.c |  3 +++
 5 files changed, 36 insertions(+)

diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index d8d16f4fe29..3ae87c146ac 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -1213,6 +1213,8 @@ public:
   virtual bool is_constant () const { return false; }
   virtual bool get_wide_int (wide_int *) const { return false; }
 
+  virtual string * get_name () { return NULL; }
+
 private:
   virtual enum precedence get_precedence () const = 0;
 
@@ -1305,6 +1307,8 @@ public:
   const char *access_as_rvalue (reproducer ) final override;
   const char *access_as_lvalue (reproducer ) final override;
 
+  string * get_name () final override { return m_name; }
+
 private:
   string * make_debug_string () final override { return m_name; }
   void write_reproducer (reproducer ) final override;
@@ -1558,6 +1562,8 @@ public:
 
   void set_rvalue_init (rvalue *val) { m_rvalue_init = val; }
 
+  string * get_name () final override { return m_name; }
+
 private:
   string * make_debug_string () final override { return m_name; }
   template 
@@ -2148,6 +2154,8 @@ public:
 
   void write_to_dump (dump ) final override;
 
+  string * get_name () final override { return m_name; }
+
 private:
   string * make_debug_string () final override { return m_name; }
   void write_reproducer (reproducer ) final override;
diff --git a/gcc/jit/libgccjit.cc b/gcc/jit/libgccjit.cc
index 445c0d0e0e3..2b8706dc7fd 100644
--- a/gcc/jit/libgccjit.cc
+++ b/gcc/jit/libgccjit.cc
@@ -4377,3 +4377,19 @@ gcc_jit_context_add_top_level_asm (gcc_jit_context *ctxt,
   RETURN_IF_FAIL (asm_stmts, ctxt, NULL, "NULL asm_stmts");
   ctxt->add_top_level_asm (loc, asm_stmts);
 }
+
+/* Public entrypoint.  See description in libgccjit.h.
+
+   After error-checking, this calls the trivial
+   gcc::jit::recording::rvalue::get_name method, in jit-recording.h.  */
+
+extern const char *
+gcc_jit_rvalue_get_name (gcc_jit_rvalue *rvalue)
+{
+  RETURN_NULL_IF_FAIL (rvalue, NULL, NULL, "NULL rvalue");
+  auto name = rvalue->get_name ();
+
+  if (!name)
+return NULL;
+  return name->c_str ();
+}
diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
index 74e847b2dec..d4094610a16 100644
--- a/gcc/jit/libgccjit.h
+++ b/gcc/jit/libgccjit.h
@@ -2066,6 +2066,10 @@ gcc_jit_lvalue_add_string_attribute (gcc_jit_lvalue *variable,
  enum gcc_jit_variable_attribute attribute,
  const char* value);
 
+/* Returns the name of the `rvalue`, if any. Returns NULL otherwise.  */
+extern const char *
+gcc_jit_rvalue_get_name (gcc_jit_rvalue *rvalue);
+
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
index 99aa5970be1..bbed8024263 100644
--- a/gcc/jit/libgccjit.map
+++ b/gcc/jit/libgccjit.map
@@ -289,3 +289,8 @@ LIBGCCJIT_ABI_27 {
   global:
 gcc_jit_context_new_sizeof;
 } LIBGCCJIT_ABI_26;
+
+LIBGCCJIT_ABI_28 {
+  global:
+gcc_jit_rvalue_get_name;
+} LIBGCCJIT_ABI_27;
diff --git a/gcc/testsuite/jit.dg/test-tls.c b/gcc/testsuite/jit.dg/test-tls.c
index 3b20182ac10..b651eb09b44 100644
--- a/gcc/testsuite/jit.dg/test-tls.c
+++ b/gcc/testsuite/jit.dg/test-tls.c
@@ -28,6 +28,9 @@ create_code (gcc_jit_context *ctxt, void *user_data)
   ctxt, NULL, GCC_JIT_GLOBAL_EXPORTED, int_type, "foo");
   gcc_jit_lvalue_set_tls_model (foo, GCC_JIT_TLS_MODEL_GLOBAL_DYNAMIC);
 
+  CHECK_STRING_VALUE (
+gcc_jit_rvalue_get_name (gcc_jit_lvalue_as_rvalue (foo)), "foo");
+
   /* Build the test_fn.  */
   gcc_jit_function *test_fn =
 gcc_jit_context_new_function (ctxt, NULL,
-- 
2.24.1.2762.gfe2e4819b8



[committed] i386: Fix up *avx2_eq3 constraints [PR114783]

2024-04-19 Thread Jakub Jelinek
Hi!

The r14-4456 change (part of APX EGPR support) seems to have mistakenly
changed in the
@@ -16831,7 +16831,7 @@ (define_insn "*avx2_eq3"
   [(set (match_operand:VI_256 0 "register_operand" "=x")
(eq:VI_256
  (match_operand:VI_256 1 "nonimmediate_operand" "%x")
- (match_operand:VI_256 2 "nonimmediate_operand" "xm")))]
+ (match_operand:VI_256 2 "nonimmediate_operand" "jm")))]
   "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "vpcmpeq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
hunk the xm constraint to jm, while in many other spots it changed correctly
xm to xjm.  The instruction doesn't require the last operand to be in
memory, it can handle 3 256-bit registers just fine, just it is a VEX only
encoded instruction and so can't allow APX EGPR regs in the memory operand.

The following patch fixes it, so that we don't force one of the == operands
into memory all the time.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as
obvious.

2024-04-20  Jakub Jelinek  

PR target/114783
* config/i386/sse.md (*avx2_eq3): Change last operand's
constraint from "jm" to "xjm".

* gcc.target/i386/avx2-pr114783.c: New test.

--- gcc/config/i386/sse.md.jj   2024-04-10 09:55:05.849877708 +0200
+++ gcc/config/i386/sse.md  2024-04-19 19:37:34.110320387 +0200
@@ -17029,7 +17029,7 @@ (define_insn "*avx2_eq3"
   [(set (match_operand:VI_256 0 "register_operand" "=x")
(eq:VI_256
  (match_operand:VI_256 1 "nonimmediate_operand" "%x")
- (match_operand:VI_256 2 "nonimmediate_operand" "jm")))]
+ (match_operand:VI_256 2 "nonimmediate_operand" "xjm")))]
   "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
   "vpcmpeq\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "type" "ssecmp")
--- gcc/testsuite/gcc.target/i386/avx2-pr114783.c.jj2024-04-19 
19:28:50.216644029 +0200
+++ gcc/testsuite/gcc.target/i386/avx2-pr114783.c   2024-04-19 
19:45:02.943054262 +0200
@@ -0,0 +1,12 @@
+/* PR target/114783 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2 -mno-avx512f -masm=att" } */
+/* { dg-final { scan-assembler "vpcmpeqd\[ \\t\]+%ymm\[01\], %ymm\[01\], 
%ymm0" } } */
+
+typedef int V __attribute__((vector_size (32)));
+
+V
+foo (V x, V y)
+{
+  return x == y;
+}

Jakub



Re: [RFC][PATCH v1 0/4] Allow flexible array members in unions and alone in structures [PR53548]

2024-04-19 Thread Kees Cook
On Fri, Apr 19, 2024 at 06:43:13PM +, Qing Zhao wrote:
> Therefore, GCC needs to explicitly allow such extensions directly for C99
> flexible arrays, since flexable array member in unions or alone in structs
> are common code patterns in active use by the Linux kernel (and other 
> projects).

Thank you for fixing this! :) This will make conversions much much
easier for the Linux kernel (and future userspace programs).

I've tested these patches and everything behaves like I'd expect.

-Kees

-- 
Kees Cook


[PATCH] testsuite: prune -freport-bug output

2024-04-19 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
I can defer to 15 if needed, of course.

-- >8 --
When the compiler defaults to -freport-bug, a few dg-ice tests fail
with:

Excess errors:
Preprocessed source stored into /tmp/cc6hldZ0.out file, please attach this to 
your bugreport.

We could add -fno-report-bug to those tests.  But it seems to me that a
better fix would be to prune the "Preprocessed source stored..." message
in prune_gcc_output.

gcc/testsuite/ChangeLog:

* lib/prune.exp (prune_gcc_output): Also prune -freport-bug output.
---
 gcc/testsuite/lib/prune.exp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index f3d3c99fbcb..d00d37f015f 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -51,6 +51,7 @@ proc prune_gcc_output { text } {
 regsub -all "(^|\n)\[^\n\]*: re(compiling|linking)\[^\n\]*" $text "" text
 regsub -all "(^|\n)Please submit.*instructions\[^\n\]*" $text "" text
 regsub -all "(^|\n)\[0-9\]\[0-9\]* errors\." $text "" text
+regsub -all "(^|\n)Preprocessed.*bugreport\[^\n\]*" $text "" text
 
 # Diagnostic inclusion stack
 regsub -all "(^|\n)(In file)?\[ \]+included from \[^\n\]*" $text "" text

base-commit: d86472a6f041ccf3d1be0cf6bb15d1e0ad8f6dbe
-- 
2.44.0



[PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-04-19 Thread Carl Love
rs6000, remove vector set and vector init built-ins.

The vector init built-ins:

  __builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
  __builtin_vec_init_v4si, __builtin_vec_init_v4sf,
  __builtin_vec_init_v2di, __builtin_vec_init_v2df,
  __builtin_vec_set_v1ti

perform the same operation as initializing the vector in C code.  For
example:

  result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
  result_v4si = {1, 2, 3, 4};

These two constructs were tested and verified they generate identical
assembly instructions with no optimization and -O3 optimization.

The vector set built-ins:

  __builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
  __builtin_vec_set_v4si, __builtin_vec_set_v4sf

perform the same operation as setting a specific element in the vector in
C code.  For example:

  src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
  src_v4si[index] = int_val;

The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.

All of the above built-ins that are removed do not have test cases and
are not documented.

Built-ins   __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.

The built-ins are removed as they don't provide any benefit over just
using C code.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vec_init_v16qi,
 __builtin_vec_init_v8hi, __builtin_vec_init_v4si,
__builtin_vec_init_v4sf, __builtin_vec_init_v2di,
__builtin_vec_init_v2df, __builtin_vec_set_v1ti,
__builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
__builtin_vec_set_v4si, __builtin_vec_set_v4sf,
__builtin_vec_set_v2di, __builtin_vec_set_v2df,
__builtin_vec_set_v1ti): Remove built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 42 ++-
 1 file changed, 2 insertions(+), 40 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 19d05b8043a..d04ad4ce7e5 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1115,37 +1115,6 @@
   const signed short __builtin_vec_ext_v8hi (vss, signed int);
 VEC_EXT_V8HI nothing {extract}
 
-  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, \
-signed char, signed char, signed char, signed char, signed char, \
-signed char, signed char, signed char, signed char, signed char, \
-signed char, signed char, signed char);
-VEC_INIT_V16QI nothing {init}
-
-  const vf __builtin_vec_init_v4sf (float, float, float, float);
-VEC_INIT_V4SF nothing {init}
-
-  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, \
- signed int);
-VEC_INIT_V4SI nothing {init}
-
-  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short,\
- signed short, signed short, signed short, signed short, \
- signed short);
-VEC_INIT_V8HI nothing {init}
-
-  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
-VEC_SET_V16QI nothing {set}
-
-  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
-VEC_SET_V4SF nothing {set}
-
-  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
-VEC_SET_V4SI nothing {set}
-
-  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
-VEC_SET_V8HI nothing {set}
-
-
 ; Cell builtins.
 [cell]
   pure vsc __builtin_altivec_lvlx (signed long, const void *);
@@ -1292,15 +1261,8 @@
   const signed long long __builtin_vec_ext_v2di (vsll, signed int);
 VEC_EXT_V2DI nothing {extract}
 
-  const vsq __builtin_vec_init_v1ti (signed __int128);
-VEC_INIT_V1TI nothing {init}
-
-  const vd __builtin_vec_init_v2df (double, double);
-VEC_INIT_V2DF nothing {init}
-
-  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
-VEC_INIT_V2DI nothing {init}
-
+;; VEC_SET_V1TI, VEC_SET_V2DF and VEC_SET_V2DI are used in
+;; resolve_vec_insert(), rs6000-c.cc
   const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
 VEC_SET_V1TI nothing {set}
 
-- 
2.44.0



[PATCH 12/13] rs6000, remove __builtin_vsx_xvcmpeqsp built-in

2024-04-19 Thread Carl Love
rs6000, remove __builtin_vsx_xvcmpeqsp built-in

The built-in __builtin_vsx_xvcmpeqsp is a duplicate of the overloaded
vec_cmpeq built-in.  The built-in is undocumented.  The built-in and
the test cases are removed.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp):
Remove built-in definition.

gcc/testsuite/ChangeLog:
* vsx-builtin-3.c (do_cmp): Remove test case for
__builtin_vsx_xvcmpeqsp.
---
 gcc/config/rs6000/rs6000-builtins.def| 3 ---
 gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c | 2 --
 2 files changed, 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 2f6149edd5f..19d05b8043a 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1613,9 +1613,6 @@
   const signed int __builtin_vsx_xvcmpeqdp_p (signed int, vd, vd);
 XVCMPEQDP_P vector_eq_v2df_p {pred}
 
-  const vf __builtin_vsx_xvcmpeqsp (vf, vf);
-XVCMPEQSP vector_eqv4sf {}
-
   const vd __builtin_vsx_xvcmpgedp (vd, vd);
 XVCMPGEDP vector_gev2df {}
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
index 35ea31b2616..245893dc0e3 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
@@ -27,7 +27,6 @@
 /* { dg-final { scan-assembler "xvcmpeqdp" } } */
 /* { dg-final { scan-assembler "xvcmpgtdp" } } */
 /* { dg-final { scan-assembler "xvcmpgedp" } } */
-/* { dg-final { scan-assembler "xvcmpeqsp" } } */
 /* { dg-final { scan-assembler "xvcmpgtsp" } } */
 /* { dg-final { scan-assembler "xvcmpgesp" } } */
 /* { dg-final { scan-assembler "xxsldwi" } } */
@@ -112,7 +111,6 @@ int do_cmp (void)
   d[i][0] = __builtin_vsx_xvcmpgtdp (d[i][1], d[i][2]); i++;
   d[i][0] = __builtin_vsx_xvcmpgedp (d[i][1], d[i][2]); i++;
 
-  f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++;
   f[i][0] = __builtin_vsx_xvcmpgtsp (f[i][1], f[i][2]); i++;
   f[i][0] = __builtin_vsx_xvcmpgesp (f[i][1], f[i][2]); i++;
   return i;
-- 
2.44.0



[PATCH 8/13] rs6000, remove __builtin_vsx_vperm_* built-ins

2024-04-19 Thread Carl Love
rs6000, remove __builtin_vsx_vperm_* built-ins

The undocumented built-ins:
  __builtin_vsx_vperm_16qi_uns,
  __builtin_vsx_vperm_1ti,
  __builtin_vsx_vperm_1ti_uns,
  __builtin_vsx_vperm_2df,
  __builtin_vsx_vperm_2di,
  __builtin_vsx_vperm_2di_uns,
  __builtin_vsx_vperm_4sf,
  __builtin_vsx_vperm_4si,
  __builtin_vsx_vperm_4si_uns

are duplicats of the __builtin_altivec_* builtins that are used by
the overloaded vec_perm built-in that is documented in the PVIPR.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vperm_16qi_uns,
__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns): Remove
built-in definitions and comments.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_vperm_16qi_uns,
 __builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns): Remove
test cases.
---
 gcc/config/rs6000/rs6000-builtins.def | 33 ---
 .../gcc.target/powerpc/vsx-builtin-3.c| 20 ---
 2 files changed, 53 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 3c409d729ea..f33564d3d9c 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1529,39 +1529,6 @@
   const vf __builtin_vsx_uns_floato_v2di (vsll);
 UNS_FLOATO_V2DI unsfloatov2di {}
 
-; These are duplicates of __builtin_altivec_* counterparts, and are being
-; kept for backwards compatibility.  The reason for their existence is
-; unclear.  TODO: Consider deprecation/removal at some point.
-  const vsc __builtin_vsx_vperm_16qi (vsc, vsc, vuc);
-VPERM_16QI_X altivec_vperm_v16qi {}
-
-  const vuc __builtin_vsx_vperm_16qi_uns (vuc, vuc, vuc);
-VPERM_16QI_UNS_X altivec_vperm_v16qi_uns {}
-
-  const vsq __builtin_vsx_vperm_1ti (vsq, vsq, vsc);
-VPERM_1TI_X altivec_vperm_v1ti {}
-
-  const vsq __builtin_vsx_vperm_1ti_uns (vsq, vsq, vsc);
-VPERM_1TI_UNS_X altivec_vperm_v1ti_uns {}
-
-  const vd __builtin_vsx_vperm_2df (vd, vd, vuc);
-VPERM_2DF_X altivec_vperm_v2df {}
-
-  const vsll __builtin_vsx_vperm_2di (vsll, vsll, vuc);
-VPERM_2DI_X altivec_vperm_v2di {}
-
-  const vull __builtin_vsx_vperm_2di_uns (vull, vull, vuc);
-VPERM_2DI_UNS_X altivec_vperm_v2di_uns {}
-
-  const vf __builtin_vsx_vperm_4sf (vf, vf, vuc);
-VPERM_4SF_X altivec_vperm_v4sf {}
-
-  const vsi __builtin_vsx_vperm_4si (vsi, vsi, vuc);
-VPERM_4SI_X altivec_vperm_v4si {}
-
-  const vui __builtin_vsx_vperm_4si_uns (vui, vui, vuc);
-VPERM_4SI_UNS_X altivec_vperm_v4si_uns {}
-
   const vss __builtin_vsx_vperm_8hi (vss, vss, vuc);
 VPERM_8HI_X altivec_vperm_v8hi {}
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
index 01f35dad713..35ea31b2616 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
@@ -2,7 +2,6 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mdejagnu-cpu=power7" } */
-/* { dg-final { scan-assembler "vperm" } } */
 /* { dg-final { scan-assembler "xvrdpi" } } */
 /* { dg-final { scan-assembler "xvrdpic" } } */
 /* { dg-final { scan-assembler "xvrdpim" } } */
@@ -56,25 +55,6 @@ extern __vector unsigned long long ull[][4];
 extern __vector __bool long bl[][4];
 #endif
 
-int do_perm(void)
-{
-  int i = 0;
-
-  si[i][0] = __builtin_vsx_vperm_4si (si[i][1], si[i][2], uc[i][3]); i++;
-  ss[i][0] = __builtin_vsx_vperm_8hi (ss[i][1], ss[i][2], uc[i][3]); i++;
-  sc[i][0] = __builtin_vsx_vperm_16qi (sc[i][1], sc[i][2], uc[i][3]); i++;
-  f[i][0] = __builtin_vsx_vperm_4sf (f[i][1], f[i][2], uc[i][3]); i++;
-  d[i][0] = __builtin_vsx_vperm_2df (d[i][1], d[i][2], uc[i][3]); i++;
-
-  si[i][0] = __builtin_vsx_vperm (si[i][1], si[i][2], uc[i][3]); i++;
-  ss[i][0] = __builtin_vsx_vperm (ss[i][1], ss[i][2], uc[i][3]); i++;
-  sc[i][0] = __builtin_vsx_vperm (sc[i][1], sc[i][2], uc[i][3]); i++;
-  f[i][0] = __builtin_vsx_vperm (f[i][1], f[i][2], uc[i][3]); i++;
-  d[i][0] = __builtin_vsx_vperm (d[i][1], d[i][2], uc[i][3]); i++;
-
-  return i;
-}
-
 int do_xxperm (void)
 {
   int i = 0;
-- 
2.44.0



[PATCH 7/13] rs6000, remove the vec_xxsel built-ins, they are duplicates

2024-04-19 Thread Carl Love
rs6000, remove the vec_xxsel built-ins, they are duplicates

The following undocumented built-ins are covered by the existing overloaded
vec_sel built-in definitions.

  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
same as vsc __builtin_vec_sel (vsc, vsc, vuc);  (overloaded vec_sel)

  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
same as vuc __builtin_vec_sel (vuc, vuc, vuc);  (overloaded vec_sel)

  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
same as  vd __builtin_vec_sel (vd, vd, vull);   (overloaded vec_sel)

  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
same as vsll __builtin_vec_sel (vsll, vsll, vsll);  (overloaded vec_sel)

  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
same as vull __builtin_vec_sel (vull, vull, vsll);  (overloaded vec_sel)

  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
same as vf __builtin_vec_sel (vf, vf, vsi)  (overloaded vec_sel)

  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
same as vsi __builtin_vec_sel (vsi, vsi, vbi);  (overloaded vec_sel)

  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
same as vui __builtin_vec_sel (vui, vui, vui);  (overloaded vec_sel)

  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
same as vss __builtin_vec_sel (vss, vss, vbs);  (overloaded vec_sel)

  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
same as vus __builtin_vec_sel (vus, vus, vus);  (overloaded vec_sel)

This patch removed the duplicate built-in definitions so users will only
use the documented vec_sel built-in.  The __builtin_vsx_xxsel_[4si, 8hi,
16qi, 4sf, 2df] tests are also removed.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxmrglw_4si,
__builtin_vsx_xxsel_16qi, __builtin_vsx_xxsel_16qi_uns,
__builtin_vsx_xxsel_2df, __builtin_vsx_xxsel_2di,
__builtin_vsx_xxsel_2di_uns, __builtin_vsx_xxsel_4sf,
__builtin_vsx_xxsel_4si, __builtin_vsx_xxsel_4si_uns,
__builtin_vsx_xxsel_8hi, __builtin_vsx_xxsel_8hi_uns): Remove
built-in definitions.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_xxsel_4si,
__builtin_vsx_xxsel_8hi, __builtin_vsx_xxsel_16qi,
__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_2df): Remove test
cases for removed built-ins.
---
 gcc/config/rs6000/rs6000-builtins.def | 30 ---
 .../gcc.target/powerpc/vsx-builtin-3.c| 26 
 2 files changed, 56 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 46d2ae7b7cb..3c409d729ea 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1925,36 +1925,6 @@
   const vss __builtin_vsx_xxpermdi_8hi (vss, vss, const int<2>);
 XXPERMDI_8HI vsx_xxpermdi_v8hi {}
 
-  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
-XXSEL_16QI vector_select_v16qi {}
-
-  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
-XXSEL_16QI_UNS vector_select_v16qi_uns {}
-
-  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
-XXSEL_2DF vector_select_v2df {}
-
-  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
-XXSEL_2DI vector_select_v2di {}
-
-  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
-XXSEL_2DI_UNS vector_select_v2di_uns {}
-
-  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
-XXSEL_4SF vector_select_v4sf {}
-
-  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
-XXSEL_4SI vector_select_v4si {}
-
-  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
-XXSEL_4SI_UNS vector_select_v4si_uns {}
-
-  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
-XXSEL_8HI vector_select_v8hi {}
-
-  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
-XXSEL_8HI_UNS vector_select_v8hi_uns {}
-
   const vsc __builtin_vsx_xxsldwi_16qi (vsc, vsc, const int<2>);
 XXSLDWI_16QI vsx_xxsldwi_v16qi {}
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c 
b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
index ff875c55304..01f35dad713 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c
@@ -2,7 +2,6 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-options "-O2 -mdejagnu-cpu=power7" } */
-/* { dg-final { scan-assembler "xxsel" } } */
 /* { dg-final { scan-assembler "vperm" } } */
 /* { dg-final { scan-assembler "xvrdpi" } } */
 /* { dg-final { scan-assembler "xvrdpic" } } */
@@ -57,31 +56,6 @@ extern __vector unsigned long long ull[][4];
 extern __vector __bool long bl[][4];
 #endif
 
-int do_sel(void)
-{
-  int i = 0;
-
-  si[i][0] = __builtin_vsx_xxsel_4si (si[i][1], si[i][2], si[i][3]); i++;
-  ss[i][0] = __builtin_vsx_xxsel_8hi (ss[i][1], ss[i][2], ss[i][3]); i++;
-  sc[i][0] = __builtin_vsx_xxsel_16qi (sc[i][1], sc[i][2], sc[i][3]); i++;
-  f[i][0] = __builtin_vsx_xxsel_4sf (f[i][1], f[i][2], 

[PATCH 11/13] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in

2024-04-19 Thread Carl Love
rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in

The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded
__builtin_altivec_vcmpeqfp_p built-in.  The built-in is undocumented and
there are no test cases for it.  The patch removes built-in
__builtin_vsx_xvcmpeqsp_p.

gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (case RS6000_BIF_RSQRT):
Remove case statement.
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp_p):
Remove built-in definition.
---
 gcc/config/rs6000/rs6000-builtin.cc   | 6 --
 gcc/config/rs6000/rs6000-builtins.def | 6 --
 2 files changed, 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index f83d65b06ef..74ed8fc1805 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -269,12 +269,6 @@ rs6000_builtin_md_vectorized_function (tree fndecl, tree 
type_out,
 = (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
   switch (fn)
 {
-case RS6000_BIF_RSQRTF:
-  if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
- && out_mode == SFmode && out_n == 4
- && in_mode == SFmode && in_n == 4)
-   return rs6000_builtin_decls[RS6000_BIF_VRSQRTFP];
-  break;
 case RS6000_BIF_RSQRT:
   if (VECTOR_UNIT_VSX_P (V2DFmode)
  && out_mode == DFmode && out_n == 2
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index d65c858ac0c..2f6149edd5f 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -917,9 +917,6 @@
   fpmath vf __builtin_altivec_vrsqrtefp (vf);
 VRSQRTEFP rsqrtev4sf2 {}
 
-  fpmath vf __builtin_altivec_vrsqrtfp (vf);
-VRSQRTFP rsqrtv4sf2 {}
-
   const vsc __builtin_altivec_vsel_16qi (vsc, vsc, vuc);
 VSEL_16QI vector_select_v16qi {}
 
@@ -1619,9 +1616,6 @@
   const vf __builtin_vsx_xvcmpeqsp (vf, vf);
 XVCMPEQSP vector_eqv4sf {}
 
-  const signed int __builtin_vsx_xvcmpeqsp_p (signed int, vf, vf);
-XVCMPEQSP_P vector_eq_v4sf_p {pred}
-
   const vd __builtin_vsx_xvcmpgedp (vd, vd);
 XVCMPGEDP vector_gev2df {}
 
-- 
2.44.0



[PATCH 5/13] rs6000, remove duplicated built-ins of vecmergl and vec_mergeh

2024-04-19 Thread Carl Love
rs6000, remove duplicated built-ins of vecmergl and vec_mergeh

The following undocumented built-ins are same as existing documented
overloaded builtins.

  const vf __builtin_vsx_xxmrghw (vf, vf);
same as  vf __builtin_vec_mergeh (vf, vf);  (overloaded vec_mergeh)

  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
same as vsi __builtin_vec_mergeh (vsi, vsi);   (overloaded vec_mergeh)

  const vf __builtin_vsx_xxmrglw (vf, vf);
same as vf __builtin_vec_mergel (vf, vf);  (overloaded vec_mergel)

  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
same as vsi __builtin_vec_mergel (vsi, vsi);   (overloaded vec_mergel)

This patch removes the duplicate built-in definitions so only the
documented built-ins will be available for use.  The case statements in
rs6000_gimple_fold_builtin are removed as they are no longer needed.  The
patch removes the now unused define_expands for vsx_xxmrghw_ and
vsx_xxmrglw_.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si, __builtin_vsx_xxmrglw,
__builtin_vsx_xxmrglw_4si, __builtin_vsx_xxsel_16qi): Remove
built-in definition.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):
remove case entries RS6000_BIF_XXMRGLW_4SI,
RS6000_BIF_XXMRGLW_4SF, RS6000_BIF_XXMRGHW_4SI,
RS6000_BIF_XXMRGHW_4SF.
* config/rs6000/vsx.md (vsx_xxmrghw_, vsx_xxmrglw_):
Remove unused define_expands.
---
 gcc/config/rs6000/rs6000-builtin.cc   |  4 ---
 gcc/config/rs6000/rs6000-builtins.def | 12 
 gcc/config/rs6000/vsx.md  | 41 ---
 3 files changed, 57 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index ac9f16fe51a..f83d65b06ef 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -2097,20 +2097,16 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 /* vec_mergel (integrals).  */
 case RS6000_BIF_VMRGLH:
 case RS6000_BIF_VMRGLW:
-case RS6000_BIF_XXMRGLW_4SI:
 case RS6000_BIF_VMRGLB:
 case RS6000_BIF_VEC_MERGEL_V2DI:
-case RS6000_BIF_XXMRGLW_4SF:
 case RS6000_BIF_VEC_MERGEL_V2DF:
   fold_mergehl_helper (gsi, stmt, 1);
   return true;
 /* vec_mergeh (integrals).  */
 case RS6000_BIF_VMRGHH:
 case RS6000_BIF_VMRGHW:
-case RS6000_BIF_XXMRGHW_4SI:
 case RS6000_BIF_VMRGHB:
 case RS6000_BIF_VEC_MERGEH_V2DI:
-case RS6000_BIF_XXMRGHW_4SF:
 case RS6000_BIF_VEC_MERGEH_V2DF:
   fold_mergehl_helper (gsi, stmt, 0);
   return true;
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 5b7237a2327..d09e21a9151 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1904,18 +1904,6 @@
   const signed int __builtin_vsx_xvtsqrtsp_fg (vf);
 XVTSQRTSP_FG vsx_tsqrtv4sf2_fg {}
 
-  const vf __builtin_vsx_xxmrghw (vf, vf);
-XXMRGHW_4SF vsx_xxmrghw_v4sf {}
-
-  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
-XXMRGHW_4SI vsx_xxmrghw_v4si {}
-
-  const vf __builtin_vsx_xxmrglw (vf, vf);
-XXMRGLW_4SF vsx_xxmrglw_v4sf {}
-
-  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
-XXMRGLW_4SI vsx_xxmrglw_v4si {}
-
   const vsc __builtin_vsx_xxpermdi_16qi (vsc, vsc, const int<2>);
 XXPERMDI_16QI vsx_xxpermdi_v16qi {}
 
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 3d39ae7995f..26560ecc38a 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -4810,47 +4810,6 @@
 }
   [(set_attr "type" "vecperm")])
 
-;; V4SF/V4SI interleave
-(define_expand "vsx_xxmrghw_"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
-(vec_select:VSX_W
- (vec_concat:
-   (match_operand:VSX_W 1 "vsx_register_operand" "wa")
-   (match_operand:VSX_W 2 "vsx_register_operand" "wa"))
- (parallel [(const_int 0) (const_int 4)
-(const_int 1) (const_int 5)])))]
-  "VECTOR_MEM_VSX_P (mode)"
-{
-  rtx (*fun) (rtx, rtx, rtx);
-  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrghw_direct_
-: gen_altivec_vmrglw_direct_;
-  if (!BYTES_BIG_ENDIAN)
-std::swap (operands[1], operands[2]);
-  emit_insn (fun (operands[0], operands[1], operands[2]));
-  DONE;
-}
-  [(set_attr "type" "vecperm")])
-
-(define_expand "vsx_xxmrglw_"
-  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
-   (vec_select:VSX_W
- (vec_concat:
-   (match_operand:VSX_W 1 "vsx_register_operand" "wa")
-   (match_operand:VSX_W 2 "vsx_register_operand" "wa"))
- (parallel [(const_int 2) (const_int 6)
-(const_int 3) (const_int 7)])))]
-  "VECTOR_MEM_VSX_P (mode)"
-{
-  rtx (*fun) (rtx, rtx, rtx);
-  fun = BYTES_BIG_ENDIAN ? gen_altivec_vmrglw_direct_
-: gen_altivec_vmrghw_direct_;
-  if (!BYTES_BIG_ENDIAN)
-std::swap 

[PATCH 4/13] rs6000, extend the current vec_{un,}signed{e,o} built-ins

2024-04-19 Thread Carl Love
rs6000, extend the current vec_{un,}signed{e,o} built-ins

The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
convert a vector of floats to signed/unsigned long long ints.  Extend the
existing vec_{un,}signed{e,o} built-ins to handle the argument
vector of floats to return the even/odd signed/unsigned integers.

Add testcases and update documentation.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low,
__builtin_vsx_xvcvspuxds_low): New built-in definitions.
* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo):
Add new overloaded specifications.
* config/rs6000/vsx.md (vsx_xvcvspxds_low): New define_expand.
* doc/extend.texi (vec_signedo, vec_signede): Add documentation.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable: New tests for the added
overloaded built-ins.
---
 gcc/config/rs6000/rs6000-builtins.def |  6 ++
 gcc/config/rs6000/rs6000-overload.def |  8 
 gcc/config/rs6000/vsx.md  | 23 +++
 gcc/doc/extend.texi   | 13 +
 4 files changed, 50 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index bf9a0ae22fc..5b7237a2327 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1709,9 +1709,15 @@
   const vsll __builtin_vsx_xvcvspsxds (vf);
 XVCVSPSXDS vsx_xvcvspsxds {}
 
+  const vsll __builtin_vsx_xvcvspsxds_low (vf);
+XVCVSPSXDSO vsx_xvcvspsxds_low {}
+
   const vsll __builtin_vsx_xvcvspuxds (vf);
 XVCVSPUXDS vsx_xvcvspuxds {}
 
+  const vsll __builtin_vsx_xvcvspuxds_low (vf);
+XVCVSPUXDSO vsx_xvcvspuxds_low {}
+
   const vsi __builtin_vsx_xvcvspuxws (vf);
 XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 84bd9ae6554..68501c05289 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3307,10 +3307,14 @@
 [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
   vsi __builtin_vec_vsignede (vd);
 VEC_VSIGNEDE_V2DF
+  vsll __builtin_vec_vsignede (vf);
+XVCVSPSXDS
 
 [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
   vsi __builtin_vec_vsignedo (vd);
 VEC_VSIGNEDO_V2DF
+  vsll __builtin_vec_vsignedo (vf);
+XVCVSPSXDSO
 
 [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
   vsi __builtin_vec_signexti (vsc);
@@ -4433,10 +4437,14 @@
 [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
   vui __builtin_vec_vunsignede (vd);
 VEC_VUNSIGNEDE_V2DF
+  vull __builtin_vec_vunsignede (vf);
+XVCVSPUXDS
 
 [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
   vui __builtin_vec_vunsignedo (vd);
 VEC_VUNSIGNEDO_V2DF
+  vull __builtin_vec_vunsignedo (vf);
+XVCVSPUXDSO
 
 [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
   vui __builtin_vec_extract_exp (vf);
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f135fa079bd..3d39ae7995f 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -2704,6 +2704,29 @@
   DONE;
 })
 
+;; Convert low vector elements of 32-bit floating point numbers to vector of
+;; 64-bit signed/unsigned integers.
+(define_expand "vsx_xvcvspxds_low"
+  [(match_operand:V2DI 0 "vsx_register_operand")
+   (match_operand:V4SF 1 "vsx_register_operand")
+   (any_fix (pc))]
+  "VECTOR_UNIT_VSX_P (V2DFmode)"
+{
+  /* Shift left one word to put even word in correct location */
+  rtx rtx_tmp;
+  rtx rtx_val = GEN_INT (4);
+  rtx_tmp = gen_reg_rtx (V4SFmode);
+  emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
+  rtx_val));
+
+  if (BYTES_BIG_ENDIAN)
+emit_insn (gen_vsx_xvcvspxds_be (operands[0], rtx_tmp));
+  else
+emit_insn (gen_vsx_xvcvspxds_le (operands[0], rtx_tmp));
+
+  DONE;
+})
+
 ;; Generate float2 double
 ;; convert two double to float
 (define_expand "float2_v2df"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7b54a241a7b..64a43b55e2d 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22552,6 +22552,19 @@ can use @var{vector long} instead of @var{vector long 
long},
 @var{vector bool long} instead of @var{vector bool long long}, and
 @var{vector unsigned long} instead of @var{vector unsigned long long}.
 
+@smallexample
+vector signed signed long long vec_signedo (vector float);
+vector signed signed long long vec_signede (vector float);
+vector unsigned signed long long vec_signedo (vector float);
+vector unsigned signed long long vec_signede (vector float);
+@end smallexample
+
+The overloaded built-ins @code{vec_signedo} and @code{vec_signede} convert the
+even/odd input vector elements to signed/unsigned long long integer values in
+addition to the supported arguments and return types documented in the PVIPR.
+Negative input values are returned as zero for the 

[PATCH 10/13] rs6000, extend vec_xxpermdi built-in for __int128 args

2024-04-19 Thread Carl Love
rs6000, extend vec_xxpermdi built-in for __int128 args

Add a new overloaded instance for vec_xxpermdi

   __int128 vec_xxpermdi (__int128, __int128, const int);

Update the documentation to include a reference to the new built-in
instance.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (vec_xxpermdi): Add new
overloaded built-in instance.
---
 gcc/config/rs6000/rs6000-overload.def | 2 ++
 gcc/doc/extend.texi   | 1 +
 2 files changed, 3 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 5912c9452f4..49962e2f2a2 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -4932,6 +4932,8 @@
 XXPERMDI_4SF  XXPERMDI_VF
   vd __builtin_vsx_xxpermdi (vd, vd, const int);
 XXPERMDI_2DF  XXPERMDI_VD
+  vsq __builtin_vsx_xxpermdi (vsq, vsq, const int);
+XXPERMDI_1TI  XXPERMDI_1TI
 
 [VEC_XXSLDWI, vec_xxsldwi, __builtin_vsx_xxsldwi]
   vsc __builtin_vsx_xxsldwi (vsc, vsc, const int);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 86b8e536dbe..47cf2f3bc8b 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -22505,6 +22505,7 @@ void vec_vsx_st (vector bool char, int, vector bool 
char *);
 void vec_vsx_st (vector bool char, int, unsigned char *);
 void vec_vsx_st (vector bool char, int, signed char *);
 
+vector __int128 vec_xxpermdi (vector __int128, vector __int128, const int);
 vector double vec_xxpermdi (vector double, vector double, const int);
 vector float vec_xxpermdi (vector float, vector float, const int);
 vector long long vec_xxpermdi (vector long long, vector long long, const int);
-- 
2.44.0



[PATCH 9/13] rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins

2024-04-19 Thread Carl Love
rs6000, remove __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp built-ins

The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are
redundant.  The overloaded vec_neg built-in provides the same
functionality.  The two buit-ins are not documented nor are there any
test cases for them.

Remove the definitions so users will use the overloaded vec_neg built-in
which is documented in the PVIPR.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvnegdp,
__builtin_vsx_xvnegsp): Remove built-in definitions.
---
 gcc/config/rs6000/rs6000-builtins.def | 6 --
 1 file changed, 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index f33564d3d9c..d65c858ac0c 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1763,12 +1763,6 @@
   const vf __builtin_vsx_xvnabssp (vf);
 XVNABSSP vsx_nabsv4sf2 {}
 
-  const vd __builtin_vsx_xvnegdp (vd);
-XVNEGDP negv2df2 {}
-
-  const vf __builtin_vsx_xvnegsp (vf);
-XVNEGSP negv4sf2 {}
-
   const vd __builtin_vsx_xvnmadddp (vd, vd, vd);
 XVNMADDDP nfmav2df4 {}
 
-- 
2.44.0



[PATCH 3/13] rs6000, fix error in unsigned vector float to unsigned int built-in definitions

2024-04-19 Thread Carl Love
rs6000, fix error in unsigned vector float to unsigned  int built-in definitions

The built-ins __builtin_vsx_vunsigned_v2df and__builtin_vsx_vunsigned_v4sf
are supposed to take a vector of floats and return a vector of unsigned
long long ints.  The definitions are using the signed version of the
instructions not the unsigned version of the instruction.  The results
should also be unsigned.  The builtins are used by the overloaded
vec_unsigned builtin which has an unsigned result.

Similarly the built-ins __builtin_vsx_vunsignede_v2df and
__builtin_vsx_vunsignedo_v2df are supposed to retun an unsigned result.
If the floating point argument is negative, the unsigned result is zero.
The built-ins are used in the overloaded built-in vec_unsignede and
vec_unsignedo respectively.

Add a test cases for a negative floating point arguments for each of the
above built-ins.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vunsigned_v2df,
__builtin_vsx_vunsigned_v4sf, __builtin_vsx_vunsignede_v2df,
__builtin_vsx_vunsignedo_v2df): Change the result type to unsigned.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c: Add tests for
vec_unsignede and vec_unsignedo with negative arguments.
---
 gcc/config/rs6000/rs6000-builtins.def | 12 +-
 .../gcc.target/powerpc/builtins-3-runnable.c  | 23 ---
 2 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index c6d2ea1bc39..bf9a0ae22fc 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1580,16 +1580,16 @@
   const vsi __builtin_vsx_vsignedo_v2df (vd);
 VEC_VSIGNEDO_V2DF vsignedo_v2df {}
 
-  const vsll __builtin_vsx_vunsigned_v2df (vd);
-VEC_VUNSIGNED_V2DF vsx_xvcvdpsxds {}
+  const vull __builtin_vsx_vunsigned_v2df (vd);
+VEC_VUNSIGNED_V2DF vsx_xvcvdpuxds {}
 
-  const vsi __builtin_vsx_vunsigned_v4sf (vf);
-VEC_VUNSIGNED_V4SF vsx_xvcvspsxws {}
+  const vui __builtin_vsx_vunsigned_v4sf (vf);
+VEC_VUNSIGNED_V4SF vsx_xvcvspuxws {}
 
-  const vsi __builtin_vsx_vunsignede_v2df (vd);
+  const vui __builtin_vsx_vunsignede_v2df (vd);
 VEC_VUNSIGNEDE_V2DF vunsignede_v2df {}
 
-  const vsi __builtin_vsx_vunsignedo_v2df (vd);
+  const vui __builtin_vsx_vunsignedo_v2df (vd);
 VEC_VUNSIGNEDO_V2DF vunsignedo_v2df {}
 
   const vf __builtin_vsx_xscvdpsp (double);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c 
b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
index 0231a1fd086..6d4fe84c8a1 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
@@ -313,6 +313,15 @@ int main()
test_unsigned_int_result (ALL, vec_uns_int_result,
  vec_uns_int_expected);
 
+   /* Convert single precision float to  unsigned int.  Negative
+  arguments
+*/
+   vec_flt0 = (vector float){-14.930, -834.49, -3.3, -5.4};
+   vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
+   vec_uns_int_result = vec_unsigned (vec_flt0);
+   test_unsigned_int_result (ALL, vec_uns_int_result,
+ vec_uns_int_expected);
+
/* Convert double precision float to long long unsigned int */
vec_dble0 = (vector double){124.930, 8134.49};
vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
@@ -321,9 +330,9 @@ int main()
 vec_ll_uns_int_expected);
 
/* Convert double precision vector float to vector unsigned int,
-  even words */
-   vec_dble0 = (vector double){3124.930, 8234.49};
-   vec_uns_int_expected = (vector unsigned int){3124, 0, 8234, 0};
+  even words.  Negative arguments */
+   vec_dble0 = (vector double){-124.930, -234.49};
+   vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
vec_uns_int_result = vec_unsignede (vec_dble0);
test_unsigned_int_result (EVEN, vec_uns_int_result,
  vec_uns_int_expected);
@@ -335,5 +344,13 @@ int main()
vec_uns_int_result = vec_unsignedo (vec_dble0);
test_unsigned_int_result (ODD, vec_uns_int_result,
  vec_uns_int_expected);
+
+   /* Convert double precision vector float to vector unsigned int,
+  odd words.  Negative arguments.  */
+   vec_dble0 = (vector double){-924.930, -1234.49};
+   vec_uns_int_expected = (vector unsigned int){0, 0, 0, 0};
+   vec_uns_int_result = vec_unsignedo (vec_dble0);
+   test_unsigned_int_result (ODD, vec_uns_int_result,
+ vec_uns_int_expected);
 }
 
-- 
2.44.0



[PATCH 6/13] rs6000, add overloaded vec_sel with int128 arguments

2024-04-19 Thread Carl Love
rs6000, add overloaded vec_sel with int128 arguments

Extend the vec_sel built-in to take three signed/unsigned int128 arguments
and return a signed/unsigned int128 result.

Extending the vec_sel built-in makes the existing buit-ins
__builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete.  The
patch removes these built-ins.

The patch adds documentation and test cases for the new overloaded vec_sel
built-ins.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
* config/rs6000/rs6000-overload.def (vec_sel): Add new overloaded
definitions.
* doc/extend.texi: Add documentation for new vec_sel arguments.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec_sel_runnable-int128.c: New test file.
---
 gcc/config/rs6000/rs6000-builtins.def |  6 --
 gcc/config/rs6000/rs6000-overload.def |  4 +
 gcc/doc/extend.texi   | 14 
 .../powerpc/vec-sel-runnable-i128.c   | 84 +++
 4 files changed, 102 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index d09e21a9151..46d2ae7b7cb 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1931,12 +1931,6 @@
   const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
 XXSEL_16QI_UNS vector_select_v16qi_uns {}
 
-  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
-XXSEL_1TI vector_select_v1ti {}
-
-  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
-XXSEL_1TI_UNS vector_select_v1ti_uns {}
-
   const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
 XXSEL_2DF vector_select_v2df {}
 
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 68501c05289..5912c9452f4 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3274,6 +3274,10 @@
 VSEL_2DF  VSEL_2DF_B
   vd __builtin_vec_sel (vd, vd, vull);
 VSEL_2DF  VSEL_2DF_U
+  vsq __builtin_vec_sel (vsq, vsq, vsq);
+VSEL_1TI  VSEL_1TI_S
+  vuq __builtin_vec_sel (vuq, vuq, vuq);
+VSEL_1TI_UNS  VSEL_1TI_U
 ; The following variants are deprecated.
   vsll __builtin_vec_sel (vsll, vsll, vsll);
 VSEL_2DI_B  VSEL_2DI_S
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 64a43b55e2d..86b8e536dbe 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -23358,6 +23358,20 @@ The programmer is responsible for understanding the 
endianness issues involved
 with the first argument and the result.
 @findex vec_replace_unaligned
 
+Vector select
+
+@smallexample
+vector signed __int128 vec_sel (vector signed __int128,
+   vector signed __int128, vector signed __int128);
+vector unsigned __int128 vec_sel (vector unsigned __int128,
+   vector unsigned __int128, vector unsigned __int128);
+@end smallexample
+
+The overloaded built-in @code{vec_sel} with vector signed/unsigned __int128
+arguments and returns a vector selecting bits from the two source vectors based
+on the values of the third input vector.  This built-in is an extension of the
+@code{vec_sel} built-in documented in the PVIPR.
+
 Vector Shift Left Double Bit Immediate
 @smallexample
 @exdent vector signed char vec_sldb (vector signed char, vector signed char,
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c 
b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
new file mode 100644
index 000..58eb383e8c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-sel-runnable-i128.c
@@ -0,0 +1,84 @@
+/* { dg-do run  { target power10_hw }} */
+/* { dg-require-effective-target int128 } */
+/* { dg-require-effective-target power10_hw } */
+/* { dg-options "-mdejagnu-cpu=power10 -save-temps" } */
+
+
+#include 
+
+
+#define DEBUG 0
+
+#if DEBUG
+#include 
+void print_i128 (unsigned __int128 val)
+{
+  printf(" 0x%016llx%016llx",
+ (unsigned long long)(val >> 64),
+ (unsigned long long)(val & 0x));
+}
+#endif
+
+extern void abort (void);
+
+int
+main (int argc, char *argv [])
+{
+  vector signed __int128 src_va_s128;
+  vector signed __int128 src_vb_s128;
+  vector signed __int128 src_vc_s128;
+  vector signed __int128 vresult_s128;
+  vector signed __int128 expected_vresult_s128;
+
+  vector unsigned __int128 src_va_u128;
+  vector unsigned __int128 src_vb_u128;
+  vector unsigned __int128 src_vc_u128;
+  vector unsigned __int128 vresult_u128;
+  vector unsigned __int128 expected_vresult_u128;
+
+  src_va_s128 = (vector signed __int128) {0x123456789ABCDEF0};
+  src_vb_s128 = (vector signed __int128) {0xFEDCBA9876543210};
+  src_vc_s128 = (vector signed __int128) {0x};
+  expected_vresult_s128 = (vector signed __int128) {0x32147658ba9cfed0};
+
+  /* Signed arguments.  

[PATCH 2/13] rs6000, Remove __builtin_vsx_xvcvspsxws built-in

2024-04-19 Thread Carl Love
rs6000, Remove __builtin_vsx_xvcvspsxws built-in

The built-in __builtin_vsx_xvcvspsxws is a duplicate of the vec_signed
built-in that is documented in the PVIPR.  The __builtin_vsx_xvcvspsxws
built-in is not documented and there are no test cases for it.

This patch removes the redundant built-in.

gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxws):
Remove built-in definition.
---
 gcc/config/rs6000/rs6000-builtins.def | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 7c36976a089..c6d2ea1bc39 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1709,9 +1709,6 @@
   const vsll __builtin_vsx_xvcvspsxds (vf);
 XVCVSPSXDS vsx_xvcvspsxds {}
 
-  const vsi __builtin_vsx_xvcvspsxws (vf);
-XVCVSPSXWS vsx_fix_truncv4sfv4si2 {}
-
   const vsll __builtin_vsx_xvcvspuxds (vf);
 XVCVSPUXDS vsx_xvcvspuxds {}
 
-- 
2.44.0



[PATCH 1/13] rs6000, Remove __builtin_vsx_cmple* builtins

2024-04-19 Thread Carl Love


rs6000, Remove __builtin_vsx_cmple* builtins

The built-ins __builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi should take
unsigned arguments and return an unsigned result.  The current definitions
take signed arguments and return signed results which is incorrect.

The signed and unsigned versions of __builtin_vsx_cmple* are not
documented in extend.texi.  Also there are no test cases for the
built-ins.

Users can use the existing vec_cmple as PVIPR defines instead of
__builtin_vsx_cmple_u16qi, __builtin_vsx_cmple_u2di,
__builtin_vsx_cmple_u4si and __builtin_vsx_cmple_u8hi,
__builtin_vsx_cmple_16qi, __builtin_vsx_cmple_2di,
__builtin_vsx_cmple_4si and __builtin_vsx_cmple_8hi,
__builtin_altivec_cmple_1ti, __builtin_altivec_cmple_u1ti.

Hence these built-ins are redundant and are removed by this patch.

gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (RS6000_BIF_CMPLE_16QI,
RS6000_BIF_CMPLE_U16QI, RS6000_BIF_CMPLE_8HI,
RS6000_BIF_CMPLE_U8HI, RS6000_BIF_CMPLE_4SI, RS6000_BIF_CMPLE_U4SI,
RS6000_BIF_CMPLE_2DI, RS6000_BIF_CMPLE_U2DI, RS6000_BIF_CMPLE_1TI,
RS6000_BIF_CMPLE_U1TI): Remove case statements.
config/rs6000/rs6000-builtins.def (__builtin_vsx_cmple_16qi,
__builtin_vsx_cmple_2di, __builtin_vsx_cmple_4si,
__builtin_vsx_cmple_8hi, __builtin_vsx_cmple_u16qi,
__builtin_vsx_cmple_u2di, __builtin_vsx_cmple_u4si,
__builtin_vsx_cmple_u8hi): Remove buit-in definitions.
---
 gcc/config/rs6000/rs6000-builtin.cc   | 13 
 gcc/config/rs6000/rs6000-builtins.def | 30 ---
 2 files changed, 43 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 320affd79e3..ac9f16fe51a 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -2027,19 +2027,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   fold_compare_helper (gsi, GT_EXPR, stmt);
   return true;
 
-case RS6000_BIF_CMPLE_16QI:
-case RS6000_BIF_CMPLE_U16QI:
-case RS6000_BIF_CMPLE_8HI:
-case RS6000_BIF_CMPLE_U8HI:
-case RS6000_BIF_CMPLE_4SI:
-case RS6000_BIF_CMPLE_U4SI:
-case RS6000_BIF_CMPLE_2DI:
-case RS6000_BIF_CMPLE_U2DI:
-case RS6000_BIF_CMPLE_1TI:
-case RS6000_BIF_CMPLE_U1TI:
-  fold_compare_helper (gsi, LE_EXPR, stmt);
-  return true;
-
 /* flavors of vec_splat_[us]{8,16,32}.  */
 case RS6000_BIF_VSPLTISB:
 case RS6000_BIF_VSPLTISH:
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 3bc7fed6956..7c36976a089 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -1337,30 +1337,6 @@
   const vss __builtin_vsx_cmpge_u8hi (vus, vus);
 CMPGE_U8HI vector_nltuv8hi {}
 
-  const vsc __builtin_vsx_cmple_16qi (vsc, vsc);
-CMPLE_16QI vector_ngtv16qi {}
-
-  const vsll __builtin_vsx_cmple_2di (vsll, vsll);
-CMPLE_2DI vector_ngtv2di {}
-
-  const vsi __builtin_vsx_cmple_4si (vsi, vsi);
-CMPLE_4SI vector_ngtv4si {}
-
-  const vss __builtin_vsx_cmple_8hi (vss, vss);
-CMPLE_8HI vector_ngtv8hi {}
-
-  const vsc __builtin_vsx_cmple_u16qi (vsc, vsc);
-CMPLE_U16QI vector_ngtuv16qi {}
-
-  const vsll __builtin_vsx_cmple_u2di (vsll, vsll);
-CMPLE_U2DI vector_ngtuv2di {}
-
-  const vsi __builtin_vsx_cmple_u4si (vsi, vsi);
-CMPLE_U4SI vector_ngtuv4si {}
-
-  const vss __builtin_vsx_cmple_u8hi (vss, vss);
-CMPLE_U8HI vector_ngtuv8hi {}
-
   const vd __builtin_vsx_concat_2df (double, double);
 CONCAT_2DF vsx_concat_v2df {}
 
@@ -3117,12 +3093,6 @@
   const vbq __builtin_altivec_cmpge_u1ti (vuq, vuq);
 CMPGE_U1TI vector_nltuv1ti {}
 
-  const vbq __builtin_altivec_cmple_1ti (vsq, vsq);
-CMPLE_1TI vector_ngtv1ti {}
-
-  const vbq __builtin_altivec_cmple_u1ti (vuq, vuq);
-CMPLE_U1TI vector_ngtuv1ti {}
-
   const unsigned long long __builtin_altivec_cntmbb (vuc, const int<1>);
 VCNTMBB vec_cntmb_v16qi {}
 
-- 
2.44.0



[PATCH 0/13] rs6000, built-in cleanup patch series

2024-04-19 Thread Carl Love
GCC maintainers:

The following patch series removes duplicate built-ins.  There are patches to 
extend an existing overloaded built-in to cover additional input types.  The 
final patch removes built-ins to set and initialize vectors.  The code 
generated by these built-ins with the default optimization is efficient than 
the code generated by using straight C code.  The assembly code for the 
built-in and straight C code is the same with -O3
optimizations.  In this case, the built-ins are removed as they add no 
additional value.

The patches have all been tested on Power 10 LE.  The last patch was also 
tested on Power 8 BE.

No regression tests were seen.

Please let me know if the patches are acceptable for mainline.  Thanks.

   Carl 



Re: [PATCH] c-family: Allow arguments with NULLPTR_TYPE as sentinels [PR114780]

2024-04-19 Thread Joseph Myers
On Fri, 19 Apr 2024, Jakub Jelinek wrote:

> Ok for trunk and later 13.3 if it passes bootstrap/regtest (so far just
> checked on the sentinel related C/C++ tests)?
> 
> 2024-04-19  Jakub Jelinek  
> 
>   PR c/114780
>   * c-common.cc (check_function_sentinel): Allow as sentinel any
>   argument of NULLPTR_TYPE.
> 
>   * gcc.dg/format/sentinel-2.c: New test.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [RFC][PATCH v1 1/4] Documentation change

2024-04-19 Thread Tom Tromey
> Qing Zhao  writes:

> +The size of the union is as if the flexiable array member were omitted
> +except that it may have more trailing padding than the omission would imply.
> +
> +If all the members of a union are flexiable array member, the size of 

There's a couple of spots that say "flexiable" which should say "flexible".

thanks,
Tom


Re: [PATCH] c, v3: Fix ICE with -g and -std=c23 related to incomplete types [PR114361]

2024-04-19 Thread Joseph Myers
On Mon, 15 Apr 2024, Jakub Jelinek wrote:

> 2024-04-15  Martin Uecker  
>   Jakub Jelinek  
> 
>   PR lto/114574
>   PR c/114361
> gcc/c/
>   * c-decl.cc (shadow_tag_warned): For flag_isoc23 and code not
>   ENUMERAL_TYPE use SET_TYPE_STRUCTURAL_EQUALITY.
>   (parser_xref_tag): Likewise.
>   (start_struct): For flag_isoc23 use SET_TYPE_STRUCTURAL_EQUALITY.
>   (c_update_type_canonical): New function.
>   (finish_struct): Put NULL as second == operand rather than first.
>   Assert TYPE_STRUCTURAL_EQUALITY_P.  Call c_update_type_canonical.
>   * c-typeck.cc (composite_type_internal): Use
>   SET_TYPE_STRUCTURAL_EQUALITY.  Formatting fix.
> gcc/testsuite/
>   * gcc.dg/pr114574-1.c: New test.
>   * gcc.dg/pr114574-2.c: New test.
>   * gcc.dg/pr114361.c: New test.
>   * gcc.dg/c23-tag-incomplete-1.c: New test.
>   * gcc.dg/c23-tag-incomplete-2.c: New test.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



[committed] libstdc++: Simplify constraints on <=> for std::reference_wrapper

2024-04-19 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Instead of constraining these overloads in terms of synth-three-way we
can just check that the value_type is less-than-comparable, which is
what synth-three-way's constraints check.

The reason that I implemented these with constraints has now been filed
as LWG 4071, so add a comment about that too.

libstdc++-v3/ChangeLog:

* include/bits/refwrap.h (operator<=>): Simplify constraints.
---
 libstdc++-v3/include/bits/refwrap.h | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/bits/refwrap.h 
b/libstdc++-v3/include/bits/refwrap.h
index fd1cc2b63e6..71ec2b297b7 100644
--- a/libstdc++-v3/include/bits/refwrap.h
+++ b/libstdc++-v3/include/bits/refwrap.h
@@ -384,23 +384,29 @@ _GLIBCXX_MEM_FN_TRAITS(&& noexcept, false_type, true_type)
&& requires { { __x.get() == __y.get() } -> convertible_to; }
   { return __x.get() == __y.get(); }
 
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 4071. reference_wrapper comparisons are not SFINAE-friendly
+
   [[nodiscard]]
   friend constexpr auto
-  operator<=>(reference_wrapper __x, reference_wrapper<_Tp> __y)
-  requires requires { __detail::__synth3way(__x.get(), __y.get()); }
+  operator<=>(reference_wrapper __x, reference_wrapper __y)
+  requires requires (const _Tp __t) {
+   { __t < __t } -> __detail::__boolean_testable;
+  }
   { return __detail::__synth3way(__x.get(), __y.get()); }
 
   [[nodiscard]]
   friend constexpr auto
   operator<=>(reference_wrapper __x, const _Tp& __y)
-  requires requires { __detail::__synth3way(__x.get(), __y); }
+  requires requires { { __y < __y } -> __detail::__boolean_testable; }
   { return __detail::__synth3way(__x.get(), __y); }
 
   [[nodiscard]]
   friend constexpr auto
   operator<=>(reference_wrapper __x, reference_wrapper __y)
-  requires (!is_const_v<_Tp>)
-   && requires { __detail::__synth3way(__x.get(), __y.get()); }
+  requires (!is_const_v<_Tp>) && requires (const _Tp __t) {
+   { __t < __t } -> __detail::__boolean_testable;
+  }
   { return __detail::__synth3way(__x.get(), __y.get()); }
 #endif
 };
-- 
2.44.0



Re: [PATCH] libstdc++: Support link chains in std::chrono::tzdb::locate_zone [PR114770]

2024-04-19 Thread Jonathan Wakely
On Fri, 19 Apr 2024 at 10:08, Jonathan Wakely  wrote:
>
> On Fri, 19 Apr 2024 at 07:14, Richard Biener  
> wrote:
> >
> > On Thu, Apr 18, 2024 at 6:34 PM Jonathan Wakely  wrote:
> > >
> > > This would fix the but, how do people feel about it this close to the
> > > gcc-14 release?
> >
> > Guess we'll have to fix it anyway, so why not now ...
>
> Yeah, I don't think Debian is going to stop using this feature, and it
> might get used more widely in future (it's currently part of the
> "vanguard" format for tzdata, but might move to "main" one day and
> then all distros would have chained links). So it needs to be
> backported to gcc-13 too.
>
> > (what could go wrong..)
>
> Well the risk is that my new code doesn't correctly detect cycles, and
> so could go into an infinite loop when trying to follow chained links.
> The current code on trunk will just fail to find a time_zone and throw
> an exception, which is not ideal, but predictable and easily
> understood. Attempting to handle chained links adds complexity.
>
> I think my new code is correct so that it won't get stuck in a loop,
> and there are tests which should cover it sufficiently. And for
> correctly tzdata.zi there will never be cycles anyway, so even if I
> messed the code up, it shouldn't matter unless the application
> provides a custom tzdata.zi with invalid links.
>
> So I guess I'll push it, and backport to gcc-13 soon.


I've pushed the attached, which is the same as the earlier patch
except for adding a new function to the testsuite/std/time/tzdb/1.cc
test.
commit eed7fb1b2fe72150cd6af10dd3b8f7fc4f0a4da1
Author: Jonathan Wakely 
Date:   Thu Apr 18 12:14:41 2024

libstdc++: Support link chains in std::chrono::tzdb::locate_zone [PR114770]

Since 2022 the TZif format defined in the zic(8) man page has said that
links can refer to other links, rather than only referring to a zone.
This isn't supported by the C++20 spec, which assumes that the target()
for a chrono::time_zone_link always names a chrono::time_zone, not
another chrono::time_zone_link.

This hasn't been a problem until now, because there are no entries in
the tzdata file that chain links together. However, Debian Sid has
changed the target of the Asia/Chungking link from the Asia/Shanghai
zone to the Asia/Chongqing link, creating a link chain. The libstdc++
code is unable to handle this, so chrono::locate_zone("Asia/Chungking")
will fail with the tzdata.zi file from Debian Sid.

It seems likely that the C++ spec will need a change to allow link
chains, so that the original structure of the IANA database can be fully
represented by chrono::tzdb. The alternative would be for chrono::tzdb
to flatten all chains when loading the data, so that a link's target is
always a zone, but this means throwing away information present in the
tzdata.zi input file.

In anticipation of a change to the spec, this commit adds support for
chained links to libstdc++. When a name is found to be a link, we try to
find its target in the list of zones as before, but now if the target
isn't the name of a zone we don't fail. Instead we look for another link
with that name, and keep doing that until we reach the end of the chain
of links, and then look up the last target as a zone.

This new logic would get stuck in a loop if the tzdata.zi file is buggy
and defines a link chain that contains a cycle, e.g. two links that
refer to each other. To deal with that unlikely case, we use the
tortoise and hare algorithm to detect cycles in link chains, and throw
an exception if we detect a cycle. Cycles in links should never happen,
and it is expected that link chains will be short (if they occur at all)
and so the code is optimized for short chains without cycles. Longer
chains (four or more links) and cycles will do more work, but won't fail
to resolve a chain or get stuck in a loop.

The new test file checks various forms of broken links and cycles.

Also add a new check in the testsuite that every element in the
get_tzdb().zones and get_tzdb().links sequences can be successfully
found using locate_zone.

libstdc++-v3/ChangeLog:

PR libstdc++/114770
* src/c++20/tzdb.cc (do_locate_zone): Support links that have
another link as their target.
* testsuite/std/time/tzdb/1.cc: Check that all zones and links
can be found by locate_zone.
* testsuite/std/time/tzdb/links.cc: New test.

diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index 639d1c440ba..c7c7cc9deee 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -1599,7 +1599,7 @@ namespace std::chrono
 const time_zone*
 do_locate_zone(const vector& zones,
   const vector& links,
-  string_view tz_name) noexcept
+ 

[RFC][PATCH v1 4/4] Adjust testcases for flexible array member in union and alone in structure extension.

2024-04-19 Thread Qing Zhao
gcc/testsuite/ChangeLog:

* c-c++-common/builtin-clear-padding-3.c: Adjust testcase.
* g++.dg/ext/flexary12.C: Likewise.
* g++.dg/ext/flexary19.C: Likewise.
* g++.dg/ext/flexary2.C: Likewise.
* g++.dg/ext/flexary3.C: Likewise.
* g++.dg/ext/flexary36.C: Likewise.
* g++.dg/ext/flexary4.C: Likewise.
* g++.dg/ext/flexary5.C: Likewise.
* g++.dg/ext/flexary8.C: Likewise.
* g++.dg/torture/pr64280.C: Likewise.
* gcc.dg/20050620-1.c: Likewise.
* gcc.dg/940510-1.c: Likewise.
---
 .../c-c++-common/builtin-clear-padding-3.c| 10 ++--
 gcc/testsuite/g++.dg/ext/flexary12.C  |  6 +--
 gcc/testsuite/g++.dg/ext/flexary19.C  | 42 +++
 gcc/testsuite/g++.dg/ext/flexary2.C   |  2 +-
 gcc/testsuite/g++.dg/ext/flexary3.C   |  2 +-
 gcc/testsuite/g++.dg/ext/flexary36.C  |  2 +-
 gcc/testsuite/g++.dg/ext/flexary4.C   | 54 +--
 gcc/testsuite/g++.dg/ext/flexary5.C   |  4 +-
 gcc/testsuite/g++.dg/ext/flexary8.C   |  8 +--
 gcc/testsuite/g++.dg/torture/pr64280.C|  2 +-
 gcc/testsuite/gcc.dg/20050620-1.c |  2 +-
 gcc/testsuite/gcc.dg/940510-1.c   |  4 +-
 12 files changed, 68 insertions(+), 70 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/builtin-clear-padding-3.c 
b/gcc/testsuite/c-c++-common/builtin-clear-padding-3.c
index d16cc6aad05f..a4f49f26db14 100644
--- a/gcc/testsuite/c-c++-common/builtin-clear-padding-3.c
+++ b/gcc/testsuite/c-c++-common/builtin-clear-padding-3.c
@@ -2,14 +2,12 @@
 /* { dg-do compile } */
 /* { dg-options "" } */
 
-union U { int a; char b[] __attribute__((aligned (2 * sizeof (int; };  
/* { dg-error "flexible array member in union" } */
+union U { int a; char b[] __attribute__((aligned (2 * sizeof (int; };
 struct V { int a; union U b; };
-struct W { int a; union U b; int c; };
 
 void
-foo (union U *u, struct V *v, struct W *w)
+foo (union U *u, struct V *v)
 {
-  __builtin_clear_padding (u);
-  __builtin_clear_padding (v);
-  __builtin_clear_padding (w);
+  __builtin_clear_padding (u); /* { dg-error "flexible array member" "does not 
have well defined padding bits" } */
+  __builtin_clear_padding (v); /* { dg-error "flexible array member" "does not 
have well defined padding bits" } */
 }
diff --git a/gcc/testsuite/g++.dg/ext/flexary12.C 
b/gcc/testsuite/g++.dg/ext/flexary12.C
index b0964948731d..6ba4b6417135 100644
--- a/gcc/testsuite/g++.dg/ext/flexary12.C
+++ b/gcc/testsuite/g++.dg/ext/flexary12.C
@@ -6,7 +6,7 @@
 // { dg-options "-Wno-pedantic" }
 
 struct A {
-  int a [];  // { dg-error "flexible array member .A::a. in an otherwise empty 
.struct A." }
+  int a [];
 };
 
 void f1 ()
@@ -40,7 +40,7 @@ void f2 ()
 }
 
 struct D {
-  int a [];  // { dg-error "flexible array member .D::a. in an otherwise empty 
.struct D." }
+  int a [];
   D ();
 };
 
@@ -52,7 +52,7 @@ D::D ():// { dg-error "initializer for flexible array 
member" }
 
 template 
 struct C {
-  T a [];  // { dg-error "flexible array member" }
+  T a [];
 };
 
 void f3 ()
diff --git a/gcc/testsuite/g++.dg/ext/flexary19.C 
b/gcc/testsuite/g++.dg/ext/flexary19.C
index abfbc43028af..9a06f9ca758f 100644
--- a/gcc/testsuite/g++.dg/ext/flexary19.C
+++ b/gcc/testsuite/g++.dg/ext/flexary19.C
@@ -12,7 +12,7 @@ struct S1
   // The following declares a named data member of an unnamed struct
   // (i.e., it is not an anonymous struct).
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } s;
 };
 
@@ -21,7 +21,7 @@ struct S2
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } s[1];
 };
 
@@ -30,7 +30,7 @@ struct S3
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } s[];
 };
 
@@ -39,7 +39,7 @@ struct S4
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } s[2];
 };
 
@@ -48,7 +48,7 @@ struct S5
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } s[1][2];
 };
 
@@ -57,7 +57,7 @@ struct S6
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } s[][2];
 };
 
@@ -66,7 +66,7 @@ struct S7
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } *s;
 };
 
@@ -75,7 +75,7 @@ struct S8
   int i;
 
   struct {
-int a[];// { dg-error "in an otherwise empty" }
+int a[];// { dg-warning "in an otherwise empty" }
   } **s;
 };
 

[RFC][PATCH v1 3/4] Add testing cases for flexible array members in unions and alone in structures.

2024-04-19 Thread Qing Zhao
gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-in-union-1.c: New test.
* gcc.dg/flex-array-in-union-2.c: New test.
---
 gcc/testsuite/gcc.dg/flex-array-in-union-1.c | 37 +
 gcc/testsuite/gcc.dg/flex-array-in-union-2.c | 42 
 2 files changed, 79 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-in-union-1.c
 create mode 100644 gcc/testsuite/gcc.dg/flex-array-in-union-2.c

diff --git a/gcc/testsuite/gcc.dg/flex-array-in-union-1.c 
b/gcc/testsuite/gcc.dg/flex-array-in-union-1.c
new file mode 100644
index ..2a532d77c1dd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-in-union-1.c
@@ -0,0 +1,37 @@
+/* testing the correct usage of flexible array members in unions 
+   and alone in structure.  */
+/* { dg-do run} */
+/* { dg-options "-O2 -Wpedantic" } */
+
+union with_fam_1 {
+  int a;
+  int b[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+};
+
+union with_fam_2 {
+  char a;
+  int b[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+};
+
+union with_fam_3 {
+  char a[];  /* { dg-warning " flexible array member in union is a GCC 
extension" } */
+  int b[];  /* { dg-warning "flexible array member in union is a GCC 
extension" } */
+};
+
+struct only_fam {
+  int b[];  /* { dg-warning "flexible array member in a struct with no named 
members is a GCC extension" } */
+};
+
+int main ()
+{
+  if (sizeof (union with_fam_1) != sizeof (int))
+__builtin_abort ();
+  if (sizeof (union with_fam_2) != __alignof__ (int))
+__builtin_abort ();
+  if (sizeof (union with_fam_3) != 0)
+__builtin_abort ();
+  if (sizeof (struct only_fam) != 0)
+__builtin_abort ();
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.dg/flex-array-in-union-2.c 
b/gcc/testsuite/gcc.dg/flex-array-in-union-2.c
new file mode 100644
index ..130124bbe653
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/flex-array-in-union-2.c
@@ -0,0 +1,42 @@
+/* testing the correct usage of flexible array members in unions 
+   and alone in structure: initialization  */
+/* { dg-do run} */
+/* { dg-options "-O2" } */
+
+union with_fam_1 {
+  int a;
+  int b[]; 
+} with_fam_1_v = {.b = {1, 2, 3, 4}};
+
+union with_fam_2 {
+  int a;
+  char b[];  
+} with_fam_2_v = {.a = 0x1f2f3f4f};
+
+union with_fam_3 {
+  char a[];  
+  int b[];  
+} with_fam_3_v = {.b = {0x1f2f3f4f, 0x5f6f7f7f}};
+
+struct only_fam {
+  int b[]; 
+} only_fam_v = {{7, 11}};
+
+int main ()
+{
+  if (with_fam_1_v.b[3] != 4
+  || with_fam_1_v.b[0] != 1)
+__builtin_abort ();
+  if (with_fam_2_v.b[3] != 0x1f
+  || with_fam_2_v.b[0] != 0x4f)
+__builtin_abort ();
+  if (with_fam_3_v.a[0] != 0x4f
+  || with_fam_3_v.a[7] != 0x5f)
+__builtin_abort ();
+  if (only_fam_v.b[0] != 7
+  || only_fam_v.b[1] != 11)
+__builtin_abort ();
+
+  return 0;
+}
+
-- 
2.31.1



[RFC][PATCH v1 1/4] Documentation change

2024-04-19 Thread Qing Zhao
for allow flexible array members in unions and alone in structures [PR53548]

The request for GCC to accept that the C99 flexible array member can be
in a union or alone in a structure has been made a long time ago around 2012
for supporting several practical cases including glibc.

A GCC PR has been opened for such request at that time:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53548

However, this PR was closed as WONTFIX around 2015 due to the following reason:

"there is an existing extension that makes the requested functionality possible"
i.e GCC fully supported that the zero-length array can be in a union or alone
in a structure for a long time. (though I didn't see any official documentation
on such extension)

It's reasonable to close PR53548 at that time since zero-length array extension
can be used for such purpose.

However, since GCC13, in order to improve the C/C++ security, we introduced
-fstrict-flex-arrays=n to gradually eliminate the "fake flexible array"
usages from C/C++ source code. As a result, zero-length arrays eventually
will be replaced by C99 flexiable array member completely.

Therefore, GCC needs to explicitly allow such extensions directly for C99
flexible arrays, since flexable array member in unions or alone in structs
are common code patterns in active use by the Linux kernel (and other projects).

For example, these do not error by default with GCC:

union one {
  int a;
  int b[0];
};

union two {
  int a;
  struct {
struct { } __empty;
int b[];
  };
};

But these do:

union three {
  int a;
  int b[];
};

struct four {
  int b[];
}

Clang has supported such extensions since March, 2024
https://github.com/llvm/llvm-project/pull/84428

GCC should also support such extensions. This will allow for
a seamless transition for code bases away from zero-length arrays without
losing existing code patterns.

gcc/ChangeLog:

* doc/extend.texi: Add documentation for Flexible Array Members in
Unions and Flexible Array Members alone in Structures.
---
 gcc/doc/extend.texi | 37 +
 1 file changed, 37 insertions(+)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7b54a241a7bf..b12ce5fb9b87 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -42,6 +42,8 @@ extensions, accepted by GCC in C90 mode and in C++.
 * Named Address Spaces::Named address spaces.
 * Zero Length:: Zero-length arrays.
 * Empty Structures::Structures with no members.
+* Flexible Array Members in Unions::  Unions with Flexible Array Members.
+* Flexible Array Members alone in Structures::  Structures with only Flexible 
Array Members.
 * Variable Length:: Arrays whose length is computed at run time.
 * Variadic Macros:: Macros with a variable number of arguments.
 * Escaped Newlines::Slightly looser rules for escaped newlines.
@@ -1873,6 +1875,41 @@ The structure has size zero.  In C++, empty structures 
are part
 of the language.  G++ treats empty structures as if they had a single
 member of type @code{char}.
 
+@node Flexible Array Members in Unions
+@section Unions with Flexible Array Members
+@cindex unions with flexible array members
+@cindex unions with FAMs
+
+GCC permits a C99 flexible array member (FAM) to be in a union:
+
+@smallexample
+union with_fam @{
+  int a;
+  int b[];
+@};
+@end smallexample
+
+The size of the union is as if the flexiable array member were omitted
+except that it may have more trailing padding than the omission would imply.
+
+If all the members of a union are flexiable array member, the size of 
+such union is zero.
+
+@node Flexible Array Members alone in Structures
+@section Structures with only Flexible Array Members
+@cindex structures with only flexible array members
+@cindex structures with only FAMs
+
+GCC permits a C99 flexible array member (FAM) to be alone in a structure:
+
+@smallexample
+struct only_fam @{
+  int b[];
+@};
+@end smallexample
+
+The size of such structure gives the size zero.
+
 @node Variable Length
 @section Arrays of Variable Length
 @cindex variable-length arrays
-- 
2.31.1



[RFC][PATCH v1 2/4] C and C++ FE changes to support flexible array members in unions and alone in structures.

2024-04-19 Thread Qing Zhao
gcc/c/ChangeLog:

* c-decl.cc (finish_struct): Change errors to pedwarns for the cases
flexible array members in union or alone in structures.

gcc/cp/ChangeLog:

* class.cc (diagnose_flexarrays): Change error to pdewarn for the case
flexible array members alone in structures.
* decl.cc (grokdeclarator): Change error to pdewarn for the case
flexible array members in unions.

gcc/ChangeLog:

* stor-layout.cc (place_union_field): Use zero sizes for flexible array
member fields.
---
 gcc/c/c-decl.cc| 16 +---
 gcc/cp/class.cc| 11 ---
 gcc/cp/decl.cc |  7 +++
 gcc/stor-layout.cc |  9 +++--
 4 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 345090dae38b..947f3cd589eb 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9471,11 +9471,8 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   if (flexible_array_member_type_p (TREE_TYPE (x)))
{
  if (TREE_CODE (t) == UNION_TYPE)
-   {
- error_at (DECL_SOURCE_LOCATION (x),
-   "flexible array member in union");
- TREE_TYPE (x) = error_mark_node;
-   }
+   pedwarn (DECL_SOURCE_LOCATION (x), OPT_Wpedantic,
+"flexible array member in union is a GCC extension");
  else if (!is_last_field)
{
  error_at (DECL_SOURCE_LOCATION (x),
@@ -9483,12 +9480,9 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
  TREE_TYPE (x) = error_mark_node;
}
  else if (!saw_named_field)
-   {
- error_at (DECL_SOURCE_LOCATION (x),
-   "flexible array member in a struct with no named "
-   "members");
- TREE_TYPE (x) = error_mark_node;
-   }
+   pedwarn (DECL_SOURCE_LOCATION (x), OPT_Wpedantic,
+"flexible array member in a struct with no named "
+"members is a GCC extension");
}
 
   if (pedantic && TREE_CODE (t) == RECORD_TYPE
diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index 5f258729940b..0c8afb72550f 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -7624,6 +7624,7 @@ diagnose_flexarrays (tree t, const flexmems_t *fmem)
   bool diagd = false;
 
   const char *msg = 0;
+  const char *msg_fam = 0;
 
   if (TYPE_DOMAIN (TREE_TYPE (fmem->array)))
 {
@@ -7649,15 +7650,19 @@ diagnose_flexarrays (tree t, const flexmems_t *fmem)
   if (fmem->after[0])
msg = G_("flexible array member %qD not at end of %q#T");
   else if (!fmem->first)
-   msg = G_("flexible array member %qD in an otherwise empty %q#T");
+   msg_fam = G_("flexible array member %qD in an otherwise"
+" empty %q#T is a GCC extension");
 
-  if (msg)
+  if (msg || msg_fam)
{
  location_t loc = DECL_SOURCE_LOCATION (fmem->array);
  diagd = true;
 
  auto_diagnostic_group d;
- error_at (loc, msg, fmem->array, t);
+ if (msg)
+   error_at (loc, msg, fmem->array, t);
+ else
+   pedwarn (loc, OPT_Wpedantic, msg_fam, fmem->array, t);
 
  /* In the unlikely event that the member following the flexible
 array member is declared in a different class, or the member
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 65ab64885ff8..9a91c6f80da1 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -14566,10 +14566,9 @@ grokdeclarator (const cp_declarator *declarator,
if (ctype
&& (TREE_CODE (ctype) == UNION_TYPE
|| TREE_CODE (ctype) == QUAL_UNION_TYPE))
- {
-   error_at (id_loc, "flexible array member in union");
-   type = error_mark_node;
- }
+ pedwarn (id_loc, OPT_Wpedantic,
+  "flexible array member in union is a GCC extension");
+
else
  {
/* Array is a flexible member.  */
diff --git a/gcc/stor-layout.cc b/gcc/stor-layout.cc
index e34be19689c0..10c0809914cd 100644
--- a/gcc/stor-layout.cc
+++ b/gcc/stor-layout.cc
@@ -1245,13 +1245,18 @@ place_union_field (record_layout_info rli, tree field)
   && TYPE_TYPELESS_STORAGE (TREE_TYPE (field)))
 TYPE_TYPELESS_STORAGE (rli->t) = 1;
 
+  /* We might see a flexible array member field (with no DECL_SIZE_UNIT), use
+ zero size for such field.  */
+  tree field_size_unit = DECL_SIZE_UNIT (field)
+? DECL_SIZE_UNIT (field)
+: build_int_cst (sizetype, 0);
   /* We assume the union's size will be a multiple of a byte so we don't
  bother with BITPOS.  */
   if (TREE_CODE (rli->t) == UNION_TYPE)
-rli->offset = size_binop (MAX_EXPR, rli->offset, DECL_SIZE_UNIT (field));
+rli->offset = size_binop (MAX_EXPR, rli->offset, 

[RFC][PATCH v1 0/4] Allow flexible array members in unions and alone in structures [PR53548]

2024-04-19 Thread Qing Zhao
Hi,

The request for GCC to accept that the C99 flexible array member can be
in a union or alone in a struct has been made a long time ago around 2012 
for supporting several practical cases including glibc.

A GCC PR has been opened for such request at that time:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53548

However, this PR was closed as WONTFIX around 2015 due to the following reason:

"there is an existing extension that makes the requested functionality possible"
i.e GCC fully supported that the zero-length array can be in union or alone
in structs for a long time. (though I didn't see any official documentation on
such extension)

It's reasonable to close PR53548 at that time since zero-length array extension
can be used for such purpose.

However, since GCC13, in order to improve the C/C++ security, we introduced
-fstrict-flex-arrays=n to gradually eliminate the "fake flexible array"
usages from C/C++ source code. As a result, zero-lenghth arrays eventually 
will be replaced by C99 flexiable array member completely.   

Therefore, GCC needs to explicitly allow such extensions directly for C99
flexible arrays, since flexable array member in unions or alone in structs
are common code patterns in active use by the Linux kernel (and other projects).

For example, these do not error by default with GCC:

union one {
int a;
int b[0];
};

union two {
int a;
struct {
struct { } __empty;
int b[];
};
};

But these do:

union three {
int a;
int b[];
};

struct four {
int b[];
}

Clang has supported such extensions since March, 2024
https://github.com/llvm/llvm-project/pull/84428

GCC should also support such extensions. This will allow for
a seamless transition for code bases away from zero-length arrays without
losing existing code patterns. 

The patch set includes:

  1. Documentation change.
 Allow flexible array members in unions and alone in structures
 [PR53548]
  2. C and C++ FE changes to support flexible array members in unions and
alone in structures.
  3. Add testing cases for flexible array members in unions and alone in
structures.
  4. Adjust testcases for flexible array member in union and alone in
structure extension.



Re: [Patch, fortran] PR103471 - [11/12/13/14 Regression] ICE in gfc_typenode_for_spec, at fortran/trans-types.c:1114

2024-04-19 Thread Harald Anlauf

Hi Paul,

the patch is OK, but I had to manually fix it.  I wonder how you managed
to produce:

diff --git a/gcc/testsuite/gfortran.dg/pr93484.f90
b/gcc/testsuite/gfortran.dg/pr93484.f90
new file mode 100644
index 000..4dcad47e8da
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr103471.f90
@@ -0,0 +1,13 @@

A minor comment on the error message and the testcase.
Take for example:

subroutine sub
  implicit none
  real, external :: x
  real   :: y(10)
  integer :: kk
  print *, [real(x(k))]
!  print *, [real(y(k))]
end

The original testcase in the PR would - without implicit none -
resemble the function invocation x(k) here and emit the error:

Fatal Error: k at (1) has no default type
compilation terminated.

while commenting the first print and uncommenting the second
would emit the message

Error: Symbol 'k' at (1) has no IMPLICIT type; did you mean 'kk'?

Thus I have the impression that the testcase tests something different
on the one hand, and on the other I wonder if we would want to change
the error message and replace "no default type" to "no IMPLICIT type".
It still would not hit the fuzzy check, but that is something that
might not be important now.

Thanks,
Harald


On 4/19/24 18:52, Paul Richard Thomas wrote:

Hi All,

This is a more or less obvious patch. The action is in resolve.cc. The
chunk in symbol.cc is a tidy up of a diagnostic marker to distinguish where
the 'no IMPLICIT type' error was coming from and the chunk in trans-decl.cc
follows from discussion with Harald on the PR.

Regtests fine. OK for mainline and backporting in a couple of weeks?

Paul

Fortran: Detect 'no implicit type' error in right place [PR103471]

2024-04-19  Paul Thomas  

gcc/fortran
PR fortran/103471
* resolve.cc (gfc_resolve_index_1): Block index expressions of
unknown type from being converted to default integer, avoiding
the fatal error in trans-decl.cc.
* symbol.cc (gfc_set_default_type): Remove '(symbol)' from the
'no IMPLICIT type' error message.
* trans-decl.cc (gfc_get_symbol_decl): Change fatal error locus
to that of the symbol declaration.
(gfc_trans_deferred_vars): Remove two trailing tabs.

gcc/testsuite/
PR fortran/103471
* gfortran.dg/pr103471.f90: New test.





Re: Request for testing on non-Linux targets; remove special casing of /usr/lib and /lib from the driver

2024-04-19 Thread David Edelsohn
The patch does not cause failures on AIX.  Is it removing explicit
references to /lib and /usr/lib?

It seems more appropriate for GCC 15.

Thanks for alerting me to the patch to test on AIX.  AIX is in CFarm.

Thanks David

On Tue, Apr 16, 2024 at 7:49 PM Andrew Pinski (QUIC) <
quic_apin...@quicinc.com> wrote:

> Hi all,
>   The driver currently will remove "/lib" and "/usr/lib" from the library
> path that gets passed to the linker because it considers them as paths that
> the linker will already known to search. But this is not true for newer
> linkers, mold and lld for an example don't have a default search path.
> This patch removes the special casing to fix FreeBSD building where lld is
> used by default and also fix riscv-linux-gnu when used in combination with
> mold.
> I have tested it on x86_64-linux-gnu and it works there but since the code
> in the driver has been around since 1992, I request some folks to test it
> on AIX, Mac OS (Darwin) and solaris where the ld is not GNU bfd ld as I
> don't have access to those targets currently.
>
> Thanks,
> Andrew Pinski
>


[Patch, fortran] PR103471 - [11/12/13/14 Regression] ICE in gfc_typenode_for_spec, at fortran/trans-types.c:1114

2024-04-19 Thread Paul Richard Thomas
Hi All,

This is a more or less obvious patch. The action is in resolve.cc. The
chunk in symbol.cc is a tidy up of a diagnostic marker to distinguish where
the 'no IMPLICIT type' error was coming from and the chunk in trans-decl.cc
follows from discussion with Harald on the PR.

Regtests fine. OK for mainline and backporting in a couple of weeks?

Paul

Fortran: Detect 'no implicit type' error in right place [PR103471]

2024-04-19  Paul Thomas  

gcc/fortran
PR fortran/103471
* resolve.cc (gfc_resolve_index_1): Block index expressions of
unknown type from being converted to default integer, avoiding
the fatal error in trans-decl.cc.
* symbol.cc (gfc_set_default_type): Remove '(symbol)' from the
'no IMPLICIT type' error message.
* trans-decl.cc (gfc_get_symbol_decl): Change fatal error locus
to that of the symbol declaration.
(gfc_trans_deferred_vars): Remove two trailing tabs.

gcc/testsuite/
PR fortran/103471
* gfortran.dg/pr103471.f90: New test.
diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 6b3e5ba4fcb..9b7fabd3707 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -5001,7 +5001,8 @@ gfc_resolve_index_1 (gfc_expr *index, int check_scalar,

   if ((index->ts.kind != gfc_index_integer_kind
&& force_index_integer_kind)
-  || index->ts.type != BT_INTEGER)
+  || (index->ts.type != BT_INTEGER
+	  && index->ts.type != BT_UNKNOWN))
 {
   gfc_clear_ts ();
   ts.type = BT_INTEGER;
diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
index 3a3b6de5cec..8f7deac1d1e 100644
--- a/gcc/fortran/symbol.cc
+++ b/gcc/fortran/symbol.cc
@@ -320,7 +320,7 @@ gfc_set_default_type (gfc_symbol *sym, int error_flag, gfc_namespace *ns)
 		   "; did you mean %qs?",
 		   sym->name, >declared_at, guessed);
 	  else
-	gfc_error ("Symbol %qs at %L has no IMPLICIT type(symbol)",
+	gfc_error ("Symbol %qs at %L has no IMPLICIT type",
 		   sym->name, >declared_at);
 	  sym->attr.untyped = 1; /* Ensure we only give an error once.  */
 	}
diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index e160c5c98c1..301439baaf5 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -1797,7 +1797,8 @@ gfc_get_symbol_decl (gfc_symbol * sym)
 }

   if (sym->ts.type == BT_UNKNOWN)
-gfc_fatal_error ("%s at %C has no default type", sym->name);
+gfc_fatal_error ("%s at %L has no default type", sym->name,
+		 >declared_at);

   if (sym->attr.intrinsic)
 gfc_internal_error ("intrinsic variable which isn't a procedure");
@@ -5214,8 +5215,8 @@ gfc_trans_deferred_vars (gfc_symbol * proc_sym, gfc_wrapped_block * block)
 	tree tmp = lookup_attribute ("omp allocate",
  DECL_ATTRIBUTES (n->sym->backend_decl));
 	tmp = TREE_VALUE (tmp);
-	TREE_PURPOSE (tmp) = se.expr;
-	TREE_VALUE (tmp) = align;
+	TREE_PURPOSE (tmp) = se.expr;
+	TREE_VALUE (tmp) = align;
 	TREE_PURPOSE (TREE_CHAIN (tmp)) = init_stmtlist;
 	TREE_VALUE (TREE_CHAIN (tmp)) = cleanup_stmtlist;
   }
diff --git a/gcc/testsuite/gfortran.dg/pr93484.f90 b/gcc/testsuite/gfortran.dg/pr93484.f90
new file mode 100644
index 000..4dcad47e8da
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr103471.f90
@@ -0,0 +1,13 @@
+! { dg-do compile }
+! Test the fix for PR103471 in which, rather than giving a "no IMPLICIT type"
+! message, gfortran took to ICEing. The fuzzy symbol check for 'kk' demonstrates
+! that the error is being detected in the right place.
+!
+! Contributed by Gerhard Steinmetz  
+!
+program p
+   implicit none
+   integer, parameter :: x(4) = [1,2,3,4]
+   integer :: kk
+   print *, [real(x(k))] ! { dg-error "has no IMPLICIT type; did you mean .kk.\\?" }
+end


[PATCH v2] [testsuite] [arm] add effective target and options for pacbti tests

2024-04-19 Thread Alexandre Oliva
Hello, Richard,

Thanks, your response was very informative.

Here's a revised patch.

arm pac and bti tests that use -march=armv8.1-m.main get an implicit
-mthumb, that is incompatible with vxworks kernel mode.  Declaring the
requirement for a 8.1-m.main-compatible toolchain is enough to avoid
those fails, because the toolchain feature test fails in kernel mode,
but taking the -march options from the standardized arch tests, after
testing for support for the corresponding effective target, makes it
generally safer, and enables us to drop skip directives and extraneous
option variants.

Tested all 6 modified testcases with an x86_64-linux-gnu-x-arm-eabi
uberbaum build.  Ok to install?


for  gcc/testsuite/ChangeLog

* gcc.target/arm/bti-1.c: Require arch, use its opts, drop skip.
* gcc.target/arm/bti-2.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
* g++.target/arm/pac-1.C: Likewise.  Drop +mve.
---
 gcc/testsuite/g++.target/arm/pac-1.C   |5 +++--
 .../gcc.target/arm/acle/pacbti-m-predef-11.c   |4 ++--
 .../gcc.target/arm/acle/pacbti-m-predef-12.c   |5 +++--
 .../gcc.target/arm/acle/pacbti-m-predef-7.c|5 +++--
 gcc/testsuite/gcc.target/arm/bti-1.c   |5 +++--
 gcc/testsuite/gcc.target/arm/bti-2.c   |5 +++--
 6 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/g++.target/arm/pac-1.C 
b/gcc/testsuite/g++.target/arm/pac-1.C
index f671a27b048c6..ac15ae18197ca 100644
--- a/gcc/testsuite/g++.target/arm/pac-1.C
+++ b/gcc/testsuite/g++.target/arm/pac-1.C
@@ -1,7 +1,8 @@
 /* Check that GCC does .save and .cfi_offset directives with RA_AUTH_CODE 
pseudo hard-register.  */
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
-mthumb -mfloat-abi=hard -g -O0" } */
+/* { dg-require-effective-target arm_arch_v8_1m_main_pacbti_ok } */
+/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
+/* { dg-additional-options "-mbranch-protection=pac-ret -mfloat-abi=hard -g 
-O0" } */
 
 __attribute__((noinline)) void
 fn1 (int a, int b, int c)
diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c 
b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
index 6a5ae92c567f3..c9c40f44027d4 100644
--- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" "-mfloat-abi=*" } } */
-/* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
+/* { dg-require-effective-target arm_arch_v8_1m_main_pacbti_ok } */
+/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
 
 #if (__ARM_FEATURE_BTI != 1)
 #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be defined to 1."
diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c 
b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
index db40b17c3b030..c26051347a2cc 100644
--- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
+/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
+/* { dg-add-options arm_arch_v8_1m_main } */
+/* { dg-additional-options "-mfloat-abi=softfp" } */
 
 #if defined (__ARM_FEATURE_BTI)
 #error "Feature test macro __ARM_FEATURE_BTI should not be defined."
diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c 
b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
index 1b25907635e24..92f500c1449b3 100644
--- a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" 
"-mcpu=*" } } */
-/* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
-mfloat-abi=hard" } */
+/* { dg-require-effective-target arm_arch_v8_1m_main_pacbti_ok } */
+/* { dg-add-options arm_arch_v8_1m_main_pacbti } */
+/* { dg-additional-options "--save-temps -mfloat-abi=hard" } */
 
 #if defined (__ARM_FEATURE_BTI_DEFAULT)
 #error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be undefined."
diff --git a/gcc/testsuite/gcc.target/arm/bti-1.c 
b/gcc/testsuite/gcc.target/arm/bti-1.c
index 79dd8010d2dab..a34bb0842b632 100644
--- a/gcc/testsuite/gcc.target/arm/bti-1.c
+++ b/gcc/testsuite/gcc.target/arm/bti-1.c
@@ -1,7 +1,8 @@
 /* Check that GCC does bti instruction.  */
 /* { dg-do compile } */
-/* { 

[PATCH v3 2/2] c++: Fix instantiation of imported temploid friends [PR114275]

2024-04-19 Thread Nathaniel Shead
On Fri, Apr 19, 2024 at 12:14:06PM +1000, Nathaniel Shead wrote:
> On Wed, Apr 17, 2024 at 02:02:21PM -0400, Patrick Palka wrote:
> > On Mon, 15 Apr 2024, Nathaniel Shead wrote:
> > 
> > > I'm not a huge fan of always streaming 'imported_temploid_friends' for
> > > all decls, but I don't think it adds much performance cost over adding a
> > > new flag to categorise decls that might be marked as such.
> > 
> > IIUC this value is going to be almost always null which is encoded as a
> > single 0 byte, which should be fast to stream.  But I wonder how much
> > larger  gets?  Can we get away with streaming this value
> > only for TEMPLATE_DECLs?
> 
> Yes, it should either just be a 0 byte or an additional backref
> somewhere, which will likely also be small. On my system it increases
> the size by 0.26%, from 31186800 bytes to 31268672.
> 
> But I've just found that this patch has a bug anyway, in that it doesn't
> correctly dedup if the friend types are instantiated in two separate
> modules that are then both imported.  I'll see what I need to do to fix
> this which may influence what we need to stream here.
> 

Here's an updated version of the patch that fixes this. Also changed to
only stream when 'inner' is either TYPE_DECL or FUNCTION_DECL, which
cuts the size of  down a bit to 31246992 (0.19% growth).

Another alternative would be to add another boolean flag at the top of
'decl_value' and branch on that; that would make use of the bitpacking
logic and probably cut down on the size further.  (I haven't measured
this yet though.)

Bootstrapped and regtested (so far just dg.exp and modules.exp) on
x86_64-pc-linux-gnu, OK for trunk if full regtest succeeds?

-- >8 --

This patch fixes a number of issues with the handling of temploid friend
declarations.

The primary issue is that instantiations of friend declarations should
attach the declaration to the same module as the befriending class, by
[module.unit] p7.1 and [temp.friend] p2; this could be a different
module from the current TU, and so needs special handling.

The other main issue here is that we can't assume that just because name
lookup didn't find a definition for a hidden template class, it doesn't
mean that it doesn't exist: it could be a non-exported entity that we've
nevertheless streamed in from an imported module.  We need to ensure
that when instantiating friend classes that we return the same TYPE_DECL
that we got from our imports, otherwise we will get later issues with
'duplicate_decls' (rightfully) complaining that they're different.

This doesn't appear necessary for functions due to the existing name
lookup handling already finding these hidden declarations.

PR c++/105320
PR c++/114275

gcc/cp/ChangeLog:

* cp-tree.h (propagate_defining_module): Declare.
(lookup_imported_hidden_friend): Declare.
* decl.cc (duplicate_decls): Also check if hidden declarations
can be redeclared in this module.
* module.cc (imported_temploid_friends): New map.
(init_modules): Initialize it.
(trees_out::decl_value): Write it; don't consider imported
temploid friends as attached to this module.
(trees_in::decl_value): Read it.
(depset::hash::add_specializations): Don't treat instantiations
of a friend type as a specialisation.
(get_originating_module_decl): Follow the owning decl for an
imported temploid friend.
(propagate_defining_module): New function.
* name-lookup.cc (lookup_imported_hidden_friend): New function.
* pt.cc (tsubst_friend_function): Propagate defining module for
new friend functions.
(tsubst_friend_class): Lookup imported hidden friends. Check
for valid redeclaration. Propagate defining module for new
friend classes.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-friend-10_a.C: New test.
* g++.dg/modules/tpl-friend-10_b.C: New test.
* g++.dg/modules/tpl-friend-10_c.C: New test.
* g++.dg/modules/tpl-friend-11_a.C: New test.
* g++.dg/modules/tpl-friend-11_b.C: New test.
* g++.dg/modules/tpl-friend-12_a.C: New test.
* g++.dg/modules/tpl-friend-12_b.C: New test.
* g++.dg/modules/tpl-friend-12_c.C: New test.
* g++.dg/modules/tpl-friend-12_d.C: New test.
* g++.dg/modules/tpl-friend-12_e.C: New test.
* g++.dg/modules/tpl-friend-12_f.C: New test.
* g++.dg/modules/tpl-friend-13_a.C: New test.
* g++.dg/modules/tpl-friend-13_b.C: New test.
* g++.dg/modules/tpl-friend-13_c.C: New test.
* g++.dg/modules/tpl-friend-13_d.C: New test.
* g++.dg/modules/tpl-friend-13_e.C: New test.
* g++.dg/modules/tpl-friend-13_f.C: New test.
* g++.dg/modules/tpl-friend-13_g.C: New test.
* g++.dg/modules/tpl-friend-14_a.C: New test.
* g++.dg/modules/tpl-friend-14_b.C: New test.
* g++.dg/modules/tpl-friend-14_c.C: New test.
* 

Re: [PATCH v2 1/2] c++: Standardise errors for module_may_redeclare

2024-04-19 Thread Nathaniel Shead
On Mon, Apr 15, 2024 at 02:49:35PM +1000, Nathaniel Shead wrote:
> I took another look at this patch and have split it into two, one (this
> one) to standardise the error messages used and prepare
> 'module_may_redeclare' for use with temploid friends, and another
> followup patch to actually handle them correctly.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> Currently different places calling 'module_may_redeclare' all emit very
> similar but slightly different error messages, and handle different
> kinds of declarations differently.  This patch makes the function
> perform its own error messages so that they're all in one place, and
> prepares it for use with temploid friends (PR c++/114275).
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (module_may_redeclare): Add default parameter.
>   * decl.cc (duplicate_decls): Don't emit errors for failed
>   module_may_redeclare.
>   (xref_tag): Likewise.
>   (start_enum): Likewise.
>   * semantics.cc (begin_class_definition): Likewise.
>   * module.cc (module_may_redeclare): Clean up logic. Emit error
>   messages on failure.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/enum-12.C: Update error message.
>   * g++.dg/modules/friend-5_b.C: Likewise.
>   * g++.dg/modules/shadow-1_b.C: Likewise.
> 
> Signed-off-by: Nathaniel Shead 
> ---
>  gcc/cp/cp-tree.h  |   2 +-
>  gcc/cp/decl.cc|  28 +
>  gcc/cp/module.cc  | 120 ++
>  gcc/cp/semantics.cc   |   6 +-
>  gcc/testsuite/g++.dg/modules/enum-12.C|   2 +-
>  gcc/testsuite/g++.dg/modules/friend-5_b.C |   2 +-
>  gcc/testsuite/g++.dg/modules/shadow-1_b.C |   5 +-
>  7 files changed, 89 insertions(+), 76 deletions(-)
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index 1dbb577a38d..faa7a0052a5 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -7401,7 +7401,7 @@ inline bool module_exporting_p ()
>  
>  extern module_state *get_module (tree name, module_state *parent = NULL,
>bool partition = false);
> -extern bool module_may_redeclare (tree decl);
> +extern bool module_may_redeclare (tree olddecl, tree newdecl = NULL);
>  
>  extern bool module_global_init_needed ();
>  extern bool module_determine_import_inits ();
> diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
> index 65ab64885ff..aa66da4829d 100644
> --- a/gcc/cp/decl.cc
> +++ b/gcc/cp/decl.cc
> @@ -2279,18 +2279,8 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
> hiding, bool was_hidden)
>&& TREE_CODE (olddecl) != NAMESPACE_DECL
>&& !hiding)
>  {
> -  if (!module_may_redeclare (olddecl))
> - {
> -   if (DECL_ARTIFICIAL (olddecl))
> - error ("declaration %qD conflicts with builtin", newdecl);
> -   else
> - {
> -   error ("declaration %qD conflicts with import", newdecl);
> -   inform (olddecl_loc, "import declared %q#D here", olddecl);
> - }
> -
> -   return error_mark_node;
> - }
> +  if (!module_may_redeclare (olddecl, newdecl))
> + return error_mark_node;
>  
>tree not_tmpl = STRIP_TEMPLATE (olddecl);
>if (DECL_LANG_SPECIFIC (not_tmpl)
> @@ -16620,12 +16610,7 @@ xref_tag (enum tag_types tag_code, tree name,
>   {
> tree decl = TYPE_NAME (t);
> if (!module_may_redeclare (decl))
> - {
> -   auto_diagnostic_group d;
> -   error ("cannot declare %qD in a different module", decl);
> -   inform (DECL_SOURCE_LOCATION (decl), "previously declared here");
> -   return error_mark_node;
> - }
> + return error_mark_node;
>  
> tree not_tmpl = STRIP_TEMPLATE (decl);
> if (DECL_LANG_SPECIFIC (not_tmpl)
> @@ -16973,12 +16958,7 @@ start_enum (tree name, tree enumtype, tree 
> underlying_type,
>   {
> tree decl = TYPE_NAME (enumtype);
> if (!module_may_redeclare (decl))
> - {
> -   auto_diagnostic_group d;
> -   error ("cannot declare %qD in different module", decl);
> -   inform (DECL_SOURCE_LOCATION (decl), "previously declared here");
> -   enumtype = error_mark_node;
> - }
> + enumtype = error_mark_node;
> else
>   set_instantiating_module (decl);
>   }
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 001430a4a8f..e2d2910ae48 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -18992,11 +18992,15 @@ get_importing_module (tree decl, bool flexible)
>return module->mod;
>  }
>  
> -/* Is it permissible to redeclare DECL.  */
> +/* Is it permissible to redeclare OLDDECL with NEWDECL.
> +
> +   If NEWDECL is NULL, assumes that OLDDECL will be redeclared using
> +   the current scope's module and attachment.  */
>  
>  bool
> -module_may_redeclare (tree decl)
> +module_may_redeclare (tree olddecl, 

[committed] internal-fn: Fix up expand_arith_overflow [PR114753]

2024-04-19 Thread Jakub Jelinek
Hi!

During backporting I've noticed I've missed one return spot for the
restoration of the original flag_trapv flag value.

Tested on x86_64-linux, committed to trunk as obvious.  Sorry.

2024-04-19  Jakub Jelinek  

PR middle-end/114753
* internal-fn.cc (expand_arith_overflow): Add one missing restore
of flag_trapv before return.

--- gcc/internal-fn.cc.jj   2024-04-18 09:45:08.079695875 +0200
+++ gcc/internal-fn.cc  2024-04-19 18:11:51.202204402 +0200
@@ -2793,6 +2793,7 @@ expand_arith_overflow (enum tree_code co
case PLUS_EXPR:
  expand_addsub_overflow (loc, code, lhs, arg0, arg1, unsr_p,
  unsr_p, unsr_p, false, NULL);
+ flag_trapv = save_flag_trapv;
  return;
case MULT_EXPR:
  expand_mul_overflow (loc, lhs, arg0, arg1, unsr_p,

Jakub



Re: [PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail tests

2024-04-19 Thread Robin Dapp
Hi Pan,

> The RVV register overlap requires both the dest, and src operands.
> Thus the rigister filter in constraint cannot cover the fully sematics
> of the vector register overlap.

I'm not sure I'm following.  Did we miss something that should have been
covered?  Like only an overlap on the srcs but not the dest?
Are there testcases that fail?  If so we should definitely have one.

If something is broken then indeed we should revert it.

But...

> Thus, revert these overlap patches list and xfail the related test
> cases.  This patch would like to revert *b3b2799b872*, and the full
> picture of related series are listed as below.

... why not just revert everything and xfail all the tests in a
follow up?  Your patch is essentially a revert but doesn't look like
it.  I'd rather we let a revert be a revert and adjust the tests
separately so it becomes clear. 

Regards
 Robin



[PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail tests

2024-04-19 Thread pan2 . li
From: Pan Li 

The RVV register overlap requires both the dest, and src operands.
Thus the rigister filter in constraint cannot cover the fully sematics
of the vector register overlap.

Thus, revert these overlap patches list and xfail the related test
cases.  This patch would like to revert *b3b2799b872*, and the full
picture of related series are listed as below.

[P] b3b2799b872 RISC-V: Support one more overlap for wv instructions
[N] 7e854b58084 RISC-V: Support highest overlap for wv instructions
[N] 018ba3ac952 RISC-V: Fix overlap group incorrect overlap on v0
[N] 27fde325d64 RISC-V: Support highest-number regno overlap for widen ternary
[N] a23415d7572 RISC-V: Support highpart register overlap for widen vx/vf 
instructions
[N] 4418d55bcd1 RISC-V: Support highpart overlap for indexed load with SRC EEW 
< DEST EEW
[N] 303195e2a6b RISC-V: Support widening register overlap for vf4/vf8
[N] 8614cbb2534 RISC-V: Support highpart overlap for floating-point widen 
instructions
[N] e65aaf8efe1 RISC-V: Rename vconstraint into group_overlap
[N] 62685890d88 RISC-V: Support highpart overlap for vext.vf
[N] bdad036da32 RISC-V: Support highpart register overlap for vwcvt
[N] 1a0af6e5a99 RISC-V: Allow dest operand and accumulator operand overlap of 
widen reduction instruction[PR112327]

Indicator:
[D]: Done, aka this patch has reverted already.
[P]: Patched, aka the revert patch is sent but not merged.
[N]: None, aka not started yet.

The below test suites are passed for this patch.
* The riscv rv64gcv fully regression test.
* The riscv rv64gc fully regression test.

gcc/ChangeLog:

* config/riscv/riscv.md (none,W21,W42,W84,W43,W86,W87,W0): Remove W0.
(none,W21,W42,W84,W43,W86,W87): Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112431-42.c: Xfail vmv1r asm check.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.md | 14 +---
 gcc/config/riscv/vector.md| 84 +--
 .../gcc.target/riscv/rvv/base/pr112431-42.c   |  2 +-
 3 files changed, 47 insertions(+), 53 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c2b4323c53a..f0928398698 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -541,7 +541,7 @@ (define_attr "fp_vector_disabled" "no,yes"
 ;; Widening instructions have group-overlap constraints.  Those are only
 ;; valid for certain register-group sizes.  This attribute marks the
 ;; alternatives not matching the required register-group size as disabled.
-(define_attr "group_overlap" "none,W21,W42,W84,W43,W86,W87,W0"
+(define_attr "group_overlap" "none,W21,W42,W84,W43,W86,W87"
   (const_string "none"))
 
 (define_attr "group_overlap_valid" "no,yes"
@@ -562,9 +562,9 @@ (define_attr "group_overlap_valid" "no,yes"
 
  ;; According to RVV ISA:
  ;; The destination EEW is greater than the source EEW, the source 
EMUL is at least 1,
- ;; and the overlap is in the highest-numbered part of the destination 
register group
- ;; (e.g., when LMUL=8, vzext.vf4 v0, v6 is legal, but a source of v0, 
v2, or v4 is not).
- ;; So the source operand should have LMUL >= 1.
+;; and the overlap is in the highest-numbered part of the destination 
register group
+;; (e.g., when LMUL=8, vzext.vf4 v0, v6 is legal, but a source of v0, 
v2, or v4 is not).
+;; So the source operand should have LMUL >= 1.
  (and (eq_attr "group_overlap" "W43")
  (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) 
!= 4
   && riscv_get_v_regno_alignment (GET_MODE 
(operands[3])) >= 1"))
@@ -574,12 +574,6 @@ (define_attr "group_overlap_valid" "no,yes"
  (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) 
!= 8
   && riscv_get_v_regno_alignment (GET_MODE 
(operands[3])) >= 1"))
 (const_string "no")
-
- ;; W21 supports highest-number overlap for source LMUL = 1.
- ;; For 'wv' variant, we can also allow wide source operand overlaps 
dest operand.
- (and (eq_attr "group_overlap" "W0")
- (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) 
> 1"))
-(const_string "no")
 ]
(const_string "yes")))
 
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 8b1c24c5d79..8298a72b771 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -3842,48 +3842,48 @@ (define_insn 
"@pred_dual_widen__scal
(set_attr "group_overlap" 
"W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")])
 
 (define_insn "@pred_single_widen_sub"
-  [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, 
vd, vr, vd, vr, vd, vr, vd, vr, vd, vr,  ,  , ?, ?")
+  [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, 
vr, vd, vr, vd, vr, vd, vr, vd, vr, ?, ?")

Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Xi Ruoyao
On Fri, 2024-04-19 at 19:04 +0800, Yang Yujie wrote:
>  @table @samp
>  @item native
> -This selects the CPU to generate code for at compilation time by determining
> -the processor type of the compiling machine.  Using @option{-march=native}
> -enables all instruction subsets supported by the local machine (hence
> -the result might not run on different machines).  Using 
> @option{-mtune=native}
> -produces code optimized for the local machine under the constraints
> -of the selected instruction set.
> +Local processor type detected by the native compiler.
>  @item loongarch64
> -A generic CPU with 64-bit extensions.
> +Generic LoongArch 64-bit processor.
>  @item la464
> -LoongArch LA464 CPU with LBT, LSX, LASX, LVZ.
> +LoongArch LA464-based processor with LSX, LASX.
> +@item la664
> +LoongArch LA664-based processor with LSX, LASX and all LoongArch v1.1 
> features.

One LoongArch v1.1 feature "Hardware Page Table Walker" is not
implemented by LA664.  Maybe "all LoongArch v1.1 **unprivileged**
features"?

> +@item la64v1.0
> +LoongArch64 ISA version 1.0.
> +@item la64v1.1
> +LoongArch64 ISA version 1.1.

IMO it's better to use a wording like LA664, i.e. "a CPU implementing
all LoongArch v1.1 unprivileged features" (emphasising "all", as the
v1.1 manual allows to only implement a subset of v1.1 features).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Accelerate Your LinkedIn Networking with Outreach Automation

2024-04-19 Thread Try Automation


   
   
  
 

   
  
 

   
  
 

   
  
 

   
  
 
Hey 

,
 
 I hope this email finds you well. I'm reaching out to introduce you to TryAutomation, a powerful tool designed to supercharge your LinkedIn outreach efforts.
 
Our software offers a range of features to streamline your outreach process and drive meaningful connections:
 
 
Sending Connection Requests: Seamlessly send personalized connection requests to expand your network.
Finding Business Email IDs: Easily discover and access business email IDs for efficient communication outside of LinkedIn.
Messaging 1st Connects: Engage with your new connections directly by sending personalized messages.
Follow-Up Automation: Never miss a follow-up opportunity with automated follow-up messages to nurture your connections.
Identifying and Saving Successful Connects: Effortlessly track and save information on successful connections to build and maintain valuable relationships.
Inviting Followers to Your Company Page: Increase visibility and engagement for your company by inviting connections to follow your LinkedIn company page.
Engagement Enhancements: Enhance your presence on LinkedIn by congratulating connections on milestones and liking relevant posts to boost engagement.
 
 TryAutomation is designed to save you time and effort while maximizing the impact of your LinkedIn outreach campaigns.
 I'd love to offer you a demo to show you firsthand how TryAutomation can elevate your LinkedIn strategy. Are you available for a brief call sometime this week?
  
   

 
  
  
   

 
  
   

 
  
  
 

   
  
 

   
  
  
 

   
  
 Cheers,
 Team TryAutomation
  
   

 
  
   

 
  
   

 
  
   

 
  
  
   



Re: [PATCH]middle-end: refactory vect_recog_absolute_difference to simplify flow [PR114769]

2024-04-19 Thread Richard Biener
On Fri, Apr 19, 2024 at 3:29 PM Tamar Christina  wrote:
>
> Hi All,
>
> As the reporter in PR114769 points out the control flow for the abd detection
> is hard to follow.  This is because vect_recog_absolute_difference has two
> different ways it can return true.
>
> 1. It can return true when the widening operation is matched, in which case
>unprom is set, half_type is not NULL and diff_stmt is not set.
>
> 2. It can return true when the widening operation is not matched, but the stmt
>being checked is a minus.  In this case unprom is not set, half_type is set
>to NULL and diff_stmt is set.  This because to get to diff_stmt you have to
>dig through the abs statement and any possible promotions.
>
> This however leads to complicated uses of the function at the call sites as 
> the
> exact semantic needs to be known to use it safely.
>
> vect_recog_absolute_difference has two callers:
>
> 1. vect_recog_sad_pattern where if you return true with unprom not set, then
>*half_type will be NULL.  The call to vect_supportable_direct_optab_p will
>always reject it since there's no vector mode for NULL.  Note that if 
> looking
>at the dump files, the convention in the dump files have always been that 
> we
>first indicate that a pattern could possibly be recognize and then check 
> that
>it's supported.
>
>This change somewhat incorrectly makes the diagnostic message get printed 
> for
>"invalid" patterns.
>
> 2. vect_recog_abd_pattern, where if half_type is NULL, it then uses diff_stmt 
> to
>set them.
>
> So while the note in the dump file is misleading, the code is safe.
>
> This refactors the code, it now only has 1 success condition, and diff_stmt is
> always set to the minus statement in the abs if there is one.
>
> The function now only returns success if the widening minus is found, in which
> case unprom and half_type set.
>
> This then leaves it up to the caller to decide if they want to do anything 
> with
> diff_stmt.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?

OK.

Thanks,
Richard.

> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> PR tree-optimization/114769
> * tree-vect-patterns.cc:
> (vect_recog_absolute_difference): Have only one success condition.
> (vect_recog_abd_pattern): Handle further checks if
> vect_recog_absolute_difference fails.
>
> ---
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 
> 4f491c6b8336f8710c3519dec1fa7e0f49387d2b..87c2acff386d91d22a3b2d6e6443d1f2f2326ea6
>  100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -797,8 +797,7 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info 
> stmt2_info, tree new_rhs,
> HALF_TYPE and UNPROM will be set should the statement be found to
> be a widened operation.
> DIFF_STMT will be set to the MINUS_EXPR
> -   statement that precedes the ABS_STMT unless vect_widened_op_tree
> -   succeeds.
> +   statement that precedes the ABS_STMT if it is a MINUS_EXPR..
>   */
>  static bool
>  vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt,
> @@ -843,6 +842,12 @@ vect_recog_absolute_difference (vec_info *vinfo, gassign 
> *abs_stmt,
>if (!diff_stmt_vinfo)
>  return false;
>
> +  gassign *diff = dyn_cast  (STMT_VINFO_STMT (diff_stmt_vinfo));
> +  if (diff_stmt && diff
> +  && gimple_assign_rhs_code (diff) == MINUS_EXPR
> +  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd)))
> +*diff_stmt = diff;
> +
>/* FORNOW.  Can continue analyzing the def-use chain when this stmt in a 
> phi
>   inside the loop (in case we are analyzing an outer-loop).  */
>if (vect_widened_op_tree (vinfo, diff_stmt_vinfo,
> @@ -850,17 +855,6 @@ vect_recog_absolute_difference (vec_info *vinfo, gassign 
> *abs_stmt,
> false, 2, unprom, half_type))
>  return true;
>
> -  /* Failed to find a widen operation so we check for a regular MINUS_EXPR.  
> */
> -  gassign *diff = dyn_cast  (STMT_VINFO_STMT (diff_stmt_vinfo));
> -  if (diff_stmt && diff
> -  && gimple_assign_rhs_code (diff) == MINUS_EXPR
> -  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd)))
> -{
> -  *diff_stmt = diff;
> -  *half_type = NULL_TREE;
> -  return true;
> -}
> -
>return false;
>  }
>
> @@ -1499,27 +1493,22 @@ vect_recog_abd_pattern (vec_info *vinfo,
>tree out_type = TREE_TYPE (gimple_assign_lhs (last_stmt));
>
>vect_unpromoted_value unprom[2];
> -  gassign *diff_stmt;
> -  tree half_type;
> -  if (!vect_recog_absolute_difference (vinfo, last_stmt, _type,
> +  gassign *diff_stmt = NULL;
> +  tree abd_in_type;
> +  if (!vect_recog_absolute_difference (vinfo, last_stmt, _in_type,
>unprom, _stmt))
> -return NULL;
> -
> -  tree abd_in_type, abd_out_type;
> -
> -  if (half_type)
> -{
> -  abd_in_type = half_type;
> -  abd_out_type = abd_in_type;
> -

[PATCH]middle-end: refactory vect_recog_absolute_difference to simplify flow [PR114769]

2024-04-19 Thread Tamar Christina
Hi All,

As the reporter in PR114769 points out the control flow for the abd detection
is hard to follow.  This is because vect_recog_absolute_difference has two
different ways it can return true.

1. It can return true when the widening operation is matched, in which case
   unprom is set, half_type is not NULL and diff_stmt is not set.

2. It can return true when the widening operation is not matched, but the stmt
   being checked is a minus.  In this case unprom is not set, half_type is set
   to NULL and diff_stmt is set.  This because to get to diff_stmt you have to
   dig through the abs statement and any possible promotions.

This however leads to complicated uses of the function at the call sites as the
exact semantic needs to be known to use it safely.

vect_recog_absolute_difference has two callers:

1. vect_recog_sad_pattern where if you return true with unprom not set, then
   *half_type will be NULL.  The call to vect_supportable_direct_optab_p will
   always reject it since there's no vector mode for NULL.  Note that if looking
   at the dump files, the convention in the dump files have always been that we
   first indicate that a pattern could possibly be recognize and then check that
   it's supported.

   This change somewhat incorrectly makes the diagnostic message get printed for
   "invalid" patterns.

2. vect_recog_abd_pattern, where if half_type is NULL, it then uses diff_stmt to
   set them.

So while the note in the dump file is misleading, the code is safe.

This refactors the code, it now only has 1 success condition, and diff_stmt is
always set to the minus statement in the abs if there is one.

The function now only returns success if the widening minus is found, in which
case unprom and half_type set.

This then leaves it up to the caller to decide if they want to do anything with
diff_stmt.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/114769
* tree-vect-patterns.cc:
(vect_recog_absolute_difference): Have only one success condition.
(vect_recog_abd_pattern): Handle further checks if
vect_recog_absolute_difference fails.

---
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 
4f491c6b8336f8710c3519dec1fa7e0f49387d2b..87c2acff386d91d22a3b2d6e6443d1f2f2326ea6
 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -797,8 +797,7 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info 
stmt2_info, tree new_rhs,
HALF_TYPE and UNPROM will be set should the statement be found to
be a widened operation.
DIFF_STMT will be set to the MINUS_EXPR
-   statement that precedes the ABS_STMT unless vect_widened_op_tree
-   succeeds.
+   statement that precedes the ABS_STMT if it is a MINUS_EXPR..
  */
 static bool
 vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt,
@@ -843,6 +842,12 @@ vect_recog_absolute_difference (vec_info *vinfo, gassign 
*abs_stmt,
   if (!diff_stmt_vinfo)
 return false;
 
+  gassign *diff = dyn_cast  (STMT_VINFO_STMT (diff_stmt_vinfo));
+  if (diff_stmt && diff
+  && gimple_assign_rhs_code (diff) == MINUS_EXPR
+  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd)))
+*diff_stmt = diff;
+
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
  inside the loop (in case we are analyzing an outer-loop).  */
   if (vect_widened_op_tree (vinfo, diff_stmt_vinfo,
@@ -850,17 +855,6 @@ vect_recog_absolute_difference (vec_info *vinfo, gassign 
*abs_stmt,
false, 2, unprom, half_type))
 return true;
 
-  /* Failed to find a widen operation so we check for a regular MINUS_EXPR.  */
-  gassign *diff = dyn_cast  (STMT_VINFO_STMT (diff_stmt_vinfo));
-  if (diff_stmt && diff
-  && gimple_assign_rhs_code (diff) == MINUS_EXPR
-  && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd)))
-{
-  *diff_stmt = diff;
-  *half_type = NULL_TREE;
-  return true;
-}
-
   return false;
 }
 
@@ -1499,27 +1493,22 @@ vect_recog_abd_pattern (vec_info *vinfo,
   tree out_type = TREE_TYPE (gimple_assign_lhs (last_stmt));
 
   vect_unpromoted_value unprom[2];
-  gassign *diff_stmt;
-  tree half_type;
-  if (!vect_recog_absolute_difference (vinfo, last_stmt, _type,
+  gassign *diff_stmt = NULL;
+  tree abd_in_type;
+  if (!vect_recog_absolute_difference (vinfo, last_stmt, _in_type,
   unprom, _stmt))
-return NULL;
-
-  tree abd_in_type, abd_out_type;
-
-  if (half_type)
-{
-  abd_in_type = half_type;
-  abd_out_type = abd_in_type;
-}
-  else
 {
+  /* We cannot try further without having a non-widening MINUS.  */
+  if (!diff_stmt)
+   return NULL;
+
   unprom[0].op = gimple_assign_rhs1 (diff_stmt);
   unprom[1].op = gimple_assign_rhs2 (diff_stmt);
   abd_in_type = signed_type_for (out_type);
-  abd_out_type = abd_in_type;
 }
 
+ 

Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-19 Thread Richard Earnshaw (lists)
On 19/04/2024 13:45, Alexandre Oliva wrote:
> On Apr 16, 2024, "Richard Earnshaw (lists)"  wrote:
> 
>> The require-effective-target flags test whether a specific set of
>> flags will make the compilation work, so they need to be used in
>> conjunction with the corresponding dg-add-options flags that then
>> apply those options.
> 
> *nod*, that's the theory.  Problem is the architectures suported by
> [add_options_for_]arm_arch_*[_ok] do not match exactly those expected by
> the tests, and I can't quite tell whether the subtle changes they would
> introduce would change what they intend to test, or even whether the
> differences are irrelevant, or would be sensible to add as variants to
> the dg machinery.  I think it would take someone more familiar than I am
> with all of the ARM variants to do this correctly.  I don't even know
> how these changes would need to be tested to be sure they remain
> correct.

It's ok to add additional variations to the table of variants in 
target-supports.exp, but we should avoid writing new specific run-time 
functions unless we really want an executable test.

I started doing some cleanup of the Arm tests infrastructure during phase 3, 
but stopped during phase 4 as I wanted to minimise the changes being made now.  
I plan to go back and work on it some more once stage 1 re-opens.

> 
> Would you be willing to take it from here, or would you accept the patch
> as an incremental yet imperfect improvement, or would you prefer to
> guide me in making it correct, and in verifying it (there are questions
> below)?  I don't have a lot of cycles to put into this (we've already
> worked around the testsuite bugs we ran into), but it would be desirable
> to get a fix into GCC as well, if we can converge on one without
> unreasonably burdening anyone.
> 
> 
>   v8_1m_main "-march=armv8.1-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
>   v8_1m_main_pacbti "-march=armv8.1-m.main+pacbti+fp -mthumb"
>   "__ARM_ARCH_8M_MAIN__ && __ARM_FEATURE_BTI && 
> __ARM_FEATURE_PAUTH
> 
> Why do these have +fp in -march but not in the v8_1m* arch name?

It's ... complicated :)

The +fp is there because, with the move to having -mfpu=auto as the default, we 
need to avoid problems when the compiler has been configured with 
--with-float=hard, which requires the extension register set (fp or vector 
support) even if the test code itself doesn't care.  The best way to handle 
this in most cases is to give the architecture strings a default FPU 
specification (ie +fp). 

> 
> 
> gcc/testsuite/g++.target/arm/pac-1.C:
> /* { dg-options "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
> -mthumb -mfloat-abi=hard -g -O0" } */
> 
> v8_1m_main_pacbti plus +mve minus +fp.
> Do we need a dg arch for that?

I'd be inclined to drop +mve from this one; there's nothing I can see in the 
test that would generate mve instructions, so I think it's irrelevant.  We can 
use the existing v8_1m_main_pacbti operations.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c:
> /* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
> -mfloat-abi=hard" } */
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c:
> /* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */
> 
> v8_1m_main_pacbti minus -mthumb.
> AFAICT the -mthumb is redundant.

Nearly, but not quite.  Although the gcc driver knows that m-profile 
architectures require thumb, that's not enough to override an explicit -marm 
from a testsuite configuration run, so if your site.exp file adds -marm in a 
test configuration we need to override that or the test will fail.  But the 
table based list of options will do that for you.

> 
> 
> gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c:
> /* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */
> 
> v8_1m_main minus -mthumb.
> AFAICT the -mthumb is redundant.

As above

> 
> 
> gcc/testsuite/gcc.target/arm/bti-1.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> gcc/testsuite/gcc.target/arm/bti-2.c:
> /* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
> -mbranch-protection=bti --save-temps" } */
> 
> v8_1m_main minus +fp.> 
> Can these be bumped to +fp, or do we need an extra dg arch?
> 
> Are these missing +pacbti?

The tests themselves do not require fp, but if we use the effective-target 
rules (arm_arch_v8_1m_main), we can remove the -march, -mthumb and -mfloat-abi 
flags from these tests.

These tests for BTI should NOT have +pacbti: they're testing that the compiler 
generates the right nop-based implementation that is backwards compatible with 
CPUs that do not have this feature.[1]

R.

> 
> 
> Thanks,
> 

[1] The PAC and BTI features define the behaviour of some architectural NOP 
instructions: on older CPUs these instructions have no effect (they are NOPs), 
but on newer CPUs these NOPs take on a new behaviour that implements the 
feature (PAC or BTI).


Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Yang Yujie
On Fri, Apr 19, 2024 at 07:34:33PM +0800, Xi Ruoyao wrote:
> On Fri, 2024-04-19 at 19:04 +0800, Yang Yujie wrote:
> > These ISA versions are defined as -march= parameters and
> > are recommended for building binaries for distribution.
> > 
> > Detailed description of these definitions can be found at
> > https://github.com/loongson/la-toolchain-conventions, which
> > the LoongArch GCC port aims to conform to.
> 
> The links seems broken.  Do you mean la-softdev-convention? 
> 
> -- 
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University

Yes, it's not accessible now, but we will publish that repository really soon.

It contains an updated version of the "LoongArch toolchain conventions"
document from the original (now archived) LoongArch-Documentations repo.

Yujie



Re: [PATCH] [testsuite] [arm] require arm_v8_1m_main for pacbti tests

2024-04-19 Thread Alexandre Oliva
On Apr 16, 2024, "Richard Earnshaw (lists)"  wrote:

> The require-effective-target flags test whether a specific set of
> flags will make the compilation work, so they need to be used in
> conjunction with the corresponding dg-add-options flags that then
> apply those options.

*nod*, that's the theory.  Problem is the architectures suported by
[add_options_for_]arm_arch_*[_ok] do not match exactly those expected by
the tests, and I can't quite tell whether the subtle changes they would
introduce would change what they intend to test, or even whether the
differences are irrelevant, or would be sensible to add as variants to
the dg machinery.  I think it would take someone more familiar than I am
with all of the ARM variants to do this correctly.  I don't even know
how these changes would need to be tested to be sure they remain
correct.

Would you be willing to take it from here, or would you accept the patch
as an incremental yet imperfect improvement, or would you prefer to
guide me in making it correct, and in verifying it (there are questions
below)?  I don't have a lot of cycles to put into this (we've already
worked around the testsuite bugs we ran into), but it would be desirable
to get a fix into GCC as well, if we can converge on one without
unreasonably burdening anyone.


v8_1m_main "-march=armv8.1-m.main+fp -mthumb" __ARM_ARCH_8M_MAIN__
v8_1m_main_pacbti "-march=armv8.1-m.main+pacbti+fp -mthumb"
"__ARM_ARCH_8M_MAIN__ && __ARM_FEATURE_BTI && 
__ARM_FEATURE_PAUTH

Why do these have +fp in -march but not in the v8_1m* arch name?


gcc/testsuite/g++.target/arm/pac-1.C:
/* { dg-options "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
-mthumb -mfloat-abi=hard -g -O0" } */

v8_1m_main_pacbti plus +mve minus +fp.
Do we need a dg arch for that?


gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-7.c:
/* { dg-additional-options "-march=armv8.1-m.main+pacbti+fp --save-temps 
-mfloat-abi=hard" } */
gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c:
/* { dg-options "-march=armv8.1-m.main+fp+pacbti" } */

v8_1m_main_pacbti minus -mthumb.
AFAICT the -mthumb is redundant.


gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-12.c:
/* { dg-options "-march=armv8-m.main+fp -mfloat-abi=softfp" } */

v8_1m_main minus -mthumb.
AFAICT the -mthumb is redundant.


gcc/testsuite/gcc.target/arm/bti-1.c:
/* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
-mbranch-protection=bti --save-temps" } */
gcc/testsuite/gcc.target/arm/bti-2.c:
/* { dg-options "-march=armv8.1-m.main -mthumb -mfloat-abi=softfp 
-mbranch-protection=bti --save-temps" } */

v8_1m_main minus +fp.

Can these be bumped to +fp, or do we need an extra dg arch?

Are these missing +pacbti?


Thanks,

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] libgccjit: Add support for machine-dependent builtins

2024-04-19 Thread Antoni Boucher

David: Ping.

Le 2024-02-29 à 10 h 34, Antoni Boucher a écrit :

David: Ping.

On Thu, 2024-02-15 at 09:32 -0500, Antoni Boucher wrote:

David: Ping

On Thu, 2024-02-08 at 08:59 -0500, Antoni Boucher wrote:

David: Ping.

On Wed, 2024-01-10 at 18:58 -0500, Antoni Boucher wrote:

Here it is: https://gcc.gnu.org/pipermail/jit/2023q4/001725.html

On Wed, 2024-01-10 at 18:44 -0500, David Malcolm wrote:

On Wed, 2024-01-10 at 18:29 -0500, Antoni Boucher wrote:

David: Ping in case you missed this patch.


For some reason it's not showing up in patchwork (or, at least,
I
can't
find it there).  Do you have a URL for it there?

Sorry about this
Dave



On Sat, 2023-02-11 at 17:37 -0800, Andrew Pinski wrote:

On Sat, Feb 11, 2023 at 4:31 PM Antoni Boucher via Gcc-
patches
 wrote:


Hi.
This patch adds support for machine-dependent builtins in
libgccjit
(bug 108762).

There are two things I don't like in this patch:

  1. There are a few functions copied from the C frontend
(common_mark_addressable_vec and a few others).

  2. Getting a target builtin only works from the second
compilation
since the type information is recorded at the first
compilation.
I
couldn't find a way to get the builtin data without using
the
langhook.
It is necessary to get the type information for type
checking
and
instrospection.

Any idea how to fix these issues?


Seems like you should do this patch in a few steps; that is
split
it
up.
Definitely split out GCC_JIT_TYPE_BFLOAT16 support.
I also think the vector support should be in a different
patch
too.

Splitting out these parts would definitely make it easier
for
review
and make incremental improvements.

Thanks,
Andrew Pinski





Thanks for the review.














Re: Frontend access to target features (was Re: [PATCH] libgccjit: Add ability to get CPU features)

2024-04-19 Thread Antoni Boucher

David: Ping.

Le 2024-04-09 à 09 h 21, Antoni Boucher a écrit :

David: Ping.

Le 2024-04-01 à 08 h 20, Antoni Boucher a écrit :

David: Ping.

Le 2024-03-19 à 07 h 03, Arthur Cohen a écrit :

Hi,

On 3/5/24 16:09, David Malcolm wrote:

On Thu, 2023-11-09 at 19:33 -0500, Antoni Boucher wrote:

Hi.
See answers below.

On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote:

On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote:

Hi.
This patch adds support for getting the CPU features in libgccjit
(bug
112466)

There's a TODO in the test:
I'm not sure how to test that gcc_jit_target_info_arch returns
the
correct value since it is dependant on the CPU.
Any idea on how to improve this?

Also, I created a CStringHash to be able to have a
std::unordered_set. Is there any built-in way of
doing
this?


Thanks for the patch.

Some high-level questions:

Is this specifically about detecting capabilities of the host that
libgccjit is currently running on? or how the target was configured
when libgccjit was built?


I'm less sure about this part. I'll need to do more tests.



One of the benefits of libgccjit is that, in theory, we support all
of
the targets that GCC already supports.  Does this patch change
that,
or
is this more about giving client code the ability to determine
capabilities of the specific host being compiled for?


This should not change that. If it does, this is a bug.



I'm nervous about having per-target jit code.  Presumably there's a
reason that we can't reuse existing target logic here - can you
please
describe what the problem is.  I see that the ChangeLog has:


 * config/i386/i386-jit.cc: New file.


where i386-jit.cc has almost 200 lines of nontrivial code.  Where
did
this come from?  Did you base it on existing code in our source
tree,
making modifications to fit the new internal API, or did you write
it
from scratch?  In either case, how onerous would this be for other
targets?


This was mostly copied from the same code done for the Rust and D
frontends.
See this commit and the following:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b1c06fd9723453dd2b2ec306684cb806dc2b4fbb
The equivalent to i386-jit.cc is there:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=22e3557e2d52f129f2bbfdc98688b945dba28dc9


[CCing Iain and Arthur re those patches; for reference, the patch being
discussed is attached to :
https://gcc.gnu.org/pipermail/jit/2024q1/001792.html ]

One of my concerns about this patch is that we seem to be gaining code
that's per-(frontend x config) which seems to be copied and pasted with
a search and replace, which could lead to an M*N explosion.


I think this is definitely already the case, and it would be worth 
investigating if C/C++/Rust/jit can reuse a similar set of target 
files, or how to factor them together. I imagine that all of these 
components share similar needs for the targets they support.




Is there any real difference between the per-config code for the
different frontends, or should there be a general "enumerate all
features of the target" hook that's independent of the frontend? (but
perhaps calls into it).

Am I right in thinking that (rustc with default LLVM backend) has some
set of feature strings that both (rustc with rustc_codegen_gcc) and
gccrs are trying to emulate?  If so, is it presumably a goal that
libgccjit gives identical results to gccrs?  If so, would it be crazy
for libgccjit to consume e.g. config/i386/i386-rust.cc ?


I think this would definitely make sense, and it could probably be 
extended to other frontends. For the time being I think it makes 
sense to try it out for gccrs and jit. But finding a fitting name 
will be hard :)


Best,

Arthur



Dave





I'm not at expert at target hooks (or at the i386 backend), so if
we
do
go with this approach I'd want someone else to review those parts
of
the patch.

Have you verified that GCC builds with this patch with jit *not*
enabled in the enabled languages?


I will do.



[...snip...]

A nitpick:


+.. function:: const char * \
+  gcc_jit_target_info_arch (gcc_jit_target_info
*info)
+
+   Get the architecture of the currently running CPU.


What does this string look like?
How long does the pointer remain valid?


It's the march string, like "znver2", for instance.
It remains valid until we free the gcc_jit_target_info object.



Thanks again; hope the above makes sense
Dave







Re: [PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Xi Ruoyao
On Fri, 2024-04-19 at 19:04 +0800, Yang Yujie wrote:
> These ISA versions are defined as -march= parameters and
> are recommended for building binaries for distribution.
> 
> Detailed description of these definitions can be found at
> https://github.com/loongson/la-toolchain-conventions, which
> the LoongArch GCC port aims to conform to.

The links seems broken.  Do you mean la-softdev-convention? 

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] c-family: Allow arguments with NULLPTR_TYPE as sentinels [PR114780]

2024-04-19 Thread Jakub Jelinek
Hi!

While in C++ the ellipsis argument conversions include
"An argument that has type cv std::nullptr_t is converted to type void*"
in C23 a nullptr_t argument is not promoted in any way, but va_arg
description says:
"the type of the next argument is nullptr_t and type is a pointer type that has 
the same
representation and alignment requirements as a pointer to a character type."
So, while in C++ check_function_sentinel will never see NULLPTR_TYPE, for
C23 it can see that and currently we incorrectly warn about those.

The only question is whether we should warn on any argument with
nullptr_t type or just about nullptr (nullptr_t argument with integer_zerop
value).  Through undefined behavior guess one could pass non-NULL pointer
that way, say by union { void *p; nullptr_t q; } u; u.p = 
and pass u.q to ..., but valid code should always pass something that will
read as (char *) 0 when read using va_arg (ap, char *), so I think it is
better not to warn rather than warn in those cases.

Note, clang seems to pass (void *)0 rather than expression of nullptr_t
type to ellipsis in C23 mode as if it did the C++ ellipsis argument
conversions, in that case guess not warning about that would be even safer,
but what GCC does I think follows the spec more closely, even when in a
valid program one shouldn't be able to observe the difference.

Ok for trunk and later 13.3 if it passes bootstrap/regtest (so far just
checked on the sentinel related C/C++ tests)?

2024-04-19  Jakub Jelinek  

PR c/114780
* c-common.cc (check_function_sentinel): Allow as sentinel any
argument of NULLPTR_TYPE.

* gcc.dg/format/sentinel-2.c: New test.

--- gcc/c-family/c-common.cc.jj 2024-03-27 19:39:17.968676626 +0100
+++ gcc/c-family/c-common.cc2024-04-19 12:58:01.577985800 +0200
@@ -5783,6 +5783,7 @@ check_function_sentinel (const_tree fnty
   sentinel = fold_for_warn (argarray[nargs - 1 - pos]);
   if ((!POINTER_TYPE_P (TREE_TYPE (sentinel))
   || !integer_zerop (sentinel))
+ && TREE_CODE (TREE_TYPE (sentinel)) != NULLPTR_TYPE
  /* Although __null (in C++) is only an integer we allow it
 nevertheless, as we are guaranteed that it's exactly
 as wide as a pointer, and we don't want to force
--- gcc/testsuite/gcc.dg/format/sentinel-2.c.jj 2024-04-19 12:57:57.431043948 
+0200
+++ gcc/testsuite/gcc.dg/format/sentinel-2.c2024-04-19 12:58:39.020460785 
+0200
@@ -0,0 +1,21 @@
+/* PR c/114780 */
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -Wformat" } */
+
+#include 
+
+[[gnu::sentinel]] void foo (int, ...);
+[[gnu::sentinel]] void bar (...);
+
+void
+baz (nullptr_t p)
+{
+  foo (1, 2, nullptr);
+  foo (3, 4, 5, p);
+  bar (nullptr);
+  bar (p);
+  foo (6, 7, 0);   // { dg-warning "missing sentinel in function call" }
+  bar (0); // { dg-warning "missing sentinel in function call" }
+  foo (8, 9, NULL);
+  bar (NULL);
+}

Jakub



[PATCH 1/2] LoongArch: Define ISA versions

2024-04-19 Thread Yang Yujie
These ISA versions are defined as -march= parameters and
are recommended for building binaries for distribution.

Detailed description of these definitions can be found at
https://github.com/loongson/la-toolchain-conventions, which
the LoongArch GCC port aims to conform to.

gcc/ChangeLog:

* config.gcc: Make la64v1.0 the default ISA preset of the lp64d ABI.
* config/loongarch/genopts/loongarch-strings: Define la64v1.0, la64v1.1.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-c.cc (LARCH_CPP_SET_PROCESSOR): Likewise.
(loongarch_cpu_cpp_builtins): Likewise.
* config/loongarch/loongarch-cpu.cc (get_native_prid): Likewise.
(fill_native_cpu_config): Likewise.
* config/loongarch/loongarch-def.cc (array_tune): Likewise.
* config/loongarch/loongarch-def.h: Likewise.
* config/loongarch/loongarch-driver.cc (driver_set_m_parm): Likewise.
(driver_get_normalized_m_opts): Likewise.
* config/loongarch/loongarch-opts.cc (default_tune_for_arch): Likewise.
(TUNE_FOR_ARCH): Likewise.
(arch_str): Likewise.
(loongarch_target_option_override): Likewise.
* config/loongarch/loongarch-opts.h (TARGET_uARCH_LA464): Likewise.
(TARGET_uARCH_LA664): Likewise.
* config/loongarch/loongarch-str.h (STR_CPU_ABI_DEFAULT): Likewise.
(STR_ARCH_ABI_DEFAULT): Likewise.
(STR_TUNE_GENERIC): Likewise.
(STR_ARCH_LA64V1_0): Likewise.
(STR_ARCH_LA64V1_1): Likewise.
* config/loongarch/loongarch.cc 
(loongarch_cpu_sched_reassociation_width): Likewise.
(loongarch_asm_code_end): Likewise.
* config/loongarch/loongarch.opt: Likewise.
* doc/invoke.texi: Likewise.
---
 gcc/config.gcc| 32 +++
 .../loongarch/genopts/loongarch-strings   |  5 +-
 gcc/config/loongarch/genopts/loongarch.opt.in | 43 --
 gcc/config/loongarch/loongarch-c.cc   | 37 +++--
 gcc/config/loongarch/loongarch-cpu.cc | 35 
 gcc/config/loongarch/loongarch-def.cc | 83 +--
 gcc/config/loongarch/loongarch-def.h  | 37 ++---
 gcc/config/loongarch/loongarch-driver.cc  |  8 +-
 gcc/config/loongarch/loongarch-opts.cc| 66 +++
 gcc/config/loongarch/loongarch-opts.h |  4 +-
 gcc/config/loongarch/loongarch-str.h  |  5 +-
 gcc/config/loongarch/loongarch.cc | 11 +--
 gcc/config/loongarch/loongarch.opt| 43 --
 gcc/doc/invoke.texi   | 56 -
 14 files changed, 298 insertions(+), 167 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 5df3c52f8e9..d1fdba38eed 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5072,7 +5072,7 @@ case "${target}" in
 
# Perform initial sanity checks on --with-* options.
case ${with_arch} in
-   "" | abi-default | loongarch64 | la[46]64) ;; # OK, append here.
+   "" | la64v1.[01] | abi-default | loongarch64 | la[46]64) ;; # 
OK, append here.
native)
if test x${host} != x${target}; then
echo "--with-arch=native is illegal for 
cross-compiler." 1>&2
@@ -5121,8 +5121,16 @@ case "${target}" in
case ${abi_base}/${abi_ext} in
lp64*/base)
# architectures that support lp64* ABI
-   arch_pattern="native|abi-default|loongarch64|la[46]64"
-   # default architecture for lp64* ABI
+   
arch_pattern="native|abi-default|la64v1.[01]|loongarch64|la[46]64"
+
+   # default architecture for lp64d ABI
+   arch_default="la64v1.0"
+   ;;
+   lp64[fs]/base)
+   # architectures that support lp64* ABI
+   
arch_pattern="native|abi-default|la64v1.[01]|loongarch64|la[46]64"
+
+   # default architecture for lp64[fs] ABI
arch_default="abi-default"
;;
*)
@@ -5194,15 +5202,7 @@ case "${target}" in
 
 
# Check default with_tune configuration using with_arch.
-   case ${with_arch} in
-   loongarch64)
-   tune_pattern="native|abi-default|loongarch64|la[46]64"
-   ;;
-   *)
-   # By default, $with_tune == $with_arch
-   tune_pattern="*"
-   ;;
-   esac
+   tune_pattern="native|generic|loongarch64|la[46]64"
 
case ${with_tune} in
"") ;; # OK
@@ -5252,7 +5252,7 @@ case "${target}" in
# Fixed: use the default gcc 
configuration for all multilib
   

[PATCH 2/2] LoongArch: Define builtin macros for ISA evolutions

2024-04-19 Thread Yang Yujie
Detailed description of these definitions can be found at
https://github.com/loongson/la-toolchain-conventions, which
the LoongArch GCC port aims to conform to.

gcc/ChangeLog:

* config.gcc: Add loongarch-evolution.o.
* config/loongarch/genopts/genstr.sh: Enable generation of
loongarch-evolution.[cc,h].
* config/loongarch/t-loongarch: Likewise.
* config/loongarch/genopts/gen-evolution.awk: New file.
* config/loongarch/genopts/isa-evolution.in: Mark ISA version
of introduction for each ISA evolution feature.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins):
Define builtin macros for enabled ISA evolutions and the ISA
version.
* config/loongarch/loongarch-cpu.cc: Use loongarch-evolution.h.
* config/loongarch/loongarch.h: Likewise.
* config/loongarch/loongarch-cpucfg-map.h: Delete.
* config/loongarch/loongarch-evolution.cc: New file.
* config/loongarch/loongarch-evolution.h: New file.
* config/loongarch/loongarch-opts.h (ISA_HAS_FRECIPE): Define.
(ISA_HAS_DIV32): Likewise.
(ISA_HAS_LAM_BH): Likewise.
(ISA_HAS_LAMCAS): Likewise.
(ISA_HAS_LD_SEQ_SA): Likewise.
---
 gcc/config.gcc|   2 +-
 .../loongarch/genopts/gen-evolution.awk   | 224 ++
 gcc/config/loongarch/genopts/genstr.sh|  82 ++-
 gcc/config/loongarch/genopts/isa-evolution.in |  10 +-
 gcc/config/loongarch/loongarch-c.cc   |  20 ++
 gcc/config/loongarch/loongarch-cpu.cc |   2 +-
 gcc/config/loongarch/loongarch-evolution.cc   |  58 +
 ...rch-cpucfg-map.h => loongarch-evolution.h} |  42 +++-
 gcc/config/loongarch/loongarch-opts.h |  11 -
 gcc/config/loongarch/loongarch.h  |   1 +
 gcc/config/loongarch/t-loongarch  |  26 +-
 11 files changed, 383 insertions(+), 95 deletions(-)
 create mode 100644 gcc/config/loongarch/genopts/gen-evolution.awk
 create mode 100644 gcc/config/loongarch/loongarch-evolution.cc
 rename gcc/config/loongarch/{loongarch-cpucfg-map.h => loongarch-evolution.h} 
(54%)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d1fdba38eed..36abe5dbc09 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -485,7 +485,7 @@ loongarch*-*-*)
cpu_type=loongarch
d_target_objs="loongarch-d.o"
extra_headers="larchintrin.h lsxintrin.h lasxintrin.h"
-   extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o 
loongarch-opts.o loongarch-def.o"
+   extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o 
loongarch-opts.o loongarch-def.o loongarch-evolution.o"
extra_gcc_objs="loongarch-driver.o loongarch-cpu.o loongarch-opts.o 
loongarch-def.o"
extra_options="${extra_options} g.opt fused-madd.opt"
;;
diff --git a/gcc/config/loongarch/genopts/gen-evolution.awk 
b/gcc/config/loongarch/genopts/gen-evolution.awk
new file mode 100644
index 000..26512834092
--- /dev/null
+++ b/gcc/config/loongarch/genopts/gen-evolution.awk
@@ -0,0 +1,224 @@
+#!/usr/bin/gawk
+#
+# A simple script that generates loongarch-evolution.h
+# from genopts/isa-evolution.in
+#
+# Copyright (C) 2021-2024 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+# or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+# License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+BEGIN {
+# isa_version_major[]
+# isa_version_minor[]
+# cpucfg_word[]
+# cpucfg_bit_in_word[]
+# name_capitalized[]
+# comment[]
+}
+
+{
+cpucfg_word[NR] = $1
+cpucfg_bit_in_word[NR] = $2
+name[NR] = gensub(/-/, "_", "g", $3)
+name_capitalized[NR] = toupper(name[NR])
+isa_version_major[NR] = gensub(/^([1-9][0-9]*)\.([0-9]+)$/, "\\1", 1, $4)
+isa_version_minor[NR] = gensub(/^([1-9][0-9]*)\.([0-9]+)$/, "\\2", 1, $4)
+
+$1 = $2 = $3 = $4 = ""
+sub (/^\s*/, "")
+comment[NR] = $0
+}
+
+function copyright_header(from_year,to_year)
+{
+print "   Copyright (C) " from_year "-" to_year \
+  " Free Software Foundation, Inc."
+print ""
+print "This file is part of GCC."
+print ""
+print "GCC is free software; you can redistribute it and/or modify"
+print "it under the terms of the GNU General Public License as published 
by"
+print "the Free Software Foundation; either version 3, or (at your option)"
+print "any later version."
+print ""
+print "GCC 

Enable 'gcc.dg/pr114768.c' for nvptx target [PR114768] (was: [PATCH] rtlanal: Fix set_noop_p for volatile loads or stores [PR114768])

2024-04-19 Thread Thomas Schwinge
Hi!

On 2024-04-19T12:30:25+0200, Jakub Jelinek  wrote:
> On Fri, Apr 19, 2024 at 12:23:03PM +0200, Thomas Schwinge wrote:
>> On 2024-04-19T08:24:03+0200, Jakub Jelinek  wrote:
>> > --- gcc/testsuite/gcc.dg/pr114768.c.jj 2024-04-18 15:37:49.139433678 
>> > +0200
>> > +++ gcc/testsuite/gcc.dg/pr114768.c2024-04-18 15:43:30.389730365 
>> > +0200
>> > @@ -0,0 +1,10 @@
>> > +/* PR rtl-optimization/114768 */
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-O2 -fdump-rtl-final" } */
>> > +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { 
>> > nvptx*-*-* } } } } } */
>> > +
>> > +void
>> > +foo (int *p)
>> > +{
>> > +  *p = *(volatile int *) p;
>> > +}
>> 
>> Why exclude nvptx target here?  As far as I can see, it does behave in
>> the exactly same way as expected; see 'diff' of before vs. after the
>> 'gcc/rtlanal.cc' code changes:
>
> I wasn't sure if the non-RA targets (for which we don't have an effective
> target) even have final dump.
> If they do as you show, then guess the target guard can go.

ACK.  Pushed to trunk branch in
commit 9451b6c0a941dc44ca6f14ff8565d74fe56cca59
"Enable 'gcc.dg/pr114768.c' for nvptx target [PR114768]", see attached.


Grüße
 Thomas


>From 9451b6c0a941dc44ca6f14ff8565d74fe56cca59 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 19 Apr 2024 12:32:03 +0200
Subject: [PATCH] Enable 'gcc.dg/pr114768.c' for nvptx target [PR114768]

Follow-up to commit 9f295847a9c32081bdd0fe908ffba58e830a24fb
"rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]": nvptx does
behave in the exactly same way as expected; see 'diff' of before vs. after the
'gcc/rtlanal.cc' code changes:

PASS: gcc.dg/pr114768.c (test for excess errors)
[-FAIL:-]{+PASS:+} gcc.dg/pr114768.c scan-rtl-dump final "\\(mem/v:"

--- 0/pr114768.c.347r.final	2024-04-19 11:34:34.577037596 +0200
+++ ./pr114768.c.347r.final	2024-04-19 12:08:00.118312524 +0200
@@ -13,15 +13,27 @@
 ;;  entry block defs 	 1 [%stack] 2 [%frame] 3 [%args]
 ;;  exit block uses 	 1 [%stack] 2 [%frame]
 ;;  regs ever live
-;;  ref usage 	r1={1d,2u} r2={1d,2u} r3={1d,1u}
-;;total ref usage 8{3d,5u,0e} in 1{1 regular + 0 call} insns.
+;;  ref usage 	r1={1d,3u} r2={1d,3u} r3={1d,2u} r22={1d,1u} r23={1d,2u}
+;;total ref usage 16{5d,11u,0e} in 4{4 regular + 0 call} insns.
 (note 1 0 4 NOTE_INSN_DELETED)
 (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
-(note 2 4 3 2 NOTE_INSN_DELETED)
+(insn 2 4 3 2 (set (reg/v/f:DI 23 [ p ])
+(unspec:DI [
+(const_int 0 [0])
+] UNSPEC_ARG_REG)) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":8:1 14 {load_arg_regdi}
+ (nil))
 (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
-(note 6 3 10 2 NOTE_INSN_DELETED)
-(note 10 6 11 2 NOTE_INSN_EPILOGUE_BEG)
-(jump_insn 11 10 12 2 (return) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
+(insn 6 3 7 2 (set (reg:SI 22 [ _1 ])
+(mem/v:SI (reg/v/f:DI 23 [ p ]) [1 MEM[(volatile int *)p_3(D)]+0 S4 A32])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:8 6 {*movsi_insn}
+ (nil))
+(insn 7 6 10 2 (set (mem:SI (reg/v/f:DI 23 [ p ]) [1 *p_3(D)+0 S4 A32])
+(reg:SI 22 [ _1 ])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:6 6 {*movsi_insn}
+ (expr_list:REG_DEAD (reg/v/f:DI 23 [ p ])
+(expr_list:REG_DEAD (reg:SI 22 [ _1 ])
+(nil
+(note 10 7 13 2 NOTE_INSN_EPILOGUE_BEG)
+(note 13 10 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
+(jump_insn 11 13 12 3 (return) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
	  (nil)
  -> return)
 (barrier 12 11 0)

--- 0/pr114768.s	2024-04-19 11:34:34.577037596 +0200
+++ ./pr114768.s	2024-04-19 12:08:00.118312524 +0200
@@ -13,5 +13,10 @@
 {
	.reg.u64 %ar0;
	ld.param.u64 %ar0, [%in_ar0];
+	.reg.u32 %r22;
+	.reg.u64 %r23;
+		mov.u64	%r23, %ar0;
+		ld.u32	%r22, [%r23];
+		st.u32	[%r23], %r22;
	ret;
 }

	PR testsuite/114768
	gcc/testsuite/
	* gcc.dg/pr114768.c: Enable for nvptx target.
---
 gcc/testsuite/gcc.dg/pr114768.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr114768.c b/gcc/testsuite/gcc.dg/pr114768.c
index 2075f0d6b82..ffe3b368638 100644
--- a/gcc/testsuite/gcc.dg/pr114768.c
+++ b/gcc/testsuite/gcc.dg/pr114768.c
@@ -1,7 +1,7 @@
 /* PR rtl-optimization/114768 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-rtl-final" } */
-/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* } } } } } */
+/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" } } */
 
 void
 foo (int *p)
-- 
2.34.1



Re: [PATCH] rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]

2024-04-19 Thread Jakub Jelinek
On Fri, Apr 19, 2024 at 12:23:03PM +0200, Thomas Schwinge wrote:
> On 2024-04-19T08:24:03+0200, Jakub Jelinek  wrote:
> > --- gcc/testsuite/gcc.dg/pr114768.c.jj  2024-04-18 15:37:49.139433678 
> > +0200
> > +++ gcc/testsuite/gcc.dg/pr114768.c 2024-04-18 15:43:30.389730365 +0200
> > @@ -0,0 +1,10 @@
> > +/* PR rtl-optimization/114768 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-rtl-final" } */
> > +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { 
> > nvptx*-*-* } } } } } */
> > +
> > +void
> > +foo (int *p)
> > +{
> > +  *p = *(volatile int *) p;
> > +}
> 
> Why exclude nvptx target here?  As far as I can see, it does behave in
> the exactly same way as expected; see 'diff' of before vs. after the
> 'gcc/rtlanal.cc' code changes:

I wasn't sure if the non-RA targets (for which we don't have an effective
target) even have final dump.
If they do as you show, then guess the target guard can go.

Jakub



Re: [PATCH] rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]

2024-04-19 Thread Thomas Schwinge
Hi Jakub!

On 2024-04-19T08:24:03+0200, Jakub Jelinek  wrote:
> --- gcc/testsuite/gcc.dg/pr114768.c.jj2024-04-18 15:37:49.139433678 
> +0200
> +++ gcc/testsuite/gcc.dg/pr114768.c   2024-04-18 15:43:30.389730365 +0200
> @@ -0,0 +1,10 @@
> +/* PR rtl-optimization/114768 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-final" } */
> +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* 
> } } } } } */
> +
> +void
> +foo (int *p)
> +{
> +  *p = *(volatile int *) p;
> +}

Why exclude nvptx target here?  As far as I can see, it does behave in
the exactly same way as expected; see 'diff' of before vs. after the
'gcc/rtlanal.cc' code changes:

PASS: gcc.dg/pr114768.c (test for excess errors)
[-FAIL:-]{+PASS:+} gcc.dg/pr114768.c scan-rtl-dump final "\\(mem/v:"

--- 0/pr114768.c.347r.final 2024-04-19 11:34:34.577037596 +0200
+++ ./pr114768.c.347r.final 2024-04-19 12:08:00.118312524 +0200
@@ -13,15 +13,27 @@
 ;;  entry block defs1 [%stack] 2 [%frame] 3 [%args]
 ;;  exit block uses 1 [%stack] 2 [%frame]
 ;;  regs ever live 
-;;  ref usage  r1={1d,2u} r2={1d,2u} r3={1d,1u} 
-;;total ref usage 8{3d,5u,0e} in 1{1 regular + 0 call} insns.
+;;  ref usage  r1={1d,3u} r2={1d,3u} r3={1d,2u} r22={1d,1u} 
r23={1d,2u} 
+;;total ref usage 16{5d,11u,0e} in 4{4 regular + 0 call} insns.
 (note 1 0 4 NOTE_INSN_DELETED)
 (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
-(note 2 4 3 2 NOTE_INSN_DELETED)
+(insn 2 4 3 2 (set (reg/v/f:DI 23 [ p ])
+(unspec:DI [
+(const_int 0 [0])
+] UNSPEC_ARG_REG)) 
"source-gcc/gcc/testsuite/gcc.dg/pr114768.c":8:1 14 {load_arg_regdi}
+ (nil))
 (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
-(note 6 3 10 2 NOTE_INSN_DELETED)
-(note 10 6 11 2 NOTE_INSN_EPILOGUE_BEG)
-(jump_insn 11 10 12 2 (return) 
"source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
+(insn 6 3 7 2 (set (reg:SI 22 [ _1 ])
+(mem/v:SI (reg/v/f:DI 23 [ p ]) [1 MEM[(volatile int *)p_3(D)]+0 
S4 A32])) "source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:8 6 {*movsi_insn}
+ (nil))
+(insn 7 6 10 2 (set (mem:SI (reg/v/f:DI 23 [ p ]) [1 *p_3(D)+0 S4 A32])
+(reg:SI 22 [ _1 ])) 
"source-gcc/gcc/testsuite/gcc.dg/pr114768.c":9:6 6 {*movsi_insn}
+ (expr_list:REG_DEAD (reg/v/f:DI 23 [ p ])
+(expr_list:REG_DEAD (reg:SI 22 [ _1 ])
+(nil
+(note 10 7 13 2 NOTE_INSN_EPILOGUE_BEG)
+(note 13 10 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
+(jump_insn 11 13 12 3 (return) 
"source-gcc/gcc/testsuite/gcc.dg/pr114768.c":10:1 289 {return}
  (nil)
  -> return)
 (barrier 12 11 0)

--- 0/pr114768.s2024-04-19 11:34:34.577037596 +0200
+++ ./pr114768.s2024-04-19 12:08:00.118312524 +0200
@@ -13,5 +13,10 @@
 {
.reg.u64 %ar0;
ld.param.u64 %ar0, [%in_ar0];
+   .reg.u32 %r22;
+   .reg.u64 %r23;
+   mov.u64 %r23, %ar0;
+   ld.u32  %r22, [%r23];
+   st.u32  [%r23], %r22;
ret;
 }


Grüße
 Thomas


Re: [PATCH v3] bpf: remove huge memory waste with string allocation.

2024-04-19 Thread Cupertino Miranda


David Faust writes:

> Hi Cupertino,
>
> On 4/18/24 13:58, Cupertino Miranda wrote:
>> Hi David, everyone,
>>
>> Following Davids last review I decided to properly detect error cases,
>> as suggested.
>> The error however should be reported earlier in compilation in
>> pack_enum_valud function, where all the errors are reported.
>>
>> Thanks for the quick and detailed reviews.
>>
>> Regards,
>> Cupertino
>
> Thanks for taking the time on this.
> This version is nice, just one little comment:
>
>>
>> The BPF backend was allocating an unnecessarily large string when
>> constructing CO-RE relocations for enum types.
>> This patch further verifies if an enumerator is valid for CO-RE
>> representability and returns an error in those cases.
>
> The second sentence is a little awkward and seems to imply the error is
> returned when the enumerator is valid :)
> Perhaps "...verifies that an enumerator is valid for CO-RE, and returns
> an error if it is not" or similar would be more clear?
Thanks for all the suggestions.
>
> Otherwise, OK.
> Thanks!
Pushed!

>
>
>>
>> gcc/ChangeLog:
>>  * config/bpf/core-builtins.cc (get_index_for_enum_value): Create
>>  function.
>>  (pack_enum_value): Check for enumerator and error out.
>>  (process_enum_value): Correct string allocation.
>> ---
>>  gcc/config/bpf/core-builtins.cc | 57 ++---
>>  1 file changed, 38 insertions(+), 19 deletions(-)
>>
>> diff --git a/gcc/config/bpf/core-builtins.cc 
>> b/gcc/config/bpf/core-builtins.cc
>> index e03e986e2c1..829acea98f7 100644
>> --- a/gcc/config/bpf/core-builtins.cc
>> +++ b/gcc/config/bpf/core-builtins.cc
>> @@ -795,6 +795,23 @@ process_field_expr (struct cr_builtins *data)
>>  static GTY(()) hash_map *bpf_enum_mappings;
>>  tree enum_value_type = NULL_TREE;
>>
>> +static int
>> +get_index_for_enum_value (tree type, tree expr)
>> +{
>> +  gcc_assert (TREE_CODE (expr) == CONST_DECL
>> +  && TREE_CODE (type) == ENUMERAL_TYPE);
>> +
>> +  unsigned int index = 0;
>> +  for (tree l = TYPE_VALUES (type); l; l = TREE_CHAIN (l))
>> +{
>> +  gcc_assert (index < (1 << 16));
>> +  if (TREE_VALUE (l) == expr)
>> +return index;
>> +  index++;
>> +}
>> +  return -1;
>> +}
>> +
>>  /* Pack helper for the __builtin_preserve_enum_value.  */
>>
>>  static struct cr_local
>> @@ -846,6 +863,16 @@ pack_enum_value_fail:
>>  ret.reloc_data.default_value = integer_one_node;
>>  }
>>
>> +  if (ret.fail == false )
>> +{
>> +  int index = get_index_for_enum_value (type, tmp);
>> +  if (index == -1 || index >= (1 << 16))
>> +{
>> +  bpf_error ("enum value in CO-RE builtin cannot be represented");
>> +  ret.fail = true;
>> +}
>> +}
>> +
>>ret.reloc_data.type = type;
>>ret.reloc_data.kind = kind;
>>return ret;
>> @@ -864,25 +891,17 @@ process_enum_value (struct cr_builtins *data)
>>
>>struct cr_final ret = { NULL, type, data->kind };
>>
>> -  if (TREE_CODE (expr) == CONST_DECL
>> - && TREE_CODE (type) == ENUMERAL_TYPE)
>> -{
>> -  unsigned int index = 0;
>> -  for (tree l = TYPE_VALUES (type); l; l = TREE_CHAIN (l))
>> -{
>> -  if (TREE_VALUE (l) == expr)
>> -{
>> -  char *tmp = (char *) ggc_alloc_atomic ((index / 10) + 1);
>> -  sprintf (tmp, "%d", index);
>> -  ret.str = (const char *) tmp;
>> -
>> -  break;
>> -}
>> -  index++;
>> -}
>> -}
>> -  else
>> -gcc_unreachable ();
>> +  gcc_assert (TREE_CODE (expr) == CONST_DECL
>> +  && TREE_CODE (type) == ENUMERAL_TYPE);
>> +
>> +  int index = get_index_for_enum_value (type, expr);
>> +  gcc_assert (index != -1 && index < (1 << 16));
>> +
>> +  /* Index can only be a value up to 2^16.  Should always fit
>> + in 6 chars.  */
>> +  char tmp[6];
>> +  sprintf (tmp, "%u", index);
>> +  ret.str = CONST_CAST (char *, ggc_strdup(tmp));
>>
>>return ret;
>>  }


Re: [PATCH 1/3] bpf: support more instructions to match CO-RE relocations

2024-04-19 Thread Cupertino Miranda


Thanks! Pushed!

Jose E. Marchesi writes:

> Hi Cupertino.
> OK for master.
> Thanks!
>
>> BPF supports multiple instructions to be CO-RE relocatable regardless of
>> the position of the immediate field in the encoding.
>> In particular, not only the MOV instruction allows a CO-RE
>> relocation of its immediate operand, but the LD and ST instructions can
>> have a CO-RE relocation happening to their offset immediate operand,
>> even though those operands are encoded in different encoding bits.
>> This patch moves matching from a more traditional matching of the
>> UNSPEC_CORE_RELOC pattern within a define_insn to a match within the
>> constraints of both immediates and address operands from more generic
>> mov define_insn rule.
>>
>> gcc/Changelog:
>>  * config/bpf/bpf-protos.h (bpf_add_core_reloc): Renamed function
>>  to bpf_output_move.
>>  * config/bpf/bpf.cc (bpf_legitimate_address_p): Allow
>>  UNSPEC_CORE_RELOC to match an address.
>>  (bpf_insn_cost): Make UNSPEC_CORE_RELOC immediate moves
>>  expensive to prioritize loads and stores.
>>  (TARGET_INSN_COST): Add hook.
>>  (bpf_output_move): Wrapper to call bpf_output_core_reloc.
>>  (bpf_print_operand): Add support to print immediate operands
>>  specified with the UNSPEC_CORE_RELOC.
>>  (bpf_print_operand_address): Likewise, but to support
>>  UNSPEC_CORE_RELOC in addresses.
>>  (bpf_init_builtins): Flag BPF_BUILTIN_CORE_RELOC as NOTHROW.
>>  * config/bpf/bpf.md: Wrap patterns for MOV, LD and ST
>>  instruction with bpf_output_move call.
>>  (mov_reloc_core): Remove now spurious define_insn.
>>  * config/bpf/constraints.md: Added "c" and "C" constraints to
>>  match immediates represented with UNSPEC_CORE_RELOC.
>>  * config/bpf/core-builtins.cc (bpf_add_core_reloc): Remove
>>  (bpf_output_core_reloc): Add function to create the CO-RE
>>  relocations based on new matching rules.
>>  * config/bpf/core-builtins.h (bpf_output_core_reloc): Add
>>  prototype.
>>  * config/bpf/predicates.md (core_imm_operand) Add predicate.
>>  (mov_src_operand): Add match for core_imm_operand.
>>
>> gcc/testsuite/ChangeLog:
>>  * gcc.target/bpf/btfext-funcinfo.c: Updated to changes.
>>  * gcc.target/bpf/core-builtin-fieldinfo-const-elimination.c:
>>  Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-existence-1.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-lshift-1-be.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-lshift-1-le.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-lshift-2.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-rshift-1.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-rshift-2.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-sign-1.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-sign-2.c: Likewise.
>>  * gcc.target/bpf/core-builtin-fieldinfo-size-1.c: Likewise.
>> ---
>>  gcc/config/bpf/bpf-protos.h   |  2 +-
>>  gcc/config/bpf/bpf.cc | 54 +-
>>  gcc/config/bpf/bpf.md | 56 ++-
>>  gcc/config/bpf/constraints.md | 20 ++
>>  gcc/config/bpf/core-builtins.cc   | 71 ++-
>>  gcc/config/bpf/core-builtins.h|  2 +
>>  gcc/config/bpf/predicates.md  |  7 +-
>>  .../gcc.target/bpf/btfext-funcinfo.c  |  2 -
>>  ...core-builtin-fieldinfo-const-elimination.c |  2 +-
>>  .../bpf/core-builtin-fieldinfo-existence-1.c  |  2 +-
>>  .../bpf/core-builtin-fieldinfo-lshift-1-be.c  |  8 +--
>>  .../bpf/core-builtin-fieldinfo-lshift-1-le.c  |  8 +--
>>  .../bpf/core-builtin-fieldinfo-lshift-2.c |  6 +-
>>  .../bpf/core-builtin-fieldinfo-offset-1.c | 12 ++--
>>  .../bpf/core-builtin-fieldinfo-rshift-1.c |  8 +--
>>  .../bpf/core-builtin-fieldinfo-rshift-2.c |  4 +-
>>  .../bpf/core-builtin-fieldinfo-sign-1.c   |  4 +-
>>  .../bpf/core-builtin-fieldinfo-sign-2.c   |  4 +-
>>  .../bpf/core-builtin-fieldinfo-size-1.c   |  8 +--
>>  19 files changed, 189 insertions(+), 91 deletions(-)
>>
>> diff --git a/gcc/config/bpf/bpf-protos.h b/gcc/config/bpf/bpf-protos.h
>> index ac0c2f4038f..b4866d34209 100644
>> --- a/gcc/config/bpf/bpf-protos.h
>> +++ b/gcc/config/bpf/bpf-protos.h
>> @@ -30,7 +30,7 @@ extern void bpf_print_operand_address (FILE *, rtx);
>>  extern void bpf_expand_prologue (void);
>>  extern void bpf_expand_epilogue (void);
>>  extern void bpf_expand_cbranch (machine_mode, rtx *);
>> -const char *bpf_add_core_reloc (rtx *operands, const char *templ);
>> +const char *bpf_output_move (rtx *operands, const char *templ);
>>
>>  class gimple_opt_pass;
>>  gimple_opt_pass *make_pass_lower_bpf_core (gcc::context *ctxt);
>> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
>> index 

[PATCH] rust: Do not link with libdl and libpthread unconditionally

2024-04-19 Thread Arthur Cohen
Hi everyone,

This patch checks for the presence of dlopen and pthread_create in libc. If 
that is not the
case, we check for the existence of -ldl and -lpthread, as these libraries are 
required to
link the Rust runtime to our Rust frontend.

If these libs are not present on the system, then we disable the Rust frontend.

This was tested on x86_64, in an environment with a recent GLIBC and in a 
container with GLIBC
2.27.

Apologies for sending it in so late.

ChangeLog:

* Makefile.tpl: Add CRAB1_LIBS variable.
* Makefile.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check if -ldl and -lpthread are needed, and if so, add
them to CRAB1_LIBS.

gcc/rust/ChangeLog:

* Make-lang.in: Remove overazealous LIBS = -ldl -lpthread line, link
crab1 against CRAB1_LIBS.
---
 Makefile.in   |   3 +
 Makefile.tpl  |   3 +
 configure | 157 ++
 configure.ac  |  94 +
 gcc/rust/Make-lang.in |   2 +-
 5 files changed, 258 insertions(+), 1 deletion(-)

diff --git a/Makefile.in b/Makefile.in
index db4fa6c6260..34c5550beca 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -197,6 +197,7 @@ HOST_EXPORTS = \
$(BASE_EXPORTS) \
CC="$(CC)"; export CC; \
ADA_CFLAGS="$(ADA_CFLAGS)"; export ADA_CFLAGS; \
+   CRAB1_LIBS="$(CRAB1_LIBS)"; export CRAB1_LIBS; \
CFLAGS="$(CFLAGS)"; export CFLAGS; \
CONFIG_SHELL="$(SHELL)"; export CONFIG_SHELL; \
CXX="$(CXX)"; export CXX; \
@@ -450,6 +451,8 @@ GOCFLAGS = $(CFLAGS)
 GDCFLAGS = @GDCFLAGS@
 GM2FLAGS = $(CFLAGS)
 
+CRAB1_LIBS = @CRAB1_LIBS@
+
 PKG_CONFIG_PATH = @PKG_CONFIG_PATH@
 
 GUILE = guile
diff --git a/Makefile.tpl b/Makefile.tpl
index 1d5813cd569..8f4bf297918 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -200,6 +200,7 @@ HOST_EXPORTS = \
$(BASE_EXPORTS) \
CC="$(CC)"; export CC; \
ADA_CFLAGS="$(ADA_CFLAGS)"; export ADA_CFLAGS; \
+   CRAB1_LIBS="$(CRAB1_LIBS)"; export CRAB1_LIBS; \
CFLAGS="$(CFLAGS)"; export CFLAGS; \
CONFIG_SHELL="$(SHELL)"; export CONFIG_SHELL; \
CXX="$(CXX)"; export CXX; \
@@ -453,6 +454,8 @@ GOCFLAGS = $(CFLAGS)
 GDCFLAGS = @GDCFLAGS@
 GM2FLAGS = $(CFLAGS)
 
+CRAB1_LIBS = @CRAB1_LIBS@
+
 PKG_CONFIG_PATH = @PKG_CONFIG_PATH@
 
 GUILE = guile
diff --git a/configure b/configure
index 3b0abeb8b2e..75b489a5f57 100755
--- a/configure
+++ b/configure
@@ -690,6 +690,7 @@ extra_host_zlib_configure_flags
 extra_host_libiberty_configure_flags
 stage1_languages
 host_libs_picflag
+CRAB1_LIBS
 PICFLAG
 host_shared
 gcc_host_pie
@@ -8875,6 +8876,142 @@ fi
 
 
 
+# Rust requires -ldl and -lpthread if you are using an old glibc that does not 
include them by
+# default, so we check for them here
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking if libc includes libdl and 
libpthread" >&5
+$as_echo_n "checking if libc includes libdl and libpthread... " >&6; }
+
+ac_ext=c
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext 
$LIBS >&5'
+ac_compiler_gnu=$ac_cv_c_compiler_gnu
+
+
+requires_ldl=no
+requires_lpthread=no
+missing_rust_dynlibs=none
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+dlopen(0,0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+
+else
+  requires_ldl=yes
+
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+
+if test $requires_ldl = yes; then
+tmp_LIBS=$LIBS
+LIBS="$LIBS -ldl"
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+dlopen(0,0);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  CRAB1_LIBS="$CRAB1_LIBS -ldl"
+else
+  missing_rust_dynlibs="libdl"
+
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+
+LIBS=$tmp_LIBS
+fi
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+pthread_create(NULL,NULL,NULL,NULL);
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+
+else
+  requires_lpthread=yes
+
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+
+if test $requires_lpthread = yes; then
+tmp_LIBS=$LIBS
+LIBS="$LIBS -lpthread"
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+int
+main ()
+{
+pthread_create(NULL,NULL,NULL,NULL);
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  CRAB1_LIBS="$CRAB1_LIBS -lpthread"
+else
+  missing_rust_dynlibs="$missing_rust_dynlibs, libpthread"
+
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext conftest.$ac_ext
+
+LIBS=$tmp_LIBS
+fi
+
+if test "$missing_rust_dynlibs" = "none"; then
+  if test $requires_ldl = yes -a 

Re: [PATCH] libstdc++: Fix std::ranges::iota is not included in numeric [PR108760]

2024-04-19 Thread Jonathan Wakely
On Thu, 18 Apr 2024 at 22:59, Patrick Palka  wrote:
>
> On Wed, 17 Apr 2024, Michael Levine (BLOOMBERG/ 919 3RD A) wrote:
>
> > This patch fixes GCC Bug 108760: 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108760
> > Before this patch, using std::ranges::iota required including  
> > when it should have been sufficient to only include .
> >
> > When the patch is applied, the following code will compile: 
> > https://godbolt.org/z/33EPeqd1b
> >
> > I added a test case for this change as well.
> >
> > I built my local version of gcc using the following configuration: $ 
> > ../gcc/configure --disable-bootstrap --prefix="$(pwd)/_pfx/" 
> > --enable-languages=c,c++,lto
> >
> > and I tested my changes by running: $ make check-c++ -jN -k
>
> Nice, thanks for the patch!
>
> >
> > I ran this on the following OS:
> >
> > Virtualization: wsl
> > Operating System: Ubuntu 20.04.6 LTS
> > Kernel: Linux 5.15.146.1-microsoft-standard-WSL2
> > Architecture: x86-64
>
> > From bd04070c281572ed7a3b48e3d33543e25b8c8afe Mon Sep 17 00:00:00 2001
> > From: Michael Levine 
> > Date: Fri, 23 Feb 2024 14:13:13 -0500
> > Subject: [PATCH 1/2] Fix the bug
> >
> > Signed-off-by: Michael Levine 
> > ---
> >  libstdc++-v3/include/bits/ranges_algo.h | 52 --
> >  libstdc++-v3/include/bits/stl_numeric.h | 57 -
> >  2 files changed, 56 insertions(+), 53 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
> > b/libstdc++-v3/include/bits/ranges_algo.h
> > index 62faff173bd..d258be0b93f 100644
> > --- a/libstdc++-v3/include/bits/ranges_algo.h
> > +++ b/libstdc++-v3/include/bits/ranges_algo.h
> > @@ -3521,58 +3521,6 @@ namespace ranges
> >
> >  #endif // __glibcxx_ranges_contains
> >
> > -#if __glibcxx_ranges_iota >= 202202L // C++ >= 23
> > -
> > -  template
> > -struct out_value_result
> > -{
> > -  [[no_unique_address]] _Out out;
> > -  [[no_unique_address]] _Tp value;
> > -
> > -  template
> > - requires convertible_to
> > -   && convertible_to
> > - constexpr
> > - operator out_value_result<_Out2, _Tp2>() const &
> > - { return {out, value}; }
> > -
> > -  template
> > - requires convertible_to<_Out, _Out2>
> > -   && convertible_to<_Tp, _Tp2>
> > - constexpr
> > - operator out_value_result<_Out2, _Tp2>() &&
> > - { return {std::move(out), std::move(value)}; }
> > -};
> > -
> > -  template
> > -using iota_result = out_value_result<_Out, _Tp>;
> > -
> > -  struct __iota_fn
> > -  {
> > -template _Sent, 
> > weakly_incrementable _Tp>
> > -  requires indirectly_writable<_Out, const _Tp&>
> > -  constexpr iota_result<_Out, _Tp>
> > -  operator()(_Out __first, _Sent __last, _Tp __value) const
> > -  {
> > - while (__first != __last)
> > -   {
> > - *__first = static_cast(__value);
> > - ++__first;
> > - ++__value;
> > -   }
> > - return {std::move(__first), std::move(__value)};
> > -  }
> > -
> > -template _Range>
> > -  constexpr iota_result, _Tp>
> > -  operator()(_Range&& __r, _Tp __value) const
> > -  { return (*this)(ranges::begin(__r), ranges::end(__r), 
> > std::move(__value)); }
> > -  };
> > -
> > -  inline constexpr __iota_fn iota{};
> > -
> > -#endif // __glibcxx_ranges_iota
> > -
> >  #if __glibcxx_ranges_find_last >= 202207L // C++ >= 23
> >
> >struct __find_last_fn
> > diff --git a/libstdc++-v3/include/bits/stl_numeric.h 
> > b/libstdc++-v3/include/bits/stl_numeric.h
> > index fe911154ab7..1b06c26dc02 100644
> > --- a/libstdc++-v3/include/bits/stl_numeric.h
> > +++ b/libstdc++-v3/include/bits/stl_numeric.h
> > @@ -59,7 +59,7 @@
> >  #include 
> >  #include 
> >  #include  // For _GLIBCXX_MOVE
> > -
> > +#include  // For _Range as used by std::ranges::iota
> >
> >  namespace std _GLIBCXX_VISIBILITY(default)
> >  {
> > @@ -102,6 +102,61 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >  }
> >  #endif
> >
> > +namespace ranges
> > +{
> > +#if __glibcxx_ranges_iota >= 202202L // C++ >= 23
> > +
> > +  template
> > +struct out_value_result
> > +{
> > +  [[no_unique_address]] _Out out;
> > +  [[no_unique_address]] _Tp value;
> > +
> > +  template
> > + requires convertible_to
> > +   && convertible_to
> > + constexpr
> > + operator out_value_result<_Out2, _Tp2>() const &
> > + { return {out, value}; }
> > +
> > +  template
> > + requires convertible_to<_Out, _Out2>
> > +   && convertible_to<_Tp, _Tp2>
> > + constexpr
> > + operator out_value_result<_Out2, _Tp2>() &&
> > + { return {std::move(out), std::move(value)}; }
> > +};
>
> IIUC out_value_result should continue to be available from , so we
> probably don't want to move it to  (or one of its internal headers).
> Better would be to move it to  I think.
>
> > +
> > +  template
> > +using iota_result = out_value_result<_Out, _Tp>;
> > +
> > +  struct __iota_fn
> > +  {
> > +template _Sent, 
> 

[committed] d: Fix ICE in build_deref, at d/d-codegen.cc:1650 [PR111650]

2024-04-19 Thread Iain Buclaw
Hi,

This regression in the D front-end was found to be caused by in some
cases the hidden closure parameter type being generated too early for
nested functions.  Better to update the type after the local
closure/frame type has been completed.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-m64,
committed to mainline, and backporting to releases/gcc-13.

Regards,
Iain.

---
PR d/111650

gcc/d/ChangeLog:

* decl.cc (get_fndecl_arguments): Move generation of frame type to ...
(DeclVisitor::visit (FuncDeclaration *)): ... here, after the call to
build_closure.

gcc/testsuite/ChangeLog:

* gdc.dg/pr111650.d: New test.
---
 gcc/d/decl.cc   | 20 ++--
 gcc/testsuite/gdc.dg/pr111650.d | 21 +
 2 files changed, 31 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr111650.d

diff --git a/gcc/d/decl.cc b/gcc/d/decl.cc
index 3b7627d3dfa..0a87c85ae2e 100644
--- a/gcc/d/decl.cc
+++ b/gcc/d/decl.cc
@@ -163,16 +163,6 @@ get_fndecl_arguments (FuncDeclaration *decl)
  tree parm_decl = get_symbol_decl (decl->vthis);
  DECL_ARTIFICIAL (parm_decl) = 1;
  TREE_READONLY (parm_decl) = 1;
-
- if (decl->vthis->type == Type::tvoidptr)
-   {
- /* Replace generic pointer with back-end closure type
-(this wins for gdb).  */
- tree frame_type = FRAMEINFO_TYPE (get_frameinfo (decl));
- gcc_assert (frame_type != NULL_TREE);
- TREE_TYPE (parm_decl) = build_pointer_type (frame_type);
-   }
-
  param_list = chainon (param_list, parm_decl);
}
 
@@ -1072,6 +1062,16 @@ public:
 /* May change cfun->static_chain.  */
 build_closure (d);
 
+/* Replace generic pointer with back-end closure type
+   (this wins for gdb).  */
+if (d->vthis && d->vthis->type == Type::tvoidptr)
+  {
+   tree frame_type = FRAMEINFO_TYPE (get_frameinfo (d));
+   gcc_assert (frame_type != NULL_TREE);
+   tree parm_decl = get_symbol_decl (d->vthis);
+   TREE_TYPE (parm_decl) = build_pointer_type (frame_type);
+  }
+
 if (d->vresult)
   declare_local_var (d->vresult);
 
diff --git a/gcc/testsuite/gdc.dg/pr111650.d b/gcc/testsuite/gdc.dg/pr111650.d
new file mode 100644
index 000..4298a76d38f
--- /dev/null
+++ b/gcc/testsuite/gdc.dg/pr111650.d
@@ -0,0 +1,21 @@
+// { dg-do compile }
+ref V require(K, V)(ref V[K] aa, K key, lazy V value);
+
+struct Root
+{
+ulong[3] f;
+}
+
+Root[ulong] roots;
+
+Root getRoot(int fd, ulong rootID)
+{
+return roots.require(rootID,
+{
+Root result;
+inoLookup(fd, () => result);
+return result;
+}());
+}
+
+void inoLookup(int, scope Root delegate()) { }
-- 
2.40.1



Re: [PATCH] libstdc++: Support link chains in std::chrono::tzdb::locate_zone [PR114770]

2024-04-19 Thread Jonathan Wakely
On Fri, 19 Apr 2024 at 07:14, Richard Biener  wrote:
>
> On Thu, Apr 18, 2024 at 6:34 PM Jonathan Wakely  wrote:
> >
> > This would fix the but, how do people feel about it this close to the
> > gcc-14 release?
>
> Guess we'll have to fix it anyway, so why not now ...

Yeah, I don't think Debian is going to stop using this feature, and it
might get used more widely in future (it's currently part of the
"vanguard" format for tzdata, but might move to "main" one day and
then all distros would have chained links). So it needs to be
backported to gcc-13 too.

> (what could go wrong..)

Well the risk is that my new code doesn't correctly detect cycles, and
so could go into an infinite loop when trying to follow chained links.
The current code on trunk will just fail to find a time_zone and throw
an exception, which is not ideal, but predictable and easily
understood. Attempting to handle chained links adds complexity.

I think my new code is correct so that it won't get stuck in a loop,
and there are tests which should cover it sufficiently. And for
correctly tzdata.zi there will never be cycles anyway, so even if I
messed the code up, it shouldn't matter unless the application
provides a custom tzdata.zi with invalid links.

So I guess I'll push it, and backport to gcc-13 soon.


>
> Richard.
>
> > Tested x86_64-linux.
> >
> > -- >8 --
> >
> > Since 2022 the TZif format defined in the zic(8) man page has said that
> > links can refer to other links, rather than only referring to a zone.
> > This isn't supported by the C++20 spec, which assumes that the target()
> > for a chrono::time_zone_link always names a chrono::time_zone, not
> > another chrono::time_zone_link.
> >
> > This hasn't been a problem until now, because there are no entries in
> > the tzdata file that chain links together. However, Debian Sid has
> > changed the target of the Asia/Chungking link from the Asia/Shanghai
> > zone to the Asia/Chongqing link, creating a link chain. The libstdc++
> > code is unable to handle this, so chrono::locate_zone("Asia/Chungking")
> > will fail with the tzdata.zi file from Debian Sid.
> >
> > It seems likely that the C++ spec will need a change to allow link
> > chains, so that the original structure of the IANA database can be fully
> > represented by chrono::tzdb. The alternative would be for chrono::tzdb
> > to flatten all chains when loading the data, so that a link's target is
> > always a zone, but this means throwing away information present in the
> > tzdata.zi input file.
> >
> > In anticipation of a change to the spec, this commit adds support for
> > chained links to libstdc++. When a name is found to be a link, we try to
> > find its target in the list of zones as before, but now if the target
> > isn't the name of a zone we don't fail. Instead we look for another link
> > with that name, and keep doing that until we reach the end of the chain
> > of links, and then look up the last target as a zone.
> >
> > This new logic would get stuck in a loop if the tzdata.zi file is buggy
> > and defines a link chain that contains a cycle, e.g. two links that
> > refer to each other. To deal with that unlikely case, we use the
> > tortoise and hare algorithm to detect cycles in link chains, and throw
> > an exception if we detect a cycle. Cycles in links should never happen,
> > and it is expected that link chains will be short (if they occur at all)
> > and so the code is optimized for short chains without cycles. Longer
> > chains (four or more links) and cycles will do more work, but won't fail
> > to resolve a chain or get stuck in a loop.
> >
> > libstdc++-v3/ChangeLog:
> >
> > PR libstdc++/114770
> > * src/c++20/tzdb.cc (do_locate_zone): Support links that have
> > another link as their target.
> > * testsuite/std/time/tzdb/links.cc: New test.
> > ---
> >  libstdc++-v3/src/c++20/tzdb.cc|  57 -
> >  libstdc++-v3/testsuite/std/time/tzdb/links.cc | 215 ++
> >  2 files changed, 268 insertions(+), 4 deletions(-)
> >  create mode 100644 libstdc++-v3/testsuite/std/time/tzdb/links.cc
> >
> > diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
> > index 639d1c440ba..c7c7cc9deee 100644
> > --- a/libstdc++-v3/src/c++20/tzdb.cc
> > +++ b/libstdc++-v3/src/c++20/tzdb.cc
> > @@ -1599,7 +1599,7 @@ namespace std::chrono
> >  const time_zone*
> >  do_locate_zone(const vector& zones,
> >const vector& links,
> > -  string_view tz_name) noexcept
> > +  string_view tz_name)
> >  {
> >// Lambda mangling changed between -fabi-version=2 and 
> > -fabi-version=18
> >auto search = [](const Vec& v, string_view name) {
> > @@ -1610,13 +1610,62 @@ namespace std::chrono
> > return ptr;
> >};
> >
> > +  // Search zones first.
> >if (auto tz = search(zones, tz_name))
> > return tz;
> >
> > +  // Search links second.
> >  

Re: [PATCH] contrib: Add autoregen.py

2024-04-19 Thread Christophe Lyon
On Fri, 19 Apr 2024 at 11:03, Christophe Lyon
 wrote:
>
> This script is a copy of the current script used by Sourceware's
> autoregen buildbots.
>
> It is intended as a helper to regenerate files managed by autotools
> (autoconf, automake, aclocal, ), as well as the toplevel
> Makefile.in which is created by autogen.
>
> Other files can be updated when using maintainer-mode, but this is not
> covered by this script.
>
> 2024-04-19  Christophe Lyon  
>
> contrib/
> * autogen.py: New script.
Of course this should read "autoregen.py" :-)

> ---
>  contrib/autoregen.py | 221 +++
>  1 file changed, 221 insertions(+)
>  create mode 100755 contrib/autoregen.py
>
> diff --git a/contrib/autoregen.py b/contrib/autoregen.py
> new file mode 100755
> index 000..faffc88c5bd
> --- /dev/null
> +++ b/contrib/autoregen.py
> @@ -0,0 +1,221 @@
> +#!/usr/bin/env python3
> +
> +# This script helps to regenerate files managed by autotools and
> +# autogen in binutils-gdb and gcc repositories.
> +
> +# It can be used by buildbots to check that the current repository
> +# contents has been updated correctly, and by developers to update
> +# such files as expected.
> +
> +import os
> +import shutil
> +import subprocess
> +from pathlib import Path
> +
> +
> +# On Gentoo, vanilla unpatched autotools are packaged separately.
> +# We place the vanilla names first as we want to prefer those if both exist.
> +AUTOCONF_NAMES = ["autoconf-vanilla-2.69", "autoconf-2.69", "autoconf"]
> +AUTOMAKE_NAMES = ["automake-vanilla-1.15", "automake-1.15.1", "automake"]
> +ACLOCAL_NAMES = ["aclocal-vanilla-1.15", "aclocal-1.15.1", "aclocal"]
> +AUTOHEADER_NAMES = ["autoheader-vanilla-2.69", "autoheader-2.69", 
> "autoheader"]
> +AUTORECONF_NAMES = ["autoreconf-vanilla-2.69", "autoreconf-2.69", 
> "autoreconf"]
> +
> +# Pick the first for each list that exists on this system.
> +AUTOCONF_BIN = next(name for name in AUTOCONF_NAMES if shutil.which(name))
> +AUTOMAKE_BIN = next(name for name in AUTOMAKE_NAMES if shutil.which(name))
> +ACLOCAL_BIN = next(name for name in ACLOCAL_NAMES if shutil.which(name))
> +AUTOHEADER_BIN = next(name for name in AUTOHEADER_NAMES if 
> shutil.which(name))
> +AUTORECONF_BIN = next(name for name in AUTORECONF_NAMES if 
> shutil.which(name))
> +
> +AUTOGEN_BIN = "autogen"
> +
> +# autoconf-wrapper and automake-wrapper from Gentoo look at this environment 
> variable.
> +# It's harmless to set it on other systems though.
> +EXTRA_ENV = {
> +"WANT_AUTOCONF": AUTOCONF_BIN.split("-", 1)[1] if "-" in AUTOCONF_BIN 
> else "",
> +"WANT_AUTOMAKE": AUTOMAKE_BIN.split("-", 1)[1] if "-" in AUTOMAKE_BIN 
> else "",
> +"AUTOCONF": AUTOCONF_BIN,
> +"ACLOCAL": ACLOCAL_BIN,
> +"AUTOMAKE": AUTOMAKE_BIN,
> +"AUTOGEN": AUTOGEN_BIN,
> +}
> +ENV = os.environ.copy()
> +ENV.update(EXTRA_ENV)
> +
> +
> +# Directories we should skip entirely because they're vendored or have 
> different
> +# autotools versions.
> +SKIP_DIRS = [
> +# readline and minizip are maintained with different autotools versions
> +"readline",
> +"minizip",
> +]
> +
> +# these directories are known to be re-generatable with a simple autoreconf
> +# without special -I flags
> +# Entries commented out (and directories not listed) are handled by
> +# regenerate_manually().
> +AUTORECONF_DIRS = [
> +# subdirs common to binutils-gdb and gcc
> +"libbacktrace",
> +"libdecnumber", # No Makefile.am
> +"libiberty", # No Makefile.am
> +"zlib",
> +
> +# binutils-gdb subdirs
> +"bfd",
> +"binutils",
> +"etc",
> +"gas",
> +"gdb",
> +"gdbserver",
> +"gdbsupport",
> +"gnulib",
> +"gold",
> +"gprof",
> +"gprofng",
> +"gprofng/libcollector",
> +"ld",
> +"libctf",
> +"libsframe",
> +"opcodes",
> +"sim",
> +
> +# gcc subdirs
> +"c++tools", # No aclocal.m4
> +"gcc", # No Makefile.am
> +#"fixincludes", # autoreconf complains about GCC_AC_FUNC_MMAP_BLACKLIST
> +"gnattools", # No aclocal.m4
> +"gotools",
> +"libada", # No aclocal.m4
> +"libatomic",
> +"libcc1",
> +"libcody", # No aclocal.m4
> +"libcpp", # No Makefile.am
> +"libffi",
> +"libgcc", # No aclocal.m4
> +"libgfortran",
> +# Hack: ACLOCAL_AMFLAGS = -I .. -I ../config in Makefile.in but we
> +# apply -I../config -I.. otherwise we do not match the current
> +# contents
> +#"libgm2",
> +"libgo",
> +"libgomp",
> +"libgrust",
> +"libitm",
> +"libobjc", # No Makefile.am
> +"libphobos",
> +"libquadmath",
> +"libsanitizer",
> +"libssp",
> +"libstdc++-v3",
> +# This does not cover libvtv/testsuite/other-tests/Makefile.in
> +"libvtv",
> +"lto-plugin",
> +]
> +
> +
> +# Run the shell command CMD.
> +#
> +# Print the command on stdout prior to running it.
> +def run_shell(cmd: str):
> +print(f"+ {cmd}", flush=True)
> +res = 

[PATCH] contrib: Add autoregen.py

2024-04-19 Thread Christophe Lyon
This script is a copy of the current script used by Sourceware's
autoregen buildbots.

It is intended as a helper to regenerate files managed by autotools
(autoconf, automake, aclocal, ), as well as the toplevel
Makefile.in which is created by autogen.

Other files can be updated when using maintainer-mode, but this is not
covered by this script.

2024-04-19  Christophe Lyon  

contrib/
* autogen.py: New script.
---
 contrib/autoregen.py | 221 +++
 1 file changed, 221 insertions(+)
 create mode 100755 contrib/autoregen.py

diff --git a/contrib/autoregen.py b/contrib/autoregen.py
new file mode 100755
index 000..faffc88c5bd
--- /dev/null
+++ b/contrib/autoregen.py
@@ -0,0 +1,221 @@
+#!/usr/bin/env python3
+
+# This script helps to regenerate files managed by autotools and
+# autogen in binutils-gdb and gcc repositories.
+
+# It can be used by buildbots to check that the current repository
+# contents has been updated correctly, and by developers to update
+# such files as expected.
+
+import os
+import shutil
+import subprocess
+from pathlib import Path
+
+
+# On Gentoo, vanilla unpatched autotools are packaged separately.
+# We place the vanilla names first as we want to prefer those if both exist.
+AUTOCONF_NAMES = ["autoconf-vanilla-2.69", "autoconf-2.69", "autoconf"]
+AUTOMAKE_NAMES = ["automake-vanilla-1.15", "automake-1.15.1", "automake"]
+ACLOCAL_NAMES = ["aclocal-vanilla-1.15", "aclocal-1.15.1", "aclocal"]
+AUTOHEADER_NAMES = ["autoheader-vanilla-2.69", "autoheader-2.69", "autoheader"]
+AUTORECONF_NAMES = ["autoreconf-vanilla-2.69", "autoreconf-2.69", "autoreconf"]
+
+# Pick the first for each list that exists on this system.
+AUTOCONF_BIN = next(name for name in AUTOCONF_NAMES if shutil.which(name))
+AUTOMAKE_BIN = next(name for name in AUTOMAKE_NAMES if shutil.which(name))
+ACLOCAL_BIN = next(name for name in ACLOCAL_NAMES if shutil.which(name))
+AUTOHEADER_BIN = next(name for name in AUTOHEADER_NAMES if shutil.which(name))
+AUTORECONF_BIN = next(name for name in AUTORECONF_NAMES if shutil.which(name))
+
+AUTOGEN_BIN = "autogen"
+
+# autoconf-wrapper and automake-wrapper from Gentoo look at this environment 
variable.
+# It's harmless to set it on other systems though.
+EXTRA_ENV = {
+"WANT_AUTOCONF": AUTOCONF_BIN.split("-", 1)[1] if "-" in AUTOCONF_BIN else 
"",
+"WANT_AUTOMAKE": AUTOMAKE_BIN.split("-", 1)[1] if "-" in AUTOMAKE_BIN else 
"",
+"AUTOCONF": AUTOCONF_BIN,
+"ACLOCAL": ACLOCAL_BIN,
+"AUTOMAKE": AUTOMAKE_BIN,
+"AUTOGEN": AUTOGEN_BIN,
+}
+ENV = os.environ.copy()
+ENV.update(EXTRA_ENV)
+
+
+# Directories we should skip entirely because they're vendored or have 
different
+# autotools versions.
+SKIP_DIRS = [
+# readline and minizip are maintained with different autotools versions
+"readline",
+"minizip",
+]
+
+# these directories are known to be re-generatable with a simple autoreconf
+# without special -I flags
+# Entries commented out (and directories not listed) are handled by
+# regenerate_manually().
+AUTORECONF_DIRS = [
+# subdirs common to binutils-gdb and gcc
+"libbacktrace",
+"libdecnumber", # No Makefile.am
+"libiberty", # No Makefile.am
+"zlib",
+
+# binutils-gdb subdirs
+"bfd",
+"binutils",
+"etc",
+"gas",
+"gdb",
+"gdbserver",
+"gdbsupport",
+"gnulib",
+"gold",
+"gprof",
+"gprofng",
+"gprofng/libcollector",
+"ld",
+"libctf",
+"libsframe",
+"opcodes",
+"sim",
+
+# gcc subdirs
+"c++tools", # No aclocal.m4
+"gcc", # No Makefile.am
+#"fixincludes", # autoreconf complains about GCC_AC_FUNC_MMAP_BLACKLIST
+"gnattools", # No aclocal.m4
+"gotools",
+"libada", # No aclocal.m4
+"libatomic",
+"libcc1",
+"libcody", # No aclocal.m4
+"libcpp", # No Makefile.am
+"libffi",
+"libgcc", # No aclocal.m4
+"libgfortran",
+# Hack: ACLOCAL_AMFLAGS = -I .. -I ../config in Makefile.in but we
+# apply -I../config -I.. otherwise we do not match the current
+# contents
+#"libgm2",
+"libgo",
+"libgomp",
+"libgrust",
+"libitm",
+"libobjc", # No Makefile.am
+"libphobos",
+"libquadmath",
+"libsanitizer",
+"libssp",
+"libstdc++-v3",
+# This does not cover libvtv/testsuite/other-tests/Makefile.in
+"libvtv",
+"lto-plugin",
+]
+
+
+# Run the shell command CMD.
+#
+# Print the command on stdout prior to running it.
+def run_shell(cmd: str):
+print(f"+ {cmd}", flush=True)
+res = subprocess.run(
+f"{cmd}",
+shell=True,
+encoding="utf8",
+env=ENV,
+)
+res.check_returncode()
+
+
+def regenerate_with_autoreconf():
+run_shell(f"{AUTORECONF_BIN} -f")
+
+def regenerate_with_autogen():
+run_shell(f"{AUTOGEN_BIN} Makefile.def")
+
+def regenerate_manually():
+configure_lines = open("configure.ac").read().splitlines()
+if folder.stem == 

Re: [PATCH 2/2] ARC: Use intrinsics for __builtin_sub_overflow*()

2024-04-19 Thread Shahab Vahedi
Hi Claudiu,

On 9/7/23 12:15, Claudiu Zissulescu Ianculescu wrote:
> OK,
> 
> Thank you for your contribution,
> Claudiu

Could you commit this patch?

> 
> On Wed, Sep 6, 2023 at 3:50 PM Shahab Vahedi  
> wrote:
>>
>> This patch covers signed and unsigned subtractions.  The generated code
>> would be something along these lines:
>>
>> signed:
>>   sub.f   r0, r1, r2
>>   b.v @label
>>
>> unsigned:
>>   sub.f   r0, r1, r2
>>   b.c @label
>>
>> gcc/ChangeLog:
>>
>> * config/arc/arc.md (subsi3_v): New insn.
>> (subvsi4): New expand.
>> (subsi3_c): New insn.
>> (usubvsi4): New expand.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/arc/overflow-2.c: New.
>>
>> Signed-off-by: Shahab Vahedi 
>> ---
>>  gcc/config/arc/arc.md | 48 +++
>>  gcc/testsuite/gcc.target/arc/overflow-2.c | 97 +++
>>  2 files changed, 145 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.target/arc/overflow-2.c
>>
>> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
>> index 9d011f6b4a9..34e9e1a7f1d 100644
>> --- a/gcc/config/arc/arc.md
>> +++ b/gcc/config/arc/arc.md
>> @@ -2973,6 +2973,54 @@ archs4x, archs4xd"
>>(set_attr "cpu_facility" "*,cd,*,*,*,*,*,*,*,*")
>>])
>>
>> +(define_insn "subsi3_v"
>> +  [(set (match_operand:SI  0 "register_operand"  "=r,r,r,  r")
>> +   (minus:SI (match_operand:SI 1 "register_operand"   "r,r,0,  r")
>> + (match_operand:SI 2 "nonmemory_operand"  "r,L,I,C32")))
>> +   (set (reg:CC_V CC_REG)
>> +   (compare:CC_V (sign_extend:DI (minus:SI (match_dup 1)
>> +   (match_dup 2)))
>> + (minus:DI (sign_extend:DI (match_dup 1))
>> +   (sign_extend:DI (match_dup 2)]
>> +   ""
>> +   "sub.f\\t%0,%1,%2"
>> +   [(set_attr "cond"   "set")
>> +(set_attr "type"   "compare")
>> +(set_attr "length" "4,4,4,8")])
>> +
>> +(define_expand "subvsi4"
>> + [(match_operand:SI 0 "register_operand")
>> +  (match_operand:SI 1 "register_operand")
>> +  (match_operand:SI 2 "nonmemory_operand")
>> +  (label_ref (match_operand 3 "" ""))]
>> +  ""
>> +  "emit_insn (gen_subsi3_v (operands[0], operands[1], operands[2]));
>> +   arc_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]);
>> +   DONE;")
>> +
>> +(define_insn "subsi3_c"
>> +  [(set (match_operand:SI  0 "register_operand"  "=r,r,r,  r")
>> +   (minus:SI (match_operand:SI 1 "register_operand"   "r,r,0,  r")
>> + (match_operand:SI 2 "nonmemory_operand"  "r,L,I,C32")))
>> +   (set (reg:CC_C CC_REG)
>> +   (compare:CC_C (match_dup 1)
>> + (match_dup 2)))]
>> +   ""
>> +   "sub.f\\t%0,%1,%2"
>> +   [(set_attr "cond"   "set")
>> +(set_attr "type"   "compare")
>> +(set_attr "length" "4,4,4,8")])
>> +
>> +(define_expand "usubvsi4"
>> +  [(match_operand:SI 0 "register_operand")
>> +   (match_operand:SI 1 "register_operand")
>> +   (match_operand:SI 2 "nonmemory_operand")
>> +   (label_ref (match_operand 3 "" ""))]
>> +   ""
>> +   "emit_insn (gen_subsi3_c (operands[0], operands[1], operands[2]));
>> +arc_gen_unlikely_cbranch (LTU, CC_Cmode, operands[3]);
>> +DONE;")
>> +
>>  (define_expand "subdi3"
>>[(set (match_operand:DI 0 "register_operand" "")
>> (minus:DI (match_operand:DI 1 "register_operand" "")
>> diff --git a/gcc/testsuite/gcc.target/arc/overflow-2.c 
>> b/gcc/testsuite/gcc.target/arc/overflow-2.c
>> new file mode 100644
>> index 000..b4de8c03b22
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arc/overflow-2.c
>> @@ -0,0 +1,97 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O1" } */
>> +
>> +#include 
>> +#include 
>> +
>> +/*
>> + * sub.f  r0,r0,r1
>> + * st_s   r0,[r2]
>> + * mov_s  r0,1
>> + * j_s.d  [blink]
>> + * mov.nv r0,0
>> + */
>> +bool sub_overflow (int32_t a, int32_t b, int32_t *res)
>> +{
>> +  return __builtin_sub_overflow (a, b, res);
>> +}
>> +
>> +/*
>> + * sub.f  r0,r0,-1234
>> + * st_s   r0,[r1]
>> + * mov_s  r0,1
>> + * j_s.d  [blink]
>> + * mov.nv r0,0
>> + */
>> +bool subi_overflow (int32_t a, int32_t *res)
>> +{
>> +  return __builtin_sub_overflow (a, -1234, res);
>> +}
>> +
>> +/*
>> + * sub.f  r3,r0,r1
>> + * st_s   r3,[r2]
>> + * j_s.d  [blink]
>> + * setlo  r0,r0,r1
>> + */
>> +bool usub_overflow (uint32_t a, uint32_t b, uint32_t *res)
>> +{
>> +  return __builtin_sub_overflow (a, b, res);
>> +}
>> +
>> +/*
>> + * sub.f  r2,r0,4321
>> + * seths  r0,4320,r0
>> + * j_s.d  [blink]
>> + * st_s   r2,[r1]
>> + */
>> +bool usubi_overflow (uint32_t a, uint32_t *res)
>> +{
>> +  return __builtin_sub_overflow (a, 4321, res);
>> +}
>> +
>> +/*
>> + * sub.f  r0,r0,r1
>> + * mov_s  r0,1
>> + * j_s.d  [blink]
>> + * mov.nv r0,0
>> + */
>> +bool sub_overflow_p (int32_t a, int32_t b, int32_t res)
>> +{
>> +  return __builtin_sub_overflow_p (a, b, res);
>> +}
>> +
>> +/*
>> + * sub.f  r0,r0,-1000
>> + * mov_s  r0,1
>> + * j_s.d  

Re: [PATCH 1/2] ARC: Use intrinsics for __builtin_add_overflow*()

2024-04-19 Thread Shahab Vahedi
Hi Clauudiu,

On 9/7/23 12:15, Claudiu Zissulescu Ianculescu wrote:
> Ok.
> 
> Thank you for your contribution,
> Claudiu

Could you commit this patch?

> 
> On Wed, Sep 6, 2023 at 3:50 PM Shahab Vahedi  
> wrote:
>>
>> This patch covers signed and unsigned additions.  The generated code
>> would be something along these lines:
>>
>> signed:
>>   add.f   r0, r1, r2
>>   b.v @label
>>
>> unsigned:
>>   add.f   r0, r1, r2
>>   b.c @label
>>
>> gcc/ChangeLog:
>>
>> * config/arc/arc-modes.def: Add CC_V mode.
>> * config/arc/predicates.md (proper_comparison_operator): Handle
>> E_CC_Vmode.
>> (equality_comparison_operator): Exclude CC_Vmode from eq/ne.
>> (cc_set_register): Handle CC_Vmode.
>> (cc_use_register): Likewise.
>> * config/arc/arc.md (addsi3_v): New insn.
>> (addvsi4): New expand.
>> (addsi3_c): New insn.
>> (uaddvsi4): New expand.
>> * config/arc/arc-protos.h (arc_gen_unlikely_cbranch): New.
>> * config/arc/arc.cc (arc_gen_unlikely_cbranch): New.
>> (get_arc_condition_code): Handle E_CC_Vmode.
>> (arc_init_reg_tables): Handle CC_Vmode.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/arc/overflow-1.c: New.
>>
>> Signed-off-by: Shahab Vahedi 
>> ---
>>  gcc/config/arc/arc-modes.def  |   1 +
>>  gcc/config/arc/arc-protos.h   |   1 +
>>  gcc/config/arc/arc.cc |  26 +-
>>  gcc/config/arc/arc.md |  49 +++
>>  gcc/config/arc/predicates.md  |  14 ++-
>>  gcc/testsuite/gcc.target/arc/overflow-1.c | 100 ++
>>  6 files changed, 187 insertions(+), 4 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/arc/overflow-1.c
>>
>> diff --git a/gcc/config/arc/arc-modes.def b/gcc/config/arc/arc-modes.def
>> index 763e880317d..69eeec5935a 100644
>> --- a/gcc/config/arc/arc-modes.def
>> +++ b/gcc/config/arc/arc-modes.def
>> @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
>>
>>  CC_MODE (CC_ZN);
>>  CC_MODE (CC_Z);
>> +CC_MODE (CC_V);
>>  CC_MODE (CC_C);
>>  CC_MODE (CC_FP_GT);
>>  CC_MODE (CC_FP_GE);
>> diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
>> index 4f2db7ffb59..bc78fb0b370 100644
>> --- a/gcc/config/arc/arc-protos.h
>> +++ b/gcc/config/arc/arc-protos.h
>> @@ -50,6 +50,7 @@ extern bool arc_check_mov_const (HOST_WIDE_INT );
>>  extern bool arc_split_mov_const (rtx *);
>>  extern bool arc_can_use_return_insn (void);
>>  extern bool arc_split_move_p (rtx *);
>> +extern void arc_gen_unlikely_cbranch (enum rtx_code, machine_mode, rtx);
>>  #endif /* RTX_CODE */
>>
>>  extern bool arc_ccfsm_branch_deleted_p (void);
>> diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc
>> index f8c9bf17e2c..ec93d40aeb9 100644
>> --- a/gcc/config/arc/arc.cc
>> +++ b/gcc/config/arc/arc.cc
>> @@ -1538,6 +1538,13 @@ get_arc_condition_code (rtx comparison)
>> case GEU : return ARC_CC_NC;
>> default : gcc_unreachable ();
>> }
>> +case E_CC_Vmode:
>> +  switch (GET_CODE (comparison))
>> +   {
>> +   case EQ : return ARC_CC_NV;
>> +   case NE : return ARC_CC_V;
>> +   default : gcc_unreachable ();
>> +   }
>>  case E_CC_FP_GTmode:
>>if (TARGET_ARGONAUT_SET && TARGET_SPFP)
>> switch (GET_CODE (comparison))
>> @@ -1868,7 +1875,7 @@ arc_init_reg_tables (void)
>>   /* mode_class hasn't been initialized yet for EXTRA_CC_MODES, so
>>  we must explicitly check for them here.  */
>>   if (i == (int) CCmode || i == (int) CC_ZNmode || i == (int) 
>> CC_Zmode
>> - || i == (int) CC_Cmode
>> + || i == (int) CC_Cmode || i == (int) CC_Vmode
>>   || i == CC_FP_GTmode || i == CC_FP_GEmode || i == CC_FP_ORDmode
>>   || i == CC_FPUmode || i == CC_FPUEmode || i == CC_FPU_UNEQmode)
>> arc_mode_class[i] = 1 << (int) C_MODE;
>> @@ -11852,6 +11859,23 @@ arc_libm_function_max_error (unsigned cfn, 
>> machine_mode mode,
>>return default_libm_function_max_error (cfn, mode, boundary_p);
>>  }
>>
>> +/* Generate RTL for conditional branch with rtx comparison CODE in mode
>> +   CC_MODE.  */
>> +
>> +void
>> +arc_gen_unlikely_cbranch (enum rtx_code cmp, machine_mode cc_mode, rtx 
>> label)
>> +{
>> +  rtx cc_reg, x;
>> +
>> +  cc_reg = gen_rtx_REG (cc_mode, CC_REG);
>> +  label = gen_rtx_LABEL_REF (VOIDmode, label);
>> +
>> +  x = gen_rtx_fmt_ee (cmp, VOIDmode, cc_reg, const0_rtx);
>> +  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x, label, pc_rtx);
>> +
>> +  emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
>> +}
>> +
>>  #undef TARGET_USE_ANCHORS_FOR_SYMBOL_P
>>  #define TARGET_USE_ANCHORS_FOR_SYMBOL_P arc_use_anchors_for_symbol_p
>>
>> diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
>> index d37ecbf4292..9d011f6b4a9 100644
>> --- a/gcc/config/arc/arc.md
>> +++ b/gcc/config/arc/arc.md
>> @@ -2725,6 +2725,55 @@ 

Re: [PATCH] rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]

2024-04-19 Thread Richard Biener
On Fri, 19 Apr 2024, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase, combine propagates the mem/v load into mem store
> with the same address and then removes it, because noop_move_p says it is a
> no-op move.  If it was the other way around, i.e. mem/v store and mem load,
> or both would be mem/v, it would be kept.
> The problem is that rtx_equal_p never checks any kind of flags on the rtxes
> (and I think it would be quite dangerous to change it at this point), and
> set_noop_p checks side_effects_p on just one of the operands, not both.
> In the MEM <- MEM set, it only checks it on the destination, in
> store to ZERO_EXTRACT only checks it on the source.
> 
> The following patch adds the missing side_effects_p checks.
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-04-19  Jakub Jelinek  
> 
>   PR rtl-optimization/114768
>   * rtlanal.cc (set_noop_p): Don't return true for MEM <- MEM
>   sets if src has side-effects or for stores into ZERO_EXTRACT
>   if ZERO_EXTRACT operand has side-effects.
> 
>   * gcc.dg/pr114768.c: New test.
> 
> --- gcc/rtlanal.cc.jj 2024-02-24 12:45:28.674249100 +0100
> +++ gcc/rtlanal.cc2024-04-18 15:09:55.199499083 +0200
> @@ -1637,12 +1637,15 @@ set_noop_p (const_rtx set)
>  return true;
>  
>if (MEM_P (dst) && MEM_P (src))
> -return rtx_equal_p (dst, src) && !side_effects_p (dst);
> +return (rtx_equal_p (dst, src)
> + && !side_effects_p (dst)
> + && !side_effects_p (src));
>  
>if (GET_CODE (dst) == ZERO_EXTRACT)
> -return rtx_equal_p (XEXP (dst, 0), src)
> -&& !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
> -&& !side_effects_p (src);
> +return (rtx_equal_p (XEXP (dst, 0), src)
> + && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
> + && !side_effects_p (src)
> + && !side_effects_p (XEXP (dst, 0)));
>  
>if (GET_CODE (dst) == STRICT_LOW_PART)
>  dst = XEXP (dst, 0);
> --- gcc/testsuite/gcc.dg/pr114768.c.jj2024-04-18 15:37:49.139433678 
> +0200
> +++ gcc/testsuite/gcc.dg/pr114768.c   2024-04-18 15:43:30.389730365 +0200
> @@ -0,0 +1,10 @@
> +/* PR rtl-optimization/114768 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-rtl-final" } */
> +/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* 
> } } } } } */
> +
> +void
> +foo (int *p)
> +{
> +  *p = *(volatile int *) p;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] libgcc: Another __divmodbitint4 bug fix [PR114762]

2024-04-19 Thread Richard Biener
On Fri, 19 Apr 2024, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled because the code to decrement
> vn on negative value with all ones in most significant limb (even partial)
> and 0 in most significant bit of the second most significant limb doesn't
> take into account the case where all bits below the most significant limb
> are zero.  This has been a problem both in the version before yesterday's
> commit where it has been done only if un was one shorter than vn before this
> decrement, and is now problem even more often when it is done earlier.
> When we decrement vn in such case and negate it, we end up with all 0s in
> the v2 value, so have both the problems with UB on __builtin_clz* and the
> expectations of the algorithm that the divisor has most significant bit set
> after shifting, plus when the decremented vn is 1 it can SIGFPE on division
> by zero even when it is not division by zero etc.  Other values shouldn't
> get 0 in the new most significant limb after negation, because the
> bitint_reduce_prec canonicalization should reduce prec if the second most
> significant limb is all ones and if that limb is all zeros, if at least
> one limb below it is non-zero, carry in will make it non-zero.
> 
> The following patch fixes it by checking if at least one bit below the
> most significant limb is non-zero, in that case it decrements, otherwise
> it will do nothing (but e.g. for the un < vn case that also means the
> divisor is large enough that the result should be q 0 r u).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2024-04-19  Jakub Jelinek  
> 
>   PR libgcc/114762
>   * libgcc2.c (__divmodbitint4): Don't decrement vn if all bits
>   below the most significant limb are zero.
> 
>   * gcc.dg/torture/bitint-70.c: New test.
> 
> --- libgcc/libgcc2.c.jj   2024-04-18 09:48:55.172538667 +0200
> +++ libgcc/libgcc2.c  2024-04-18 12:17:28.893616007 +0200
> @@ -1715,11 +1715,18 @@ __divmodbitint4 (UBILtype *q, SItype qpr
>&& vn > 1
>&& (Wtype) v[BITINT_END (1, vn - 2)] >= 0)
>  {
> -  vp = 0;
> -  --vn;
> +  /* Unless all bits below the most significant limb are zero.  */
> +  SItype vn2;
> +  for (vn2 = vn - 2; vn2 >= 0; --vn2)
> + if (v[BITINT_END (vn - 1 - vn2, vn2)])
> +   {
> + vp = 0;
> + --vn;
>  #if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
> -  ++v;
> + ++v;
>  #endif
> + break;
> +   }
>  }
>if (__builtin_expect (un < vn, 0))
>  {
> --- gcc/testsuite/gcc.dg/torture/bitint-70.c.jj   2024-04-18 
> 12:26:09.406383158 +0200
> +++ gcc/testsuite/gcc.dg/torture/bitint-70.c  2024-04-18 12:26:57.253718287 
> +0200
> @@ -0,0 +1,22 @@
> +/* PR libgcc/114762 */
> +/* { dg-do run { target bitint } } */
> +/* { dg-options "-std=c23" } */
> +/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
> +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
> +
> +#if __BITINT_MAXWIDTH__ >= 255
> +__attribute__((__noipa__)) signed _BitInt(255)
> +foo (signed _BitInt(255) a, signed _BitInt(65) b)
> +{
> +  return a / b;
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __BITINT_MAXWIDTH__ >= 255
> +  if (foo (1, -0xwb - 1wb))
> +__builtin_abort ();
> +#endif
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] gcc-13/changes.html (LoongArch): Fix link.

2024-04-19 Thread Lulu Cheng
---
 htdocs/gcc-13/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 4384c329..15a309d6 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -625,7 +625,7 @@ You may also want to check out our
   The new command-line option -mdirect-extern-access can 
be used
  to prevent accessing external symbols through GOT.
   
-  The new variable attribute https://gcc.gnu.org/onlinedocs/gcc/LoongArch-Variable-Attributes.html#LoongArch-Variable-Attributes;>model
+  The new variable attribute https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/LoongArch-Variable-Attributes.html#LoongArch-Variable-Attributes;>model
   has been added.
   
 
-- 
2.39.3



[PATCH] rtlanal: Fix set_noop_p for volatile loads or stores [PR114768]

2024-04-19 Thread Jakub Jelinek
Hi!

On the following testcase, combine propagates the mem/v load into mem store
with the same address and then removes it, because noop_move_p says it is a
no-op move.  If it was the other way around, i.e. mem/v store and mem load,
or both would be mem/v, it would be kept.
The problem is that rtx_equal_p never checks any kind of flags on the rtxes
(and I think it would be quite dangerous to change it at this point), and
set_noop_p checks side_effects_p on just one of the operands, not both.
In the MEM <- MEM set, it only checks it on the destination, in
store to ZERO_EXTRACT only checks it on the source.

The following patch adds the missing side_effects_p checks.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-19  Jakub Jelinek  

PR rtl-optimization/114768
* rtlanal.cc (set_noop_p): Don't return true for MEM <- MEM
sets if src has side-effects or for stores into ZERO_EXTRACT
if ZERO_EXTRACT operand has side-effects.

* gcc.dg/pr114768.c: New test.

--- gcc/rtlanal.cc.jj   2024-02-24 12:45:28.674249100 +0100
+++ gcc/rtlanal.cc  2024-04-18 15:09:55.199499083 +0200
@@ -1637,12 +1637,15 @@ set_noop_p (const_rtx set)
 return true;
 
   if (MEM_P (dst) && MEM_P (src))
-return rtx_equal_p (dst, src) && !side_effects_p (dst);
+return (rtx_equal_p (dst, src)
+   && !side_effects_p (dst)
+   && !side_effects_p (src));
 
   if (GET_CODE (dst) == ZERO_EXTRACT)
-return rtx_equal_p (XEXP (dst, 0), src)
-  && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
-  && !side_effects_p (src);
+return (rtx_equal_p (XEXP (dst, 0), src)
+   && !BITS_BIG_ENDIAN && XEXP (dst, 2) == const0_rtx
+   && !side_effects_p (src)
+   && !side_effects_p (XEXP (dst, 0)));
 
   if (GET_CODE (dst) == STRICT_LOW_PART)
 dst = XEXP (dst, 0);
--- gcc/testsuite/gcc.dg/pr114768.c.jj  2024-04-18 15:37:49.139433678 +0200
+++ gcc/testsuite/gcc.dg/pr114768.c 2024-04-18 15:43:30.389730365 +0200
@@ -0,0 +1,10 @@
+/* PR rtl-optimization/114768 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-final" } */
+/* { dg-final { scan-rtl-dump "\\\(mem/v:" "final" { target { ! { nvptx*-*-* } 
} } } } */
+
+void
+foo (int *p)
+{
+  *p = *(volatile int *) p;
+}

Jakub



[PATCH] libgcc: Another __divmodbitint4 bug fix [PR114762]

2024-04-19 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled because the code to decrement
vn on negative value with all ones in most significant limb (even partial)
and 0 in most significant bit of the second most significant limb doesn't
take into account the case where all bits below the most significant limb
are zero.  This has been a problem both in the version before yesterday's
commit where it has been done only if un was one shorter than vn before this
decrement, and is now problem even more often when it is done earlier.
When we decrement vn in such case and negate it, we end up with all 0s in
the v2 value, so have both the problems with UB on __builtin_clz* and the
expectations of the algorithm that the divisor has most significant bit set
after shifting, plus when the decremented vn is 1 it can SIGFPE on division
by zero even when it is not division by zero etc.  Other values shouldn't
get 0 in the new most significant limb after negation, because the
bitint_reduce_prec canonicalization should reduce prec if the second most
significant limb is all ones and if that limb is all zeros, if at least
one limb below it is non-zero, carry in will make it non-zero.

The following patch fixes it by checking if at least one bit below the
most significant limb is non-zero, in that case it decrements, otherwise
it will do nothing (but e.g. for the un < vn case that also means the
divisor is large enough that the result should be q 0 r u).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-04-19  Jakub Jelinek  

PR libgcc/114762
* libgcc2.c (__divmodbitint4): Don't decrement vn if all bits
below the most significant limb are zero.

* gcc.dg/torture/bitint-70.c: New test.

--- libgcc/libgcc2.c.jj 2024-04-18 09:48:55.172538667 +0200
+++ libgcc/libgcc2.c2024-04-18 12:17:28.893616007 +0200
@@ -1715,11 +1715,18 @@ __divmodbitint4 (UBILtype *q, SItype qpr
   && vn > 1
   && (Wtype) v[BITINT_END (1, vn - 2)] >= 0)
 {
-  vp = 0;
-  --vn;
+  /* Unless all bits below the most significant limb are zero.  */
+  SItype vn2;
+  for (vn2 = vn - 2; vn2 >= 0; --vn2)
+   if (v[BITINT_END (vn - 1 - vn2, vn2)])
+ {
+   vp = 0;
+   --vn;
 #if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
-  ++v;
+   ++v;
 #endif
+   break;
+ }
 }
   if (__builtin_expect (un < vn, 0))
 {
--- gcc/testsuite/gcc.dg/torture/bitint-70.c.jj 2024-04-18 12:26:09.406383158 
+0200
+++ gcc/testsuite/gcc.dg/torture/bitint-70.c2024-04-18 12:26:57.253718287 
+0200
@@ -0,0 +1,22 @@
+/* PR libgcc/114762 */
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c23" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 255
+__attribute__((__noipa__)) signed _BitInt(255)
+foo (signed _BitInt(255) a, signed _BitInt(65) b)
+{
+  return a / b;
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 255
+  if (foo (1, -0xwb - 1wb))
+__builtin_abort ();
+#endif
+}

Jakub



Re: [PATCH] libstdc++: Support link chains in std::chrono::tzdb::locate_zone [PR114770]

2024-04-19 Thread Richard Biener
On Thu, Apr 18, 2024 at 6:34 PM Jonathan Wakely  wrote:
>
> This would fix the but, how do people feel about it this close to the
> gcc-14 release?

Guess we'll have to fix it anyway, so why not now ... (what could go wrong..)

Richard.

> Tested x86_64-linux.
>
> -- >8 --
>
> Since 2022 the TZif format defined in the zic(8) man page has said that
> links can refer to other links, rather than only referring to a zone.
> This isn't supported by the C++20 spec, which assumes that the target()
> for a chrono::time_zone_link always names a chrono::time_zone, not
> another chrono::time_zone_link.
>
> This hasn't been a problem until now, because there are no entries in
> the tzdata file that chain links together. However, Debian Sid has
> changed the target of the Asia/Chungking link from the Asia/Shanghai
> zone to the Asia/Chongqing link, creating a link chain. The libstdc++
> code is unable to handle this, so chrono::locate_zone("Asia/Chungking")
> will fail with the tzdata.zi file from Debian Sid.
>
> It seems likely that the C++ spec will need a change to allow link
> chains, so that the original structure of the IANA database can be fully
> represented by chrono::tzdb. The alternative would be for chrono::tzdb
> to flatten all chains when loading the data, so that a link's target is
> always a zone, but this means throwing away information present in the
> tzdata.zi input file.
>
> In anticipation of a change to the spec, this commit adds support for
> chained links to libstdc++. When a name is found to be a link, we try to
> find its target in the list of zones as before, but now if the target
> isn't the name of a zone we don't fail. Instead we look for another link
> with that name, and keep doing that until we reach the end of the chain
> of links, and then look up the last target as a zone.
>
> This new logic would get stuck in a loop if the tzdata.zi file is buggy
> and defines a link chain that contains a cycle, e.g. two links that
> refer to each other. To deal with that unlikely case, we use the
> tortoise and hare algorithm to detect cycles in link chains, and throw
> an exception if we detect a cycle. Cycles in links should never happen,
> and it is expected that link chains will be short (if they occur at all)
> and so the code is optimized for short chains without cycles. Longer
> chains (four or more links) and cycles will do more work, but won't fail
> to resolve a chain or get stuck in a loop.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/114770
> * src/c++20/tzdb.cc (do_locate_zone): Support links that have
> another link as their target.
> * testsuite/std/time/tzdb/links.cc: New test.
> ---
>  libstdc++-v3/src/c++20/tzdb.cc|  57 -
>  libstdc++-v3/testsuite/std/time/tzdb/links.cc | 215 ++
>  2 files changed, 268 insertions(+), 4 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/time/tzdb/links.cc
>
> diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
> index 639d1c440ba..c7c7cc9deee 100644
> --- a/libstdc++-v3/src/c++20/tzdb.cc
> +++ b/libstdc++-v3/src/c++20/tzdb.cc
> @@ -1599,7 +1599,7 @@ namespace std::chrono
>  const time_zone*
>  do_locate_zone(const vector& zones,
>const vector& links,
> -  string_view tz_name) noexcept
> +  string_view tz_name)
>  {
>// Lambda mangling changed between -fabi-version=2 and -fabi-version=18
>auto search = [](const Vec& v, string_view name) {
> @@ -1610,13 +1610,62 @@ namespace std::chrono
> return ptr;
>};
>
> +  // Search zones first.
>if (auto tz = search(zones, tz_name))
> return tz;
>
> +  // Search links second.
>if (auto tz_l = search(links, tz_name))
> -   return search(zones, tz_l->target());
> +   {
> + // Handle the common case of a link that has a zone as the target.
> + if (auto tz = search(zones, tz_l->target())) [[likely]]
> +   return tz;
>
> -  return nullptr;
> + // Either tz_l->target() doesn't exist, or we have a chain of links.
> + // Use Floyd's cycle-finding algorithm to avoid infinite loops,
> + // at the cost of extra lookups. In the common case we expect a
> + // chain of links to be short so the loop won't run many times.
> + // In particular, the duplicate lookups to move the tortoise
> + // never happen unless the chain has four or more links.
> + // When a chain contains a cycle we do multiple duplicate lookups,
> + // but that case should never happen with correct tzdata.zi,
> + // so there's no need to optimize cycle detection.
> +
> + const time_zone_link* tortoise = tz_l;
> + const time_zone_link* hare = search(links, tz_l->target());
> + while (hare)
> +   {
> + // Chains should be short, so first check if it ends here:
> + if