Re: [11 PATCH] libiberty, Darwin: Fix a build warning. [PR112823]

2023-12-01 Thread Sam James


Iain Sandoe  writes:

> HI Sam,

Hi Iain,

>
> I think this qualifies as obvious (it’s on my list, but I did not get to it 
> yet,
> so go ahead).

Thanks. I can't push it myself - could you do that for me?

thanks again,
sam

>
> Iain
>
>> On 2 Dec 2023, at 05:30, Sam James  wrote:
>> 
>> From: Iain Sandoe 
>> 
>> r12-3005-g220c410162ebece4f missed a cast for the set_32 call.
>> Fixed thus.
>> 
>> Signed-off-by: Iain Sandoe 
>> Signed-off-by: Sam James 
>> 
>> libiberty/ChangeLog:
>>  PR other/112823
>>  * simple-object-mach-o.c (simple_object_mach_o_write_segment):
>>  Cast the first argument to set_32 as needed.
>> 
>> (cherry picked from commit 38757aa88735ab2e511bc428e2407a5a5e9fa0be)
>> ---
>> libiberty/simple-object-mach-o.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/libiberty/simple-object-mach-o.c 
>> b/libiberty/simple-object-mach-o.c
>> index 72b69d19c216..a8869e7c6395 100644
>> --- a/libiberty/simple-object-mach-o.c
>> +++ b/libiberty/simple-object-mach-o.c
>> @@ -1228,7 +1228,7 @@ simple_object_mach_o_write_segment 
>> (simple_object_write *sobj, int descriptor,
>>   /* Swap the indices, if required.  */
>> 
>>   for (i = 0; i < (nsects_in * 4); ++i)
>> -set_32 (&index[i], index[i]);
>> +set_32 ((unsigned char *) &index[i], index[i]);
>> 
>>   sechdr_offset += sechdrsize;
>> 
>> -- 
>> 2.43.0
>> 



Re: [11 PATCH] libiberty, Darwin: Fix a build warning. [PR112823]

2023-12-01 Thread Iain Sandoe
HI Sam,

I think this qualifies as obvious (it’s on my list, but I did not get to it yet,
so go ahead).

Iain

> On 2 Dec 2023, at 05:30, Sam James  wrote:
> 
> From: Iain Sandoe 
> 
> r12-3005-g220c410162ebece4f missed a cast for the set_32 call.
> Fixed thus.
> 
> Signed-off-by: Iain Sandoe 
> Signed-off-by: Sam James 
> 
> libiberty/ChangeLog:
>   PR other/112823
>   * simple-object-mach-o.c (simple_object_mach_o_write_segment):
>   Cast the first argument to set_32 as needed.
> 
> (cherry picked from commit 38757aa88735ab2e511bc428e2407a5a5e9fa0be)
> ---
> libiberty/simple-object-mach-o.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libiberty/simple-object-mach-o.c 
> b/libiberty/simple-object-mach-o.c
> index 72b69d19c216..a8869e7c6395 100644
> --- a/libiberty/simple-object-mach-o.c
> +++ b/libiberty/simple-object-mach-o.c
> @@ -1228,7 +1228,7 @@ simple_object_mach_o_write_segment (simple_object_write 
> *sobj, int descriptor,
>   /* Swap the indices, if required.  */
> 
>   for (i = 0; i < (nsects_in * 4); ++i)
> - set_32 (&index[i], index[i]);
> + set_32 ((unsigned char *) &index[i], index[i]);
> 
>   sechdr_offset += sechdrsize;
> 
> -- 
> 2.43.0
> 



[PATCH 3/3] MATCH: (convert)(zero_one !=/== 0/1) for outer type and zero_one type are the same

2023-12-01 Thread Andrew Pinski
When I moved two_value to match.pd, I removed the check for the {0,+-1}
as I had placed it after the {0,+-1} case for cond in match.pd.
In the case of {0,+-1} and non boolean, before we would optmize those
case to just `(convert)a` but after we would get `(convert)(a != 0)`
which was not handled anyways to just `(convert)a`.
So this adds a pattern to match `(convert)(zeroone != 0)` and simplify
to `(convert)zeroone`.

Also this optimizes (convert)(zeroone == 0) into (zeroone^1) if the
type match. This can only be done on the gimple level as if zeroone
was defined by (a&1), fold will convert (a&1)^1 back into
`(convert)(zeroone == 0)` and an infinite loop will happen.

Note the testcase pr69270.c needed a slight update due to not matching
exactly a scan pattern, this update makes it more robust and will match
before and afterwards and if there are other changes in this area too.

Note the testcase gcc.target/i386/pr110790-2.c needs a slight update
for better code generation in LP64 bit mode.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

PR tree-optimization/111972
PR tree-optimization/110637
* match.pd (`(convert)(zeroone !=/== CST)`): Match
and simplify to ((convert)zeroone){,^1}.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr110637-1.c: New test.
* gcc.dg/tree-ssa/pr110637-2.c: New test.
* gcc.dg/tree-ssa/pr110637-3.c: New test.
* gcc.dg/tree-ssa/pr111972-1.c: New test.
* gcc.dg/tree-ssa/pr69270.c: Update testcase.
* gcc.target/i386/pr110790-2.c: Update testcase.

Signed-off-by: Andrew Pinski 
---
 gcc/match.pd   | 21 +
 gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c | 10 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c | 13 +
 gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c | 14 +
 gcc/testsuite/gcc.dg/tree-ssa/pr111972-1.c | 34 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr69270.c|  4 +--
 gcc/testsuite/gcc.target/i386/pr110790-2.c | 16 --
 7 files changed, 108 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr111972-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 4d554ba4721..656b2c9edda 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3332,6 +3332,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0)))
 (rcmp @0 @1
 
+/* (type)([0,1]@a != 0) -> (type)a
+   (type)([0,1]@a == 1) -> (type)a
+   (type)([0,1]@a == 0) -> a ^ 1
+   (type)([0,1]@a != 1) -> a ^ 1.  */
+(for eqne (eq ne)
+ (simplify
+  (convert (eqne zero_one_valued_p@0 INTEGER_CST@1))
+  (if ((integer_zerop (@1) || integer_onep (@1)))
+   (if ((eqne == EQ_EXPR) ^ integer_zerop (@1))
+(convert @0)
+   /* a^1 can only be produced for gimple as
+  fold has the exact opposite transformation
+  for `(X & 1) ^ 1`.
+  See `Fold ~X & 1 as (X & 1) == 0.`
+  and `Fold (X ^ 1) & 1 as (X & 1) == 0.` in fold-const.cc.
+  Only do this if the types match as (type)(a == 0) is
+  canonical form normally, while `a ^ 1` is canonical when
+  there is no type change. */
+   (if (GIMPLE && types_match (type, TREE_TYPE (@0)))
+(bit_xor @0 { build_one_cst (type); } ))
+
 /* We can't reassociate at all for saturating types.  */
 (if (!TYPE_SATURATING (type))
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c
new file mode 100644
index 000..3d03b0992a4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+int f(int a)
+{
+int b = (a & 1)!=0;
+return b;
+}
+
+/* This should be optimized to just return (a & 1); */
+/* { dg-final { scan-tree-dump-not " == " "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c
new file mode 100644
index 000..f1c5b90353a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+int f(int a)
+{
+int b = a & 1;
+int c = b == 0;
+return c;
+}
+
+/* This should be optimized to just return `(a&1) ^ 1` or `(~a) & 1`. */
+/* { dg-final { scan-tree-dump-not " == " "optimized"} } */
+/* { dg-final { scan-tree-dump-times "~a" 1 "optimized"} } */
+/* { dg-final { scan-tree-dump-times " & 1" 1 "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c
new file mode 100644
index 000..ce80146d9df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c
@@ 

[PATCH 0/3] Fix PR 111972

2023-12-01 Thread Andrew Pinski
This patch set fixes PR 111972 and the fallout from it.

The first patch is a fix to zero_one_valued_p's convert pattern
which I hit while running the testsuite with the last patch.

The second patch is a fix for -fanalyzer which I had implemented with
a different version of the 3rd patch while I was working at Marvell.

And the 3rd patch fixes the issue by having the following as
canonical forms:
* `a ^ 1` is the canonical form of `(convert_back)(zero_one == 0)` (and 
`(convert_back)(zero_one != 1)`).
* `(convert)a` is the canonical form of `(convert)(zero_one != 0)` (and 
`(convert)(zero_one == 1)`).

Signed-off-by: Andrew Pinski 


Andrew Pinski (3):
  MATCH: Fix zero_one_valued_p's convert pattern
  Remove check of unsigned_char in
maybe_undo_optimize_bit_field_compare.
  MATCH: (convert)(zero_one !=/== 0/1) for outer type and zero_one type
are the same

 gcc/analyzer/region-model-manager.cc   |  3 --
 gcc/match.pd   | 24 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c | 10 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c | 13 +
 gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c | 14 +
 gcc/testsuite/gcc.dg/tree-ssa/pr111972-1.c | 34 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr69270.c|  4 +--
 gcc/testsuite/gcc.target/i386/pr110790-2.c | 16 --
 8 files changed, 111 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110637-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110637-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110637-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr111972-1.c

-- 
2.34.1



[PATCH 2/3] Remove check of unsigned_char in maybe_undo_optimize_bit_field_compare.

2023-12-01 Thread Andrew Pinski
From: Andrew Pinski 

The check for the type seems unnecessary and gets in the way sometimes.
Also with a patch I am working on for match.pd, it causes a failure to happen.
Before my patch the IR was:
  _1 = BIT_FIELD_REF ;
  _2 = _1 & 1;
  _3 = _2 != 0;
  _4 = (int) _3;
  __analyzer_eval (_4);

Where _2 was an unsigned char type.
And After my patch we have:
  _1 = BIT_FIELD_REF ;
  _2 = (int) _1;
  _3 = _2 & 1;
  __analyzer_eval (_3);

But in this case, the BIT_AND_EXPR is in an int type.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/analyzer/ChangeLog:

* region-model-manager.cc (maybe_undo_optimize_bit_field_compare): 
Remove
the check for type being unsigned_char_type_node.
---
 gcc/analyzer/region-model-manager.cc | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/analyzer/region-model-manager.cc 
b/gcc/analyzer/region-model-manager.cc
index 921edc55868..9a17b9d2878 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -586,9 +586,6 @@ maybe_undo_optimize_bit_field_compare (tree type,
   tree cst,
   const svalue *arg1)
 {
-  if (type != unsigned_char_type_node)
-return NULL;
-
   const binding_map &map = compound_sval->get_map ();
   unsigned HOST_WIDE_INT mask = TREE_INT_CST_LOW (cst);
   /* If "mask" is a contiguous range of set bits, see if the
-- 
2.39.3



[PATCH 1/3] MATCH: Fix zero_one_valued_p's convert pattern

2023-12-01 Thread Andrew Pinski
While working on PR 111972, I was getting a regression
due to zero_one_valued_p matching a signed 1 bit integer
when it came to convert. This patch fixes that by checking
the outer type too.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* match.pd (zero_one_valued_p): For convert
make sure type is not a signed 1-bit integer.

Signed-off-by: Andrew Pinski 
---
 gcc/match.pd | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 26383e55767..4d554ba4721 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2247,6 +2247,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
   && (TYPE_UNSIGNED (TREE_TYPE (@1))
  || TYPE_PRECISION (TREE_TYPE (@1)) > 1)
+  && INTEGRAL_TYPE_P (type)
+  && (TYPE_UNSIGNED (type)
+ || TYPE_PRECISION (type) > 1)
   && wi::leu_p (tree_nonzero_bits (@1), 1
 
 /* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 }.  */
-- 
2.39.3



Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Sam James


Jeff Law  writes:

> On 12/1/23 18:13, Sam James wrote:
>> 钟居哲  writes:
>> 
>>> Hi, This patch cause error on building newlib/glibc/musl on RISC-V port:
>>>
>>> /work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:8:40:
>>> error: passing argument 3 of 'syscall_errno' makes integer from pointer 
>>> without a cast [-Wint-conversion]
>>>  8 |   return syscall_errno (SYS_access, 2, file, mode, 0, 0, 0, 0);
>>>|^~~~
>>>||
>>>|const char *
>> This looks like an issue in newlib. We expect broken code to be
>> broken
>> by the recent changes. Can you investigate it on the newlib side?
> A ton of stuff in newlib/libgloss is broken due to the compiler
> changes.   But that's not a big surprise -- much of the
> newlib/libgloss code is c89 and clearly wrong for c99 and newer.

Yeah, it's probably a reasonable candidate for -fpermissive to start
with until it's cleaned up.

(Also, sorry, I didn't mean my comment to appear glib. I just meant to
say "yes, this looks expected".)

>
> Jeff

thanks,
sam


[11 PATCH] libiberty, Darwin: Fix a build warning. [PR112823]

2023-12-01 Thread Sam James
From: Iain Sandoe 

r12-3005-g220c410162ebece4f missed a cast for the set_32 call.
Fixed thus.

Signed-off-by: Iain Sandoe 
Signed-off-by: Sam James 

libiberty/ChangeLog:
PR other/112823
* simple-object-mach-o.c (simple_object_mach_o_write_segment):
Cast the first argument to set_32 as needed.

(cherry picked from commit 38757aa88735ab2e511bc428e2407a5a5e9fa0be)
---
 libiberty/simple-object-mach-o.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libiberty/simple-object-mach-o.c b/libiberty/simple-object-mach-o.c
index 72b69d19c216..a8869e7c6395 100644
--- a/libiberty/simple-object-mach-o.c
+++ b/libiberty/simple-object-mach-o.c
@@ -1228,7 +1228,7 @@ simple_object_mach_o_write_segment (simple_object_write 
*sobj, int descriptor,
   /* Swap the indices, if required.  */
 
   for (i = 0; i < (nsects_in * 4); ++i)
-   set_32 (&index[i], index[i]);
+   set_32 ((unsigned char *) &index[i], index[i]);
 
   sechdr_offset += sechdrsize;
 
-- 
2.43.0



Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Jeff Law




On 12/1/23 18:13, Sam James wrote:


钟居哲  writes:


Hi, This patch cause error on building newlib/glibc/musl on RISC-V port:

/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:8:40:
error: passing argument 3 of 'syscall_errno' makes integer from pointer without 
a cast [-Wint-conversion]
 8 |   return syscall_errno (SYS_access, 2, file, mode, 0, 0, 0, 0);
   |^~~~
   ||
   |const char *


This looks like an issue in newlib. We expect broken code to be broken
by the recent changes. Can you investigate it on the newlib side?
A ton of stuff in newlib/libgloss is broken due to the compiler changes. 
 But that's not a big surprise -- much of the newlib/libgloss code is 
c89 and clearly wrong for c99 and newer.


Jeff


Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Patrick O'Neill

That failure is is due to newlib files:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../*newlib*/newlib/libm/complex/ccoshl.c: 
In function 'ccoshl':


To build gcc w/ glibc with riscv-gnu-toolchain, run make linux.

A temporary fix for newlib is here:
https://github.com/patrick-rivos/riscv-gnu-toolchain/tree/35d8e8c486bd2f6e3e2e673db8d2b979309a6de4/fixups/newlib

On 12/1/23 17:53, 钟居哲 wrote:

No. GLIBC 2.37 also failed:

make[4]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib/riscv64-unknown-elf/newlib'

  CC       libm/complex/libm_a-casinhl.o
make[3]: *** [Makefile:5283: all] Error 2
make[3]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib/riscv64-unknown-elf/newlib'

make[2]: *** [Makefile:8492: all-target-newlib] Error 2
make[2]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib'

make[1]: *** [Makefile:879: all] Error 2
make[1]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib'

make: *** [Makefile:624: stamps/build-newlib] Error 2
make: *** Waiting for unfinished jobs
  CC       libm/complex/libm_a-csinhl.o
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c: 
In function 'ccoshl':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:13: 
error: implicit declaration of function 'coshl'; did you mean 'coshf'? 
[-Wimplicit-function-declaration]

   43 |         w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
      |             ^
      |             coshf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:24: 
error: implicit declaration of function 'cosl'; did you mean 'cosf'? 
[-Wimplicit-function-declaration]

   43 |         w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
      |                        ^~~~
      |                        cosf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/clogl.c: 
In function 'clogl':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:35: 
error: implicit declaration of function 'sinhl'; did you mean 'sinhf'? 
[-Wimplicit-function-declaration]

   43 |         w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
      |                                   ^
      |                                   sinhf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/clogl.c:42:13: 
error: implicit declaration of function 'logl'; did you mean 'logf'? 
[-Wimplicit-function-declaration]

   42 |         p = logl(rr);
      |             ^~~~
      |             logf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:46: 
error: implicit declaration of function 'sinl'; did you mean 'sinf'? 
[-Wimplicit-function-declaration]

   43 |         w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
      |                                              ^~~~
      |                                              sinf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/clogl.c:43:14: 
error: implicit declaration of function 'atan2l'; did you mean 
'atan2f'? [-Wimplicit-function-declaration]

   43 |         rr = atan2l(cimagl(z), creall(z));
      |              ^~
      |              atan2f
  CC       libm/complex/libm_a-csinl.o
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c: 
In function 'cexpl':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c:43:13: 
error: implicit declaration of function 'expl'; did you mean 'expf'? 
[-Wimplicit-function-declaration]

   43 |         r = expl(x);
      |             ^~~~
      |             expf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c:44:17: 
error: implicit declaration of function 'cosl'; did you mean 'cosf'? 
[-Wimplicit-function-declaration]

   44 |         w = r * cosl(y) + r * sinl(y) * I;
      |                 ^~~~
      |                 cosf
/work/home/jzzhong/work/toolchain/riscv/buil

RE: [PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread Li, Pan2
Committed, thanks Juzhe.

Pan

From: 钟居哲 
Sent: Saturday, December 2, 2023 9:10 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f

LGTM


juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-12-02 08:59
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
From: Pan Li mailto:pan2...@intel.com>>

If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.

This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:

rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits for ZVE32F.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv.cc | 63 +--
.../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
2 files changed, 95 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..84512dcdc68 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = known_eq (GET_MODE_SIZE (smode), 8)
+ && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+   need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-  for (unsigned int i = 0; i < num; i++)
+  if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode))
{
-   rtx result;
-   if (num == 1)
- result = dest;
-   else if (i == 0)
- result = gen_lowpart (smode, dest);
-   else
- result = gen_reg_rtx (smode);
-   riscv_vector::emit_vec_extract (result, v, index + i);
+   rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+   rtx int_reg = dest;
-   if (i == 1)
+   if (need_int_reg_p)
{
-   rtx tmp
- = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
- gen_int_mode (32, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
-   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-OPTAB_DIRECT);
-   emit_move_insn (dest, tmp2);
+   int_reg = gen_reg_rtx (DImode);
+   emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest));
}
+
+   for (unsigned int i = 0; i < num; i++)
+ {
+   rtx result;
+   if (num == 1)
+ result = int_reg;
+   else if (i == 0)
+ result = gen_lowpart (smode, int_reg);
+   else
+ result = gen_reg_rtx (smode);
+
+   riscv_vector::emit_vec_extract (result, v, index + i);
+
+   if (i == 1)
+ {
+   rtx tmp = expand_binop (Pmode, ashl_optab,
+   gen_lowpart (Pmode, result),
+   gen_int_mode (32, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+NULL_RTX, 0,
+OPTAB_DIRECT);
+   emit_move_insn (int_reg, tmp2);
+ }
+ }
+
+   if (need_int_reg_p)
+ emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg));
+   else
+ emit_move_insn (dest, int_reg);
}
+  else
+ gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-

Re: Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread 钟居哲
No. GLIBC 2.37 also failed:

make[4]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib/riscv64-unknown-elf/newlib'
  CC   libm/complex/libm_a-casinhl.o
make[3]: *** [Makefile:5283: all] Error 2
make[3]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib/riscv64-unknown-elf/newlib'
make[2]: *** [Makefile:8492: all-target-newlib] Error 2
make[2]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib'
make[1]: *** [Makefile:879: all] Error 2
make[1]: Leaving directory 
'/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/build-newlib'
make: *** [Makefile:624: stamps/build-newlib] Error 2
make: *** Waiting for unfinished jobs
  CC   libm/complex/libm_a-csinhl.o
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:
 In function 'ccoshl':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:13:
 error: implicit declaration of function 'coshl'; did you mean 'coshf'? 
[-Wimplicit-function-declaration]
   43 | w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
  | ^
  | coshf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:24:
 error: implicit declaration of function 'cosl'; did you mean 'cosf'? 
[-Wimplicit-function-declaration]
   43 | w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
  |^~~~
  |cosf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/clogl.c:
 In function 'clogl':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:35:
 error: implicit declaration of function 'sinhl'; did you mean 'sinhf'? 
[-Wimplicit-function-declaration]
   43 | w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
  |   ^
  |   sinhf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/clogl.c:42:13:
 error: implicit declaration of function 'logl'; did you mean 'logf'? 
[-Wimplicit-function-declaration]
   42 | p = logl(rr);
  | ^~~~
  | logf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/ccoshl.c:43:46:
 error: implicit declaration of function 'sinl'; did you mean 'sinf'? 
[-Wimplicit-function-declaration]
   43 | w = coshl(x) * cosl(y) + (sinhl(x) * sinl(y)) * I;
  |  ^~~~
  |  sinf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/clogl.c:43:14:
 error: implicit declaration of function 'atan2l'; did you mean 'atan2f'? 
[-Wimplicit-function-declaration]
   43 | rr = atan2l(cimagl(z), creall(z));
  |  ^~
  |  atan2f
  CC   libm/complex/libm_a-csinl.o
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c:
 In function 'cexpl':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c:43:13:
 error: implicit declaration of function 'expl'; did you mean 'expf'? 
[-Wimplicit-function-declaration]
   43 | r = expl(x);
  | ^~~~
  | expf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c:44:17:
 error: implicit declaration of function 'cosl'; did you mean 'cosf'? 
[-Wimplicit-function-declaration]
   44 | w = r * cosl(y) + r * sinl(y) * I;
  | ^~~~
  | cosf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm/complex/cexpl.c:44:31:
 error: implicit declaration of function 'sinl'; did you mean 'sinf'? 
[-Wimplicit-function-declaration]
   44 | w = r * cosl(y) + r * sinl(y) * I;
  |   ^~~~
  |   sinf
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-linux-spike-debug/../../newlib/newlib/libm

Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-12-01 Thread waffl3x






On Friday, December 1st, 2023 at 9:52 AM, Jason Merrill  
wrote:


> 
> 
> On 12/1/23 01:02, waffl3x wrote:
> 
> > I ran into another issue while devising tests for redeclarations of
> > xobj member functions as static member functions and vice versa. I am
> > pretty sure by the literal wording of the standard, this is well formed.
> > 
> > template
> > concept Constrain = true;
> > 
> > struct S {
> > void f(this auto, Constrain auto) {};
> > static void f(Constrain auto) {};
> > 
> > void g(this auto const&, Constrain auto) {};
> > static void g(Constrain auto) {};
> > 
> > void h(this auto&&, Constrain auto) {};
> > static void h(Constrain auto) {};
> > };
> > 
> > And also,
> > 
> > struct S{
> > void f(this auto) {};
> > static void f() {};
> > 
> > void g(this auto const&) {};
> > static void g() {};
> > 
> > void h(this auto&&) {};
> > static void h() {};
> > };
> > 
> > I wrote these tests expecting them to be ill-formed, and found what I
> > thought was a bug when they were not diagnosed as redecelarations.
> > However, given how the code for resolving overloads and determining
> > redeclarations looks, I believe this is actually well formed on a
> > technicality. I can't find the passages in the standard that specify
> > this so I can't be sure.
> 
> 
> I think the relevant section is
> https://eel.is/c++draft/basic.scope.scope
> 
> > Anyway, the template parameter list differs because of the deduced
> > object parameter. Now here is the question, you are required to ignore
> > the object parameter when determining if these are redeclarations or
> > not, but what about the template parameters associated with the object
> > parameter? Am I just missing the passage that specifies this or is this
> > an actual defect in the standard?
> 
> 
> I think that since they differ in template parameters, they don't
> correspond under https://eel.is/c++draft/basic.scope.scope#4.5 so they
> can be overloaded.
> 
> This is specified in terms of the template-head grammar non-terminal,
> but elsewhere we say that abbreviated templates are equivalent to
> writing out the template parameters explicitly.
> 
> > The annoying thing is, even if this was brought up, I think the only
> > solution is to ratify these examples as well formed.
> 
> 
> Yes.
> 
> Jason

I can't get over that I feel like this goes against the spirit of the
specification. Just because an object argument is deduced should not
suddenly mean we take it into account. Too bad there's no good solution.

I especially don't like that that the following case is ambiguous. I
understand why, but I don't like it.

template
concept Constrain = true;

struct S {
  int f(this auto, Constrain auto) {};
  static f(auto) {};
};
main() {
  S{}.f(0);
}

I would like to see this changed honestly. When an ambiguity is
encountered, the more constrained function should be taken into account
even if they normally can't be considered. Is there some pitfall with
this line of thinking that kept it out of the standard? Is it just a
case of "too hard to specify" or is there some reason it's impossible
to do in all but the simplest of cases?

Anyway while I do think this behavior is bad (not wrong according to
the standard, but bad imo), I recognize I don't have time to think
about it right now so I'll go back to working on the patch for the time
being.

Alex


Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Sam James


钟居哲  writes:

> Hi, This patch cause error on building newlib/glibc/musl on RISC-V port:
>
> /work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:8:40:
> error: passing argument 3 of 'syscall_errno' makes integer from pointer 
> without a cast [-Wint-conversion]
> 8 |   return syscall_errno (SYS_access, 2, file, mode, 0, 0, 0, 0);
>   |^~~~
>   ||
>   |const char *

This looks like an issue in newlib. We expect broken code to be broken
by the recent changes. Can you investigate it on the newlib side?

Thanks.



Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Patrick O'Neill

Hi Juzhe,

I can confirm the failure on Newlib.
I'm not seeing any issues on glibc 2.37.
I haven't tried to build musl.

Since this patch promotes warnings to errors breakages were probably 
expected.

The fix may require changes to newlib to remove the errors.
I've hacked together a series of patches on top of newlib 4.3.0 that 
resolves these issues (but I think they'd need more work to be 
upstream-able):

https://github.com/patrick-rivos/riscv-gnu-toolchain/tree/35d8e8c486bd2f6e3e2e673db8d2b979309a6de4/fixups/newlib

@Thomas @Florian am I right in assuming that breakages were expected/the 
fix should come from fixing the warnings?


Thanks,
Patrick

On 12/1/23 16:33, 钟居哲 wrote:

Hi, This patch cause error on building newlib/glibc/musl on RISC-V port:

/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:8:40: 
error: passing argument 3 of 'syscall_errno' makes integer from 
pointer without a cast [-Wint-conversion]

    8 |   return syscall_errno (SYS_access, 2, file, mode, 0, 0, 0, 0);
      |                                        ^~~~
      |                                        |
      |                                        const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/internal_syscall.h:66:38: 
note: expected 'long int' but argument is of type 'const char *'
   66 | syscall_errno(long n, int argc, long _a0, long _a1, long _a2, 
long _a3, long _a4, long _a5)

      |                                 ~^~~
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_utime.c:5:39: 
warning: 'struct utimbuf' declared inside parameter list will not be 
visible outside of this definition or declaration

    5 | _utime(const char *path, const struct utimbuf *times)
      |                                       ^~~
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_faccessat.c: 
In function '_faccessat':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_faccessat.c:7:50: 
error: passing argument 4 of 'syscall_errno' makes integer from 
pointer without a cast [-Wint-conversion]
    7 |   return syscall_errno (SYS_faccessat, 4, dirfd, file, mode, 
flags, 0, 0);

      | ^~~~
      |                                                  |
      | const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_faccessat.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/internal_syscall.h:66:48: 
note: expected 'long int' but argument is of type 'const char *'
   66 | syscall_errno(long n, int argc, long _a0, long _a1, long _a2, 
long _a3, long _a4, long _a5)

      |                                           ~^~~
make[5]: *** [Makefile:3315: riscv/riscv_libgloss_a-sys_access.o] Error 1
make[5]: *** Waiting for unfinished jobs
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_open.c: 
In function '_open':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_open.c:8:38: 
error: passing argument 3 of 'syscall_errno' makes integer from 
pointer without a cast [-Wint-conversion]

    8 |   return syscall_errno (SYS_open, 3, name, flags, mode, 0, 0, 0);
      |                                      ^~~~
      |                                      |
      |                                      const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_open.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/internal_syscall.h:66:38: 
note: expected 'long int' but argument is of type 'const char *'
   66 | syscall_errno(long n, int argc, long _a0, long _a1, long _a2, 
long _a3, long _a4, long _a5)

      |                                 ~^~~
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_openat.c: 
In function '_openat':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_openat.c:7:47: 
error: passing argument 4 of 'syscall_errno' makes integer fro

Re: [PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread 钟居哲
LGTM



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-12-02 08:59
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
From: Pan Li 
 
If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.
 
This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:
 
rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done
 
PR target/112743
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits for ZVE32F.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr112743-2.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv.cc | 63 +--
.../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
2 files changed, 95 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..84512dcdc68 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = known_eq (GET_MODE_SIZE (smode), 8)
+ && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+   need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-  for (unsigned int i = 0; i < num; i++)
+  if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode))
{
-   rtx result;
-   if (num == 1)
- result = dest;
-   else if (i == 0)
- result = gen_lowpart (smode, dest);
-   else
- result = gen_reg_rtx (smode);
-   riscv_vector::emit_vec_extract (result, v, index + i);
+   rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+   rtx int_reg = dest;
-   if (i == 1)
+   if (need_int_reg_p)
{
-   rtx tmp
- = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
- gen_int_mode (32, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
-   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-OPTAB_DIRECT);
-   emit_move_insn (dest, tmp2);
+   int_reg = gen_reg_rtx (DImode);
+   emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest));
}
+
+   for (unsigned int i = 0; i < num; i++)
+ {
+   rtx result;
+   if (num == 1)
+ result = int_reg;
+   else if (i == 0)
+ result = gen_lowpart (smode, int_reg);
+   else
+ result = gen_reg_rtx (smode);
+
+   riscv_vector::emit_vec_extract (result, v, index + i);
+
+   if (i == 1)
+ {
+   rtx tmp = expand_binop (Pmode, ashl_optab,
+   gen_lowpart (Pmode, result),
+   gen_int_mode (32, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+NULL_RTX, 0,
+OPTAB_DIRECT);
+   emit_move_insn (int_reg, tmp2);
+ }
+ }
+
+   if (need_int_reg_p)
+ emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg));
+   else
+ emit_move_insn (dest, int_reg);
}
+  else
+ gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32f_zvfh_zfh -mabi=lp64 -O2" } */
+
+#include 
+
+union double_union
+{
+  double d;
+  __uint32_t i[2];
+};
+
+#define word0(x)  (x.i[1])
+#define word1(x)  (x.i[0])
+
+#define P 53
+#define Exp_shift 20
+#define Exp_msk1  ((__uint32_t)0x10L)
+#define Exp_mask  ((__uint32_t)0x7ff0L)
+
+double ulp (double _x)
+{
+  union double_union x, a;
+  register int L;
+
+  x.d = _x;
+  L = (word0 (x) & Exp_mask) - (P - 1) * Exp_msk1;
+
+  if (L > 0)
+{
+  L |= Exp_msk1 >> 4;
+  word0 

[PATCH v4] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread pan2 . li
From: Pan Li 

If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.

This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:

rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done

PR target/112743

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits for ZVE32F.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr112743-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv.cc | 63 +--
 .../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
 2 files changed, 95 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..84512dcdc68 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = known_eq (GET_MODE_SIZE (smode), 8)
+   && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
 
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+ need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
 
-  for (unsigned int i = 0; i < num; i++)
+  if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode))
{
- rtx result;
- if (num == 1)
-   result = dest;
- else if (i == 0)
-   result = gen_lowpart (smode, dest);
- else
-   result = gen_reg_rtx (smode);
- riscv_vector::emit_vec_extract (result, v, index + i);
+ rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+ rtx int_reg = dest;
 
- if (i == 1)
+ if (need_int_reg_p)
{
- rtx tmp
-   = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
-   gen_int_mode (32, Pmode), NULL_RTX, 0,
-   OPTAB_DIRECT);
- rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-  OPTAB_DIRECT);
- emit_move_insn (dest, tmp2);
+ int_reg = gen_reg_rtx (DImode);
+ emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest));
}
+
+ for (unsigned int i = 0; i < num; i++)
+   {
+ rtx result;
+ if (num == 1)
+   result = int_reg;
+ else if (i == 0)
+   result = gen_lowpart (smode, int_reg);
+ else
+   result = gen_reg_rtx (smode);
+
+ riscv_vector::emit_vec_extract (result, v, index + i);
+
+ if (i == 1)
+   {
+ rtx tmp = expand_binop (Pmode, ashl_optab,
+ gen_lowpart (Pmode, result),
+ gen_int_mode (32, Pmode), NULL_RTX, 0,
+ OPTAB_DIRECT);
+ rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+  NULL_RTX, 0,
+  OPTAB_DIRECT);
+ emit_move_insn (int_reg, tmp2);
+   }
+   }
+
+ if (need_int_reg_p)
+   emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg));
+ else
+   emit_move_insn (dest, int_reg);
}
+  else
+   gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g

Re: [PATCH v6] c++: implement P2564, consteval needs to propagate up [PR107687]

2023-12-01 Thread Jason Merrill

On 12/1/23 18:37, Marek Polacek wrote:

On Thu, Nov 30, 2023 at 06:34:01PM -0500, Jason Merrill wrote:

On 11/23/23 11:46, Marek Polacek wrote:

v5 greatly simplifies the code.


Indeed, it's much cleaner now.


I still need a new ff_ flag to signal that we can return immediately
after seeing an i-e expr.


That's still not clear to me:


+  /* In turn, maybe promote the function we find ourselves in...  */
+  if ((data->flags & ff_find_escalating_expr)
+ && DECL_IMMEDIATE_FUNCTION_P (decl)
+ /* ...but not if the call to DECL was constant; that is the
+"an immediate invocation that is not a constant expression"
+case.  */
+ && (e = cxx_constant_value (stmt, tf_none), e == error_mark_node))
+   {
+ /* Since we had to set DECL_ESCALATION_CHECKED_P before the walk,
+we call promote_function_to_consteval directly which doesn't
+check unchecked_immediate_escalating_function_p.  */
+ if (current_function_decl)
+   promote_function_to_consteval (current_function_decl);
+ *walk_subtrees = 0;
+ return stmt;
+   }


This is the one use of ff_find_escalating_expr, and it seems redundant with
the code immediately below, where we use complain (derived from
ff_mce_false) to decide whether to return immediately.  Can we remove this
hunk and the flag, and merge find_escalating_expr with cp_fold_immediate?


Ah, that works!  Hopefully done now.
  

I think you want to walk the function body for three-ish reasons:
1) at EOF, to check for escalation
2) at EOF, to check for errors
3) at error time, to explain escalation

It's not clear to me that we need a flag to distinguish between them. When
we encounter an immediate-escalating expression E:

A) if we're in an immediate-escalating function, escalate and return E (#1,
#3).
B) otherwise, if we're diagnosing, error and continue (#2).
C) otherwise, return E (individual expression mce_unknown walk from
constexpr.cc).


@@ -1178,11 +1388,19 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_
)
   *walk_subtrees = 0;
   /* Don't return yet, still need the cp_fold below.  */
 }
-  cp_fold_immediate_r (stmt_p, walk_subtrees, data);
+  else
+   cp_fold_immediate_r (stmt_p, walk_subtrees, data);
  }
*stmt_p = stmt = cp_fold (*stmt_p, data->flags);
+  /* For certain trees, like +foo(), the cp_fold below will remove the +,


s/below/above/?


Fixed.
  

+/* We've stashed immediate-escalating functions.  Now see if they indeed
+   ought to be promoted to consteval.  */
+
+void
+process_pending_immediate_escalating_fns ()
+{
+  /* This will be null for -fno-immediate-escalation.  */
+  if (!deferred_escalating_exprs)
+return;
+
+  for (auto e : *deferred_escalating_exprs)
+if (TREE_CODE (e) == FUNCTION_DECL && !DECL_ESCALATION_CHECKED_P (e))
+  cp_fold_immediate (&DECL_SAVED_TREE (e), mce_false, e);
+}
+
+/* We've escalated every function that could have been promoted to
+   consteval.  Check that we are not taking the address of a consteval
+   function.  */
+
+void
+check_immediate_escalating_refs ()
+{
+  /* This will be null for -fno-immediate-escalation.  */
+  if (!deferred_escalating_exprs)
+return;
+
+  for (auto ref : *deferred_escalating_exprs)
+{
+  if (TREE_CODE (ref) == FUNCTION_DECL)
+   continue;
+  tree decl = (TREE_CODE (ref) == PTRMEM_CST
+  ? PTRMEM_CST_MEMBER (ref)
+  : TREE_OPERAND (ref, 0));
+  if (DECL_IMMEDIATE_FUNCTION_P (decl))
+   taking_address_of_imm_fn_error (ref, decl);
+}
+
+  deferred_escalating_exprs = nullptr;
  }


Could these be merged, so you do a single loop of cp_fold_immediate over
function bodies or non-function expressions?  I'd expect that to work.


We seem to walk the hash table in a random order so I can't use one loop,
otherwise we could hit &f before escalating f.


Is that a problem, since we recurse if we see a function that is still 
unchecked?



@@ -1045,90 +1191,138 @@ cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, 
void *data_)
   /* The purpose of this is not to emit errors for mce_unknown.  */
   const tsubst_flags_t complain = (data->flags & ff_mce_false
   ? tf_error : tf_none);
+  const tree_code code = TREE_CODE (stmt);
 
   /* No need to look into types or unevaluated operands.

  NB: This affects cp_fold_r as well.  */
-  if (TYPE_P (stmt) || unevaluated_p (TREE_CODE (stmt)))
+  if (TYPE_P (stmt) || unevaluated_p (code) || cp_unevaluated_operand)


Maybe check in_immediate_context here instead?


+  /* [expr.const]p16 "An expression or conversion is immediate-escalating if
+ it is not initially in an immediate function context and it is either
+ -- an immediate invocation that is not a constant expression and is not
+ a subexpression of an immediate invocation."
 
+ If we are in an immediate-escalating function, t

[PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread 钟居哲
Hi, This patch cause error on building newlib/glibc/musl on RISC-V port:

/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:8:40:
 error: passing argument 3 of 'syscall_errno' makes integer from pointer 
without a cast [-Wint-conversion]
8 |   return syscall_errno (SYS_access, 2, file, mode, 0, 0, 0, 0);
  |^~~~
  ||
  |const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_access.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/internal_syscall.h:66:38:
 note: expected 'long int' but argument is of type 'const char *'
   66 | syscall_errno(long n, int argc, long _a0, long _a1, long _a2, long _a3, 
long _a4, long _a5)
  | ~^~~
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_utime.c:5:39:
 warning: 'struct utimbuf' declared inside parameter list will not be visible 
outside of this definition or declaration
5 | _utime(const char *path, const struct utimbuf *times)
  |   ^~~
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_faccessat.c:
 In function '_faccessat':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_faccessat.c:7:50:
 error: passing argument 4 of 'syscall_errno' makes integer from pointer 
without a cast [-Wint-conversion]
7 |   return syscall_errno (SYS_faccessat, 4, dirfd, file, mode, flags, 0, 
0);
  |  ^~~~
  |  |
  |  const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_faccessat.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/internal_syscall.h:66:48:
 note: expected 'long int' but argument is of type 'const char *'
   66 | syscall_errno(long n, int argc, long _a0, long _a1, long _a2, long _a3, 
long _a4, long _a5)
  |   ~^~~
make[5]: *** [Makefile:3315: riscv/riscv_libgloss_a-sys_access.o] Error 1
make[5]: *** Waiting for unfinished jobs
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_open.c:
 In function '_open':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_open.c:8:38:
 error: passing argument 3 of 'syscall_errno' makes integer from pointer 
without a cast [-Wint-conversion]
8 |   return syscall_errno (SYS_open, 3, name, flags, mode, 0, 0, 0);
  |  ^~~~
  |  |
  |  const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_open.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/internal_syscall.h:66:38:
 note: expected 'long int' but argument is of type 'const char *'
   66 | syscall_errno(long n, int argc, long _a0, long _a1, long _a2, long _a3, 
long _a4, long _a5)
  | ~^~~
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_openat.c:
 In function '_openat':
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_openat.c:7:47:
 error: passing argument 4 of 'syscall_errno' makes integer from pointer 
without a cast [-Wint-conversion]
7 |   return syscall_errno (SYS_openat, 4, dirfd, name, flags, mode, 0, 0);
  |   ^~~~
  |   |
  |   const char *
In file included from 
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/sys_openat.c:2:
/work/home/jzzhong/work/toolchain/riscv/build/dev-rv64gcv_zvfh_zfh-lp64d-medany-newlib-spike-debug/../../newlib/libgloss/riscv/in

[PATCH v6] c++: implement P2564, consteval needs to propagate up [PR107687]

2023-12-01 Thread Marek Polacek
On Thu, Nov 30, 2023 at 06:34:01PM -0500, Jason Merrill wrote:
> On 11/23/23 11:46, Marek Polacek wrote:
> > v5 greatly simplifies the code.
> 
> Indeed, it's much cleaner now.
> 
> > I still need a new ff_ flag to signal that we can return immediately
> > after seeing an i-e expr.
> 
> That's still not clear to me:
> 
> > +  /* In turn, maybe promote the function we find ourselves in...  */
> > +  if ((data->flags & ff_find_escalating_expr)
> > + && DECL_IMMEDIATE_FUNCTION_P (decl)
> > + /* ...but not if the call to DECL was constant; that is the
> > +"an immediate invocation that is not a constant expression"
> > +case.  */
> > + && (e = cxx_constant_value (stmt, tf_none), e == error_mark_node))
> > +   {
> > + /* Since we had to set DECL_ESCALATION_CHECKED_P before the walk,
> > +we call promote_function_to_consteval directly which doesn't
> > +check unchecked_immediate_escalating_function_p.  */
> > + if (current_function_decl)
> > +   promote_function_to_consteval (current_function_decl);
> > + *walk_subtrees = 0;
> > + return stmt;
> > +   }
> 
> This is the one use of ff_find_escalating_expr, and it seems redundant with
> the code immediately below, where we use complain (derived from
> ff_mce_false) to decide whether to return immediately.  Can we remove this
> hunk and the flag, and merge find_escalating_expr with cp_fold_immediate?

Ah, that works!  Hopefully done now.
 
> I think you want to walk the function body for three-ish reasons:
> 1) at EOF, to check for escalation
> 2) at EOF, to check for errors
> 3) at error time, to explain escalation
> 
> It's not clear to me that we need a flag to distinguish between them. When
> we encounter an immediate-escalating expression E:
> 
> A) if we're in an immediate-escalating function, escalate and return E (#1,
> #3).
> B) otherwise, if we're diagnosing, error and continue (#2).
> C) otherwise, return E (individual expression mce_unknown walk from
> constexpr.cc).
> 
> > @@ -1178,11 +1388,19 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void 
> > *data_
> > )
> >   *walk_subtrees = 0;
> >   /* Don't return yet, still need the cp_fold below.  */
> > }
> > -  cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> > +  else
> > +   cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> >  }
> >*stmt_p = stmt = cp_fold (*stmt_p, data->flags);
> > +  /* For certain trees, like +foo(), the cp_fold below will remove the +,
> 
> s/below/above/?

Fixed.
 
> > +/* We've stashed immediate-escalating functions.  Now see if they indeed
> > +   ought to be promoted to consteval.  */
> > +
> > +void
> > +process_pending_immediate_escalating_fns ()
> > +{
> > +  /* This will be null for -fno-immediate-escalation.  */
> > +  if (!deferred_escalating_exprs)
> > +return;
> > +
> > +  for (auto e : *deferred_escalating_exprs)
> > +if (TREE_CODE (e) == FUNCTION_DECL && !DECL_ESCALATION_CHECKED_P (e))
> > +  cp_fold_immediate (&DECL_SAVED_TREE (e), mce_false, e);
> > +}
> > +
> > +/* We've escalated every function that could have been promoted to
> > +   consteval.  Check that we are not taking the address of a consteval
> > +   function.  */
> > +
> > +void
> > +check_immediate_escalating_refs ()
> > +{
> > +  /* This will be null for -fno-immediate-escalation.  */
> > +  if (!deferred_escalating_exprs)
> > +return;
> > +
> > +  for (auto ref : *deferred_escalating_exprs)
> > +{
> > +  if (TREE_CODE (ref) == FUNCTION_DECL)
> > +   continue;
> > +  tree decl = (TREE_CODE (ref) == PTRMEM_CST
> > +  ? PTRMEM_CST_MEMBER (ref)
> > +  : TREE_OPERAND (ref, 0));
> > +  if (DECL_IMMEDIATE_FUNCTION_P (decl))
> > +   taking_address_of_imm_fn_error (ref, decl);
> > +}
> > +
> > +  deferred_escalating_exprs = nullptr;
> >  }
> 
> Could these be merged, so you do a single loop of cp_fold_immediate over
> function bodies or non-function expressions?  I'd expect that to work.

We seem to walk the hash table in a random order so I can't use one loop,
otherwise we could hit &f before escalating f.  But there's not need for
two functions, so I've merged them into
process_and_check_pending_immediate_escalating_fns.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This patch implements P2564, described at , whereby
certain functions are promoted to consteval.  For example:

  consteval int id(int i) { return i; }

  template 
  constexpr int f(T t)
  {
return t + id(t); // id causes f to be promoted to consteval
  }

  void g(int i)
  {
f (3);
  }

now compiles.  Previously the code was ill-formed: we would complain
that 't' in 'f' is not a constant expression.  Since 'f' is now
consteval, it means that the call to id(t) is in an immediate context,
so doesn't have to produce a constant -- this is how we allow consteval

Re: [PATCH] testsuite: scev: expect fail on ilp32

2023-12-01 Thread Hans-Peter Nilsson
> Date: Fri, 1 Dec 2023 08:07:14 +0100 (CET)
> From: Richard Biener 

> On Fri, 1 Dec 2023, Hans-Peter Nilsson wrote:
> 
> > > From: Hans-Peter Nilsson 
> > > Date: Thu, 30 Nov 2023 18:09:10 +0100
> > 
> > Richard B.:
> > > > > In the end we might need to move/duplicate the test to some
> > > > > gcc.target/* dir and restrict it to a specific tuning.
> > > 
> > > I intend to post two alternative patches to get this
> > > resolved:
> > > 1: Move the tests to gcc.target/i386/scev-[3-5].c
> > 
> > Subject: [PATCH 1/2] testsuite: Fix XPASS for gcc.dg/tree-ssa/scev-3.c, 
> > -4.c and -5.c [PR112786]
> > 
> > This is the first alternative, perhaps the more appropriate one.
> > 
> > Tested cris-elf, arm-eabi (default), x86_64-linux, ditto -m32,
> > h8300-elf and shle-linux; xpassing, skipped and passing as
> > applicable and intended.
> > 
> > Ok to commit?
> 
> Digging in history reveals the testcases were added by
> Jiangning Liu , not for any
> particular bugreport but "The problem is originally from a real benchmark,
> and the test case only tries to detect the GIMPLE level changes."
> 
> I'm not sure we can infer the testcase should be moved to
> gcc.target/arm/ because of that, but it does seem plausible.

It's been so long and so many changes since these tests were
regression guards, that the original target has lost
importance.  Heck, it was even xfail lp64 at one time!
According to my git dig, it's been adjusted for pass
changes, including reordering and dump output changes.  But
you know that; you've been instrumental in many of those
changes. :)

I'd say gcc.target/arm/ is the one target that's *not*
plausible, as according to Alex result differs between
subtargets.

> I read from your messages that the testcases pass on arm*-*-*?

Yes: they pass (currently XPASS) on arm-eabi and
arm-unknown-linux-gnueabi, default configurations.  But,
scev-3 and -5 fail with for example -mcpu=cortex-r5

brgds, H-P
PS. I'm open to just reverting the patch.  The s/ia32/ilp32/ is proven wrong.


Re: [PATCH] RISC-V: Add vectorized strcmp.

2023-12-01 Thread 钟居哲
lgtm



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-12-01 23:23
To: gcc-patches; palmer; Kito Cheng; jeffreyalaw; juzhe.zh...@rivai.ai
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Add vectorized strcmp.
Hi,
 
this patch adds a vectorized strcmp implementation and tests.  Similar
to strlen, expansion is still guarded by -minline-strcmp.  I just
realized I forgot to make it a series but this one is actually
dependent on the NFC patch and the rawmemchr fix before.
 
Regards
Robin
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (expand_strcmp): Declare.
* config/riscv/riscv-string.cc (riscv_expand_strcmp): Add
strategy handling and delegation to scalar and vector expanders.
(expand_strcmp): Vectorized implementation.
* config/riscv/riscv.md: Add TARGET_VECTOR to strcmp expander.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strcmp.c: New test.
---
gcc/config/riscv/riscv-protos.h   |   1 +
gcc/config/riscv/riscv-string.cc  | 161 +-
gcc/config/riscv/riscv.md |   3 +-
.../riscv/rvv/autovec/builtin/strcmp-run.c|  32 
.../riscv/rvv/autovec/builtin/strcmp.c|  13 ++
5 files changed, 206 insertions(+), 4 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp.c
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index c94c82a9973..5878a674413 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -558,6 +558,7 @@ void expand_cond_binop (unsigned, rtx *);
void expand_cond_ternop (unsigned, rtx *);
void expand_popcount (rtx *);
void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
+bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool);
void emit_vec_extract (rtx, rtx, poly_int64);
/* Rounding mode bitfield for fixed point VXRM.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 6cde1bf89a0..11c1f74d0b3 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -511,12 +511,19 @@ riscv_expand_strcmp (rtx result, rtx src1, rtx src2,
 return false;
   alignment = UINTVAL (align_rtx);
-  if (TARGET_ZBB || TARGET_XTHEADBB)
+  if (TARGET_VECTOR && stringop_strategy & STRATEGY_VECTOR)
 {
-  return riscv_expand_strcmp_scalar (result, src1, src2, nbytes, alignment,
- ncompare);
+  bool ok = riscv_vector::expand_strcmp (result, src1, src2,
+  bytes_rtx, alignment,
+  ncompare);
+  if (ok)
+ return true;
 }
+  if ((TARGET_ZBB || TARGET_XTHEADBB) && stringop_strategy & STRATEGY_SCALAR)
+return riscv_expand_strcmp_scalar (result, src1, src2, nbytes, alignment,
+ncompare);
+
   return false;
}
@@ -1092,4 +1099,152 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx 
haystack, rtx needle,
 }
}
+/* Implement cmpstr using vector instructions.  The ALIGNMENT and
+   NCOMPARE parameters are unused for now.  */
+
+bool
+expand_strcmp (rtx result, rtx src1, rtx src2, rtx nbytes,
+unsigned HOST_WIDE_INT, bool)
+{
+  gcc_assert (TARGET_VECTOR);
+
+  /* We don't support big endian.  */
+  if (BYTES_BIG_ENDIAN)
+return false;
+
+  bool with_length = nbytes != NULL_RTX;
+
+  if (with_length
+  && (!REG_P (nbytes) && !SUBREG_P (nbytes) && !CONST_INT_P (nbytes)))
+return false;
+
+  if (with_length && CONST_INT_P (nbytes))
+nbytes = force_reg (Pmode, nbytes);
+
+  machine_mode mode = E_QImode;
+  unsigned int isize = GET_MODE_SIZE (mode).to_constant ();
+  int lmul = TARGET_MAX_LMUL;
+  poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR * lmul, isize);
+
+  machine_mode vmode;
+  if (!riscv_vector::get_vector_mode (GET_MODE_INNER (mode), nunits)
+ .exists (&vmode))
+gcc_unreachable ();
+
+  machine_mode mask_mode = riscv_vector::get_mask_mode (vmode);
+
+  /* Prepare addresses.  */
+  rtx src_addr1 = copy_addr_to_reg (XEXP (src1, 0));
+  rtx vsrc1 = change_address (src1, vmode, src_addr1);
+
+  rtx src_addr2 = copy_addr_to_reg (XEXP (src2, 0));
+  rtx vsrc2 = change_address (src2, vmode, src_addr2);
+
+  /* Set initial pointer bump to 0.  */
+  rtx cnt = gen_reg_rtx (Pmode);
+  emit_move_insn (cnt, CONST0_RTX (Pmode));
+
+  rtx sub = gen_reg_rtx (Pmode);
+  emit_move_insn (sub, CONST0_RTX (Pmode));
+
+  /* Create source vectors.  */
+  rtx vec1 = gen_reg_rtx (vmode);
+  rtx vec2 = gen_reg_rtx (vmode);
+
+  rtx done = gen_label_rtx ();
+  rtx loop = gen_label_rtx ();
+  emit_label (loop);
+
+  /* Bump the pointers.  */
+  emit_insn (gen_rtx_SET (src_addr1, gen_rtx_PLUS (Pmode, src_addr1, cnt)));
+  emit_insn (gen_rtx_SET (src_addr2, gen_rtx_PLUS (Pmode, src_addr2, cnt)));
+
+  rtx vlops1[] = {vec1, vsrc1};
+  rtx vlops2[] = {vec2, vsrc2};
+
+  if (!with_length)
+{
+  emit_vlmax_insn (code_for_pred_fault_load (vmode),
+riscv

Re: [PATCH] RISC-V: Add vectorized strlen.

2023-12-01 Thread 钟居哲
LGTM.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-12-01 23:21
To: gcc-patches; palmer; Kito Cheng; jeffreyalaw; juzhe.zh...@rivai.ai
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Add vectorized strlen.
Hi,
 
this patch implements a vectorized strlen by re-using and slightly
adjusting the rawmemchr implementation.  Rawmemchr returns the address
of the needle while strlen returns the difference between needle address
and start address.
 
As before, strlen expansion is guarded by -minline-strlen.
 
While testing with -minline-strlen I encountered a vsetvl problem in
memcpy-chk.c where we didn't insert a vsetvl at the proper spot (after
a setjmp).  This needs to be fixed separately and I figured I'd post
this patch as-is.
 
Regards
Robin
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (expand_rawmemchr): Add strlen
parameter.
* config/riscv/riscv-string.cc (riscv_expand_strlen): Call
rawmemchr.
(expand_rawmemchr): Add strlen handling.
* config/riscv/riscv.md: Add TARGET_VECTOR to strlen expander.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strlen.c: New test.
---
gcc/config/riscv/riscv-protos.h   |  2 +-
gcc/config/riscv/riscv-string.cc  | 41 ++-
gcc/config/riscv/riscv.md |  8 +---
.../riscv/rvv/autovec/builtin/strlen-run.c| 37 +
.../riscv/rvv/autovec/builtin/strlen.c| 12 ++
5 files changed, 83 insertions(+), 17 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 695ee24ad6f..c94c82a9973 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -557,7 +557,7 @@ void expand_cond_unop (unsigned, rtx *);
void expand_cond_binop (unsigned, rtx *);
void expand_cond_ternop (unsigned, rtx *);
void expand_popcount (rtx *);
-void expand_rawmemchr (machine_mode, rtx, rtx, rtx);
+void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
void emit_vec_extract (rtx, rtx, poly_int64);
/* Rounding mode bitfield for fixed point VXRM.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 594ff49fc5a..6cde1bf89a0 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -588,9 +588,16 @@ riscv_expand_strlen_scalar (rtx result, rtx src, rtx align)
bool
riscv_expand_strlen (rtx result, rtx src, rtx search_char, rtx align)
{
+  if (TARGET_VECTOR && stringop_strategy & STRATEGY_VECTOR)
+{
+  riscv_vector::expand_rawmemchr (E_QImode, result, src, search_char,
+   /* strlen */ true);
+  return true;
+}
+
   gcc_assert (search_char == const0_rtx);
-  if (TARGET_ZBB || TARGET_XTHEADBB)
+  if ((TARGET_ZBB || TARGET_XTHEADBB) && stringop_strategy & STRATEGY_SCALAR)
 return riscv_expand_strlen_scalar (result, src, align);
   return false;
@@ -979,12 +986,13 @@ expand_block_move (rtx dst_in, rtx src_in, rtx length_in)
}
-/* Implement rawmemchr using vector instructions.
+/* Implement rawmemchr and strlen using vector instructions.
It can be assumed that the needle is in the haystack, otherwise the
behavior is undefined.  */
void
-expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat)
+expand_rawmemchr (machine_mode mode, rtx dst, rtx haystack, rtx needle,
+   bool strlen)
{
   /*
 rawmemchr:
@@ -1005,6 +1013,9 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   */
   gcc_assert (TARGET_VECTOR);
+  if (strlen)
+gcc_assert (mode == E_QImode);
+
   unsigned int isize = GET_MODE_SIZE (mode).to_constant ();
   int lmul = TARGET_MAX_LMUL;
   poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR * lmul, isize);
@@ -1028,12 +1039,13 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
  return a pointer to the matching byte.  */
   unsigned int shift = exact_log2 (GET_MODE_SIZE (mode).to_constant ());
-  rtx src_addr = copy_addr_to_reg (XEXP (src, 0));
+  rtx src_addr = copy_addr_to_reg (XEXP (haystack, 0));
+  rtx start_addr = copy_addr_to_reg (XEXP (haystack, 0));
   rtx loop = gen_label_rtx ();
   emit_label (loop);
-  rtx vsrc = change_address (src, vmode, src_addr);
+  rtx vsrc = change_address (haystack, vmode, src_addr);
   /* Bump the pointer.  */
   rtx step = gen_reg_rtx (Pmode);
@@ -1052,8 +1064,8 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
 emit_insn (gen_read_vldi_zero_extend (cnt));
   /* Compare needle with haystack and store in a mask.  */
-  rtx eq = gen_rtx_EQ (mask_mode, gen_const_vec_duplicate (vmode, pat), vec);
-  rtx vmsops[] = {mask, eq, vec, pat};
+  rtx eq = gen_rtx_EQ (mask_mode, gen_const_vec_duplicate (vmode, needle), 
vec);
+  rtx vmsops[] = {mask, eq, vec, needle};
   emit_nonvlmax_insn (code_for_pred_eqne_scalar (vm

Re: [PATCH] RISC-V: Rename and unify stringop strategy handling [NFC].

2023-12-01 Thread 钟居哲
LGTM



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-12-01 23:21
To: gcc-patches; palmer; Kito Cheng; jeffreyalaw; juzhe.zh...@rivai.ai
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Rename and unify stringop strategy handling [NFC].
Hi,
 
now split into multiple patches.
 
In preparation for the vectorized strlen and strcmp support this NFC
patch unifies the stringop strategy handling a bit.  The "auto"
strategy now is a combination of scalar and vector and an expander
should try the strategies in their preferred order.
 
For the block_move expander this patch does just that.
 
Regards
Robin
 
gcc/ChangeLog:
 
* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum):
Rename...
(enum stringop_strategy_enum): ... to this.
* config/riscv/riscv-string.cc (riscv_expand_block_move): New
wrapper expander handling the strategies and delegation.
(riscv_expand_block_move_scalar): Rename function and make
static.
(expand_block_move): Remove strategy handling.
* config/riscv/riscv.md: Call expander wrapper.
* config/riscv/riscv.opt: Rename.
---
gcc/config/riscv/riscv-opts.h | 18 ++--
gcc/config/riscv/riscv-string.cc  | 92 +++
gcc/config/riscv/riscv.md |  4 +-
gcc/config/riscv/riscv.opt| 18 ++--
.../riscv/rvv/base/cpymem-strategy-1.c|  2 +-
.../riscv/rvv/base/cpymem-strategy-2.c|  2 +-
.../riscv/rvv/base/cpymem-strategy-3.c|  2 +-
.../riscv/rvv/base/cpymem-strategy-4.c|  2 +-
.../riscv/rvv/base/cpymem-strategy-5.c|  2 +-
9 files changed, 78 insertions(+), 64 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index e6e55ad7071..30efebbf07b 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -104,15 +104,15 @@ enum riscv_entity
};
/* RISC-V stringop strategy. */
-enum riscv_stringop_strategy_enum {
-  /* Use scalar or vector instructions. */
-  USE_AUTO,
-  /* Always use a library call. */
-  USE_LIBCALL,
-  /* Only use scalar instructions. */
-  USE_SCALAR,
-  /* Only use vector instructions. */
-  USE_VECTOR
+enum stringop_strategy_enum {
+  /* No expansion. */
+  STRATEGY_LIBCALL = 1,
+  /* Use scalar expansion if possible. */
+  STRATEGY_SCALAR = 2,
+  /* Only vector expansion if possible. */
+  STRATEGY_VECTOR = 4,
+  /* Use any. */
+  STRATEGY_AUTO = STRATEGY_SCALAR | STRATEGY_VECTOR
};
#define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && 
TARGET_64BIT))
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 80e3b5981af..f3a4d3ddd47 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -707,51 +707,68 @@ riscv_block_move_loop (rtx dest, rtx src, unsigned 
HOST_WIDE_INT length,
/* Expand a cpymemsi instruction, which copies LENGTH bytes from
memory reference SRC to memory reference DEST.  */
-bool
-riscv_expand_block_move (rtx dest, rtx src, rtx length)
+static bool
+riscv_expand_block_move_scalar (rtx dest, rtx src, rtx length)
{
-  if (riscv_memcpy_strategy == USE_LIBCALL
-  || riscv_memcpy_strategy == USE_VECTOR)
+  if (!CONST_INT_P (length))
 return false;
-  if (CONST_INT_P (length))
-{
-  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
-  unsigned HOST_WIDE_INT factor, align;
+  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
+  unsigned HOST_WIDE_INT factor, align;
-  align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
-  factor = BITS_PER_WORD / align;
+  align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
+  factor = BITS_PER_WORD / align;
-  if (optimize_function_for_size_p (cfun)
-   && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
- return false;
+  if (optimize_function_for_size_p (cfun)
+  && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
+return false;
-  if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+  if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+{
+  riscv_block_move_straight (dest, src, INTVAL (length));
+  return true;
+}
+  else if (optimize && align >= BITS_PER_WORD)
+{
+  unsigned min_iter_words
+ = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
+  unsigned iter_words = min_iter_words;
+  unsigned HOST_WIDE_INT bytes = hwi_length;
+  unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
+
+  /* Lengthen the loop body if it shortens the tail.  */
+  for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
{
-   riscv_block_move_straight (dest, src, INTVAL (length));
-   return true;
+   unsigned cur_cost = iter_words + words % iter_words;
+   unsigned new_cost = i + words % i;
+   if (new_cost <= cur_cost)
+ iter_words = i;
}
-  else if (optimize && align >= BITS_PER_WORD)
- {
-   unsigned min_iter_words
- = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
-   unsigned iter_words = min_iter_wo

Re: [PATCH] RISC-V: Fix rawmemchr implementation.

2023-12-01 Thread 钟居哲
LGTM。



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-12-01 23:20
To: gcc-patches; palmer; Kito Cheng; jeffreyalaw; juzhe.zh...@rivai.ai
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Fix rawmemchr implementation.
Hi,
 
this fixes a bug in the rawmemchr implementation by incrementing the
source address by vl * element_size instead of just vl.
 
This is normally harmless as we will just scan the same region more than
once but, in combination with an older qemu version, would lead to
an execution failure in SPEC2017.
 
Regards
Robin
 
 
gcc/ChangeLog:
 
* config/riscv/riscv-string.cc (expand_rawmemchr): Increment
source address by vl * element_size.
---
gcc/config/riscv/riscv-string.cc | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index f3a4d3ddd47..594ff49fc5a 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -1017,6 +1017,8 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   machine_mode mask_mode = riscv_vector::get_mask_mode (vmode);
   rtx cnt = gen_reg_rtx (Pmode);
+  emit_move_insn (cnt, CONST0_RTX (Pmode));
+
   rtx end = gen_reg_rtx (Pmode);
   rtx vec = gen_reg_rtx (vmode);
   rtx mask = gen_reg_rtx (mask_mode);
@@ -1033,6 +1035,11 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   rtx vsrc = change_address (src, vmode, src_addr);
+  /* Bump the pointer.  */
+  rtx step = gen_reg_rtx (Pmode);
+  emit_insn (gen_rtx_SET (step, gen_rtx_ASHIFT (Pmode, cnt, GEN_INT (shift;
+  emit_insn (gen_rtx_SET (src_addr, gen_rtx_PLUS (Pmode, src_addr, step)));
+
   /* Emit a first-fault load.  */
   rtx vlops[] = {vec, vsrc};
   emit_vlmax_insn (code_for_pred_fault_load (vmode),
@@ -1055,16 +1062,10 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   emit_nonvlmax_insn (code_for_pred_ffs (mask_mode, Pmode),
  riscv_vector::CPOP_OP, vfops, cnt);
-  /* Bump the pointer.  */
-  emit_insn (gen_rtx_SET (src_addr, gen_rtx_PLUS (Pmode, src_addr, cnt)));
-
   /* Emit the loop condition.  */
   rtx test = gen_rtx_LT (VOIDmode, end, const0_rtx);
   emit_jump_insn (gen_cbranch4 (Pmode, test, end, const0_rtx, loop));
-  /*  We overran by CNT, subtract it.  */
-  emit_insn (gen_rtx_SET (src_addr, gen_rtx_MINUS (Pmode, src_addr, cnt)));
-
   /*  We found something at SRC + END * [1,2,4,8].  */
   emit_insn (gen_rtx_SET (end, gen_rtx_ASHIFT (Pmode, end, GEN_INT (shift;
   emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end)));
-- 
2.43.0
 
 


Re: [PATCH] c++: decltype of (non-captured variable) [PR83167]

2023-12-01 Thread Patrick Palka
On Fri, 1 Dec 2023, Jason Merrill wrote:

> On 12/1/23 12:32, Patrick Palka wrote:
> > On Tue, 14 Nov 2023, Jason Merrill wrote:
> > 
> > > On 11/14/23 11:10, Patrick Palka wrote:
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > > trunk?
> > > > 
> > > > -- >8 --
> > > > 
> > > > For decltype((x)) within a lambda where x is not captured, we dubiously
> > > > require that the lambda has a capture default, unlike for decltype(x).
> > > > This patch fixes this inconsistency; I couldn't find a justification for
> > > > it in the standard.
> > > 
> > > The relevant passage seems to be
> > > 
> > > https://eel.is/c++draft/expr.prim#id.unqual-3
> > > 
> > > "If naming the entity from outside of an unevaluated operand within S
> > > would
> > > refer to an entity captured by copy in some intervening lambda-expression,
> > > then let E be the innermost such lambda-expression.
> > > 
> > > If there is such a lambda-expression and if P is in E's function parameter
> > > scope but not its parameter-declaration-clause, then the type of the
> > > expression is the type of a class member access expression ([expr.ref])
> > > naming
> > > the non-static data member that would be declared for such a capture in
> > > the
> > > object parameter ([dcl.fct]) of the function call operator of E."
> > > 
> > > In this case I guess there is no such lambda-expression because naming x
> > > won't
> > > refer to a capture by copy if the lambda doesn't capture anything, so we
> > > ignore the lambda.
> > > 
> > > Maybe refer to that in a comment?  OK with that change.
> > > 
> > > I'm surprised that it refers specifically to capture by copy, but I guess
> > > a
> > > capture by reference should have the same decltype as the captured
> > > variable?
> > 
> > Ah, seems like it.  So maybe we should get rid of the redundant
> > by-reference capture-default handling, to more closely mirror the
> > standard?
> > 
> > Also now that r14-6026-g73e2bdbf9bed48 made capture_decltype return
> > NULL_TREE to mean the capture is dependent, it seems we should just
> > inline capture_decltype into finish_decltype_type rather than
> > introducing another special return value to mean "fall back to ordinary
> > handling".
> > 
> > How does the following look?  Bootstrapped and regtested on
> > x86_64-pc-linux-gnu.
> > 
> > -- >8 --
> > 
> > PR c++/83167
> > 
> > gcc/cp/ChangeLog:
> > 
> > * semantics.cc (capture_decltype): Inline into its only caller ...
> > (finish_decltype_type): ... here.  Update nearby comment to refer
> > to recent standard.  Restrict uncaptured variable handling to just
> > lambdas with a by-copy capture-default.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/lambda/lambda-decltype4.C: New test.
> > ---
> >   gcc/cp/semantics.cc   | 107 +++---
> >   .../g++.dg/cpp0x/lambda/lambda-decltype4.C|  15 +++
> >   2 files changed, 55 insertions(+), 67 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype4.C
> > 
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index fbbc18336a0..fb4c3992e34 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -53,7 +53,6 @@ along with GCC; see the file COPYING3.  If not see
> > static tree maybe_convert_cond (tree);
> >   static tree finalize_nrv_r (tree *, int *, void *);
> > -static tree capture_decltype (tree);
> > /* Used for OpenMP non-static data member privatization.  */
> >   @@ -11856,21 +11855,48 @@ finish_decltype_type (tree expr, bool
> > id_expression_or_member_access_p,
> >   }
> > else
> >   {
> > -  /* Within a lambda-expression:
> > -
> > -Every occurrence of decltype((x)) where x is a possibly
> > -parenthesized id-expression that names an entity of
> > -automatic storage duration is treated as if x were
> > -transformed into an access to a corresponding data member
> > -of the closure type that would have been declared if x
> > -were a use of the denoted entity.  */
> > if (outer_automatic_var_p (STRIP_REFERENCE_REF (expr))
> >   && current_function_decl
> >   && LAMBDA_FUNCTION_P (current_function_decl))
> > {
> > - type = capture_decltype (STRIP_REFERENCE_REF (expr));
> > - if (!type)
> > -   goto dependent;
> > + /* [expr.prim.id.unqual]/3: If naming the entity from outside of an
> > +unevaluated operand within S would refer to an entity captured by
> > +copy in some intervening lambda-expression, then let E be the
> > +innermost such lambda-expression.
> > +
> > +If there is such a lambda-expression and if P is in E's function
> > +parameter scope but not its parameter-declaration-clause, then
> > the
> > +type of the expression is the type of a class member access
> > +expression naming the non-static data member that would be
> > declared
> > +for such a capt

[PATCH, v3] Fortran: deferred-length character optional dummy arguments [PR93762,PR100651]

2023-12-01 Thread Harald Anlauf

Dear all,

this patch extends the previous version by adding further code testing
the presence of an optional deferred-length character argument also
in the function initialization code.  This allows to re-enable a
commented-out test in v2.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald


On 11/28/23 20:56, Harald Anlauf wrote:

Hi FX,

On 11/28/23 18:07, FX Coudert wrote:

Hi Harald,

The patch looks OK to me. Probably wait a bit for another opinion,
since I’m not that active and I may have missed something.

Thanks,
FX


thanks for having a look.

In the meantime I got an automated mail from the Linaro testers.
According to it there is a runtime failure of the testcase on
aarch64.  I couldn't see any useful traceback or else.

I tried the testcase on x86 with different options and found
an unexpected result only with -fsanitize=undefined and only
for the case of a rank-1 dummy when there is no actual argument
and the passed to another subroutine.  (valgrind is happy.)

Reduced reproducer:

! this fails with -fsanitize=undefined
program main
   call test_rank1 ()
contains
   subroutine test_rank1 (msg1)
     character(:), optional, allocatable :: msg1(:)
     if (present (msg1)) stop 77
     call assert_rank1 ()    ! <- no problem here
     call assert_rank1 (msg1)    ! <- problematic code path
   end

   subroutine assert_rank1 (msg2)
     character(:), optional, allocatable :: msg2(:)
     if (present (msg2)) stop 99 ! <- no problem if commented
   end
end


As far as I can tell, this could be a pre-existing (latent)
issue.  By looking at the tree-dump, the only thing that
appears fishy has been there before.  But then I am only
guessing that this is the problem observed on aarch64.

I have disabled the related call in the testcase of the
attached revised version.  As I do not see anything else,
I wonder if one could proceed with the current version
but open a PR for the reduced case above, unless someone
can pinpoint the place that is responsible for the above
failure.  (Is it the caller, or rather the function entry
code in the callee?)

Cheers,
Harald

From b0a169bd70c9cd175c25b4a9543b24058596bf5e Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 1 Dec 2023 22:44:30 +0100
Subject: [PATCH] Fortran: deferred-length character optional dummy arguments
 [PR93762,PR100651]

gcc/fortran/ChangeLog:

	PR fortran/93762
	PR fortran/100651
	* trans-array.cc (gfc_trans_deferred_array): Add presence check
	for optional deferred-length character dummy arguments.
	* trans-expr.cc (gfc_conv_missing_dummy): The character length for
	deferred-length dummy arguments is passed by reference, so that
	its value can be returned.  Adjust handling for optional dummies.

gcc/testsuite/ChangeLog:

	PR fortran/93762
	PR fortran/100651
	* gfortran.dg/optional_deferred_char_1.f90: New test.
---
 gcc/fortran/trans-array.cc|   9 ++
 gcc/fortran/trans-expr.cc |  22 +++-
 .../gfortran.dg/optional_deferred_char_1.f90  | 100 ++
 3 files changed, 127 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/optional_deferred_char_1.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index bbb81f40aa9..82f60a656f3 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -11430,6 +11430,15 @@ gfc_trans_deferred_array (gfc_symbol * sym, gfc_wrapped_block * block)
 {
   gfc_conv_string_length (sym->ts.u.cl, NULL, &init);
   gfc_trans_vla_type_sizes (sym, &init);
+
+  /* Presence check of optional deferred-length character dummy.  */
+  if (sym->ts.deferred && sym->attr.dummy && sym->attr.optional)
+	{
+	  tmp = gfc_finish_block (&init);
+	  tmp = build3_v (COND_EXPR, gfc_conv_expr_present (sym),
+			  tmp, build_empty_stmt (input_location));
+	  gfc_add_expr_to_block (&init, tmp);
+	}
 }
 
   /* Dummy, use associated and result variables don't need anything special.  */
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 6a47af39c57..ea087294249 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -2125,10 +2125,24 @@ gfc_conv_missing_dummy (gfc_se * se, gfc_expr * arg, gfc_typespec ts, int kind)
 
   if (ts.type == BT_CHARACTER)
 {
-  tmp = build_int_cst (gfc_charlen_type_node, 0);
-  tmp = fold_build3_loc (input_location, COND_EXPR, gfc_charlen_type_node,
-			 present, se->string_length, tmp);
-  tmp = gfc_evaluate_now (tmp, &se->pre);
+  /* Handle deferred-length dummies that pass the character length by
+	 reference so that the value can be returned.  */
+  if (ts.deferred && INDIRECT_REF_P (se->string_length))
+	{
+	  tmp = gfc_build_addr_expr (NULL_TREE, se->string_length);
+	  tmp = fold_build3_loc (input_location, COND_EXPR, TREE_TYPE (tmp),
+ present, tmp, null_pointer_node);
+	  tmp = gfc_evaluate_now (tmp, &se->pre);
+	  tmp = build_fold_indirect_ref_loc (input_location, tmp);
+	}
+  

Re: [PATCH] Fortran: copy-out for possibly missing OPTIONAL CLASS arguments [PR112772]

2023-12-01 Thread Harald Anlauf

Hi Mikael,

On 12/1/23 21:24, Mikael Morin wrote:

Hello,

Le 30/11/2023 à 22:06, Harald Anlauf a écrit :

the attached rather obvious patch fixes the first testcase of pr112772:
we unconditionally generated copy-out code for optional class arguments,
while the copy-in properly checked the presence of arguments.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Looks good.
Thanks.


ok, will commit.


(The second testcase is a different issue.)


Maybe use a separate PR?

Mikael



I just found a fix that is regtesting, and which will allow to
re-enable the test failing with ASAN in the patch for PR100651.
Will merge that fix into the previous patch and submit a v3 later.

Thanks,
Harald



Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-01 Thread Jakub Jelinek
On Fri, Dec 01, 2023 at 03:53:14PM -0500, Marek Polacek wrote:
> On Fri, Dec 01, 2023 at 11:44:28AM -0800, Andrew Pinski wrote:
> > On Fri, Dec 1, 2023, 11:36 Marek Polacek  wrote:
> > 
> > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > >
> > > -- >8 --
> > > It came up that a good hardening strategy is to disable trampolines
> > > which may require executable stack.  Therefore the following patch
> > > adds -Werror=trampolines to -fhardened.
> > >
> > 
> > It might make sense to add a fortran testcase too. Especially when that and
> > Ada are 2 biggest users of trampolines.
> 
> I don't know either of these languages to write a test, and I don't see
> anything that mentions the word trampoline in gfortran.dg/.  Ada has

program nesting
  integer :: i
  procedure(), pointer :: p
  p => foo
  i = 5
  call p
  if (i.ne.6) stop 1
contains
  subroutine foo
i = i + 1
  end subroutine
end program

(obviously at -O0 only)?

Jakub



Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-01 Thread Marek Polacek
On Fri, Dec 01, 2023 at 11:44:28AM -0800, Andrew Pinski wrote:
> On Fri, Dec 1, 2023, 11:36 Marek Polacek  wrote:
> 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> >
> > -- >8 --
> > It came up that a good hardening strategy is to disable trampolines
> > which may require executable stack.  Therefore the following patch
> > adds -Werror=trampolines to -fhardened.
> >
> 
> It might make sense to add a fortran testcase too. Especially when that and
> Ada are 2 biggest users of trampolines.

I don't know either of these languages to write a test, and I don't see
anything that mentions the word trampoline in gfortran.dg/.  Ada has
gnat.dg/trampoline3.adb but:

$ gcc -c -Wtrampolines trampoline3.adb
trampoline3.adb:6:03: warning: variable "A" is read but never assigned [-gnatwv]

so there is no warning.

Marek



Re: [PATCH] libsupc++: try cxa_thread_atexit_impl at runtime

2023-12-01 Thread Jason Merrill

On 12/1/23 15:40, Alexandre Oliva wrote:

On Nov  9, 2023, Jonathan Wakely  wrote:


On Thu, 9 Nov 2023 at 01:56, Alexandre Oliva  wrote:



g++.dg/tls/thread_local-order2.C fails when the toolchain is built for
a platform that lacks __cxa_thread_atexit_impl, even if the program is
built and run using that toolchain on a (later) platform that offers
__cxa_thread_atexit_impl.

This patch adds runtime testing for __cxa_thread_atexit_impl on
platforms that support weak symbols.

Regstrapped on x86_64-linux-gnu, also tested with gcc-13 on i686- and
x86_64-, and with ac_cv_func___cxa_thread_atexit_impl=no, that, on a
distro that lacks __cxa_thread_atexit in libc, forces the newly-added
code to be exercised, and that enabled thread_local-order2.C to pass
where the runtime libc has __cxa_thread_atexit_impl.  Ok to install?



Seems fine to me. Any objections, Jason?


Jason, ping?
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635750.html


OK by me.

Jason



Re: [PATCH] libsupc++: try cxa_thread_atexit_impl at runtime

2023-12-01 Thread Alexandre Oliva
On Nov  9, 2023, Jonathan Wakely  wrote:

> On Thu, 9 Nov 2023 at 01:56, Alexandre Oliva  wrote:

>> g++.dg/tls/thread_local-order2.C fails when the toolchain is built for
>> a platform that lacks __cxa_thread_atexit_impl, even if the program is
>> built and run using that toolchain on a (later) platform that offers
>> __cxa_thread_atexit_impl.
>> 
>> This patch adds runtime testing for __cxa_thread_atexit_impl on
>> platforms that support weak symbols.
>> 
>> Regstrapped on x86_64-linux-gnu, also tested with gcc-13 on i686- and
>> x86_64-, and with ac_cv_func___cxa_thread_atexit_impl=no, that, on a
>> distro that lacks __cxa_thread_atexit in libc, forces the newly-added
>> code to be exercised, and that enabled thread_local-order2.C to pass
>> where the runtime libc has __cxa_thread_atexit_impl.  Ok to install?

> Seems fine to me. Any objections, Jason?

Jason, ping?
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635750.html

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] c++: decltype of (non-captured variable) [PR83167]

2023-12-01 Thread Jason Merrill

On 12/1/23 12:32, Patrick Palka wrote:

On Tue, 14 Nov 2023, Jason Merrill wrote:


On 11/14/23 11:10, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

For decltype((x)) within a lambda where x is not captured, we dubiously
require that the lambda has a capture default, unlike for decltype(x).
This patch fixes this inconsistency; I couldn't find a justification for
it in the standard.


The relevant passage seems to be

https://eel.is/c++draft/expr.prim#id.unqual-3

"If naming the entity from outside of an unevaluated operand within S would
refer to an entity captured by copy in some intervening lambda-expression,
then let E be the innermost such lambda-expression.

If there is such a lambda-expression and if P is in E's function parameter
scope but not its parameter-declaration-clause, then the type of the
expression is the type of a class member access expression ([expr.ref]) naming
the non-static data member that would be declared for such a capture in the
object parameter ([dcl.fct]) of the function call operator of E."

In this case I guess there is no such lambda-expression because naming x won't
refer to a capture by copy if the lambda doesn't capture anything, so we
ignore the lambda.

Maybe refer to that in a comment?  OK with that change.

I'm surprised that it refers specifically to capture by copy, but I guess a
capture by reference should have the same decltype as the captured variable?


Ah, seems like it.  So maybe we should get rid of the redundant
by-reference capture-default handling, to more closely mirror the
standard?

Also now that r14-6026-g73e2bdbf9bed48 made capture_decltype return
NULL_TREE to mean the capture is dependent, it seems we should just
inline capture_decltype into finish_decltype_type rather than
introducing another special return value to mean "fall back to ordinary
handling".

How does the following look?  Bootstrapped and regtested on
x86_64-pc-linux-gnu.

-- >8 --

PR c++/83167

gcc/cp/ChangeLog:

* semantics.cc (capture_decltype): Inline into its only caller ...
(finish_decltype_type): ... here.  Update nearby comment to refer
to recent standard.  Restrict uncaptured variable handling to just
lambdas with a by-copy capture-default.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-decltype4.C: New test.
---
  gcc/cp/semantics.cc   | 107 +++---
  .../g++.dg/cpp0x/lambda/lambda-decltype4.C|  15 +++
  2 files changed, 55 insertions(+), 67 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype4.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index fbbc18336a0..fb4c3992e34 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -53,7 +53,6 @@ along with GCC; see the file COPYING3.  If not see
  
  static tree maybe_convert_cond (tree);

  static tree finalize_nrv_r (tree *, int *, void *);
-static tree capture_decltype (tree);
  
  /* Used for OpenMP non-static data member privatization.  */
  
@@ -11856,21 +11855,48 @@ finish_decltype_type (tree expr, bool id_expression_or_member_access_p,

  }
else
  {
-  /* Within a lambda-expression:
-
-Every occurrence of decltype((x)) where x is a possibly
-parenthesized id-expression that names an entity of
-automatic storage duration is treated as if x were
-transformed into an access to a corresponding data member
-of the closure type that would have been declared if x
-were a use of the denoted entity.  */
if (outer_automatic_var_p (STRIP_REFERENCE_REF (expr))
  && current_function_decl
  && LAMBDA_FUNCTION_P (current_function_decl))
{
- type = capture_decltype (STRIP_REFERENCE_REF (expr));
- if (!type)
-   goto dependent;
+ /* [expr.prim.id.unqual]/3: If naming the entity from outside of an
+unevaluated operand within S would refer to an entity captured by
+copy in some intervening lambda-expression, then let E be the
+innermost such lambda-expression.
+
+If there is such a lambda-expression and if P is in E's function
+parameter scope but not its parameter-declaration-clause, then the
+type of the expression is the type of a class member access
+expression naming the non-static data member that would be declared
+for such a capture in the object parameter of the function call
+operator of E."  */


Hmm, looks like this code is only checking the innermost lambda, it 
needs to check all containing lambdas for one that would capture it by copy.



+ tree decl = STRIP_REFERENCE_REF (expr);
+ tree lam = CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT 
(current_function_decl));
+ tree cap = lookup_name (DECL_NAME (decl), LOOK_where::BLOCK,
+ LOOK_want::HI

Re: [PATCH] Fortran: copy-out for possibly missing OPTIONAL CLASS arguments [PR112772]

2023-12-01 Thread Mikael Morin

Hello,

Le 30/11/2023 à 22:06, Harald Anlauf a écrit :

the attached rather obvious patch fixes the first testcase of pr112772:
we unconditionally generated copy-out code for optional class arguments,
while the copy-in properly checked the presence of arguments.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Looks good.
Thanks.


(The second testcase is a different issue.)


Maybe use a separate PR?

Mikael


Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-01 Thread Andrew Pinski
On Fri, Dec 1, 2023, 11:36 Marek Polacek  wrote:

> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>
> -- >8 --
> It came up that a good hardening strategy is to disable trampolines
> which may require executable stack.  Therefore the following patch
> adds -Werror=trampolines to -fhardened.
>

It might make sense to add a fortran testcase too. Especially when that and
Ada are 2 biggest users of trampolines.

Thanks,
Andrew




> gcc/ChangeLog:
>
> * common.opt (Wtrampolines): Enable by -fhardened.
> * doc/invoke.texi: Reflect that -fhardened enables
> -Werror=trampolines.
> * opts.cc (print_help_hardened): Add -Werror=trampolines.
> * toplev.cc (process_options): Enable -Werror=trampolines for
> -fhardened.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/fhardened-1.c: New test.
> * gcc.dg/fhardened-2.c: New test.
> * gcc.dg/fhardened-3.c: New test.
> * gcc.dg/fhardened-4.c: New test.
> * gcc.dg/fhardened-5.c: New test.
> ---
>  gcc/common.opt |  2 +-
>  gcc/doc/invoke.texi|  1 +
>  gcc/opts.cc|  1 +
>  gcc/testsuite/gcc.dg/fhardened-1.c | 27 +++
>  gcc/testsuite/gcc.dg/fhardened-2.c | 25 +
>  gcc/testsuite/gcc.dg/fhardened-3.c | 25 +
>  gcc/testsuite/gcc.dg/fhardened-4.c | 25 +
>  gcc/testsuite/gcc.dg/fhardened-5.c | 27 +++
>  gcc/toplev.cc  |  8 +++-
>  9 files changed, 139 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-4.c
>  create mode 100644 gcc/testsuite/gcc.dg/fhardened-5.c
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 161a035d736..9b09c7cb3df 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -807,7 +807,7 @@ Common Var(warn_system_headers) Warning
>  Do not suppress warnings from system headers.
>
>  Wtrampolines
> -Common Var(warn_trampolines) Warning
> +Common Var(warn_trampolines) Warning EnabledBy(fhardened)
>  Warn whenever a trampoline is generated.
>
>  Wtrivial-auto-var-init
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 2fab4c5d71f..c1664a1a0f1 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -17745,6 +17745,7 @@ may change between major releases of GCC, but are
> currently:
>  -fstack-protector-strong
>  -fstack-clash-protection
>  -fcf-protection=full @r{(x86 GNU/Linux only)}
> +-Werror=trampolines
>  }
>
>  The list of options enabled by @option{-fhardened} can be generated using
> diff --git a/gcc/opts.cc b/gcc/opts.cc
> index 5d5efaf1b9e..aa062b87cef 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -2517,6 +2517,7 @@ print_help_hardened ()
>printf ("  %s\n", "-fstack-protector-strong");
>printf ("  %s\n", "-fstack-clash-protection");
>printf ("  %s\n", "-fcf-protection=full");
> +  printf ("  %s\n", "-Werror=trampolines");
>putchar ('\n');
>  }
>
> diff --git a/gcc/testsuite/gcc.dg/fhardened-1.c
> b/gcc/testsuite/gcc.dg/fhardened-1.c
> new file mode 100644
> index 000..8710959b6f1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fhardened-1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
> +/* { dg-require-effective-target trampolines } */
> +/* { dg-options "-fhardened -O" } */
> +
> +static void
> +baz (int (*bar) (void))
> +{
> +  bar ();
> +}
> +
> +int
> +main (void)
> +{
> +  int a = 6;
> +
> +  int
> +  bar (void)   // { dg-error "trampoline" }
> +  {
> +return a;
> +  }
> +
> +  baz (bar);
> +
> +  return 0;
> +}
> +
> +/* { dg-prune-output "some warnings being treated as errors" } */
> diff --git a/gcc/testsuite/gcc.dg/fhardened-2.c
> b/gcc/testsuite/gcc.dg/fhardened-2.c
> new file mode 100644
> index 000..d47512aa47f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fhardened-2.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
> +/* { dg-require-effective-target trampolines } */
> +/* { dg-options "-fhardened -O -Wno-trampolines" } */
> +
> +static void
> +baz (int (*bar) (void))
> +{
> +  bar ();
> +}
> +
> +int
> +main (void)
> +{
> +  int a = 6;
> +
> +  int
> +  bar (void)   // { dg-bogus "trampoline" }
> +  {
> +return a;
> +  }
> +
> +  baz (bar);
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/fhardened-3.c
> b/gcc/testsuite/gcc.dg/fhardened-3.c
> new file mode 100644
> index 000..cebae13d8be
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fhardened-3.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
> +/* { dg-require-effective-target trampolines } */
> +/* { dg-options "-fhardened -O -Wno-error" } */
> +
> +static void
> +baz (int (*bar) (void))
> +{
> +  bar ();
> +}
> +
> +int
> +

[PATCH] gcc: Disallow trampolines when -fhardened

2023-12-01 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
It came up that a good hardening strategy is to disable trampolines
which may require executable stack.  Therefore the following patch
adds -Werror=trampolines to -fhardened.

gcc/ChangeLog:

* common.opt (Wtrampolines): Enable by -fhardened.
* doc/invoke.texi: Reflect that -fhardened enables -Werror=trampolines.
* opts.cc (print_help_hardened): Add -Werror=trampolines.
* toplev.cc (process_options): Enable -Werror=trampolines for
-fhardened.

gcc/testsuite/ChangeLog:

* gcc.dg/fhardened-1.c: New test.
* gcc.dg/fhardened-2.c: New test.
* gcc.dg/fhardened-3.c: New test.
* gcc.dg/fhardened-4.c: New test.
* gcc.dg/fhardened-5.c: New test.
---
 gcc/common.opt |  2 +-
 gcc/doc/invoke.texi|  1 +
 gcc/opts.cc|  1 +
 gcc/testsuite/gcc.dg/fhardened-1.c | 27 +++
 gcc/testsuite/gcc.dg/fhardened-2.c | 25 +
 gcc/testsuite/gcc.dg/fhardened-3.c | 25 +
 gcc/testsuite/gcc.dg/fhardened-4.c | 25 +
 gcc/testsuite/gcc.dg/fhardened-5.c | 27 +++
 gcc/toplev.cc  |  8 +++-
 9 files changed, 139 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/fhardened-1.c
 create mode 100644 gcc/testsuite/gcc.dg/fhardened-2.c
 create mode 100644 gcc/testsuite/gcc.dg/fhardened-3.c
 create mode 100644 gcc/testsuite/gcc.dg/fhardened-4.c
 create mode 100644 gcc/testsuite/gcc.dg/fhardened-5.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 161a035d736..9b09c7cb3df 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -807,7 +807,7 @@ Common Var(warn_system_headers) Warning
 Do not suppress warnings from system headers.
 
 Wtrampolines
-Common Var(warn_trampolines) Warning
+Common Var(warn_trampolines) Warning EnabledBy(fhardened)
 Warn whenever a trampoline is generated.
 
 Wtrivial-auto-var-init
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 2fab4c5d71f..c1664a1a0f1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17745,6 +17745,7 @@ may change between major releases of GCC, but are 
currently:
 -fstack-protector-strong
 -fstack-clash-protection
 -fcf-protection=full @r{(x86 GNU/Linux only)}
+-Werror=trampolines
 }
 
 The list of options enabled by @option{-fhardened} can be generated using
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 5d5efaf1b9e..aa062b87cef 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2517,6 +2517,7 @@ print_help_hardened ()
   printf ("  %s\n", "-fstack-protector-strong");
   printf ("  %s\n", "-fstack-clash-protection");
   printf ("  %s\n", "-fcf-protection=full");
+  printf ("  %s\n", "-Werror=trampolines");
   putchar ('\n');
 }
 
diff --git a/gcc/testsuite/gcc.dg/fhardened-1.c 
b/gcc/testsuite/gcc.dg/fhardened-1.c
new file mode 100644
index 000..8710959b6f1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fhardened-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
+/* { dg-require-effective-target trampolines } */
+/* { dg-options "-fhardened -O" } */
+
+static void
+baz (int (*bar) (void))
+{
+  bar ();
+}
+
+int
+main (void)
+{
+  int a = 6;
+
+  int
+  bar (void)   // { dg-error "trampoline" }
+  {
+return a;
+  }
+
+  baz (bar);
+
+  return 0;
+}
+
+/* { dg-prune-output "some warnings being treated as errors" } */
diff --git a/gcc/testsuite/gcc.dg/fhardened-2.c 
b/gcc/testsuite/gcc.dg/fhardened-2.c
new file mode 100644
index 000..d47512aa47f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fhardened-2.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
+/* { dg-require-effective-target trampolines } */
+/* { dg-options "-fhardened -O -Wno-trampolines" } */
+
+static void
+baz (int (*bar) (void))
+{
+  bar ();
+}
+
+int
+main (void)
+{
+  int a = 6;
+
+  int
+  bar (void)   // { dg-bogus "trampoline" }
+  {
+return a;
+  }
+
+  baz (bar);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/fhardened-3.c 
b/gcc/testsuite/gcc.dg/fhardened-3.c
new file mode 100644
index 000..cebae13d8be
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fhardened-3.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
+/* { dg-require-effective-target trampolines } */
+/* { dg-options "-fhardened -O -Wno-error" } */
+
+static void
+baz (int (*bar) (void))
+{
+  bar ();
+}
+
+int
+main (void)
+{
+  int a = 6;
+
+  int
+  bar (void)   // { dg-warning "trampoline" }
+  {
+return a;
+  }
+
+  baz (bar);
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/fhardened-4.c 
b/gcc/testsuite/gcc.dg/fhardened-4.c
new file mode 100644
index 000..7e62ed3385d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fhardened-4.c
@@ -0,0 +1,25 @@
+/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
+/* { dg-require-effective-target trampolines } */
+/* { dg-options "-fhardened -O -W

[COMMITTED] Use range_compatible_p in check_operands_p.

2023-12-01 Thread Andrew MacLeod
Comments in PR 112788 correctly brought up that the new 
check_operands_p() routine is directly checking precision rather than 
calling range_compatible_p().


Most earlier iterations of the original patch had ranges as arguments, 
and it wasn't primarily a CHECKING_P only call then...   Regardless, it 
makes total sense to call range_compatible_p so this patch does exactly 
that.  It required moving range_compatible_p() into value-range.h and 
then adjusting each check_operands_p() routine.


Now range type compatibility is centralized again :-P

Bootstraps on x86_64-pc-linux-gnu with no new regressions.

Andrew

From c6bb413eeb9d13412e8101e3029099d7fd746708 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 1 Dec 2023 11:15:33 -0500
Subject: [PATCH] Use range_compatible_p in check_operands_p.

Instead of directly checking type precision, check_operands_p should
invoke range_compatible_p to keep the range checking centralized.

	* gimple-range-fold.h (range_compatible_p): Relocate.
	* value-range.h (range_compatible_p): Here.
	* range-op-mixed.h (operand_equal::operand_check_p): Call
	range_compatible_p rather than comparing precision.
	(operand_not_equal::operand_check_p): Ditto.
	(operand_not_lt::operand_check_p): Ditto.
	(operand_not_le::operand_check_p): Ditto.
	(operand_not_gt::operand_check_p): Ditto.
	(operand_not_ge::operand_check_p): Ditto.
	(operand_plus::operand_check_p): Ditto.
	(operand_abs::operand_check_p): Ditto.
	(operand_minus::operand_check_p): Ditto.
	(operand_negate::operand_check_p): Ditto.
	(operand_mult::operand_check_p): Ditto.
	(operand_bitwise_not::operand_check_p): Ditto.
	(operand_bitwise_xor::operand_check_p): Ditto.
	(operand_bitwise_and::operand_check_p): Ditto.
	(operand_bitwise_or::operand_check_p): Ditto.
	(operand_min::operand_check_p): Ditto.
	(operand_max::operand_check_p): Ditto.
	* range-op.cc (operand_lshift::operand_check_p): Ditto.
	(operand_rshift::operand_check_p): Ditto.
	(operand_logical_and::operand_check_p): Ditto.
	(operand_logical_or::operand_check_p): Ditto.
	(operand_logical_not::operand_check_p): Ditto.
---
 gcc/gimple-range-fold.h | 12 
 gcc/range-op-mixed.h| 43 -
 gcc/range-op.cc | 12 +---
 gcc/value-range.h   | 11 +++
 4 files changed, 33 insertions(+), 45 deletions(-)

diff --git a/gcc/gimple-range-fold.h b/gcc/gimple-range-fold.h
index fcbe1626790..0094b4e3f35 100644
--- a/gcc/gimple-range-fold.h
+++ b/gcc/gimple-range-fold.h
@@ -89,18 +89,6 @@ gimple_range_ssa_p (tree exp)
   return NULL_TREE;
 }
 
-// Return true if TYPE1 and TYPE2 are compatible range types.
-
-inline bool
-range_compatible_p (tree type1, tree type2)
-{
-  // types_compatible_p requires conversion in both directions to be useless.
-  // GIMPLE only requires a cast one way in order to be compatible.
-  // Ranges really only need the sign and precision to be the same.
-  return (TYPE_PRECISION (type1) == TYPE_PRECISION (type2)
-	  && TYPE_SIGN (type1) == TYPE_SIGN (type2));
-}
-
 // Source of all operands for fold_using_range and gori_compute.
 // It abstracts out the source of an operand so it can come from a stmt or
 // and edge or anywhere a derived class of fur_source wants.
diff --git a/gcc/range-op-mixed.h b/gcc/range-op-mixed.h
index 4386a68e946..7e3ee17ccbd 100644
--- a/gcc/range-op-mixed.h
+++ b/gcc/range-op-mixed.h
@@ -140,7 +140,7 @@ public:
 		   const irange &rh) const final override;
   // Check op1 and op2 for compatibility.
   bool operand_check_p (tree, tree t1, tree t2) const final override
-{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
+{ return range_compatible_p (t1, t2); }
 };
 
 class operator_not_equal : public range_operator
@@ -179,7 +179,7 @@ public:
 		   const irange &rh) const final override;
   // Check op1 and op2 for compatibility.
   bool operand_check_p (tree, tree t1, tree t2) const final override
-{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
+{ return range_compatible_p (t1, t2); }
 };
 
 class operator_lt :  public range_operator
@@ -215,7 +215,7 @@ public:
 		   const irange &rh) const final override;
   // Check op1 and op2 for compatibility.
   bool operand_check_p (tree, tree t1, tree t2) const final override
-{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
+{ return range_compatible_p (t1, t2); }
 };
 
 class operator_le :  public range_operator
@@ -254,7 +254,7 @@ public:
 		   const irange &rh) const final override;
   // Check op1 and op2 for compatibility.
   bool operand_check_p (tree, tree t1, tree t2) const final override
-{ return TYPE_PRECISION (t1) == TYPE_PRECISION (t2); }
+{ return range_compatible_p (t1, t2); }
 };
 
 class operator_gt :  public range_operator
@@ -292,7 +292,7 @@ public:
 		   const irange &rh) const final override;
   // Check op1 and op2 for compatibility.
   bool operand_check_p (tree, tree t1, tree t2) const final override
-{ return T

Re: [pushed] c++: remove LAMBDA_EXPR_MUTABLE_P

2023-12-01 Thread Patrick Palka
On Thu, 30 Nov 2023, Jason Merrill wrote:

> Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> -- 8< --
> 
> In review of the deducing 'this' patch it came up that LAMBDA_EXPR_MUTABLE_P
> doesn't make sense for a lambda with an explicit object parameter.  And it
> was never necessary, so let's remove it.
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (LAMBDA_EXPR_MUTABLE_P): Remove.
>   * cp-tree.def: Remove documentation.
>   * lambda.cc (build_lambda_expr): Remove reference.
>   * parser.cc (cp_parser_lambda_declarator_opt): Likewise.
>   * pt.cc (tsubst_lambda_expr): Likewise.
>   * ptree.cc (cxx_print_lambda_node): Likewise.
>   * semantics.cc (capture_decltype): Get the object quals
>   from the object instead.
> ---
>  gcc/cp/cp-tree.h| 5 -
>  gcc/cp/lambda.cc| 1 -
>  gcc/cp/parser.cc| 1 -
>  gcc/cp/pt.cc| 1 -
>  gcc/cp/ptree.cc | 2 --
>  gcc/cp/semantics.cc | 9 ++---
>  gcc/cp/cp-tree.def  | 3 +--
>  7 files changed, 7 insertions(+), 15 deletions(-)
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index 5614b71eed4..964af1ddd85 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -461,7 +461,6 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
>TYPENAME_IS_CLASS_P (in TYPENAME_TYPE)
>STMT_IS_FULL_EXPR_P (in _STMT)
>TARGET_EXPR_LIST_INIT_P (in TARGET_EXPR)
> -  LAMBDA_EXPR_MUTABLE_P (in LAMBDA_EXPR)
>DECL_FINAL_P (in FUNCTION_DECL)
>QUALIFIED_NAME_IS_TEMPLATE (in SCOPE_REF)
>CONSTRUCTOR_IS_DEPENDENT (in CONSTRUCTOR)
> @@ -1478,10 +1477,6 @@ enum cp_lambda_default_capture_mode_type {
>  #define LAMBDA_EXPR_CAPTURES_THIS_P(NODE) \
>LAMBDA_EXPR_THIS_CAPTURE(NODE)
>  
> -/* Predicate tracking whether the lambda was declared 'mutable'.  */
> -#define LAMBDA_EXPR_MUTABLE_P(NODE) \
> -  TREE_LANG_FLAG_1 (LAMBDA_EXPR_CHECK (NODE))
> -
>  /* True iff uses of a const variable capture were optimized away.  */
>  #define LAMBDA_EXPR_CAPTURE_OPTIMIZED(NODE) \
>TREE_LANG_FLAG_2 (LAMBDA_EXPR_CHECK (NODE))
> diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc
> index 34d0190a89b..be8d240944d 100644
> --- a/gcc/cp/lambda.cc
> +++ b/gcc/cp/lambda.cc
> @@ -44,7 +44,6 @@ build_lambda_expr (void)
>LAMBDA_EXPR_THIS_CAPTURE (lambda) = NULL_TREE;
>LAMBDA_EXPR_REGEN_INFO   (lambda) = NULL_TREE;
>LAMBDA_EXPR_PENDING_PROXIES  (lambda) = NULL;
> -  LAMBDA_EXPR_MUTABLE_P(lambda) = false;
>return lambda;
>  }
>  
> diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> index 2464d1a0783..1826b6175f5 100644
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
> @@ -11770,7 +11770,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
> tree lambda_expr)
>  
>if (lambda_specs.storage_class == sc_mutable)
>  {
> -  LAMBDA_EXPR_MUTABLE_P (lambda_expr) = 1;
>quals = TYPE_UNQUALIFIED;
>  }
>else if (lambda_specs.storage_class == sc_static)
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index c18718b319d..00a808bf323 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -19341,7 +19341,6 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
> complain, tree in_decl)
>  = LAMBDA_EXPR_LOCATION (t);
>LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (r)
>  = LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (t);
> -  LAMBDA_EXPR_MUTABLE_P (r) = LAMBDA_EXPR_MUTABLE_P (t);
>if (tree ti = LAMBDA_EXPR_REGEN_INFO (t))
>  LAMBDA_EXPR_REGEN_INFO (r)
>= build_template_info (t, add_to_template_args (TI_ARGS (ti),
> diff --git a/gcc/cp/ptree.cc b/gcc/cp/ptree.cc
> index 32c5b5280dc..d1f58921fab 100644
> --- a/gcc/cp/ptree.cc
> +++ b/gcc/cp/ptree.cc
> @@ -265,8 +265,6 @@ cxx_print_identifier (FILE *file, tree node, int indent)
>  void
>  cxx_print_lambda_node (FILE *file, tree node, int indent)
>  {
> -  if (LAMBDA_EXPR_MUTABLE_P (node))
> -fprintf (file, " /mutable");
>fprintf (file, " default_capture_mode=[");
>switch (LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (node))
>  {
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 04b0540599a..36b57ac9524 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12792,9 +12792,12 @@ capture_decltype (tree decl)
>  
>if (!TYPE_REF_P (type))
>  {
> -  if (!LAMBDA_EXPR_MUTABLE_P (lam))
> - type = cp_build_qualified_type (type, (cp_type_quals (type)
> -|TYPE_QUAL_CONST));
> +  int quals = cp_type_quals (type);
> +  tree obtype = TREE_TYPE (DECL_ARGUMENTS (current_function_decl));
> +  gcc_checking_assert (!WILDCARD_TYPE_P (non_reference (obtype)));
> +  if (INDIRECT_TYPE_P (obtype))
> + quals |= cp_type_quals (TREE_TYPE (obtype));

Shouldn't we propagate cv-quals of a by-value object parameter as well?

> +  type = cp_build_qualified_type (type, quals);
>type = build_reference_type (type);
>  }
>return type;
> diff --git a/gcc/cp/cp-tree.def b/gcc/cp/cp-tree.def
> index bf3bcd1bf13..

[pushed][PR112445][LRA]: Fix "unable to find a register to spill" error

2023-12-01 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112445

The patch was successfully bootstrapped and tested on x86-64, aarch64, 
ppc64le.
commit 1390bf52c17a71834a1766c0222e4f8a74efb162
Author: Vladimir N. Makarov 
Date:   Fri Dec 1 11:46:37 2023 -0500

[PR112445][LRA]: Fix "unable to find a register to spill" error

PR112445 is a very complicated bug occurring from interaction of
constraint subpass, inheritance, and hard reg live range splitting.
It is hard to debug this PR only from LRA standard logs.  Therefore I
added dumping all func insns at the end of complicated sub-passes
(constraint, inheritance, undoing inheritance, hard reg live range
splitting, and rematerialization).  As such output can be quite big,
it is switched only one level 7 of -fira-verbose value.  The reason
for the bug is a skip of live-range splitting of hard reg (dx) on the
1st live range splitting subpass.  Splitting is done for reload
pseudos around an original insn and its reload insns but the subpass
did not recognize such insn pattern because previous inheritance and
undoing inheritance subpasses extended a bit reload pseudo live range.
Although we undid inheritance in question, the result code was a bit
different from a code before the corresponding inheritance pass.  The
following fixes the bug by restoring exact code before the
inheritance.

gcc/ChangeLog:

PR target/112445
* lra.h (lra): Add one more arg.
* lra-int.h (lra_verbose, lra_dump_insns): New externals.
(lra_dump_insns_if_possible): Ditto.
* lra.cc (lra_dump_insns): Dump all insns.
(lra_dump_insns_if_possible):  Dump all insns for lra_verbose >= 7.
(lra_verbose): New global.
(lra): Add new arg.  Setup lra_verbose from its value.
* lra-assigns.cc (lra_split_hard_reg_for): Dump insns if rtl
was changed.
* lra-remat.cc (lra_remat): Dump insns if rtl was changed.
* lra-constraints.cc (lra_inheritance): Dump insns.
(lra_constraints, lra_undo_inheritance): Dump insns if rtl
was changed.
(remove_inheritance_pseudos): Use restore reg if it is set up.
* ira.cc: (lra): Pass internal_flag_ira_verbose.

gcc/testsuite/ChangeLog:

PR target/112445
* gcc.target/i386/pr112445.c: New test.

diff --git a/gcc/ira.cc b/gcc/ira.cc
index d7530f01380..b5c4c0e4af7 100644
--- a/gcc/ira.cc
+++ b/gcc/ira.cc
@@ -5970,7 +5970,7 @@ do_reload (void)
 
   ira_destroy ();
 
-  lra (ira_dump_file);
+  lra (ira_dump_file, internal_flag_ira_verbose);
   /* ???!!! Move it before lra () when we use ira_reg_equiv in
 	 LRA.  */
   vec_free (reg_equivs);
diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index d2ebcfd5056..7aa210e986f 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1835,6 +1835,7 @@ lra_split_hard_reg_for (void)
   if (spill_p)
 {
   bitmap_clear (&failed_reload_pseudos);
+  lra_dump_insns_if_possible ("changed func after splitting hard regs");
   return true;
 }
   bitmap_clear (&non_reload_pseudos);
diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 9b6a2af5b75..177c765ca13 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -5537,6 +5537,8 @@ lra_constraints (bool first_p)
 	  lra_assert (df_regs_ever_live_p (hard_regno + j));
 	  }
 }
+  if (changed_p)
+lra_dump_insns_if_possible ("changed func after local");
   return changed_p;
 }
 
@@ -7277,7 +7279,7 @@ lra_inheritance (void)
   bitmap_release (&invalid_invariant_regs);
   bitmap_release (&check_only_regs);
   free (usage_insns);
-
+  lra_dump_insns_if_possible ("func after inheritance");
   timevar_pop (TV_LRA_INHERITANCE);
 }
 
@@ -7477,13 +7479,16 @@ remove_inheritance_pseudos (bitmap remove_pseudos)
 			   == get_regno (lra_reg_info[prev_sregno].restore_rtx
 		  && ! bitmap_bit_p (remove_pseudos, prev_sregno))
 		{
+		  int restore_regno = get_regno (lra_reg_info[sregno].restore_rtx);
+		  if (restore_regno < 0)
+			restore_regno = prev_sregno;
 		  lra_assert (GET_MODE (SET_SRC (prev_set))
-  == GET_MODE (regno_reg_rtx[sregno]));
+  == GET_MODE (regno_reg_rtx[restore_regno]));
 		  /* Although we have a single set, the insn can
 			 contain more one sregno register occurrence
 			 as a source.  Change all occurrences.  */
 		  lra_substitute_pseudo_within_insn (curr_insn, sregno,
-			 SET_SRC (prev_set),
+			 regno_reg_rtx[restore_regno],
 			 false);
 		  /* As we are finishing with processing the insn
 			 here, check the destination too as it might
@@ -7745,5 +7750,7 @@ lra_undo_inheritance (void)
   EXECUTE_IF_SET_IN_BITMAP (&lra_split_regs, 0, regno, bi)
 lra_reg_info[regno].restore_rtx = NULL_RTX;
   change_p = undo_optional_rel

ping: [PATCH v5] libstdc++: Remove UB from month and weekday additions and subtractions.

2023-12-01 Thread Cassio Neri
ping.

On Sat, 25 Nov 2023 at 13:52, Cassio Neri  wrote:

> The following invoke signed integer overflow (UB) [1]:
>
>   month   + months{MAX} // where MAX is the maximum value of months::rep
>   month   + months{MIN} // where MIN is the maximum value of months::rep
>   month   - months{MIN} // where MIN is the minimum value of months::rep
>   weekday + days  {MAX} // where MAX is the maximum value of days::rep
>   weekday - days  {MIN} // where MIN is the minimum value of days::rep
>
> For the additions to MAX, the crux of the problem is that, in libstdc++,
> months::rep and days::rep are int64_t. Other implementations use int32_t,
> cast
> operands to int64_t and perform arithmetic operations without risk of
> overflowing.
>
> For month + months{MIN}, the implementation follows the Standard's "returns
> clause" and evaluates:
>
>modulo(static_cast(unsigned{__x}) + (__y.count() - 1), 12);
>
> Overflow occurs when MIN - 1 is evaluated. Casting to a larger type could
> help
> but, unfortunately again, this is not possible for libstdc++.
>
> For the subtraction of MIN, the problem is that -MIN is not representable.
>
> It's fair to say that the intention is for these additions/subtractions to
> be performed in modulus (12 or 7) arithmetic so that no overflow is
> expected.
>
> To fix these UB, this patch implements:
>
>   template 
>   unsigned __add_modulo(unsigned __x, _T __y);
>
>   template 
>   unsigned __sub_modulo(unsigned __x, _T __y);
>
> which respectively, returns the remainder of Euclidean division of, __x +
> __y
> and __x - __y by __d without overflowing. These functions replace
>
>   constexpr unsigned __modulo(long long __n, unsigned __d);
>
> which also calculates the reminder of __n, where __n is the result of the
> addition or subtraction. Hence, these operations might invoke UB before
> __modulo
> is called and thus, __modulo can't do anything to remediate the issue.
>
> In addition to solve the UB issues, __add_modulo and __sub_modulo allow
> better
> codegen (shorter and branchless) on x86-64 and ARM [2].
>
> [1] https://godbolt.org/z/a9YfWdn57
> [2] https://godbolt.org/z/Gh36cr7E4
>
> libstdc++-v3/ChangeLog:
>
> * include/std/chrono: Fix + and - for months and weekdays.
> * testsuite/std/time/month/1.cc: Add constexpr tests against
> overflow.
> * testsuite/std/time/month/2.cc: New test for extreme values.
> * testsuite/std/time/weekday/1.cc: Add constexpr tests against
> overflow.
> * testsuite/std/time/weekday/2.cc: New test for extreme values.
> ---
> Good for trunk?
>
> Changes with respect to previous versions:
>   v5: Fix typos in commit message.
>   v4: Remove UB also from operator-.
>   v3: Fix screwed up email send with v2.
>   v2: Replace _T with _Tp and _U with _Up. Remove copyright+license from
> test.
>
>  libstdc++-v3/include/std/chrono  | 79 +---
>  libstdc++-v3/testsuite/std/time/month/1.cc   | 19 +
>  libstdc++-v3/testsuite/std/time/month/2.cc   | 32 
>  libstdc++-v3/testsuite/std/time/weekday/1.cc | 16 +++-
>  libstdc++-v3/testsuite/std/time/weekday/2.cc | 32 
>  5 files changed, 151 insertions(+), 27 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/time/month/2.cc
>  create mode 100644 libstdc++-v3/testsuite/std/time/weekday/2.cc
>
> diff --git a/libstdc++-v3/include/std/chrono
> b/libstdc++-v3/include/std/chrono
> index e4ba6eafceb..1b7cdb96e1c 100644
> --- a/libstdc++-v3/include/std/chrono
> +++ b/libstdc++-v3/include/std/chrono
> @@ -501,18 +501,47 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>  namespace __detail
>  {
> -  // Compute the remainder of the Euclidean division of __n divided
> by __d.
> -  // Euclidean division truncates toward negative infinity and always
> -  // produces a remainder in the range of [0,__d-1] (whereas standard
> -  // division truncates toward zero and yields a nonpositive remainder
> -  // for negative __n).
> +  // Helper to __add_modulo and __sub_modulo.
> +  template 
> +  constexpr auto
> +  __modulo_offset()
> +  {
> +   using _Up = make_unsigned_t<_Tp>;
> +   auto constexpr __a = _Up(-1) - _Up(255 + __d - 2);
> +   auto constexpr __b = _Up(__d * (__a / __d) - 1);
> +   // Notice: b <= a - 1 <= _Up(-1) - (255 + d - 1) and b % d = d - 1.
> +   return _Up(-1) - __b; // >= 255 + d - 1
> +  }
> +
> +  // Compute the remainder of the Euclidean division of __x + __y
> divided by
> +  // __d without overflowing.  Typically, __x <= 255 + d - 1 is sum of
> +  // weekday/month with a shift in [0, d - 1] and __y is a duration
> count.
> +  template 
> +  constexpr unsigned
> +  __add_modulo(unsigned __x, _Tp __y)
> +  {
> +   using _Up = make_unsigned_t<_Tp>;
> +   // For __y >= 0, _Up(__y) has the same mathematical value as __y
> and
> +   // this function simply returns (__x + _Up(__y)) % d.  Typically,
> this
> +   // doesn'

Re: [PATCH] libgccjit: Fix GGC segfault when using -flto

2023-12-01 Thread David Malcolm
On Thu, 2023-11-30 at 17:13 -0500, Antoni Boucher wrote:
> Here's the updated patch.
> The failure was due to the test being in the test array while it
> should
> not have been there since it changes the context.

Thanks for the updated patch.

Did you do a full bootstrap and regression test with this one, or do
you want me to?

Dave



Re: [PATCH] libgccjit: Add support for the type bfloat16

2023-12-01 Thread David Malcolm
On Thu, 2023-11-16 at 17:20 -0500, Antoni Boucher wrote:
> I forgot to attach the patch.
> 
> On Thu, 2023-11-16 at 17:19 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds the support for the type bfloat16 (bug 112574).
> > 
> > This was asked to be splitted from a another patch sent here:
> > https://gcc.gnu.org/pipermail/jit/2023q1/001607.html
> > 
> > Thanks for the review.
> 

Thanks for the patch.

> diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
> index 18cc4da25b8..7e1c97a4638 100644
> --- a/gcc/jit/jit-playback.cc
> +++ b/gcc/jit/jit-playback.cc
> @@ -280,6 +280,8 @@ get_tree_node_for_type (enum gcc_jit_types type_)
>  
>  case GCC_JIT_TYPE_FLOAT:
>return float_type_node;
> +case GCC_JIT_TYPE_BFLOAT16:
> +  return bfloat16_type_node;

The code to create bfloat16_type_node (in build_common_tree_nodes) is
guarded by #ifdef HAVE_BFmode, so we should probably have a test for
this in case GCC_JIT_TYPE_BFLOAT16 to at least add an error message
when it's NULL_TREE, rather than silently returning NULL_TREE and
crashing.

[...]

> diff --git a/gcc/testsuite/jit.dg/test-bfloat16.c 
> b/gcc/testsuite/jit.dg/test-bfloat16.c
> new file mode 100644
> index 000..6aed3920351
> --- /dev/null
> +++ b/gcc/testsuite/jit.dg/test-bfloat16.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile { target x86_64-*-* } } */
> +
> +#include 
> +#include 
> +
> +#include "libgccjit.h"
> +
> +/* We don't want set_options() in harness.h to set -O3 so our little local
> +   is optimized away. */
> +#define TEST_ESCHEWS_SET_OPTIONS
> +static void set_options (gcc_jit_context *ctxt, const char *argv0)
> +{
> +}


Please add a comment to all-non-failing-tests.h noting the exclusion of
this test case from the array.

[...]

> diff --git a/gcc/testsuite/jit.dg/test-types.c 
> b/gcc/testsuite/jit.dg/test-types.c
> index a01944e35fa..9e7c4f3e046 100644
> --- a/gcc/testsuite/jit.dg/test-types.c
> +++ b/gcc/testsuite/jit.dg/test-types.c
> @@ -1,3 +1,4 @@
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -492,4 +493,5 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result 
> *result)
>  
>CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, 
> GCC_JIT_TYPE_FLOAT)), sizeof (float));
>CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, 
> GCC_JIT_TYPE_DOUBLE)), sizeof (double));
> +  CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, 
> GCC_JIT_TYPE_BFLOAT16)), sizeof (__bfloat16));


This is only going to work on targets which #ifdef HAVE_BFmode, so this
CHECK_VALUE needs to be conditionalized somehow, to avoid having this,
test-combination, and test-threads from bailing out early on targets
without BFmode.

Dave



Re: [PATCH] extend.texi: Fix up defbuiltin* with spaces in return type

2023-12-01 Thread Sandra Loosemore

On 12/1/23 10:33, Jakub Jelinek wrote:

On Fri, Dec 01, 2023 at 10:04:38AM -0700, Sandra Loosemore wrote:

Thanks, this looks good to me.  I think I also noticed this weird formatting
in passing recently when I was looking for something else and did not have
time to track it down myself.


There is another question.  In many cases we just specify types for the
builtin arguments, in other cases types and names with @var{name} syntax,
and in other case with just name.

@defbuiltin{int __builtin_fpclassify (int, int, int, int, int, ...)}
vs.
@defbuiltin{size_t __builtin_object_size (const void * @var{ptr}, int 
@var{type})}
vs.
@defbuiltinx{bool __builtin_umull_overflow (unsigned long int a, unsigned long 
int b, unsigned long int *res)}
and in some cases even just name the arguments and don't specify type:
@defbuiltin{void __builtin_clear_padding (@var{ptr})}
@defbuiltin{@var{type} __builtin_choose_expr (@var{const_exp}, @var{exp1}, 
@var{exp2})}

Shall we tweak that somehow?  If the argument names are unimportant, perhaps
it is fine to leave that out, but shouldn't we always use @var{...} around
the parameter names when specified?


Yup.  The Texinfo manual says:  "When using @deftypefn command and variations, 
you should mark parameter names with @var to distinguish these from data type 
names, keywords, and other parts of the literal syntax of the programming 
language."



And avoid leaving out the types, use something like
__builtin_clear_padding (@var{type} *@var{ptr})
or
__builtin_choose_expr (@var{type1} @var{const_exp}, @var{type2} @var{exp1}, 
@var{type3} @var{exp2})
?


That would probably be good too.

-Sandra





Re: [PATCH] hardcfr: make builtin_return tests more portable [PR112334]

2023-12-01 Thread Alexandre Oliva
On Dec  1, 2023, Alexandre Oliva  wrote:

> diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-noret.c 
> b/gcc/testsuite/c-c++-common/torture/harden-cfr-noret.c
> index fdd803109a4ae..8c7cc01c0ce68 100644
> --- a/gcc/testsuite/c-c++-common/torture/harden-cfr-noret.c
> +++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-noret.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-fharden-control-flow-redundancy 
> -fhardcfr-check-noreturn-calls=always -fdump-tree-hardcfr -ffat-lto-objects" 
> } */
> +/* { dg-options "-fharden-control-flow-redundancy 
> -fhardcfr-check-noreturn-calls=always -fno-exceptions -fdump-tree-hardcfr 
> -ffat-lto-objects" } */

This one was *not* meant to be modified, it was changed by accident in
my tree and integrated by mistake with the posted patch.  Fortunately
the ChangeLog verifier caught the mistake, and I dropped it from the
patch I've just pushed.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] testsuite: require avx_runtime for some tests

2023-12-01 Thread Mike Stump
On Nov 6, 2023, at 2:59 AM, Marc Poulhiès  wrote:
> 
> These 3 tests fails parsing the 'vect' dump when not using -mavx. Make
> the dependency explicit.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-ifcvt-18.c: Add dep on avx_runtime.
>   * gcc.dg/vect/vect-simd-clone-16f.c: Likewise.
>   * gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
> ---
> Tested on x86_64-linux and x86_64-elf.
> 
> Ok for master?

Ok.

Re: [PATCH] extend.texi: Fix up defbuiltin* with spaces in return type

2023-12-01 Thread Jakub Jelinek
On Fri, Dec 01, 2023 at 10:04:38AM -0700, Sandra Loosemore wrote:
> Thanks, this looks good to me.  I think I also noticed this weird formatting
> in passing recently when I was looking for something else and did not have
> time to track it down myself.

There is another question.  In many cases we just specify types for the
builtin arguments, in other cases types and names with @var{name} syntax,
and in other case with just name.

@defbuiltin{int __builtin_fpclassify (int, int, int, int, int, ...)}
vs.
@defbuiltin{size_t __builtin_object_size (const void * @var{ptr}, int 
@var{type})}
vs.
@defbuiltinx{bool __builtin_umull_overflow (unsigned long int a, unsigned long 
int b, unsigned long int *res)}
and in some cases even just name the arguments and don't specify type:
@defbuiltin{void __builtin_clear_padding (@var{ptr})}
@defbuiltin{@var{type} __builtin_choose_expr (@var{const_exp}, @var{exp1}, 
@var{exp2})}

Shall we tweak that somehow?  If the argument names are unimportant, perhaps
it is fine to leave that out, but shouldn't we always use @var{...} around
the parameter names when specified?
And avoid leaving out the types, use something like
__builtin_clear_padding (@var{type} *@var{ptr})
or
__builtin_choose_expr (@var{type1} @var{const_exp}, @var{type2} @var{exp1}, 
@var{type3} @var{exp2})
?

Jakub



Re: [PATCH] c++: decltype of (non-captured variable) [PR83167]

2023-12-01 Thread Patrick Palka
On Tue, 14 Nov 2023, Jason Merrill wrote:

> On 11/14/23 11:10, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > -- >8 --
> > 
> > For decltype((x)) within a lambda where x is not captured, we dubiously
> > require that the lambda has a capture default, unlike for decltype(x).
> > This patch fixes this inconsistency; I couldn't find a justification for
> > it in the standard.
> 
> The relevant passage seems to be
> 
> https://eel.is/c++draft/expr.prim#id.unqual-3
> 
> "If naming the entity from outside of an unevaluated operand within S would
> refer to an entity captured by copy in some intervening lambda-expression,
> then let E be the innermost such lambda-expression.
> 
> If there is such a lambda-expression and if P is in E's function parameter
> scope but not its parameter-declaration-clause, then the type of the
> expression is the type of a class member access expression ([expr.ref]) naming
> the non-static data member that would be declared for such a capture in the
> object parameter ([dcl.fct]) of the function call operator of E."
> 
> In this case I guess there is no such lambda-expression because naming x won't
> refer to a capture by copy if the lambda doesn't capture anything, so we
> ignore the lambda.
> 
> Maybe refer to that in a comment?  OK with that change.
> 
> I'm surprised that it refers specifically to capture by copy, but I guess a
> capture by reference should have the same decltype as the captured variable?

Ah, seems like it.  So maybe we should get rid of the redundant
by-reference capture-default handling, to more closely mirror the
standard?

Also now that r14-6026-g73e2bdbf9bed48 made capture_decltype return
NULL_TREE to mean the capture is dependent, it seems we should just
inline capture_decltype into finish_decltype_type rather than
introducing another special return value to mean "fall back to ordinary
handling".

How does the following look?  Bootstrapped and regtested on
x86_64-pc-linux-gnu.

-- >8 --

PR c++/83167

gcc/cp/ChangeLog:

* semantics.cc (capture_decltype): Inline into its only caller ...
(finish_decltype_type): ... here.  Update nearby comment to refer
to recent standard.  Restrict uncaptured variable handling to just
lambdas with a by-copy capture-default.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-decltype4.C: New test.
---
 gcc/cp/semantics.cc   | 107 +++---
 .../g++.dg/cpp0x/lambda/lambda-decltype4.C|  15 +++
 2 files changed, 55 insertions(+), 67 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-decltype4.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index fbbc18336a0..fb4c3992e34 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -53,7 +53,6 @@ along with GCC; see the file COPYING3.  If not see
 
 static tree maybe_convert_cond (tree);
 static tree finalize_nrv_r (tree *, int *, void *);
-static tree capture_decltype (tree);
 
 /* Used for OpenMP non-static data member privatization.  */
 
@@ -11856,21 +11855,48 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
 }
   else
 {
-  /* Within a lambda-expression:
-
-Every occurrence of decltype((x)) where x is a possibly
-parenthesized id-expression that names an entity of
-automatic storage duration is treated as if x were
-transformed into an access to a corresponding data member
-of the closure type that would have been declared if x
-were a use of the denoted entity.  */
   if (outer_automatic_var_p (STRIP_REFERENCE_REF (expr))
  && current_function_decl
  && LAMBDA_FUNCTION_P (current_function_decl))
{
- type = capture_decltype (STRIP_REFERENCE_REF (expr));
- if (!type)
-   goto dependent;
+ /* [expr.prim.id.unqual]/3: If naming the entity from outside of an
+unevaluated operand within S would refer to an entity captured by
+copy in some intervening lambda-expression, then let E be the
+innermost such lambda-expression.
+
+If there is such a lambda-expression and if P is in E's function
+parameter scope but not its parameter-declaration-clause, then the
+type of the expression is the type of a class member access
+expression naming the non-static data member that would be declared
+for such a capture in the object parameter of the function call
+operator of E."  */
+ tree decl = STRIP_REFERENCE_REF (expr);
+ tree lam = CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT 
(current_function_decl));
+ tree cap = lookup_name (DECL_NAME (decl), LOOK_where::BLOCK,
+ LOOK_want::HIDDEN_LAMBDA);
+
+ if (cap && is_capture_proxy (cap))
+   type = TREE_TYPE (cap);
+ else if (LAMBDA_EXPR

[pushed] wwwdocs: benchmarks: Remove http://annwm.lbl.gov/bench/

2023-12-01 Thread Gerald Pfeifer
This has been unreachable for months (at least).

If any of you is aware of some other link to add to
https://gcc.gnu.org/benchmarks/ please let me know.

Gerald
---
 htdocs/benchmarks/index.html | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/htdocs/benchmarks/index.html b/htdocs/benchmarks/index.html
index 6ccee355..ee3a2161 100644
--- a/htdocs/benchmarks/index.html
+++ b/htdocs/benchmarks/index.html
@@ -47,13 +47,6 @@ Environment (CSiBE), along with the testbed and 
measurement scripts.
 
 Related benchmarks and scripts
 
-
-Charles Leggett runs several benchmarks (Bench++, Haney, Stepanov, OOPACK)
-comparing various versions of GCC and KAI KCC with several optimization
-levels.  Results can be found at http://annwm.lbl.gov/bench/";>
-http://annwm.lbl.gov/bench/.
-
-
 
 SUSE runs various other C++ benchmarks and Polyhedron.
 Results can be found at https://gcc.opensuse.org";
-- 
2.43.0


[pushed] wwwdocs: conduct: Change further creativecommons.org links to https

2023-12-01 Thread Gerald Pfeifer
This feels a bit lick whack-a-mole to me, though I am hopefull to now have 
caught and covered all cases where your CoC pages link to addresses that 
permanently redirect.

Famous last words. :-)

(The redirect work, avoiding them primarily helps us to find "odd" cases.
The extra turnaround should be hardly noticable for most.)

Gerald

---
 htdocs/conduct-faq.html  | 2 +-
 htdocs/conduct-report.html   | 2 +-
 htdocs/conduct-response.html | 2 +-
 htdocs/conduct.html  | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/htdocs/conduct-faq.html b/htdocs/conduct-faq.html
index 181a7674..ffa7bd20 100644
--- a/htdocs/conduct-faq.html
+++ b/htdocs/conduct-faq.html
@@ -63,7 +63,7 @@ your question either,
 email mailto:cond...@gcc.gnu.org";>cond...@gcc.gnu.org with any
 additional questions or feedback.
 
-http://creativecommons.org/licenses/by-sa/4.0/";>
+https://creativecommons.org/licenses/by-sa/4.0/";>
 https://licensebuttons.net/l/by-sa/4.0/88x31.png";>
 This work is licensed under a
diff --git a/htdocs/conduct-report.html b/htdocs/conduct-report.html
index 139cd6c7..097309ae 100644
--- a/htdocs/conduct-report.html
+++ b/htdocs/conduct-report.html
@@ -113,7 +113,7 @@ directly to another member, or to a member of the Steering 
Committee.
 of the committee's decision. To make such a request, contact a member of the
 Steering Committee with your request and motivation.
 
-http://creativecommons.org/licenses/by-sa/4.0/";>
+https://creativecommons.org/licenses/by-sa/4.0/";>
 https://licensebuttons.net/l/by-sa/4.0/88x31.png";>
 This work is licensed under a
diff --git a/htdocs/conduct-response.html b/htdocs/conduct-response.html
index d646b2e7..fb2c26f3 100644
--- a/htdocs/conduct-response.html
+++ b/htdocs/conduct-response.html
@@ -132,7 +132,7 @@ excluded from the response process. For these cases, anyone 
can make a report
 directly to any of the committee members, as documented in the reporting
 guidelines.
 
-http://creativecommons.org/licenses/by-sa/4.0/";>
+https://creativecommons.org/licenses/by-sa/4.0/";>
 https://licensebuttons.net/l/by-sa/4.0/88x31.png";>
 This work is licensed under a
diff --git a/htdocs/conduct.html b/htdocs/conduct.html
index 2e7628af..a4cd186f 100644
--- a/htdocs/conduct.html
+++ b/htdocs/conduct.html
@@ -114,7 +114,7 @@ email mailto:cond...@gcc.gnu.org";>cond...@gcc.gnu.org.
 that doesn't answer your questions, feel free
 to mailto:cond...@gcc.gnu.org";>contact us.
 
-http://creativecommons.org/licenses/by-sa/4.0/";>
+https://creativecommons.org/licenses/by-sa/4.0/";>
 https://licensebuttons.net/l/by-sa/4.0/88x31.png";>
 This work is licensed under a
-- 
2.43.0


Re: [PATCH] testsuite: refine gcc.dg/analyzer/fd-4.c test for newlib

2023-12-01 Thread Mike Stump
On Nov 6, 2023, at 3:01 AM, Marc Poulhiès  wrote:
> 
> Contrary to glibc, including stdio.h from newlib defines mode_t which
> conflicts with the test's type definition.
> 
> .../gcc/testsuite/gcc.dg/analyzer/fd-4.c:19:3: error: redefinition of typedef 
> 'mode_t' with different type
> ...
> .../include/sys/types.h:189:25: note: previous declaration of 'mode_t' with 
> type 'mode_t' {aka 'unsigned int'}
> 
> Defining _MODE_T_DECLARED skips the type definition.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/analyzer/fd-4.c: Fix for newlib.
> ---
> Tested on x86_64-linux and x86_64-elf.
> 
> Ok for master?

Ok,


Re: [PATCH] testsuite: skip gcc.target/i386/pr106910-1.c test when using newlib

2023-12-01 Thread Mike Stump
On Nov 6, 2023, at 2:57 AM, Marc Poulhiès  wrote:
> 
> Using newlib produces a different codegen because the support for c99
> differs (see libc_has_function hook).
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/i386/pr106910-1.c: Disable for newlib.
> ---
> Tested on x86_64-linux and x86_64-elf.
> 
> OK for master?

Ok, but nix the first blank line?

> gcc/testsuite/gcc.target/i386/pr106910-1.c | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.target/i386/pr106910-1.c 
> b/gcc/testsuite/gcc.target/i386/pr106910-1.c
> index c7685a32183..00c93f444b6 100644
> --- a/gcc/testsuite/gcc.target/i386/pr106910-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr106910-1.c
> @@ -1,4 +1,6 @@
> +

This one.

> /* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-skip-if "newlib libc math causes different codegen" { newlib } } */
> /* { dg-options "-msse4.1 -O2 -Ofast" } */
> /* { dg-final { scan-assembler-times "roundps" 9 } } */
> /* { dg-final { scan-assembler-times "cvtps2dq" 1 } } */


Re: [PATCH] extend.texi: Fix up defbuiltin* with spaces in return type

2023-12-01 Thread Sandra Loosemore

On 12/1/23 03:26, Jakub Jelinek wrote:

Hi!

In 
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fstdc_005fbit_005ffloor>>
 I've noticed that while e.g. __builtin_stdc_bit_floor builtin is properly
rendered in bold and bigger size, for the __builtin_stdc_bit_width builtin
it is not the builtin name which is marked like that, but the keyword int
before it.  Also, seems such builtins are missing from the index.

I've read the texinfo docs and they seem to suggest in
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Line-Macros.html>>
 that return types of functions with spaces in the return type should be
wrapped with {}s and we already use that e.g. in
@defbuiltin{{void *} __builtin_thread_pointer (void)}

The following patch adjusts builtins I found which contained one or two
spaces in the return type name (plus two spots which used 2 spaces after
single keyword return type instead of 1 which triggered my search regex as
well).

Tested on x86_64-linux, ok for trunk?


Thanks, this looks good to me.  I think I also noticed this weird formatting in 
passing recently when I was looking for something else and did not have time to 
track it down myself.


-Sandra




Re: [PATCH v3] AArch64: Add inline memmove expansion

2023-12-01 Thread Wilco Dijkstra
Hi Richard,

> +  rtx load[max_ops], store[max_ops];
>
> Please either add a comment explaining why 40 is guaranteed to be
> enough, or (my preference) use:
>
>  auto_vec, ...> ops;

I've changed to using auto_vec since that should help reduce conflicts
with Alex' LDP changes. I double-checked maximum number of instructions,
with a minor tweak to handle AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS
it can now be limited to 12 if you also select -mstrict-align.

v3: update after review, use auto_vec, tweak max_copy_size, add another test.

Add support for inline memmove expansions.  The generated code is identical
as for memcpy, except that all loads are emitted before stores rather than
being interleaved.  The maximum size is 256 bytes which requires at most 16
registers.

Passes regress/bootstrap, OK for commit?

gcc/ChangeLog/
* config/aarch64/aarch64.opt (aarch64_mops_memmove_size_threshold):
Change default.
* config/aarch64/aarch64.md (cpymemdi): Add a parameter.
(movmemdi): Call aarch64_expand_cpymem.
* config/aarch64/aarch64.cc (aarch64_copy_one_block): Rename function,
simplify, support storing generated loads/stores. 
(aarch64_expand_cpymem): Support expansion of memmove.
* config/aarch64/aarch64-protos.h (aarch64_expand_cpymem): Add bool arg.

gcc/testsuite/ChangeLog/
* gcc.target/aarch64/memmove.c: Add new test.
* gcc.target/aarch64/memmove.c: Likewise.

---

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
d2718cc87b306e9673b166cc40e0af2ba72aa17b..d958b181d79440ab1b4f274cc188559edc04c628
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -769,7 +769,7 @@ bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
 tree aarch64_vector_load_decl (tree);
 void aarch64_expand_call (rtx, rtx, rtx, bool);
 bool aarch64_expand_cpymem_mops (rtx *, bool);
-bool aarch64_expand_cpymem (rtx *);
+bool aarch64_expand_cpymem (rtx *, bool);
 bool aarch64_expand_setmem (rtx *);
 bool aarch64_float_const_zero_rtx_p (rtx);
 bool aarch64_float_const_rtx_p (rtx);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
748b313092c5af452e9526a0c6747c51e598e4ca..26d1485ff6b977caeeb780dfaee739069742e638
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -23058,51 +23058,41 @@ aarch64_progress_pointer (rtx pointer)
   return aarch64_move_pointer (pointer, GET_MODE_SIZE (GET_MODE (pointer)));
 }
 
+typedef auto_vec, 12> copy_ops;
+
 /* Copy one block of size MODE from SRC to DST at offset OFFSET.  */
 
 static void
-aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst,
- machine_mode mode)
+aarch64_copy_one_block (copy_ops &ops, rtx src, rtx dst,
+   int offset, machine_mode mode)
 {
-  /* Handle 256-bit memcpy separately.  We do this by making 2 adjacent memory
- address copies using V4SImode so that we can use Q registers.  */
-  if (known_eq (GET_MODE_BITSIZE (mode), 256))
+  /* Emit explict load/store pair instructions for 32-byte copies.  */
+  if (known_eq (GET_MODE_SIZE (mode), 32))
 {
   mode = V4SImode;
+  rtx src1 = adjust_address (src, mode, offset);
+  rtx src2 = adjust_address (src, mode, offset + 16);
+  rtx dst1 = adjust_address (dst, mode, offset);
+  rtx dst2 = adjust_address (dst, mode, offset + 16);
   rtx reg1 = gen_reg_rtx (mode);
   rtx reg2 = gen_reg_rtx (mode);
-  /* "Cast" the pointers to the correct mode.  */
-  *src = adjust_address (*src, mode, 0);
-  *dst = adjust_address (*dst, mode, 0);
-  /* Emit the memcpy.  */
-  emit_insn (aarch64_gen_load_pair (mode, reg1, *src, reg2,
-   aarch64_progress_pointer (*src)));
-  emit_insn (aarch64_gen_store_pair (mode, *dst, reg1,
-aarch64_progress_pointer (*dst), 
reg2));
-  /* Move the pointers forward.  */
-  *src = aarch64_move_pointer (*src, 32);
-  *dst = aarch64_move_pointer (*dst, 32);
+  rtx load = aarch64_gen_load_pair (mode, reg1, src1, reg2, src2);
+  rtx store = aarch64_gen_store_pair (mode, dst1, reg1, dst2, reg2);
+  ops.safe_push ({ load, store });
   return;
 }
 
   rtx reg = gen_reg_rtx (mode);
-
-  /* "Cast" the pointers to the correct mode.  */
-  *src = adjust_address (*src, mode, 0);
-  *dst = adjust_address (*dst, mode, 0);
-  /* Emit the memcpy.  */
-  emit_move_insn (reg, *src);
-  emit_move_insn (*dst, reg);
-  /* Move the pointers forward.  */
-  *src = aarch64_progress_pointer (*src);
-  *dst = aarch64_progress_pointer (*dst);
+  rtx load = gen_move_insn (reg, adjust_address (src, mode, offset));
+  rtx store = gen_move_insn (adjust_address (dst, mode, offset), reg);
+  ops.safe_push ({ load, store });
 }
 
 /* Expand a cpymem/movmem using the MOPS extension.  OPERANDS are taken

Re: [PATCH v6 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-12-01 Thread Jason Merrill

On 12/1/23 01:02, waffl3x wrote:

I ran into another issue while devising tests for redeclarations of
xobj member functions as static member functions and vice versa. I am
pretty sure by the literal wording of the standard, this is well formed.

template
concept Constrain = true;

struct S {
   void f(this auto, Constrain auto) {};
   static void f(Constrain auto) {};

   void g(this auto const&, Constrain auto) {};
   static void g(Constrain auto) {};

   void h(this auto&&, Constrain auto) {};
   static void h(Constrain auto) {};
};

And also,

struct S{
   void f(this auto) {};
   static void f() {};

   void g(this auto const&) {};
   static void g() {};

   void h(this auto&&) {};
   static void h() {};
};

I wrote these tests expecting them to be ill-formed, and found what I
thought was a bug when they were not diagnosed as redecelarations.
However, given how the code for resolving overloads and determining
redeclarations looks, I believe this is actually well formed on a
technicality. I can't find the passages in the standard that specify
this so I can't be sure.


I think the relevant section is
https://eel.is/c++draft/basic.scope.scope


Anyway, the template parameter list differs because of the deduced
object parameter. Now here is the question, you are required to ignore
the object parameter when determining if these are redeclarations or
not, but what about the template parameters associated with the object
parameter? Am I just missing the passage that specifies this or is this
an actual defect in the standard?


I think that since they differ in template parameters, they don't 
correspond under https://eel.is/c++draft/basic.scope.scope#4.5 so they 
can be overloaded.


This is specified in terms of the template-head grammar non-terminal, 
but elsewhere we say that abbreviated templates are equivalent to 
writing out the template parameters explicitly.



The annoying thing is, even if this was brought up, I think the only
solution is to ratify these examples as well formed.


Yes.

Jason



Re: [PATCH v2] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Saurabh Jha

On 12/1/2023 2:10 PM, Richard Earnshaw (lists) wrote:

On 01/12/2023 13:45, Christophe Lyon wrote:

On Fri, 1 Dec 2023 at 13:44, Richard Earnshaw (lists)
 wrote:

On 01/12/2023 11:28, Saurabh Jha wrote:

Hey,

I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in this 
commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an error which I 
unfortunately missed. This patch fixes that test.

Did regression testing on arm-none-eabi and found no regressions. Output of 
running gcc/contrib/compare_tests is this:

"""
Tests that now work, but didn't before (2 tests):

arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
gcc.target/arm/mve/pr112337.c (test for excess errors)
arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
 gcc.target/arm/mve/pr112337.c (test for excess errors)
"""

Ok for trunk? I don't have commit access so could someone please commit on my 
behalf?

Regards,
Saurabh

gcc/testsuite/ChangeLog:

 * gcc.target/arm/mve/pr112337.c: Fix the testcase


Hmm, could this be related to the changes Christophe made recently to change 
the way MVE vector types were set up internally?  If so, this might indicate an 
issue that's going to affect real users with existing code.


My change was only about vector types, here the problem is with a
pointer to a scalar.
Anyway, I ran the test with my commit reverted and it still fails in
the same way, so I think this patch is needed.

Thanks,

Christophe


Christophe?

R.

Ok, thanks for checking.  In that case, Saurabh, your patch is OK, but please 
change 'Fix testcase' to 'Use int32_t instead of int.'

Note that ChangeLog entries end with a full stop.

R.


Thank you for the feedback. Please find the updated ChangeLog below.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/pr112337.c: Use int32_t instead of int.
From 2365aae84de030bbb006edac18c9314812fc657b Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Tue, 28 Nov 2023 13:05:58 +
Subject: [PATCH] testsuite: Fix up pr112337.c test

---
 gcc/testsuite/gcc.target/arm/mve/pr112337.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/pr112337.c 
b/gcc/testsuite/gcc.target/arm/mve/pr112337.c
index 8f491990088..d1a075ecd0e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/pr112337.c
+++ b/gcc/testsuite/gcc.target/arm/mve/pr112337.c
@@ -5,8 +5,8 @@
 #include 
 
 void g(int32x4_t);
-void f(int, int, int, short, int *p) {
-  int *bias = p;
+void f(int, int, int, short, int32_t *p) {
+  int32_t *bias = p;
   for (;;) {
 int32x4_t d = vldrwq_s32 (p);
 bias += 4;
-- 
2.34.1



Re: [PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342]

2023-12-01 Thread Andre Vieira (lists)




On 29/11/2023 17:01, Richard Sandiford wrote:

"Andre Vieira (lists)"  writes:

Rebased, no major changes, still needs review.

On 30/08/2023 10:19, Andre Vieira (lists) via Gcc-patches wrote:

This patch finalizes adding support for the generation of SVE simd
clones when no simdlen is provided, following the ABI rules where the
widest data type determines the minimum amount of elements in a length
agnostic vector.

gcc/ChangeLog:

      * config/aarch64/aarch64-protos.h (add_sve_type_attribute):
Declare.
  * config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute):
Make
  visibility global.
  * config/aarch64/aarch64.cc (aarch64_fntype_abi): Ensure SVE ABI is
  chosen over SIMD ABI if a SVE type is used in return or arguments.
  (aarch64_simd_clone_compute_vecsize_and_simdlen): Create VLA simd
clone
  when no simdlen is provided, according to ABI rules.
  (aarch64_simd_clone_adjust): Add '+sve' attribute to SVE simd clones.
  (aarch64_simd_clone_adjust_ret_or_param): New.
  (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Define.
  * omp-simd-clone.cc (simd_clone_mangle): Print 'x' for VLA simdlen.
  (simd_clone_adjust): Adapt safelen check to be compatible with VLA
  simdlen.

gcc/testsuite/ChangeLog:

  * c-c++-common/gomp/declare-variant-14.c: Adapt aarch64 scan.
  * gfortran.dg/gomp/declare-variant-14.f90: Likewise.
  * gcc.target/aarch64/declare-simd-1.c: Remove warning checks where no
  longer necessary.
  * gcc.target/aarch64/declare-simd-2.c: Add SVE clone scan.


diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
60a55f4bc1956786ea687fc7cad7ec9e4a84e1f0..769d637f63724a7f0044f48f3dd683e0fb46049c
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1005,6 +1005,8 @@ namespace aarch64_sve {
  #ifdef GCC_TARGET_H
bool verify_type_context (location_t, type_context_kind, const_tree, bool);
  #endif
+ void add_sve_type_attribute (tree, unsigned int, unsigned int,
+ const char *, const char *);
  }
  
  extern void aarch64_split_combinev16qi (rtx operands[3]);

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 
161a14edde7c9fb1b13b146cf50463e2d78db264..6f99c438d10daa91b7e3b623c995489f1a8a0f4c
 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -569,14 +569,16 @@ static bool reported_missing_registers_p;
  /* Record that TYPE is an ABI-defined SVE type that contains NUM_ZR SVE 
vectors
 and NUM_PR SVE predicates.  MANGLED_NAME, if nonnull, is the ABI-defined
 mangling of the type.  ACLE_NAME is the  name of the type.  */
-static void
+void
  add_sve_type_attribute (tree type, unsigned int num_zr, unsigned int num_pr,
const char *mangled_name, const char *acle_name)
  {
tree mangled_name_tree
  = (mangled_name ? get_identifier (mangled_name) : NULL_TREE);
+  tree acle_name_tree
+= (acle_name ? get_identifier (acle_name) : NULL_TREE);
  
-  tree value = tree_cons (NULL_TREE, get_identifier (acle_name), NULL_TREE);

+  tree value = tree_cons (NULL_TREE, acle_name_tree, NULL_TREE);
value = tree_cons (NULL_TREE, mangled_name_tree, value);
value = tree_cons (NULL_TREE, size_int (num_pr), value);
value = tree_cons (NULL_TREE, size_int (num_zr), value);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
37507f091c2a6154fa944c3a9fad6a655ab5d5a1..cb0947b18c6a611d55579b5b08d93f6a4a9c3b2c
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -4080,13 +4080,13 @@ aarch64_takes_arguments_in_sve_regs_p (const_tree 
fntype)
  static const predefined_function_abi &
  aarch64_fntype_abi (const_tree fntype)
  {
-  if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)))
-return aarch64_simd_abi ();
-
if (aarch64_returns_value_in_sve_regs_p (fntype)
|| aarch64_takes_arguments_in_sve_regs_p (fntype))
  return aarch64_sve_abi ();
  
+  if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype)))

+return aarch64_simd_abi ();
+
return default_function_abi;
  }
  


I think we discussed this off-list later, but the change above shouldn't
be necessary.  aarch64_vector_pcs must not be attached to SVE PCS functions,
so the two cases should be mutually exclusive.


Yeah I had made the changes locally, but not updated the patch yet.



@@ -27467,7 +27467,7 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct 
cgraph_node *node,
int num, bool explicit_p)
  {
tree t, ret_type;
-  unsigned int nds_elt_bits;
+  unsigned int nds_elt_bits, wds_elt_bits;
int count;
unsigned HOST_WIDE_INT const_simdlen;
  
@@ -27513,10 +27513,14 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,

if (TREE

Re: [PATCH] btf: fix PR debug/112768

2023-12-01 Thread David Faust


On 11/30/23 14:17, Indu Bhagat wrote:
> PR debug/112768 - btf: fix asm comment output for BTF_KIND_FUNC* kinds
> 
> The patch adds a small function to abstract out the detail and return
> the name of the type.  The patch also fixes the issue of BTF_KIND_FUNC
> appearing in the comments with a 'null' string.

OK, thanks.

> 
> For btf-function-6.c testcase, after the patch:
> 
> .long   0   # TYPE 2 BTF_KIND_FUNC_PROTO ''
> .long   0xd02   # btt_info: kind=13, kflag=0, vlen=2
> .long   0x1 # btt_type: (BTF_KIND_INT 'int')
> .long   0   # farg_name
> .long   0x1 # farg_type: (BTF_KIND_INT 'int')
> .long   0   # farg_name
> .long   0x1 # farg_type: (BTF_KIND_INT 'int')
> .long   0   # TYPE 3 BTF_KIND_FUNC_PROTO ''
> .long   0xd01   # btt_info: kind=13, kflag=0, vlen=1
> .long   0x1 # btt_type: (BTF_KIND_INT 'int')
> .long   0x68# farg_name
> .long   0x1 # farg_type: (BTF_KIND_INT 'int')
> .long   0x5 # TYPE 4 BTF_KIND_FUNC 'extfunc'
> .long   0xc02   # btt_info: kind=12, kflag=0, linkage=2
> .long   0x2 # btt_type: (BTF_KIND_FUNC_PROTO '')
> .long   0xd # TYPE 5 BTF_KIND_FUNC 'foo'
> .long   0xc01   # btt_info: kind=12, kflag=0, linkage=1
> .long   0x3 # btt_type: (BTF_KIND_FUNC_PROTO '')
> 
> gcc/ChangeLog:
> 
>   PR debug/112768
>   * btfout.cc (get_btf_type_name): New definition.
>   (btf_collect_datasec): Update dtd_name to the original type name
>   string.
>   (btf_asm_type_ref): Use the new get_btf_type_name function
>   instead.
>   (btf_asm_type): Likewise.
>   (btf_asm_func_type): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR debug/112768
>   * gcc.dg/debug/btf/btf-function-6.c: Empty string expected with
>   BTF_KIND_FUNC_PROTO.
> 
> Testing notes:
>   - bootstrapped and reg tested on x86_64
>   - No regressions in btf.exp on BPF target
> 
> ---
>  gcc/btfout.cc | 22 +++
>  .../gcc.dg/debug/btf/btf-function-6.c |  4 ++--
>  2 files changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
> index 5f2e99ce472..1c25404b2c0 100644
> --- a/gcc/btfout.cc
> +++ b/gcc/btfout.cc
> @@ -158,6 +158,19 @@ get_btf_kind (uint32_t ctf_kind)
>return BTF_KIND_UNKN;
>  }
>  
> +/* Some BTF types, like BTF_KIND_FUNC_PROTO, are anonymous.  The machinery
> +   in btfout to emit BTF, may reset dtd_data->ctti_name, but does not update
> +   the name in the ctf_dtdef_ref type object (deliberate choice).  This
> +   interface helps abstract out that state of affairs, while giving access to
> +   the name of the type as intended.  */
> +
> +static const char *
> +get_btf_type_name (ctf_dtdef_ref dtd)
> +{
> +  const char *anon = "";
> +  return (dtd->dtd_data.ctti_name) ? dtd->dtd_name : anon;
> +}
> +
>  /* Helper routines to map between 'relative' and 'absolute' IDs.
>  
> In BTF all records (including variables) are output in one long list, and 
> all
> @@ -425,6 +438,7 @@ btf_collect_datasec (ctf_container_ref ctfc)
>func_dtd->dtd_data = dtd->dtd_data;
>func_dtd->dtd_data.ctti_type = dtd->dtd_type;
>func_dtd->linkage = dtd->linkage;
> +  func_dtd->dtd_name = dtd->dtd_name;
>func_dtd->dtd_type = num_types_added + num_types_created;
>  
>/* Only the BTF_KIND_FUNC type actually references the name. The
> @@ -722,7 +736,7 @@ btf_asm_type_ref (const char *prefix, ctf_container_ref 
> ctfc, ctf_id_t ref_id)
>size_t func_id = btf_relative_func_id (ref_id);
>ctf_dtdef_ref ref_type = (*funcs)[func_id];
>dw2_asm_output_data (4, ref_id, "%s: (BTF_KIND_FUNC '%s')",
> -prefix, ref_type->dtd_name);
> +prefix, get_btf_type_name (ref_type));
>  }
>else
>  {
> @@ -733,7 +747,7 @@ btf_asm_type_ref (const char *prefix, ctf_container_ref 
> ctfc, ctf_id_t ref_id)
>  
>dw2_asm_output_data (4, ref_id, "%s: (BTF_KIND_%s '%s')",
>  prefix, btf_kind_name (ref_kind),
> -ref_type->dtd_name);
> +get_btf_type_name (ref_type));
>  }
>  }
>  
> @@ -809,7 +823,7 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
>dw2_asm_output_data (4, dtd->dtd_data.ctti_name,
>  "TYPE %" PRIu64 " BTF_KIND_%s '%s'",
>  get_btf_id (dtd->dtd_type), btf_kind_name (btf_kind),
> -dtd->dtd_name);
> +get_btf_type_name (dtd));
>dw2_asm_output_data (4, BTF_TYPE_INFO (btf_kind, btf_kflag, btf_vlen),
>  "btt_info: kind=%u, kflag=%u, vlen=%u",
>  btf_kind, btf_kflag, btf_vlen);
> @@ -950,7 +964,7 @@ btf_asm_func_type (ctf_container_ref ctfc, ctf_dtdef_ref 
> dtd, ct

Re: [PATCH] btf: fix PR debug/112656

2023-12-01 Thread David Faust
Hi Indu,

On 11/30/23 14:18, Indu Bhagat wrote:
> PR debug/112656 - btf: function prototypes generated with name
> 
> With this patch, all BTF_KIND_FUNC_PROTO will appear anonymous in the
> generated BTF section.
> 
> As noted in the discussion in the bugzilla, the number of
> BTF_KIND_FUNC_PROTO types output varies across targets (BPF with -mco-re
> vs non-BPF targets).  Hence the check in the test case merely checks
> that all BTF_KIND_FUNC_PROTO appear anonymous.

Looks good to me. OK to apply.
Thanks!

> 
> gcc/ChangeLog:
> 
>   PR debug/112656
> * btfout.cc (btf_asm_type): Fixup ctti_name for all
>   BTF types of kind BTF_KIND_FUNC_PROTO.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR debug/112656
> * gcc.dg/debug/btf/btf-function-7.c: New test.
> 
> 
> Testing notes:
>   - bootstrapped and reg tested on x86_64
>   - No regressions in btf.exp on BPF target
> 
> ---
>  gcc/btfout.cc |  4 
>  .../gcc.dg/debug/btf/btf-function-7.c | 19 +++
>  2 files changed, 23 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-function-7.c
> 
> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
> index 1c25404b2c0..a5e0d640e19 100644
> --- a/gcc/btfout.cc
> +++ b/gcc/btfout.cc
> @@ -820,6 +820,10 @@ btf_asm_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd)
>   btf_kind = BTF_KIND_ENUM64;
> }
>  
> +  /* PR debug/112656.  BTF_KIND_FUNC_PROTO is always anonymous.  */
> +  if (btf_kind == BTF_KIND_FUNC_PROTO)
> +dtd->dtd_data.ctti_name = 0;
> +
>dw2_asm_output_data (4, dtd->dtd_data.ctti_name,
>  "TYPE %" PRIu64 " BTF_KIND_%s '%s'",
>  get_btf_id (dtd->dtd_type), btf_kind_name (btf_kind),
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-function-7.c 
> b/gcc/testsuite/gcc.dg/debug/btf/btf-function-7.c
> new file mode 100644
> index 000..b560dc75650
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-function-7.c
> @@ -0,0 +1,19 @@
> +/* Test BTF for inlined functions.
> +
> +   See PR/112656 - btf: function prototypes generated with name
> +   BTF_KIND_FUNC_PROTO must be anonymous.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -gbtf -dA" } */
> +
> +/* { dg-final { scan-assembler-times "BTF_KIND_FUNC_PROTO 
> ''\\(\[0-9a-z\]*\\)'" 0 } } */
> +
> +static int log_event(const char *event_name, void *dev_ptr)
> +{
> +  return 666;
> +}
> +
> +int foo ()
> +{
> +  return log_event ("foobar", ((void *)0));
> +}


Re: [PATCH] testsuite: skip gcc.target/i386/pr106910-1.c test when using newlib

2023-12-01 Thread Marc Poulhiès


Marc Poulhiès  writes:

> Using newlib produces a different codegen because the support for c99
> differs (see libc_has_function hook).
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/i386/pr106910-1.c: Disable for newlib.
> ---
> Tested on x86_64-linux and x86_64-elf.
>
> OK for master?
>
>  gcc/testsuite/gcc.target/i386/pr106910-1.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr106910-1.c 
> b/gcc/testsuite/gcc.target/i386/pr106910-1.c
> index c7685a32183..00c93f444b6 100644
> --- a/gcc/testsuite/gcc.target/i386/pr106910-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr106910-1.c
> @@ -1,4 +1,6 @@
> +
>  /* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-skip-if "newlib libc math causes different codegen" { newlib } } */
>  /* { dg-options "-msse4.1 -O2 -Ofast" } */
>  /* { dg-final { scan-assembler-times "roundps" 9 } } */
>  /* { dg-final { scan-assembler-times "cvtps2dq" 1 } } */

Ping.

Thanks,
Marc


Re: [PATCH] testsuite: require avx_runtime for some tests

2023-12-01 Thread Marc Poulhiès


Marc Poulhiès  writes:

> These 3 tests fails parsing the 'vect' dump when not using -mavx. Make
> the dependency explicit.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.dg/vect/vect-ifcvt-18.c: Add dep on avx_runtime.
>   * gcc.dg/vect/vect-simd-clone-16f.c: Likewise.
>   * gcc.dg/vect/vect-simd-clone-18f.c: Likewise.
> ---
> Tested on x86_64-linux and x86_64-elf.
>
> Ok for master?
>
>  gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c   | 3 ++-
>  gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c | 4 ++--
>  gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c | 4 ++--
>  3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c 
> b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> index c1d3c27d819..607194496e9 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> @@ -1,6 +1,7 @@
>  /* { dg-require-effective-target vect_condition } */
>  /* { dg-require-effective-target vect_float } */
> -/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */
> +/* { dg-require-effective-target avx_runtime } */
> +/* { dg-additional-options "-Ofast -mavx" } */
>
>
>  int A0[4] = {36,39,42,45};
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c 
> b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c
> index 7cd29e894d0..c6615dc626d 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c
> @@ -1,6 +1,6 @@
>  /* { dg-require-effective-target vect_simd_clones } */
> -/* { dg-additional-options "-fopenmp-simd --param vect-epilogues-nomask=0" } 
> */
> -/* { dg-additional-options "-mavx" { target avx_runtime } } */
> +/* { dg-additional-options "-fopenmp-simd --param vect-epilogues-nomask=0 
> -mavx" } */
> +/* { dg-require-effective-target avx_runtime } */
>  /* { dg-additional-options "-mno-avx512f" { target { { i?86*-*-* x86_64-*-* 
> } && { ! lp64 } } } } */
>
>  #define TYPE __INT64_TYPE__
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c 
> b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c
> index 4dd51381d73..787b918d0c4 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c
> @@ -1,6 +1,6 @@
>  /* { dg-require-effective-target vect_simd_clones } */
> -/* { dg-additional-options "-fopenmp-simd --param vect-epilogues-nomask=0" } 
> */
> -/* { dg-additional-options "-mavx" { target avx_runtime } } */
> +/* { dg-additional-options "-fopenmp-simd --param vect-epilogues-nomask=0 
> -mavx" } */
> +/* { dg-require-effective-target  avx_runtime } */
>  /* { dg-additional-options "-mno-avx512f" { target { { i?86*-*-* x86_64-*-* 
> } && { ! lp64 } } } } */
>
>  #define TYPE __INT64_TYPE__

Ping.

Thanks,
Marc


Re: [PATCH] testsuite: refine gcc.dg/analyzer/fd-4.c test for newlib

2023-12-01 Thread Marc Poulhiès


Marc Poulhiès  writes:

> Contrary to glibc, including stdio.h from newlib defines mode_t which
> conflicts with the test's type definition.
>
> .../gcc/testsuite/gcc.dg/analyzer/fd-4.c:19:3: error: redefinition of typedef 
> 'mode_t' with different type
> ...
> .../include/sys/types.h:189:25: note: previous declaration of 'mode_t' with 
> type 'mode_t' {aka 'unsigned int'}
>
> Defining _MODE_T_DECLARED skips the type definition.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.dg/analyzer/fd-4.c: Fix for newlib.
> ---
> Tested on x86_64-linux and x86_64-elf.
>
> Ok for master?
>
>  gcc/testsuite/gcc.dg/analyzer/fd-4.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-4.c 
> b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> index 994bad84342..e4a834ade30 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> @@ -1,3 +1,4 @@
> +/* { dg-additional-options "-D_MODE_T_DECLARED=1" { target newlib } } */
>  #ifdef _AIX
>  #define _MODE_T
>  #endif

Ping ?

Thanks,
Marc


Re: [RFA] New pass for sign/zero extension elimination

2023-12-01 Thread Hans-Peter Nilsson
> Date: Fri, 1 Dec 2023 08:09:08 -0700
> From: Jeff Law 

> On 11/30/23 18:08, Hans-Peter Nilsson wrote:
> >> Date: Sun, 19 Nov 2023 17:47:56 -0700
> >> From: Jeff Law 
> > 
> >> Locally we have had this enabled at -O1 and above to encourage testing,
> >> but I'm thinking that for the trunk enabling at -O2 and above is the
> >> right thing to do.
> > 
> > Yes.
> > 
> >> Thoughts, comments, recommendations?
> > 
> > Sounds great!
> > 
> > It'd be nice if its framework can be re-used for
> > target-specific passes, doing quirky sign- or zero-extend-
> > related optimizations (those that are not just sign- or
> > zero-extend removal).  Perhaps most of those opportunities
> > can be implemented as target hooks in this pass.  Definitely
> > not asking for a change, just imagining future improvements.
> > 
> > Also, I haven't followed the thread and its branches, just
> > offering a word encouragement.
> What kind of quirky target things did you have in mind?  If there's 
> overlap with things we need I might be able to find someone to take it 
> on.  Or might be able to suggest how they can be handled.

Sorry, I was hoping I'd not need to substantiate that part
outside the "not just sign- or zero-extend removal". :)

But perhaps: somewhat trivial would be where the
sign/zero-extension is hidden in an unspec, so the target
needs to be consulted regarding possible elimination and how
to do it.

If that doesn't do it, just ignore that part of the comment.
I have nothing substantial besides this pass sounding like
it'd be a great stepping-stone.  I'm having trouble making
up hypothetical cases here, and it probably wouldn't help.

I hope I'll find time to try the latest version but
definitely no promises.

brgds, H-P


c: Turn -Wimplicit-function-declaration into a permerror: Fix 'gcc.dg/gnu23-builtins-no-dfp-1.c' (was: [PATCH v3 06/11] c: Turn -Wimplicit-function-declaration into a permerror)

2023-12-01 Thread Thomas Schwinge
Hi!

On 2023-11-20T10:56:16+0100, Florian Weimer  wrote:
> --- a/gcc/c/c-decl.cc
> +++ b/gcc/c/c-decl.cc

> @@ -3515,14 +3515,14 @@ implicit_decl_warning (location_t loc, tree id, tree 
> olddecl)
>   {
> gcc_rich_location richloc (loc);
> richloc.add_fixit_replace (suggestion);
> -   warned = pedwarn (&richloc, OPT_Wimplicit_function_declaration,
> - "implicit declaration of function %qE;"
> - " did you mean %qs?",
> - id, suggestion);
> +   warned = permerror_opt (&richloc, OPT_Wimplicit_function_declaration,
> +   "implicit declaration of function %qE;"
> +   " did you mean %qs?",
> +   id, suggestion);
>   }
>else
> - warned = pedwarn (loc, OPT_Wimplicit_function_declaration,
> -   "implicit declaration of function %qE", id);
> + warned = permerror_opt (loc, OPT_Wimplicit_function_declaration,
> + "implicit declaration of function %qE", id);
>  }

There's one more test case to adjust, for the more limited back ends
(here: 'target { ! dfp }').  OK to push the attached
"c: Turn -Wimplicit-function-declaration into a permerror: Fix 
'gcc.dg/gnu23-builtins-no-dfp-1.c'"?


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 35ecfb7264bc93dbf2a5b2ee0a2d23a881fd6ded Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 1 Dec 2023 16:52:06 +0100
Subject: [PATCH] c: Turn -Wimplicit-function-declaration into a permerror: Fix
 'gcc.dg/gnu23-builtins-no-dfp-1.c'

With recent commit 55e94561e97ed0bce4774aa1c6b5d5d82209a379
"c: Turn -Wimplicit-function-declaration into a permerror", this test
case, added in 2019 commit 5b8d9367684f266c30c280b4d3c98830a88c70ab
"Prevent all uses of DFP when unsupported (PR c/91985)" started FAILing
(for applicable configurations):

[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c  (test for warnings, line 13)
[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c  (test for warnings, line 14)
[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c  (test for warnings, line 15)
[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c  (test for warnings, line 16)
[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c  (test for warnings, line 17)
[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c  (test for warnings, line 18)
[-PASS:-]{+FAIL:+} gcc.dg/gnu23-builtins-no-dfp-1.c (test for excess errors)

This is due to:

[...]/gcc.dg/gnu23-builtins-no-dfp-1.c:13:13: error: implicit declaration of function '__builtin_fabsd32'; did you mean '__builtin_fabsf32'? [-Wimplicit-function-declaration]
[...]

Adjust as obvious.

	gcc/testsuite/
	* gcc.dg/gnu23-builtins-no-dfp-1.c: 'dg-error "implicit"' instead
	of 'dg-warning "implicit"'.
---
 gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c b/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c
index 9fa25f0dd13..7cb200fce6a 100644
--- a/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c
+++ b/gcc/testsuite/gcc.dg/gnu23-builtins-no-dfp-1.c
@@ -10,9 +10,9 @@ int nand32 (void);
 int nand64 (void);
 int nand128 (void);
 
-__typeof__ (__builtin_fabsd32 (0)) d32; /* { dg-warning "implicit" } */
-__typeof__ (__builtin_fabsd64 (0)) d64; /* { dg-warning "implicit" } */
-__typeof__ (__builtin_fabsd128 (0)) d128; /* { dg-warning "implicit" } */
-__typeof__ (__builtin_nand32 (0)) d32n; /* { dg-warning "implicit" } */
-__typeof__ (__builtin_nand64 (0)) d64n; /* { dg-warning "implicit" } */
-__typeof__ (__builtin_nand128 (0)) d128n; /* { dg-warning "implicit" } */
+__typeof__ (__builtin_fabsd32 (0)) d32; /* { dg-error "implicit" } */
+__typeof__ (__builtin_fabsd64 (0)) d64; /* { dg-error "implicit" } */
+__typeof__ (__builtin_fabsd128 (0)) d128; /* { dg-error "implicit" } */
+__typeof__ (__builtin_nand32 (0)) d32n; /* { dg-error "implicit" } */
+__typeof__ (__builtin_nand64 (0)) d64n; /* { dg-error "implicit" } */
+__typeof__ (__builtin_nand128 (0)) d128n; /* { dg-error "implicit" } */
-- 
2.34.1



Re: [PATCH] ada: Fix Ada bootstrap on macOS

2023-12-01 Thread Marc Poulhiès


Rainer Orth  writes:

> The recent warning changes broke Ada bootstrap on macOS:
>
> adaint.c: In function '__gnat_copy_attribs':
> adaint.c:3336:10: error: implicit declaration of function 'utimes'; did you 
> mean 'utime'? [-Wimplicit-function-declaration]
>  3336 |  if (utimes (to, tbuf) == -1) {
>   |  ^~
>   |  utime
> adaint.c: In function '__gnat_kill':
> adaint.c:3597:3: error: implicit declaration of function 'kill' 
> [-Wimplicit-function-declaration]
>  3597 |   kill (pid, sig);
>   |   ^~~~
> terminals.c: In function 'allocate_pty_desc':
> terminals.c:1196:12: error: implicit declaration of function 'openpty'; did 
> you mean 'openat'? [-Wimplicit-function-declaration]
>  1196 |   status = openpty (&master_fd, &slave_fd, NULL, NULL, NULL);
>   |^~~
>   |openat
> terminals.c: In function '__gnat_setup_winsize':
> terminals.c:1392:6: error: implicit declaration of function 'kill' 
> [-Wimplicit-function-declaration]
>  1392 |  kill (desc->child_pid, SIGWINCH);
>   |  ^~~~
>
> This patch fixes this by including the necessary headers: 
> for utimes,  for kill, and  for openpty.  With those
> changes, the build completed on x86_64-apple-darwin2[0-3] (make check
> still running).
>
> Ok for trunk?

Ok!

Thanks,
Marc


PING^3 Re: [i386 PATCH] A minor code clean-up: Use NULL_RTX instead of nullptr

2023-12-01 Thread Bernhard Reutner-Fischer
On Wed, 14 Jun 2023 21:14:02 +0200
Bernhard Reutner-Fischer  wrote:

> plonk.

ping^3

patch at
https://inbox.sourceware.org/gcc-patches/20230526103151.3a7f6...@nbbrfq.loc/

I would regenerate it for rtx and/or tree, though, whatever you deem
desirable?

thanks

> 
> On 26 May 2023 10:31:51 CEST, Bernhard Reutner-Fischer 
>  wrote:
> >On Thu, 25 May 2023 18:58:04 +0200
> >Bernhard Reutner-Fischer  wrote:
> >  
> >> On Wed, 24 May 2023 18:54:06 +0100
> >> "Roger Sayle"  wrote:
> >>   
> >> > My understanding is that GCC's preferred null value for rtx is NULL_RTX
> >> > (and for tree is NULL_TREE), and by being typed allows strict type 
> >> > checking,
> >> > and use with function polymorphism and template instantiation.
> >> > C++'s nullptr is preferred over NULL and 0 for pointer types that don't
> >> > have a defined null of the correct type.
> >> > 
> >> > This minor clean-up uses NULL_RTX consistently in i386-expand.cc.
> >> 
> >> Oh. Well, i can't resist cleanups :)  
> >  
> >> (and handle nullptr too, and the same game for tree)  
> >
> > so like the attached. And
> > sed -e 's/RTX/TREE/g' -e 's/rtx/tree/g' \
> >  < ~/coccinelle/gcc-rtx-null.0.cocci \  
> >  > ~/coccinelle/gcc-tree-null.0.cocci  
> >
> > I do not know if we want to shorten explicit NULL comparisons.
> > foo == NULL => !foo and foo != NULL => foo
> > Left them alone in the form they were written.
> >
> > See the attached result of the rtx hunks, someone would have to build  
> 
> I've bootstrapped and regtested the hunks for rtx as cited up-thread without 
> regressions (as expected).
> 
> I know everybody is busy, but I'd like to know if I should swap these out 
> completely,
> or postpone this until start of stage3 or next stage 1 or something.
> I can easily keep these local to my personal pre-configure stage for my own 
> amusement.
> 
> thanks,
> 
> >it and hack git-commit-mklog.py --changelog 'Use NULL_RTX.'
> >to print("{}.".format(random.choice(['Ditto', 'Same', 'Likewise']))) ;)
> >  
> >> 
> >> Just a thought..  
> >
> >cheers,  
> 



Re: [PATCH v2] doc: Update the status of build directory not fully separated

2023-12-01 Thread Eric Gallager
Please cross-reference against issue 37210 if/when merging, if it
hasn't already been:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37210

On Fri, Dec 1, 2023 at 2:15 AM Richard Biener
 wrote:
>
> On Thu, Nov 30, 2023 at 2:42 PM Xi Ruoyao  wrote:
> >
> > Recently there are some people building GCC with srcdir == objdir and
> > the attempts just failed [1].  So stop to say "it should work".  OTOH
> > objdir as a subdirectory of srcdir works: we've built GCC in LFS [2]
> > and BLFS [3] this way for decades and this is confirmed during the
> > review of a previous version of this patch [4].
> >
> > [1]: https://gcc.gnu.org/pipermail/gcc-help/2023-November/143068.html
> > [2]: https://www.linuxfromscratch.org/lfs/view/12.0/chapter08/gcc.html
> > [3]: https://www.linuxfromscratch.org/blfs/view/12.0/general/gcc.html
> > [4]: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638760.html>
>
> > gcc/ChangeLog:
> >
> > * doc/install.texi: Deem srcdir == objdir broken, but objdir
> > as a subdirectory of srcdir fine.
> > ---
> >
> > Superseds
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638728.html.
> >
> > Ok for trunk?
>
> OK.
>
> Thanks,
> Richard.
>
> >  gcc/doc/install.texi | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> > index c1ccb8ba02d..c1128d9274c 100644
> > --- a/gcc/doc/install.texi
> > +++ b/gcc/doc/install.texi
> > @@ -697,9 +697,8 @@ phases.
> >  First, we @strong{highly} recommend that GCC be built into a
> >  separate directory from the sources which does @strong{not} reside
> >  within the source tree.  This is how we generally build GCC; building
> > -where @var{srcdir} == @var{objdir} should still work, but doesn't
> > -get extensive testing; building where @var{objdir} is a subdirectory
> > -of @var{srcdir} is unsupported.
> > +where @var{objdir} is a subdirectory of @var{srcdir} should work as well;
> > +building where @var{objdir} == @var{srcdir} is unsupported.
> >
> >  If you have previously built GCC in the same directory for a
> >  different target machine, do @samp{make distclean} to delete all files
> > --
> > 2.43.0
> >


Re: [PATCH] ada: Fix Ada bootstrap on macOS

2023-12-01 Thread Iain Sandoe



> On 1 Dec 2023, at 15:14, Rainer Orth  wrote:
> 
> The recent warning changes broke Ada bootstrap on macOS:
> 
> adaint.c: In function '__gnat_copy_attribs':
> adaint.c:3336:10: error: implicit declaration of function 'utimes'; did you 
> mean 'utime'? [-Wimplicit-function-declaration]
> 3336 |  if (utimes (to, tbuf) == -1) {
>  |  ^~
>  |  utime
> adaint.c: In function '__gnat_kill':
> adaint.c:3597:3: error: implicit declaration of function 'kill' 
> [-Wimplicit-function-declaration]
> 3597 |   kill (pid, sig);
>  |   ^~~~
> terminals.c: In function 'allocate_pty_desc':
> terminals.c:1196:12: error: implicit declaration of function 'openpty'; did 
> you mean 'openat'? [-Wimplicit-function-declaration]
> 1196 |   status = openpty (&master_fd, &slave_fd, NULL, NULL, NULL);
>  |^~~
>  |openat
> terminals.c: In function '__gnat_setup_winsize':
> terminals.c:1392:6: error: implicit declaration of function 'kill' 
> [-Wimplicit-function-declaration]
> 1392 |  kill (desc->child_pid, SIGWINCH);
>  |  ^~~~
> 
> This patch fixes this by including the necessary headers: 
> for utimes,  for kill, and  for openpty.  With those
> changes, the build completed on x86_64-apple-darwin2[0-3] (make check
> still running).
> 
> Ok for trunk?

OK from the Darwin side.
Iain

> 
>   Rainer
> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
> 
> 
> 2023-12-01  Rainer Orth  
> 
>   gcc/ada:
>   * adaint.c [__APPLE__]: Include , .
>   * terminals.c [!_WIN32]: Include .
>   [__APPLE__]: Include .
>   Fix typos.
> 
> diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
> --- a/gcc/ada/adaint.c
> +++ b/gcc/ada/adaint.c
> @@ -85,6 +85,8 @@
> 
> #if defined (__APPLE__)
> #include 
> +#include 
> +#include 
> #include 
> #endif
> 
> diff --git a/gcc/ada/terminals.c b/gcc/ada/terminals.c
> --- a/gcc/ada/terminals.c
> +++ b/gcc/ada/terminals.c
> @@ -31,7 +31,7 @@
> 
> #define ATTRIBUTE_UNUSED __attribute__((unused))
> 
> -/* First all usupported platforms. Add stubs for exported routines. */
> +/* First all unsupported platforms. Add stubs for exported routines. */
> 
> #if defined (VMS) || defined (__vxworks) || defined (__Lynx__) \
>   || defined (__ANDROID__) || defined (__PikeOS__) || defined(__DJGPP__)
> @@ -1089,7 +1089,7 @@ void
> {
> }
> 
> -#else /* defined(_WIN32, implementatin for all UNIXes */
> +#else /* defined(_WIN32, implementation for all UNIXes */
> 
> /* First defined some macro to identify easily some systems */
> #if defined (__FreeBSD__) \
> @@ -1104,6 +1104,7 @@ void
> #include 
> #include 
> #include 
> +#include 
> #include 
> #include 
> #include 
> @@ -1121,6 +1122,9 @@ void
> #if defined (__hpux__)
> #   include 
> #endif
> +#if defined (__APPLE__)
> +#   include 
> +#endif
> 
> #define CDISABLE _POSIX_VDISABLE
> 



Re: [PATCH] RISC-V: Vectorized str(n)cmp and strlen.

2023-12-01 Thread Robin Dapp
Split it into four separate patches now.

Regards
 Robin



[PATCH] RISC-V: Add vectorized strcmp.

2023-12-01 Thread Robin Dapp
Hi,

this patch adds a vectorized strcmp implementation and tests.  Similar
to strlen, expansion is still guarded by -minline-strcmp.  I just
realized I forgot to make it a series but this one is actually
dependent on the NFC patch and the rawmemchr fix before.

Regards
 Robin

gcc/ChangeLog:

* config/riscv/riscv-protos.h (expand_strcmp): Declare.
* config/riscv/riscv-string.cc (riscv_expand_strcmp): Add
strategy handling and delegation to scalar and vector expanders.
(expand_strcmp): Vectorized implementation.
* config/riscv/riscv.md: Add TARGET_VECTOR to strcmp expander.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strcmp.c: New test.
---
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-string.cc  | 161 +-
 gcc/config/riscv/riscv.md |   3 +-
 .../riscv/rvv/autovec/builtin/strcmp-run.c|  32 
 .../riscv/rvv/autovec/builtin/strcmp.c|  13 ++
 5 files changed, 206 insertions(+), 4 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strcmp.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index c94c82a9973..5878a674413 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -558,6 +558,7 @@ void expand_cond_binop (unsigned, rtx *);
 void expand_cond_ternop (unsigned, rtx *);
 void expand_popcount (rtx *);
 void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
+bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool);
 void emit_vec_extract (rtx, rtx, poly_int64);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 6cde1bf89a0..11c1f74d0b3 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -511,12 +511,19 @@ riscv_expand_strcmp (rtx result, rtx src1, rtx src2,
 return false;
   alignment = UINTVAL (align_rtx);
 
-  if (TARGET_ZBB || TARGET_XTHEADBB)
+  if (TARGET_VECTOR && stringop_strategy & STRATEGY_VECTOR)
 {
-  return riscv_expand_strcmp_scalar (result, src1, src2, nbytes, alignment,
-ncompare);
+  bool ok = riscv_vector::expand_strcmp (result, src1, src2,
+bytes_rtx, alignment,
+ncompare);
+  if (ok)
+   return true;
 }
 
+  if ((TARGET_ZBB || TARGET_XTHEADBB) && stringop_strategy & STRATEGY_SCALAR)
+return riscv_expand_strcmp_scalar (result, src1, src2, nbytes, alignment,
+  ncompare);
+
   return false;
 }
 
@@ -1092,4 +1099,152 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx 
haystack, rtx needle,
 }
 }
 
+/* Implement cmpstr using vector instructions.  The ALIGNMENT and
+   NCOMPARE parameters are unused for now.  */
+
+bool
+expand_strcmp (rtx result, rtx src1, rtx src2, rtx nbytes,
+  unsigned HOST_WIDE_INT, bool)
+{
+  gcc_assert (TARGET_VECTOR);
+
+  /* We don't support big endian.  */
+  if (BYTES_BIG_ENDIAN)
+return false;
+
+  bool with_length = nbytes != NULL_RTX;
+
+  if (with_length
+  && (!REG_P (nbytes) && !SUBREG_P (nbytes) && !CONST_INT_P (nbytes)))
+return false;
+
+  if (with_length && CONST_INT_P (nbytes))
+nbytes = force_reg (Pmode, nbytes);
+
+  machine_mode mode = E_QImode;
+  unsigned int isize = GET_MODE_SIZE (mode).to_constant ();
+  int lmul = TARGET_MAX_LMUL;
+  poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR * lmul, isize);
+
+  machine_mode vmode;
+  if (!riscv_vector::get_vector_mode (GET_MODE_INNER (mode), nunits)
+.exists (&vmode))
+gcc_unreachable ();
+
+  machine_mode mask_mode = riscv_vector::get_mask_mode (vmode);
+
+  /* Prepare addresses.  */
+  rtx src_addr1 = copy_addr_to_reg (XEXP (src1, 0));
+  rtx vsrc1 = change_address (src1, vmode, src_addr1);
+
+  rtx src_addr2 = copy_addr_to_reg (XEXP (src2, 0));
+  rtx vsrc2 = change_address (src2, vmode, src_addr2);
+
+  /* Set initial pointer bump to 0.  */
+  rtx cnt = gen_reg_rtx (Pmode);
+  emit_move_insn (cnt, CONST0_RTX (Pmode));
+
+  rtx sub = gen_reg_rtx (Pmode);
+  emit_move_insn (sub, CONST0_RTX (Pmode));
+
+  /* Create source vectors.  */
+  rtx vec1 = gen_reg_rtx (vmode);
+  rtx vec2 = gen_reg_rtx (vmode);
+
+  rtx done = gen_label_rtx ();
+  rtx loop = gen_label_rtx ();
+  emit_label (loop);
+
+  /* Bump the pointers.  */
+  emit_insn (gen_rtx_SET (src_addr1, gen_rtx_PLUS (Pmode, src_addr1, cnt)));
+  emit_insn (gen_rtx_SET (src_addr2, gen_rtx_PLUS (Pmode, src_addr2, cnt)));
+
+  rtx vlops1[] = {vec1, vsrc1};
+  rtx vlops2[] = {vec2, vsrc2};
+
+  if (!with_length)
+{
+  emit_vlmax_insn (code_for_pred

[PATCH] RISC-V: Add vectorized strlen.

2023-12-01 Thread Robin Dapp
Hi,

this patch implements a vectorized strlen by re-using and slightly
adjusting the rawmemchr implementation.  Rawmemchr returns the address
of the needle while strlen returns the difference between needle address
and start address.

As before, strlen expansion is guarded by -minline-strlen.

While testing with -minline-strlen I encountered a vsetvl problem in
memcpy-chk.c where we didn't insert a vsetvl at the proper spot (after
a setjmp).  This needs to be fixed separately and I figured I'd post
this patch as-is.

Regards
 Robin

gcc/ChangeLog:

* config/riscv/riscv-protos.h (expand_rawmemchr): Add strlen
parameter.
* config/riscv/riscv-string.cc (riscv_expand_strlen): Call
rawmemchr.
(expand_rawmemchr): Add strlen handling.
* config/riscv/riscv.md: Add TARGET_VECTOR to strlen expander.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strlen.c: New test.
---
 gcc/config/riscv/riscv-protos.h   |  2 +-
 gcc/config/riscv/riscv-string.cc  | 41 ++-
 gcc/config/riscv/riscv.md |  8 +---
 .../riscv/rvv/autovec/builtin/strlen-run.c| 37 +
 .../riscv/rvv/autovec/builtin/strlen.c| 12 ++
 5 files changed, 83 insertions(+), 17 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 695ee24ad6f..c94c82a9973 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -557,7 +557,7 @@ void expand_cond_unop (unsigned, rtx *);
 void expand_cond_binop (unsigned, rtx *);
 void expand_cond_ternop (unsigned, rtx *);
 void expand_popcount (rtx *);
-void expand_rawmemchr (machine_mode, rtx, rtx, rtx);
+void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false);
 void emit_vec_extract (rtx, rtx, poly_int64);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 594ff49fc5a..6cde1bf89a0 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -588,9 +588,16 @@ riscv_expand_strlen_scalar (rtx result, rtx src, rtx align)
 bool
 riscv_expand_strlen (rtx result, rtx src, rtx search_char, rtx align)
 {
+  if (TARGET_VECTOR && stringop_strategy & STRATEGY_VECTOR)
+{
+  riscv_vector::expand_rawmemchr (E_QImode, result, src, search_char,
+ /* strlen */ true);
+  return true;
+}
+
   gcc_assert (search_char == const0_rtx);
 
-  if (TARGET_ZBB || TARGET_XTHEADBB)
+  if ((TARGET_ZBB || TARGET_XTHEADBB) && stringop_strategy & STRATEGY_SCALAR)
 return riscv_expand_strlen_scalar (result, src, align);
 
   return false;
@@ -979,12 +986,13 @@ expand_block_move (rtx dst_in, rtx src_in, rtx length_in)
 }
 
 
-/* Implement rawmemchr using vector instructions.
+/* Implement rawmemchr and strlen using vector instructions.
It can be assumed that the needle is in the haystack, otherwise the
behavior is undefined.  */
 
 void
-expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat)
+expand_rawmemchr (machine_mode mode, rtx dst, rtx haystack, rtx needle,
+ bool strlen)
 {
   /*
 rawmemchr:
@@ -1005,6 +1013,9 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   */
   gcc_assert (TARGET_VECTOR);
 
+  if (strlen)
+gcc_assert (mode == E_QImode);
+
   unsigned int isize = GET_MODE_SIZE (mode).to_constant ();
   int lmul = TARGET_MAX_LMUL;
   poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR * lmul, isize);
@@ -1028,12 +1039,13 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
  return a pointer to the matching byte.  */
   unsigned int shift = exact_log2 (GET_MODE_SIZE (mode).to_constant ());
 
-  rtx src_addr = copy_addr_to_reg (XEXP (src, 0));
+  rtx src_addr = copy_addr_to_reg (XEXP (haystack, 0));
+  rtx start_addr = copy_addr_to_reg (XEXP (haystack, 0));
 
   rtx loop = gen_label_rtx ();
   emit_label (loop);
 
-  rtx vsrc = change_address (src, vmode, src_addr);
+  rtx vsrc = change_address (haystack, vmode, src_addr);
 
   /* Bump the pointer.  */
   rtx step = gen_reg_rtx (Pmode);
@@ -1052,8 +1064,8 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
 emit_insn (gen_read_vldi_zero_extend (cnt));
 
   /* Compare needle with haystack and store in a mask.  */
-  rtx eq = gen_rtx_EQ (mask_mode, gen_const_vec_duplicate (vmode, pat), vec);
-  rtx vmsops[] = {mask, eq, vec, pat};
+  rtx eq = gen_rtx_EQ (mask_mode, gen_const_vec_duplicate (vmode, needle), 
vec);
+  rtx vmsops[] = {mask, eq, vec, needle};
   emit_nonvlmax_insn (code_for_pred_eqne_scalar (vmode),
  riscv_vector::COMPARE_OP, vmsops, 

[PATCH] RISC-V: Rename and unify stringop strategy handling [NFC].

2023-12-01 Thread Robin Dapp
Hi,

now split into multiple patches.

In preparation for the vectorized strlen and strcmp support this NFC
patch unifies the stringop strategy handling a bit.  The "auto"
strategy now is a combination of scalar and vector and an expander
should try the strategies in their preferred order.

For the block_move expander this patch does just that.

Regards
 Robin

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum):
Rename...
(enum stringop_strategy_enum): ... to this.
* config/riscv/riscv-string.cc (riscv_expand_block_move): New
wrapper expander handling the strategies and delegation.
(riscv_expand_block_move_scalar): Rename function and make
static.
(expand_block_move): Remove strategy handling.
* config/riscv/riscv.md: Call expander wrapper.
* config/riscv/riscv.opt: Rename.
---
 gcc/config/riscv/riscv-opts.h | 18 ++--
 gcc/config/riscv/riscv-string.cc  | 92 +++
 gcc/config/riscv/riscv.md |  4 +-
 gcc/config/riscv/riscv.opt| 18 ++--
 .../riscv/rvv/base/cpymem-strategy-1.c|  2 +-
 .../riscv/rvv/base/cpymem-strategy-2.c|  2 +-
 .../riscv/rvv/base/cpymem-strategy-3.c|  2 +-
 .../riscv/rvv/base/cpymem-strategy-4.c|  2 +-
 .../riscv/rvv/base/cpymem-strategy-5.c|  2 +-
 9 files changed, 78 insertions(+), 64 deletions(-)

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index e6e55ad7071..30efebbf07b 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -104,15 +104,15 @@ enum riscv_entity
 };
 
 /* RISC-V stringop strategy. */
-enum riscv_stringop_strategy_enum {
-  /* Use scalar or vector instructions. */
-  USE_AUTO,
-  /* Always use a library call. */
-  USE_LIBCALL,
-  /* Only use scalar instructions. */
-  USE_SCALAR,
-  /* Only use vector instructions. */
-  USE_VECTOR
+enum stringop_strategy_enum {
+  /* No expansion. */
+  STRATEGY_LIBCALL = 1,
+  /* Use scalar expansion if possible. */
+  STRATEGY_SCALAR = 2,
+  /* Only vector expansion if possible. */
+  STRATEGY_VECTOR = 4,
+  /* Use any. */
+  STRATEGY_AUTO = STRATEGY_SCALAR | STRATEGY_VECTOR
 };
 
 #define TARGET_ZICOND_LIKE (TARGET_ZICOND || (TARGET_XVENTANACONDOPS && 
TARGET_64BIT))
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 80e3b5981af..f3a4d3ddd47 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -707,51 +707,68 @@ riscv_block_move_loop (rtx dest, rtx src, unsigned 
HOST_WIDE_INT length,
 /* Expand a cpymemsi instruction, which copies LENGTH bytes from
memory reference SRC to memory reference DEST.  */
 
-bool
-riscv_expand_block_move (rtx dest, rtx src, rtx length)
+static bool
+riscv_expand_block_move_scalar (rtx dest, rtx src, rtx length)
 {
-  if (riscv_memcpy_strategy == USE_LIBCALL
-  || riscv_memcpy_strategy == USE_VECTOR)
+  if (!CONST_INT_P (length))
 return false;
 
-  if (CONST_INT_P (length))
-{
-  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
-  unsigned HOST_WIDE_INT factor, align;
+  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
+  unsigned HOST_WIDE_INT factor, align;
 
-  align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
-  factor = BITS_PER_WORD / align;
+  align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
+  factor = BITS_PER_WORD / align;
 
-  if (optimize_function_for_size_p (cfun)
- && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
-   return false;
+  if (optimize_function_for_size_p (cfun)
+  && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
+return false;
 
-  if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+  if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+{
+  riscv_block_move_straight (dest, src, INTVAL (length));
+  return true;
+}
+  else if (optimize && align >= BITS_PER_WORD)
+{
+  unsigned min_iter_words
+   = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
+  unsigned iter_words = min_iter_words;
+  unsigned HOST_WIDE_INT bytes = hwi_length;
+  unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
+
+  /* Lengthen the loop body if it shortens the tail.  */
+  for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
{
- riscv_block_move_straight (dest, src, INTVAL (length));
- return true;
+ unsigned cur_cost = iter_words + words % iter_words;
+ unsigned new_cost = i + words % i;
+ if (new_cost <= cur_cost)
+   iter_words = i;
}
-  else if (optimize && align >= BITS_PER_WORD)
-   {
- unsigned min_iter_words
-   = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
- unsigned iter_words = min_iter_words;
- unsigned HOST_WIDE_

[PATCH] RISC-V: Fix rawmemchr implementation.

2023-12-01 Thread Robin Dapp
Hi,

this fixes a bug in the rawmemchr implementation by incrementing the
source address by vl * element_size instead of just vl.

This is normally harmless as we will just scan the same region more than
once but, in combination with an older qemu version, would lead to
an execution failure in SPEC2017.

Regards
 Robin


gcc/ChangeLog:

* config/riscv/riscv-string.cc (expand_rawmemchr): Increment
source address by vl * element_size.
---
 gcc/config/riscv/riscv-string.cc | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index f3a4d3ddd47..594ff49fc5a 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -1017,6 +1017,8 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   machine_mode mask_mode = riscv_vector::get_mask_mode (vmode);
 
   rtx cnt = gen_reg_rtx (Pmode);
+  emit_move_insn (cnt, CONST0_RTX (Pmode));
+
   rtx end = gen_reg_rtx (Pmode);
   rtx vec = gen_reg_rtx (vmode);
   rtx mask = gen_reg_rtx (mask_mode);
@@ -1033,6 +1035,11 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
 
   rtx vsrc = change_address (src, vmode, src_addr);
 
+  /* Bump the pointer.  */
+  rtx step = gen_reg_rtx (Pmode);
+  emit_insn (gen_rtx_SET (step, gen_rtx_ASHIFT (Pmode, cnt, GEN_INT (shift;
+  emit_insn (gen_rtx_SET (src_addr, gen_rtx_PLUS (Pmode, src_addr, step)));
+
   /* Emit a first-fault load.  */
   rtx vlops[] = {vec, vsrc};
   emit_vlmax_insn (code_for_pred_fault_load (vmode),
@@ -1055,16 +1062,10 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, 
rtx pat)
   emit_nonvlmax_insn (code_for_pred_ffs (mask_mode, Pmode),
  riscv_vector::CPOP_OP, vfops, cnt);
 
-  /* Bump the pointer.  */
-  emit_insn (gen_rtx_SET (src_addr, gen_rtx_PLUS (Pmode, src_addr, cnt)));
-
   /* Emit the loop condition.  */
   rtx test = gen_rtx_LT (VOIDmode, end, const0_rtx);
   emit_jump_insn (gen_cbranch4 (Pmode, test, end, const0_rtx, loop));
 
-  /*  We overran by CNT, subtract it.  */
-  emit_insn (gen_rtx_SET (src_addr, gen_rtx_MINUS (Pmode, src_addr, cnt)));
-
   /*  We found something at SRC + END * [1,2,4,8].  */
   emit_insn (gen_rtx_SET (end, gen_rtx_ASHIFT (Pmode, end, GEN_INT (shift;
   emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end)));
-- 
2.43.0



[PATCH] ada: Fix Ada bootstrap on macOS

2023-12-01 Thread Rainer Orth
The recent warning changes broke Ada bootstrap on macOS:

adaint.c: In function '__gnat_copy_attribs':
adaint.c:3336:10: error: implicit declaration of function 'utimes'; did you 
mean 'utime'? [-Wimplicit-function-declaration]
 3336 |  if (utimes (to, tbuf) == -1) {
  |  ^~
  |  utime
adaint.c: In function '__gnat_kill':
adaint.c:3597:3: error: implicit declaration of function 'kill' 
[-Wimplicit-function-declaration]
 3597 |   kill (pid, sig);
  |   ^~~~
terminals.c: In function 'allocate_pty_desc':
terminals.c:1196:12: error: implicit declaration of function 'openpty'; did you 
mean 'openat'? [-Wimplicit-function-declaration]
 1196 |   status = openpty (&master_fd, &slave_fd, NULL, NULL, NULL);
  |^~~
  |openat
terminals.c: In function '__gnat_setup_winsize':
terminals.c:1392:6: error: implicit declaration of function 'kill' 
[-Wimplicit-function-declaration]
 1392 |  kill (desc->child_pid, SIGWINCH);
  |  ^~~~

This patch fixes this by including the necessary headers: 
for utimes,  for kill, and  for openpty.  With those
changes, the build completed on x86_64-apple-darwin2[0-3] (make check
still running).

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2023-12-01  Rainer Orth  

gcc/ada:
* adaint.c [__APPLE__]: Include , .
* terminals.c [!_WIN32]: Include .
[__APPLE__]: Include .
Fix typos.

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -85,6 +85,8 @@
 
 #if defined (__APPLE__)
 #include 
+#include 
+#include 
 #include 
 #endif
 
diff --git a/gcc/ada/terminals.c b/gcc/ada/terminals.c
--- a/gcc/ada/terminals.c
+++ b/gcc/ada/terminals.c
@@ -31,7 +31,7 @@
 
 #define ATTRIBUTE_UNUSED __attribute__((unused))
 
-/* First all usupported platforms. Add stubs for exported routines. */
+/* First all unsupported platforms. Add stubs for exported routines. */
 
 #if defined (VMS) || defined (__vxworks) || defined (__Lynx__) \
   || defined (__ANDROID__) || defined (__PikeOS__) || defined(__DJGPP__)
@@ -1089,7 +1089,7 @@ void
 {
 }
 
-#else /* defined(_WIN32, implementatin for all UNIXes */
+#else /* defined(_WIN32, implementation for all UNIXes */
 
 /* First defined some macro to identify easily some systems */
 #if defined (__FreeBSD__) \
@@ -1104,6 +1104,7 @@ void
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1121,6 +1122,9 @@ void
 #if defined (__hpux__)
 #   include 
 #endif
+#if defined (__APPLE__)
+#   include 
+#endif
 
 #define CDISABLE _POSIX_VDISABLE
 


[PATCH] untyped calls: enable target switching [PR112334]

2023-12-01 Thread Alexandre Oliva
On Dec  1, 2023, Alexandre Oliva  wrote:

> Also tested on arm-eabi, but it's *not* enough (or needed) to fix the
> PR, there's another bug lurking there, with a separate patch coming
> up.

Here it is.



The computation of apply_args_size and apply_result_size is saved in a
static variable, so that the corresponding _mode arrays are
initialized only once.  That is not compatible with switchable
targets, and ARM's arm_set_current_function, by saving and restoring
target globals, exercises this problem with a testcase such as that in
the PR, in which more than one function in the translation unit calls
__builtin_apply or __builtin_return, respectively.

This patch moves the _size statics into the target_builtins array,
with a bit of ugliness over _plus_one so that zero initialization of
the struct does the right thing.

Regstrapped on x86_64-linux-gnu, tested on arm-eabi with and without the
upthread patch.  It fixes the hardcfr fails either way.  As for the
ugliness, there's a follow up patch below that attempts to alleviate it
a little (also regstrapped and tested), but I'm not sure we want to go
down that path.  WDYT?


for  gcc/ChangeLog

PR target/112334
* builtins.h (target_builtins): Add fields for apply_args_size
and apply_result_size.
* builtins.cc (apply_args_size, apply_result_size): Cache
results in fields rather than in static variables.
(get_apply_args_size, set_apply_args_size): New.
(get_apply_result_size, set_apply_result_size): New.
---
 gcc/builtins.cc |   16 ++--
 gcc/builtins.h  |7 +++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 4fc58a0bda9b8..039bb5e997a2c 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -1398,8 +1398,16 @@ get_memory_rtx (tree exp, tree len)
 
 /* Built-in functions to perform an untyped call and return.  */
 
+#define set_apply_args_size(x) \
+  (this_target_builtins->x_apply_args_size_plus_one = 1 + (x))
+#define get_apply_args_size() \
+  (this_target_builtins->x_apply_args_size_plus_one - 1)
 #define apply_args_mode \
   (this_target_builtins->x_apply_args_mode)
+#define set_apply_result_size(x) \
+  (this_target_builtins->x_apply_result_size_plus_one = 1 + (x))
+#define get_apply_result_size() \
+  (this_target_builtins->x_apply_result_size_plus_one - 1)
 #define apply_result_mode \
   (this_target_builtins->x_apply_result_mode)
 
@@ -1409,7 +1417,7 @@ get_memory_rtx (tree exp, tree len)
 static int
 apply_args_size (void)
 {
-  static int size = -1;
+  int size = get_apply_args_size ();
   int align;
   unsigned int regno;
 
@@ -1442,6 +1450,8 @@ apply_args_size (void)
  }
else
  apply_args_mode[regno] = as_a  (VOIDmode);
+
+  set_apply_args_size (size);
 }
   return size;
 }
@@ -1452,7 +1462,7 @@ apply_args_size (void)
 static int
 apply_result_size (void)
 {
-  static int size = -1;
+  int size = get_apply_result_size ();
   int align, regno;
 
   /* The values computed by this function never change.  */
@@ -1484,6 +1494,8 @@ apply_result_size (void)
 #ifdef APPLY_RESULT_SIZE
   size = APPLY_RESULT_SIZE;
 #endif
+
+  set_apply_result_size (size);
 }
   return size;
 }
diff --git a/gcc/builtins.h b/gcc/builtins.h
index 88a26d70cd5a8..1a26fc63a6d10 100644
--- a/gcc/builtins.h
+++ b/gcc/builtins.h
@@ -37,6 +37,13 @@ struct target_builtins {
  register windows, this gives only the outbound registers.
  INCOMING_REGNO gives the corresponding inbound register.  */
   fixed_size_mode_pod x_apply_result_mode[FIRST_PSEUDO_REGISTER];
+
+  /* Nonzero iff the arrays above have been initialized.  The _plus_one suffix
+ is for zero initialization to make it an unreasonable size, used to signal
+ that the size and the corresponding mode array has not been
+ initialized.  */
+  int x_apply_args_size_plus_one;
+  int x_apply_result_size_plus_one;
 };
 
 extern struct target_builtins default_target_builtins;




untyped calls: use wrapper class type for implicit plus_one

Instead of get and set macros to apply a delta, use a single macro
that resorts to a temporary wrapper class to apply it.

To be combined (or not) with the previous patch.
---
 gcc/builtins.cc |   32 
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 039bb5e997a2c..a60e0a7084513 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -1398,16 +1398,24 @@ get_memory_rtx (tree exp, tree len)
 
 /* Built-in functions to perform an untyped call and return.  */
 
-#define set_apply_args_size(x) \
-  (this_target_builtins->x_apply_args_size_plus_one = 1 + (x))
-#define get_apply_args_size() \
-  (this_target_builtins->x_apply_args_size_plus_one - 1)
+/* Wrapper that implicitly applies a delta when getting or setting the
+   enclosed value.  */
+template 
+class delta_type
+{
+  T &value; T const delta;
+public:
+  delta

Re: [RFA] New pass for sign/zero extension elimination

2023-12-01 Thread Jeff Law




On 11/30/23 18:08, Hans-Peter Nilsson wrote:

Date: Sun, 19 Nov 2023 17:47:56 -0700
From: Jeff Law 



Locally we have had this enabled at -O1 and above to encourage testing,
but I'm thinking that for the trunk enabling at -O2 and above is the
right thing to do.


Yes.


Thoughts, comments, recommendations?


Sounds great!

It'd be nice if its framework can be re-used for
target-specific passes, doing quirky sign- or zero-extend-
related optimizations (those that are not just sign- or
zero-extend removal).  Perhaps most of those opportunities
can be implemented as target hooks in this pass.  Definitely
not asking for a change, just imagining future improvements.

Also, I haven't followed the thread and its branches, just
offering a word encouragement.
What kind of quirky target things did you have in mind?  If there's 
overlap with things we need I might be able to find someone to take it 
on.  Or might be able to suggest how they can be handled.



jeff


Re: [PATCH] extend.texi: Fix up defbuiltin* with spaces in return type

2023-12-01 Thread Jeff Law




On 12/1/23 03:26, Jakub Jelinek wrote:

Hi!

In 
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-_005f_005fbuiltin_005fstdc_005fbit_005ffloor
I've noticed that while e.g. __builtin_stdc_bit_floor builtin is properly
rendered in bold and bigger size, for the __builtin_stdc_bit_width builtin
it is not the builtin name which is marked like that, but the keyword int
before it.  Also, seems such builtins are missing from the index.

I've read the texinfo docs and they seem to suggest in
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Line-Macros.html
that return types of functions with spaces in the return type should be
wrapped with {}s and we already use that e.g. in
@defbuiltin{{void *} __builtin_thread_pointer (void)}

The following patch adjusts builtins I found which contained one or two
spaces in the return type name (plus two spots which used 2 spaces after
single keyword return type instead of 1 which triggered my search regex as
well).

Tested on x86_64-linux, ok for trunk?

2023-12-01  Jakub Jelinek  

* doc/extend.texi (__builtin_addc, __builtin_addcl, __builtin_addcll,
__builtin_subc, __builtin_subcl, __builtin_subcll,
__builtin_stdc_bit_width, __builtin_stdc_count_ones,
__builtin_stdc_count_zeros, __builtin_stdc_first_leading_one,
__builtin_stdc_first_leading_zero, __builtin_stdc_first_trailing_one,
__builtin_stdc_first_trailing_zero, __builtin_stdc_has_single_bit,
__builtin_stdc_leading_ones, __builtin_stdc_leading_zeros,
__builtin_stdc_trailing_ones, __builtin_stdc_trailing_zeros,
__builtin_nvptx_brev, __builtin_nvptx_brevll, __builtin_darn,
__builtin_darn_raw, __builtin_ia32_vec_ext_v2di,
__builtin_ia32_crc32qi, __builtin_ia32_crc32hi,
__builtin_ia32_crc32si, __builtin_ia32_crc32di): Put {}s around
return type with spaces in it.
(__builtin_rx_mvfachi, __builtin_rx_mvfacmi): Remove superfluous
whitespace.

OK
jeff


Re: Ping: [PATCH] Fix PR112419

2023-12-01 Thread Jeff Law




On 11/30/23 10:27, Hans-Peter Nilsson wrote:




My plan was to split up the test case in one which is for
-Wstringop-overflow and one which is for -Wnonnull and then
one could turn off the -Wstringop-overflow for the tests
which are actually for -Wnonnull.  But adding the dg-blah
would certainly be simpler.


Sort-of-mid-week ping (only because status quo isn't clear):
Jeff, are you content with Martin U:s reply (i.e. patch ok)
or are you waiting for a follow-up?

Perhaps it's already in your overflowing queue, then please
ignore this, I'll just ping in a week. ;-)

IMHO, after looking at the test-case I'd expect *more*
warnings for ilp32 targets; i.e. it was a bug that it didn't
show up before.

Sorry, not waiting on anything.  Just crazy busy with some personal stuff.

I think we can go with the patch as-is.

Jeff


Re: [PATCH] libgccjit: Add support for the type bfloat16

2023-12-01 Thread Antoni Boucher
David: ping.

On Thu, 2023-11-16 at 17:20 -0500, Antoni Boucher wrote:
> I forgot to attach the patch.
> 
> On Thu, 2023-11-16 at 17:19 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds the support for the type bfloat16 (bug 112574).
> > 
> > This was asked to be splitted from a another patch sent here:
> > https://gcc.gnu.org/pipermail/jit/2023q1/001607.html
> > 
> > Thanks for the review.
> 



Re: [r14-5930 Regression] FAIL: gcc.c-torture/compile/libcall-2.c -Os (test for excess errors) on Linux/x86_64

2023-12-01 Thread Iain Sandoe
Hi FX,

> On 1 Dec 2023, at 13:55, FX Coudert  wrote:
> 
> That commit makes gcc.target/i386/libcall-1.c on darwin:
> 
> FAIL: gcc.target/i386/libcall-1.c scan-assembler globl\t__divti3
> 
> because the pattern is not found, the only mention of divti3 in the generated 
> assembly is:
> 
> LCFI0:
>movabsq $_b@GOTOFF, %rdx
>movabsq $___divti3@PLTOFF, %rax
>leaqL2(%rip), %r15
>pushq   %rbx
> 
> 
> The source code is:
> 
> ---
> /* Make sure that external refences for libcalls are generated even for
>   indirect calls.  */
> 
> /* { dg-do compile { target int128 } } */
> /* { dg-options "-O2 -mcmodel=large" } */

mcmodel=large s not supported (yet) on any Darwin arch [PR90698], so the test 
needs skipping or xfailing, I think (either way with a reference to the PR).

Iain

> /* { dg-final { scan-assembler "globl\t__divti3" } } */
> 
> __int128 a, b; void foo () { a = a / b; }
> ---
> 
> Looking at, for example, gcc.target/i386/falign-functions-3.c it seems that 
> test avoids scanning for global references on darwin. Probably the new test 
> needs the same exception.
> 
> FX



Re: [PATCH] hardcfr: make builtin_return tests more portable [PR112334]

2023-12-01 Thread Richard Biener
On Fri, Dec 1, 2023 at 1:58 PM Alexandre Oliva  wrote:
>
>
> Rework __builtin_return tests to explicitly call __builtin_apply and
> use its return value rather than anything else.  Also require
> untyped_assembly.  Avoid the noise out of exceptions escaping the
> builtin-applied function, but add a test to cover their effects as
> well.
>
> Regstrapped on x86_64-linux-gnu.  Also tested on arm-eabi, but it's
> *not* enough (or needed) to fix the PR, there's another bug lurking
> there, with a separate patch coming up.  Regardless, is this ok to
> install?

OK.

>
> for  gcc/testsuite/ChangeLog
>
> PR target/112334
> * c-c++-common/torture/harden-cfr-bret.c: Rework for stricter
> untyped_return requirements.  Require untyped_assembly.
> * c-c++-common/torture/harden-cfr-bret-except.c: New.
> * c-c++-common/torture/harden-cfr-bret-always.c: Require
> untyped_assembly.
> * c-c++-common/torture/harden-cfr-bret-never.c: Likewise.
> * c-c++-common/torture/harden-cfr-bret-noopt.c: Likewise.
> * c-c++-common/torture/harden-cfr-bret-noret.c: Likewise.
> * c-c++-common/torture/harden-cfr-bret-no-xthrow.c: Likewise.
> * c-c++-common/torture/harden-cfr-bret-nothrow.c: Likewise.
> * c-c++-common/torture/harden-cfr-bret-retcl.c: Likewise.
> ---
>  .../c-c++-common/torture/harden-cfr-bret-always.c  |3 ++-
>  .../c-c++-common/torture/harden-cfr-bret-except.c  |   17 +++
>  .../c-c++-common/torture/harden-cfr-bret-never.c   |3 ++-
>  .../torture/harden-cfr-bret-no-xthrow.c|3 ++-
>  .../c-c++-common/torture/harden-cfr-bret-noopt.c   |3 ++-
>  .../c-c++-common/torture/harden-cfr-bret-noret.c   |3 ++-
>  .../c-c++-common/torture/harden-cfr-bret-nothrow.c |3 ++-
>  .../c-c++-common/torture/harden-cfr-bret-retcl.c   |3 ++-
>  .../c-c++-common/torture/harden-cfr-bret.c |   23 
> 
>  .../c-c++-common/torture/harden-cfr-noret.c|2 +-
>  10 files changed, 50 insertions(+), 13 deletions(-)
>  create mode 100644 
> gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c
>
> diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c 
> b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c
> index 779896c60e846..3406c4e6ef9dc 100644
> --- a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c
> +++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-options "-fharden-control-flow-redundancy 
> -fhardcfr-check-noreturn-calls=always -fdump-tree-hardcfr -ffat-lto-objects" 
> } */
> +/* { dg-options "-fharden-control-flow-redundancy 
> -fhardcfr-check-noreturn-calls=always -fno-exceptions -fdump-tree-hardcfr 
> -ffat-lto-objects" } */
> +/* { dg-require-effective-target untyped_assembly } */
>
>  /* Check that, even enabling all checks before noreturn calls (leaving
> returning calls enabled), we get checks before __builtin_return without
> diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c 
> b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c
> new file mode 100644
> index 0..3acb61cb75a77
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fharden-control-flow-redundancy -fexceptions 
> -fdump-tree-hardcfr -ffat-lto-objects" } */
> +/* { dg-require-effective-target untyped_assembly } */
> +
> +/* Check that, with exceptions enabled, even in C, the calls initiated by
> +   builtin_apply are enclosed in cleanup handlers that add extra checks.
> +   Unfortunately, declaring foobar as nothrow is not enough to avoid the
> +   handler around the builtin_apply call, so the other bret tests all use
> +   -fno-exceptions.  */
> +
> +#include "harden-cfr-bret.c"
> +
> +/* With exceptions, we get an extra check per function, to check before
> +   propagating exceptions, so it's 3 in f and 2 in g.  */
> +/* { dg-final { scan-tree-dump-times "__hardcfr_check" 5 "hardcfr" } } */
> +/* The extra check in g also removes the possibility of inlining the check.  
> */
> +/* { dg-final { scan-tree-dump-times "__builtin_trap" 0 "hardcfr" } } */
> diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c 
> b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c
> index 49ce17f5b937c..7f8fb64138df1 100644
> --- a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c
> +++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-options "-fharden-control-flow-redundancy 
> -fhardcfr-check-noreturn-calls=never -fdump-tree-hardcfr -ffat-lto-objects" } 
> */
> +/* { dg-options "-fharden-control-flow-redundancy 
> -fhardcfr-check-noreturn-calls=never -fno-exceptions -fdump-tree-hardcfr 
> -ffat-lto-objects" } */
> +/* { dg-require-effective-target untyped_a

Re: [PATCH] Take register pressure into account for vec_construct/scalar_to_vec when the components are not loaded from memory.

2023-12-01 Thread Richard Biener
On Fri, Dec 1, 2023 at 3:39 AM liuhongt  wrote:
>
> > Hmm, I would suggest you put reg_needed into the class and accumulate
> > over all vec_construct, with your patch you pessimize a single v32qi
> > over two separate v16qi for example.  Also currently the whole block is
> > gated with INTEGRAL_TYPE_P but register pressure would be also
> > a concern for floating point vectors.  finish_cost would then apply an
> > adjustment.
>
> Changed.
>
> > 'target_avail_regs' is for GENERAL_REGS, does that include APX regs?
> > I don't see anything similar for FP regs, but I guess the target should know
> > or maybe there's a #regs in regclass query already.
> Haven't see any, use below setting.
>
> unsigned target_avail_sse = TARGET_64BIT ? (TARGET_AVX512F ? 32 : 16) : 8;
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> No big impact on SPEC2017.
> Observe 1 big improvement from other benchmark by avoiding vectorization with
> vec_construct v32qi which caused lots of spills.
>
> Ok for trunk?

LGTM, let's see what x86 maintainers think.

Richard.

> For vec_contruct, the components must be live at the same time if
> they're not loaded from memory, when the number of those components
> exceeds available registers, spill happens. Try to account that with a
> rough estimation.
> ??? Ideally, we should have an overall estimation of register pressure
> if we know the live range of all variables.
>
> gcc/ChangeLog:
>
> * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> Count sse_reg/gpr_regs for components not loaded from memory.
> (ix86_vector_costs:ix86_vector_costs): New constructor.
> (ix86_vector_costs::m_num_gpr_needed[3]): New private memeber.
> (ix86_vector_costs::m_num_sse_needed[3]): Ditto.
> (ix86_vector_costs::finish_cost): Estimate overall register
> pressure cost.
> (ix86_vector_costs::ix86_vect_estimate_reg_pressure): New
> function.
> ---
>  gcc/config/i386/i386.cc | 54 ++---
>  1 file changed, 50 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 9390f525b99..dcaea6c2096 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -24562,15 +24562,34 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, 
> struct noce_if_info *if_info)
>  /* x86-specific vector costs.  */
>  class ix86_vector_costs : public vector_costs
>  {
> -  using vector_costs::vector_costs;
> +public:
> +  ix86_vector_costs (vec_info *, bool);
>
>unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
>   stmt_vec_info stmt_info, slp_tree node,
>   tree vectype, int misalign,
>   vect_cost_model_location where) override;
>void finish_cost (const vector_costs *) override;
> +
> +private:
> +
> +  /* Estimate register pressure of the vectorized code.  */
> +  void ix86_vect_estimate_reg_pressure ();
> +  /* Number of GENERAL_REGS/SSE_REGS used in the vectorizer, it's used for
> + estimation of register pressure.
> + ??? Currently it's only used by vec_construct/scalar_to_vec
> + where we know it's not loaded from memory.  */
> +  unsigned m_num_gpr_needed[3];
> +  unsigned m_num_sse_needed[3];
>  };
>
> +ix86_vector_costs::ix86_vector_costs (vec_info* vinfo, bool 
> costing_for_scalar)
> +  : vector_costs (vinfo, costing_for_scalar),
> +m_num_gpr_needed (),
> +m_num_sse_needed ()
> +{
> +}
> +
>  /* Implement targetm.vectorize.create_costs.  */
>
>  static vector_costs *
> @@ -24748,8 +24767,7 @@ ix86_vector_costs::add_stmt_cost (int count, 
> vect_cost_for_stmt kind,
>  }
>else if ((kind == vec_construct || kind == scalar_to_vec)
>&& node
> -  && SLP_TREE_DEF_TYPE (node) == vect_external_def
> -  && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
> +  && SLP_TREE_DEF_TYPE (node) == vect_external_def)
>  {
>stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>unsigned i;
> @@ -24785,7 +24803,15 @@ ix86_vector_costs::add_stmt_cost (int count, 
> vect_cost_for_stmt kind,
>   && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
>   || !VECTOR_TYPE_P (TREE_TYPE
> (TREE_OPERAND (gimple_assign_rhs1 (def), 
> 0))
> -   stmt_cost += ix86_cost->sse_to_integer;
> +   {
> + if (fp)
> +   m_num_sse_needed[where]++;
> + else
> +   {
> + m_num_gpr_needed[where]++;
> + stmt_cost += ix86_cost->sse_to_integer;
> +   }
> +   }
> }
>FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
> if (TREE_CODE (op) == SSA_NAME)
> @@ -24821,6 +24847,24 @@ ix86_vector_costs::add_stmt_cost (int count, 
> vect_cost_for_stmt kind,
>return retval;
>  }
>
> +void
> +ix86_vec

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)
On 01/12/2023 13:45, Christophe Lyon wrote:
> On Fri, 1 Dec 2023 at 13:44, Richard Earnshaw (lists)
>  wrote:
>>
>> On 01/12/2023 11:28, Saurabh Jha wrote:
>>> Hey,
>>>
>>> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in 
>>> this commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an 
>>> error which I unfortunately missed. This patch fixes that test.
>>>
>>> Did regression testing on arm-none-eabi and found no regressions. Output of 
>>> running gcc/contrib/compare_tests is this:
>>>
>>> """
>>> Tests that now work, but didn't before (2 tests):
>>>
>>> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
>>> gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>>>  gcc.target/arm/mve/pr112337.c (test for excess errors)
>>> """
>>>
>>> Ok for trunk? I don't have commit access so could someone please commit on 
>>> my behalf?
>>>
>>> Regards,
>>> Saurabh
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/arm/mve/pr112337.c: Fix the testcase
>>
>>
>> Hmm, could this be related to the changes Christophe made recently to change 
>> the way MVE vector types were set up internally?  If so, this might indicate 
>> an issue that's going to affect real users with existing code.
>>
> 
> My change was only about vector types, here the problem is with a
> pointer to a scalar.
> Anyway, I ran the test with my commit reverted and it still fails in
> the same way, so I think this patch is needed.
> 
> Thanks,
> 
> Christophe
> 
>> Christophe?
>>
>> R.

Ok, thanks for checking.  In that case, Saurabh, your patch is OK, but please 
change 'Fix testcase' to 'Use int32_t instead of int.'

Note that ChangeLog entries end with a full stop.

R.


[pushed] diagnostics, analyzer: add optional per-diagnostic property bags to SARIF

2023-12-01 Thread David Malcolm
I've found it useful in debugging the analyzer for the SARIF output to
contain extra analyzer-specific data in each diagnostic.

This patch:
* adds a way for a diagnostic_metadata to populate a property
bag within a SARIF "result" object based on a new vfunc
* reworks how diagnostics are emitted within the analyzer so
that a custom diagnostic_metadata subclass is used, which populates
the property bag with information from the saved_diagnostic, and with
a vfunc hook allowing for per-pending_diagnotic-subclass extra
properties.

Doing so makes it trivial to go from the SARIF output back to
pertinent parts of the analyzer's internals (e.g. the index of
the diagnostic within the ana::diagnostic_manager, the index of
the ana::exploded_node, etc).

It also replaces a lot of boilerplate in the "emit" implementations
in the various pending_diagnostics subclasses.  In particular, doing
so fixes missing CVE metadata for -Wanalyzer-fd-phase-mismatch (where
sm-fd.cc's fd_phase_mismatch::emit was failing to use its
diagnostic_metadata instance).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.

Pushed to trunk as r14-6057-g12b67d1e13b3cf.

gcc/analyzer/ChangeLog:
* analyzer.h (class saved_diagnostic): New forward decl.
* bounds-checking.cc: Update for changes to
pending_diagnostic::emit.
* call-details.cc: Likewise.
* diagnostic-manager.cc: Include "diagnostic-format-sarif.h".
(saved_diagnostic::maybe_add_sarif_properties): New.
(class pending_diagnostic_metadata): New.
(diagnostic_manager::emit_saved_diagnostic): Create a
pending_diagnostic_metadata and a diagnostic_emission_context.
Pass the latter to the pending_diagnostic::emit vfunc.
* diagnostic-manager.h
(saved_diagnostic::maybe_add_sarif_properties): New decl.
* engine.cc: Update for changes to pending_diagnostic::emit.
* infinite-loop.cc: Likewise.
* infinite-recursion.cc: Likewise.
* kf-analyzer.cc: Likewise.
* kf.cc: Likewise.
* pending-diagnostic.cc
(diagnostic_emission_context::get_pending_diagnostic): New.
(diagnostic_emission_context::warn): New.
(diagnostic_emission_context::inform): New.
* pending-diagnostic.h (class diagnostic_emission_context): New.
(pending_diagnostic::emit): Update params.
(pending_diagnostic::maybe_add_sarif_properties): New vfunc.
* region.cc: Don't include "diagnostic-metadata.h".
* region-model.cc: Include "diagnostic-format-sarif.h".  Update
for changes to pending_diagnostic::emit.
(exposure_through_uninit_copy::maybe_add_sarif_properties): New.
* sm-fd.cc: Update for changes to pending_diagnostic::emit.
* sm-file.cc: Likewise.
* sm-malloc.cc: Likewise.
* sm-pattern-test.cc: Likewise.
* sm-sensitive.cc: Likewise.
* sm-signal.cc: Likewise.
* sm-taint.cc: Likewise.
* store.cc: Don't include "diagnostic-metadata.h".
* varargs.cc: Update for changes to pending_diagnostic::emit.

gcc/ChangeLog:
* diagnostic-core.h (emit_diagnostic_valist): New overload decl.
* diagnostic-format-sarif.cc (sarif_builder::make_result_object):
When we have metadata, call its maybe_add_sarif_properties vfunc.
* diagnostic-metadata.h (class sarif_object): Forward decl.
(diagnostic_metadata::~diagnostic_metadata): New.
(diagnostic_metadata::maybe_add_sarif_properties): New vfunc.
* diagnostic.cc (emit_diagnostic_valist): New overload.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-accept.c: Update for fix to missing CWE
metadata for -Wanalyzer-fd-phase-mismatch.
* gcc.dg/analyzer/fd-bind.c: Likewise.
* gcc.dg/analyzer/fd-socket-misuse.c: Likewise.
* gcc.dg/plugin/analyzer_cpython_plugin.c: Update for changes to
pending_diagnostic::emit.
* gcc.dg/plugin/analyzer_gil_plugin.c: Likewise.
---
 gcc/analyzer/analyzer.h   |   1 +
 gcc/analyzer/bounds-checking.cc   | 130 +--
 gcc/analyzer/call-details.cc  |   8 +-
 gcc/analyzer/diagnostic-manager.cc|  53 -
 gcc/analyzer/diagnostic-manager.h |   2 +
 gcc/analyzer/engine.cc|  15 +-
 gcc/analyzer/infinite-loop.cc |   9 +-
 gcc/analyzer/infinite-recursion.cc|   9 +-
 gcc/analyzer/kf-analyzer.cc   |   4 +-
 gcc/analyzer/kf.cc|  32 ++-
 gcc/analyzer/pending-diagnostic.cc|  45 
 gcc/analyzer/pending-diagnostic.h |  56 -
 gcc/analyzer/region-model.cc  | 123 +-
 gcc/analyzer/region.cc|   1 -
 gcc/analyzer/sm-fd.cc |  75 +++
 gcc/analyzer/sm-file.cc

[pushed] docs: remove stray reference to -fanalyzer-checker=taint [PR103533]

2023-12-01 Thread David Malcolm
I missed this one in r14-5464-gcfaaa8b11b8429.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-6056-g83b210d55b2846.

gcc/ChangeLog:
PR analyzer/103533
* doc/extend.texi: Remove stray reference to
-fanalyzer-checker=taint.
---
 gcc/doc/extend.texi | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 1ae589aeb29..6004c5699ff 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -4204,9 +4204,8 @@ pointers.  In the latter case, any function used as an 
initializer of
 such a callback field will be treated as being called with tainted
 arguments.
 
-The analyzer will pay particular attention to such functions when both
-@option{-fanalyzer} and @option{-fanalyzer-checker=taint} are supplied,
-potentially issuing warnings guarded by
+The analyzer will pay particular attention to such functions when
+@option{-fanalyzer} is supplied, potentially issuing warnings guarded by
 @option{-Wanalyzer-tainted-allocation-size},
 @option{-Wanalyzer-tainted-array-index},
 @option{-Wanalyzer-tainted-divisor},
-- 
2.26.3



Re: [r14-5930 Regression] FAIL: gcc.c-torture/compile/libcall-2.c -Os (test for excess errors) on Linux/x86_64

2023-12-01 Thread FX Coudert
That commit makes gcc.target/i386/libcall-1.c on darwin:

FAIL: gcc.target/i386/libcall-1.c scan-assembler globl\t__divti3

because the pattern is not found, the only mention of divti3 in the generated 
assembly is:

LCFI0:
movabsq $_b@GOTOFF, %rdx
movabsq $___divti3@PLTOFF, %rax
leaqL2(%rip), %r15
pushq   %rbx


The source code is:

---
/* Make sure that external refences for libcalls are generated even for
   indirect calls.  */

/* { dg-do compile { target int128 } } */
/* { dg-options "-O2 -mcmodel=large" } */
/* { dg-final { scan-assembler "globl\t__divti3" } } */

__int128 a, b; void foo () { a = a / b; }
---

Looking at, for example, gcc.target/i386/falign-functions-3.c it seems that 
test avoids scanning for global references on darwin. Probably the new test 
needs the same exception.

FX

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Christophe Lyon
On Fri, 1 Dec 2023 at 13:44, Richard Earnshaw (lists)
 wrote:
>
> On 01/12/2023 11:28, Saurabh Jha wrote:
> > Hey,
> >
> > I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in 
> > this commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an 
> > error which I unfortunately missed. This patch fixes that test.
> >
> > Did regression testing on arm-none-eabi and found no regressions. Output of 
> > running gcc/contrib/compare_tests is this:
> >
> > """
> > Tests that now work, but didn't before (2 tests):
> >
> > arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
> > gcc.target/arm/mve/pr112337.c (test for excess errors)
> > arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
> >  gcc.target/arm/mve/pr112337.c (test for excess errors)
> > """
> >
> > Ok for trunk? I don't have commit access so could someone please commit on 
> > my behalf?
> >
> > Regards,
> > Saurabh
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/arm/mve/pr112337.c: Fix the testcase
>
>
> Hmm, could this be related to the changes Christophe made recently to change 
> the way MVE vector types were set up internally?  If so, this might indicate 
> an issue that's going to affect real users with existing code.
>

My change was only about vector types, here the problem is with a
pointer to a scalar.
Anyway, I ran the test with my commit reverted and it still fails in
the same way, so I think this patch is needed.

Thanks,

Christophe

> Christophe?
>
> R.


[PATCH] hardcfr: make builtin_return tests more portable [PR112334]

2023-12-01 Thread Alexandre Oliva


Rework __builtin_return tests to explicitly call __builtin_apply and
use its return value rather than anything else.  Also require
untyped_assembly.  Avoid the noise out of exceptions escaping the
builtin-applied function, but add a test to cover their effects as
well.

Regstrapped on x86_64-linux-gnu.  Also tested on arm-eabi, but it's
*not* enough (or needed) to fix the PR, there's another bug lurking
there, with a separate patch coming up.  Regardless, is this ok to
install?


for  gcc/testsuite/ChangeLog

PR target/112334
* c-c++-common/torture/harden-cfr-bret.c: Rework for stricter
untyped_return requirements.  Require untyped_assembly.
* c-c++-common/torture/harden-cfr-bret-except.c: New.
* c-c++-common/torture/harden-cfr-bret-always.c: Require
untyped_assembly.
* c-c++-common/torture/harden-cfr-bret-never.c: Likewise.
* c-c++-common/torture/harden-cfr-bret-noopt.c: Likewise.
* c-c++-common/torture/harden-cfr-bret-noret.c: Likewise.
* c-c++-common/torture/harden-cfr-bret-no-xthrow.c: Likewise.
* c-c++-common/torture/harden-cfr-bret-nothrow.c: Likewise.
* c-c++-common/torture/harden-cfr-bret-retcl.c: Likewise.
---
 .../c-c++-common/torture/harden-cfr-bret-always.c  |3 ++-
 .../c-c++-common/torture/harden-cfr-bret-except.c  |   17 +++
 .../c-c++-common/torture/harden-cfr-bret-never.c   |3 ++-
 .../torture/harden-cfr-bret-no-xthrow.c|3 ++-
 .../c-c++-common/torture/harden-cfr-bret-noopt.c   |3 ++-
 .../c-c++-common/torture/harden-cfr-bret-noret.c   |3 ++-
 .../c-c++-common/torture/harden-cfr-bret-nothrow.c |3 ++-
 .../c-c++-common/torture/harden-cfr-bret-retcl.c   |3 ++-
 .../c-c++-common/torture/harden-cfr-bret.c |   23 
 .../c-c++-common/torture/harden-cfr-noret.c|2 +-
 10 files changed, 50 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c

diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c 
b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c
index 779896c60e846..3406c4e6ef9dc 100644
--- a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c
+++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-always.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-fharden-control-flow-redundancy 
-fhardcfr-check-noreturn-calls=always -fdump-tree-hardcfr -ffat-lto-objects" } 
*/
+/* { dg-options "-fharden-control-flow-redundancy 
-fhardcfr-check-noreturn-calls=always -fno-exceptions -fdump-tree-hardcfr 
-ffat-lto-objects" } */
+/* { dg-require-effective-target untyped_assembly } */
 
 /* Check that, even enabling all checks before noreturn calls (leaving
returning calls enabled), we get checks before __builtin_return without
diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c 
b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c
new file mode 100644
index 0..3acb61cb75a77
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-except.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-fharden-control-flow-redundancy -fexceptions 
-fdump-tree-hardcfr -ffat-lto-objects" } */
+/* { dg-require-effective-target untyped_assembly } */
+
+/* Check that, with exceptions enabled, even in C, the calls initiated by
+   builtin_apply are enclosed in cleanup handlers that add extra checks.
+   Unfortunately, declaring foobar as nothrow is not enough to avoid the
+   handler around the builtin_apply call, so the other bret tests all use
+   -fno-exceptions.  */
+
+#include "harden-cfr-bret.c"
+
+/* With exceptions, we get an extra check per function, to check before
+   propagating exceptions, so it's 3 in f and 2 in g.  */
+/* { dg-final { scan-tree-dump-times "__hardcfr_check" 5 "hardcfr" } } */
+/* The extra check in g also removes the possibility of inlining the check.  */
+/* { dg-final { scan-tree-dump-times "__builtin_trap" 0 "hardcfr" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c 
b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c
index 49ce17f5b937c..7f8fb64138df1 100644
--- a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c
+++ b/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-never.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-fharden-control-flow-redundancy 
-fhardcfr-check-noreturn-calls=never -fdump-tree-hardcfr -ffat-lto-objects" } */
+/* { dg-options "-fharden-control-flow-redundancy 
-fhardcfr-check-noreturn-calls=never -fno-exceptions -fdump-tree-hardcfr 
-ffat-lto-objects" } */
+/* { dg-require-effective-target untyped_assembly } */
 
 /* Check that, even enabling checks before never noreturn calls (leaving
returning calls enabled), we get checks before __builtin_return without
diff --git a/gcc/testsuite/c-c++-common/torture/harden-cfr-bret-no-xthrow.c 
b/gcc/testsuit

Re: [PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Richard Earnshaw (lists)
On 01/12/2023 11:28, Saurabh Jha wrote:
> Hey,
> 
> I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in this 
> commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an error 
> which I unfortunately missed. This patch fixes that test.
> 
> Did regression testing on arm-none-eabi and found no regressions. Output of 
> running gcc/contrib/compare_tests is this:
> 
> """
> Tests that now work, but didn't before (2 tests):
> 
> arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
> gcc.target/arm/mve/pr112337.c (test for excess errors)
> arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
>  gcc.target/arm/mve/pr112337.c (test for excess errors)
> """
> 
> Ok for trunk? I don't have commit access so could someone please commit on my 
> behalf?
> 
> Regards,
> Saurabh
> 
> gcc/testsuite/ChangeLog:
> 
>     * gcc.target/arm/mve/pr112337.c: Fix the testcase


Hmm, could this be related to the changes Christophe made recently to change 
the way MVE vector types were set up internally?  If so, this might indicate an 
issue that's going to affect real users with existing code.

Christophe?

R.


[PATCH] RISC-V: Fix incorrect combine of extended scalar pattern

2023-12-01 Thread Juzhe-Zhong
Background:
RVV ISA vx instructions for example vadd.vx,
When EEW = 64 and RV32. We can't directly use vadd.vx.
Instead, we need to use:

sw
sw
vlse
vadd.vv

However, we have some special situation that we still can directly use
vadd.vx directly for EEW=64 && RV32.

that is, when scalar is a known CONST_INT value that doesn't overflow 32-bit 
value.
So, we have a dedicated pattern for such situation:

...
(sign_extend: (match_operand: 3 "register_operand"  " r,  
r,  r,  r")).
...

We first force_reg such CONST_INT (within 32bit value) into a SImode reg.
Then use such special patterns.
Those pattern with this operand match should only value on! TARGET_64BIT.

The PR112801 combine into such patterns on RV64 incorrectly (Those patterns 
should be only value on RV32).

This is the bug:

andia2,a2,2
vsetivlizero,2,e64,m1,ta,ma
sext.w  a3,a4
vmv.v.x v1,a2
vslide1down.vx  v1,v1,a4-> it should be a3 instead of a4.

Such incorrect codegen is caused by 
...
(sign_extend:DI (subreg:SI (reg:DI 135 [ f.0_3 ]) 0))
] UNSPEC_VSLIDE1DOWN)) 16935 {*pred_slide1downv2di_extended}
...

Incorretly combine into the patterns should not be valid on RV64 system.

So add !TARGET_64BIT to all same type patterns which can fix such issue as well 
as robostify the vector.md.

PR target/112801

gcc/ChangeLog:

* config/riscv/vector.md: Add !TARGET_64BIT.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112801.c: New test.

---
 gcc/config/riscv/vector.md| 52 +--
 .../gcc.target/riscv/rvv/autovec/pr112801.c   | 36 +
 2 files changed, 62 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112801.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 09e8a63af07..acb812593a0 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1913,7 +1913,7 @@
 (match_operand:V_VLSI_D 2 "register_operand" " vr,vr")
(match_operand: 4 "register_operand" " vm,vm"))
   (match_operand:V_VLSI_D 1 "vector_merge_operand"   " vu, 0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "vmerge.vxm\t%0,%2,%3,%4"
   [(set_attr "type" "vimerge")
(set_attr "mode" "")])
@@ -2091,7 +2091,7 @@
(sign_extend:
  (match_operand: 3 "register_operand"  " r,  r,  
r,  r")))
  (match_operand:V_VLSI_D 2 "vector_merge_operand"  "vu,  0, 
vu,  0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "@
vmv.v.x\t%0,%3
vmv.v.x\t%0,%3
@@ -2677,7 +2677,7 @@
(match_operand: 4 "reg_or_0_operand" "rJ,rJ, rJ, rJ")))
(match_operand:V_VLSI_D 3 "register_operand" "vr,vr, vr, 
vr"))
  (match_operand:V_VLSI_D 2 "vector_merge_operand"   "vu, 0, vu,  
0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "v.vx\t%0,%3,%z4%p1"
   [(set_attr "type" "")
(set_attr "mode" "")])
@@ -2753,7 +2753,7 @@
  (sign_extend:
(match_operand: 4 "reg_or_0_operand" "rJ,rJ, rJ, rJ"
  (match_operand:V_VLSI_D 2 "vector_merge_operand"   "vu, 0, vu,  
0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "v.vx\t%0,%3,%z4%p1"
   [(set_attr "type" "")
(set_attr "mode" "")])
@@ -2829,7 +2829,7 @@
(match_operand: 4 "reg_or_0_operand" "rJ,rJ, rJ, rJ")))
(match_operand:V_VLSI_D 3 "register_operand" "vr,vr, vr, 
vr"))
  (match_operand:V_VLSI_D 2 "vector_merge_operand"   "vu, 0, vu,  
0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "vrsub.vx\t%0,%3,%z4%p1"
   [(set_attr "type" "vialu")
(set_attr "mode" "")])
@@ -2947,7 +2947,7 @@
 (match_operand: 4 "reg_or_0_operand" "rJ,rJ, rJ, rJ")))
 (match_operand:VFULLI_D 3 "register_operand" "vr,vr, vr, vr")] 
VMULH)
  (match_operand:VFULLI_D 2 "vector_merge_operand""vu, 0, vu,  
0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "vmulh.vx\t%0,%3,%z4%p1"
   [(set_attr "type" "vimul")
(set_attr "mode" "")])
@@ -3126,7 +3126,7 @@
(match_operand:VI_D 2 "register_operand" "vr,vr"))
  (match_operand: 4 "register_operand"   "vm,vm")] 
UNSPEC_VADC)
  (match_operand:VI_D 1 "vector_merge_operand"   "vu, 0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "vadc.vxm\t%0,%2,%z3,%4"
   [(set_attr "type" "vicalu")
(set_attr "mode" "")
@@ -3210,7 +3210,7 @@
(match_operand: 3 "reg_or_0_operand" "rJ,rJ"
  (match_operand: 4 "register_operand"   "vm,vm")] 
UNSPEC_VSBC)
  (match_operand:VI_D 1 "vector_merge_operand"   "vu, 0")))]
-  "TARGET_VECTOR"
+  "TARGET_VECTOR && !TARGET_64BIT"
   "vsbc.vxm\t%0,%2,%z3,%4"
   [(set_attr "type" "vicalu")
(set_a

Re: [PATCH v3] RISC-V: Bugfix for legitimize move when get vec mode in zve32f

2023-12-01 Thread juzhe.zh...@rivai.ai
One more comment:

+  unsigned int num = (smode == DImode || smode == DFmode)
+ && !TARGET_VECTOR_ELEN_64 ? 2 : 1;

change it into:

unsigned int num = known_eq (GET_MODE_SIZE (smode), 8) && 
!TARGET_VECTOR_ELEN_64 ? 2 : 1;



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-12-01 18:11
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v3] RISC-V: Bugfix for legitimize move when get vec mode in 
zve32f
From: Pan Li 
 
If we want to extract 64bit value but ELEN < 64, we use RVV
vector mode with EEW = 32 to extract the highpart and lowpart.
However, this approach doesn't honor DFmode when movdf pattern
when ZVE32f and of course results in ICE when zve32f.
 
This patch would like to reuse the approach with some additional
handing, consider lowpart bits is meaningless for FP mode, we need
one int reg as bridge here. For example:
 
rtx tmp = gen_rtx_reg (DImode)
reg:DI = reg:DF (fmv.d.x) // Move DF reg to DI
...
perform the extract for high and low parts
...
reg:DF = reg:DI (fmv.x.d) // Move DI reg back to DF after all done
 
PR target/112743
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (riscv_legitimize_move): Take the
exist (U *mode) and handle DFmode like DImode when EEW is
32bits for ZVE32F.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr112743-2.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv.cc | 63 +--
.../gcc.target/riscv/rvv/base/pr112743-2.c| 52 +++
2 files changed, 95 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index a4fc858fb50..2fbaaf01078 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2605,41 +2605,64 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx 
src)
   unsigned int nunits = vmode_size > mode_size ? vmode_size / mode_size : 
1;
   scalar_mode smode = as_a (mode);
   unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size;
-  unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  unsigned int num = (smode == DImode || smode == DFmode)
+ && !TARGET_VECTOR_ELEN_64 ? 2 : 1;
+  bool need_int_reg_p = false;
   if (num == 2)
{
  /* If we want to extract 64bit value but ELEN < 64,
 we use RVV vector mode with EEW = 32 to extract
 the highpart and lowpart.  */
+   need_int_reg_p = smode == DFmode;
  smode = SImode;
  nunits = nunits * 2;
}
-  vmode = riscv_vector::get_vector_mode (smode, nunits).require ();
-  rtx v = gen_lowpart (vmode, SUBREG_REG (src));
-  for (unsigned int i = 0; i < num; i++)
+  if (riscv_vector::get_vector_mode (smode, nunits).exists (&vmode))
{
-   rtx result;
-   if (num == 1)
- result = dest;
-   else if (i == 0)
- result = gen_lowpart (smode, dest);
-   else
- result = gen_reg_rtx (smode);
-   riscv_vector::emit_vec_extract (result, v, index + i);
+   rtx v = gen_lowpart (vmode, SUBREG_REG (src));
+   rtx int_reg = dest;
-   if (i == 1)
+   if (need_int_reg_p)
{
-   rtx tmp
- = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result),
- gen_int_mode (32, Pmode), NULL_RTX, 0,
- OPTAB_DIRECT);
-   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
-OPTAB_DIRECT);
-   emit_move_insn (dest, tmp2);
+   int_reg = gen_reg_rtx (DImode);
+   emit_move_insn (int_reg, gen_lowpart (GET_MODE (int_reg), dest));
}
+
+   for (unsigned int i = 0; i < num; i++)
+ {
+   rtx result;
+   if (num == 1)
+ result = int_reg;
+   else if (i == 0)
+ result = gen_lowpart (smode, int_reg);
+   else
+ result = gen_reg_rtx (smode);
+
+   riscv_vector::emit_vec_extract (result, v, index + i);
+
+   if (i == 1)
+ {
+   rtx tmp = expand_binop (Pmode, ashl_optab,
+   gen_lowpart (Pmode, result),
+   gen_int_mode (32, Pmode), NULL_RTX, 0,
+   OPTAB_DIRECT);
+   rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, int_reg,
+NULL_RTX, 0,
+OPTAB_DIRECT);
+   emit_move_insn (int_reg, tmp2);
+ }
+ }
+
+   if (need_int_reg_p)
+ emit_move_insn (dest, gen_lowpart (GET_MODE (dest), int_reg));
+   else
+ emit_move_insn (dest, int_reg);
}
+  else
+ gcc_unreachable ();
+
   return true;
 }
   /* Expand
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
new file mode 100644
index 000..fdb35fd70f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112743-2.c
@@ -0,0 +1,52 @@
+/* Test that we do not have ice when compile */
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zve32f_zvfh_zfh -mabi=lp64 -O2" } */
+
+#include 
+
+union double_union
+{
+  double d;
+  __uint32_t i[2];
+};
+
+#define word0(x)  (x.i[1])
+#define word1(x)  (x.i[0])
+
+#define P 53
+#define Exp_shift 20
+#define Exp_msk1  ((__uint32_t)0x10L)
+#define Exp_mask  ((__uint

Re: [PATCH] Fix ambiguity between vect_get_vec_defs with/without vectype

2023-12-01 Thread Richard Biener
On Fri, 1 Dec 2023, Richard Sandiford wrote:

> Richard Biener  writes:
> > When querying a single set of vector defs with the overloaded
> > vect_get_vec_defs API then when you try to use the overload with
> > the vector type specified the call will be ambiguous with the
> > variant without the vector type.  The following fixes this by
> > re-ordering the vector type argument to come before the output
> > def vector argument.
> >
> > I've changed vectorizable_conversion as that triggered this
> > so it has coverage showing this works.  The motivation is to
> > reduce the number of (redundant) get_vectype_for_scalar_type
> > calls.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >
> > OK for trunk or shall I defer to stage1?
> 
> LGTM.  Think it's probably worth having now, in case we need it for
> other fixes.

That's what I thought as well, pushed now.

Thanks,
Richard.

> Thanks,
> Richard
> 
> >
> > Thanks,
> > Richard.
> >
> > * tree-vectorizer.h (vect_get_vec_defs): Re-order arguments.
> > * tree-vect-stmts.cc (vect_get_vec_defs): Likewise.
> > (vectorizable_condition): Update caller.
> > (vectorizable_comparison_1): Likewise.
> > (vectorizable_conversion): Specify the vector type to be
> > used for invariant/external defs.
> > * tree-vect-loop.cc (vect_transform_reduction): Update caller.
> > ---
> >  gcc/tree-vect-loop.cc  |  6 +++---
> >  gcc/tree-vect-stmts.cc | 42 +-
> >  gcc/tree-vectorizer.h  |  8 
> >  3 files changed, 28 insertions(+), 28 deletions(-)
> >
> > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > index 3df020d2228..dd584ab4a42 100644
> > --- a/gcc/tree-vect-loop.cc
> > +++ b/gcc/tree-vect-loop.cc
> > @@ -8504,11 +8504,11 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
> >gcc_assert (single_defuse_cycle
> >   && (reduc_index == 1 || reduc_index == 2));
> >vect_get_vec_defs (loop_vinfo, stmt_info, slp_node, ncopies,
> > -op.ops[0], &vec_oprnds0, truth_type_for (vectype_in),
> > +op.ops[0], truth_type_for (vectype_in), &vec_oprnds0,
> >  reduc_index == 1 ? NULL_TREE : op.ops[1],
> > -&vec_oprnds1, NULL_TREE,
> > +NULL_TREE, &vec_oprnds1,
> >  reduc_index == 2 ? NULL_TREE : op.ops[2],
> > -&vec_oprnds2, NULL_TREE);
> > +NULL_TREE, &vec_oprnds2);
> >  }
> >  
> >/* For single def-use cycles get one copy of the vectorized reduction
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index bf8c99779ae..067abac3917 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -1267,10 +1267,10 @@ vect_get_vec_defs_for_operand (vec_info *vinfo, 
> > stmt_vec_info stmt_vinfo,
> >  void
> >  vect_get_vec_defs (vec_info *vinfo, stmt_vec_info stmt_info, slp_tree 
> > slp_node,
> >unsigned ncopies,
> > -  tree op0, vec *vec_oprnds0, tree vectype0,
> > -  tree op1, vec *vec_oprnds1, tree vectype1,
> > -  tree op2, vec *vec_oprnds2, tree vectype2,
> > -  tree op3, vec *vec_oprnds3, tree vectype3)
> > +  tree op0, tree vectype0, vec *vec_oprnds0,
> > +  tree op1, tree vectype1, vec *vec_oprnds1,
> > +  tree op2, tree vectype2, vec *vec_oprnds2,
> > +  tree op3, tree vectype3, vec *vec_oprnds3)
> >  {
> >if (slp_node)
> >  {
> > @@ -1309,10 +1309,10 @@ vect_get_vec_defs (vec_info *vinfo, stmt_vec_info 
> > stmt_info, slp_tree slp_node,
> >tree op3, vec *vec_oprnds3)
> >  {
> >vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
> > -op0, vec_oprnds0, NULL_TREE,
> > -op1, vec_oprnds1, NULL_TREE,
> > -op2, vec_oprnds2, NULL_TREE,
> > -op3, vec_oprnds3, NULL_TREE);
> > +op0, NULL_TREE, vec_oprnds0,
> > +op1, NULL_TREE, vec_oprnds1,
> > +op2, NULL_TREE, vec_oprnds2,
> > +op3, NULL_TREE, vec_oprnds3);
> >  }
> >  
> >  /* Helper function called by vect_finish_replace_stmt and
> > @@ -5657,7 +5657,7 @@ vectorizable_conversion (vec_info *vinfo,
> >  {
> >  case NONE:
> >vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
> > -op0, &vec_oprnds0);
> > +op0, vectype_in, &vec_oprnds0);
> >/* vec_dest is intermediate type operand when multi_step_cvt.  */
> >if (multi_step_cvt)
> > {
> > @@ -5696,9 +5696,9 @@ vectorizable_conversion (vec_info *vinfo,
> >  generate more than one vector stmt - i.e - we need to "unroll"
> >  the vector stmt by a factor VF/nunits.  */
> >vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies * ninputs,
> > -op0, &vec_oprnds0,
> > +op0, vectype_in

Re: [PATCH] htdocs/git.html: correct spelling and use git in example

2023-12-01 Thread Jonny Grant



On 30/11/2023 23:56, Joseph Myers wrote:
> On Thu, 30 Nov 2023, Jonny Grant wrote:
> 
>> ChangeLog:
>>
>>  htdocs/git.html: change example to use git:// and correct
>>  spelling repostiory -> repository .
> 
> git:// (unencrypted / unauthenticated) is pretty widely considered 
> obsolescent, I'm not sure adding a use of it (as opposed to changing any 
> existing examples to use a secure connection mechanism) is a good idea.
> 

Hi Joseph

Thank you for your review.

Good point. I changed the ssh::// example because it doesn't work with 
anonymous access.
How about changing both to https:// ?

Otherwise, could the spelling correction change be applied, or is it better I 
submit a new patch with only that?

Kind regards
Jonny


Re: Re: [RISC-V PATCH] Improve style to work around PR 60994 in host compiler.

2023-12-01 Thread juzhe.zh...@rivai.ai
LGTM.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-12-01 17:27
To: Roger Sayle; gcc-patches
CC: rdapp.gcc; juzhe.zh...@rivai.ai
Subject: Re: [RISC-V PATCH] Improve style to work around PR 60994 in host 
compiler.
Yes, OK, thanks for that.  CC'ing Juzhe as this is his pass.
 
Regards
Robin
 


Re: [PATCH] Fix ambiguity between vect_get_vec_defs with/without vectype

2023-12-01 Thread Richard Sandiford
Richard Biener  writes:
> When querying a single set of vector defs with the overloaded
> vect_get_vec_defs API then when you try to use the overload with
> the vector type specified the call will be ambiguous with the
> variant without the vector type.  The following fixes this by
> re-ordering the vector type argument to come before the output
> def vector argument.
>
> I've changed vectorizable_conversion as that triggered this
> so it has coverage showing this works.  The motivation is to
> reduce the number of (redundant) get_vectype_for_scalar_type
> calls.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK for trunk or shall I defer to stage1?

LGTM.  Think it's probably worth having now, in case we need it for
other fixes.

Thanks,
Richard

>
> Thanks,
> Richard.
>
>   * tree-vectorizer.h (vect_get_vec_defs): Re-order arguments.
>   * tree-vect-stmts.cc (vect_get_vec_defs): Likewise.
>   (vectorizable_condition): Update caller.
>   (vectorizable_comparison_1): Likewise.
>   (vectorizable_conversion): Specify the vector type to be
>   used for invariant/external defs.
>   * tree-vect-loop.cc (vect_transform_reduction): Update caller.
> ---
>  gcc/tree-vect-loop.cc  |  6 +++---
>  gcc/tree-vect-stmts.cc | 42 +-
>  gcc/tree-vectorizer.h  |  8 
>  3 files changed, 28 insertions(+), 28 deletions(-)
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 3df020d2228..dd584ab4a42 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -8504,11 +8504,11 @@ vect_transform_reduction (loop_vec_info loop_vinfo,
>gcc_assert (single_defuse_cycle
> && (reduc_index == 1 || reduc_index == 2));
>vect_get_vec_defs (loop_vinfo, stmt_info, slp_node, ncopies,
> -  op.ops[0], &vec_oprnds0, truth_type_for (vectype_in),
> +  op.ops[0], truth_type_for (vectype_in), &vec_oprnds0,
>reduc_index == 1 ? NULL_TREE : op.ops[1],
> -  &vec_oprnds1, NULL_TREE,
> +  NULL_TREE, &vec_oprnds1,
>reduc_index == 2 ? NULL_TREE : op.ops[2],
> -  &vec_oprnds2, NULL_TREE);
> +  NULL_TREE, &vec_oprnds2);
>  }
>  
>/* For single def-use cycles get one copy of the vectorized reduction
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index bf8c99779ae..067abac3917 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -1267,10 +1267,10 @@ vect_get_vec_defs_for_operand (vec_info *vinfo, 
> stmt_vec_info stmt_vinfo,
>  void
>  vect_get_vec_defs (vec_info *vinfo, stmt_vec_info stmt_info, slp_tree 
> slp_node,
>  unsigned ncopies,
> -tree op0, vec *vec_oprnds0, tree vectype0,
> -tree op1, vec *vec_oprnds1, tree vectype1,
> -tree op2, vec *vec_oprnds2, tree vectype2,
> -tree op3, vec *vec_oprnds3, tree vectype3)
> +tree op0, tree vectype0, vec *vec_oprnds0,
> +tree op1, tree vectype1, vec *vec_oprnds1,
> +tree op2, tree vectype2, vec *vec_oprnds2,
> +tree op3, tree vectype3, vec *vec_oprnds3)
>  {
>if (slp_node)
>  {
> @@ -1309,10 +1309,10 @@ vect_get_vec_defs (vec_info *vinfo, stmt_vec_info 
> stmt_info, slp_tree slp_node,
>  tree op3, vec *vec_oprnds3)
>  {
>vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
> -  op0, vec_oprnds0, NULL_TREE,
> -  op1, vec_oprnds1, NULL_TREE,
> -  op2, vec_oprnds2, NULL_TREE,
> -  op3, vec_oprnds3, NULL_TREE);
> +  op0, NULL_TREE, vec_oprnds0,
> +  op1, NULL_TREE, vec_oprnds1,
> +  op2, NULL_TREE, vec_oprnds2,
> +  op3, NULL_TREE, vec_oprnds3);
>  }
>  
>  /* Helper function called by vect_finish_replace_stmt and
> @@ -5657,7 +5657,7 @@ vectorizable_conversion (vec_info *vinfo,
>  {
>  case NONE:
>vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
> -  op0, &vec_oprnds0);
> +  op0, vectype_in, &vec_oprnds0);
>/* vec_dest is intermediate type operand when multi_step_cvt.  */
>if (multi_step_cvt)
>   {
> @@ -5696,9 +5696,9 @@ vectorizable_conversion (vec_info *vinfo,
>generate more than one vector stmt - i.e - we need to "unroll"
>the vector stmt by a factor VF/nunits.  */
>vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies * ninputs,
> -  op0, &vec_oprnds0,
> +  op0, vectype_in, &vec_oprnds0,
>code == WIDEN_LSHIFT_EXPR ? NULL_TREE : op1,
> -  &vec_oprnds1);
> +  vectype_in, &vec_oprnds1);
>if (code == WIDEN_LSHIFT_EXPR)
>   {
> int oprn

Re: [PATCH v3] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-01 Thread Richard Sandiford
Manos Anagnostakis  writes:
> This is an RTL pass that detects store forwarding from stores to larger loads 
> (load pairs).
>
> This optimization is SPEC2017-driven and was found to be beneficial for some 
> benchmarks,
> through testing on ampere1/ampere1a machines.
>
> For example, it can transform cases like
>
> str  d5, [sp, #320]
> fmul d5, d31, d29
> ldp  d31, d17, [sp, #312] # Large load from small store
>
> to
>
> str  d5, [sp, #320]
> fmul d5, d31, d29
> ldr  d31, [sp, #312]
> ldr  d17, [sp, #320]
>
> Currently, the pass is disabled by default on all architectures and enabled 
> by a target-specific option.
>
> If deemed beneficial enough for a default, it will be enabled on 
> ampere1/ampere1a,
> or other architectures as well, without needing to be turned on by this 
> option.
>
> Bootstrapped and regtested on aarch64-linux.
>
> gcc/ChangeLog:
>
> * config.gcc: Add aarch64-store-forwarding.o to extra_objs.
> * config/aarch64/aarch64-passes.def (INSERT_PASS_AFTER): New pass.
> * config/aarch64/aarch64-protos.h (make_pass_avoid_store_forwarding): 
> Declare.
> * config/aarch64/aarch64.opt (mavoid-store-forwarding): New option.
>   (aarch64-store-forwarding-threshold): New param.
> * config/aarch64/t-aarch64: Add aarch64-store-forwarding.o
> * doc/invoke.texi: Document new option and new param.
> * config/aarch64/aarch64-store-forwarding.cc: New file.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/ldp_ssll_no_overlap_address.c: New test.
> * gcc.target/aarch64/ldp_ssll_no_overlap_offset.c: New test.
> * gcc.target/aarch64/ldp_ssll_overlap.c: New test.
>
> Signed-off-by: Manos Anagnostakis 
> Co-Authored-By: Manolis Tsamis 
> Co-Authored-By: Philipp Tomsich 
> ---
> Changes in v3:
>   - Removed obvious zext/sext check in SET_DEST.
>   - Replaced map/vector with list/struct.
>   - Replaced current threshold check algorithm with the
>   suggested, more efficient one (actually an inline check).
>   - Added usage of cselib_subst_to_values, which works well
>   with cselib_process_insn.
>   - Removed canon_rtx, but cannot remove init/end_alias_analysis,
>   as this is required for cselib to work properly (see cselib_init
>   in gcc/cselib.cc).

Ah, right.  Thanks for giving it a go.

>   - Adjusted comments according to the new changes.
>
>  gcc/config.gcc|   1 +
>  gcc/config/aarch64/aarch64-passes.def |   1 +
>  gcc/config/aarch64/aarch64-protos.h   |   1 +
>  .../aarch64/aarch64-store-forwarding.cc   | 319 ++
>  gcc/config/aarch64/aarch64.opt|   9 +
>  gcc/config/aarch64/t-aarch64  |  10 +
>  gcc/doc/invoke.texi   |  12 +-
>  .../aarch64/ldp_ssll_no_overlap_address.c |  33 ++
>  .../aarch64/ldp_ssll_no_overlap_offset.c  |  33 ++
>  .../gcc.target/aarch64/ldp_ssll_overlap.c |  33 ++
>  10 files changed, 451 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/config/aarch64/aarch64-store-forwarding.cc
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_address.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_offset.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_ssll_overlap.c
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 748430194f3..2ee3b61c4fa 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -350,6 +350,7 @@ aarch64*-*-*)
>   cxx_target_objs="aarch64-c.o"
>   d_target_objs="aarch64-d.o"
>   extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o 
> aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o 
> aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o 
> falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o"
> + extra_objs="${extra_objs} aarch64-store-forwarding.o"
>   target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.cc 
> \$(srcdir)/config/aarch64/aarch64-sve-builtins.h 
> \$(srcdir)/config/aarch64/aarch64-sve-builtins.cc"
>   target_has_targetm_common=yes
>   ;;
> diff --git a/gcc/config/aarch64/aarch64-passes.def 
> b/gcc/config/aarch64/aarch64-passes.def
> index 6ace797b738..fa79e8adca8 100644
> --- a/gcc/config/aarch64/aarch64-passes.def
> +++ b/gcc/config/aarch64/aarch64-passes.def
> @@ -23,3 +23,4 @@ INSERT_PASS_BEFORE (pass_reorder_blocks, 1, 
> pass_track_speculation);
>  INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance);
>  INSERT_PASS_BEFORE (pass_shorten_branches, 1, pass_insert_bti);
>  INSERT_PASS_AFTER (pass_if_after_combine, 1, pass_cc_fusion);
> +INSERT_PASS_AFTER (pass_peephole2, 1, pass_avoid_store_forwarding);
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index d2718cc87b3..7d9dfa06af9 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarc

Re: [PATCH] testsuite: Tweak some further tests for modern C changes

2023-12-01 Thread Florian Weimer
* Jakub Jelinek:

> Hi!
>
> On IRC Richi mentioned some FAILs in gcc.target/x86_64 and in pr83126.c.
>
> The following patch fixes the former ones (they need recent binutils to
> be enabled), for pr83126.c because I didn't have graphite configured I've
> just verified that the test compiles (didn't without the patch) and that
> the gimple dump is identical with one from yesterday's gcc (as it was a
> tree-parloops.cc ICE, I guess identical gimple is all we care about
> and no need to verify it further).
>
> Ok for trunk?
>
> 2023-12-01  Jakub Jelinek  
>
>   * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c
>   (fun_check_passing_m512_8_values, fun_check_passing_m512h_8_values):
>   Add missing void return type.
>   * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c
>   (fun_check_passing_m256_8_values, fun_check_passing_m256h_8_values):
>   Likewise.
>   * gcc.dg/graphite/pr83126.c (ew): Add missing casts to __INTPTR_TYPE__
>   and then to int *.

Looks fine.  Sorry, I totally forgot to upgrade binutils and install
isl.

Thanks,
Florian



Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Florian Weimer
* Thomas Schwinge:

> I'm not proposing a patch as I don't know whether you'd like to just
> silence the diagnostic, fix (?) the test case, and/or add another
> 'dg-error'-checking test case?  (I've not yet looked at the history of
> the test case.)

Jakub just posted a patch:

  [PATCH] testsuite: Tweak some further tests for modern C changes
  

Thanks,
Florian



Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-12-01 Thread Thomas Schwinge
Hi Florian!

On 2023-11-13T14:10:34+0100, Florian Weimer  wrote:
> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc

> @@ -7616,27 +7639,28 @@ convert_for_assignment (location_t location, 
> location_t expr_loc, tree type,

> case ic_assign:
> - pedwarn (location, OPT_Wint_conversion,
> -  "assignment to %qT from %qT makes pointer from integer "
> -  "without a cast", type, rhstype);
> + pedpermerror (location, OPT_Wint_conversion,
> +   "assignment to %qT from %qT makes pointer from "
> +   "integer without a cast", type, rhstype);
>   break;

In addition to the many ones that you've fixed, there's one more:

[-PASS:-]{+FAIL:+} gcc.dg/graphite/pr83126.c (test for excess errors)

[...]/testsuite/gcc.dg/graphite/pr83126.c: In function 'ew':
[...]/testsuite/gcc.dg/graphite/pr83126.c:15:10: error: assignment to 'int 
*' from 'int' makes pointer from integer without a cast [-Wint-conversion]

Presumably your testing didn't have Graphite enabled due to missing
host libisl?  However, you should be able to reproduce without that:

$ build-gcc/gcc/xgcc -Bbuild-gcc/gcc/ 
source-gcc/gcc/testsuite/gcc.dg/graphite/pr83126.c
source-gcc/gcc/testsuite/gcc.dg/graphite/pr83126.c: In function ‘ew’:
source-gcc/gcc/testsuite/gcc.dg/graphite/pr83126.c:15:10: error: assignment 
to ‘int *’ from ‘int’ makes pointer from integer without a cast 
[-Wint-conversion]
   15 |   fd = *fd;
  |  ^

I'm not proposing a patch as I don't know whether you'd like to just
silence the diagnostic, fix (?) the test case, and/or add another
'dg-error'-checking test case?  (I've not yet looked at the history of
the test case.)


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] testsuite, arm: Fix up pr112337.c test

2023-12-01 Thread Saurabh Jha

Hey,

I introduced this test "gcc/testsuite/gcc.target/arm/mve/pr112337.c" in this 
commit 2365aae84de030bbb006edac18c9314812fc657b before. This had an error which I 
unfortunately missed. This patch fixes that test.

Did regression testing on arm-none-eabi and found no regressions. Output of 
running gcc/contrib/compare_tests is this:

"""
Tests that now work, but didn't before (2 tests):

arm-eabi-aem/-marm/-march=armv7-a/-mfpu=vfpv3-d16/-mfloat-abi=softfp: 
gcc.target/arm/mve/pr112337.c (test for excess errors)
arm-eabi-aem/-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard:
 gcc.target/arm/mve/pr112337.c (test for excess errors)
"""

Ok for trunk? I don't have commit access so could someone please commit on my 
behalf?

Regards,
Saurabh

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/pr112337.c: Fix the testcase
From 2365aae84de030bbb006edac18c9314812fc657b Mon Sep 17 00:00:00 2001
From: Saurabh Jha 
Date: Tue, 28 Nov 2023 13:05:58 +
Subject: [PATCH] testsuite: Fix up pr112337.c test

---
 gcc/testsuite/gcc.target/arm/mve/pr112337.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/pr112337.c 
b/gcc/testsuite/gcc.target/arm/mve/pr112337.c
index 8f491990088..d1a075ecd0e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/pr112337.c
+++ b/gcc/testsuite/gcc.target/arm/mve/pr112337.c
@@ -5,8 +5,8 @@
 #include 
 
 void g(int32x4_t);
-void f(int, int, int, short, int *p) {
-  int *bias = p;
+void f(int, int, int, short, int32_t *p) {
+  int32_t *bias = p;
   for (;;) {
 int32x4_t d = vldrwq_s32 (p);
 bias += 4;
-- 
2.34.1



Re: [PATCH] testsuite: Tweak some further tests for modern C changes

2023-12-01 Thread Richard Biener
On Fri, 1 Dec 2023, Jakub Jelinek wrote:

> Hi!
> 
> On IRC Richi mentioned some FAILs in gcc.target/x86_64 and in pr83126.c.
> 
> The following patch fixes the former ones (they need recent binutils to
> be enabled), for pr83126.c because I didn't have graphite configured I've
> just verified that the test compiles (didn't without the patch) and that
> the gimple dump is identical with one from yesterday's gcc (as it was a
> tree-parloops.cc ICE, I guess identical gimple is all we care about
> and no need to verify it further).
> 
> Ok for trunk?

OK.

Richard.

> 2023-12-01  Jakub Jelinek  
> 
>   * gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c
>   (fun_check_passing_m512_8_values, fun_check_passing_m512h_8_values):
>   Add missing void return type.
>   * gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c
>   (fun_check_passing_m256_8_values, fun_check_passing_m256h_8_values):
>   Likewise.
>   * gcc.dg/graphite/pr83126.c (ew): Add missing casts to __INTPTR_TYPE__
>   and then to int *.
> 
> --- 
> gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c.jj   
> 2021-12-30 15:12:43.747143127 +0100
> +++ gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c  
> 2023-12-01 11:56:10.708574470 +0100
> @@ -25,6 +25,7 @@ int failed = 0;
>assert (memcmp (&X1, &X2, sizeof (T)) == 0); \
>  } while (0)
>  
> +void
>  fun_check_passing_m512_8_values (__m512 i0 ATTRIBUTE_UNUSED,
>__m512 i1 ATTRIBUTE_UNUSED,
>__m512 i2 ATTRIBUTE_UNUSED,
> @@ -45,6 +46,7 @@ fun_check_passing_m512_8_values (__m512
>compare (values.i7, i7, __m512);
>  }
>  
> +void
>  fun_check_passing_m512h_8_values (__m512h i0 ATTRIBUTE_UNUSED,
> __m512h i1 ATTRIBUTE_UNUSED,
> __m512h i2 ATTRIBUTE_UNUSED,
> --- 
> gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c.jj   
> 2021-12-30 15:12:43.746143141 +0100
> +++ gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c  
> 2023-12-01 11:55:56.770772491 +0100
> @@ -25,6 +25,7 @@ int failed = 0;
>assert (memcmp (&X1, &X2, sizeof (T)) == 0); \
>  } while (0)
>  
> +void
>  fun_check_passing_m256_8_values (__m256 i0 ATTRIBUTE_UNUSED,
>__m256 i1 ATTRIBUTE_UNUSED,
>__m256 i2 ATTRIBUTE_UNUSED,
> @@ -45,6 +46,7 @@ fun_check_passing_m256_8_values (__m256
>compare (values.i7, i7, __m256);
>  }
>  
> +void
>  fun_check_passing_m256h_8_values (__m256h i0 ATTRIBUTE_UNUSED,
> __m256h i1 ATTRIBUTE_UNUSED,
> __m256h i2 ATTRIBUTE_UNUSED,
> --- gcc/testsuite/gcc.dg/graphite/pr83126.c.jj2020-01-12 
> 11:54:37.438397944 +0100
> +++ gcc/testsuite/gcc.dg/graphite/pr83126.c   2023-12-01 12:20:42.045695863 
> +0100
> @@ -12,7 +12,7 @@ ew (unsigned short int c9, int stuff)
>int *fd = &stuff;
>  
>*fd = c9;
> -  fd = *fd;
> +  fd = (int *) (__INTPTR_TYPE__) *fd;
>if (*fd != 0)
>   for (*by = 0; *by < 2; ++*by)
> c9 *= e1;
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


  1   2   >