Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Lehua Ding

Hi Robin and Juzhe,

I changed to use the most original method, please see V3 as below:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631076.html

On 2023/9/20 17:51, Robin Dapp wrote:

So, IMHO, a complicate pattern which combine initial 0 value + extension + 
reduction + vmerge may be more reasonable.


If that works I would also prefer that.

Regards
  Robin



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai



[PATCH V3] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Lehua Ding
V3 Change: Back to the original method.

This patch support combining cond extend and reduce_sum to cond widen reduce_sum
like combine the following three insns:
   (set (reg:RVVM2HI 149)
(if_then_else:RVVM2HI
  (unspec:RVVMF8BI [
(const_vector:RVVMF8BI repeat [
  (const_int 1 [0x1])
])
(reg:DI 146)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(reg:SI 66 vl)
(reg:SI 67 vtype)
  ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2HI repeat [
   (const_int 0 [0])
 ])
 (unspec:RVVM2HI [
   (reg:SI 0 zero)
 ] UNSPEC_VUNDEF)))
  (set (reg:RVVM2HI 138)
(if_then_else:RVVM2HI
  (reg:RVVMF8BI 135)
  (reg:RVVM2HI 148)
  (reg:RVVM2HI 149)))
  (set (reg:HI 150)
(unspec:HI [
  (reg:RVVM2HI 138)
] UNSPEC_REDUC_SUM))
into one insn:
  (set (reg:SI 147)
(unspec:SI [
  (if_then_else:RVVM2SI
(reg:RVVMF16BI 135)
(sign_extend:RVVM2SI (reg:RVVM1HI 136))
(if_then_else:RVVM2HI
  (unspec:RVVMF8BI [
(const_vector:RVVMF8BI repeat [
  (const_int 1 [0x1])
])
(reg:DI 146)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(reg:SI 66 vl)
(reg:SI 67 vtype)
  ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2HI repeat [
   (const_int 0 [0])
 ])
 (unspec:RVVM2HI [
   (reg:SI 0 zero)
 ] UNSPEC_VUNDEF)))
] UNSPEC_REDUC_SUM))

Consider the following C code:

int16_t foo (int8_t *restrict a, int8_t *restrict pred)
{
  int16_t sum = 0;
  for (int i = 0; i < 16; i += 1)
if (pred[i])
  sum += a[i];
  return sum;
}

assembly before this patch:

foo:
vsetivlizero,16,e16,m2,ta,ma
li  a5,0
vmv.v.i v2,0
vsetvli zero,zero,e8,m1,ta,ma
vl1re8.vv0,0(a1)
vmsne.viv0,v0,0
vsetvli zero,zero,e16,m2,ta,mu
vle8.v  v4,0(a0),v0.t
vmv.s.x v1,a5
vsext.vf2   v2,v4,v0.t
vredsum.vs  v2,v2,v1
vmv.x.s a0,v2
slliw   a0,a0,16
sraiw   a0,a0,16
ret

assembly after this patch:

foo:
li  a5,0
vsetivlizero,16,e16,m1,ta,ma
vmv.s.x v3,a5
vsetivlizero,16,e8,m1,ta,ma
vl1re8.vv0,0(a1)
vmsne.viv0,v0,0
vle8.v  v2,0(a0),v0.t
vwredsum.vs v1,v2,v3,v0.t
vsetivlizero,0,e16,m1,ta,ma
vmv.x.s a0,v1
slliw   a0,a0,16
sraiw   a0,a0,16
ret

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*cond_widen_reduc_plus_scal_):
New combine patterns.
* config/riscv/riscv-protos.h (enum insn_type): New insn_type.
(enum avl_type): New avl_type for VLS mode.
* config/riscv/riscv-v.cc: Add VLS avl_type for VLS mode.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 72 +++
 gcc/config/riscv/riscv-protos.h   |  6 +-
 gcc/config/riscv/riscv-v.cc   |  9 ++-
 .../rvv/autovec/cond/cond_widen_reduc-1.c | 30 
 .../rvv/autovec/cond/cond_widen_reduc-2.c | 30 
 .../rvv/autovec/cond/cond_widen_reduc_run-1.c | 28 
 .../rvv/autovec/cond/cond_widen_reduc_run-2.c | 28 
 7 files changed, 198 insertions(+), 5 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index a97a095691c..ed9c0777eb9 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -1119,6 +1119,78 @@
   }
   [(set_attr "type" "vfwmuladd")])

+;; Combine mask_extend + vredsum to mask_vwredsum[u]
+;; where the mrege of mask_extend is vector const 0
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(any_extend:
+  (match_operand:VI_QHS_NO_M8 2 "register_operand"))
+(if_then_else:
+  (unspec: [
+(match_operand: 3 "vector_all_trues_mask_operand")
+(match_operand 6 

Re: [PATCH] check undefine_p for one more vr

2023-09-20 Thread Jiufu Guo


Hi,

Richard Biener  writes:

>> Am 21.09.2023 um 05:10 schrieb Jiufu Guo :
>> 
>> Hi,
>> 
>> The root cause of PR111355 and PR111482 is missing to check if vr0
>> is undefined_p before call vr0.lower_bound.
>> 
>> In the pattern "(X + C) / N",
>> 
>>(if (INTEGRAL_TYPE_P (type)
>> && get_range_query (cfun)->range_of_expr (vr0, @0))
>> (if (...) 
>>   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
>>   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
>>&& wi::geu_p (vr0.lower_bound (), -c))
>> 
>> In "(if (...)", there is code to prevent vr0's undefined_p,
>> But in the "else" part, vr0's undefined_p is not checked before
>> "wi::geu_p (vr0.lower_bound (), -c)".
>> 
>> Bootstrap & regtest pass on ppc64{,le}.
>> Is this ok for trunk?
>
> Ok

Thanks! Committed via r14-4192.

BR,
Jeff (Jiufu Guo)

>
> Richard 
>
>> BR,
>> Jeff (Jiufu Guo)
>> 
>> 
>>PR tree-optimization/111355
>> 
>> gcc/ChangeLog:
>> 
>>* match.pd ((X + C) / N): Update pattern.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>* gcc.dg/pr111355.c: New test.
>> 
>> ---
>> gcc/match.pd| 2 +-
>> gcc/testsuite/gcc.dg/pr111355.c | 8 
>> 2 files changed, 9 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.dg/pr111355.c
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 39c9c81966a..5fdfba14d47 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1033,7 +1033,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>  || (vr0.nonnegative_p () && vr3.nonnegative_p ())
>>  || (vr0.nonpositive_p () && vr3.nonpositive_p (
>>(plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
>> -   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0
>> +   (if (!vr0.undefined_p () && TYPE_UNSIGNED (type) && c.sign_mask () < >> 0
>>&& exact_mod (-c)
>>/* unsigned "X-(-C)" doesn't underflow.  */
>>&& wi::geu_p (vr0.lower_bound (), -c))
>> diff --git a/gcc/testsuite/gcc.dg/pr111355.c 
>> b/gcc/testsuite/gcc.dg/pr111355.c
>> new file mode 100644
>> index 000..8bacbc69d31
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr111355.c
>> @@ -0,0 +1,8 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3 -Wno-div-by-zero" } */
>> +
>> +/* Make sure no ICE. */
>> +int main() {
>> +  unsigned b;
>> +  return b ? 1 << --b / 0 : 0;
>> +}
>> -- 
>> 2.25.1
>> 


Re: [PATCH 1/2] using overflow_free_p to simplify pattern

2023-09-20 Thread Jiufu Guo


Hi,

Richard Biener  writes:

> On Tue, 19 Sep 2023, Jiufu Guo wrote:
>
>> Hi,
>> 
>> In r14-3582, an "overflow_free_p" interface is added.
>> The pattern of "(t * 2) / 2" in match.pd can be simplified
>> by using this interface.
>> 
>> Bootstrap & regtest pass on ppc64{,le} and x86_64.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu)
>> 
>> gcc/ChangeLog:
>> 
>>  * match.pd ((t * 2) / 2): Update to use overflow_free_p.
>> 
>> ---
>>  gcc/match.pd | 37 +++--
>>  1 file changed, 7 insertions(+), 30 deletions(-)
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 87edf0e75c3..8bba7056000 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -926,36 +926,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> (if (TYPE_OVERFLOW_UNDEFINED (type))
>>  @0
>>  #if GIMPLE
>> -(with
>> - {
>> -   bool overflowed = true;
>> -   value_range vr0, vr1;
>> -   if (INTEGRAL_TYPE_P (type)
>> -   && get_range_query (cfun)->range_of_expr (vr0, @0)
>> -   && get_range_query (cfun)->range_of_expr (vr1, @1)
>> -   && !vr0.varying_p () && !vr0.undefined_p ()
>> -   && !vr1.varying_p () && !vr1.undefined_p ())
>> - {
>> -   wide_int wmin0 = vr0.lower_bound ();
>> -   wide_int wmax0 = vr0.upper_bound ();
>> -   wide_int wmin1 = vr1.lower_bound ();
>> -   wide_int wmax1 = vr1.upper_bound ();
>> -   /* If the multiplication can't overflow/wrap around, then
>> -  it can be optimized too.  */
>> -   wi::overflow_type min_ovf, max_ovf;
>> -   wi::mul (wmin0, wmin1, TYPE_SIGN (type), _ovf);
>> -   wi::mul (wmax0, wmax1, TYPE_SIGN (type), _ovf);
>> -   if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
>> - {
>> -   wi::mul (wmin0, wmax1, TYPE_SIGN (type), _ovf);
>> -   wi::mul (wmax0, wmin1, TYPE_SIGN (type), _ovf);
>> -   if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
>> - overflowed = false;
>> - }
>> - }
>> - }
>> -(if (!overflowed)
>> - @0))
>> +(with {value_range vr0, vr1;}
>> + (if (INTEGRAL_TYPE_P (type)
>> +  && get_range_query (cfun)->range_of_expr (vr0, @0)
>> +  && get_range_query (cfun)->range_of_expr (vr1, @1)
>> +  && !vr0.varying_p () && !vr1.varying_p ()
>
> From your other uses checking !varying_p doesn't seem necessary?

Thanks for pointing out this!!
Yes, !varying_p is not needed, overflow_free_p could cover it.

Committed via r14-4191.

BR,
Jeff (Jiufu Guo)

>
> OK with omitting.
>
> Richard.
>
>> +  && range_op_handler (MULT_EXPR).overflow_free_p (vr0, vr1))
>> +  @0))
>>  #endif
>> 
>>  
>> 


[Bug tree-optimization/111355] [14 Regression] ICE on valid code at -O1 and above: in lower_bound, at value-range.h:1078

2023-09-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111355

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Jiu Fu Guo :

https://gcc.gnu.org/g:4d80863d7f93c0a839d1fe5dc59be83153e89110

commit r14-4192-g4d80863d7f93c0a839d1fe5dc59be83153e89110
Author: Jiufu Guo 
Date:   Wed Sep 20 11:11:58 2023 +0800

check undefine_p for one more vr

The root cause of PR111355 and PR111482 is missing to check if vr0
is undefined_p before call vr0.lower_bound.

In the pattern "(X + C) / N",

(if (INTEGRAL_TYPE_P (type)
 && get_range_query (cfun)->range_of_expr (vr0, @0))
 (if (...)
   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
&& wi::geu_p (vr0.lower_bound (), -c))

In "(if (...)", there is code to prevent vr0's undefined_p,
But in the "else" part, vr0's undefined_p is not checked before
"wi::geu_p (vr0.lower_bound (), -c)".

PR tree-optimization/111355

gcc/ChangeLog:

* match.pd ((X + C) / N): Update pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/pr111355.c: New test.

[Bug tree-optimization/111495] [14 regression] ICE in lower_bound, at value-range.h:1078 when building LLVM 17.0.1

2023-09-20 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111495

--- Comment #1 from Sam James  ---
Created attachment 55956
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55956=edit
reduced.ii

Attached minimised reproducer fails with -O3, works with -O2, per original.

Re: [PATCH] check undefine_p for one more vr

2023-09-20 Thread Richard Biener



> Am 21.09.2023 um 05:10 schrieb Jiufu Guo :
> 
> Hi,
> 
> The root cause of PR111355 and PR111482 is missing to check if vr0
> is undefined_p before call vr0.lower_bound.
> 
> In the pattern "(X + C) / N",
> 
>(if (INTEGRAL_TYPE_P (type)
> && get_range_query (cfun)->range_of_expr (vr0, @0))
> (if (...) 
>   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
>   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
>&& wi::geu_p (vr0.lower_bound (), -c))
> 
> In "(if (...)", there is code to prevent vr0's undefined_p,
> But in the "else" part, vr0's undefined_p is not checked before
> "wi::geu_p (vr0.lower_bound (), -c)".
> 
> Bootstrap & regtest pass on ppc64{,le}.
> Is this ok for trunk?

Ok

Richard 

> BR,
> Jeff (Jiufu Guo)
> 
> 
>PR tree-optimization/111355
> 
> gcc/ChangeLog:
> 
>* match.pd ((X + C) / N): Update pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.dg/pr111355.c: New test.
> 
> ---
> gcc/match.pd| 2 +-
> gcc/testsuite/gcc.dg/pr111355.c | 8 
> 2 files changed, 9 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/pr111355.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 39c9c81966a..5fdfba14d47 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1033,7 +1033,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  || (vr0.nonnegative_p () && vr3.nonnegative_p ())
>  || (vr0.nonpositive_p () && vr3.nonpositive_p (
>(plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
> -   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0
> +   (if (!vr0.undefined_p () && TYPE_UNSIGNED (type) && c.sign_mask () < 0
>&& exact_mod (-c)
>/* unsigned "X-(-C)" doesn't underflow.  */
>&& wi::geu_p (vr0.lower_bound (), -c))
> diff --git a/gcc/testsuite/gcc.dg/pr111355.c b/gcc/testsuite/gcc.dg/pr111355.c
> new file mode 100644
> index 000..8bacbc69d31
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr111355.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -Wno-div-by-zero" } */
> +
> +/* Make sure no ICE. */
> +int main() {
> +  unsigned b;
> +  return b ? 1 << --b / 0 : 0;
> +}
> -- 
> 2.25.1
> 


Re: Attempt to fix g++.dg tests failures in gnu-versioned-namespace mode

2023-09-20 Thread François Dumont via Gcc
Thanks for the feedback, seems a little bit too complicated for what I'm 
trying to achieve.


I'm eventually testing this patch which is also how libstdc++ is 
managing this small problem.


I'll submit a proper patch once confirmed that tests are fixed.

François

On 20/09/2023 09:22, Thomas Schwinge wrote:

Hi!

On 2023-09-20T07:08:25+0200, François Dumont via Gcc  wrote:

I've configured libstdc++ with --enable-symvers=gnu-versioned-namespace

I can't comment on that option...


and run make check-c++.

A number of failures are like this one:

/home/fdumont/dev/gcc/git/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C:
In function 'coro1 f()':
/home/fdumont/dev/gcc/git/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C:9:1:
error: 'operator new' is provided by
'std::__8::__n4861::__coroutine_traits_impl::promise_type'
{aka 'co
ro1::promise_type'} but is not usable with the function signature 'coro1
f()'
compiler exited with status 1
FAIL: g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C  (test for
errors, line 9)
FAIL: g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C (test for excess
errors)
Excess errors:
/home/fdumont/dev/gcc/git/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C:9:1:
error: 'operator new' is provided by
'std::__n4861::__coroutine_traits_impl::promise_type' {aka
'coro1::promise_type'} but is not usable with the function signature
'coro1 f()'

The '__8' is messing with expected output.

So I've added:

  # Ignore optional version namespace from libstdc++.
  regsub -all "std::__8::" $text "std::" text

..., and whether that's conceptually the correct solution...


in testsuite/lib/prune.exp prune_gcc_output.

But it had no impact, same failures.

What am I missing ?

..., but I can answer that one: pruning happens after scanning for
'dg-error' etc. (..., which are captured in 'dg-messages').  See DejaGnu
'dg.exp:dg-test':

 [...]
 set results [${tool}-dg-test $prog [lindex ${dg-do-what} 0] "$tool_flags 
${dg-extra-tool-flags}"]

 set comp_output [lindex $results 0]
 set output_file [lindex $results 1]

 foreach i ${dg-messages} {
 verbose "Scanning for message: $i" 4

 # Remove all error messages for the line [lindex $i 0]
 # in the source file.  If we find any, success!
 [...]
 }

 # Remove messages from the tool that we can ignore.
 set comp_output [prune_warnings $comp_output]
 [...]
 if {$comp_output ne ""} {
 fail "$name (test for excess errors)"
 send_log "Excess errors:\n$comp_output\n"
 } else {
 pass "$name (test for excess errors)"
 }
 [...]

So you'll have to have your 's%std::__8::%std::' work on 'dg-messages', I
suppose?


Grüße
  Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955diff --git a/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C b/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C
index 4706deebf4e..928e0c974e1 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C
+++ b/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-00-bad-op-new.C
@@ -6,7 +6,7 @@
 #include "coro1-allocators.h"
 
 struct coro1
-f ()  /* { dg-error {'operator new' is provided by 'std::__n4861::__coroutine_traits_impl::promise_type' \{aka 'coro1::promise_type'\} but is not usable with the function signature 'coro1 f\(\)'} } */
+f ()  /* { dg-error {'operator new' is provided by 'std::(__8::)?__n4861::__coroutine_traits_impl::promise_type' \{aka 'coro1::promise_type'\} but is not usable with the function signature 'coro1 f\(\)'} } */
 {
   co_return;
 }
diff --git a/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C b/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C
index 252cb5e442c..fc2afcf5e0e 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C
+++ b/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C
@@ -6,7 +6,7 @@
 #include "coro1-allocators.h"
 
 struct coro1
-f ()  /* { dg-error {'operator delete' is provided by 'std::__n4861::__coroutine_traits_impl::promise_type' \{aka 'coro1::promise_type'\} but is not usable with the function signature 'coro1 f\(\)'} } */
+f ()  /* { dg-error {'operator delete' is provided by 'std::(__8)?__n4861::__coroutine_traits_impl::promise_type' \{aka 'coro1::promise_type'\} but is not usable with the function signature 'coro1 f\(\)'} } */
 {
   co_return;
 }
diff --git a/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C b/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C
index 89972b60945..0a545fed0e3 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro-bad-alloc-02-no-op-new-nt.C

Re: [PATCH][_GLIBCXX_INLINE_VERSION] Fix

2023-09-20 Thread François Dumont

Tests were successful, ok to commit ?

On 20/09/2023 19:51, François Dumont wrote:
libstdc++: [_GLIBCXX_INLINE_VERSION] Add handle_contract_violation 
symbol alias


libstdc++-v3/ChangeLog:

    * src/experimental/contract.cc
    [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Provide 
symbol alias

    without version namespace decoration for gcc.

Here is what I'm testing eventually, ok to commit if successful ?

François

On 20/09/2023 11:32, Jonathan Wakely wrote:

On Wed, 20 Sept 2023 at 05:51, François Dumont via Libstdc++
 wrote:

libstdc++: Remove std::constract_violation from versioned namespace

Spelling mistake in contract_violation, and it's not
std::contract_violation, it's std::experimental::contract_violation


GCC expects this type to be in std namespace directly.

Again, it's in std::experimental not in std directly.

Will this change cause problems when including another experimental
header, which does put experimental below std::__8?

I think std::__8::experimental and std::experimental will become 
ambiguous.


Maybe we do want to remove the inline __8 namespace from all
experimental headers. That needs a bit more thought though.


libstdc++-v3/ChangeLog:

  * include/experimental/contract:
  Remove 
_GLIBCXX_BEGIN_NAMESPACE_VERSION/_GLIBCXX_END_NAMESPACE_VERSION.

This line is too long for the changelog.


It does fix 29 g++.dg/contracts in gcc testsuite.

Ok to commit ?

François


[Bug modula2/111510] New: Modula-2 runtime ICE on arm-linux-gnueabihf: iso/RTentity.mod:245:in findChildAndParent has caused internal runtime error, RTentity is either corrupt or the module storage ha

2023-09-20 Thread doko at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111510

Bug ID: 111510
   Summary: Modula-2 runtime ICE on arm-linux-gnueabihf:
iso/RTentity.mod:245:in findChildAndParent has caused
internal runtime error, RTentity is either corrupt or
the module storage has not been initialized yet
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: modula2
  Assignee: gaius at gcc dot gnu.org
  Reporter: doko at gcc dot gnu.org
  Target Milestone: ---

seen with current gcc-13 branch, that used to work with GCC 12 before (although
M2 was not yet merged at this time). running a simple HelloWorld.mod on
arm-linux-gnueabihf fails with:

2746s autopkgtest [17:26:10]: test libgm2-link: [---
2748s build: OK
2748s   libm2cor.so.18 => /lib/arm-linux-gnueabihf/libm2cor.so.18 (0xf7eec000)
2748s   libm2pim.so.18 => /lib/arm-linux-gnueabihf/libm2pim.so.18 (0xf7ec8000)
2748s   libm2iso.so.18 => /lib/arm-linux-gnueabihf/libm2iso.so.18 (0xf7ea1000)
2748s   libstdc++.so.6 => /lib/arm-linux-gnueabihf/libstdc++.so.6 (0xf7cf6000)
2748s   libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0xf7cdc000)
2748s   libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xf7bac000)
2748s   /lib/ld-linux-armhf.so.3 (0xf7eff000)
2748s   libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0xf7b6a000)
2748s
../../../../src/libgm2/libm2iso/../../gcc/m2/gm2-libs-iso/RTentity.mod:245:in
findChildAndParent has caused internal runtime error, RTentity is either
corrupt or the module storage has not been initialized yet
2749s autopkgtest [17:26:13]: test libgm2-link: ---]

Re: Re: [PATCH] RISC-V: Optimized for strided load/store with stride == element width[PR111450]

2023-09-20 Thread Li Xu
Committed, thanks Juzhe.
--
Li Xu
>Thanks a lot. LGTM.
>
>
>
>juzhe.zh...@rivai.ai
>
>From: Li Xu
>Date: 2023-09-21 11:12
>To: gcc-patches
>CC: kito.cheng; palmer; juzhe.zhong; xuli
>Subject: [PATCH] RISC-V: Optimized for strided load/store with stride == 
>element width[PR111450]
>From: xuli 
>
>When stride == element width, vlsse should be optimized into vle.v.
>vsse should be optimized into vse.v.
>
>PR target/111450
>
>gcc/ChangeLog:
>
>*config/riscv/constraints.md (c01): const_int 1.
>(c02): const_int 2.
>(c04): const_int 4.
>(c08): const_int 8.
>* config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for 
>stride operand.
>(vector_eew16_stride_operand): Ditto.
>(vector_eew32_stride_operand): Ditto.
>(vector_eew64_stride_operand): Ditto.
>* config/riscv/vector-iterators.md: New iterator for stride operand.
>* config/riscv/vector.md: Add stride = element width constraint.
>
>gcc/testsuite/ChangeLog:
>
>* gcc.target/riscv/rvv/base/pr111450.c: New test.
>---
>gcc/config/riscv/constraints.md   |  20 
>gcc/config/riscv/predicates.md    |  18 
>gcc/config/riscv/vector-iterators.md  |  87 +++
>gcc/config/riscv/vector.md    |  42 +---
>.../gcc.target/riscv/rvv/base/pr111450.c  | 100 ++
>5 files changed, 250 insertions(+), 17 deletions(-)
>create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111450.c
>
>diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
>index 3f52bc76f67..964fdd450c9 100644
>--- a/gcc/config/riscv/constraints.md
>+++ b/gcc/config/riscv/constraints.md
>@@ -45,6 +45,26 @@
>   (and (match_code "const_int")
>    (match_test "ival == 0")))
>+(define_constraint "c01"
>+  "Constant value 1."
>+  (and (match_code "const_int")
>+   (match_test "ival == 1")))
>+
>+(define_constraint "c02"
>+  "Constant value 2"
>+  (and (match_code "const_int")
>+   (match_test "ival == 2")))
>+
>+(define_constraint "c04"
>+  "Constant value 4"
>+  (and (match_code "const_int")
>+   (match_test "ival == 4")))
>+
>+(define_constraint "c08"
>+  "Constant value 8"
>+  (and (match_code "const_int")
>+   (match_test "ival == 8")))
>+
>(define_constraint "K"
>   "A 5-bit unsigned immediate for CSR access instructions."
>   (and (match_code "const_int")
>diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
>index 4bc7ff2c9d8..7845998e430 100644
>--- a/gcc/config/riscv/predicates.md
>+++ b/gcc/config/riscv/predicates.md
>@@ -514,6 +514,24 @@
>   (ior (match_operand 0 "const_0_operand")
>    (match_operand 0 "pmode_register_operand")))
>+;; [1, 2, 4, 8] means strided load/store with stride == element width
>+(define_special_predicate "vector_eew8_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 1 || INTVAL (op) == 0"
>+(define_special_predicate "vector_eew16_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 2 || INTVAL (op) == 0"
>+(define_special_predicate "vector_eew32_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 4 || INTVAL (op) == 0"
>+(define_special_predicate "vector_eew64_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 8 || INTVAL (op) == 0"
>+
>;; A special predicate that doesn't match a particular mode.
>(define_special_predicate "vector_any_register_operand"
>   (match_code "reg"))
>diff --git a/gcc/config/riscv/vector-iterators.md 
>b/gcc/config/riscv/vector-iterators.md
>index 73df55a69c8..f85d1cc80d1 100644
>--- a/gcc/config/riscv/vector-iterators.md
>+++ b/gcc/config/riscv/vector-iterators.md
>@@ -2596,6 +2596,93 @@
>   (V512DI "V512BI")
>])
>+(define_mode_attr stride_predicate [
>+  (RVVM8QI "vector_eew8_stride_operand") (RVVM4QI 
>"vector_eew8_stride_operand")
>+  (RVVM2QI "vector_eew8_stride_operand") (RVVM1QI 
>"vector_eew8_stride_operand")
>+  (RVVMF2QI "vector_eew8_stride_operand") (RVVMF4QI 
>"vector_eew8_stride_operand")
>+  (RVVMF8QI "vector_eew8_stride_operand")
>+
>+  (RVVM8HI "vector_eew16_stride_operand") (RVVM4HI 
>"vector_eew16_stride_operand")
>+  (RVVM2HI "vector_eew16_stride_operand") (RVVM1HI 
>"vector_eew16_stride_operand")
>+  (RVVMF2HI "vector_eew16_stride_operand") (RVVMF4HI 
>"vector_eew16_stride_operand")
>+
>+  (RVVM8HF "vector_eew16_stride_operand") (RVVM4HF 
>"vector_eew16_stride_operand")
>+  (RVVM2HF "vector_eew16_stride_operand") (RVVM1HF 
>"vector_eew16_stride_operand")
>+  (RVVMF2HF "vector_eew16_stride_operand") (RVVMF4HF 
>"vector_eew16_stride_operand")
>+
>+  (RVVM8SI "vector_eew32_stride_operand") (RVVM4SI 
>"vector_eew32_stride_operand")
>+  (RVVM2SI 

[Bug target/111450] RISC-V: Missed optimized for strided load/store with stride = element width

2023-09-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111450

--- Comment #1 from CVS Commits  ---
The master branch has been updated by Li Xu :

https://gcc.gnu.org/g:47065ff360292c683670efb96df4b61f57dc1d9a

commit r14-4190-g47065ff360292c683670efb96df4b61f57dc1d9a
Author: xuli 
Date:   Thu Sep 21 03:04:56 2023 +

RISC-V: Optimized for strided load/store with stride == element
width[PR111450]

When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.

PR target/111450

gcc/ChangeLog:

* config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New
predicate for stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride
operand.
* config/riscv/vector.md: Add stride = element width constraint.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111450.c: New test.

Re: [PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

2023-09-20 Thread Lehua Ding

Committed, thanks Juzhe.

On 2023/9/21 11:45, juzhe.zh...@rivai.ai wrote:

LGTM


juzhe.zh...@rivai.ai

*From:* Lehua Ding 
*Date:* 2023-09-21 11:44
*To:* gcc-patches 
*CC:* juzhe.zhong ; kito.cheng
; rdapp.gcc
; palmer ;
jeffreyalaw ; lehua.ding

*Subject:* [PATCH] RISC-V: Rename predicate
vector_gs_scale_operand_16/32 to more generic names
This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.
gcc/ChangeLog:
* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.
---
gcc/config/riscv/predicates.md   | 16 
gcc/config/riscv/vector-iterators.md | 16 
2 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/gcc/config/riscv/predicates.md
b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..a4f03242f2c 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -70,6 +70,14 @@
    (and (match_code "const_int,const_wide_int,const_vector")
     (match_test "op == CONST1_RTX (GET_MODE (op))")))
+(define_predicate "const_1_or_2_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "const_1_or_4_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
(define_predicate "reg_or_0_operand"
    (ior (match_operand 0 "const_0_operand")
     (match_operand 0 "register_operand")))
@@ -463,14 +471,6 @@
    (ior (match_operand 0 "register_operand")
     (match_code "const_vector")))
-(define_predicate "vector_gs_scale_operand_16"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
-
-(define_predicate "vector_gs_scale_operand_32"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
-
(define_predicate "vector_gs_scale_operand_64"
    (and (match_code "const_int")
     (match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode
== DImode)")))
diff --git a/gcc/config/riscv/vector-iterators.md
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..a32d7e8d4e9 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2723,18 +2723,18 @@
    (RVVMF4QI "const_1_operand") (RVVMF8QI "const_1_operand")
    (RVVM8HI "const_1_operand") (RVVM4HI
"vector_gs_scale_operand_16_rv32")
-  (RVVM2HI "vector_gs_scale_operand_16") (RVVM1HI
"vector_gs_scale_operand_16")
-  (RVVMF2HI "vector_gs_scale_operand_16") (RVVMF4HI
"vector_gs_scale_operand_16")
+  (RVVM2HI "const_1_or_2_operand") (RVVM1HI "const_1_or_2_operand")
+  (RVVMF2HI "const_1_or_2_operand") (RVVMF4HI "const_1_or_2_operand")
    (RVVM8HF "const_1_operand") (RVVM4HF
"vector_gs_scale_operand_16_rv32")
-  (RVVM2HF "vector_gs_scale_operand_16") (RVVM1HF
"vector_gs_scale_operand_16")
-  (RVVMF2HF "vector_gs_scale_operand_16") (RVVMF4HF
"vector_gs_scale_operand_16")
+  (RVVM2HF "const_1_or_2_operand") (RVVM1HF "const_1_or_2_operand")
+  (RVVMF2HF "const_1_or_2_operand") (RVVMF4HF "const_1_or_2_operand")
-  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI
"vector_gs_scale_operand_32") (RVVM2SI "vector_gs_scale_operand_32")
-  (RVVM1SI "vector_gs_scale_operand_32") (RVVMF2SI
"vector_gs_scale_operand_32")
+  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI
"const_1_or_4_operand") (RVVM2SI "const_1_or_4_operand")
+  (RVVM1SI "const_1_or_4_operand") (RVVMF2SI "const_1_or_4_operand")
-  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF
"vector_gs_scale_operand_32") (RVVM2SF "vector_gs_scale_operand_32")
-  (RVVM1SF "vector_gs_scale_operand_32") (RVVMF2SF
"vector_gs_scale_operand_32")
+  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF
"const_1_or_4_operand") (RVVM2SF "const_1_or_4_operand")
+  (RVVM1SF "const_1_or_4_operand") (RVVMF2SF "const_1_or_4_operand")
    (RVVM8DI "vector_gs_scale_operand_64") (RVVM4DI
"vector_gs_scale_operand_64")
    (RVVM2DI "vector_gs_scale_operand_64") (RVVM1DI
"vector_gs_scale_operand_64")
--
2.36.3



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai



Re: [PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

2023-09-20 Thread juzhe.zh...@rivai.ai
LGTM



juzhe.zh...@rivai.ai
 
From: Lehua Ding
Date: 2023-09-21 11:44
To: gcc-patches
CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding
Subject: [PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more 
generic names
This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.
 
gcc/ChangeLog:
 
* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.
 
---
gcc/config/riscv/predicates.md   | 16 
gcc/config/riscv/vector-iterators.md | 16 
2 files changed, 16 insertions(+), 16 deletions(-)
 
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..a4f03242f2c 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -70,6 +70,14 @@
   (and (match_code "const_int,const_wide_int,const_vector")
(match_test "op == CONST1_RTX (GET_MODE (op))")))
 
+(define_predicate "const_1_or_2_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "const_1_or_4_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
(define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
@@ -463,14 +471,6 @@
   (ior (match_operand 0 "register_operand")
(match_code "const_vector")))
 
-(define_predicate "vector_gs_scale_operand_16"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
-
-(define_predicate "vector_gs_scale_operand_32"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
-
(define_predicate "vector_gs_scale_operand_64"
   (and (match_code "const_int")
(match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == 
DImode)")))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..a32d7e8d4e9 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2723,18 +2723,18 @@
   (RVVMF4QI "const_1_operand") (RVVMF8QI "const_1_operand")
 
   (RVVM8HI "const_1_operand") (RVVM4HI "vector_gs_scale_operand_16_rv32")
-  (RVVM2HI "vector_gs_scale_operand_16") (RVVM1HI "vector_gs_scale_operand_16")
-  (RVVMF2HI "vector_gs_scale_operand_16") (RVVMF4HI 
"vector_gs_scale_operand_16")
+  (RVVM2HI "const_1_or_2_operand") (RVVM1HI "const_1_or_2_operand")
+  (RVVMF2HI "const_1_or_2_operand") (RVVMF4HI "const_1_or_2_operand")
 
   (RVVM8HF "const_1_operand") (RVVM4HF "vector_gs_scale_operand_16_rv32")
-  (RVVM2HF "vector_gs_scale_operand_16") (RVVM1HF "vector_gs_scale_operand_16")
-  (RVVMF2HF "vector_gs_scale_operand_16") (RVVMF4HF 
"vector_gs_scale_operand_16")
+  (RVVM2HF "const_1_or_2_operand") (RVVM1HF "const_1_or_2_operand")
+  (RVVMF2HF "const_1_or_2_operand") (RVVMF4HF "const_1_or_2_operand")
 
-  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI 
"vector_gs_scale_operand_32") (RVVM2SI "vector_gs_scale_operand_32")
-  (RVVM1SI "vector_gs_scale_operand_32") (RVVMF2SI 
"vector_gs_scale_operand_32")
+  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI "const_1_or_4_operand") 
(RVVM2SI "const_1_or_4_operand")
+  (RVVM1SI "const_1_or_4_operand") (RVVMF2SI "const_1_or_4_operand")
 
-  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF 
"vector_gs_scale_operand_32") (RVVM2SF "vector_gs_scale_operand_32")
-  (RVVM1SF "vector_gs_scale_operand_32") (RVVMF2SF 
"vector_gs_scale_operand_32")
+  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF "const_1_or_4_operand") 
(RVVM2SF "const_1_or_4_operand")
+  (RVVM1SF "const_1_or_4_operand") (RVVMF2SF "const_1_or_4_operand")
 
   (RVVM8DI "vector_gs_scale_operand_64") (RVVM4DI "vector_gs_scale_operand_64")
   (RVVM2DI "vector_gs_scale_operand_64") (RVVM1DI "vector_gs_scale_operand_64")
--
2.36.3
 


[PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

2023-09-20 Thread Lehua Ding
This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.

gcc/ChangeLog:

* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.

---
 gcc/config/riscv/predicates.md   | 16 
 gcc/config/riscv/vector-iterators.md | 16 
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..a4f03242f2c 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -70,6 +70,14 @@
   (and (match_code "const_int,const_wide_int,const_vector")
(match_test "op == CONST1_RTX (GET_MODE (op))")))

+(define_predicate "const_1_or_2_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "const_1_or_4_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
@@ -463,14 +471,6 @@
   (ior (match_operand 0 "register_operand")
(match_code "const_vector")))

-(define_predicate "vector_gs_scale_operand_16"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
-
-(define_predicate "vector_gs_scale_operand_32"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
-
 (define_predicate "vector_gs_scale_operand_64"
   (and (match_code "const_int")
(match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == 
DImode)")))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..a32d7e8d4e9 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2723,18 +2723,18 @@
   (RVVMF4QI "const_1_operand") (RVVMF8QI "const_1_operand")

   (RVVM8HI "const_1_operand") (RVVM4HI "vector_gs_scale_operand_16_rv32")
-  (RVVM2HI "vector_gs_scale_operand_16") (RVVM1HI "vector_gs_scale_operand_16")
-  (RVVMF2HI "vector_gs_scale_operand_16") (RVVMF4HI 
"vector_gs_scale_operand_16")
+  (RVVM2HI "const_1_or_2_operand") (RVVM1HI "const_1_or_2_operand")
+  (RVVMF2HI "const_1_or_2_operand") (RVVMF4HI "const_1_or_2_operand")

   (RVVM8HF "const_1_operand") (RVVM4HF "vector_gs_scale_operand_16_rv32")
-  (RVVM2HF "vector_gs_scale_operand_16") (RVVM1HF "vector_gs_scale_operand_16")
-  (RVVMF2HF "vector_gs_scale_operand_16") (RVVMF4HF 
"vector_gs_scale_operand_16")
+  (RVVM2HF "const_1_or_2_operand") (RVVM1HF "const_1_or_2_operand")
+  (RVVMF2HF "const_1_or_2_operand") (RVVMF4HF "const_1_or_2_operand")

-  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI 
"vector_gs_scale_operand_32") (RVVM2SI "vector_gs_scale_operand_32")
-  (RVVM1SI "vector_gs_scale_operand_32") (RVVMF2SI 
"vector_gs_scale_operand_32")
+  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI "const_1_or_4_operand") 
(RVVM2SI "const_1_or_4_operand")
+  (RVVM1SI "const_1_or_4_operand") (RVVMF2SI "const_1_or_4_operand")

-  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF 
"vector_gs_scale_operand_32") (RVVM2SF "vector_gs_scale_operand_32")
-  (RVVM1SF "vector_gs_scale_operand_32") (RVVMF2SF 
"vector_gs_scale_operand_32")
+  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF "const_1_or_4_operand") 
(RVVM2SF "const_1_or_4_operand")
+  (RVVM1SF "const_1_or_4_operand") (RVVMF2SF "const_1_or_4_operand")

   (RVVM8DI "vector_gs_scale_operand_64") (RVVM4DI "vector_gs_scale_operand_64")
   (RVVM2DI "vector_gs_scale_operand_64") (RVVM1DI "vector_gs_scale_operand_64")
--
2.36.3



Re: [PATCH] RISC-V: Optimized for strided load/store with stride == element width[PR111450]

2023-09-20 Thread juzhe.zh...@rivai.ai
Thanks a lot. LGTM.



juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-09-21 11:12
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; xuli
Subject: [PATCH] RISC-V: Optimized for strided load/store with stride == 
element width[PR111450]
From: xuli 
 
When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.
 
PR target/111450
 
gcc/ChangeLog:
 
*config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for 
stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride operand.
* config/riscv/vector.md: Add stride = element width constraint.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr111450.c: New test.
---
gcc/config/riscv/constraints.md   |  20 
gcc/config/riscv/predicates.md|  18 
gcc/config/riscv/vector-iterators.md  |  87 +++
gcc/config/riscv/vector.md|  42 +---
.../gcc.target/riscv/rvv/base/pr111450.c  | 100 ++
5 files changed, 250 insertions(+), 17 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111450.c
 
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3f52bc76f67..964fdd450c9 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -45,6 +45,26 @@
   (and (match_code "const_int")
(match_test "ival == 0")))
+(define_constraint "c01"
+  "Constant value 1."
+  (and (match_code "const_int")
+   (match_test "ival == 1")))
+
+(define_constraint "c02"
+  "Constant value 2"
+  (and (match_code "const_int")
+   (match_test "ival == 2")))
+
+(define_constraint "c04"
+  "Constant value 4"
+  (and (match_code "const_int")
+   (match_test "ival == 4")))
+
+(define_constraint "c08"
+  "Constant value 8"
+  (and (match_code "const_int")
+   (match_test "ival == 8")))
+
(define_constraint "K"
   "A 5-bit unsigned immediate for CSR access instructions."
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..7845998e430 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -514,6 +514,24 @@
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "pmode_register_operand")))
+;; [1, 2, 4, 8] means strided load/store with stride == element width
+(define_special_predicate "vector_eew8_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 1 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew16_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 2 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew32_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 4 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew64_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 8 || INTVAL (op) == 0"
+
;; A special predicate that doesn't match a particular mode.
(define_special_predicate "vector_any_register_operand"
   (match_code "reg"))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 73df55a69c8..f85d1cc80d1 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2596,6 +2596,93 @@
   (V512DI "V512BI")
])
+(define_mode_attr stride_predicate [
+  (RVVM8QI "vector_eew8_stride_operand") (RVVM4QI "vector_eew8_stride_operand")
+  (RVVM2QI "vector_eew8_stride_operand") (RVVM1QI "vector_eew8_stride_operand")
+  (RVVMF2QI "vector_eew8_stride_operand") (RVVMF4QI 
"vector_eew8_stride_operand")
+  (RVVMF8QI "vector_eew8_stride_operand")
+
+  (RVVM8HI "vector_eew16_stride_operand") (RVVM4HI 
"vector_eew16_stride_operand")
+  (RVVM2HI "vector_eew16_stride_operand") (RVVM1HI 
"vector_eew16_stride_operand")
+  (RVVMF2HI "vector_eew16_stride_operand") (RVVMF4HI 
"vector_eew16_stride_operand")
+
+  (RVVM8HF "vector_eew16_stride_operand") (RVVM4HF 
"vector_eew16_stride_operand")
+  (RVVM2HF "vector_eew16_stride_operand") (RVVM1HF 
"vector_eew16_stride_operand")
+  (RVVMF2HF "vector_eew16_stride_operand") (RVVMF4HF 
"vector_eew16_stride_operand")
+
+  (RVVM8SI "vector_eew32_stride_operand") (RVVM4SI 
"vector_eew32_stride_operand")
+  (RVVM2SI "vector_eew32_stride_operand") (RVVM1SI 
"vector_eew32_stride_operand")
+  (RVVMF2SI "vector_eew32_stride_operand")
+
+  (RVVM8SF "vector_eew32_stride_operand") (RVVM4SF 

[PATCH] RISC-V: Optimized for strided load/store with stride == element width[PR111450]

2023-09-20 Thread Li Xu
From: xuli 

When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.

PR target/111450

gcc/ChangeLog:

*config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New 
predicate for stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride operand.
* config/riscv/vector.md: Add stride = element width constraint.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111450.c: New test.
---
 gcc/config/riscv/constraints.md   |  20 
 gcc/config/riscv/predicates.md|  18 
 gcc/config/riscv/vector-iterators.md  |  87 +++
 gcc/config/riscv/vector.md|  42 +---
 .../gcc.target/riscv/rvv/base/pr111450.c  | 100 ++
 5 files changed, 250 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111450.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3f52bc76f67..964fdd450c9 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -45,6 +45,26 @@
   (and (match_code "const_int")
(match_test "ival == 0")))
 
+(define_constraint "c01"
+  "Constant value 1."
+  (and (match_code "const_int")
+   (match_test "ival == 1")))
+
+(define_constraint "c02"
+  "Constant value 2"
+  (and (match_code "const_int")
+   (match_test "ival == 2")))
+
+(define_constraint "c04"
+  "Constant value 4"
+  (and (match_code "const_int")
+   (match_test "ival == 4")))
+
+(define_constraint "c08"
+  "Constant value 8"
+  (and (match_code "const_int")
+   (match_test "ival == 8")))
+
 (define_constraint "K"
   "A 5-bit unsigned immediate for CSR access instructions."
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..7845998e430 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -514,6 +514,24 @@
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "pmode_register_operand")))
 
+;; [1, 2, 4, 8] means strided load/store with stride == element width
+(define_special_predicate "vector_eew8_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 1 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew16_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 2 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew32_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 4 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew64_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 8 || INTVAL (op) == 0"
+
 ;; A special predicate that doesn't match a particular mode.
 (define_special_predicate "vector_any_register_operand"
   (match_code "reg"))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 73df55a69c8..f85d1cc80d1 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2596,6 +2596,93 @@
   (V512DI "V512BI")
 ])
 
+(define_mode_attr stride_predicate [
+  (RVVM8QI "vector_eew8_stride_operand") (RVVM4QI "vector_eew8_stride_operand")
+  (RVVM2QI "vector_eew8_stride_operand") (RVVM1QI "vector_eew8_stride_operand")
+  (RVVMF2QI "vector_eew8_stride_operand") (RVVMF4QI 
"vector_eew8_stride_operand")
+  (RVVMF8QI "vector_eew8_stride_operand")
+
+  (RVVM8HI "vector_eew16_stride_operand") (RVVM4HI 
"vector_eew16_stride_operand")
+  (RVVM2HI "vector_eew16_stride_operand") (RVVM1HI 
"vector_eew16_stride_operand")
+  (RVVMF2HI "vector_eew16_stride_operand") (RVVMF4HI 
"vector_eew16_stride_operand")
+
+  (RVVM8HF "vector_eew16_stride_operand") (RVVM4HF 
"vector_eew16_stride_operand")
+  (RVVM2HF "vector_eew16_stride_operand") (RVVM1HF 
"vector_eew16_stride_operand")
+  (RVVMF2HF "vector_eew16_stride_operand") (RVVMF4HF 
"vector_eew16_stride_operand")
+
+  (RVVM8SI "vector_eew32_stride_operand") (RVVM4SI 
"vector_eew32_stride_operand")
+  (RVVM2SI "vector_eew32_stride_operand") (RVVM1SI 
"vector_eew32_stride_operand")
+  (RVVMF2SI "vector_eew32_stride_operand")
+
+  (RVVM8SF "vector_eew32_stride_operand") (RVVM4SF 
"vector_eew32_stride_operand")
+  (RVVM2SF "vector_eew32_stride_operand") (RVVM1SF 
"vector_eew32_stride_operand")
+  (RVVMF2SF 

[Bug tree-optimization/111456] [14 Regression] Dead Code Elimination Regression since r14-3719-gb34f3736356

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111456

Andrew Pinski  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-Septemb
   ||er/631065.html
   Keywords||patch

--- Comment #6 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631065.html

[PATCH] MATCH: Simplify `(A ==/!= B) &/| (((cast)A) CMP C)`

2023-09-20 Thread Andrew Pinski
This patch adds support to the pattern for `(A == B) &/| (A CMP C)`
where the second A could be casted to a different type.
Some were handled correctly if using seperate `if` statements
but not if combined with BIT_AND/BIT_IOR.
In the case of pr111456-1.c, the testcase would pass if
`--param=logical-op-non-short-circuit=0` was used but now
can be optimized always.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/106164
PR tree-optimization/111456

gcc/ChangeLog:

* match.pd (`(A ==/!= B) & (A CMP C)`):
Support an optional cast on the second A.
(`(A ==/!= B) | (A CMP C)`): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/cmpbit-6.c: New test.
* gcc.dg/tree-ssa/cmpbit-7.c: New test.
* gcc.dg/tree-ssa/pr111456-1.c: New test.
---
 gcc/match.pd   | 76 +-
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-6.c   | 22 +++
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-7.c   | 28 
 gcc/testsuite/gcc.dg/tree-ssa/pr111456-1.c | 43 
 4 files changed, 139 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-7.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr111456-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a37af05f873..0bf91bde486 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2973,7 +2973,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
   (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }
 
-/* Convert (X == CST1) && (X OP2 CST2) to a known value
+/* Convert (X == CST1) && ((other)X OP2 CST2) to a known value
based on CST1 OP2 CST2.  Similarly for (X != CST1).  */
 /* Convert (X == Y) && (X OP2 Y) to a known value if X is an integral type.
Similarly for (X != Y).  */
@@ -2981,26 +2981,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for code1 (eq ne)
  (for code2 (eq ne lt gt le ge)
   (simplify
-   (bit_and:c (code1@3 @0 @1) (code2@4 @0 @2))
+   (bit_and:c (code1:c@3 @0 @1) (code2:c@4 (convert?@c0 @0) @2))
(if ((TREE_CODE (@1) == INTEGER_CST
 && TREE_CODE (@2) == INTEGER_CST)
|| ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
 || POINTER_TYPE_P (TREE_TYPE (@1)))
-   && operand_equal_p (@1, @2)))
+   && bitwise_equal_p (@1, @2)))
 (with
  {
   bool one_before = false;
   bool one_after = false;
   int cmp = 0;
+  bool allbits = true;
   if (TREE_CODE (@1) == INTEGER_CST
  && TREE_CODE (@2) == INTEGER_CST)
{
- cmp = tree_int_cst_compare (@1, @2);
+ allbits = TYPE_PRECISION (TREE_TYPE (@1)) <= TYPE_PRECISION 
(TREE_TYPE (@2));
+ auto t1 = wi::to_wide (fold_convert (TREE_TYPE (@2), @1));
+ auto t2 = wi::to_wide (@2);
+ cmp = wi::cmp (t1, t2, TYPE_SIGN (TREE_TYPE (@2)));
  if (cmp < 0
- && wi::to_wide (@1) == wi::to_wide (@2) - 1)
+ && t1 == t2 - 1)
one_before = true;
  if (cmp > 0
- && wi::to_wide (@1) == wi::to_wide (@2) + 1)
+ && t1 == t2 + 1)
one_after = true;
}
   bool val;
@@ -3018,25 +3022,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (switch
   (if (code1 == EQ_EXPR && val) @3)
   (if (code1 == EQ_EXPR && !val) { constant_boolean_node (false, type); })
-  (if (code1 == NE_EXPR && !val) @4)
+  (if (code1 == NE_EXPR && !val && allbits) @4)
   (if (code1 == NE_EXPR
&& code2 == GE_EXPR
-  && cmp == 0)
-   (gt @0 @1))
+  && cmp == 0
+  && allbits)
+   (gt @c0 (convert @1)))
   (if (code1 == NE_EXPR
&& code2 == LE_EXPR
-  && cmp == 0)
-   (lt @0 @1))
+  && cmp == 0
+  && allbits)
+   (lt @c0 (convert @1)))
   /* (a != (b+1)) & (a > b) -> a > (b+1) */
   (if (code1 == NE_EXPR
&& code2 == GT_EXPR
-  && one_after)
-   (gt @0 @1))
+  && one_after
+  && allbits)
+   (gt @c0 (convert @1)))
   /* (a != (b-1)) & (a < b) -> a < (b-1) */
   (if (code1 == NE_EXPR
&& code2 == LT_EXPR
-  && one_before)
-   (lt @0 @1))
+  && one_before
+  && allbits)
+   (lt @c0 (convert @1)))
  )
 )
)
@@ -3100,26 +3108,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for code1 (eq ne)
  (for code2 (eq ne lt gt le ge)
   (simplify
-   (bit_ior:c (code1@3 @0 @1) (code2@4 @0 @2))
+   (bit_ior:c (code1:c@3 @0 @1) (code2:c@4 (convert?@c0 @0) @2))
(if ((TREE_CODE (@1) == INTEGER_CST
 && TREE_CODE (@2) == INTEGER_CST)
|| ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
|| POINTER_TYPE_P (TREE_TYPE (@1)))
-   && operand_equal_p (@1, @2)))
+   && bitwise_equal_p (@1, @2)))
 (with
  {
   bool one_before = false;
   bool one_after = false;
   int cmp = 0;
+  

[PATCH] check undefine_p for one more vr

2023-09-20 Thread Jiufu Guo
Hi,

The root cause of PR111355 and PR111482 is missing to check if vr0
is undefined_p before call vr0.lower_bound.

In the pattern "(X + C) / N",

(if (INTEGRAL_TYPE_P (type)
 && get_range_query (cfun)->range_of_expr (vr0, @0))
 (if (...) 
   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
&& wi::geu_p (vr0.lower_bound (), -c))

In "(if (...)", there is code to prevent vr0's undefined_p,
But in the "else" part, vr0's undefined_p is not checked before
"wi::geu_p (vr0.lower_bound (), -c)".

Bootstrap & regtest pass on ppc64{,le}.
Is this ok for trunk?

BR,
Jeff (Jiufu Guo)


PR tree-optimization/111355

gcc/ChangeLog:

* match.pd ((X + C) / N): Update pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/pr111355.c: New test.

---
 gcc/match.pd| 2 +-
 gcc/testsuite/gcc.dg/pr111355.c | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr111355.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 39c9c81966a..5fdfba14d47 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1033,7 +1033,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  || (vr0.nonnegative_p () && vr3.nonnegative_p ())
  || (vr0.nonpositive_p () && vr3.nonpositive_p (
(plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
-   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0
+   (if (!vr0.undefined_p () && TYPE_UNSIGNED (type) && c.sign_mask () < 0
&& exact_mod (-c)
/* unsigned "X-(-C)" doesn't underflow.  */
&& wi::geu_p (vr0.lower_bound (), -c))
diff --git a/gcc/testsuite/gcc.dg/pr111355.c b/gcc/testsuite/gcc.dg/pr111355.c
new file mode 100644
index 000..8bacbc69d31
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111355.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wno-div-by-zero" } */
+
+/* Make sure no ICE. */
+int main() {
+  unsigned b;
+  return b ? 1 << --b / 0 : 0;
+}
-- 
2.25.1



[Bug ipa/108007] [11/12/13/14 Regression] wrong code at -Os and above with "-fno-dce -fno-tree-dce" on x86_64-linux-gnu

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108007

--- Comment #14 from Andrew Pinski  ---
*** Bug 111507 has been marked as a duplicate of this bug. ***

[Bug c/111507] Floating point exception with '-O3'

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111507

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
  _13 = removed_return.111_34(D) % _12;
  _14 = (long int) _13;
  _15 = -2446 % _14;


Same issue as PR 108007 .

*** This bug has been marked as a duplicate of bug 108007 ***

[Committed] RISC-V: Support VLS INT <-> FP conversions

2023-09-20 Thread Juzhe-Zhong
Support INT <-> FP VLS auto-vectorization patterns.

Regression passed.
Committed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/convert-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-9.c: New test.

---
 gcc/config/riscv/autovec.md   |  12 +-
 gcc/config/riscv/vector-iterators.md  | 202 ++
 gcc/config/riscv/vector.md|  20 +-
 .../riscv/rvv/autovec/vls/convert-1.c |  74 +++
 .../riscv/rvv/autovec/vls/convert-10.c|  80 +++
 .../riscv/rvv/autovec/vls/convert-11.c|  54 +
 .../riscv/rvv/autovec/vls/convert-12.c|  36 
 .../riscv/rvv/autovec/vls/convert-2.c |  74 +++
 .../riscv/rvv/autovec/vls/convert-3.c |  58 +
 .../riscv/rvv/autovec/vls/convert-4.c |  36 
 .../riscv/rvv/autovec/vls/convert-5.c |  80 +++
 .../riscv/rvv/autovec/vls/convert-6.c |  55 +
 .../riscv/rvv/autovec/vls/convert-7.c |  37 
 .../riscv/rvv/autovec/vls/convert-8.c |  58 +
 .../riscv/rvv/autovec/vls/convert-9.c |  22 ++
 15 files changed, 882 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-9.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 75ed7ae4f2e..55c0a04df3b 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -847,7 +847,7 @@
 (define_insn_and_split "2"
   [(set (match_operand: 0 "register_operand")
(any_fix:
- (match_operand:VF 1 "register_operand")))]
+ (match_operand:V_VLSF 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -868,8 +868,8 @@
 ;; -
 
 (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-   (any_float:VF
+  [(set (match_operand:V_VLSF 0 "register_operand")
+   (any_float:V_VLSF
  (match_operand: 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
@@ -916,8 +916,8 @@
 ;; - vfwcvt.f.x.v
 ;; -
 (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-   (any_float:VF
+  [(set (match_operand:V_VLSF 0 "register_operand")
+   (any_float:V_VLSF
  (match_operand: 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
@@ -940,7 +940,7 @@
 (define_insn_and_split "2"
   [(set (match_operand: 0 "register_operand")
(any_fix:
- (match_operand:VF 1 "register_operand")))]
+ (match_operand:V_VLSF 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..19f3ec3ef74 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -1037,6 +1037,28 @@
   (RVVM4DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
   (RVVM2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
   (RVVM1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
+
+  (V1SI "TARGET_VECTOR_VLS && TARGET_ZVFH")
+  (V2SI 

[Bug ipa/108007] [11/12/13/14 Regression] wrong code at -Os and above with "-fno-dce -fno-tree-dce" on x86_64-linux-gnu

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108007

--- Comment #13 from Andrew Pinski  ---
*** Bug 111509 has been marked as a duplicate of this bug. ***

[Bug c/111509] Floating point exception with '-O3 -fno-dce -fno-inline-small-functions -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse'

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111509

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
  _145 = (unsigned int) removed_return.91_186(D);
  _146 = 1365587247 % _145;

Same issue as PR 108007.

*** This bug has been marked as a duplicate of bug 108007 ***

[Bug ipa/108007] [11/12/13/14 Regression] wrong code at -Os and above with "-fno-dce -fno-tree-dce" on x86_64-linux-gnu

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108007

--- Comment #12 from Andrew Pinski  ---
*** Bug 111508 has been marked as a duplicate of this bug. ***

[Bug c/111508] Floating point exception with '-O3 -fno-dce -fno-early-inlining -fno-tree-dce -fno-tree-dse'

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111508

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Andrew Pinski  ---
Same issue as PR 108007 .

*** This bug has been marked as a duplicate of bug 108007 ***

[Bug c/111509] Floating point exception with '-O3 -fno-dce -fno-inline-small-functions -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse'

2023-09-20 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111509

--- Comment #1 from CTC <19373742 at buaa dot edu.cn> ---
Created attachment 55955
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55955=edit
The compiler output

[Bug c/111509] New: Floating point exception with '-O3 -fno-dce -fno-inline-small-functions -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse'

2023-09-20 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111509

Bug ID: 111509
   Summary: Floating point exception with '-O3 -fno-dce
-fno-inline-small-functions -fno-tree-dce
-fno-tree-dominator-opts -fno-tree-dse'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 19373742 at buaa dot edu.cn
  Target Milestone: ---

Created attachment 55954
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55954=edit
The preprocessed file

***
OS and Platform:
Ubuntu 20.04.4 LTS
***
gcc version:
$ gcc -v
Using built-in specs.
COLLECT_GCC=/home/cuisk/ctc/gcc-releases/gcc-14/bin/gcc
COLLECT_LTO_WRAPPER=/home/cuisk/ctc/gcc-releases/gcc-14/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/home/cuisk/ctc/gcc-releases/gcc-14
--disable-multilib --enable-language=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230917 (experimental) (GCC) 
***
Command Lines:
$ gcc -I /home/csmith/include/csmith-2.3.0 -O3 -fno-dce
-fno-inline-small-functions -fno-tree-dce -fno-tree-dominator-opts
-fno-tree-dse a.c -o work 2>
ge.out

$ ./work

Floating point exception (core dumped)

[Bug c/111508] Floating point exception with '-O3 -fno-dce -fno-early-inlining -fno-tree-dce -fno-tree-dse'

2023-09-20 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111508

--- Comment #1 from CTC <19373742 at buaa dot edu.cn> ---
Created attachment 55953
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55953=edit
The compiler output

[Bug c/111508] New: Floating point exception with '-O3 -fno-dce -fno-early-inlining -fno-tree-dce -fno-tree-dse'

2023-09-20 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111508

Bug ID: 111508
   Summary: Floating point exception with '-O3 -fno-dce
-fno-early-inlining -fno-tree-dce -fno-tree-dse'
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 19373742 at buaa dot edu.cn
  Target Milestone: ---

Created attachment 55952
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55952=edit
The preprocessed file

***
OS and Platform:
Ubuntu 20.04.4 LTS
***
gcc version:
$ gcc -v
Using built-in specs.
COLLECT_GCC=/home/cuisk/ctc/gcc-releases/gcc-14/bin/gcc
COLLECT_LTO_WRAPPER=/home/cuisk/ctc/gcc-releases/gcc-14/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/home/cuisk/ctc/gcc-releases/gcc-14
--disable-multilib --enable-language=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230917 (experimental) (GCC) 
***
Command Lines:
$ gcc -I /home/csmith/include/csmith-2.3.0 -O3 -fno-dce -fno-early-inlining
-fno-tree-dce -fno-tree-dse -save-temps a.c -o fails2 2>
fe.out

$ ./fails2

Floating point exception (core dumped)

[Bug c/111507] Floating point exception with '-O3'

2023-09-20 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111507

--- Comment #1 from CTC <19373742 at buaa dot edu.cn> ---
Created attachment 55951
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55951=edit
The compiler output

[Bug c/111507] New: Floating point exception with '-O3'

2023-09-20 Thread 19373742 at buaa dot edu.cn via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111507

Bug ID: 111507
   Summary: Floating point exception with '-O3'
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: 19373742 at buaa dot edu.cn
  Target Milestone: ---

Created attachment 55950
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55950=edit
The preprocessed file

***
OS and Platform:
Ubuntu 20.04.4 LTS
***
gcc version:
$ gcc -v

Using built-in specs.
COLLECT_GCC=/home/cuisk/ctc/gcc-releases/gcc-13/bin/gcc
COLLECT_LTO_WRAPPER=/home/cuisk/ctc/gcc-releases/gcc-13/libexec/gcc/x86_64-pc-linux-gnu/13.2.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ./configure --prefix=/home/cuisk/ctc/gcc-releases/gcc-13
--disable-multilib --enable-language=c,c++
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.2.1 20230916 (GCC) 
***
Command Lines:

$ gcc -I /home/ctc/csmith/include/csmith-2.3.0 -O3 -fno-dce
-fno-forward-propagate -fno-inline-functions-called-once
-fno-inline-small-functions -fno-ipa-reference-addressable
-fno-rerun-cse-after-loop -fno-tree-dce -fno-tree-dse a.c -o fails 2>tmp.out

$ ./fails

Floating point exception (core dumped)

[Bug c/111506] New: RISC-V: Failed to vectorize conversion from INT64 -> _Float16

2023-09-20 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111506

Bug ID: 111506
   Summary: RISC-V: Failed to vectorize conversion from INT64 ->
_Float16
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

#include 

void foo (int64_t *__restrict a, _Float16 * b, int n)
{
for (int i = 0; i < n; i++) {
b[i] = (_Float16)a[i];
}
}

-march=rv64gcv_zvfh_zfh -O3  -fno-trapping-math -fopt-info-vec-missed
-march=rv64gcv_zvfh_zfh -O3  -ffast-math -fopt-info-vec-missed

:5:23: missed: couldn't vectorize loop
:6:27: missed: not vectorized: no vectype for stmt: _4 = *_3;
 scalar_type: int64_t
Compiler returned: 0

https://godbolt.org/z/orevoq7E1

Consider LLVM can vectorize with -fno-trapping-math.
However, LLVM can not vectorize when -ftrapping-math.

So, we need an explicit patterns from INT64 -> _Float16 with
!flag_trapping_math

[PATCH] LoongArch: Optimizations of vector construction.

2023-09-20 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_): New
pattern for vector construction.
(vec_set_internal): Ditto.
(lasx_xvinsgr2vr__internal): Ditto.
(lasx_xvilvl__internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl__internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
---
 gcc/config/loongarch/lasx.md  |  69 ++
 gcc/config/loongarch/loongarch.cc | 716 +-
 gcc/config/loongarch/lsx.md   | 134 
 .../vector/lasx/lasx-vec-construct-opt.c  | 102 +++
 .../vector/lsx/lsx-vec-construct-opt.c|  85 +++
 5 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 8111c8bb79a..2bc5d47ed4a 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -186,6 +186,9 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVLDI
   UNSPEC_LASX_XVLDX
   UNSPEC_LASX_XVSTX
+  UNSPEC_LASX_VECINIT_MERGE
+  UNSPEC_LASX_VEC_SET_INTERNAL
+  UNSPEC_LASX_XVILVL_INTERNAL
 ])
 
 ;; All vector modes with 256 bits.
@@ -255,6 +258,15 @@ (define_mode_attr VFHMODE256
[(V8SF "V4SF")
(V4DF "V2DF")])
 
+;; The attribute gives half int/float modes for vector modes.
+(define_mode_attr VHMODE256_ALL
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")
+   (V8SF "V4SF")
+   (V4DF "V2DF")])
+
 ;; The attribute gives double modes for vector modes in LASX.
 (define_mode_attr VDMODE256
   [(V8SI "V4DI")
@@ -312,6 +324,11 @@ (define_mode_attr mode256_f
(V4DI "v4df")
(V8SI "v8sf")])
 
+;; This attribute gives V32QI mode and V16HI mode with half size.
+(define_mode_attr mode256_i_half
+  [(V32QI "v16qi")
+   (V16HI "v8hi")])
+
  ;; This attribute gives suffix for LASX instructions.  HOW?
 (define_mode_attr lasxfmt
   [(V4DF "d")
@@ -756,6 +773,20 @@ (define_insn "lasx_xvpermi_q_"
   [(set_attr "type" "simd_splat")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Support a LSX-mode input op2.
+(define_insn "lasx_vecinit_merge_"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (unspec:LASX
+ [(match_operand:LASX 1 "register_operand" "0")
+  (match_operand: 2 "register_operand" "f")
+  (match_operand 3 "const_uimm8_operand")]
+  UNSPEC_LASX_VECINIT_MERGE))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "")])
+
 (define_insn "lasx_xvpickve2gr_d"
   [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
@@ -779,6 +810,33 @@ (define_expand "vec_set"
   DONE;
 })
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Simulate missing instructions xvinsgr2vr.b and xvinsgr2vr.h.
+(define_expand "vec_set_internal"
+  [(match_operand:ILASX_HB 0 "register_operand")
+   (match_operand: 1 "reg_or_0_operand")
+   (match_operand 2 "const__operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr__internal
+(operands[0], operands[1], operands[0], index));
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr__internal"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+   (unspec:ILASX_HB [(match_operand: 1 "reg_or_0_operand" "rJ")
+ (match_operand:ILASX_HB 2 "register_operand" "0")
+ (match_operand 3 "const__operand" "")]
+UNSPEC_LASX_VEC_SET_INTERNAL))]
+  "ISA_HAS_LASX"
+{
+  return "vinsgr2vr.\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "")])
+
 (define_expand "vec_set"
   [(match_operand:FLASX 0 "register_operand")
(match_operand: 1 "reg_or_0_operand")
@@ -1567,6 +1625,17 @@ (define_insn "logb2"
   [(set_attr "type" "simd_flog2")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Merge two scalar floating-point op1 and op2 into a LASX op0.
+(define_insn "lasx_xvilvl__internal"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+   (unspec:FLASX [(match_operand: 1 "register_operand" "f")
+  (match_operand: 2 "register_operand" "f")]
+ UNSPEC_LASX_XVILVL_INTERNAL))]
+  

[PATCH] LoongArch: Optimizations of vector construction.

2023-09-20 Thread Guo Jie
Change-Id: I327f68ab482b94073974e672c71d25c98b35a080

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_): New
pattern for vector construction.
(vec_set_internal): Ditto.
(lasx_xvinsgr2vr__internal): Ditto.
(lasx_xvilvl__internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl__internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
---
 gcc/config/loongarch/lasx.md  |  69 ++
 gcc/config/loongarch/loongarch.cc | 716 +-
 gcc/config/loongarch/lsx.md   | 134 
 .../vector/lasx/lasx-vec-construct-opt.c  | 102 +++
 .../vector/lsx/lsx-vec-construct-opt.c|  85 +++
 5 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 8111c8bb79a..2bc5d47ed4a 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -186,6 +186,9 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVLDI
   UNSPEC_LASX_XVLDX
   UNSPEC_LASX_XVSTX
+  UNSPEC_LASX_VECINIT_MERGE
+  UNSPEC_LASX_VEC_SET_INTERNAL
+  UNSPEC_LASX_XVILVL_INTERNAL
 ])
 
 ;; All vector modes with 256 bits.
@@ -255,6 +258,15 @@ (define_mode_attr VFHMODE256
[(V8SF "V4SF")
(V4DF "V2DF")])
 
+;; The attribute gives half int/float modes for vector modes.
+(define_mode_attr VHMODE256_ALL
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")
+   (V8SF "V4SF")
+   (V4DF "V2DF")])
+
 ;; The attribute gives double modes for vector modes in LASX.
 (define_mode_attr VDMODE256
   [(V8SI "V4DI")
@@ -312,6 +324,11 @@ (define_mode_attr mode256_f
(V4DI "v4df")
(V8SI "v8sf")])
 
+;; This attribute gives V32QI mode and V16HI mode with half size.
+(define_mode_attr mode256_i_half
+  [(V32QI "v16qi")
+   (V16HI "v8hi")])
+
  ;; This attribute gives suffix for LASX instructions.  HOW?
 (define_mode_attr lasxfmt
   [(V4DF "d")
@@ -756,6 +773,20 @@ (define_insn "lasx_xvpermi_q_"
   [(set_attr "type" "simd_splat")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Support a LSX-mode input op2.
+(define_insn "lasx_vecinit_merge_"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (unspec:LASX
+ [(match_operand:LASX 1 "register_operand" "0")
+  (match_operand: 2 "register_operand" "f")
+  (match_operand 3 "const_uimm8_operand")]
+  UNSPEC_LASX_VECINIT_MERGE))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "")])
+
 (define_insn "lasx_xvpickve2gr_d"
   [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
@@ -779,6 +810,33 @@ (define_expand "vec_set"
   DONE;
 })
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Simulate missing instructions xvinsgr2vr.b and xvinsgr2vr.h.
+(define_expand "vec_set_internal"
+  [(match_operand:ILASX_HB 0 "register_operand")
+   (match_operand: 1 "reg_or_0_operand")
+   (match_operand 2 "const__operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr__internal
+(operands[0], operands[1], operands[0], index));
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr__internal"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+   (unspec:ILASX_HB [(match_operand: 1 "reg_or_0_operand" "rJ")
+ (match_operand:ILASX_HB 2 "register_operand" "0")
+ (match_operand 3 "const__operand" "")]
+UNSPEC_LASX_VEC_SET_INTERNAL))]
+  "ISA_HAS_LASX"
+{
+  return "vinsgr2vr.\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "")])
+
 (define_expand "vec_set"
   [(match_operand:FLASX 0 "register_operand")
(match_operand: 1 "reg_or_0_operand")
@@ -1567,6 +1625,17 @@ (define_insn "logb2"
   [(set_attr "type" "simd_flog2")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Merge two scalar floating-point op1 and op2 into a LASX op0.
+(define_insn "lasx_xvilvl__internal"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+   (unspec:FLASX [(match_operand: 1 "register_operand" "f")
+  (match_operand: 2 "register_operand" "f")]
+   

Re: Re: [Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]

2023-09-20 Thread juzhe.zh...@rivai.ai
Yes. We could wait for a more few days to backport.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-09-21 00:41
To: Juzhe-Zhong
CC: GCC Patches; Kito Cheng; Jeff Law; Robin Dapp
Subject: Re: [Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]
Does it also happened on gcc 13 branch? If so plz backport :)

Juzhe-Zhong  於 2023年9月20日 週三 11:09 寫道:
This bug is exposed when we support VLS integer conversion patterns.

FAIL: c-c++-common/torture/pr53505.c execution.

This is because incorrect vsetvl elimination by Phase 4:

   10318:   0d207057vsetvli zero,zero,e32,m4,ta,ma
   1031c:   5e003e57vmv.v.i v28,0
   .:   missed e8,m1 vsetvl
   10320:   7b07b057vmsgtu.vi   v0,v16,15
   10324:   03083157vadd.vi v2,v16,-16

Regression on release version GCC no surprise difference.

Committed.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Fix bug.

---
 gcc/config/riscv/riscv-vsetvl.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index df980b6770e..e0f61148ef3 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1799,10 +1799,11 @@ vector_insn_info::operator== (const vector_insn_info 
) const
 if (m_demands[i] != other.demand_p ((enum demand_type) i))
   return false;

-  if (vector_config_insn_p (m_insn->rtl ())
-  || vector_config_insn_p (other.get_insn ()->rtl ()))
-if (m_insn != other.get_insn ())
-  return false;
+  /* We should consider different INSN demands as different
+ expression.  Otherwise, we will be doing incorrect vsetvl
+ elimination.  */
+  if (m_insn != other.get_insn ())
+return false;

   if (!same_avl_p (other))
 return false;
-- 
2.36.3



[Bug tree-optimization/106164] (a > b) & (a >= b) does not get optimized until reassoc1

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106164

--- Comment #21 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #18)
> I think the only thing left is supporting floating point.

Another testcase for integer but with a nop_conversion:
```
int f(int a)
{
unsigned b = a;
int t = (b >= 3);
return t | (a == 0);
}
```

Without the nop_conversion, the pattern:
/* y == XXX_MIN || x < y --> x <= y - 1 */
(simplify
 (bit_ior:c (eq:s @1 min_value) (lt:cs @0 @1))
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@1))
   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
  (le @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }

Hits.

I will look into supporting the case where there is a nop_conversion for the @0
in the lt ...

[Bug tree-optimization/111456] [14 Regression] Dead Code Elimination Regression since r14-3719-gb34f3736356

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111456

--- Comment #5 from Andrew Pinski  ---
Just a quick note, --param=logical-op-non-short-circuit=0 also allows the
missed optimization to no longer to be missed.

Re: Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-20 Thread 钟居哲
>> On a more general note, are we expecting #include  to cause a
>> testcase to fail?

Well, actually I am not familiar with this stuff.
We include match.h is because we need it.
For example, CEIL/FLOOR,...etc.
I don't know how to avoid those bogus failures.


juzhe.zh...@rivai.ai
 
From: Patrick O'Neill
Date: 2023-09-21 01:47
To: juzhe.zh...@rivai.ai
CC: Robin Dapp; gcc-patches; Kito.cheng; jeffreyalaw; palmer; Edwin Lu; 
joern.rennecke; jeremy.bennett; gnu-toolchain; Kito Cheng
Subject: Re: [Committed] RISC-V: Support VLS unary floating-point patterns
Juzhe,

On a more general note, are we expecting #include  to cause a
testcase to fail?

My motivation is to make the testsuite less noisy when checking for
regressions. For example, a patch like this one:
https://patchwork.sourceware.org/project/gcc/patch/20230920023059.1728132-1-pan2...@intel.com/
is showing 4 new failures on rv32gcv from the {dg-do compile} testcases
that #include . I might be wrong, but those don't look like real
failures to me [1][2][3].

On glibc rv64gcv I'm seeing tests like:
gcc.target/riscv/rvv/autovec/unop/vnot-rv32gcv.c
fail with similar missing stubs-ilp32d.h errors.

I want to sanity-check with other people that they are seeing similar
errors and that these errors indicate something wrong with the testsuite.
If nobody else is seeing these errors, I'd like to hear how you're
running the testsuite so I can debug the riscv-gnu-toolchain repo.

Patrick

[1]: 
Executing on host: 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
 
-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
  
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c
  -march=rv32gcv -mabi=ilp32d -mcmodel=medlow   -fdiagnostics-plain-output  -O3 
-ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2 -S   
-o math-ceil-1.s(timeout = 600)
spawn -ignore SIGHUP 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
 
-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c
 -march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output -O3 
-ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2 -S -o 
math-ceil-1.s
In file included from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h:1,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c:5:
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11:
 fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/math-ceil-1.c -O3 -ftree-vectorize (test for 
excess errors)

[2]:
https://github.com/ewlu/riscv-gnu-toolchain/issues/170

[3]:
This also extends beyond math.h. I'm seeing similar failures for
testcases like 
gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c that
#include .


On 9/19/23 18:12, Patrick O'Neill wrote:
I'll let it run overnight and see if this helps. Even before this patch,
I was seeing 233 stubs related failures for rv32gcv and 7 for rv64gcv so
this won't fix all the issues.

It's easily replicated using upstream riscv-gnu-toolchain
git clone https://github.com/riscv-collab/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init gcc
cd gcc
git pull master
cd ..
mkdir build
cd build
../configure --prefix=$(pwd) --with-arch=rv32gcv --with-abi=ilp32d
make report-linux -j32

Then search for "stubs" in the debug logs 
(/build-gcc-linux-stage2/gcc/testsuite/*.log)

Patrick
On 9/19/23 17:54, juzhe.zh...@rivai.ai wrote:
I think we could remove match.h.

Hi, @Patrick. Could you verify it?

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
index 2292372d7a3..674098e9ba6 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -1,5 +1,4 

[Bug c++/111504] compare operator not defined for recursive data types on C++20

2023-09-20 Thread xgao at nvidia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111504

--- Comment #2 from Xiang Gao  ---
(In reply to Andrew Pinski from comment #1)
> Fails for the same reason with clang (both with libstdc++ and libc++) 
> 
> Are you sure this is valid C++ 20 code?

I am not 100% sure, but my understanding is, for the SFINAE in
  inline constexpr bool operator<(const DT& x, const DT& y)
The first condition
  hasLessThan::value
is just hasLessThan::value, which should be true, and (true
|| anything) is always true.

So the condition in SFINAE should be easily evaluated as true. So the operator<
for DynamicType should be defined.

And if operator< is defined, then the operator<=> of std::vector
should also be defined. This can be validated by changing the definition of
operator< by only keeping the first condition:

template <
typename DT,
typename = std::enable_if_t<
(hasLessThan::value)>>
inline constexpr bool operator<(const DT& x, const DT& y) {
  // implementation omitted
  return true;
}

and then both g++ and clang++ will pass.

I do observe the same error on clang, but this https://cpp.sh/ seems to compile
without problem on C++20.

Re: [PATCH v2 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-20 Thread Jason Merrill

On 9/19/23 20:30, waffl3x wrote:

Thank you, this is great!


Thanks!


One legal hurdle to start with: our DCO policy
(https://gcc.gnu.org/dco.html) requires real names in the sign-off, not
pseudonyms. If you would prefer to contribute under this pseudonym, I
encourage you to file a copyright assignment with the FSF, who are set
up to handle that.


I will get on that right away.


+/* These need to moved to somewhere appropriate. */


This isn't a bad spot for these macros, but you could also move them
down lower, maybe near DECL_THIS_STATIC and DECL_ARRAY_PARAMETER_P for
some thematic connection.


Sounds good, I will move them down.


+/* The flag is a member of base, but the value is meaningless for other
+ decl types so checking is still justified I imagine. */


Absolutely, we often reuse bits for other purposes if they're disjoint
from the use they were added for.


Would it be more appropriate to give it a general name in base instead
then? If so, I can also change that.


That would make sense.


+/* Not a lang_decl field, but still specific to c++. */
+#define DECL_PARM_XOBJ_FLAG(NODE) \
+ (PARM_DECL_CHECK (NODE)->decl_common.decl_flag_3)


Better to use a DECL_LANG_FLAG than claim one of the
language-independent flags for C++.

There's a list at the top of cp-tree.h of the uses of LANG_FLAG on
various kinds of tree node. DECL_LANG_FLAG_4 seems free on PARM_DECL.


Okay, I will switch to that instead, I didn't like using such a general
purpose flag for what is only relevant until the FUNC_DECL is created
and then never again.


That's a good point, but the flag you chose seems even more general purpose.

A better option might be, instead of putting this flag on the PARM_DECL, 
to put it on the short-lived TREE_LIST which is only used for 
communication between cp_parser_parameter_declaration_list and 
grokparms, and have grokdeclarator grab it from 
declarator->u.function.parameters?



If you don't mind answering right now, what are the consequences of
claiming language-independent flags for C++? Or to phrase it
differently, why would this be claiming it for C++? My guess was that
those flags could be used by any front ends and there wouldn't be any
conflicts, as you can't really have crossover between two front ends at
the same time. Or is that the thing, that kind of cross-over is
actually viable and claiming a language independent flag inhibits that
possibility? Like I eluded to, this is kinda off topic from the patch
so feel free to defer the answer to someone else but I just want to
clear up my understanding for the future.


Generally the flags that aren't specifically specified to be 
language-specific are reserved for language-independent uses; even if 
only one front-end actually uses the feature, it should be for 
communication to language-independent code rather than communication 
within the particular front-end.  The patch modified tree-core.h to 
refer to a macro in cp-tree.h.



Yeah, I separated all the diagnostics out into the second patch. This
patch was meant to include the bare minimum of what was necessary to
get the feature functional. As for the diagnostics patch, I'm not happy
with how scattered about the code base it is, but you'll be able to
judge for yourself when I resubmit that patch, hopefully later today.
So not to worry, I didn't neglect diagnostics, it's just in a follow
up. The v1 of it was submitted on August 31st if you want to find it,
but I wouldn't recommend it. I misunderstood how some things were to be
formatted so it's probably best you just wait for me to finish a v2 of
it.


Ah, oops, I assumed that v2 completely replaced v1.


One last thing, I assume I should clean up the comments and replace
them with more typical ones right? I'm going to go forward with that
assumption in v3, I just want to mention it in advanced just in case I
have the wrong idea.


Yes, please.


I will get started on v3 of this patch and v2 of the diagnostic patch
as soon as I have the ball rolling on legal stuff. I should have it all
finished tonight. Thanks for the detailed response, it cleared up a lot
of my doubts.


Sounds good!

Jason



Re: Question on -fwrapv and -fwrapv-pointer

2023-09-20 Thread Martin Uecker via Gcc
Am Mittwoch, dem 20.09.2023 um 13:40 -0700 schrieb Kees Cook via Gcc:
> On Sat, Sep 16, 2023 at 10:36:52AM +0200, Martin Uecker wrote:
> > > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > > > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> > > > > > 
> > > > > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> > > > > > > > > CLANG already provided -fsanitize=unsigned-integer-overflow. 
> > > > > > > > > GCC
> > > > > > > > > might need to do the same.
> > > > > > > > 
> > > > > > > > NO. There is no such thing as unsigned integer overflow. That 
> > > > > > > > option
> > > > > > > > is badly designed and the GCC community has rejected a few 
> > > > > > > > times now
> > > > > > > > having that sanitizer before. It is bad form to have a 
> > > > > > > > sanitizer for
> > > > > > > > well defined code.
> > > > > > > 
> > > > > > > Even though unsigned integer overflow is well defined, it might be
> > > > > > > unintentional, shall we warn user about this?
> > > > > > 
> > > > > > *Everything* could be unintentional and should be warned then.  GCC 
> > > > > > is a
> > > > > > compiler, not an advanced AI educating the programmers.
> > > > > 
> > > > > Well, you are right in some sense. -:)
> > > > > 
> > > > > However, overflow is one important source for security flaws, it’s 
> > > > > important  for compilers to detect
> > > > > overflows in the programs in general.
> > > > 
> > > > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > > > point here. unsigned wraps and does NOT overflow. Yes there is a major
> > > > difference.
> > > 
> > > Right, yes. I will try to pick my language very carefully. :)
> > > 
> > > The practical problem I am trying to solve in the 30 million lines of
> > > Linux kernel code is that of catching arithmetic wrap-around. The
> > > problem is one of evolving the code -- I can't just drop -fwrapv and
> > > -fwrapv-pointer because it's not possible to fix all the cases at once.
> > > (And we really don't want to reintroduce undefined behavior.)
> > > 
> > > So, for signed, pointer, and unsigned types, we need:
> > > 
> > > a) No arithmetic UB -- everything needs to have deterministic behavior.
> > >The current solution here is "-fno-strict-overflow", which eliminates
> > >the UB and makes sure everything wraps.
> > > 
> > > b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
> > >would work with -fsanitize=[signed-integer|pointer]-overflow except
> > >due to "a)" we always wrap. And there isn't currently coverage like
> > >this for unsigned (in GCC).
> > > 
> > > Our problem is that the kernel is filled with a mix of places where there
> > > is intended wrap-around and unintended wrap-around. We can chip away at
> > > fixing the intended wrap-around that we can find with static analyzers,
> > > etc, but at the end of the day there is a long tail of finding the places
> > > where intended wrap-around is hiding. But when the refactoring is
> > > sufficiently completely, we can move the wrap-around warning to a trap,
> > > and the kernel will not longer have this class of security flaw.
> > > 
> > > As a real-world example, here is a bug where a u8 wraps around causing
> > > an under-allocation that allowed for a heap overwrite:
> > > 
> > > https://git.kernel.org/linus/6311071a0562
> > > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422
> > > 
> > > If there were more than 255 elements in a linked list, the allocation
> > > would be too small, and the second loop would write past the end of the
> > > allocation. This is a pretty classic allocation underflow and linear
> > > heap write overflow security flaw. (And it would be trivially stopped by
> > > trapping on the u8 wrap around.)
> > > 
> > > So, I want to be able to catch that at run-time. But we also have code
> > > doing things like "if (ulong + offset < ulong) { ... }":
> > > 
> > > https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artpec6_crypto.c#L1187
> > > 
> > > This is easy for a static analyzer to find and we can replace it with a
> > > non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find
> > > them all immediately, especially for the signed and pointer cases.
> > > 
> > > So, I need to retain the "everything wraps" behavior while still being
> > > able to detect when it happens.
> > 
> > 
> > Hi Kees,
> > 
> > I have a couple of questions:
> > 
> > Currently, my thinking was that you would use signed integers
> > if you want the usual integer arithmetic rules we know from
> > elementary school and if you overflow this is clearly a bug 
> > you can diagnose with UBsan.
> > 
> > There are people who think that signed overflow should be
> > defined to wrap, but I think this would be a severe
> > mistake because then code would start to rely on it, which
> > makes it then difficult to 

[Bug middle-end/111505] Asan (address-sanitizer) bootstrap fails since r14-4003-geaa8e8541349df

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111505

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
   Keywords||GC
  Component|bootstrap   |middle-end

--- Comment #1 from Andrew Pinski  ---
Hmm:
from gtype-desc.cc:
  {
_n_trees[0].signed_type,
1 * (NUM_INT_N_ENTS),
sizeof (int_n_trees[0]),
_ggc_mx_tree_node,
_pch_nx_tree_node
  },
  {
_n_trees[0].unsigned_type,
1 * (NUM_INT_N_ENTS),
sizeof (int_n_trees[0]),
_ggc_mx_tree_node,
_pch_nx_tree_node
  },

That looks wrong ...

[Bug bootstrap/111505] New: Asan (address-sanitizer) bootstrap fails since r14-4003-geaa8e8541349df

2023-09-20 Thread jamborm at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111505

Bug ID: 111505
   Summary: Asan (address-sanitizer) bootstrap fails since
r14-4003-geaa8e8541349df
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jamborm at gcc dot gnu.org
CC: dmalcolm at gcc dot gnu.org, fkastl at suse dot cz
Blocks: 86656
  Target Milestone: ---
  Host: x86_64-linux-gnu
Target: x86_64-linux-gnu

Bootstrapping with active address sanitizer fails at the beginning of
stage 3 since r14-4003-geaa8e8541349df (ggc, jit: forcibly clear GTY
roots in jit).

To reproduce, use --with-build-config=bootstrap-asan at configure
time, for example:

../src/configure --prefix=/home/user/install/prefix --enable-languages=c,c++
--enable-checking=release --enable-host-shared --disable-multilib
--with-build-config=bootstrap-asan

and run make (and wait).

At least one failure happens during configure script run of libiberty,
which fails with "C compiler cannot create executables" and the
corresponding config.log contains the following ASAN errors:

configure:3470:  /home/mjambor/gcc/mine/b-obj/./prev-gcc/xgcc
-B/home/mjambor/gcc/mine/b-obj/./prev-gcc/
-B/home/mjambor/gcc/mine/inst/x86_64-pc-linux-gnu/bin/
-B/home/mjambor/gcc/mine/inst/x86_64-pc-linux-gnu/bin/
-B/home/mjambor/gcc/mine/inst/x86_64-pc-linux-gnu/lib/ -isystem
/home/mjambor/gcc/mine/inst/x86_64-pc-linux-gnu/include -isystem
/home/mjambor/gcc/mine/inst/x86_64-pc-linux-gnu/sys-include   -fchecking=1 -o
conftest -g -O2 -fchecking=1 -fsanitize=address  -static-libstdc++
-static-libgcc -fsanitize=address -static-libasan
-B/home/mjambor/gcc/mine/b-obj/prev-x86_64-pc-linux-gnu/libsanitizer/
-B/home/mjambor/gcc/mine/b-obj/prev-x86_64-pc-linux-gnu/libsanitizer/asan/
-B/home/mjambor/gcc/mine/b-obj/prev-x86_64-pc-linux-gnu/libsanitizer/asan/.libs
 conftest.c  >&5
=
==2683==ERROR: AddressSanitizer: global-buffer-overflow on address
0x0718d4d0 at pc 0x007cd234 bp 0x7ffdc15756e0 sp 0x7ffdc1574ea0
WRITE of size 16 at 0x0718d4d0 thread T0
#0 0x7cd233 in __interceptor_memset
/home/mjambor/gcc/mine/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:847
#1 0x12151ab in ggc_common_finalize()
/home/mjambor/gcc/mine/src/gcc/ggc-common.cc:1311
#2 0x1dad8ef in toplev::finalize()
/home/mjambor/gcc/mine/src/gcc/toplev.cc:2354
#3 0x796732 in main /home/mjambor/gcc/mine/src/gcc/main.cc:42
#4 0x7f74182281af in __libc_start_call_main (/lib64/libc.so.6+0x281af)
(BuildId: 7729cbd8376d2b42276cc2cc10693449ff810847)
#5 0x7f7418228278 in __libc_start_main@@GLIBC_2.34
(/lib64/libc.so.6+0x28278) (BuildId: 7729cbd8376d2b42276cc2cc10693449ff810847)
#6 0x797e84 in _start ../sysdeps/x86_64/start.S:115

0x0718d4d0 is located 48 bytes before global variable 'int_n_enabled_p'
defined in '/home/mjambor/gcc/mine/src/gcc/tree.cc:234:6' (0x718d500) of size 1
0x0718d4d0 is located 0 bytes after global variable 'int_n_trees' defined
in '/home/mjambor/gcc/mine/src/gcc/tree.cc:235:22' (0x718d4c0) of size 16
SUMMARY: AddressSanitizer: global-buffer-overflow
/home/mjambor/gcc/mine/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:847
in __interceptor_memset
Shadow bytes around the buggy address:
  0x0718d200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0718d280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0718d300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0718d380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0718d400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0718d480: 00 00 00 00 f9 f9 f9 f9 00 00[f9]f9 f9 f9 f9 f9
  0x0718d500: 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0718d580: 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9
  0x0718d600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0718d680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0718d700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:   00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:   fa
  Freed heap region:   fd
  Stack left redzone:  f1
  Stack mid redzone:   f2
  Stack right redzone: f3
  Stack after return:  f5
  Stack use after scope:   f8
  Global redzone:  f9
  Global init order:   f6
  Poisoned by user:f7
  Container overflow:  fc
  Array cookie:ac
  Intra object redzone:bb
  ASan internal:   fe
  Left alloca redzone: ca
  Right alloca redzone:cb
==2683==ABORTING

[...]

configure:3708:  /home/mjambor/gcc/mine/b-obj/./prev-gcc/xgcc

[Bug c++/111504] compare operator not defined for recursive data types on C++20

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111504

--- Comment #1 from Andrew Pinski  ---
Fails for the same reason with clang (both with libstdc++ and libc++) 

Are you sure this is valid C++ 20 code?

Re: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

2023-09-20 Thread Jason Merrill

On 9/20/23 11:03, Patrick Palka wrote:

On Wed, 20 Sep 2023, Patrick Palka wrote:


Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

This fixes some missed SFINAE in grok_array_decl when checking a C++23
multidimensional subscript operator expression.

Note the existing pedwarn code paths are a backward compability fallback
for treating invalid a[x, y, z] as a[(x, y, z)], but this should only be
done outside of a SFINAE context I think.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard errors with tf_error.
In the pedwarn code paths, return error_mark_node when in
a SFINAE context.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.
---
  gcc/cp/decl2.cc  | 36 +++-
  gcc/testsuite/g++.dg/cpp23/subscript15.C | 24 
  2 files changed, 47 insertions(+), 13 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b402befba6d..6eb6d8c57d6 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -477,7 +477,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  /* If it would be valid albeit deprecated expression in
 C++20, just pedwarn on it and treat it as if wrapped
-in ().  */
+in () unless we're in a SFINAE context.  */
+ if (!(complain & tf_error))
+   return error_mark_node;


It occurred to me that we could check for tf_error much earlier, before
we call build_x_compound_expr_from_vec and build_new_op, since they're
only used here to implement the backward compatibilty fallback.  Perhaps
the following is better, then:


This version is OK.


-- >8 --

Subject: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly recognize ill-formed C++23 multidimensional
subscript operator expressions.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.
---
  gcc/cp/decl2.cc  | 15 +++---
  gcc/testsuite/g++.dg/cpp23/subscript15.C | 25 
  2 files changed, 37 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b402befba6d..6ac27cbc15f 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -459,7 +459,10 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  expr = build_op_subscript (loc, array_expr, index_exp_list,
 , complain & tf_decltype);
- if (expr == error_mark_node)
+ if (expr == error_mark_node
+ /* Don't do the backward compatibility fallback in a SFINAE
+context.   */
+ && (complain & tf_error))
{
  tree idx = build_x_compound_expr_from_vec (*index_exp_list, NULL,
 tf_none);
@@ -510,6 +513,11 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
  
if (index_exp == NULL_TREE)

{
+ if (!(complain & tf_error))
+   /* Don't do the backward compatibility fallback in a SFINAE
+  context.  */
+   return error_mark_node;
+
  if ((*index_exp_list)->is_empty ())
{
  error_at (loc, "built-in subscript operator without expression "
@@ -561,8 +569,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
swapped = true, array_expr = p2, index_exp = i1;
else
{
- error_at (loc, "invalid types %<%T[%T]%> for array subscript",
-   type, TREE_TYPE (index_exp));
+ if (complain & tf_error)
+   error_at (loc, "invalid types %<%T[%T]%> for array subscript",
+ type, TREE_TYPE (index_exp));
  return error_mark_node;
}
  
diff --git a/gcc/testsuite/g++.dg/cpp23/subscript15.C b/gcc/testsuite/g++.dg/cpp23/subscript15.C

new file mode 100644
index 000..fece96be96b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/subscript15.C
@@ -0,0 +1,25 @@
+// PR c++/111493
+// { dg-do compile { target c++23 } }
+
+template
+concept CartesianIndexable = requires(T t, Ts... ts) { t[ts...]; };
+
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+
+static_assert(!CartesianIndexable);
+static_assert(CartesianIndexable);
+static_assert(!CartesianIndexable);

[Bug preprocessor/90400] _Pragma not always expanded in the right location within macros

2023-09-20 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400

Lewis Hyatt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Lewis Hyatt  ---
Marking it fixed now that the testcase is added.

[Bug preprocessor/61474] ICE (segfault) in preprocessor

2023-09-20 Thread lhyatt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61474

Lewis Hyatt  changed:

   What|Removed |Added

 CC||lhyatt at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |14.0

--- Comment #4 from Lewis Hyatt  ---
Thanks for the testcase, this is fixed for GCC 14.

Re: [PATCH] c++: constraint rewriting during ttp coercion [PR111485]

2023-09-20 Thread Jason Merrill

On 9/20/23 13:10, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps backports?


OK for trunk and 13.


-- >8 --

In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters.  The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2.  This patch
fixes this by including the outer template arguments in this substitution,
which ought to match the depth of the ttp.

The second testcase demonstrates that it's sometimes necessary to
substitute the concrete outer template arguments instead of generic
ones, because a ttp's constraints could depend on outer arguments.

PR c++/111485

gcc/cp/ChangeLog:

* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
---
  gcc/cp/pt.cc   |  5 +++--
  gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C | 24 ++
  gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C | 17 +++
  3 files changed, 44 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8758e218ce4..f47887291a6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -8360,7 +8360,7 @@ canonicalize_expr_argument (tree arg, tsubst_flags_t 
complain)
 constrained than the parameter.  */
  
  static bool

-is_compatible_template_arg (tree parm, tree arg)
+is_compatible_template_arg (tree parm, tree arg, tree args)
  {
tree parm_cons = get_constraints (parm);
  
@@ -8381,6 +8381,7 @@ is_compatible_template_arg (tree parm, tree arg)

  {
tree aparms = DECL_INNERMOST_TEMPLATE_PARMS (arg);
new_args = template_parms_level_to_args (aparms);
+  new_args = add_to_template_args (args, new_args);
++processing_template_decl;
parm_cons = tsubst_constraint_info (parm_cons, new_args,
  tf_none, NULL_TREE);
@@ -8635,7 +8636,7 @@ convert_template_argument (tree parm,
// Check that the constraints are compatible before allowing the
// substitution.
if (val != error_mark_node)
-if (!is_compatible_template_arg (parm, arg))
+   if (!is_compatible_template_arg (parm, arg, args))
{
if (in_decl && (complain & tf_error))
{
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
new file mode 100644
index 000..4129e9e1303
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
@@ -0,0 +1,24 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+template concept D = C || true;
+
+template class TT> struct example { };
+template class UU> using example_t = example;
+
+template
+struct A {
+  template class TT> struct example { };
+
+  template class UU> using example_t = example;
+
+  template
+  struct B {
+template class UU> using example_t = example;
+  };
+};
+
+template struct A::B;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
new file mode 100644
index 000..7832cabc7d8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
@@ -0,0 +1,17 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+
+template requires C class TT>
+void f();
+
+template requires C
+struct A;
+
+int main() {
+  f();
+  f(); // { dg-error "no match|constraint" }
+}




[Bug preprocessor/90400] _Pragma not always expanded in the right location within macros

2023-09-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:d8e08ba9396b1f7da50011468f260250b7afaab7

commit r14-4186-gd8e08ba9396b1f7da50011468f260250b7afaab7
Author: Lewis Hyatt 
Date:   Fri Aug 25 15:57:19 2023 -0400

testsuite: Add test for already-fixed issue with _Pragma expansion
[PR90400]

The PR was fixed by r12-5454. Since the fix was somewhat incidental,
although related, add a testcase from PR90400 too before closing it out.

gcc/testsuite/ChangeLog:

PR preprocessor/90400
* c-c++-common/cpp/pr90400.c: New test.

[Bug preprocessor/61474] ICE (segfault) in preprocessor

2023-09-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61474

--- Comment #3 from CVS Commits  ---
The master branch has been updated by Lewis Hyatt :

https://gcc.gnu.org/g:601dbf2a799f691688dfe78250b5bea2717b5b5e

commit r14-4185-g601dbf2a799f691688dfe78250b5bea2717b5b5e
Author: Lewis Hyatt 
Date:   Fri Sep 15 13:31:51 2023 -0400

libcpp: Fix ICE on #include after a line marker directive [PR61474]

As noted in the PR, GCC will segfault if a file name is first seen in a
linemarker directive, and then later seen in a normal #include.  This is
because the fake include process adds the file to the cache with a null
PATH
member. The normal #include finds this file in the cache and then attempts
to use the null PATH.  Resolve by adding the file to the cache with a
unique
starting directory, so that the fake entry will only be found by a
subsequent fake include, not by a real one.

libcpp/ChangeLog:

PR preprocessor/61474
* files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
include files.
(_cpp_fake_include): Pass a unique cpp_dir* address so
the fake file will not be found when looked up for real.

gcc/testsuite/ChangeLog:

PR preprocessor/61474
* c-c++-common/cpp/pr61474-2.h: New test.
* c-c++-common/cpp/pr61474.c: New test.
* c-c++-common/cpp/pr61474.h: New test.

Re: Question on -fwrapv and -fwrapv-pointer

2023-09-20 Thread Kees Cook via Gcc
On Sat, Sep 16, 2023 at 10:36:52AM +0200, Martin Uecker wrote:
> > On Fri, Sep 15, 2023 at 08:18:28AM -0700, Andrew Pinski wrote:
> > > On Fri, Sep 15, 2023 at 8:12 AM Qing Zhao  wrote:
> > > >
> > > >
> > > >
> > > > > On Sep 15, 2023, at 3:43 AM, Xi Ruoyao  wrote:
> > > > >
> > > > > On Thu, 2023-09-14 at 21:41 +, Qing Zhao wrote:
> > > >  CLANG already provided -fsanitize=unsigned-integer-overflow. GCC
> > > >  might need to do the same.
> > > > >>>
> > > > >>> NO. There is no such thing as unsigned integer overflow. That option
> > > > >>> is badly designed and the GCC community has rejected a few times now
> > > > >>> having that sanitizer before. It is bad form to have a sanitizer for
> > > > >>> well defined code.
> > > > >>
> > > > >> Even though unsigned integer overflow is well defined, it might be
> > > > >> unintentional, shall we warn user about this?
> > > > >
> > > > > *Everything* could be unintentional and should be warned then.  GCC 
> > > > > is a
> > > > > compiler, not an advanced AI educating the programmers.
> > > >
> > > > Well, you are right in some sense. -:)
> > > >
> > > > However, overflow is one important source for security flaws, it’s 
> > > > important  for compilers to detect
> > > > overflows in the programs in general.
> > > 
> > > Except it is NOT an overflow. Rather it is wrapping. That is a big
> > > point here. unsigned wraps and does NOT overflow. Yes there is a major
> > > difference.
> > 
> > Right, yes. I will try to pick my language very carefully. :)
> > 
> > The practical problem I am trying to solve in the 30 million lines of
> > Linux kernel code is that of catching arithmetic wrap-around. The
> > problem is one of evolving the code -- I can't just drop -fwrapv and
> > -fwrapv-pointer because it's not possible to fix all the cases at once.
> > (And we really don't want to reintroduce undefined behavior.)
> > 
> > So, for signed, pointer, and unsigned types, we need:
> > 
> > a) No arithmetic UB -- everything needs to have deterministic behavior.
> >The current solution here is "-fno-strict-overflow", which eliminates
> >the UB and makes sure everything wraps.
> > 
> > b) A way to run-time warn/trap on overflow/underflow/wrap-around. This
> >would work with -fsanitize=[signed-integer|pointer]-overflow except
> >due to "a)" we always wrap. And there isn't currently coverage like
> >this for unsigned (in GCC).
> > 
> > Our problem is that the kernel is filled with a mix of places where there
> > is intended wrap-around and unintended wrap-around. We can chip away at
> > fixing the intended wrap-around that we can find with static analyzers,
> > etc, but at the end of the day there is a long tail of finding the places
> > where intended wrap-around is hiding. But when the refactoring is
> > sufficiently completely, we can move the wrap-around warning to a trap,
> > and the kernel will not longer have this class of security flaw.
> > 
> > As a real-world example, here is a bug where a u8 wraps around causing
> > an under-allocation that allowed for a heap overwrite:
> > 
> > https://git.kernel.org/linus/6311071a0562
> > https://elixir.bootlin.com/linux/v6.5/source/net/wireless/nl80211.c#L5422
> > 
> > If there were more than 255 elements in a linked list, the allocation
> > would be too small, and the second loop would write past the end of the
> > allocation. This is a pretty classic allocation underflow and linear
> > heap write overflow security flaw. (And it would be trivially stopped by
> > trapping on the u8 wrap around.)
> > 
> > So, I want to be able to catch that at run-time. But we also have code
> > doing things like "if (ulong + offset < ulong) { ... }":
> > 
> > https://elixir.bootlin.com/linux/v6.5/source/drivers/crypto/axis/artpec6_crypto.c#L1187
> > 
> > This is easy for a static analyzer to find and we can replace it with a
> > non-wrapping test (e.g. __builtin_add_overflow()), but we'll not find
> > them all immediately, especially for the signed and pointer cases.
> > 
> > So, I need to retain the "everything wraps" behavior while still being
> > able to detect when it happens.
> 
> 
> Hi Kees,
> 
> I have a couple of questions:
> 
> Currently, my thinking was that you would use signed integers
> if you want the usual integer arithmetic rules we know from
> elementary school and if you overflow this is clearly a bug 
> you can diagnose with UBsan.
> 
> There are people who think that signed overflow should be
> defined to wrap, but I think this would be a severe
> mistake because then code would start to rely on it, which
> makes it then difficult to differentiate between bugs and
> intended uses (e.g. the unfortunate situation you have 
> with the kernel).

Right -- my goal is to migrate the kernel codebase into using unambiguous
arithmetic. Doing that evolution, though, is the hard part.  :)

At present, the kernel treats all signed and pointer arithmetic as
wrapping, as that makes sure that there is no UB, 

[Bug c++/111504] New: compare operator not defined for recursive data types on C++20

2023-09-20 Thread xgao at nvidia dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111504

Bug ID: 111504
   Summary: compare operator not defined for recursive data types
on C++20
   Product: gcc
   Version: 13.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xgao at nvidia dot com
  Target Milestone: ---

Related bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111316

The following code works on C++17 but not C++20:

#include 
#include 
#include 

template 
static auto hasLessThanHelper(int)
-> decltype(std::declval() < std::declval(), std::true_type{});

template 
static auto hasLessThanHelper(long) -> std::false_type;

template 
struct hasLessThan : decltype(hasLessThanHelper(0)) {};

struct DynamicType {
  using T1 = int64_t;
  using T2 = std::vector;
};

template <
typename DT,
typename = std::enable_if_t<
(hasLessThan::value ||
 hasLessThan::value ||
 hasLessThan::value ||
 hasLessThan::value)>>
inline constexpr bool operator<(const DT& x, const DT& y) {
  // implementation omitted
  return true;
}

int main() {
  using DT = DynamicType;
  // This assert works on C++17, but fails on C++20
  static_assert(hasLessThan, std::vector>::value);
}

[Bug middle-end/111502] Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread andrew at sifive dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

--- Comment #6 from Andrew Waterman  ---
Ack, I misunderstood your earlier message.  You're of course right that the
load/load/shift/or sequence is preferable to the load/load/store/store/load
sequence, on just about any practical implementation.  That the memcpy version
is optimized less optimally does seem to be disjoint from the issue Andrew
mentioned.

[Bug fortran/111503] New: Issues with POINTER, OPTIONAL, CONTIGUOUS dummy arguments

2023-09-20 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111503

Bug ID: 111503
   Summary: Issues with POINTER, OPTIONAL, CONTIGUOUS dummy
arguments
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: anlauf at gcc dot gnu.org
  Target Milestone: ---

The following code shows an issue that popped up while looking at pr55978:

program test
  implicit none
  integer, pointer, contiguous :: p(:) => null()
  print *, is_contiguous (p)
  call one (p)   ! accepted
  call one ()! accepted (but see pr55978 comment#19)
  call one (null())  ! rejected, but accepted by NAG, Intel
  call one (null(p)) ! rejected by NAG, accepted by Intel
contains
  subroutine one (x)
integer, pointer, optional, contiguous, intent(in) :: x(:)
!   integer,  optional, contiguous, intent(in) :: x(:) ! accepted
  end subroutine one
end

Intel accepts the code as-is.  NAG complains that NULL(p) is not contiguous:

Error: pr11.f90, line 8: Argument X (no. 1) of ONE is a CONTIGUOUS pointer,
but the actual argument NULL(P) is not simply contiguous

I guess that NAG is correct here, and Intel does not properly diagnose.

gfortran is maybe overly "cautious" and gives:

pr11.f90:7:12:

7 |   call one (null())  ! rejected, but accepted by NAG, Intel
  |1
Error: Actual argument to contiguous pointer dummy 'x' at (1) must be simply
contiguous
pr11.f90:8:12:

8 |   call one (null(p)) ! rejected by NAG, accepted by Intel
  |1
Error: Actual argument to contiguous pointer dummy 'x' at (1) must be simply
contiguous

As - in the current context - null() is equivalent to an absent actual
argument, I wonder whether we should relax our checks and follow NAG and Intel.

[Bug middle-end/111502] Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread lasse.collin at tukaani dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

--- Comment #5 from Lasse Collin  ---
If I understood correctly, PR 50417 is about wishing that GCC would infer that
a pointer given to memcpy has alignment higher than one. In my examples the
alignment of the uint8_t *b argument is one and thus byte-by-byte access is
needed (if the target processor doesn't have fast unaligned access, determined
from -mtune and -mno-strict-align).

My report is about the instruction sequence used for the byte-by-byte access.

Omitting the stack pointer manipulation and return instruction, this is
bytes16:

lbu a5,1(a0)
lbu a0,0(a0)
sllia5,a5,8
or  a0,a5,a0

And copy16:

lbu a4,0(a0)
lbu a5,1(a0)
sb  a4,14(sp)
sb  a5,15(sp)
lhu a0,14(sp)

Is the latter as good code as the former? If yes, then this report might be
invalid and I apologize for the noise.

PR 50417 includes a case where a memcpy(a, b, 4) generates an actual call to
memcpy, so that is the same detail as the -Os case in my first message. Calling
memcpy instead of expanding it inline saves six bytes in RV64C. On ARM64 with
-Os -mstrict-align the call doesn't save space:

bytes32:
ldrbw1, [x0]
ldrbw2, [x0, 1]
orr x2, x1, x2, lsl 8
ldrbw1, [x0, 2]
ldrbw0, [x0, 3]
orr x1, x2, x1, lsl 16
orr w0, w1, w0, lsl 24
ret

copy32:
stp x29, x30, [sp, -32]!
mov x1, x0
mov x2, 4
mov x29, sp
add x0, sp, 28
bl  memcpy
ldr w0, [sp, 28]
ldp x29, x30, [sp], 32
ret

And ARM64 with -O2 -mstrict-align, shuffing via stack is longer too:

bytes32:
ldrbw4, [x0]
ldrbw2, [x0, 1]
ldrbw1, [x0, 2]
ldrbw3, [x0, 3]
orr x2, x4, x2, lsl 8
orr x0, x2, x1, lsl 16
orr w0, w0, w3, lsl 24
ret

copy32:
sub sp, sp, #16
ldrbw3, [x0]
ldrbw2, [x0, 1]
ldrbw1, [x0, 2]
ldrbw0, [x0, 3]
strbw3, [sp, 12]
strbw2, [sp, 13]
strbw1, [sp, 14]
strbw0, [sp, 15]
ldr w0, [sp, 12]
add sp, sp, 16
ret

ARM64 with -mstrict-align might be a contrived example in practice though.

[Bug fortran/107716] Getting negative values with NINT when using doubleprecision values in range on i386

2023-09-20 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107716

--- Comment #6 from Steve Kargl  ---
On Wed, Sep 20, 2023 at 07:07:37PM +, mikael at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107716
> 
> --- Comment #5 from Mikael Morin  ---
> (In reply to kargl from comment #4)
> > What the heck does "RESOLVED MOVED"?
> 
> I think it means this PR is not a gcc bug and the problem is tracked on some
> other project's bug tracker (it is "moved" there).
> 
> Not sure where else the problem is tracked in this case.
> 

Thanks, Mikael.  After a bit of duck-duck-go, I've come to a
similar conclusion.

<$0.02>
If a url/pointer to where the bug is actually tracked is not available,
this status should not be used.  It is useless.  I'll also note that if
one clicks on the 'status:' label a list of stati (statuses?) and their
meaining can be found.  MOVED is not among the list.


Re: Ping: [PATCH] testsuite: Add test for already-fixed issue with _Pragma expansion [PR90400]

2023-09-20 Thread Richard Sandiford
Lewis Hyatt via Gcc-patches  writes:
> Hello-
>
> May I please ping this one? It's adding a testcase prior to closing
> the PR. Thanks!
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628488.html

OK, thanks.  (Not really my area, but someone would probably have
objected by now if they were going to.)

Richard

>
> -Lewis
>
> On Fri, Aug 25, 2023 at 4:46 PM Lewis Hyatt  wrote:
>>
>> Hello-
>>
>> This is adding a testcase for a PR that was already incidentally fixed. OK
>> to commit please? Thanks...
>>
>> -Lewis
>>
>> -- >8 --
>>
>> The PR was fixed by r12-5454. Since the fix was somewhat incidental,
>> although related, add a testcase from PR90400 too before closing it out.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR preprocessor/90400
>> * c-c++-common/cpp/pr90400.c: New test.
>> ---
>>  gcc/testsuite/c-c++-common/cpp/pr90400.c | 14 ++
>>  1 file changed, 14 insertions(+)
>>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr90400.c
>>
>> diff --git a/gcc/testsuite/c-c++-common/cpp/pr90400.c 
>> b/gcc/testsuite/c-c++-common/cpp/pr90400.c
>> new file mode 100644
>> index 000..4f2cab8d6ab
>> --- /dev/null
>> +++ b/gcc/testsuite/c-c++-common/cpp/pr90400.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-do compile } */
>> +/* { dg-additional-options "-save-temps" } */
>> +/* PR preprocessor/90400 */
>> +
>> +#define OUTER(x) x
>> +#define FOR(x) _Pragma ("GCC unroll 0") for (x)
>> +void f ()
>> +{
>> +/* If the pragma were to be seen prior to the expansion of FOR, as was
>> +   the case before r12-5454, then the unroll pragma would complain
>> +   because the immediately following statement would be ";" rather than
>> +   a loop.  */
>> +OUTER (; FOR (int i = 0; i != 1; ++i);) /* { dg-bogus {statement 
>> expected before ';' token} } */
>> +}


[COMMITTED] Tweak ssa_cache::merge_range API.

2023-09-20 Thread Andrew MacLeod
Merge_range use to return TRUE if there was already a range in the 
cache.   This patches change the meaning of the return value such that 
it returns TRUE if the range in the cache changes..  ie, it either set a 
range where there wasn't one before, or updates an existing range when 
the old one intersects with the new one results in a different range.


It also tweaks the debug output for the cache to no longer output the 
header text "non-varying Global Ranges" in the class, as the class is 
now used for other purpoises as well.   The text is moved to when the 
dump is actually from a global table.


Bootstraps on 86_64-pc-linux-gnu with no regressions.   Pushed.

Andrew
commit 0885e96272f1335c324f99fd2d1e9b0b3da9090c
Author: Andrew MacLeod 
Date:   Wed Sep 20 12:53:04 2023 -0400

Tweak merge_range API.

merge_range use to return TRUE if ter was already a arange.  Now it
returns TRUE if it adds a new range, OR updates and existing range
with a new value.  FALSE is returned when the range already matches.

* gimple-range-cache.cc (ssa_cache::merge_range): Change meaning
of the return value.
(ssa_cache::dump): Don't print GLOBAL RANGE header.
(ssa_lazy_cache::merge_range): Adjust return value meaning.
(ranger_cache::dump): Print GLOBAL RANGE header.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 5b74681b61a..3c819933c4e 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -606,7 +606,7 @@ ssa_cache::set_range (tree name, const vrange )
 }
 
 // If NAME has a range, intersect it with R, otherwise set it to R.
-// Return TRUE if there was already a range set, otherwise false.
+// Return TRUE if the range is new or changes.
 
 bool
 ssa_cache::merge_range (tree name, const vrange )
@@ -616,19 +616,23 @@ ssa_cache::merge_range (tree name, const vrange )
 m_tab.safe_grow_cleared (num_ssa_names + 1);
 
   vrange_storage *m = m_tab[v];
-  if (m)
+  // Check if this is a new value.
+  if (!m)
+m_tab[v] = m_range_allocator->clone (r);
+  else
 {
   Value_Range curr (TREE_TYPE (name));
   m->get_vrange (curr, TREE_TYPE (name));
-  curr.intersect (r);
+  // If there is no change, return false.
+  if (!curr.intersect (r))
+   return false;
+
   if (m->fits_p (curr))
m->set_vrange (curr);
   else
m_tab[v] = m_range_allocator->clone (curr);
 }
-  else
-m_tab[v] = m_range_allocator->clone (r);
-  return m != NULL;
+  return true;
 }
 
 // Set the range for NAME to R in the ssa cache.
@@ -656,27 +660,14 @@ ssa_cache::clear ()
 void
 ssa_cache::dump (FILE *f)
 {
-  /* Cleared after the table header has been printed.  */
-  bool print_header = true;
   for (unsigned x = 1; x < num_ssa_names; x++)
 {
   if (!gimple_range_ssa_p (ssa_name (x)))
continue;
   Value_Range r (TREE_TYPE (ssa_name (x)));
-  // Invoke dump_range_query which is a private virtual version of
-  // get_range.   This avoids performance impacts on general queries,
-  // but allows sharing of the dump routine.
+  // Dump all non-varying ranges.
   if (get_range (r, ssa_name (x)) && !r.varying_p ())
{
- if (print_header)
-   {
- /* Print the header only when there's something else
-to print below.  */
- fprintf (f, "Non-varying global ranges:\n");
- fprintf (f, "=:\n");
- print_header = false;
-   }
-
  print_generic_expr (f, ssa_name (x), TDF_NONE);
  fprintf (f, "  : ");
  r.dump (f);
@@ -684,8 +675,6 @@ ssa_cache::dump (FILE *f)
}
 }
 
-  if (!print_header)
-fputc ('\n', f);
 }
 
 // Return true if NAME has an active range in the cache.
@@ -716,7 +705,7 @@ ssa_lazy_cache::set_range (tree name, const vrange )
 }
 
 // If NAME has a range, intersect it with R, otherwise set it to R.
-// Return TRUE if there was already a range set, otherwise false.
+// Return TRUE if the range is new or changes.
 
 bool
 ssa_lazy_cache::merge_range (tree name, const vrange )
@@ -731,7 +720,7 @@ ssa_lazy_cache::merge_range (tree name, const vrange )
   if (v >= m_tab.length ())
 m_tab.safe_grow (num_ssa_names + 1);
   m_tab[v] = m_range_allocator->clone (r);
-  return false;
+  return true;
 }
 
 // Return TRUE if NAME has a range, and return it in R.
@@ -996,6 +985,8 @@ ranger_cache::~ranger_cache ()
 void
 ranger_cache::dump (FILE *f)
 {
+  fprintf (f, "Non-varying global ranges:\n");
+  fprintf (f, "=:\n");
   m_globals.dump (f);
   fprintf (f, "\n");
 }


Re: [PATCH v2] c++: Catch indirect change of active union member in constexpr [PR101631]

2023-09-20 Thread Jason Merrill

On 9/19/23 20:55, Nathaniel Shead wrote:

On Tue, Sep 19, 2023 at 05:25:20PM -0400, Jason Merrill wrote:

On 9/1/23 08:22, Nathaniel Shead wrote:

On Wed, Aug 30, 2023 at 04:28:18PM -0400, Jason Merrill wrote:

On 8/29/23 09:35, Nathaniel Shead wrote:

This is an attempt to improve the constexpr machinery's handling of
union lifetime by catching more cases that cause UB. Is this approach
OK?

I'd also like some feedback on a couple of pain points with this
implementation; in particular, is there a good way to detect if a type
has a non-deleted trivial constructor? I've used 'is_trivially_xible' in
this patch, but that also checks for a trivial destructor which by my
reading of [class.union.general]p5 is possibly incorrect. Checking for a
trivial default constructor doesn't seem too hard but I couldn't find a
good way of checking if that constructor is deleted.


I guess the simplest would be

(TYPE_HAS_TRIVIAL_DFLT (t) && locate_ctor (t))

because locate_ctor returns null for a deleted default ctor.  It would be
good to make this a separate predicate.


I'm also generally unsatisfied with the additional complexity with the
third 'refs' argument in 'cxx_eval_store_expression' being pushed and
popped; would it be better to replace this with a vector of some
specific structure type for the data that needs to be passed on?


Perhaps, but what you have here is fine.  Another possibility would be to
just have a vec of the refs and extract the index from the ref later as
needed.

Jason



Thanks for the feedback. I've kept the refs as-is for now. I've also
cleaned up a couple of other typos I'd had with comments and diagnostics.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

@@ -6192,10 +6197,16 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
 type = reftype;
-  if (code == UNION_TYPE && CONSTRUCTOR_NELTS (*valp)
- && CONSTRUCTOR_ELT (*valp, 0)->index != index)
+  if (code == UNION_TYPE
+ && TREE_CODE (t) == MODIFY_EXPR
+ && (CONSTRUCTOR_NELTS (*valp) == 0
+ || CONSTRUCTOR_ELT (*valp, 0)->index != index))
{
- if (cxx_dialect < cxx20)
+ /* We changed the active member of a union. Ensure that this is
+valid.  */
+ bool has_active_member = CONSTRUCTOR_NELTS (*valp) != 0;
+ tree inner = strip_array_types (reftype);
+ if (has_active_member && cxx_dialect < cxx20)
{
  if (!ctx->quiet)
error_at (cp_expr_loc_or_input_loc (t),


While we're looking at this area, this error message should really mention
that it's allowed in C++20.


@@ -6205,8 +6216,36 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
  index);
  *non_constant_p = true;
}
- else if (TREE_CODE (t) == MODIFY_EXPR
-  && CONSTRUCTOR_NO_CLEARING (*valp))
+ else if (!is_access_expr
+  || (CLASS_TYPE_P (inner)
+  && !type_has_non_deleted_trivial_default_ctor (inner)))
+   {
+ /* Diagnose changing active union member after initialisation
+without a valid member access expression, as described in
+[class.union.general] p5.  */
+ if (!ctx->quiet)
+   {
+ if (has_active_member)
+   error_at (cp_expr_loc_or_input_loc (t),
+ "accessing %qD member instead of initialized "
+ "%qD member in constant expression",
+ index, CONSTRUCTOR_ELT (*valp, 0)->index);
+ else
+   error_at (cp_expr_loc_or_input_loc (t),
+ "accessing uninitialized member %qD",
+ index);
+ if (is_access_expr)
+   {
+ inform (DECL_SOURCE_LOCATION (index),
+ "%qD does not implicitly begin its lifetime "
+ "because %qT does not have a non-deleted "
+ "trivial default constructor",
+ index, inner);
+   }


The !is_access_expr case could also use an explanatory message.


Thanks for the review, I've updated these messages and will send through
an updated patch once bootstrap/regtest is complete.


Also, I notice that this testcase crashes with the patch:

union U { int i; float f; };
constexpr auto g (U u) { return (u.i = 42); }
static_assert (g({.f = 3.14}) == 42);


This appears to segfault even without the patch since GCC 13.1.
https://godbolt.org/z/45sPh8WaK

I haven't done a bisect yet to work out what commit exactly caused this.
Should I aim to fix this first before coming back with this patch?


Ah, I was just assuming it was related, never mind.  I'll fix it.

Jason



[Bug target/111500] [arm-none-eabi-gcc] / suboptimal optimization

2023-09-20 Thread gnu.arne at wgboome dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111500

--- Comment #2 from Luke  ---
(In reply to Andrew Pinski from comment #1)
> Can you attach (compilible) examples code for each issue? Really these
> should be filed seperately too.

do u mean, i should file 3 further bug reports?

i try examples for artifact #1 first...
is it right like this?

example code for artifact #1a:
// "cmp" follows the "subs" immediately
__attribute((noinline)) void artiSUBS() {
for (int i=100; i>0; i--)
200017ec:   2364movsr3, #100; 0x64
*(volatile int*)0xE000E014 = i;
200017ee:   4a03ldr r2, [pc, #12]   ; (200017fc)
200017f0:   6013str r3, [r2, #0]
for (int i=100; i>0; i--)
200017f2:   3b01subsr3, #1
200017f4:   2b00cmp r3, #0
200017f6:   d1fbbne.n   200017f0
}
200017f8:   4770bx  lr
200017fa:   46c0nop
200017fc:   e000e014

example code for artifact #1b:
// gcc tends to do the "i--" too early; "ldr" does not touch flags
// maybe because it hopes to get beneficial pipeline effects at loop entry?
__attribute((noinline)) void artiSUBS() {
200017ec:   2364movsr3, #100; 0x64
for (int i=100; i>0; i--)
(void) *(volatile int*)0xE000E014;
200017ee:   4a03ldr r2, [pc, #12]   ; (200017fc)
200017f0:   3b01subsr3, #1
200017f2:   6811ldr r1, [r2, #0]
for (int i=100; i>0; i--)
200017f4:   2b00cmp r3, #0
200017f6:   d1fbbne.n   200017f0
}
200017f8:   4770bx  lr
200017fa:   46c0nop
200017fc:   e000e014

Re: [PATCH] [frange] Remove special casing from unordered operators.

2023-09-20 Thread Aldy Hernandez




On 9/20/23 11:12, Aldy Hernandez wrote:

In coming up with testcases for the unordered folders, I realized that
we were already handling them correctly, even in the absence of my
work in this area lately.

All of the unordered fold_range() methods try to fold with the ordered
variants first, and if they return TRUE, we are guaranteed to be able
to fold, even in the presence of NANs.  For example:

if (x_5 >= y_8)
   if (x_5 __UNLE y_8)

On the true side of the first conditional we know that either x_5 < y_8
or that one or more operands is a NAN.  Since UNLE_EXPR returns true
for precisely this scenario, we can fold as true.


Ugh, that should've been the false edge of the first conditional, thus:

if (x_5 >= y_8)
  {
  }
else
  {
// Relation at this point is: x_5 < y_8
// or either x_5 or y_8 is a NAN.
if (x_5 __UNLE y_8)
  link_error();
  }

The second conditional is foldable because LT U NAN is a subset of 
__UNLE (which is LE U NAN).


The patch still stands though :).

Aldy



[Bug fortran/107716] Getting negative values with NINT when using doubleprecision values in range on i386

2023-09-20 Thread mikael at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107716

Mikael Morin  changed:

   What|Removed |Added

 CC||mikael at gcc dot gnu.org

--- Comment #5 from Mikael Morin  ---
(In reply to kargl from comment #4)
> What the heck does "RESOLVED MOVED"?

I think it means this PR is not a gcc bug and the problem is tracked on some
other project's bug tracker (it is "moved" there).

Not sure where else the problem is tracked in this case.

[Bug middle-end/111502] Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

Andrew Pinski  changed:

   What|Removed |Added

 Depends on||50417

--- Comment #4 from Andrew Pinski  ---
See pr 50417


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50417
[Bug 50417] [11/12/13/14 regression]: memcpy with known alignment

[Bug middle-end/111502] Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

Andrew Pinski  changed:

   What|Removed |Added

  Component|target  |middle-end

--- Comment #3 from Andrew Pinski  ---
This is a dup of this bug. Basically memcpy is not changed into an unaligned
load ..

Re: [PATCH] aarch64: Ensure const and sign correctness

2023-09-20 Thread Richard Sandiford
Pekka Seppänen  writes:
> Be const and sign correct by using a matching CIE augmentation type.
> Use a builtin instead of relying  being included.
>
> libgcc/ChangeLog:
>
>   * config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
>   Use const unsigned type and a builtin.

Thanks for the patch, pushed to trunk.

Richard

> Signed-off-by: Pekka Seppänen 
> ---
>  libgcc/config/aarch64/aarch64-unwind.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
> b/libgcc/config/aarch64/aarch64-unwind.h
> index 3ad2f8239ed..d669edd671b 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -40,8 +40,9 @@ aarch64_cie_signed_with_b_key (struct _Unwind_Context 
> *context)
>const struct dwarf_cie *cie = get_cie (fde);
>if (cie != NULL)
>   {
> -   char *aug_str = cie->augmentation;
> -   return strchr (aug_str, 'B') == NULL ? 0 : 1;
> +   const unsigned char *aug_str = cie->augmentation;
> +   return __builtin_strchr ((const char *) aug_str,
> +'B') == NULL ? 0 : 1;
>   }
>  }
>return 0;


[Bug target/111502] Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread lasse.collin at tukaani dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

--- Comment #2 from Lasse Collin  ---
Byte access by default is good when the compiler doesn't know if unaligned is
fast on the target processor. There is no disagreement here.

What I suspect is a bug is the instruction sequence used for byte access in
copy16 and copy32 cases. copy16 uses 2 * lbu + 2 * sb + 1 * lhu, that is, five
memory operations to load an unaligned 16-bit integer. copy32 uses 4 * lbu + 4
* sb + 1 * lw, that is, nine memory operations to load a 32-bit integer.

bytes16 needs two memory operations and bytes32 needs four. Clang generates
this kind of code from both bytesxx and copyxx.

[Bug fortran/107716] Getting negative values with NINT when using doubleprecision values in range on i386

2023-09-20 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107716

--- Comment #4 from kargl at gcc dot gnu.org ---
What the heck does "RESOLVED MOVED"?

[Bug tree-optimization/111502] Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread andrew at sifive dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

Andrew Waterman  changed:

   What|Removed |Added

 CC||andrew at sifive dot com

--- Comment #1 from Andrew Waterman  ---
This isn't actually a bug.  Quoting the RVA profile spec, "misaligned loads and
stores might execute extremely slowly"--which is code for the possibility that
they might be trapped and emulated, taking hundreds of clock cycles apiece.  So
the default behavior of emitting byte accesses is best when generating generic
code.  (Of course, when tuning for a particular microarchitecture, the shorter
code sequence may be emitted.)

Re: [PATCH][_GLIBCXX_INLINE_VERSION] Fix

2023-09-20 Thread François Dumont
libstdc++: [_GLIBCXX_INLINE_VERSION] Add handle_contract_violation 
symbol alias


libstdc++-v3/ChangeLog:

    * src/experimental/contract.cc
    [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Provide 
symbol alias

    without version namespace decoration for gcc.

Here is what I'm testing eventually, ok to commit if successful ?

François

On 20/09/2023 11:32, Jonathan Wakely wrote:

On Wed, 20 Sept 2023 at 05:51, François Dumont via Libstdc++
 wrote:

libstdc++: Remove std::constract_violation from versioned namespace

Spelling mistake in contract_violation, and it's not
std::contract_violation, it's std::experimental::contract_violation


GCC expects this type to be in std namespace directly.

Again, it's in std::experimental not in std directly.

Will this change cause problems when including another experimental
header, which does put experimental below std::__8?

I think std::__8::experimental and std::experimental will become ambiguous.

Maybe we do want to remove the inline __8 namespace from all
experimental headers. That needs a bit more thought though.


libstdc++-v3/ChangeLog:

  * include/experimental/contract:
  Remove _GLIBCXX_BEGIN_NAMESPACE_VERSION/_GLIBCXX_END_NAMESPACE_VERSION.

This line is too long for the changelog.


It does fix 29 g++.dg/contracts in gcc testsuite.

Ok to commit ?

Françoisdiff --git a/libstdc++-v3/src/experimental/contract.cc b/libstdc++-v3/src/experimental/contract.cc
index 504a6c041f1..17daa3312ca 100644
--- a/libstdc++-v3/src/experimental/contract.cc
+++ b/libstdc++-v3/src/experimental/contract.cc
@@ -67,3 +67,14 @@ handle_contract_violation (const std::experimental::contract_violation 
   std::cerr << std::endl;
 #endif
 }
+
+#if _GLIBCXX_INLINE_VERSION
+// Provide symbol alias without version namespace decoration for gcc.
+extern "C"
+void _Z25handle_contract_violationRKNSt12experimental18contract_violationE
+(const std::experimental::contract_violation )
+__attribute__ (
+(alias
+ ("_Z25handle_contract_violationRKNSt3__812experimental18contract_violationE"),
+ weak));
+#endif


[Committed] RISC-V: Remove math.h import to resolve missing stubs failures

2023-09-20 Thread Patrick O'Neill

Committed. Thanks!

On 9/20/23 10:19, Kito Cheng wrote:

LGTM

Patrick O'Neill  於 2023年9月20日 週三 18:07 寫道:

Resolves some of the missing stubs failures:
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.

2023-09-20 Juzhe Zhong 

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded
math.h
        import.

Tested-by: Patrick O'Neill 
---
Tested using 590a8bec3ed92118e084b0a1897d3314a666170e
glibc rv64gcv
glibc rv32gcv

glibc rv64gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-6.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)

glibc rv32gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/and-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-5.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-6.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-5.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/div-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-5.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-6.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-7.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: 

[Bug tree-optimization/111502] New: Suboptimal unaligned 2/4-byte memcpy on strict-align targets

2023-09-20 Thread lasse.collin at tukaani dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111502

Bug ID: 111502
   Summary: Suboptimal unaligned 2/4-byte memcpy on strict-align
targets
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lasse.collin at tukaani dot org
  Target Milestone: ---

I was playing with RISC-V GCC 12.2.0 from Arch Linux. I noticed
inefficient-looking assembly output in code that uses memcpy to access 32-bit
unaligned integers. I tried Godbolt with 16/32-bit integers and seems that the
same weirdness happens with RV32 & RV64 with GCC 13.2.0 and trunk, and also on
a few other targets. (Clang's output looks OK.)

For a little endian target:

#include 
#include 

uint32_t bytes16(const uint8_t *b)
{
return (uint32_t)b[0]
| ((uint32_t)b[1] << 8);
}

uint32_t copy16(const uint8_t *b)
{
uint16_t v;
memcpy(, b, sizeof(v));
return v;
}

riscv64-linux-gnu-gcc -march=rv64gc -O2 -mtune=size

bytes16:
lhu a0,0(a0)
ret

copy16:
lhu a0,0(a0)
ret

That looks good because -mno-strict-align is the default.

After omitting -mtune=size, unaligned access isn't used (the output is the same
as with -mstrict-align):

riscv64-linux-gnu-gcc -march=rv64gc -O2

bytes16:
lbu a5,1(a0)
lbu a0,0(a0)
sllia5,a5,8
or  a0,a5,a0
ret

copy16:
lbu a4,0(a0)
lbu a5,1(a0)
addisp,sp,-16
sb  a4,14(sp)
sb  a5,15(sp)
lhu a0,14(sp)
addisp,sp,16
jr  ra

bytes16 looks good but copy16 is weird: the bytes are copied to an aligned
location on stack and then loaded back.

On Godbolt it happens with GCC 13.2.0 on RV32, RV64, ARM64 (but only if using
-mstrict-align), MIPS64EL, and SPARC & SPARC64 (comparison needs big endian
bytes16). For ARM64 and MIPS64EL the oldest GCC on Godbolt is GCC 5.4 and the
same thing happens with that too.

32-bit reads with -O2 behave similarly. With -Os a call to memcpy is emitted
for copy32 but not for bytes32.

#include 
#include 

uint32_t bytes32(const uint8_t *b)
{
return (uint32_t)b[0]
| ((uint32_t)b[1] << 8)
| ((uint32_t)b[2] << 16)
| ((uint32_t)b[3] << 24);
}

uint32_t copy32(const uint8_t *b)
{
uint32_t v;
memcpy(, b, sizeof(v));
return v;
}

riscv64-linux-gnu-gcc -march=rv64gc -O2

bytes32:
lbu a4,1(a0)
lbu a3,0(a0)
lbu a5,2(a0)
lbu a0,3(a0)
sllia4,a4,8
or  a4,a4,a3
sllia5,a5,16
or  a5,a5,a4
sllia0,a0,24
or  a0,a0,a5
sext.w  a0,a0
ret

copy32:
lbu a2,0(a0)
lbu a3,1(a0)
lbu a4,2(a0)
lbu a5,3(a0)
addisp,sp,-16
sb  a2,12(sp)
sb  a3,13(sp)
sb  a4,14(sp)
sb  a5,15(sp)
lw  a0,12(sp)
addisp,sp,16
jr  ra

riscv64-linux-gnu-gcc -march=rv64gc -Os

bytes32:
lbu a4,1(a0)
lbu a5,0(a0)
sllia4,a4,8
or  a4,a4,a5
lbu a5,2(a0)
lbu a0,3(a0)
sllia5,a5,16
or  a5,a5,a4
sllia0,a0,24
or  a0,a0,a5
sext.w  a0,a0
ret

copy32:
addisp,sp,-32
mv  a1,a0
li  a2,4
addia0,sp,12
sd  ra,24(sp)
callmemcpy@plt
ld  ra,24(sp)
lw  a0,12(sp)
addisp,sp,32
jr  ra

I probably cannot test any proposed fixes but I hope this report is still
useful. Thanks!

[Bug c++/111493] [concepts] multidimensional subscript operator inside requires is broken

2023-09-20 Thread elrodc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111493

--- Comment #2 from Chris Elrod  ---
Note that it also shows up in gcc-13. I put gcc-14 as the version to indicate
that I confirmed it is still a problem on latest trunk. Not sure what the
policy is on which version we should report.

Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-20 Thread Patrick O'Neill

Juzhe,

On a more general note, are we expecting #include  to cause a
testcase to fail?

My motivation is to make the testsuite less noisy when checking for
regressions. For example, a patch like this one:
https://patchwork.sourceware.org/project/gcc/patch/20230920023059.1728132-1-pan2...@intel.com/
is showing 4 new failures on rv32gcv from the {dg-do compile} testcases
that #include . I might be wrong, but those don't look like real
failures to me [1][2][3].

On glibc rv64gcv I'm seeing tests like:
gcc.target/riscv/rvv/autovec/unop/vnot-rv32gcv.c
fail with similar missing stubs-ilp32d.h errors.

I want to sanity-check with other people that they are seeing similar
errors and that these errors indicate something wrong with the testsuite.
If nobody else is seeing these errors, I'd like to hear how you're
running the testsuite so I can debug the riscv-gnu-toolchain repo.

Patrick

[1]:
Executing on host: 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc 
-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/ 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output  
-O3 -ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns 
-fno-schedule-insns2 -S   -o math-ceil-1.s (timeout = 600)
spawn -ignore SIGHUP 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc 
-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/ 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output 
-O3 -ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns 
-fno-schedule-insns2 -S -o math-ceil-1.s
In file included from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h:1,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c:5:
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11: 
fatal error: gnu/stubs-lp64d.h: No such file or directory

compilation terminated.
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/math-ceil-1.c -O3 -ftree-vectorize 
(test for excess errors)


[2]:
https://github.com/ewlu/riscv-gnu-toolchain/issues/170

[3]:
This also extends beyond math.h. I'm seeing similar failures for
testcases like
gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c that
#include .


On 9/19/23 18:12, Patrick O'Neill wrote:


I'll let it run overnight and see if this helps. Even before this patch,
I was seeing 233 stubs related failures for rv32gcv and 7 for rv64gcv so
this won't fix all the issues.

It's easily replicated using upstream riscv-gnu-toolchain
git clone https://github.com/riscv-collab/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init gcc
cd gcc
git pull master
cd ..
mkdir build
cd build
../configure --prefix=$(pwd) --with-arch=rv32gcv --with-abi=ilp32d
make report-linux -j32

Then search for "stubs" in the debug logs 
(/build-gcc-linux-stage2/gcc/testsuite/*.log)


Patrick

On 9/19/23 17:54, juzhe.zh...@rivai.ai wrote:

I think we could remove match.h.

Hi, @Patrick. Could you verify it?

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h

index 2292372d7a3..674098e9ba6 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -1,5 +1,4 @@
 #include 
-#include 

and commit it.

Thanks.


Re: [Patch, fortran] PR68155 - ICE on initializing character array in type (len_lhs <> len_rhs)

2023-09-20 Thread Harald Anlauf

Hi Paul,

On 9/20/23 09:03, Paul Richard Thomas wrote:

Hi All,

This is a straightforward patch that is adequately explained by the ChangeLog.

Regtests fine - OK for trunk?


this looks good to me.  OK for trunk.

As it is an almost obvious fix for sort of wrong code, I'd consider
it backportable if you have intentions in that direction.

Thanks,
Harald


Cheers

Paul

Fortran: Pad mismatched charlens in component initializers [PR68155]

2023-09-20  Paul Thomas  

gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.

gcc/testsuite/
PR fortran/68155
* gfortran.dg/pr68155.f90: New test.




Re: [PATCH] RISC-V: Remove math.h import to resolve missing stubs failures

2023-09-20 Thread Kito Cheng
LGTM

Patrick O'Neill  於 2023年9月20日 週三 18:07 寫道:

> Resolves some of the missing stubs failures:
> fatal error: gnu/stubs-lp64d.h: No such file or directory
> compilation terminated.
>
> 2023-09-20 Juzhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded math.h
> import.
>
> Tested-by: Patrick O'Neill 
> ---
> Tested using 590a8bec3ed92118e084b0a1897d3314a666170e
> glibc rv64gcv
> glibc rv32gcv
>
> glibc rv64gcv
> Resolved failures:
> FAIL: gcc.target/riscv/rvv/autovec/vls/mov-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/mov-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/mov-6.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
>
> glibc rv32gcv
> Resolved failures:
> FAIL: gcc.target/riscv/rvv/autovec/vls/and-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/and-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/and-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-5.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-6.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-5.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/div-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-5.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-6.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-7.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/extract-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/extract-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c 

[PATCH] c++: constraint rewriting during ttp coercion [PR111485]

2023-09-20 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps backports?

-- >8 --

In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters.  The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2.  This patch
fixes this by including the outer template arguments in this substitution,
which ought to match the depth of the ttp.

The second testcase demonstrates that it's sometimes necessary to
substitute the concrete outer template arguments instead of generic
ones, because a ttp's constraints could depend on outer arguments.

PR c++/111485

gcc/cp/ChangeLog:

* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
---
 gcc/cp/pt.cc   |  5 +++--
 gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C | 24 ++
 gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C | 17 +++
 3 files changed, 44 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8758e218ce4..f47887291a6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -8360,7 +8360,7 @@ canonicalize_expr_argument (tree arg, tsubst_flags_t 
complain)
constrained than the parameter.  */
 
 static bool
-is_compatible_template_arg (tree parm, tree arg)
+is_compatible_template_arg (tree parm, tree arg, tree args)
 {
   tree parm_cons = get_constraints (parm);
 
@@ -8381,6 +8381,7 @@ is_compatible_template_arg (tree parm, tree arg)
 {
   tree aparms = DECL_INNERMOST_TEMPLATE_PARMS (arg);
   new_args = template_parms_level_to_args (aparms);
+  new_args = add_to_template_args (args, new_args);
   ++processing_template_decl;
   parm_cons = tsubst_constraint_info (parm_cons, new_args,
  tf_none, NULL_TREE);
@@ -8635,7 +8636,7 @@ convert_template_argument (tree parm,
   // Check that the constraints are compatible before allowing the
   // substitution.
   if (val != error_mark_node)
-if (!is_compatible_template_arg (parm, arg))
+   if (!is_compatible_template_arg (parm, arg, args))
   {
if (in_decl && (complain & tf_error))
   {
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
new file mode 100644
index 000..4129e9e1303
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
@@ -0,0 +1,24 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+template concept D = C || true;
+
+template class TT> struct example { };
+template class UU> using example_t = example;
+
+template
+struct A {
+  template class TT> struct example { };
+
+  template class UU> using example_t = example;
+
+  template
+  struct B {
+template class UU> using example_t = example;
+  };
+};
+
+template struct A::B;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
new file mode 100644
index 000..7832cabc7d8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
@@ -0,0 +1,17 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+
+template requires C class TT>
+void f();
+
+template requires C
+struct A;
+
+int main() {
+  f();
+  f(); // { dg-error "no match|constraint" }
+}
-- 
2.42.0.216.gbda494f404



Re: [PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Wilco Dijkstra
Hi Richard,

> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

> Shouldn't this be a separate patch?  It's not immediately obvious that this 
> is a necessary part of this change.

You mean this?

@@ -1627,7 +1627,7 @@ (define_expand "cpymemdi"
(match_operand:BLK 1 "memory_operand")
(match_operand:DI 2 "general_operand")
(match_operand:DI 3 "immediate_operand")]
-   "!STRICT_ALIGNMENT || TARGET_MOPS"
+   ""

Yes that's necessary since that is the bug.

> +  unsigned align = INTVAL (operands[3]);
>
>This should read the value with UINTVAL.  Given the useful range of the 
>alignment, it should be OK that we're not using unsigned HWI.

I'll fix that.

> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);
>
> So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we 
> want a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too 
> eagerly?

The goal was to fix the issue in way that is both obvious and can be easily 
backported.
Further improvements can be made to handle other alignments, but it is
slightly tricky (eg. align == 4 won't emit LDP/STP directly using current code
and thus would need additional work to generalize the LDP path).
  
>> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;
>
>I find this name slightly confusing.  Surely it's min_mops_size (since above 
>that we want to use MOPS rather than inlined loads/stores).  But why not just 
>use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

The reason is that in a follow-on patch I check 
aarch64_mops_memcpy_size_threshold
too, so for now this acts as a shortcut for the ridiculously long name.

> Are there any additional tests for this?

There are existing tests that check the expansion which fail if you completely
block expansions with STRICT_ALIGNMENT.

Cheers,
Wilco

[PATCH] RISC-V: Remove math.h import to resolve missing stubs failures

2023-09-20 Thread Patrick O'Neill
Resolves some of the missing stubs failures:
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.

2023-09-20 Juzhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded math.h
import.

Tested-by: Patrick O'Neill 
---
Tested using 590a8bec3ed92118e084b0a1897d3314a666170e
glibc rv64gcv
glibc rv32gcv

glibc rv64gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-6.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)

glibc rv32gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/and-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-5.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-6.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-5.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/div-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-5.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-6.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-7.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-3.c -O3 
-ftree-vectorize --param 

[Bug middle-end/111483] [14 Regression] ICE in to_sreal, at profile-count.cc:472

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111483

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|NEW |RESOLVED

--- Comment #3 from Andrew Pinski  ---
Dup of bug 111054.

*** This bug has been marked as a duplicate of bug 111054 ***

[Bug middle-end/111054] [14 Regression] ICE: in to_sreal, at profile-count.cc:472 with -O3 -fno-guess-branch-probability

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111054

Andrew Pinski  changed:

   What|Removed |Added

 CC||19373742 at buaa dot edu.cn

--- Comment #3 from Andrew Pinski  ---
*** Bug 111483 has been marked as a duplicate of this bug. ***

Re: [Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]

2023-09-20 Thread Kito Cheng
Does it also happened on gcc 13 branch? If so plz backport :)

Juzhe-Zhong  於 2023年9月20日 週三 11:09 寫道:

> This bug is exposed when we support VLS integer conversion patterns.
>
> FAIL: c-c++-common/torture/pr53505.c execution.
>
> This is because incorrect vsetvl elimination by Phase 4:
>
>10318:   0d207057vsetvli zero,zero,e32,m4,ta,ma
>1031c:   5e003e57vmv.v.i v28,0
>.:   missed e8,m1 vsetvl
>10320:   7b07b057vmsgtu.vi   v0,v16,15
>10324:   03083157vadd.vi v2,v16,-16
>
> Regression on release version GCC no surprise difference.
>
> Committed.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Fix
> bug.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc
> b/gcc/config/riscv/riscv-vsetvl.cc
> index df980b6770e..e0f61148ef3 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -1799,10 +1799,11 @@ vector_insn_info::operator== (const
> vector_insn_info ) const
>  if (m_demands[i] != other.demand_p ((enum demand_type) i))
>return false;
>
> -  if (vector_config_insn_p (m_insn->rtl ())
> -  || vector_config_insn_p (other.get_insn ()->rtl ()))
> -if (m_insn != other.get_insn ())
> -  return false;
> +  /* We should consider different INSN demands as different
> + expression.  Otherwise, we will be doing incorrect vsetvl
> + elimination.  */
> +  if (m_insn != other.get_insn ())
> +return false;
>
>if (!same_avl_p (other))
>  return false;
> --
> 2.36.3
>
>


[Bug c++/103524] [meta-bug] modules issue

2023-09-20 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103524
Bug 103524 depends on bug 104993, which changed state.

Bug 104993 Summary: [modules] Missing diagnostic when exporting using-directive
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104993

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

[Bug c++/104993] [modules] Missing diagnostic when exporting using-directive

2023-09-20 Thread johelegp at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104993

Johel Ernesto Guerrero Peña  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Johel Ernesto Guerrero Peña  ---
> [module.interface]p3s1 says
> > An exported declaration that is not a module-import-declaration shall 
> > declare at least one name.

Removed by "DR20: Meaningful exports P2615R0".

[Bug target/111501] RISC-V: non-optimal casting when shifting

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111501

Andrew Pinski  changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu.org

--- Comment #2 from Andrew Pinski  ---
I think there might be a dup of this bug already.

Also with zbb, GCC gets:

srlia0,a0,32
zext.h  a0,a0
ret

[Bug target/111501] RISC-V: non-optimal casting when shifting

2023-09-20 Thread palmer at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111501

palmer at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2023-09-20
   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW
  Component|c   |target
 Ever confirmed|0   |1
 CC||palmer at gcc dot gnu.org,
   ||vineetg at gcc dot gnu.org

--- Comment #1 from palmer at gcc dot gnu.org ---
Adding Vineet.

[Bug c++/111496] Optimizer issue when reinitializing an object of a standard-layout class with a trivial copy constructor and a trivial destructor but with padding bytes

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111496

Andrew Pinski  changed:

   What|Removed |Added

Summary|Optimizer issue when|Optimizer issue when
   |reinitializing an object of |reinitializing an object of
   |a standard-layout class |a standard-layout class
   |with a trivial copy |with a trivial copy
   |constructor and a trivial   |constructor and a trivial
   |destructor  |destructor but with padding
   ||bytes

--- Comment #1 from Andrew Pinski  ---
So your class examples all have padding in it (which you can get a warning with
-Wpadded ).
Once you remove the padding (via adding another field at the end):
```
int b[4] = {};
char* p = {};
int x = {};
int t = {};
```

The code gets better.

So the big question should padding bytes should be copied or not ...

[Bug target/110751] RISC-V: Suport undefined value that allows VSETVL PASS use TA/MA

2023-09-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

--- Comment #42 from CVS Commits  ---
The master branch has been updated by Robin Dapp :

https://gcc.gnu.org/g:27282dc0931484c31fa391772499d878afcc746a

commit r14-4179-g27282dc0931484c31fa391772499d878afcc746a
Author: Juzhe-Zhong 
Date:   Wed Sep 20 22:58:49 2023 +0800

internal-fn: Support undefined rtx for uninitialized SSA_NAME[PR110751]

According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

As Richard and Richi suggested, we recognize uninitialized SSA_NAME and
convert it
into SCRATCH rtx if the target predicate allows SCRATCH.

It can help to reduce redundant data move instructions of targets like
RISC-V.

Bootstrap and Regression on x86 passed.

gcc/ChangeLog:
PR target/110751

* internal-fn.cc (expand_fn_using_insn): Support undefined rtx
value.
* optabs.cc (maybe_legitimize_operand): Ditto.
(can_reuse_operands_p): Ditto.
* optabs.h (enum expand_operand_type): Ditto.
(create_undefined_input_operand): Ditto.

[Bug c/111501] New: RISC-V: non-optimal casting when shifting

2023-09-20 Thread charlie at rivosinc dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111501

Bug ID: 111501
   Summary: RISC-V: non-optimal casting when shifting
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: charlie at rivosinc dot com
  Target Milestone: ---

Created attachment 55949
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55949=edit
tar file of -save-temps output

I would expect the first to be able to compile into the second:

unsigned int do_shift(unsigned long csum)
{
return (unsigned short)(csum >> 32);
}

unsigned int do_shift2(unsigned long csum)
{
return (csum << 16) >> 48;
}

However, the asm output is instead:

do_shift:
srlia0,a0,32
sllia0,a0,48
srlia0,a0,48
ret
do_shift2:
sllia0,a0,16
srlia0,a0,48
ret

These are the same so the first should be able to be compiled into the second.

[Bug target/111500] [arm-none-eabi-gcc] / suboptimal optimization

2023-09-20 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111500

Andrew Pinski  changed:

   What|Removed |Added

  Component|c   |target
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
   Keywords||missed-optimization
 Target||arm
   Last reconfirmed||2023-09-20

--- Comment #1 from Andrew Pinski  ---
Can you attach (compilible) examples code for each issue? Really these should
be filed seperately too.

[Bug c++/111471] Incorrect NTTP printing in the error messages

2023-09-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111471

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:75c4b0cde4835b45350da0a5cd82f1d1a0a7a2f1

commit r14-4178-g75c4b0cde4835b45350da0a5cd82f1d1a0a7a2f1
Author: Patrick Palka 
Date:   Wed Sep 20 12:09:36 2023 -0400

c++: improve class NTTP object pretty printing [PR111471]

1. Move class NTTP object pretty printing to a more general spot in
   the pretty printer, so that we always print its value instead of
   its (mangled) name even when it appears outside of a template
   argument list.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
   value, like dump_expr would have done.
3. Don't print const VIEW_CONVERT_EXPR wrappers for class NTTPs.

PR c++/111471

gcc/cp/ChangeLog:

* cxx-pretty-print.cc (cxx_pretty_printer::expression)
: Handle class NTTP objects by printing
their type and value.
: Strip const VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle class NTTP
objects here.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/diagnostic19.C: New test.

[Bug target/111481] MacOS, linker issues with Xcode 15

2023-09-20 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111481

Iain Sandoe  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Iain Sandoe  ---
(In reply to Eric Gallager from comment #3)
> (In reply to simon from comment #2)

(this is not a GCC bug AFAIK - but for the sake of information)

> > A fix for the Ada issue is to link with the classic linker:
> > 
> > $ gnatmake hello -largs -Wl,-ld_classic
> > gcc -c hello.adb
> > gnatbind -x hello.ali
> > gnatlink hello.ali -Wl,-ld_classic
> > $

you really need to configure the toolchain with "--with-ld=/path/to/ld-classic"

NOTE: AFAIU, the ld bug is reportedly fixed "upstrean" and will appear in some
future dot release go Xcode 15.  At present, I'd just suggest sticking with
14.3.

> This is annoying how Apple is reusing the name `ld_classic` for something
> new, after it previously referred to something else...

Indeed, it is for those of us supporting older Darwin.

[PATCH v2] AArch64: Fix memmove operand corruption [PR111121]

2023-09-20 Thread Wilco Dijkstra
A MOPS memmove may corrupt registers since there is no copy of the input
operands to temporary registers.  Fix this by calling
aarch64_expand_cpymem_mops.

Passes regress/bootstrap, OK for commit?

gcc/ChangeLog/
PR target/21
* config/aarch64/aarch64.md (aarch64_movmemdi): Add new expander.
(movmemdi): Call aarch64_expand_cpymem_mops for correct expansion.
* config/aarch64/aarch64.cc (aarch64_expand_cpymem_mops): Add 
support
for memmove.
* config/aarch64/aarch64-protos.h (aarch64_expand_cpymem_mops): Add 
new
function.

gcc/testsuite/ChangeLog/
PR target/21
* gcc.target/aarch64/mops_4.c: Add memmove testcases.

---

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
70303d6fd953e0c397b9138ede8858c2db2e53db..e8d91cba30e32e03c4794ccc24254691d135f2dd
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -765,6 +765,7 @@ bool aarch64_emit_approx_div (rtx, rtx, rtx);
 bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
 tree aarch64_vector_load_decl (tree);
 void aarch64_expand_call (rtx, rtx, rtx, bool);
+bool aarch64_expand_cpymem_mops (rtx *, bool);
 bool aarch64_expand_cpymem (rtx *);
 bool aarch64_expand_setmem (rtx *);
 bool aarch64_float_const_zero_rtx_p (rtx);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
219c4ee6d4cd7522f6ad634c794485841e5d08fa..dd6874d13a75f20d10a244578afc355b25c73da2
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -25228,10 +25228,11 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
*src, rtx *dst,
   *dst = aarch64_progress_pointer (*dst);
 }
 
-/* Expand a cpymem using the MOPS extension.  OPERANDS are taken
-   from the cpymem pattern.  Return true iff we succeeded.  */
-static bool
-aarch64_expand_cpymem_mops (rtx *operands)
+/* Expand a cpymem/movmem using the MOPS extension.  OPERANDS are taken
+   from the cpymem/movmem pattern.  IS_MEMMOVE is true if this is a memmove
+   rather than memcpy.  Return true iff we succeeded.  */
+bool
+aarch64_expand_cpymem_mops (rtx *operands, bool is_memmove = false)
 {
   if (!TARGET_MOPS)
 return false;
@@ -25243,8 +25244,10 @@ aarch64_expand_cpymem_mops (rtx *operands)
   rtx dst_mem = replace_equiv_address (operands[0], dst_addr);
   rtx src_mem = replace_equiv_address (operands[1], src_addr);
   rtx sz_reg = copy_to_mode_reg (DImode, operands[2]);
-  emit_insn (gen_aarch64_cpymemdi (dst_mem, src_mem, sz_reg));
-
+  if (is_memmove)
+emit_insn (gen_aarch64_movmemdi (dst_mem, src_mem, sz_reg));
+  else
+emit_insn (gen_aarch64_cpymemdi (dst_mem, src_mem, sz_reg));
   return true;
 }
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
60133b541e9289610ce58116b0258a61f29bdc00..6d0f072a9dd6d094e8764a513222a9129d8296fa
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1635,7 +1635,22 @@ (define_expand "cpymemdi"
 }
 )
 
-(define_insn "aarch64_movmemdi"
+(define_expand "aarch64_movmemdi"
+  [(parallel
+ [(set (match_operand 2) (const_int 0))
+  (clobber (match_dup 3))
+  (clobber (match_dup 4))
+  (clobber (reg:CC CC_REGNUM))
+  (set (match_operand 0)
+  (unspec:BLK [(match_operand 1) (match_dup 2)] UNSPEC_MOVMEM))])]
+  "TARGET_MOPS"
+  {
+operands[3] = XEXP (operands[0], 0);
+operands[4] = XEXP (operands[1], 0);
+  }
+)
+
+(define_insn "*aarch64_movmemdi"
   [(parallel [
(set (match_operand:DI 2 "register_operand" "+") (const_int 0))
(clobber (match_operand:DI 0 "register_operand" "+"))
@@ -1668,17 +1683,9 @@ (define_expand "movmemdi"
&& INTVAL (sz_reg) < aarch64_mops_memmove_size_threshold)
  FAIL;
 
-   rtx addr_dst = XEXP (operands[0], 0);
-   rtx addr_src = XEXP (operands[1], 0);
-
-   if (!REG_P (sz_reg))
- sz_reg = force_reg (DImode, sz_reg);
-   if (!REG_P (addr_dst))
- addr_dst = force_reg (DImode, addr_dst);
-   if (!REG_P (addr_src))
- addr_src = force_reg (DImode, addr_src);
-   emit_insn (gen_aarch64_movmemdi (addr_dst, addr_src, sz_reg));
-   DONE;
+  if (aarch64_expand_cpymem_mops (operands, true))
+DONE;
+  FAIL;
 }
 )
 
diff --git a/gcc/testsuite/gcc.target/aarch64/mops_4.c 
b/gcc/testsuite/gcc.target/aarch64/mops_4.c
index 
1b87759cb5e8bbcbb58cf63404d1d579d44b2818..dd796115cb4093251964d881e93bf4b98ade0c32
 100644
--- a/gcc/testsuite/gcc.target/aarch64/mops_4.c
+++ b/gcc/testsuite/gcc.target/aarch64/mops_4.c
@@ -50,6 +50,54 @@ copy3 (int *x, int *y, long z, long *res)
   *res = z;
 }
 
+/*
+** move1:
+** mov (x[0-9]+), x0
+** cpyp\[\1\]!, \[x1\]!, x2!
+** cpym\[\1\]!, \[x1\]!, x2!
+** cpye\[\1\]!, \[x1\]!, x2!
+** str x0, \[x3\]
+** ret
+*/
+void
+move1 (int *x, int *y, long z, int **res)
+{
+  __builtin_memmove (x, y, z);
+  *res = x;
+}
+
+/*
+** 

Re: [PATCH] c, c++, v3: Accept __builtin_classify_type (typename)

2023-09-20 Thread Joseph Myers
On Wed, 20 Sep 2023, Jakub Jelinek wrote:

> On Mon, Sep 18, 2023 at 09:25:19PM +, Joseph Myers wrote:
> > > I'd like to ping this patch.
> > > The C++ FE part has been approved by Jason already with a minor change
> > > I've made in my copy.
> > > Are the remaining parts ok for trunk?
> > 
> > In the C front-end changes, since you end up discarding any side effects 
> > from the type, I'd expect use of in_alignof to be more appropriate than 
> > in_typeof (and thus not needing to use pop_maybe_used).
> 
> So like this?  Bootstrapped/regtested again on x86_64-linux and i686-linux.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug target/111481] MacOS, linker issues with Xcode 15

2023-09-20 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111481

Eric Gallager  changed:

   What|Removed |Added

 CC||egallager at gcc dot gnu.org,
   ||iains at gcc dot gnu.org

--- Comment #3 from Eric Gallager  ---
(In reply to simon from comment #2)
> A fix for the Ada issue is to link with the classic linker:
> 
> $ gnatmake hello -largs -Wl,-ld_classic
> gcc -c hello.adb
> gnatbind -x hello.ali
> gnatlink hello.ali -Wl,-ld_classic
> $

This is annoying how Apple is reusing the name `ld_classic` for something new,
after it previously referred to something else...

Re: [PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Richard Earnshaw (lists)
On 20/09/2023 14:50, Wilco Dijkstra wrote:
> 
> The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
> Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
> Clean up the condition when to use MOPS.
> 
> Passes regress/bootstrap, OK for commit?
> 
> gcc/ChangeLog/
> PR target/103100
> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

Shouldn't this be a separate patch?  It's not immediately obvious that this is 
a necessary part of this change.

> (setmemdi): Likewise.
> * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
> strict-align.  Cleanup condition for using MOPS.
> (aarch64_expand_setmem): Likewise.
> 
> ---
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> dd6874d13a75f20d10a244578afc355b25c73da2..8f3bfb91c0f4ec43f37fe9289a66092a29a47e4d
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
>int mode_bits;
>rtx dst = operands[0];
>rtx src = operands[1];
> +  unsigned align = INTVAL (operands[3]);

This should read the value with UINTVAL.  Given the useful range of the 
alignment, it should be OK that we're not using unsigned HWI.

>rtx base;
>machine_mode cur_mode = BLKmode;
> +  bool size_p = optimize_function_for_size_p (cfun);
>  
> -  /* Variable-sized memcpy can go through the MOPS expansion if available.  
> */
> -  if (!CONST_INT_P (operands[2]))
> +  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);

So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we want 
a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too eagerly?


>  
>unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
>  
> -  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  
> */
> -  unsigned HOST_WIDE_INT max_copy_size
> -= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
> -
> -  bool size_p = optimize_function_for_size_p (cfun);
> +  /* Try to inline up to 256 bytes.  */
> +  unsigned max_copy_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;

I find this name slightly confusing.  Surely it's min_mops_size (since above 
that we want to use MOPS rather than inlined loads/stores).  But why not just 
use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

>  
> -  /* Large constant-sized cpymem should go through MOPS when possible.
> - It should be a win even for size optimization in the general case.
> - For speed optimization the choice between MOPS and the SIMD sequence
> - depends on the size of the copy, rather than number of instructions,
> - alignment etc.  */
> -  if (size > max_copy_size)
> +  /* Large copies use MOPS when available or a library call.  */
> +  if (size > max_copy_size || (TARGET_MOPS && size > max_mops_size))
>  return aarch64_expand_cpymem_mops (operands);
>  
>int copy_bits = 256;
> @@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)

Similar comments apply to this code as well.

>unsigned HOST_WIDE_INT len;
>rtx dst = operands[0];
>rtx val = operands[2], src;
> +  unsigned align = INTVAL (operands[3]);
>rtx base;
>machine_mode cur_mode = BLKmode, next_mode;
>  
> -  /* If we don't have SIMD registers or the size is variable use the MOPS
> - inlined sequence if possible.  */
> -  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
> +  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
> +  || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_setmem_mops (operands);
>  
>bool size_p = optimize_function_for_size_p (cfun);
> @@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)

And here.

>/* Default the maximum to 256-bytes when considering only libcall vs
>   SIMD broadcast sequence.  */
>unsigned max_set_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memset_size_threshold;
>  
>len = INTVAL (operands[1]);
> -  if (len > max_set_size && !TARGET_MOPS)
> -return false;
> +
> +  /* Large memset uses MOPS when available or a library call.  */
> +  if (len > max_set_size || (TARGET_MOPS && len > max_mops_size))
> +return aarch64_expand_setmem_mops (operands);
>  
>int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
>/* The MOPS sequence takes:
> @@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
>   the arguments + 1 for the call.  */
>unsigned libcall_cost = 4;
>  
> -  /* Upper bound check.  For large constant-sized setmem use the MOPS 
> sequence
> - when available.  */
> -  if (TARGET_MOPS
> -  && len >= 

[Bug c++/111499] std::vector less operator< doesn't compile with optimization due to __builtin_memcmp

2023-09-20 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111499

--- Comment #4 from Jonathan Wakely  ---
(In reply to Michal Lachowicz from comment #2)
> N.B. Bug exists on current release gcc 13.2

Yes, but you're still not using the devel/omp/gcc-12 branch from Git.

[Bug c++/111499] std::vector less operator< doesn't compile with optimization due to __builtin_memcmp

2023-09-20 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111499

--- Comment #3 from Jonathan Wakely  ---
The only reason it doesn't compile is because you explicitly told the compiler
to make it not compile. This is just a warning, not an error.

[Bug c++/111499] std::vector less operator< doesn't compile with optimization due to __builtin_memcmp

2023-09-20 Thread mlachowicz at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111499

--- Comment #2 from Michal Lachowicz  ---
N.B. Bug exists on current release gcc 13.2

https://godbolt.org/z/zeKbhPY4r

  1   2   3   >