date:20230920

Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Lehua Ding


Hi Robin and Juzhe,

I changed to use the most original method, please see V3 as below:
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631076.html

On 2023/9/20 17:51, Robin Dapp wrote:

So, IMHO, a complicate pattern which combine initial 0 value + extension + 
reduction + vmerge may be more reasonable.


If that works I would also prefer that.

Regards
  Robin



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

[PATCH V3] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Lehua Ding

V3 Change: Back to the original method.

This patch support combining cond extend and reduce_sum to cond widen reduce_sum
like combine the following three insns:
   (set (reg:RVVM2HI 149)
(if_then_else:RVVM2HI
  (unspec:RVVMF8BI [
(const_vector:RVVMF8BI repeat [
  (const_int 1 [0x1])
])
(reg:DI 146)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(reg:SI 66 vl)
(reg:SI 67 vtype)
  ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2HI repeat [
   (const_int 0 [0])
 ])
 (unspec:RVVM2HI [
   (reg:SI 0 zero)
 ] UNSPEC_VUNDEF)))
  (set (reg:RVVM2HI 138)
(if_then_else:RVVM2HI
  (reg:RVVMF8BI 135)
  (reg:RVVM2HI 148)
  (reg:RVVM2HI 149)))
  (set (reg:HI 150)
(unspec:HI [
  (reg:RVVM2HI 138)
] UNSPEC_REDUC_SUM))
into one insn:
  (set (reg:SI 147)
(unspec:SI [
  (if_then_else:RVVM2SI
(reg:RVVMF16BI 135)
(sign_extend:RVVM2SI (reg:RVVM1HI 136))
(if_then_else:RVVM2HI
  (unspec:RVVMF8BI [
(const_vector:RVVMF8BI repeat [
  (const_int 1 [0x1])
])
(reg:DI 146)
(const_int 2 [0x2]) repeated x2
(const_int 1 [0x1])
(reg:SI 66 vl)
(reg:SI 67 vtype)
  ] UNSPEC_VPREDICATE)
 (const_vector:RVVM2HI repeat [
   (const_int 0 [0])
 ])
 (unspec:RVVM2HI [
   (reg:SI 0 zero)
 ] UNSPEC_VUNDEF)))
] UNSPEC_REDUC_SUM))

Consider the following C code:

int16_t foo (int8_t *restrict a, int8_t *restrict pred)
{
  int16_t sum = 0;
  for (int i = 0; i < 16; i += 1)
if (pred[i])
  sum += a[i];
  return sum;
}

assembly before this patch:

foo:
vsetivlizero,16,e16,m2,ta,ma
li  a5,0
vmv.v.i v2,0
vsetvli zero,zero,e8,m1,ta,ma
vl1re8.vv0,0(a1)
vmsne.viv0,v0,0
vsetvli zero,zero,e16,m2,ta,mu
vle8.v  v4,0(a0),v0.t
vmv.s.x v1,a5
vsext.vf2   v2,v4,v0.t
vredsum.vs  v2,v2,v1
vmv.x.s a0,v2
slliw   a0,a0,16
sraiw   a0,a0,16
ret

assembly after this patch:

foo:
li  a5,0
vsetivlizero,16,e16,m1,ta,ma
vmv.s.x v3,a5
vsetivlizero,16,e8,m1,ta,ma
vl1re8.vv0,0(a1)
vmsne.viv0,v0,0
vle8.v  v2,0(a0),v0.t
vwredsum.vs v1,v2,v3,v0.t
vsetivlizero,0,e16,m1,ta,ma
vmv.x.s a0,v1
slliw   a0,a0,16
sraiw   a0,a0,16
ret

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*cond_widen_reduc_plus_scal_):
New combine patterns.
* config/riscv/riscv-protos.h (enum insn_type): New insn_type.
(enum avl_type): New avl_type for VLS mode.
* config/riscv/riscv-v.cc: Add VLS avl_type for VLS mode.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 72 +++
 gcc/config/riscv/riscv-protos.h   |  6 +-
 gcc/config/riscv/riscv-v.cc   |  9 ++-
 .../rvv/autovec/cond/cond_widen_reduc-1.c | 30 
 .../rvv/autovec/cond/cond_widen_reduc-2.c | 30 
 .../rvv/autovec/cond/cond_widen_reduc_run-1.c | 28 
 .../rvv/autovec/cond/cond_widen_reduc_run-2.c | 28 
 7 files changed, 198 insertions(+), 5 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index a97a095691c..ed9c0777eb9 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -1119,6 +1119,78 @@
   }
   [(set_attr "type" "vfwmuladd")])

+;; Combine mask_extend + vredsum to mask_vwredsum[u]
+;; where the mrege of mask_extend is vector const 0
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(any_extend:
+  (match_operand:VI_QHS_NO_M8 2 "register_operand"))
+(if_then_else:
+  (unspec: [
+(match_operand: 3 "vector_all_trues_mask_operand")
+(match_operand 6

Re: [PATCH] check undefine_p for one more vr

2023-09-20 Thread Jiufu Guo



Hi,

Richard Biener  writes:

>> Am 21.09.2023 um 05:10 schrieb Jiufu Guo :
>> 
>> Hi,
>> 
>> The root cause of PR111355 and PR111482 is missing to check if vr0
>> is undefined_p before call vr0.lower_bound.
>> 
>> In the pattern "(X + C) / N",
>> 
>>(if (INTEGRAL_TYPE_P (type)
>> && get_range_query (cfun)->range_of_expr (vr0, @0))
>> (if (...) 
>>   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
>>   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
>>&& wi::geu_p (vr0.lower_bound (), -c))
>> 
>> In "(if (...)", there is code to prevent vr0's undefined_p,
>> But in the "else" part, vr0's undefined_p is not checked before
>> "wi::geu_p (vr0.lower_bound (), -c)".
>> 
>> Bootstrap & regtest pass on ppc64{,le}.
>> Is this ok for trunk?
>
> Ok

Thanks! Committed via r14-4192.

BR,
Jeff (Jiufu Guo)

>
> Richard 
>
>> BR,
>> Jeff (Jiufu Guo)
>> 
>> 
>>PR tree-optimization/111355
>> 
>> gcc/ChangeLog:
>> 
>>* match.pd ((X + C) / N): Update pattern.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>* gcc.dg/pr111355.c: New test.
>> 
>> ---
>> gcc/match.pd| 2 +-
>> gcc/testsuite/gcc.dg/pr111355.c | 8 
>> 2 files changed, 9 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.dg/pr111355.c
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 39c9c81966a..5fdfba14d47 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1033,7 +1033,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>  || (vr0.nonnegative_p () && vr3.nonnegative_p ())
>>  || (vr0.nonpositive_p () && vr3.nonpositive_p (
>>(plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
>> -   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0
>> +   (if (!vr0.undefined_p () && TYPE_UNSIGNED (type) && c.sign_mask () < >> 0
>>&& exact_mod (-c)
>>/* unsigned "X-(-C)" doesn't underflow.  */
>>&& wi::geu_p (vr0.lower_bound (), -c))
>> diff --git a/gcc/testsuite/gcc.dg/pr111355.c 
>> b/gcc/testsuite/gcc.dg/pr111355.c
>> new file mode 100644
>> index 000..8bacbc69d31
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr111355.c
>> @@ -0,0 +1,8 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3 -Wno-div-by-zero" } */
>> +
>> +/* Make sure no ICE. */
>> +int main() {
>> +  unsigned b;
>> +  return b ? 1 << --b / 0 : 0;
>> +}
>> -- 
>> 2.25.1
>>

Re: [PATCH 1/2] using overflow_free_p to simplify pattern

2023-09-20 Thread Jiufu Guo



Hi,

Richard Biener  writes:

> On Tue, 19 Sep 2023, Jiufu Guo wrote:
>
>> Hi,
>> 
>> In r14-3582, an "overflow_free_p" interface is added.
>> The pattern of "(t * 2) / 2" in match.pd can be simplified
>> by using this interface.
>> 
>> Bootstrap & regtest pass on ppc64{,le} and x86_64.
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu)
>> 
>> gcc/ChangeLog:
>> 
>>  * match.pd ((t * 2) / 2): Update to use overflow_free_p.
>> 
>> ---
>>  gcc/match.pd | 37 +++--
>>  1 file changed, 7 insertions(+), 30 deletions(-)
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 87edf0e75c3..8bba7056000 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -926,36 +926,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> (if (TYPE_OVERFLOW_UNDEFINED (type))
>>  @0
>>  #if GIMPLE
>> -(with
>> - {
>> -   bool overflowed = true;
>> -   value_range vr0, vr1;
>> -   if (INTEGRAL_TYPE_P (type)
>> -   && get_range_query (cfun)->range_of_expr (vr0, @0)
>> -   && get_range_query (cfun)->range_of_expr (vr1, @1)
>> -   && !vr0.varying_p () && !vr0.undefined_p ()
>> -   && !vr1.varying_p () && !vr1.undefined_p ())
>> - {
>> -   wide_int wmin0 = vr0.lower_bound ();
>> -   wide_int wmax0 = vr0.upper_bound ();
>> -   wide_int wmin1 = vr1.lower_bound ();
>> -   wide_int wmax1 = vr1.upper_bound ();
>> -   /* If the multiplication can't overflow/wrap around, then
>> -  it can be optimized too.  */
>> -   wi::overflow_type min_ovf, max_ovf;
>> -   wi::mul (wmin0, wmin1, TYPE_SIGN (type), _ovf);
>> -   wi::mul (wmax0, wmax1, TYPE_SIGN (type), _ovf);
>> -   if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
>> - {
>> -   wi::mul (wmin0, wmax1, TYPE_SIGN (type), _ovf);
>> -   wi::mul (wmax0, wmin1, TYPE_SIGN (type), _ovf);
>> -   if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
>> - overflowed = false;
>> - }
>> - }
>> - }
>> -(if (!overflowed)
>> - @0))
>> +(with {value_range vr0, vr1;}
>> + (if (INTEGRAL_TYPE_P (type)
>> +  && get_range_query (cfun)->range_of_expr (vr0, @0)
>> +  && get_range_query (cfun)->range_of_expr (vr1, @1)
>> +  && !vr0.varying_p () && !vr1.varying_p ()
>
> From your other uses checking !varying_p doesn't seem necessary?

Thanks for pointing out this!!
Yes, !varying_p is not needed, overflow_free_p could cover it.

Committed via r14-4191.

BR,
Jeff (Jiufu Guo)

>
> OK with omitting.
>
> Richard.
>
>> +  && range_op_handler (MULT_EXPR).overflow_free_p (vr0, vr1))
>> +  @0))
>>  #endif
>> 
>>  
>>

Re: [PATCH] check undefine_p for one more vr

2023-09-20 Thread Richard Biener




> Am 21.09.2023 um 05:10 schrieb Jiufu Guo :
> 
> Hi,
> 
> The root cause of PR111355 and PR111482 is missing to check if vr0
> is undefined_p before call vr0.lower_bound.
> 
> In the pattern "(X + C) / N",
> 
>(if (INTEGRAL_TYPE_P (type)
> && get_range_query (cfun)->range_of_expr (vr0, @0))
> (if (...) 
>   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
>   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
>&& wi::geu_p (vr0.lower_bound (), -c))
> 
> In "(if (...)", there is code to prevent vr0's undefined_p,
> But in the "else" part, vr0's undefined_p is not checked before
> "wi::geu_p (vr0.lower_bound (), -c)".
> 
> Bootstrap & regtest pass on ppc64{,le}.
> Is this ok for trunk?

Ok

Richard 

> BR,
> Jeff (Jiufu Guo)
> 
> 
>PR tree-optimization/111355
> 
> gcc/ChangeLog:
> 
>* match.pd ((X + C) / N): Update pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.dg/pr111355.c: New test.
> 
> ---
> gcc/match.pd| 2 +-
> gcc/testsuite/gcc.dg/pr111355.c | 8 
> 2 files changed, 9 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.dg/pr111355.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 39c9c81966a..5fdfba14d47 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1033,7 +1033,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  || (vr0.nonnegative_p () && vr3.nonnegative_p ())
>  || (vr0.nonpositive_p () && vr3.nonpositive_p (
>(plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
> -   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0
> +   (if (!vr0.undefined_p () && TYPE_UNSIGNED (type) && c.sign_mask () < 0
>&& exact_mod (-c)
>/* unsigned "X-(-C)" doesn't underflow.  */
>&& wi::geu_p (vr0.lower_bound (), -c))
> diff --git a/gcc/testsuite/gcc.dg/pr111355.c b/gcc/testsuite/gcc.dg/pr111355.c
> new file mode 100644
> index 000..8bacbc69d31
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr111355.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -Wno-div-by-zero" } */
> +
> +/* Make sure no ICE. */
> +int main() {
> +  unsigned b;
> +  return b ? 1 << --b / 0 : 0;
> +}
> -- 
> 2.25.1
>

Re: [PATCH][_GLIBCXX_INLINE_VERSION] Fix

2023-09-20 Thread François Dumont


Tests were successful, ok to commit ?

On 20/09/2023 19:51, François Dumont wrote:
libstdc++: [_GLIBCXX_INLINE_VERSION] Add handle_contract_violation 
symbol alias


libstdc++-v3/ChangeLog:

    * src/experimental/contract.cc
    [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Provide 
symbol alias

    without version namespace decoration for gcc.

Here is what I'm testing eventually, ok to commit if successful ?

François

On 20/09/2023 11:32, Jonathan Wakely wrote:

On Wed, 20 Sept 2023 at 05:51, François Dumont via Libstdc++
 wrote:

libstdc++: Remove std::constract_violation from versioned namespace

Spelling mistake in contract_violation, and it's not
std::contract_violation, it's std::experimental::contract_violation


GCC expects this type to be in std namespace directly.

Again, it's in std::experimental not in std directly.

Will this change cause problems when including another experimental
header, which does put experimental below std::__8?

I think std::__8::experimental and std::experimental will become 
ambiguous.


Maybe we do want to remove the inline __8 namespace from all
experimental headers. That needs a bit more thought though.


libstdc++-v3/ChangeLog:

  * include/experimental/contract:
  Remove 
_GLIBCXX_BEGIN_NAMESPACE_VERSION/_GLIBCXX_END_NAMESPACE_VERSION.

This line is too long for the changelog.


It does fix 29 g++.dg/contracts in gcc testsuite.

Ok to commit ?

François

Re: Re: [PATCH] RISC-V: Optimized for strided load/store with stride == element width[PR111450]

2023-09-20 Thread Li Xu

Committed, thanks Juzhe.
--
Li Xu
>Thanks a lot. LGTM.
>
>
>
>juzhe.zh...@rivai.ai
>
>From: Li Xu
>Date: 2023-09-21 11:12
>To: gcc-patches
>CC: kito.cheng; palmer; juzhe.zhong; xuli
>Subject: [PATCH] RISC-V: Optimized for strided load/store with stride == 
>element width[PR111450]
>From: xuli 
>
>When stride == element width, vlsse should be optimized into vle.v.
>vsse should be optimized into vse.v.
>
>PR target/111450
>
>gcc/ChangeLog:
>
>*config/riscv/constraints.md (c01): const_int 1.
>(c02): const_int 2.
>(c04): const_int 4.
>(c08): const_int 8.
>* config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for 
>stride operand.
>(vector_eew16_stride_operand): Ditto.
>(vector_eew32_stride_operand): Ditto.
>(vector_eew64_stride_operand): Ditto.
>* config/riscv/vector-iterators.md: New iterator for stride operand.
>* config/riscv/vector.md: Add stride = element width constraint.
>
>gcc/testsuite/ChangeLog:
>
>* gcc.target/riscv/rvv/base/pr111450.c: New test.
>---
>gcc/config/riscv/constraints.md   |  20 
>gcc/config/riscv/predicates.md    |  18 
>gcc/config/riscv/vector-iterators.md  |  87 +++
>gcc/config/riscv/vector.md    |  42 +---
>.../gcc.target/riscv/rvv/base/pr111450.c  | 100 ++
>5 files changed, 250 insertions(+), 17 deletions(-)
>create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111450.c
>
>diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
>index 3f52bc76f67..964fdd450c9 100644
>--- a/gcc/config/riscv/constraints.md
>+++ b/gcc/config/riscv/constraints.md
>@@ -45,6 +45,26 @@
>   (and (match_code "const_int")
>    (match_test "ival == 0")))
>+(define_constraint "c01"
>+  "Constant value 1."
>+  (and (match_code "const_int")
>+   (match_test "ival == 1")))
>+
>+(define_constraint "c02"
>+  "Constant value 2"
>+  (and (match_code "const_int")
>+   (match_test "ival == 2")))
>+
>+(define_constraint "c04"
>+  "Constant value 4"
>+  (and (match_code "const_int")
>+   (match_test "ival == 4")))
>+
>+(define_constraint "c08"
>+  "Constant value 8"
>+  (and (match_code "const_int")
>+   (match_test "ival == 8")))
>+
>(define_constraint "K"
>   "A 5-bit unsigned immediate for CSR access instructions."
>   (and (match_code "const_int")
>diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
>index 4bc7ff2c9d8..7845998e430 100644
>--- a/gcc/config/riscv/predicates.md
>+++ b/gcc/config/riscv/predicates.md
>@@ -514,6 +514,24 @@
>   (ior (match_operand 0 "const_0_operand")
>    (match_operand 0 "pmode_register_operand")))
>+;; [1, 2, 4, 8] means strided load/store with stride == element width
>+(define_special_predicate "vector_eew8_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 1 || INTVAL (op) == 0"
>+(define_special_predicate "vector_eew16_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 2 || INTVAL (op) == 0"
>+(define_special_predicate "vector_eew32_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 4 || INTVAL (op) == 0"
>+(define_special_predicate "vector_eew64_stride_operand"
>+  (ior (match_operand 0 "pmode_register_operand")
>+   (and (match_code "const_int")
>+    (match_test "INTVAL (op) == 8 || INTVAL (op) == 0"
>+
>;; A special predicate that doesn't match a particular mode.
>(define_special_predicate "vector_any_register_operand"
>   (match_code "reg"))
>diff --git a/gcc/config/riscv/vector-iterators.md 
>b/gcc/config/riscv/vector-iterators.md
>index 73df55a69c8..f85d1cc80d1 100644
>--- a/gcc/config/riscv/vector-iterators.md
>+++ b/gcc/config/riscv/vector-iterators.md
>@@ -2596,6 +2596,93 @@
>   (V512DI "V512BI")
>])
>+(define_mode_attr stride_predicate [
>+  (RVVM8QI "vector_eew8_stride_operand") (RVVM4QI 
>"vector_eew8_stride_operand")
>+  (RVVM2QI "vector_eew8_stride_operand") (RVVM1QI 
>"vector_eew8_stride_operand")
>+  (RVVMF2QI "vector_eew8_stride_operand") (RVVMF4QI 
>"vector_eew8_stride_operand")
>+  (RVVMF8QI "vector_eew8_stride_operand")
>+
>+  (RVVM8HI "vector_eew16_stride_operand") (RVVM4HI 
>"vector_eew16_stride_operand")
>+  (RVVM2HI "vector_eew16_stride_operand") (RVVM1HI 
>"vector_eew16_stride_operand")
>+  (RVVMF2HI "vector_eew16_stride_operand") (RVVMF4HI 
>"vector_eew16_stride_operand")
>+
>+  (RVVM8HF "vector_eew16_stride_operand") (RVVM4HF 
>"vector_eew16_stride_operand")
>+  (RVVM2HF "vector_eew16_stride_operand") (RVVM1HF 
>"vector_eew16_stride_operand")
>+  (RVVMF2HF "vector_eew16_stride_operand") (RVVMF4HF 
>"vector_eew16_stride_operand")
>+
>+  (RVVM8SI "vector_eew32_stride_operand") (RVVM4SI 
>"vector_eew32_stride_operand")
>+  (RVVM2SI

Re: [PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

2023-09-20 Thread Lehua Ding


Committed, thanks Juzhe.

On 2023/9/21 11:45, juzhe.zh...@rivai.ai wrote:

LGTM


juzhe.zh...@rivai.ai

*From:* Lehua Ding 
*Date:* 2023-09-21 11:44
*To:* gcc-patches 
*CC:* juzhe.zhong ; kito.cheng
; rdapp.gcc
; palmer ;
jeffreyalaw ; lehua.ding

*Subject:* [PATCH] RISC-V: Rename predicate
vector_gs_scale_operand_16/32 to more generic names
This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.
gcc/ChangeLog:
* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.
---
gcc/config/riscv/predicates.md   | 16 
gcc/config/riscv/vector-iterators.md | 16 
2 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/gcc/config/riscv/predicates.md
b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..a4f03242f2c 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -70,6 +70,14 @@
    (and (match_code "const_int,const_wide_int,const_vector")
     (match_test "op == CONST1_RTX (GET_MODE (op))")))
+(define_predicate "const_1_or_2_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "const_1_or_4_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
(define_predicate "reg_or_0_operand"
    (ior (match_operand 0 "const_0_operand")
     (match_operand 0 "register_operand")))
@@ -463,14 +471,6 @@
    (ior (match_operand 0 "register_operand")
     (match_code "const_vector")))
-(define_predicate "vector_gs_scale_operand_16"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
-
-(define_predicate "vector_gs_scale_operand_32"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
-
(define_predicate "vector_gs_scale_operand_64"
    (and (match_code "const_int")
     (match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode
== DImode)")))
diff --git a/gcc/config/riscv/vector-iterators.md
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..a32d7e8d4e9 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2723,18 +2723,18 @@
    (RVVMF4QI "const_1_operand") (RVVMF8QI "const_1_operand")
    (RVVM8HI "const_1_operand") (RVVM4HI
"vector_gs_scale_operand_16_rv32")
-  (RVVM2HI "vector_gs_scale_operand_16") (RVVM1HI
"vector_gs_scale_operand_16")
-  (RVVMF2HI "vector_gs_scale_operand_16") (RVVMF4HI
"vector_gs_scale_operand_16")
+  (RVVM2HI "const_1_or_2_operand") (RVVM1HI "const_1_or_2_operand")
+  (RVVMF2HI "const_1_or_2_operand") (RVVMF4HI "const_1_or_2_operand")
    (RVVM8HF "const_1_operand") (RVVM4HF
"vector_gs_scale_operand_16_rv32")
-  (RVVM2HF "vector_gs_scale_operand_16") (RVVM1HF
"vector_gs_scale_operand_16")
-  (RVVMF2HF "vector_gs_scale_operand_16") (RVVMF4HF
"vector_gs_scale_operand_16")
+  (RVVM2HF "const_1_or_2_operand") (RVVM1HF "const_1_or_2_operand")
+  (RVVMF2HF "const_1_or_2_operand") (RVVMF4HF "const_1_or_2_operand")
-  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI
"vector_gs_scale_operand_32") (RVVM2SI "vector_gs_scale_operand_32")
-  (RVVM1SI "vector_gs_scale_operand_32") (RVVMF2SI
"vector_gs_scale_operand_32")
+  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI
"const_1_or_4_operand") (RVVM2SI "const_1_or_4_operand")
+  (RVVM1SI "const_1_or_4_operand") (RVVMF2SI "const_1_or_4_operand")
-  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF
"vector_gs_scale_operand_32") (RVVM2SF "vector_gs_scale_operand_32")
-  (RVVM1SF "vector_gs_scale_operand_32") (RVVMF2SF
"vector_gs_scale_operand_32")
+  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF
"const_1_or_4_operand") (RVVM2SF "const_1_or_4_operand")
+  (RVVM1SF "const_1_or_4_operand") (RVVMF2SF "const_1_or_4_operand")
    (RVVM8DI "vector_gs_scale_operand_64") (RVVM4DI
"vector_gs_scale_operand_64")
    (RVVM2DI "vector_gs_scale_operand_64") (RVVM1DI
"vector_gs_scale_operand_64")
--
2.36.3



--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai

Re: [PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

2023-09-20 Thread juzhe.zh...@rivai.ai

LGTM



juzhe.zh...@rivai.ai
 
From: Lehua Ding
Date: 2023-09-21 11:44
To: gcc-patches
CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding
Subject: [PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more 
generic names
This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.
 
gcc/ChangeLog:
 
* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.
 
---
gcc/config/riscv/predicates.md   | 16 
gcc/config/riscv/vector-iterators.md | 16 
2 files changed, 16 insertions(+), 16 deletions(-)
 
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..a4f03242f2c 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -70,6 +70,14 @@
   (and (match_code "const_int,const_wide_int,const_vector")
(match_test "op == CONST1_RTX (GET_MODE (op))")))
 
+(define_predicate "const_1_or_2_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "const_1_or_4_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
(define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
@@ -463,14 +471,6 @@
   (ior (match_operand 0 "register_operand")
(match_code "const_vector")))
 
-(define_predicate "vector_gs_scale_operand_16"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
-
-(define_predicate "vector_gs_scale_operand_32"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
-
(define_predicate "vector_gs_scale_operand_64"
   (and (match_code "const_int")
(match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == 
DImode)")))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..a32d7e8d4e9 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2723,18 +2723,18 @@
   (RVVMF4QI "const_1_operand") (RVVMF8QI "const_1_operand")
 
   (RVVM8HI "const_1_operand") (RVVM4HI "vector_gs_scale_operand_16_rv32")
-  (RVVM2HI "vector_gs_scale_operand_16") (RVVM1HI "vector_gs_scale_operand_16")
-  (RVVMF2HI "vector_gs_scale_operand_16") (RVVMF4HI 
"vector_gs_scale_operand_16")
+  (RVVM2HI "const_1_or_2_operand") (RVVM1HI "const_1_or_2_operand")
+  (RVVMF2HI "const_1_or_2_operand") (RVVMF4HI "const_1_or_2_operand")
 
   (RVVM8HF "const_1_operand") (RVVM4HF "vector_gs_scale_operand_16_rv32")
-  (RVVM2HF "vector_gs_scale_operand_16") (RVVM1HF "vector_gs_scale_operand_16")
-  (RVVMF2HF "vector_gs_scale_operand_16") (RVVMF4HF 
"vector_gs_scale_operand_16")
+  (RVVM2HF "const_1_or_2_operand") (RVVM1HF "const_1_or_2_operand")
+  (RVVMF2HF "const_1_or_2_operand") (RVVMF4HF "const_1_or_2_operand")
 
-  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI 
"vector_gs_scale_operand_32") (RVVM2SI "vector_gs_scale_operand_32")
-  (RVVM1SI "vector_gs_scale_operand_32") (RVVMF2SI 
"vector_gs_scale_operand_32")
+  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI "const_1_or_4_operand") 
(RVVM2SI "const_1_or_4_operand")
+  (RVVM1SI "const_1_or_4_operand") (RVVMF2SI "const_1_or_4_operand")
 
-  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF 
"vector_gs_scale_operand_32") (RVVM2SF "vector_gs_scale_operand_32")
-  (RVVM1SF "vector_gs_scale_operand_32") (RVVMF2SF 
"vector_gs_scale_operand_32")
+  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF "const_1_or_4_operand") 
(RVVM2SF "const_1_or_4_operand")
+  (RVVM1SF "const_1_or_4_operand") (RVVMF2SF "const_1_or_4_operand")
 
   (RVVM8DI "vector_gs_scale_operand_64") (RVVM4DI "vector_gs_scale_operand_64")
   (RVVM2DI "vector_gs_scale_operand_64") (RVVM1DI "vector_gs_scale_operand_64")
--
2.36.3

[PATCH] RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic names

2023-09-20 Thread Lehua Ding

This little rename vector_gs_scale_operand_16/32 to more generic names
const_1_or_2/4_operand. So it's a little better understood when offered
for use elsewhere.

gcc/ChangeLog:

* config/riscv/predicates.md (const_1_or_2_operand): Rename.
(const_1_or_4_operand): Ditto.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
* config/riscv/vector-iterators.md: Adjust.

---
 gcc/config/riscv/predicates.md   | 16 
 gcc/config/riscv/vector-iterators.md | 16 
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..a4f03242f2c 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -70,6 +70,14 @@
   (and (match_code "const_int,const_wide_int,const_vector")
(match_test "op == CONST1_RTX (GET_MODE (op))")))

+(define_predicate "const_1_or_2_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "const_1_or_4_operand"
+  (and (match_code "const_int")
+   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "register_operand")))
@@ -463,14 +471,6 @@
   (ior (match_operand 0 "register_operand")
(match_code "const_vector")))

-(define_predicate "vector_gs_scale_operand_16"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
-
-(define_predicate "vector_gs_scale_operand_32"
-  (and (match_code "const_int")
-   (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
-
 (define_predicate "vector_gs_scale_operand_64"
   (and (match_code "const_int")
(match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == 
DImode)")))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..a32d7e8d4e9 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2723,18 +2723,18 @@
   (RVVMF4QI "const_1_operand") (RVVMF8QI "const_1_operand")

   (RVVM8HI "const_1_operand") (RVVM4HI "vector_gs_scale_operand_16_rv32")
-  (RVVM2HI "vector_gs_scale_operand_16") (RVVM1HI "vector_gs_scale_operand_16")
-  (RVVMF2HI "vector_gs_scale_operand_16") (RVVMF4HI 
"vector_gs_scale_operand_16")
+  (RVVM2HI "const_1_or_2_operand") (RVVM1HI "const_1_or_2_operand")
+  (RVVMF2HI "const_1_or_2_operand") (RVVMF4HI "const_1_or_2_operand")

   (RVVM8HF "const_1_operand") (RVVM4HF "vector_gs_scale_operand_16_rv32")
-  (RVVM2HF "vector_gs_scale_operand_16") (RVVM1HF "vector_gs_scale_operand_16")
-  (RVVMF2HF "vector_gs_scale_operand_16") (RVVMF4HF 
"vector_gs_scale_operand_16")
+  (RVVM2HF "const_1_or_2_operand") (RVVM1HF "const_1_or_2_operand")
+  (RVVMF2HF "const_1_or_2_operand") (RVVMF4HF "const_1_or_2_operand")

-  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI 
"vector_gs_scale_operand_32") (RVVM2SI "vector_gs_scale_operand_32")
-  (RVVM1SI "vector_gs_scale_operand_32") (RVVMF2SI 
"vector_gs_scale_operand_32")
+  (RVVM8SI "vector_gs_scale_operand_32_rv32") (RVVM4SI "const_1_or_4_operand") 
(RVVM2SI "const_1_or_4_operand")
+  (RVVM1SI "const_1_or_4_operand") (RVVMF2SI "const_1_or_4_operand")

-  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF 
"vector_gs_scale_operand_32") (RVVM2SF "vector_gs_scale_operand_32")
-  (RVVM1SF "vector_gs_scale_operand_32") (RVVMF2SF 
"vector_gs_scale_operand_32")
+  (RVVM8SF "vector_gs_scale_operand_32_rv32") (RVVM4SF "const_1_or_4_operand") 
(RVVM2SF "const_1_or_4_operand")
+  (RVVM1SF "const_1_or_4_operand") (RVVMF2SF "const_1_or_4_operand")

   (RVVM8DI "vector_gs_scale_operand_64") (RVVM4DI "vector_gs_scale_operand_64")
   (RVVM2DI "vector_gs_scale_operand_64") (RVVM1DI "vector_gs_scale_operand_64")
--
2.36.3

Re: [PATCH] RISC-V: Optimized for strided load/store with stride == element width[PR111450]

2023-09-20 Thread juzhe.zh...@rivai.ai

Thanks a lot. LGTM.



juzhe.zh...@rivai.ai
 
From: Li Xu
Date: 2023-09-21 11:12
To: gcc-patches
CC: kito.cheng; palmer; juzhe.zhong; xuli
Subject: [PATCH] RISC-V: Optimized for strided load/store with stride == 
element width[PR111450]
From: xuli 
 
When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.
 
PR target/111450
 
gcc/ChangeLog:
 
*config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for 
stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride operand.
* config/riscv/vector.md: Add stride = element width constraint.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr111450.c: New test.
---
gcc/config/riscv/constraints.md   |  20 
gcc/config/riscv/predicates.md|  18 
gcc/config/riscv/vector-iterators.md  |  87 +++
gcc/config/riscv/vector.md|  42 +---
.../gcc.target/riscv/rvv/base/pr111450.c  | 100 ++
5 files changed, 250 insertions(+), 17 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111450.c
 
diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3f52bc76f67..964fdd450c9 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -45,6 +45,26 @@
   (and (match_code "const_int")
(match_test "ival == 0")))
+(define_constraint "c01"
+  "Constant value 1."
+  (and (match_code "const_int")
+   (match_test "ival == 1")))
+
+(define_constraint "c02"
+  "Constant value 2"
+  (and (match_code "const_int")
+   (match_test "ival == 2")))
+
+(define_constraint "c04"
+  "Constant value 4"
+  (and (match_code "const_int")
+   (match_test "ival == 4")))
+
+(define_constraint "c08"
+  "Constant value 8"
+  (and (match_code "const_int")
+   (match_test "ival == 8")))
+
(define_constraint "K"
   "A 5-bit unsigned immediate for CSR access instructions."
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..7845998e430 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -514,6 +514,24 @@
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "pmode_register_operand")))
+;; [1, 2, 4, 8] means strided load/store with stride == element width
+(define_special_predicate "vector_eew8_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 1 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew16_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 2 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew32_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 4 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew64_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 8 || INTVAL (op) == 0"
+
;; A special predicate that doesn't match a particular mode.
(define_special_predicate "vector_any_register_operand"
   (match_code "reg"))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 73df55a69c8..f85d1cc80d1 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2596,6 +2596,93 @@
   (V512DI "V512BI")
])
+(define_mode_attr stride_predicate [
+  (RVVM8QI "vector_eew8_stride_operand") (RVVM4QI "vector_eew8_stride_operand")
+  (RVVM2QI "vector_eew8_stride_operand") (RVVM1QI "vector_eew8_stride_operand")
+  (RVVMF2QI "vector_eew8_stride_operand") (RVVMF4QI 
"vector_eew8_stride_operand")
+  (RVVMF8QI "vector_eew8_stride_operand")
+
+  (RVVM8HI "vector_eew16_stride_operand") (RVVM4HI 
"vector_eew16_stride_operand")
+  (RVVM2HI "vector_eew16_stride_operand") (RVVM1HI 
"vector_eew16_stride_operand")
+  (RVVMF2HI "vector_eew16_stride_operand") (RVVMF4HI 
"vector_eew16_stride_operand")
+
+  (RVVM8HF "vector_eew16_stride_operand") (RVVM4HF 
"vector_eew16_stride_operand")
+  (RVVM2HF "vector_eew16_stride_operand") (RVVM1HF 
"vector_eew16_stride_operand")
+  (RVVMF2HF "vector_eew16_stride_operand") (RVVMF4HF 
"vector_eew16_stride_operand")
+
+  (RVVM8SI "vector_eew32_stride_operand") (RVVM4SI 
"vector_eew32_stride_operand")
+  (RVVM2SI "vector_eew32_stride_operand") (RVVM1SI 
"vector_eew32_stride_operand")
+  (RVVMF2SI "vector_eew32_stride_operand")
+
+  (RVVM8SF "vector_eew32_stride_operand") (RVVM4SF

[PATCH] RISC-V: Optimized for strided load/store with stride == element width[PR111450]

2023-09-20 Thread Li Xu

From: xuli 

When stride == element width, vlsse should be optimized into vle.v.
vsse should be optimized into vse.v.

PR target/111450

gcc/ChangeLog:

*config/riscv/constraints.md (c01): const_int 1.
(c02): const_int 2.
(c04): const_int 4.
(c08): const_int 8.
* config/riscv/predicates.md (vector_eew8_stride_operand): New 
predicate for stride operand.
(vector_eew16_stride_operand): Ditto.
(vector_eew32_stride_operand): Ditto.
(vector_eew64_stride_operand): Ditto.
* config/riscv/vector-iterators.md: New iterator for stride operand.
* config/riscv/vector.md: Add stride = element width constraint.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111450.c: New test.
---
 gcc/config/riscv/constraints.md   |  20 
 gcc/config/riscv/predicates.md|  18 
 gcc/config/riscv/vector-iterators.md  |  87 +++
 gcc/config/riscv/vector.md|  42 +---
 .../gcc.target/riscv/rvv/base/pr111450.c  | 100 ++
 5 files changed, 250 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111450.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3f52bc76f67..964fdd450c9 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -45,6 +45,26 @@
   (and (match_code "const_int")
(match_test "ival == 0")))
 
+(define_constraint "c01"
+  "Constant value 1."
+  (and (match_code "const_int")
+   (match_test "ival == 1")))
+
+(define_constraint "c02"
+  "Constant value 2"
+  (and (match_code "const_int")
+   (match_test "ival == 2")))
+
+(define_constraint "c04"
+  "Constant value 4"
+  (and (match_code "const_int")
+   (match_test "ival == 4")))
+
+(define_constraint "c08"
+  "Constant value 8"
+  (and (match_code "const_int")
+   (match_test "ival == 8")))
+
 (define_constraint "K"
   "A 5-bit unsigned immediate for CSR access instructions."
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 4bc7ff2c9d8..7845998e430 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -514,6 +514,24 @@
   (ior (match_operand 0 "const_0_operand")
(match_operand 0 "pmode_register_operand")))
 
+;; [1, 2, 4, 8] means strided load/store with stride == element width
+(define_special_predicate "vector_eew8_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 1 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew16_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 2 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew32_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 4 || INTVAL (op) == 0"
+(define_special_predicate "vector_eew64_stride_operand"
+  (ior (match_operand 0 "pmode_register_operand")
+   (and (match_code "const_int")
+(match_test "INTVAL (op) == 8 || INTVAL (op) == 0"
+
 ;; A special predicate that doesn't match a particular mode.
 (define_special_predicate "vector_any_register_operand"
   (match_code "reg"))
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 73df55a69c8..f85d1cc80d1 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -2596,6 +2596,93 @@
   (V512DI "V512BI")
 ])
 
+(define_mode_attr stride_predicate [
+  (RVVM8QI "vector_eew8_stride_operand") (RVVM4QI "vector_eew8_stride_operand")
+  (RVVM2QI "vector_eew8_stride_operand") (RVVM1QI "vector_eew8_stride_operand")
+  (RVVMF2QI "vector_eew8_stride_operand") (RVVMF4QI 
"vector_eew8_stride_operand")
+  (RVVMF8QI "vector_eew8_stride_operand")
+
+  (RVVM8HI "vector_eew16_stride_operand") (RVVM4HI 
"vector_eew16_stride_operand")
+  (RVVM2HI "vector_eew16_stride_operand") (RVVM1HI 
"vector_eew16_stride_operand")
+  (RVVMF2HI "vector_eew16_stride_operand") (RVVMF4HI 
"vector_eew16_stride_operand")
+
+  (RVVM8HF "vector_eew16_stride_operand") (RVVM4HF 
"vector_eew16_stride_operand")
+  (RVVM2HF "vector_eew16_stride_operand") (RVVM1HF 
"vector_eew16_stride_operand")
+  (RVVMF2HF "vector_eew16_stride_operand") (RVVMF4HF 
"vector_eew16_stride_operand")
+
+  (RVVM8SI "vector_eew32_stride_operand") (RVVM4SI 
"vector_eew32_stride_operand")
+  (RVVM2SI "vector_eew32_stride_operand") (RVVM1SI 
"vector_eew32_stride_operand")
+  (RVVMF2SI "vector_eew32_stride_operand")
+
+  (RVVM8SF "vector_eew32_stride_operand") (RVVM4SF 
"vector_eew32_stride_operand")
+  (RVVM2SF "vector_eew32_stride_operand") (RVVM1SF 
"vector_eew32_stride_operand")
+  (RVVMF2SF

[PATCH] MATCH: Simplify `(A ==/!= B) &/| (((cast)A) CMP C)`

2023-09-20 Thread Andrew Pinski

This patch adds support to the pattern for `(A == B) &/| (A CMP C)`
where the second A could be casted to a different type.
Some were handled correctly if using seperate `if` statements
but not if combined with BIT_AND/BIT_IOR.
In the case of pr111456-1.c, the testcase would pass if
`--param=logical-op-non-short-circuit=0` was used but now
can be optimized always.

OK? Bootstrapped and tested on x86_64-linux-gnu.

PR tree-optimization/106164
PR tree-optimization/111456

gcc/ChangeLog:

* match.pd (`(A ==/!= B) & (A CMP C)`):
Support an optional cast on the second A.
(`(A ==/!= B) | (A CMP C)`): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/cmpbit-6.c: New test.
* gcc.dg/tree-ssa/cmpbit-7.c: New test.
* gcc.dg/tree-ssa/pr111456-1.c: New test.
---
 gcc/match.pd   | 76 +-
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-6.c   | 22 +++
 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-7.c   | 28 
 gcc/testsuite/gcc.dg/tree-ssa/pr111456-1.c | 43 
 4 files changed, 139 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-6.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpbit-7.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr111456-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index a37af05f873..0bf91bde486 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2973,7 +2973,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
   (gt @0 (minus @1 { build_int_cst (TREE_TYPE (@1), 1); }
 
-/* Convert (X == CST1) && (X OP2 CST2) to a known value
+/* Convert (X == CST1) && ((other)X OP2 CST2) to a known value
based on CST1 OP2 CST2.  Similarly for (X != CST1).  */
 /* Convert (X == Y) && (X OP2 Y) to a known value if X is an integral type.
Similarly for (X != Y).  */
@@ -2981,26 +2981,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for code1 (eq ne)
  (for code2 (eq ne lt gt le ge)
   (simplify
-   (bit_and:c (code1@3 @0 @1) (code2@4 @0 @2))
+   (bit_and:c (code1:c@3 @0 @1) (code2:c@4 (convert?@c0 @0) @2))
(if ((TREE_CODE (@1) == INTEGER_CST
 && TREE_CODE (@2) == INTEGER_CST)
|| ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
 || POINTER_TYPE_P (TREE_TYPE (@1)))
-   && operand_equal_p (@1, @2)))
+   && bitwise_equal_p (@1, @2)))
 (with
  {
   bool one_before = false;
   bool one_after = false;
   int cmp = 0;
+  bool allbits = true;
   if (TREE_CODE (@1) == INTEGER_CST
  && TREE_CODE (@2) == INTEGER_CST)
{
- cmp = tree_int_cst_compare (@1, @2);
+ allbits = TYPE_PRECISION (TREE_TYPE (@1)) <= TYPE_PRECISION 
(TREE_TYPE (@2));
+ auto t1 = wi::to_wide (fold_convert (TREE_TYPE (@2), @1));
+ auto t2 = wi::to_wide (@2);
+ cmp = wi::cmp (t1, t2, TYPE_SIGN (TREE_TYPE (@2)));
  if (cmp < 0
- && wi::to_wide (@1) == wi::to_wide (@2) - 1)
+ && t1 == t2 - 1)
one_before = true;
  if (cmp > 0
- && wi::to_wide (@1) == wi::to_wide (@2) + 1)
+ && t1 == t2 + 1)
one_after = true;
}
   bool val;
@@ -3018,25 +3022,29 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (switch
   (if (code1 == EQ_EXPR && val) @3)
   (if (code1 == EQ_EXPR && !val) { constant_boolean_node (false, type); })
-  (if (code1 == NE_EXPR && !val) @4)
+  (if (code1 == NE_EXPR && !val && allbits) @4)
   (if (code1 == NE_EXPR
&& code2 == GE_EXPR
-  && cmp == 0)
-   (gt @0 @1))
+  && cmp == 0
+  && allbits)
+   (gt @c0 (convert @1)))
   (if (code1 == NE_EXPR
&& code2 == LE_EXPR
-  && cmp == 0)
-   (lt @0 @1))
+  && cmp == 0
+  && allbits)
+   (lt @c0 (convert @1)))
   /* (a != (b+1)) & (a > b) -> a > (b+1) */
   (if (code1 == NE_EXPR
&& code2 == GT_EXPR
-  && one_after)
-   (gt @0 @1))
+  && one_after
+  && allbits)
+   (gt @c0 (convert @1)))
   /* (a != (b-1)) & (a < b) -> a < (b-1) */
   (if (code1 == NE_EXPR
&& code2 == LT_EXPR
-  && one_before)
-   (lt @0 @1))
+  && one_before
+  && allbits)
+   (lt @c0 (convert @1)))
  )
 )
)
@@ -3100,26 +3108,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (for code1 (eq ne)
  (for code2 (eq ne lt gt le ge)
   (simplify
-   (bit_ior:c (code1@3 @0 @1) (code2@4 @0 @2))
+   (bit_ior:c (code1:c@3 @0 @1) (code2:c@4 (convert?@c0 @0) @2))
(if ((TREE_CODE (@1) == INTEGER_CST
 && TREE_CODE (@2) == INTEGER_CST)
|| ((INTEGRAL_TYPE_P (TREE_TYPE (@1))
|| POINTER_TYPE_P (TREE_TYPE (@1)))
-   && operand_equal_p (@1, @2)))
+   && bitwise_equal_p (@1, @2)))
 (with
  {
   bool one_before = false;
   bool one_after = false;
   int cmp = 0;
+

[PATCH] check undefine_p for one more vr

2023-09-20 Thread Jiufu Guo

Hi,

The root cause of PR111355 and PR111482 is missing to check if vr0
is undefined_p before call vr0.lower_bound.

In the pattern "(X + C) / N",

(if (INTEGRAL_TYPE_P (type)
 && get_range_query (cfun)->range_of_expr (vr0, @0))
 (if (...) 
   (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ...
&& wi::geu_p (vr0.lower_bound (), -c))

In "(if (...)", there is code to prevent vr0's undefined_p,
But in the "else" part, vr0's undefined_p is not checked before
"wi::geu_p (vr0.lower_bound (), -c)".

Bootstrap & regtest pass on ppc64{,le}.
Is this ok for trunk?

BR,
Jeff (Jiufu Guo)


PR tree-optimization/111355

gcc/ChangeLog:

* match.pd ((X + C) / N): Update pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/pr111355.c: New test.

---
 gcc/match.pd| 2 +-
 gcc/testsuite/gcc.dg/pr111355.c | 8 
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr111355.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 39c9c81966a..5fdfba14d47 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1033,7 +1033,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  || (vr0.nonnegative_p () && vr3.nonnegative_p ())
  || (vr0.nonpositive_p () && vr3.nonpositive_p (
(plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); })
-   (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0
+   (if (!vr0.undefined_p () && TYPE_UNSIGNED (type) && c.sign_mask () < 0
&& exact_mod (-c)
/* unsigned "X-(-C)" doesn't underflow.  */
&& wi::geu_p (vr0.lower_bound (), -c))
diff --git a/gcc/testsuite/gcc.dg/pr111355.c b/gcc/testsuite/gcc.dg/pr111355.c
new file mode 100644
index 000..8bacbc69d31
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111355.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -Wno-div-by-zero" } */
+
+/* Make sure no ICE. */
+int main() {
+  unsigned b;
+  return b ? 1 << --b / 0 : 0;
+}
-- 
2.25.1

[Committed] RISC-V: Support VLS INT <-> FP conversions

2023-09-20 Thread Juzhe-Zhong

Support INT <-> FP VLS auto-vectorization patterns.

Regression passed.
Committed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS modes.
* config/riscv/vector-iterators.md: Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/convert-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/convert-9.c: New test.

---
 gcc/config/riscv/autovec.md   |  12 +-
 gcc/config/riscv/vector-iterators.md  | 202 ++
 gcc/config/riscv/vector.md|  20 +-
 .../riscv/rvv/autovec/vls/convert-1.c |  74 +++
 .../riscv/rvv/autovec/vls/convert-10.c|  80 +++
 .../riscv/rvv/autovec/vls/convert-11.c|  54 +
 .../riscv/rvv/autovec/vls/convert-12.c|  36 
 .../riscv/rvv/autovec/vls/convert-2.c |  74 +++
 .../riscv/rvv/autovec/vls/convert-3.c |  58 +
 .../riscv/rvv/autovec/vls/convert-4.c |  36 
 .../riscv/rvv/autovec/vls/convert-5.c |  80 +++
 .../riscv/rvv/autovec/vls/convert-6.c |  55 +
 .../riscv/rvv/autovec/vls/convert-7.c |  37 
 .../riscv/rvv/autovec/vls/convert-8.c |  58 +
 .../riscv/rvv/autovec/vls/convert-9.c |  22 ++
 15 files changed, 882 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/convert-9.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 75ed7ae4f2e..55c0a04df3b 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -847,7 +847,7 @@
 (define_insn_and_split "2"
   [(set (match_operand: 0 "register_operand")
(any_fix:
- (match_operand:VF 1 "register_operand")))]
+ (match_operand:V_VLSF 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -868,8 +868,8 @@
 ;; -
 
 (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-   (any_float:VF
+  [(set (match_operand:V_VLSF 0 "register_operand")
+   (any_float:V_VLSF
  (match_operand: 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
@@ -916,8 +916,8 @@
 ;; - vfwcvt.f.x.v
 ;; -
 (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-   (any_float:VF
+  [(set (match_operand:V_VLSF 0 "register_operand")
+   (any_float:V_VLSF
  (match_operand: 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
@@ -940,7 +940,7 @@
 (define_insn_and_split "2"
   [(set (match_operand: 0 "register_operand")
(any_fix:
- (match_operand:VF 1 "register_operand")))]
+ (match_operand:V_VLSF 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 053d84c0c7d..19f3ec3ef74 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -1037,6 +1037,28 @@
   (RVVM4DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
   (RVVM2DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
   (RVVM1DI "TARGET_VECTOR_ELEN_64 && TARGET_VECTOR_ELEN_FP_32")
+
+  (V1SI "TARGET_VECTOR_VLS && TARGET_ZVFH")
+  (V2SI

[PATCH] LoongArch: Optimizations of vector construction.

2023-09-20 Thread Guo Jie

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_): New
pattern for vector construction.
(vec_set_internal): Ditto.
(lasx_xvinsgr2vr__internal): Ditto.
(lasx_xvilvl__internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl__internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
---
 gcc/config/loongarch/lasx.md  |  69 ++
 gcc/config/loongarch/loongarch.cc | 716 +-
 gcc/config/loongarch/lsx.md   | 134 
 .../vector/lasx/lasx-vec-construct-opt.c  | 102 +++
 .../vector/lsx/lsx-vec-construct-opt.c|  85 +++
 5 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 8111c8bb79a..2bc5d47ed4a 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -186,6 +186,9 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVLDI
   UNSPEC_LASX_XVLDX
   UNSPEC_LASX_XVSTX
+  UNSPEC_LASX_VECINIT_MERGE
+  UNSPEC_LASX_VEC_SET_INTERNAL
+  UNSPEC_LASX_XVILVL_INTERNAL
 ])
 
 ;; All vector modes with 256 bits.
@@ -255,6 +258,15 @@ (define_mode_attr VFHMODE256
[(V8SF "V4SF")
(V4DF "V2DF")])
 
+;; The attribute gives half int/float modes for vector modes.
+(define_mode_attr VHMODE256_ALL
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")
+   (V8SF "V4SF")
+   (V4DF "V2DF")])
+
 ;; The attribute gives double modes for vector modes in LASX.
 (define_mode_attr VDMODE256
   [(V8SI "V4DI")
@@ -312,6 +324,11 @@ (define_mode_attr mode256_f
(V4DI "v4df")
(V8SI "v8sf")])
 
+;; This attribute gives V32QI mode and V16HI mode with half size.
+(define_mode_attr mode256_i_half
+  [(V32QI "v16qi")
+   (V16HI "v8hi")])
+
  ;; This attribute gives suffix for LASX instructions.  HOW?
 (define_mode_attr lasxfmt
   [(V4DF "d")
@@ -756,6 +773,20 @@ (define_insn "lasx_xvpermi_q_"
   [(set_attr "type" "simd_splat")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Support a LSX-mode input op2.
+(define_insn "lasx_vecinit_merge_"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (unspec:LASX
+ [(match_operand:LASX 1 "register_operand" "0")
+  (match_operand: 2 "register_operand" "f")
+  (match_operand 3 "const_uimm8_operand")]
+  UNSPEC_LASX_VECINIT_MERGE))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "")])
+
 (define_insn "lasx_xvpickve2gr_d"
   [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
@@ -779,6 +810,33 @@ (define_expand "vec_set"
   DONE;
 })
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Simulate missing instructions xvinsgr2vr.b and xvinsgr2vr.h.
+(define_expand "vec_set_internal"
+  [(match_operand:ILASX_HB 0 "register_operand")
+   (match_operand: 1 "reg_or_0_operand")
+   (match_operand 2 "const__operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr__internal
+(operands[0], operands[1], operands[0], index));
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr__internal"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+   (unspec:ILASX_HB [(match_operand: 1 "reg_or_0_operand" "rJ")
+ (match_operand:ILASX_HB 2 "register_operand" "0")
+ (match_operand 3 "const__operand" "")]
+UNSPEC_LASX_VEC_SET_INTERNAL))]
+  "ISA_HAS_LASX"
+{
+  return "vinsgr2vr.\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "")])
+
 (define_expand "vec_set"
   [(match_operand:FLASX 0 "register_operand")
(match_operand: 1 "reg_or_0_operand")
@@ -1567,6 +1625,17 @@ (define_insn "logb2"
   [(set_attr "type" "simd_flog2")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Merge two scalar floating-point op1 and op2 into a LASX op0.
+(define_insn "lasx_xvilvl__internal"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+   (unspec:FLASX [(match_operand: 1 "register_operand" "f")
+  (match_operand: 2 "register_operand" "f")]
+ UNSPEC_LASX_XVILVL_INTERNAL))]
+

[PATCH] LoongArch: Optimizations of vector construction.

2023-09-20 Thread Guo Jie

Change-Id: I327f68ab482b94073974e672c71d25c98b35a080

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_): New
pattern for vector construction.
(vec_set_internal): Ditto.
(lasx_xvinsgr2vr__internal): Ditto.
(lasx_xvilvl__internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl__internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
---
 gcc/config/loongarch/lasx.md  |  69 ++
 gcc/config/loongarch/loongarch.cc | 716 +-
 gcc/config/loongarch/lsx.md   | 134 
 .../vector/lasx/lasx-vec-construct-opt.c  | 102 +++
 .../vector/lsx/lsx-vec-construct-opt.c|  85 +++
 5 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 8111c8bb79a..2bc5d47ed4a 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -186,6 +186,9 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVLDI
   UNSPEC_LASX_XVLDX
   UNSPEC_LASX_XVSTX
+  UNSPEC_LASX_VECINIT_MERGE
+  UNSPEC_LASX_VEC_SET_INTERNAL
+  UNSPEC_LASX_XVILVL_INTERNAL
 ])
 
 ;; All vector modes with 256 bits.
@@ -255,6 +258,15 @@ (define_mode_attr VFHMODE256
[(V8SF "V4SF")
(V4DF "V2DF")])
 
+;; The attribute gives half int/float modes for vector modes.
+(define_mode_attr VHMODE256_ALL
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")
+   (V8SF "V4SF")
+   (V4DF "V2DF")])
+
 ;; The attribute gives double modes for vector modes in LASX.
 (define_mode_attr VDMODE256
   [(V8SI "V4DI")
@@ -312,6 +324,11 @@ (define_mode_attr mode256_f
(V4DI "v4df")
(V8SI "v8sf")])
 
+;; This attribute gives V32QI mode and V16HI mode with half size.
+(define_mode_attr mode256_i_half
+  [(V32QI "v16qi")
+   (V16HI "v8hi")])
+
  ;; This attribute gives suffix for LASX instructions.  HOW?
 (define_mode_attr lasxfmt
   [(V4DF "d")
@@ -756,6 +773,20 @@ (define_insn "lasx_xvpermi_q_"
   [(set_attr "type" "simd_splat")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Support a LSX-mode input op2.
+(define_insn "lasx_vecinit_merge_"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (unspec:LASX
+ [(match_operand:LASX 1 "register_operand" "0")
+  (match_operand: 2 "register_operand" "f")
+  (match_operand 3 "const_uimm8_operand")]
+  UNSPEC_LASX_VECINIT_MERGE))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "")])
+
 (define_insn "lasx_xvpickve2gr_d"
   [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
@@ -779,6 +810,33 @@ (define_expand "vec_set"
   DONE;
 })
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Simulate missing instructions xvinsgr2vr.b and xvinsgr2vr.h.
+(define_expand "vec_set_internal"
+  [(match_operand:ILASX_HB 0 "register_operand")
+   (match_operand: 1 "reg_or_0_operand")
+   (match_operand 2 "const__operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr__internal
+(operands[0], operands[1], operands[0], index));
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr__internal"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+   (unspec:ILASX_HB [(match_operand: 1 "reg_or_0_operand" "rJ")
+ (match_operand:ILASX_HB 2 "register_operand" "0")
+ (match_operand 3 "const__operand" "")]
+UNSPEC_LASX_VEC_SET_INTERNAL))]
+  "ISA_HAS_LASX"
+{
+  return "vinsgr2vr.\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "")])
+
 (define_expand "vec_set"
   [(match_operand:FLASX 0 "register_operand")
(match_operand: 1 "reg_or_0_operand")
@@ -1567,6 +1625,17 @@ (define_insn "logb2"
   [(set_attr "type" "simd_flog2")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Merge two scalar floating-point op1 and op2 into a LASX op0.
+(define_insn "lasx_xvilvl__internal"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+   (unspec:FLASX [(match_operand: 1 "register_operand" "f")
+  (match_operand: 2 "register_operand" "f")]
+

Re: Re: [Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]

2023-09-20 Thread juzhe.zh...@rivai.ai

Yes. We could wait for a more few days to backport.

juzhe.zh...@rivai.ai

From: Kito Cheng
Date: 2023-09-21 00:41
To: Juzhe-Zhong
CC: GCC Patches; Kito Cheng; Jeff Law; Robin Dapp
Subject: Re: [Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]
Does it also happened on gcc 13 branch? If so plz backport :)

Juzhe-Zhong  於 2023年9月20日 週三 11:09 寫道：
This bug is exposed when we support VLS integer conversion patterns.

FAIL: c-c++-common/torture/pr53505.c execution.

This is because incorrect vsetvl elimination by Phase 4:

   10318:   0d207057vsetvli zero,zero,e32,m4,ta,ma
   1031c:   5e003e57vmv.v.i v28,0
   .:   missed e8,m1 vsetvl
   10320:   7b07b057vmsgtu.vi   v0,v16,15
   10324:   03083157vadd.vi v2,v16,-16

Regression on release version GCC no surprise difference.

Committed.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Fix bug.

---
 gcc/config/riscv/riscv-vsetvl.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index df980b6770e..e0f61148ef3 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1799,10 +1799,11 @@ vector_insn_info::operator== (const vector_insn_info 
) const
 if (m_demands[i] != other.demand_p ((enum demand_type) i))
   return false;

-  if (vector_config_insn_p (m_insn->rtl ())
-  || vector_config_insn_p (other.get_insn ()->rtl ()))
-if (m_insn != other.get_insn ())
-  return false;
+  /* We should consider different INSN demands as different
+ expression.  Otherwise, we will be doing incorrect vsetvl
+ elimination.  */
+  if (m_insn != other.get_insn ())
+return false;

   if (!same_avl_p (other))
 return false;
-- 
2.36.3

Re: Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-20 Thread 钟居哲

>> On a more general note, are we expecting #include  to cause a
>> testcase to fail?

Well, actually I am not familiar with this stuff.
We include match.h is because we need it.
For example, CEIL/FLOOR,...etc.
I don't know how to avoid those bogus failures.

juzhe.zh...@rivai.ai

From: Patrick O'Neill
Date: 2023-09-21 01:47
To: juzhe.zh...@rivai.ai
CC: Robin Dapp; gcc-patches; Kito.cheng; jeffreyalaw; palmer; Edwin Lu; 
joern.rennecke; jeremy.bennett; gnu-toolchain; Kito Cheng
Subject: Re: [Committed] RISC-V: Support VLS unary floating-point patterns
Juzhe,

On a more general note, are we expecting #include  to cause a
testcase to fail?

My motivation is to make the testsuite less noisy when checking for
regressions. For example, a patch like this one:
https://patchwork.sourceware.org/project/gcc/patch/20230920023059.1728132-1-pan2...@intel.com/
is showing 4 new failures on rv32gcv from the {dg-do compile} testcases
that #include . I might be wrong, but those don't look like real
failures to me [1][2][3].

On glibc rv64gcv I'm seeing tests like:
gcc.target/riscv/rvv/autovec/unop/vnot-rv32gcv.c
fail with similar missing stubs-ilp32d.h errors.

I want to sanity-check with other people that they are seeing similar
errors and that these errors indicate something wrong with the testsuite.
If nobody else is seeing these errors, I'd like to hear how you're
running the testsuite so I can debug the riscv-gnu-toolchain repo.

Patrick

[1]: 
Executing on host: 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc

-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/

/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c
  -march=rv32gcv -mabi=ilp32d -mcmodel=medlow   -fdiagnostics-plain-output  -O3 
-ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2 -S   
-o math-ceil-1.s(timeout = 600)
spawn -ignore SIGHUP 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc

-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/

/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c
 -march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output -O3 
-ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2 -S -o 
math-ceil-1.s
In file included from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h:1,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c:5:
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11:
 fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/math-ceil-1.c -O3 -ftree-vectorize (test for 
excess errors)

[2]:
https://github.com/ewlu/riscv-gnu-toolchain/issues/170

[3]:
This also extends beyond math.h. I'm seeing similar failures for
testcases like 
gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c that
#include .

On 9/19/23 18:12, Patrick O'Neill wrote:
I'll let it run overnight and see if this helps. Even before this patch,
I was seeing 233 stubs related failures for rv32gcv and 7 for rv64gcv so
this won't fix all the issues.

It's easily replicated using upstream riscv-gnu-toolchain
git clone https://github.com/riscv-collab/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init gcc
cd gcc
git pull master
cd ..
mkdir build
cd build
../configure --prefix=$(pwd) --with-arch=rv32gcv --with-abi=ilp32d
make report-linux -j32

Then search for "stubs" in the debug logs 
(/build-gcc-linux-stage2/gcc/testsuite/*.log)

Patrick
On 9/19/23 17:54, juzhe.zh...@rivai.ai wrote:
I think we could remove match.h.

Hi, @Patrick. Could you verify it?

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
index 2292372d7a3..674098e9ba6 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -1,5 +1,4

Re: [PATCH v2 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-20 Thread Jason Merrill


On 9/19/23 20:30, waffl3x wrote:

Thank you, this is great!


Thanks!


One legal hurdle to start with: our DCO policy
(https://gcc.gnu.org/dco.html) requires real names in the sign-off, not
pseudonyms. If you would prefer to contribute under this pseudonym, I
encourage you to file a copyright assignment with the FSF, who are set
up to handle that.


I will get on that right away.


+/* These need to moved to somewhere appropriate. */


This isn't a bad spot for these macros, but you could also move them
down lower, maybe near DECL_THIS_STATIC and DECL_ARRAY_PARAMETER_P for
some thematic connection.


Sounds good, I will move them down.


+/* The flag is a member of base, but the value is meaningless for other
+ decl types so checking is still justified I imagine. */


Absolutely, we often reuse bits for other purposes if they're disjoint
from the use they were added for.


Would it be more appropriate to give it a general name in base instead
then? If so, I can also change that.


That would make sense.


+/* Not a lang_decl field, but still specific to c++. */
+#define DECL_PARM_XOBJ_FLAG(NODE) \
+ (PARM_DECL_CHECK (NODE)->decl_common.decl_flag_3)


Better to use a DECL_LANG_FLAG than claim one of the
language-independent flags for C++.

There's a list at the top of cp-tree.h of the uses of LANG_FLAG on
various kinds of tree node. DECL_LANG_FLAG_4 seems free on PARM_DECL.


Okay, I will switch to that instead, I didn't like using such a general
purpose flag for what is only relevant until the FUNC_DECL is created
and then never again.


That's a good point, but the flag you chose seems even more general purpose.

A better option might be, instead of putting this flag on the PARM_DECL, 
to put it on the short-lived TREE_LIST which is only used for 
communication between cp_parser_parameter_declaration_list and 
grokparms, and have grokdeclarator grab it from 
declarator->u.function.parameters?



If you don't mind answering right now, what are the consequences of
claiming language-independent flags for C++? Or to phrase it
differently, why would this be claiming it for C++? My guess was that
those flags could be used by any front ends and there wouldn't be any
conflicts, as you can't really have crossover between two front ends at
the same time. Or is that the thing, that kind of cross-over is
actually viable and claiming a language independent flag inhibits that
possibility? Like I eluded to, this is kinda off topic from the patch
so feel free to defer the answer to someone else but I just want to
clear up my understanding for the future.


Generally the flags that aren't specifically specified to be 
language-specific are reserved for language-independent uses; even if 
only one front-end actually uses the feature, it should be for 
communication to language-independent code rather than communication 
within the particular front-end.  The patch modified tree-core.h to 
refer to a macro in cp-tree.h.



Yeah, I separated all the diagnostics out into the second patch. This
patch was meant to include the bare minimum of what was necessary to
get the feature functional. As for the diagnostics patch, I'm not happy
with how scattered about the code base it is, but you'll be able to
judge for yourself when I resubmit that patch, hopefully later today.
So not to worry, I didn't neglect diagnostics, it's just in a follow
up. The v1 of it was submitted on August 31st if you want to find it,
but I wouldn't recommend it. I misunderstood how some things were to be
formatted so it's probably best you just wait for me to finish a v2 of
it.


Ah, oops, I assumed that v2 completely replaced v1.


One last thing, I assume I should clean up the comments and replace
them with more typical ones right? I'm going to go forward with that
assumption in v3, I just want to mention it in advanced just in case I
have the wrong idea.


Yes, please.


I will get started on v3 of this patch and v2 of the diagnostic patch
as soon as I have the ball rolling on legal stuff. I should have it all
finished tonight. Thanks for the detailed response, it cleared up a lot
of my doubts.


Sounds good!

Jason

Re: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

2023-09-20 Thread Jason Merrill


On 9/20/23 11:03, Patrick Palka wrote:

On Wed, 20 Sep 2023, Patrick Palka wrote:


Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

This fixes some missed SFINAE in grok_array_decl when checking a C++23
multidimensional subscript operator expression.

Note the existing pedwarn code paths are a backward compability fallback
for treating invalid a[x, y, z] as a[(x, y, z)], but this should only be
done outside of a SFINAE context I think.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard errors with tf_error.
In the pedwarn code paths, return error_mark_node when in
a SFINAE context.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.
---
  gcc/cp/decl2.cc  | 36 +++-
  gcc/testsuite/g++.dg/cpp23/subscript15.C | 24 
  2 files changed, 47 insertions(+), 13 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b402befba6d..6eb6d8c57d6 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -477,7 +477,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  /* If it would be valid albeit deprecated expression in
 C++20, just pedwarn on it and treat it as if wrapped
-in ().  */
+in () unless we're in a SFINAE context.  */
+ if (!(complain & tf_error))
+   return error_mark_node;


It occurred to me that we could check for tf_error much earlier, before
we call build_x_compound_expr_from_vec and build_new_op, since they're
only used here to implement the backward compatibilty fallback.  Perhaps
the following is better, then:


This version is OK.


-- >8 --

Subject: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly recognize ill-formed C++23 multidimensional
subscript operator expressions.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.
---
  gcc/cp/decl2.cc  | 15 +++---
  gcc/testsuite/g++.dg/cpp23/subscript15.C | 25 
  2 files changed, 37 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b402befba6d..6ac27cbc15f 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -459,7 +459,10 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  expr = build_op_subscript (loc, array_expr, index_exp_list,
 , complain & tf_decltype);
- if (expr == error_mark_node)
+ if (expr == error_mark_node
+ /* Don't do the backward compatibility fallback in a SFINAE
+context.   */
+ && (complain & tf_error))
{
  tree idx = build_x_compound_expr_from_vec (*index_exp_list, NULL,
 tf_none);
@@ -510,6 +513,11 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
  
if (index_exp == NULL_TREE)

{
+ if (!(complain & tf_error))
+   /* Don't do the backward compatibility fallback in a SFINAE
+  context.  */
+   return error_mark_node;
+
  if ((*index_exp_list)->is_empty ())
{
  error_at (loc, "built-in subscript operator without expression "
@@ -561,8 +569,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
swapped = true, array_expr = p2, index_exp = i1;
else
{
- error_at (loc, "invalid types %<%T[%T]%> for array subscript",
-   type, TREE_TYPE (index_exp));
+ if (complain & tf_error)
+   error_at (loc, "invalid types %<%T[%T]%> for array subscript",
+ type, TREE_TYPE (index_exp));
  return error_mark_node;
}
  
diff --git a/gcc/testsuite/g++.dg/cpp23/subscript15.C b/gcc/testsuite/g++.dg/cpp23/subscript15.C

new file mode 100644
index 000..fece96be96b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/subscript15.C
@@ -0,0 +1,25 @@
+// PR c++/111493
+// { dg-do compile { target c++23 } }
+
+template
+concept CartesianIndexable = requires(T t, Ts... ts) { t[ts...]; };
+
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+
+static_assert(!CartesianIndexable);
+static_assert(CartesianIndexable);
+static_assert(!CartesianIndexable);

Re: [PATCH] c++: constraint rewriting during ttp coercion [PR111485]

2023-09-20 Thread Jason Merrill


On 9/20/23 13:10, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps backports?


OK for trunk and 13.


-- >8 --

In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters.  The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2.  This patch
fixes this by including the outer template arguments in this substitution,
which ought to match the depth of the ttp.

The second testcase demonstrates that it's sometimes necessary to
substitute the concrete outer template arguments instead of generic
ones, because a ttp's constraints could depend on outer arguments.

PR c++/111485

gcc/cp/ChangeLog:

* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
---
  gcc/cp/pt.cc   |  5 +++--
  gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C | 24 ++
  gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C | 17 +++
  3 files changed, 44 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8758e218ce4..f47887291a6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -8360,7 +8360,7 @@ canonicalize_expr_argument (tree arg, tsubst_flags_t 
complain)
 constrained than the parameter.  */
  
  static bool

-is_compatible_template_arg (tree parm, tree arg)
+is_compatible_template_arg (tree parm, tree arg, tree args)
  {
tree parm_cons = get_constraints (parm);
  
@@ -8381,6 +8381,7 @@ is_compatible_template_arg (tree parm, tree arg)

  {
tree aparms = DECL_INNERMOST_TEMPLATE_PARMS (arg);
new_args = template_parms_level_to_args (aparms);
+  new_args = add_to_template_args (args, new_args);
++processing_template_decl;
parm_cons = tsubst_constraint_info (parm_cons, new_args,
  tf_none, NULL_TREE);
@@ -8635,7 +8636,7 @@ convert_template_argument (tree parm,
// Check that the constraints are compatible before allowing the
// substitution.
if (val != error_mark_node)
-if (!is_compatible_template_arg (parm, arg))
+   if (!is_compatible_template_arg (parm, arg, args))
{
if (in_decl && (complain & tf_error))
{
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
new file mode 100644
index 000..4129e9e1303
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
@@ -0,0 +1,24 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+template concept D = C || true;
+
+template class TT> struct example { };
+template class UU> using example_t = example;
+
+template
+struct A {
+  template class TT> struct example { };
+
+  template class UU> using example_t = example;
+
+  template
+  struct B {
+template class UU> using example_t = example;
+  };
+};
+
+template struct A::B;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
new file mode 100644
index 000..7832cabc7d8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
@@ -0,0 +1,17 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+
+template requires C class TT>
+void f();
+
+template requires C
+struct A;
+
+int main() {
+  f();
+  f(); // { dg-error "no match|constraint" }
+}

Re: Ping: [PATCH] testsuite: Add test for already-fixed issue with _Pragma expansion [PR90400]

2023-09-20 Thread Richard Sandiford

Lewis Hyatt via Gcc-patches  writes:
> Hello-
>
> May I please ping this one? It's adding a testcase prior to closing
> the PR. Thanks!
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628488.html

OK, thanks.  (Not really my area, but someone would probably have
objected by now if they were going to.)

Richard

>
> -Lewis
>
> On Fri, Aug 25, 2023 at 4:46 PM Lewis Hyatt  wrote:
>>
>> Hello-
>>
>> This is adding a testcase for a PR that was already incidentally fixed. OK
>> to commit please? Thanks...
>>
>> -Lewis
>>
>> -- >8 --
>>
>> The PR was fixed by r12-5454. Since the fix was somewhat incidental,
>> although related, add a testcase from PR90400 too before closing it out.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR preprocessor/90400
>> * c-c++-common/cpp/pr90400.c: New test.
>> ---
>>  gcc/testsuite/c-c++-common/cpp/pr90400.c | 14 ++
>>  1 file changed, 14 insertions(+)
>>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr90400.c
>>
>> diff --git a/gcc/testsuite/c-c++-common/cpp/pr90400.c 
>> b/gcc/testsuite/c-c++-common/cpp/pr90400.c
>> new file mode 100644
>> index 000..4f2cab8d6ab
>> --- /dev/null
>> +++ b/gcc/testsuite/c-c++-common/cpp/pr90400.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-do compile } */
>> +/* { dg-additional-options "-save-temps" } */
>> +/* PR preprocessor/90400 */
>> +
>> +#define OUTER(x) x
>> +#define FOR(x) _Pragma ("GCC unroll 0") for (x)
>> +void f ()
>> +{
>> +/* If the pragma were to be seen prior to the expansion of FOR, as was
>> +   the case before r12-5454, then the unroll pragma would complain
>> +   because the immediately following statement would be ";" rather than
>> +   a loop.  */
>> +OUTER (; FOR (int i = 0; i != 1; ++i);) /* { dg-bogus {statement 
>> expected before ';' token} } */
>> +}

[COMMITTED] Tweak ssa_cache::merge_range API.

2023-09-20 Thread Andrew MacLeod

Merge_range use to return TRUE if there was already a range in the 
cache.   This patches change the meaning of the return value such that 
it returns TRUE if the range in the cache changes..  ie, it either set a 
range where there wasn't one before, or updates an existing range when 
the old one intersects with the new one results in a different range.


It also tweaks the debug output for the cache to no longer output the 
header text "non-varying Global Ranges" in the class, as the class is 
now used for other purpoises as well.   The text is moved to when the 
dump is actually from a global table.


Bootstraps on 86_64-pc-linux-gnu with no regressions.   Pushed.

Andrew
commit 0885e96272f1335c324f99fd2d1e9b0b3da9090c
Author: Andrew MacLeod 
Date:   Wed Sep 20 12:53:04 2023 -0400

Tweak merge_range API.

merge_range use to return TRUE if ter was already a arange.  Now it
returns TRUE if it adds a new range, OR updates and existing range
with a new value.  FALSE is returned when the range already matches.

* gimple-range-cache.cc (ssa_cache::merge_range): Change meaning
of the return value.
(ssa_cache::dump): Don't print GLOBAL RANGE header.
(ssa_lazy_cache::merge_range): Adjust return value meaning.
(ranger_cache::dump): Print GLOBAL RANGE header.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 5b74681b61a..3c819933c4e 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -606,7 +606,7 @@ ssa_cache::set_range (tree name, const vrange )
 }
 
 // If NAME has a range, intersect it with R, otherwise set it to R.
-// Return TRUE if there was already a range set, otherwise false.
+// Return TRUE if the range is new or changes.
 
 bool
 ssa_cache::merge_range (tree name, const vrange )
@@ -616,19 +616,23 @@ ssa_cache::merge_range (tree name, const vrange )
 m_tab.safe_grow_cleared (num_ssa_names + 1);
 
   vrange_storage *m = m_tab[v];
-  if (m)
+  // Check if this is a new value.
+  if (!m)
+m_tab[v] = m_range_allocator->clone (r);
+  else
 {
   Value_Range curr (TREE_TYPE (name));
   m->get_vrange (curr, TREE_TYPE (name));
-  curr.intersect (r);
+  // If there is no change, return false.
+  if (!curr.intersect (r))
+   return false;
+
   if (m->fits_p (curr))
m->set_vrange (curr);
   else
m_tab[v] = m_range_allocator->clone (curr);
 }
-  else
-m_tab[v] = m_range_allocator->clone (r);
-  return m != NULL;
+  return true;
 }
 
 // Set the range for NAME to R in the ssa cache.
@@ -656,27 +660,14 @@ ssa_cache::clear ()
 void
 ssa_cache::dump (FILE *f)
 {
-  /* Cleared after the table header has been printed.  */
-  bool print_header = true;
   for (unsigned x = 1; x < num_ssa_names; x++)
 {
   if (!gimple_range_ssa_p (ssa_name (x)))
continue;
   Value_Range r (TREE_TYPE (ssa_name (x)));
-  // Invoke dump_range_query which is a private virtual version of
-  // get_range.   This avoids performance impacts on general queries,
-  // but allows sharing of the dump routine.
+  // Dump all non-varying ranges.
   if (get_range (r, ssa_name (x)) && !r.varying_p ())
{
- if (print_header)
-   {
- /* Print the header only when there's something else
-to print below.  */
- fprintf (f, "Non-varying global ranges:\n");
- fprintf (f, "=:\n");
- print_header = false;
-   }
-
  print_generic_expr (f, ssa_name (x), TDF_NONE);
  fprintf (f, "  : ");
  r.dump (f);
@@ -684,8 +675,6 @@ ssa_cache::dump (FILE *f)
}
 }
 
-  if (!print_header)
-fputc ('\n', f);
 }
 
 // Return true if NAME has an active range in the cache.
@@ -716,7 +705,7 @@ ssa_lazy_cache::set_range (tree name, const vrange )
 }
 
 // If NAME has a range, intersect it with R, otherwise set it to R.
-// Return TRUE if there was already a range set, otherwise false.
+// Return TRUE if the range is new or changes.
 
 bool
 ssa_lazy_cache::merge_range (tree name, const vrange )
@@ -731,7 +720,7 @@ ssa_lazy_cache::merge_range (tree name, const vrange )
   if (v >= m_tab.length ())
 m_tab.safe_grow (num_ssa_names + 1);
   m_tab[v] = m_range_allocator->clone (r);
-  return false;
+  return true;
 }
 
 // Return TRUE if NAME has a range, and return it in R.
@@ -996,6 +985,8 @@ ranger_cache::~ranger_cache ()
 void
 ranger_cache::dump (FILE *f)
 {
+  fprintf (f, "Non-varying global ranges:\n");
+  fprintf (f, "=:\n");
   m_globals.dump (f);
   fprintf (f, "\n");
 }

Re: [PATCH v2] c++: Catch indirect change of active union member in constexpr [PR101631]

2023-09-20 Thread Jason Merrill


On 9/19/23 20:55, Nathaniel Shead wrote:

On Tue, Sep 19, 2023 at 05:25:20PM -0400, Jason Merrill wrote:

On 9/1/23 08:22, Nathaniel Shead wrote:

On Wed, Aug 30, 2023 at 04:28:18PM -0400, Jason Merrill wrote:

On 8/29/23 09:35, Nathaniel Shead wrote:

This is an attempt to improve the constexpr machinery's handling of
union lifetime by catching more cases that cause UB. Is this approach
OK?

I'd also like some feedback on a couple of pain points with this
implementation; in particular, is there a good way to detect if a type
has a non-deleted trivial constructor? I've used 'is_trivially_xible' in
this patch, but that also checks for a trivial destructor which by my
reading of [class.union.general]p5 is possibly incorrect. Checking for a
trivial default constructor doesn't seem too hard but I couldn't find a
good way of checking if that constructor is deleted.


I guess the simplest would be

(TYPE_HAS_TRIVIAL_DFLT (t) && locate_ctor (t))

because locate_ctor returns null for a deleted default ctor.  It would be
good to make this a separate predicate.


I'm also generally unsatisfied with the additional complexity with the
third 'refs' argument in 'cxx_eval_store_expression' being pushed and
popped; would it be better to replace this with a vector of some
specific structure type for the data that needs to be passed on?


Perhaps, but what you have here is fine.  Another possibility would be to
just have a vec of the refs and extract the index from the ref later as
needed.

Jason



Thanks for the feedback. I've kept the refs as-is for now. I've also
cleaned up a couple of other typos I'd had with comments and diagnostics.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

@@ -6192,10 +6197,16 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
 type = reftype;
-  if (code == UNION_TYPE && CONSTRUCTOR_NELTS (*valp)
- && CONSTRUCTOR_ELT (*valp, 0)->index != index)
+  if (code == UNION_TYPE
+ && TREE_CODE (t) == MODIFY_EXPR
+ && (CONSTRUCTOR_NELTS (*valp) == 0
+ || CONSTRUCTOR_ELT (*valp, 0)->index != index))
{
- if (cxx_dialect < cxx20)
+ /* We changed the active member of a union. Ensure that this is
+valid.  */
+ bool has_active_member = CONSTRUCTOR_NELTS (*valp) != 0;
+ tree inner = strip_array_types (reftype);
+ if (has_active_member && cxx_dialect < cxx20)
{
  if (!ctx->quiet)
error_at (cp_expr_loc_or_input_loc (t),


While we're looking at this area, this error message should really mention
that it's allowed in C++20.


@@ -6205,8 +6216,36 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
  index);
  *non_constant_p = true;
}
- else if (TREE_CODE (t) == MODIFY_EXPR
-  && CONSTRUCTOR_NO_CLEARING (*valp))
+ else if (!is_access_expr
+  || (CLASS_TYPE_P (inner)
+  && !type_has_non_deleted_trivial_default_ctor (inner)))
+   {
+ /* Diagnose changing active union member after initialisation
+without a valid member access expression, as described in
+[class.union.general] p5.  */
+ if (!ctx->quiet)
+   {
+ if (has_active_member)
+   error_at (cp_expr_loc_or_input_loc (t),
+ "accessing %qD member instead of initialized "
+ "%qD member in constant expression",
+ index, CONSTRUCTOR_ELT (*valp, 0)->index);
+ else
+   error_at (cp_expr_loc_or_input_loc (t),
+ "accessing uninitialized member %qD",
+ index);
+ if (is_access_expr)
+   {
+ inform (DECL_SOURCE_LOCATION (index),
+ "%qD does not implicitly begin its lifetime "
+ "because %qT does not have a non-deleted "
+ "trivial default constructor",
+ index, inner);
+   }


The !is_access_expr case could also use an explanatory message.


Thanks for the review, I've updated these messages and will send through
an updated patch once bootstrap/regtest is complete.


Also, I notice that this testcase crashes with the patch:

union U { int i; float f; };
constexpr auto g (U u) { return (u.i = 42); }
static_assert (g({.f = 3.14}) == 42);


This appears to segfault even without the patch since GCC 13.1.
https://godbolt.org/z/45sPh8WaK

I haven't done a bisect yet to work out what commit exactly caused this.
Should I aim to fix this first before coming back with this patch?


Ah, I was just assuming it was related, never mind.  I'll fix it.

Jason

Re: [PATCH] [frange] Remove special casing from unordered operators.

2023-09-20 Thread Aldy Hernandez





On 9/20/23 11:12, Aldy Hernandez wrote:

In coming up with testcases for the unordered folders, I realized that
we were already handling them correctly, even in the absence of my
work in this area lately.

All of the unordered fold_range() methods try to fold with the ordered
variants first, and if they return TRUE, we are guaranteed to be able
to fold, even in the presence of NANs.  For example:

if (x_5 >= y_8)
   if (x_5 __UNLE y_8)

On the true side of the first conditional we know that either x_5 < y_8
or that one or more operands is a NAN.  Since UNLE_EXPR returns true
for precisely this scenario, we can fold as true.


Ugh, that should've been the false edge of the first conditional, thus:

if (x_5 >= y_8)
  {
  }
else
  {
// Relation at this point is: x_5 < y_8
// or either x_5 or y_8 is a NAN.
if (x_5 __UNLE y_8)
  link_error();
  }

The second conditional is foldable because LT U NAN is a subset of 
__UNLE (which is LE U NAN).


The patch still stands though :).

Aldy

Re: [PATCH] aarch64: Ensure const and sign correctness

2023-09-20 Thread Richard Sandiford

Pekka Seppänen  writes:
> Be const and sign correct by using a matching CIE augmentation type.
> Use a builtin instead of relying  being included.
>
> libgcc/ChangeLog:
>
>   * config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key):
>   Use const unsigned type and a builtin.

Thanks for the patch, pushed to trunk.

Richard

> Signed-off-by: Pekka Seppänen 
> ---
>  libgcc/config/aarch64/aarch64-unwind.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
> b/libgcc/config/aarch64/aarch64-unwind.h
> index 3ad2f8239ed..d669edd671b 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -40,8 +40,9 @@ aarch64_cie_signed_with_b_key (struct _Unwind_Context 
> *context)
>const struct dwarf_cie *cie = get_cie (fde);
>if (cie != NULL)
>   {
> -   char *aug_str = cie->augmentation;
> -   return strchr (aug_str, 'B') == NULL ? 0 : 1;
> +   const unsigned char *aug_str = cie->augmentation;
> +   return __builtin_strchr ((const char *) aug_str,
> +'B') == NULL ? 0 : 1;
>   }
>  }
>return 0;

Re: [PATCH][_GLIBCXX_INLINE_VERSION] Fix

2023-09-20 Thread François Dumont

libstdc++: [_GLIBCXX_INLINE_VERSION] Add handle_contract_violation 
symbol alias


libstdc++-v3/ChangeLog:

    * src/experimental/contract.cc
    [_GLIBCXX_INLINE_VERSION](handle_contract_violation): Provide 
symbol alias

    without version namespace decoration for gcc.

Here is what I'm testing eventually, ok to commit if successful ?

François

On 20/09/2023 11:32, Jonathan Wakely wrote:

On Wed, 20 Sept 2023 at 05:51, François Dumont via Libstdc++
 wrote:

libstdc++: Remove std::constract_violation from versioned namespace

Spelling mistake in contract_violation, and it's not
std::contract_violation, it's std::experimental::contract_violation


GCC expects this type to be in std namespace directly.

Again, it's in std::experimental not in std directly.

Will this change cause problems when including another experimental
header, which does put experimental below std::__8?

I think std::__8::experimental and std::experimental will become ambiguous.

Maybe we do want to remove the inline __8 namespace from all
experimental headers. That needs a bit more thought though.


libstdc++-v3/ChangeLog:

  * include/experimental/contract:
  Remove _GLIBCXX_BEGIN_NAMESPACE_VERSION/_GLIBCXX_END_NAMESPACE_VERSION.

This line is too long for the changelog.


It does fix 29 g++.dg/contracts in gcc testsuite.

Ok to commit ?

Françoisdiff --git a/libstdc++-v3/src/experimental/contract.cc b/libstdc++-v3/src/experimental/contract.cc
index 504a6c041f1..17daa3312ca 100644
--- a/libstdc++-v3/src/experimental/contract.cc
+++ b/libstdc++-v3/src/experimental/contract.cc
@@ -67,3 +67,14 @@ handle_contract_violation (const std::experimental::contract_violation 
   std::cerr << std::endl;
 #endif
 }
+
+#if _GLIBCXX_INLINE_VERSION
+// Provide symbol alias without version namespace decoration for gcc.
+extern "C"
+void _Z25handle_contract_violationRKNSt12experimental18contract_violationE
+(const std::experimental::contract_violation )
+__attribute__ (
+(alias
+ ("_Z25handle_contract_violationRKNSt3__812experimental18contract_violationE"),
+ weak));
+#endif

[Committed] RISC-V: Remove math.h import to resolve missing stubs failures

2023-09-20 Thread Patrick O'Neill


Committed. Thanks!

On 9/20/23 10:19, Kito Cheng wrote:

LGTM

Patrick O'Neill  於 2023年9月20日 週三 18:07 寫道：

Resolves some of the missing stubs failures:
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.

2023-09-20 Juzhe Zhong 

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded
math.h
        import.

Tested-by: Patrick O'Neill 
---
Tested using 590a8bec3ed92118e084b0a1897d3314a666170e
glibc rv64gcv
glibc rv32gcv

glibc rv64gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-6.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)

glibc rv32gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/and-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-5.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-6.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-5.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/div-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-4.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-5.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-6.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-7.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c -O3
-ftree-vectorize --param riscv-autovec-preference=scalable (test
for excess errors)
FAIL:

Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-20 Thread Patrick O'Neill


Juzhe,

On a more general note, are we expecting #include  to cause a
testcase to fail?

My motivation is to make the testsuite less noisy when checking for
regressions. For example, a patch like this one:
https://patchwork.sourceware.org/project/gcc/patch/20230920023059.1728132-1-pan2...@intel.com/
is showing 4 new failures on rv32gcv from the {dg-do compile} testcases
that #include . I might be wrong, but those don't look like real
failures to me [1][2][3].

On glibc rv64gcv I'm seeing tests like:
gcc.target/riscv/rvv/autovec/unop/vnot-rv32gcv.c
fail with similar missing stubs-ilp32d.h errors.

I want to sanity-check with other people that they are seeing similar
errors and that these errors indicate something wrong with the testsuite.
If nobody else is seeing these errors, I'd like to hear how you're
running the testsuite so I can debug the riscv-gnu-toolchain repo.

Patrick

[1]:
Executing on host: 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc 
-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/ 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output  
-O3 -ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns 
-fno-schedule-insns2 -S   -o math-ceil-1.s (timeout = 600)
spawn -ignore SIGHUP 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc 
-B/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/ 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output 
-O3 -ftree-vectorize -march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns 
-fno-schedule-insns2 -S -o math-ceil-1.s
In file included from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h:1,
 from 
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c:5:
/github/ewlu-runner-2/_work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11: 
fatal error: gnu/stubs-lp64d.h: No such file or directory

compilation terminated.
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/math-ceil-1.c -O3 -ftree-vectorize 
(test for excess errors)


[2]:
https://github.com/ewlu/riscv-gnu-toolchain/issues/170

[3]:
This also extends beyond math.h. I'm seeing similar failures for
testcases like
gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c that
#include .


On 9/19/23 18:12, Patrick O'Neill wrote:


I'll let it run overnight and see if this helps. Even before this patch,
I was seeing 233 stubs related failures for rv32gcv and 7 for rv64gcv so
this won't fix all the issues.

It's easily replicated using upstream riscv-gnu-toolchain
git clone https://github.com/riscv-collab/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init gcc
cd gcc
git pull master
cd ..
mkdir build
cd build
../configure --prefix=$(pwd) --with-arch=rv32gcv --with-abi=ilp32d
make report-linux -j32

Then search for "stubs" in the debug logs 
(/build-gcc-linux-stage2/gcc/testsuite/*.log)


Patrick

On 9/19/23 17:54, juzhe.zh...@rivai.ai wrote:

I think we could remove match.h.

Hi, @Patrick. Could you verify it?

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h

index 2292372d7a3..674098e9ba6 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -1,5 +1,4 @@
 #include 
-#include 

and commit it.

Thanks.

Re: [Patch, fortran] PR68155 - ICE on initializing character array in type (len_lhs <> len_rhs)

2023-09-20 Thread Harald Anlauf


Hi Paul,

On 9/20/23 09:03, Paul Richard Thomas wrote:

Hi All,

This is a straightforward patch that is adequately explained by the ChangeLog.

Regtests fine - OK for trunk?


this looks good to me.  OK for trunk.

As it is an almost obvious fix for sort of wrong code, I'd consider
it backportable if you have intentions in that direction.

Thanks,
Harald


Cheers

Paul

Fortran: Pad mismatched charlens in component initializers [PR68155]

2023-09-20  Paul Thomas  

gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.

gcc/testsuite/
PR fortran/68155
* gfortran.dg/pr68155.f90: New test.

Re: [PATCH] RISC-V: Remove math.h import to resolve missing stubs failures

2023-09-20 Thread Kito Cheng

LGTM

Patrick O'Neill  於 2023年9月20日 週三 18:07 寫道：

> Resolves some of the missing stubs failures:
> fatal error: gnu/stubs-lp64d.h: No such file or directory
> compilation terminated.
>
> 2023-09-20 Juzhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded math.h
> import.
>
> Tested-by: Patrick O'Neill 
> ---
> Tested using 590a8bec3ed92118e084b0a1897d3314a666170e
> glibc rv64gcv
> glibc rv32gcv
>
> glibc rv64gcv
> Resolved failures:
> FAIL: gcc.target/riscv/rvv/autovec/vls/mov-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/mov-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/mov-6.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
>
> glibc rv32gcv
> Resolved failures:
> FAIL: gcc.target/riscv/rvv/autovec/vls/and-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/and-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/and-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-5.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-6.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/const-5.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/div-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-4.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-5.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-6.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/dup-7.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/extract-1.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/extract-2.c -O3 -ftree-vectorize
> --param riscv-autovec-preference=scalable (test for excess errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for excess
> errors)
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c

[PATCH] c++: constraint rewriting during ttp coercion [PR111485]

2023-09-20 Thread Patrick Palka

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps backports?

-- >8 --

In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters.  The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2.  This patch
fixes this by including the outer template arguments in this substitution,
which ought to match the depth of the ttp.

The second testcase demonstrates that it's sometimes necessary to
substitute the concrete outer template arguments instead of generic
ones, because a ttp's constraints could depend on outer arguments.

PR c++/111485

gcc/cp/ChangeLog:

* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
---
 gcc/cp/pt.cc   |  5 +++--
 gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C | 24 ++
 gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C | 17 +++
 3 files changed, 44 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8758e218ce4..f47887291a6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -8360,7 +8360,7 @@ canonicalize_expr_argument (tree arg, tsubst_flags_t 
complain)
constrained than the parameter.  */
 
 static bool
-is_compatible_template_arg (tree parm, tree arg)
+is_compatible_template_arg (tree parm, tree arg, tree args)
 {
   tree parm_cons = get_constraints (parm);
 
@@ -8381,6 +8381,7 @@ is_compatible_template_arg (tree parm, tree arg)
 {
   tree aparms = DECL_INNERMOST_TEMPLATE_PARMS (arg);
   new_args = template_parms_level_to_args (aparms);
+  new_args = add_to_template_args (args, new_args);
   ++processing_template_decl;
   parm_cons = tsubst_constraint_info (parm_cons, new_args,
  tf_none, NULL_TREE);
@@ -8635,7 +8636,7 @@ convert_template_argument (tree parm,
   // Check that the constraints are compatible before allowing the
   // substitution.
   if (val != error_mark_node)
-if (!is_compatible_template_arg (parm, arg))
+   if (!is_compatible_template_arg (parm, arg, args))
   {
if (in_decl && (complain & tf_error))
   {
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
new file mode 100644
index 000..4129e9e1303
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp5.C
@@ -0,0 +1,24 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+template concept D = C || true;
+
+template class TT> struct example { };
+template class UU> using example_t = example;
+
+template
+struct A {
+  template class TT> struct example { };
+
+  template class UU> using example_t = example;
+
+  template
+  struct B {
+template class UU> using example_t = example;
+  };
+};
+
+template struct A::B;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
new file mode 100644
index 000..7832cabc7d8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp6.C
@@ -0,0 +1,17 @@
+// PR c++/111485
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+
+template concept C = always_true;
+
+template requires C class TT>
+void f();
+
+template requires C
+struct A;
+
+int main() {
+  f();
+  f(); // { dg-error "no match|constraint" }
+}
-- 
2.42.0.216.gbda494f404

Re: [PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Wilco Dijkstra

Hi Richard,

> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

> Shouldn't this be a separate patch?  It's not immediately obvious that this 
> is a necessary part of this change.

You mean this?

@@ -1627,7 +1627,7 @@ (define_expand "cpymemdi"
(match_operand:BLK 1 "memory_operand")
(match_operand:DI 2 "general_operand")
(match_operand:DI 3 "immediate_operand")]
-   "!STRICT_ALIGNMENT || TARGET_MOPS"
+   ""

Yes that's necessary since that is the bug.

> +  unsigned align = INTVAL (operands[3]);
>
>This should read the value with UINTVAL.  Given the useful range of the 
>alignment, it should be OK that we're not using unsigned HWI.

I'll fix that.

> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);
>
> So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we 
> want a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too 
> eagerly?

The goal was to fix the issue in way that is both obvious and can be easily 
backported.
Further improvements can be made to handle other alignments, but it is
slightly tricky (eg. align == 4 won't emit LDP/STP directly using current code
and thus would need additional work to generalize the LDP path).
  
>> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;
>
>I find this name slightly confusing.  Surely it's min_mops_size (since above 
>that we want to use MOPS rather than inlined loads/stores).  But why not just 
>use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

The reason is that in a follow-on patch I check 
aarch64_mops_memcpy_size_threshold
too, so for now this acts as a shortcut for the ridiculously long name.

> Are there any additional tests for this?

There are existing tests that check the expansion which fail if you completely
block expansions with STRICT_ALIGNMENT.

Cheers,
Wilco

[PATCH] RISC-V: Remove math.h import to resolve missing stubs failures

2023-09-20 Thread Patrick O'Neill

Resolves some of the missing stubs failures:
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.

2023-09-20 Juzhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded math.h
import.

Tested-by: Patrick O'Neill 
---
Tested using 590a8bec3ed92118e084b0a1897d3314a666170e
glibc rv64gcv
glibc rv32gcv

glibc rv64gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/mov-6.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)

glibc rv32gcv
Resolved failures:
FAIL: gcc.target/riscv/rvv/autovec/vls/and-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/and-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-5.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/cmp-6.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/const-5.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/div-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-3.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-4.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-5.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-6.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/dup-7.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-1.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/extract-2.c -O3 -ftree-vectorize --param 
riscv-autovec-preference=scalable (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-1.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-add-3.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-1.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-2.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for excess 
errors)
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-div-3.c -O3 
-ftree-vectorize --param

Re: [Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]

2023-09-20 Thread Kito Cheng

Does it also happened on gcc 13 branch? If so plz backport :)

Juzhe-Zhong  於 2023年9月20日 週三 11:09 寫道：

> This bug is exposed when we support VLS integer conversion patterns.
>
> FAIL: c-c++-common/torture/pr53505.c execution.
>
> This is because incorrect vsetvl elimination by Phase 4:
>
>10318:   0d207057vsetvli zero,zero,e32,m4,ta,ma
>1031c:   5e003e57vmv.v.i v28,0
>.:   missed e8,m1 vsetvl
>10320:   7b07b057vmsgtu.vi   v0,v16,15
>10324:   03083157vadd.vi v2,v16,-16
>
> Regression on release version GCC no surprise difference.
>
> Committed.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Fix
> bug.
>
> ---
>  gcc/config/riscv/riscv-vsetvl.cc | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vsetvl.cc
> b/gcc/config/riscv/riscv-vsetvl.cc
> index df980b6770e..e0f61148ef3 100644
> --- a/gcc/config/riscv/riscv-vsetvl.cc
> +++ b/gcc/config/riscv/riscv-vsetvl.cc
> @@ -1799,10 +1799,11 @@ vector_insn_info::operator== (const
> vector_insn_info ) const
>  if (m_demands[i] != other.demand_p ((enum demand_type) i))
>return false;
>
> -  if (vector_config_insn_p (m_insn->rtl ())
> -  || vector_config_insn_p (other.get_insn ()->rtl ()))
> -if (m_insn != other.get_insn ())
> -  return false;
> +  /* We should consider different INSN demands as different
> + expression.  Otherwise, we will be doing incorrect vsetvl
> + elimination.  */
> +  if (m_insn != other.get_insn ())
> +return false;
>
>if (!same_avl_p (other))
>  return false;
> --
> 2.36.3
>
>

[PATCH v2] AArch64: Fix memmove operand corruption [PR111121]

2023-09-20 Thread Wilco Dijkstra

A MOPS memmove may corrupt registers since there is no copy of the input
operands to temporary registers.  Fix this by calling
aarch64_expand_cpymem_mops.

Passes regress/bootstrap, OK for commit?

gcc/ChangeLog/
PR target/21
* config/aarch64/aarch64.md (aarch64_movmemdi): Add new expander.
(movmemdi): Call aarch64_expand_cpymem_mops for correct expansion.
* config/aarch64/aarch64.cc (aarch64_expand_cpymem_mops): Add 
support
for memmove.
* config/aarch64/aarch64-protos.h (aarch64_expand_cpymem_mops): Add 
new
function.

gcc/testsuite/ChangeLog/
PR target/21
* gcc.target/aarch64/mops_4.c: Add memmove testcases.

---

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
70303d6fd953e0c397b9138ede8858c2db2e53db..e8d91cba30e32e03c4794ccc24254691d135f2dd
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -765,6 +765,7 @@ bool aarch64_emit_approx_div (rtx, rtx, rtx);
 bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
 tree aarch64_vector_load_decl (tree);
 void aarch64_expand_call (rtx, rtx, rtx, bool);
+bool aarch64_expand_cpymem_mops (rtx *, bool);
 bool aarch64_expand_cpymem (rtx *);
 bool aarch64_expand_setmem (rtx *);
 bool aarch64_float_const_zero_rtx_p (rtx);
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
219c4ee6d4cd7522f6ad634c794485841e5d08fa..dd6874d13a75f20d10a244578afc355b25c73da2
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -25228,10 +25228,11 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
*src, rtx *dst,
   *dst = aarch64_progress_pointer (*dst);
 }
 
-/* Expand a cpymem using the MOPS extension.  OPERANDS are taken
-   from the cpymem pattern.  Return true iff we succeeded.  */
-static bool
-aarch64_expand_cpymem_mops (rtx *operands)
+/* Expand a cpymem/movmem using the MOPS extension.  OPERANDS are taken
+   from the cpymem/movmem pattern.  IS_MEMMOVE is true if this is a memmove
+   rather than memcpy.  Return true iff we succeeded.  */
+bool
+aarch64_expand_cpymem_mops (rtx *operands, bool is_memmove = false)
 {
   if (!TARGET_MOPS)
 return false;
@@ -25243,8 +25244,10 @@ aarch64_expand_cpymem_mops (rtx *operands)
   rtx dst_mem = replace_equiv_address (operands[0], dst_addr);
   rtx src_mem = replace_equiv_address (operands[1], src_addr);
   rtx sz_reg = copy_to_mode_reg (DImode, operands[2]);
-  emit_insn (gen_aarch64_cpymemdi (dst_mem, src_mem, sz_reg));
-
+  if (is_memmove)
+emit_insn (gen_aarch64_movmemdi (dst_mem, src_mem, sz_reg));
+  else
+emit_insn (gen_aarch64_cpymemdi (dst_mem, src_mem, sz_reg));
   return true;
 }
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
60133b541e9289610ce58116b0258a61f29bdc00..6d0f072a9dd6d094e8764a513222a9129d8296fa
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1635,7 +1635,22 @@ (define_expand "cpymemdi"
 }
 )
 
-(define_insn "aarch64_movmemdi"
+(define_expand "aarch64_movmemdi"
+  [(parallel
+ [(set (match_operand 2) (const_int 0))
+  (clobber (match_dup 3))
+  (clobber (match_dup 4))
+  (clobber (reg:CC CC_REGNUM))
+  (set (match_operand 0)
+  (unspec:BLK [(match_operand 1) (match_dup 2)] UNSPEC_MOVMEM))])]
+  "TARGET_MOPS"
+  {
+operands[3] = XEXP (operands[0], 0);
+operands[4] = XEXP (operands[1], 0);
+  }
+)
+
+(define_insn "*aarch64_movmemdi"
   [(parallel [
(set (match_operand:DI 2 "register_operand" "+") (const_int 0))
(clobber (match_operand:DI 0 "register_operand" "+"))
@@ -1668,17 +1683,9 @@ (define_expand "movmemdi"
&& INTVAL (sz_reg) < aarch64_mops_memmove_size_threshold)
  FAIL;
 
-   rtx addr_dst = XEXP (operands[0], 0);
-   rtx addr_src = XEXP (operands[1], 0);
-
-   if (!REG_P (sz_reg))
- sz_reg = force_reg (DImode, sz_reg);
-   if (!REG_P (addr_dst))
- addr_dst = force_reg (DImode, addr_dst);
-   if (!REG_P (addr_src))
- addr_src = force_reg (DImode, addr_src);
-   emit_insn (gen_aarch64_movmemdi (addr_dst, addr_src, sz_reg));
-   DONE;
+  if (aarch64_expand_cpymem_mops (operands, true))
+DONE;
+  FAIL;
 }
 )
 
diff --git a/gcc/testsuite/gcc.target/aarch64/mops_4.c 
b/gcc/testsuite/gcc.target/aarch64/mops_4.c
index 
1b87759cb5e8bbcbb58cf63404d1d579d44b2818..dd796115cb4093251964d881e93bf4b98ade0c32
 100644
--- a/gcc/testsuite/gcc.target/aarch64/mops_4.c
+++ b/gcc/testsuite/gcc.target/aarch64/mops_4.c
@@ -50,6 +50,54 @@ copy3 (int *x, int *y, long z, long *res)
   *res = z;
 }
 
+/*
+** move1:
+** mov (x[0-9]+), x0
+** cpyp\[\1\]!, \[x1\]!, x2!
+** cpym\[\1\]!, \[x1\]!, x2!
+** cpye\[\1\]!, \[x1\]!, x2!
+** str x0, \[x3\]
+** ret
+*/
+void
+move1 (int *x, int *y, long z, int **res)
+{
+  __builtin_memmove (x, y, z);
+  *res = x;
+}
+
+/*
+**

Re: [PATCH] c, c++, v3: Accept __builtin_classify_type (typename)

2023-09-20 Thread Joseph Myers

On Wed, 20 Sep 2023, Jakub Jelinek wrote:

> On Mon, Sep 18, 2023 at 09:25:19PM +, Joseph Myers wrote:
> > > I'd like to ping this patch.
> > > The C++ FE part has been approved by Jason already with a minor change
> > > I've made in my copy.
> > > Are the remaining parts ok for trunk?
> > 
> > In the C front-end changes, since you end up discarding any side effects 
> > from the type, I'd expect use of in_alignof to be more appropriate than 
> > in_typeof (and thus not needing to use pop_maybe_used).
> 
> So like this?  Bootstrapped/regtested again on x86_64-linux and i686-linux.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Richard Earnshaw (lists)

On 20/09/2023 14:50, Wilco Dijkstra wrote:
> 
> The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
> Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
> Clean up the condition when to use MOPS.
> 
> Passes regress/bootstrap, OK for commit?
> 
> gcc/ChangeLog/
> PR target/103100
> * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

Shouldn't this be a separate patch?  It's not immediately obvious that this is 
a necessary part of this change.

> (setmemdi): Likewise.
> * config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
> strict-align.  Cleanup condition for using MOPS.
> (aarch64_expand_setmem): Likewise.
> 
> ---
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> dd6874d13a75f20d10a244578afc355b25c73da2..8f3bfb91c0f4ec43f37fe9289a66092a29a47e4d
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
>int mode_bits;
>rtx dst = operands[0];
>rtx src = operands[1];
> +  unsigned align = INTVAL (operands[3]);

This should read the value with UINTVAL.  Given the useful range of the 
alignment, it should be OK that we're not using unsigned HWI.

>rtx base;
>machine_mode cur_mode = BLKmode;
> +  bool size_p = optimize_function_for_size_p (cfun);
>  
> -  /* Variable-sized memcpy can go through the MOPS expansion if available.  
> */
> -  if (!CONST_INT_P (operands[2]))
> +  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_cpymem_mops (operands);

So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we want 
a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too eagerly?


>  
>unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
>  
> -  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  
> */
> -  unsigned HOST_WIDE_INT max_copy_size
> -= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
> -
> -  bool size_p = optimize_function_for_size_p (cfun);
> +  /* Try to inline up to 256 bytes.  */
> +  unsigned max_copy_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;

I find this name slightly confusing.  Surely it's min_mops_size (since above 
that we want to use MOPS rather than inlined loads/stores).  But why not just 
use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

>  
> -  /* Large constant-sized cpymem should go through MOPS when possible.
> - It should be a win even for size optimization in the general case.
> - For speed optimization the choice between MOPS and the SIMD sequence
> - depends on the size of the copy, rather than number of instructions,
> - alignment etc.  */
> -  if (size > max_copy_size)
> +  /* Large copies use MOPS when available or a library call.  */
> +  if (size > max_copy_size || (TARGET_MOPS && size > max_mops_size))
>  return aarch64_expand_cpymem_mops (operands);
>  
>int copy_bits = 256;
> @@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)

Similar comments apply to this code as well.

>unsigned HOST_WIDE_INT len;
>rtx dst = operands[0];
>rtx val = operands[2], src;
> +  unsigned align = INTVAL (operands[3]);
>rtx base;
>machine_mode cur_mode = BLKmode, next_mode;
>  
> -  /* If we don't have SIMD registers or the size is variable use the MOPS
> - inlined sequence if possible.  */
> -  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
> +  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
> +  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
> +  || (STRICT_ALIGNMENT && align < 16))
>  return aarch64_expand_setmem_mops (operands);
>  
>bool size_p = optimize_function_for_size_p (cfun);
> @@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)

And here.

>/* Default the maximum to 256-bytes when considering only libcall vs
>   SIMD broadcast sequence.  */
>unsigned max_set_size = 256;
> +  unsigned max_mops_size = aarch64_mops_memset_size_threshold;
>  
>len = INTVAL (operands[1]);
> -  if (len > max_set_size && !TARGET_MOPS)
> -return false;
> +
> +  /* Large memset uses MOPS when available or a library call.  */
> +  if (len > max_set_size || (TARGET_MOPS && len > max_mops_size))
> +return aarch64_expand_setmem_mops (operands);
>  
>int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
>/* The MOPS sequence takes:
> @@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
>   the arguments + 1 for the call.  */
>unsigned libcall_cost = 4;
>  
> -  /* Upper bound check.  For large constant-sized setmem use the MOPS 
> sequence
> - when available.  */
> -  if (TARGET_MOPS
> -  && len >=

[PATCH] [frange] Remove special casing from unordered operators.

2023-09-20 Thread Aldy Hernandez

In coming up with testcases for the unordered folders, I realized that
we were already handling them correctly, even in the absence of my
work in this area lately.

All of the unordered fold_range() methods try to fold with the ordered
variants first, and if they return TRUE, we are guaranteed to be able
to fold, even in the presence of NANs.  For example:

if (x_5 >= y_8)
  if (x_5 __UNLE y_8)

On the true side of the first conditional we know that either x_5 < y_8
or that one or more operands is a NAN.  Since UNLE_EXPR returns true
for precisely this scenario, we can fold as true.

This is handled in the fold_range() methods as follows:

if (!range_op_handler (LE_EXPR).fold_range (r, type, op1_no_nan,
op2_no_nan, trio))
  return false;
// The result is the same as the ordered version when the
// comparison is true or when the operands cannot be NANs.
if (!maybe_isnan (op1, op2) || r == range_true (type))
  return true;

This code has been there since the last release, and makes the special
casing I am deleting obsolete.  I have added tests to make sure we
keep track of this behavior.

I will commit this pending tests.

gcc/ChangeLog:

* range-op-float.cc (foperator_unordered_ge::fold_range): Remove
special casing.
(foperator_unordered_gt::fold_range): Same.
(foperator_unordered_lt::fold_range): Same.
(foperator_unordered_le::fold_range): Same.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-relations-5.c: New test.
* gcc.dg/tree-ssa/vrp-float-relations-6.c: New test.
---
 gcc/range-op-float.cc | 20 ++-
 .../gcc.dg/tree-ssa/vrp-float-relations-5.c   | 54 +++
 .../gcc.dg/tree-ssa/vrp-float-relations-6.c   | 54 +++
 3 files changed, 112 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-relations-5.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-relations-6.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 399deee5d8a..0951bd385a9 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -1644,10 +1644,7 @@ public:
   const frange , const frange ,
   relation_trio trio = TRIO_VARYING) const final override
   {
-relation_kind rel = trio.op1_op2 ();
-
-if (op1.known_isnan () || op2.known_isnan ()
-   || rel == VREL_LT)
+if (op1.known_isnan () || op2.known_isnan ())
   {
r = range_true (type);
return true;
@@ -1759,10 +1756,7 @@ public:
   const frange , const frange ,
   relation_trio trio = TRIO_VARYING) const final override
   {
-relation_kind rel = trio.op1_op2 ();
-
-if (op1.known_isnan () || op2.known_isnan ()
-   || rel == VREL_LE)
+if (op1.known_isnan () || op2.known_isnan ())
   {
r = range_true (type);
return true;
@@ -1870,10 +1864,7 @@ public:
   const frange , const frange ,
   relation_trio trio = TRIO_VARYING) const final override
   {
-relation_kind rel = trio.op1_op2 ();
-
-if (op1.known_isnan () || op2.known_isnan ()
-   || rel == VREL_GT)
+if (op1.known_isnan () || op2.known_isnan ())
   {
r = range_true (type);
return true;
@@ -1985,10 +1976,7 @@ public:
   const frange , const frange ,
   relation_trio trio = TRIO_VARYING) const final override
   {
-relation_kind rel = trio.op1_op2 ();
-
-if (op1.known_isnan () || op2.known_isnan ()
-   || rel == VREL_GE)
+if (op1.known_isnan () || op2.known_isnan ())
   {
r = range_true (type);
return true;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-relations-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-relations-5.c
new file mode 100644
index 000..2bd06c6fbf7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-relations-5.c
@@ -0,0 +1,54 @@
+// { dg-do compile }
+// { dg-options "-O2 -fgimple -fdump-tree-evrp" }
+
+void link_error();
+
+void __GIMPLE (ssa,startwith("evrp"))
+foo1 (float x, float y)
+{
+  __BB(2):
+  if (x_4(D) <= y_5(D))
+goto __BB5;
+  else
+goto __BB3;
+
+  __BB(3):
+  // Relation at this point is VREL_GT.
+  if (x_4(D) __UNGE y_5(D))
+goto __BB5;
+  else
+goto __BB4;
+
+  __BB(4):
+  link_error ();
+  goto __BB5;
+
+  __BB(5):
+  return;
+}
+
+void __GIMPLE (ssa,startwith("evrp"))
+foo2 (float x, float y)
+{
+  __BB(2):
+  if (x_4(D) <= y_5(D))
+goto __BB5;
+  else
+goto __BB3;
+
+  __BB(3):
+  // Relation at this point is VREL_GT.
+  if (x_4(D) __UNGT y_5(D))
+goto __BB5;
+  else
+goto __BB4;
+
+  __BB(4):
+  link_error ();
+  goto __BB5;
+
+  __BB(5):
+  return;
+}
+
+// { dg-final { scan-tree-dump-not "link_error" "evrp" } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-relations-6.c

Re: [Committed V4] internal-fn: Support undefined rtx for uninitialized SSA_NAME[PR110751]

2023-09-20 Thread Palmer Dabbelt


On Wed, 20 Sep 2023 07:58:49 PDT (-0700), juzhe.zh...@rivai.ai wrote:

According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert 
it
into SCRATCH rtx if the target predicate allows SCRATCH.

It can help to reduce redundant data move instructions of targets like RISC-V.

Bootstrap and Regression on x86 passed.

gcc/ChangeLog:

* internal-fn.cc (expand_fn_using_insn): Support undefined rtx value.
* optabs.cc (maybe_legitimize_operand): Ditto.
(can_reuse_operands_p): Ditto.
* optabs.h (enum expand_operand_type): Ditto.
(create_undefined_input_operand): Ditto.


It's somewhat common to put the PR at the top of the ChangeLog (though I 
pretty frequently forget as well).




---
 gcc/internal-fn.cc |  4 
 gcc/optabs.cc  | 13 -
 gcc/optabs.h   | 13 -
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 0fd34359247..61d5a9e4772 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -247,6 +247,10 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
unsigned int noutputs,
create_convert_operand_from ([opno], rhs_rtx,
 TYPE_MODE (rhs_type),
 TYPE_UNSIGNED (rhs_type));
+  else if (TREE_CODE (rhs) == SSA_NAME
+  && SSA_NAME_IS_DEFAULT_DEF (rhs)
+  && VAR_P (SSA_NAME_VAR (rhs)))
+   create_undefined_input_operand ([opno], TYPE_MODE (rhs_type));
   else
create_input_operand ([opno], rhs_rtx, TYPE_MODE (rhs_type));
   opno += 1;
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 32ff379ffc3..8b96f23aec0 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -8102,6 +8102,16 @@ maybe_legitimize_operand (enum insn_code icode, unsigned 
int opno,
  goto input;
}
   break;
+
+case EXPAND_UNDEFINED_INPUT:
+  /* See if the predicate accepts a SCRATCH rtx, which in this context
+indicates an undefined value.  Use an uninitialized register if not. */
+  if (!insn_operand_matches (icode, opno, op->value))
+   {
+ op->value = gen_reg_rtx (op->mode);
+ goto input;
+   }
+  return true;
 }
   return insn_operand_matches (icode, opno, op->value);
 }
@@ -8140,7 +8150,8 @@ can_reuse_operands_p (enum insn_code icode,
   switch (op1->type)
 {
 case EXPAND_OUTPUT:
-  /* Outputs must remain distinct.  */
+case EXPAND_UNDEFINED_INPUT:
+  /* Outputs and undefined intputs must remain distinct.  */
   return false;

 case EXPAND_FIXED:
diff --git a/gcc/optabs.h b/gcc/optabs.h
index c80b7f4dc1b..9b78d40a46c 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -37,7 +37,8 @@ enum expand_operand_type {
   EXPAND_CONVERT_TO,
   EXPAND_CONVERT_FROM,
   EXPAND_ADDRESS,
-  EXPAND_INTEGER
+  EXPAND_INTEGER,
+  EXPAND_UNDEFINED_INPUT
 };

 /* Information about an operand for instruction expansion.  */
@@ -117,6 +118,16 @@ create_input_operand (class expand_operand *op, rtx value,
   create_expand_operand (op, EXPAND_INPUT, value, mode, false);
 }

+/* Make OP describe an undefined input operand of mode MODE.  MODE cannot
+   be null.  */
+
+inline void
+create_undefined_input_operand (class expand_operand *op, machine_mode mode)
+{
+  create_expand_operand (op, EXPAND_UNDEFINED_INPUT, gen_rtx_SCRATCH (mode),
+mode, false);
+}
+
 /* Like create_input_operand, except that VALUE must first be converted
to mode MODE.  UNSIGNED_P says whether VALUE is unsigned.  */

Re: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

2023-09-20 Thread Patrick Palka

On Wed, 20 Sep 2023, Patrick Palka wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
> -- >8 --
> 
> This fixes some missed SFINAE in grok_array_decl when checking a C++23
> multidimensional subscript operator expression.
> 
> Note the existing pedwarn code paths are a backward compability fallback
> for treating invalid a[x, y, z] as a[(x, y, z)], but this should only be
> done outside of a SFINAE context I think.
> 
>   PR c++/111493
> 
> gcc/cp/ChangeLog:
> 
>   * decl2.cc (grok_array_decl): Guard errors with tf_error.
>   In the pedwarn code paths, return error_mark_node when in
>   a SFINAE context.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp23/subscript15.C: New test.
> ---
>  gcc/cp/decl2.cc  | 36 +++-
>  gcc/testsuite/g++.dg/cpp23/subscript15.C | 24 
>  2 files changed, 47 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C
> 
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index b402befba6d..6eb6d8c57d6 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -477,7 +477,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
> index_exp,
>   {
> /* If it would be valid albeit deprecated expression in
>C++20, just pedwarn on it and treat it as if wrapped
> -  in ().  */
> +  in () unless we're in a SFINAE context.  */
> +   if (!(complain & tf_error))
> + return error_mark_node;

It occurred to me that we could check for tf_error much earlier, before
we call build_x_compound_expr_from_vec and build_new_op, since they're
only used here to implement the backward compatibilty fallback.  Perhaps
the following is better, then:

-- >8 --

Subject: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly recognize ill-formed C++23 multidimensional
subscript operator expressions.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.
---
 gcc/cp/decl2.cc  | 15 +++---
 gcc/testsuite/g++.dg/cpp23/subscript15.C | 25 
 2 files changed, 37 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b402befba6d..6ac27cbc15f 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -459,7 +459,10 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  expr = build_op_subscript (loc, array_expr, index_exp_list,
 , complain & tf_decltype);
- if (expr == error_mark_node)
+ if (expr == error_mark_node
+ /* Don't do the backward compatibility fallback in a SFINAE
+context.   */
+ && (complain & tf_error))
{
  tree idx = build_x_compound_expr_from_vec (*index_exp_list, NULL,
 tf_none);
@@ -510,6 +513,11 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
 
   if (index_exp == NULL_TREE)
{
+ if (!(complain & tf_error))
+   /* Don't do the backward compatibility fallback in a SFINAE
+  context.  */
+   return error_mark_node;
+
  if ((*index_exp_list)->is_empty ())
{
  error_at (loc, "built-in subscript operator without expression "
@@ -561,8 +569,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
swapped = true, array_expr = p2, index_exp = i1;
   else
{
- error_at (loc, "invalid types %<%T[%T]%> for array subscript",
-   type, TREE_TYPE (index_exp));
+ if (complain & tf_error)
+   error_at (loc, "invalid types %<%T[%T]%> for array subscript",
+ type, TREE_TYPE (index_exp));
  return error_mark_node;
}
 
diff --git a/gcc/testsuite/g++.dg/cpp23/subscript15.C 
b/gcc/testsuite/g++.dg/cpp23/subscript15.C
new file mode 100644
index 000..fece96be96b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/subscript15.C
@@ -0,0 +1,25 @@
+// PR c++/111493
+// { dg-do compile { target c++23 } }
+
+template
+concept CartesianIndexable = requires(T t, Ts... ts) { t[ts...]; };
+
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+
+static_assert(!CartesianIndexable);
+static_assert(CartesianIndexable);
+static_assert(!CartesianIndexable);
+static_assert(!CartesianIndexable);
+
+template

[Committed V4] internal-fn: Support undefined rtx for uninitialized SSA_NAME[PR110751]

2023-09-20 Thread Juzhe-Zhong

According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751

As Richard and Richi suggested, we recognize uninitialized SSA_NAME and convert 
it
into SCRATCH rtx if the target predicate allows SCRATCH.

It can help to reduce redundant data move instructions of targets like RISC-V.

Bootstrap and Regression on x86 passed.

gcc/ChangeLog:

* internal-fn.cc (expand_fn_using_insn): Support undefined rtx value.
* optabs.cc (maybe_legitimize_operand): Ditto.
(can_reuse_operands_p): Ditto.
* optabs.h (enum expand_operand_type): Ditto.
(create_undefined_input_operand): Ditto.

---
 gcc/internal-fn.cc |  4 
 gcc/optabs.cc  | 13 -
 gcc/optabs.h   | 13 -
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 0fd34359247..61d5a9e4772 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -247,6 +247,10 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
unsigned int noutputs,
create_convert_operand_from ([opno], rhs_rtx,
 TYPE_MODE (rhs_type),
 TYPE_UNSIGNED (rhs_type));
+  else if (TREE_CODE (rhs) == SSA_NAME
+  && SSA_NAME_IS_DEFAULT_DEF (rhs)
+  && VAR_P (SSA_NAME_VAR (rhs)))
+   create_undefined_input_operand ([opno], TYPE_MODE (rhs_type));
   else
create_input_operand ([opno], rhs_rtx, TYPE_MODE (rhs_type));
   opno += 1;
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index 32ff379ffc3..8b96f23aec0 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -8102,6 +8102,16 @@ maybe_legitimize_operand (enum insn_code icode, unsigned 
int opno,
  goto input;
}
   break;
+
+case EXPAND_UNDEFINED_INPUT:
+  /* See if the predicate accepts a SCRATCH rtx, which in this context
+indicates an undefined value.  Use an uninitialized register if not. */
+  if (!insn_operand_matches (icode, opno, op->value))
+   {
+ op->value = gen_reg_rtx (op->mode);
+ goto input;
+   }
+  return true;
 }
   return insn_operand_matches (icode, opno, op->value);
 }
@@ -8140,7 +8150,8 @@ can_reuse_operands_p (enum insn_code icode,
   switch (op1->type)
 {
 case EXPAND_OUTPUT:
-  /* Outputs must remain distinct.  */
+case EXPAND_UNDEFINED_INPUT:
+  /* Outputs and undefined intputs must remain distinct.  */
   return false;
 
 case EXPAND_FIXED:
diff --git a/gcc/optabs.h b/gcc/optabs.h
index c80b7f4dc1b..9b78d40a46c 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -37,7 +37,8 @@ enum expand_operand_type {
   EXPAND_CONVERT_TO,
   EXPAND_CONVERT_FROM,
   EXPAND_ADDRESS,
-  EXPAND_INTEGER
+  EXPAND_INTEGER,
+  EXPAND_UNDEFINED_INPUT
 };
 
 /* Information about an operand for instruction expansion.  */
@@ -117,6 +118,16 @@ create_input_operand (class expand_operand *op, rtx value,
   create_expand_operand (op, EXPAND_INPUT, value, mode, false);
 }
 
+/* Make OP describe an undefined input operand of mode MODE.  MODE cannot
+   be null.  */
+
+inline void
+create_undefined_input_operand (class expand_operand *op, machine_mode mode)
+{
+  create_expand_operand (op, EXPAND_UNDEFINED_INPUT, gen_rtx_SCRATCH (mode),
+mode, false);
+}
+
 /* Like create_input_operand, except that VALUE must first be converted
to mode MODE.  UNSIGNED_P says whether VALUE is unsigned.  */
 
-- 
2.36.3

Re: [PATCH][RFC] middle-end/106811 - document GENERIC/GIMPLE undefined behavior

2023-09-20 Thread Richard Sandiford

Richard Biener  writes:
> On Wed, 20 Sep 2023, Richard Sandiford wrote:
>
>> Thanks for doing this.  Question below...
>> 
>> Richard Biener via Gcc-patches  writes:
>> > The following attempts to provide a set of conditions GENERIC/GIMPLE
>> > considers invoking undefined behavior, leaning on the C standards
>> > Annex J, as to provide portability guidance to language frontend
>> > developers.
>> >
>> > I've both tried to remember cases we exploit undefined behavior
>> > and went over C2x Annex J to catch more stuff.  I'd be grateful
>> > if people could point out obvious omissions or cases where the
>> > wording isn't clear.  I plan to check/amend the individual operator
>> > documentations as well, but not everything fits there.
>> >
>> > I've put this into generic.texi because it applies to GENERIC as
>> > the frontend interface.  All constraints apply to GIMPLE as well.
>> > I plan to add a section to gimple.texi as to how to deal with
>> > undefined behavior.
>> >
>> > As said, every comment is welcome.
>> >
>> > For testing I've built doc and inspected the resulting pdf.
>> >
>> >PR middle-end/106811
>> >* doc/generic.texi: Add portability section with
>> >subsection on undefined behavior.
>> > ---
>> >  gcc/doc/generic.texi | 87 
>> >  1 file changed, 87 insertions(+)
>> >
>> > diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
>> > index 6534c354b7a..0969f881146 100644
>> > --- a/gcc/doc/generic.texi
>> > +++ b/gcc/doc/generic.texi
>> > @@ -43,6 +43,7 @@ seems inelegant.
>> >  * Functions:: Function bodies, linkage, and other aspects.
>> >  * Language-dependent trees::Topics and trees specific to language 
>> > front ends.
>> >  * C and C++ Trees::   Trees specific to C and C++.
>> > +* Portability issues::  Portability summary for languages.
>> >  @end menu
>> >  
>> >  @c -
>> > @@ -3733,3 +3734,89 @@ In either case, the expression is void.
>> >  
>> >  
>> >  @end table
>> > +
>> > +
>> > +@node Portability issues
>> > +@section Portability issues
>> > +
>> > +This section summarizes portability issues when translating source 
>> > languages
>> > +to GENERIC.  Everything written here also applies to GIMPLE.  This section
>> > +heavily relies on interpretation according to the C standard.
>> > +
>> > +@menu
>> > +* Undefined behavior::  Undefined behavior.
>> > +@end menu
>> > +
>> > +@node Undefined behavior
>> > +@subsection Undefined behavior
>> > +
>> > +The following is a list of circumstances that invoke undefined behavior.
>> > +
>> > +@itemize @bullet
>> > +@item
>> > +When the result of negation, addition, subtraction or division of two 
>> > signed
>> > +integers or signed integer vectors not subject to @option{-fwrapv} cannot 
>> > be
>> > +represented in the type.
>> 
>> Couldn't tell: is the omission of multiplication deliberate?
>
> No.  Fixed.  Do you by chance remember/know anything about RTL 'div'
> and behavior on overflow (INT_MIN/-1), in particular with -fwrapv?

No, sorry.  I thought SDIV was allowed (but not required) to trap
on overflow, but I don't know off-hand what effect -fwrapv has
on the way that we use it.

Richard

Re: [PATCH v8 0/4] P1689R5 support

2023-09-20 Thread Ben Boeckel

On Tue, Sep 19, 2023 at 17:33:34 -0400, Jason Merrill wrote:
> Pushed, thanks!

Thanks!

Is there a process I can use to backport this to GCC 13?

--Ben

Re: [PATCH V3] internal-fn: Support undefined rtx for uninitialized SSA_NAME

2023-09-20 Thread Richard Sandiford

Juzhe-Zhong  writes:
> According to PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110751
>
> As Richard and Richi suggested, we recognize uninitialized SSA_NAME and 
> convert it
> into SCRATCH rtx if the target predicate allows SCRATCH.
>
> It can help to reduce redundant data move instructions of targets like RISC-V.
>
> Bootstrap and Regression on x86 passed.
>
> gcc/ChangeLog:
>
>   * internal-fn.cc (expand_fn_using_insn): Support undefined rtx value.
>   * optabs.cc (maybe_legitimize_operand): Ditto.
>   (can_reuse_operands_p): Ditto.
>   * optabs.h (enum expand_operand_type): Ditto.
>   (create_undefined_input_operand): Ditto.
>
> ---
>  gcc/internal-fn.cc |  4 
>  gcc/optabs.cc  | 14 +-
>  gcc/optabs.h   | 14 +-
>  3 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 0fd34359247..61d5a9e4772 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -247,6 +247,10 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
> unsigned int noutputs,
>   create_convert_operand_from ([opno], rhs_rtx,
>TYPE_MODE (rhs_type),
>TYPE_UNSIGNED (rhs_type));
> +  else if (TREE_CODE (rhs) == SSA_NAME
> +&& SSA_NAME_IS_DEFAULT_DEF (rhs)
> +&& VAR_P (SSA_NAME_VAR (rhs)))
> + create_undefined_input_operand ([opno], TYPE_MODE (rhs_type));
>else
>   create_input_operand ([opno], rhs_rtx, TYPE_MODE (rhs_type));
>opno += 1;
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index 32ff379ffc3..75b1d54ad7c 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -8102,6 +8102,17 @@ maybe_legitimize_operand (enum insn_code icode, 
> unsigned int opno,
> goto input;
>   }
>break;
> +
> +case EXPAND_UNDEFINED_INPUT:
> +  /* For SCRATCH rtx which is converted from uninitialized
> +  SSA, we convert it as fresh pseudo when target doesn't
> +  allow scratch rtx in predicate. Otherwise, return true.  */

I don't think we should assume that default SSA definitions are the
only source of uninitialised RTL operands.  How about:

/* See if the predicate accepts a SCRATCH rtx, which in this context
   indicates an undefined value.  Use an uninitialized register if not.  */

> +  if (!insn_operand_matches (icode, opno, op->value))
> + {
> +   op->value = gen_reg_rtx (op->mode);
> +   goto input;
> + }
> +  return true;
>  }
>return insn_operand_matches (icode, opno, op->value);
>  }
> @@ -8140,7 +8151,8 @@ can_reuse_operands_p (enum insn_code icode,
>switch (op1->type)
>  {
>  case EXPAND_OUTPUT:
> -  /* Outputs must remain distinct.  */
> +case EXPAND_UNDEFINED_INPUT:
> +  /* Outputs and undefined intputs must remain distinct.  */
>return false;
>  
>  case EXPAND_FIXED:
> diff --git a/gcc/optabs.h b/gcc/optabs.h
> index c80b7f4dc1b..6faebf7cb63 100644
> --- a/gcc/optabs.h
> +++ b/gcc/optabs.h
> @@ -37,7 +37,8 @@ enum expand_operand_type {
>EXPAND_CONVERT_TO,
>EXPAND_CONVERT_FROM,
>EXPAND_ADDRESS,
> -  EXPAND_INTEGER
> +  EXPAND_INTEGER,
> +  EXPAND_UNDEFINED_INPUT
>  };
>  
>  /* Information about an operand for instruction expansion.  */
> @@ -117,6 +118,17 @@ create_input_operand (class expand_operand *op, rtx 
> value,
>create_expand_operand (op, EXPAND_INPUT, value, mode, false);
>  }
>  
> +/* Make OP describe an undefined input operand for uninitialized
> +   SSA.  It's the scratch operand with mode MODE; MODE cannot be
> +   VOIDmode.  */

How about:

/* Make OP describe an undefined input operand of mode MODE.  MODE cannot
   be null.  */

Preapproved with those changes, thanks.  No need for another review.

Richard

> +
> +inline void
> +create_undefined_input_operand (class expand_operand *op, machine_mode mode)
> +{
> +  create_expand_operand (op, EXPAND_UNDEFINED_INPUT, gen_rtx_SCRATCH (mode),
> +  mode, false);
> +}
> +
>  /* Like create_input_operand, except that VALUE must first be converted
> to mode MODE.  UNSIGNED_P says whether VALUE is unsigned.  */

Re: [PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

2023-09-20 Thread Patrick Palka

On Wed, 20 Sep 2023, Patrick Palka wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?

... and perhaps 13?

> 
> -- >8 --
> 
> This fixes some missed SFINAE in grok_array_decl when checking a C++23
> multidimensional subscript operator expression.
> 
> Note the existing pedwarn code paths are a backward compability fallback
> for treating invalid a[x, y, z] as a[(x, y, z)], but this should only be
> done outside of a SFINAE context I think.
> 
>   PR c++/111493
> 
> gcc/cp/ChangeLog:
> 
>   * decl2.cc (grok_array_decl): Guard errors with tf_error.
>   In the pedwarn code paths, return error_mark_node when in
>   a SFINAE context.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp23/subscript15.C: New test.
> ---
>  gcc/cp/decl2.cc  | 36 +++-
>  gcc/testsuite/g++.dg/cpp23/subscript15.C | 24 
>  2 files changed, 47 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C
> 
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index b402befba6d..6eb6d8c57d6 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -477,7 +477,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
> index_exp,
>   {
> /* If it would be valid albeit deprecated expression in
>C++20, just pedwarn on it and treat it as if wrapped
> -  in ().  */
> +  in () unless we're in a SFINAE context.  */
> +   if (!(complain & tf_error))
> + return error_mark_node;
> pedwarn (loc, OPT_Wcomma_subscript,
>  "top-level comma expression in array subscript "
>  "changed meaning in C++23");
> @@ -487,7 +489,7 @@ grok_array_decl (location_t loc, tree array_expr, tree 
> index_exp,
>   = build_x_compound_expr_from_vec (orig_index_exp_list,
> NULL, complain);
> if (orig_index_exp == error_mark_node)
> - expr = error_mark_node;
> + return error_mark_node;
> release_tree_vector (orig_index_exp_list);
>   }
>   }
> @@ -512,22 +514,29 @@ grok_array_decl (location_t loc, tree array_expr, tree 
> index_exp,
>   {
> if ((*index_exp_list)->is_empty ())
>   {
> -   error_at (loc, "built-in subscript operator without expression "
> -  "list");
> +   if (complain & tf_error)
> + error_at (loc, "built-in subscript operator without expression "
> +"list");
> return error_mark_node;
>   }
> tree idx = build_x_compound_expr_from_vec (*index_exp_list, NULL,
>tf_none);
> if (idx != error_mark_node)
> - /* If it would be valid albeit deprecated expression in C++20,
> -just pedwarn on it and treat it as if wrapped in ().  */
> - pedwarn (loc, OPT_Wcomma_subscript,
> -  "top-level comma expression in array subscript "
> -  "changed meaning in C++23");
> + {
> +   /* If it would be valid albeit deprecated expression in C++20,
> +  just pedwarn on it and treat it as if wrapped in () unless
> +  we're in a SFINAE context.  */
> +   if (!(complain & tf_error))
> + return error_mark_node;
> +   pedwarn (loc, OPT_Wcomma_subscript,
> +"top-level comma expression in array subscript "
> +"changed meaning in C++23");
> + }
> else
>   {
> -   error_at (loc, "built-in subscript operator with more than one "
> -  "expression in expression list");
> +   if (complain & tf_error)
> + error_at (loc, "built-in subscript operator with more than one "
> +"expression in expression list");
> return error_mark_node;
>   }
> index_exp = idx;
> @@ -561,8 +570,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
> index_exp,
>   swapped = true, array_expr = p2, index_exp = i1;
>else
>   {
> -   error_at (loc, "invalid types %<%T[%T]%> for array subscript",
> - type, TREE_TYPE (index_exp));
> +   if (complain & tf_error)
> + error_at (loc, "invalid types %<%T[%T]%> for array subscript",
> +   type, TREE_TYPE (index_exp));
> return error_mark_node;
>   }
>  
> diff --git a/gcc/testsuite/g++.dg/cpp23/subscript15.C 
> b/gcc/testsuite/g++.dg/cpp23/subscript15.C
> new file mode 100644
> index 000..1528ee71306
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp23/subscript15.C
> @@ -0,0 +1,24 @@
> +// PR c++/111493
> +// {

[PATCH] c++: missing SFINAE in grok_array_decl [PR111493]

2023-09-20 Thread Patrick Palka

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

This fixes some missed SFINAE in grok_array_decl when checking a C++23
multidimensional subscript operator expression.

Note the existing pedwarn code paths are a backward compability fallback
for treating invalid a[x, y, z] as a[(x, y, z)], but this should only be
done outside of a SFINAE context I think.

PR c++/111493

gcc/cp/ChangeLog:

* decl2.cc (grok_array_decl): Guard errors with tf_error.
In the pedwarn code paths, return error_mark_node when in
a SFINAE context.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/subscript15.C: New test.
---
 gcc/cp/decl2.cc  | 36 +++-
 gcc/testsuite/g++.dg/cpp23/subscript15.C | 24 
 2 files changed, 47 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/subscript15.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b402befba6d..6eb6d8c57d6 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -477,7 +477,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  /* If it would be valid albeit deprecated expression in
 C++20, just pedwarn on it and treat it as if wrapped
-in ().  */
+in () unless we're in a SFINAE context.  */
+ if (!(complain & tf_error))
+   return error_mark_node;
  pedwarn (loc, OPT_Wcomma_subscript,
   "top-level comma expression in array subscript "
   "changed meaning in C++23");
@@ -487,7 +489,7 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
= build_x_compound_expr_from_vec (orig_index_exp_list,
  NULL, complain);
  if (orig_index_exp == error_mark_node)
-   expr = error_mark_node;
+   return error_mark_node;
  release_tree_vector (orig_index_exp_list);
}
}
@@ -512,22 +514,29 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
{
  if ((*index_exp_list)->is_empty ())
{
- error_at (loc, "built-in subscript operator without expression "
-"list");
+ if (complain & tf_error)
+   error_at (loc, "built-in subscript operator without expression "
+  "list");
  return error_mark_node;
}
  tree idx = build_x_compound_expr_from_vec (*index_exp_list, NULL,
 tf_none);
  if (idx != error_mark_node)
-   /* If it would be valid albeit deprecated expression in C++20,
-  just pedwarn on it and treat it as if wrapped in ().  */
-   pedwarn (loc, OPT_Wcomma_subscript,
-"top-level comma expression in array subscript "
-"changed meaning in C++23");
+   {
+ /* If it would be valid albeit deprecated expression in C++20,
+just pedwarn on it and treat it as if wrapped in () unless
+we're in a SFINAE context.  */
+ if (!(complain & tf_error))
+   return error_mark_node;
+ pedwarn (loc, OPT_Wcomma_subscript,
+  "top-level comma expression in array subscript "
+  "changed meaning in C++23");
+   }
  else
{
- error_at (loc, "built-in subscript operator with more than one "
-"expression in expression list");
+ if (complain & tf_error)
+   error_at (loc, "built-in subscript operator with more than one "
+  "expression in expression list");
  return error_mark_node;
}
  index_exp = idx;
@@ -561,8 +570,9 @@ grok_array_decl (location_t loc, tree array_expr, tree 
index_exp,
swapped = true, array_expr = p2, index_exp = i1;
   else
{
- error_at (loc, "invalid types %<%T[%T]%> for array subscript",
-   type, TREE_TYPE (index_exp));
+ if (complain & tf_error)
+   error_at (loc, "invalid types %<%T[%T]%> for array subscript",
+ type, TREE_TYPE (index_exp));
  return error_mark_node;
}
 
diff --git a/gcc/testsuite/g++.dg/cpp23/subscript15.C 
b/gcc/testsuite/g++.dg/cpp23/subscript15.C
new file mode 100644
index 000..1528ee71306
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/subscript15.C
@@ -0,0 +1,24 @@
+// PR c++/111493
+// { dg-do compile { target c++23 } }
+
+template
+concept CartesianIndexable = requires(T t, Ts... ts) { t[ts...]; };
+
+static_assert(!CartesianIndexable);

Re: [PATCH] c++: improve class NTTP object pretty printing [PR111471]

2023-09-20 Thread Patrick Palka

On Tue, 19 Sep 2023, Patrick Palka wrote:

> On Tue, 19 Sep 2023, Jason Merrill wrote:
> 
> > On 9/19/23 12:40, Patrick Palka wrote:
> > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/13?
> > 
> > OK for trunk.  What's your argument for backporting?
> 
> Thanks.  I don't feel strongly about it, but I was thinking that since
> we typically backport C++20-only correctness fixes to the most recent
> release branch, C++20-only diagnostic improvements might be suitable
> too?
> 
> > 
> > > -- >8 --
> > > 
> > > 1. Move class NTTP object pretty printing to a more general spot in
> > > the pretty printer.

FWIW this first change isn't just a refactoring, it means we now pretty
print an NTTP object that appears elsewhere besides in a template
argument list, e.g. in a parameter mapping:

Before:

diagnostic19.C:8:15: note: the expression ‘((const A)V).value [with V = 
_ZTAXtl1AEE]’ evaluated to ‘false’

After:

diagnostic19.C:8:15: note: the expression ‘(V).value [with V = A{false}]’ 
evaluated to ‘false’

> > > 2. Print the type of an class NTTP object alongside its CONSTRUCTOR
> > > value, like dump_expr would have done.
> > > 3. Don't pretty print const VIEW_CONVERT_EXPR wrapping class NTTPs.
> > > 
> > >   PR c++/111471
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * cxx-pretty-print.cc (cxx_pretty_printer::expression)
> > >   : Print the value of a class NTTP object.
> > >   : Strip cosnt VIEW_CONVERT_EXPR
> > >   wrappers for class NTTPs.
> > >   (pp_cxx_template_argument_list): Don't handle the class
> > >   NTTP objects here.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   * g++.dg/concepts/diagnostic19.C: New test.
> > > ---
> > >   gcc/cp/cxx-pretty-print.cc   | 19 +--
> > >   gcc/testsuite/g++.dg/concepts/diagnostic19.C | 20 
> > >   2 files changed, 37 insertions(+), 2 deletions(-)
> > >   create mode 100644 gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > > 
> > > diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
> > > index 909a9dc917f..7cd43151592 100644
> > > --- a/gcc/cp/cxx-pretty-print.cc
> > > +++ b/gcc/cp/cxx-pretty-print.cc
> > > @@ -1121,6 +1121,15 @@ cxx_pretty_printer::expression (tree t)
> > > t = OVL_FIRST (t);
> > > /* FALLTHRU */
> > >   case VAR_DECL:
> > > +  if (DECL_NTTP_OBJECT_P (t))
> > > + {
> > > +   /* Print the type followed by the CONSTRUCTOR value of an
> > > +  NTTP object.  */
> > > +   simple_type_specifier (cv_unqualified (TREE_TYPE (t)));
> > > +   expression (DECL_INITIAL (t));
> > > +   break;
> > > + }
> > > +  /* FALLTHRU */
> > >   case PARM_DECL:
> > >   case FIELD_DECL:
> > >   case CONST_DECL:
> > > @@ -1261,6 +1270,14 @@ cxx_pretty_printer::expression (tree t)
> > > pp_cxx_right_paren (this);
> > > break;
> > >   +case VIEW_CONVERT_EXPR:
> > > +  if (TREE_CODE (TREE_OPERAND (t, 0)) == TEMPLATE_PARM_INDEX)
> > > + {
> > > +   /* Strip const VIEW_CONVERT_EXPR wrappers for class NTTPs.  */
> > > +   expression (TREE_OPERAND (t, 0));
> > > +   break;
> > > + }
> > > +  /* FALLTHRU */
> > >   default:
> > > c_pretty_printer::expression (t);
> > > break;
> > > @@ -1966,8 +1983,6 @@ pp_cxx_template_argument_list (cxx_pretty_printer 
> > > *pp,
> > > tree t)
> > > if (TYPE_P (arg) || (TREE_CODE (arg) == TEMPLATE_DECL
> > >  && TYPE_P (DECL_TEMPLATE_RESULT (arg
> > >   pp->type_id (arg);
> > > -   else if (VAR_P (arg) && DECL_NTTP_OBJECT_P (arg))
> > > - pp->expression (DECL_INITIAL (arg));
> > > else
> > >   pp->expression (arg);
> > >   }
> > > diff --git a/gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > > b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > > new file mode 100644
> > > index 000..20cdb63380b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > > @@ -0,0 +1,20 @@
> > > +// Verify our pretty printing of class NTTP objects.
> > > +// PR c++/111471
> > > +// { dg-do compile { target c++20 } }
> > > +
> > > +struct A { bool value; };
> > > +
> > > +template
> > > +  requires (V.value) // { dg-message {'\(V\).value \[with V =
> > > A\{false\}\]'} }
> > > +void f();
> > > +
> > > +template struct B { static constexpr auto value = V.value; };
> > > +
> > > +template
> > > +  requires T::value // { dg-message {'T::value \[with T = 
> > > B\]'}
> > > }
> > > +void g();
> > > +
> > > +int main() {
> > > +  f(); // { dg-error "no match" }
> > > +  g>(); // { dg-error "no match" }
> > > +}
> > 
> > 
>

Re: [PATCH][RFC] middle-end/106811 - document GENERIC/GIMPLE undefined behavior

2023-09-20 Thread Alexander Monakov



On Fri, 15 Sep 2023, Richard Biener via Gcc-patches wrote:

> +@itemize @bullet
> +@item
> +When the result of negation, addition, subtraction or division of two signed
> +integers or signed integer vectors not subject to @option{-fwrapv} cannot be
> +represented in the type.

It would be a bit awkward to add 'or vectors' everywhere it applies, perhaps
say something general about elementwise vector operations up front?

> +
> +@item
> +The value of the second operand of any of the division or modulo operators
> +is zero.
> +
> +@item
> +When incrementing or decrementing a pointer not subject to
> +@option{-fwrapv-pointer} wraps around zero.
> +
> +@item
> +An expression is shifted by a negative number or by an amount greater
> +than or equal to the width of the shifted operand.
> +
> +@item
> +Pointers that do not point to the same object are compared using
> +relational operators.

This does not apply to '==' and '!='. Maybe say

  Ordered comparison operators are applied to pointers
  that do not point to the same object.

> +
> +@item
> +An object which has been modified is accessed through a restrict-qualified
> +pointer and another pointer that are not both based on the same object.
> +
> +@item
> +The @} that terminates a function is reached, and the value of the function
> +call is used by the caller.
> +
> +@item
> +When program execution reaches __builtin_unreachable.
> +
> +@item
> +When an object has its stored value accessed by an lvalue that
> +does not have one of the following types:
> +@itemize @minus
> +@item
> +a (qualified) type compatible with the effective type of the object
> +@item
> +a type that is the (qualified) signed or unsigned type corresponding to
> +the effective type of the object
> +@item
> +a character type, a ref-all qualified type or a type subject to
> +@option{-fno-strict-aliasing}
> +@item
> +a pointer to void with the same level of indirection as the accessed
> +pointer object
> +@end itemize

This list seems to miss a clause that allows aliasing between
scalar types and their vector counterparts?

Thanks.
Alexander

Re: [PATCH] Testsuite, DWARF2: adjust regexp to match darwin output

2023-09-20 Thread FX Coudert

ping**2


> Hi,
> 
> This was a painful one to fix, because I hate regexps, especially when they 
> are quoted. On darwin, we have this failure:
> 
>FAIL: gcc.dg/debug/dwarf2/inline4.c scan-assembler 
> DW_TAG_inlined_subroutine[^(]*([^)]*)[^(]*(DIE 
> (0x[0-9a-f]*) DW_TAG_formal_parameter[^(]*(DIE 
> (0x[0-9a-f]*) DW_TAG_variable
> 
> That hideous regexp is trying to match (generated on Linux):
> 
>>.uleb128 0x4# (DIE (0x5c) DW_TAG_inlined_subroutine)
>>.long   0xa0# DW_AT_abstract_origin
>>.quad   .LBI4   # DW_AT_entry_pc
>>.byte   .LVU2   # DW_AT_GNU_entry_view
>>.quad   .LBB4   # DW_AT_low_pc
>>.quad   .LBE4-.LBB4 # DW_AT_high_pc
>>.byte   0x1 # DW_AT_call_file (u.c)
>>.byte   0xf # DW_AT_call_line
>>.byte   0x14# DW_AT_call_column
>>.uleb128 0x5# (DIE (0x7d) DW_TAG_formal_parameter)
>>.long   0xad# DW_AT_abstract_origin
>>.long   .LLST0  # DW_AT_location
>>.long   .LVUS0  # DW_AT_GNU_locviews
>>.uleb128 0x6# (DIE (0x8a) DW_TAG_variable)
> 
> It is using the parentheses to check what is between  
> DW_TAG_inlined_subroutine, DW_TAG_formal_parameter and DW_TAG_variable. 
> There’s only one block of parentheses in the middle, that "(u.c)”. However, 
> on darwin, the generated output is more compact:
> 
>>.uleb128 0x4; (DIE (0x188) DW_TAG_inlined_subroutine)
>>.long   0x1b8   ; DW_AT_abstract_origin
>>.quad   LBB4; DW_AT_low_pc
>>.quad   LBE4; DW_AT_high_pc
>>.uleb128 0x5; (DIE (0x19d) DW_TAG_formal_parameter)
>>.long   0x1c6   ; DW_AT_abstract_origin
>>.uleb128 0x6; (DIE (0x1a2) DW_TAG_variable)
> 
> I think that’s valid as well, and the test should pass (what the test really 
> wants to check is that there is no DW_TAG_lexical_block emitted there, see 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37801 for its origin). It could 
> be achieved in two ways:
> 
> 1. making darwin emit the DW_AT_call_file
> 2. adjusting the regexp to match, making the internal block of parentheses 
> optional 
> 
> I chose the second approach. It makes the test pass on darwin. If someone can 
> test it on linux, it’d be appreciated :) I don’t have ready access to such a 
> system right now.
> 
> Once that passes, OK to commit?
> FX



0001-Testsuite-DWARF2-adjust-regexp-to-match-darwin-outpu.patch
Description: Binary data

[PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-09-20 Thread Robin Dapp

Hi,

as described in PR111401 we currently emit a COND and a PLUS expression
for conditional reductions.  This makes it difficult to combine both
into a masked reduction statement later.
This patch improves that by directly emitting a COND_ADD during ifcvt and
adjusting some vectorizer code to handle it.

It also makes neutral_op_for_reduction return -0 if HONOR_SIGNED_ZEROS
is true.

Related question/change: We only allow PLUS_EXPR in fold_left_reduction_fn
but have code to handle MINUS_EXPR in vectorize_fold_left_reduction.  I
suppose that's intentional but it "just works" on riscv and the testsuite
doesn't change when allowing MINUS_EXPR so I went ahead and did that.

Bootstrapped and regtested on x86 and aarch64.

Regards
 Robin

gcc/ChangeLog:

PR middle-end/111401
* internal-fn.cc (cond_fn_p): New function.
* internal-fn.h (cond_fn_p): Define.
* tree-if-conv.cc (convert_scalar_cond_reduction): Emit COND_ADD
if supported.
(predicate_scalar_phi): Add whitespace.
* tree-vect-loop.cc (fold_left_reduction_fn): Add IFN_COND_ADD.
(neutral_op_for_reduction): Return -0 for PLUS.
(vect_is_simple_reduction): Don't count else operand in
COND_ADD.
(vectorize_fold_left_reduction): Add COND_ADD handling.
(vectorizable_reduction): Don't count else operand in COND_ADD.
(vect_transform_reduction): Add COND_ADD handling.
* tree-vectorizer.h (neutral_op_for_reduction): Add default
parameter.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: New test.
* gcc.target/riscv/rvv/autovec/cond/pr111401.c: New test.
---
 gcc/internal-fn.cc|  38 +
 gcc/internal-fn.h |   1 +
 .../vect-cond-reduc-in-order-2-signed-zero.c  | 141 ++
 .../riscv/rvv/autovec/cond/pr111401.c |  61 
 gcc/tree-if-conv.cc   |  63 ++--
 gcc/tree-vect-loop.cc | 130 
 gcc/tree-vectorizer.h |   2 +-
 7 files changed, 394 insertions(+), 42 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111401.c

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 0fd34359247..77939890f5a 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -4241,6 +4241,44 @@ first_commutative_argument (internal_fn fn)
 }
 }
 
+/* Return true if this CODE describes a conditional (masked) internal_fn.  */
+
+bool
+cond_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+{
+#undef DEF_INTERNAL_COND_FN
+#define DEF_INTERNAL_COND_FN(NAME, F, O, T)  \
+case IFN_COND_##NAME:\
+case IFN_COND_LEN_##NAME:\
+  return true;
+#include "internal-fn.def"
+#undef DEF_INTERNAL_COND_FN
+
+#undef DEF_INTERNAL_SIGNED_COND_FN
+#define DEF_INTERNAL_SIGNED_COND_FN(NAME, F, S, SO, UO, T)   \
+case IFN_COND_##NAME:\
+case IFN_COND_LEN_##NAME:\
+  return true;
+#include "internal-fn.def"
+#undef DEF_INTERNAL_SIGNED_COND_FN
+
+default:
+  return false;
+}
+
+  return false;
+}
+
+
 /* Return true if this CODE describes an internal_fn that returns a vector with
elements twice as wide as the element size of the input vectors.  */
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 99de13a0199..f1cc9db29c0 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -219,6 +219,7 @@ extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
 extern bool widening_fn_p (code_helper);
+extern bool cond_fn_p (code_helper code);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c 
b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
new file mode 100644
index 000..57c600838ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
@@ -0,0 +1,141 @@
+/* Make sure a -0 stays -0 when we perform a conditional reduction.  */
+/* { dg-do run } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-add-options ieee } */
+/* { dg-additional-options "-std=c99 -fno-fast-math" } */
+
+#include "tree-vect.h"
+
+#include 
+
+#define N (VECTOR_BITS * 17)
+
+double __attribute__ ((noinline, noclone))
+reduc_plus_double (double *restrict a, double init, int *cond, int n)
+{
+  double res = init;
+  for (int i = 0; i

RE: [PATCH] RISC-V: Support simplifying x/(-1) to neg for vector.

2023-09-20 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Wednesday, September 20, 2023 9:39 PM
To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2 
Subject: Re: [PATCH] RISC-V: Support simplifying x/(-1) to neg for vector.



On 9/19/23 21:36, yanzhang.w...@intel.com wrote:
> From: Yanzhang Wang 
> 
> gcc/ChangeLog:
> 
>   * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
>  support simplifying vector int not only scalar int.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/simplify-vdiv.c: New test.
> 
> Signed-off-by: Yanzhang Wang 
OK
jeff

[PATCH] AArch64: Fix strict-align cpymem/setmem [PR103100]

2023-09-20 Thread Wilco Dijkstra


The cpymemdi/setmemdi implementation doesn't fully support strict alignment.
Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT.
Clean up the condition when to use MOPS.

Passes regress/bootstrap, OK for commit?

gcc/ChangeLog/
PR target/103100
* config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.
(setmemdi): Likewise.
* config/aarch64/aarch64.cc (aarch64_expand_cpymem): Support
strict-align.  Cleanup condition for using MOPS.
(aarch64_expand_setmem): Likewise.

---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
dd6874d13a75f20d10a244578afc355b25c73da2..8f3bfb91c0f4ec43f37fe9289a66092a29a47e4d
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -25261,27 +25261,23 @@ aarch64_expand_cpymem (rtx *operands)
   int mode_bits;
   rtx dst = operands[0];
   rtx src = operands[1];
+  unsigned align = INTVAL (operands[3]);
   rtx base;
   machine_mode cur_mode = BLKmode;
+  bool size_p = optimize_function_for_size_p (cfun);
 
-  /* Variable-sized memcpy can go through the MOPS expansion if available.  */
-  if (!CONST_INT_P (operands[2]))
+  /* Variable-sized or strict-align copies may use the MOPS expansion.  */
+  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
 return aarch64_expand_cpymem_mops (operands);
 
   unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
 
-  /* Try to inline up to 256 bytes or use the MOPS threshold if available.  */
-  unsigned HOST_WIDE_INT max_copy_size
-= TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
-
-  bool size_p = optimize_function_for_size_p (cfun);
+  /* Try to inline up to 256 bytes.  */
+  unsigned max_copy_size = 256;
+  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;
 
-  /* Large constant-sized cpymem should go through MOPS when possible.
- It should be a win even for size optimization in the general case.
- For speed optimization the choice between MOPS and the SIMD sequence
- depends on the size of the copy, rather than number of instructions,
- alignment etc.  */
-  if (size > max_copy_size)
+  /* Large copies use MOPS when available or a library call.  */
+  if (size > max_copy_size || (TARGET_MOPS && size > max_mops_size))
 return aarch64_expand_cpymem_mops (operands);
 
   int copy_bits = 256;
@@ -25445,12 +25441,13 @@ aarch64_expand_setmem (rtx *operands)
   unsigned HOST_WIDE_INT len;
   rtx dst = operands[0];
   rtx val = operands[2], src;
+  unsigned align = INTVAL (operands[3]);
   rtx base;
   machine_mode cur_mode = BLKmode, next_mode;
 
-  /* If we don't have SIMD registers or the size is variable use the MOPS
- inlined sequence if possible.  */
-  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD)
+  /* Variable-sized or strict-align memset may use the MOPS expansion.  */
+  if (!CONST_INT_P (operands[1]) || !TARGET_SIMD
+  || (STRICT_ALIGNMENT && align < 16))
 return aarch64_expand_setmem_mops (operands);
 
   bool size_p = optimize_function_for_size_p (cfun);
@@ -25458,10 +25455,13 @@ aarch64_expand_setmem (rtx *operands)
   /* Default the maximum to 256-bytes when considering only libcall vs
  SIMD broadcast sequence.  */
   unsigned max_set_size = 256;
+  unsigned max_mops_size = aarch64_mops_memset_size_threshold;
 
   len = INTVAL (operands[1]);
-  if (len > max_set_size && !TARGET_MOPS)
-return false;
+
+  /* Large memset uses MOPS when available or a library call.  */
+  if (len > max_set_size || (TARGET_MOPS && len > max_mops_size))
+return aarch64_expand_setmem_mops (operands);
 
   int cst_val = !!(CONST_INT_P (val) && (INTVAL (val) != 0));
   /* The MOPS sequence takes:
@@ -25474,12 +25474,6 @@ aarch64_expand_setmem (rtx *operands)
  the arguments + 1 for the call.  */
   unsigned libcall_cost = 4;
 
-  /* Upper bound check.  For large constant-sized setmem use the MOPS sequence
- when available.  */
-  if (TARGET_MOPS
-  && len >= (unsigned HOST_WIDE_INT) aarch64_mops_memset_size_threshold)
-return aarch64_expand_setmem_mops (operands);
-
   /* Attempt a sequence with a vector broadcast followed by stores.
  Count the number of operations involved to see if it's worth it
  against the alternatives.  A simple counter simd_ops on the
@@ -25521,10 +25515,8 @@ aarch64_expand_setmem (rtx *operands)
   simd_ops++;
   n -= mode_bits;
 
-  /* Do certain trailing copies as overlapping if it's going to be
-cheaper.  i.e. less instructions to do so.  For instance doing a 15
-byte copy it's more efficient to do two overlapping 8 byte copies than
-8 + 4 + 2 + 1.  Only do this when -mstrict-align is not supplied.  */
+  /* Emit trailing writes using overlapping unaligned accesses
+   (when !STRICT_ALIGNMENT) - this is smaller and faster.  */
   if (n > 0 && n < copy_limit / 2 && !STRICT_ALIGNMENT)
{
  next_mode =

Re: [PATCH][RFC] middle-end/106811 - document GENERIC/GIMPLE undefined behavior

2023-09-20 Thread Richard Biener

On Wed, 20 Sep 2023, Richard Sandiford wrote:

> Thanks for doing this.  Question below...
> 
> Richard Biener via Gcc-patches  writes:
> > The following attempts to provide a set of conditions GENERIC/GIMPLE
> > considers invoking undefined behavior, leaning on the C standards
> > Annex J, as to provide portability guidance to language frontend
> > developers.
> >
> > I've both tried to remember cases we exploit undefined behavior
> > and went over C2x Annex J to catch more stuff.  I'd be grateful
> > if people could point out obvious omissions or cases where the
> > wording isn't clear.  I plan to check/amend the individual operator
> > documentations as well, but not everything fits there.
> >
> > I've put this into generic.texi because it applies to GENERIC as
> > the frontend interface.  All constraints apply to GIMPLE as well.
> > I plan to add a section to gimple.texi as to how to deal with
> > undefined behavior.
> >
> > As said, every comment is welcome.
> >
> > For testing I've built doc and inspected the resulting pdf.
> >
> > PR middle-end/106811
> > * doc/generic.texi: Add portability section with
> > subsection on undefined behavior.
> > ---
> >  gcc/doc/generic.texi | 87 
> >  1 file changed, 87 insertions(+)
> >
> > diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
> > index 6534c354b7a..0969f881146 100644
> > --- a/gcc/doc/generic.texi
> > +++ b/gcc/doc/generic.texi
> > @@ -43,6 +43,7 @@ seems inelegant.
> >  * Functions::  Function bodies, linkage, and other aspects.
> >  * Language-dependent trees::Topics and trees specific to language 
> > front ends.
> >  * C and C++ Trees::Trees specific to C and C++.
> > +* Portability issues::  Portability summary for languages.
> >  @end menu
> >  
> >  @c -
> > @@ -3733,3 +3734,89 @@ In either case, the expression is void.
> >  
> >  
> >  @end table
> > +
> > +
> > +@node Portability issues
> > +@section Portability issues
> > +
> > +This section summarizes portability issues when translating source 
> > languages
> > +to GENERIC.  Everything written here also applies to GIMPLE.  This section
> > +heavily relies on interpretation according to the C standard.
> > +
> > +@menu
> > +* Undefined behavior::  Undefined behavior.
> > +@end menu
> > +
> > +@node Undefined behavior
> > +@subsection Undefined behavior
> > +
> > +The following is a list of circumstances that invoke undefined behavior.
> > +
> > +@itemize @bullet
> > +@item
> > +When the result of negation, addition, subtraction or division of two 
> > signed
> > +integers or signed integer vectors not subject to @option{-fwrapv} cannot 
> > be
> > +represented in the type.
> 
> Couldn't tell: is the omission of multiplication deliberate?

No.  Fixed.  Do you by chance remember/know anything about RTL 'div'
and behavior on overflow (INT_MIN/-1), in particular with -fwrapv?

Richard.

> Richard
> 
> > +
> > +@item
> > +The value of the second operand of any of the division or modulo operators
> > +is zero.
> > +
> > +@item
> > +When incrementing or decrementing a pointer not subject to
> > +@option{-fwrapv-pointer} wraps around zero.
> > +
> > +@item
> > +An expression is shifted by a negative number or by an amount greater
> > +than or equal to the width of the shifted operand.
> > +
> > +@item
> > +Pointers that do not point to the same object are compared using
> > +relational operators.
> > +
> > +@item
> > +An object which has been modified is accessed through a restrict-qualified
> > +pointer and another pointer that are not both based on the same object.
> > +
> > +@item
> > +The @} that terminates a function is reached, and the value of the function
> > +call is used by the caller.
> > +
> > +@item
> > +When program execution reaches __builtin_unreachable.
> > +
> > +@item
> > +When an object has its stored value accessed by an lvalue that
> > +does not have one of the following types:
> > +@itemize @minus
> > +@item
> > +a (qualified) type compatible with the effective type of the object
> > +@item
> > +a type that is the (qualified) signed or unsigned type corresponding to
> > +the effective type of the object
> > +@item
> > +a character type, a ref-all qualified type or a type subject to
> > +@option{-fno-strict-aliasing}
> > +@item
> > +a pointer to void with the same level of indirection as the accessed
> > +pointer object
> > +@end itemize
> > +
> > +@item
> > +Addition or subtraction of a pointer into, or just beyond, an object
> > +and an integer type produces a result that does not point into, or just
> > +beyond when not dereferenced, the same object.
> > +
> > +@item
> > +Pointers that do not point into, or just beyond, the same object are
> > +subtracted.
> > +
> > +@item
> > +When a pointer not pointing to actual storage is dereferenced.
> > +
> > +@item
> > +An array subscript is out of range, even if

PING * 2: [V3][PATCH 3/3] Use the counted_by attribute information in bound sanitizer[PR108896]

2023-09-20 Thread Qing Zhao

Hi, 

I’d like to ping this patch set one more time.

Thanks

Qing

> On Aug 25, 2023, at 11:24 AM, Qing Zhao  wrote:
> 
> Use the counted_by attribute information in bound sanitizer.
> 
> gcc/c-family/ChangeLog:
> 
>   PR C/108896
>   * c-ubsan.cc (ubsan_instrument_bounds): Use counted_by attribute
>   information.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR C/108896
>   * gcc.dg/ubsan/flex-array-counted-by-bounds.c: New test.
>   * gcc.dg/ubsan/flex-array-counted-by-bounds-2.c: New test.
> ---
> gcc/c-family/c-ubsan.cc   | 16 +++
> .../ubsan/flex-array-counted-by-bounds-2.c| 27 +++
> .../ubsan/flex-array-counted-by-bounds.c  | 46 +++
> 3 files changed, 89 insertions(+)
> create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
> create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
> 
> diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc
> index 51aa83a378d2..a99e8433069f 100644
> --- a/gcc/c-family/c-ubsan.cc
> +++ b/gcc/c-family/c-ubsan.cc
> @@ -362,6 +362,10 @@ ubsan_instrument_bounds (location_t loc, tree array, 
> tree *index,
> {
>   tree type = TREE_TYPE (array);
>   tree domain = TYPE_DOMAIN (type);
> +  /* whether the array ref is a flexible array member with valid counted_by
> + attribute.  */
> +  bool fam_has_count_attr = false;
> +  tree counted_by = NULL_TREE;
> 
>   if (domain == NULL_TREE)
> return NULL_TREE;
> @@ -375,6 +379,17 @@ ubsan_instrument_bounds (location_t loc, tree array, 
> tree *index,
> && COMPLETE_TYPE_P (type)
> && integer_zerop (TYPE_SIZE (type)))
>   bound = build_int_cst (TREE_TYPE (TYPE_MIN_VALUE (domain)), -1);
> +  /* If the array ref is to flexible array member field which has
> +  counted_by attribute.  We can use the information from the
> +  attribute as the bound to instrument the reference.  */
> +  else if ((counted_by = component_ref_get_counted_by (array))
> + != NULL_TREE)
> + {
> +   fam_has_count_attr = true;
> +   bound = fold_build2 (MINUS_EXPR, TREE_TYPE (counted_by),
> +counted_by,
> +build_int_cst (TREE_TYPE (counted_by), 1));
> + }
>   else
>   return NULL_TREE;
> }
> @@ -387,6 +402,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
> *index,
>  -fsanitize=bounds-strict.  */
>   tree base = get_base_address (array);
>   if (!sanitize_flags_p (SANITIZE_BOUNDS_STRICT)
> +  && !fam_has_count_attr
>   && TREE_CODE (array) == COMPONENT_REF
>   && base && (INDIRECT_REF_P (base) || TREE_CODE (base) == MEM_REF))
> {
> diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c 
> b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
> new file mode 100644
> index ..77ec333509d0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
> @@ -0,0 +1,27 @@
> +/* test the attribute counted_by and its usage in
> +   bounds sanitizer combined with VLA.  */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds" } */
> +
> +#include 
> +
> +void __attribute__((__noinline__)) setup_and_test_vla (int n, int m)
> +{
> +   struct foo {
> +   int n;
> +   int p[][n] __attribute__((counted_by(n)));
> +   } *f;
> +
> +   f = (struct foo *) malloc (sizeof(struct foo) + m*sizeof(int[n]));
> +   f->n = m;
> +   f->p[m][n-1]=1;
> +   return;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +  setup_and_test_vla (10, 11);
> +  return 0;
> +}
> +
> +/* { dg-output "17:8: runtime error: index 11 out of bounds for type" } */
> diff --git a/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c 
> b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
> new file mode 100644
> index ..81eaeb3f2681
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
> @@ -0,0 +1,46 @@
> +/* test the attribute counted_by and its usage in
> +   bounds sanitizer.  */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=bounds" } */
> +
> +#include 
> +
> +struct flex {
> +  int b;
> +  int c[];
> +} *array_flex;
> +
> +struct annotated {
> +  int b;
> +  int c[] __attribute__ ((counted_by (b)));
> +} *array_annotated;
> +
> +void __attribute__((__noinline__)) setup (int normal_count, int 
> annotated_count)
> +{
> +  array_flex
> += (struct flex *)malloc (sizeof (struct flex)
> +  + normal_count *  sizeof (int));
> +  array_flex->b = normal_count;
> +
> +  array_annotated
> += (struct annotated *)malloc (sizeof (struct annotated)
> +   + annotated_count *  sizeof (int));
> +  array_annotated->b = annotated_count;
> +
> +  return;
> +}
> +
> +void __attribute__((__noinline__)) test (int normal_index, int 
> annotated_index)
> +{
> +  array_flex->c[normal_index] = 1;
> +  array_annotated->c[annotated_index] = 2;
> +}
>

PING *2: [V3][PATCH 2/3] Use the counted_by atribute info in builtin object size [PR108896]

2023-09-20 Thread Qing Zhao

Hi, 

I’d like to ping this patch set one more time.

Thanks

Qing

> On Aug 25, 2023, at 11:24 AM, Qing Zhao  wrote:
> 
> Use the counted_by atribute info in builtin object size to compute the
> subobject size for flexible array members.
> 
> gcc/ChangeLog:
> 
>   PR C/108896
>   * tree-object-size.cc (addr_object_size): Use the counted_by
>   attribute info.
>   * tree.cc (component_ref_has_counted_by_p): New function.
>   (component_ref_get_counted_by): New function.
>   * tree.h (component_ref_has_counted_by_p): New prototype.
>   (component_ref_get_counted_by): New prototype.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR C/108896
>   * gcc.dg/flex-array-counted-by-2.c: New test.
>   * gcc.dg/flex-array-counted-by-3.c: New test.
> ---
> .../gcc.dg/flex-array-counted-by-2.c  |  74 ++
> .../gcc.dg/flex-array-counted-by-3.c  | 210 ++
> gcc/tree-object-size.cc   |  37 ++-
> gcc/tree.cc   |  95 +++-
> gcc/tree.h|  10 +
> 5 files changed, 418 insertions(+), 8 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
> create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
> 
> diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c 
> b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
> new file mode 100644
> index ..ec580c1f1f01
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
> @@ -0,0 +1,74 @@
> +/* test the attribute counted_by and its usage in
> + * __builtin_dynamic_object_size.  */ 
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#include "builtin-object-size-common.h"
> +
> +#define expect(p, _v) do { \
> +size_t v = _v; \
> +if (p == v) \
> + __builtin_printf ("ok:  %s == %zd\n", #p, p); \
> +else \
> + {  \
> +   __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> +   FAIL (); \
> + } \
> +} while (0);
> +
> +struct flex {
> +  int b;
> +  int c[];
> +} *array_flex;
> +
> +struct annotated {
> +  int b;
> +  int c[] __attribute__ ((counted_by (b)));
> +} *array_annotated;
> +
> +struct nested_annotated {
> +  struct {
> +union {
> +  int b;
> +  float f;   
> +};
> +int n;
> +  };
> +  int c[] __attribute__ ((counted_by (b)));
> +} *array_nested_annotated;
> +
> +void __attribute__((__noinline__)) setup (int normal_count, int attr_count)
> +{
> +  array_flex
> += (struct flex *)malloc (sizeof (struct flex)
> +  + normal_count *  sizeof (int));
> +  array_flex->b = normal_count;
> +
> +  array_annotated
> += (struct annotated *)malloc (sizeof (struct annotated)
> +   + attr_count *  sizeof (int));
> +  array_annotated->b = attr_count;
> +
> +  array_nested_annotated
> += (struct nested_annotated *)malloc (sizeof (struct nested_annotated)
> +  + attr_count *  sizeof (int));
> +  array_nested_annotated->b = attr_count;
> +
> +  return;
> +}
> +
> +void __attribute__((__noinline__)) test ()
> +{
> +expect(__builtin_dynamic_object_size(array_flex->c, 1), -1);
> +expect(__builtin_dynamic_object_size(array_annotated->c, 1),
> +array_annotated->b * sizeof (int));
> +expect(__builtin_dynamic_object_size(array_nested_annotated->c, 1),
> +array_nested_annotated->b * sizeof (int));
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +  setup (10,10);   
> +  test ();
> +  DONE ();
> +}
> diff --git a/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c 
> b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
> new file mode 100644
> index ..a0c3cb88ec71
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
> @@ -0,0 +1,210 @@
> +/* test the attribute counted_by and its usage in
> +__builtin_dynamic_object_size: what's the correct behavior when the
> +allocation size mismatched with the value of counted_by attribute?  */
> +/* { dg-do run } */
> +/* { dg-options "-O -fstrict-flex-arrays=3" } */
> +
> +#include "builtin-object-size-common.h"
> +
> +struct annotated {
> +  size_t foo;
> +  char others;
> +  char array[] __attribute__((counted_by (foo)));
> +};
> +
> +#define expect(p, _v) do { \
> +size_t v = _v; \
> +if (p == v) \
> +__builtin_printf ("ok:  %s == %zd\n", #p, p); \
> +else \
> +{  \
> +  __builtin_printf ("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
> +   FAIL (); \
> +} \
> +} while (0);
> +
> +#define noinline __attribute__((__noinline__))
> +#define SIZE_BUMP 10 
> +#define MAX(a, b) ((a) > (b) ? (a) : (b))
> +#define MIN(a, b) ((a) < (b) ? (a) : (b))
> +
> +/* In general, Due to type casting, the type for the pointee of a pointer
> +   does not say anything about the object it points to,
> +   So, __builtin_object_size can not directly use the type of the pointee
> +   to decide the size

Ping * 2: [V3][PATCH 1/3] Provide counted_by attribute to flexible array member field (PR108896)

2023-09-20 Thread Qing Zhao

Hi, 

I’d like to ping this patch set one more time.

Thanks

Qing

> On Aug 25, 2023, at 11:24 AM, Qing Zhao  wrote:
> 
> Provide a new counted_by attribute to flexible array member field.
> 
> 'counted_by (COUNT)'
> The 'counted_by' attribute may be attached to the flexible array
> member of a structure.  It indicates that the number of the
> elements of the array is given by the field named "COUNT" in the
> same structure as the flexible array member.  GCC uses this
> information to improve the results of the array bound sanitizer and
> the '__builtin_dynamic_object_size'.
> 
> For instance, the following code:
> 
>  struct P {
>size_t count;
>char other;
>char array[] __attribute__ ((counted_by (count)));
>  } *p;
> 
> specifies that the 'array' is a flexible array member whose number
> of elements is given by the field 'count' in the same structure.
> 
> The field that represents the number of the elements should have an
> integer type.  An explicit 'counted_by' annotation defines a
> relationship between two objects, 'p->array' and 'p->count', that
> 'p->array' has _at least_ 'p->count' number of elements available.
> This relationship must hold even after any of these related objects
> are updated.  It's the user's responsibility to make sure this
> relationship to be kept all the time.  Otherwise the results of the
> array bound sanitizer and the '__builtin_dynamic_object_size' might
> be incorrect.
> 
> For instance, in the following example, the allocated array has
> less elements than what's specified by the 'sbuf->count', this is
> an user error.  As a result, out-of-bounds access to the array
> might not be detected.
> 
>  #define SIZE_BUMP 10
>  struct P *sbuf;
>  void alloc_buf (size_t nelems)
>  {
>sbuf = (struct P *) malloc (MAX (sizeof (struct P),
>   (offsetof (struct P, array[0])
>+ nelems * sizeof (char;
>sbuf->count = nelems + SIZE_BUMP;
>/* This is invalid when the sbuf->array has less than sbuf->count
>   elements.  */
>  }
> 
> In the following example, the 2nd update to the field 'sbuf->count'
> of the above structure will permit out-of-bounds access to the
> array 'sbuf>array' as well.
> 
>  #define SIZE_BUMP 10
>  struct P *sbuf;
>  void alloc_buf (size_t nelems)
>  {
>sbuf = (struct P *) malloc (MAX (sizeof (struct P),
>   (offsetof (struct P, array[0])
>+ (nelems + SIZE_BUMP) * sizeof 
> (char;
>sbuf->count = nelems;
>/* This is valid when the sbuf->array has at least sbuf->count
>   elements.  */
>  }
>  void use_buf (int index)
>  {
>sbuf->count = sbuf->count + SIZE_BUMP + 1;
>/* Now the value of sbuf->count is larger than the number
>   of elements of sbuf->array.  */
>sbuf->array[index] = 0;
>/* then the out-of-bound access to this array
>   might not be detected.  */
>  }
> 
> gcc/c-family/ChangeLog:
> 
>   PR C/108896
>   * c-attribs.cc (handle_counted_by_attribute): New function.
>   (attribute_takes_identifier_p): Add counted_by attribute to the list.
>   * c-common.cc (c_flexible_array_member_type_p): ...To this.
>   * c-common.h (c_flexible_array_member_type_p): New prototype.
> 
> gcc/c/ChangeLog:
> 
>   PR C/108896
>   * c-decl.cc (flexible_array_member_type_p): Renamed and moved to...
>   (add_flexible_array_elts_to_size): Use renamed function.
>   (is_flexible_array_member_p): Use renamed function.
>   (verify_counted_by_attribute): New function.
>   (finish_struct): Use renamed function and verify counted_by
>   attribute.
> 
> gcc/ChangeLog:
> 
>   PR C/108896
>   * doc/extend.texi: Document attribute counted_by.
>   * tree.cc (get_named_field): New function.
>   * tree.h (get_named_field): New prototype.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR C/108896
>   * gcc.dg/flex-array-counted-by.c: New test.
> ---
> gcc/c-family/c-attribs.cc| 54 -
> gcc/c-family/c-common.cc | 13 
> gcc/c-family/c-common.h  |  1 +
> gcc/c/c-decl.cc  | 79 +++-
> gcc/doc/extend.texi  | 77 +++
> gcc/testsuite/gcc.dg/flex-array-counted-by.c | 40 ++
> gcc/tree.cc  | 40 ++
> gcc/tree.h   |  5 ++
> 8 files changed, 291 insertions(+), 18 deletions(-)
> create mode 100644

PING * 2: [V3][PATCH 0/3] New attribute "counted_by" to annotate bounds for C99 FAM(PR108896)

2023-09-20 Thread Qing Zhao

Hi,

I’d like to ping this patch set one more time!

Thanks.

Qing

> On Aug 25, 2023, at 11:24 AM, Qing Zhao  wrote:
> 
> This is the 3rd version of the patch, per our discussion based on the
> review comments for the 1st and 2nd version, the major changes in this
> version are:
> 
> ***Against 1st version:
> 1. change the name "element_count" to "counted_by";
> 2. change the parameter for the attribute from a STRING to an
> Identifier;
> 3. Add logic and testing cases to handle anonymous structure/unions;
> 4. Clarify documentation to permit the situation when the allocation
> size is larger than what's specified by "counted_by", at the same time,
> it's user's error if allocation size is smaller than what's specified by
> "counted_by";
> 5. Add a complete testing case for using counted_by attribute in
> __builtin_dynamic_object_size when there is mismatch between the
> allocation size and the value of "counted_by", the expecting behavior
> for each case and the explanation on why in the comments. 
> 
> ***Against 2rd version:
> 1. Identify a tree node sharing issue and fixed it in the routine
>   "component_ref_get_counted_ty" of tree.cc;
> 2. Update the documentation and testing cases with the clear usage
>   of the fomula to compute the allocation size:
> MAX (sizeof (struct A), offsetof (struct A, array[0]) + counted_by * 
> sizeof(element))
>   (the algorithm used in tree-object-size.cc is correct).
> 
> In this set of patches, the major functionality provided is:
> 
> 1. a new attribute "counted_by";
> 2. use this new attribute in bound sanitizer;
> 3. use this new attribute in dynamic object size for subobject size;
> 
> As discussed, I plan to add two more separate patches sets after this initial
> patch set is approved and committed.
> 
> set 1. A new warning option and a new sanitizer option for the user error
>  when the allocation size is smaller than the value of "counted_by".
> set 2. An improvement to __builtin_dynamic_object_size  for whole-object
>  size of the structure with FAM annaoted with counted_by. 
> 
> there are also some existing bugs in tree-object-size.cc identified
> during the study, and PRs were filed to record them. these bugs will 
> be fixed seperately with individual patches:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111030
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111040
> 
> Bootstrapped and regression tested on both aarch64 and X86, no issue.
> 
> Please see more details on the description of this work on:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619708.html
> 
> and more discussions on
> https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626376.html
> 
> Okay for committing?
> 
> thanks.
> 
> Qing
> 
> Qing Zhao (3):
>  Provide counted_by attribute to flexible array member field (PR108896)
>  Use the counted_by atribute info in builtin object size [PR108896]
>  Use the counted_by attribute information in bound sanitizer[PR108896]
> 
> gcc/c-family/c-attribs.cc |  54 -
> gcc/c-family/c-common.cc  |  13 ++
> gcc/c-family/c-common.h   |   1 +
> gcc/c-family/c-ubsan.cc   |  16 ++
> gcc/c/c-decl.cc   |  79 +--
> gcc/doc/extend.texi   |  77 +++
> .../gcc.dg/flex-array-counted-by-2.c  |  74 ++
> .../gcc.dg/flex-array-counted-by-3.c  | 210 ++
> gcc/testsuite/gcc.dg/flex-array-counted-by.c  |  40 
> .../ubsan/flex-array-counted-by-bounds-2.c|  27 +++
> .../ubsan/flex-array-counted-by-bounds.c  |  46 
> gcc/tree-object-size.cc   |  37 ++-
> gcc/tree.cc   | 133 +++
> gcc/tree.h|  15 ++
> 14 files changed, 797 insertions(+), 25 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-2.c
> create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by-3.c
> create mode 100644 gcc/testsuite/gcc.dg/flex-array-counted-by.c
> create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds-2.c
> create mode 100644 gcc/testsuite/gcc.dg/ubsan/flex-array-counted-by-bounds.c
> 
> -- 
> 2.31.1
>

Re: [PATCH] RISC-V: Support simplifying x/(-1) to neg for vector.

2023-09-20 Thread Jeff Law





On 9/19/23 21:36, yanzhang.w...@intel.com wrote:

From: Yanzhang Wang 

gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
 support simplifying vector int not only scalar int.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/simplify-vdiv.c: New test.

Signed-off-by: Yanzhang Wang 

OK
jeff

[PATCH 2/2] RISC-V: Add assert of the number of vmerge in autovec cond testcases

2023-09-20 Thread Lehua Ding

This patch makes cond autovec testcase checks more restrict.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_arith-1.c:
Assert of the number of vmerge.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_arith-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2float-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_float2int-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2float-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv32-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-1.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_convert_int2int-rv64-2.c:
Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_logical_min_max-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_shift-3.c: Ditto.
*

[PATCH 1/2] match.pd: Support combine cond_len_op + vec_cond similar to cond_op

2023-09-20 Thread Lehua Ding

This patch adds combine cond_len_op and vec_cond to cond_len_op like
cond_op.

gcc/ChangeLog:

* gimple-match.h (gimple_match_op::gimple_match_op):
Add interfaces for more arguments.
(gimple_match_op::set_op): Add interfaces for more arguments.
* match.pd: Add support of combining cond_len_op + vec_cond
---
 gcc/gimple-match.h | 72 ++
 gcc/match.pd   | 39 +
 2 files changed, 111 insertions(+)

diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h
index bec3ff42e3e..9892c142285 100644
--- a/gcc/gimple-match.h
+++ b/gcc/gimple-match.h
@@ -92,6 +92,10 @@ public:
   code_helper, tree, tree, tree, tree, tree);
   gimple_match_op (const gimple_match_cond &,
   code_helper, tree, tree, tree, tree, tree, tree);
+  gimple_match_op (const gimple_match_cond &,
+  code_helper, tree, tree, tree, tree, tree, tree, tree);
+  gimple_match_op (const gimple_match_cond &,
+  code_helper, tree, tree, tree, tree, tree, tree, tree, tree);
 
   void set_op (code_helper, tree, unsigned int);
   void set_op (code_helper, tree, tree);
@@ -100,6 +104,8 @@ public:
   void set_op (code_helper, tree, tree, tree, tree, bool);
   void set_op (code_helper, tree, tree, tree, tree, tree);
   void set_op (code_helper, tree, tree, tree, tree, tree, tree);
+  void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree);
+  void set_op (code_helper, tree, tree, tree, tree, tree, tree, tree, tree);
   void set_value (tree);
 
   tree op_or_null (unsigned int) const;
@@ -212,6 +218,39 @@ gimple_match_op::gimple_match_op (const gimple_match_cond 
_in,
   ops[4] = op4;
 }
 
+inline
+gimple_match_op::gimple_match_op (const gimple_match_cond _in,
+ code_helper code_in, tree type_in,
+ tree op0, tree op1, tree op2, tree op3,
+ tree op4, tree op5)
+  : cond (cond_in), code (code_in), type (type_in), reverse (false),
+num_ops (6)
+{
+  ops[0] = op0;
+  ops[1] = op1;
+  ops[2] = op2;
+  ops[3] = op3;
+  ops[4] = op4;
+  ops[5] = op5;
+}
+
+inline
+gimple_match_op::gimple_match_op (const gimple_match_cond _in,
+ code_helper code_in, tree type_in,
+ tree op0, tree op1, tree op2, tree op3,
+ tree op4, tree op5, tree op6)
+  : cond (cond_in), code (code_in), type (type_in), reverse (false),
+num_ops (7)
+{
+  ops[0] = op0;
+  ops[1] = op1;
+  ops[2] = op2;
+  ops[3] = op3;
+  ops[4] = op4;
+  ops[5] = op5;
+  ops[6] = op6;
+}
+
 /* Change the operation performed to CODE_IN, the type of the result to
TYPE_IN, and the number of operands to NUM_OPS_IN.  The caller needs
to set the operands itself.  */
@@ -299,6 +338,39 @@ gimple_match_op::set_op (code_helper code_in, tree type_in,
   ops[4] = op4;
 }
 
+inline void
+gimple_match_op::set_op (code_helper code_in, tree type_in,
+tree op0, tree op1, tree op2, tree op3, tree op4,
+tree op5)
+{
+  code = code_in;
+  type = type_in;
+  num_ops = 6;
+  ops[0] = op0;
+  ops[1] = op1;
+  ops[2] = op2;
+  ops[3] = op3;
+  ops[4] = op4;
+  ops[5] = op5;
+}
+
+inline void
+gimple_match_op::set_op (code_helper code_in, tree type_in,
+tree op0, tree op1, tree op2, tree op3, tree op4,
+tree op5, tree op6)
+{
+  code = code_in;
+  type = type_in;
+  num_ops = 7;
+  ops[0] = op0;
+  ops[1] = op1;
+  ops[2] = op2;
+  ops[3] = op3;
+  ops[4] = op4;
+  ops[5] = op5;
+  ops[6] = op6;
+}
+
 /* Set the "operation" to be the single value VALUE, such as a constant
or SSA_NAME.  */
 
diff --git a/gcc/match.pd b/gcc/match.pd
index a37af05f873..75b7e100120 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -103,12 +103,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   IFN_COND_FMIN IFN_COND_FMAX
   IFN_COND_AND IFN_COND_IOR IFN_COND_XOR
   IFN_COND_SHL IFN_COND_SHR)
+(define_operator_list COND_LEN_BINARY
+  IFN_COND_LEN_ADD IFN_COND_LEN_SUB
+  IFN_COND_LEN_MUL IFN_COND_LEN_DIV
+  IFN_COND_LEN_MOD IFN_COND_LEN_RDIV
+  IFN_COND_LEN_MIN IFN_COND_LEN_MAX
+  IFN_COND_LEN_FMIN IFN_COND_LEN_FMAX
+  IFN_COND_LEN_AND IFN_COND_LEN_IOR IFN_COND_LEN_XOR
+  IFN_COND_LEN_SHL IFN_COND_LEN_SHR)
 
 /* Same for ternary operations.  */
 (define_operator_list UNCOND_TERNARY
   IFN_FMA IFN_FMS IFN_FNMA IFN_FNMS)
 (define_operator_list COND_TERNARY
   IFN_COND_FMA IFN_COND_FMS IFN_COND_FNMA IFN_COND_FNMS)
+(define_operator_list COND_LEN_TERNARY
+  IFN_COND_LEN_FMA IFN_COND_LEN_FMS IFN_COND_LEN_FNMA IFN_COND_LEN_FNMS)
 
 /* __atomic_fetch_or_*, __atomic_fetch_xor_*, __atomic_xor_fetch_*  */
 (define_operator_list ATOMIC_FETCH_OR_XOR_N
@@ -8861,6 +8871,35 @@ and,
 && element_precision (type) == element_precision (op_type))
 (view_convert (cond_op @2 @3 @4 @5 (view_convert:op_type

Re: [Patch] OpenMP: Add ME support for 'omp allocate' stack variables

2023-09-20 Thread Jakub Jelinek

On Mon, Sep 18, 2023 at 02:22:50PM +0200, Tobias Burnus wrote:
> The attached patch now actually adds GOMP_alloc/free calls for 'omp allocate'.
> 
> Besides the addition of the calls and the value expression, it also had to 
> deal with
> (implicit) mapping/privatization - such that 'default(none)' did not wrongly 
> trigger
> for the value expression (and categorizes the vars correctly for 
> default/defaultmap)
> and that mapping/privatization is handled correctly.
> 
> Build and regtested on x86-64-gnu-linux (w/o offloading configured, but I 
> tested
> separately the libgomp.c/allocate-*.c with nvptx offloading).
> 
> Comments, suggestions, remarks?

LGTM.

Jakub

Re: [PATCH] Add a GCC Security policy

2023-09-20 Thread Siddhesh Poyarekar


On 2023-09-20 08:29, Jakub Jelinek wrote:

I just noticed (ENOCOFFEE) that the line (after removing libvtv) is:

 Support libraries such as libiberty, libcc1 and libcpp have been
 developed separately to share code with other tools such as binutils
 and gdb.

Does that address your concern Jakub?


I believe that is the case just for libiberty.
libcpp is I think solely used by gcc itself (several frontends use it
though, plus some build utilities in gcc).
libcc1 is code for gdb with gcc implementation details.


How about:

Libraries that are not distributed for runtime language support such as 
libiberty, libcc1 and libcpp have similar challenges to compiler 
drivers.  While they are expected to be robust against arbitrary input, 
they should only be used with trusted inputs.


Thanks,
Sid

[Committed] RISC-V: Support VLS floating-point extend/truncate

2023-09-20 Thread Juzhe-Zhong

Regression passed.

Committed.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Extend VLS floating-point.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/widen/widen-10.c: Adapt test.
* gcc.target/riscv/rvv/autovec/widen/widen-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/ext-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/ext-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trunc-5.c: New test.

---
 gcc/config/riscv/vector-iterators.md  | 95 +++
 .../gcc.target/riscv/rvv/autovec/vls/ext-4.c  | 35 +++
 .../gcc.target/riscv/rvv/autovec/vls/ext-5.c  | 27 ++
 .../riscv/rvv/autovec/vls/trunc-4.c   | 35 +++
 .../riscv/rvv/autovec/vls/trunc-5.c   | 27 ++
 .../riscv/rvv/autovec/widen/widen-10.c|  2 +-
 .../riscv/rvv/autovec/widen/widen-11.c|  2 +-
 .../riscv/rvv/autovec/widen/widen-12.c|  2 +-
 .../rvv/autovec/widen/widen-complicate-7.c|  2 +-
 .../rvv/autovec/widen/widen-complicate-8.c|  2 +-
 .../rvv/autovec/widen/widen-complicate-9.c|  2 +-
 11 files changed, 225 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/ext-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/ext-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/trunc-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/trunc-5.c

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 73df55a69c8..053d84c0c7d 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -994,6 +994,28 @@
 
   (RVVM8DF "TARGET_VECTOR_ELEN_FP_64") (RVVM4DF "TARGET_VECTOR_ELEN_FP_64")
   (RVVM2DF "TARGET_VECTOR_ELEN_FP_64") (RVVM1DF "TARGET_VECTOR_ELEN_FP_64")
+
+  (V1SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32")
+  (V2SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32")
+  (V4SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32")
+  (V8SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32")
+  (V16SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 64")
+  (V32SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (V64SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 256")
+  (V128SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 512")
+  (V256SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 1024")
+  (V512SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 2048")
+  (V1024SF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_16 && 
TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 4096")
+  (V1DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V2DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V4DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V8DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
64")
+  (V16DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
128")
+  (V32DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
256")
+  (V64DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
512")
+  (V128DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
1024")
+  (V256DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
2048")
+  (V512DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
4096")
 ])
 
 (define_mode_iterator VWEXTF [
@@ -1049,6 +1071,17 @@
 (define_mode_iterator VQEXTF [
   (RVVM8DF "TARGET_VECTOR_ELEN_FP_64") (RVVM4DF "TARGET_VECTOR_ELEN_FP_64")
   (RVVM2DF "TARGET_VECTOR_ELEN_FP_64") (RVVM1DF "TARGET_VECTOR_ELEN_FP_64")
+
+  (V1DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V2DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V4DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64")
+  (V8DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
64")
+  (V16DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
128")
+  (V32DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
256")
+  (V64DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
512")
+  (V128DF "TARGET_VECTOR_VLS && TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 
1024")

Re: [PATCH][RFC] middle-end/106811 - document GENERIC/GIMPLE undefined behavior

2023-09-20 Thread Richard Sandiford

Thanks for doing this.  Question below...

Richard Biener via Gcc-patches  writes:
> The following attempts to provide a set of conditions GENERIC/GIMPLE
> considers invoking undefined behavior, leaning on the C standards
> Annex J, as to provide portability guidance to language frontend
> developers.
>
> I've both tried to remember cases we exploit undefined behavior
> and went over C2x Annex J to catch more stuff.  I'd be grateful
> if people could point out obvious omissions or cases where the
> wording isn't clear.  I plan to check/amend the individual operator
> documentations as well, but not everything fits there.
>
> I've put this into generic.texi because it applies to GENERIC as
> the frontend interface.  All constraints apply to GIMPLE as well.
> I plan to add a section to gimple.texi as to how to deal with
> undefined behavior.
>
> As said, every comment is welcome.
>
> For testing I've built doc and inspected the resulting pdf.
>
>   PR middle-end/106811
>   * doc/generic.texi: Add portability section with
>   subsection on undefined behavior.
> ---
>  gcc/doc/generic.texi | 87 
>  1 file changed, 87 insertions(+)
>
> diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
> index 6534c354b7a..0969f881146 100644
> --- a/gcc/doc/generic.texi
> +++ b/gcc/doc/generic.texi
> @@ -43,6 +43,7 @@ seems inelegant.
>  * Functions::Function bodies, linkage, and other aspects.
>  * Language-dependent trees::Topics and trees specific to language front 
> ends.
>  * C and C++ Trees::  Trees specific to C and C++.
> +* Portability issues::  Portability summary for languages.
>  @end menu
>  
>  @c -
> @@ -3733,3 +3734,89 @@ In either case, the expression is void.
>  
>  
>  @end table
> +
> +
> +@node Portability issues
> +@section Portability issues
> +
> +This section summarizes portability issues when translating source languages
> +to GENERIC.  Everything written here also applies to GIMPLE.  This section
> +heavily relies on interpretation according to the C standard.
> +
> +@menu
> +* Undefined behavior::  Undefined behavior.
> +@end menu
> +
> +@node Undefined behavior
> +@subsection Undefined behavior
> +
> +The following is a list of circumstances that invoke undefined behavior.
> +
> +@itemize @bullet
> +@item
> +When the result of negation, addition, subtraction or division of two signed
> +integers or signed integer vectors not subject to @option{-fwrapv} cannot be
> +represented in the type.

Couldn't tell: is the omission of multiplication deliberate?

Richard

> +
> +@item
> +The value of the second operand of any of the division or modulo operators
> +is zero.
> +
> +@item
> +When incrementing or decrementing a pointer not subject to
> +@option{-fwrapv-pointer} wraps around zero.
> +
> +@item
> +An expression is shifted by a negative number or by an amount greater
> +than or equal to the width of the shifted operand.
> +
> +@item
> +Pointers that do not point to the same object are compared using
> +relational operators.
> +
> +@item
> +An object which has been modified is accessed through a restrict-qualified
> +pointer and another pointer that are not both based on the same object.
> +
> +@item
> +The @} that terminates a function is reached, and the value of the function
> +call is used by the caller.
> +
> +@item
> +When program execution reaches __builtin_unreachable.
> +
> +@item
> +When an object has its stored value accessed by an lvalue that
> +does not have one of the following types:
> +@itemize @minus
> +@item
> +a (qualified) type compatible with the effective type of the object
> +@item
> +a type that is the (qualified) signed or unsigned type corresponding to
> +the effective type of the object
> +@item
> +a character type, a ref-all qualified type or a type subject to
> +@option{-fno-strict-aliasing}
> +@item
> +a pointer to void with the same level of indirection as the accessed
> +pointer object
> +@end itemize
> +
> +@item
> +Addition or subtraction of a pointer into, or just beyond, an object
> +and an integer type produces a result that does not point into, or just
> +beyond when not dereferenced, the same object.
> +
> +@item
> +Pointers that do not point into, or just beyond, the same object are
> +subtracted.
> +
> +@item
> +When a pointer not pointing to actual storage is dereferenced.
> +
> +@item
> +An array subscript is out of range, even if an object is apparently 
> accessible
> +with the given subscript (as in the lvalue expression a[1][7] given the
> +declaration int a[4][5]).
> +
> +@end itemize

Re: [PATCH] Add a GCC Security policy

2023-09-20 Thread Jakub Jelinek

On Wed, Sep 20, 2023 at 08:23:32AM -0400, Siddhesh Poyarekar wrote:
> On 2023-09-20 07:58, Siddhesh Poyarekar wrote:
> > On 2023-09-20 07:55, Jakub Jelinek wrote:
> > > On Wed, Sep 20, 2023 at 07:50:43AM -0400, Siddhesh Poyarekar wrote:
> > > > +    Support libraries such as libiberty, libcc1 libvtv and libcpp have
> > > 
> > > Missing comma before libvtv.  But more importantly, libvtv is not
> > > support library like libiberty, libcpp, it is more like the sanitizer
> > > libraries runtime library for -fvtable-verify= .
> > 
> > Ack, I'll move libvtv out.
> > 
> > > And, libcc1 also isn't a compiler support library, but support library
> > > for a GDB plugin.
> > > 
> > 
> > Isn't that like libiberty then, which also gets used by other toolchain
> > projects?  Maybe calling it "Toolchain support libraries" would make it
> > more explicit?
> 
> I just noticed (ENOCOFFEE) that the line (after removing libvtv) is:
> 
> Support libraries such as libiberty, libcc1 and libcpp have been
> developed separately to share code with other tools such as binutils
> and gdb.
> 
> Does that address your concern Jakub?

I believe that is the case just for libiberty.
libcpp is I think solely used by gcc itself (several frontends use it
though, plus some build utilities in gcc).
libcc1 is code for gdb with gcc implementation details.

Jakub

Re: [PATCH] Add a GCC Security policy

2023-09-20 Thread Jakub Jelinek

On Wed, Sep 20, 2023 at 07:58:04AM -0400, Siddhesh Poyarekar wrote:
> On 2023-09-20 07:55, Jakub Jelinek wrote:
> > On Wed, Sep 20, 2023 at 07:50:43AM -0400, Siddhesh Poyarekar wrote:
> > > +Support libraries such as libiberty, libcc1 libvtv and libcpp have
> > 
> > Missing comma before libvtv.  But more importantly, libvtv is not
> > support library like libiberty, libcpp, it is more like the sanitizer
> > libraries runtime library for -fvtable-verify= .
> 
> Ack, I'll move libvtv out.
> 
> > And, libcc1 also isn't a compiler support library, but support library
> > for a GDB plugin.
> > 
> 
> Isn't that like libiberty then, which also gets used by other toolchain
> projects?  Maybe calling it "Toolchain support libraries" would make it more
> explicit?

Not really.  libiberty is a static only library with some useful routines
for all the projects which each of them links in.  libcc1 is a shared
library which gdb uses to implement the eval command (or how is it called).

Jakub

Re: [PATCH] Add a GCC Security policy

2023-09-20 Thread Siddhesh Poyarekar


On 2023-09-20 07:58, Siddhesh Poyarekar wrote:

On 2023-09-20 07:55, Jakub Jelinek wrote:

On Wed, Sep 20, 2023 at 07:50:43AM -0400, Siddhesh Poyarekar wrote:

+    Support libraries such as libiberty, libcc1 libvtv and libcpp have


Missing comma before libvtv.  But more importantly, libvtv is not
support library like libiberty, libcpp, it is more like the sanitizer
libraries runtime library for -fvtable-verify= .


Ack, I'll move libvtv out.


And, libcc1 also isn't a compiler support library, but support library
for a GDB plugin.



Isn't that like libiberty then, which also gets used by other toolchain 
projects?  Maybe calling it "Toolchain support libraries" would make it 
more explicit?


I just noticed (ENOCOFFEE) that the line (after removing libvtv) is:

Support libraries such as libiberty, libcc1 and libcpp have been
developed separately to share code with other tools such as binutils
and gdb.

Does that address your concern Jakub?

Thanks,
Sid

[PATCH 2/3] build: Add libgrust as compilation modules

2023-09-20 Thread Arthur Cohen

From: Pierre-Emmanuel Patry 

Define the libgrust directory as a host compilation module as well as
for targets.

ChangeLog:

* Makefile.def: Add libgrust as host & target module.
* configure.ac: Add libgrust to host tools list.

gcc/rust/ChangeLog:

* config-lang.in: Add libgrust as a target module for the rust
language.

Signed-off-by: Pierre-Emmanuel Patry 
---
 Makefile.def| 2 ++
 configure.ac| 3 ++-
 gcc/rust/config-lang.in | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/Makefile.def b/Makefile.def
index 870150183b9..3df3fc18d14 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -149,6 +149,7 @@ host_modules= { module= libcc1; 
extra_configure_flags=--enable-shared; };
 host_modules= { module= gotools; };
 host_modules= { module= libctf; bootstrap=true; };
 host_modules= { module= libsframe; bootstrap=true; };
+host_modules= { module= libgrust; };
 
 target_modules = { module= libstdc++-v3;
   bootstrap=true;
@@ -192,6 +193,7 @@ target_modules = { module= libgm2; lib_path=.libs; };
 target_modules = { module= libgomp; bootstrap= true; lib_path=.libs; };
 target_modules = { module= libitm; lib_path=.libs; };
 target_modules = { module= libatomic; bootstrap=true; lib_path=.libs; };
+target_modules = { module= libgrust; };
 
 // These are (some of) the make targets to be done in each subdirectory.
 // Not all; these are the ones which don't have special options.
diff --git a/configure.ac b/configure.ac
index 1d16530140a..036e5945905 100644
--- a/configure.ac
+++ b/configure.ac
@@ -133,7 +133,7 @@ build_tools="build-texinfo build-flex build-bison build-m4 
build-fixincludes"
 
 # these libraries are used by various programs built for the host environment
 #f
-host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
libbacktrace libcpp libcody libdecnumber gmp mpfr mpc isl libiconv libctf 
libsframe"
+host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
libbacktrace libcpp libcody libdecnumber gmp mpfr mpc isl libiconv libctf 
libsframe libgrust "
 
 # these tools are built for the host environment
 # Note, the powerpc-eabi build depends on sim occurring before gdb in order to
@@ -164,6 +164,7 @@ target_libraries="target-libgcc \
target-libada \
target-libgm2 \
target-libgo \
+   target-libgrust \
target-libphobos \
target-zlib"
 
diff --git a/gcc/rust/config-lang.in b/gcc/rust/config-lang.in
index aac66c9b962..8f071dcb0bf 100644
--- a/gcc/rust/config-lang.in
+++ b/gcc/rust/config-lang.in
@@ -29,4 +29,6 @@ compilers="rust1\$(exeext)"
 
 build_by_default="no"
 
+target_libs="target-libffi target-libbacktrace target-libgrust"
+
 gtfiles="\$(srcdir)/rust/rust-lang.cc"
-- 
2.42.0

[PATCH 1/3] librust: Add libproc_macro and build system

2023-09-20 Thread Arthur Cohen

From: Pierre-Emmanuel Patry 

This patch series adds the build system changes to allow the Rust
frontend to develop and distribute its own libraries. The first library
we have been working on is the `proc_macro` library, comprised of a C++
library as well as a user-facing Rust library.

Follow up commits containing the actual library code will be committed.
Should I submit patches to the MAINTAINERS file to allow Philip and I to
commit to this folder without first approval?

This first commit adds a simple `libgrust` folder with on top of which the
full library will be built.

All the best,

Arthur

-

Add some dummy files in libproc_macro along with it's build system.

ChangeLog:

* libgrust/Makefile.am: New file.
* libgrust/configure.ac: New file.
* libgrust/libproc_macro/Makefile.am: New file.
* libgrust/libproc_macro/proc_macro.cc: New file.
* libgrust/libproc_macro/proc_macro.h: New file.

Signed-off-by: Pierre-Emmanuel Patry 
---
 libgrust/Makefile.am |  68 
 libgrust/configure.ac| 113 +++
 libgrust/libproc_macro/Makefile.am   |  58 ++
 libgrust/libproc_macro/proc_macro.cc |   7 ++
 libgrust/libproc_macro/proc_macro.h  |   7 ++
 5 files changed, 253 insertions(+)
 create mode 100644 libgrust/Makefile.am
 create mode 100644 libgrust/configure.ac
 create mode 100644 libgrust/libproc_macro/Makefile.am
 create mode 100644 libgrust/libproc_macro/proc_macro.cc
 create mode 100644 libgrust/libproc_macro/proc_macro.h

diff --git a/libgrust/Makefile.am b/libgrust/Makefile.am
new file mode 100644
index 000..8e5274922c5
--- /dev/null
+++ b/libgrust/Makefile.am
@@ -0,0 +1,68 @@
+AUTOMAKE_OPTIONS = 1.8 foreign
+
+SUFFIXES = .c .rs .def .o .lo .a
+
+ACLOCAL_AMFLAGS = -I . -I .. -I ../config
+
+AM_CFLAGS = -I $(srcdir)/../libgcc -I $(MULTIBUILDTOP)../../gcc/include
+
+TOP_GCCDIR := $(shell cd $(top_srcdir) && cd .. && pwd)
+
+GCC_DIR = $(TOP_GCCDIR)/gcc
+RUST_SRC = $(GCC_DIR)/rust
+
+toolexeclibdir=@toolexeclibdir@
+toolexecdir=@toolexecdir@
+
+SUBDIRS = libproc_macro
+
+RUST_BUILDDIR := $(shell pwd)
+
+# Work around what appears to be a GNU make bug handling MAKEFLAGS
+# values defined in terms of make variables, as is the case for CC and
+# friends when we are called from the top level Makefile.
+AM_MAKEFLAGS = \
+"GCC_DIR=$(GCC_DIR)" \
+"RUST_SRC=$(RUST_SRC)" \
+   "AR_FLAGS=$(AR_FLAGS)" \
+   "CC_FOR_BUILD=$(CC_FOR_BUILD)" \
+   "CC_FOR_TARGET=$(CC_FOR_TARGET)" \
+   "RUST_FOR_TARGET=$(RUST_FOR_TARGET)" \
+   "CFLAGS=$(CFLAGS)" \
+   "CXXFLAGS=$(CXXFLAGS)" \
+   "CFLAGS_FOR_BUILD=$(CFLAGS_FOR_BUILD)" \
+   "CFLAGS_FOR_TARGET=$(CFLAGS_FOR_TARGET)" \
+   "INSTALL=$(INSTALL)" \
+   "INSTALL_DATA=$(INSTALL_DATA)" \
+   "INSTALL_PROGRAM=$(INSTALL_PROGRAM)" \
+   "INSTALL_SCRIPT=$(INSTALL_SCRIPT)" \
+   "LDFLAGS=$(LDFLAGS)" \
+   "LIBCFLAGS=$(LIBCFLAGS)" \
+   "LIBCFLAGS_FOR_TARGET=$(LIBCFLAGS_FOR_TARGET)" \
+   "MAKE=$(MAKE)" \
+   "MAKEINFO=$(MAKEINFO) $(MAKEINFOFLAGS)" \
+   "PICFLAG=$(PICFLAG)" \
+   "PICFLAG_FOR_TARGET=$(PICFLAG_FOR_TARGET)" \
+   "SHELL=$(SHELL)" \
+   "RUNTESTFLAGS=$(RUNTESTFLAGS)" \
+   "exec_prefix=$(exec_prefix)" \
+   "infodir=$(infodir)" \
+   "libdir=$(libdir)" \
+   "includedir=$(includedir)" \
+   "prefix=$(prefix)" \
+   "tooldir=$(tooldir)" \
+   "gxx_include_dir=$(gxx_include_dir)" \
+   "AR=$(AR)" \
+   "AS=$(AS)" \
+   "LD=$(LD)" \
+   "RANLIB=$(RANLIB)" \
+   "NM=$(NM)" \
+   "NM_FOR_BUILD=$(NM_FOR_BUILD)" \
+   "NM_FOR_TARGET=$(NM_FOR_TARGET)" \
+   "DESTDIR=$(DESTDIR)" \
+   "WERROR=$(WERROR)" \
+"TARGET_LIB_PATH=$(TARGET_LIB_PATH)" \
+"TARGET_LIB_PATH_librust=$(TARGET_LIB_PATH_librust)" \
+   "LIBTOOL=$(RUST_BUILDDIR)/libtool"
+
+include $(top_srcdir)/../multilib.am
diff --git a/libgrust/configure.ac b/libgrust/configure.ac
new file mode 100644
index 000..7aed489a643
--- /dev/null
+++ b/libgrust/configure.ac
@@ -0,0 +1,113 @@
+AC_INIT([libgrust], version-unused,,librust)
+AC_CONFIG_SRCDIR(Makefile.am)
+AC_CONFIG_FILES([Makefile])
+
+# AM_ENABLE_MULTILIB(, ..)
+
+# Do not delete or change the following two lines.  For why, see
+# http://gcc.gnu.org/ml/libstdc++/2003-07/msg00451.html
+AC_CANONICAL_SYSTEM
+target_alias=${target_alias-$host_alias}
+AC_SUBST(target_alias)
+
+# Automake should never attempt to rebuild configure
+AM_MAINTAINER_MODE
+
+AM_INIT_AUTOMAKE([1.15.1 foreign no-dist -Wall])
+
+# Make sure we don't test executables when making cross-tools.
+GCC_NO_EXECUTABLES
+
+
+# Add the ability to change LIBTOOL directory
+GCC_WITH_TOOLEXECLIBDIR
+
+# Use system specific extensions
+AC_USE_SYSTEM_EXTENSIONS
+
+
+# Checks for header files.
+AC_HEADER_STDC
+AC_HEADER_SYS_WAIT
+AC_CHECK_HEADERS(limits.h stddef.h string.h strings.h stdlib.h \
+

Re: [PATCH] Add a GCC Security policy

2023-09-20 Thread Siddhesh Poyarekar


On 2023-09-20 07:55, Jakub Jelinek wrote:

On Wed, Sep 20, 2023 at 07:50:43AM -0400, Siddhesh Poyarekar wrote:

+Support libraries such as libiberty, libcc1 libvtv and libcpp have


Missing comma before libvtv.  But more importantly, libvtv is not
support library like libiberty, libcpp, it is more like the sanitizer
libraries runtime library for -fvtable-verify= .


Ack, I'll move libvtv out.


And, libcc1 also isn't a compiler support library, but support library
for a GDB plugin.



Isn't that like libiberty then, which also gets used by other toolchain 
projects?  Maybe calling it "Toolchain support libraries" would make it 
more explicit?


Thanks,
Sid

Re: [PATCH] Add a GCC Security policy

2023-09-20 Thread Jakub Jelinek

On Wed, Sep 20, 2023 at 07:50:43AM -0400, Siddhesh Poyarekar wrote:
> +Support libraries such as libiberty, libcc1 libvtv and libcpp have

Missing comma before libvtv.  But more importantly, libvtv is not
support library like libiberty, libcpp, it is more like the sanitizer
libraries runtime library for -fvtable-verify= .
And, libcc1 also isn't a compiler support library, but support library
for a GDB plugin.

Jakub

[PATCH] Add a GCC Security policy

2023-09-20 Thread Siddhesh Poyarekar

Define a security process and exclusions to security issues for GCC and
all components it ships.

Signed-off-by: Siddhesh Poyarekar 
---

Sending as a proper patch since there have been no further comments on
the RFC.  I toyed with the idea of making the distinction of
"exploitable vulnerability" vs "missed hardening" more explicit near the
top of the document but decided against further tinkering in the end
since we already have a proper section dealing with it.  Instead I made
the language in the hardening section a bit more explicit, clarifying
that missed hardening is not an *exploitable vulnerability*, which
hopefully resolves the contradication of a bug in a security feature not
being a security bug.

I also added the AdaCore security contact at Arnaud's request.

Thanks,
Sid

 SECURITY.txt | 202 +++
 1 file changed, 202 insertions(+)
 create mode 100644 SECURITY.txt

diff --git a/SECURITY.txt b/SECURITY.txt
new file mode 100644
index 000..d2161f03bf5
--- /dev/null
+++ b/SECURITY.txt
@@ -0,0 +1,202 @@
+What is a GCC security bug?
+===
+
+A security bug is one that threatens the security of a system or
+network, or might compromise the security of data stored on it.
+In the context of GCC there are multiple ways in which this might
+happen and they're detailed below.
+
+Compiler drivers, programs, libgccjit and support libraries
+---
+
+The compiler driver processes source code, invokes other programs
+such as the assembler and linker and generates the output result,
+which may be assembly code or machine code.  Compiling untrusted
+sources can result in arbitrary code execution and unconstrained
+resource consumption in the compiler. As a result, compilation of
+such code should be done inside a sandboxed environment to ensure
+that it does not compromise the development environment.
+
+The libgccjit library can, despite the name, be used both for
+ahead-of-time compilation and for just-in-compilation.  In both
+cases it can be used to translate input representations (such as
+source code) in the application context; in the latter case the
+generated code is also run in the application context.
+
+Limitations that apply to the compiler driver, apply here too in
+terms of sanitizing inputs and it is recommended that both the
+compilation *and* execution context of the code are appropriately
+sandboxed to contain the effects of any bugs in libgccjit, the
+application code using it, or its generated code to the sandboxed
+environment.
+
+Support libraries such as libiberty, libcc1 libvtv and libcpp have
+been developed separately to share code with other tools such as
+binutils and gdb.  These libraries again have similar challenges to
+compiler drivers.  While they are expected to be robust against
+arbitrary input, they should only be used with trusted inputs.
+
+Libraries such as zlib that bundled into GCC to build it will be
+treated the same as the compiler drivers and programs as far as
+security coverage is concerned.  However if you find an issue in
+these libraries independent of their use in GCC, you should reach
+out to their upstream projects to report them.
+
+As a result, the only case for a potential security issue in the
+compiler is when it generates vulnerable application code for
+trusted input source code that is conforming to the relevant
+programming standard or extensions documented as supported by GCC
+and the algorithm expressed in the source code does not have the
+vulnerability.  The output application code could be considered
+vulnerable if it produces an actual vulnerability in the target
+application, specifically in the following cases:
+
+- The application dereferences an invalid memory location despite
+  the application sources being valid.
+- The application reads from or writes to a valid but incorrect
+  memory location, resulting in an information integrity issue or an
+  information leak.
+- The application ends up running in an infinite loop or with
+  severe degradation in performance despite the input sources having
+  no such issue, resulting in a Denial of Service.  Note that
+  correct but non-performant code is not a security issue candidate,
+  this only applies to incorrect code that may result in performance
+  degradation severe enough to amount to a denial of service.
+- The application crashes due to the generated incorrect code,
+  resulting in a Denial of Service.
+
+Language runtime libraries
+--
+
+GCC also builds and distributes libraries that are intended to be
+used widely to implement runtime support for various programming
+languages.  These include the following:
+

[PATCH] OpenMP: Support accelerated 2D/3D memory copies for AMD GCN

2023-09-20 Thread Julian Brown

This patch adds support for 2D/3D memory copies for omp_target_memcpy_rect
and "target update", using AMD extensions to the HSA API.  I've just
committed a version of this patch to the og13 branch, but this is the
mainline version.

Support is also added for 1-dimensional strided accesses: these are
treated as a special case of 2-dimensional transfers, where the innermost
dimension is formed from the stride length (in bytes).

This patch has (somewhat awkwardly from a review perspective) been merged
on top of the following list of in-review series:

"OpenMP/OpenACC: map clause and OMP gimplify rework":
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627895.html

"OpenMP: lvalue parsing and "declare mapper" support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629363.html

"OpenMP: Array-shaping operator and strided/rectangular 'target update'
support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629422.html

"OpenMP: Enable 'declare mapper' mappers for 'target update' directives":
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629432.html

Though it only depends directly on parts of that work (regarding
strided/rectangular updates).  A stand-alone version that just works
for the OpenMP API routine omp_target_memcpy_rect could be prepared to
apply separately, if preferable.

This version has been re-tested and bootstrapped.  OK?

2023-09-20  Julian Brown  

libgomp/
* plugin/plugin-gcn.c (hsa_runtime_fn_info): Add
hsa_amd_memory_lock_fn, hsa_amd_memory_unlock_fn,
hsa_amd_memory_async_copy_rect_fn function pointers.
(init_hsa_runtime_functions): Add above functions, with
DLSYM_OPT_FN.
(GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New functions.
* target.c (omp_target_memcpy_rect_worker): Add 1D strided transfer
support.
---
 libgomp/plugin/plugin-gcn.c | 359 
 libgomp/target.c|  31 
 2 files changed, 390 insertions(+)

diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index ef22d48da79..95c0a57e792 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -196,6 +196,16 @@ struct hsa_runtime_fn_info
   hsa_status_t (*hsa_code_object_deserialize_fn)
 (void *serialized_code_object, size_t serialized_code_object_size,
  const char *options, hsa_code_object_t *code_object);
+  hsa_status_t (*hsa_amd_memory_lock_fn)
+(void *host_ptr, size_t size, hsa_agent_t *agents, int num_agent,
+ void **agent_ptr);
+  hsa_status_t (*hsa_amd_memory_unlock_fn) (void *host_ptr);
+  hsa_status_t (*hsa_amd_memory_async_copy_rect_fn)
+(const hsa_pitched_ptr_t *dst, const hsa_dim3_t *dst_offset,
+ const hsa_pitched_ptr_t *src, const hsa_dim3_t *src_offset,
+ const hsa_dim3_t *range, hsa_agent_t copy_agent,
+ hsa_amd_copy_direction_t dir, uint32_t num_dep_signals,
+ const hsa_signal_t *dep_signals, hsa_signal_t completion_signal);
 };
 
 /* Structure describing the run-time and grid properties of an HSA kernel
@@ -1398,6 +1408,9 @@ init_hsa_runtime_functions (void)
   DLSYM_FN (hsa_signal_load_acquire)
   DLSYM_FN (hsa_queue_destroy)
   DLSYM_FN (hsa_code_object_deserialize)
+  DLSYM_OPT_FN (hsa_amd_memory_lock)
+  DLSYM_OPT_FN (hsa_amd_memory_unlock)
+  DLSYM_OPT_FN (hsa_amd_memory_async_copy_rect)
   return true;
 #undef DLSYM_FN
 }
@@ -3790,6 +3803,352 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst, const void 
*src, size_t n)
   return true;
 }
 
+/* Here _size refers to  multiplied by size -- i.e.
+   measured in bytes.  So we have:
+
+   dim1_size: number of bytes to copy on innermost dimension ("row")
+   dim0_len: number of rows to copy
+   dst: base pointer for destination of copy
+   dst_offset1_size: innermost row offset (for dest), in bytes
+   dst_offset0_len: offset, number of rows (for dest)
+   dst_dim1_size: whole-array dest row length, in bytes (pitch)
+   src: base pointer for source of copy
+   src_offset1_size: innermost row offset (for source), in bytes
+   src_offset0_len: offset, number of rows (for source)
+   src_dim1_size: whole-array source row length, in bytes (pitch)
+*/
+
+int
+GOMP_OFFLOAD_memcpy2d (int dst_ord, int src_ord, size_t dim1_size,
+  size_t dim0_len, void *dst, size_t dst_offset1_size,
+  size_t dst_offset0_len, size_t dst_dim1_size,
+  const void *src, size_t src_offset1_size,
+  size_t src_offset0_len, size_t src_dim1_size)
+{
+  if (!hsa_fns.hsa_amd_memory_lock_fn
+  || !hsa_fns.hsa_amd_memory_unlock_fn
+  || !hsa_fns.hsa_amd_memory_async_copy_rect_fn)
+return -1;
+
+  /* GCN hardware requires 4-byte alignment for base addresses & pitches.  Bail
+ out quietly if we have anything oddly-aligned rather than letting the
+ driver raise an error.  */
+  if uintptr_t) dst) & 3) != 0 || (((uintptr_t) src) & 3) != 0)
+return -1;
+
+  if ((dst_dim1_size & 3)

Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-20 Thread jvoisin

I'd like to provide some data-points on hardening-related flags, as I've
spent some time with Sam documenting their usage across various
distributions here[1]. I also attached the relevant file to this email
for archiving purposes.

tl'dr: the suggested flag selection for `-fhardened` is not only sound
but great. The only issue I see is `-fcf-protection=full`, since musl
libc doesn't support it for now, so it might cause runtime issues.

1. https://github.com/jvoisin/compiler-flags-distro/blob/main/README.md

-- 
Julien (jvoisin) Voisin
GPG: 04D041E8171901CC
dustri.org# Usage of enabled-by-default hardening-related compiler flags across Linux distributions

|.| Alpine | Debian | Fedora| Gentoo Hardened | Ubuntu | OpenSUSE | ArchLinux | OpenBSD | Chimera Linux | Android |
|-|||---|-||--|---|-|---|-|
|`-D_FORTIFY_SOURCE=2`|[yes](https://gitlab.alpinelinux.org/alpine/tsc/-/issues/64)|[2011](https://github.com/guillemj/dpkg/commit/f3bb7d4939ae95cf44c89e8f599e7ed5da431e57)|[2007](https://listman.redhat.com/archives/fedora-devel-announce/2007-September/msg00015.html)|superseded|[2008](https://wiki.ubuntu.com/ToolChain/CompilerFlags#A-D_FORTIFY_SOURCE.3D2)|[2005](https://en.opensuse.org/openSUSE:Security_Features)|[2021](https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/commit/f409a72342bf37017f190021970efaaeac1bb619)|?|[yes](https://github.com/chimera-linux/cports/commit/9b78e55067f024b8dbf9fbceb472e8705f84ed5d)|[2017](https://android-developers.googleblog.com/2019/10/introducing-ndk-r21-our-first-long-term.html)|
|`-D_FORTIFY_SOURCE=3`|no  |[no](https://wiki.debian.org/Hardening)|[2023](https://fedoraproject.org/wiki/Changes/Add_FORTIFY_SOURCE%3D3_to_distribution_build_flags)|[2022](https://bugs.gentoo.org/876893)|[no](https://bugs.launchpad.net/ubuntu/+source/gcc-12/+bug/2012440)|[2023](https://en.opensuse.org/openSUSE:Security_Features)|[not](https://gitlab.archlinux.org/archlinux/rfcs/-/merge_requests/17) [yet](https://gitlab.archlinux.org/archlinux/devtools/-/merge_requests/191)|?|no|[no](https://android.googlesource.com/platform/bionic.git/+/HEAD/docs/status.md#fortify)|
|`-D_GLIBCXX_ASSERTIONS`  |[2023](https://gitlab.alpinelinux.org/alpine/abuild/-/commit/44c933da5d8e364d6cd755071f629c05444191df)|no|[2018](https://fedoraproject.org/wiki/Changes/HardeningFlags28)|[2022](https://bugs.gentoo.org/876895)|[no](https://bugs.launchpad.net/ubuntu/+source/gcc-12/+bug/2016042)|yes|[2021](https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/commit/f409a72342bf37017f190021970efaaeac1bb619)|no|no|no|
|`-D_LIBCPP_ENABLE_HARDENED_MODE` (llvm17) |[not yet](https://gitlab.alpinelinux.org/alpine/abuild/-/commit/65b5d578b2d9e3f170bc9d31dcd23f0014cfc36e)[^1]|no|no|[2023](https://bugs.gentoo.org/85)|no|no|no|?|?|no|
|`-D_LIBCXX_ENABLE_ASSERTIONS` (llvm16) |no|no|no|superseded|no|no|no|?|[yes](https://github.com/search?q=repo%3Achimera-linux%2Fcports+DLIBCXX_ENABLE_ASSERTIONS=code)|?|
|`-Wformat -Wformat-security`/`-Wformat=2` |[2023](https://gitlab.alpinelinux.org/alpine/abuild/-/commit/ca8375f0e9d1715e38c14c918c675d6774f1eabc)|[2011](https://salsa.debian.org/toolchain-team/gcc/-/blob/master/debian/patches/gcc-distro-specs.diff)|[2013](https://fedoraproject.org/wiki/Changes/FormatSecurity)|[2009](https://bugs.gentoo.org/259417)|[2008](https://wiki.ubuntu.com/ToolChain/CompilerFlags)|yes|[2021](https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/commit/f409a72342bf37017f190021970efaaeac1bb619)|?|[2023](https://github.com/chimera-linux/cports/commit/ad898a6b645b11dee989f4504e89577f5395ba24)|[2010](https://source.android.com/docs/security/enhancements/enhancements41)|
|`-Wl,-z,noexecstack` |yes|yes|yes|yes|yes|yes|yes|yes|yes|yes|
|`-Wl,-z,relro`/`-Wl,-z,now`  |[yes](https://gitlab.alpinelinux.org/alpine/tsc/-/issues/64)|[yes](https://salsa.debian.org/toolchain-team/gcc/-/blob/master/debian/patches/gcc-distro-specs.diff)|[2015](https://fedoraproject.org/wiki/Security_Features_Matrix#Built_as_PIE)|[yes](https://wiki.gentoo.org/wiki/Hardened/Toolchain)|[2008](https://wiki.ubuntu.com/ToolChain/CompilerFlags)|[2006](https://en.opensuse.org/openSUSE:Security_Features)|[2017](https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/commit/b4b2bb56174493ea2e60b1eecc0085db421908cc)|?|[yes](https://github.com/chimera-linux/cports/commit/9b78e55067f024b8dbf9fbceb472e8705f84ed5d)|[2013](https://source.android.com/docs/security/enhancements/enhancements43)|
|`-fPIE`/`-fPIC`/…

[PATCH 3/3] [og13] OpenMP: Support accelerated 2D/3D memory copies for AMD GCN

2023-09-20 Thread Julian Brown

This patch adds support for 2D/3D memory copies for omp_target_memcpy_rect
and "target update", using AMD extensions to the HSA API.  This follows
Tobias's similar patch for NVPTX (already present on mainline and higher
up this patch series for og13).

Support is also added for 1-dimensional strided accesses: these are
treated as a special case of 2-dimensional transfers, where the innermost
dimension is formed from the stride length (in bytes).

2023-09-19  Julian Brown  

libgomp/
* plugin/plugin-gcn.c (hsa_runtime_fn_info): Add hsa_amd_memory_lock_fn,
hsa_amd_memory_unlock_fn, hsa_amd_memory_async_copy_rect_fn function
pointers.
(init_hsa_runtime_functions): Add above functions, with DLSYM_OPT_FN.
(GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New functions.
* target.c (omp_target_memcpy_rect_worker): Add 1D strided transfer
support.
---
 libgomp/plugin/plugin-gcn.c | 359 
 libgomp/target.c|  31 
 2 files changed, 390 insertions(+)

diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 3df9fbe8d80..b8b95d75214 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -211,6 +211,16 @@ struct hsa_runtime_fn_info
   hsa_status_t (*hsa_amd_svm_attributes_set_fn)
 (void* ptr, size_t size, hsa_amd_svm_attribute_pair_t* attribute_list,
  size_t attribute_count);
+  hsa_status_t (*hsa_amd_memory_lock_fn)
+(void *host_ptr, size_t size, hsa_agent_t *agents, int num_agent,
+ void **agent_ptr);
+  hsa_status_t (*hsa_amd_memory_unlock_fn) (void *host_ptr);
+  hsa_status_t (*hsa_amd_memory_async_copy_rect_fn)
+(const hsa_pitched_ptr_t *dst, const hsa_dim3_t *dst_offset,
+ const hsa_pitched_ptr_t *src, const hsa_dim3_t *src_offset,
+ const hsa_dim3_t *range, hsa_agent_t copy_agent,
+ hsa_amd_copy_direction_t dir, uint32_t num_dep_signals,
+ const hsa_signal_t *dep_signals, hsa_signal_t completion_signal);
 };
 
 /* Structure describing the run-time and grid properties of an HSA kernel
@@ -1439,6 +1449,9 @@ init_hsa_runtime_functions (void)
   DLSYM_FN (hsa_queue_destroy)
   DLSYM_FN (hsa_code_object_deserialize)
   DLSYM_OPT_FN (hsa_amd_svm_attributes_set)
+  DLSYM_OPT_FN (hsa_amd_memory_lock)
+  DLSYM_OPT_FN (hsa_amd_memory_unlock)
+  DLSYM_OPT_FN (hsa_amd_memory_async_copy_rect)
   return true;
 #undef DLSYM_FN
 }
@@ -3948,6 +3961,352 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst, const void 
*src, size_t n)
   return true;
 }
 
+/* Here _size refers to  multiplied by size -- i.e.
+   measured in bytes.  So we have:
+
+   dim1_size: number of bytes to copy on innermost dimension ("row")
+   dim0_len: number of rows to copy
+   dst: base pointer for destination of copy
+   dst_offset1_size: innermost row offset (for dest), in bytes
+   dst_offset0_len: offset, number of rows (for dest)
+   dst_dim1_size: whole-array dest row length, in bytes (pitch)
+   src: base pointer for source of copy
+   src_offset1_size: innermost row offset (for source), in bytes
+   src_offset0_len: offset, number of rows (for source)
+   src_dim1_size: whole-array source row length, in bytes (pitch)
+*/
+
+int
+GOMP_OFFLOAD_memcpy2d (int dst_ord, int src_ord, size_t dim1_size,
+  size_t dim0_len, void *dst, size_t dst_offset1_size,
+  size_t dst_offset0_len, size_t dst_dim1_size,
+  const void *src, size_t src_offset1_size,
+  size_t src_offset0_len, size_t src_dim1_size)
+{
+  if (!hsa_fns.hsa_amd_memory_lock_fn
+  || !hsa_fns.hsa_amd_memory_unlock_fn
+  || !hsa_fns.hsa_amd_memory_async_copy_rect_fn)
+return -1;
+
+  /* GCN hardware requires 4-byte alignment for base addresses & pitches.  Bail
+ out quietly if we have anything oddly-aligned rather than letting the
+ driver raise an error.  */
+  if uintptr_t) dst) & 3) != 0 || (((uintptr_t) src) & 3) != 0)
+return -1;
+
+  if ((dst_dim1_size & 3) != 0 || (src_dim1_size & 3) != 0)
+return -1;
+
+  /* Only handle host to device or device to host transfers here.  */
+  if ((dst_ord == -1 && src_ord == -1)
+  || (dst_ord != -1 && src_ord != -1))
+return -1;
+
+  hsa_amd_copy_direction_t dir
+= (src_ord == -1) ? hsaHostToDevice : hsaDeviceToHost;
+  hsa_agent_t copy_agent;
+
+  /* We need to pin (lock) host memory before we start the transfer.  Try to
+ lock the minimum size necessary, i.e. using partial first/last rows of the
+ whole array.  Something like this:
+
+rows -->
+..
+ c | ..###+ <- first row apart from {src,dst}_offset1_size
+ o | ++###+ <- whole row
+ l | ++###+ <- "
+ s v ++###. <- last row apart from trailing remainder
+..
+
+ We could split very large transfers into several rectangular copies, but
+ that is unimplemented for now.  */
+
+  size_t

[PATCH 1/3] [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect

2023-09-20 Thread Julian Brown

From: Tobias Burnus 

This is a version of Tobias's mainline patch of the same name,
merged to og13 and with the followup patch "libgomp: cuda.h and
omp_target_memcpy_rect cleanup" folded in.  A couple of merge conflicts
have also been resolved, mostly regarding "gomp_update".  Tobias's
original log message follows.

When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.

Additionally, it permits device-to-device copies, if necessary using a
temporary variable on the host.

2023-09-19  Tobias Burnus  
Julian Brown  

include/
* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
CUDA_MEMCPY3D_PEER): New typdefs.
(cuMemcpyPeer, cuMemcpyPeerAsync, cuMemcpy2D, cuMemcpy2DAsync,
cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
cuMemcpy3DPeerAsync): New prototypes.

libgomp/
* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): New prototypes.
* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
and memcpy3d_func.
* libgomp.texi (nvptx): Document when cuMemcpy2D/cuMemcpy3D is used.
* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
cuMemcpy3D): Invoke via CUDA_ONE_CALL.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
GOMP_OFFLOAD_memcpy3d): New.
* target.c (omp_target_memcpy_rect_worker): Update prototype.
(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
Permit all device-to-device copies; invoke new plugins for
2D and 3D copying when available.
(gomp_update): Update calls to omp_target_memcpy_rect_worker.  Ensure
that tmp space is not allocated here.
(gomp_load_plugin_for_device): DLSYM the new plugin functions.
* testsuite/libgomp.c/target-12.c: Fix dimension bug.
* testsuite/libgomp.fortran/target-12.f90: Likewise.
* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
---
 include/cuda/cuda.h   |  87 +++
 libgomp/libgomp-plugin.h  |   7 +
 libgomp/libgomp.h |   2 +
 libgomp/libgomp.texi  |   5 +
 libgomp/oacc-host.c   |   2 +
 libgomp/plugin/cuda-lib.def   |   3 +
 libgomp/plugin/plugin-nvptx.c | 118 
 libgomp/target.c  | 133 -
 libgomp/testsuite/libgomp.c/target-12.c   |   6 +-
 .../testsuite/libgomp.fortran/target-12.f90   |   6 +-
 .../libgomp.fortran/target-memcpy-rect-1.f90  | 531 ++
 11 files changed, 875 insertions(+), 25 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-memcpy-rect-1.f90

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index 3bfe0bab234..6a48a14a5fc 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -47,6 +47,7 @@ typedef void *CUevent;
 typedef void *CUfunction;
 typedef void *CUlinkState;
 typedef void *CUmodule;
+typedef void *CUarray;
 typedef size_t (*CUoccupancyB2DSize)(int);
 typedef void *CUstream;
 
@@ -54,7 +55,10 @@ typedef enum {
   CUDA_SUCCESS = 0,
   CUDA_ERROR_INVALID_VALUE = 1,
   CUDA_ERROR_OUT_OF_MEMORY = 2,
+  CUDA_ERROR_NOT_INITIALIZED = 3,
+  CUDA_ERROR_DEINITIALIZED = 4,
   CUDA_ERROR_INVALID_CONTEXT = 201,
+  CUDA_ERROR_INVALID_HANDLE = 400,
   CUDA_ERROR_NOT_FOUND = 500,
   CUDA_ERROR_NOT_READY = 600,
   CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED = 712,
@@ -141,6 +145,75 @@ typedef enum {
   CU_LIMIT_MALLOC_HEAP_SIZE = 0x02,
 } CUlimit;
 
+typedef enum {
+  CU_MEMORYTYPE_HOST = 0x01,
+  CU_MEMORYTYPE_DEVICE = 0x02,
+  CU_MEMORYTYPE_ARRAY = 0x03,
+  CU_MEMORYTYPE_UNIFIED = 0x04
+} CUmemorytype;
+
+typedef struct {
+  size_t srcXInBytes, srcY;
+  CUmemorytype srcMemoryType;
+  const void *srcHost;
+  CUdeviceptr srcDevice;
+  CUarray srcArray;
+  size_t srcPitch;
+
+  size_t dstXInBytes, dstY;
+  CUmemorytype dstMemoryType;
+  void *dstHost;
+  CUdeviceptr dstDevice;
+  CUarray dstArray;
+  size_t dstPitch;
+
+  size_t WidthInBytes, Height;
+} CUDA_MEMCPY2D;
+
+typedef struct {
+  size_t srcXInBytes, srcY, srcZ;
+  size_t srcLOD;
+  CUmemorytype srcMemoryType;
+  const void *srcHost;
+  CUdeviceptr srcDevice;
+  CUarray srcArray;
+  void *reserved0;
+  size_t srcPitch, srcHeight;
+
+  size_t dstXInBytes, dstY, dstZ;
+  size_t dstLOD;
+  CUmemorytype dstMemoryType;
+  void *dstHost;
+  CUdeviceptr dstDevice;
+  CUarray dstArray;
+  void *reserved1;
+  size_t dstPitch, dstHeight;
+
+  size_t WidthInBytes, Height, Depth;
+} CUDA_MEMCPY3D;
+
+typedef struct {
+  size_t srcXInBytes, srcY, srcZ;
+  size_t srcLOD;
+  CUmemorytype

[PATCH 2/3] [og13] OpenMP, NVPTX: memcpy[23]D bias correction

2023-09-20 Thread Julian Brown

This patch works around behaviour of the 2D and 3D memcpy operations in
the CUDA driver runtime.  Particularly in Fortran, the "base pointer"
of an array (used for either source or destination of a host/device copy)
may lie outside of data that is actually stored on the device.  The fix
is to make sure that we use the first element of data to be transferred
instead, and adjust parameters accordingly.

This is a merge of the patch previously posted for mainline to the
og13 branch.

2023-09-19  Julian Brown  

libgomp/
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d): Adjust parameters to
avoid out-of-bounds array checks in CUDA runtime.
(GOMP_OFFLOAD_memcpy3d): Likewise.
---
 libgomp/plugin/plugin-nvptx.c | 67 +++
 1 file changed, 67 insertions(+)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index bc232f9f81f..dd8c56b8f58 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -2460,6 +2460,35 @@ GOMP_OFFLOAD_memcpy2d (int dst_ord, int src_ord, size_t 
dim1_size,
   data.srcXInBytes = src_offset1_size;
   data.srcY = src_offset0_len;
 
+  if (data.srcXInBytes != 0 || data.srcY != 0)
+{
+  /* Adjust origin to the actual array data, else the CUDA 2D memory
+copy API calls below may fail to validate source/dest pointers
+correctly (especially for Fortran where the "virtual origin" of an
+array is often outside the stored data).  */
+  if (src_ord == -1)
+   data.srcHost = (const void *) ((const char *) data.srcHost
+ + data.srcY * data.srcPitch
+ + data.srcXInBytes);
+  else
+   data.srcDevice += data.srcY * data.srcPitch + data.srcXInBytes;
+  data.srcXInBytes = 0;
+  data.srcY = 0;
+}
+
+  if (data.dstXInBytes != 0 || data.dstY != 0)
+{
+  /* As above.  */
+  if (dst_ord == -1)
+   data.dstHost = (void *) ((char *) data.dstHost
++ data.dstY * data.dstPitch
++ data.dstXInBytes);
+  else
+   data.dstDevice += data.dstY * data.dstPitch + data.dstXInBytes;
+  data.dstXInBytes = 0;
+  data.dstY = 0;
+}
+
   CUresult res = CUDA_CALL_NOCHECK (cuMemcpy2D, );
   if (res == CUDA_ERROR_INVALID_VALUE)
 /* If pitch > CU_DEVICE_ATTRIBUTE_MAX_PITCH or for device-to-device
@@ -2528,6 +2557,44 @@ GOMP_OFFLOAD_memcpy3d (int dst_ord, int src_ord, size_t 
dim2_size,
   data.srcY = src_offset1_len;
   data.srcZ = src_offset0_len;
 
+  if (data.srcXInBytes != 0 || data.srcY != 0 || data.srcZ != 0)
+{
+  /* Adjust origin to the actual array data, else the CUDA 3D memory
+copy API call below may fail to validate source/dest pointers
+correctly (especially for Fortran where the "virtual origin" of an
+array is often outside the stored data).  */
+  if (src_ord == -1)
+   data.srcHost
+ = (const void *) ((const char *) data.srcHost
+   + (data.srcZ * data.srcHeight + data.srcY)
+ * data.srcPitch
+   + data.srcXInBytes);
+  else
+   data.srcDevice
+ += (data.srcZ * data.srcHeight + data.srcY) * data.srcPitch
++ data.srcXInBytes;
+  data.srcXInBytes = 0;
+  data.srcY = 0;
+  data.srcZ = 0;
+}
+
+  if (data.dstXInBytes != 0 || data.dstY != 0 || data.dstZ != 0)
+{
+  /* As above.  */
+  if (dst_ord == -1)
+   data.dstHost = (void *) ((char *) data.dstHost
++ (data.dstZ * data.dstHeight + data.dstY)
+  * data.dstPitch
++ data.dstXInBytes);
+  else
+   data.dstDevice
+ += (data.dstZ * data.dstHeight + data.dstY) * data.dstPitch
++ data.dstXInBytes;
+  data.dstXInBytes = 0;
+  data.dstY = 0;
+  data.dstZ = 0;
+}
+
   CUDA_CALL (cuMemcpy3D, );
   return true;
 }
-- 
2.25.1

[PATCH 0/3] [og13] OpenMP: Accelerated 2D/3D host<->target memory copies

2023-09-20 Thread Julian Brown

This series consists of support for accelerated 2D/3D memory copies for
omp_target_memcpy_rect and "target update" directives, using underlying
API-provided routines (CUDA for nVidia, or via an AMD-specific extension
for HSA).

One of the patches (by Tobias) is already on mainline, so this is a
backport, and another is a bug fix that has been submitted for mainline
but not yet reviewed.  The final patch is new (the AMD support for
2D/3D copies).

Tested with offloading to both NVPTX and AMD GCN.  I will push (to the
og13 branch) shortly.

Julian Brown (2):
  [og13] OpenMP, NVPTX: memcpy[23]D bias correction
  [og13] OpenMP: Support accelerated 2D/3D memory copies for AMD GCN

Tobias Burnus (1):
  [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for
omp_target_memcpy_rect

 include/cuda/cuda.h   |  87 +++
 libgomp/libgomp-plugin.h  |   7 +
 libgomp/libgomp.h |   2 +
 libgomp/libgomp.texi  |   5 +
 libgomp/oacc-host.c   |   2 +
 libgomp/plugin/cuda-lib.def   |   3 +
 libgomp/plugin/plugin-gcn.c   | 359 
 libgomp/plugin/plugin-nvptx.c | 185 ++
 libgomp/target.c  | 164 +-
 libgomp/testsuite/libgomp.c/target-12.c   |   6 +-
 .../testsuite/libgomp.fortran/target-12.f90   |   6 +-
 .../libgomp.fortran/target-memcpy-rect-1.f90  | 531 ++
 12 files changed, 1332 insertions(+), 25 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-memcpy-rect-1.f90

-- 
2.25.1

[Committed] RISC-V: Fix Demand comparison bug[VSETVL PASS]

2023-09-20 Thread Juzhe-Zhong

This bug is exposed when we support VLS integer conversion patterns.

FAIL: c-c++-common/torture/pr53505.c execution.

This is because incorrect vsetvl elimination by Phase 4:

   10318:   0d207057vsetvli zero,zero,e32,m4,ta,ma
   1031c:   5e003e57vmv.v.i v28,0
   .:   missed e8,m1 vsetvl
   10320:   7b07b057vmsgtu.vi   v0,v16,15
   10324:   03083157vadd.vi v2,v16,-16

Regression on release version GCC no surprise difference.

Committed.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vector_insn_info::operator==): Fix bug.

---
 gcc/config/riscv/riscv-vsetvl.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index df980b6770e..e0f61148ef3 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1799,10 +1799,11 @@ vector_insn_info::operator== (const vector_insn_info 
) const
 if (m_demands[i] != other.demand_p ((enum demand_type) i))
   return false;
 
-  if (vector_config_insn_p (m_insn->rtl ())
-  || vector_config_insn_p (other.get_insn ()->rtl ()))
-if (m_insn != other.get_insn ())
-  return false;
+  /* We should consider different INSN demands as different
+ expression.  Otherwise, we will be doing incorrect vsetvl
+ elimination.  */
+  if (m_insn != other.get_insn ())
+return false;
 
   if (!same_avl_p (other))
 return false;
-- 
2.36.3

[pushed] Darwin: Move checking of the 'shared' driver spec.

2023-09-20 Thread Iain Sandoe

Tested on x86_64-darwin21, pushed to trunk, thanks
Iain

--- 8< ---

This avoids a bunch of irrelevant diagnostics if the user passes '-shared' to
gnatmake.  Currently, we push '-dynamiclib' back onto the command line (since
that is the Darwin spelling of 'shared') but this is not handled by gnat1,
leading to a diagnostic for every character after the '-d'.

'-shared' has no effect on gnatmake (it needs to be passed to gnatbind).

This moves the handling of '-shared' to leaf specs so that we do not need to
push 'dynamiclib' onto the command line.

gcc/ChangeLog:

* config/darwin.h:
(SUBTARGET_DRIVER_SELF_SPECS): Move handling of 'shared' into the same
specs as 'dynamiclib'. (STARTFILE_SPEC): Handle 'shared'.
---
 gcc/config/darwin.h | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index b7cfab607db..61e46f76b22 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -133,10 +133,9 @@ extern GTY(()) int darwin_ms_struct;
cases where these driver opts are used multiple times, or to control
operations on more than one command (e.g. dynamiclib).  These are handled
specially and we then add %= 10.7 mmacosx-version-min= -no_pie) }"
 
 #define DARWIN_CC1_SPEC
\
-  "%

Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Robin Dapp

> So, IMHO, a complicate pattern which combine initial 0 value + extension + 
> reduction + vmerge may be more reasonable.

If that works I would also prefer that.

Regards
 Robin

Re: Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread juzhe.zh...@rivai.ai

I think both approaches look weird to me.

Lehua is adding an const 0 move pattern which is only used by widen reduction 
is not ideal.
Also, I don't like changing abs/vcond_mask predicate.

So, IMHO, a complicate pattern which combine initial 0 value + extension + 
reduction + vmerge may be more reasonable.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-09-20 17:14
To: Lehua Ding; gcc-patches
CC: rdapp.gcc; juzhe.zhong; kito.cheng; palmer; jeffreyalaw
Subject: Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to 
widen reduce sum
Hi Lehua,
 
I think this is better but still a bit weird :D  Allowing constants
and forcing them into registers unconditionally is slightly dubious as
well, though.  One thing that always sticks out is - how is 0 special?
Wouldn't we want other constants as well?
 
For reductions I think the vectorizer always starts accumulates
starting with the initial neutral value 0 and adds any other scalar
initial value later.  But that could change?
 
For reference, attached is what I tried.  This gives me no regressions
and your tests work.  Your approach is more generic in case we want to
match future zero constants in other patterns (that we still needed
to adjust with force reg otherwise) but the force-reg thing appears
more "natural".
 
All in all, I would prefer the force-reg approach slightly but could also
live with this v2 despite some minor "usability" concerns.  Going to leave
the decision to you, either one is OK.
 
Regards
Robin
 
From 3be4cf4403a584d560c3923207a9c4da8dafee49 Mon Sep 17 00:00:00 2001
From: Robin Dapp 
Date: Wed, 20 Sep 2023 10:15:36 +0200
Subject: [PATCH] lehua
 
---
gcc/config/riscv/autovec-opt.md | 52 -
gcc/config/riscv/autovec.md |  4 ++-
gcc/config/riscv/riscv-protos.h |  1 +
3 files changed, 55 insertions(+), 2 deletions(-)
 
diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index a97a095691c..8d4ee2ae37f 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -103,12 +103,14 @@ (define_insn_and_split "*cond_abs"
 (if_then_else:VF
   (match_operand: 3 "register_operand")
   (abs:VF (match_operand:VF 1 "nonmemory_operand"))
-  (match_operand:VF 2 "register_operand")))]
+  (match_operand:VF 2 "nonmemory_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
   [(const_int 0)]
{
+  if (!REG_P (operands[2]))
+operands[2] = force_reg (mode, operands[2]);
   emit_insn (gen_cond_len_abs (operands[0], operands[3], operands[1],
 operands[2],
 gen_int_mode (GET_MODE_NUNITS (mode), Pmode),
@@ -1176,3 +1178,51 @@ (define_insn_and_split "*n"
 DONE;
   }
   [(set_attr "type" "vmalu")])
+
+;; Combine mask extend + vredsum to mask vwredsum[u]
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(any_extend:
+  (match_operand:VI_QHS_NO_M8 2 "register_operand"))
+(match_operand: 3 "vector_const_0_operand"))
+] UNSPEC_REDUC_SUM))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
+  riscv_vector::expand_reduction (,
+  riscv_vector::REDUCE_OP_M,
+  ops, CONST0_RTX (mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
+
+;; Combine mask extend + vfredsum to mask vfwredusum
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(float_extend:
+  (match_operand:VF_HS_NO_M8 2 "register_operand"))
+(match_operand: 3 "vector_const_0_operand"))
+] UNSPEC_REDUC_SUM_UNORDERED))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
+  riscv_vector::expand_reduction (UNSPEC_WREDUC_SUM_UNORDERED,
+  riscv_vector::REDUCE_OP_M_FRM_DYN,
+  ops, CONST0_RTX (mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 75ed7ae4f2e..1c10e841692 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -550,13 +550,15 @@ (define_insn_and_split "vcond_mask_"
 (if_then_else:V_VLS
   (match_operand: 3 "register_operand")
   (match_operand:V_VLS 1 "nonmemory_operand")
-  (match_operand:V_VLS 2 "register_operand")))]
+  (match_operand:V_VLS 2 "nonmemory_operand")))]
   "TARGET_VECTOR &&

Re: [PATCH][_GLIBCXX_INLINE_VERSION] Fix

2023-09-20 Thread Jonathan Wakely

On Wed, 20 Sept 2023 at 05:51, François Dumont via Libstdc++
 wrote:
>
> libstdc++: Remove std::constract_violation from versioned namespace

Spelling mistake in contract_violation, and it's not
std::contract_violation, it's std::experimental::contract_violation

>
> GCC expects this type to be in std namespace directly.

Again, it's in std::experimental not in std directly.

Will this change cause problems when including another experimental
header, which does put experimental below std::__8?

I think std::__8::experimental and std::experimental will become ambiguous.

Maybe we do want to remove the inline __8 namespace from all
experimental headers. That needs a bit more thought though.

>
> libstdc++-v3/ChangeLog:
>
>  * include/experimental/contract:
>  Remove _GLIBCXX_BEGIN_NAMESPACE_VERSION/_GLIBCXX_END_NAMESPACE_VERSION.

This line is too long for the changelog.

>
> It does fix 29 g++.dg/contracts in gcc testsuite.
>
> Ok to commit ?
>
> François

[PATCH 2/2] tree-optimization/111489 - raise --param uninit-max-chain-len to 8

2023-09-20 Thread Richard Biener

This raises --param uninit-max-chain-len to avoid a bogus diagnostic
for the large testcase in PR111489.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/111489
* params.opt (-param uninit-max-chain-len=): Raise default to 8.

* gcc.dg/uninit-pr111489.c: New testcase.
---
 gcc/params.opt |   2 +-
 gcc/testsuite/gcc.dg/uninit-pr111489.c | 112 +
 2 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/uninit-pr111489.c

diff --git a/gcc/params.opt b/gcc/params.opt
index a67778e456b..fffa8b1bc64 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1107,7 +1107,7 @@ Common Joined UInteger 
Var(param_uninit_control_dep_attempts) Init(1000) Integer
 Maximum number of nested calls to search for control dependencies during 
uninitialized variable analysis.
 
 -param=uninit-max-chain-len=
-Common Joined UInteger Var(param_uninit_max_chain_len) Init(5) IntegerRange(1, 
128) Param Optimization
+Common Joined UInteger Var(param_uninit_max_chain_len) Init(8) IntegerRange(1, 
128) Param Optimization
 Maximum number of predicates anded for each predicate ored in the normalized
 predicate chain.
 
diff --git a/gcc/testsuite/gcc.dg/uninit-pr111489.c 
b/gcc/testsuite/gcc.dg/uninit-pr111489.c
new file mode 100644
index 000..1b3eb652b25
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/uninit-pr111489.c
@@ -0,0 +1,112 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wuninitialized" } */
+
+#include 
+#include 
+
+typedef struct
+{
+uint16_t pgv1_size;
+uint16_t pgv1_flags1;
+uint16_t pgv1_flags2;
+uint16_t pgv1_flags3;
+uint16_t pgv1_nslots;
+uint32_t pgv1_generic;
+} page_v1_t;
+
+typedef struct
+{
+page_v1_t pgv2_hdr;
+int64_t pgv2_next;
+int64_t pgv2_prev;
+} page_v2_t;
+
+typedef struct
+{
+union
+{
+struct
+{
+uint16_t pgv0_size;
+uint16_t pgv0_flags;
+};
+uint32_t pgv0_nslots;
+};
+uint32_t pgv0_mode;
+uint32_t pgv0_next;
+} page_v0_t;
+
+typedef struct
+{
+uint16_t sl4_off;
+uint16_t sl4_len;
+} slot4_t;
+
+typedef struct
+{
+uint32_t sl8_off;
+uint32_t sl8_len;
+} slot8_t;
+extern int64_t cur_rowid;
+
+extern uint8_t *pg_alloc(size_t pg_size);
+extern void mem_move(void *dst, const void *src, size_t len);
+extern void pg_expand(uint8_t *row_in, uint8_t *row_out, uint16_t pg_version);
+extern void pg_reorg(uint32_t flags, uint8_t *pg_old, uint16_t slotnum, 
uint8_t *row_data, uint64_t cur_partp);
+
+void pg_reorg(uint32_t flags, uint8_t *pg_old, uint16_t slotnum, uint8_t 
*row_data, uint64_t cur_partp)
+{
+uint16_t pg_version;
+size_t pg_size = page_v1_t *)(pg_old))->pgv1_flags3 == 3) ? 
(((page_v1_t *)(pg_old))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_old))->pgv0_size & 0xF800)));
+uint8_t *pg_new = pg_alloc(pg_size);
+uint8_t *old_slot = ((page_v1_t *)(pg_old))->pgv1_flags3 == 3) ? 
(((page_v1_t *)(pg_old))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_old))->pgv0_size & 0xF800))) > 0x4000) ? (&(pg_old)[page_v1_t 
*)(pg_old))->pgv1_flags3 == 3) ? (((page_v1_t *)(pg_old))->pgv1_size*1024) : 
(uint16_t)(0x0800 + (((page_v0_t *)(pg_old))->pgv0_size & 0xF800))) - 
sizeof(int64_t) - (1*sizeof(slot8_t))]) : page_v0_t *)(pg_old))->pgv0_mode 
& 1) ? (&(pg_old)[page_v1_t *)(pg_old))->pgv1_flags3 == 3) ? (((page_v1_t 
*)(pg_old))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_old))->pgv0_size & 0xF800))) - sizeof(int64_t) - (1*sizeof(slot4_t))]) : 
(&(pg_old)[page_v1_t *)(pg_old))->pgv1_flags3 == 3) ? (((page_v1_t 
*)(pg_old))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_old))->pgv0_size & 0xF800))) - sizeof(int32_t) - (1*sizeof(slot4_t))])));
+uint8_t *new_slot = ((page_v1_t *)(pg_new))->pgv1_flags3 == 3) ? 
(((page_v1_t *)(pg_new))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_new))->pgv0_size & 0xF800))) > 0x4000) ? (&(pg_new)[page_v1_t 
*)(pg_new))->pgv1_flags3 == 3) ? (((page_v1_t *)(pg_new))->pgv1_size*1024) : 
(uint16_t)(0x0800 + (((page_v0_t *)(pg_new))->pgv0_size & 0xF800))) - 
sizeof(int64_t) - (1*sizeof(slot8_t))]) : page_v0_t *)(pg_new))->pgv0_mode 
& 1) ? (&(pg_new)[page_v1_t *)(pg_new))->pgv1_flags3 == 3) ? (((page_v1_t 
*)(pg_new))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_new))->pgv0_size & 0xF800))) - sizeof(int64_t) - (1*sizeof(slot4_t))]) : 
(&(pg_new)[page_v1_t *)(pg_new))->pgv1_flags3 == 3) ? (((page_v1_t 
*)(pg_new))->pgv1_size*1024) : (uint16_t)(0x0800 + (((page_v0_t 
*)(pg_new))->pgv0_size & 0xF800))) - sizeof(int32_t) - (1*sizeof(slot4_t))])));
+
+if (flags == 2)
+{
+
+pg_version = page_v1_t *)(pg_old))->pgv1_flags3 == 3) ? 
(((page_v1_t *)(pg_old))->pgv1_generic >> 24) : ((uint32_t)((page_v0_t 
*)(pg_old))->pgv0_next >> 24));
+}
+
+int64_t sav_rowid =

[PATCH 1/2] tree-optimization/111489 - turn uninit limits to params

2023-09-20 Thread Richard Biener

The following turns MAX_NUM_CHAINS and MAX_CHAIN_LEN to params which
allows to experiment with raising them.  For the testcase in PR111489
raising MAX_CHAIN_LEN from 5 to 8 avoids the bogus diagnostics
at -O2, at -O3 we need a MAX_CHAIN_LEN of 6.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/111489
* doc/invoke.texi (--param uninit-max-chain-len): Document.
(--param uninit-max-num-chains): Likewise.
* params.def (-param=uninit-max-chain-len=): New.
(-param=uninit-max-num-chains=): Likewise.
* gimple-predicate-analysis.cc (MAX_NUM_CHAINS): Define to
param_uninit_max_num_chains.
(MAX_CHAIN_LEN): Define to param_uninit_max_chain_len.
(uninit_analysis::init_use_preds): Avoid VLA.
(uninit_analysis::init_from_phi_def): Likewise.
(compute_control_dep_chain): Avoid using MAX_CHAIN_LEN in
template parameter.
---
 gcc/doc/invoke.texi  |  7 +++
 gcc/gimple-predicate-analysis.cc | 13 -
 gcc/params.opt   |  9 +
 3 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 09b2b92ad84..ba7984bcb7e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16348,6 +16348,13 @@ crossing a loop backedge when comparing to
 Maximum number of nested calls to search for control dependencies
 during uninitialized variable analysis.
 
+@item uninit-max-chain-len
+Maximum number of predicates anded for each predicate ored in the normalized
+predicate chain.
+
+@item uninit-max-num-chains
+Maximum number of predicates ored in the normalized predicate chain.
+
 @item sched-autopref-queue-depth
 Hardware autoprefetcher scheduler model control flag.
 Number of lookahead cycles the model looks into; at '
diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 373163ba9c8..ad2c35524ce 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -50,8 +50,8 @@
 
 /* In our predicate normal form we have MAX_NUM_CHAINS or predicates
and in those MAX_CHAIN_LEN (inverted) and predicates.  */
-#define MAX_NUM_CHAINS 8
-#define MAX_CHAIN_LEN 5
+#define MAX_NUM_CHAINS (unsigned)param_uninit_max_num_chains
+#define MAX_CHAIN_LEN (unsigned)param_uninit_max_chain_len
 
 /* Return true if X1 is the negation of X2.  */
 
@@ -1163,11 +1163,12 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
   vec cd_chains[], unsigned *num_chains,
   unsigned in_region = 0)
 {
-  auto_vec cur_cd_chain;
+  auto_vec cur_cd_chain;
   unsigned num_calls = 0;
   unsigned depth = 0;
   bool complete_p = true;
   /* Walk the post-dominator chain.  */
+  cur_cd_chain.reserve (MAX_CHAIN_LEN + 1);
   compute_control_dep_chain_pdom (dom_bb, dep_bb, NULL, cd_chains,
  num_chains, cur_cd_chain, _calls,
  in_region, depth, _p);
@@ -2035,7 +2036,7 @@ uninit_analysis::init_use_preds (predicate _preds, 
basic_block def_bb,
  are logical conjunctions.  Together, the DEP_CHAINS vector is
  used below to initialize an OR expression of the conjunctions.  */
   unsigned num_chains = 0;
-  auto_vec dep_chains[MAX_NUM_CHAINS];
+  auto_vec *dep_chains = new auto_vec[MAX_NUM_CHAINS];
 
   if (!dfs_mark_dominating_region (use_bb, cd_root, in_region, region)
   || !compute_control_dep_chain (cd_root, use_bb, dep_chains, _chains,
@@ -2060,6 +2061,7 @@ uninit_analysis::init_use_preds (predicate _preds, 
basic_block def_bb,
  Each OR subexpression is represented by one element of DEP_CHAINS,
  where each element consists of a series of AND subexpressions.  */
   use_preds.init_from_control_deps (dep_chains, num_chains, true);
+  delete[] dep_chains;
   return !use_preds.is_empty ();
 }
 
@@ -2144,7 +2146,7 @@ uninit_analysis::init_from_phi_def (gphi *phi)
   break;
 
   unsigned num_chains = 0;
-  auto_vec dep_chains[MAX_NUM_CHAINS];
+  auto_vec *dep_chains = new auto_vec[MAX_NUM_CHAINS];
   for (unsigned i = 0; i < nedges; i++)
 {
   edge e = def_edges[i];
@@ -2175,6 +2177,7 @@ uninit_analysis::init_from_phi_def (gphi *phi)
  which the PHI operands are defined to values for which M_EVAL is
  false.  */
   m_phi_def_preds.init_from_control_deps (dep_chains, num_chains, false);
+  delete[] dep_chains;
   return !m_phi_def_preds.is_empty ();
 }
 
diff --git a/gcc/params.opt b/gcc/params.opt
index 70cfb495e3a..a67778e456b 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -1106,6 +1106,15 @@ Emit instrumentation calls to __tsan_func_entry() and 
__tsan_func_exit().
 Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) 
IntegerRange(1, 65536) Param Optimization
 Maximum number of nested calls to search for control dependencies during 
uninitialized variable analysis.
 
+-param=uninit-max-chain-len=

Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Robin Dapp

Hi Lehua,

I think this is better but still a bit weird :D  Allowing constants
and forcing them into registers unconditionally is slightly dubious as
well, though.  One thing that always sticks out is - how is 0 special?
Wouldn't we want other constants as well?

For reductions I think the vectorizer always starts accumulates
starting with the initial neutral value 0 and adds any other scalar
initial value later.  But that could change?

For reference, attached is what I tried.  This gives me no regressions
and your tests work.  Your approach is more generic in case we want to
match future zero constants in other patterns (that we still needed
to adjust with force reg otherwise) but the force-reg thing appears
more "natural".

All in all, I would prefer the force-reg approach slightly but could also
live with this v2 despite some minor "usability" concerns.  Going to leave
the decision to you, either one is OK.

Regards
 Robin

>From 3be4cf4403a584d560c3923207a9c4da8dafee49 Mon Sep 17 00:00:00 2001
From: Robin Dapp 
Date: Wed, 20 Sep 2023 10:15:36 +0200
Subject: [PATCH] lehua

---
 gcc/config/riscv/autovec-opt.md | 52 -
 gcc/config/riscv/autovec.md |  4 ++-
 gcc/config/riscv/riscv-protos.h |  1 +
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index a97a095691c..8d4ee2ae37f 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -103,12 +103,14 @@ (define_insn_and_split "*cond_abs"
 (if_then_else:VF
   (match_operand: 3 "register_operand")
   (abs:VF (match_operand:VF 1 "nonmemory_operand"))
-  (match_operand:VF 2 "register_operand")))]
+  (match_operand:VF 2 "nonmemory_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
   [(const_int 0)]
 {
+  if (!REG_P (operands[2]))
+operands[2] = force_reg (mode, operands[2]);
   emit_insn (gen_cond_len_abs (operands[0], operands[3], operands[1],
 operands[2],
 gen_int_mode (GET_MODE_NUNITS 
(mode), Pmode),
@@ -1176,3 +1178,51 @@ (define_insn_and_split "*n"
 DONE;
   }
   [(set_attr "type" "vmalu")])
+
+;; Combine mask extend + vredsum to mask vwredsum[u]
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(any_extend:
+  (match_operand:VI_QHS_NO_M8 2 "register_operand"))
+(match_operand: 3 "vector_const_0_operand"))
+] UNSPEC_REDUC_SUM))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
+  riscv_vector::expand_reduction (,
+  riscv_vector::REDUCE_OP_M,
+  ops, CONST0_RTX (mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
+
+;; Combine mask extend + vfredsum to mask vfwredusum
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(float_extend:
+  (match_operand:VF_HS_NO_M8 2 "register_operand"))
+(match_operand: 3 "vector_const_0_operand"))
+] UNSPEC_REDUC_SUM_UNORDERED))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
+  riscv_vector::expand_reduction (UNSPEC_WREDUC_SUM_UNORDERED,
+  riscv_vector::REDUCE_OP_M_FRM_DYN,
+  ops, CONST0_RTX (mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 75ed7ae4f2e..1c10e841692 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -550,13 +550,15 @@ (define_insn_and_split "vcond_mask_"
 (if_then_else:V_VLS
   (match_operand: 3 "register_operand")
   (match_operand:V_VLS 1 "nonmemory_operand")
-  (match_operand:V_VLS 2 "register_operand")))]
+  (match_operand:V_VLS 2 "nonmemory_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
   [(const_int 0)]
   {
 /* The order of vcond_mask is opposite to pred_merge.  */
+if (!REG_P (operands[2]))
+  operands[2] = force_reg (mode, operands[2]);
 std::swap (operands[1], operands[2]);
 riscv_vector::emit_vlmax_insn (code_for_pred_merge (mode),
riscv_vector::MERGE_OP, operands);
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 9ea0bcf15d3..a75b0b485b4

[PATCH, rs6000] Enable vector compare for 16-byte memory equality compare [PR111449]

2023-09-20 Thread HAO CHEN GUI

Hi,
  This patch enables vector compare for 16-byte memory equality compare.
The 16-byte memory equality compare can be efficiently implemented by
instruction "vcmpequb." It reduces one branch and one compare compared
with two 8-byte compare sequence.

  16-byte vector compare is not enabled on 32bit sub-targets as TImode
hasn't been supported well on 32bit sub-targets.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.

Thanks
Gui Haochen

ChangeLog
rs6000: Enable vector compare for 16-byte memory equality compare

gcc/
PR target/111449
* config/rs6000/altivec.md (cbranchti4): New expand pattern.
* config/rs6000/rs6000.cc (rs6000_generate_compare): Generate insn
sequence for TImode vector equality compare.
* config/rs6000/rs6000.h (MOVE_MAX_PIECES): Define.
(COMPARE_MAX_PIECES): Define.

gcc/testsuite/
PR target/111449
* gcc.target/powerpc/pr111449.c: New.

patch.diff
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index e8a596fb7e9..99264235cbe 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2605,6 +2605,24 @@ (define_insn "altivec_vupklpx"
 }
   [(set_attr "type" "vecperm")])

+(define_expand "cbranchti4"
+  [(use (match_operator 0 "equality_operator"
+   [(match_operand:TI 1 "memory_operand")
+(match_operand:TI 2 "memory_operand")]))
+   (use (match_operand 3))]
+  "VECTOR_UNIT_ALTIVEC_P (V16QImode)"
+{
+  rtx op1 = simplify_subreg (V16QImode, operands[1], TImode, 0);
+  rtx op2 = simplify_subreg (V16QImode, operands[2], TImode, 0);
+  operands[1] = force_reg (V16QImode, op1);
+  operands[2] = force_reg (V16QImode, op2);
+  rtx_code code = GET_CODE (operands[0]);
+  operands[0] = gen_rtx_fmt_ee (code, V16QImode, operands[1],
+   operands[2]);
+  rs6000_emit_cbranch (TImode, operands);
+  DONE;
+})
+
 ;; Compare vectors producing a vector result and a predicate, setting CR6 to
 ;; indicate a combined status
 (define_insn "altivec_vcmpequ_p"
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index efe9adce1f8..c6b935a64e7 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -15264,6 +15264,15 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
  else
emit_insn (gen_stack_protect_testsi (compare_result, op0, op1b));
}
+  else if (mode == TImode)
+   {
+ gcc_assert (code == EQ || code == NE);
+
+ rtx result_vector = gen_reg_rtx (V16QImode);
+ compare_result = gen_rtx_REG (CCmode, CR6_REGNO);
+ emit_insn (gen_altivec_vcmpequb_p (result_vector, op0, op1));
+ code = (code == NE) ? GE : LT;
+   }
   else
emit_insn (gen_rtx_SET (compare_result,
gen_rtx_COMPARE (comp_mode, op0, op1)));
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 3503614efbd..dc33bca0802 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1730,6 +1730,8 @@ typedef struct rs6000_args
in one reasonably fast instruction.  */
 #define MOVE_MAX (! TARGET_POWERPC64 ? 4 : 8)
 #define MAX_MOVE_MAX 8
+#define MOVE_MAX_PIECES (!TARGET_POWERPC64 ? 4 : 16)
+#define COMPARE_MAX_PIECES (!TARGET_POWERPC64 ? 4 : 16)

 /* Nonzero if access to memory by bytes is no faster than for words.
Also nonzero if doing byte operations (specifically shifts) in registers
diff --git a/gcc/testsuite/gcc.target/powerpc/pr111449.c 
b/gcc/testsuite/gcc.target/powerpc/pr111449.c
new file mode 100644
index 000..ab9583f47bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr111449.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-maltivec -O2" } */
+/* { dg-require-effective-target has_arch_ppc64 } */
+
+/* Ensure vector comparison is used for 16-byte memory equality compare.  */
+
+int compare (const char* s1, const char* s2)
+{
+  return __builtin_memcmp (s1, s2, 16) == 0;
+}
+
+/* { dg-final { scan-assembler-times {\mvcmpequb\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mcmpd\M} } } */

Re: [PATCH] RISC-V: Support combine cond extend and reduce sum to cond widen reduce sum

2023-09-20 Thread Lehua Ding


Hi Robin,

I have posted a V2 patch to implement the method I mentioned. I wonder 
if that makes it a little easier to understand?


V2 patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630985.html


On 2023/9/20 12:02, Lehua Ding wrote:



On 2023/9/20 6:02, Robin Dapp wrote:

Hi Lehua,

thanks for the explanation.


My current method is still to keep the operand 2 of vcond_mask as a
register, but the pattern of mov_vec_const_0 is simplified, so that
the corresponding combine pattern can be more simple. That's the only
reason I split the vcond_mask into three patterns.


My "problem" with the separate split it that it really sticks out
and everybody seeing it would wonder why we need it.  It's not that
bad of course but it appears as if we messed up somewhere else.


Can not agree more.


I checked and I don't see additional FAILs with the vmask pattern
that additionally allows a const0 operand (that is forced into a 
register)

and a force_reg in abs:VF.

Would you mind re-checking if we can avoid the extra
"vec_duplicate_const_0" by changing the other affected patterns
in a similar manner?  I really didn't verify in-depth so if we needed
to add a force_reg to every pattern we might need to reconsider.
Still, I'd be unsure if I preferred the "vec_dup_const_0" over
additional force_regs ;)


I think that's a little weird, too. I prefer to add a single 
define_insn_and_split move_const_0 pattern like following diff, How do 
you feel about that?


diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index f4dab9fceb8..c0ab96ae8ab 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -973,7 +973,11 @@ expand_const_vector (rtx target, rtx src)
    rtx tmp = register_operand (target, mode) ? target : gen_reg_rtx 
(mode);

    /* Element in range -16 ~ 15 integer or 0.0 floating-point,
   we use vmv.v.i instruction.  */
-  if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src))
+  /* For const int or float 0, we keep the simple pattern before 
split1

+ pass. */
+  if (can_create_pseudo_p () && satisfies_constraint_Wc0 (src))
+    emit_insn (gen_mov_vec_const_0 (mode, tmp, src));
+  else if (satisfies_constraint_vi (src))
  {
    rtx ops[] = {tmp, src};
    emit_vlmax_insn (code_for_pred_mov (mode), UNARY_OP, ops);

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index f66ffebba24..b4973125d04 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1640,6 +1640,19 @@
    "TARGET_VECTOR"
    {})

+(define_insn_and_split "@mov_vec_const_0"
+  [(set (match_operand:V_VLS 0 "register_operand")
+    (match_operand:V_VLS 1 "vector_const_0_operand"))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    emit_move_insn (operands[0], operands[1]);
+    DONE;
+  }
+  [(set_attr "type" "vector")])
+
  ;; vle.v/vse.v,vmv.v.v
  (define_insn_and_split "*pred_mov"
    [(set (match_operand:V_VLS 0 "nonimmediate_operand"    "=vr, 
    vr,    vd, m,    vr,    vr")




--
Best,
Lehua

[PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum

2023-09-20 Thread Lehua Ding

V2 Change: Use new method to simple move const 0 to vector.

This patch support combining cond extend and reduce_sum to cond widen reduce_sum
like combine the following three insns:
  (set (reg:RVVM2HI 149)
   (const_vector:RVVM2HI repeat [
  (const_int 0)
   ]))
  (set (reg:RVVM2HI 138)
(if_then_else:RVVM2HI
  (reg:RVVMF8BI 135)
  (reg:RVVM2HI 148)
  (reg:RVVM2HI 149)))
  (set (reg:HI 150)
(unspec:HI [
  (reg:RVVM2HI 138)
] UNSPEC_REDUC_SUM))
into one insn:
  (set (reg:SI 147)
(unspec:SI [
  (if_then_else:RVVM2SI
(reg:RVVMF16BI 135)
(sign_extend:RVVM2SI (reg:RVVM1HI 136))
(const_vector:RVVM2SI repeat [
  (const_int 0)
]))
] UNSPEC_REDUC_SUM))

Consider the following C code:

int16_t foo (int8_t *restrict a, int8_t *restrict pred)
{
  int16_t sum = 0;
  for (int i = 0; i < 16; i += 1)
if (pred[i])
  sum += a[i];
  return sum;
}

assembly before this patch:

foo:
vsetivlizero,16,e16,m2,ta,ma
li  a5,0
vmv.v.i v2,0
vsetvli zero,zero,e8,m1,ta,ma
vl1re8.vv0,0(a1)
vmsne.viv0,v0,0
vsetvli zero,zero,e16,m2,ta,mu
vle8.v  v4,0(a0),v0.t
vmv.s.x v1,a5
vsext.vf2   v2,v4,v0.t
vredsum.vs  v2,v2,v1
vmv.x.s a0,v2
slliw   a0,a0,16
sraiw   a0,a0,16
ret

assembly after this patch:

foo:
li  a5,0
vsetivlizero,16,e16,m1,ta,ma
vmv.s.x v3,a5
vsetivlizero,16,e8,m1,ta,ma
vl1re8.vv0,0(a1)
vmsne.viv0,v0,0
vle8.v  v2,0(a0),v0.t
vwredsum.vs v1,v2,v3,v0.t
vsetivlizero,0,e16,m1,ta,ma
vmv.x.s a0,v1
slliw   a0,a0,16
sraiw   a0,a0,16
ret

gcc/ChangeLog:

* config/riscv/autovec-opt.md (@mov_vec_const_0):
New helper pattern.
(*cond_widen_reduc_plus_scal_): New combine pattern.
* config/riscv/riscv-protos.h (enum insn_type): Ditto.
* config/riscv/riscv-v.cc (expand_const_vector): Gen new pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c: New test.
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c: New test.

---
 gcc/config/riscv/autovec-opt.md   | 64 +++
 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv-v.cc   |  7 +-
 .../rvv/autovec/cond/cond_widen_reduc-1.c | 30 +
 .../rvv/autovec/cond/cond_widen_reduc_run-1.c | 28 
 5 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 66c77ad6ebb..5cc13c85fe5 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -185,6 +185,22 @@
   [(set_attr "type" "vimovvx")
(set_attr "mode" "")])

+;; Let the mov pattern move 0 to vector remain simple pattern before split1.
+;; This simple pattern will let more patterns be made to combine successfully.
+(define_insn_and_split "@mov_vec_const_0"
+  [(set (match_operand:V_VLS 0 "register_operand")
+(match_operand:V_VLS 1 "vector_const_0_operand"))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+riscv_vector::emit_vlmax_insn (code_for_pred_mov (mode),
+   riscv_vector::UNARY_OP, operands);
+DONE;
+  }
+  [(set_attr "type" "vimov")])
+
 ;; 
=
 ;; All combine patterns for combine pass.
 ;; 
=
@@ -1175,6 +1191,54 @@
   }
   [(set_attr "type" "vfwmuladd")])

+;; Combine mask_extend + vredsum to mask_vwredsum[u]
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+(unspec: [
+  (if_then_else:
+(match_operand: 1 "register_operand")
+(any_extend:
+  (match_operand:VI_QHS_NO_M8 2 "register_operand"))
+(match_operand: 3 "vector_const_0_operand"))
+] UNSPEC_REDUC_SUM))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
+  riscv_vector::expand_reduction (,
+  riscv_vector::REDUCE_OP_M,
+  ops, CONST0_RTX (mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
+
+;; Combine mask_extend + vfredsum to mask_vfwredusum
+(define_insn_and_split "*cond_widen_reduc_plus_scal_"
+  [(set (match_operand: 0 "register_operand")
+

Re: [PATCH] middle-end: use MAX_FIXED_MODE_SIZE instead of precidion of TImode/DImode

2023-09-20 Thread Richard Biener

On Wed, 20 Sep 2023, Jakub Jelinek wrote:

> Hi!
> 
> On Tue, Sep 19, 2023 at 05:50:59PM +0100, Richard Sandiford wrote:
> > How about using MAX_FIXED_MODE_SIZE for things like this?
> 
> Seems like a good idea.
> 
> The following patch does that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2023-09-20  Jakub Jelinek  
> 
>   * match.pd ((x << c) >> c): Use MAX_FIXED_MODE_SIZE instead of
>   GET_MODE_PRECISION of TImode or DImode depending on whether
>   TImode is supported scalar mode.
>   * gimple-lower-bitint.cc (bitint_precision_kind): Likewise.
>   * expr.cc (expand_expr_real_1): Likewise.
>   * tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): Likewise.
>   * ubsan.cc (ubsan_encode_value, ubsan_type_descriptor): Likewise.
> 
> --- gcc/match.pd.jj   2023-09-19 19:47:23.969430272 +0200
> +++ gcc/match.pd  2023-09-19 20:08:17.341559409 +0200
> @@ -4124,9 +4124,7 @@ (define_operator_list SYNC_FETCH_AND_AND
>  (with {
>int width = element_precision (type) - tree_to_uhwi (@1);
>tree stype = NULL_TREE;
> -  scalar_int_mode mode = (targetm.scalar_mode_supported_p (TImode)
> -   ? TImode : DImode);
> -  if (width <= GET_MODE_PRECISION (mode))
> +  if (width <= MAX_FIXED_MODE_SIZE)
>   stype = build_nonstandard_integer_type (width, 0);
>   }
>   (if (stype && (width == 1 || type_has_mode_precision_p (stype)))
> --- gcc/gimple-lower-bitint.cc.jj 2023-09-08 11:29:20.105768005 +0200
> +++ gcc/gimple-lower-bitint.cc2023-09-19 20:01:50.927782331 +0200
> @@ -100,21 +100,19 @@ bitint_precision_kind (int prec)
>small_max_prec = prec;
>return bitint_prec_small;
>  }
> -  scalar_int_mode arith_mode = (targetm.scalar_mode_supported_p (TImode)
> - ? TImode : DImode);
>if (!large_min_prec
> -  && GET_MODE_PRECISION (arith_mode) > GET_MODE_PRECISION (limb_mode))
> -large_min_prec = GET_MODE_PRECISION (arith_mode) + 1;
> +  && GET_MODE_PRECISION (limb_mode) < MAX_FIXED_MODE_SIZE)
> +large_min_prec = MAX_FIXED_MODE_SIZE + 1;
>if (!limb_prec)
>  limb_prec = GET_MODE_PRECISION (limb_mode);
>if (!huge_min_prec)
>  {
> -  if (4 * limb_prec >= GET_MODE_PRECISION (arith_mode))
> +  if (4 * limb_prec >= MAX_FIXED_MODE_SIZE)
>   huge_min_prec = 4 * limb_prec;
>else
> - huge_min_prec = GET_MODE_PRECISION (arith_mode) + 1;
> + huge_min_prec = MAX_FIXED_MODE_SIZE + 1;
>  }
> -  if (prec <= GET_MODE_PRECISION (arith_mode))
> +  if (prec <= MAX_FIXED_MODE_SIZE)
>  {
>if (!mid_min_prec || prec < mid_min_prec)
>   mid_min_prec = prec;
> --- gcc/expr.cc.jj2023-09-08 11:29:20.101768059 +0200
> +++ gcc/expr.cc   2023-09-19 20:00:12.788108832 +0200
> @@ -11044,17 +11044,11 @@ expand_expr_real_1 (tree exp, rtx target
>   scalar_int_mode limb_mode
> = as_a  (info.limb_mode);
>   unsigned int limb_prec = GET_MODE_PRECISION (limb_mode);
> - if (prec > limb_prec)
> + if (prec > limb_prec && prec > MAX_FIXED_MODE_SIZE)
> {
> - scalar_int_mode arith_mode
> -   = (targetm.scalar_mode_supported_p (TImode)
> -  ? TImode : DImode);
> - if (prec > GET_MODE_PRECISION (arith_mode))
> -   {
> - /* Emit large/huge _BitInt INTEGER_CSTs into memory.  */
> - exp = tree_output_constant_def (exp);
> - return expand_expr (exp, target, VOIDmode, modifier);
> -   }
> + /* Emit large/huge _BitInt INTEGER_CSTs into memory.  */
> + exp = tree_output_constant_def (exp);
> + return expand_expr (exp, target, VOIDmode, modifier);
> }
> }
>  
> --- gcc/tree-ssa-sccvn.cc.jj  2023-09-18 15:14:48.987358112 +0200
> +++ gcc/tree-ssa-sccvn.cc 2023-09-19 20:02:53.160941163 +0200
> @@ -7004,10 +7004,7 @@ eliminate_dom_walker::eliminate_stmt (ba
> && !type_has_mode_precision_p (TREE_TYPE (lhs)))
>   {
> if (TREE_CODE (TREE_TYPE (lhs)) == BITINT_TYPE
> -   && (TYPE_PRECISION (TREE_TYPE (lhs))
> -   > (targetm.scalar_mode_supported_p (TImode)
> -  ? GET_MODE_PRECISION (TImode)
> -  : GET_MODE_PRECISION (DImode
> +   && TYPE_PRECISION (TREE_TYPE (lhs)) > MAX_FIXED_MODE_SIZE)
>   lookup_lhs = NULL_TREE;
> else if (TREE_CODE (lhs) == COMPONENT_REF
>  || TREE_CODE (lhs) == MEM_REF)
> --- gcc/ubsan.cc.jj   2023-09-08 11:29:20.136767581 +0200
> +++ gcc/ubsan.cc  2023-09-19 20:06:56.118657251 +0200
> @@ -136,13 +136,10 @@ ubsan_encode_value (tree t, enum ubsan_e
>   }
>else
>   {
> -   scalar_int_mode arith_mode
> - = (targetm.scalar_mode_supported_p (TImode) ? TImode : DImode);
> -   if (TYPE_PRECISION (type) >

Re: [RFC] GCC Security policy

2023-09-20 Thread Arnaud Charlet

This is a great initiative I think.

See reference to AdaCore's security email below (among Debian, Red Hat,
SUSE)

On Mon, Aug 7, 2023 at 7:30 PM David Edelsohn via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> FOSS Best Practices recommends that projects have an official Security
> policy stated in a SECURITY.md or SECURITY.txt file at the root of the
> repository.  GLIBC and Binutils have added such documents.
>
> Appended is a prototype for a Security policy file for GCC based on the
> Binutils document because GCC seems to have more affinity with Binutils as
> a tool. Do the runtime libraries distributed with GCC, especially libgcc,
> require additional security policies?
>
> [ ] Is it appropriate to use the Binutils SECURITY.txt as the starting
> point or should GCC use GLIBC SECURITY.md as the starting point for the GCC
> Security policy?
>
> [ ] Does GCC, or some components of GCC, require additional care because of
> runtime libraries like libgcc and libstdc++, and because of gcov and
> profile-directed feedback?
>
> Thoughts?
>
> Thanks, David
>
> GCC Security Process
> 
>
> What is a GCC security bug?
> ===
>
> A security bug is one that threatens the security of a system or
> network, or might compromise the security of data stored on it.
> In the context of GCC there are two ways in which such
> bugs might occur.  In the first, the programs themselves might be
> tricked into a direct compromise of security.  In the second, the
> tools might introduce a vulnerability in the generated output that
> was not already present in the files used as input.
>
> Other than that, all other bugs will be treated as non-security
> issues.  This does not mean that they will be ignored, just that
> they will not be given the priority that is given to security bugs.
>
> This stance applies to the creation tools in the GCC (e.g.,
> gcc, g++, gfortran, gccgo, gccrs, gnat, cpp, gcov, etc.) and the
> libraries that they use.
>
> Notes:
> ==
>
> None of the programs in GCC need elevated privileges to operate and
> it is recommended that users do not use them from accounts where such
> privileges are automatically available.
>
> Reporting private security bugs
> 
>
>*All bugs reported in the GCC Bugzilla are public.*
>
>In order to report a private security bug that is not immediately
>public, please contact one of the downstream distributions with
>security teams.  The following teams have volunteered to handle
>such bugs:
>
>   Debian:  secur...@debian.org
>   Red Hat: secal...@redhat.com
>   SUSE:secur...@suse.de


Can you also please add:

AdaCore:  product-secur...@adacore.com


>
>Please report the bug to just one of these teams.  It will be shared
>with other teams as necessary.
>
>The team contacted will take care of details such as vulnerability
>rating and CVE assignment (http://cve.mitre.org/about/).  It is likely
>that the team will ask to file a public bug because the issue is
>sufficiently minor and does not warrant an embargo.  An embargo is not
>a requirement for being credited with the discovery of a security
>vulnerability.
>
> Reporting public security bugs
> ==
>
>It is expected that critical security bugs will be rare, and that most
>security bugs can be reported in GCC, thus making
>them public immediately.  The system can be found here:
>
>   https://gcc.gnu.org/bugzilla/
>

[PATCH] middle-end: use MAX_FIXED_MODE_SIZE instead of precidion of TImode/DImode

2023-09-20 Thread Jakub Jelinek

Hi!

On Tue, Sep 19, 2023 at 05:50:59PM +0100, Richard Sandiford wrote:
> How about using MAX_FIXED_MODE_SIZE for things like this?

Seems like a good idea.

The following patch does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-09-20  Jakub Jelinek  

* match.pd ((x << c) >> c): Use MAX_FIXED_MODE_SIZE instead of
GET_MODE_PRECISION of TImode or DImode depending on whether
TImode is supported scalar mode.
* gimple-lower-bitint.cc (bitint_precision_kind): Likewise.
* expr.cc (expand_expr_real_1): Likewise.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): Likewise.
* ubsan.cc (ubsan_encode_value, ubsan_type_descriptor): Likewise.

--- gcc/match.pd.jj 2023-09-19 19:47:23.969430272 +0200
+++ gcc/match.pd2023-09-19 20:08:17.341559409 +0200
@@ -4124,9 +4124,7 @@ (define_operator_list SYNC_FETCH_AND_AND
 (with {
   int width = element_precision (type) - tree_to_uhwi (@1);
   tree stype = NULL_TREE;
-  scalar_int_mode mode = (targetm.scalar_mode_supported_p (TImode)
- ? TImode : DImode);
-  if (width <= GET_MODE_PRECISION (mode))
+  if (width <= MAX_FIXED_MODE_SIZE)
stype = build_nonstandard_integer_type (width, 0);
  }
  (if (stype && (width == 1 || type_has_mode_precision_p (stype)))
--- gcc/gimple-lower-bitint.cc.jj   2023-09-08 11:29:20.105768005 +0200
+++ gcc/gimple-lower-bitint.cc  2023-09-19 20:01:50.927782331 +0200
@@ -100,21 +100,19 @@ bitint_precision_kind (int prec)
   small_max_prec = prec;
   return bitint_prec_small;
 }
-  scalar_int_mode arith_mode = (targetm.scalar_mode_supported_p (TImode)
-   ? TImode : DImode);
   if (!large_min_prec
-  && GET_MODE_PRECISION (arith_mode) > GET_MODE_PRECISION (limb_mode))
-large_min_prec = GET_MODE_PRECISION (arith_mode) + 1;
+  && GET_MODE_PRECISION (limb_mode) < MAX_FIXED_MODE_SIZE)
+large_min_prec = MAX_FIXED_MODE_SIZE + 1;
   if (!limb_prec)
 limb_prec = GET_MODE_PRECISION (limb_mode);
   if (!huge_min_prec)
 {
-  if (4 * limb_prec >= GET_MODE_PRECISION (arith_mode))
+  if (4 * limb_prec >= MAX_FIXED_MODE_SIZE)
huge_min_prec = 4 * limb_prec;
   else
-   huge_min_prec = GET_MODE_PRECISION (arith_mode) + 1;
+   huge_min_prec = MAX_FIXED_MODE_SIZE + 1;
 }
-  if (prec <= GET_MODE_PRECISION (arith_mode))
+  if (prec <= MAX_FIXED_MODE_SIZE)
 {
   if (!mid_min_prec || prec < mid_min_prec)
mid_min_prec = prec;
--- gcc/expr.cc.jj  2023-09-08 11:29:20.101768059 +0200
+++ gcc/expr.cc 2023-09-19 20:00:12.788108832 +0200
@@ -11044,17 +11044,11 @@ expand_expr_real_1 (tree exp, rtx target
scalar_int_mode limb_mode
  = as_a  (info.limb_mode);
unsigned int limb_prec = GET_MODE_PRECISION (limb_mode);
-   if (prec > limb_prec)
+   if (prec > limb_prec && prec > MAX_FIXED_MODE_SIZE)
  {
-   scalar_int_mode arith_mode
- = (targetm.scalar_mode_supported_p (TImode)
-? TImode : DImode);
-   if (prec > GET_MODE_PRECISION (arith_mode))
- {
-   /* Emit large/huge _BitInt INTEGER_CSTs into memory.  */
-   exp = tree_output_constant_def (exp);
-   return expand_expr (exp, target, VOIDmode, modifier);
- }
+   /* Emit large/huge _BitInt INTEGER_CSTs into memory.  */
+   exp = tree_output_constant_def (exp);
+   return expand_expr (exp, target, VOIDmode, modifier);
  }
  }
 
--- gcc/tree-ssa-sccvn.cc.jj2023-09-18 15:14:48.987358112 +0200
+++ gcc/tree-ssa-sccvn.cc   2023-09-19 20:02:53.160941163 +0200
@@ -7004,10 +7004,7 @@ eliminate_dom_walker::eliminate_stmt (ba
  && !type_has_mode_precision_p (TREE_TYPE (lhs)))
{
  if (TREE_CODE (TREE_TYPE (lhs)) == BITINT_TYPE
- && (TYPE_PRECISION (TREE_TYPE (lhs))
- > (targetm.scalar_mode_supported_p (TImode)
-? GET_MODE_PRECISION (TImode)
-: GET_MODE_PRECISION (DImode
+ && TYPE_PRECISION (TREE_TYPE (lhs)) > MAX_FIXED_MODE_SIZE)
lookup_lhs = NULL_TREE;
  else if (TREE_CODE (lhs) == COMPONENT_REF
   || TREE_CODE (lhs) == MEM_REF)
--- gcc/ubsan.cc.jj 2023-09-08 11:29:20.136767581 +0200
+++ gcc/ubsan.cc2023-09-19 20:06:56.118657251 +0200
@@ -136,13 +136,10 @@ ubsan_encode_value (tree t, enum ubsan_e
}
   else
{
- scalar_int_mode arith_mode
-   = (targetm.scalar_mode_supported_p (TImode) ? TImode : DImode);
- if (TYPE_PRECISION (type) > GET_MODE_PRECISION (arith_mode))
+ if (TYPE_PRECISION (type) > MAX_FIXED_MODE_SIZE)
return build_zero_cst (pointer_sized_int_node);
- type

Re: [PATCH] RISC-V: Reorganize and rename combine patterns in autovec-opt.md

2023-09-20 Thread Lehua Ding


Committed, thanks Juzhe and Robin.

On 2023/9/20 15:14, Robin Dapp wrote:

Hi Lehua,

this LGTM.

Regards
  Robin



--
Best,
Lehua

[PATCH] c, c++, v3: Accept __builtin_classify_type (typename)

2023-09-20 Thread Jakub Jelinek

On Mon, Sep 18, 2023 at 09:25:19PM +, Joseph Myers wrote:
> > I'd like to ping this patch.
> > The C++ FE part has been approved by Jason already with a minor change
> > I've made in my copy.
> > Are the remaining parts ok for trunk?
> 
> In the C front-end changes, since you end up discarding any side effects 
> from the type, I'd expect use of in_alignof to be more appropriate than 
> in_typeof (and thus not needing to use pop_maybe_used).

So like this?  Bootstrapped/regtested again on x86_64-linux and i686-linux.

2023-09-20  Jakub Jelinek  

gcc/
* builtins.h (type_to_class): Declare.
* builtins.cc (type_to_class): No longer static.  Return
int rather than enum.
* doc/extend.texi (__builtin_classify_type): Document.
gcc/c/
* c-parser.cc (c_parser_postfix_expression_after_primary): Parse
__builtin_classify_type call with typename as argument.
gcc/cp/
* parser.cc (cp_parser_postfix_expression): Parse
__builtin_classify_type call with typename as argument.
* pt.cc (tsubst_copy_and_build): Handle __builtin_classify_type
with dependent typename as argument.
gcc/testsuite/
* c-c++-common/builtin-classify-type-1.c: New test.
* g++.dg/ext/builtin-classify-type-1.C: New test.
* g++.dg/ext/builtin-classify-type-2.C: New test.
* gcc.dg/builtin-classify-type-1.c: New test.

--- gcc/builtins.h.jj   2023-01-03 00:20:34.856089856 +0100
+++ gcc/builtins.h  2023-06-12 09:35:20.841902572 +0200
@@ -156,5 +156,6 @@ extern internal_fn associated_internal_f
 extern internal_fn replacement_internal_fn (gcall *);
 
 extern bool builtin_with_linkage_p (tree);
+extern int type_to_class (tree);
 
 #endif /* GCC_BUILTINS_H */
--- gcc/builtins.cc.jj  2023-05-20 15:31:09.03352 +0200
+++ gcc/builtins.cc 2023-06-12 09:35:31.709751296 +0200
@@ -113,7 +113,6 @@ static rtx expand_builtin_apply_args (vo
 static rtx expand_builtin_apply_args_1 (void);
 static rtx expand_builtin_apply (rtx, rtx, rtx);
 static void expand_builtin_return (rtx);
-static enum type_class type_to_class (tree);
 static rtx expand_builtin_classify_type (tree);
 static rtx expand_builtin_mathfn_3 (tree, rtx, rtx);
 static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx);
@@ -1852,7 +1851,7 @@ expand_builtin_return (rtx result)
 
 /* Used by expand_builtin_classify_type and fold_builtin_classify_type.  */
 
-static enum type_class
+int
 type_to_class (tree type)
 {
   switch (TREE_CODE (type))
--- gcc/doc/extend.texi.jj  2023-06-10 19:58:26.197478291 +0200
+++ gcc/doc/extend.texi 2023-06-12 18:06:24.629413024 +0200
@@ -14354,6 +14354,30 @@ need not be a constant.  @xref{Object Si
 description of the function.
 @enddefbuiltin
 
+@defbuiltin{int __builtin_classify_type (@var{arg})}
+@defbuiltinx{int __builtin_classify_type (@var{type})}
+The @code{__builtin_classify_type} returns a small integer with a category
+of @var{arg} argument's type, like void type, integer type, enumeral type,
+boolean type, pointer type, reference type, offset type, real type, complex
+type, function type, method type, record type, union type, array type,
+string type, etc.  When the argument is an expression, for
+backwards compatibility reason the argument is promoted like arguments
+passed to @code{...} in varargs function, so some classes are never returned
+in certain languages.  Alternatively, the argument of the built-in
+function can be a typename, such as the @code{typeof} specifier.
+
+@smallexample
+int a[2];
+__builtin_classify_type (a) == __builtin_classify_type (int[5]);
+__builtin_classify_type (a) == __builtin_classify_type (void*);
+__builtin_classify_type (typeof (a)) == __builtin_classify_type (int[5]);
+@end smallexample
+
+The first comparison will never be true, as @var{a} is implicitly converted
+to pointer.  The last two comparisons will be true as they classify
+pointers in the second case and arrays in the last case.
+@enddefbuiltin
+
 @defbuiltin{double __builtin_huge_val (void)}
 Returns a positive infinity, if supported by the floating-point format,
 else @code{DBL_MAX}.  This function is suitable for implementing the
--- gcc/c/c-parser.cc.jj2023-06-10 19:22:15.577205685 +0200
+++ gcc/c/c-parser.cc   2023-06-12 17:32:31.007413019 +0200
@@ -11213,6 +11213,29 @@ c_parser_postfix_expression_after_primar
literal_zero_mask = 0;
if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
  exprlist = NULL;
+   else if (TREE_CODE (expr.value) == FUNCTION_DECL
+&& fndecl_built_in_p (expr.value, BUILT_IN_CLASSIFY_TYPE)
+&& c_parser_next_tokens_start_typename (parser,
+cla_prefer_id))
+ {
+   /* __builtin_classify_type (type)  */
+   c_inhibit_evaluation_warnings++;
+   in_alignof++;
+   struct c_type_name *type = c_parser_type_name

Re: [PATCH] RISC-V: Reorganize and rename combine patterns in autovec-opt.md

2023-09-20 Thread Robin Dapp

Hi Lehua,

this LGTM.

Regards
 Robin

Re: [PATCH] RISC-V: Reorganize and rename combine patterns in autovec-opt.md

2023-09-20 Thread juzhe.zh...@rivai.ai

LGTM.



juzhe.zh...@rivai.ai
 
From: Lehua Ding
Date: 2023-09-20 15:03
To: gcc-patches
CC: juzhe.zhong; kito.cheng; rdapp.gcc; palmer; jeffreyalaw; lehua.ding
Subject: [PATCH] RISC-V: Reorganize and rename combine patterns in 
autovec-opt.md
This patch reorganize and rename the combine patterns in autovec-opt.md
by category. There shouldn't be any functional changes.
The current classification includes the following categories:
 
- Combine op + vmerge to cond_op
- Combine binop + trunc to narrow_binop
- Combine extend + binop to widen_binop
- Combine extend + ternop to widen_ternop
- Misc combine patterns
 
gcc/ChangeLog:
 
* config/riscv/autovec-opt.md (*not): Move and rename.
(*n): Ditto.
(*vtrunc): Ditto.
(*trunc): Ditto.
(*narrow_): Ditto.
(*narrow__scalar): Ditto.
(*single_widen_mult): Ditto.
(*single_widen_mul): Ditto.
(*single_widen_mult): Ditto.
(*single_widen_mul): Ditto.
(*dual_widen_fma): Ditto.
(*dual_widen_fma): Ditto.
(*single_widen_fma): Ditto.
(*single_widen_fma): Ditto.
(*dual_fma): Ditto.
(*single_fma): Ditto.
(*dual_fnma): Ditto.
(*dual_widen_fnma): Ditto.
(*single_fnma): Ditto.
(*single_widen_fnma): Ditto.
(*dual_fms): Ditto.
(*dual_widen_fms): Ditto.
(*single_fms): Ditto.
(*single_widen_fms): Ditto.
(*dual_fnms): Ditto.
(*dual_widen_fnms): Ditto.
(*single_fnms): Ditto.
(*single_widen_fnms): Ditto.
 
---
gcc/config/riscv/autovec-opt.md | 203 ++--
1 file changed, 91 insertions(+), 112 deletions(-)
 
diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 66c77ad6ebb..46a344407c7 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -58,104 +58,6 @@
   }
)
 
-;; -
-;;  [BOOL] Binary logical operations (inverted second input)
-;; -
-;; Includes:
-;; - vmandnot.mm
-;; - vmornot.mm
-;; -
-
-(define_insn_and_split "*not"
-  [(set (match_operand:VB_VLS 0 "register_operand"   "=vr")
- (bitmanip_bitwise:VB_VLS
-   (not:VB_VLS (match_operand:VB_VLS 2 "register_operand" " vr"))
-   (match_operand:VB_VLS 1 "register_operand" " vr")))]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-  {
-insn_code icode = code_for_pred_not (, mode);
-riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_MASK_OP, 
operands);
-DONE;
-  }
-  [(set_attr "type" "vmalu")
-   (set_attr "mode" "")])
-
-;; -
-;;  [BOOL] Binary logical operations (inverted result)
-;; -
-;; Includes:
-;; - vmnand.mm
-;; - vmnor.mm
-;; - vmxnor.mm
-;; -
-
-(define_insn_and_split "*n"
-  [(set (match_operand:VB_VLS 0 "register_operand" "=vr")
- (not:VB_VLS
-   (any_bitwise:VB_VLS
- (match_operand:VB_VLS 1 "register_operand" " vr")
- (match_operand:VB_VLS 2 "register_operand" " vr"]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-  {
-insn_code icode = code_for_pred_n (, mode);
-riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_MASK_OP, 
operands);
-DONE;
-  }
-  [(set_attr "type" "vmalu")
-   (set_attr "mode" "")])
-
-;; -
-;;  [INT] Binary narrow shifts.
-;; -
-;; Includes:
-;; - vnsrl.wv/vnsrl.wx/vnsrl.wi
-;; - vnsra.wv/vnsra.wx/vnsra.wi
-;; -
-
-(define_insn_and_split "*vtrunc"
-  [(set (match_operand: 0 "register_operand"   "=vr,vr")
-(truncate:
-  (any_shiftrt:VWEXTI
-(match_operand:VWEXTI 1 "register_operand" " vr,vr")
- (any_extend:VWEXTI
-  (match_operand: 2 "vector_shift_operand" " 
vr,vk")]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-{
-  insn_code icode = code_for_pred_narrow (, mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
-  DONE;
-}
- [(set_attr "type" "vnshift")
-  (set_attr "mode" "")])
-
-(define_insn_and_split "*trunc"
-  [(set (match_operand: 0 "register_operand" "=vr")
-(truncate:
-  (any_shiftrt:VWEXTI
-(match_operand:VWEXTI 1 "register_operand"   " vr")
- (match_operand: 2 "csr_operand" " rK"]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-{
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  insn_code icode = code_for_pred_narrow_scalar (, 
mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
-  DONE;
-}
- [(set_attr

[Patch, fortran] PR68155 - ICE on initializing character array in type (len_lhs <> len_rhs)

2023-09-20 Thread Paul Richard Thomas

Hi All,

This is a straightforward patch that is adequately explained by the ChangeLog.

Regtests fine - OK for trunk?

Cheers

Paul

Fortran: Pad mismatched charlens in component initializers [PR68155]

2023-09-20  Paul Thomas  

gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.

gcc/testsuite/
PR fortran/68155
* gfortran.dg/pr68155.f90: New test.
diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 8182ef29f43..4a3c5b86de0 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -1960,6 +1960,45 @@ gfc_free_enum_history (void)
 }
 
 
+/* Function to fix initializer character length if the length of the
+   symbol or component is constant.  */
+
+static bool
+fix_initializer_charlen (gfc_typespec *ts, gfc_expr *init)
+{
+  if (!gfc_specification_expr (ts->u.cl->length))
+return false;
+
+  int k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);
+
+  /* resolve_charlen will complain later on if the length
+ is too large.  Just skip the initialization in that case.  */
+  if (mpz_cmp (ts->u.cl->length->value.integer,
+	   gfc_integer_kinds[k].huge) <= 0)
+{
+  HOST_WIDE_INT len
+		= gfc_mpz_get_hwi (ts->u.cl->length->value.integer);
+
+  if (init->expr_type == EXPR_CONSTANT)
+	gfc_set_constant_character_len (len, init, -1);
+  else if (init->expr_type == EXPR_ARRAY)
+	{
+	  gfc_constructor *cons;
+
+	  /* Build a new charlen to prevent simplification from
+	 deleting the length before it is resolved.  */
+	  init->ts.u.cl = gfc_new_charlen (gfc_current_ns, NULL);
+	  init->ts.u.cl->length = gfc_copy_expr (ts->u.cl->length);
+	  cons = gfc_constructor_first (init->value.constructor);
+	  for (; cons; cons = gfc_constructor_next (cons))
+	gfc_set_constant_character_len (len, cons->expr, -1);
+	}
+}
+
+  return true;
+}
+
+
 /* Function called by variable_decl() that adds an initialization
expression to a symbol.  */
 
@@ -2073,40 +2112,10 @@ add_init_expr_to_sym (const char *name, gfc_expr **initp, locus *var_locus)
 gfc_copy_expr (init->ts.u.cl->length);
 		}
 	}
-	  /* Update initializer character length according symbol.  */
-	  else if (sym->ts.u.cl->length->expr_type == EXPR_CONSTANT)
-	{
-	  if (!gfc_specification_expr (sym->ts.u.cl->length))
-		return false;
-
-	  int k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind,
-	 false);
-	  /* resolve_charlen will complain later on if the length
-		 is too large.  Just skeep the initialization in that case.  */
-	  if (mpz_cmp (sym->ts.u.cl->length->value.integer,
-			   gfc_integer_kinds[k].huge) <= 0)
-		{
-		  HOST_WIDE_INT len
-		= gfc_mpz_get_hwi (sym->ts.u.cl->length->value.integer);
-
-		  if (init->expr_type == EXPR_CONSTANT)
-		gfc_set_constant_character_len (len, init, -1);
-		  else if (init->expr_type == EXPR_ARRAY)
-		{
-		  gfc_constructor *c;
-
-		  /* Build a new charlen to prevent simplification from
-			 deleting the length before it is resolved.  */
-		  init->ts.u.cl = gfc_new_charlen (gfc_current_ns, NULL);
-		  init->ts.u.cl->length
-			= gfc_copy_expr (sym->ts.u.cl->length);
-
-		  for (c = gfc_constructor_first (init->value.constructor);
-			   c; c = gfc_constructor_next (c))
-			gfc_set_constant_character_len (len, c->expr, -1);
-		}
-		}
-	}
+	  /* Update initializer character length according to symbol.  */
+	  else if (sym->ts.u.cl->length->expr_type == EXPR_CONSTANT
+		   && !fix_initializer_charlen (>ts, init))
+	return false;
 	}
 
   if (sym->attr.flavor == FL_PARAMETER && sym->attr.dimension && sym->as
@@ -2369,6 +2378,13 @@ build_struct (const char *name, gfc_charlen *cl, gfc_expr **init,
   c->initializer = *init;
   *init = NULL;
 
+  /* Update initializer character length according to component.  */
+  if (c->ts.type == BT_CHARACTER && c->ts.u.cl->length
+  && c->ts.u.cl->length->expr_type == EXPR_CONSTANT
+  && c->initializer && c->initializer->ts.type == BT_CHARACTER
+  && !fix_initializer_charlen (>ts, c->initializer))
+return false;
+
   c->as = *as;
   if (c->as != NULL)
 {
! { dg-do run }
!
! Fix for PR68155 in which initializers of constant length, character
! components of derived types were not being padded if they were too short.
! Originally, mismatched lengths caused ICEs. This seems to have been fixed
! in 9-branch.
!
! Contributed by Gerhard Steinmetz  
!
program p
  implicit none
  type t
character(3) :: c1(2) = [ 'b', 'c']  ! OK
character(3) :: c2(2) = [ character(1) :: 'b', 'c'] // ""! OK
character(3) :: c3(2) = [ 'b', 'c'] // ""! was not padded
character(3) :: c4(2) = [ '' , '' ] // ""! was not padded
character(3) :: c5(2) = [ 'b', 'c'] // 'a'   ! was not padded
character(3) :: c6(2) = [

[PATCH] RISC-V: Reorganize and rename combine patterns in autovec-opt.md

2023-09-20 Thread Lehua Ding

This patch reorganize and rename the combine patterns in autovec-opt.md
by category. There shouldn't be any functional changes.
The current classification includes the following categories:

- Combine op + vmerge to cond_op
- Combine binop + trunc to narrow_binop
- Combine extend + binop to widen_binop
- Combine extend + ternop to widen_ternop
- Misc combine patterns

gcc/ChangeLog:

* config/riscv/autovec-opt.md (*not): Move and rename.
(*n): Ditto.
(*vtrunc): Ditto.
(*trunc): Ditto.
(*narrow_): Ditto.
(*narrow__scalar): Ditto.
(*single_widen_mult): Ditto.
(*single_widen_mul): Ditto.
(*single_widen_mult): Ditto.
(*single_widen_mul): Ditto.
(*dual_widen_fma): Ditto.
(*dual_widen_fma): Ditto.
(*single_widen_fma): Ditto.
(*single_widen_fma): Ditto.
(*dual_fma): Ditto.
(*single_fma): Ditto.
(*dual_fnma): Ditto.
(*dual_widen_fnma): Ditto.
(*single_fnma): Ditto.
(*single_widen_fnma): Ditto.
(*dual_fms): Ditto.
(*dual_widen_fms): Ditto.
(*single_fms): Ditto.
(*single_widen_fms): Ditto.
(*dual_fnms): Ditto.
(*dual_widen_fnms): Ditto.
(*single_fnms): Ditto.
(*single_widen_fnms): Ditto.

---
 gcc/config/riscv/autovec-opt.md | 203 ++--
 1 file changed, 91 insertions(+), 112 deletions(-)

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 66c77ad6ebb..46a344407c7 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -58,104 +58,6 @@
   }
 )

-;; -
-;;  [BOOL] Binary logical operations (inverted second input)
-;; -
-;; Includes:
-;; - vmandnot.mm
-;; - vmornot.mm
-;; -
-
-(define_insn_and_split "*not"
-  [(set (match_operand:VB_VLS 0 "register_operand"   "=vr")
-   (bitmanip_bitwise:VB_VLS
- (not:VB_VLS (match_operand:VB_VLS 2 "register_operand" " vr"))
- (match_operand:VB_VLS 1 "register_operand" " vr")))]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-  {
-insn_code icode = code_for_pred_not (, mode);
-riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_MASK_OP, 
operands);
-DONE;
-  }
-  [(set_attr "type" "vmalu")
-   (set_attr "mode" "")])
-
-;; -
-;;  [BOOL] Binary logical operations (inverted result)
-;; -
-;; Includes:
-;; - vmnand.mm
-;; - vmnor.mm
-;; - vmxnor.mm
-;; -
-
-(define_insn_and_split "*n"
-  [(set (match_operand:VB_VLS 0 "register_operand" "=vr")
-   (not:VB_VLS
- (any_bitwise:VB_VLS
-   (match_operand:VB_VLS 1 "register_operand" " vr")
-   (match_operand:VB_VLS 2 "register_operand" " vr"]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-  {
-insn_code icode = code_for_pred_n (, mode);
-riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_MASK_OP, 
operands);
-DONE;
-  }
-  [(set_attr "type" "vmalu")
-   (set_attr "mode" "")])
-
-;; -
-;;  [INT] Binary narrow shifts.
-;; -
-;; Includes:
-;; - vnsrl.wv/vnsrl.wx/vnsrl.wi
-;; - vnsra.wv/vnsra.wx/vnsra.wi
-;; -
-
-(define_insn_and_split "*vtrunc"
-  [(set (match_operand: 0 "register_operand"   "=vr,vr")
-(truncate:
-  (any_shiftrt:VWEXTI
-(match_operand:VWEXTI 1 "register_operand" " vr,vr")
-   (any_extend:VWEXTI
-  (match_operand: 2 "vector_shift_operand" " 
vr,vk")]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-{
-  insn_code icode = code_for_pred_narrow (, mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands);
-  DONE;
-}
- [(set_attr "type" "vnshift")
-  (set_attr "mode" "")])
-
-(define_insn_and_split "*trunc"
-  [(set (match_operand: 0 "register_operand" "=vr")
-(truncate:
-  (any_shiftrt:VWEXTI
-(match_operand:VWEXTI 1 "register_operand"   " vr")
-   (match_operand: 2 "csr_operand" " rK"]
-  "TARGET_VECTOR && can_create_pseudo_p ()"
-  "#"
-  "&& 1"
-  [(const_int 0)]
-{
-  operands[2] = gen_lowpart (Pmode, operands[2]);
-  insn_code icode = code_for_pred_narrow_scalar (, 
mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP,

1 2 >

1 - 100 of 104 matches

Mail list logo