[Bug fortran/82774] [10/11/12/13/14 Regression] Structure constructor and deferred type parameter character component

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82774

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Paul Thomas  ---
Hi Steve,

Fixed on trunk. I will back port to 13-branch in 2-3 weeks time.

Thanks for the report - it's a pity that it took so long to fix :-(

Paul

[Bug fortran/104429] [10/11/12/13/14 Regression] ICE in gfc_conv_variable, at fortran/trans-expr.cc:3056 since r9-2664-g1312bb902382cb48

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104429

Paul Thomas  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Paul Thomas  ---
Fixed on trunk. I will back port to 13-branch in 2-3 weeks time.

Thanks for the report

Paul

[Bug fortran/103389] [10/11/12/13/14 Regression] ICE in estimate_move_cost, at tree-inline.c:4191 since r9-5784-ga3df90b9672562d0

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103389

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Paul Thomas  ---
Fixed on trunk. I will back port to 13-branch in 2-3 weeks time.

Thanks for the report

Paul

[Bug fortran/87946] [10/11/12/13/14 Regression] ICE in gfc_walk_array_ref, at fortran/trans-array.c:10506

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87946

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #10 from Paul Thomas  ---
Fixed on trunk. I will back port to 13-branch in 2-3 weeks time.

Thanks for the report

Paul

[Bug target/87496] ICE in aggregate_value_p at gcc/function.c:2046

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87496

Paul Thomas  changed:

   What|Removed |Added

 CC||pault at gcc dot gnu.org

--- Comment #18 from Paul Thomas  ---
Sorry for the noise - a bit of a slip occurred in the ChangeLog.

Cheers

Paul

[Bug fortran/100193] [10/11/12/13/14 Regression] ICE in alloc_scalar_allocatable_for_assignment, at fortran/trans-expr.c:10837

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100193

Paul Thomas  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Paul Thomas  ---
Fixed on trunk. I will back port to 13-branch in 2-3 weeks time.

Thanks for the report

Paul

[Bug fortran/105152] [10/11/12/13/14 Regression] ICE in gfc_conv_gfc_desc_to_cfi_desc, at fortran/trans-expr.cc:5647 since r9-5372-gbbf18dc5d248a79a

2023-05-15 Thread pault at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105152

Paul Thomas  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|NEW |RESOLVED

--- Comment #6 from Paul Thomas  ---
Fixed on trunk. I will back port to 13-branch in 2-3 weeks time.

Thanks for the report

Paul

[Bug fortran/103389] [10/11/12/13/14 Regression] ICE in estimate_move_cost, at tree-inline.c:4191 since r9-5784-ga3df90b9672562d0

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103389

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

[Bug fortran/87946] [10/11/12/13/14 Regression] ICE in gfc_walk_array_ref, at fortran/trans-array.c:10506

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87946

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

[Bug target/87496] ICE in aggregate_value_p at gcc/function.c:2046

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87496

--- Comment #17 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

[Bug fortran/82774] [10/11/12/13/14 Regression] Structure constructor and deferred type parameter character component

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82774

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

[Bug fortran/104429] [10/11/12/13/14 Regression] ICE in gfc_conv_variable, at fortran/trans-expr.cc:3056 since r9-2664-g1312bb902382cb48

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104429

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

[Bug fortran/105152] [10/11/12/13/14 Regression] ICE in gfc_conv_gfc_desc_to_cfi_desc, at fortran/trans-expr.cc:5647 since r9-5372-gbbf18dc5d248a79a

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105152

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

[Bug fortran/100193] [10/11/12/13/14 Regression] ICE in alloc_scalar_allocatable_for_assignment, at fortran/trans-expr.c:10837

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100193

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Paul Thomas :

https://gcc.gnu.org/g:6c95fe9bc0553743098eeaa739f14b885050fa42

commit r14-870-g6c95fe9bc0553743098eeaa739f14b885050fa42
Author: Paul Thomas 
Date:   Tue May 16 06:35:40 2023 +0100

Fortran: Fix an assortment of bugs

2023-05-16  Paul Thomas  

gcc/fortran
PR fortran/105152
* interface.cc (gfc_compare_actual_formal): Emit an error if an
unlimited polymorphic actual is not matched either to an
unlimited or assumed type formal argument.

PR fortran/100193
* resolve.cc (resolve_ordinary_assign): Emit an error if the
var expression of an ordinary assignment is a proc pointer
component.

PR fortran/87496
* trans-array.cc (gfc_walk_array_ref): Provide assumed shape
arrays coming from interface mapping with a viable arrayspec.

PR fortran/103389
* trans-expr.cc (gfc_conv_intrinsic_to_class): Tidy up flagging
of unlimited polymorphic 'class_ts'.
(gfc_conv_gfc_desc_to_cfi_desc): Assumed type is unlimited
polymorphic and should accept any actual type.

PR fortran/104429
(gfc_conv_procedure_call): Replace dreadful kludge with a call
to gfc_finalize_tree_expr. Avoid dereferencing a void pointer
by giving it the pointer type of the actual argument.

PR fortran/82774
(alloc_scalar_allocatable_subcomponent): Shorten the function
name and replace the symbol argument with the se string length.
If a deferred length character length is either not present or
is not a variable, give the typespec a variable and assign the
string length to that. Use gfc_deferred_strlen to find the
hidden string length component.
(gfc_trans_subcomponent_assign): Convert the expression before
the call to alloc_scalar_allocatable_subcomponent so that a
good string length is provided.
(gfc_trans_structure_assign): Remove the unneeded derived type
symbol from calls to gfc_trans_subcomponent_assign.

gcc/testsuite/
PR fortran/105152
* gfortran.dg/pr105152.f90 : New test

PR fortran/100193
* gfortran.dg/pr100193.f90 : New test

PR fortran/87946
* gfortran.dg/pr87946.f90 : New test

PR fortran/103389
* gfortran.dg/pr103389.f90 : New test

PR fortran/104429
* gfortran.dg/pr104429.f90 : New test

PR fortran/82774
* gfortran.dg/pr82774.f90 : New test

Re: [Testsuite] Skip -fdelete-null-pointer-check tests if target keeps_null_pointer_checks

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/14/23 23:06, SenthilKumar.Selvaraj--- via Gcc-patches wrote:

Hi,

When running regression tests related to 
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616772.html,
I noticed a bunch of failures because some tests explicitly pass in
-fdelete-null-pointer-checks, even if the target is configured to keep them.

This patch skips such failing tests by adding a dg-skip-if for 
keeps_null_pointer_checks.
Ok to commit?

Regards
Senthil

gcc/testsuite/ChangeLog:

* gcc.dg/attr-returns-nonnull.c: Skip if
keeps_null_pointer_checks.
* gcc.dg/init-compare-1.c: Likewise.
* gcc.dg/ipa/pr85734.c: Likewise.
* gcc.dg/ipa/propmalloc-1.c: Likewise.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.
* gcc.dg/ipa/propmalloc-4.c: Likewise.
* gcc.dg/tree-ssa/evrp11.c: Likewise.
* gcc.dg/tree-ssa/pr83648.c: Likewise.

OK.
jeff


Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-15 Thread juzhe.zh...@rivai.ai
>> The examples are good, but this one made me wonder: why is the
>> adjustment made to the limit (namely 16, the gap between _39 and _41)
>> different from the limits imposed by the MIN_EXPR (32)?  And I think
>> the answer is that:

>> - _47 counts the number of elements processed by the loop in total,
>>   including the vectors under the control of _44

>> - _44 counts the number of elements controlled by _47 in the next
>>   iteration of the vector loop (if there is one)

>> And that's needed to allow the IVs to be updated independently.

>> The difficulty with this is that the len_load* and len_store*
>> optabs currently say that the behaviour is undefined if the
>> length argument is greater than the length of a vector.
>> So I think using these values of _47 and _44 in the .LEN_STOREs
>> is relying on undefined behaviour.

>> Haven't had time to think about the consequences of that yet,
>> but wanted to send something out sooner rather than later.

Hi, Richard. I totally understand your concern now. I think the undefine 
behavior is more
appropriate for RVV since we have vsetvli instruction that gurantee this will 
cause potential
issues. However, for some other target, we may need to use additional MIN_EXPR 
to guard
the length never over VF. I think it can be addressed in the future when it is 
needed.

For now, is it OK for trunk the V9 patch?
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618638.html 
which has fixed the comments as you suggested.

Besides, we will going to add more patterns has length included:
len_mask_load/len_mask_stores, len_mask_gather_load/ len_cond...etc
They are all undefine behavior for length larger than the vector length.

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 03:44
To: juzhe.zhong
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
juzhe.zh...@rivai.ai writes:
> From: Juzhe-Zhong 
>
> This patch implement decrement IV for length approach in loop control.
>
> Address comment from kewen that incorporate the implementation inside
> "vect_set_loop_controls_directly" instead of a standalone function.
>
> Address comment from Richard using MIN_EXPR to handle these 3 following
> cases
> 1. single rgroup.
> 2. multiple rgroup for SLP.
> 3. multiple rgroup for non-SLP (tested on vec_pack_trunc).
 
Thanks, this looks pretty reasonable to me FWIW, but some comments below:
 
> Bootstraped && Regression on x86.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
> (vect_set_loop_controls_directly): Add decrement IV support.
> (vect_set_loop_condition_partial_vectors): Ditto.
> * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Add a new 
> variable.
> (vect_get_loop_len): Add decrement IV support.
> * tree-vect-stmts.cc (vectorizable_store): Ditto.
> (vectorizable_load): Ditto.
> * tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
> (vect_get_loop_len): Add decrement IV support.
>
> ---
>  gcc/tree-vect-loop-manip.cc | 177 +++-
>  gcc/tree-vect-loop.cc   |  38 +++-
>  gcc/tree-vect-stmts.cc  |   9 +-
>  gcc/tree-vectorizer.h   |  13 ++-
>  4 files changed, 224 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index ff6159e08d5..1baac7b1b52 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -385,6 +385,58 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, 
> rgroup_controls *dest_rgm,
>return false;
>  }
>  
> +/* Try to use adjust loop lens for non-SLP multiple-rgroups.
> +
> + _36 = MIN_EXPR ;
> +
> + First length (MIN (X, VF/N)):
> +   loop_len_15 = MIN_EXPR <_36, VF/N>;
> +
> + Second length:
> +   tmp = _36 - loop_len_15;
> +   loop_len_16 = MIN (tmp, VF/N);
> +
> + Third length:
> +   tmp2 = tmp - loop_len_16;
> +   loop_len_17 = MIN (tmp2, VF/N);
> +
> + Forth length:
> +   tmp3 = tmp2 - loop_len_17;
> +   loop_len_18 = MIN (tmp3, VF/N);  */
> +
> +static void
> +vect_adjust_loop_lens (tree iv_type, gimple_seq *seq, rgroup_controls 
> *dest_rgm,
> +rgroup_controls *src_rgm)
> +{
> +  tree ctrl_type = dest_rgm->type;
> +  poly_uint64 nitems_per_ctrl
> += TYPE_VECTOR_SUBPARTS (ctrl_type) * dest_rgm->factor;
> +
> +  for (unsigned int i = 0; i < dest_rgm->controls.length (); ++i)
> +{
> +  tree src = src_rgm->controls[i / dest_rgm->controls.length ()];
> +  tree dest = dest_rgm->controls[i];
> +  tree length_limit = build_int_cst (iv_type, nitems_per_ctrl);
> +  gassign *stmt;
> +  if (i == 0)
> + {
> +   /* MIN (X, VF*I/N) capped to the range [0, VF/N].  */
> +   stmt = gimple_build_assign (dest, MIN_EXPR, src, length_limit);
> +   gimple_seq_add_stmt (seq, stmt);
> + }
> +  else
> + {
> +   /* (MIN (remain, 

Re: [gcc13 backport] RISCV: Inline subword atomic ops

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/9/23 10:01, Patrick O'Neill wrote:

Ping.

OK for backporting.  Sorry for the delay.

jeff


[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #13 from Sam James  ---
the OOB read seems to go away with --enable-checking=yes,rtl,extra (previously
had --enable-checking=release)...? (at least for 13)

Re: [PATCH] riscv: Add autovectorization tests for binary integer

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/15/23 03:15, juzhe.zh...@rivai.ai wrote:

I think it is the issue of include file.

Kito may know the better the solution instead of changing stdint.h into 
stdint-gcc.h.
I think that's the only solution right now.  I'm not keen to open up the 
multilib can of worms.


Consider a patch that changes stdint.h -> stdint-gcc.h in the RVV 
testsuite pre-approved.


jeff


Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/15/23 07:16, Jin Ma wrote:

This patch adds the 'Zfa' extension for riscv, which is based on:
https://github.com/riscv/riscv-isa-manual/commits/zfb

The binutils-gdb for 'Zfa' extension:
https://sourceware.org/pipermail/binutils/2023-April/127060.html

What needs special explanation is:
1, The immediate number of the instructions FLI.H/S/D is represented in the 
assembly as a
   floating-point value, with scientific counting when rs1 is 2,3, and decimal 
numbers for
   the rest.

   Related llvm link:
 https://reviews.llvm.org/D145645
   Related discussion link:
 https://github.com/riscv/riscv-isa-manual/issues/980

2, According to riscv-spec, "The FCVTMO D.W.D instruction was added principally 
to
   accelerate the processing of JavaScript Numbers.", so it seems that no 
implementation
   is required.

3, The instructions FMINM and FMAXM correspond to C23 library function fminimum 
and fmaximum.
   Therefore, this patch has simply implemented the pattern of fminm3 
and
   fmaxm3 to prepare for later.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zfa extension version.
* config/riscv/constraints.md (zfli): Constrain the floating point 
number that the
instructions FLI.H/S/D can load.
* config/riscv/iterators.md (ceil): New.
(rup): New.
* config/riscv/riscv-opts.h (MASK_ZFA): New.
(TARGET_ZFA): New.
* config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): 
New.
* config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
(riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, 
memory is not applicable.
(riscv_const_insns): Likewise.
(riscv_legitimize_const_move): Likewise.
(riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no 
split is required.
(riscv_split_doubleword_move): Likewise.
(riscv_output_move): Output the mov instructions in zfa extension.
(riscv_print_operand): Output the floating-point value of the FLI.H/S/D 
immediate in assembly
(riscv_secondary_memory_needed): Likewise.
* config/riscv/riscv.md (fminm3): New.
(fmaxm3): New.
(movsidf2_low_rv32): New.
(movsidf2_high_rv32): New.
(movdfsisi3_rv32): New.
(f_quiet4_zfa): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
* gcc.target/riscv/zfa-fleq-fltq.c: New test.
* gcc.target/riscv/zfa-fli-rv32.c: New test.
* gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
* gcc.target/riscv/zfa-fli-zfh.c: New test.
* gcc.target/riscv/zfa-fli.c: New test.
* gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
* gcc.target/riscv/zfa-fround-rv32.c: New test.
* gcc.target/riscv/zfa-fround.c: New test.
---
  gcc/common/config/riscv/riscv-common.cc   |   4 +
  gcc/config/riscv/constraints.md   |  21 +-
  gcc/config/riscv/iterators.md |   5 +
  gcc/config/riscv/riscv-opts.h |   3 +
  gcc/config/riscv/riscv-protos.h   |   1 +
  gcc/config/riscv/riscv.cc | 204 +-
  gcc/config/riscv/riscv.md | 145 +++--
  .../gcc.target/riscv/zfa-fleq-fltq-rv32.c |  19 ++
  .../gcc.target/riscv/zfa-fleq-fltq.c  |  19 ++
  gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++
  .../gcc.target/riscv/zfa-fli-zfh-rv32.c   |  41 
  gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 
  gcc/testsuite/gcc.target/riscv/zfa-fli.c  |  79 +++
  .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
  .../gcc.target/riscv/zfa-fround-rv32.c|  42 
  gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 
  16 files changed, 719 insertions(+), 36 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c





+
+/* Return index of the FLI instruction table if rtx X is an immediate constant 
that can
+   be moved using a single FLI instruction in zfa extension. Return -1 if not 
found.  */
+
+int
+riscv_float_const_rtx_index_for_fli (rtx x)
+{
+  unsigned HOST_WIDE_INT *fli_value_array;
+
+  machine_mode mode = GET_MODE (x);
+
+  if (!TARGET_ZFA
+  || !CONST_DOUBLE_P(x)
+  || mode == VOIDmode
+  || (mode == HFmode && !TARGET_ZFH)
+  || (mode == SFmode && !TARGET_HARD_FLOAT)
+  || (mode == 

Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/15/23 07:30, jinma wrote:

According to Jeff's review feedback, the issues regarding UNSPEC's 
implementation of round, ceil, nearbyint, etc. still need to be determined:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617706.html

source:
https://github.com/majin2020/gcc-mirror/commit/93d7a2d995cee588d494d1839f56e8151c6cb057
After double-checking I was incorrect.  We have named patterns for those 
operations, but the RTL for them are UNSPECs.  So this is a non-issue 
for this patch.


jeff


Re: [PATCH v8] RISC-V: Add the 'zfa' extension, version 0.2.

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/6/23 06:53, jinma wrote:

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 9b767038452..c81b08e3cc5 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -288,3 +288,8 @@ (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET 
UNSPEC_FLE_QUIET])
   (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET 
"le")])
   (define_int_attr QUIET_PATTERN [(UNSPEC_FLT_QUIET "LT") (UNSPEC_FLE_QUIET 
"LE")])
   
+(define_int_iterator ROUND [UNSPEC_ROUND UNSPEC_FLOOR UNSPEC_CEIL UNSPEC_BTRUNC UNSPEC_ROUNDEVEN UNSPEC_NEARBYINT])

+(define_int_attr round_pattern [(UNSPEC_ROUND "round") (UNSPEC_FLOOR "floor") 
(UNSPEC_CEIL "ceil")
+   (UNSPEC_BTRUNC "btrunc") (UNSPEC_ROUNDEVEN "roundeven") 
(UNSPEC_NEARBYINT "nearbyint")])
+(define_int_attr round_rm [(UNSPEC_ROUND "rmm") (UNSPEC_FLOOR "rdn") (UNSPEC_CEIL 
"rup")
+  (UNSPEC_BTRUNC "rtz") (UNSPEC_ROUNDEVEN "rne") 
(UNSPEC_NEARBYINT "dyn")])

Do we really need to use unspecs for all these cases?  I would expect
some correspond to the trunc, round, ceil, nearbyint, etc well known RTX
codes.

In general, we should try to avoid unspecs when there is a clear
semantic match between the instruction and GCC's RTX opcodes.  So please
review the existing RTX code semantics to see if any match the new
instructions.  If there are matches, use those RTX codes rather than
UNSPECs.


I'll try, thanks.



I encountered some confusion about this. I checked gcc's documents and
found no RTX codes that can correspond to round, ceil, nearbyint, etc.
Only "(fix:m x)" seems to correspond to trunc, which can be expressed
as rounding towards zero, while others have not yet been found.
You're largely correct.  My bad.  There's named patterns for round to 
integer, nearbyint, etc, but no RTX codes.  So they need to be handled 
as unspecs.  Sorry fo the confusion.


Jeff


Re: [PATCH] RISC-V: Add rounding mode operand for floating point instructions

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/15/23 07:54, 钟居哲 wrote:
I don't know why we should not add frm vfsqrt.v since I saw topper (LLVM 
maintainer) said we should

not add frm into vsqrt.v. Maybe kito knows the reason ?
I'm pretty sure this is referring to the estimator.   The documentation 
is very clear that the sqrt estimator is independent of the rounding mode.


While it's not as explicit in the RV manual, a real sqrt instruction 
must round to be usable in an IEEE compliant way.  If it didn't honor 
rounding modes we would largely be unable to use the [v]fsqrt instructions.


Jeff


[Bug tree-optimization/101805] Max -> bool0 | bool1 Min -> a & b

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101805

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Andrew Pinski  ---
Fixed by r14-868-gb06cfb62229f17eca59fa4aabf853d7e17e2327b (I typed the wrong
bug # in the commit message).

[Bug tree-optimization/109424] ~((x > y) ? x : y) produces two not instructions

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109424

--- Comment #7 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:b06cfb62229f17eca59fa4aabf853d7e17e2327b

commit r14-868-gb06cfb62229f17eca59fa4aabf853d7e17e2327b
Author: Andrew Pinski 
Date:   Mon May 15 21:44:27 2023 +

MATCH: [PR109424] Simplify min/max of boolean arguments

This is version 2 of
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html
which does not depend on adding gimple_truth_valued_p at this point.
Instead will use zero_one_valued_p which is already used for mult
simplifications
to make sure that we only have [0,1] rather having the mistake of maybe
having [-1,0]
as the range for signed bools.

This shows up in a few places in GCC itself but only at -O1, we miss the
min/max conversion
because of PR 107888 (which I will be testing seperately).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

PR tree-optimization/109424

gcc/ChangeLog:

* match.pd: Add patterns for min/max of zero_one_valued
values to `&`/`|`.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-12.c: New test.
* gcc.dg/tree-ssa/bool-13.c: New test.
* gcc.dg/tree-ssa/minmax-20.c: New test.
* gcc.dg/tree-ssa/minmax-21.c: New test.

[Bug tree-optimization/107888] [12/13/14 Regression] Missed min/max transformation in phiopt due to VRP

2023-05-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107888

--- Comment #11 from CVS Commits  ---
The trunk branch has been updated by Andrew Pinski :

https://gcc.gnu.org/g:b06cfb62229f17eca59fa4aabf853d7e17e2327b

commit r14-868-gb06cfb62229f17eca59fa4aabf853d7e17e2327b
Author: Andrew Pinski 
Date:   Mon May 15 21:44:27 2023 +

MATCH: [PR109424] Simplify min/max of boolean arguments

This is version 2 of
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html
which does not depend on adding gimple_truth_valued_p at this point.
Instead will use zero_one_valued_p which is already used for mult
simplifications
to make sure that we only have [0,1] rather having the mistake of maybe
having [-1,0]
as the range for signed bools.

This shows up in a few places in GCC itself but only at -O1, we miss the
min/max conversion
because of PR 107888 (which I will be testing seperately).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

PR tree-optimization/109424

gcc/ChangeLog:

* match.pd: Add patterns for min/max of zero_one_valued
values to `&`/`|`.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-12.c: New test.
* gcc.dg/tree-ssa/bool-13.c: New test.
* gcc.dg/tree-ssa/minmax-20.c: New test.
* gcc.dg/tree-ssa/minmax-21.c: New test.

[Bug c++/109870] Miscomputation of return type of unevaluated lambda in type alias in template context

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109870

--- Comment #1 from Andrew Pinski  ---
Most likely a dup of a bug that PR 107430 depends on.

RE: [PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-15 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Tuesday, May 16, 2023 11:27 AM
To: juzhe.zh...@rivai.ai; gcc-patches 
Cc: Kito.cheng ; palmer ; Robin 
Dapp 
Subject: Re: [PATCH V2] RISC-V: Add FRM and rounding mode operand into floating 
point intrinsics



On 5/15/23 19:02, juzhe.zh...@rivai.ai wrote:
> Ping。
> 
> Is it Ok for trunk ? I have double checked the floating-point 
> instructions needed FRM.
Yes, this is OK for the trunk.

Thanks,
jeff


[Bug c++/109870] New: Miscomputation of return type of unevaluated lambda in type alias in template context

2023-05-15 Thread ed at catmur dot uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109870

Bug ID: 109870
   Summary: Miscomputation of return type of unevaluated lambda in
type alias in template context
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ed at catmur dot uk
  Target Milestone: ---

template struct X {};
template
struct M {
using R = decltype([] { return 1; }());
template struct S { X p; };
};
M::S s;

9.1 through 14.0.0 trunk reject with:

: In instantiation of 'struct M::S':
:7:17:   required from here
:4:36: error: return-statement with a value, in function returning
'void' [-fpermissive]
4 | using R = decltype([] { return 1; }());
  |^

It appears the `U` type parameter in `M::S` is overwriting the lambda
deduced return type.

Possibly related to #92707, #103569

Re: [PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/15/23 19:02, juzhe.zh...@rivai.ai wrote:

Ping。

Is it Ok for trunk ? I have double checked the floating-point 
instructions needed FRM.

Yes, this is OK for the trunk.

Thanks,
jeff


Re: [PATCH] MATCH: [PR109424] Simplify min/max of boolean arguments

2023-05-15 Thread Jeff Law via Gcc-patches




On 5/15/23 19:36, Andrew Pinski via Gcc-patches wrote:

This is version 2 of 
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html
which does not depend on adding gimple_truth_valued_p at this point.
Instead will use zero_one_valued_p which is already used for mult 
simplifications
to make sure that we only have [0,1] rather having the mistake of maybe having 
[-1,0]
as the range for signed bools.

This shows up in a few places in GCC itself but only at -O1, we miss the 
min/max conversion
because of PR 107888 (which I will be testing seperately).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

PR tree-optimization/109424

gcc/ChangeLog:

* match.pd: Add patterns for min/max of zero_one_valued
values to `&`/`|`.
Not sure it buys us a whole lot.  I guess the strongest argument is 
probably that turning it into a logical helps on targets without min/max 
support.


OK.

jeff


[Bug rtl-optimization/109858] [14 Regression] r14-172 caused some SPEC2017 bmk to degrade on Power

2023-05-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109858

--- Comment #6 from Kewen Lin  ---
(In reply to Hongtao.liu from comment #5)
> (In reply to Kewen Lin from comment #3)
> > (In reply to Hongtao.liu from comment #2)
> > > Does https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617431.html help?
> > 
> > Sorry, I just measured those degraded bmks with this fix, the results showed
> > it didn't help.
> 
> Sorry for the inconvience, could you try again with attached patch.
> The patch will still use GENERAL_REGS when hard_regno_mode_ok for mode and
> GENERAL_REGS(which is the case in PR109610), hope it can also fix this
> regression.

Thanks for the prompt fix, I'll do the perf evaluation once the perf boxes get
released (they are used by others now) and get back to you.

[Bug tree-optimization/90087] Suboptimal codegen for x < 0 ? x - INT_MIN : x

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90087

--- Comment #3 from Andrew Pinski  ---
THis way with type_min and type_max filled out correctly.

(simplify
 (cond (lt @0 integer_zero_p) (minus @0 INTEGER_CST@1) @0)
 (if (TYPE_SIGNED (type) && wi::to_widest(@0) == type_min(@0))
  (bit_ior @0 { build_int_cst (type_max(type), type); } )))

Note I think TYPE_PRECISION needs to be a power of 2 but I could be wrong ...

[Bug rtl-optimization/109858] [14 Regression] r14-172 caused some SPEC2017 bmk to degrade on Power

2023-05-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109858

--- Comment #5 from Hongtao.liu  ---
(In reply to Kewen Lin from comment #3)
> (In reply to Hongtao.liu from comment #2)
> > Does https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617431.html help?
> 
> Sorry, I just measured those degraded bmks with this fix, the results showed
> it didn't help.

Sorry for the inconvience, could you try again with attached patch.
The patch will still use GENERAL_REGS when hard_regno_mode_ok for mode and
GENERAL_REGS(which is the case in PR109610), hope it can also fix this
regression.

[Bug rtl-optimization/109858] [14 Regression] r14-172 caused some SPEC2017 bmk to degrade on Power

2023-05-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109858

--- Comment #4 from Hongtao.liu  ---
Created attachment 55091
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55091=edit
Only use NO_REGS in cost calculation when !hard_regno_mode_ok for GENERAL_REGS
and mode.

[Bug tree-optimization/109869] New: comparing SCHAR_MIN and SCHAR_MAX but with widden type could be optimized better

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109869

Bug ID: 109869
   Summary: comparing SCHAR_MIN and SCHAR_MAX but with widden type
could be optimized better
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take (for most targets):
```
bool f1 (signed char i)
{
  unsigned long long _1 = i;
  bool _5 = _1 == 127;
  bool _6 = _1 == 18446744073709551488ull;
  return _5 | _6;
}
bool f2 (signed char i)
{
  return i == -128 || i == 127;
  __SIZE_TYPE__ _1 = i;
}
```

These two functions should produce the same code.

I noticed this while looking at PR 77899 .

[Bug c/109863] RFE: more consistent flex array initialization: lift static storage requirement in gnu2x

2023-05-15 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109863

--- Comment #2 from Xi Ruoyao  ---
(In reply to Xi Ruoyao from comment #1)
> Note that the entire "initializing a flexible array member" thing is a GNU
> extension and not supported by the standard.  So GCC is free to support the
> constexpr case but reject other non-static cases.

Or, "it's an extension after all, so it's not related to if the standard say
constexpr implies static or not".

[Bug c/109863] RFE: more consistent flex array initialization: lift static storage requirement in gnu2x

2023-05-15 Thread xry111 at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109863

Xi Ruoyao  changed:

   What|Removed |Added

 CC||xry111 at gcc dot gnu.org

--- Comment #1 from Xi Ruoyao  ---
Note that the entire "initializing a flexible array member" thing is a GNU
extension and not supported by the standard.  So GCC is free to support the
constexpr case but reject other non-static cases.

[Bug tree-optimization/77899] incorrect VR_RANGE for a signed char function argument

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77899

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=107699

--- Comment #13 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #12)
> Even this should be folded but is not currently:
> void f (signed char i)
> {
>   char d [260];
> 
>   const char *p = [130];
> 
>   p += i;
> 
>   if (p == d + 2 || d + 257 == p)
> __builtin_abort ();
> }

That was handled by r13-4555-g892e8c520be37d (aka PR 107699).

The non eq/ne ones are not handled though; I think there might be another bug
about that ...

[Bug tree-optimization/101805] Max -> bool0 | bool1 Min -> a & b

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101805

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-May/618
   ||645.html

--- Comment #5 from Andrew Pinski  ---
Updated patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618645.html

[PATCH] MATCH: [PR109424] Simplify min/max of boolean arguments

2023-05-15 Thread Andrew Pinski via Gcc-patches
This is version 2 of 
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577394.html
which does not depend on adding gimple_truth_valued_p at this point.
Instead will use zero_one_valued_p which is already used for mult 
simplifications
to make sure that we only have [0,1] rather having the mistake of maybe having 
[-1,0]
as the range for signed bools.

This shows up in a few places in GCC itself but only at -O1, we miss the 
min/max conversion
because of PR 107888 (which I will be testing seperately).

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Thanks,
Andrew Pinski

PR tree-optimization/109424

gcc/ChangeLog:

* match.pd: Add patterns for min/max of zero_one_valued
values to `&`/`|`.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bool-12.c: New test.
* gcc.dg/tree-ssa/bool-13.c: New test.
* gcc.dg/tree-ssa/minmax-20.c: New test.
* gcc.dg/tree-ssa/minmax-21.c: New test.
---
 gcc/match.pd  |  8 +
 gcc/testsuite/gcc.dg/tree-ssa/bool-12.c   | 44 +++
 gcc/testsuite/gcc.dg/tree-ssa/bool-13.c   | 38 
 gcc/testsuite/gcc.dg/tree-ssa/minmax-20.c | 27 ++
 gcc/testsuite/gcc.dg/tree-ssa/minmax-21.c | 28 +++
 5 files changed, 145 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-12.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-20.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/minmax-21.c

diff --git a/gcc/match.pd b/gcc/match.pd
index b025fb8facf..30ffdfcf8bb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -7439,6 +7439,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& TREE_CODE (@0) != INTEGER_CST)
(op @0 (ext @1 @2)
 
+/* Max -> bool0 | bool1
+   Min -> bool0 & bool1 */
+(for op(max min)
+ logic (bit_ior bit_and)
+ (simplify
+  (op zero_one_valued_p@0 zero_one_valued_p@1)
+  (logic @0 @1)))
+
 /* signbit(x) != 0 ? -x : x -> abs(x)
signbit(x) == 0 ? -x : x -> -abs(x) */
 (for sign (SIGNBIT)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-12.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-12.c
new file mode 100644
index 000..e62594e1dad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-12.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original 
-fdump-tree-phiopt1 -fdump-tree-forwprop2" } */
+#define bool _Bool
+int maxbool(bool ab, bool bb)
+{
+  int a = ab;
+  int b = bb;
+  int c;
+  if (a > b)
+c = a;
+  else
+c = b;
+  return c;
+}
+int minbool(bool ab, bool bb)
+{
+  int a = ab;
+  int b = bb;
+  int c;
+  if (a < b)
+c = a;
+  else
+c = b;
+  return c;
+}
+/* In Original, we should still have the if form as that is what is written. */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "original" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "original" } } */
+/* { dg-final { scan-tree-dump-times "if " 2 "original" } } */
+
+/* PHI-OPT1 should have converted it into min/max */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "if " 0 "phiopt1" } } */
+
+/* Forwprop2 (after ccp) will convert it into &\| */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "forwprop2" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "forwprop2" } } */
+/* { dg-final { scan-tree-dump-times "if " 0 "forwprop2" } } */
+
+/* By optimize there should be no min/max nor if  */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "if " 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
new file mode 100644
index 000..438f15a484a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bool-13.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fdump-tree-original 
-fdump-tree-phiopt1 -fdump-tree-forwprop2" } */
+#define bool _Bool
+int maxbool(bool ab, bool bb)
+{
+  int a = ab;
+  int b = bb;
+  int c;
+  c = a > b ? a : b;
+  return c;
+}
+int minbool(bool ab, bool bb)
+{
+  int a = ab;
+  int b = bb;
+  int c;
+  c = a < b ? a : b;
+  return c;
+}
+/* In Original, we should still have the min/max form as that is what is 
written. */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "original" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "original" } } */
+/* { dg-final { scan-tree-dump-times "if " 0 "original" } } */
+
+/* PHI-OPT1 should have kept it as min/max. */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "if " 0 "phiopt1" } } */
+
+/* Forwprop2 

RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-15 Thread Li, Pan2 via Gcc-patches
Kindly ping for this PATCH v3.

Pan

-Original Message-
From: Li, Pan2  
Sent: Saturday, May 13, 2023 9:13 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2 ; 
Wang, Yanzhang ; jeffreya...@gmail.com; 
rguent...@suse.de; richard.sandif...@arm.com
Subject: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

From: Pan Li 

We are running out of the machine_mode(8 bits) in RISC-V backend. Thus we would 
like to extend the machine_mode bit size from 8 to 16 bits.
However, it is sensitive to extend the memory size in common structure like 
tree or rtx. This patch would like to extend the machine_mode bits to 16 bits 
by shrinking, like:

* Swap the bit size of code and machine code in rtx_def.
* Adjust the machine_mode location and spare in tree.

The memory impact of this patch for correlated structure looks like below:

+---+--+-+--+
| struct/bytes  | upstream | patched | diff |
+---+--+-+--+
| rtx_obj_reference |8 |  12 |   +4 |
| ext_modified  |2 |   4 |   +2 |
| ira_allocno   |  192 | 184 |   -8 |
| qty_table_elem|   40 |  40 |0 |
| reg_stat_type |   64 |  64 |0 |
| rtx_def   |   40 |  40 |0 |
| table_elt |   80 |  80 |0 |
| tree_decl_common  |  112 | 112 |0 |
| tree_type_common  |  128 | 128 |0 |
+---+--+-+--+

The tree and rtx related struct has no memory changes after this patch, and the 
machine_mode changes to 16 bits already.

Signed-off-by: Pan Li 
Co-authored-by: Ju-Zhe Zhong 
Co-authored-by: Kito Cheng 
Co-Authored-By: Richard Biener 
Co-Authored-By: Richard Sandiford 

gcc/ChangeLog:

* combine.cc (struct reg_stat_type): Extend machine_mode to 16 bits.
* cse.cc (struct qty_table_elem): Extend machine_mode to 16 bits
(struct table_elt): Extend machine_mode to 16 bits.
(struct set): Ditto.
* genmodes.cc (emit_mode_wider): Extend type from char to short.
(emit_mode_complex): Ditto.
(emit_mode_inner): Ditto.
(emit_class_narrowest_mode): Ditto.
* genopinit.cc (main): Extend the machine_mode limit.
* ira-int.h (struct ira_allocno): Extend machine_mode to 16 bits and
re-ordered the struct fields for padding.
* machmode.h (MACHINE_MODE_BITSIZE): New macro.
(GET_MODE_2XWIDER_MODE): Extend type from char to short.
(get_mode_alignment): Extend type from char to short.
* ree.cc (struct ext_modified): Extend machine_mode to 16 bits and
removed the ATTRIBUTE_PACKED.
* rtl-ssa/accesses.h: Extend machine_mode to 16 bits.
* rtl.h (RTX_CODE_BITSIZE): New macro.
(struct rtx_def): Swap both the bit size and location between the
rtx_code and the machine_mode.
(subreg_shape::unique_id): Extend the machine_mode limit.
* rtlanal.h: Extend machine_mode to 16 bits.
* tree-core.h (struct tree_type_common): Extend machine_mode to 16
bits and re-ordered the struct fields for padding.
(struct tree_decl_common): Extend machine_mode to 16 bits.
---
 gcc/combine.cc |  4 +--
 gcc/cse.cc | 16 
 gcc/genmodes.cc| 16 ++--
 gcc/genopinit.cc   |  3 ++-
 gcc/ira-int.h  | 56 +-
 gcc/machmode.h | 27 +++-
 gcc/ree.cc |  4 +--
 gcc/rtl-ssa/accesses.h |  2 +-
 gcc/rtl.h  | 12 +
 gcc/rtlanal.h  |  2 +-
 gcc/tree-core.h|  9 ---
 11 files changed, 82 insertions(+), 69 deletions(-)

diff --git a/gcc/combine.cc b/gcc/combine.cc index 5aa0ec5c45a..a23caeed96f 
100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -200,7 +200,7 @@ struct reg_stat_type {
 
   unsigned HOST_WIDE_INT   last_set_nonzero_bits;
   char last_set_sign_bit_copies;
-  ENUM_BITFIELD(machine_mode)  last_set_mode : 8;
+  ENUM_BITFIELD(machine_mode)  last_set_mode : MACHINE_MODE_BITSIZE;
 
   /* Set nonzero if references to register n in expressions should not be
  used.  last_set_invalid is set nonzero when this register is being @@ 
-235,7 +235,7 @@ struct reg_stat_type {
  truncation if we know that value already contains a truncated
  value.  */
 
-  ENUM_BITFIELD(machine_mode)  truncated_to_mode : 8;
+  ENUM_BITFIELD(machine_mode)  truncated_to_mode : MACHINE_MODE_BITSIZE;
 };
 
 
diff --git a/gcc/cse.cc b/gcc/cse.cc
index b10c9b0c94d..86403b95938 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -248,10 +248,8 @@ struct qty_table_elem
   rtx comparison_const;
   int comparison_qty;
   unsigned int first_reg, last_reg;
-  /* The sizes of these fields should match the sizes of the
- code and mode fields of struct rtx_def (see rtl.h).  */
-  ENUM_BITFIELD(rtx_code) 

RE: [PATCH] RISC-V: Support RVV VREINTERPRET from v{u}int*_t to vbool1_t

2023-05-15 Thread Li, Pan2 via Gcc-patches
Kindly ping for this PATCH, .

Pan

From: Li, Pan2
Sent: Monday, May 15, 2023 11:25 AM
To: juzhe.zh...@rivai.ai; gcc-patches 
Cc: Kito.cheng ; Wang, Yanzhang 
Subject: RE: [PATCH] RISC-V: Support RVV VREINTERPRET from v{u}int*_t to 
vbool1_t

Thanks Juzhe. Let’s wait kito’s suggestion.

Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, May 15, 2023 11:20 AM
To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Kito.cheng mailto:kito.ch...@sifive.com>>; Li, Pan2 
mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: [PATCH] RISC-V: Support RVV VREINTERPRET from v{u}int*_t to 
vbool1_t

The implementation LGTM.
But I am not sure testcase since we don't include any intrinsic API testcases 
in GCC testsuite.
I think it needs Kito's decision.

Thanks.

juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-05-15 11:14
To: gcc-patches
CC: juzhe.zhong; 
kito.cheng; pan2.li; 
yanzhang.wang
Subject: [PATCH] RISC-V: Support RVV VREINTERPRET from v{u}int*_t to vbool1_t
From: Pan Li mailto:pan2...@intel.com>>

This patch support the RVV VREINTERPRET from the int to the vbool1_t.  Aka:

vbool1_t __riscv_vreinterpret_xx_xx(v{u}int[8|16|32|64]_t);

These APIs help the users to convert vector LMUL=1 integer to vbool1_t.
According to the RVV intrinsic SPEC as below, the reinterpret intrinsics
only change the types of the underlying contents.

https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#reinterpret-vbool-o-vintm1

For example, given below code.
vbool1_t test_vreinterpret_v_i8m1_b1(vint8m1_t src) {
  return __riscv_vreinterpret_v_i8m1_b1(src);
}

It will generate the assembly code similar as below:
vsetvli a5,zero,e8,m8,ta,ma
vlm.v   v1,0(a1)
vsm.v   v1,0(a0)
ret

The rest intrinsic bool size APIs will be prepared in other PATCH.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/genrvv-type-indexer.cc (BOOL_SIZE_LIST): New
  macro.
(main): Add bool1 to the type indexer.
* config/riscv/riscv-vector-builtins-functions.def
(vreinterpret): Register vbool1 interpret function.
* config/riscv/riscv-vector-builtins-types.def
(DEF_RVV_BOOL1_INTERPRET_OPS): New macro.
(vint8m1_t): Add the type to bool1_interpret_ops.
(vint16m1_t): Ditto.
(vint32m1_t): Ditto.
(vint64m1_t): Ditto.
(vuint8m1_t): Ditto.
(vuint16m1_t): Ditto.
(vuint32m1_t): Ditto.
(vuint64m1_t): Ditto.
* config/riscv/riscv-vector-builtins.cc
(DEF_RVV_BOOL1_INTERPRET_OPS): New macro.
(required_extensions_p): Add bool1 interpret case.
* config/riscv/riscv-vector-builtins.def
(bool1_interpret): Add bool1 interpret to base type.
* config/riscv/vector.md (@vreinterpret): Add new expand
with VB dest for vreinterpret.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: New test.
---
gcc/config/riscv/genrvv-type-indexer.cc   | 19 ++
.../riscv/riscv-vector-builtins-functions.def |  1 +
.../riscv/riscv-vector-builtins-types.def | 17 +
gcc/config/riscv/riscv-vector-builtins.cc | 18 +
gcc/config/riscv/riscv-vector-builtins.def|  2 +
gcc/config/riscv/vector.md| 10 +
.../rvv/base/misc_vreinterpret_vbool_vint.c   | 38 +++
7 files changed, 105 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c

diff --git a/gcc/config/riscv/genrvv-type-indexer.cc 
b/gcc/config/riscv/genrvv-type-indexer.cc
index 9bf6a82601d..2f0375568a8 100644
--- a/gcc/config/riscv/genrvv-type-indexer.cc
+++ b/gcc/config/riscv/genrvv-type-indexer.cc
@@ -23,6 +23,8 @@ along with GCC; see the file COPYING3.  If not see
#include 
#include 
+#define BOOL_SIZE_LIST {1}
+
std::string
to_lmul (int lmul_log2)
{
@@ -218,6 +220,9 @@ main (int argc, const char **argv)
   for (unsigned eew : {8, 16, 32, 64})
fprintf (fp, "  /*EEW%d_INTERPRET*/ INVALID,\n", eew);
+  for (unsigned boolsize : BOOL_SIZE_LIST)
+ fprintf (fp, "  /*BOOL%d_INTERPRET*/ INVALID,\n", boolsize);
+
   for (unsigned lmul_log2_offset : {1, 2, 3, 4, 5, 6})
{
  unsigned multiple_of_lmul = 1 << lmul_log2_offset;
@@ -297,6 +302,16 @@ main (int argc, const char **argv)
   inttype (eew, lmul_log2, unsigned_p).c_str ());
  }
+ for (unsigned boolsize : BOOL_SIZE_LIST)
+   {
+ std::stringstream mode;
+ mode << "vbool" << boolsize << "_t";
+
+ fprintf (fp, "  /*BOOL%d_INTERPRET*/ %s,\n", boolsize,
+ nf == 1 && lmul_log2 == 0 ? mode.str ().c_str ()
+: "INVALID");
+   }
+
for (unsigned lmul_log2_offset : {1, 2, 3, 4, 5, 6})
  {
unsigned multiple_of_lmul = 1 << lmul_log2_offset;
@@ -355,6 +370,10 @@ main (int argc, const char **argv)
   floattype 

Re: [PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-15 Thread juzhe.zh...@rivai.ai
Ping。

Is it Ok for trunk ? I have double checked the floating-point instructions 
needed FRM.

Thanks.


juzhe.zh...@rivai.ai
 
From: juzhe.zhong
Date: 2023-05-15 22:53
To: gcc-patches
CC: kito.cheng; palmer; rdapp.gcc; jeffreyalaw; Juzhe-Zhong
Subject: [PATCH V2] RISC-V: Add FRM and rounding mode operand into floating 
point intrinsics
From: Juzhe-Zhong 
 
This patch is adding rounding mode operand and FRM_REGNUM dependency
into floating-point instructions.
 
The floating-point instructions we added FRM and rounding mode operand:
1. vfadd/vfsub
2. vfwadd/vfwsub
3. vfmul
4. vfdiv
5. vfwmul
6. vfwmacc/vfwnmacc/vfwmsac/vfwnmsac
7. vfsqrt
8. floating-point conversions.
9. floating-point reductions.
10. floating-point ternary.
 
The floating-point instructions we did NOT add FRM and rounding mode operand:
1. vfabs/vfneg/vfsqrt7/vfrec7
2. vfmin/vfmax
3. comparisons
4. vfclass
5. vfsgnj/vfsgnjn/vfsgnjx
6. vfmerge
7. vfmv.v.f
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (enum frm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_ternop_insn): Add default rounding mode.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add FRM REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(FRM_REG_P): Ditto.
(RISCV_DWARF_FRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md: split no frm and has frm operations.
* config/riscv/vector.md (@pred__scalar): New pattern.
(@pred_): Ditto.
 
---
gcc/config/riscv/riscv-protos.h   |  10 +
gcc/config/riscv/riscv-vector-builtins.cc |  14 ++
gcc/config/riscv/riscv.cc |   7 +-
gcc/config/riscv/riscv.h  |   7 +-
gcc/config/riscv/riscv.md |   1 +
gcc/config/riscv/vector-iterators.md  |   9 +-
gcc/config/riscv/vector.md| 258 ++
7 files changed, 251 insertions(+), 55 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 835bb802fc6..12634d0ac1a 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -231,6 +231,16 @@ enum vxrm_field_enum
   VXRM_RDN,
   VXRM_ROD
};
+/* Rounding mode bitfield for floating point FRM.  */
+enum frm_field_enum
+{
+  FRM_RNE = 0b000,
+  FRM_RTZ = 0b001,
+  FRM_RDN = 0b010,
+  FRM_RUP = 0b011,
+  FRM_RMM = 0b100,
+  DYN = 0b111
+};
}
/* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 1de075fb90d..b7458aaace6 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3460,6 +3460,13 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
}
@@ -3482,6 +3489,13 @@ function_expander::use_widen_ternop_insn (insn_code 
icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
}
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b52e613c629..de5b87b1a87 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6082,7 +6082,8 @@ riscv_hard_regno_nregs (unsigned int regno, machine_mode 
mode)
   /* mode for VL or VTYPE are just a marker, not holding value,
  so it always consume one register.  */
-  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return 1;
   /* Assume every valid non-vector mode fits in one vector register.  */
@@ -6150,7 +6151,8 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (lmul != 1)
return ((regno % lmul) == 0);
 }
-  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #12 from Andrew Pinski  ---
Jakub, assign this to me if you think we should go down that route unless you
want to take the patch further.

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #11 from Andrew Pinski  ---
Created attachment 55090
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55090=edit
Patch which I came up with

This patch adds back zero_sized_field_decl but keeps the call to is_empty_type
too.

[committed] c: Ignore _Atomic on function return type for C2x

2023-05-15 Thread Joseph Myers
For C2x it was decided that _Atomic would be completely ignored on
function return types (just as was done for qualifiers in C11 DR#423),
to eliminate the potential for an rvalue returned by a function having
_Atomic-qualified type when an rvalue resulting from lvalue-to-rvalue
conversion could not have such a type.  Implement this for GCC.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (grokdeclarator): Ignore _Atomic on function return
type for C2x.

gcc/testsuite/
* gcc.dg/qual-return-9.c, gcc.dg/qual-return-10.c: New tests.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 1b53f2d0785..90d7cd27cd5 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -7412,9 +7412,12 @@ grokdeclarator (const struct c_declarator *declarator,
   them for noreturn functions.  The resolution of C11
   DR#423 means qualifiers (other than _Atomic) are
   actually removed from the return type when
-  determining the function type.  */
+  determining the function type.  For C2X, _Atomic is
+  removed as well.  */
int quals_used = type_quals;
-   if (flag_isoc11)
+   if (flag_isoc2x)
+ quals_used = 0;
+   else if (flag_isoc11)
  quals_used &= TYPE_QUAL_ATOMIC;
if (quals_used && VOID_TYPE_P (type) && really_funcdef)
  pedwarn (specs_loc, 0,
diff --git a/gcc/testsuite/gcc.dg/qual-return-10.c 
b/gcc/testsuite/gcc.dg/qual-return-10.c
new file mode 100644
index 000..c7dd6adc4c6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/qual-return-10.c
@@ -0,0 +1,12 @@
+/* Test qualifiers on function return types in C2X (C2X version of
+   qual-return-6.c): those qualifiers are now ignored for all purposes,
+   including _Atomic, but should still get warnings.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -Wignored-qualifiers" } */
+
+const int f1 (void); /* { dg-warning "qualifiers ignored" } */
+volatile int f2 (void) { return 0; } /* { dg-warning "qualifiers ignored" } */
+const volatile void f3 (void) { } /* { dg-warning "qualifiers ignored" } */
+const void f4 (void); /* { dg-warning "qualifiers ignored" } */
+_Atomic int f5 (void); /* { dg-warning "qualifiers ignored" } */
+_Atomic int f6 (void) { return 0; } /* { dg-warning "qualifiers ignored" } */
diff --git a/gcc/testsuite/gcc.dg/qual-return-9.c 
b/gcc/testsuite/gcc.dg/qual-return-9.c
new file mode 100644
index 000..7762782edf0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/qual-return-9.c
@@ -0,0 +1,32 @@
+/* Test qualifiers on function return types in C2X (C2X version of
+   qual-return-5.c): those qualifiers are now ignored for all purposes,
+   including _Atomic.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -pedantic-errors" } */
+
+int f1 (void);
+const int f1 (void);
+volatile int f1 (void) { return 0; }
+
+int *restrict f2 (void) { return 0; }
+int *f2 (void);
+
+const volatile long f3 (void);
+long f3 (void);
+
+const volatile void f4 (void) { }
+void f4 (void);
+
+_Atomic int f5 (void);
+int f5 (void);
+
+int f6 (void);
+_Atomic int f6 (void) { return 0; }
+
+/* The standard seems unclear regarding the case where restrict is
+   applied to a function return type that may not be
+   restrict-qualified; assume here that it is disallowed.  */
+restrict int f7 (void); /* { dg-error "restrict" } */
+
+typedef void FT (void);
+FT *restrict f8 (void); /* { dg-error "restrict" } */

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #10 from Sam James  ---
fwiw, on glibc, I don't get the oob read w/ valgrind but still the ICE as
you've already found.

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #9 from Jakub Jelinek  ---
The ICE started with r13-436-gaf34279921f4bb95b07c0be but the undesirable store
is
there already since r12-2975-g32c3a75390623a0470df52.

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #8 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #5)
> That might have been caused by r12-1150-g34aae6b561871d . I will look into
> it soon because we should not be emitting an assignment here ...

Yes it was introduced by that revision, specifically the change of
zero_sized_field_decl to is_empty_type. We checked the DECL_SIZE being zero but
now we check the size of type being empty but is_empty_type is not considered
true for bitfield types of size 0 ...

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 CC||jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
I'll have a look tomorrow^H^H^H^H^Hlater today.

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #6 from Andrew Pinski  ---
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806#c15 .

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #5 from Andrew Pinski  ---
A little more reduced:
```
struct ClockImpl  {
  virtual void addRef();
  long tv_nsec;
  int : 0;
};
void f() { ClockImpl b{}; }
```

So maybe this is a gimplifier issue producing the assignment to the zero-sized
bitfield:

  b.D.2780 = 0;

What is interesting is GCC 11 didn't produce the assignment but GCC 12 does.
That might have been caused by r12-1150-g34aae6b561871d . I will look into it
soon because we should not be emitting an assignment here ...

[Bug tree-optimization/109806] [13/14 Regression] 13.1.0 cc1plus stack smashing crash with C array of complex structs

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806

--- Comment #16 from Sam James  ---
Filed my musl one as PR109868, sorry for clogging up this one!

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #4 from Andrew Pinski  ---
Hmm:
  D.2948._startTime.D.2792 = 0;

That seems wrong.

Reduced further:
```
struct SimpleRefCounted {
  virtual void addRef();
};
struct ClockImpl : SimpleRefCounted {
  long tv_nsec;
  int : 0;
};
void f() { ClockImpl b{}; }
```
And yes it is the zero sized bitfield causing issues ...

[Bug tree-optimization/109868] [13/14 regression] ICE: segmentation fault or ICE in min_value with zero sized bitfield

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

Andrew Pinski  changed:

   What|Removed |Added

  Component|c++ |tree-optimization
  Known to fail||13.1.0, 14.0
  Known to work||12.3.0
 CC||pinskia at gcc dot gnu.org
   Target Milestone|--- |13.2
   Last reconfirmed||2023-05-15
Summary|[13/14 regression] ICE: |[13/14 regression] ICE:
   |segmentation fault when |segmentation fault or ICE
   |building small C++ program  |in min_value with zero
   ||sized bitfield
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Keywords||ice-on-valid-code

--- Comment #3 from Andrew Pinski  ---
Confirmed. I can reproduce it also on normal x86_64-linux-gnu and
aarch64-linux-gnu even.

[Bug tree-optimization/109806] [13/14 Regression] 13.1.0 cc1plus stack smashing crash with C array of complex structs

2023-05-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806

--- Comment #15 from Jakub Jelinek  ---
That ICE is because layout_class_type calls c_build_bitfield_integer_type with
width of 0 and that type is then seen by ranger for some reason:
#7  0x00c4eee1 in layout_class_type (t=, virtuals_p=0x7fffd4c8)
at ../../gcc/cp/class.cc:6858
6853  tree ftype = TREE_TYPE (field);
6854  width = tree_to_uhwi (DECL_SIZE (field));
6855  if (width != TYPE_PRECISION (ftype))
6856{
6857  TREE_TYPE (field)
6858= c_build_bitfield_integer_type (width,
6859 TYPE_UNSIGNED
(ftype));
6860  TREE_TYPE (field)
6861= cp_build_qualified_type (TREE_TYPE (field),
6862   cp_type_quals (ftype));
I think unnamed bitfields are just padding and shouldn't have this called.

[Bug c++/109868] [13/14 regression] ICE: segmentation fault when building small C++ program

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #2 from Sam James  ---
Created attachment 55089
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55089=edit
clock.ii (reduced)

[Bug c++/109868] [13/14 regression] ICE: segmentation fault when building small C++ program

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

--- Comment #1 from Sam James  ---
Created attachment 55088
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55088=edit
clock.ii.orig

[Bug c++/109868] New: [13/14 regression] ICE: segmentation fault when building small C++ program

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109868

Bug ID: 109868
   Summary: [13/14 regression] ICE: segmentation fault when
building small C++ program
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
CC: amonakov at gcc dot gnu.org
  Target Milestone: ---

I originally reported this at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806#c12.

For me, this crashes on x86_64-gentoo-linux-musl:
```
struct SimpleRefCounted {
  virtual void addRef();
};
struct timespec {
  long tv_nsec;
  int : 0;
};
struct ClockImpl : SimpleRefCounted {
  timespec _startTime;
};
struct Clock {
  Clock();
};
Clock::Clock() { ClockImpl(); }
```

with:
```
# g++ /tmp/foo.cxx -O2 -wrapper valgrind
==1239523== Memcheck, a memory error detector
==1239523== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1239523== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==1239523== Command: /usr/libexec/gcc/x86_64-gentoo-linux-musl/13/cc1plus
-quiet -D_GNU_SOURCE /tmp/foo.cxx -quiet -dumpdir a- -dumpbase foo.cxx
-dumpbase-ext .cxx -mtune=generic -march=x86-64 -O2 -fcf-protection -o
/tmp/ccigHfiN.s
==1239523==
==1239523== Invalid read of size 1
==1239523==at 0x97844C: to_wide (tree.h:6257)
==1239523==by 0x97844C: irange::set_varying(tree_node*) (value-range.h:959)
==1239523==by 0x10C1A45: range_query::get_tree_range(vrange&, tree_node*,
gimple*) (value-query.cc:252)
==1239523==by 0x1B52256: gimple_ranger::range_of_stmt(vrange&, gimple*,
tree_node*) (gimple-range.cc:298)
==1239523==by 0x1B52778: gimple_ranger::register_inferred_ranges(gimple*)
(gimple-range.cc:474)
==1239523==by 0x109FB19: rvrp_folder::fold_stmt(gimple_stmt_iterator*)
(tree-vrp.cc:1079)
==1239523==by 0xFA9ED3:
substitute_and_fold_dom_walker::before_dom_children(basic_block_def*)
(tree-ssa-propagate.cc:848)
==1239523==by 0x1B24C2E: dom_walker::walk(basic_block_def*)
(domwalk.cc:311)
==1239523==by 0xFA9312:
substitute_and_fold_engine::substitute_and_fold(basic_block_def*)
(tree-ssa-propagate.cc:971)
==1239523==by 0x109DB80: execute_ranger_vrp(function*, bool, bool)
(tree-vrp.cc:1107)
==1239523==by 0xD3A0EA: execute_one_pass(opt_pass*) (passes.cc:2651)
==1239523==by 0xD3A9AF: execute_pass_list_1(opt_pass*) (passes.cc:2760)
==1239523==by 0xD3A9C1: execute_pass_list_1(opt_pass*) (passes.cc:2761)
==1239523==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
==1239523==
during GIMPLE pass: evrp
/tmp/foo.cxx: In constructor 'Clock::Clock()':
/tmp/foo.cxx:14:31: internal compiler error: Segmentation fault
   14 | Clock::Clock() { ClockImpl(); }
  |   ^

0xe10df3 crash_signal
   
/usr/src/debug/sys-devel/gcc-13.1.1_p20230513/gcc-13-20230513/gcc/toplev.cc:314
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

It crashes without valgrind too, just less informative.

[Bug tree-optimization/109806] [13/14 Regression] 13.1.0 cc1plus stack smashing crash with C array of complex structs

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806

--- Comment #14 from Sam James  ---
(In reply to Alexander Monakov from comment #13)
> The 128KB stack size is for *secondary* threads on musl (i.e. those created
> via pthread_create). The main thread has the same stack as on glibc (GCC
> extends it to 128MB unless there's a hard limit).

To be fair, I didn't know if gcc did everything in a single thread. Thanks.

> 
> This doesn't look like a stack exhaustion and should be a separate bug.

I asked on IRC and was pointed here rather than file a dupe, but I'll file one
now in that case.

[Bug tree-optimization/109806] [13/14 Regression] 13.1.0 cc1plus stack smashing crash with C array of complex structs

2023-05-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806

Alexander Monakov  changed:

   What|Removed |Added

 CC||amonakov at gcc dot gnu.org

--- Comment #13 from Alexander Monakov  ---
The 128KB stack size is for *secondary* threads on musl (i.e. those created via
pthread_create). The main thread has the same stack as on glibc (GCC extends it
to 128MB unless there's a hard limit).

This doesn't look like a stack exhaustion and should be a separate bug.

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread Gary.White at ColoState dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #8 from GARY.WHITE at ColoState dot edu  ---
I just tried to send you a zip file with all the code and instructions (see
below), but it is over 6Mb in size, and was rejected.  Where can I put it that
you can access it?

I have put the file test_case.zip on my Onedrive account at

https://1drv.ms/u/s!Ak8uiHyJ2kc2iqIPdvZKUGDak3CZ9A?e=yFcRJZ

Gary


Gary C. White, CWB(r)
Professor Emeritus
Department of Fish, Wildlife, and Conservation Biology
10 Wagar
Colorado State University
Fort Collins, CO 80523
(515)450-2768 Mobile
gary.wh...@colostate.edu
https://sites.warnercnr.colostate.edu/gwhite/
he/him/his

See where we are!

"Leadership is a privilege to better the lives of others. It is not an
opportunity to satisfy personal greed." Mwai Kibaki

-Original Message-
From: White,Gary
Sent: Monday, May 15, 2023 3:53 PM
To: kargl at gcc dot gnu.org 
Subject: RE: [Bug fortran/109865] different results when routine moved inside
the contains statement

Sorry I can't simplify this down to a nice compact piece of code, but ...

In the attached test_case.zip file are all the *.f90 files, makefile, and some
library files that work on ubuntu with gfortran-12.  I can provide Windows
libraries if that is easier.

  Create the executable file, mark64,  by a  simple  make or  make type=mark64

Right now, the makefile does not have an -O0 on the va09ad.f90 compile line. 
As we found out, over-riding -O3 on va09ad.f90 compilation produces correct
code.

Execute the test case with

 ./mark64 i=dipper.inp o=dipper.out

I've included 2 output files, dipper_correct.out and dipper_incorrect.out so
you can see what correct and incorrect outputs look like.

Hopefully this all works out.

Thanks.

Gary

Gary C. White, CWB(r)
Professor Emeritus
Department of Fish, Wildlife, and Conservation Biology
10 Wagar
Colorado State University
Fort Collins, CO 80523
(515)450-2768 Mobile
gary.wh...@colostate.edu
https://sites.warnercnr.colostate.edu/gwhite/
he/him/his

See where we are!

"Leadership is a privilege to better the lives of others. It is not an
opportunity to satisfy personal greed." Mwai Kibaki

-Original Message-
From: kargl at gcc dot gnu.org 
Sent: Monday, May 15, 2023 2:42 PM
To: White,Gary 
Subject: [Bug fortran/109865] different results when routine moved inside the
contains statement

** Caution: EXTERNAL Sender **

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #6 from kargl at gcc dot gnu.org --- (In reply to
gary.wh...@colostate.edu from comment #5)
> (In reply to Steve Kargl from comment #4)

> > I assume you've also tried with -fcheck=all.
> > Your report states you're using og12.  If it supports the sanitizer,
> > can you add -fsanitize=undefined to the options?
>
> -fcheck=all does not generate any warnings.
> -fsanitize=undefined returns pages when loading of:
>
> undefined reference to `__ubsan_handle_pointer_overflow'
>
> which makes no sense to me

Hmmm.  Thanks for checking.  Either your version of gcc is not built with
--enable-libsanitizer or gfortran cannot find the library.  At this point, it
seems we're going to need a complete testcase.

--
You are receiving this mail because:
You reported the bug.

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread Gary.White at ColoState dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #7 from GARY.WHITE at ColoState dot edu  ---
Sorry I can't simplify this down to a nice compact piece of code, but ...

In the attached test_case.zip file are all the *.f90 files, makefile, and some
library files that work on ubuntu with gfortran-12.  I can provide Windows
libraries if that is easier.

  Create the executable file, mark64,  by a  simple
 make
or
 make type=mark64

Right now, the makefile does not have an -O0 on the va09ad.f90 compile line. 
As we found out, over-riding -O3 on va09ad.f90 compilation produces correct
code.

Execute the test case with

 ./mark64 i=dipper.inp o=dipper.out

I've included 2 output files, dipper_correct.out and dipper_incorrect.out so
you can see what correct and incorrect outputs look like.

Hopefully this all works out.

Thanks.

Gary

Gary C. White, CWB(r)
Professor Emeritus
Department of Fish, Wildlife, and Conservation Biology
10 Wagar
Colorado State University
Fort Collins, CO 80523
(515)450-2768 Mobile
gary.wh...@colostate.edu
https://sites.warnercnr.colostate.edu/gwhite/
he/him/his

See where we are!

"Leadership is a privilege to better the lives of others. It is not an
opportunity to satisfy personal greed." Mwai Kibaki

-Original Message-
From: kargl at gcc dot gnu.org 
Sent: Monday, May 15, 2023 2:42 PM
To: White,Gary 
Subject: [Bug fortran/109865] different results when routine moved inside the
contains statement

** Caution: EXTERNAL Sender **

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #6 from kargl at gcc dot gnu.org --- (In reply to
gary.wh...@colostate.edu from comment #5)
> (In reply to Steve Kargl from comment #4)

> > I assume you've also tried with -fcheck=all.
> > Your report states you're using og12.  If it supports the sanitizer,
> > can you add -fsanitize=undefined to the options?
>
> -fcheck=all does not generate any warnings.
> -fsanitize=undefined returns pages when loading of:
>
> undefined reference to `__ubsan_handle_pointer_overflow'
>
> which makes no sense to me

Hmmm.  Thanks for checking.  Either your version of gcc is not built with
--enable-libsanitizer or gfortran cannot find the library.  At this point, it
seems we're going to need a complete testcase.

--
You are receiving this mail because:
You reported the bug.

[Bug tree-optimization/101805] Max -> bool0 | bool1 Min -> a & b

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101805

Andrew Pinski  changed:

   What|Removed |Added

   Keywords|patch   |
URL|https://gcc.gnu.org/piperma |
   |il/gcc-patches/2021-August/ |
   |577394.html |

--- Comment #4 from Andrew Pinski  ---
About to submit an updated version of this patch.

[Bug c++/109867] New: -Wswicht-default reports missing default in coroutine

2023-05-15 Thread lukaslang.bugtracker at outlook dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109867

Bug ID: 109867
   Summary: -Wswicht-default reports missing default in coroutine
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: lukaslang.bugtracker at outlook dot com
  Target Milestone: ---

Consider the following implementation of a simple coroutine
(https://godbolt.org/z/rcevTd5f6):

#include 

struct task
{
struct promise_type
{
std::suspend_always initial_suspend();
std::suspend_always final_suspend() noexcept;
void unhandled_exception();
task get_return_object();
void return_value(int);
};
};

int main()
{
auto t = []() -> task
{
co_return 2;
}();
} 

Compiling this with -std=c++20 -Wswitch-default -Werror results in an error at
the end of the coroutine body:

:20:5: error: switch missing default case [-Werror=switch-default]
   20 | }();

Since I can get neither clang nor msvc to complain, I assume this is a bug, or
am I missing something? If this is indeed a bug, can I work around this without
having to disable the warning?

[Bug tree-optimization/109720] -Wmaybe-uninitialized triggering when I can see no path that would allow it

2023-05-15 Thread psmith at gnu dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109720

--- Comment #7 from Paul Smith  ---
Just to note this code also throws this warning in GCC 12.3 but it doesn't
complain in GCC 11.3 which is what I was using before.

[Bug tree-optimization/109806] [13/14 Regression] 13.1.0 cc1plus stack smashing crash with C array of complex structs

2023-05-15 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806

Sam James  changed:

   What|Removed |Added

 CC||aldyh at gcc dot gnu.org
   See Also||https://bugs.gentoo.org/sho
   ||w_bug.cgi?id=906380

--- Comment #12 from Sam James  ---
I think I'm hitting this on musl too. Reported in Gentoo at
https://bugs.gentoo.org/906380.

For me, this crashes on x86_64-gentoo-linux-musl:
```
struct SimpleRefCounted {
  virtual void addRef();
};
struct timespec {
  long tv_nsec;
  int : 0;
};
struct ClockImpl : SimpleRefCounted {
  timespec _startTime;
};
struct Clock {
  Clock();
};
Clock::Clock() { ClockImpl(); }
```

with:
```
# g++ /tmp/foo.cxx -O2 -wrapper valgrind
==1239523== Memcheck, a memory error detector
==1239523== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1239523== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==1239523== Command: /usr/libexec/gcc/x86_64-gentoo-linux-musl/13/cc1plus
-quiet -D_GNU_SOURCE /tmp/foo.cxx -quiet -dumpdir a- -dumpbase foo.cxx
-dumpbase-ext .cxx -mtune=generic -march=x86-64 -O2 -fcf-protection -o
/tmp/ccigHfiN.s
==1239523==
==1239523== Invalid read of size 1
==1239523==at 0x97844C: to_wide (tree.h:6257)
==1239523==by 0x97844C: irange::set_varying(tree_node*) (value-range.h:959)
==1239523==by 0x10C1A45: range_query::get_tree_range(vrange&, tree_node*,
gimple*) (value-query.cc:252)
==1239523==by 0x1B52256: gimple_ranger::range_of_stmt(vrange&, gimple*,
tree_node*) (gimple-range.cc:298)
==1239523==by 0x1B52778: gimple_ranger::register_inferred_ranges(gimple*)
(gimple-range.cc:474)
==1239523==by 0x109FB19: rvrp_folder::fold_stmt(gimple_stmt_iterator*)
(tree-vrp.cc:1079)
==1239523==by 0xFA9ED3:
substitute_and_fold_dom_walker::before_dom_children(basic_block_def*)
(tree-ssa-propagate.cc:848)
==1239523==by 0x1B24C2E: dom_walker::walk(basic_block_def*)
(domwalk.cc:311)
==1239523==by 0xFA9312:
substitute_and_fold_engine::substitute_and_fold(basic_block_def*)
(tree-ssa-propagate.cc:971)
==1239523==by 0x109DB80: execute_ranger_vrp(function*, bool, bool)
(tree-vrp.cc:1107)
==1239523==by 0xD3A0EA: execute_one_pass(opt_pass*) (passes.cc:2651)
==1239523==by 0xD3A9AF: execute_pass_list_1(opt_pass*) (passes.cc:2760)
==1239523==by 0xD3A9C1: execute_pass_list_1(opt_pass*) (passes.cc:2761)
==1239523==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
==1239523==
during GIMPLE pass: evrp
/tmp/foo.cxx: In constructor 'Clock::Clock()':
/tmp/foo.cxx:14:31: internal compiler error: Segmentation fault
   14 | Clock::Clock() { ClockImpl(); }
  |   ^

0xe10df3 crash_signal
   
/usr/src/debug/sys-devel/gcc-13.1.1_p20230513/gcc-13-20230513/gcc/toplev.cc:314
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.
```

(Obviously crashes w/o valgrind too, just the output is way less helpful.)

Note that musl has a small default stack size, as I mentioned at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109695#c18.

[committed] c: Update __has_c_attribute values for C2x

2023-05-15 Thread Joseph Myers
WG14 decided that __has_c_attribute should return the same value
(equal to the intended __STDC_VERSION__ value) for all standard
attributes in C2x, with values associated with when an attribute was
added to the working draft (or had semantics added or changed in the
working draft) only being used in earlier stages of development of
that draft.  The intent is that the values for existing attributes
increase in future standard versions only if there are new features /
semantic changes for those attributes.  Implement this change for GCC.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c-family/
* c-lex.cc (c_common_has_attribute): Use 202311 as
__has_c_attribute return for all C2x attributes.

gcc/testsuite/
* gcc.dg/c2x-has-c-attribute-2.c: Expect 202311L return value from
__has_c_attribute for all C2x attributes.

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index 6eb0fae2f53..dcd061c7cb1 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -392,17 +392,13 @@ c_common_has_attribute (cpp_reader *pfile, bool 
std_syntax)
}
  else
{
- if (is_attribute_p ("deprecated", attr_name))
-   result = 201904;
- else if (is_attribute_p ("fallthrough", attr_name))
-   result = 201910;
- else if (is_attribute_p ("nodiscard", attr_name))
-   result = 202003;
- else if (is_attribute_p ("maybe_unused", attr_name))
-   result = 202106;
- else if (is_attribute_p ("noreturn", attr_name)
-  || is_attribute_p ("_Noreturn", attr_name))
-   result = 202202;
+ if (is_attribute_p ("deprecated", attr_name)
+ || is_attribute_p ("fallthrough", attr_name)
+ || is_attribute_p ("maybe_unused", attr_name)
+ || is_attribute_p ("nodiscard", attr_name)
+ || is_attribute_p ("noreturn", attr_name)
+ || is_attribute_p ("_Noreturn", attr_name))
+   result = 202311;
}
  if (result)
attr_name = NULL_TREE;
diff --git a/gcc/testsuite/gcc.dg/c2x-has-c-attribute-2.c 
b/gcc/testsuite/gcc.dg/c2x-has-c-attribute-2.c
index 3c34ab6cbd9..dc92b95e907 100644
--- a/gcc/testsuite/gcc.dg/c2x-has-c-attribute-2.c
+++ b/gcc/testsuite/gcc.dg/c2x-has-c-attribute-2.c
@@ -2,56 +2,56 @@
 /* { dg-do preprocess } */
 /* { dg-options "-std=c2x -pedantic-errors" } */
 
-#if __has_c_attribute ( nodiscard ) != 202003L
+#if __has_c_attribute ( nodiscard ) != 202311L
 #error "bad result for nodiscard"
 #endif
 
-#if __has_c_attribute ( __nodiscard__ ) != 202003L
+#if __has_c_attribute ( __nodiscard__ ) != 202311L
 #error "bad result for __nodiscard__"
 #endif
 
-#if __has_c_attribute(maybe_unused) != 202106L
+#if __has_c_attribute(maybe_unused) != 202311L
 #error "bad result for maybe_unused"
 #endif
 
-#if __has_c_attribute(__maybe_unused__) != 202106L
+#if __has_c_attribute(__maybe_unused__) != 202311L
 #error "bad result for __maybe_unused__"
 #endif
 
-#if __has_c_attribute (deprecated) != 201904L
+#if __has_c_attribute (deprecated) != 202311L
 #error "bad result for deprecated"
 #endif
 
-#if __has_c_attribute (__deprecated__) != 201904L
+#if __has_c_attribute (__deprecated__) != 202311L
 #error "bad result for __deprecated__"
 #endif
 
-#if __has_c_attribute (fallthrough) != 201910L
+#if __has_c_attribute (fallthrough) != 202311L
 #error "bad result for fallthrough"
 #endif
 
-#if __has_c_attribute (__fallthrough__) != 201910L
+#if __has_c_attribute (__fallthrough__) != 202311L
 #error "bad result for __fallthrough__"
 #endif
 
-#if __has_c_attribute (noreturn) != 202202L
+#if __has_c_attribute (noreturn) != 202311L
 #error "bad result for noreturn"
 #endif
 
-#if __has_c_attribute (__noreturn__) != 202202L
+#if __has_c_attribute (__noreturn__) != 202311L
 #error "bad result for __noreturn__"
 #endif
 
-#if __has_c_attribute (_Noreturn) != 202202L
+#if __has_c_attribute (_Noreturn) != 202311L
 #error "bad result for _Noreturn"
 #endif
 
-#if __has_c_attribute (___Noreturn__) != 202202L
+#if __has_c_attribute (___Noreturn__) != 202311L
 #error "bad result for ___Noreturn__"
 #endif
   
 /* Macros in the attribute name are expanded.  */
 #define foo deprecated
-#if __has_c_attribute (foo) != 201904L
+#if __has_c_attribute (foo) != 202311L
 #error "bad result for foo"
 #endif

-- 
Joseph S. Myers
jos...@codesourcery.com


[Bug fortran/109861] Optimization is marking uninitialized C_PTR being passed to a C function, causes segfault.

2023-05-15 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109861

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #5 from kargl at gcc dot gnu.org ---
(In reply to Scot Breitenfeld from comment #3)
> I see the same issue with NAG, regardless of the optimization level. Our CI
> testing had missed it because this was a parallel test, and we don't test
> parallel with NAG.
> 
> I guess the issue is whether marking TYPE(C_PTR) as CLOBBER is correct. I
> looked through the 2018 standard and could not locate anything that
> addresses this use case. Are you interpreting the possibility that a
> TYPE(C_PTR) should not be declared INTENT(OUT)?

Fortran 2023, 8.5.10

The INTENT (OUT) attribute for a nonpointer dummy argument specifies
that the dummy argument becomes undefined on invocation of the procedure,
except for any subcomponents that are default-initialized (7.5.4.6).

[Bug fortran/109861] Optimization is marking uninitialized C_PTR being passed to a C function, causes segfault.

2023-05-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109861

--- Comment #4 from anlauf at gcc dot gnu.org ---
(In reply to Scot Breitenfeld from comment #3)
> I guess the issue is whether marking TYPE(C_PTR) as CLOBBER is correct. I
> looked through the 2018 standard and could not locate anything that
> addresses this use case. Are you interpreting the possibility that a
> TYPE(C_PTR) should not be declared INTENT(OUT)?

Maybe I am missing your intention, but I interpret your code that you
want to pass a (C) pointer to variable attr_rdata0 to return your result.
But that needs to be intent(in).  Your subroutine is not really supposed
to return a pointer but a result in the location the pointer dereferences.

Feel free to correct me.

> I can instead change the subroutine to declare buf as
> 
> INTEGER(C_INT), INTENT(OUT), TARGET :: buf
> 
> and f_ptr = C_LOC(buf) and there is no issue.

I cannot confirm this with my gcc installations, and there is no reason
that this should make a difference.

> So it seems to depend on the
> TYPE of the argument being passed.

There are cases where no clobber is currently generated.  For example,
if the dummy variable is a Fortran pointer, which has a completely
different semantics from TYPE(C_PTR).

Still I don't understand why you don't use INTENT(IN) for the pointer.
In that case you could do things in the main like:

  CALL H5Aread_async_f(C_LOC(attr_rdata0))

which appears to represent what I am guessing, and which gets rejected for
INTENT /= IN with a possibly more helpful error message.

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-15 Thread 钟居哲
Hi, Richard.

>> Easier to read as:

 >>  _41 = _40 - 16

>> (which might not be valid gimple, but pseudocode is good enough).

OK.


>> The difficulty with this is that the len_load* and len_store*
>>optabs currently say that the behaviour is undefined if the
>>length argument is greater than the length of a vector.
>>So I think using these values of _47 and _44 in the .LEN_STOREs
>>is relying on undefined behaviour.

>>Haven't had time to think about the consequences of that yet,
>>but wanted to send something out sooner rather than later.

Yes, we have tail agnostic (TA) in vsevli which is make tail element 
undefined value. The current optabs behavior matches the RVV specification.
I think maybe we can leave it to be carefully solved in the future. Currently,
I don't see the issue yet so far.

>>It would be better to use known_le here, without checking whether the
>>VF is constant.
Ok

Thank you so much for your patience helping this patch.
I have sent V8 patch with fixes as you suggested:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618638.html 

Can I merge this patch?

I am gonna post the next patch with select_vl included.

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 03:44
To: juzhe.zhong
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
juzhe.zh...@rivai.ai writes:
> From: Juzhe-Zhong 
>
> This patch implement decrement IV for length approach in loop control.
>
> Address comment from kewen that incorporate the implementation inside
> "vect_set_loop_controls_directly" instead of a standalone function.
>
> Address comment from Richard using MIN_EXPR to handle these 3 following
> cases
> 1. single rgroup.
> 2. multiple rgroup for SLP.
> 3. multiple rgroup for non-SLP (tested on vec_pack_trunc).
 
Thanks, this looks pretty reasonable to me FWIW, but some comments below:
 
> Bootstraped && Regression on x86.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
> (vect_set_loop_controls_directly): Add decrement IV support.
> (vect_set_loop_condition_partial_vectors): Ditto.
> * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Add a new 
> variable.
> (vect_get_loop_len): Add decrement IV support.
> * tree-vect-stmts.cc (vectorizable_store): Ditto.
> (vectorizable_load): Ditto.
> * tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
> (vect_get_loop_len): Add decrement IV support.
>
> ---
>  gcc/tree-vect-loop-manip.cc | 177 +++-
>  gcc/tree-vect-loop.cc   |  38 +++-
>  gcc/tree-vect-stmts.cc  |   9 +-
>  gcc/tree-vectorizer.h   |  13 ++-
>  4 files changed, 224 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index ff6159e08d5..1baac7b1b52 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -385,6 +385,58 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, 
> rgroup_controls *dest_rgm,
>return false;
>  }
>  
> +/* Try to use adjust loop lens for non-SLP multiple-rgroups.
> +
> + _36 = MIN_EXPR ;
> +
> + First length (MIN (X, VF/N)):
> +   loop_len_15 = MIN_EXPR <_36, VF/N>;
> +
> + Second length:
> +   tmp = _36 - loop_len_15;
> +   loop_len_16 = MIN (tmp, VF/N);
> +
> + Third length:
> +   tmp2 = tmp - loop_len_16;
> +   loop_len_17 = MIN (tmp2, VF/N);
> +
> + Forth length:
> +   tmp3 = tmp2 - loop_len_17;
> +   loop_len_18 = MIN (tmp3, VF/N);  */
> +
> +static void
> +vect_adjust_loop_lens (tree iv_type, gimple_seq *seq, rgroup_controls 
> *dest_rgm,
> +rgroup_controls *src_rgm)
> +{
> +  tree ctrl_type = dest_rgm->type;
> +  poly_uint64 nitems_per_ctrl
> += TYPE_VECTOR_SUBPARTS (ctrl_type) * dest_rgm->factor;
> +
> +  for (unsigned int i = 0; i < dest_rgm->controls.length (); ++i)
> +{
> +  tree src = src_rgm->controls[i / dest_rgm->controls.length ()];
> +  tree dest = dest_rgm->controls[i];
> +  tree length_limit = build_int_cst (iv_type, nitems_per_ctrl);
> +  gassign *stmt;
> +  if (i == 0)
> + {
> +   /* MIN (X, VF*I/N) capped to the range [0, VF/N].  */
> +   stmt = gimple_build_assign (dest, MIN_EXPR, src, length_limit);
> +   gimple_seq_add_stmt (seq, stmt);
> + }
> +  else
> + {
> +   /* (MIN (remain, VF*I/N)) capped to the range [0, VF/N].  */
> +   tree temp = make_ssa_name (iv_type);
> +   stmt = gimple_build_assign (temp, MINUS_EXPR, src,
> +   dest_rgm->controls[i - 1]);
> +   gimple_seq_add_stmt (seq, stmt);
> +   stmt = gimple_build_assign (dest, MIN_EXPR, temp, length_limit);
> +   gimple_seq_add_stmt (seq, stmt);
> + }
> +}
> +}
> +
>  /* Helper for vect_set_loop_condition_partial_vectors.  Generate definitions
> for all the rgroup controls in RGC and return a control that is nonzero
> when the loop needs to 

[PATCH V9] VECT: Add decrement IV support in Loop Vectorizer

2023-05-15 Thread juzhe . zhong
From: Ju-Zhe Zhong 

his patch implement decrement IV for length approach in loop control.

Address comment from kewen that incorporate the implementation inside
"vect_set_loop_controls_directly" instead of a standalone function.

Address comment from Richard using MIN_EXPR to handle these 3 following
cases
1. single rgroup.
2. multiple rgroup for SLP.
3. multiple rgroup for non-SLP (tested on vec_pack_trunc).

Bootstraped && Regression on x86.

Ok for trunk ?

gcc/ChangeLog:

* tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
(vect_set_loop_controls_directly): Add decrement IV support.
(vect_set_loop_condition_partial_vectors): Ditto.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Add a new 
variable.
(vect_get_loop_len): Add decrement IV support.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
(vect_get_loop_len): Add decrement IV support.

---
 gcc/tree-vect-loop-manip.cc | 177 +++-
 gcc/tree-vect-loop.cc   |  37 +++-
 gcc/tree-vect-stmts.cc  |   9 +-
 gcc/tree-vectorizer.h   |  13 ++-
 4 files changed, 223 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index ff6159e08d5..aae2e122b1a 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -385,6 +385,58 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, 
rgroup_controls *dest_rgm,
   return false;
 }
 
+/* Try to use adjust loop lens for non-SLP multiple-rgroups.
+
+ _36 = MIN_EXPR ;
+
+ First length (MIN (X, VF/N)):
+   loop_len_15 = MIN_EXPR <_36, VF/N>;
+
+ Second length:
+   tmp = _36 - loop_len_15;
+   loop_len_16 = MIN (tmp, VF/N);
+
+ Third length:
+   tmp2 = tmp - loop_len_16;
+   loop_len_17 = MIN (tmp2, VF/N);
+
+ Forth length:
+   tmp3 = tmp2 - loop_len_17;
+   loop_len_18 = MIN (tmp3, VF/N);  */
+
+static void
+vect_adjust_loop_lens (tree iv_type, gimple_seq *seq, rgroup_controls 
*dest_rgm,
+  rgroup_controls *src_rgm)
+{
+  tree ctrl_type = dest_rgm->type;
+  poly_uint64 nitems_per_ctrl
+= TYPE_VECTOR_SUBPARTS (ctrl_type) * dest_rgm->factor;
+
+  for (unsigned int i = 0; i < dest_rgm->controls.length (); ++i)
+{
+  tree src = src_rgm->controls[i / dest_rgm->controls.length ()];
+  tree dest = dest_rgm->controls[i];
+  tree length_limit = build_int_cst (iv_type, nitems_per_ctrl);
+  gassign *stmt;
+  if (i == 0)
+   {
+ /* MIN (X, VF*I/N) capped to the range [0, VF/N].  */
+ stmt = gimple_build_assign (dest, MIN_EXPR, src, length_limit);
+ gimple_seq_add_stmt (seq, stmt);
+   }
+  else
+   {
+ /* (MIN (remain, VF*I/N)) capped to the range [0, VF/N].  */
+ tree temp = make_ssa_name (iv_type);
+ stmt = gimple_build_assign (temp, MINUS_EXPR, src,
+ dest_rgm->controls[i - 1]);
+ gimple_seq_add_stmt (seq, stmt);
+ stmt = gimple_build_assign (dest, MIN_EXPR, temp, length_limit);
+ gimple_seq_add_stmt (seq, stmt);
+   }
+}
+}
+
 /* Helper for vect_set_loop_condition_partial_vectors.  Generate definitions
for all the rgroup controls in RGC and return a control that is nonzero
when the loop needs to iterate.  Add any new preheader statements to
@@ -468,9 +520,10 @@ vect_set_loop_controls_directly (class loop *loop, 
loop_vec_info loop_vinfo,
   gimple_stmt_iterator incr_gsi;
   bool insert_after;
   standard_iv_increment_position (loop, _gsi, _after);
-  create_iv (build_int_cst (iv_type, 0), PLUS_EXPR, nitems_step, NULL_TREE,
-loop, _gsi, insert_after, _before_incr,
-_after_incr);
+  if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
+create_iv (build_int_cst (iv_type, 0), PLUS_EXPR, nitems_step, NULL_TREE,
+  loop, _gsi, insert_after, _before_incr,
+  _after_incr);
 
   tree zero_index = build_int_cst (compare_type, 0);
   tree test_index, test_limit, first_limit;
@@ -552,8 +605,13 @@ vect_set_loop_controls_directly (class loop *loop, 
loop_vec_info loop_vinfo,
   /* Convert the IV value to the comparison type (either a no-op or
  a demotion).  */
   gimple_seq test_seq = NULL;
-  test_index = gimple_convert (_seq, compare_type, test_index);
-  gsi_insert_seq_before (test_gsi, test_seq, GSI_SAME_STMT);
+  if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
+test_limit = gimple_convert (preheader_seq, iv_type, nitems_total);
+  else
+{
+  test_index = gimple_convert (_seq, compare_type, test_index);
+  gsi_insert_seq_before (test_gsi, test_seq, GSI_SAME_STMT);
+}
 
   /* Provide a definition of each control in the group.  */
   tree next_ctrl = NULL_TREE;
@@ -587,6 +645,101 @@ vect_set_loop_controls_directly (class loop 

Re: [PATCH] Turn on LRA on all targets

2023-05-15 Thread Sam James via Gcc-patches

"Maciej W. Rozycki"  writes:

> On Sun, 23 Apr 2023, Segher Boessenkool wrote:
>
>> >  There are extra ICEs in regression testing and code quality is poor; cf. 
>> > .  
>> 
>> Do you have something you can show for this?  Maybe in a PR?
>
>  I have filed no PRs as I didn't assess the collateral damage at the time 
> I looked at it.  I only ran regression-testing with `-mlra' shortly after 
> I completed MODE_CC conversion and added the option, to see what lies 
> beyond.  And I only added `-mlra' and made minimal changes to make the 
> compiler build again just to make it easier to proceed towards LRA.

I think before moving forward with the plan in general, a PR is ideally
needed for each target anyway. Not all machine maintainers actively watch the
MLs.


signature.asc
Description: PGP signature


Re: [PATCH] Turn on LRA on all targets

2023-05-15 Thread Maciej W. Rozycki
On Sun, 23 Apr 2023, Segher Boessenkool wrote:

> >  There are extra ICEs in regression testing and code quality is poor; cf. 
> > .  
> 
> Do you have something you can show for this?  Maybe in a PR?

 I have filed no PRs as I didn't assess the collateral damage at the time 
I looked at it.  I only ran regression-testing with `-mlra' shortly after 
I completed MODE_CC conversion and added the option, to see what lies 
beyond.  And I only added `-mlra' and made minimal changes to make the 
compiler build again just to make it easier to proceed towards LRA.

> And, are the ICEs in the generic code, or something vax-specific?

 At least some were in generic code, e.g.:

during RTL pass: combine
.../gcc/testsuite/gcc.c-torture/compile/pr101562.c: In function 'foo':
.../gcc/testsuite/gcc.c-torture/compile/pr101562.c:12:1: internal compiler 
error: in insert, at wide-int.cc:682
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
compiler exited with status 1
FAIL: gcc.c-torture/compile/pr101562.c   -O1  (internal compiler error)
FAIL: gcc.c-torture/compile/pr101562.c   -O1  (test for excess errors)

(coming from `gcc_checking_assert (precision >= width)'), or:

In file included from .../gcc/testsuite/g++.dg/modules/xtreme-header-2.h:10,
 from .../gcc/testsuite/g++.dg/modules/xtreme-header-2_a.H:4:
.../vax-netbsdelf/libstdc++-v3/include/regex:42: internal compiler error: in 
set_filename, at cp/module.cc:19134
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
compiler exited with status 1
FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (internal compiler error)
FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (test for excess errors)

(from `gcc_checking_assert (!filename)').  As I say, I did not assess this 
at all back then and the logs are dated Nov 2021 (I had to chase them).

 Also I'm not going to dedicate any time now to switch the VAX backend to 
LRA, because old reload continues working while we have a non-functional 
exception unwinder that never ever worked, as I have recently discovered, 
which breaks lots of C++ code, including in particular native VAX/NetBSD 
GDB and `gdbserver' (my newly-ported implementation of), which is a bit of 
a problem (native VAX/NetBSD GCC has been spared owing to the decision not 
to use exceptions).

 And fixing the unwinder is going to be a major effort due to how the VAX 
CALLS machine instruction works and the stack frame has been consequently 
structured; it is unlike any other ELF target, and even if it can be 
expressed in DWARF terms (which I'm not entirely sure about), it is going 
to require a dedicated handler like with ARM or IA64.

 I may choose to implement a non-DWARF unwinder instead, as the VAX stack 
frame is always fully described by the hardware and there is never ever a 
need for debug information to be able to decode any VAX stack frame (the 
RET machine instruction uses the stack frame information to restore the 
previous PC, FP, SP, AP and any static registers saved by CALLS).

 So implementing a working exception unwinder has to take precedence over 
LRA and I do hope to complete it during this release cycle, but I may not 
have any time left for LRA.

 Please keep this in mind with any plans to drop old reload.  I'll highly 
appreciate that and I do keep LRA on my radar as the next item to address 
after the unwinder, by any means it's not been lost.

  Maciej


[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #14 from Thomas Schwinge  ---
(In reply to Eric Gallager from comment #12)
> Note that there's a gnulib module for flock:
> https://www.gnu.org/software/gnulib/manual/html_node/flock.html

I'd see that one -- but it also says: "the replacement function does not really
work", so I don't think that's useful?

(In reply to Jakub Jelinek from comment #13)
> And fcntl in tclx.

Seen that, too -- but is TclX something that people actually have
available/installed?  (Rainer?)

> Anyway, I think choosing between flock(1) and some
> python file locking would be better than using perl which is only needed in
> maintainer mode and not otherwise.

Rainer, would a 'python3' variant work for you?

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #6 from kargl at gcc dot gnu.org ---
(In reply to gary.wh...@colostate.edu from comment #5)
> (In reply to Steve Kargl from comment #4)

> > I assume you've also tried with -fcheck=all.
> > Your report states you're using og12.  If 
> > it supports the sanitizer, can you add 
> > -fsanitize=undefined to the options?
> 
> -fcheck=all does not generate any warnings.
> -fsanitize=undefined returns pages when loading of:
> 
> undefined reference to `__ubsan_handle_pointer_overflow'
> 
> which makes no sense to me

Hmmm.  Thanks for checking.  Either your version of
gcc is not built with --enable-libsanitizer or 
gfortran cannot find the library.  At this point,
it seems we're going to need a complete testcase.

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread Gary.White at ColoState dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #5 from GARY.WHITE at ColoState dot edu  ---
(In reply to Steve Kargl from comment #4)
> On Mon, May 15, 2023 at 07:11:17PM +, Gary.White at ColoState dot edu
> wrote:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865
> > (In reply to kargl from comment #2)
> > > (In reply to gary.wh...@colostate.edu from comment #0)
> > > 
> > > > Options being used to compile the code:
> > > > COPTIONS = -cpp -std=f2018 -c -D ieee -D dbleprecision -m64
> > > > -fsignaling-nans -ffpe-summary='invalid','zero','overflow','underflow' 
> > > > -O3
> > > > -funroll-loops -ffast-math 
> > > 
> > > What happens if you remove -ffast-math and use -O0 or -O1?
> > 
> > -O0 generates correct code with or without -ffastmath, -O1 does not generate
> > correct code.
> 
> I assume you've also tried with -fcheck=all.
> Your report states you're using og12.  If 
> it supports the sanitizer, can you add 
> -fsanitize=undefined to the options?

-fcheck=all does not generate any warnings.
-fsanitize=undefined returns pages when loading of:

undefined reference to `__ubsan_handle_pointer_overflow'

which makes no sense to me

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #13 from Jakub Jelinek  ---
And fcntl in tclx.  Anyway, I think choosing between flock(1) and some python
file locking would be better than using perl which is only needed in maintainer
mode and not otherwise.

Re: [PATCH 0/3] Refactor memory block operations

2023-05-15 Thread Andreas Krebbel via Gcc-patches
On 5/15/23 09:17, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested.  Ok for mainline?
> 
> Stefan Schulze Frielinghaus (3):
>   s390: Refactor block operation cpymem
>   s390: Add block operation movmem
>   s390: Refactor block operation setmem
> 
>  gcc/config/s390/s390-protos.h|   5 +-
>  gcc/config/s390/s390.cc  | 301 ---
>  gcc/config/s390/s390.md  |  61 -
>  gcc/testsuite/gcc.target/s390/memset-1.c |   7 +-
>  4 files changed, 331 insertions(+), 43 deletions(-)
> 

Ok. Thanks!

Andreas



Re: More C type errors by default for GCC 14

2023-05-15 Thread Eric Gallager via Gcc
On 5/15/23, Richard Earnshaw (lists) via Gcc  wrote:
> On 10/05/2023 03:38, Eli Zaretskii via Gcc wrote:
>>> From: Arsen Arsenović 
>>> Cc: Eli Zaretskii , Jakub Jelinek ,
>>>   jwakely@gmail.com, gcc@gcc.gnu.org
>>> Date: Tue, 09 May 2023 22:21:03 +0200
>>>
 The concern is using the good will of the GNU Toolchain brand as the tip
 of
 the spear or battering ram to motivate software packages to fix their
 problems. It's using GCC as leverage in a manner that is difficult for
 package maintainers to avoid.  Maybe that's a necessary approach, but
 we
 should be clear about the reasoning.  Again, I'm not objecting, but
 let's
 clarify why we are choosing this approach.
>>>
>>> Both the GNU Toolchain and the GNU Toolchain users will benefit from a
>>> stricter toolchain.
>>>
>>> People can and have stopped using the GNU Toolchain due to lackluster
>>> and non-strict defaults.  This is certainly not positive for the brand,
>>> and I doubt it buys it much good will.
>>
>> It is not GCC's business to force developers of packages to get their
>> act together.  It is the business of those package developers
>> themselves.  GCC should give those developers effective and convenient
>> means of detecting any unsafe and dubious code and of correcting it as
>> they see fit.  Which GCC already does by emitting warnings.  GCC
>> should only error out if it is completely unable to produce valid
>> code, which is not the case here, since it has been producing valid
>> code for ages.
>>
>> It is a disservice to GCC users if a program that compiled yesterday
>> and worked perfectly well suddenly cannot be built because GCC was
>> upgraded, perhaps due to completely unrelated reasons.  It would be a
>> grave mistake on the part of GCC to decide that part of its mission is
>> to teach package developers how to write their code and when and how
>> to modify it.
>
> That argument doesn't really wash.  We already upgrade the 'default'
> language version (-std=...) from time to time and that can impact
> existing programs (eg we changed from gnu-inline to std-inline model).
>
> If this really isn't legal C, then my suggestion would be to tie this to
> a setting of -std, so -std=c2 would default to being more
> aggressive in enforcing this (via changing the warning to -werror=) and
> then -std=gnu2 might follow a bit behind that.
> Furthermore, we can trail this aggressively in release notes so that
> nobody can really claim to be surprised.
>

I support this plan for using -Werror= and having it be split based on
whether -std= is set to a strict ANSI option or a GNU option; is there
a way to do that in the optfiles, or would it have to be handled at
the specs level instead?

> At some point that std setting will become the default and the overall
> goal is achieved.
>
> R.
>


Re: [PATCH] aarch64: Add SVE instruction types

2023-05-15 Thread Evandro Menezes via Gcc-patches
Hi, Kyrill.

I wasn’t aware of your previous patch.  Could you clarify why you considered 
creating an SVE specific type attribute instead of reusing the common one?  I 
really liked the iterators that you created; I’d like to use them.

Do you have specific examples which you might want to mention with regards to 
granularity?

Yes, my intent for this patch is to enable modeling the SVE instructions on N1. 
 The patch that implements it brings up some performance improvements, but it’s 
mostly flat, as expected.

Thank you,

-- 
Evandro Menezes



> Em 15 de mai. de 2023, à(s) 04:49, Kyrylo Tkachov  
> escreveu:
> 
> 
> 
>> -Original Message-
>> From: Richard Sandiford > >
>> Sent: Monday, May 15, 2023 10:01 AM
>> To: Evandro Menezes via Gcc-patches > >
>> Cc: evandro+...@gcc.gnu.org ; Evandro 
>> Menezes mailto:ebah...@icloud.com>>;
>> Kyrylo Tkachov mailto:kyrylo.tkac...@arm.com>>; 
>> Tamar Christina
>> mailto:tamar.christ...@arm.com>>
>> Subject: Re: [PATCH] aarch64: Add SVE instruction types
>> 
>> Evandro Menezes via Gcc-patches  writes:
>>> This patch adds the attribute `type` to most SVE1 instructions, as in the
>> other
>>> instructions.
>> 
>> Thanks for doing this.
>> 
>> Could you say what criteria you used for picking the granularity?  Other
>> maintainers might disagree, but personally I'd prefer to distinguish two
>> instructions only if:
>> 
>> (a) a scheduling description really needs to distinguish them or
>> (b) grouping them together would be very artificial (because they're
>>logically unrelated)
>> 
>> It's always possible to split types later if new scheduling descriptions
>> require it.  Because of that, I don't think we should try to predict ahead
>> of time what future scheduling descriptions will need.
>> 
>> Of course, this depends on having results that show that scheduling
>> makes a significant difference on an SVE core.  I think one of the
>> problems here is that, when a different scheduling model changes the
>> performance of a particular test, it's difficult to tell whether
>> the gain/loss is caused by the model being more/less accurate than
>> the previous one, or if it's due to important "secondary" effects
>> on register live ranges.  Instinctively, I'd have expected these
>> secondary effects to dominate on OoO cores.
> 
> I agree with Richard on these points. The key here is getting the granularity 
> right without having too maintain too many types that aren't useful in the 
> models.
> FWIW I had posted 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607101.html in 
> November. It adds annotations to SVE2 patterns as well as for base SVE.
> Feel free to reuse it if you'd like.
> I see you had posted a Neoverse V1 scheduling model. Does that give an 
> improvement on SVE code when combined with the scheduling attributes somehow?
> Thanks,
> Kyrill



[Bug fortran/109861] Optimization is marking uninitialized C_PTR being passed to a C function, causes segfault.

2023-05-15 Thread brtnfld at hdfgroup dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109861

--- Comment #3 from Scot Breitenfeld  ---
I see the same issue with NAG, regardless of the optimization level. Our CI
testing had missed it because this was a parallel test, and we don't test
parallel with NAG.

I guess the issue is whether marking TYPE(C_PTR) as CLOBBER is correct. I
looked through the 2018 standard and could not locate anything that addresses
this use case. Are you interpreting the possibility that a TYPE(C_PTR) should
not be declared INTENT(OUT)?

I can instead change the subroutine to declare buf as

INTEGER(C_INT), INTENT(OUT), TARGET :: buf

and f_ptr = C_LOC(buf) and there is no issue. So it seems to depend on the TYPE
of the argument being passed.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #12 from Eric Gallager  ---
Note that there's a gnulib module for flock:
https://www.gnu.org/software/gnulib/manual/html_node/flock.html

Re: [PATCH] aarch64: Add SVE instruction types

2023-05-15 Thread Evandro Menezes via Gcc-patches
Hi, Richard.

My criteria were very much (a).  In some cases though, a particular instruction 
could have variations that others in its natural group didn’t, when if seemed 
sensible to create a specific description for this instruction, even if its 
base form shares resources with other instructions in its group.

Do you have specific instances in mind?

Thank you,

-- 
Evandro Menezes



> Em 15 de mai. de 2023, à(s) 04:00, Richard Sandiford 
>  escreveu:
> 
> Evandro Menezes via Gcc-patches  writes:
>> This patch adds the attribute `type` to most SVE1 instructions, as in the 
>> other
>> instructions.
> 
> Thanks for doing this.
> 
> Could you say what criteria you used for picking the granularity?  Other
> maintainers might disagree, but personally I'd prefer to distinguish two
> instructions only if:
> 
> (a) a scheduling description really needs to distinguish them or
> (b) grouping them together would be very artificial (because they're
>logically unrelated)
> 
> It's always possible to split types later if new scheduling descriptions
> require it.  Because of that, I don't think we should try to predict ahead
> of time what future scheduling descriptions will need.
> 
> Of course, this depends on having results that show that scheduling
> makes a significant difference on an SVE core.  I think one of the
> problems here is that, when a different scheduling model changes the
> performance of a particular test, it's difficult to tell whether
> the gain/loss is caused by the model being more/less accurate than
> the previous one, or if it's due to important "secondary" effects
> on register live ranges.  Instinctively, I'd have expected these
> secondary effects to dominate on OoO cores.
> 
> Richard


-- 
Evandro Menezes ◊ evan...@yahoo.com ◊ Austin, TX
Άγιος ο Θεός ⁂ ܩܕܝܫܐ ܐܢ̱ܬ ܠܐ ܡܝܘܬܐ ⁂ Sanctus Deus





Re: [wish] Flexible array members in unions

2023-05-15 Thread Qing Zhao via Gcc


> On May 12, 2023, at 2:16 AM, Richard Biener via Gcc  wrote:
> 
> On Thu, May 11, 2023 at 11:14 PM Kees Cook via Gcc  wrote:
>> 
>> On Thu, May 11, 2023 at 08:53:52PM +, Joseph Myers wrote:
>>> On Thu, 11 May 2023, Kees Cook via Gcc wrote:
>>> 
 On Thu, May 11, 2023 at 06:29:10PM +0200, Alejandro Colomar wrote:
> On 5/11/23 18:07, Alejandro Colomar wrote:
> [...]
>> Would you allow flexible array members in unions?  Is there any
>> strong reason to disallow them?
 
 Yes please!! And alone in a struct, too.
 
 AFAICT, there is no mechanical/architectural reason to disallow them
 (especially since they _can_ be constructed with some fancy tricks,
 and they behave as expected.) My understanding is that it's disallowed
 due to an overly strict reading of the very terse language that created
 flexible arrays in C99.
>>> 
>>> Standard C has no such thing as a zero-size object or type, which would
>>> lead to problems with a struct or union that only contains a flexible
>>> array member there.
>> 
>> Ah-ha, okay. That root cause makes sense now.
> 
> Hmm. but then the workaround
> 
> struct X {
>  int n;
>  union u {
>  char at_least_size_one;
>  int iarr[];
>  short sarr[];
>  };
> };
> 
> doesn't work either.  We could make that a GNU extension without
> adverse effects?

I think that this might be  a very nice extension, which addresses the standard 
C’s restriction  on the zero-size object, and also can resolve kernel’s need. 
(And also other users’s similar programming need?)
And maybe it’s also possible to add such extension later to Standard C?

Similar as flexible array member in Standard C, we should limit such union as 
the last field of another structure.  (Since basically this union can be treated
As a flexible array member)

Qing

> 
> Richard.
> 
>> Why are zero-sized objects missing in Standard C? Or, perhaps, the better
>> question is: what's needed to support the idea of a zero-sized object?
>> 
>> --
>> Kees Cook



Re: [PATCH v2] libstdc++: Do not use pthread_mutex_clocklock with ThreadSanitizer

2023-05-15 Thread Thomas Rodgers via Gcc-patches
On Thu, May 11, 2023 at 1:52 PM Jonathan Wakely  wrote:

> On Thu, 11 May 2023 at 13:42, Jonathan Wakely  wrote:
>
>>
>>
>> On Thu, 11 May 2023 at 13:19, Mike Crowe  wrote:
>>
>>> However, ...
>>>
>>> > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
>>> > > index 89e7f5f5f45..e2700b05ec3 100644
>>> > > --- a/libstdc++-v3/acinclude.m4
>>> > > +++ b/libstdc++-v3/acinclude.m4
>>> > > @@ -4284,7 +4284,7 @@
>>> AC_DEFUN([GLIBCXX_CHECK_PTHREAD_COND_CLOCKWAIT], [
>>> > >[glibcxx_cv_PTHREAD_COND_CLOCKWAIT=no])
>>> > >])
>>> > >if test $glibcxx_cv_PTHREAD_COND_CLOCKWAIT = yes; then
>>> > > -AC_DEFINE(_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT, 1, [Define if
>>> > > pthread_cond_clockwait is available in .])
>>> > > +AC_DEFINE(_GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT,
>>> (_GLIBCXX_TSAN==0),
>>> > > [Define if pthread_cond_clockwait is available in .])
>>> > >fi
>>>
>>> TSan does appear to have an interceptor for pthread_cond_clockwait, even
>>> if
>>> it lacks the others. Does this mean that this part is unnecessary?
>>>
>>
>> Ah good point, thanks. I grepped for clocklock but not clockwait.
>>
>
> In fact it seems like we don't need to change
> _GLIBCXX_USE_PTHREAD_RWLOCK_CLOCKLOCK either, because I don't get any tsan
> warnings for that. It doesn't have interceptors for
> pthread_rwlock_{rd,wr}lock, but it doesn't complain anyway (maybe it's
> simply not instrumenting the rwlock functions at all?!)
>
> So I'm now retesting with this version of the patch, which only touches
> the USE_PTHREAD_LOCKLOCK macro.
>
> Please take another look, thanks.
>
> LGTM.


Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-15 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes:
> From: Juzhe-Zhong 
>
> This patch implement decrement IV for length approach in loop control.
>
> Address comment from kewen that incorporate the implementation inside
> "vect_set_loop_controls_directly" instead of a standalone function.
>
> Address comment from Richard using MIN_EXPR to handle these 3 following
> cases
> 1. single rgroup.
> 2. multiple rgroup for SLP.
> 3. multiple rgroup for non-SLP (tested on vec_pack_trunc).

Thanks, this looks pretty reasonable to me FWIW, but some comments below:

> Bootstraped && Regression on x86.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
> (vect_set_loop_controls_directly): Add decrement IV support.
> (vect_set_loop_condition_partial_vectors): Ditto.
> * tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): Add a new 
> variable.
> (vect_get_loop_len): Add decrement IV support.
> * tree-vect-stmts.cc (vectorizable_store): Ditto.
> (vectorizable_load): Ditto.
> * tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
> (vect_get_loop_len): Add decrement IV support.
>
> ---
>  gcc/tree-vect-loop-manip.cc | 177 +++-
>  gcc/tree-vect-loop.cc   |  38 +++-
>  gcc/tree-vect-stmts.cc  |   9 +-
>  gcc/tree-vectorizer.h   |  13 ++-
>  4 files changed, 224 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index ff6159e08d5..1baac7b1b52 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -385,6 +385,58 @@ vect_maybe_permute_loop_masks (gimple_seq *seq, 
> rgroup_controls *dest_rgm,
>return false;
>  }
>  
> +/* Try to use adjust loop lens for non-SLP multiple-rgroups.
> +
> + _36 = MIN_EXPR ;
> +
> + First length (MIN (X, VF/N)):
> +   loop_len_15 = MIN_EXPR <_36, VF/N>;
> +
> + Second length:
> +   tmp = _36 - loop_len_15;
> +   loop_len_16 = MIN (tmp, VF/N);
> +
> + Third length:
> +   tmp2 = tmp - loop_len_16;
> +   loop_len_17 = MIN (tmp2, VF/N);
> +
> + Forth length:
> +   tmp3 = tmp2 - loop_len_17;
> +   loop_len_18 = MIN (tmp3, VF/N);  */
> +
> +static void
> +vect_adjust_loop_lens (tree iv_type, gimple_seq *seq, rgroup_controls 
> *dest_rgm,
> +rgroup_controls *src_rgm)
> +{
> +  tree ctrl_type = dest_rgm->type;
> +  poly_uint64 nitems_per_ctrl
> += TYPE_VECTOR_SUBPARTS (ctrl_type) * dest_rgm->factor;
> +
> +  for (unsigned int i = 0; i < dest_rgm->controls.length (); ++i)
> +{
> +  tree src = src_rgm->controls[i / dest_rgm->controls.length ()];
> +  tree dest = dest_rgm->controls[i];
> +  tree length_limit = build_int_cst (iv_type, nitems_per_ctrl);
> +  gassign *stmt;
> +  if (i == 0)
> + {
> +   /* MIN (X, VF*I/N) capped to the range [0, VF/N].  */
> +   stmt = gimple_build_assign (dest, MIN_EXPR, src, length_limit);
> +   gimple_seq_add_stmt (seq, stmt);
> + }
> +  else
> + {
> +   /* (MIN (remain, VF*I/N)) capped to the range [0, VF/N].  */
> +   tree temp = make_ssa_name (iv_type);
> +   stmt = gimple_build_assign (temp, MINUS_EXPR, src,
> +   dest_rgm->controls[i - 1]);
> +   gimple_seq_add_stmt (seq, stmt);
> +   stmt = gimple_build_assign (dest, MIN_EXPR, temp, length_limit);
> +   gimple_seq_add_stmt (seq, stmt);
> + }
> +}
> +}
> +
>  /* Helper for vect_set_loop_condition_partial_vectors.  Generate definitions
> for all the rgroup controls in RGC and return a control that is nonzero
> when the loop needs to iterate.  Add any new preheader statements to
> @@ -468,9 +520,10 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>gimple_stmt_iterator incr_gsi;
>bool insert_after;
>standard_iv_increment_position (loop, _gsi, _after);
> -  create_iv (build_int_cst (iv_type, 0), PLUS_EXPR, nitems_step, NULL_TREE,
> -  loop, _gsi, insert_after, _before_incr,
> -  _after_incr);
> +  if (!LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> +create_iv (build_int_cst (iv_type, 0), PLUS_EXPR, nitems_step, NULL_TREE,
> +loop, _gsi, insert_after, _before_incr,
> +_after_incr);
>  
>tree zero_index = build_int_cst (compare_type, 0);
>tree test_index, test_limit, first_limit;
> @@ -552,8 +605,13 @@ vect_set_loop_controls_directly (class loop *loop, 
> loop_vec_info loop_vinfo,
>/* Convert the IV value to the comparison type (either a no-op or
>   a demotion).  */
>gimple_seq test_seq = NULL;
> -  test_index = gimple_convert (_seq, compare_type, test_index);
> -  gsi_insert_seq_before (test_gsi, test_seq, GSI_SAME_STMT);
> +  if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> +test_limit = gimple_convert (preheader_seq, iv_type, nitems_total);
> +  else
> +  

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread sgk at troutmask dot apl.washington.edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #4 from Steve Kargl  ---
On Mon, May 15, 2023 at 07:11:17PM +, Gary.White at ColoState dot edu
wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865
> (In reply to kargl from comment #2)
> > (In reply to gary.wh...@colostate.edu from comment #0)
> > 
> > > Options being used to compile the code:
> > >   COPTIONS = -cpp -std=f2018 -c -D ieee -D dbleprecision -m64
> > > -fsignaling-nans -ffpe-summary='invalid','zero','overflow','underflow' -O3
> > > -funroll-loops -ffast-math 
> > 
> > What happens if you remove -ffast-math and use -O0 or -O1?
> 
> -O0 generates correct code with or without -ffastmath, -O1 does not generate
> correct code.

I assume you've also tried with -fcheck=all.
Your report states you're using og12.  If 
it supports the sanitizer, can you add 
-fsanitize=undefined to the options?

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread Gary.White at ColoState dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

--- Comment #3 from GARY.WHITE at ColoState dot edu  ---
(In reply to kargl from comment #2)
> (In reply to gary.wh...@colostate.edu from comment #0)
> > Created attachment 55087 [details]
> > set of subroutines where moving mc11ad inside the contains statement
> > produces incorrect results
> > 
> > In the following code, when the subroutine mc11ad is moved inside the
> > contains statement, incorrect results are produced.
> 
> Produce wrong results is meaningless as you haven't told what the
> correct results and wrong results are.  A difference in the 7
> decimal place for REAL may be entirely possible due to floating
> point round-off
> 
> > Options being used to compile the code:
> > COPTIONS = -cpp -std=f2018 -c -D ieee -D dbleprecision -m64
> > -fsignaling-nans -ffpe-summary='invalid','zero','overflow','underflow' -O3
> > -funroll-loops -ffast-math 
> 
> What happens if you remove -ffast-math and use -O0 or -O1?

-O0 generates correct code with or without -ffastmath, -O1 does not generate
correct code.

[Bug bootstrap/82856] --enable-maintainter-mode broken by incompatiblity of gcc's required automake and modern Perl

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82856

Thomas Schwinge  changed:

   What|Removed |Added

 CC||redi at gcc dot gnu.org,
   ||tschwinge at gcc dot gnu.org

--- Comment #14 from Thomas Schwinge  ---
(In reply to Jonathan Wakely from comment #13)
> (In reply to Thomas Koenig from comment #7)
> > Author: tkoenig
> > Date: Thu Nov 16 20:24:00 2017
> > New Revision: 254845
> > 
> > URL: https://gcc.gnu.org/viewcvs?rev=254845=gcc=rev
> > Log:
> > 2017-11-16  Thomas Koenig  
> > 
> > PR bootstrap/82856
> > * doc/install.texi: Document incompatibility of Perl >=5.6.26
> > with the required version of automake 1.11.6.
> > 
> > 
> > Modified:
> > trunk/gcc/ChangeLog
> > trunk/gcc/doc/install.texi
> 
> Thomas, this patch refers to a non-existent 5.6.26 version (did you mean
> 5.26.0?)
> 
> But is this even still relevant now that we've updated the automake version?
> Can we revert it to just say 5.6.1 or later?

Indeed; 
"Back to requiring "Perl version 5.6.1 (or later)" [PR82856]".

Back to requiring "Perl version 5.6.1 (or later)" [PR82856] (was: Update GCC to autoconf 2.69, automake 1.15.1)

2023-05-15 Thread Thomas Schwinge
Hi!

On 2018-10-31T17:04:46+, Joseph Myers  wrote:
> On Wed, 31 Oct 2018, Thomas Koenig wrote:
>> Am 31.10.18 um 04:26 schrieb Joseph Myers:
>> > This patch (diffs to generated files omitted below) updates GCC to use
>> > autoconf 2.69 and automake 1.15.1.
>>
>> I think this should fix PR 82856.  Maybe you could confirm that this
>> restores automake functionality with perl 5.6.26, and mention the PR
>> in the ChangeLog.

(Perl 5.26, not 5.6.26, is what was meant there; see
.
I remember well, as I once chased down that patch...)

> At least, the warnings I saw with an older perl version and automake
> 1.11.x are gone when using 1.15.1.

ACK.

> I've committed this revised patch version

> A reference to PR
> bootstrap/82856 has been added.  gcc/doc/install.texi has been updated
> to mention the new versions required.

..., but not removed the Perl "5.6.25" 5.26 requirement.  OK to push the
attached "Back to requiring "Perl version 5.6.1 (or later)" [PR82856]"?
(Later then to be backported to all relevant release branches, too.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 18153b349eb0062d73e3b2da3a2721dd44884b94 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 15 May 2023 20:55:11 +0200
Subject: [PATCH] Back to requiring "Perl version 5.6.1 (or later)" [PR82856]

With Subversion r265695 (Git commit 22e052725189a472e4e86ebb6595278a49f4bcdd)
"Update GCC to autoconf 2.69, automake 1.15.1 (PR bootstrap/82856)" we're back
to normal; per Automake 1.15.1 'configure.ac' still "[...] perl 5.6 or better
is required [...]".

	PR bootstrap/82856
	gcc/
	* doc/install.texi (Perl): Back to requiring "Perl version 5.6.1 (or
	later)".
---
 gcc/doc/install.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index fa91ce1953d..dfab47dac96 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -366,7 +366,7 @@ Necessary (only on some platforms) to untar the source code.  Many
 systems' @command{tar} programs will also work, only try GNU
 @command{tar} if you have problems.
 
-@item Perl version between 5.6.1 and 5.6.24
+@item Perl version 5.6.1 (or later)
 
 Necessary when targeting Darwin, building @samp{libstdc++},
 and not using @option{--disable-symvers}.
-- 
2.34.1



Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-15 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> Hi Richard,
> After committing the interleave+zip1 patch for vector initialization,
> it seems to regress the s32 case for this patch:
>
> int32x4_t f_s32(int32_t x)
> {
>   return (int32x4_t) { x, x, x, 1 };
> }
>
> code-gen:
> f_s32:
> moviv30.2s, 0x1
> fmovs31, w0
> dup v0.2s, v31.s[0]
> ins v30.s[0], v31.s[0]
> zip1v0.4s, v0.4s, v30.4s
> ret
>
> instead of expected code-gen:
> f_s32:
> moviv31.2s, 0x1
> dup v0.4s, w0
> ins v0.s[3], v31.s[0]
> ret
>
> Cost for fallback sequence: 16
> Cost for interleave and zip sequence: 12
>
> For the above case, the cost for interleave+zip1 sequence is computed as:
> halves[0]:
> (set (reg:V2SI 96)
> (vec_duplicate:V2SI (reg/v:SI 93 [ x ])))
> cost = 8
>
> halves[1]:
> (set (reg:V2SI 97)
> (const_vector:V2SI [
> (const_int 1 [0x1]) repeated x2
> ]))
> (set (reg:V2SI 97)
> (vec_merge:V2SI (vec_duplicate:V2SI (reg/v:SI 93 [ x ]))
> (reg:V2SI 97)
> (const_int 1 [0x1])))
> cost = 8
>
> followed by:
> (set (reg:V4SI 95)
> (unspec:V4SI [
> (subreg:V4SI (reg:V2SI 96) 0)
> (subreg:V4SI (reg:V2SI 97) 0)
> ] UNSPEC_ZIP1))
> cost = 4
>
> So the total cost becomes
> max(costs[0], costs[1]) + zip1_insn_cost
> = max(8, 8) + 4
> = 12
>
> While the fallback rtl sequence is:
> (set (reg:V4SI 95)
> (vec_duplicate:V4SI (reg/v:SI 93 [ x ])))
> cost = 8
> (set (reg:SI 98)
> (const_int 1 [0x1]))
> cost = 4
> (set (reg:V4SI 95)
> (vec_merge:V4SI (vec_duplicate:V4SI (reg:SI 98))
> (reg:V4SI 95)
> (const_int 8 [0x8])))
> cost = 4
>
> So total cost = 8 + 4 + 4 = 16, and we choose the interleave+zip1 sequence.
>
> I think the issue is probably that for the interleave+zip1 sequence we take
> max(costs[0], costs[1]) to reflect that both halves are interleaved,
> but for the fallback seq we use seq_cost, which assumes serial execution
> of insns in the sequence.
> For above fallback sequence,
> set (reg:V4SI 95)
> (vec_duplicate:V4SI (reg/v:SI 93 [ x ])))
> and
> (set (reg:SI 98)
> (const_int 1 [0x1]))
> could be executed in parallel, which would make it's cost max(8, 4) + 4 = 12.

Agreed.

A good-enough substitute for this might be to ignore scalar moves
(for both alternatives) when costing for speed.

> I was wondering if we should we make cost for interleave+zip1 sequence
> more conservative
> by not taking max, but summing up costs[0] + costs[1] even for speed ?
> For this case,
> that would be 8 + 8 + 4 = 20.
>
> It generates the fallback sequence for other cases (s8, s16, s64) from
> the test-case.

What does it do for the tests in the interleave+zip1 patch?  If it doesn't
make a difference there then it sounds like we don't have enough tests. :)

Summing is only conservative if the fallback sequence is somehow "safer".
But I don't think it is.   Building an N-element vector from N scalars
can be done using N instructions in the fallback case and N+1 instructions
in the interleave+zip1 case.  But the interleave+zip1 case is still
better (speedwise) for N==16.

Thanks,
Richard


[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread kargl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

kargl at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kargl at gcc dot gnu.org

--- Comment #2 from kargl at gcc dot gnu.org ---
(In reply to gary.wh...@colostate.edu from comment #0)
> Created attachment 55087 [details]
> set of subroutines where moving mc11ad inside the contains statement
> produces incorrect results
> 
> In the following code, when the subroutine mc11ad is moved inside the
> contains statement, incorrect results are produced.

Produce wrong results is meaningless as you haven't told what the
correct results and wrong results are.  A difference in the 7
decimal place for REAL may be entirely possible due to floating
point round-off

> Options being used to compile the code:
>   COPTIONS = -cpp -std=f2018 -c -D ieee -D dbleprecision -m64
> -fsignaling-nans -ffpe-summary='invalid','zero','overflow','underflow' -O3
> -funroll-loops -ffast-math 

What happens if you remove -ffast-math and use -O0 or -O1?

[Bug fortran/109861] Optimization is marking uninitialized C_PTR being passed to a C function, causes segfault.

2023-05-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109861

--- Comment #2 from anlauf at gcc dot gnu.org ---
(In reply to anlauf from comment #1)
> I think that you might want to cross-check your testcase with the NAG
> compiler, or some other compiler which provides a means to initialize
> INTENT(OUT) arguments to detect such code.

I've just checked: NAG behaves similarly to gfortran; the code crashes
with INTENT(OUT) and works with INTENT(INOUT) as well as INTENT(IN).

[Bug rtl-optimization/109866] New: Sometimes using sub/test instead just test

2023-05-15 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109866

Bug ID: 109866
   Summary: Sometimes using sub/test instead just test
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---

Take:
```
int g(void); int h(void); int t(void);
int f(int a, int b)
{
  int c = a - b;
  if(c == 0)
return g();
  if (c > 0)
return h();
  return t();
}
```
This is reduced from bzip2 in spec 2006, though I am not so sure any more.
On x86_64 GCC produces:
```
subl%esi, %edi
testl   %edi, %edi
je  .L5
jle .L3
jmp h()
.L3:
jmp t()
.L5:
jmp g()
```
But GCC should produce (likes clang/LLVM does):
```
cmpl%esi, %edi
je  .L5
jle .L3
jmp h()
.L3:
jmp t()
.L5:
jmp g()
```

Note a similar thing happens with aarch64 target too.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #11 from Thomas Schwinge  ---
(In reply to myself from comment #10)
> Could we easily build a portable 'flock'-like using 'fcntl' locking
> primitives?

(, for
example.)

> (I've not yet looked.)


But simpler, is it OK to require Perl (Ick!) for parallelized
'check-target-libgomp'?  There's ,
and I've got that implemented as a fallback 'flock'.  (It's certainly not,
after two decades or so, my desire to write something in Perl, but I suppose
it's available "almost everywhere" and the fallback 'flock' is simple to
implement.)

[Bug fortran/109865] different results when routine moved inside the contains statement

2023-05-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109865

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2023-05-15
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1

--- Comment #1 from anlauf at gcc dot gnu.org ---
Please provide a compilable, self-contained testcase.  I get:

varmat.F90:3:11:

3 |   use status_module
  |   1
Fatal Error: Cannot open module file 'status_module.mod' for reading at (1): No
such file or directory
compilation terminated.

[Bug fortran/109861] Optimization is marking uninitialized C_PTR being passed to a C function, causes segfault.

2023-05-15 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109861

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Last reconfirmed||2023-05-15
 Status|UNCONFIRMED |WAITING
 Ever confirmed|0   |1
 CC||anlauf at gcc dot gnu.org

--- Comment #1 from anlauf at gcc dot gnu.org ---
I believe the sample code is misleading, and the behavior is expected:

  SUBROUTINE h5aread_async_f(buf)
TYPE(C_PTR), INTENT(OUT) :: buf

You can see what happens if you specify the flag -fdump-tree-original as
part of F90FLAGS.

Now compare the dump-tree for INTENT(INOUT) vs. INTENT(OUT):

--- fcode.F90.005t.original.inout   2023-05-15 20:03:07.292148948 +0200
+++ fcode.F90.005t.original.out 2023-05-15 20:03:36.292208016 +0200
@@ -19,6 +19,7 @@
 D.4223 = (void *) _rdata0;
 f_ptr = D.4223;
   }
+  f_ptr = {CLOBBER};
   h5aread_async_f (_ptr);
   {
 struct __st_parameter_dt dt_parm.0;

When the dummy argument buf is declared with INTENT(OUT), we mark the
actual argument in the caller with CLOBBER, which means that the optimizer
may throw away previous calculations and assignments as they do not matter.

If you add a line

print '(Z16.16)', buf

into that subroutine, you'll see that the clobber annotation serves its
purpose once optimization is enabled.

I think that you might want to cross-check your testcase with the NAG
compiler, or some other compiler which provides a means to initialize
INTENT(OUT) arguments to detect such code.

Ping: [PATCH V5] PR target/105325: Fix constraint issue with power10 fusion

2023-05-15 Thread Michael Meissner via Gcc-patches
Ping both patches:

Patch #1, rewrite genfusion.pl's code for load and compare immediate fusion to
be more readable.  This patch produces the same output as the current sources.

| Date: Wed, 10 May 2023 11:38:55 -0400
| Subject: Re: [PATCH V5, 1/2] PR target/105325: Rewrite genfusion.pl's 
gen_ld_cmpi_p10 function.
| Message-ID: 

Patch #2, implement the fix for PR target/105325:

| Date: Wed, 10 May 2023 11:40:00 -0400
| Subject: [PATCH V5, 2/2] PR target/105325: Fix memory constraints for power10 
fusion.
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


  1   2   3   4   >