Re: [PATCH 1/4] rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

2024-05-13 Thread Richard Biener
On Mon, May 13, 2024 at 3:39 AM Kewen.Lin  wrote:
>
> Hi Joseph and Richi,
>
> Thanks for the suggestions and comments!
>
> on 2024/5/10 14:31, Richard Biener wrote:
> > On Thu, May 9, 2024 at 9:12 PM Joseph Myers  wrote:
> >>
> >> On Wed, 8 May 2024, Kewen.Lin wrote:
> >>
> >>> to widen IFmode to TFmode.  To make build_common_tree_nodes
> >>> be able to find the correct mode for long double type node,
> >>> it introduces one hook mode_for_longdouble to offer target
> >>> a way to specify the mode used for long double type node.
> >>
> >> I don't really like layering a hook on top of the old target macro as a
> >> way to address a deficiency in the design of that target macro (floating
> >> types should have their mode, not a poorly defined precision value,
> >> specified directly by the target).
>
> Good point!
>
> >
> > Seconded.
> >
> >> A better hook design might be something like mode_for_floating_type (enum
> >> tree_index), where the argument is TI_FLOAT_TYPE, TI_DOUBLE_TYPE or
> >> TI_LONG_DOUBLE_TYPE, replacing all definitions and uses of
> >> FLOAT_TYPE_SIZE, DOUBLE_TYPE_SIZE and LONG_DOUBLE_TYPE_SIZE with the
> >> single new hook and appropriate definitions for each target (with a
> >> default definition that uses SFmode for float and DFmode for double and
> >> long double, which would be suitable for many targets).
> >
>
> The originally proposed hook was meant to make the other ports unaffected,
> but I agree that introducing such hook would be more clear.
>
> > In fact replacing all of X_TYPE_SIZE with a single hook might be worthwhile
> > though this removes the "convenient" defaulting, requiring each target to
> > enumerate all standard C ABI type modes.  But that might be also a good 
> > thing.
> >
>
> I guess the main value by extending from floating point types to all is to
> unify them?  (Assuming that excepting for floating types the others would
> not have multiple possible representations like what we faces on 128bit fp).
>
> > The most pragmatic solution would be to do
> > s/LONG_DOUBLE_TYPE_SIZE/LONG_DOUBLE_TYPE_MODE/
>
> Yeah, this beats my proposed hook (assuming the default is VOIDmode too).
>
> So it seems we have three alternatives here:
>   1) s/LONG_DOUBLE_TYPE_SIZE/LONG_DOUBLE_TYPE_MODE/
>   2) mode_for_floating_type
>   3) mode_for_abi_type
>
> Since 1) would make long double type special (different from the other types
> having _TYPE_SIZE), personally I'm inclined to 3): implement 2) first, get
> this patch series landed, extend to all.
>
> Do you have any preference?

Maybe do 3) but have the default hook implementation look at
*_TYPE_SIZE when the target doesn't implement the hook?  That would
force you to transition rs6000 away from *_TYPE_SIZE completely
but this would also prove the design.

Btw, for .c.mode_for_abi_type I'd exclude ADA_LONG_TYPE_SIZE.

Joseph, do you agree with this?  I'd not touch the target macros like
PTRDIFF_TYPE (those evaluating to a string) at this point though
they could be handled with a common target hook as well (not sure
if we'd want to have a unified hook for both?).

Thanks,
Richard.

>
> BR,
> Kewen


Re: [PATCH] Fortran: improve attribute conflict checking [PR93635]

2024-05-13 Thread Mikael Morin

Le 10/05/2024 à 21:56, Harald Anlauf a écrit :

Am 10.05.24 um 21:48 schrieb Harald Anlauf:

Hi Mikael,

Am 10.05.24 um 11:45 schrieb Mikael Morin:

Le 09/05/2024 à 22:30, Harald Anlauf a écrit :

I'll stop here...


Thanks. Go figure, I have no problem reproducing today.
It's PR99798 (and there is even a patch for it).


this patch has rotten a bit: the type of gfc_reluease_symbol
has changed to bool, this can be fixed.

Unfortunately, applying the patch does not remove the ICEs here...


Oops, I take that back!  There was an error on my side applying the
patch; and now it does fix the ICEs after correcting that hickup

Now the PR99798 patch is ready to be pushed, but I won't be available 
for a few days.  We can finish our discussion on this topic afterwards.





We currently do not recover well from errors, and the prevention
of corrupted namespaces is apparently a goal we aim at.


Yes, and we are not there yet. But at least there is a sensible error
message before the crash.


True.  But having a sensible error before ICEing does not improve
user experience either.

Are you planning to work again on PR99798?

Cheers,
Harald


Cheers,
Harald


The patch therefore does not require any testsuite update and
should not give any other surprises, so it should be very safe.
The plan is also to leave the PR open for the time being.

Regtesting on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald


















Re: [PATCH] Don't reduce estimated unrolled size for innermost loop.

2024-05-13 Thread Richard Biener
On Mon, May 13, 2024 at 4:29 AM liuhongt  wrote:
>
> As testcase in the PR, O3 cunrolli may prevent vectorization for the
> innermost loop and increase register pressure.
> The patch removes the 1/3 reduction of unr_insn for innermost loop for UL_ALL.
> ul != UR_ALL is needed since some small loop complete unrolling at O2 relies
> the reduction.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> No big impact for SPEC2017.
> Ok for trunk?

This removes the 1/3 reduction when unrolling a loop nest (the case I was
concerned about).  Unrolling of a nest is by iterating in
tree_unroll_loops_completely
so the to be unrolled loop appears innermost.  So I think you need a new
parameter on tree_unroll_loops_completely_1 indicating whether we're in the
first iteration (or whether to assume inner most loops will "simplify").

Few comments below

> gcc/ChangeLog:
>
> PR tree-optimization/112325
> * tree-ssa-loop-ivcanon.cc (estimated_unrolled_size): Add 2
> new parameters: loop and ul, and remove unr_insns reduction
> for innermost loop.
> (try_unroll_loop_completely): Pass loop and ul to
> estimated_unrolled_size.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr112325.c: New test.
> * gcc.dg/vect/pr69783.c: Add extra option --param
> max-completely-peeled-insns=300.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr112325.c | 57 
>  gcc/testsuite/gcc.dg/vect/pr69783.c  |  2 +-
>  gcc/tree-ssa-loop-ivcanon.cc | 16 +--
>  3 files changed, 71 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr112325.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr112325.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr112325.c
> new file mode 100644
> index 000..14208b3e7f8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr112325.c
> @@ -0,0 +1,57 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cunrolli-details" } */
> +
> +typedef unsigned short ggml_fp16_t;
> +static float table_f32_f16[1 << 16];
> +
> +inline static float ggml_lookup_fp16_to_fp32(ggml_fp16_t f) {
> +unsigned short s;
> +__builtin_memcpy(&s, &f, sizeof(unsigned short));
> +return table_f32_f16[s];
> +}
> +
> +typedef struct {
> +ggml_fp16_t d;
> +ggml_fp16_t m;
> +unsigned char qh[4];
> +unsigned char qs[32 / 2];
> +} block_q5_1;
> +
> +typedef struct {
> +float d;
> +float s;
> +char qs[32];
> +} block_q8_1;
> +
> +void ggml_vec_dot_q5_1_q8_1(const int n, float * restrict s, const void * 
> restrict vx, const void * restrict vy) {
> +const int qk = 32;
> +const int nb = n / qk;
> +
> +const block_q5_1 * restrict x = vx;
> +const block_q8_1 * restrict y = vy;
> +
> +float sumf = 0.0;
> +
> +for (int i = 0; i < nb; i++) {
> +unsigned qh;
> +__builtin_memcpy(&qh, x[i].qh, sizeof(qh));
> +
> +int sumi = 0;
> +
> +for (int j = 0; j < qk/2; ++j) {
> +const unsigned char xh_0 = ((qh >> (j + 0)) << 4) & 0x10;
> +const unsigned char xh_1 = ((qh >> (j + 12)) ) & 0x10;
> +
> +const int x0 = (x[i].qs[j] & 0xF) | xh_0;
> +const int x1 = (x[i].qs[j] >> 4) | xh_1;
> +
> +sumi += (x0 * y[i].qs[j]) + (x1 * y[i].qs[j + qk/2]);
> +}
> +
> +sumf += (ggml_lookup_fp16_to_fp32(x[i].d)*y[i].d)*sumi + 
> ggml_lookup_fp16_to_fp32(x[i].m)*y[i].s;
> +}
> +
> +*s = sumf;
> +}
> +
> +/* { dg-final { scan-tree-dump {(?n)Not unrolling loop [1-9] \(--param 
> max-completely-peel-times limit reached} "cunrolli"} } */
> diff --git a/gcc/testsuite/gcc.dg/vect/pr69783.c 
> b/gcc/testsuite/gcc.dg/vect/pr69783.c
> index 5df95d0ce4e..a1f75514d72 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr69783.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr69783.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-require-effective-target vect_float } */
> -/* { dg-additional-options "-Ofast -funroll-loops" } */
> +/* { dg-additional-options "-Ofast -funroll-loops --param 
> max-completely-peeled-insns=300" } */

If we rely on unrolling of a loop can you put #pragma unroll [N]
before the respective loop
instead?

>  #define NXX 516
>  #define NYY 516
> diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
> index bf017137260..5e0eca647a1 100644
> --- a/gcc/tree-ssa-loop-ivcanon.cc
> +++ b/gcc/tree-ssa-loop-ivcanon.cc
> @@ -444,7 +444,9 @@ tree_estimate_loop_size (class loop *loop, edge exit, 
> edge edge_to_cancel,
>
>  static unsigned HOST_WIDE_INT
>  estimated_unrolled_size (struct loop_size *size,
> -unsigned HOST_WIDE_INT nunroll)
> +unsigned HOST_WIDE_INT nunroll,
> +enum unroll_level ul,
> +class loop* loop)
>  {
>HOST_WIDE_INT unr_insns = ((nunroll)
>  * (HOST_WIDE_INT) (size->overall
> @@ -453,7 +455,15 @@ estimated_u

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-13 Thread Richard Biener
On Mon, May 13, 2024 at 8:18 AM Robin Dapp  wrote:
>
> > How does this make a difference in the end?  I'd expect say forwprop to
> > fix things?
>
> In general we try to only add the masking "boilerplate" of our
> instructions at split time so fwprop, combine et al. can do their
> work uninhibited of it (and we don't need numerous
> (if_then_else ... (if_then_else) ...) combinations in our patterns).
> A vec constant we expand directly to a masked representation, though
> which makes further simplification difficult.  I can experiment with
> changing that if preferred.
>
> My thinking was, however, that for other operations like binops we
> directly emit the right variant via expand_operands without
> forcing to a reg and don't even need to fwprop so I wanted to
> imitate that.

Ah, so yeah, it probably makes sense for constants.  Btw,
there's prepare_operand which I think might be better for
its CONST_INT handling?  I can also see we usually do not
bother with force_reg, the force_reg was added with the
initial r6-4696-ga414c77f2a30bb already.

What happens if we simply remove all of the force_reg here?

Thanks,
Richard.

> Regards
>  Robin
>


[PATCH] tree-ssa-math-opts: Pattern recognize yet another .ADD_OVERFLOW pattern [PR113982]

2024-05-13 Thread Jakub Jelinek
Hi!

We pattern recognize already many different patterns, and closest to the
requested one also
   yc = (type) y;
   zc = (type) z;
   x = yc + zc;
   w = (typeof_y) x;
   if (x > max)
where y/z has the same unsigned type and type is a wider unsigned type
and max is maximum value of the narrower unsigned type.
But apparently people are creative in writing this in diffent ways,
this requests
   yc = (type) y;
   zc = (type) z;
   x = yc + zc;
   w = (typeof_y) x;
   if (x >> narrower_type_bits)

The following patch implements that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-05-13  Jakub Jelinek  

PR middle-end/113982
* tree-ssa-math-opts.cc (arith_overflow_check_p): Also return 1
for RSHIFT_EXPR by precision of maxval if shift result is only
used in a cast or comparison against zero.
(match_arith_overflow): Handle the RSHIFT_EXPR use case.

* gcc.dg/pr113982.c: New test.

--- gcc/tree-ssa-math-opts.cc.jj2024-04-11 09:26:36.318369218 +0200
+++ gcc/tree-ssa-math-opts.cc   2024-05-10 18:17:08.795744811 +0200
@@ -3947,6 +3947,66 @@ arith_overflow_check_p (gimple *stmt, gi
   else
 return 0;
 
+  if (maxval
+  && ccode == RSHIFT_EXPR
+  && crhs1 == lhs
+  && TREE_CODE (crhs2) == INTEGER_CST
+  && wi::to_widest (crhs2) == TYPE_PRECISION (TREE_TYPE (maxval)))
+{
+  tree shiftlhs = gimple_assign_lhs (use_stmt);
+  if (!shiftlhs)
+   return 0;
+  use_operand_p use;
+  if (!single_imm_use (shiftlhs, &use, &cur_use_stmt))
+   return 0;
+  if (gimple_code (cur_use_stmt) == GIMPLE_COND)
+   {
+ ccode = gimple_cond_code (cur_use_stmt);
+ crhs1 = gimple_cond_lhs (cur_use_stmt);
+ crhs2 = gimple_cond_rhs (cur_use_stmt);
+   }
+  else if (is_gimple_assign (cur_use_stmt))
+   {
+ if (gimple_assign_rhs_class (cur_use_stmt) == GIMPLE_BINARY_RHS)
+   {
+ ccode = gimple_assign_rhs_code (cur_use_stmt);
+ crhs1 = gimple_assign_rhs1 (cur_use_stmt);
+ crhs2 = gimple_assign_rhs2 (cur_use_stmt);
+   }
+ else if (gimple_assign_rhs_code (cur_use_stmt) == COND_EXPR)
+   {
+ tree cond = gimple_assign_rhs1 (cur_use_stmt);
+ if (COMPARISON_CLASS_P (cond))
+   {
+ ccode = TREE_CODE (cond);
+ crhs1 = TREE_OPERAND (cond, 0);
+ crhs2 = TREE_OPERAND (cond, 1);
+   }
+ else
+   return 0;
+   }
+ else
+   {
+ enum tree_code sc = gimple_assign_rhs_code (cur_use_stmt);
+ tree castlhs = gimple_assign_lhs (cur_use_stmt);
+ if (!CONVERT_EXPR_CODE_P (sc)
+ || !castlhs
+ || !INTEGRAL_TYPE_P (TREE_TYPE (castlhs))
+ || (TYPE_PRECISION (TREE_TYPE (castlhs))
+ > TYPE_PRECISION (TREE_TYPE (maxval
+   return 0;
+ return 1;
+   }
+   }
+  else
+   return 0;
+  if ((ccode != EQ_EXPR && ccode != NE_EXPR)
+ || crhs1 != shiftlhs
+ || !integer_zerop (crhs2))
+   return 0;
+  return 1;
+}
+
   if (TREE_CODE_CLASS (ccode) != tcc_comparison)
 return 0;
 
@@ -4049,6 +4109,7 @@ arith_overflow_check_p (gimple *stmt, gi
_8 = IMAGPART_EXPR <_7>;
if (_8)
and replace (utype) x with _9.
+   Or with x >> popcount (max) instead of x > max.
 
Also recognize:
x = ~z;
@@ -4481,10 +4542,62 @@ match_arith_overflow (gimple_stmt_iterat
  gcc_checking_assert (is_gimple_assign (use_stmt));
  if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
{
- gimple_assign_set_rhs1 (use_stmt, ovf);
- gimple_assign_set_rhs2 (use_stmt, build_int_cst (type, 0));
- gimple_assign_set_rhs_code (use_stmt,
- ovf_use == 1 ? NE_EXPR : EQ_EXPR);
+ if (gimple_assign_rhs_code (use_stmt) == RSHIFT_EXPR)
+   {
+ g2 = gimple_build_assign (make_ssa_name (boolean_type_node),
+   ovf_use == 1 ? NE_EXPR : EQ_EXPR,
+   ovf, build_int_cst (type, 0));
+ gimple_stmt_iterator gsiu = gsi_for_stmt (use_stmt);
+ gsi_insert_before (&gsiu, g2, GSI_SAME_STMT);
+ gimple_assign_set_rhs_with_ops (&gsiu, NOP_EXPR,
+ gimple_assign_lhs (g2));
+ update_stmt (use_stmt);
+ use_operand_p use;
+ single_imm_use (gimple_assign_lhs (use_stmt), &use,
+ &use_stmt);
+ if (gimple_code (use_stmt) == GIMPLE_COND)
+   {
+ gcond *cond_stmt = as_a  (use_stmt);
+ gimple_cond_set_lhs 

Re: Fix gnu versioned namespace mode 00/03

2024-05-13 Thread Jonathan Wakely
On Mon, 13 May 2024, 07:30 Iain Sandoe,  wrote:

>
>
> > On 13 May 2024, at 06:06, François Dumont  wrote:
> >
> >
> > On 07/05/2024 18:15, Iain Sandoe wrote:
> >> Hi François
> >>
> >>> On 4 May 2024, at 22:11, François Dumont  wrote:
> >>>
> >>> Here is the list of patches to restore gnu versioned namespace mode.
> >>>
> >>> 1/3: Bump gnu version namespace
> >>>
> >>> This is important to be done first so that once build of gnu versioned
> namespace is fixed there is no chance to have another build of '__8'
> version with a different abi than last successful '__8' build.
>


The versioned namespace build is not expected to be ABI compatible though,
so nobody should be expecting compatibility with previous builds.
Especially not on the gcc-15 trunk, a week or two after entering stage 1!


> >>>
> >>> 2/3: Fix build using cxx11 abi for versioned namespace
> >>>
> >>> 3/3: Proposal to default to "new" abi when dual abi is disabled and
> accept any default-libstdcxx-abi either dual abi is enabled or not.
> >>>
> >>> All testsuite run for following configs:
> >>>
> >>> - dual abi
> >>>
> >>> - gcc4-compatible only abi
> >>>
> >>> - new only abi
> >>>
> >>> - versioned namespace abi
> >> At the risk of delaying this (a bit) - I think we should also consider
> items like once_call that have broken impls.
> > Do you have any pointer to this once_call problem, sorry I'm not aware
> about it (apart from your messages).
>
> (although this mentions one specific target, it applies more widely).
>

I've removed the "on ppc64le" part from the summary.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146
>
> Also, AFAICT, any nested once_call is a problem (not just exceptions).
>

Could you update the bug with that info please?


> >>  in the current library - and at least get proposed replacements
> available behind the versioned namespace; rather than using up a namespace
> version with the current broken code.
> >
> > I'm not proposing to fix all library bugs on all platforms with this
> patch, just fix the versioned namespace mode.
>
> Sorry, I was not intending to suggest that (although perhaps my comments
> read that way).
>
> I was trying to suggest that, in the case where we have proposed fixes
> that are blocked because they are ABI breaks, that those could be put
> behind the versioned namspace (it was not an intention to suggest that such
> additions should be part of this patch series).
>
> > As to do so I also need to adopt cxx11 abi in versioned mode it already
> justify a bump of version.
>
> I see - it’s just a bit strange that we are bumping a version for a mode
> that does not currently work;  however, i guess someone might have deployed
> it even so.
>

It does work though, doesn't it?
It's known to fail on powerpc64 due to conflicts with the ieee128 stuff,
but it should work elsewhere.
It doesn't work with --with-default-libstdcxx-abi=cxx11 but that's just a
"this doesn't work and isn't supported" limitation.

The point of the patch series is to change it so the versioned namespace
always uses the cxx11 ABI, which does seem worth bumping the version (even
though the versioned namespace is explicitly not a stable ABI and not
backwards compatible).


>
> > The reason I'm proposing to integrate this patch this early in gcc 15
> stage is to have time to integrate any other library fix/optimization that
> could make use of it. I already have 1 on my side for the hashtable
> implementation
>
> Ah, then I think we are aiming for the same thing.
>
> > . I hope your once_call fix also have time to be ready for gcc 15, no ?
>
> Yes; if we put it behind the versioned namespace - there are (I think)
> several proposed solutions to that specific issue.
>
> thanks
> Iain
>
> >
> > François
>
>


[COMMITTED] ada: Refactor GNAT.Directory_Operations.Read to minimise runtime checks

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Array assignments are likely more efficient than element-by-element
copying; in particular, they avoid constraints checks in every iteration
of a loop (when the runtime is compiled with checks enabled).

A cleanup and improvement opportunity spotted while working on improved
detection of uninitialised local scalar objects.

gcc/ada/

* libgnat/g-dirope.adb (Read): Use null-excluding,
access-to-constant type; replace element-by-element copy with
array assignments.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/g-dirope.adb | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/libgnat/g-dirope.adb b/gcc/ada/libgnat/g-dirope.adb
index c23aa68b700..428d27d9e8d 100644
--- a/gcc/ada/libgnat/g-dirope.adb
+++ b/gcc/ada/libgnat/g-dirope.adb
@@ -676,13 +676,9 @@ package body GNAT.Directory_Operations is
  return;
   end if;
 
-  Last :=
-(if Str'Length > Filename_Len then Str'First + Filename_Len - 1
- else Str'Last);
-
   declare
  subtype Path_String is String (1 .. Filename_Len);
- typePath_String_Access is access Path_String;
+ typePath_String_Access is not null access constant Path_String;
 
  function Address_To_Access is new
Ada.Unchecked_Conversion
@@ -693,9 +689,13 @@ package body GNAT.Directory_Operations is
  Address_To_Access (Filename_Addr);
 
   begin
- for J in Str'First .. Last loop
-Str (J) := Path_Access (J - Str'First + 1);
- end loop;
+ if Str'Length > Filename_Len then
+Last := Str'First + Filename_Len - 1;
+Str (Str'First .. Last) := Path_Access.all;
+ else
+Last := Str'Last;
+Str := Path_Access (1 .. Str'Length);
+ end if;
   end;
end Read;
 
-- 
2.43.2



[COMMITTED] ada: Compiler crash on nonstatic container aggregates for Doubly_Linked_Lists

2024-05-13 Thread Marc Poulhiès
From: Gary Dismukes 

The compiler was crashing on container aggregates for the List type
coming from an instantiation of Ada.Containers.Doubly_Linked_Lists
when the aggregate has more than one iterated_element_association
with nonstatic range bounds. As part of addressing this, it was
noticed that there were also somewhat related problems with container
aggregates associated with the Ada.Containers.Bounded_Doubly_Linked_Lists
generic (and likely others like it) and mishandling of certain cases of
indexed aggregates, and those are also addressed by this set of changes.
In the case of container aggregates with at least one association with
a nonstatic range, the total length of the aggregate is determined by
expansion actions of Aggregate_Size.

gcc/ada/

* exp_aggr.adb (Expand_Container_Aggregate): Move determination of
whether the aggregate is an indexed aggregate earlier in the
procedure. Test Is_Indexed_Aggregate as a criterion for generating
a call to the container type's New_Indexed function, add proper
computation of bounds to pass in to the function, and remove later
code for generating such a call. Add and improve comments.
(Aggregate_Size): Remove special treatment of case where there is
exactly one component association, and instead loop over all
component associations to determine whether any of them have a
nonstatic length. If there is at least one such nonstatic
association, return -1.
(Build_Siz_Exp): Accumulate a sum of the sizes of each of the
component associations in Siz_Exp (which will only be used if
there any associations that are of Nkind
N_Iterated_Component_Association with a nonstatic range).
(Expand_Range_Component): Fix typos in the procedure's spec
comment and block comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb | 247 ++-
 1 file changed, 149 insertions(+), 98 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index 950f310b58c..c82bd07aedc 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -6662,6 +6662,8 @@ package body Exp_Aggr is
 end if;
  end Add_Range_Size;
 
+  --  Start of processing for Aggregate_Size
+
   begin
  --  Aggregate is either all positional or all named
 
@@ -6669,23 +6671,39 @@ package body Exp_Aggr is
 
  if Present (Component_Associations (N)) then
 Comp := First (Component_Associations (N));
---  If there is a single component association it can be
---  an iterated component with dynamic bounds or an element
---  iterator over an iterable object. If it is an array
---  we can use the attribute Length to get its size;
---  for a predefined container the function Length plays
---  the same role. There is no available mechanism for
---  user-defined containers. For now we treat all of these
---  as dynamic.
-
-if List_Length (Component_Associations (N)) = 1
-  and then Nkind (Comp) in N_Iterated_Component_Association |
-   N_Iterated_Element_Association
-then
-   return Build_Siz_Exp (Comp);
-end if;
 
---  Otherwise all associations must specify static sizes.
+--  If one or more of the associations is one of the iterated
+--  forms, and is either an association with nonstatic bounds
+--  or is an iterator over an iterable object, then treat the
+--  whole container aggregate as having a nonstatic number of
+--  elements.
+
+declare
+   Has_Nonstatic_Length : Boolean := False;
+
+begin
+   while Present (Comp) loop
+  if Nkind (Comp) in N_Iterated_Component_Association |
+ N_Iterated_Element_Association
+and then Build_Siz_Exp (Comp) = -1
+  then
+ Has_Nonstatic_Length := True;
+  end if;
+
+  Next (Comp);
+   end loop;
+
+   if Has_Nonstatic_Length then
+  return -1;
+   end if;
+end;
+
+--  Otherwise, the aggregate must have associations where all
+--  choices and bounds are statically known, and we compute
+--  the number of elements statically by adding up the number
+--  of elements in each association.
+
+Comp := First (Component_Associations (N));
 
 while Present (Comp) loop
Choice := First (Choice_List (Comp));
@@ -6731,7 +6749,9 @@ package body Exp_Aggr is
   ---
 
   function Build_Siz_Exp (Comp : Node_Id) return Int is
-   

[COMMITTED] ada: Small cleanup in the BIP machinery

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

This avoids creating Null nodes when they are not used in the end and makes
the implementation of Add_Finalization_Master_Actual_To_Build_In_Place_Call
more consistent with that of its sibling routines.  No functional changes.

gcc/ada/

* exp_ch6.adb (Add_Unconstrained_Actuals_To_Build_In_Place_Call):
Rename Pool_Actual into Pool_Exp and use Empty as default value.
(Add_Finalization_Master_Actual_To_Build_In_Place_Call): Change the
names of the first two parameters and use a simpler code structure.
(Make_Build_In_Place_Call_In_Allocator): Rename the local variable
for the pool actual and set it to Empty if it is not used.
(Make_Build_In_Place_Call_In_Object_Declaration): Rename the local
variable for the master actual.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb | 192 ++--
 1 file changed, 98 insertions(+), 94 deletions(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index 9e1844aa08e..0ab6c0080bf 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -157,22 +157,22 @@ package body Exp_Ch6 is
   Function_Id: Entity_Id;
   Alloc_Form : BIP_Allocation_Form := Unspecified;
   Alloc_Form_Exp : Node_Id := Empty;
-  Pool_Actual: Node_Id := Make_Null (No_Location));
+  Pool_Exp   : Node_Id := Empty);
--  Ada 2005 (AI-318-02): If the result type of a build-in-place call needs
--  them, add the actuals parameters BIP_Alloc_Form and BIP_Storage_Pool.
--  If Alloc_Form_Exp is present, then pass it for the first parameter,
--  otherwise pass a literal corresponding to the Alloc_Form parameter
-   --  (which must not be Unspecified in that case). Pool_Actual is the
-   --  parameter to pass to BIP_Storage_Pool.
+   --  (which must not be Unspecified in that case). If Pool_Exp is present,
+   --  then use it for BIP_Storage_Pool, otherwise pass "null".
 
procedure Add_Finalization_Master_Actual_To_Build_In_Place_Call
- (Func_Call  : Node_Id;
-  Func_Id: Entity_Id;
-  Ptr_Typ: Entity_Id := Empty;
-  Master_Exp : Node_Id   := Empty);
+ (Function_Call : Node_Id;
+  Function_Id   : Entity_Id;
+  Ptr_Typ   : Entity_Id := Empty;
+  Master_Exp: Node_Id   := Empty);
--  Ada 2005 (AI-318-02): If the result type of a build-in-place call needs
--  finalization actions, add an actual parameter which is a pointer to the
-   --  finalization master of the caller. If Master_Exp is not Empty, then that
+   --  finalization master of the caller. If Master_Exp is present, then that
--  will be passed as the actual. Otherwise, if Ptr_Typ is left Empty, this
--  will result in an automatic "null" value for the actual.
 
@@ -424,13 +424,12 @@ package body Exp_Ch6 is
   Function_Id: Entity_Id;
   Alloc_Form : BIP_Allocation_Form := Unspecified;
   Alloc_Form_Exp : Node_Id := Empty;
-  Pool_Actual: Node_Id := Make_Null (No_Location))
+  Pool_Exp   : Node_Id := Empty)
is
   Loc : constant Source_Ptr := Sloc (Function_Call);
 
   Alloc_Form_Actual : Node_Id;
   Alloc_Form_Formal : Node_Id;
-  Pool_Formal   : Node_Id;
 
begin
   --  Nothing to do when the size of the object is known, and the caller is
@@ -472,10 +471,16 @@ package body Exp_Ch6 is
   --  those targets do not support pools.
 
   if RTE_Available (RE_Root_Storage_Pool_Ptr) then
- Pool_Formal := Build_In_Place_Formal (Function_Id, BIP_Storage_Pool);
- Analyze_And_Resolve (Pool_Actual, Etype (Pool_Formal));
- Add_Extra_Actual_To_Call
-   (Function_Call, Pool_Formal, Pool_Actual);
+ declare
+Pool_Actual : constant Node_Id :=
+  (if Present (Pool_Exp) then Pool_Exp else Make_Null (Loc));
+Pool_Formal : constant Node_Id :=
+  Build_In_Place_Formal (Function_Id, BIP_Storage_Pool);
+
+ begin
+Analyze_And_Resolve (Pool_Actual, Etype (Pool_Formal));
+Add_Extra_Actual_To_Call (Function_Call, Pool_Formal, Pool_Actual);
+ end;
   end if;
end Add_Unconstrained_Actuals_To_Build_In_Place_Call;
 
@@ -484,92 +489,88 @@ package body Exp_Ch6 is
---
 
procedure Add_Finalization_Master_Actual_To_Build_In_Place_Call
- (Func_Call  : Node_Id;
-  Func_Id: Entity_Id;
-  Ptr_Typ: Entity_Id := Empty;
-  Master_Exp : Node_Id   := Empty)
+ (Function_Call : Node_Id;
+  Function_Id   : Entity_Id;
+  Ptr_Typ   : Entity_Id := Empty;
+  Master_Exp: Node_Id   := Empty)
is
+  Loc : constant Source_Ptr := Sloc (Function_Call);
+
+  Actual: Node_Id;
+  Formal: Node_Id;
+  Desig_Typ : Entity_Id;
+
begin
- 

[COMMITTED] ada: Fix internal error with Put_Image aspect on access-to-class-wide type

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

This occurs with an instantiation of Ada.Containers.Vectors in a nested
package on an access-to-class-wide type declared with the Put_Image aspect
because of too late a freezing for the internal renaming generated for the
Put_Image procedure.

The change freezes this renaming immediately in this particular case; this
is similar to a trick used in Build_Array_Put_Image_Procedure.

gcc/ada/

* sem_ch13.adb (New_Put_Image_Subprogram): In the nondeferred case
coming from an aspect and for a type with delaying freezing, also
freeze the subprogram immediately.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index 1f3f8277294..f3212f25dcc 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -15112,6 +15112,11 @@ package body Sem_Ch13 is
   then
  Append_Freeze_Action (Ent, Subp_Decl);
 
+ --  We may freeze Subp_Id immediately since Ent has just been frozen.
+ --  This will help to shield us from potential late freezing issues.
+
+ Set_Is_Frozen (Subp_Id);
+
   else
  Insert_Action (N, Subp_Decl);
  Set_Entity (N, Subp_Id);
-- 
2.43.2



[COMMITTED] ada: Enable casing on composite via -X0 instead of -X

2024-05-13 Thread Marc Poulhiès
From: Steve Baird 

Move case statement pattern matching out of the curated language extension
set and into the extended set.

gcc/ada/

* sem_case.adb: Replace all tests of Core_Extensions_Allowed with
corresponding tests of All_Extensions_Allowed.
* sem_ch5.adb: Likewise.
* doc/gnat_rm/gnat_language_extensions.rst: update documentation.
* gnat_rm.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 .../doc/gnat_rm/gnat_language_extensions.rst  | 236 +++---
 gcc/ada/gnat_rm.texi  | 292 +-
 gcc/ada/sem_case.adb  |   4 +-
 gcc/ada/sem_ch5.adb   |   6 +-
 4 files changed, 269 insertions(+), 269 deletions(-)

diff --git a/gcc/ada/doc/gnat_rm/gnat_language_extensions.rst 
b/gcc/ada/doc/gnat_rm/gnat_language_extensions.rst
index 42d64133989..c703e1c7e3f 100644
--- a/gcc/ada/doc/gnat_rm/gnat_language_extensions.rst
+++ b/gcc/ada/doc/gnat_rm/gnat_language_extensions.rst
@@ -137,124 +137,6 @@ An exception message can also be added:
 Link to the original RFC:
 
https://github.com/AdaCore/ada-spark-rfcs/blob/master/prototyped/rfc-conditional-when-constructs.rst
 
-Case pattern matching
--
-
-The selector for a case statement (but not yet for a case expression) may be 
of a composite type, subject to
-some restrictions (described below). Aggregate syntax is used for choices
-of such a case statement; however, in cases where a "normal" aggregate would
-require a discrete value, a discrete subtype may be used instead; box
-notation can also be used to match all values.
-
-Consider this example:
-
-.. code-block:: ada
-
-  type Rec is record
- F1, F2 : Integer;
-  end record;
-
-  procedure Caser_1 (X : Rec) is
-  begin
- case X is
-when (F1 => Positive, F2 => Positive) =>
-   Do_This;
-when (F1 => Natural, F2 => <>) | (F1 => <>, F2 => Natural) =>
-   Do_That;
-when others =>
-Do_The_Other_Thing;
- end case;
-  end Caser_1;
-
-If ``Caser_1`` is called and both components of X are positive, then
-``Do_This`` will be called; otherwise, if either component is nonnegative
-then ``Do_That`` will be called; otherwise, ``Do_The_Other_Thing`` will be
-called.
-
-In addition, pattern bindings are supported. This is a mechanism
-for binding a name to a component of a matching value for use within
-an alternative of a case statement. For a component association
-that occurs within a case choice, the expression may be followed by
-``is ``. In the special case of a "box" component association,
-the identifier may instead be provided within the box. Either of these
-indicates that the given identifier denotes (a constant view of) the matching
-subcomponent of the case selector.
-
-.. attention:: Binding is not yet supported for arrays or subcomponents
-   thereof.
-
-Consider this example (which uses type ``Rec`` from the previous example):
-
-.. code-block:: ada
-
-  procedure Caser_2 (X : Rec) is
-  begin
- case X is
-when (F1 => Positive is Abc, F2 => Positive) =>
-   Do_This (Abc)
-when (F1 => Natural is N1, F2 => ) |
- (F1 => , F2 => Natural is N1) =>
-   Do_That (Param_1 => N1, Param_2 => N2);
-when others =>
-   Do_The_Other_Thing;
- end case;
-  end Caser_2;
-
-This example is the same as the previous one with respect to determining
-whether ``Do_This``, ``Do_That``, or ``Do_The_Other_Thing`` will be called. But
-for this version, ``Do_This`` takes a parameter and ``Do_That`` takes two
-parameters. If ``Do_This`` is called, the actual parameter in the call will be
-``X.F1``.
-
-If ``Do_That`` is called, the situation is more complex because there are two
-choices for that alternative. If ``Do_That`` is called because the first choice
-matched (i.e., because ``X.F1`` is nonnegative and either ``X.F1`` or ``X.F2``
-is zero or negative), then the actual parameters of the call will be (in order)
-``X.F1`` and ``X.F2``. If ``Do_That`` is called because the second choice
-matched (and the first one did not), then the actual parameters will be
-reversed.
-
-Within the choice list for single alternative, each choice must define the same
-set of bindings and the component subtypes for for a given identifer must all
-statically match. Currently, the case of a binding for a nondiscrete component
-is not implemented.
-
-If the set of values that match the choice(s) of an earlier alternative
-overlaps the corresponding set of a later alternative, then the first set shall
-be a proper subset of the second (and the later alternative will not be
-executed if the earlier alternative "matches"). All possible values of the
-composite type shall be covered. The composite type of the selector shall be an
-array or record type that is neither limited nor class-wide. Currently, a "when
-others =>" case choice is required; it is intended that 

[COMMITTED] ada: Simplify uses of readdir_gnat with object overlay

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Code cleanup; behavior is unaffected.

gcc/ada/

* libgnat/a-direct.adb (Start_Search_Internal): Combine subtype
and object declaration.
* libgnat/g-dirope.adb (Read): Replace convoluted unchecked
conversion with an overlay.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/a-direct.adb |  4 +---
 gcc/ada/libgnat/g-dirope.adb | 18 --
 2 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/gcc/ada/libgnat/a-direct.adb b/gcc/ada/libgnat/a-direct.adb
index 9e399c1003e..32e020c48c3 100644
--- a/gcc/ada/libgnat/a-direct.adb
+++ b/gcc/ada/libgnat/a-direct.adb
@@ -1367,9 +1367,7 @@ package body Ada.Directories is
  --  the Filter add it to our search vector.
 
  declare
-subtype File_Name_String is String (1 .. File_Name_Len);
-
-File_Name : constant File_Name_String
+File_Name : constant String (1 .. File_Name_Len)
   with Import, Address => File_Name_Addr;
 
  begin
diff --git a/gcc/ada/libgnat/g-dirope.adb b/gcc/ada/libgnat/g-dirope.adb
index 428d27d9e8d..d8ac0ec06f8 100644
--- a/gcc/ada/libgnat/g-dirope.adb
+++ b/gcc/ada/libgnat/g-dirope.adb
@@ -34,7 +34,6 @@ with Ada.Characters.Handling;
 with Ada.Strings.Fixed;
 
 with Ada.Unchecked_Deallocation;
-with Ada.Unchecked_Conversion;
 
 with System;  use System;
 with System.CRTL; use System.CRTL;
@@ -677,24 +676,15 @@ package body GNAT.Directory_Operations is
   end if;
 
   declare
- subtype Path_String is String (1 .. Filename_Len);
- typePath_String_Access is not null access constant Path_String;
-
- function Address_To_Access is new
-   Ada.Unchecked_Conversion
- (Source => Address,
-  Target => Path_String_Access);
-
- Path_Access : constant Path_String_Access :=
- Address_To_Access (Filename_Addr);
-
+ Filename : constant String (1 .. Filename_Len)
+   with Import, Address => Filename_Addr;
   begin
  if Str'Length > Filename_Len then
 Last := Str'First + Filename_Len - 1;
-Str (Str'First .. Last) := Path_Access.all;
+Str (Str'First .. Last) := Filename;
  else
 Last := Str'Last;
-Str := Path_Access (1 .. Str'Length);
+Str := Filename (1 .. Str'Length);
  end if;
   end;
end Read;
-- 
2.43.2



[COMMITTED] ada: Rewrite Append_Entity_Name; skip irrelevant names

2024-05-13 Thread Marc Poulhiès
From: Bob Duff 

This patch rewrites Append_Entity_Name, both for maintainability and to
improve user messages. The main issue was that the recursion stopped
when the enclosing scope is the wrapper created in case of
postconditions with 'Old. This caused different results depending
on the enabling/disabling of assertions. Instead of stopping,
we now skip things that the user shouldn't see; there is useful
information in more-outer scope names.

Simplify the code. We had a nested procedure, which called itself
recursively, and also was mutually recursive with the outer procedure.
Avoid testing Is_Internal_Name of the Chars, which seems too fragile.
'R' is used for subprogram instances, but for example "SR" is used for
TSS_Stream_Read, so removing 'R' works only by accident.
Instead, base the test for subprogram instances on normal Einfo
queries.

The new version of Append_Entity_Name produces different (and better)
results in many cases, but this fact is not apparent in most test cases,
because they don't raise unhandled exceptions or do other things that
involve printing the entity name.

The comment:

--  Otherwise nothing to output (happens in unnamed block statements)

is removed; there are many cases other than block statements that
reached that part of the code.

gcc/ada/

* sem_util.ads (Append_Entity_Name): Fix comment to reflect new
semantics. The comment said, "The qualification stops at an
enclosing scope has no source name (block or loop)." There seems
to be no reason for stopping; instead, we should SKIP things with
no source name. And the "loop" part was wrong.
* sem_util.adb (Append_Entity_Name): Do not stop the recursion;
skip to next-outer scope instead. Misc cleanup/simplification.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 128 +--
 gcc/ada/sem_util.ads |   7 +--
 2 files changed, 54 insertions(+), 81 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 6350524874c..b30cbcd57e9 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -645,96 +645,70 @@ package body Sem_Util is
-- Append_Entity_Name --

 
-   procedure Append_Entity_Name (Buf : in out Bounded_String; E : Entity_Id) is
-  Temp : Bounded_String;
-
-  procedure Inner (E : Entity_Id);
-  --  Inner recursive routine, keep outer routine nonrecursive to ease
-  --  debugging when we get strange results from this routine.
-
-  ---
-  -- Inner --
-  ---
-
-  procedure Inner (E : Entity_Id) is
- Scop : Node_Id;
-
-  begin
- --  If entity has an internal name, skip by it, and print its scope.
- --  Note that we strip a final R from the name before the test; this
- --  is needed for some cases of instantiations.
-
- declare
-E_Name : Bounded_String;
-
- begin
-Append (E_Name, Chars (E));
-
-if E_Name.Chars (E_Name.Length) = 'R' then
-   E_Name.Length := E_Name.Length - 1;
-end if;
-
-if Is_Internal_Name (E_Name) then
-   Inner (Scope (E));
-   return;
-end if;
- end;
-
- Scop := Scope (E);
-
- --  Just print entity name if its scope is at the outer level
-
- if Scop = Standard_Standard then
+   procedure Append_Entity_Name
+ (Buf : in out Bounded_String; E : Entity_Id)
+   is
+  Scop : constant Node_Id := Scope (E);
+  --  We recursively print the scope to Buf, and then print the simple
+  --  name, along with some special cases (see below). So for A.B.C.D,
+  --  recursively print A.B.C, then print D.
+   begin
+  --  If E is not a source entity, then skip the simple name and just
+  --  recursively print its scope. However, subprogram instances have
+  --  Comes_From_Source = False, but we do want to print the simple name
+  --  of the instance.
+
+  if not Comes_From_Source (E) then
+ if Is_Generic_Instance (E)
+   and then Ekind (E) in E_Function | E_Procedure
+ then
 null;
+ else
+Append_Entity_Name (Buf, Scope (E));
+return;
+ end if;
+  end if;
 
- --  If scope comes from source, write scope and entity
-
- elsif Comes_From_Source (Scop) then
-Append_Entity_Name (Temp, Scop);
-Append (Temp, '.');
-
- --  If in wrapper package skip past it
-
- elsif Present (Scop) and then Is_Wrapper_Package (Scop) then
-Append_Entity_Name (Temp, Scope (Scop));
-Append (Temp, '.');
+  --  Just print entity name if its scope is at the outer level
 
- --  Otherwise nothing to output (happens in unnamed block statements)
+  if No (Scop) or Scop = Standard_Standard then
+ null;
 
- else
-  

[COMMITTED] ada: Recognize pragma Lock_Free as specific to GNAT

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Pramga Lock_Free must be recognized as implementation-defined.

gcc/ada/

* sem_prag.adb (Analyze_Pragma): When processing pragma
Lock_Free, check if restriction No_Implementation_Pragmas is
enabled.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_prag.adb | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
index ff02ae9a7af..9e0e41c3dad 100644
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -19950,6 +19950,7 @@ package body Sem_Prag is
 Val : Boolean;
 
  begin
+GNAT_Pragma;
 Check_No_Identifiers;
 Check_At_Most_N_Arguments (1);
 
-- 
2.43.2



[COMMITTED] ada: Fix pragma Compile_Time_Error for alignment of array types

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

The pragma is consistenly rejected for the alignment of array types because
Eval_Attribute does not evaluate it even if it is known.

gcc/ada/

* sem_attr.adb (Eval_Attribute): Treat Alignment like Component_Size
for array types.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_attr.adb | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index e80a144ebb2..65442d45a85 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -8729,14 +8729,15 @@ package body Sem_Attr is
   --  Unconstrained_Array are again exceptions, because they apply as well
   --  to unconstrained types.
 
+  --  Folding can also be done for Preelaborable_Initialization based on
+  --  whether the prefix type has preelaborable initialization, even though
+  --  the attribute is nonstatic.
+
   --  In addition Component_Size is an exception since it is possibly
   --  foldable, even though it is never static, and it does apply to
   --  unconstrained arrays. Furthermore, it is essential to fold this
   --  in the packed case, since otherwise the value will be incorrect.
-
-  --  Folding can also be done for Preelaborable_Initialization based on
-  --  whether the prefix type has preelaborable initialization, even though
-  --  the attribute is nonstatic.
+  --  Moreover, the exact same reasoning can be applied to Alignment.
 
   elsif Id = Attribute_Atomic_Always_Lock_Free  or else
 Id = Attribute_Definite or else
@@ -8747,7 +8748,8 @@ package body Sem_Attr is
 Id = Attribute_Preelaborable_Initialization or else
 Id = Attribute_Type_Class   or else
 Id = Attribute_Unconstrained_Array  or else
-Id = Attribute_Component_Size
+Id = Attribute_Component_Size   or else
+Id = Attribute_Alignment
   then
  Static := False;
  Set_Is_Static_Expression (N, False);
-- 
2.43.2



[COMMITTED] ada: Remove deprecated VxWorks interrupt connection API

2024-05-13 Thread Marc Poulhiès
From: Ashley Gay 

The VxWorks 7 API to use hardware interrupts is the VxBus subsystem.
GNAT API still provides bindings for the deprecated (VxWorks 6) routines.
A direct consequence of this change is that Attach_Handler cannot be
used anymore (the VxBus subsystem should be used instead).

This patch removes all the functions that are not supported by VxWorks 7
anymore.
To warn for the usage of Attach_Handler, it adds the 'Obsolescent'
pragma to to this routine so the comiler will advise the user if this
function is called directly or through a pragma.

gcc/ada/

* Makefile.rtl: remove i-vxinco.* from the build
* doc/gnat_rm/the_gnat_library.rst: Remove i-vxinco.ads from
the units documentation.
* impunit.adb: Remove i-vxinco from the list of available units
in GNATstudio.
* libgnarl/i-vxinco.adb: Remove.
* libgnarl/i-vxinco.ads: Ditto.
* libgnarl/s-interr__vxworks.adb: enrich comment
* libgnarl/s-vxwext__kernel.ads: fix comment
* libgnat/i-vxwork.ads: Remove deprecated interrupt connections
API, as well as an example.
* libgnat/i-vxwork__x86.ads: Ditto and add the paragma
Obscolescent to Attach_Handler
* gnat_rm.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/Makefile.rtl |   8 +-
 gcc/ada/doc/gnat_rm/the_gnat_library.rst |  17 --
 gcc/ada/gnat_rm.texi | 196 ++-
 gcc/ada/impunit.adb  |   1 -
 gcc/ada/libgnarl/i-vxinco.adb|  48 --
 gcc/ada/libgnarl/i-vxinco.ads|  56 ---
 gcc/ada/libgnarl/s-interr__vxworks.adb   |   5 +-
 gcc/ada/libgnarl/s-vxwext__kernel.ads|   2 +-
 gcc/ada/libgnat/i-vxwork.ads | 115 +
 gcc/ada/libgnat/i-vxwork__x86.ads| 109 -
 10 files changed, 98 insertions(+), 459 deletions(-)
 delete mode 100644 gcc/ada/libgnarl/i-vxinco.adb
 delete mode 100644 gcc/ada/libgnarl/i-vxinco.ads

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 3721a70ffcc..ad3e6380a52 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -1162,7 +1162,7 @@ ifeq ($(strip $(filter-out powerpc% wrs vxworks 
vxworks7%, $(target_cpu) $(targe
   RTSERR = $(error NO SUCH RUNTIME)
 endif
   endif
-  EXTRA_GNATRTL_NONTASKING_OBJS+=i-vxinco.o i-vxwork.o i-vxwoio.o
+  EXTRA_GNATRTL_NONTASKING_OBJS+=i-vxwork.o i-vxwoio.o
 endif
   endif
 
@@ -1279,7 +1279,7 @@ ifeq ($(strip $(filter-out %86 x86_64 wrs vxworks7%, 
$(target_cpu) $(target_vend
   RTSERR = $(error NO SUCH RUNTIME)
 endif
 
-EXTRA_GNATRTL_NONTASKING_OBJS += i-vxinco.o i-vxwork.o i-vxwoio.o
+EXTRA_GNATRTL_NONTASKING_OBJS += i-vxwork.o i-vxwoio.o
   endif
 
   EXTRA_GNATRTL_NONTASKING_OBJS += s-stchop.o
@@ -1371,7 +1371,7 @@ ifeq ($(strip $(filter-out aarch64 arm wrs vxworks7%, 
$(target_cpu) $(target_ven
   endif
 
   EXTRA_GNATRTL_NONTASKING_OBJS += i-vxwork.o i-vxwoio.o s-stchop.o
-  EXTRA_GNATRTL_TASKING_OBJS += i-vxinco.o s-vxwork.o s-vxwext.o
+  EXTRA_GNATRTL_TASKING_OBJS += s-vxwork.o s-vxwext.o
 
   EXTRA_LIBGNAT_OBJS+=vx_stack_info.o
 
@@ -2890,7 +2890,7 @@ ADA_EXCLUDE_SRCS =\
   g-allein.ads g-alleve.adb g-alleve.ads g-altcon.adb g-altcon.ads \
   g-altive.ads g-alveop.adb g-alveop.ads g-alvety.ads g-alvevi.ads \
   g-intpri.ads g-regist.adb g-regist.ads g-sse.adsg-ssvety.ads \
-  i-vxinco.adb i-vxinco.ads i-vxwoio.adb i-vxwoio.ads i-vxwork.ads \
+  i-vxwoio.adb i-vxwoio.ads i-vxwork.ads \
   s-linux.ads  s-vxwext.adb s-vxwext.ads s-win32.ads  s-winext.ads \
   s-stchop.ads s-stchop.adb \
   s-strcom.adb s-strcom.ads s-thread.ads \
diff --git a/gcc/ada/doc/gnat_rm/the_gnat_library.rst 
b/gcc/ada/doc/gnat_rm/the_gnat_library.rst
index 3aae70a4409..88204d4cfe7 100644
--- a/gcc/ada/doc/gnat_rm/the_gnat_library.rst
+++ b/gcc/ada/doc/gnat_rm/the_gnat_library.rst
@@ -1915,23 +1915,6 @@ mainframes.
 .. index:: VxWorks, interfacing
 
 This package provides a limited binding to the VxWorks API.
-In particular, it interfaces with the
-VxWorks hardware interrupt facilities.
-
-.. _`Interfaces.VxWorks.Int_Connection_(i-vxinco.ads)`:
-
-``Interfaces.VxWorks.Int_Connection`` (:file:`i-vxinco.ads`)
-
-
-.. index:: Interfaces.VxWorks.Int_Connection (i-vxinco.ads)
-
-.. index:: Interfacing to VxWorks
-
-.. index:: VxWorks, interfacing
-
-This package provides a way for users to replace the use of
-intConnect() with a custom routine for installing interrupt
-handlers.
 
 .. _`Interfaces.VxWorks.IO_(i-vxwoio.ads)`:
 
diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
index f6b14cf61b9..f0e95bec1e5 100644
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -833,7 +833,6 @@ The GNAT Library
 * Interfaces.C.Streams (i-cstrea.ads): Interfaces C Streams i-cstrea ads. 
 * Interfaces.Packed_Decimal (i-pacdec.ads): Interfaces Packed_Decimal i-

[COMMITTED] ada: Couple of comment tweaks to latest change

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

This replaces a few remaining references to "master" by "collection" and
makes a couple of additional tweaks in comments.

gcc/ada/

* libgnat/s-finpri.adb (Finalize): Replace "master" by "collection"
in comments and add a comment about the form of the loop.
* libgnat/s-stposu.adb (Allocate_Any_Controlled): Tweak comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-finpri.adb | 20 
 gcc/ada/libgnat/s-stposu.adb |  9 -
 2 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/gcc/ada/libgnat/s-finpri.adb b/gcc/ada/libgnat/s-finpri.adb
index 2abc9f49403..731c913b2e7 100644
--- a/gcc/ada/libgnat/s-finpri.adb
+++ b/gcc/ada/libgnat/s-finpri.adb
@@ -174,18 +174,16 @@ package body System.Finalization_Primitives is
   if Collection.Finalization_Started then
  Unlock_Task.all;
 
- --  Double finalization may occur during the handling of stand alone
- --  libraries or the finalization of a pool with subpools. Due to the
- --  potential aliasing of masters in these two cases, do not process
- --  the same master twice.
+ --  Double finalization may occur during the handling of stand-alone
+ --  libraries or the finalization of a pool with subpools.
 
  return;
   end if;
 
-  --  Lock the master to prevent any allocations while the objects are
-  --  being finalized. The master remains locked because either the master
-  --  is explicitly deallocated or the associated access type is about to
-  --  go out of scope.
+  --  Lock the collection to prevent any allocation while the objects are
+  --  being finalized. The collection remains locked because either it is
+  --  explicitly deallocated or the associated access type is about to go
+  --  out of scope.
 
   --  Synchronization:
   --Read  - allocation, finalization
@@ -193,6 +191,12 @@ package body System.Finalization_Primitives is
 
   Collection.Finalization_Started := True;
 
+  --  Note that we cannot walk the list while finalizing its elements
+  --  because the finalization of one may call Unchecked_Deallocation
+  --  on another and, therefore, detach it from anywhere on the list.
+  --  Instead, we empty the list by repeatedly finalizing the first
+  --  element (after the dummy head) and detaching it from the list.
+
   while not Is_Empty_List (Collection.Head'Unchecked_Access) loop
  Curr_Ptr := Collection.Head.Next;
 
diff --git a/gcc/ada/libgnat/s-stposu.adb b/gcc/ada/libgnat/s-stposu.adb
index ebbd3e4d72a..8d232fa0d61 100644
--- a/gcc/ada/libgnat/s-stposu.adb
+++ b/gcc/ada/libgnat/s-stposu.adb
@@ -196,17 +196,16 @@ package body System.Storage_Pools.Subpools is
   --  object or a record with controlled components.
 
   if Is_Controlled then
-
- --  Synchronization:
- --Read  - allocation, finalization
- --Write - finalization
-
  Lock_Taken := True;
  Lock_Task.all;
 
  --  Do not allow the allocation of controlled objects while the
  --  associated collection is being finalized.
 
+ --  Synchronization:
+ --Read  - allocation, finalization
+ --Write - finalization
+
  if Finalization_Started (Collection.all) then
 raise Program_Error with "allocation after finalization started";
  end if;
-- 
2.43.2



[COMMITTED] ada: Restore fix for controlled dynamic allocation with BIP function call

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

The resolution made some time ago had been that a dynamic allocation for
a limited type that needs finalization with a function call as expression
always needs to be done in the called function, even if the limited type
has a known size.  But the fix implementing this resolution was dropped
inadvertently at some point.

The change also contains a small tweak for Expand_N_Object_Declaration
and a small related cleanup in the finalization machinery.

gcc/ada/

* exp_ch3.adb (Expand_N_Object_Declaration): In the case of a
return object of a BIP function that needs finalization, save
the assignment statement made to initialize it, if any.
* exp_ch6.ads (BIP_Formal_Kind): Adjust description.
* exp_ch6.adb (Make_Build_In_Place_Call_In_Allocator): Make a
couple of adjustments to the commentary.
(Needs_BIP_Alloc_Form): Also return true if the function needs
a BIP_Finalization_Master parameter.
* exp_ch7.adb (Build_BIP_Cleanup_Stmts): Remove now always true
test on Needs_BIP_Alloc_Form.
(Attach_Object_To_Master_Node): Remove duplication in comment.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch3.adb |  8 +
 gcc/ada/exp_ch6.adb | 34 ++--
 gcc/ada/exp_ch6.ads | 22 +++--
 gcc/ada/exp_ch7.adb | 75 -
 4 files changed, 64 insertions(+), 75 deletions(-)

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index f934dbfddaa..4ebc7b977e9 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -8746,6 +8746,14 @@ package body Exp_Ch3 is
 Initialize_Return_Object
   (Tag_Assign, Adj_Call, Expr_Q, Init_Stmt, Init_After);
 
+--  Save the assignment statement when returning a controlled
+--  object. This reference is used later by the finalization
+--  machinery to mark the object as successfully initialized.
+
+if Present (Init_Stmt) and then Needs_Finalization (Typ) then
+   Set_Last_Aggregate_Assignment (Def_Id, Init_Stmt);
+end if;
+
 --  Replace the return object declaration with a renaming of a
 --  dereference of the access value designating the return object.
 
diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index a89c9af0bb2..9e1844aa08e 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -158,9 +158,9 @@ package body Exp_Ch6 is
   Alloc_Form : BIP_Allocation_Form := Unspecified;
   Alloc_Form_Exp : Node_Id := Empty;
   Pool_Actual: Node_Id := Make_Null (No_Location));
-   --  Ada 2005 (AI-318-02): Add the actuals needed for a build-in-place
-   --  function call that returns a caller-unknown-size result (BIP_Alloc_Form
-   --  and BIP_Storage_Pool). If Alloc_Form_Exp is present, then use it,
+   --  Ada 2005 (AI-318-02): If the result type of a build-in-place call needs
+   --  them, add the actuals parameters BIP_Alloc_Form and BIP_Storage_Pool.
+   --  If Alloc_Form_Exp is present, then pass it for the first parameter,
--  otherwise pass a literal corresponding to the Alloc_Form parameter
--  (which must not be Unspecified in that case). Pool_Actual is the
--  parameter to pass to BIP_Storage_Pool.
@@ -8328,9 +8328,11 @@ package body Exp_Ch6 is
   Set_Can_Never_Be_Null (Acc_Type, False);
   --  It gets initialized to null, so we can't have that
 
-  --  When the result subtype is constrained, the return object is created
-  --  on the caller side, and access to it is passed to the function. This
-  --  optimization is disabled when the result subtype needs finalization
+  --  When the result subtype is returned on the secondary stack or is
+  --  tagged, the called function itself must perform the allocation of
+  --  the return object, so we pass parameters indicating that.
+
+  --  But that's also the case when the result subtype needs finalization
   --  actions because the caller side allocation may result in undesirable
   --  finalization. Consider the following example:
   --
@@ -8351,11 +8353,6 @@ package body Exp_Ch6 is
   --  will be finalized when access type Lim_Ctrl_Ptr goes out of scope
   --  since it is already attached on the related finalization master.
 
-  --  Here and in related routines, we must examine the full view of the
-  --  type, because the view at the point of call may differ from the
-  --  one in the function body, and the expansion mechanism depends on
-  --  the characteristics of the full view.
-
   if Needs_BIP_Alloc_Form (Function_Id) then
  Temp_Init := Empty;
 
@@ -8386,6 +8383,10 @@ package body Exp_Ch6 is
 
  Return_Obj_Actual := Empty;
 
+  --  When the result subtype neither is returned on the secondary stack
+  --  nor is tagged, the return object is created on the caller

[COMMITTED] ada: Replace finalization masters with finalization collections

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

This change replaces finalization masters with finalization collections in
most cases, that is to say, when they implement a list of objects created
by allocators of a given access type; indeed the moniker is overloaded in
the front-end, e.g. Sem_Util.Is_Master determines if a node "constitutes
a finalization master" but is not affected by the change.

This is mostly a renaming at this stage, toward something more in keeping
with the terminology used in the RM 7.6.1 clause and no functional changes:
although it gets rid of the rest of the System.Finalization_Masters unit,
the functionalities are reimplemented in the System.Finalization_Primitives
unit in terms of collections with only minor adjustments.

gcc/ada/

* Makefile.rtl (GNATRTL_NONTASKING_OBJS): Remove s-finmas$(objext).
* einfo.ads (Anonymous_Masters): Rename into Anonymous_Collections.
(Finalization_Master): Rename into Finalization_Collection.
* gen_il-fields.ads (Opt_Field_Enum): Replace Anonymous_Masters
with Anonymous_Collections; and Finalization_Master with
Finalization_Collection.
* gen_il-gen-gen_entities.adb (Access_Kind): Likewise.
(E_Function): Likewise.
(E_Procedure): Likewise.
(E_Package): Likewise.
(E_Subprogram_Body): Likewise.
* exp_ch3.adb (Build_Heap_Or_Pool_Allocator): Adjust to renamings.
(Freeze_Type): Likewise.
(Stream_Operation_OK): Remove obsolete test.
* exp_ch4.adb (Expand_Allocator_Expression): Adjust to renamings.
(Expand_N_Allocator): Likewise.
* exp_ch6.ads (BIP_Formal_Kind): Replace BIP_Finalization_Master
with BIP_Collection.
(Needs_BIP_Finalization_Master): Rename into...
(Needs_BIP_Collection): ...this.
* exp_ch6.adb (BIP_Finalization_Master_Suffix): Delete.
(BIP_Collection_Suffix): New constant string.
(Add_Finalization_Master_Actual_To_Build_In_Place_Call): Rename to
(Add_Collection_Actual_To_Build_In_Place_Call): ...this and adjust.
(BIP_Formal_Suffix): Replace BIP_Finalization_Master alternative
with BIP_Collection alternative.
(BIP_Suffix_Kind): Replace test on BIP_Finalization_Master_Suffix
with test on BIP_Collection_Suffix.
(Is_Build_In_Place_Entity): Likewise.
(Make_Build_In_Place_Call_In_Allocator): Call Needs_BIP_Collection
and Add_Collection_Actual_To_Build_In_Place_Call.
(Make_Build_In_Place_Call_In_Anonymous_Context): Likewise.
(Make_Build_In_Place_Call_In_Assignment): Likewise.
(Make_Build_In_Place_Call_In_Object_Declaration): Likewise.
(Needs_BIP_Finalization_Master): Rename into...
(Needs_BIP_Collection): ...this.
(Needs_BIP_Alloc_Form): Call Needs_BIP_Collection.
* exp_ch7.ads (Build_Anonymous_Master): Rename into...
(Build_Anonymous_Collection): ...this.
(Build_Finalization_Master): Rename into...
(Build_Finalization_Collection): ...this.
* exp_ch7.adb (Allows_Finalization_Master): Rename into...
(Allows_Finalization_Collection): ...this.
(Build_BIP_Cleanup_Stmts): Adjust to renamings.
(Build_Anonymous_Master): Rename into...
(Build_Anonymous_Collection): ...this.  Adjust to renamings.
(Build_Finalization_Master): Rename into...
(Build_Finalization_Collection): ...this.  Adjust to renamings.
(Build_Finalizer): Adjust comment to renamings.
* exp_ch13.adb (Expand_N_Free_Statement): Adjust to renamings.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Likewise.
(Requires_Cleanup_Actions): Adjust comment to renamings.
* freeze.adb (Freeze_All): Likewise.
* rtsfind.ads (RTU_Id): Remove System_Finalization_Masters.
(RE_Id): Remove RE_Finalization_Master & RE_Finalization_Master_Ptr
add RE_Finalization_Collection & RE_Finalization_Collection_Ptr.
Adjust RE_Add_Offset_To_Address and RE_Finalization_Scope_Master.
(RE_Unit_Table): Remove entries for RE_Finalization_Master &
RE_Finalization_Master_Ptr, add ones for RE_Finalization_Collection
& RE_Finalization_Collection_Ptr.  Also adjust those of
RE_Add_Offset_To_Address and RE_Finalization_Scope_Master.
* sem_ch3.adb (Access_Type_Declaration): Adjust to renamings.
* sem_ch6.adb (Create_Extra_Formals): Likewise.
* sem_util.adb (Designated_Subtype_Mark): Likewise.
* libgnat/s-finpri.ads: Add clauses for Ada.Finalization and
System.Storage_Elements.
(Finalization_Collection): New limited controlled type.
(Finalization_Collection_Ptr): Likewise.
(Initialize): New overriding procedure.
(Finalize): Likewise.
(Finalization_Started): Likewise.
(Collection_Node): New type.
(Collection_Node_Ptr): Likewise.
(Attach_Node_To_Collection): New procedure.
(D

[COMMITTED] ada: Rename finalization scope masters into finalization masters

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

Now that what was previously called "finalization master" has been renamed
into "finalization collection" in the front-end, we can also rename what was
initially called "finalization scope master" into "finalization master".

These entities indeed drive the finalization of all the objects that require
it, directly for (statically) declared objects or indirectly for dynamically
allocated objects (that is to say, through finalization collections).

gcc/ada/

* exp_ch7.adb: Adjust the description of finalization management.
(Build_Finalizer): Rename scope master into master throughout.
* rtsfind.ads (RE_Id): Replace RE_Finalization_Scope_Master with
RE_Finalization_Master.
(RE_Unit_Table): Replace entry for RE_Finalization_Scope_Master with
entry for RE_Finalization_Master.
* libgnat/s-finpri.ads (Finalization_Scope_Master): Rename into...
(Finalization_Master): ...this.
(Attach_Object_To_Master): Adjust to above renaming.
(Chain_Node_To_Master): Likewise.
(Finalize_Master): Likewise.
* libgnat/s-finpri.adb (Attach_Object_To_Master): Likewise.
(Chain_Node_To_Master): Likewise.
(Finalize_Master): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch7.adb  | 64 +---
 gcc/ada/libgnat/s-finpri.adb |  6 ++--
 gcc/ada/libgnat/s-finpri.ads | 12 +++
 gcc/ada/rtsfind.ads  |  4 +--
 4 files changed, 42 insertions(+), 44 deletions(-)

diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index 50d5359e04d..a62c7441a48 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -85,10 +85,9 @@ package body Exp_Ch7 is
 
--  Initialize calls: they are generated for either declarations or dynamic
--  allocations of controlled objects with no initial value. They are always
-   --  followed by an attachment to the current finalization chain. For the
-   --  dynamic allocation case, this is the chain attached to the scope of the
-   --  access type definition; otherwise, this is the chain of the current
-   --  scope.
+   --  followed by an attachment to some finalization chain. For the dynamic
+   --  dynamic allocation case, this is the collection attached to the access
+   --  type definition; otherwise, this is the master of the current scope.
 
--  Adjust calls: they are generated on two occasions: (1) for declarations
--  or dynamic allocations of controlled objects with an initial value (with
@@ -122,7 +121,7 @@ package body Exp_Ch7 is
--  is expanded into:
--
--declare
-   --   Mnn : System.Finalization_Primitives.Finalization_Scope_Master;
+   --   Mnn : System.Finalization_Primitives.Finalization_Master;
 
--   XMN : aliased System.Finalization_Primitives.Master_Node;
--   X : Ctrl;
@@ -203,8 +202,8 @@ package body Exp_Ch7 is
--at end
--   _Finalizer;
 
-   --  In the case of a block containing a single controlled object, the scope
-   --  master degenerates into a single master node:
+   --  In the case of a block containing a single controlled object, the master
+   --  degenerates into a single master node:
 
--declare
--   X : Ctrl := Init;
@@ -268,7 +267,7 @@ package body Exp_Ch7 is
 
--  These direct actions must be signalled to the post-processing machinery
--  and this is achieved through the handling of Master_Node objects, which
-   --  are the items actually chained in finalization chains of scope masters.
+   --  are the items actually chained in the finalization chains of masters.
--  With the default processing, they are created by Build_Finalizer for the
--  controlled objects spotted by Requires_Cleanup_Actions. But when direct
--  actions are carried out, they are generated by these actions and later
@@ -1702,8 +1701,8 @@ package body Exp_Ch7 is
   Finalizer_Decls : List_Id := No_List;
   --  Local variable declarations
 
-  Finalization_Scope_Master : Entity_Id;
-  --  The Finalization Scope Master object
+  Finalization_Master : Entity_Id;
+  --  The Finalization Master object
 
   Finalizer_Stmts : List_Id := No_List;
   --  The statement list of the finalizer body
@@ -1774,33 +1773,33 @@ package body Exp_Ch7 is
   --
 
   procedure Build_Components is
- Constraints   : List_Id;
- Scope_Master_Decl : Node_Id;
- Scope_Master_Name : Name_Id;
+ Constraints : List_Id;
+ Master_Decl : Node_Id;
+ Master_Name : Name_Id;
 
   begin
  pragma Assert (Present (Decls));
 
  --  If the context contains controlled objects, then we create the
- --  finalization scope master, unless there is a single such object;
- --  in this common case, we'll directly finalize the object.
+ --  finalization master, unless there is a single such object: 

[COMMITTED] ada: Decouple finalization masters from storage pools

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

The coupling came from the build-in-place protocol but is now unnecessary
because the storage pool reference is always passed along with the master
reference in this protocol.  No functional changes.

gcc/ada/

* exp_ch3.adb (Build_Heap_Or_Pool_Allocator): Use the BIPstoragepool
formal parameter to retrieve the pool in the presence of a master.
* exp_ch6.adb (Make_Build_In_Place_Call_In_Allocator): Always pass
a pool reference along with the master reference.
(Make_Build_In_Place_Call_In_Object_Declaration): Likewise.
* exp_ch7.adb (Build_BIP_Cleanup_Stmts): Use the BIPstoragepool
formal parameter to retrieve the pool in the presence of a master.
(Create_Anonymous_Master): Do not call Set_Base_Pool.
(Build_Finalization_Master): Likewise.
* rtsfind.ads (RE_Id): Remove RE_Base_Pool and RE_Set_Base_Pool.
(RE_Unit_Table): Remove associated entries.
* libgnat/s-finmas.ads: Remove clause for System.Storage_Pools.
(Any_Storage_Pool_Ptr): Delete.
(Finalization_Master): Remove Base_Pool component.
(Base_Pool): Delete.
(Set_Base_Pool): Likewise.
* libgnat/s-finmas.adb (Base_Pool): Likewise.
(Set_Base_Pool): Likewise.
(Print_Master): Do not print Base_Pool.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch3.adb  | 49 +++---
 gcc/ada/exp_ch6.adb  | 33 ---
 gcc/ada/exp_ch7.adb  | 79 
 gcc/ada/libgnat/s-finmas.adb | 30 --
 gcc/ada/libgnat/s-finmas.ads | 22 --
 gcc/ada/rtsfind.ads  |  4 --
 6 files changed, 76 insertions(+), 141 deletions(-)

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 4ebc7b977e9..f8d41b1bfc0 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -6254,8 +6254,7 @@ package body Exp_Ch3 is
   --   else
   --  declare
   -- type Ptr_Typ is access Ret_Typ;
-  -- for Ptr_Typ'Storage_Pool use
-  --   Base_Pool (BIPfinalizationmaster.all).all;
+  -- for Ptr_Typ'Storage_Pool use BIPstoragepool.all;
   -- Local : Ptr_Typ;
   --
   --  begin
@@ -6497,25 +6496,27 @@ package body Exp_Ch3 is
 
 begin
--  Generate:
-   --Pool_Id renames Base_Pool (BIPfinalizationmaster.all).all;
-
-   Pool_Id := Make_Temporary (Loc, 'P');
-
-   Append_To (Decls,
- Make_Object_Renaming_Declaration (Loc,
-   Defining_Identifier => Pool_Id,
-   Subtype_Mark=>
- New_Occurrence_Of (RTE (RE_Root_Storage_Pool), Loc),
-   Name=>
- Make_Explicit_Dereference (Loc,
-   Prefix =>
- Make_Function_Call (Loc,
-   Name   =>
- New_Occurrence_Of (RTE (RE_Base_Pool), Loc),
-   Parameter_Associations => New_List (
- Make_Explicit_Dereference (Loc,
-   Prefix =>
- New_Occurrence_Of (Fin_Mas_Id, Loc)));
+   --Pool_Id renames BIPstoragepool.all;
+
+   --  This formal is not added on ZFP as those targets do not
+   --  support pools.
+
+   if RTE_Available (RE_Root_Storage_Pool_Ptr) then
+  Pool_Id := Make_Temporary (Loc, 'P');
+
+  Append_To (Decls,
+Make_Object_Renaming_Declaration (Loc,
+  Defining_Identifier => Pool_Id,
+  Subtype_Mark=>
+New_Occurrence_Of (RTE (RE_Root_Storage_Pool), Loc),
+  Name=>
+Make_Explicit_Dereference (Loc,
+  New_Occurrence_Of
+(Build_In_Place_Formal
+   (Func_Id, BIP_Storage_Pool), Loc;
+   else
+  Pool_Id := Empty;
+   end if;
 
--  Create an access type which uses the storage pool of the
--  caller's master. This additional type is necessary because
@@ -6572,10 +6573,8 @@ package body Exp_Ch3 is
  Unchecked_Convert_To (Temp_Typ,
New_Occurrence_Of (Local_Id, Loc;
 
-   --  Wrap the allocation in a block. This is further conditioned
-   --  by checking the caller finalization master at runtime. A
-   --  null value indicates a non-existent master, most likely due
-   --  to a Finalize_Storage_Only allocation.
+   --  Wrap the allocation in a block to make it conditioned

[COMMITTED] ada: Fix style in comments

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Code cleanup.

gcc/ada/

* contracts.adb (Inherit_Subprogram_Contract): Fix style.
* sem_ch5.adb (Analyze_Iterator_Specification): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/contracts.adb | 2 +-
 gcc/ada/sem_ch5.adb   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
index c440053bb78..c04d850b532 100644
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -3620,7 +3620,7 @@ package body Contracts is
  end if;
   end Inherit_Pragma;
 
-   --   Start of processing for Inherit_Subprogram_Contract
+   --  Start of processing for Inherit_Subprogram_Contract
 
begin
   --  Inheritance is carried out only when both entities are subprograms
diff --git a/gcc/ada/sem_ch5.adb b/gcc/ada/sem_ch5.adb
index dc9524b0891..2677a2c5a1c 100644
--- a/gcc/ada/sem_ch5.adb
+++ b/gcc/ada/sem_ch5.adb
@@ -2158,7 +2158,7 @@ package body Sem_Ch5 is
  return Etype (Ent);
   end Get_Cursor_Type;
 
-   --   Start of processing for Analyze_Iterator_Specification
+   --  Start of processing for Analyze_Iterator_Specification
 
begin
   Enter_Name (Def_Id);
-- 
2.43.2



[COMMITTED] ada: Avoid crash on illegal constrained type declarations

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Fix crash on ACATS test B38003B introduced by a recent cleanup of
per-object constraints.

gcc/ada/

* sem_util.adb (Get_Index_Bounds): Guard against missing Entity,
which happens on illegal constrained type declaration.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 5f44b4c26fe..579172515df 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10542,7 +10542,10 @@ package body Sem_Util is
 H := High_Bound (Range_Expression (Constraint (N)));
  end if;
 
-  elsif Is_Entity_Name (N) and then Is_Type (Entity (N)) then
+  elsif Is_Entity_Name (N)
+and then Present (Entity (N))
+and then Is_Type (Entity (N))
+  then
  Rng := Scalar_Range_Of_Type (Entity (N));
 
  if Error_Posted (Rng) then
-- 
2.43.2



[COMMITTED] ada: Complete implementation of Ada 2022 aspect Exclusive_Functions

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Extend implementation of RM 9.5.1(7/4), which now applies also to
protected function if the protected type has aspect Exclusive_Functions.

gcc/ada/

* exp_ch9.adb (Build_Protected_Subprogram_Call_Cleanup): If
aspect Exclusive_Functions is present then the cleanup of a
protected function now services queued entries, just like the
cleanup of a protected procedure.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch9.adb | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/exp_ch9.adb b/gcc/ada/exp_ch9.adb
index 17d997b9f60..1b231b8bf2c 100644
--- a/gcc/ada/exp_ch9.adb
+++ b/gcc/ada/exp_ch9.adb
@@ -4032,12 +4032,25 @@ package body Exp_Ch9 is
   Nam : Node_Id;
 
begin
-  --  If the associated protected object has entries, a protected
-  --  procedure has to service entry queues. In this case generate:
+  --  If the associated protected object has entries, the expanded
+  --  exclusive protected operation has to service entry queues. In
+  --  this case generate:
 
   --Service_Entries (_object._object'Access);
 
-  if Nkind (Op_Spec) = N_Procedure_Specification
+  if (Nkind (Op_Spec) = N_Procedure_Specification
+or else
+  (Nkind (Op_Spec) = N_Function_Specification
+ and then Has_Aspect (Conc_Typ, Aspect_Exclusive_Functions)
+ and then
+   (No
+ (Find_Value_Of_Aspect (Conc_Typ,
+Aspect_Exclusive_Functions))
+  or else
+Is_True
+  (Static_Boolean
+ (Find_Value_Of_Aspect
+(Conc_Typ, Aspect_Exclusive_Functions))
 and then Has_Entries (Conc_Typ)
   then
  case Corresponding_Runtime_Package (Conc_Typ) is
-- 
2.43.2



[COMMITTED] ada: Deconstruct flag Split_PPC since splitting now is done in expansion

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Remove flag Split_PPC and all its uses.

gcc/ada/

* contracts.adb (Append_Enabled_Item): Remove use of Split_PPC;
simplify.
* gen_il-fields.ads (Opt_Field_Enum): Remove flag definition.
* gen_il-gen-gen_nodes.adb (N_Aspect_Specification, N_Pragma):
Remove Split_PPC flags.
* gen_il-internals.adb (Image): Remove use of Split_PPC.
* par_sco.adb (Traverse_Aspects): Likewise.
* sem_ch13.adb (Make_Aitem_Pragma): Likewise.
* sem_ch6.adb (List_Inherited_Pre_Post_Aspects): Likewise.
* sem_prag.adb (Analyze_Pre_Post_Condition, Analyze_Pragma,
Find_Related_Declaration_Or_Body): Likewise.
* sem_util.adb (Applied_On_Conjunct): Likewise.
* sinfo.ads: Remove flag documentation.
* treepr.adb (Image): Remove use of Split_PPC.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/contracts.adb| 17 +---
 gcc/ada/gen_il-fields.ads|  1 -
 gcc/ada/gen_il-gen-gen_nodes.adb |  2 -
 gcc/ada/gen_il-internals.adb |  2 -
 gcc/ada/par_sco.adb  |  5 ---
 gcc/ada/sem_ch13.adb | 15 +++
 gcc/ada/sem_ch6.adb  |  4 +-
 gcc/ada/sem_prag.adb |  8 +---
 gcc/ada/sem_util.adb | 73 ++--
 gcc/ada/sinfo.ads| 19 +
 gcc/ada/treepr.adb   |  2 -
 11 files changed, 25 insertions(+), 123 deletions(-)

diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
index c04d850b532..97f38735662 100644
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -2714,22 +2714,7 @@ package body Contracts is
  --  Otherwise, add the item
 
  else
-if No (List) then
-   List := New_List;
-end if;
-
---  If the pragma is a conjunct in a composite postcondition, it
---  has been processed in reverse order. In the postcondition body
---  it must appear before the others.
-
-if Nkind (Item) = N_Pragma
-  and then From_Aspect_Specification (Item)
-  and then Split_PPC (Item)
-then
-   Prepend (Item, List);
-else
-   Append (Item, List);
-end if;
+Append_New (Item, List);
  end if;
   end Append_Enabled_Item;
 
diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
index 67074c60250..54a5703d1a5 100644
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -386,7 +386,6 @@ package Gen_IL.Fields is
   Shift_Count_OK,
   Source_Type,
   Specification,
-  Split_PPC,
   Statements,
   Storage_Pool,
   Subpool_Handle_Name,
diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index 3a78ffb2009..f3dc215673a 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -1251,7 +1251,6 @@ begin -- Gen_IL.Gen.Gen_Nodes
(Sy (Identifier, Node_Id, Default_Empty),
 Sy (Expression, Node_Id, Default_Empty),
 Sy (Class_Present, Flag),
-Sy (Split_PPC, Flag),
 Sm (Aspect_On_Partial_View, Flag),
 Sm (Aspect_Rep_Item, Node_Id),
 Sm (Entity_Or_Associated_Node, Node_Id), -- just Entity
@@ -1556,7 +1555,6 @@ begin -- Gen_IL.Gen.Gen_Nodes
(Sy (Pragma_Argument_Associations, List_Id, Default_No_List),
 Sy (Pragma_Identifier, Node_Id),
 Sy (Class_Present, Flag),
-Sy (Split_PPC, Flag),
 Sm (Corresponding_Aspect, Node_Id),
 Sm (From_Aspect_Specification, Flag),
 Sm (Import_Interface_Present, Flag),
diff --git a/gcc/ada/gen_il-internals.adb b/gcc/ada/gen_il-internals.adb
index a0f55d39a42..e08397f7d4e 100644
--- a/gcc/ada/gen_il-internals.adb
+++ b/gcc/ada/gen_il-internals.adb
@@ -339,8 +339,6 @@ package body Gen_IL.Internals is
 return "SPARK_Pragma";
  when SPARK_Pragma_Inherited =>
 return "SPARK_Pragma_Inherited";
- when Split_PPC =>
-return "Split_PPC";
  when SSO_Set_High_By_Default =>
 return "SSO_Set_High_By_Default";
  when SSO_Set_Low_By_Default =>
diff --git a/gcc/ada/par_sco.adb b/gcc/ada/par_sco.adb
index 144c1382369..83c1d001ee5 100644
--- a/gcc/ada/par_sco.adb
+++ b/gcc/ada/par_sco.adb
@@ -1704,11 +1704,6 @@ package body Par_SCO is
  while Present (AN) loop
 AE := Expression (AN);
 
---  SCOs are generated before semantic analysis/expansion:
---  PPCs are not split yet.
-
-pragma Assert (not Split_PPC (AN));
-
 C1 := ASCII.NUL;
 
 case Get_Aspect_Id (AN) is
diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index efbc67f3c5d..0470ce10ac7 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -1776,12 +1776,12 @@ package body Sem_Ch13 is
Pragma_Name  : Name_Id) retur

[COMMITTED] ada: Move splitting of pre/post aspect expressions to expansion

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

We split expressions of pre/post aspects into individual conjuncts and
emit messages with their precise location when they fail at runtime.

This was done when processing the aspects and caused inefficiency when
the original expression had to be recovered to detects uses of 'Old that
changed in Ada 2022. This patch moves splitting to expansion.

Conceptually, splitting in expansion is easy, but we need to take care
of locations for inherited pre/post contracts. Previously the location
string was generated while splitting the aspect into pragmas and then
it was manipulated when inheriting the pragmas. Now the location string
is built when installing the Pre'Class check and when splitting the
expression in expansion.

gcc/ada/

* exp_ch6.adb (Append_Message): Build the location string from
scratch and not rely on the one produced while splitting the
aspect into pragmas.
* exp_prag.adb (Expand_Pragma_Check): Split pre/post checks in
expansion.
* sem_ch13.adb (Analyze_Aspect_Specification): Don't split
pre/post expressions into conjuncts; don't add message with
location to the corresponding pragma.
* sem_prag.adb (Build_Pragma_Check_Equivalent): Inherited
pragmas no longer have messages that would need to be updated.
* sinput.adb (Build_Location_String): Adjust to keep previous
messages while using with inherited pragmas.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb  |  45 +++
 gcc/ada/exp_prag.adb | 279 +--
 gcc/ada/sem_ch13.adb |  52 
 gcc/ada/sem_prag.adb |  18 ---
 gcc/ada/sinput.adb   |  21 +++-
 5 files changed, 224 insertions(+), 191 deletions(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index 1ed83255a6d..97be99d6661 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -78,7 +78,6 @@ with Sinfo.Utils;use Sinfo.Utils;
 with Sinput; use Sinput;
 with Snames; use Snames;
 with Stand;  use Stand;
-with Stringt;use Stringt;
 with Tbuild; use Tbuild;
 with Uintp;  use Uintp;
 with Validsw;use Validsw;
@@ -7677,47 +7676,37 @@ package body Exp_Ch6 is
(Id   : Entity_Id;
 Is_First : in out Boolean)
  is
-Prag   : constant Node_Id := Get_Class_Wide_Pragma (Id,
- Pragma_Precondition);
-Msg: Node_Id;
-Str_Id : String_Id;
+Prag : constant Node_Id :=
+  Get_Class_Wide_Pragma (Id, Pragma_Precondition);
 
  begin
 if No (Prag) or else Is_Ignored (Prag) then
return;
 end if;
 
-Msg:= Expression (Last (Pragma_Argument_Associations (Prag)));
-Str_Id := Strval (Msg);
-
 if Is_First then
Is_First := False;
 
-   Append (Global_Name_Buffer, Strval (Msg));
-
-   if Id /= Subp_Id
- and then Name_Buffer (1 .. 19) = "failed precondition"
-   then
-  Insert_Str_In_Name_Buffer ("inherited ", 8);
+   if Id /= Subp_Id then
+  Append
+(Global_Name_Buffer, "failed inherited precondition ");
+   else
+  Append (Global_Name_Buffer, "failed precondition ");
end if;
 
 else
-   declare
-  Str  : constant String := To_String (Str_Id);
-  From_Idx : Integer;
+   Append (Global_Name_Buffer, ASCII.LF);
+   Append (Global_Name_Buffer, "  or ");
 
-   begin
-  Append (Global_Name_Buffer, ASCII.LF);
-  Append (Global_Name_Buffer, "  or ");
-
-  From_Idx := Name_Len;
-  Append (Global_Name_Buffer, Str_Id);
-
-  if Str (1 .. 19) = "failed precondition" then
- Insert_Str_In_Name_Buffer ("inherited ", From_Idx + 8);
-  end if;
-   end;
+   Append (Global_Name_Buffer, "failed inherited precondition ");
 end if;
+
+Append (Global_Name_Buffer, "from " &
+  Build_Location_String
+(Sloc
+  (First_Node
+ (Expression
+(First (Pragma_Argument_Associations (Prag)));
  end Append_Message;
 
  --  Local variables
diff --git a/gcc/ada/exp_prag.adb b/gcc/ada/exp_prag.adb
index 78490dcbf45..a9379025a6b 100644
--- a/gcc/ada/exp_prag.adb
+++ b/gcc/ada/exp_prag.adb
@@ -284,24 +284,6 @@ package body Exp_Prag is
--
 
procedure Expand_Pragma_Check (N : Node_Id) is
-  Cond : constant Node_Id := Arg_N (N, 2);
-  Nam  : constant Name_Id := Chars (Arg_N (N, 1));
-  Msg  : Node_Id;
-
-  

[COMMITTED] ada: Move Init_Proc_Level_Formal from Exp_Ch3 to Exp_Util

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

This makes it possible to remove clauses from the Accessibility package.

gcc/ada/

* accessibility.adb: Remove clauses for Exp_Ch3.
* exp_ch3.ads (Init_Proc_Level_Formal): Move declaration to...
* exp_ch3.adb (Init_Proc_Level_Formal): Move body to...
* exp_util.ads (Init_Proc_Level_Formal): ...here.
(Inside_Init_Proc): Alphabetize.
* exp_util.adb (Init_Proc_Level_Formal): ...here.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/accessibility.adb |  1 -
 gcc/ada/exp_ch3.adb   | 25 -
 gcc/ada/exp_ch3.ads   |  5 -
 gcc/ada/exp_util.adb  | 26 ++
 gcc/ada/exp_util.ads  | 10 +++---
 5 files changed, 33 insertions(+), 34 deletions(-)

diff --git a/gcc/ada/accessibility.adb b/gcc/ada/accessibility.adb
index 75ab9667436..bb81ae49f41 100644
--- a/gcc/ada/accessibility.adb
+++ b/gcc/ada/accessibility.adb
@@ -32,7 +32,6 @@ with Elists; use Elists;
 with Errout; use Errout;
 with Einfo.Utils;use Einfo.Utils;
 with Exp_Atag;   use Exp_Atag;
-with Exp_Ch3;use Exp_Ch3;
 with Exp_Ch7;use Exp_Ch7;
 with Exp_Tss;use Exp_Tss;
 with Exp_Util;   use Exp_Util;
diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index f9989373a62..2477a221c96 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -1462,31 +1462,6 @@ package body Exp_Ch3 is
   return Agg;
end Build_Equivalent_Record_Aggregate;
 
-   
-   -- Init_Proc_Level_Formal --
-   
-
-   function Init_Proc_Level_Formal (Proc : Entity_Id) return Entity_Id is
-  Form : Entity_Id;
-   begin
-  --  Move through the formals of the initialization procedure Proc to find
-  --  the extra accessibility level parameter associated with the object
-  --  being initialized.
-
-  Form := First_Formal (Proc);
-  while Present (Form) loop
- if Chars (Form) = Name_uInit_Level then
-return Form;
- end if;
-
- Next_Formal (Form);
-  end loop;
-
-  --  No formal was found, return Empty
-
-  return Empty;
-   end Init_Proc_Level_Formal;
-
---
-- Build_Initialization_Call --
---
diff --git a/gcc/ada/exp_ch3.ads b/gcc/ada/exp_ch3.ads
index 5a4b1133916..1e0f76ae18f 100644
--- a/gcc/ada/exp_ch3.ads
+++ b/gcc/ada/exp_ch3.ads
@@ -146,11 +146,6 @@ package Exp_Ch3 is
--  type is valid only when Normalize_Scalars or Initialize_Scalars is
--  active, or if N is the node for a 'Invalid_Value attribute node.
 
-   function Init_Proc_Level_Formal (Proc : Entity_Id) return Entity_Id;
-   --  Fetch the extra formal from an initalization procedure "proc"
-   --  corresponding to the level of the object being initialized. When none
-   --  is present Empty is returned.
-
procedure Init_Secondary_Tags
  (Typ: Entity_Id;
   Target : Node_Id;
diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
index efc9ef0ed38..1dcfb61b333 100644
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -7267,6 +7267,32 @@ package body Exp_Util is
   return False;
end In_Unconditional_Context;
 
+   
+   -- Init_Proc_Level_Formal --
+   
+
+   function Init_Proc_Level_Formal (Proc : Entity_Id) return Entity_Id is
+  Form : Entity_Id;
+
+   begin
+  --  Go through the formals of the initialization procedure Proc to find
+  --  the extra accessibility level parameter associated with the object
+  --  being initialized.
+
+  Form := First_Formal (Proc);
+  while Present (Form) loop
+ if Chars (Form) = Name_uInit_Level then
+return Form;
+ end if;
+
+ Next_Formal (Form);
+  end loop;
+
+  --  No formal was found, return Empty
+
+  return Empty;
+   end Init_Proc_Level_Formal;
+
---
-- Insert_Action --
---
diff --git a/gcc/ada/exp_util.ads b/gcc/ada/exp_util.ads
index b968f448bba..3fd3a151ddb 100644
--- a/gcc/ada/exp_util.ads
+++ b/gcc/ada/exp_util.ads
@@ -724,9 +724,6 @@ package Exp_Util is
--  chain, counting only entries in the current scope. If an entity is not
--  overloaded, the returned number will be one.
 
-   function Inside_Init_Proc return Boolean;
-   --  Returns True if current scope is within an init proc
-
function In_Library_Level_Package_Body (Id : Entity_Id) return Boolean;
--  Given an arbitrary entity, determine whether it appears at the library
--  level of a package body.
@@ -737,6 +734,13 @@ package Exp_Util is
--  unconditionally executed, i.e. it is not within a loop or a conditional
--  or a case statement etc.
 
+   function Init_Proc_Level_Formal (Proc : Entity_Id) return Entity_Id;
+   --  Return the extra formal of an initialization proce

[COMMITTED] ada: Deconstruct unused flag Is_Expanded_Contract

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Flag Is_Expanded_Contract was introduced together with N_Contract field
(when implementing freezing of contracts), but was never actually used.

gcc/ada/

* gen_il-fields.ads (Opt_Field_Enum):
Remove Is_Expanded_Contract from the list of flags.
* gen_il-gen-gen_nodes.adb (N_Contract): Remove
Is_Expanded_Contract from the list of N_Contract fields.
* sinfo.ads (Is_Expanded_Contract): Remove comments for the flag
and its single occurrence in N_Contract.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-fields.ads| 1 -
 gcc/ada/gen_il-gen-gen_nodes.adb | 1 -
 gcc/ada/sinfo.ads| 5 -
 3 files changed, 7 deletions(-)

diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
index 594aeb68819..67074c60250 100644
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -254,7 +254,6 @@ package Gen_IL.Fields is
   Is_Elsif,
   Is_Entry_Barrier_Function,
   Is_Expanded_Build_In_Place_Call,
-  Is_Expanded_Contract,
   Is_Folded_In_Parser,
   Is_Generic_Contract_Pragma,
   Is_Homogeneous_Aggregate,
diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index fb00993a95e..3a78ffb2009 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -1330,7 +1330,6 @@ begin -- Gen_IL.Gen.Gen_Nodes
Cc (N_Contract, Node_Kind,
(Sm (Classifications, Node_Id),
 Sm (Contract_Test_Cases, Node_Id),
-Sm (Is_Expanded_Contract, Flag),
 Sm (Pre_Post_Conditions, Node_Id)));
 
Cc (N_Derived_Type_Definition, Node_Kind,
diff --git a/gcc/ada/sinfo.ads b/gcc/ada/sinfo.ads
index 06b9ad0884e..bee4491efde 100644
--- a/gcc/ada/sinfo.ads
+++ b/gcc/ada/sinfo.ads
@@ -1720,10 +1720,6 @@ package Sinfo is
--actuals to support a build-in-place style of call have been added to
--the call.
 
-   --  Is_Expanded_Contract
-   --Present in N_Contract nodes. Set if the contract has already undergone
-   --expansion activities.
-
--  Is_Generic_Contract_Pragma
--This flag is present in N_Pragma nodes. It is set when the pragma is
--a source construct, applies to a generic unit or its body, and denotes
@@ -7959,7 +7955,6 @@ package Sinfo is
   --  Pre_Post_Conditions (set to Empty if none)
   --  Contract_Test_Cases (set to Empty if none)
   --  Classifications (set to Empty if none)
-  --  Is_Expanded_Contract
 
   --  Pre_Post_Conditions contains a collection of pragmas that correspond
   --  to pre- and postconditions associated with an entry or a subprogram
-- 
2.43.2



[COMMITTED] ada: Remove guards against traversal of empty list of aspects

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

When iterating over Aspect_Specifications, we can use First/Next
directly even if the Aspect_Specifications returns a No_List or
the list has no items.

Code cleanup.

gcc/ada/

* aspects.adb (Copy_Aspects): Style fix.
* contracts.adb (Analyze_Contracts): Style fix.
(Save_Global_References_In_Contract): Remove extra guards.
* par_sco.adb (Traverse_Aspects): Move guard to the caller and
make it consistent with Save_Global_References_In_Contract.
* sem_ch12.adb (Has_Contracts): Remove extra guards.
* sem_ch3.adb (Delayed_Aspect_Present, Get_Partial_View_Aspect,
Check_Duplicate_Aspects): Likewise.
* sem_disp.adb (Check_Dispatching_Operation): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/aspects.adb   |  1 -
 gcc/ada/contracts.adb |  5 +--
 gcc/ada/par_sco.adb   |  8 ++--
 gcc/ada/sem_ch12.adb  | 22 +--
 gcc/ada/sem_ch3.adb   | 91 ---
 gcc/ada/sem_disp.adb  | 22 +--
 6 files changed, 65 insertions(+), 84 deletions(-)

diff --git a/gcc/ada/aspects.adb b/gcc/ada/aspects.adb
index 696ee672acd..b7262c56f3f 100644
--- a/gcc/ada/aspects.adb
+++ b/gcc/ada/aspects.adb
@@ -433,7 +433,6 @@ package body Aspects is
---
 
procedure Copy_Aspects (From : Node_Id; To : Node_Id) is
-
begin
   if not Has_Aspects (From) then
  return;
diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
index 97f38735662..810b360fb94 100644
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -512,7 +512,6 @@ package body Contracts is
if Present (It) then
   Validate_Iterable_Aspect (E, It);
end if;
-
if Present (I_Lit) then
   Validate_Literal_Aspect (E, I_Lit);
end if;
@@ -4980,9 +4979,7 @@ package body Contracts is
 
   Push_Scope (Gen_Id);
 
-  if Permits_Aspect_Specifications (Templ)
-and then Has_Aspects (Templ)
-  then
+  if Permits_Aspect_Specifications (Templ) then
  Save_Global_References_In_Aspects (Templ);
   end if;
 
diff --git a/gcc/ada/par_sco.adb b/gcc/ada/par_sco.adb
index 83c1d001ee5..0b750a6f8de 100644
--- a/gcc/ada/par_sco.adb
+++ b/gcc/ada/par_sco.adb
@@ -1696,10 +1696,6 @@ package body Par_SCO is
  C1 : Character;
 
   begin
- if not Has_Aspects (N) then
-return;
- end if;
-
  AN := First (Aspect_Specifications (N));
  while Present (AN) loop
 AE := Expression (AN);
@@ -2414,7 +2410,9 @@ package body Par_SCO is
end if;
  end case;
 
- Traverse_Aspects (N);
+ if Permits_Aspect_Specifications (N) then
+Traverse_Aspects (N);
+ end if;
   end Traverse_One;
 
--  Start of processing for Traverse_Declarations_Or_Statements
diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index e7b759c4e88..cb05a71e96f 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -9663,21 +9663,17 @@ package body Sem_Ch12 is
   A_Spec : Node_Id;
   A_Id   : Aspect_Id;
begin
-  if No (A_List) then
- return False;
-  else
- A_Spec := First (A_List);
- while Present (A_Spec) loop
-A_Id := Get_Aspect_Id (A_Spec);
-if A_Id = Aspect_Pre or else A_Id = Aspect_Post then
-   return True;
-end if;
+  A_Spec := First (A_List);
+  while Present (A_Spec) loop
+ A_Id := Get_Aspect_Id (A_Spec);
+ if A_Id = Aspect_Pre or else A_Id = Aspect_Post then
+return True;
+ end if;
 
-Next (A_Spec);
- end loop;
+ Next (A_Spec);
+  end loop;
 
- return False;
-  end if;
+  return False;
end Has_Contracts;
 
--
diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
index 1d95b12ff44..2bff0bb6307 100644
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -4153,24 +4153,22 @@ package body Sem_Ch3 is
  A_Id : Aspect_Id;
 
   begin
- if Present (Aspect_Specifications (N)) then
-A := First (Aspect_Specifications (N));
+ A := First (Aspect_Specifications (N));
 
-while Present (A) loop
-   A_Id := Get_Aspect_Id (Chars (Identifier (A)));
+ while Present (A) loop
+A_Id := Get_Aspect_Id (Chars (Identifier (A)));
 
-   if A_Id = Aspect_Address then
+if A_Id = Aspect_Address then
 
-  --  Set flag on object entity, for later processing at
-  --  the freeze point.
+   --  Set flag on object entity, for later processing at the
+   --  freeze point.
 
-  Set_Has_Delayed_Aspects (Id);
-  return True;
-   end if;
+   Set_Has_Delayed_Aspects (Id);
+   return T

[COMMITTED] ada: Remove dynamic frame in System.Image_D and document it in System.Image_F

2024-05-13 Thread Marc Poulhiès
From: Eric Botcazou 

The former can easily be removed while the latter cannot.

gcc/ada/

* libgnat/s-imaged.ads (System.Image_D): Add Uns formal parameter.
* libgnat/s-imaged.adb: Add with clauses for System.Image_I,
System.Value_I_Spec and System.Value_U_Spec.
(Uns_Spec): New instance of System.Value_U_Spec.
(Int_Spec): New instance of System.Value_I_Spec.
(Image_I): New instance of System.Image_I.
(Set_Image_Integer): New renaming.
(Set_Image_Decimal): Replace 'Image with call to Set_Image_Integer.
* libgnat/s-imde32.ads (Uns32): New subtype.
(Impl): Pass Uns32 as second actual paramter to Image_D.
* libgnat/s-imde64.ads (Uns64): New subtype.
(Impl): Pass Uns64 as second actual paramter to Image_D.
* libgnat/s-imde128.ads (Uns128): New subtype.
(Impl): Pass Uns128 as second actual paramter to Image_D.
* libgnat/s-imagef.adb (Set_Image_Fixed): Document bounds for the
A, D and AF local constants.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-imaged.adb  | 55 +--
 gcc/ada/libgnat/s-imaged.ads  |  5 ++--
 gcc/ada/libgnat/s-imagef.adb  |  9 ++
 gcc/ada/libgnat/s-imde128.ads |  3 +-
 gcc/ada/libgnat/s-imde32.ads  |  3 +-
 gcc/ada/libgnat/s-imde64.ads  |  3 +-
 6 files changed, 70 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/libgnat/s-imaged.adb b/gcc/ada/libgnat/s-imaged.adb
index 800a8e421cd..3a3b34960ca 100644
--- a/gcc/ada/libgnat/s-imaged.adb
+++ b/gcc/ada/libgnat/s-imaged.adb
@@ -29,10 +29,42 @@
 --  --
 --
 
+with System.Image_I;
 with System.Img_Util; use System.Img_Util;
+with System.Value_I_Spec;
+with System.Value_U_Spec;
 
 package body System.Image_D is
 
+   --  Contracts, ghost code, loop invariants and assertions in this unit are
+   --  meant for analysis only, not for run-time checking, as it would be too
+   --  costly otherwise. This is enforced by setting the assertion policy to
+   --  Ignore.
+
+   pragma Assertion_Policy (Assert => Ignore,
+Assert_And_Cut => Ignore,
+Contract_Cases => Ignore,
+Ghost  => Ignore,
+Loop_Invariant => Ignore,
+Pre=> Ignore,
+Post   => Ignore,
+Subprogram_Variant => Ignore);
+
+   package Uns_Spec is new System.Value_U_Spec (Uns);
+   package Int_Spec is new System.Value_I_Spec (Int, Uns, Uns_Spec);
+
+   package Image_I is new System.Image_I
+ (Int=> Int,
+  Uns=> Uns,
+  U_Spec => Uns_Spec,
+  I_Spec => Int_Spec);
+
+   procedure Set_Image_Integer
+ (V : Int;
+  S : in out String;
+  P : in out Natural)
+ renames Image_I.Set_Image_Integer;
+
---
-- Image_Decimal --
---
@@ -71,11 +103,28 @@ package body System.Image_D is
   Aft   : Natural;
   Exp   : Natural)
is
-  Digs : String := Int'Image (V);
-  --  Sign and digits of decimal value
+  Maxdigs : constant Natural := Int'Width;
+  --  Maximum length needed for Image of an Int
+
+  Digs  : String (1 .. Maxdigs);
+  Ndigs : Natural;
+  --  Buffer for the image of the integer value
 
begin
-  Set_Decimal_Digits (Digs, Digs'Length, S, P, Scale, Fore, Aft, Exp);
+  --  Set the first character like Image
+
+  if V >= 0 then
+ Digs (1) := ' ';
+ Ndigs := 1;
+  else
+ Ndigs := 0;
+  end if;
+
+  Set_Image_Integer (V, Digs, Ndigs);
+
+  pragma Assert (1 <= Ndigs and then Ndigs <= Maxdigs);
+
+  Set_Decimal_Digits (Digs, Ndigs, S, P, Scale, Fore, Aft, Exp);
end Set_Image_Decimal;
 
 end System.Image_D;
diff --git a/gcc/ada/libgnat/s-imaged.ads b/gcc/ada/libgnat/s-imaged.ads
index 5fe8f82fa17..927ea50e769 100644
--- a/gcc/ada/libgnat/s-imaged.ads
+++ b/gcc/ada/libgnat/s-imaged.ads
@@ -36,6 +36,7 @@
 generic
 
type Int is range <>;
+   type Uns is mod <>;
 
 package System.Image_D is
 
@@ -46,8 +47,8 @@ package System.Image_D is
   Scale : Integer);
--  Computes fixed_type'Image (V), where V is the integer value (in units of
--  delta) of a decimal type whose Scale is as given and stores the result
-   --  S (1 .. P), updating P to the value of L. The image is given by the
-   --  rules in RM 3.5(34) for fixed-point type image functions. The caller
+   --  S (1 .. P), updating P on return. The result is computed according to
+   --  the rules for image for fixed-point types (RM 3.5(34)). The caller
--  guarantees that S is long enough to hold the result and has a lower
--  bound of 1.
 
diff --git a/gcc/ada/lib

[COMMITTED] ada: Refactor repeated code for querying Boolean-valued aspects

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Code cleanup following a fix for aspect Exclusive_Functions; semantics
is unaffected.

gcc/ada/

* exp_ch9.adb (Build_Protected_Subprogram_Body,
Build_Protected_Subprogram_Call_Cleanup): Reuse refactored
routine.
* sem_util.adb
(Has_Enabled_Aspect): Refactored repeated code.
(Is_Static_Function): Reuse refactored routine.
* sem_util.ads (Has_Enabled_Aspect):
New query routine refactored from repeated code.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch9.adb  | 19 +++
 gcc/ada/sem_util.adb | 28 +++-
 gcc/ada/sem_util.ads |  6 ++
 3 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/gcc/ada/exp_ch9.adb b/gcc/ada/exp_ch9.adb
index 1b231b8bf2c..a89e3247647 100644
--- a/gcc/ada/exp_ch9.adb
+++ b/gcc/ada/exp_ch9.adb
@@ -3835,13 +3835,7 @@ package body Exp_Ch9 is
 Expression => New_Occurrence_Of (R, Loc));
  end if;
 
- if Has_Aspect (Pid, Aspect_Exclusive_Functions)
-   and then
- (No (Find_Value_Of_Aspect (Pid, Aspect_Exclusive_Functions))
-   or else
- Is_True (Static_Boolean (Find_Value_Of_Aspect
-   (Pid, Aspect_Exclusive_Functions
- then
+ if Has_Enabled_Aspect (Pid, Aspect_Exclusive_Functions) then
 Lock_Kind := RE_Lock;
  else
 Lock_Kind := RE_Lock_Read_Only;
@@ -4041,16 +4035,9 @@ package body Exp_Ch9 is
   if (Nkind (Op_Spec) = N_Procedure_Specification
 or else
   (Nkind (Op_Spec) = N_Function_Specification
- and then Has_Aspect (Conc_Typ, Aspect_Exclusive_Functions)
  and then
-   (No
- (Find_Value_Of_Aspect (Conc_Typ,
-Aspect_Exclusive_Functions))
-  or else
-Is_True
-  (Static_Boolean
- (Find_Value_Of_Aspect
-(Conc_Typ, Aspect_Exclusive_Functions))
+   Has_Enabled_Aspect
+ (Conc_Typ, Aspect_Exclusive_Functions)))
 and then Has_Entries (Conc_Typ)
   then
  case Corresponding_Runtime_Package (Conc_Typ) is
diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index b30cbcd57e9..e9ab6650dac 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -13484,6 +13484,28 @@ package body Sem_Util is
   return False;
end Has_Effectively_Volatile_Component;
 
+   
+   -- Has_Enabled_Aspect --
+   
+
+   function Has_Enabled_Aspect
+ (Id : Entity_Id;
+  A  : Aspect_Id)
+  return Boolean
+   is
+  Asp : constant Node_Id := Find_Aspect (Id, A);
+   begin
+  if Present (Asp) then
+ if Present (Expression (Asp)) then
+return Is_True (Static_Boolean (Expression (Asp)));
+ else
+return True;
+ end if;
+  else
+ return False;
+  end if;
+   end Has_Enabled_Aspect;
+

-- Has_Volatile_Component --

@@ -20356,11 +20378,7 @@ package body Sem_Util is
   --  for efficiency.
 
   return Ada_Version >= Ada_2022
-and then Has_Aspect (Subp, Aspect_Static)
-and then
-  (No (Find_Value_Of_Aspect (Subp, Aspect_Static))
-or else Is_True (Static_Boolean
-   (Find_Value_Of_Aspect (Subp, Aspect_Static;
+and then Has_Enabled_Aspect (Subp, Aspect_Static);
end Is_Static_Function;
 
-
diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
index a5eb1ecd7c1..527b1075c3f 100644
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -1559,6 +1559,12 @@ package Sem_Util is
--  Given arbitrary type Typ, determine whether it contains at least one
--  effectively volatile component.
 
+   function Has_Enabled_Aspect (Id : Entity_Id; A : Aspect_Id) return Boolean
+ with Pre => A in Boolean_Aspects;
+   --  Returns True if a Boolean-valued aspect is enabled on entity Id; i.e. it
+   --  is present and either has no aspect definition or its aspect definition
+   --  statically evaluates to True.
+
function Has_Volatile_Component (Typ : Entity_Id) return Boolean;
--  Given arbitrary type Typ, determine whether it contains at least one
--  volatile component.
-- 
2.43.2



[COMMITTED] ada: Revert recent change for Put_Image and Object_Size attributes

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Recent change for attribute Object_Size caused spurious errors when
restriction No_Implementation_Attributes is active and attribute
Object_Size is introduced by expansion of dispatching operations.

Temporarily revert that change for a further investigation.

gcc/ada/

* sem_attr.adb (Attribute_22): Remove Put_Image and Object_Size.
* sem_attr.ads (Attribute_Imp_Def): Restore Object_Size.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_attr.adb |  4 +---
 gcc/ada/sem_attr.ads | 11 +++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index b979ffdf0b1..65442d45a85 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -181,9 +181,7 @@ package body Sem_Attr is
  (Attribute_Enum_Rep |
   Attribute_Enum_Val |
   Attribute_Index|
-  Attribute_Object_Size  |
-  Attribute_Preelaborable_Initialization |
-  Attribute_Put_Image=> True,
+  Attribute_Preelaborable_Initialization => True,
   others => False);
 
--  The following array contains all attributes that imply a modification
diff --git a/gcc/ada/sem_attr.ads b/gcc/ada/sem_attr.ads
index 65b7b534711..4c9f27043c6 100644
--- a/gcc/ada/sem_attr.ads
+++ b/gcc/ada/sem_attr.ads
@@ -373,6 +373,17 @@ package Sem_Attr is
   --  other composite object passed by reference, there is no other way
   --  of specifying that a zero address should be passed.
 
+  -
+  -- Object_Size --
+  -
+
+  Attribute_Object_Size => True,
+  --  Type'Object_Size is the same as Type'Size for all types except
+  --  fixed-point types and discrete types. For fixed-point types and
+  --  discrete types, this attribute gives the size used for default
+  --  allocation of objects and components of the size. See section in
+  --  Einfo ("Handling of Type'Size values") for further details.
+
   -
   -- Passed_By_Reference --
   -
-- 
2.43.2



[COMMITTED] ada: Remove code that expected pre/post being split into conjuncts

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

The removed code is no longer needed (and causes assertion failures).
Most likely it should have been using the Split_PPC flag.

gcc/ada/

* sem_util.adb (Is_Potentially_Unevaluated): Remove code for
recovering the original structure of expressions with AND THEN.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 29 ++---
 1 file changed, 2 insertions(+), 27 deletions(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 1166c68b972..b5c33638b35 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -19582,39 +19582,14 @@ package body Sem_Util is
 
   --  Local variables
 
-  Par  : Node_Id;
   Expr : Node_Id;
+  Par  : Node_Id;
 
--  Start of processing for Is_Potentially_Unevaluated
 
begin
   Expr := N;
-  Par  := N;
-
-  --  A postcondition whose expression is a short-circuit is broken down
-  --  into individual aspects for better exception reporting. The original
-  --  short-circuit expression is rewritten as the second operand, and an
-  --  occurrence of 'Old in that operand is potentially unevaluated.
-  --  See sem_ch13.adb for details of this transformation. The reference
-  --  to 'Old may appear within an expression, so we must look for the
-  --  enclosing pragma argument in the tree that contains the reference.
-
-  while Present (Par)
-and then Nkind (Par) /= N_Pragma_Argument_Association
-  loop
- if Is_Rewrite_Substitution (Par)
-   and then Nkind (Original_Node (Par)) = N_And_Then
- then
-return True;
- end if;
-
- Par := Parent (Par);
-  end loop;
-
-  --  Other cases; 'Old appears within other expression (not the top-level
-  --  conjunct in a postcondition) with a potentially unevaluated operand.
-
-  Par := Parent (Expr);
+  Par  := Parent (Expr);
 
   while Present (Par)
 and then Nkind (Par) /= N_Pragma_Argument_Association
-- 
2.43.2



Re: [PATCH] tree-ssa-math-opts: Pattern recognize yet another .ADD_OVERFLOW pattern [PR113982]

2024-05-13 Thread Richard Biener
On Mon, 13 May 2024, Jakub Jelinek wrote:

> Hi!
> 
> We pattern recognize already many different patterns, and closest to the
> requested one also
>yc = (type) y;
>zc = (type) z;
>x = yc + zc;
>w = (typeof_y) x;
>if (x > max)
> where y/z has the same unsigned type and type is a wider unsigned type
> and max is maximum value of the narrower unsigned type.
> But apparently people are creative in writing this in diffent ways,
> this requests
>yc = (type) y;
>zc = (type) z;
>x = yc + zc;
>w = (typeof_y) x;
>if (x >> narrower_type_bits)
> 
> The following patch implements that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.  Seeing the large matching code I wonder if using a match
in match.pd might be more easy to maintain (eh, and I'd still
like to somehow see "inline" match patterns in source files, not
sure how, but requiring some gen* program extracting them).

Thanks,
Richard.

> 2024-05-13  Jakub Jelinek  
> 
>   PR middle-end/113982
>   * tree-ssa-math-opts.cc (arith_overflow_check_p): Also return 1
>   for RSHIFT_EXPR by precision of maxval if shift result is only
>   used in a cast or comparison against zero.
>   (match_arith_overflow): Handle the RSHIFT_EXPR use case.
> 
>   * gcc.dg/pr113982.c: New test.
> 
> --- gcc/tree-ssa-math-opts.cc.jj  2024-04-11 09:26:36.318369218 +0200
> +++ gcc/tree-ssa-math-opts.cc 2024-05-10 18:17:08.795744811 +0200
> @@ -3947,6 +3947,66 @@ arith_overflow_check_p (gimple *stmt, gi
>else
>  return 0;
>  
> +  if (maxval
> +  && ccode == RSHIFT_EXPR
> +  && crhs1 == lhs
> +  && TREE_CODE (crhs2) == INTEGER_CST
> +  && wi::to_widest (crhs2) == TYPE_PRECISION (TREE_TYPE (maxval)))
> +{
> +  tree shiftlhs = gimple_assign_lhs (use_stmt);
> +  if (!shiftlhs)
> + return 0;
> +  use_operand_p use;
> +  if (!single_imm_use (shiftlhs, &use, &cur_use_stmt))
> + return 0;
> +  if (gimple_code (cur_use_stmt) == GIMPLE_COND)
> + {
> +   ccode = gimple_cond_code (cur_use_stmt);
> +   crhs1 = gimple_cond_lhs (cur_use_stmt);
> +   crhs2 = gimple_cond_rhs (cur_use_stmt);
> + }
> +  else if (is_gimple_assign (cur_use_stmt))
> + {
> +   if (gimple_assign_rhs_class (cur_use_stmt) == GIMPLE_BINARY_RHS)
> + {
> +   ccode = gimple_assign_rhs_code (cur_use_stmt);
> +   crhs1 = gimple_assign_rhs1 (cur_use_stmt);
> +   crhs2 = gimple_assign_rhs2 (cur_use_stmt);
> + }
> +   else if (gimple_assign_rhs_code (cur_use_stmt) == COND_EXPR)
> + {
> +   tree cond = gimple_assign_rhs1 (cur_use_stmt);
> +   if (COMPARISON_CLASS_P (cond))
> + {
> +   ccode = TREE_CODE (cond);
> +   crhs1 = TREE_OPERAND (cond, 0);
> +   crhs2 = TREE_OPERAND (cond, 1);
> + }
> +   else
> + return 0;
> + }
> +   else
> + {
> +   enum tree_code sc = gimple_assign_rhs_code (cur_use_stmt);
> +   tree castlhs = gimple_assign_lhs (cur_use_stmt);
> +   if (!CONVERT_EXPR_CODE_P (sc)
> +   || !castlhs
> +   || !INTEGRAL_TYPE_P (TREE_TYPE (castlhs))
> +   || (TYPE_PRECISION (TREE_TYPE (castlhs))
> +   > TYPE_PRECISION (TREE_TYPE (maxval
> + return 0;
> +   return 1;
> + }
> + }
> +  else
> + return 0;
> +  if ((ccode != EQ_EXPR && ccode != NE_EXPR)
> +   || crhs1 != shiftlhs
> +   || !integer_zerop (crhs2))
> + return 0;
> +  return 1;
> +}
> +
>if (TREE_CODE_CLASS (ccode) != tcc_comparison)
>  return 0;
>  
> @@ -4049,6 +4109,7 @@ arith_overflow_check_p (gimple *stmt, gi
> _8 = IMAGPART_EXPR <_7>;
> if (_8)
> and replace (utype) x with _9.
> +   Or with x >> popcount (max) instead of x > max.
>  
> Also recognize:
> x = ~z;
> @@ -4481,10 +4542,62 @@ match_arith_overflow (gimple_stmt_iterat
> gcc_checking_assert (is_gimple_assign (use_stmt));
> if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
>   {
> -   gimple_assign_set_rhs1 (use_stmt, ovf);
> -   gimple_assign_set_rhs2 (use_stmt, build_int_cst (type, 0));
> -   gimple_assign_set_rhs_code (use_stmt,
> -   ovf_use == 1 ? NE_EXPR : EQ_EXPR);
> +   if (gimple_assign_rhs_code (use_stmt) == RSHIFT_EXPR)
> + {
> +   g2 = gimple_build_assign (make_ssa_name (boolean_type_node),
> + ovf_use == 1 ? NE_EXPR : EQ_EXPR,
> + ovf, build_int_cst (type, 0));
> +   gimple_stmt_iterator gsiu = gsi_for_stmt (use_stmt);
> +   gsi_insert_before (&gsiu, g2, GSI_SAME_STMT);
> +   gimple_assign_set_rhs_with_ops (&gsiu, NOP_EXPR,
> +  

[COMMITTED] ada: Refine type of a local variable

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Code cleanup; semantics is unaffected.

gcc/ada/

* sem_util.adb (Has_No_Output): Iteration with
First_Formal/Next_Formal involves Entity_Ids.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index e9ab6650dac..03055039a1f 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -4203,7 +4203,7 @@ package body Sem_Util is
 ---
 
 function Has_No_Output (Subp : Entity_Id) return Boolean is
-   Param : Node_Id;
+   Param : Entity_Id;
 
 begin
--  A function has its result as output
-- 
2.43.2



[COMMITTED] ada: Fix crash on Compile_Time_Warning in dead code

2024-05-13 Thread Marc Poulhiès
From: Bob Duff 

If a pragma Compile_Time_Warning triggers, and the pragma
is later removed because it is dead code, then the compiler
can return a bad exit code. This causes gprbuild to report
"*** compilation phase failed".

This is because Total_Errors_Detected, which is declared as Nat,
goes negative, causing Constraint_Error. In assertions-off mode,
the Constraint_Error is not detected, but the compiler nonetheless
reports a bad exit code.

This patch prevents that negative count.

gcc/ada/

* errout.adb (Output_Messages): Protect against the total going
negative.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/errout.adb | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
index d28a410f47b..c4761bd1bc9 100644
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -3399,11 +3399,16 @@ package body Errout is
 
   if Warning_Mode = Treat_As_Error then
  declare
-Compile_Time_Pragma_Warnings : constant Int :=
+Compile_Time_Pragma_Warnings : constant Nat :=
Count_Compile_Time_Pragma_Warnings;
- begin
-Total_Errors_Detected := Total_Errors_Detected + Warnings_Detected
+Total : constant Int := Total_Errors_Detected + Warnings_Detected
- Warning_Info_Messages - Compile_Time_Pragma_Warnings;
+--  We need to protect against a negative Total here, because
+--  if a pragma Compile_Time_Warning occurs in dead code, it
+--  gets counted in Compile_Time_Pragma_Warnings but not in
+--  Warnings_Detected.
+ begin
+Total_Errors_Detected := Int'Max (Total, 0);
 Warnings_Detected :=
Warning_Info_Messages + Compile_Time_Pragma_Warnings;
  end;
-- 
2.43.2



[COMMITTED] ada: Attributes Put_Image and Object_Size are defined by Ada 2022

2024-05-13 Thread Marc Poulhiès
From: Piotr Trojanek 

Recognize references to attributes Put_Image and Object_Size as
language-defined in Ada 2022 and implementation-defined in earlier
versions of Ada. Other attributes listed in Ada 2022 RM, K.2 and
currently implemented in GNAT are correctly categorized.

This change only affects code with restriction
No_Implementation_Attributes.

gcc/ada/

* sem_attr.adb (Attribute_22): Add Put_Image and Object_Size.
* sem_attr.ads (Attribute_Imp_Def): Remove Object_Size.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_attr.adb |  4 +++-
 gcc/ada/sem_attr.ads | 11 ---
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index 65442d45a85..b979ffdf0b1 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -181,7 +181,9 @@ package body Sem_Attr is
  (Attribute_Enum_Rep |
   Attribute_Enum_Val |
   Attribute_Index|
-  Attribute_Preelaborable_Initialization => True,
+  Attribute_Object_Size  |
+  Attribute_Preelaborable_Initialization |
+  Attribute_Put_Image=> True,
   others => False);
 
--  The following array contains all attributes that imply a modification
diff --git a/gcc/ada/sem_attr.ads b/gcc/ada/sem_attr.ads
index 4c9f27043c6..65b7b534711 100644
--- a/gcc/ada/sem_attr.ads
+++ b/gcc/ada/sem_attr.ads
@@ -373,17 +373,6 @@ package Sem_Attr is
   --  other composite object passed by reference, there is no other way
   --  of specifying that a zero address should be passed.
 
-  -
-  -- Object_Size --
-  -
-
-  Attribute_Object_Size => True,
-  --  Type'Object_Size is the same as Type'Size for all types except
-  --  fixed-point types and discrete types. For fixed-point types and
-  --  discrete types, this attribute gives the size used for default
-  --  allocation of objects and components of the size. See section in
-  --  Einfo ("Handling of Type'Size values") for further details.
-
   -
   -- Passed_By_Reference --
   -
-- 
2.43.2



[PATCH] testsuite: c++: Allow for std::printf in g++.dg/modules/stdio-1_a.H [PR98529]

2024-05-13 Thread Rainer Orth
g++.dg/modules/stdio-1_a.H currently FAILs on Solaris:

FAIL: g++.dg/modules/stdio-1_a.H -std=c++17  scan-lang-dump module "Depset:0 
decl entity:[0-9]* function_decl:'::printf'"
FAIL: g++.dg/modules/stdio-1_a.H -std=c++2a  scan-lang-dump module "Depset:0 
decl entity:[0-9]* function_decl:'::printf'"
FAIL: g++.dg/modules/stdio-1_a.H -std=c++2b  scan-lang-dump module "Depset:0 
decl entity:[0-9]* function_decl:'::printf'"

The problem is that the module file doesn't contain

 Depset:0 decl entity:95 function_decl:'::printf'

as expected by the test, but

 Depset:0 decl entity:26 function_decl:'::std::printf'

This happens because Solaris  declares printf in namespace std
as allowed by C++11, Annex D, D.5.

This patch allows for both forms.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
x86_64-pc-linux-gnu.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-05-13  Rainer Orth  

gcc/testsuite:
PR c++/98529
* g++.dg/modules/stdio-1_a.H (scan-lang-dump): Allow for
::std::printf.

diff --git a/gcc/testsuite/g++.dg/modules/stdio-1_a.H b/gcc/testsuite/g++.dg/modules/stdio-1_a.H
--- a/gcc/testsuite/g++.dg/modules/stdio-1_a.H
+++ b/gcc/testsuite/g++.dg/modules/stdio-1_a.H
@@ -10,5 +10,5 @@
 #endif
 // There should be *lots* of depsets (209 for glibc today)
 // { dg-final { scan-lang-dump {Writing section:60 } module } }
-// { dg-final { scan-lang-dump {Depset:0 decl entity:[0-9]* function_decl:'::printf'} module } }
+// { dg-final { scan-lang-dump {Depset:0 decl entity:[0-9]* function_decl:'(::std)?::printf'} module } }
 // { dg-final { scan-lang-dump {Depset:1 binding namespace_decl:'::printf'} module } }


RE: [EXTERNAL] [COMMITTED] Regenerate cygming.opt.urls and mingw.opt.urls

2024-05-13 Thread Evgeny Karpov
Sunday, May 12, 2024
Mark Wielaard  wrote:

> The new cygming.opt.urls and mingw.opt.urls in the
> gcc/config/mingw/cygming.opt.urls directory need to generated by make
> regenerate-opt-urls in the gcc subdirectory. They still contained references 
> to
> the gcc/config/i386 directory from which they were copied.
> 
> Fixes: 1f05dfc131c7 ("Reuse MinGW from i386 for AArch64")
> Fixes: e8d003736e6c ("Rename "x86 Windows Options" to "Cygwin and
> MinGW Options"")
> 
> gcc/ChangeLog:
> 
>   * config/mingw/cygming.opt.urls: Regenerate.
>   * config/mingw/mingw.opt.urls: Likewise.
> ---

Hello Mark, 
Thank you for reviewing our changes related to the refactoring of extracting 
the MinGW implementation from ix64. 

It was expected to move the MinGW-related files without changes in this commit 
("Reuse MinGW from i386 for AArch64") and apply the renaming in a follow-up 
commit, which has been done in 'Rename "x86 Windows Options" to "Cygwin and 
MinGW Options"'. 

The script to update opt.urls files has been used.

>  gcc/config/mingw/cygming.opt.urls | 7 +++
>  gcc/config/mingw/mingw.opt.urls   | 2 +-
>  2 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/config/mingw/cygming.opt.urls
> b/gcc/config/mingw/cygming.opt.urls
> index c624e22e4427..af11c4997609 100644
> --- a/gcc/config/mingw/cygming.opt.urls
> +++ b/gcc/config/mingw/cygming.opt.urls
> @@ -1,4 +1,4 @@

> -; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/cygming.opt
> and generated HTML
> +; Autogenerated by regenerate-opt-urls.py from
> +gcc/config/mingw/cygming.opt and generated HTML

I am not sure why this comment has not been updated. Is it critical or it could 
be updated next time when it is needed?

>
>  mconsole
>  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mconsole)
> @@ -9,9 +9,8 @@ UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-
> mdll)
>  mnop-fun-dllimport
>  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mnop-fun-dllimport)
> 
> -; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> -;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1'
> -;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> +mthreads
> +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1)

mthreads has the same issue before applying changes. Has something been changed 
recently?
This is the change in patch series in 'Rename "x86 Windows Options" to "Cygwin 
and MinGW Options"' commit.

; skipping UrlSuffix for 'mthreads' due to multiple URLs:
+;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1'
 ;   duplicate: 'gcc/x86-Options.html#index-mthreads'
-;   duplicate: 'gcc/x86-Windows-Options.html#index-mthreads-1'

Regards,
Evgeny

>  mwin32
>  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mwin32)
> diff --git a/gcc/config/mingw/mingw.opt.urls
> b/gcc/config/mingw/mingw.opt.urls index f8ee5be6a535..40fb086606b2
> 100644

> --- a/gcc/config/mingw/mingw.opt.urls
> +++ b/gcc/config/mingw/mingw.opt.urls
> @@ -1,4 +1,4 @@
> -; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/mingw.opt
> and generated HTML
> +; Autogenerated by regenerate-opt-urls.py from
> +gcc/config/mingw/mingw.opt and generated HTML
> 
>  mcrtdll=
>  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mcrtdll)
> --
> 2.39.3



Re: [pushed 00/21] Various backports to gcc 13 (analyzer, jit, diagnostics)

2024-05-13 Thread Jakub Jelinek
On Thu, May 09, 2024 at 01:42:15PM -0400, David Malcolm wrote:
> I've pushed the following changes to releases/gcc-13
> as r13-8741-g89feb3557a0188 through r13-8761-gb7a2697733d19a.

Unfortunately many of the commits contained git commit message wording
that update_git_version can't cope with.
Wording like
(cherry picked from commit r14-1664-gfe9771b59f576f)
is wrong,
(cherry picked from commit .)
is reserved solely for what one gets from git cherry-pick -x
(i.e. the full commit hash without anything extra).

I had to ignore the following commits in the ChangeLog generation
because of this:

89feb3557a018893cfe50c2e07f91559bd3cde2b
ccf8d3e3d26c6ba3d5e11fffeed8d64018e9c060
e0c52905f666e3d23881f82dbf39466a24f009f4
b38472ffc1e631bd357573b44d956ce16d94e666
a0b13d0860848dd5f2876897ada1e22e4e681e91
b8c772cae97b54386f7853edf0f9897012bfa90b
810d35a7e054bcbb5b66d2e5924428e445f5fba9
0df1ee083434ac00ecb19582b1e5b25e105981b2
2c688f6afce4cbb414f5baab1199cd525f309fca
60dcb710b6b4aa22ea96abc8df6dfe9067f3d7fe
44968a0e00f656e9bb3e504bb2fa1a8282002015

Can you please add the ChangeLog entries for these by hand
(commits which only touch ChangeLog files are allowed and shouldn't
contain ChangeLog style entry in the commit message)?

Thanks.

Jakub



RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-13 Thread Tamar Christina
Hi Pan,

> -Original Message-
> From: pan2...@intel.com 
> Sent: Monday, May 6, 2024 3:48 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com;
> hongtao@intel.com; Pan Li 
> Subject: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned 
> scalar
> int
> 
> From: Pan Li 
> 
> This patch would like to add the middle-end presentation for the
> saturation add.  Aka set the result of add to the max when overflow.
> It will take the pattern similar as below.
> 
> SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))
> 
> Take uint8_t as example, we will have:
> 
> * SAT_ADD (1, 254)   => 255.
> * SAT_ADD (1, 255)   => 255.
> * SAT_ADD (2, 255)   => 255.
> * SAT_ADD (255, 255) => 255.
> 
> Given below example for the unsigned scalar integer uint64_t:
> 
> uint64_t sat_add_u64 (uint64_t x, uint64_t y)
> {
>   return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
> }
> 
> Before this patch:
> uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
> {
>   long unsigned int _1;
>   _Bool _2;
>   long unsigned int _3;
>   long unsigned int _4;
>   uint64_t _7;
>   long unsigned int _10;
>   __complex__ long unsigned int _11;
> 
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
>   _1 = REALPART_EXPR <_11>;
>   _10 = IMAGPART_EXPR <_11>;
>   _2 = _10 != 0;
>   _3 = (long unsigned int) _2;
>   _4 = -_3;
>   _7 = _1 | _4;
>   return _7;
> ;;succ:   EXIT
> 
> }
> 
> After this patch:
> uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
> {
>   uint64_t _7;
> 
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
>   return _7;
> ;;succ:   EXIT
> }
> 
> We perform the tranform during widen_mult because that the sub-expr of
> SAT_ADD will be optimized to .ADD_OVERFLOW.  We need to try the .SAT_ADD
> pattern first and then .ADD_OVERFLOW,  or we may never catch the pattern
> .SAT_ADD.  Meanwhile, the isel pass is after widen_mult and then we
> cannot perform the .SAT_ADD pattern match as the sub-expr will be
> optmized to .ADD_OVERFLOW first.
> 
> The below tests are passed for this patch:
> 1. The riscv fully regression tests.
> 2. The aarch64 fully regression tests.
> 3. The x86 bootstrap tests.
> 4. The x86 fully regression tests.
> 
>   PR target/51492
>   PR target/112600
> 
> gcc/ChangeLog:
> 
>   * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD
>   to the return true switch case(s).
>   * internal-fn.def (SAT_ADD):  Add new signed optab SAT_ADD.
>   * match.pd: Add unsigned SAT_ADD match.
>   * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd.
>   * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern
>   func decl generated in match.pd match.
>   (match_saturation_arith): New func impl to match the saturation arith.
>   (math_opts_dom_walker::after_dom_children): Try match saturation
>   arith.
> 
> Signed-off-by: Pan Li 
> ---
>  gcc/internal-fn.cc|  1 +
>  gcc/internal-fn.def   |  2 ++
>  gcc/match.pd  | 28 
>  gcc/optabs.def|  4 ++--
>  gcc/tree-ssa-math-opts.cc | 46
> +++
>  5 files changed, 79 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 0a7053c2286..73045ca8c8c 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn)
>  case IFN_UBSAN_CHECK_MUL:
>  case IFN_ADD_OVERFLOW:
>  case IFN_MUL_OVERFLOW:
> +case IFN_SAT_ADD:
>  case IFN_VEC_WIDEN_PLUS:
>  case IFN_VEC_WIDEN_PLUS_LO:
>  case IFN_VEC_WIDEN_PLUS_HI:
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 848bb9dbff3..25badbb86e5 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST
> | ECF_NOTHROW, first,
>  DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW,
> first,
> smulhrs, umulhrs, binary)
> 
> +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd,
> binary)
> +
>  DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary)
>  DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary)
>  DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary)
> diff --git a/gcc/match.pd b/gcc/match.pd
> index d401e7503e6..7058e4cbe29 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3043,6 +3043,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> || POINTER_TYPE_P (itype))
>&& wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype))
> 
> +/* Unsigned Saturation Add */
> +(match (usadd_left_part @0 @1)
> + (plus:c @0 @1)
> + (if (INTEGRAL_TYPE_P (type)
> +  && TYPE_UNSIGNED (TREE_TYPE (@0))
> +  && types_match (type, TREE_TYPE (@0))
> +  && types_match (type, TREE

RE: [PATCH v4 2/3] VECT: Support new IFN SAT_ADD for unsigned vector int

2024-05-13 Thread Tamar Christina
Hi Pan,

> -Original Message-
> From: pan2...@intel.com 
> Sent: Monday, May 6, 2024 3:49 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com;
> hongtao@intel.com; Pan Li 
> Subject: [PATCH v4 2/3] VECT: Support new IFN SAT_ADD for unsigned vector int
> 
> From: Pan Li 
> 
> This patch depends on below scalar enabling patch:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html
> 
> For vectorize, we leverage the existing vect pattern recog to find
> the pattern similar to scalar and let the vectorizer to perform
> the rest part for standard name usadd3 in vector mode.
> The riscv vector backend have insn "Vector Single-Width Saturating
> Add and Subtract" which can be leveraged when expand the usadd3
> in vector mode.  For example:
> 
> void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
> {
>   unsigned i;
> 
>   for (i = 0; i < n; i++)
> out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i]));
> }
> 
> Before this patch:
> void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
> {
>   ...
>   _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]);
>   ivtmp_58 = _80 * 8;
>   vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0);
>   vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0);
>   vect__7.11_66 = vect__4.7_61 + vect__6.10_65;
>   mask__8.12_67 = vect__4.7_61 > vect__7.11_66;
>   vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615,
> ... }, vect__7.11_66);
>   .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72);
>   vectp_x.5_60 = vectp_x.5_59 + ivtmp_58;
>   vectp_y.8_64 = vectp_y.8_63 + ivtmp_58;
>   vectp_out.16_75 = vectp_out.16_74 + ivtmp_58;
>   ivtmp_79 = ivtmp_78 - _80;
>   ...
> }
> 
> After this patch:
> void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
> {
>   ...
>   _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]);
>   ivtmp_46 = _62 * 8;
>   vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0);
>   vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0);
>   vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53);
>   .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54);
>   ...
> }
> 
> The below test suites are passed for this patch.
> * The riscv fully regression tests.
> * The aarch64 fully regression tests.
> * The x86 bootstrap tests.
> * The x86 fully regression tests.
> 
>   PR target/51492
>   PR target/112600
> 
> gcc/ChangeLog:
> 
>   * tree-vect-patterns.cc (gimple_unsigned_integer_sat_add): New func
>   decl generated by match.pd match.
>   (vect_recog_sat_add_pattern): New func impl to recog the pattern
>   for unsigned SAT_ADD.
> 
> Signed-off-by: Pan Li 

Patch looks good to me, but I cannot approve so I'll pass it on to Richi.

Cheers,
Tamar

> ---
>  gcc/tree-vect-patterns.cc | 51 +++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 87c2acff386..8ffcaf71d5c 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -4487,6 +4487,56 @@ vect_recog_mult_pattern (vec_info *vinfo,
>return pattern_stmt;
>  }
> 
> +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree));
> +
> +/*
> + * Try to detect saturation add pattern (SAT_ADD), aka below gimple:
> + *   _7 = _4 + _6;
> + *   _8 = _4 > _7;
> + *   _9 = (long unsigned int) _8;
> + *   _10 = -_9;
> + *   _12 = _7 | _10;
> + *
> + * And then simplied to
> + *   _12 = .SAT_ADD (_4, _6);
> + */
> +
> +static gimple *
> +vect_recog_sat_add_pattern (vec_info *vinfo, stmt_vec_info stmt_vinfo,
> + tree *type_out)
> +{
> +  gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo);
> +
> +  if (!is_gimple_assign (last_stmt))
> +return NULL;
> +
> +  tree res_ops[2];
> +  tree lhs = gimple_assign_lhs (last_stmt);
> +
> +  if (gimple_unsigned_integer_sat_add (lhs, res_ops, NULL))
> +{
> +  tree itype = TREE_TYPE (res_ops[0]);
> +  tree vtype = get_vectype_for_scalar_type (vinfo, itype);
> +
> +  if (vtype != NULL_TREE && direct_internal_fn_supported_p (
> + IFN_SAT_ADD, vtype, OPTIMIZE_FOR_SPEED))
> + {
> +   *type_out = vtype;
> +   gcall *call = gimple_build_call_internal (IFN_SAT_ADD, 2, res_ops[0],
> + res_ops[1]);
> +
> +   gimple_call_set_lhs (call, vect_recog_temp_ssa_var (itype, NULL));
> +   gimple_call_set_nothrow (call, /* nothrow_p */ false);
> +   gimple_set_location (call, gimple_location (last_stmt));
> +
> +   vect_pattern_detected ("vect_recog_sat_add_pattern", last_stmt);
> +   return call;
> + }
> +}
> +
> +  return NULL;
> +}
> +
>  /* Detect a signed division by a constant that wouldn't be
> otherwi

Re: [EXTERNAL] [COMMITTED] Regenerate cygming.opt.urls and mingw.opt.urls

2024-05-13 Thread Mark Wielaard
Hi Evgeny,

Adding David to the CC, who might know the details.

On Mon, May 13, 2024 at 08:44:12AM +, Evgeny Karpov wrote:
> Sunday, May 12, 2024
>
> Thank you for reviewing our changes related to the refactoring of
> extracting the MinGW implementation from ix64.
>
> It was expected to move the MinGW-related files without changes in
> this commit ("Reuse MinGW from i386 for AArch64") and apply the
> renaming in a follow-up commit, which has been done in 'Rename "x86
> Windows Options" to "Cygwin and MinGW Options"'.
>
> The script to update opt.urls files has been used.
> 
> > diff --git a/gcc/config/mingw/cygming.opt.urls
> > b/gcc/config/mingw/cygming.opt.urls
> > index c624e22e4427..af11c4997609 100644
> > --- a/gcc/config/mingw/cygming.opt.urls
> > +++ b/gcc/config/mingw/cygming.opt.urls
> > @@ -1,4 +1,4 @@
> 
> > -; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/cygming.opt
> > and generated HTML
> > +; Autogenerated by regenerate-opt-urls.py from
> > +gcc/config/mingw/cygming.opt and generated HTML
> 
> I am not sure why this comment has not been updated. Is it critical
> or it could be updated next time when it is needed?

Odd that the script didn't update this comment, it really should have.
It might be that running the script through make regenerate-opt-urls
inside the gcc build subdir invokes regenerate-opt-urls.py slightly
differently so that this line is updated.

> >  mconsole
> >  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mconsole)
> > @@ -9,9 +9,8 @@ UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-
> > mdll)
> >  mnop-fun-dllimport
> >  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mnop-fun-dllimport)
> > 
> > -; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> > -;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1'
> > -;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> > +mthreads
> > +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1)
> 
> mthreads has the same issue before applying changes. Has something been 
> changed recently?
> This is the change in patch series in 'Rename "x86 Windows Options" to 
> "Cygwin and MinGW Options"' commit.
> 
> ; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> +;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1'
>  ;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> -;   duplicate: 'gcc/x86-Windows-Options.html#index-mthreads-1'

Again, it might be caused by invoking the script by hand vs with make
regenerate-opt-urls.py. I believe with the make option it will
renumber the suffixes making sure the urls are unique.

BTW. There is a CI buildbot that tries to regenerate all generated
files, which is how I spotted this:
https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen
(It should also sent email to the author of the patch on failure.)

Cheers,

Mark


RE: [PATCH] Allow patterns in SLP reductions

2024-05-13 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Friday, May 10, 2024 2:07 PM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Allow patterns in SLP reductions
> 
> On Fri, Mar 1, 2024 at 10:21 AM Richard Biener  wrote:
> >
> > The following removes the over-broad rejection of patterns for SLP
> > reductions which is done by removing them from LOOP_VINFO_REDUCTIONS
> > during pattern detection.  That's also insufficient in case the
> > pattern only appears on the reduction path.  Instead this implements
> > the proper correctness check in vectorizable_reduction and guides
> > SLP discovery to heuristically avoid forming later invalid groups.
> >
> > I also couldn't find any testcase that FAILs when allowing the SLP
> > reductions to form so I've added one.
> >
> > I came across this for single-lane SLP reductions with the all-SLP
> > work where we rely on patterns to properly vectorize COND_EXPR
> > reductions.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.
> 
> Re-bootstrapped/tested, r15-361-g52d4691294c847

Awesome!

Does this now allow us to write new reductions using patterns? i.e. widening 
reductions?

Cheers,
Tamar
> 
> Richard.
> 
> > Richard.
> >
> > * tree-vect-patterns.cc (vect_pattern_recog_1): Do not
> > remove reductions involving patterns.
> > * tree-vect-loop.cc (vectorizable_reduction): Reject SLP
> > reduction groups with multiple lane-reducing reductions.
> > * tree-vect-slp.cc (vect_analyze_slp_instance): When discovering
> > SLP reduction groups avoid including lane-reducing ones.
> >
> > * gcc.dg/vect/vect-reduc-sad-9.c: New testcase.
> > ---
> >  gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c | 68 
> >  gcc/tree-vect-loop.cc| 15 +
> >  gcc/tree-vect-patterns.cc| 13 
> >  gcc/tree-vect-slp.cc | 26 +---
> >  4 files changed, 101 insertions(+), 21 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> >
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> > new file mode 100644
> > index 000..3c6af4510f4
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> > @@ -0,0 +1,68 @@
> > +/* Disabling epilogues until we find a better way to deal with scans.  */
> > +/* { dg-additional-options "--param vect-epilogues-nomask=0" } */
> > +/* { dg-additional-options "-msse4.2" { target { x86_64-*-* i?86-*-* } } } 
> > */
> > +/* { dg-require-effective-target vect_usad_char } */
> > +
> > +#include 
> > +#include "tree-vect.h"
> > +
> > +#define N 64
> > +
> > +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> > +unsigned char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> > +int abs (int);
> > +
> > +/* Sum of absolute differences between arrays of unsigned char types.
> > +   Detected as a sad pattern.
> > +   Vectorized on targets that support sad for unsigned chars.  */
> > +
> > +__attribute__ ((noinline)) int
> > +foo (int len, int *res2)
> > +{
> > +  int i;
> > +  int result = 0;
> > +  int result2 = 0;
> > +
> > +  for (i = 0; i < len; i++)
> > +{
> > +  /* Make sure we are not using an SLP reduction for this.  */
> > +  result += abs (X[2*i] - Y[2*i]);
> > +  result2 += abs (X[2*i + 1] - Y[2*i + 1]);
> > +}
> > +
> > +  *res2 = result2;
> > +  return result;
> > +}
> > +
> > +
> > +int
> > +main (void)
> > +{
> > +  int i;
> > +  int sad;
> > +
> > +  check_vect ();
> > +
> > +  for (i = 0; i < N/2; i++)
> > +{
> > +  X[2*i] = i;
> > +  Y[2*i] = N/2 - i;
> > +  X[2*i+1] = i;
> > +  Y[2*i+1] = 0;
> > +  __asm__ volatile ("");
> > +}
> > +
> > +
> > +  int sad2;
> > +  sad = foo (N/2, &sad2);
> > +  if (sad != (N/2)*(N/4))
> > +abort ();
> > +  if (sad2 != (N/2-1)*(N/2)/2)
> > +abort ();
> > +
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } 
> > } */
> > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> > +
> > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > index 35f1f8c7d42..13dcdba403a 100644
> > --- a/gcc/tree-vect-loop.cc
> > +++ b/gcc/tree-vect-loop.cc
> > @@ -7703,6 +7703,21 @@ vectorizable_reduction (loop_vec_info loop_vinfo,
> >return false;
> >  }
> >
> > +  /* Lane-reducing ops also never can be used in a SLP reduction group
> > + since we'll mix lanes belonging to different reductions.  But it's
> > + OK to use them in a reduction chain or when the reduction group
> > + has just one element.  */
> > +  if (lane_reduc_code_p
> > +  && slp_node
> > +  && !REDUC_GROUP_FIRST_ELEMENT (stmt_info)
> > +  && SLP_TREE_LANES (slp_node) > 1)
> > +{
> > +  if (dump_enabled_p ())
> > +   dump_printf_loc (MSG_MISS

Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]

2024-05-13 Thread Jakub Jelinek
On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote:
> > 2024-05-09  Jakub Jelinek  
> > Jason Merrill  
> > 
> > PR lto/113208
> > * cp-tree.h (maybe_optimize_cdtor): Remove.
> > * decl2.cc (tentative_decl_linkage): Call maybe_make_one_only
> > for implicit instantiations of maybe in charge ctors/dtors
> > declared inline.
> > (import_export_decl): Don't call maybe_optimize_cdtor.
> > (c_parse_final_cleanups): Formatting fixes.
> > * optimize.cc (can_alias_cdtor): Adjust condition, for
> > HAVE_COMDAT_GROUP && DECL_ONE_ONLY && DECL_WEAK return true even
> > if not DECL_INTERFACE_KNOWN.
> 
> > --- gcc/cp/optimize.cc.jj   2024-04-25 20:33:30.771858912 +0200
> > +++ gcc/cp/optimize.cc  2024-05-09 17:10:23.920478922 +0200
> > @@ -220,10 +220,8 @@ can_alias_cdtor (tree fn)
> > gcc_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
> > /* Don't use aliases for weak/linkonce definitions unless we can put 
> > both
> >symbols in the same COMDAT group.  */
> > -  return (DECL_INTERFACE_KNOWN (fn)
> > - && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
> > - && (!DECL_ONE_ONLY (fn)
> > - || (HAVE_COMDAT_GROUP && DECL_WEAK (fn;
> > +  return (DECL_WEAK (fn) ? (HAVE_COMDAT_GROUP && DECL_ONE_ONLY (fn))
> > +: (DECL_INTERFACE_KNOWN (fn) && !DECL_ONE_ONLY (fn)));
> 
> Hmm, would
> 
> (DECL_ONE_ONLY (fn) ? HAVE_COMDAT_GROUP
>  : (DECL_INTERFACE_KNOWN (fn) && !DECL_WEAK (fn)))
> 
> make sense instead?  I don't think DECL_WEAK is necessary for COMDAT.

I think it isn't indeed necessary for COMDAT, although e.g. comdat_linkage
will not call make_decl_one_only if !flag_weak.

But I think it is absolutely required for the alias cdtor optimization
in question, because otherwise it would be an ABI change.
Consider older version of GCC or some other compiler emitting
_ZN6vectorI12QualityValueEC1ERKS1_
and
_ZN6vectorI12QualityValueEC2ERKS1_
symbols not as aliases, each in their own comdat groups, so
.text._ZN6vectorI12QualityValueEC1ERKS1_ in _ZN6vectorI12QualityValueEC1ERKS1_
comdat group and
.text._ZN6vectorI12QualityValueEC2ERKS1_ in _ZN6vectorI12QualityValueEC2ERKS1_
comdat group.  And then comes GCC with the above patch without the DECL_WEAK
check in there, and decides to use alias, so
_ZN6vectorI12QualityValueEC1ERKS1_ is an alias to
_ZN6vectorI12QualityValueEC2ERKS1_ and both live in
.text._ZN6vectorI12QualityValueEC2ERKS1_ section in
_ZN6vectorI12QualityValueEC5ERKS1_ comdat group.  If you mix TUs with this,
the linker can keep one of the section sets from the 
_ZN6vectorI12QualityValueEC1ERKS1_
and _ZN6vectorI12QualityValueEC2ERKS1_ and _ZN6vectorI12QualityValueEC5ERKS1_
comdat groups.  If there is no .weak for the symbols, this will fail to
link, one can emit it either the old way or the new way but never both, it
is part of an ABI.
While with .weak, mixing it is possible, worst case one gets some unused
code in the linked binary or shared library.  Of course the desirable case
is that there is no mixing and there is no unused code, but if it happens,
no big deal.  Without .weak it is a big deal.

Jakub



RE: [PATCH] Allow patterns in SLP reductions

2024-05-13 Thread Richard Biener
On Mon, 13 May 2024, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Friday, May 10, 2024 2:07 PM
> > To: Richard Biener 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [PATCH] Allow patterns in SLP reductions
> > 
> > On Fri, Mar 1, 2024 at 10:21 AM Richard Biener  wrote:
> > >
> > > The following removes the over-broad rejection of patterns for SLP
> > > reductions which is done by removing them from LOOP_VINFO_REDUCTIONS
> > > during pattern detection.  That's also insufficient in case the
> > > pattern only appears on the reduction path.  Instead this implements
> > > the proper correctness check in vectorizable_reduction and guides
> > > SLP discovery to heuristically avoid forming later invalid groups.
> > >
> > > I also couldn't find any testcase that FAILs when allowing the SLP
> > > reductions to form so I've added one.
> > >
> > > I came across this for single-lane SLP reductions with the all-SLP
> > > work where we rely on patterns to properly vectorize COND_EXPR
> > > reductions.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.
> > 
> > Re-bootstrapped/tested, r15-361-g52d4691294c847
> 
> Awesome!
> 
> Does this now allow us to write new reductions using patterns? i.e. 
> widening reductions?

Yes (SLP reductions, that is).  This is really only for SLP reductions
(not SLP reduction chains, not non-SLP reductions).  So it's just
a corner-case but since with SLP-only non-SLP reductions become
SLP reductions with a single lane that was important to fix ;)

Richard.

> Cheers,
> Tamar
> > 
> > Richard.
> > 
> > > Richard.
> > >
> > > * tree-vect-patterns.cc (vect_pattern_recog_1): Do not
> > > remove reductions involving patterns.
> > > * tree-vect-loop.cc (vectorizable_reduction): Reject SLP
> > > reduction groups with multiple lane-reducing reductions.
> > > * tree-vect-slp.cc (vect_analyze_slp_instance): When discovering
> > > SLP reduction groups avoid including lane-reducing ones.
> > >
> > > * gcc.dg/vect/vect-reduc-sad-9.c: New testcase.
> > > ---
> > >  gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c | 68 
> > >  gcc/tree-vect-loop.cc| 15 +
> > >  gcc/tree-vect-patterns.cc| 13 
> > >  gcc/tree-vect-slp.cc | 26 +---
> > >  4 files changed, 101 insertions(+), 21 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> > >
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> > b/gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> > > new file mode 100644
> > > index 000..3c6af4510f4
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-sad-9.c
> > > @@ -0,0 +1,68 @@
> > > +/* Disabling epilogues until we find a better way to deal with scans.  */
> > > +/* { dg-additional-options "--param vect-epilogues-nomask=0" } */
> > > +/* { dg-additional-options "-msse4.2" { target { x86_64-*-* i?86-*-* } } 
> > > } */
> > > +/* { dg-require-effective-target vect_usad_char } */
> > > +
> > > +#include 
> > > +#include "tree-vect.h"
> > > +
> > > +#define N 64
> > > +
> > > +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> > > +unsigned char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> > > +int abs (int);
> > > +
> > > +/* Sum of absolute differences between arrays of unsigned char types.
> > > +   Detected as a sad pattern.
> > > +   Vectorized on targets that support sad for unsigned chars.  */
> > > +
> > > +__attribute__ ((noinline)) int
> > > +foo (int len, int *res2)
> > > +{
> > > +  int i;
> > > +  int result = 0;
> > > +  int result2 = 0;
> > > +
> > > +  for (i = 0; i < len; i++)
> > > +{
> > > +  /* Make sure we are not using an SLP reduction for this.  */
> > > +  result += abs (X[2*i] - Y[2*i]);
> > > +  result2 += abs (X[2*i + 1] - Y[2*i + 1]);
> > > +}
> > > +
> > > +  *res2 = result2;
> > > +  return result;
> > > +}
> > > +
> > > +
> > > +int
> > > +main (void)
> > > +{
> > > +  int i;
> > > +  int sad;
> > > +
> > > +  check_vect ();
> > > +
> > > +  for (i = 0; i < N/2; i++)
> > > +{
> > > +  X[2*i] = i;
> > > +  Y[2*i] = N/2 - i;
> > > +  X[2*i+1] = i;
> > > +  Y[2*i+1] = 0;
> > > +  __asm__ volatile ("");
> > > +}
> > > +
> > > +
> > > +  int sad2;
> > > +  sad = foo (N/2, &sad2);
> > > +  if (sad != (N/2)*(N/4))
> > > +abort ();
> > > +  if (sad2 != (N/2-1)*(N/2)/2)
> > > +abort ();
> > > +
> > > +  return 0;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" 
> > > } } */
> > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> > > +
> > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > > index 35f1f8c7d42..13dcdba403a 100644
> > > --- a/gcc/tree-vect-loop.cc
> > > +++ b/gcc/tree-vect-loop.cc
> > > @@ -7703,6 +7703

[PATCH] Refactor SLP reduction group discovery

2024-05-13 Thread Richard Biener
The following refactors a bit how we perform SLP reduction group
discovery possibly making it easier to have multiple reduction
groups later, esp. with single-lane SLP.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* tree-vect-slp.cc (vect_analyze_slp_instance): Remove
slp_inst_kind_reduc_group handling.
(vect_analyze_slp): Add the meat here.
---
 gcc/tree-vect-slp.cc | 67 ++--
 1 file changed, 34 insertions(+), 33 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 8c18f5308e2..f34ed54a70b 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -3586,7 +3586,6 @@ vect_analyze_slp_instance (vec_info *vinfo,
   slp_instance_kind kind,
   unsigned max_tree_size, unsigned *limit)
 {
-  unsigned int i;
   vec scalar_stmts;
 
   if (is_a  (vinfo))
@@ -3620,35 +3619,6 @@ vect_analyze_slp_instance (vec_info *vinfo,
   STMT_VINFO_REDUC_DEF (vect_orig_stmt (stmt_info))
= STMT_VINFO_REDUC_DEF (vect_orig_stmt (scalar_stmts.last ()));
 }
-  else if (kind == slp_inst_kind_reduc_group)
-{
-  /* Collect reduction statements.  */
-  const vec &reductions
-   = as_a  (vinfo)->reductions;
-  scalar_stmts.create (reductions.length ());
-  for (i = 0; reductions.iterate (i, &next_info); i++)
-   {
- gassign *g;
- next_info = vect_stmt_to_vectorize (next_info);
- if ((STMT_VINFO_RELEVANT_P (next_info)
-  || STMT_VINFO_LIVE_P (next_info))
- /* ???  Make sure we didn't skip a conversion around a reduction
-path.  In that case we'd have to reverse engineer that
-conversion stmt following the chain using reduc_idx and from
-the PHI using reduc_def.  */
- && STMT_VINFO_DEF_TYPE (next_info) == vect_reduction_def
- /* Do not discover SLP reductions for lane-reducing ops, that
-will fail later.  */
- && (!(g = dyn_cast  (STMT_VINFO_STMT (next_info)))
- || (gimple_assign_rhs_code (g) != DOT_PROD_EXPR
- && gimple_assign_rhs_code (g) != WIDEN_SUM_EXPR
- && gimple_assign_rhs_code (g) != SAD_EXPR)))
-   scalar_stmts.quick_push (next_info);
-   }
-  /* If less than two were relevant/live there's nothing to SLP.  */
-  if (scalar_stmts.length () < 2)
-   return false;
-}
   else
 gcc_unreachable ();
 
@@ -3740,9 +3710,40 @@ vect_analyze_slp (vec_info *vinfo, unsigned 
max_tree_size)
 
   /* Find SLP sequences starting from groups of reductions.  */
   if (loop_vinfo->reductions.length () > 1)
-   vect_analyze_slp_instance (vinfo, bst_map, loop_vinfo->reductions[0],
-  slp_inst_kind_reduc_group, max_tree_size,
-  &limit);
+   {
+ /* Collect reduction statements.  */
+ vec scalar_stmts;
+ scalar_stmts.create (loop_vinfo->reductions.length ());
+ for (auto next_info : loop_vinfo->reductions)
+   {
+ gassign *g;
+ next_info = vect_stmt_to_vectorize (next_info);
+ if ((STMT_VINFO_RELEVANT_P (next_info)
+  || STMT_VINFO_LIVE_P (next_info))
+ /* ???  Make sure we didn't skip a conversion around a
+reduction path.  In that case we'd have to reverse
+engineer that conversion stmt following the chain using
+reduc_idx and from the PHI using reduc_def.  */
+ && STMT_VINFO_DEF_TYPE (next_info) == vect_reduction_def
+ /* Do not discover SLP reductions for lane-reducing ops, that
+will fail later.  */
+ && (!(g = dyn_cast  (STMT_VINFO_STMT (next_info)))
+ || (gimple_assign_rhs_code (g) != DOT_PROD_EXPR
+ && gimple_assign_rhs_code (g) != WIDEN_SUM_EXPR
+ && gimple_assign_rhs_code (g) != SAD_EXPR)))
+   scalar_stmts.quick_push (next_info);
+   }
+ if (scalar_stmts.length () > 1)
+   {
+ vec roots = vNULL;
+ vec remain = vNULL;
+ vect_build_slp_instance (loop_vinfo, slp_inst_kind_reduc_group,
+  scalar_stmts, roots, remain,
+  max_tree_size, &limit, bst_map, NULL);
+   }
+ else
+   scalar_stmts.release ();
+   }
 }
 
   hash_set visited_patterns;
-- 
2.35.3


Re: [PATCH] libstdc++: Use __builtin_shufflevector for simd split and concat

2024-05-13 Thread Jonathan Wakely
On Tue, 7 May 2024 at 14:42, Matthias Kretz  wrote:
>
> Tested on x86_64-linux-gnu and aarch64-linux-gnu and with Clang 18 on x86_64-
> linux-gnu.
>
> OK for trunk and backport(s)?

OK for all.


>
> -- 8< 
>
> Signed-off-by: Matthias Kretz 
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/114958
> * include/experimental/bits/simd.h (__as_vector): Return scalar
> simd as one-element vector. Return vector from single-vector
> fixed_size simd.
> (__vec_shuffle): New.
> (__extract_part): Adjust return type signature.
> (split): Use __extract_part for any split into non-fixed_size
> simds.
> (concat): If the return type stores a single vector, use
> __vec_shuffle (which calls __builtin_shufflevector) to produce
> the return value.
> * include/experimental/bits/simd_builtin.h
> (__shift_elements_right): Removed.
> (__extract_part): Return single elements directly. Use
> __vec_shuffle (which calls __builtin_shufflevector) to for all
> non-trivial cases.
> * include/experimental/bits/simd_fixed_size.h (__extract_part):
> Return single elements directly.
> * testsuite/experimental/simd/pr114958.cc: New test.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h | 161 +-
>  .../include/experimental/bits/simd_builtin.h  | 152 +
>  .../experimental/bits/simd_fixed_size.h   |   4 +-
>  .../testsuite/experimental/simd/pr114958.cc   |  20 +++
>  4 files changed, 145 insertions(+), 192 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/experimental/simd/pr114958.cc
>
>
> --
> ──
>  Dr. Matthias Kretz   https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
>  stdₓ::simd
> ──



Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-13 Thread Segher Boessenkool
Hi!

On Mon, May 13, 2024 at 10:57:12AM +0800, Jiufu Guo wrote:
> For PR96866, when gcc print asm code for modifier "%a" which requires
> an address operand,

It requires a *memory* operand, and it outputs its address.  This is a
generic modifier btw (not rs6000).

> while the operand is with the constraint "X" which
> allow non-address form.  An error message would be reported to indicate
> the invalid asm operands.

"non-address form"?  Every mem has an address.

But 'X' is not memory.  What is it at all?  Why do we use that when you
*have to* have mem here?

The code you add that tests for address_operand looks wrong.  I would
expect it to test the operand is memory, instead :-)


Segher


Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-13 Thread Aldy Hernandez
On Thu, May 9, 2024 at 10:05 AM Mikael Morin  wrote:
>
> Hello,
>
> Le 07/05/2024 à 04:37, HAO CHEN GUI a écrit :
> > Hi,
> >The former patch adds isfinite optab for __builtin_isfinite.
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html
> >
> >Thus the builtin might not be folded at front end. The range op for
> > isfinite is needed for value range analysis. This patch adds them.
> >
> >Compared to last version, this version fixes a typo.
> >
> >Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> > regressions. Is it OK for the trunk?
> >
> > Thanks
> > Gui Haochen
> >
> > ChangeLog
> > Value Range: Add range op for builtin isfinite
> >
> > The former patch adds optab for builtin isfinite. Thus builtin isfinite 
> > might
> > not be folded at front end.  So the range op for isfinite is needed for 
> > value
> > range analysis.  This patch adds range op for builtin isfinite.
> >
> > gcc/
> >   * gimple-range-op.cc (class cfn_isfinite): New.
> >   (op_cfn_finite): New variables.
> >   (gimple_range_op_handler::maybe_builtin_call): Handle
> >   CFN_BUILT_IN_ISFINITE.
> >
> > gcc/testsuite/
> >   * gcc/testsuite/gcc.dg/tree-ssa/range-isfinite.c: New test.
> >
> > patch.diff
> > diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
> > index 9de130b4022..99c511728d3 100644
> > --- a/gcc/gimple-range-op.cc
> > +++ b/gcc/gimple-range-op.cc
> > @@ -1192,6 +1192,56 @@ public:
> > }
> >   } op_cfn_isinf;
> >
> > +//Implement range operator for CFN_BUILT_IN_ISFINITE
> > +class cfn_isfinite : public range_operator
> > +{
> > +public:
> > +  using range_operator::fold_range;
> > +  using range_operator::op1_range;
> > +  virtual bool fold_range (irange &r, tree type, const frange &op1,
> > +const irange &, relation_trio) const override
> > +  {
> > +if (op1.undefined_p ())
> > +  return false;
> > +
> > +if (op1.known_isfinite ())
> > +  {
> > + r.set_nonzero (type);
> > + return true;
> > +  }
> > +
> > +if (op1.known_isnan ()
> > + || op1.known_isinf ())
> > +  {
> > + r.set_zero (type);
> > + return true;
> > +  }
> > +
> > +return false;
> I think the canonical API behaviour sets R to varying and returns true
> instead of just returning false if nothing is known about the range.

Correct.  If we know it's varying, we just set varying and return
true.  Returning false is usually reserved for "I have no idea".
However, every caller of fold_range() should know to ignore a return
of false, so you should be safe.

>
> I'm not sure whether it makes any difference; Aldy can probably tell.
> But if the type is bool, varying is [0,1] which is better than unknown
> range.

Also, I see you're setting zero/nonzero.  Is the return type known to
be boolean, because if so, we usually prefer to one of:

r = range_true ()
r = range_false ()
r = range_true_and_false ();

It doesn't matter either way, but it's probably best to use these as
they force boolean_type_node automatically.

I don't have a problem with this patch, but I would prefer the
floating point savvy people to review this, as there are no members of
the ranger team that are floating point experts :).

Also, I see you mention in your original post that this patch was
needed as a follow-up to this one:

https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html

I don't see the above patch in the source tree currently:

Thanks.
Aldy

>
> > +  }
> > +  virtual bool op1_range (frange &r, tree type, const irange &lhs,
> > +   const frange &, relation_trio) const override
> > +  {
> > +if (lhs.zero_p ())
> > +  {
> > + // The range is [-INF,-INF][+INF,+INF] NAN, but it can't be 
> > represented.
> > + // Set range to varying
> > + r.set_varying (type);
> > + return true;
> > +  }
> > +
> > +if (!range_includes_zero_p (&lhs))
> > +  {
> > + nan_state nan (false);
> > + r.set (type, real_min_representable (type),
> > +real_max_representable (type), nan);
> > + return true;
> > +  }
> > +
> > +return false;
> Same here.
>
> > +  }
> > +} op_cfn_isfinite;
> > +
> >   // Implement range operator for CFN_BUILT_IN_
> >   class cfn_parity : public range_operator
> >   {
>



Re: [PATCH] testsuite: c++: Allow for std::printf in g++.dg/modules/stdio-1_a.H [PR98529]

2024-05-13 Thread Nathaniel Shead
On Mon, May 13, 2024 at 10:40:30AM +0200, Rainer Orth wrote:
> g++.dg/modules/stdio-1_a.H currently FAILs on Solaris:
> 
> FAIL: g++.dg/modules/stdio-1_a.H -std=c++17  scan-lang-dump module "Depset:0 
> decl entity:[0-9]* function_decl:'::printf'"
> FAIL: g++.dg/modules/stdio-1_a.H -std=c++2a  scan-lang-dump module "Depset:0 
> decl entity:[0-9]* function_decl:'::printf'"
> FAIL: g++.dg/modules/stdio-1_a.H -std=c++2b  scan-lang-dump module "Depset:0 
> decl entity:[0-9]* function_decl:'::printf'"
> 
> The problem is that the module file doesn't contain
> 
>  Depset:0 decl entity:95 function_decl:'::printf'
> 
> as expected by the test, but
> 
>  Depset:0 decl entity:26 function_decl:'::std::printf'
> 
> This happens because Solaris  declares printf in namespace std
> as allowed by C++11, Annex D, D.5.
> 
> This patch allows for both forms.
> 
> Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
> x86_64-pc-linux-gnu.
> 
> Ok for trunk?
> 
>   Rainer

There are a couple of other tests that appear to potentially have a
similar issue:

global-2_a.C
21:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^\n']*' added} 
module } }

global-3_a.C
15:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^'\n]*' added} 
module } }

Which I suppose maybe also should be updated in the same way; I guess
they don't fail on Solaris because they aren't actually correctly
testing what they think they are.

Otherwise LGTM.

Nathaniel

> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
> 
> 
> 2024-05-13  Rainer Orth  
> 
>   gcc/testsuite:
>   PR c++/98529
>   * g++.dg/modules/stdio-1_a.H (scan-lang-dump): Allow for
>   ::std::printf.
> 

> diff --git a/gcc/testsuite/g++.dg/modules/stdio-1_a.H 
> b/gcc/testsuite/g++.dg/modules/stdio-1_a.H
> --- a/gcc/testsuite/g++.dg/modules/stdio-1_a.H
> +++ b/gcc/testsuite/g++.dg/modules/stdio-1_a.H
> @@ -10,5 +10,5 @@
>  #endif
>  // There should be *lots* of depsets (209 for glibc today)
>  // { dg-final { scan-lang-dump {Writing section:60 } module } }
> -// { dg-final { scan-lang-dump {Depset:0 decl entity:[0-9]* 
> function_decl:'::printf'} module } }
> +// { dg-final { scan-lang-dump {Depset:0 decl entity:[0-9]* 
> function_decl:'(::std)?::printf'} module } }
>  // { dg-final { scan-lang-dump {Depset:1 binding namespace_decl:'::printf'} 
> module } }



[PATCH][14 backport] c++: Fix instantiation of imported temploid friends [PR114275]

2024-05-13 Thread Nathaniel Shead
> > @@ -11751,9 +11767,16 @@ tsubst_friend_class (tree friend_tmpl, tree args)
> > if (tmpl != error_mark_node)
> > {
> >   /* The new TMPL is not an instantiation of anything, so we
> > -forget its origins.  We don't reset CLASSTYPE_TI_TEMPLATE
> > +forget its origins.  It is also not a specialization of
> > +anything.  We don't reset CLASSTYPE_TI_TEMPLATE
> >  for the new type because that is supposed to be the
> >  corresponding template decl, i.e., TMPL.  */
> > + spec_entry elt;
> > + elt.tmpl = friend_tmpl;
> > + elt.args = CLASSTYPE_TI_ARGS (TREE_TYPE (tmpl));
> > + elt.spec = TREE_TYPE (tmpl);
> > + type_specializations->remove_elt (&elt);
> 
> For GCC 14.2 let's guard this with if (modules_p ()); for GCC 15 it can be
> unconditional.  OK.
> 
> Jason
> 

I'm looking to backport this patch to GCC 14 now that it's been on trunk
some time.  Here's the patch I'm aiming to add (squashed with the
changes from r15-220-gec2365e07537e8) after cherrypicking the
prerequisite commit r15-58-g2faf040335f9b4; is this OK?

Or should I keep it as two separate commits to make the cherrypicking
more obvious? Not entirely sure on the etiquette around this.

Bootstrapped and regtested on x86_64-pc-linux-gnu on top of the
releases/gcc-14 branch.

-- >8 --

This patch fixes a number of issues with the handling of temploid friend
declarations.

The primary issue is that instantiations of friend declarations should
attach the declaration to the same module as the befriending class, by
[module.unit] p7.1 and [temp.friend] p2; this could be a different
module from the current TU, and so needs special handling.

The other main issue here is that we can't assume that just because name
lookup didn't find a definition for a hidden class template, that it
doesn't exist at all: it could be a non-exported entity that we've
nevertheless streamed in from an imported module.  We need to ensure
that when instantiating template friend classes that we return the same
TEMPLATE_DECL that we got from our imports, otherwise we will get later
issues with 'duplicate_decls' (rightfully) complaining that they're
different when trying to merge.

This doesn't appear necessary for function templates due to the existing
name lookup handling already finding these hidden declarations.

PR c++/105320
PR c++/114275

gcc/cp/ChangeLog:

* cp-tree.h (propagate_defining_module): Declare.
(remove_defining_module): Declare.
(lookup_imported_hidden_friend): Declare.
* decl.cc (duplicate_decls): Also check if hidden decls can be
redeclared in this module. Call remove_defining_module on
to-be-freed newdecl.
* module.cc (imported_temploid_friends): New.
(init_modules): Initialize it.
(trees_out::decl_value): Write it; don't consider imported
temploid friends as attached to a module.
(trees_in::decl_value): Read it for non-discarded decls.
(get_originating_module_decl): Follow the owning decl for an
imported temploid friend.
(propagate_defining_module): New.
(remove_defining_module): New.
* name-lookup.cc (get_mergeable_namespace_binding): New.
(lookup_imported_hidden_friend): New.
* pt.cc (tsubst_friend_function): Propagate defining module for
new friend functions.
(tsubst_friend_class): Lookup imported hidden friends.  Check
for valid module attachment of existing names.  Propagate
defining module for new classes.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-friend-10_a.C: New test.
* g++.dg/modules/tpl-friend-10_b.C: New test.
* g++.dg/modules/tpl-friend-10_c.C: New test.
* g++.dg/modules/tpl-friend-10_d.C: New test.
* g++.dg/modules/tpl-friend-11_a.C: New test.
* g++.dg/modules/tpl-friend-11_b.C: New test.
* g++.dg/modules/tpl-friend-12_a.C: New test.
* g++.dg/modules/tpl-friend-12_b.C: New test.
* g++.dg/modules/tpl-friend-12_c.C: New test.
* g++.dg/modules/tpl-friend-12_d.C: New test.
* g++.dg/modules/tpl-friend-12_e.C: New test.
* g++.dg/modules/tpl-friend-12_f.C: New test.
* g++.dg/modules/tpl-friend-13_a.C: New test.
* g++.dg/modules/tpl-friend-13_b.C: New test.
* g++.dg/modules/tpl-friend-13_c.C: New test.
* g++.dg/modules/tpl-friend-13_d.C: New test.
* g++.dg/modules/tpl-friend-13_e.C: New test.
* g++.dg/modules/tpl-friend-13_f.C: New test.
* g++.dg/modules/tpl-friend-13_g.C: New test.
* g++.dg/modules/tpl-friend-14_a.C: New test.
* g++.dg/modules/tpl-friend-14_b.C: New test.
* g++.dg/modules/tpl-friend-14_c.C: New test.
* g++.dg/modules/tpl-friend-14_d.C: New test.
* g++.dg/modules/tpl-friend-9.C: New test.

Signed-off-by: Nathaniel Shead 
Reviewed-by: Jason Merrill 
Reviewed-by: Patrick Palka 
---

Re: [PATCH] testsuite: c++: Allow for std::printf in g++.dg/modules/stdio-1_a.H [PR98529]

2024-05-13 Thread Rainer Orth
Hi Nathaniel,

> On Mon, May 13, 2024 at 10:40:30AM +0200, Rainer Orth wrote:
>> g++.dg/modules/stdio-1_a.H currently FAILs on Solaris:
>> 
>> FAIL: g++.dg/modules/stdio-1_a.H -std=c++17  scan-lang-dump module "Depset:0 
>> decl entity:[0-9]* function_decl:'::printf'"
>> FAIL: g++.dg/modules/stdio-1_a.H -std=c++2a  scan-lang-dump module "Depset:0 
>> decl entity:[0-9]* function_decl:'::printf'"
>> FAIL: g++.dg/modules/stdio-1_a.H -std=c++2b  scan-lang-dump module "Depset:0 
>> decl entity:[0-9]* function_decl:'::printf'"
>> 
>> The problem is that the module file doesn't contain
>> 
>>  Depset:0 decl entity:95 function_decl:'::printf'
>> 
>> as expected by the test, but
>> 
>>  Depset:0 decl entity:26 function_decl:'::std::printf'
>> 
>> This happens because Solaris  declares printf in namespace std
>> as allowed by C++11, Annex D, D.5.
>> 
>> This patch allows for both forms.
>> 
>> Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
>> x86_64-pc-linux-gnu.
>> 
>> Ok for trunk?
>> 
>>  Rainer
>
> There are a couple of other tests that appear to potentially have a
> similar issue:
>
> global-2_a.C
> 21:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^\n']*'
> added} module } }
>
> global-3_a.C
> 15:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^'\n]*'
> added} module } }

neither module file contains "Reachable GMF" at all, with ::printf or
otherwise.

> Which I suppose maybe also should be updated in the same way; I guess
> they don't fail on Solaris because they aren't actually correctly
> testing what they think they are.

Perhaps, but it would be useful to first understand what those tests are
supposed to look like.  WRT global-3_a.C, printf doesn't occur at all,
so this may just be a case of copy-and-paste.

Maybe Nathan, who authored the tests, can shed some light.

> Otherwise LGTM.

Thanks.  I'll go ahead and commit the patch as is, asjusting the other
two once it's become clear what they should look like.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] testsuite: c++: Allow for std::printf in g++.dg/modules/stdio-1_a.H [PR98529]

2024-05-13 Thread Nathaniel Shead
On Mon, May 13, 2024 at 01:59:51PM +0200, Rainer Orth wrote:
> Hi Nathaniel,
> 
> > On Mon, May 13, 2024 at 10:40:30AM +0200, Rainer Orth wrote:
> >> g++.dg/modules/stdio-1_a.H currently FAILs on Solaris:
> >> 
> >> FAIL: g++.dg/modules/stdio-1_a.H -std=c++17  scan-lang-dump module 
> >> "Depset:0 decl entity:[0-9]* function_decl:'::printf'"
> >> FAIL: g++.dg/modules/stdio-1_a.H -std=c++2a  scan-lang-dump module 
> >> "Depset:0 decl entity:[0-9]* function_decl:'::printf'"
> >> FAIL: g++.dg/modules/stdio-1_a.H -std=c++2b  scan-lang-dump module 
> >> "Depset:0 decl entity:[0-9]* function_decl:'::printf'"
> >> 
> >> The problem is that the module file doesn't contain
> >> 
> >>  Depset:0 decl entity:95 function_decl:'::printf'
> >> 
> >> as expected by the test, but
> >> 
> >>  Depset:0 decl entity:26 function_decl:'::std::printf'
> >> 
> >> This happens because Solaris  declares printf in namespace std
> >> as allowed by C++11, Annex D, D.5.
> >> 
> >> This patch allows for both forms.
> >> 
> >> Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and
> >> x86_64-pc-linux-gnu.
> >> 
> >> Ok for trunk?
> >> 
> >>Rainer
> >
> > There are a couple of other tests that appear to potentially have a
> > similar issue:
> >
> > global-2_a.C
> > 21:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^\n']*'
> > added} module } }
> >
> > global-3_a.C
> > 15:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^'\n]*'
> > added} module } }
> 
> neither module file contains "Reachable GMF" at all, with ::printf or
> otherwise.
> 

Yes, I think the test is aiming to check that such a declaration is not
added at all, and so that's correct. But if for some reason on some
system it did add "::std::printf" that would be a bug that would not be
caught by this test.

> > Which I suppose maybe also should be updated in the same way; I guess
> > they don't fail on Solaris because they aren't actually correctly
> > testing what they think they are.
> 
> Perhaps, but it would be useful to first understand what those tests are
> supposed to look like.  WRT global-3_a.C, printf doesn't occur at all,
> so this may just be a case of copy-and-paste.
> 
> Maybe Nathan, who authored the tests, can shed some light.
> 
> > Otherwise LGTM.
> 
> Thanks.  I'll go ahead and commit the patch as is, asjusting the other
> two once it's become clear what they should look like.
> 

Ah, I should have been clearer: I'm not sure I can approve, but I've
CC'd Jason in.

>   Rainer
> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-13 Thread Kewen.Lin
Hi,

on 2024/5/13 10:57, Jiufu Guo wrote:
> Hi,
> 
> For PR96866, when gcc print asm code for modifier "%a" which requires
> an address operand, while the operand is with the constraint "X" which
> allow non-address form.  An error message would be reported to indicate
> the invalid asm operands.
> 
> Bootstrap®test pass on ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff(Jiufu Guo)
> 
>   PR target/96866
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (print_operand_address):
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr96866-1.c: New test.
>   * gcc.target/powerpc/pr96866-2.c: New test.
> 
> ---
>  gcc/config/rs6000/rs6000.cc  |  6 ++
>  gcc/testsuite/gcc.target/powerpc/pr96866-1.c | 15 +++
>  gcc/testsuite/gcc.target/powerpc/pr96866-2.c | 10 ++
>  3 files changed, 31 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96866-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96866-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 117999613d8..50943d76f79 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -14659,6 +14659,12 @@ print_operand_address (FILE *file, rtx x)
>else if (SYMBOL_REF_P (x) || GET_CODE (x) == CONST
>  || GET_CODE (x) == LABEL_REF)
>  {
> +  if (this_is_asm_operands && !address_operand (x, VOIDmode))

Do we really need this_is_asm_operands here?

> + {
> +   output_operand_lossage ("invalid expression as operand");
> +   return;
> + }
> +
>output_addr_const (file, x);
>if (small_data_operand (x, GET_MODE (x)))
>   fprintf (file, "@%s(%s)", SMALL_DATA_RELOC,
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96866-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr96866-1.c
> new file mode 100644
> index 000..6554a472a11
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96866-1.c
> @@ -0,0 +1,15 @@
> +/* It's to verify no ICE here, ignore error messages about invalid 'asm'.  */
> +/* { dg-excess-errors "pr96866-2.c" } */
> +/* { dg-options "-fPIC -O2" } */

Nit: If these two options are required, it would be good to have a comment 
explaining it a bit
when it's not obvious.

> +
> +int x[2];
> +
> +int __attribute__ ((noipa))
> +f1 (void)
> +{
> +  int n;
> +  int *p = x;
> +  *p++;
> +  __asm__ volatile("ld %0, %a1" : "=r"(n) : "X"(p));
> +  return n;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96866-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr96866-2.c
> new file mode 100644
> index 000..a5ec96f29dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96866-2.c
> @@ -0,0 +1,10 @@
> +/* It's to verify no ICE here, ignore error messages about invalid 'asm'.  */
> +/* { dg-excess-errors "pr96866-2.c" } */
> +/* { dg-options "-fPIC -O2" } */

Ditto.

BR,
Kewen

> +
> +void
> +f (void)
> +{
> +  extern int x;
> +  __asm__ volatile("#%a0" ::"X"(&x));
> +}



Re: [PATCH] testsuite: c++: Allow for std::printf in g++.dg/modules/stdio-1_a.H [PR98529]

2024-05-13 Thread Rainer Orth
Hi Nathaniel,

>> > There are a couple of other tests that appear to potentially have a
>> > similar issue:
>> >
>> > global-2_a.C
>> > 21:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^\n']*'
>> > added} module } }
>> >
>> > global-3_a.C
>> > 15:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^'\n]*'
>> > added} module } }
>> 
>> neither module file contains "Reachable GMF" at all, with ::printf or
>> otherwise.
>> 
>
> Yes, I think the test is aiming to check that such a declaration is not
> added at all, and so that's correct. But if for some reason on some
> system it did add "::std::printf" that would be a bug that would not be
> caught by this test.

understood.  However, the question about global-3_a.C remains which
contains no printf at all.

>> > Which I suppose maybe also should be updated in the same way; I guess
>> > they don't fail on Solaris because they aren't actually correctly
>> > testing what they think they are.
>> 
>> Perhaps, but it would be useful to first understand what those tests are
>> supposed to look like.  WRT global-3_a.C, printf doesn't occur at all,
>> so this may just be a case of copy-and-paste.
>> 
>> Maybe Nathan, who authored the tests, can shed some light.
>> 
>> > Otherwise LGTM.
>> 
>> Thanks.  I'll go ahead and commit the patch as is, asjusting the other
>> two once it's become clear what they should look like.
>> 
>
> Ah, I should have been clearer: I'm not sure I can approve, but I've
> CC'd Jason in.

Sorry, I already committed the patch.  I can revert, of course, if
that's inappropriate.  OTOH, it could be considered obvious ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH v2 2/3] diagnostics: Don't hardcode auto_enable_urls to false for mingw hosts

2024-05-13 Thread NightStrike
On Thu, May 9, 2024 at 1:03 PM Peter Damianov  wrote:
>
> Windows terminal and mintty both have support for link escape sequences, and 
> so
> auto_enable_urls shouldn't be hardcoded to false. For older versions of the
> windows console, mingw_ansi_fputs's console API translation logic does mangle
> these sequences, but there's nothing useful it could do even if this weren't
> the case, so check if the ansi escape sequences are supported at all.
>
> conhost.exe doesn't support link escape sequences, but printing them does not
> cause any problems.

Are there any issues when running under the Wine console, such as when
running the testsuite?


Re: [PATCH v2 2/3] diagnostics: Don't hardcode auto_enable_urls to false for mingw hosts

2024-05-13 Thread Peter0x44

13 May 2024 1:30:28 pm NightStrike :

On Thu, May 9, 2024 at 1:03 PM Peter Damianov  
wrote:


Windows terminal and mintty both have support for link escape 
sequences, and so
auto_enable_urls shouldn't be hardcoded to false. For older versions 
of the
windows console, mingw_ansi_fputs's console API translation logic does 
mangle
these sequences, but there's nothing useful it could do even if this 
weren't

the case, so check if the ansi escape sequences are supported at all.

conhost.exe doesn't support link escape sequences, but printing them 
does not

cause any problems.


Are there any issues when running under the Wine console, such as when
running the testsuite?


I did not try this. There shouldn't be problems if wine implements 
ENABLE_VIRTUAL_TERMINAL_PROCESSING correctly, but I agree it would be 
good to check. Are there instructions anywhere for running the testsuite 
with wine? Anything specific I need to do?


[PATCH] PR60276 fix for single-lane SLP

2024-05-13 Thread Richard Biener
When enabling single-lane SLP and not splitting groups the fix for
PR60276 is no longer effective since it for unknown reason exempted
pure SLP.  The following removes this exemption, making
gcc.dg/vect/pr60276.c PASS even with --param vect-single-lane-slp=1

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/60276
* tree-vect-stmts.cc (vectorizable_load): Do not exempt
pure_slp grouped loads from the STMT_VINFO_MIN_NEG_DIST
restriction.
---
 gcc/tree-vect-stmts.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 21e8fe98e44..b8a71605f1b 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -9995,8 +9995,7 @@ vectorizable_load (vec_info *vinfo,
 
   /* Invalidate assumptions made by dependence analysis when vectorization
 on the unrolled body effectively re-orders stmts.  */
-  if (!PURE_SLP_STMT (stmt_info)
- && STMT_VINFO_MIN_NEG_DIST (stmt_info) != 0
+  if (STMT_VINFO_MIN_NEG_DIST (stmt_info) != 0
  && maybe_gt (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
   STMT_VINFO_MIN_NEG_DIST (stmt_info)))
{
-- 
2.35.3


Re: [PATCH 1/4] rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

2024-05-13 Thread Joseph Myers
On Mon, 13 May 2024, Kewen.Lin wrote:

> > In fact replacing all of X_TYPE_SIZE with a single hook might be worthwhile
> > though this removes the "convenient" defaulting, requiring each target to
> > enumerate all standard C ABI type modes.  But that might be also a good 
> > thing.
> > 
> 
> I guess the main value by extending from floating point types to all is to
> unify them?  (Assuming that excepting for floating types the others would
> not have multiple possible representations like what we faces on 128bit fp).

For integer types, giving the number of bits makes sense as an interface - 
there isn't an issue with different modes.

So I think it's appropriate for floating and integer types to have 
separate hooks - with the one for floating types returning a mode, and the 
one for integer types returning a number of bits.  (And also keep the 
existing separate hook for _FloatN / _FloatNx modes.)

That may also make for more convenient defaults (whether a target has long 
double wider than double is largely independent of what sizes it uses for 
integer types).

-- 
Joseph S. Myers
josmy...@redhat.com



RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-13 Thread Li, Pan2
Thanks Tamer for comments.

> I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when 
> optimizing for size.

Sure thing, let me update it in v5.

> Hmm why do you iterate independently over the statements? The block below 
> already visits
> Every statement doesn't it?

Because it will hit .ADD_OVERFLOW first, then it will never hit SAT_ADD as the 
shape changed, or shall we put it to the previous pass ?

> The root of your match is a BIT_IOR_EXPR expression, so I think you just need 
> to change the entry below to:
>
>   case BIT_IOR_EXPR:
> match_saturation_arith (&gsi, stmt, m_cfg_changed_p);
> /* fall-through */
>   case BIT_XOR_EXPR:
> match_uaddc_usubc (&gsi, stmt, code);
> break;

There are other shapes (not covered in this patch) of SAT_ADD like below branch 
version, the IOR should be one of the ROOT. Thus doesn't
add case here.  Then, shall we take case for each shape here ? Both works for 
me.

#define SAT_ADD_U_1(T) \
T sat_add_u_1_##T(T x, T y) \
{ \
  return (T)(x + y) >= x ? (x + y) : -1; \
}

SAT_ADD_U_1(uint32_t)

Pan


-Original Message-
From: Tamar Christina  
Sent: Monday, May 13, 2024 5:10 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Liu, Hongtao 
Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned 
scalar int

Hi Pan,

> -Original Message-
> From: pan2...@intel.com 
> Sent: Monday, May 6, 2024 3:48 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> ; richard.guent...@gmail.com;
> hongtao@intel.com; Pan Li 
> Subject: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned 
> scalar
> int
> 
> From: Pan Li 
> 
> This patch would like to add the middle-end presentation for the
> saturation add.  Aka set the result of add to the max when overflow.
> It will take the pattern similar as below.
> 
> SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))
> 
> Take uint8_t as example, we will have:
> 
> * SAT_ADD (1, 254)   => 255.
> * SAT_ADD (1, 255)   => 255.
> * SAT_ADD (2, 255)   => 255.
> * SAT_ADD (255, 255) => 255.
> 
> Given below example for the unsigned scalar integer uint64_t:
> 
> uint64_t sat_add_u64 (uint64_t x, uint64_t y)
> {
>   return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
> }
> 
> Before this patch:
> uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
> {
>   long unsigned int _1;
>   _Bool _2;
>   long unsigned int _3;
>   long unsigned int _4;
>   uint64_t _7;
>   long unsigned int _10;
>   __complex__ long unsigned int _11;
> 
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
>   _1 = REALPART_EXPR <_11>;
>   _10 = IMAGPART_EXPR <_11>;
>   _2 = _10 != 0;
>   _3 = (long unsigned int) _2;
>   _4 = -_3;
>   _7 = _1 | _4;
>   return _7;
> ;;succ:   EXIT
> 
> }
> 
> After this patch:
> uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
> {
>   uint64_t _7;
> 
> ;;   basic block 2, loop depth 0
> ;;pred:   ENTRY
>   _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
>   return _7;
> ;;succ:   EXIT
> }
> 
> We perform the tranform during widen_mult because that the sub-expr of
> SAT_ADD will be optimized to .ADD_OVERFLOW.  We need to try the .SAT_ADD
> pattern first and then .ADD_OVERFLOW,  or we may never catch the pattern
> .SAT_ADD.  Meanwhile, the isel pass is after widen_mult and then we
> cannot perform the .SAT_ADD pattern match as the sub-expr will be
> optmized to .ADD_OVERFLOW first.
> 
> The below tests are passed for this patch:
> 1. The riscv fully regression tests.
> 2. The aarch64 fully regression tests.
> 3. The x86 bootstrap tests.
> 4. The x86 fully regression tests.
> 
>   PR target/51492
>   PR target/112600
> 
> gcc/ChangeLog:
> 
>   * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD
>   to the return true switch case(s).
>   * internal-fn.def (SAT_ADD):  Add new signed optab SAT_ADD.
>   * match.pd: Add unsigned SAT_ADD match.
>   * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd.
>   * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern
>   func decl generated in match.pd match.
>   (match_saturation_arith): New func impl to match the saturation arith.
>   (math_opts_dom_walker::after_dom_children): Try match saturation
>   arith.
> 
> Signed-off-by: Pan Li 
> ---
>  gcc/internal-fn.cc|  1 +
>  gcc/internal-fn.def   |  2 ++
>  gcc/match.pd  | 28 
>  gcc/optabs.def|  4 ++--
>  gcc/tree-ssa-math-opts.cc | 46
> +++
>  5 files changed, 79 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 0a7053c2286..73045ca8c8c 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -4202,6 +42

Re: [EXTERNAL] [COMMITTED] Regenerate cygming.opt.urls and mingw.opt.urls

2024-05-13 Thread David Malcolm
On Mon, 2024-05-13 at 11:14 +0200, Mark Wielaard wrote:
> Hi Evgeny,
> 
> Adding David to the CC, who might know the details.
> 
> On Mon, May 13, 2024 at 08:44:12AM +, Evgeny Karpov wrote:
> > Sunday, May 12, 2024
> > 
> > Thank you for reviewing our changes related to the refactoring of
> > extracting the MinGW implementation from ix64.
> > 
> > It was expected to move the MinGW-related files without changes in
> > this commit ("Reuse MinGW from i386 for AArch64") and apply the
> > renaming in a follow-up commit, which has been done in 'Rename "x86
> > Windows Options" to "Cygwin and MinGW Options"'.
> > 
> > The script to update opt.urls files has been used.
> > 
> > > diff --git a/gcc/config/mingw/cygming.opt.urls
> > > b/gcc/config/mingw/cygming.opt.urls
> > > index c624e22e4427..af11c4997609 100644
> > > --- a/gcc/config/mingw/cygming.opt.urls
> > > +++ b/gcc/config/mingw/cygming.opt.urls
> > > @@ -1,4 +1,4 @@
> > 
> > > -; Autogenerated by regenerate-opt-urls.py from
> > > gcc/config/i386/cygming.opt
> > > and generated HTML
> > > +; Autogenerated by regenerate-opt-urls.py from
> > > +gcc/config/mingw/cygming.opt and generated HTML
> > 
> > I am not sure why this comment has not been updated. Is it critical
> > or it could be updated next time when it is needed?
> 
> Odd that the script didn't update this comment, it really should
> have.
> It might be that running the script through make regenerate-opt-urls
> inside the gcc build subdir invokes regenerate-opt-urls.py slightly
> differently so that this line is updated.

It might be a "make" dependencies issue:
"make regenerate-opt-urls" has dependencies on OPT_URLS_HTML_DEPS which
is currently defined as:
OPT_URLS_HTML_DEPS = $(build_htmldir)/gcc/Option-Index.html \
$(build_htmldir)/gdc/Option-Index.html \
$(build_htmldir)/gfortran/Option-Index.html
which might not be enough for the doc changes when moving things around
that affect other generated html files.

So when the CI runs "make regenerate-opt-urls" in a pristine build it
will forcibly rerun texinfo to regenerate the docs first, whereas if
you manually run the script in a build directory, you might not be
seeing the latest version of the HTML (especially in thre presence of
file moves).

So I think the Makefile as currently written handles most cases, but
can get it slightly wrong for the case you ran into here (sorry); fully
refreshing the built docs ought to fix such cases.

That's my theory of what happened here, anyway.

Dave

> 
> > >  mconsole
> > >  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mconsole)
> > > @@ -9,9 +9,8 @@ UrlSuffix(gcc/Cygwin-and-MinGW-
> > > Options.html#index-
> > > mdll)
> > >  mnop-fun-dllimport
> > >  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mnop-fun-
> > > dllimport)
> > > 
> > > -; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> > > -;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-
> > > mthreads-1'
> > > -;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> > > +mthreads
> > > +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1)
> > 
> > mthreads has the same issue before applying changes. Has something
> > been changed recently?
> > This is the change in patch series in 'Rename "x86 Windows Options"
> > to "Cygwin and MinGW Options"' commit.
> > 
> > ; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> > +;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-
> > 1'
> >  ;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> > -;   duplicate: 'gcc/x86-Windows-Options.html#index-mthreads-1'
> 
> Again, it might be caused by invoking the script by hand vs with make
> regenerate-opt-urls.py. I believe with the make option it will
> renumber the suffixes making sure the urls are unique.
> 
> BTW. There is a CI buildbot that tries to regenerate all generated
> files, which is how I spotted this:
> https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen
> (It should also sent email to the author of the patch on failure.)
> 
> Cheers,
> 
> Mark
> 



Re: [EXTERNAL] [COMMITTED] Regenerate cygming.opt.urls and mingw.opt.urls

2024-05-13 Thread David Malcolm
On Mon, 2024-05-13 at 09:42 -0400, David Malcolm wrote:
> On Mon, 2024-05-13 at 11:14 +0200, Mark Wielaard wrote:
> > Hi Evgeny,
> > 
> > Adding David to the CC, who might know the details.
> > 
> > On Mon, May 13, 2024 at 08:44:12AM +, Evgeny Karpov wrote:
> > > Sunday, May 12, 2024
> > > 
> > > Thank you for reviewing our changes related to the refactoring of
> > > extracting the MinGW implementation from ix64.
> > > 
> > > It was expected to move the MinGW-related files without changes
> > > in
> > > this commit ("Reuse MinGW from i386 for AArch64") and apply the
> > > renaming in a follow-up commit, which has been done in 'Rename
> > > "x86
> > > Windows Options" to "Cygwin and MinGW Options"'.
> > > 
> > > The script to update opt.urls files has been used.
> > > 
> > > > diff --git a/gcc/config/mingw/cygming.opt.urls
> > > > b/gcc/config/mingw/cygming.opt.urls
> > > > index c624e22e4427..af11c4997609 100644
> > > > --- a/gcc/config/mingw/cygming.opt.urls
> > > > +++ b/gcc/config/mingw/cygming.opt.urls
> > > > @@ -1,4 +1,4 @@
> > > 
> > > > -; Autogenerated by regenerate-opt-urls.py from
> > > > gcc/config/i386/cygming.opt
> > > > and generated HTML
> > > > +; Autogenerated by regenerate-opt-urls.py from
> > > > +gcc/config/mingw/cygming.opt and generated HTML
> > > 
> > > I am not sure why this comment has not been updated. Is it
> > > critical
> > > or it could be updated next time when it is needed?
> > 
> > Odd that the script didn't update this comment, it really should
> > have.
> > It might be that running the script through make regenerate-opt-
> > urls
> > inside the gcc build subdir invokes regenerate-opt-urls.py slightly
> > differently so that this line is updated.
> 
> It might be a "make" dependencies issue:
> "make regenerate-opt-urls" has dependencies on OPT_URLS_HTML_DEPS
> which
> is currently defined as:
> OPT_URLS_HTML_DEPS = $(build_htmldir)/gcc/Option-Index.html \
> $(build_htmldir)/gdc/Option-Index.html \
> $(build_htmldir)/gfortran/Option-Index.html
> which might not be enough for the doc changes when moving things
> around
> that affect other generated html files.
> 
> So when the CI runs "make regenerate-opt-urls" in a pristine build it
> will forcibly rerun texinfo to regenerate the docs first, whereas if
> you manually run the script in a build directory, you might not be
> seeing the latest version of the HTML (especially in thre presence of
> file moves).
> 
> So I think the Makefile as currently written handles most cases, but
> can get it slightly wrong for the case you ran into here (sorry);
> fully
> refreshing the built docs ought to fix such cases.

Specifically, if you have some generated .html files in the
$(build_htmldir) from a file that has gone away (due to a move), then I
suspect these .html files stick around until you fully delete the
$(build_htmldir), and in the meantime they get found by regenerate-opt-
urls.py and lead to duplicate enries, leading to differences against a
pristine build dir.

Dave

> 
> That's my theory of what happened here, anyway.
> 
> Dave
> 
> > 
> > > >  mconsole
> > > >  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mconsole)
> > > > @@ -9,9 +9,8 @@ UrlSuffix(gcc/Cygwin-and-MinGW-
> > > > Options.html#index-
> > > > mdll)
> > > >  mnop-fun-dllimport
> > > >  UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mnop-fun-
> > > > dllimport)
> > > > 
> > > > -; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> > > > -;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-
> > > > mthreads-1'
> > > > -;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> > > > +mthreads
> > > > +UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1)
> > > 
> > > mthreads has the same issue before applying changes. Has
> > > something
> > > been changed recently?
> > > This is the change in patch series in 'Rename "x86 Windows
> > > Options"
> > > to "Cygwin and MinGW Options"' commit.
> > > 
> > > ; skipping UrlSuffix for 'mthreads' due to multiple URLs:
> > > +;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-
> > > mthreads-
> > > 1'
> > >  ;   duplicate: 'gcc/x86-Options.html#index-mthreads'
> > > -;   duplicate: 'gcc/x86-Windows-Options.html#index-mthreads-1'
> > 
> > Again, it might be caused by invoking the script by hand vs with
> > make
> > regenerate-opt-urls.py. I believe with the make option it will
> > renumber the suffixes making sure the urls are unique.
> > 
> > BTW. There is a CI buildbot that tries to regenerate all generated
> > files, which is how I spotted this:
> > https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen
> > (It should also sent email to the author of the patch on failure.)
> > 
> > Cheers,
> > 
> > Mark
> > 
> 



Re: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-13 Thread Kito Cheng
LGTM as well :)

On Sat, May 11, 2024 at 3:58 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM from my side. Wait for kito chime in.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: pan2.li
> Date: 2024-05-11 15:54
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; Pan Li
> Subject: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 
> scalar
> From: Pan Li 
>
> For the vfw vx format RVV intrinsic, the scalar type _Float16 also
> requires the zvfh extension.  Unfortunately,  we only check the
> vector tree type and miss the scalar _Float16 type checking.  For
> example:
>
> vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t 
> vl)
> {
>   return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl);
> }
>
> It should report some error message like zvfh extension is required
> instead of ICE for unreg insn.
>
> This patch would like to make up such kind of validation for _Float16
> in the RVV intrinsic API.  It will report some error like below when
> there is no zvfh enabled.
>
> error: built-in function '__riscv_vfwsub_wf_f32mf2(vs2,  rs1,  vl)'
>   requires the zvfhmin or zvfh ISA extension
>
> PR target/114988
>
> Passed the rv64gcv fully regression tests, included c/c++/fortran.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins.cc
> (validate_instance_type_required_extensions): New func impl to
> validate the intrinisc func type ops.
> (expand_builtin): Validate instance type before expand.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr114988-1.c: New test.
> * gcc.target/riscv/rvv/base/pr114988-2.c: New test.
>
> Signed-off-by: Pan Li 
> ---
> gcc/config/riscv/riscv-vector-builtins.cc | 51 +++
> .../gcc.target/riscv/rvv/base/pr114988-1.c|  9 
> .../gcc.target/riscv/rvv/base/pr114988-2.c|  9 
> 3 files changed, 69 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
> b/gcc/config/riscv/riscv-vector-builtins.cc
> index 192a6c230d1..3fdb4400d70 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -4632,6 +4632,54 @@ gimple_fold_builtin (unsigned int code, 
> gimple_stmt_iterator *gsi, gcall *stmt)
>return gimple_folder (rfn.instance, rfn.decl, gsi, stmt).fold ();
> }
> +static bool
> +validate_instance_type_required_extensions (const rvv_type_info type,
> + tree exp)
> +{
> +  uint64_t exts = type.required_extensions;
> +
> +  if ((exts & RVV_REQUIRE_ELEN_FP_16) &&
> +!TARGET_VECTOR_ELEN_FP_16_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the "
> + "zvfhmin or zvfh ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  if ((exts & RVV_REQUIRE_ELEN_FP_32) &&
> +!TARGET_VECTOR_ELEN_FP_32_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the "
> + "zve32f, zve64f, zve64d or v ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  if ((exts & RVV_REQUIRE_ELEN_FP_64) &&
> +!TARGET_VECTOR_ELEN_FP_64_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the zve64d or v ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  if ((exts & RVV_REQUIRE_ELEN_64) &&
> +!TARGET_VECTOR_ELEN_64_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the "
> + "zve64x, zve64f, zve64d or v ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  return true;
> +}
> +
> /* Expand a call to the RVV function with subcode CODE.  EXP is the call
> expression and TARGET is the preferred location for the result.
> Return the value of the lhs.  */
> @@ -4649,6 +4697,9 @@ expand_builtin (unsigned int code, tree exp, rtx target)
>return target;
>  }
> +  if (!validate_instance_type_required_extensions (rfn.instance.type, exp))
> +return target;
> +
>return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
> }
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
> new file mode 100644
> index 000..b8474804c88
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t 
> vl)
> +{
> +  return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl); /* { dg-error {built-in 
> function '__riscv_vfwsub_wf_f32mf2\(vs2,  rs1,  vl\)' requires the zvfhmin or 
> zvfh ISA extension} } */
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c 
> b/gcc/tes

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-13 Thread Robin Dapp
> What happens if we simply remove all of the force_reg here?

On x86 I bootstrapped and tested the attached without fallout
(gcc188, so it's no avx512-native machine and therefore limited
coverage).  riscv regtest is unchanged.
For aarch64 I would to rely on the pre-commit CI to pick it
up (does that work on sub-threads?).

Regards
 Robin


gcc/ChangeLog:

PR middle-end/113474

* internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
force_regs.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr113474.c: New test.
---
 gcc/internal-fn.cc  |  3 ---
 .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 +
 2 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 2c764441cde..4d226c478b4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, 
convert_optab optab)
   rtx_op1 = expand_normal (op1);
   rtx_op2 = expand_normal (op2);
 
-  mask = force_reg (mask_mode, mask);
-  rtx_op1 = force_reg (mode, rtx_op1);
-
   rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   create_output_operand (&ops[0], target, mode);
   create_input_operand (&ops[1], rtx_op1, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
new file mode 100644
index 000..0364bf9f5e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target riscv_v } }  */
+/* { dg-additional-options "-std=c99" }  */
+
+void
+foo (int n, int **a)
+{
+  int b;
+  for (b = 0; b < n; b++)
+for (long e = 8; e > 0; e--)
+  a[b][e] = a[b][e] == 15;
+}
+
+/* { dg-final { scan-assembler "vmerge.vim" } }  */
-- 
2.45.0



Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-13 Thread Richard Biener
On Mon, May 13, 2024 at 4:14 PM Robin Dapp  wrote:
>
> > What happens if we simply remove all of the force_reg here?
>
> On x86 I bootstrapped and tested the attached without fallout
> (gcc188, so it's no avx512-native machine and therefore limited
> coverage).  riscv regtest is unchanged.
> For aarch64 I would to rely on the pre-commit CI to pick it
> up (does that work on sub-threads?).

OK if that pre-commit CI works out.

Richard.

> Regards
>  Robin
>
>
> gcc/ChangeLog:
>
> PR middle-end/113474
>
> * internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
> force_regs.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/pr113474.c: New test.
> ---
>  gcc/internal-fn.cc  |  3 ---
>  .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 +
>  2 files changed, 13 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 2c764441cde..4d226c478b4 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall 
> *stmt, convert_optab optab)
>rtx_op1 = expand_normal (op1);
>rtx_op2 = expand_normal (op2);
>
> -  mask = force_reg (mask_mode, mask);
> -  rtx_op1 = force_reg (mode, rtx_op1);
> -
>rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>create_output_operand (&ops[0], target, mode);
>create_input_operand (&ops[1], rtx_op1, mode);
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
> new file mode 100644
> index 000..0364bf9f5e3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile { target riscv_v } }  */
> +/* { dg-additional-options "-std=c99" }  */
> +
> +void
> +foo (int n, int **a)
> +{
> +  int b;
> +  for (b = 0; b < n; b++)
> +for (long e = 8; e > 0; e--)
> +  a[b][e] = a[b][e] == 15;
> +}
> +
> +/* { dg-final { scan-assembler "vmerge.vim" } }  */
> --
> 2.45.0
>


Re: [PATCH v2 2/3] diagnostics: Don't hardcode auto_enable_urls to false for mingw hosts

2024-05-13 Thread Peter0x44

13 May 2024 1:30:28 pm NightStrike :

On Thu, May 9, 2024 at 1:03 PM Peter Damianov  
wrote:


Windows terminal and mintty both have support for link escape 
sequences, and so
auto_enable_urls shouldn't be hardcoded to false. For older versions 
of the
windows console, mingw_ansi_fputs's console API translation logic does 
mangle
these sequences, but there's nothing useful it could do even if this 
weren't

the case, so check if the ansi escape sequences are supported at all.

conhost.exe doesn't support link escape sequences, but printing them 
does not

cause any problems.


Are there any issues when running under the Wine console, such as when
running the testsuite?


I installed wine and gave compiling a file emitting a warning a try. 
Unfortunately, yes, gcc emits mangled warnings here. Even simply running 
this patch under wine causes problems, it's not just wine's conhost.exe.


I'm not sure whether it's my fault or wine's. I've attached two 
screenshots demonstrating exactly what happens. (I think???) wine should 
only be advertising that it supports those settings regarding escape 
sequences if it actually does. Also, on this machine, wine is near 
unusably slow, I'm talking multiple seconds to react to a keypress 
through the wine conhost. I will not be attempting to run the testsuite, 
I severely doubt it will work.

[PATCH v1 1/3] Vect: Support loop len in vectorizable early exit

2024-05-13 Thread pan2 . li
From: Pan Li 

This patch adds early break auto-vectorization support for target which
use length on partial vectorization.  Consider this following example:

unsigned vect_a[802];
unsigned vect_b[802];

void test (unsigned x, int n)
{
  for (int i = 0; i < n; i++)
  {
    vect_b[i] = x + i;

    if (vect_a[i] > x)
      break;

    vect_a[i] = x;
  }
}

We use VCOND_MASK_LEN to simulate the generate (mask && i < len + bias).
And then the IR of RVV looks like below:

  ...
  _87 = .SELECT_VL (ivtmp_85, POLY_INT_CST [32, 32]);
  _55 = (int) _87;
  ...
  mask_patt_6.13_69 = vect_cst__62 < vect__3.12_67;
  vec_len_mask_72 = .VCOND_MASK_LEN (mask_patt_6.13_69, { -1, ... }, \
{0, ... }, _87, 0);
  if (vec_len_mask_72 != { 0, ... })
    goto ; [5.50%]
  else
    goto ; [94.50%]

The below tests are passed for this patch:
1. The riscv fully regression tests.
2. The aarch64 fully regression tests.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_early_exit): Add loop len
handling for one or multiple stmt.

Signed-off-by: Pan Li 
---
 gcc/tree-vect-stmts.cc | 47 --
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 21e8fe98e44..bfd9d66568f 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -12896,7 +12896,9 @@ vectorizable_early_exit (vec_info *vinfo, stmt_vec_info 
stmt_info,
 ncopies = vect_get_num_copies (loop_vinfo, vectype);
 
   vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
+  vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);
   bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
+  bool len_loop_p = LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);
 
   /* Now build the new conditional.  Pattern gimple_conds get dropped during
  codegen so we must replace the original insn.  */
@@ -12960,12 +12962,11 @@ vectorizable_early_exit (vec_info *vinfo, 
stmt_vec_info stmt_info,
{
  if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
  OPTIMIZE_FOR_SPEED))
-   return false;
+   vect_record_loop_len (loop_vinfo, lens, ncopies, vectype, 1);
  else
vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
}
 
-
   return true;
 }
 
@@ -13018,6 +13019,25 @@ vectorizable_early_exit (vec_info *vinfo, 
stmt_vec_info stmt_info,
  stmts[i], &cond_gsi);
workset.quick_push (stmt_mask);
  }
+  else if (len_loop_p)
+   for (unsigned i = 0; i < stmts.length (); i++)
+ {
+   tree all_ones_mask = build_all_ones_cst (vectype);
+   tree all_zero_mask = build_zero_cst (vectype);
+   tree len = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies,
+ vectype, i, 1);
+   signed char cst = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo);
+   tree bias = build_int_cst (intQI_type_node, cst);
+   tree len_mask
+ = make_temp_ssa_name (TREE_TYPE (stmts[i]), NULL, "vec_len_mask");
+   gcall *call = gimple_build_call_internal (IFN_VCOND_MASK_LEN, 5,
+ stmts[i], all_ones_mask,
+ all_zero_mask, len, bias);
+   gimple_call_set_lhs (call, len_mask);
+   gsi_insert_before (&cond_gsi, call, GSI_SAME_STMT);
+
+   workset.quick_push (len_mask);
+ }
   else
workset.splice (stmts);
 
@@ -13042,6 +13062,29 @@ vectorizable_early_exit (vec_info *vinfo, 
stmt_vec_info stmt_info,
  new_temp = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask,
   new_temp, &cond_gsi);
}
+  else if (len_loop_p)
+   {
+ /* len_mask = VCOND_MASK_LEN (compare_mask, ones, zero, len, bias)
+
+which is equivalent to:
+
+len_mask = compare_mask mask && i < len ? 1 : 0
+ */
+ tree all_ones_mask = build_all_ones_cst (vectype);
+ tree all_zero_mask = build_zero_cst (vectype);
+ tree len
+   = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies, vectype, 0, 1);
+ signed char biasval = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS (loop_vinfo);
+ tree bias = build_int_cst (intQI_type_node, biasval);
+ tree len_mask
+   = make_temp_ssa_name (TREE_TYPE (new_temp), NULL, "vec_len_mask");
+ gcall *call = gimple_build_call_internal (IFN_VCOND_MASK_LEN, 5,
+   new_temp, all_ones_mask,
+   all_zero_mask, len, bias);
+ gimple_call_set_lhs (call, len_mask);
+ gsi_insert_before (&cond_gsi, call, GSI_SAME_STMT);
+ new_temp = len_mask;
+   }
   

[PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-13 Thread pan2 . li
From: Pan Li 

This patch depends on below middle-end implementation.

https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html

After we support the loop lens for the vectorizable,  we would like to
implement the feature for the RISC-V target.  Given below example:

unsigned vect_a[1923];
unsigned vect_b[1923];

unsigned test (unsigned limit, int n)
{
  unsigned ret = 0;

  for (int i = 0; i < n; i++)
{
  vect_b[i] = limit + i;

  if (vect_a[i] > limit)
{
  ret = vect_b[i];
  return ret;
}

  vect_a[i] = limit;
}

  return ret;
}

Before this patch:
  ...
.L8:
  swa3,0(a5)
  addiw a0,a0,1
  addi  a4,a4,4
  addi  a5,a5,4
  beq   a1,a0,.L2
.L4:
  swa0,0(a4)
  lwa2,0(a5)
  bleu  a2,a3,.L8
  ret

After this patch:
  ...
.L5:
  vsetvli   a5,a3,e8,mf4,ta,ma
  vmv1r.v   v4,v2
  vsetvli   t4,zero,e32,m1,ta,ma
  vmv.v.x   v1,a5
  vadd.vv   v2,v2,v1
  vsetvli   zero,a5,e32,m1,ta,ma
  vadd.vv   v5,v4,v3
  slli  a6,a5,2
  vle32.v   v1,0(t1)
  vmsltu.vv v1,v3,v1
  vcpop.m   t4,v1
  beq   t4,zero,.L4
  vmv.x.s   a4,v4
.L3:
  ...

The below tests are passed for this patch:
1. The riscv fully regression tests.

gcc/ChangeLog:

* config/riscv/autovec-opt.md 
(*vcond_mask_len_popcount_):
New pattern of vcond_mask_len_popcount for vector bool mode.
* config/riscv/autovec.md (vcond_mask_len_): New pattern
of vcond_mask_len for vector bool mode.
(cbranch4): New pattern for vector bool mode.
* config/riscv/vector-iterators.md: Add new unspec UNSPEC_SELECT_MASK.
* config/riscv/vector.md (@pred_popcount): Add
VLS mode to popcount pattern.
(@pred_popcount): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/early-break-1.c: New test.
* gcc.target/riscv/rvv/autovec/early-break-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec-opt.md   | 33 ++
 gcc/config/riscv/autovec.md   | 60 +++
 gcc/config/riscv/vector-iterators.md  |  1 +
 gcc/config/riscv/vector.md| 18 +++---
 .../riscv/rvv/autovec/early-break-1.c | 34 +++
 .../riscv/rvv/autovec/early-break-2.c | 37 
 6 files changed, 174 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/early-break-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/early-break-2.c

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 645dc53d868..04f85d8e455 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -1436,3 +1436,36 @@ (define_insn_and_split "*n"
 DONE;
   }
   [(set_attr "type" "vmalu")])
+
+;; Optimization pattern for early break auto-vectorization
+;; vcond_mask_len (mask, ones, zeros, len, bias) + vlmax popcount
+;; -> non vlmax popcount (mask, len)
+(define_insn_and_split "*vcond_mask_len_popcount_"
+  [(set (match_operand:P 0 "register_operand")
+(popcount:P
+ (unspec:VB_VLS [
+  (unspec:VB_VLS [
+   (match_operand:VB_VLS 1 "register_operand")
+   (match_operand:VB_VLS 2 "const_1_operand")
+   (match_operand:VB_VLS 3 "const_0_operand")
+   (match_operand 4 "autovec_length_operand")
+   (match_operand 5 "const_0_operand")] UNSPEC_SELECT_MASK)
+  (match_operand 6 "autovec_length_operand")
+  (const_int 1)
+  (reg:SI VL_REGNUM)
+  (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)))]
+  "TARGET_VECTOR
+   && can_create_pseudo_p ()
+   && riscv_vector::get_vector_mode (Pmode, GET_MODE_NUNITS 
(mode)).exists ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+riscv_vector::emit_nonvlmax_insn (
+   code_for_pred_popcount (mode, Pmode),
+   riscv_vector::CPOP_OP,
+   operands, operands[4]);
+DONE;
+  }
+  [(set_attr "type" "vector")]
+)
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index aa1ae0fe075..dfa58b8af69 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2612,3 +2612,63 @@ (define_expand "rawmemchr"
 DONE;
   }
 )
+
+;; =
+;; == Early break auto-vectorization patterns
+;; =
+
+;; vcond_mask_len
+(define_insn_and_split "vcond_mask_len_"
+  [(set (match_operand:VB 0 "register_operand")
+(unspec: VB [
+ (match_operand:VB 1 "register_operand")
+ (match_operand:VB 2 "const_1_operand")
+ (match_operand:VB 3 "const_0_operand")
+ (match_operand 4 "autovec_length_operand")
+ (match_operand 5 "const_0_operand")] UNSPEC_SELECT_MASK))]
+  "TARGET_VECTOR
+   && can_create_pseudo_p ()
+   && riscv_vector::get_vector_mode (Pmode, GET_MODE_NUNITS 
(mode)).exists ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+machine_mode mode = riscv_vector::get_vector_mode (Pmode,
+   GET_MO

[PATCH v1 3/3] RISC-V: Enable vectorizable early exit test

2024-05-13 Thread pan2 . li
From: Pan Li 

This patch depends on below 2 patches.

https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651460.html

After we supported vectorizable early exit in RISC-V,  we would like to
enable the gcc vect test for vectorizable early test.

The vect-early-break_124-pr114403.c failed to vectorize for now.
Because that the __builtin_memcpy with 8 bytes failed to folded into
int64 assignment during ccp1.  We will improve that first and mark
this as xfail for RISC-V.

The below tests are passed for this patch:
1. The riscv fully regression tests.
2. The aarch64 fully regression tests.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-mask-store-1.c: Add pragma novector as it will
have 2 times LOOP VECTORIZED in RISC-V.
* gcc.dg/vect/vect-early-break_124-pr114403.c: Xfail for the
riscv backend.
* lib/target-supports.exp: Add RISC-V backend.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.dg/vect/slp-mask-store-1.c  | 2 ++
 gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c | 2 +-
 gcc/testsuite/lib/target-supports.exp | 2 ++
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-mask-store-1.c 
b/gcc/testsuite/gcc.dg/vect/slp-mask-store-1.c
index fdd9032da98..2f80bf89e5e 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-mask-store-1.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-mask-store-1.c
@@ -28,6 +28,8 @@ main ()
 
   if (__builtin_memcmp (x, res, sizeof (x)) != 0)
 abort ();
+
+#pragma GCC novector
   for (int i = 0; i < 32; ++i)
 if (flag[i] != 0 && flag[i] != 1)
   abort ();
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c
index 51abf245ccb..101ae1e0eaa 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_124-pr114403.c
@@ -2,7 +2,7 @@
 /* { dg-require-effective-target vect_early_break_hw } */
 /* { dg-require-effective-target vect_long_long } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { xfail riscv*-*-* } } 
} */
 
 #include "tree-vect.h"
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 6f5d477b128..adaa5912588 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -4099,6 +4099,7 @@ proc check_effective_target_vect_early_break { } {
|| [check_effective_target_arm_v8_neon_ok]
|| [check_effective_target_sse4]
|| [istarget amdgcn-*-*]
+   || [check_effective_target_riscv_v]
}}]
 }
 
@@ -4114,6 +4115,7 @@ proc check_effective_target_vect_early_break_hw { } {
|| [check_effective_target_arm_v8_neon_hw]
|| [check_sse4_hw_available]
|| [istarget amdgcn-*-*]
+   || [check_effective_target_riscv_v]
}}]
 }
 
-- 
2.34.1



RE: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-13 Thread Li, Pan2
Committed, thanks Juzhe and Kito. Let's wait for a while before backport to 14.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, May 13, 2024 10:11 PM
To: juzhe.zh...@rivai.ai
Cc: Li, Pan2 ; gcc-patches 
Subject: Re: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 
scalar

LGTM as well :)

On Sat, May 11, 2024 at 3:58 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM from my side. Wait for kito chime in.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: pan2.li
> Date: 2024-05-11 15:54
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; Pan Li
> Subject: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 
> scalar
> From: Pan Li 
>
> For the vfw vx format RVV intrinsic, the scalar type _Float16 also
> requires the zvfh extension.  Unfortunately,  we only check the
> vector tree type and miss the scalar _Float16 type checking.  For
> example:
>
> vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t 
> vl)
> {
>   return __riscv_vfwsub_wf_f32mf2(vs2, rs1, vl);
> }
>
> It should report some error message like zvfh extension is required
> instead of ICE for unreg insn.
>
> This patch would like to make up such kind of validation for _Float16
> in the RVV intrinsic API.  It will report some error like below when
> there is no zvfh enabled.
>
> error: built-in function '__riscv_vfwsub_wf_f32mf2(vs2,  rs1,  vl)'
>   requires the zvfhmin or zvfh ISA extension
>
> PR target/114988
>
> Passed the rv64gcv fully regression tests, included c/c++/fortran.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins.cc
> (validate_instance_type_required_extensions): New func impl to
> validate the intrinisc func type ops.
> (expand_builtin): Validate instance type before expand.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr114988-1.c: New test.
> * gcc.target/riscv/rvv/base/pr114988-2.c: New test.
>
> Signed-off-by: Pan Li 
> ---
> gcc/config/riscv/riscv-vector-builtins.cc | 51 +++
> .../gcc.target/riscv/rvv/base/pr114988-1.c|  9 
> .../gcc.target/riscv/rvv/base/pr114988-2.c|  9 
> 3 files changed, 69 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-2.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
> b/gcc/config/riscv/riscv-vector-builtins.cc
> index 192a6c230d1..3fdb4400d70 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -4632,6 +4632,54 @@ gimple_fold_builtin (unsigned int code, 
> gimple_stmt_iterator *gsi, gcall *stmt)
>return gimple_folder (rfn.instance, rfn.decl, gsi, stmt).fold ();
> }
> +static bool
> +validate_instance_type_required_extensions (const rvv_type_info type,
> + tree exp)
> +{
> +  uint64_t exts = type.required_extensions;
> +
> +  if ((exts & RVV_REQUIRE_ELEN_FP_16) &&
> +!TARGET_VECTOR_ELEN_FP_16_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the "
> + "zvfhmin or zvfh ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  if ((exts & RVV_REQUIRE_ELEN_FP_32) &&
> +!TARGET_VECTOR_ELEN_FP_32_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the "
> + "zve32f, zve64f, zve64d or v ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  if ((exts & RVV_REQUIRE_ELEN_FP_64) &&
> +!TARGET_VECTOR_ELEN_FP_64_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the zve64d or v ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  if ((exts & RVV_REQUIRE_ELEN_64) &&
> +!TARGET_VECTOR_ELEN_64_P (riscv_vector_elen_flags))
> +{
> +  error_at (EXPR_LOCATION (exp),
> + "built-in function %qE requires the "
> + "zve64x, zve64f, zve64d or v ISA extension",
> + exp);
> +  return false;
> +}
> +
> +  return true;
> +}
> +
> /* Expand a call to the RVV function with subcode CODE.  EXP is the call
> expression and TARGET is the preferred location for the result.
> Return the value of the lhs.  */
> @@ -4649,6 +4697,9 @@ expand_builtin (unsigned int code, tree exp, rtx target)
>return target;
>  }
> +  if (!validate_instance_type_required_extensions (rfn.instance.type, exp))
> +return target;
> +
>return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
> }
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
> new file mode 100644
> index 000..b8474804c88
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114988-1.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t v

RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-13 Thread Tamar Christina
> 
> Thanks Tamer for comments.
> 
> > I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when
> optimizing for size.
> 
> Sure thing, let me update it in v5.
> 
> > Hmm why do you iterate independently over the statements? The block below
> already visits
> > Every statement doesn't it?
> 
> Because it will hit .ADD_OVERFLOW first, then it will never hit SAT_ADD as the
> shape changed, or shall we put it to the previous pass ?
> 

That's just a matter of matching the overflow as an additional case no?
i.e. you can add an overload for unsigned_integer_sat_add matching the
IFN_ ADD_OVERFLOW and using the realpart and imagpart helpers.

I think that would be better as it avoid visiting all the statements twice but 
also
extends the matching to some __builtin_add_overflow uses and should be fairly
simple.

> > The root of your match is a BIT_IOR_EXPR expression, so I think you just 
> > need to
> change the entry below to:
> >
> > case BIT_IOR_EXPR:
> >   match_saturation_arith (&gsi, stmt, m_cfg_changed_p);
> >   /* fall-through */
> > case BIT_XOR_EXPR:
> >   match_uaddc_usubc (&gsi, stmt, code);
> >   break;
> 
> There are other shapes (not covered in this patch) of SAT_ADD like below 
> branch
> version, the IOR should be one of the ROOT. Thus doesn't
> add case here.  Then, shall we take case for each shape here ? Both works for 
> me.
> 

Yeah, I think that's better than iterating over the statements twice.  It also 
fits better
In the existing code.

Tamar.

> #define SAT_ADD_U_1(T) \
> T sat_add_u_1_##T(T x, T y) \
> { \
>   return (T)(x + y) >= x ? (x + y) : -1; \
> }
> 
> SAT_ADD_U_1(uint32_t)
> 
> Pan
> 
> 
> -Original Message-
> From: Tamar Christina 
> Sent: Monday, May 13, 2024 5:10 PM
> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com;
> Liu, Hongtao 
> Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned
> scalar int
> 
> Hi Pan,
> 
> > -Original Message-
> > From: pan2...@intel.com 
> > Sent: Monday, May 6, 2024 3:48 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina
> > ; richard.guent...@gmail.com;
> > hongtao@intel.com; Pan Li 
> > Subject: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned
> scalar
> > int
> >
> > From: Pan Li 
> >
> > This patch would like to add the middle-end presentation for the
> > saturation add.  Aka set the result of add to the max when overflow.
> > It will take the pattern similar as below.
> >
> > SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))
> >
> > Take uint8_t as example, we will have:
> >
> > * SAT_ADD (1, 254)   => 255.
> > * SAT_ADD (1, 255)   => 255.
> > * SAT_ADD (2, 255)   => 255.
> > * SAT_ADD (255, 255) => 255.
> >
> > Given below example for the unsigned scalar integer uint64_t:
> >
> > uint64_t sat_add_u64 (uint64_t x, uint64_t y)
> > {
> >   return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
> > }
> >
> > Before this patch:
> > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
> > {
> >   long unsigned int _1;
> >   _Bool _2;
> >   long unsigned int _3;
> >   long unsigned int _4;
> >   uint64_t _7;
> >   long unsigned int _10;
> >   __complex__ long unsigned int _11;
> >
> > ;;   basic block 2, loop depth 0
> > ;;pred:   ENTRY
> >   _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
> >   _1 = REALPART_EXPR <_11>;
> >   _10 = IMAGPART_EXPR <_11>;
> >   _2 = _10 != 0;
> >   _3 = (long unsigned int) _2;
> >   _4 = -_3;
> >   _7 = _1 | _4;
> >   return _7;
> > ;;succ:   EXIT
> >
> > }
> >
> > After this patch:
> > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
> > {
> >   uint64_t _7;
> >
> > ;;   basic block 2, loop depth 0
> > ;;pred:   ENTRY
> >   _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
> >   return _7;
> > ;;succ:   EXIT
> > }
> >
> > We perform the tranform during widen_mult because that the sub-expr of
> > SAT_ADD will be optimized to .ADD_OVERFLOW.  We need to try the .SAT_ADD
> > pattern first and then .ADD_OVERFLOW,  or we may never catch the pattern
> > .SAT_ADD.  Meanwhile, the isel pass is after widen_mult and then we
> > cannot perform the .SAT_ADD pattern match as the sub-expr will be
> > optmized to .ADD_OVERFLOW first.
> >
> > The below tests are passed for this patch:
> > 1. The riscv fully regression tests.
> > 2. The aarch64 fully regression tests.
> > 3. The x86 bootstrap tests.
> > 4. The x86 fully regression tests.
> >
> > PR target/51492
> > PR target/112600
> >
> > gcc/ChangeLog:
> >
> > * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD
> > to the return true switch case(s).
> > * internal-fn.def (SAT_ADD):  Add new signed optab SAT_ADD.
> > * match.pd: Add unsigned SAT_ADD match.
> > * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd.
> > * tree-ssa-math-opts.cc (gimple_unsigned_in

RE: [PATCH v1 1/3] Vect: Support loop len in vectorizable early exit

2024-05-13 Thread Tamar Christina
> -Original Message-
> From: pan2...@intel.com 
> Sent: Monday, May 13, 2024 3:54 PM
> To: gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com;
> Tamar Christina ; Richard Sandiford
> ; Pan Li 
> Subject: [PATCH v1 1/3] Vect: Support loop len in vectorizable early exit
> 
> From: Pan Li 
> 
> This patch adds early break auto-vectorization support for target which
> use length on partial vectorization.  Consider this following example:
> 
> unsigned vect_a[802];
> unsigned vect_b[802];
> 
> void test (unsigned x, int n)
> {
>   for (int i = 0; i < n; i++)
>   {
>     vect_b[i] = x + i;
> 
>     if (vect_a[i] > x)
>       break;
> 
>     vect_a[i] = x;
>   }
> }
> 
> We use VCOND_MASK_LEN to simulate the generate (mask && i < len + bias).
> And then the IR of RVV looks like below:
> 
>   ...
>   _87 = .SELECT_VL (ivtmp_85, POLY_INT_CST [32, 32]);
>   _55 = (int) _87;
>   ...
>   mask_patt_6.13_69 = vect_cst__62 < vect__3.12_67;
>   vec_len_mask_72 = .VCOND_MASK_LEN (mask_patt_6.13_69, { -1, ... }, \
> {0, ... }, _87, 0);
>   if (vec_len_mask_72 != { 0, ... })
>     goto ; [5.50%]
>   else
>     goto ; [94.50%]
> 
> The below tests are passed for this patch:
> 1. The riscv fully regression tests.
> 2. The aarch64 fully regression tests.
> 3. The x86 bootstrap tests.
> 4. The x86 fully regression tests.
> 
> gcc/ChangeLog:
> 
>   * tree-vect-stmts.cc (vectorizable_early_exit): Add loop len
>   handling for one or multiple stmt.
> 
> Signed-off-by: Pan Li 
> ---
>  gcc/tree-vect-stmts.cc | 47 -
> -
>  1 file changed, 45 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 21e8fe98e44..bfd9d66568f 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -12896,7 +12896,9 @@ vectorizable_early_exit (vec_info *vinfo,
> stmt_vec_info stmt_info,
>  ncopies = vect_get_num_copies (loop_vinfo, vectype);
> 
>vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> +  vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo);
>bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
> +  bool len_loop_p = LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);
> 
>/* Now build the new conditional.  Pattern gimple_conds get dropped during
>   codegen so we must replace the original insn.  */
> @@ -12960,12 +12962,11 @@ vectorizable_early_exit (vec_info *vinfo,
> stmt_vec_info stmt_info,
>   {
> if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
> OPTIMIZE_FOR_SPEED))
> - return false;
> + vect_record_loop_len (loop_vinfo, lens, ncopies, vectype, 1);
> else
>   vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
>   }
> 
> -
>return true;
>  }
> 
> @@ -13018,6 +13019,25 @@ vectorizable_early_exit (vec_info *vinfo,
> stmt_vec_info stmt_info,
> stmts[i], &cond_gsi);
>   workset.quick_push (stmt_mask);
> }
> +  else if (len_loop_p)
> + for (unsigned i = 0; i < stmts.length (); i++)
> +   {
> + tree all_ones_mask = build_all_ones_cst (vectype);
> + tree all_zero_mask = build_zero_cst (vectype);
> + tree len = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies,
> +   vectype, i, 1);
> + signed char cst = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS
> (loop_vinfo);
> + tree bias = build_int_cst (intQI_type_node, cst);
> + tree len_mask
> +   = make_temp_ssa_name (TREE_TYPE (stmts[i]), NULL,
> "vec_len_mask");
> + gcall *call = gimple_build_call_internal (IFN_VCOND_MASK_LEN, 5,
> +   stmts[i], all_ones_mask,
> +   all_zero_mask, len, bias);
> + gimple_call_set_lhs (call, len_mask);
> + gsi_insert_before (&cond_gsi, call, GSI_SAME_STMT);
> +
> + workset.quick_push (len_mask);
> +   }
>else
>   workset.splice (stmts);
> 
> @@ -13042,6 +13062,29 @@ vectorizable_early_exit (vec_info *vinfo,
> stmt_vec_info stmt_info,
> new_temp = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask,
>  new_temp, &cond_gsi);
>   }
> +  else if (len_loop_p)
> + {
> +   /* len_mask = VCOND_MASK_LEN (compare_mask, ones, zero, len, bias)
> +
> +  which is equivalent to:
> +
> +  len_mask = compare_mask mask && i < len ? 1 : 0
> +   */
> +   tree all_ones_mask = build_all_ones_cst (vectype);
> +   tree all_zero_mask = build_zero_cst (vectype);
> +   tree len
> + = vect_get_loop_len (loop_vinfo, gsi, lens, ncopies, vectype, 0, 1);
> +   signed char biasval = LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS
> (loop_vinfo);
> +   tree bias = build_int_cst (intQI_type_node, bi

Re: [PATCH] c++: Avoid using __array_rank as a variable name [PR115061]

2024-05-13 Thread Marek Polacek
On Sun, May 12, 2024 at 11:48:07PM -0700, Ken Matsui wrote:
> This patch fixes a compilation error when building GCC using Clang.
> Since __array_rank is used as a built-in trait name, use rank instead.

I think you can go ahead and push this patch as obvious, thanks.
 
>   PR c++/115061
> 
> gcc/cp/ChangeLog:
> 
>   * semantics.cc (finish_trait_expr): Use rank instead of
>   __array_rank.
> 
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/semantics.cc | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 43b175f92fd..df62e2d80db 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12914,10 +12914,10 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>tree val;
>if (kind == CPTK_RANK)
>  {
> -  size_t __array_rank = 0;
> +  size_t rank = 0;
>for (; TREE_CODE (type1) == ARRAY_TYPE; type1 = TREE_TYPE (type1))
> - ++__array_rank;
> -  val = build_int_cst (size_type_node, __array_rank);
> + ++rank;
> +  val = build_int_cst (size_type_node, rank);
>  }
>else
>  val = (trait_expr_value (kind, type1, type2)
> -- 
> 2.44.0
> 

Marek



[pushed][PR115013][LRA]: Modify register starvation recognition

2024-05-13 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115013

Successfully tested and bootstrapped on x86-64.
commit 44430ef3d8ba75692efff5f6969d5610134566d3
Author: Vladimir N. Makarov 
Date:   Mon May 13 10:12:11 2024 -0400

[PR115013][LRA]: Modify register starvation recognition

  My recent patch to recognize reg starvation resulted in few GCC test
failures.  The following patch fixes this by using more accurate
starvation calculation and ignoring small reg classes.

gcc/ChangeLog:

PR rtl-optimization/115013
* lra-constraints.cc (process_alt_operands): Update all_used_nregs
only for winreg.  Ignore reg starvation for small reg classes.

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index e945a4da451..92b343fa99a 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -2674,8 +2674,9 @@ process_alt_operands (int only_alternative)
 	  if (early_clobber_p
 		  || curr_static_id->operand[nop].type != OP_OUT)
 		{
-		  all_used_nregs
-		+= ira_reg_class_min_nregs[this_alternative][mode];
+		  if (winreg)
+		all_used_nregs
+		  += ira_reg_class_min_nregs[this_alternative][mode];
 		  all_this_alternative
 		= (reg_class_subunion
 		   [all_this_alternative][this_alternative]);
@@ -3250,6 +3251,7 @@ process_alt_operands (int only_alternative)
 	  overall += LRA_MAX_REJECT;
 	}
   if (all_this_alternative != NO_REGS
+	  && !SMALL_REGISTER_CLASS_P (all_this_alternative)
 	  && all_used_nregs != 0 && all_reload_nregs != 0
 	  && (all_used_nregs + all_reload_nregs + 1
 	  >= ira_class_hard_regs_num[all_this_alternative]))


Re: [Patch, aarch64] v3: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-13 Thread Alex Coplan
Hi Ajit,

Why did you send three mails for this revision of the patch?  If you're
going to send a new revision of the patch you should increment the
version number and outline the changes / reasons for the new revision.

Mostly the comments below are just style nits and things you missed from
the last review(s) (please try not to miss so many in the future).

On 09/05/2024 17:06, Ajit Agarwal wrote:
> Hello Alex/Richard:
> 
> All review comments are addressed.
> 
> Common infrastructure of load store pair fusion is divided into target
> independent and target dependent changed code.
> 
> Target independent code is the Generic code with pure virtual function
> to interface betwwen target independent and dependent code.
> 
> Target dependent code is the implementation of pure virtual function for
> aarch64 target and the call to target independent code.
> 
> Bootstrapped on aarch64-linux-gnu.
> 
> Thanks & Regards
> Ajit
> 
> 
> 
> aarch64: Preparatory patch to place target independent and
> dependent changed code in one file
> 
> Common infrastructure of load store pair fusion is divided into target
> independent and target dependent changed code.
> 
> Target independent code is the Generic code with pure virtual function
> to interface betwwen target independent and dependent code.
> 
> Target dependent code is the implementation of pure virtual function for
> aarch64 target and the call to target independent code.
> 
> 2024-05-09  Ajit Kumar Agarwal  
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-ldp-fusion.cc: Place target
>   independent and dependent changed code.
> ---
>  gcc/config/aarch64/aarch64-ldp-fusion.cc | 542 +++
>  1 file changed, 363 insertions(+), 179 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> index 1d9caeab05d..217790e111a 100644
> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> @@ -138,6 +138,224 @@ struct alt_base
>poly_int64 offset;
>  };
>  
> +// Virtual base class for load/store walkers used in alias analysis.
> +struct alias_walker
> +{
> +  virtual bool conflict_p (int &budget) const = 0;
> +  virtual insn_info *insn () const = 0;
> +  virtual bool valid () const = 0;
> +  virtual void advance () = 0;
> +};
> +
> +enum class writeback{

You missed a nit here.  Space before '{'.

> +  ALL,
> +  EXISTING
> +};

You also missed adding comments for the enum, please see the review for v2:
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651074.html

> +
> +struct pair_fusion {
> +  pair_fusion ()
> +  {
> +calculate_dominance_info (CDI_DOMINATORS);
> +df_analyze ();
> +crtl->ssa = new rtl_ssa::function_info (cfun);
> +  };
> +
> +  // Given:
> +  // - an rtx REG_OP, the non-memory operand in a load/store insn,
> +  // - a machine_mode MEM_MODE, the mode of the MEM in that insn, and
> +  // - a boolean LOAD_P (true iff the insn is a load), then:
> +  // return true if the access should be considered an FP/SIMD access.
> +  // Such accesses are segregated from GPR accesses, since we only want
> +  // to form pairs for accesses that use the same register file.
> +  virtual bool fpsimd_op_p (rtx, machine_mode, bool)
> +  {
> +return false;
> +  }
> +
> +  // Return true if we should consider forming ldp/stp insns from memory
> +  // accesses with operand mode MODE at this stage in compilation.
> +  virtual bool pair_operand_mode_ok_p (machine_mode mode) = 0;
> +
> +  // Return true iff REG_OP is a suitable register operand for a paired
> +  // memory access, where LOAD_P is true if we're asking about loads and
> +  // false for stores.  MEM_MODE gives the mode of the operand.
> +  virtual bool pair_reg_operand_ok_p (bool load_p, rtx reg_op,
> +   machine_mode mode) = 0;

The comment needs updating since we changed the name of the last param,
i.e. s/MEM_MODE/MODE/.

> +
> +  // Return alias check limit.
> +  // This is needed to avoid unbounded quadratic behaviour when
> +  // performing alias analysis.
> +  virtual int pair_mem_alias_check_limit () = 0;
> +
> +  // Returns true if we should try to handle writeback opportunities
> +  // (not whether there are any).
> +  virtual bool handle_writeback_opportunities (enum writeback which) = 0 ;

Heh, the bit in parens from the v2 review probably doesn't need to go
into the comment here.

Also you should describe WHICH in the comment.

> +
> +  // Given BASE_MEM, the mem from the lower candidate access for a pair,
> +  // and LOAD_P (true if the access is a load), check if we should proceed
> +  // to form the pair given the target's code generation policy on
> +  // paired accesses.
> +  virtual bool pair_mem_ok_with_policy (rtx first_mem, bool load_p,
> + machine_mode mode) = 0;

The name of the first param needs updating in the prototype, i.e.
s/first_mem/base_mem/.  I think you missed the bit a

[PATCH] Match: optimize `a == CST & unary(a)` [PR111487]

2024-05-13 Thread Andrew Pinski
This is an expansion of the optimize `a == CST & a`
to handle more than just casts. It adds optimization
for unary.
The patch for binary operators will come later.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/111487
gcc/ChangeLog:

* match.pd (tcc_int_unary): New operator list.
(`a == CST & unary(a)`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/and-unary-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/match.pd| 12 
 gcc/testsuite/gcc.dg/tree-ssa/and-unary-1.c | 61 +
 2 files changed, 73 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/and-unary-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 07e743ae464..3ee28a3d8fc 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -57,6 +57,10 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "cfn-operators.pd"
 
+/* integer unary operators that return the same type. */
+(define_operator_list tcc_int_unary
+ abs absu negate bit_not BSWAP POPCOUNT CTZ CLZ PARITY)
+
 /* Define operand lists for math rounding functions {,i,l,ll}FN,
where the versions prefixed with "i" return an int, those prefixed with
"l" return a long and those prefixed with "ll" return a long long.
@@ -5451,6 +5455,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   @2
   { build_zero_cst (type); }))
 
+/* `(a == CST) & unary(a)` can be simplified to `(a == CST) & unary(CST)`. */
+(simplify
+ (bit_and:c (convert@2 (eq @0 INTEGER_CST@1))
+(convert? (tcc_int_unary @3)))
+ (if (bitwise_equal_p (@0, @3))
+  (with { tree  inner_type = TREE_TYPE (@3); }
+   (bit_and @2 (convert (tcc_int_unary (convert:inner_type @1)))
+
 /* Optimize
# x_5 in range [cst1, cst2] where cst2 = cst1 + 1
x_5 == cstN ? cst4 : cst3
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/and-unary-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/and-unary-1.c
new file mode 100644
index 000..c157bc11b00
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/and-unary-1.c
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-forwprop1-raw -fdump-tree-optimized-raw" } */
+/* unary part of PR tree-optimization/111487 */
+
+int abs1(int a)
+{
+  int b = __builtin_abs(a);
+  return (a == 1) & b;
+}
+int absu1(int a)
+{
+  int b;
+  b = a > 0 ? -a:a;
+  b = -b;
+return (a == 1) & b;
+}
+
+int bswap1(int a)
+{
+  int b = __builtin_bswap32(a);
+  return (a == 1) & b;
+}
+
+int ctz1(int a)
+{
+  int b = __builtin_ctz(a);
+  return (a == 1) & b;
+}
+int pop1(int a)
+{
+  int b = __builtin_popcount(a);
+  return (a == 1) & b;
+}
+int neg1(int a)
+{
+  int b = -(a);
+  return (a == 1) & b;
+}
+int not1(int a)
+{
+  int b = ~(a);
+  return (a == 1) & b;
+}
+int partity1(int a)
+{
+  int b = __builtin_parity(a);
+  return (a == 1) & b;
+}
+
+
+/* We should optimize out the unary operator for each.
+   For ctz we can optimize directly to `return 0`.
+   For bswap1 and not1, we can do the same but not until after forwprop1.  */
+/* { dg-final { scan-tree-dump-times "eq_expr, " 7 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "eq_expr, " 5 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "abs_expr, "  "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "absu_expr, "  "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "bit_not_expr, "  "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "negate_expr, "  "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "gimple_call <"  "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "bit_and_expr,  "  "forwprop1" } } */
-- 
2.34.1



Re: [PATCH v1 3/3] RISC-V: Enable vectorizable early exit test

2024-05-13 Thread Robin Dapp
Hi Pan,

>  
> @@ -4114,6 +4115,7 @@ proc check_effective_target_vect_early_break_hw { } {
>   || [check_effective_target_arm_v8_neon_hw]
>   || [check_sse4_hw_available]
>   || [istarget amdgcn-*-*]
> + || [check_effective_target_riscv_v]
>   }}]
>  }

I believe this should be riscv_v_ok.  riscv_v only checks if we can
compile.  OK with that changed after 2/3 is in.

Regards
 Robin


[r15-429 Regression] FAIL: experimental/simd/pr109261_constexpr_simd.cc -msse2 -O2 -Wno-psabi (test for excess errors) on Linux/x86_64

2024-05-13 Thread haochen.jiang
On Linux/x86_64,

fb1649f8b4ad5043dd0e65e4e3a643a0ced018a9 is the first bad commit
commit fb1649f8b4ad5043dd0e65e4e3a643a0ced018a9
Author: Matthias Kretz 
Date:   Mon May 6 12:13:55 2024 +0200

libstdc++: Use __builtin_shufflevector for simd split and concat

caused

FAIL: experimental/simd/pr109261_constexpr_simd.cc -msse2 -O2 -Wno-psabi (test 
for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r15-429/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=experimental/simd/pr109261_constexpr_simd.cc 
--target_board='unix{-m32}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


Re: [PATCH] c++: Avoid using __array_rank as a variable name [PR115061]

2024-05-13 Thread Ken Matsui
On Mon, May 13, 2024 at 8:19 AM Marek Polacek  wrote:
>
> On Sun, May 12, 2024 at 11:48:07PM -0700, Ken Matsui wrote:
> > This patch fixes a compilation error when building GCC using Clang.
> > Since __array_rank is used as a built-in trait name, use rank instead.
>
> I think you can go ahead and push this patch as obvious, thanks.

Oh, I see.  Thank you for letting me know!

>
> >   PR c++/115061
> >
> > gcc/cp/ChangeLog:
> >
> >   * semantics.cc (finish_trait_expr): Use rank instead of
> >   __array_rank.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/cp/semantics.cc | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 43b175f92fd..df62e2d80db 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12914,10 +12914,10 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >tree val;
> >if (kind == CPTK_RANK)
> >  {
> > -  size_t __array_rank = 0;
> > +  size_t rank = 0;
> >for (; TREE_CODE (type1) == ARRAY_TYPE; type1 = TREE_TYPE (type1))
> > - ++__array_rank;
> > -  val = build_int_cst (size_type_node, __array_rank);
> > + ++rank;
> > +  val = build_int_cst (size_type_node, rank);
> >  }
> >else
> >  val = (trait_expr_value (kind, type1, type2)
> > --
> > 2.44.0
> >
>
> Marek
>


[COMMITTED] c++: Avoid using __array_rank as a variable name [PR115061]

2024-05-13 Thread Ken Matsui
Pushed as obvious.

-- >8 --

This patch fixes a compilation error when building GCC using Clang.
Since __array_rank is used as a built-in trait name, use rank instead.

PR c++/115061

gcc/cp/ChangeLog:

* semantics.cc (finish_trait_expr): Use rank instead of
__array_rank.

Signed-off-by: Ken Matsui 
---
 gcc/cp/semantics.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 43b175f92fd..df62e2d80db 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12914,10 +12914,10 @@ finish_trait_expr (location_t loc, cp_trait_kind 
kind, tree type1, tree type2)
   tree val;
   if (kind == CPTK_RANK)
 {
-  size_t __array_rank = 0;
+  size_t rank = 0;
   for (; TREE_CODE (type1) == ARRAY_TYPE; type1 = TREE_TYPE (type1))
-   ++__array_rank;
-  val = build_int_cst (size_type_node, __array_rank);
+   ++rank;
+  val = build_int_cst (size_type_node, rank);
 }
   else
 val = (trait_expr_value (kind, type1, type2)
-- 
2.44.0



[COMMITTED][GCC12] Backport of 111009 patch.

2024-05-13 Thread Andrew MacLeod

Same patch for gcc12.

bootstraps and passes all tests on x86_64-pc-linux-gnu

On 5/9/24 10:32, Andrew MacLeod wrote:
As requested, backported the patch for 111009 to resolve incorrect 
ranges from addr_expr and committed to GCC 13 branch.


bootstraps and passes all tests on x86_64-pc-linux-gnu

Andrewcommit b5d079c37e9eee15c0bfe34ffcae31e551192777
Author: Andrew MacLeod 
Date:   Fri May 10 13:56:01 2024 -0400

Fix range-ops operator_addr.

Lack of symbolic information prevents op1_range from being able to draw
the same conclusions as fold_range can.

PR tree-optimization/111009
gcc/
* range-op.cc (operator_addr_expr::op1_range): Be more restrictive.
* value-range.h (contains_zero_p): New.

gcc/testsuite/
* gcc.dg/pr111009.c: New.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index bf95f5fbaa1..2e0d67b70b6 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -3825,7 +3825,17 @@ operator_addr_expr::op1_range (irange &r, tree type,
 			   const irange &op2,
 			   relation_kind rel ATTRIBUTE_UNUSED) const
 {
-  return operator_addr_expr::fold_range (r, type, lhs, op2);
+   if (empty_range_varying (r, type, lhs, op2))
+return true;
+
+  // Return a non-null pointer of the LHS type (passed in op2), but only
+  // if we cant overflow, eitherwise a no-zero offset could wrap to zero.
+  // See PR 111009.
+  if (!contains_zero_p (lhs) && TYPE_OVERFLOW_UNDEFINED (type))
+r = range_nonzero (type);
+  else
+r.set_varying (type);
+  return true;
 }
 
 
diff --git a/gcc/testsuite/gcc.dg/pr111009.c b/gcc/testsuite/gcc.dg/pr111009.c
new file mode 100644
index 000..3accd9ac063
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111009.c
@@ -0,0 +1,38 @@
+/* PR tree-optimization/111009 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-strict-overflow" } */
+
+struct dso {
+ struct dso * next;
+ int maj;
+};
+
+__attribute__((noipa)) static void __dso_id__cmp_(void) {}
+
+__attribute__((noipa))
+static int bug(struct dso * d, struct dso *dso)
+{
+ struct dso **p = &d;
+ struct dso *curr = 0;
+
+ while (*p) {
+  curr = *p;
+  // prevent null deref below
+  if (!dso) return 1;
+  if (dso == curr) return 1;
+
+  int *a = &dso->maj;
+  // null deref
+  if (!(a && *a)) __dso_id__cmp_();
+
+  p = &curr->next;
+ }
+ return 0;
+}
+
+__attribute__((noipa))
+int main(void) {
+struct dso d = { 0, 0, };
+bug(&d, 0);
+}
+
diff --git a/gcc/value-range.h b/gcc/value-range.h
index d4cba22d540..22f5fc68d7c 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -605,6 +605,16 @@ irange::normalize_kind ()
 }
 }
 
+inline bool
+contains_zero_p (const irange &r)
+{
+  if (r.undefined_p ())
+return false;
+
+  tree zero = build_zero_cst (r.type ());
+  return r.contains_p (zero);
+}
+
 // Return the maximum value for TYPE.
 
 inline tree


Re: [PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-13 Thread Jeff Law




On 5/13/24 9:00 AM, Li, Pan2 wrote:

Committed, thanks Juzhe and Kito. Let's wait for a while before backport to 14.

Could you fix the formatting nits caught by the CI linter?

=== ERROR type #1: trailing operator (4 error(s)) ===
gcc/config/riscv/riscv-vector-builtins.cc:4641:39:  if ((exts & 
RVV_REQUIRE_ELEN_FP_16) &&
gcc/config/riscv/riscv-vector-builtins.cc:4651:39:  if ((exts & 
RVV_REQUIRE_ELEN_FP_32) &&
gcc/config/riscv/riscv-vector-builtins.cc:4661:39:  if ((exts & 
RVV_REQUIRE_ELEN_FP_64) &&
gcc/config/riscv/riscv-vector-builtins.cc:4670:36:  if ((exts & 
RVV_REQUIRE_ELEN_64) &&



The "&&" needs to come down to the next line, indented like

if ((exts && RVV_REQUIRE_ELEN_FP_16)
&& !TARGET_VECTOR_.)

Ie, the "&&" indents just inside the first open paren.  It looks like 
all the conditions in validate_instance_type_required_extensions need to 
be fixed in a similar manner.


Given this is NFC, just post it for the archiver.  No need to wait on 
review.


Jeff




Re: [COMMITTED 2/5] Fix ranger when called from SCEV.

2024-05-13 Thread Jan-Benedict Glaw
On Tue, 2024-04-30 17:24:15 -0400, Andrew MacLeod  wrote:
> Bootstrapped on x86_64-pc-linux-gnu with no regressions.  pushed.

Starting with this patch (upstream as
e8ae56a7dc46e39a48017bb5159e4dc672ec7fad, can still be reproduced with
0c585c8d0dd85601a8d116ada99126a48c8ce9fd as of May 13th), my CI builds fail for
csky-elf in all-target-libgcc by falling into a loop infinite loop:

../gcc/configure '--with-pkgversion=basepoints/gcc-15-432-g0c585c8d0dd, built 
at 1715608899'\
--prefix=/tmp/gcc-csky-elf --enable-werror-always 
--enable-languages=all\
--disable-gcov --disable-shared --disable-threads --target=csky-elf 
--without-headers
make V=1 all-gcc
make V=1 install-strip-gcc
make V=1 all-target-libgcc

(gdb) bt
#0  0x0098f1df in bitmap_list_find_element (head=0x38f2e18, indx=5001) 
at ../../gcc/gcc/bitmap.cc:375
#1  bitmap_set_bit (head=0x38f2e18, bit=640244) at ../../gcc/gcc/bitmap.cc:962
#2  0x00d39cd1 in process_bb_lives (bb=, 
curr_point=@0x7ffe062c1b2c: 3039473, dead_insn_p=) at 
../../gcc/gcc/lra-lives.cc:889
#3  lra_create_live_ranges_1 (all_p=all_p@entry=true, dead_insn_p=) at ../../gcc/gcc/lra-lives.cc:1416
#4  0x00d3b810 in lra_create_live_ranges (all_p=all_p@entry=true, 
dead_insn_p=) at ../../gcc/gcc/lra-lives.cc:1486
#5  0x00d1a8bd in lra (f=, verbose=) at 
../../gcc/gcc/lra.cc:2482
#6  0x00cd0e18 in do_reload () at ../../gcc/gcc/ira.cc:5973
#7  (anonymous namespace)::pass_reload::execute (this=) at 
../../gcc/gcc/ira.cc:6161
#8  0x00de6368 in execute_one_pass (pass=pass@entry=0x367c490) at 
../../gcc/gcc/passes.cc:2647
#9  0x00de6c00 in execute_pass_list_1 (pass=0x367c490) at 
../../gcc/gcc/passes.cc:2756
#10 0x00de6c12 in execute_pass_list_1 (pass=0x367b2f0) at 
../../gcc/gcc/passes.cc:2757
#11 0x00de6c39 in execute_pass_list (fn=0x7f24a1c06240, pass=) at ../../gcc/gcc/passes.cc:2767
#12 0x00a188c6 in cgraph_node::expand (this=0x7f24a1bfaaa0) at 
../../gcc/gcc/context.h:48
#13 cgraph_node::expand (this=0x7f24a1bfaaa0) at 
../../gcc/gcc/cgraphunit.cc:1798
#14 0x00a1a69b in expand_all_functions () at 
../../gcc/gcc/cgraphunit.cc:2028
#15 symbol_table::compile (this=0x7f24a205b000) at 
../../gcc/gcc/cgraphunit.cc:2404
#16 0x00a1ccb8 in symbol_table::compile (this=0x7f24a205b000) at 
../../gcc/gcc/cgraphunit.cc:2315
#17 symbol_table::finalize_compilation_unit (this=0x7f24a205b000) at 
../../gcc/gcc/cgraphunit.cc:2589
#18 0x00f0932d in compile_file () at ../../gcc/gcc/toplev.cc:476
#19 0x00839648 in do_compile () at ../../gcc/gcc/toplev.cc:2158
#20 toplev::main (this=this@entry=0x7ffe062c1f2e, argc=, 
argc@entry=78, argv=, argv@entry=0x7ffe062c2058) at 
../../gcc/gcc/toplev.cc:2314
#21 0x0083ad9e in main (argc=78, argv=0x7ffe062c2058) at 
../../gcc/gcc/main.cc:39

(Loop is based in process_bb_lives(), looping in the
FOR_BB_INSNS_REVERSE_SAFE (bb, curr_insn, next) block starting at
about line 696.)

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: [COMMITTED 2/5] Fix ranger when called from SCEV.

2024-05-13 Thread Jan-Benedict Glaw
On Mon, 2024-05-13 20:19:42 +0200, Jan-Benedict Glaw  wrote:
> On Tue, 2024-04-30 17:24:15 -0400, Andrew MacLeod  wrote:
> > Bootstrapped on x86_64-pc-linux-gnu with no regressions.  pushed.
> 
> Starting with this patch (upstream as
> e8ae56a7dc46e39a48017bb5159e4dc672ec7fad, can still be reproduced with
> 0c585c8d0dd85601a8d116ada99126a48c8ce9fd as of May 13th), my CI builds fail 
> for
> csky-elf in all-target-libgcc by falling into a loop infinite loop:
> 
> ../gcc/configure '--with-pkgversion=basepoints/gcc-15-432-g0c585c8d0dd, built 
> at 1715608899'  \
>   --prefix=/tmp/gcc-csky-elf --enable-werror-always 
> --enable-languages=all\
>   --disable-gcov --disable-shared --disable-threads --target=csky-elf 
> --without-headers
> make V=1 all-gcc
> make V=1 install-strip-gcc
> make V=1 all-target-libgcc

Just to add:

/var/lib/laminar/run/gcc-csky-elf/65/toolchain-build/./gcc/cc1 -quiet   
\
-I . -I . -I ../../.././gcc -I ../../../../gcc/libgcc   
\
-I ../../../../gcc/libgcc/. -I ../../../../gcc/libgcc/../gcc
\
-I ../../../../gcc/libgcc/../include -imultilib ck801   
\
-iprefix 
/var/lib/laminar/run/gcc-csky-elf/65/toolchain-build/gcc/../lib/gcc/csky-elf/15.0.0/
   \
-isystem 
/var/lib/laminar/run/gcc-csky-elf/65/toolchain-build/./gcc/include \
-isystem 
/var/lib/laminar/run/gcc-csky-elf/65/toolchain-build/./gcc/include-fixed   \
-MD unwind-dw2-fde.d -MF unwind-dw2-fde.dep -MP -MT unwind-dw2-fde.o
\
-D IN_GCC -D CROSS_DIRECTORY_STRUCTURE -D IN_LIBGCC2 -D inhibit_libc
\
-D HAVE_CC_TLS -D USE_EMUTLS -D HIDE_EXPORTS
\
-isystem /tmp/gcc-csky-elf/csky-elf/include 
\
-isystem /tmp/gcc-csky-elf/csky-elf/sys-include 
\
-isystem ./include ../../../../gcc/libgcc/unwind-dw2-fde.c -quiet   
\
-dumpbase unwind-dw2-fde.c -dumpbase-ext .c -mcpu=ck801 -g -g -g -O2 
-O2 -O2\
-Wextra -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual 
-Wstrict-prototypes\
-Wmissing-prototypes -Wold-style-definition -fbuilding-libgcc 
-fno-stack-protector  \
-fexceptions -fvisibility=hidden -o /tmp/cc3SHedS.s

> (gdb) bt
> #0  0x0098f1df in bitmap_list_find_element (head=0x38f2e18, 
> indx=5001) at ../../gcc/gcc/bitmap.cc:375
> #1  bitmap_set_bit (head=0x38f2e18, bit=640244) at ../../gcc/gcc/bitmap.cc:962
> #2  0x00d39cd1 in process_bb_lives (bb=, 
> curr_point=@0x7ffe062c1b2c: 3039473, dead_insn_p=) at 
> ../../gcc/gcc/lra-lives.cc:889
> #3  lra_create_live_ranges_1 (all_p=all_p@entry=true, dead_insn_p= out>) at ../../gcc/gcc/lra-lives.cc:1416
> #4  0x00d3b810 in lra_create_live_ranges (all_p=all_p@entry=true, 
> dead_insn_p=) at ../../gcc/gcc/lra-lives.cc:1486
> #5  0x00d1a8bd in lra (f=, verbose=) at 
> ../../gcc/gcc/lra.cc:2482
> #6  0x00cd0e18 in do_reload () at ../../gcc/gcc/ira.cc:5973
> #7  (anonymous namespace)::pass_reload::execute (this=) at 
> ../../gcc/gcc/ira.cc:6161
> #8  0x00de6368 in execute_one_pass (pass=pass@entry=0x367c490) at 
> ../../gcc/gcc/passes.cc:2647
> #9  0x00de6c00 in execute_pass_list_1 (pass=0x367c490) at 
> ../../gcc/gcc/passes.cc:2756
> #10 0x00de6c12 in execute_pass_list_1 (pass=0x367b2f0) at 
> ../../gcc/gcc/passes.cc:2757
> #11 0x00de6c39 in execute_pass_list (fn=0x7f24a1c06240, 
> pass=) at ../../gcc/gcc/passes.cc:2767
> #12 0x00a188c6 in cgraph_node::expand (this=0x7f24a1bfaaa0) at 
> ../../gcc/gcc/context.h:48
> #13 cgraph_node::expand (this=0x7f24a1bfaaa0) at 
> ../../gcc/gcc/cgraphunit.cc:1798
> #14 0x00a1a69b in expand_all_functions () at 
> ../../gcc/gcc/cgraphunit.cc:2028
> #15 symbol_table::compile (this=0x7f24a205b000) at 
> ../../gcc/gcc/cgraphunit.cc:2404
> #16 0x00a1ccb8 in symbol_table::compile (this=0x7f24a205b000) at 
> ../../gcc/gcc/cgraphunit.cc:2315
> #17 symbol_table::finalize_compilation_unit (this=0x7f24a205b000) at 
> ../../gcc/gcc/cgraphunit.cc:2589
> #18 0x00f0932d in compile_file () at ../../gcc/gcc/toplev.cc:476
> #19 0x00839648 in do_compile () at ../../gcc/gcc/toplev.cc:2158
> #20 toplev::main (this=this@entry=0x7ffe062c1f2e, argc=, 
> argc@entry=78, argv=, argv@entry=0x7ffe062c2058) at 
> ../../gcc/gcc/toplev.cc:2314
> #21 0x0083ad9e in main (argc=78, argv=0x7ffe062c2058) at 
> ../../gcc/gcc/main.cc:39
> 
> (Loop is based in process_bb_lives(), looping in the
> FOR_BB_INSNS_REVERSE_SAFE (bb, curr_insn, next) block starting at
> about line 696.)

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: [PATCH] rs6000: Enable overlapped by-pieces operations

2024-05-13 Thread Kewen.Lin
Hi,

on 2024/5/9 15:35, HAO CHEN GUI wrote:
> Hi Kewen,
>   Thanks for your comments.
> 
> 在 2024/5/9 13:44, Kewen.Lin 写道:
>> Hi,
>>
>> on 2024/5/8 14:47, HAO CHEN GUI wrote:
>>> Hi,
>>>   This patch enables overlapped by-piece operations. On rs6000, default
>>> move/set/clear ratio is 2. So the overlap is only enabled with compare
>>> by-pieces.
>>
>> Thanks for enabling this, did you evaluate if it can help some benchmark?
> 
> Tested it with SPEC2017. No obvious performance impact. I think memory
> compare might not be hot enough.
> 
> Tested it with my micro benchmark. 5-10% performance gain when compare
> length is 7.

Nice!

> 
>>
>>>
>>>   Bootstrapped and tested on powerpc64-linux BE and LE with no
>>> regressions. Is it OK for the trunk?
>>>
>>> Thanks
>>> Gui Haochen
>>>
>>> ChangeLog
>>> rs6000: Enable overlapped by-pieces operations
>>>
>>> This patch enables overlapped by-piece operations by defining
>>> TARGET_OVERLAP_OP_BY_PIECES_P to true.  On rs6000, default move/set/clear
>>> ratio is 2.  So the overlap is only enabled with compare by-pieces.
>>>
>>> gcc/
>>> * config/rs6000/rs6000.cc (TARGET_OVERLAP_OP_BY_PIECES_P): Define.
>>>
>>> gcc/testsuite/
>>> * gcc.target/powerpc/block-cmp-9.c: New.
>>>
>>>
>>> patch.diff
>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>>> index 6b9a40fcc66..2b5f5cf1d86 100644
>>> --- a/gcc/config/rs6000/rs6000.cc
>>> +++ b/gcc/config/rs6000/rs6000.cc
>>> @@ -1774,6 +1774,9 @@ static const scoped_attribute_specs *const 
>>> rs6000_attribute_table[] =
>>>  #undef TARGET_CONST_ANCHOR
>>>  #define TARGET_CONST_ANCHOR 0x8000
>>>
>>> +#undef TARGET_OVERLAP_OP_BY_PIECES_P
>>> +#define TARGET_OVERLAP_OP_BY_PIECES_P hook_bool_void_true
>>> +
>>>  
>>>
>>>  /* Processor table.  */
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c 
>>> b/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c
>>> new file mode 100644
>>> index 000..b5f51affbb7
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
>>
>> Why does it need power8 forced here?
> 
> I just want to exclude P7 LE as targetm.slow_unaligned_access return false
> for it and the expand cmpmemsi won't be invoked.

> I think it over. It's no need. For the sub-targets which library is
> called, l[hb]z won't be generated too.

Thanks for checking, OK with dropping this forced power8.

BR,
Kewen

> 
>>
>> BR,
>> Kewen
>>
>>> +/* { dg-final { scan-assembler-not {\ml[hb]z\M} } } */
>>> +
>>> +/* Test if by-piece overlap compare is enabled and following case is
>>> +   implemented by two overlap word loads and compares.  */
>>> +
>>> +int foo (const char* s1, const char* s2)
>>> +{
>>> +  return __builtin_memcmp (s1, s2, 7) == 0;
>>> +}
>>
> 
> Thanks
> Gui Haochen



[PATCH v2 0/2] RISC-V improve stack/array access by constant mat tweak

2024-05-13 Thread Vineet Gupta
Hi,

This set of patches help improve stack/array accesses by improving
constant materialization. Details are in respective patches.

The first patch is the main change which improves SPEC cactu by 10%.

As discussed/agreed for v1 [1], I've dropped the splitter variant for
stack accesses.

I also have a few follow-ups which I come back to seperately.

Thx,
-Vineet

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647874.html

Vineet Gupta (2):
  RISC-V: avoid LUI based const materialization ... [part of PR/106265]
  RISC-V: avoid LUI based const mat in prologue/epilogue expansion
[PR/105733]

 gcc/config/riscv/constraints.md   |  6 ++
 gcc/config/riscv/predicates.md|  6 ++
 gcc/config/riscv/riscv-protos.h   |  3 +
 gcc/config/riscv/riscv.cc | 85 +--
 gcc/config/riscv/riscv.h  | 22 +
 gcc/config/riscv/riscv.md | 40 +
 gcc/testsuite/gcc.target/riscv/pr105733.c | 15 
 .../riscv/rvv/autovec/vls/spill-1.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-2.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-3.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-4.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-5.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-6.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-7.c   |  4 +-
 .../gcc.target/riscv/sum-of-two-s12-const-1.c | 45 ++
 .../gcc.target/riscv/sum-of-two-s12-const-2.c | 15 
 .../gcc.target/riscv/sum-of-two-s12-const-3.c | 22 +
 17 files changed, 266 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr105733.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sum-of-two-s12-const-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sum-of-two-s12-const-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/sum-of-two-s12-const-3.c

-- 
2.34.1



[PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265]

2024-05-13 Thread Vineet Gupta
Apologies for the delay in getting this out. Needed to fix one ICE
with glibc build and fresh round of testing: both testsuite and SPEC
runs (which are similar to v1 in terms of Cactu gains, but some more minor
regressions elsewhere gcc). Again those seem so small that IMHO this
should still go in.

I'll investigate those next as well as an existing weirdnes in glibc tempnam
which I spotted during the debugging.

Changes since v1 [1]
 - Tighten the main conditition to avoid stack regs as destination
   (to avoid making them potentially unaligned with -2047 addend:
this might be OK execution/ABI wise, but undesirable/ugly still
specially when coming from compiler codegen).
 - Ensure that first alternative is always split
 - Remove "&& 1" from split condition. That was tripping up glibc build
   with illegal operands `add s0, s0, 2048`.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647877.html

---

... if the constant can be represented as sum of two S12 values.
The two S12 values could instead be fused with subsequent ADD insn.
The helps
 - avoid an additional LUI insn
 - side benefits of not clobbering a reg

e.g.
w/o patch w/ patch
long  | |
plus(unsigned long i) | li  a5,4096 |
{ | addia5,a5,-2032 | addi a0, a0, 2047
   return i + 2064;   | add a0,a0,a5| addi a0, a0, 17
} | ret | ret

NOTE: In theory not having const in a standalone reg might seem less
  CSE friendly, but for workloads in consideration these mat are
  from very late LRA reloads and follow on GCSE is not doing much
  currently.

The real benefit however is seen in base+offset computation for array
accesses and especially for stack accesses which are finalized late in
optim pipeline, during LRA register allocation. Often the finalized
offsets trigger LRA reloads resulting in mind boggling repetition of
exact same insn sequence including LUI based constant materialization.

This shaves off 290 billion dynamic instrustions (QEMU icounts) in
SPEC 2017 Cactu benchmark which is over 10% of workload. In the rest of
suite, there additional 10 billion shaved, with both gains and losses
in indiv workloads as is usual with compiler changes.

 500.perlbench_r-0 |  1,214,534,029,025 | 1,212,887,959,387 |
 500.perlbench_r-1 |740,383,419,739 |   739,280,308,163 |
 500.perlbench_r-2 |692,074,638,817 |   691,118,734,547 |
 502.gcc_r-0   |190,820,141,435 |   190,857,065,988 |
 502.gcc_r-1   |225,747,660,839 |   225,809,444,357 | <- -0.02%
 502.gcc_r-2   |220,370,089,641 |   220,406,367,876 | <- -0.03%
 502.gcc_r-3   |179,111,460,458 |   179,135,609,723 | <- -0.02%
 502.gcc_r-4   |219,301,546,340 |   219,320,416,956 | <- -0.01%
 503.bwaves_r-0|278,733,324,691 |   278,733,323,575 | <- -0.01%
 503.bwaves_r-1|442,397,521,282 |   442,397,519,616 |
 503.bwaves_r-2|344,112,218,206 |   344,112,216,760 |
 503.bwaves_r-3|417,561,469,153 |   417,561,467,597 |
 505.mcf_r |669,319,257,525 |   669,318,763,084 |
 507.cactuBSSN_r   |  2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%
 508.namd_r|  1,855,884,342,110 | 1,855,881,110,934 |
 510.parest_r  |  1,654,525,521,053 | 1,654,402,859,174 |
 511.povray_r  |  2,990,146,655,619 | 2,990,060,324,589 |
 519.lbm_r |  1,158,337,294,525 | 1,158,337,294,529 |
 520.omnetpp_r |  1,021,765,791,283 | 1,026,165,661,394 |
 521.wrf_r |  1,715,955,652,503 | 1,714,352,737,385 |
 523.xalancbmk_r   |849,846,008,075 |   849,836,851,752 |
 525.x264_r-0  |277,801,762,763 |   277,488,776,427 |
 525.x264_r-1  |927,281,789,540 |   926,751,516,742 |
 525.x264_r-2  |915,352,631,375 |   914,667,785,953 |
 526.blender_r |  1,652,839,180,887 | 1,653,260,825,512 |
 527.cam4_r|  1,487,053,494,925 | 1,484,526,670,770 |
 531.deepsjeng_r   |  1,641,969,526,837 | 1,642,126,598,866 |
 538.imagick_r |  2,098,016,546,691 | 2,097,997,929,125 |
 541.leela_r   |  1,983,557,323,877 | 1,983,531,314,526 |
 544.nab_r |  1,516,061,611,233 | 1,516,061,407,715 |
 548.exchange2_r   |  2,072,594,330,215 | 2,072,591,648,318 |
 549.fotonik3d_r   |  1,001,499,307,366 | 1,001,478,944,189 |
 554.roms_r|  1,028,799,739,111 | 1,028,780,904,061 |
 557.xz_r-0|363,827,039,684 |   363,057,014,260 |
 557.xz_r-1|906,649,112,601 |   905,928,888,732 |
 557.xz_r-2|509,023,898,187 |   508,140,356,932 |
 997.specrand_fr   |402,535,577 |   403,052,561 |
 999.specrand_ir   |402,535,577 |   403,052,561 |

This should still be considered damage control as the real/deeper fix
would be to reduce number of LRA reloads or CSE/anchor those during
LRA constraint sub-pass (re)runs (thats a different PR/114729.

Implementation Details (for posterity)

[PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-13 Thread Vineet Gupta
If the constant used for stack offset can be expressed as sum of two S12
values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.

The prev patches to generally avoid LUI based const materialization didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.

This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)

   gcc-13.1 release   |  gcc 230823 |   |
  |g6619b3d4c15c|   This patch  |  clang/llvm
-
li  t0,-4096 | lit0,-4096  | addi  sp,sp,-2048 | addi 
sp,sp,-2048
addit0,t0,2016   | addi  t0,t0,2032| add   sp,sp,-16   | addi sp,sp,-32
li  a4,4096  | add   sp,sp,t0  | add   a5,sp,a0| add  a1,sp,16
add sp,sp,t0 | addi  a5,sp,-2032   | sbzero,0(a5)  | add  a0,a0,a1
li  a5,-4096 | add   a0,a5,a0  | addi  sp,sp,2032  | sb   zero,0(a0)
addia4,a4,-2032  | lit0, 4096  | addi  sp,sp,32| addi sp,sp,2032
add a4,a4,a5 | sbzero,2032(a0) | ret   | addi sp,sp,48
addia5,sp,16 | addi  t0,t0,-2032   |   | ret
add a5,a4,a5 | add   sp,sp,t0  |
add a0,a5,a0 | ret |
li  t0,4096  |
sd  a5,8(sp) |
sb  zero,2032(a0)|
addit0,t0,-2016  |
add sp,sp,t0 |
ret  |

gcc/ChangeLog:
PR target/105733
* config/riscv/riscv.h: New macros for with aligned offsets.
* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
function to split a sum of two s12 values into constituents.
(riscv_expand_prologue): Handle offset being sum of two S12.
(riscv_expand_epilogue): Ditto.
* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr105733.c: New Test.
* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
expect LUI 4096.
* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv-protos.h   |  2 +
 gcc/config/riscv/riscv.cc | 74 +--
 gcc/config/riscv/riscv.h  |  7 ++
 gcc/testsuite/gcc.target/riscv/pr105733.c | 15 
 .../riscv/rvv/autovec/vls/spill-1.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-2.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-3.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-4.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-5.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-6.c   |  4 +-
 .../riscv/rvv/autovec/vls/spill-7.c   |  4 +-
 11 files changed, 105 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr105733.c

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 706dc204e643..6da6ae4d041f 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -166,6 +166,8 @@ extern void riscv_subword_address (rtx, rtx *, rtx *, rtx 
*, rtx *);
 extern void riscv_lshift_subword (machine_mode, rtx, rtx, rtx *);
 extern enum memmodel riscv_union_memmodels (enum memmodel, enum memmodel);
 extern bool riscv_reg_frame_related (rtx);
+extern void riscv_split_sum_of_two_s12 (HOST_WIDE_INT, HOST_WIDE_INT *,
+   HOST_WIDE_INT *);
 
 /* Routines implemented in riscv-c.cc.  */
 void riscv_cpu_cpp_builtins (cpp_reader *);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4067505270e1..4b742489b272 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4063,6 +4063,32 @@ riscv_split_doubleword_move (rtx dest, rtx src)
riscv_emit_move (riscv_subword (dest, true), riscv_subword (src, true));
  }
 }
+
+/* Constant VAL is known to be sum of two S12 constants.  Break it into
+   comprising BASE and OFF.
+   Numerically S12 is -2048 to 2047, however it uses the more conservative
+   range -2048 to 2032 as offsets pertain to stack related registers.  */
+
+void
+riscv_split_sum_of_two_s12 (HOST_WIDE_INT val, HOST_WIDE_INT *base,
+   HOST_WIDE_INT *off)
+{
+  if (SUM_OF_TWO_S12_N (val))
+{
+  *base = -2048;
+  *off = val - (-2048);
+}
+  else if (SUM_OF_TWO_S12_P_ALGN (val))
+{
+  *base = 203

[to-be-committed][RISC-V] Improve AND with some constants

2024-05-13 Thread Jeff Law


If we have an AND with a constant operand and the constant operand 
requires synthesis, then we may be able to generate more efficient code 
than we do now.


Essentially the need for constant synthesis gives us a budget for 
alternative ways to clear bits, which zext.w can do for bits 32..63 
trivially.   So if we clear 32..63  via zext.w, the constant for the 
remaining bits to clear may be simple enough to use with andi or bseti. 
That will save us an instruction.


This has tested in Ventana's CI system as well as my own.  I'll wait for 
the upstream CI tester to report success before committing.


Jeff
gcc/
* config/riscv/bitmanip.md: Add new splitter for AND with
a constant that masks off bits 32..63 and needs synthesis.

gcc/testsuite/

* gcc.target/riscv/zba_zbs_and-1.c: New test.

+++ b/gcc/testsuite/gcc.target/riscv/zba_zbs_and-1.c
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 724511b6df3..8769a6b818b 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -843,6 +843,40 @@ (define_insn_and_split "*andi_extrabit"
 }
 [(set_attr "type" "bitmanip")])
 
+;; If we have the ZBA extension, then we can clear the upper half of a 64
+;; bit object with a zext.w.  So if we have AND where the constant would
+;; require synthesis of two or more instructions, but 32->64 sign extension
+;; of the constant is a simm12, then we can use zext.w+andi.  If the adjusted
+;; constant is a single bit constant, then we can use zext.w+bclri
+;;
+;; With the mvconst_internal pattern claiming a single insn to synthesize
+;; constants, this must be a define_insn_and_split.
+(define_insn_and_split ""
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (and:DI (match_operand:DI 1 "register_operand" "r")
+   (match_operand 2 "const_int_operand" "n")))]
+  "TARGET_64BIT
+   && TARGET_ZBA
+   && !paradoxical_subreg_p (operands[1])
+   /* Only profitable if synthesis takes more than one insn.  */
+   && riscv_const_insns (operands[2]) != 1
+   /* We need the upper half to be zero.  */
+   && (INTVAL (operands[2]) & HOST_WIDE_INT_C (0x)) == 0
+   /* And the the adjusted constant must either be something we can
+  implement with andi or bclri.  */
+   && ((SMALL_OPERAND (sext_hwi (INTVAL (operands[2]), 32))
+|| (TARGET_ZBS && popcount_hwi (INTVAL (operands[2])) == 31))
+   && INTVAL (operands[2]) != 0x7fff)"
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (zero_extend:DI (match_dup 3)))
+   (set (match_dup 0) (and:DI (match_dup 0) (match_dup 2)))]
+  "{
+ operands[3] = gen_lowpart (SImode, operands[1]);
+ operands[2] = GEN_INT (sext_hwi (INTVAL (operands[2]), 32));
+   }"
+  [(set_attr "type" "bitmanip")])
+
 ;; IF_THEN_ELSE: test for 2 bits of opposite polarity
 (define_insn_and_split "*branch_mask_twobits_equals_singlebit"
   [(set (pc)
diff --git a/gcc/testsuite/gcc.target/riscv/zba_zbs_and-1.c 
b/gcc/testsuite/gcc.target/riscv/zba_zbs_and-1.c
new file mode 100644
index 000..23fd769449e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zba_zbs_and-1.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zba_zbb_zbs -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+
+unsigned long long w32mem_1(unsigned long long w32)
+{
+return w32 & ~(1U << 0);
+}
+
+unsigned long long w32mem_2(unsigned long long w32)
+{
+return w32 & ~(1U << 30);
+}
+
+unsigned long long w32mem_3(unsigned long long w32)
+{
+return w32 & ~(1U << 31);
+}
+
+/* If we do synthesis, then we'd see an addi.  */
+/* { dg-final { scan-assembler-not "addi\t" } } */


[RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Qing Zhao
-Warray-bounds is an important option to enable linux kernal to keep
the array out-of-bound errors out of the source tree.

However, due to the false positive warnings reported in PR109071
(-Warray-bounds false positive warnings due to code duplication from
jump threading), -Warray-bounds=1 cannot be added on by default.

Although it's impossible to elinimate all the false positive warnings
from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
documentation says "always out of bounds"), we should minimize the
false positive warnings in -Warray-bounds=1.

The root reason for the false positive warnings reported in PR109071 is:

When the thread jump optimization tries to reduce the # of branches
inside the routine, sometimes it needs to duplicate the code and
split into two conditional pathes. for example:

The original code:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
  if (index >= 4)
warn ();
  *ptr = 0;
  *val = sg->vals[index];
  if (index >= 4)
warn ();
  *ptr = *val;

  return;
}

With the thread jump, the above becomes:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
  if (index >= 4)
{
  warn ();
  *ptr = 0; // Code duplications since "warn" does return;
  *val = sg->vals[index];   // same this line.
// In this path, since it's under the condition
// "index >= 4", the compiler knows the value
// of "index" is larger then 4, therefore the
// out-of-bound warning.
  warn ();
}
  else
{
  *ptr = 0;
  *val = sg->vals[index];
}
  *ptr = *val;
  return;
}

We can see, after the thread jump optimization, the # of branches inside
the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
code duplication (which is needed for the correctness of the code), we
got a false positive out-of-bound warning.

In order to eliminate such false positive out-of-bound warning,

A. Add one more flag for GIMPLE: is_splitted.
B. During the thread jump optimization, when the basic blocks are
   duplicated, mark all the STMTs inside the original and duplicated
   basic blocks as "is_splitted";
C. Inside the array bound checker, add the following new heuristic:

If
   1. the stmt is duplicated and splitted into two conditional paths;
+  2. the warning level < 2;
+  3. the current block is not dominating the exit block
Then not report the warning.

The false positive warnings are moved from -Warray-bounds=1 to
 -Warray-bounds=2 now.

Bootstrapped and regression tested on both x86 and aarch64. adjusted
 -Warray-bounds-61.c due to the false positive warnings.

Let me know if you have any comments and suggestions.

Thanks.

Qing


PR tree optimization/109071

gcc/ChangeLog:

* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two new
arguments for the new heuristic to not issue warnings.
(array_bounds_checker::check_array_ref): Call the new prototype of the
routine check_out_of_bounds_and_warn.
(array_bounds_checker::check_mem_ref): Add one new argument for the
new heuristic to not issue warnings.
(array_bounds_checker::check_addr_expr): Call the new prototype of the
routine check_mem_ref, add new heuristic for not issue warnings.
(array_bounds_checker::check_array_bounds): Call the new prototype of
the routine check_mem_ref.
* gimple-array-bounds.h: New prototype of check_mem_ref.
* gimple.h (struct GTY): Add one new flag is_splitted for gimple.
(gimple_is_splitted_p): New function.
(gimple_set_is_splitted): New function.
* tree-ssa-threadupdate.cc (set_stmts_in_bb_is_splitted): New function.
(back_jt_path_registry::duplicate_thread_path): Mark all the stmts in
both original and copied blocks as IS_SPLITTED.

gcc/testsuite/ChangeLog:

* gcc.dg/Warray-bounds-61.c: Adjust testing case.
* gcc.dg/pr109071-1.c: New test.
* gcc.dg/pr109071.c: New test.
---
 gcc/gimple-array-bounds.cc  | 46 +
 gcc/gimple-array-bounds.h   |  2 +-
 gcc/gimple.h| 21 +--
 gcc/testsuite/gcc.dg/Warray-bounds-61.c |  6 ++--
 gcc/testsuite/gcc.dg/pr109071-1.c   | 22 
 gcc/testsuite/gcc.dg/pr109071.c | 22 
 gcc/tree-ssa-threadupdate.cc| 15 
 7 files changed, 122 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr109071-1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071.c

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 008071cd5464..4a2975623bc1 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -264,7 +264,9 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
  tree up_bound, tree up_bound_p1,
  

[wwwdocs] cxx-dr-status: Update from C++ Core Language Issue TOC, Revision 114

2024-05-13 Thread Marek Polacek
Pushed.

commit 06c46c88cc02e0dff5f65b41754178fb25fb939e
Author: Marek Polacek 
Date:   Mon May 13 16:09:05 2024 -0400

cxx-dr-status: Update from C++ Core Language Issue TOC, Revision 114

diff --git a/htdocs/projects/cxx-dr-status.html 
b/htdocs/projects/cxx-dr-status.html
index a5f45359..2a61cfbd 100644
--- a/htdocs/projects/cxx-dr-status.html
+++ b/htdocs/projects/cxx-dr-status.html
@@ -15,7 +15,7 @@
 
   This table tracks the implementation status of C++ defect reports in GCC.
   It is based on C++ Standard Core Language Issue Table of Contents, Revision
-  113 (https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_toc.html";>here).
+  114 (https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_toc.html";>here).
 
   
 
@@ -1652,7 +1652,7 @@
 
 
   https://wg21.link/cwg233";>233
-  drafting
+  review
   References vs pointers in UDC overload resolution
   No
   https://gcc.gnu.org/PR114697";>PR114697
@@ -3196,7 +3196,7 @@
 
 
   https://wg21.link/cwg453";>453
-  tentatively ready
+  DR
   References may only bind to "valid" objects
   ?
   
@@ -7031,11 +7031,11 @@
   ?
   
 
-
+
   https://wg21.link/cwg1001";>1001
-  drafting
+  review
   Parameter type adjustment in dependent parameter types
-  -
+  ?
   https://gcc.gnu.org/PR51851";>PR51851
 
 
@@ -7292,7 +7292,7 @@
 
 
   https://wg21.link/cwg1038";>1038
-  DR
+  DRWP
   Overload resolution of &x.static_func
   ?
   
@@ -8624,6 +8624,7 @@
   https://wg21.link/cwg1228";>1228
   NAD
   Copy-list-initialization and explicit constructors
+
   No
   https://gcc.gnu.org/PR113300";>PR113300
 
@@ -11916,7 +11917,7 @@
 
 
   https://wg21.link/cwg1698";>1698
-  DR
+  DRWP
   Files ending in \
   ?
   
@@ -12075,11 +12076,11 @@
   ?
   
 
-
+
   https://wg21.link/cwg1721";>1721
-  drafting
+  review
   Diagnosing ODR violations for static data members
-  -
+  ?
   
 
 
@@ -13454,11 +13455,11 @@
   N/A
   
 
-
+
   https://wg21.link/cwg1918";>1918
-  open
+  CD5
   friend templates with dependent scopes
-  -
+  ?
   
 
 
@@ -13644,11 +13645,11 @@
   -
   
 
-
+
   https://wg21.link/cwg1945";>1945
-  open
+  CD5
   Friend declarations naming members of class templates in 
non-templates
-  -
+  ?
   
 
 
@@ -13709,7 +13710,7 @@
 
 
   https://wg21.link/cwg1954";>1954
-  tentatively ready
+  DR
   typeid null dereference check in subexpressions
   ?
   
@@ -14373,11 +14374,11 @@
   -
   
 
-
+
   https://wg21.link/cwg2049";>2049
-  drafting
+  DRWP
   List initializer in non-type template default argument
-  -
+  ?
   
 
 
@@ -14410,7 +14411,7 @@
 
 
   https://wg21.link/cwg2054";>2054
-  DR
+  DRWP
   Missing description of class SFINAE
   ?
   
@@ -14746,7 +14747,7 @@
 
 
   https://wg21.link/cwg2102";>2102
-  DR
+  DRWP
   Constructor checking in new-expression
   ?
   
@@ -15797,7 +15798,7 @@
 
 
   https://wg21.link/cwg2252";>2252
-  DR
+  DRWP
   Enumeration list-initialization from the same type
   ?
   
@@ -17069,11 +17070,11 @@
   ?
   
 
-
+
   https://wg21.link/cwg2434";>2434
-  open
+  review
   Mandatory copy elision vs non-class objects
-  -
+  ?
   
 
 
@@ -17183,7 +17184,7 @@
 
 
   https://wg21.link/cwg2450";>2450
-  review
+  DRWP
   braced-init-list as a template-argument
   11
   
@@ -17244,12 +17245,12 @@
   ?
   
 
-
+
   https://wg21.link/cwg2459";>2459
-  drafting
+  DRWP
   Template parameter initialization
-  -
-  
+  ?
+  https://gcc.gnu.org/PR113800";>PR113800
 
 
   https://wg21.link/cwg2460";>2460
@@ -17365,7 +17366,7 @@
 
 
   https://wg21.link/cwg2476";>2476
-  tentatively ready
+  DR
   placeholder-type-specifiers and function declarators
   ?
   
@@ -17561,7 +17562,7 @@
 
 
   https://wg21.link/cwg2504";>2504
-  DR
+  DRWP
   Inheriting constructors from virtual base classes
   ?
   
@@ -17750,7 +17751,7 @@
 
 
   https://wg21.link/cwg2531";>2531
-  DR
+  DRWP
   Static data members redeclared as constexpr
   ?
   
@@ -17764,7 +17765,7 @@
 
 
   https://wg21.link/cwg2533";>2533
-  review
+  DR
   Storage duration of implicitly created objects
   ?
   
@@ -17855,14 +17856,14 @@
 
 
   https://wg21.link/cwg2546";>2546
-  tentatively ready
+  DR
   Defaulted secondary comparison operators

[PATCH] RISC-V: Do not allow v0 as dest when merging [PR115068].

2024-05-13 Thread Robin Dapp
Hi,

this patch splits the vfw...wf pattern so we do not emit
e.g. vfwadd.wf v0,v8,fa5,v0.t anymore.

Regtested on rv64gcv_zvfh.

Regards
 Robin

gcc/ChangeLog:

PR target/115068

* config/riscv/vector.md:  Split vfw.wf pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr115068-run.c: New test.
* gcc.target/riscv/rvv/base/pr115068.c: New test.
---
 gcc/config/riscv/vector.md| 20 ++---
 .../gcc.target/riscv/rvv/base/pr115068-run.c  | 28 ++
 .../gcc.target/riscv/rvv/base/pr115068.c  | 29 +++
 3 files changed, 67 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 2a54f78df8e..e408baa809c 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7178,24 +7178,24 @@ (define_insn "@pred_single_widen_sub"
(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))])
 
 (define_insn "@pred_single_widen__scalar"
-  [(set (match_operand:VWEXTF 0 "register_operand"   "=vr,   
vr")
+  [(set (match_operand:VWEXTF 0 "register_operand""=vd, vd, 
vr, vr")
(if_then_else:VWEXTF
  (unspec:
-   [(match_operand: 1 "vector_mask_operand"   
"vmWc1,vmWc1")
-(match_operand 5 "vector_length_operand"  "   rK,   
rK")
-(match_operand 6 "const_int_operand"  "i,
i")
-(match_operand 7 "const_int_operand"  "i,
i")
-(match_operand 8 "const_int_operand"  "i,
i")
-(match_operand 9 "const_int_operand"  "i,
i")
+   [(match_operand: 1 "vector_mask_operand"  " vm, 
vm,Wc1,Wc1")
+(match_operand 5 "vector_length_operand" " rK, rK, rK, 
rK")
+(match_operand 6 "const_int_operand" "  i,  i,  i, 
 i")
+(match_operand 7 "const_int_operand" "  i,  i,  i, 
 i")
+(match_operand 8 "const_int_operand" "  i,  i,  i, 
 i")
+(match_operand 9 "const_int_operand" "  i,  i,  i, 
 i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)
 (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
  (plus_minus:VWEXTF
-   (match_operand:VWEXTF 3 "register_operand" "   vr,   
vr")
+   (match_operand:VWEXTF 3 "register_operand"" vr, vr, vr, 
vr")
(float_extend:VWEXTF
  (vec_duplicate:
-   (match_operand: 4 "register_operand"   "f,
f"
- (match_operand:VWEXTF 2 "vector_merge_operand"   "   vu,
0")))]
+   (match_operand: 4 "register_operand"  "  f,  f,  f, 
 f"
+ (match_operand:VWEXTF 2 "vector_merge_operand"  " vu,  0, vu, 
 0")))]
   "TARGET_VECTOR"
   "vfw.wf\t%0,%3,%4%p1"
   [(set_attr "type" "vf")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c
new file mode 100644
index 000..95ec8e06021
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068-run.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-require-effective-target riscv_v_ok } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-std=gnu99" } */
+
+#include 
+#include 
+
+vfloat64m8_t
+test_vfwadd_wf_f64m8_m (vbool8_t vm, vfloat64m8_t vs2, float rs1, size_t vl)
+{
+  return __riscv_vfwadd_wf_f64m8_m (vm, vs2, rs1, vl);
+}
+
+char global_memory[1024];
+void *fake_memory = (void *) global_memory;
+
+int
+main ()
+{
+  asm volatile ("fence" ::: "memory");
+  vfloat64m8_t vfwadd_wf_f64m8_m_vd = test_vfwadd_wf_f64m8_m (
+__riscv_vreinterpret_v_i8m1_b8 (__riscv_vundefined_i8m1 ()),
+__riscv_vundefined_f64m8 (), 1.0, __riscv_vsetvlmax_e64m8 ());
+  asm volatile ("" ::"vr"(vfwadd_wf_f64m8_m_vd) : "memory");
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c
new file mode 100644
index 000..6d680037aa1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr115068.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-std=gnu99" } */
+
+#include 
+#include 
+
+vfloat64m8_t
+test_vfwadd_wf_f64m8_m (vbool8_t vm, vfloat64m8_t vs2, float rs1, size_t vl)
+{
+  return __riscv_vfwadd_wf_f64m8_m (vm, vs2, rs1, vl);
+}
+
+char global_memory[1024];
+void *fake_memory = (void *) global_memory;
+
+int
+main ()
+{
+  asm volatile ("fence" ::: "memory");
+  vfloat64m8_t vfwadd_wf_f64m8_m_vd = test_vfwadd_wf_f64m8_m (
+__riscv_vreinterpret_v_i8m1_b8 (__riscv_vundefined_i8m1 ()),
+__riscv_vundef

Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265]

2024-05-13 Thread Jeff Law




On 5/13/24 12:49 PM, Vineet Gupta wrote:

Apologies for the delay in getting this out. Needed to fix one ICE
with glibc build and fresh round of testing: both testsuite and SPEC
runs (which are similar to v1 in terms of Cactu gains, but some more minor
regressions elsewhere gcc). Again those seem so small that IMHO this
should still go in.

I'll investigate those next as well as an existing weirdnes in glibc tempnam
which I spotted during the debugging.

Changes since v1 [1]
  - Tighten the main conditition to avoid stack regs as destination
(to avoid making them potentially unaligned with -2047 addend:
 this might be OK execution/ABI wise, but undesirable/ugly still
 specially when coming from compiler codegen).
  - Ensure that first alternative is always split
  - Remove "&& 1" from split condition. That was tripping up glibc build
with illegal operands `add s0, s0, 2048`.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647877.html

  
+;; Special case of adding a reg and constant if latter is sum of two S12

+;; values (in range -2048 to 2047). Avoid materialized the const and fuse
+;; into the add (with an additional add for 2nd value). Makes a 3 insn
+;; sequence into 2 insn.
+
+(define_insn_and_split "*add3_const_sum_of_two_s12"
+  [(set (match_operand:P0 "register_operand" "=r,r")
+   (plus:P (match_operand:P 1 "register_operand" " r,r")
+   (match_operand:P 2 "const_two_s12"" MiG,r")))]
+  "!riscv_reg_frame_related (operands[0])"
So that !riscv_reg_frame_related is my only concern with this patch. 
It's a destination, so it *may* be OK.


If it were a source operand, then we'd have to worry about cases where 
it was a pseudo with the same value as sp/fp/argp and subsequent copy 
propagation replacing the pseudo with sp/fp/argp causing the insn to no 
longer match.


Similarly if it were a source operand we'd have to worry about cases 
where the pseudo had a registered (or discoverable) equivalence to 
sp/fp/argp plus an offset.  IRA/LRA can replace the use with its 
equivalence in some of those cases which would have potentially caused 
headaches.


But as a destination we really just have to worry about generation in 
the prologue/epilogue and for alloca calls.  Those should be the only 
places that set one of those special registers.  They're constrained 
enough that I think we'll be OK.


I'm very slightly worried about hard register cprop, but I think it 
should be safe these days WRT those special registers in the unlikely 
event it found an opportunity to propagate them.


So a tentative OK.  If we find this tidibit is problematical in the 
future, then what I would suggest is we allow those special registers 
and dial-back the aggressiveness on the range of allowed constants. 
That would allow the first instruction in the sequence to never create a 
mis-aligned sp.  But again, that's only if we need to revisit.


Please wait for CI to report back sane results :-)

Jeff


[PATCH] Fortran: fix bounds check for assignment, class component [PR86100]

2024-05-13 Thread Harald Anlauf
Dear all,

the attached patch does two things:

- it fixes a bogus array bounds check when deep-copying a class component
  of a derived type and the class component has rank > 1, the reason being
  that the previous code compared the full size of one side with the size
  of the first dimension of the other

- the bounds-check error message that was generated e.g. by an allocate
  statement with conflicting sizes in the allocation and the source-expr
  will now use an improved abbreviated name pointing to the component
  involved, which was introduced in 14-development.

What I could not resolve: a deep copy may still create no useful array
name in the error message (which I am now unable to trigger).  If someone
sees how to extract it reliably from the tree, please let me know.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

I would like to backport this to 14-branch after a decent delay.

Thanks,
Harald

From e187285dfd83da2f69cfd50854c701744dc8acc5 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 13 May 2024 22:06:33 +0200
Subject: [PATCH] Fortran: fix bounds check for assignment, class component
 [PR86100]

gcc/fortran/ChangeLog:

	PR fortran/86100
	* trans-array.cc (gfc_conv_ss_startstride): Use abridged_ref_name
	to generate a more user-friendly name for bounds-check messages.
	* trans-expr.cc (gfc_copy_class_to_class): Fix bounds check for
	rank>1 by looping over the dimensions.

gcc/testsuite/ChangeLog:

	PR fortran/86100
	* gfortran.dg/bounds_check_25.f90: New test.
---
 gcc/fortran/trans-array.cc|  7 +++-
 gcc/fortran/trans-expr.cc | 40 ++-
 gcc/testsuite/gfortran.dg/bounds_check_25.f90 | 32 +++
 3 files changed, 60 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/bounds_check_25.f90

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index c5b56f4e273..eec62c296ff 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -4911,6 +4911,7 @@ done:
 	  gfc_expr *expr;
 	  locus *expr_loc;
 	  const char *expr_name;
+	  char *ref_name = NULL;

 	  ss_info = ss->info;
 	  if (ss_info->type != GFC_SS_SECTION)
@@ -4922,7 +4923,10 @@ done:

 	  expr = ss_info->expr;
 	  expr_loc = &expr->where;
-	  expr_name = expr->symtree->name;
+	  if (expr->ref)
+	expr_name = ref_name = abridged_ref_name (expr, NULL);
+	  else
+	expr_name = expr->symtree->name;

 	  gfc_start_block (&inner);

@@ -5134,6 +5138,7 @@ done:

 	  gfc_add_expr_to_block (&block, tmp);

+	  free (ref_name);
 	}

   tmp = gfc_finish_block (&block);
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index e315e2d3370..dfc5b8e9b4a 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -1520,7 +1520,6 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
   stmtblock_t body;
   stmtblock_t ifbody;
   gfc_loopinfo loop;
-  tree orig_nelems = nelems; /* Needed for bounds check.  */

   gfc_init_block (&body);
   tmp = fold_build2_loc (input_location, MINUS_EXPR,
@@ -1552,27 +1551,32 @@ gfc_copy_class_to_class (tree from, tree to, tree nelems, bool unlimited)
   /* Add bounds check.  */
   if ((gfc_option.rtcheck & GFC_RTCHECK_BOUNDS) > 0 && is_from_desc)
 	{
-	  char *msg;
 	  const char *name = "<>";
-	  tree from_len;
+	  int dim, rank;

 	  if (DECL_P (to))
-	name = (const char *)(DECL_NAME (to)->identifier.id.str);
-
-	  from_len = gfc_conv_descriptor_size (from_data, 1);
-	  from_len = fold_convert (TREE_TYPE (orig_nelems), from_len);
-	  tmp = fold_build2_loc (input_location, NE_EXPR,
-  logical_type_node, from_len, orig_nelems);
-	  msg = xasprintf ("Array bound mismatch for dimension %d "
-			   "of array '%s' (%%ld/%%ld)",
-			   1, name);
-
-	  gfc_trans_runtime_check (true, false, tmp, &body,
-   &gfc_current_locus, msg,
-			 fold_convert (long_integer_type_node, orig_nelems),
-			   fold_convert (long_integer_type_node, from_len));
+	name = IDENTIFIER_POINTER (DECL_NAME (to));

-	  free (msg);
+	  rank = GFC_TYPE_ARRAY_RANK (TREE_TYPE (from_data));
+	  for (dim = 1; dim <= rank; dim++)
+	{
+	  tree from_len, to_len, cond;
+	  char *msg;
+
+	  from_len = gfc_conv_descriptor_size (from_data, dim);
+	  from_len = fold_convert (long_integer_type_node, from_len);
+	  to_len = gfc_conv_descriptor_size (to_data, dim);
+	  to_len = fold_convert (long_integer_type_node, to_len);
+	  msg = xasprintf ("Array bound mismatch for dimension %d "
+			   "of array '%s' (%%ld/%%ld)",
+			   dim, name);
+	  cond = fold_build2_loc (input_location, NE_EXPR,
+  logical_type_node, from_len, to_len);
+	  gfc_trans_runtime_check (true, false, cond, &body,
+   &gfc_current_locus, msg,
+   to_len, from_len);
+	  free (msg);
+	}
 	}

   tmp = build_call_vec (fcn_type, fcn, args);
diff --git a/gcc/testsuite/gfortr

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-13 Thread Jeff Law




On 5/13/24 12:49 PM, Vineet Gupta wrote:

If the constant used for stack offset can be expressed as sum of two S12
values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.

The prev patches to generally avoid LUI based const materialization didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.

This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)

gcc-13.1 release   |  gcc 230823 |   |
   |g6619b3d4c15c|   This patch  |  clang/llvm
-
li  t0,-4096 | lit0,-4096  | addi  sp,sp,-2048 | addi 
sp,sp,-2048
addit0,t0,2016   | addi  t0,t0,2032| add   sp,sp,-16   | addi sp,sp,-32
li  a4,4096  | add   sp,sp,t0  | add   a5,sp,a0| add  a1,sp,16
add sp,sp,t0 | addi  a5,sp,-2032   | sbzero,0(a5)  | add  a0,a0,a1
li  a5,-4096 | add   a0,a5,a0  | addi  sp,sp,2032  | sb   zero,0(a0)
addia4,a4,-2032  | lit0, 4096  | addi  sp,sp,32| addi sp,sp,2032
add a4,a4,a5 | sbzero,2032(a0) | ret   | addi sp,sp,48
addia5,sp,16 | addi  t0,t0,-2032   |   | ret
add a5,a4,a5 | add   sp,sp,t0  |
add a0,a5,a0 | ret |
li  t0,4096  |
sd  a5,8(sp) |
sb  zero,2032(a0)|
addit0,t0,-2016  |
add sp,sp,t0 |
ret  |

gcc/ChangeLog:
PR target/105733
* config/riscv/riscv.h: New macros for with aligned offsets.
* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
function to split a sum of two s12 values into constituents.
(riscv_expand_prologue): Handle offset being sum of two S12.
(riscv_expand_epilogue): Ditto.
* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr105733.c: New Test.
* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
expect LUI 4096.
* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.





@@ -8074,14 +8111,26 @@ riscv_expand_epilogue (int style)
}
else
{
- if (!SMALL_OPERAND (adjust_offset.to_constant ()))
+ HOST_WIDE_INT adj_off_value = adjust_offset.to_constant ();
+ if (SMALL_OPERAND (adj_off_value))
+   {
+ adjust = GEN_INT (adj_off_value);
+   }
+ else if (SUM_OF_TWO_S12_ALGN (adj_off_value))
+   {
+ HOST_WIDE_INT base, off;
+ riscv_split_sum_of_two_s12 (adj_off_value, &base, &off);
+ insn = gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx,
+   GEN_INT (base));
+ RTX_FRAME_RELATED_P (insn) = 1;
+ adjust = GEN_INT (off);
+   }
So this was the hunk that we identified internally as causing problems 
with libgomp's testsuite.  We never fully chased it down as this hunk 
didn't seem terribly important performance wise -- we just set it aside. 
 The thing is it looked basically correct to me.  So the failure was 
certainly unexpected, but it was consistent.


So I think the question is whether or not the CI system runs the libgomp 
testsuite, particularly in the rv64 linux configuration.  If it does, 
and it passes, then we're good.  I'm still finding my way around the 
configuration, so I don't know if the CI system Edwin & Patrick have 
built tests libgomp or not.


If it isn't run, then we'll need to do a run to test that.  I'm set up 
here to do that if needed.   I can just drop this version into our 
internal tree, trigger an internal CI run and see if it complains :-)


If it does complain, then we know where to start investigations.




Jeff



Re: [PATCH] RISC-V: add option -m(no-)autovec-segment

2024-05-13 Thread Vineet Gupta


On 2/27/24 07:25, Jeff Law wrote:
> On 2/25/24 21:53, Greg McGary wrote:
>> Add option -m(no-)autovec-segment to enable/disable autovectorizer
>> from emitting vector segment load/store instructions. This is useful for
>> performance experiments.
>>
>> gcc/ChangeLog:
>>  * config/riscv/autovec.md (vec_mask_len_load_lanes, 
>> vec_mask_len_store_lanes):
>>Predicate with TARGET_VECTOR_AUTOVEC_SEGMENT
>>  * gcc/config/riscv/riscv-opts.h (TARGET_VECTOR_AUTOVEC_SEGMENT): New 
>> macro.
>>  * gcc/config/riscv/riscv.opt (-m(no-)autovec-segment): New option.
>>  * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent 
>> divide-by-zero.
>>  * testsuite/gcc.target/riscv/rvv/autovec/struct/*_noseg*.c,
>>  testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: New tests.
> I don't mind having options to do this kind of selection (we've done 
> similar things internally for other RVV features).  But I don't think 
> now is the time to be introducing this stuff.  We're in stage4 of the 
> development cycle after all.

Ping ! now that we are back in stage1

Thx,
-Vineet


Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-13 Thread Jeff Law




On 5/13/24 1:48 PM, Qing Zhao wrote:

-Warray-bounds is an important option to enable linux kernal to keep
the array out-of-bound errors out of the source tree.

However, due to the false positive warnings reported in PR109071
(-Warray-bounds false positive warnings due to code duplication from
jump threading), -Warray-bounds=1 cannot be added on by default.

Although it's impossible to elinimate all the false positive warnings
from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
documentation says "always out of bounds"), we should minimize the
false positive warnings in -Warray-bounds=1.

The root reason for the false positive warnings reported in PR109071 is:

When the thread jump optimization tries to reduce the # of branches
inside the routine, sometimes it needs to duplicate the code and
split into two conditional pathes. for example:

The original code:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   if (index >= 4)
 warn ();
   *ptr = 0;
   *val = sg->vals[index];
   if (index >= 4)
 warn ();
   *ptr = *val;

   return;
}

With the thread jump, the above becomes:

void sparx5_set (int * ptr, struct nums * sg, int index)
{
   if (index >= 4)
 {
   warn ();
   *ptr = 0;// Code duplications since "warn" does return;
   *val = sg->vals[index];   // same this line.
// In this path, since it's under the condition
// "index >= 4", the compiler knows the value
// of "index" is larger then 4, therefore the
// out-of-bound warning.
   warn ();
 }
   else
 {
   *ptr = 0;
   *val = sg->vals[index];
 }
   *ptr = *val;
   return;
}

We can see, after the thread jump optimization, the # of branches inside
the routine "sparx5_set" is reduced from 2 to 1, however,  due to the
code duplication (which is needed for the correctness of the code), we
got a false positive out-of-bound warning.

In order to eliminate such false positive out-of-bound warning,

A. Add one more flag for GIMPLE: is_splitted.
B. During the thread jump optimization, when the basic blocks are
duplicated, mark all the STMTs inside the original and duplicated
basic blocks as "is_splitted";
C. Inside the array bound checker, add the following new heuristic:

If
1. the stmt is duplicated and splitted into two conditional paths;
+  2. the warning level < 2;
+  3. the current block is not dominating the exit block
Then not report the warning.

The false positive warnings are moved from -Warray-bounds=1 to
  -Warray-bounds=2 now.

Bootstrapped and regression tested on both x86 and aarch64. adjusted
  -Warray-bounds-61.c due to the false positive warnings.

Let me know if you have any comments and suggestions.

This sounds horribly wrong.   In the code above, the warning is correct.

Jeff


Follow up #1 (was Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265])

2024-05-13 Thread Vineet Gupta
On 5/13/24 11:49, Vineet Gupta wrote:
>  500.perlbench_r-0 |  1,214,534,029,025 | 1,212,887,959,387 |
>  500.perlbench_r-1 |740,383,419,739 |   739,280,308,163 |
>  500.perlbench_r-2 |692,074,638,817 |   691,118,734,547 |
>  502.gcc_r-0   |190,820,141,435 |   190,857,065,988 |
>  502.gcc_r-1   |225,747,660,839 |   225,809,444,357 | <- -0.02%
>  502.gcc_r-2   |220,370,089,641 |   220,406,367,876 | <- -0.03%
>  502.gcc_r-3   |179,111,460,458 |   179,135,609,723 | <- -0.02%
>  502.gcc_r-4   |219,301,546,340 |   219,320,416,956 | <- -0.01%
>  503.bwaves_r-0|278,733,324,691 |   278,733,323,575 | <- -0.01%
>  503.bwaves_r-1|442,397,521,282 |   442,397,519,616 |
>  503.bwaves_r-2|344,112,218,206 |   344,112,216,760 |
>  503.bwaves_r-3|417,561,469,153 |   417,561,467,597 |
>  505.mcf_r |669,319,257,525 |   669,318,763,084 |
>  507.cactuBSSN_r   |  2,852,767,394,456 | 2,564,736,063,742 | <+ 10.10%

The small gcc regression seems like a tooling issue of some sort.
Looking at the topblocks, the insn sequences are exactly the same, only
the counts differ and its not obvious why.
Here's for gcc_r-1.


> Block 0 @ 0x170ca, 12 insns, 87854493 times, 0.47%:

000170ca :
   170ca:    7179        add    sp,sp,-48
   170cc:    ec26        sd    s1,24(sp)
   170ce:    e84a        sd    s2,16(sp)
   170d0:    e44e        sd    s3,8(sp)
   170d2:    f406        sd    ra,40(sp)
   170d4:    f022        sd    s0,32(sp)
   170d6:    84aa        mv    s1,a0
   170d8:    03200913      li    s2,50
   170dc:    03d00993      li    s3,61
   170e0:    8526        mv    a0,s1
   170e2:    001cd097      auipc    ra,0x1cd
   170e6:    bac080e7      jalr    -1108(ra) # 1e3c8e


> Block 1 @ 0x706d0a, 3 insns, 274713936 times, 0.37%:
>  Block 2 @ 0x1e3c8e, 9 insns, 88507109 times, 0.35%:
...

< Block 0 @ 0x170ca, 12 insns, 87869602 times, 0.47%:
< Block 1 @ 0x706d42, 3 insns, 274608893 times, 0.36%:
< Block 2 @ 0x1e3c94, 9 insns, 88526354 times, 0.35%:


FWIW, Greg internally has been looking at some of this and found some
issues in the bbv tooling, but I wish all of this was  shared/upstream
(QEMU bbv plugin) for people to compare notes and not discover/fix the
same issues over and again.

Thx,
-Vineet


Re: [PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-13 Thread Robin Dapp
Hi Pan,

thanks for working on this.

In general the patch looks reasonable to me but I'd rather
have some more comments about the high-level idea.
E.g. cbranch is implemented like aarch64 by xor'ing the
bitmasks and comparing the result against zero (so we branch
based on mask equality).

> +;; vcond_mask_len

High-level description here instead please.

> +(define_insn_and_split "vcond_mask_len_"
> +  [(set (match_operand:VB 0 "register_operand")

> +(unspec: VB [
> + (match_operand:VB 1 "register_operand")
> + (match_operand:VB 2 "const_1_operand")

I guess it works like that because operand[2] is just implicitly
used anyway but shouldn't that rather be an all_ones_operand?

> +   && riscv_vector::get_vector_mode (Pmode, GET_MODE_NUNITS 
> (mode)).exists ()"

Seems a bit odd on first sight.  If all we want to do is to
select between two masks why do we need a large Pmode mode?

> +rtx ops[] = {operands[0], operands[1], operands[1], cmp, reg, 
> operands[4]};

So that's basically a mask-move with length?  Can't this be done
differently?  If not, please describe, maybe this is already
the shortest way.

Regards
 Robin



  1   2   >