Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2

2023-05-16 Thread jinma via Gcc-patches
On 5/15/23 07:16, Jin Ma wrote:
> > This patch adds the 'Zfa' extension for riscv, which is based on:
> > https://github.com/riscv/riscv-isa-manual/commits/zfb
> > 
> > The binutils-gdb for 'Zfa' extension:
> > https://sourceware.org/pipermail/binutils/2023-April/127060.html
> > 
> > What needs special explanation is:
> > 1, The immediate number of the instructions FLI.H/S/D is represented in the 
> > assembly as a
> >floating-point value, with scientific counting when rs1 is 2,3, and 
> > decimal numbers for
> >the rest.
> > 
> >Related llvm link:
> >  https://reviews.llvm.org/D145645
> >Related discussion link:
> >  https://github.com/riscv/riscv-isa-manual/issues/980
> > 
> > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added 
> > principally to
> >accelerate the processing of JavaScript Numbers.", so it seems that no 
> > implementation
> >is required.
> > 
> > 3, The instructions FMINM and FMAXM correspond to C23 library function 
> > fminimum and fmaximum.
> >Therefore, this patch has simply implemented the pattern of 
> > fminm3 and
> >fmaxm3 to prepare for later.
> > 
> > gcc/ChangeLog:
> > 
> >  * common/config/riscv/riscv-common.cc: Add zfa extension version.
> >  * config/riscv/constraints.md (zfli): Constrain the floating point number 
> > that the
> >  instructions FLI.H/S/D can load.
> >  * config/riscv/iterators.md (ceil): New.
> >  (rup): New.
> >  * config/riscv/riscv-opts.h (MASK_ZFA): New.
> >  (TARGET_ZFA): New.
> >  * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> >  * config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
> >  (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, 
> > memory is not applicable.
> >  (riscv_const_insns): Likewise.
> >  (riscv_legitimize_const_move): Likewise.
> >  (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no split 
> > is required.
> >  (riscv_split_doubleword_move): Likewise.
> >  (riscv_output_move): Output the mov instructions in zfa extension.
> >  (riscv_print_operand): Output the floating-point value of the FLI.H/S/D 
> > immediate in assembly
> >  (riscv_secondary_memory_needed): Likewise.
> >  * config/riscv/riscv.md (fminm3): New.
> >  (fmaxm3): New.
> >  (movsidf2_low_rv32): New.
> >  (movsidf2_high_rv32): New.
> >  (movdfsisi3_rv32): New.
> >  (f_quiet4_zfa): Likewise.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> >  * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> >  * gcc.target/riscv/zfa-fli-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fli-zfh.c: New test.
> >  * gcc.target/riscv/zfa-fli.c: New test.
> >  * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fround-rv32.c: New test.
> >  * gcc.target/riscv/zfa-fround.c: New test.
> > ---
> >   gcc/common/config/riscv/riscv-common.cc   |   4 +
> >   gcc/config/riscv/constraints.md   |  21 +-
> >   gcc/config/riscv/iterators.md |   5 +
> >   gcc/config/riscv/riscv-opts.h |   3 +
> >   gcc/config/riscv/riscv-protos.h   |   1 +
> >   gcc/config/riscv/riscv.cc | 204 +-
> >   gcc/config/riscv/riscv.md | 145 +++--
> >   .../gcc.target/riscv/zfa-fleq-fltq-rv32.c |  19 ++
> >   .../gcc.target/riscv/zfa-fleq-fltq.c  |  19 ++
> >   gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++
> >   .../gcc.target/riscv/zfa-fli-zfh-rv32.c   |  41 
> >   gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 
> >   gcc/testsuite/gcc.target/riscv/zfa-fli.c  |  79 +++
> >   .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
> >   .../gcc.target/riscv/zfa-fround-rv32.c|  42 
> >   gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 
> >   16 files changed, 719 insertions(+), 36 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround-rv32.c
> >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fround.c
> > 
> 
> 
> > +
> > +/* Return index of the FLI instruction table if rtx X is an immediate 
> > constant that can
> > +   be moved using a single FLI instruction in zfa extension. Return -1 if 
> > not found.  */
> > +
> > +int
> > +riscv_float_const_rtx_index_for_fli (rtx x)
> > +{
> > +  unsigned HOST_WIDE_INT *fli_value_array;
> > +
> > +  machin

Re: [PATCH v4] libgfortran: Replace mutex with rwlock

2023-05-16 Thread Zhu, Lipeng via Gcc-patches




On 5/9/2023 10:32 AM, Zhu, Lipeng wrote:



On 1/1/1970 8:00 AM, Bernhard Reutner-Fischer wrote:

On Mon,  8 May 2023 17:44:43 +0800
Lipeng Zhu  wrote:


This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the percentage
to step into the insert_unit function is around 30%, in most
instances, we can get the unit in the phase of reading the unit_cache
or unit_root tree. So split the read/write phase by rwlock would be an
approach to make it more parallel.

BTW, the IPC metrics can gain around 9x in our test server with 220
cores. The benchmark we used is https://github.com/rwesson/NEAT


See commentary typos below.
You did not state if you regression tested the patch?
I use valgrind --tool=helgrind or --tool=drd to test 'make 
check-fortran'. Is it necessary to add an additional unit test for this 
patch?



Other than that it LGTM but i cannot approve it.
Thank you for your kind help for this patch, is there anything that I 
can do or can you help to push this patch forward?



Hi Bernhard,

Is there any other refinement that need I to do for this patch?

Thanks.





diff --git a/libgfortran/io/async.h b/libgfortran/io/async.h index
ad226c8e856..0033cc74252 100644
--- a/libgfortran/io/async.h
+++ b/libgfortran/io/async.h
@@ -210,6 +210,128 @@
  DEBUG_PRINTF ("%s" DEBUG_RED "ACQ:" DEBUG_NORM " %-30s %78p\n", 
aio_prefix, #mutex,

mutex); \

Thanks, corrected in Patch v5.


    } while (0)
+#ifdef __GTHREAD_RWLOCK_INIT
+#define RWLOCK_DEBUG_ADD(rwlock) do {    \
+    aio_rwlock_debug *n;    \
+    n = xmalloc (sizeof(aio_rwlock_debug));    \


Missing space before the open brace: sizeof (


Thanks, corrected in Patch v5.


diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c index
82664dc5f98..62f1db21d34 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -33,34 +33,36 @@ see the files COPYING3 and COPYING.RUNTIME
respectively.  If not, see
  /* IO locking rules:
-   UNIT_LOCK is a master lock, protecting UNIT_ROOT tree and 
UNIT_CACHE.
+   UNIT_RWLOCK is a master lock, protecting UNIT_ROOT tree and 
UNIT_CACHE.

+   And use the rwlock to spilt read and write phase to UNIT_ROOT tree
+   and UNIT_CACHE to increase CPU efficiency.


s/spilt/split. Maybe:

Using an rwlock improves efficiency by allowing us to separate readers 
and writers of both UNIT_ROOT

and UNIT_CACHE.


Thanks, corrected in Patch v5.


@@ -350,6 +356,17 @@ retry:
    if (c == 0)
  break;
  }
+  /* We did not find a unit in the cache nor in the unit list, 
create a new

+    (locked) unit and insert into the unit list and cache.
+    Manipulating either or both the unit list and the unit cache 
requires to

+    hold a write-lock [for obvious reasons]:
+    1. By separating the read/write lock, it will greatly reduce the 
contention
+   at the read part, while write part is not always necessary or 
most

+   unlikely once the unit hit in cache.


+    By separating the read/write lock, we will greatly reduce the 
contention
+    on the read part, while the write part is unlikely once the unit 
hits

+    the cache.

+    2. We try to balance the implementation complexity and the 
performance

+   gains that fit into current cases we observed by just using a
+   pthread_rwlock. */


Let's drop 2.


Got it, thanks!

thanks,


Re: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

2023-05-16 Thread Robin Dapp via Gcc-patches
> This patch would like to align the stdint.h to the stdint-gcc.h for all
> the RVV test files. Aka:
> 
> stdint.h => stdint-gcc.h

Looks good.  Jeff already pre-approved so you can go ahead and install
this on the trunk.

Regards
 Robin


RE: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

2023-05-16 Thread Li, Pan2 via Gcc-patches
Got it, thanks, will commit to trunk after pass the rvv test.

Pan

-Original Message-
From: Robin Dapp  
Sent: Tuesday, May 16, 2023 3:11 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

> This patch would like to align the stdint.h to the stdint-gcc.h for 
> all the RVV test files. Aka:
> 
> stdint.h => stdint-gcc.h

Looks good.  Jeff already pre-approved so you can go ahead and install this on 
the trunk.

Regards
 Robin


Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard.

>> But we can't generate (vector) gimple that has undefined behaviour from
>> (scalar) gimple that had defined behaviour.  So something needs to change.
>> Either we need to generate a different sequence, or we need to define
>> what the behaviour of len_load/store/etc. are when the length is out of
>> range (perhaps under a target hook?).

To be safe for any targets, so we need add "MIN" to make the length never over 
length.
So for case 2 will be like this:

_44 = MIN_EXPR ;_47 = MIN_EXPR ;
...
_44_2 = MIN_EXPR <_44, 16>;  ->>> add this MIN
LEN_STORE (_6, 8B, _44_2, ...);...
I suddenly realize that it's better to add a MIN for it.But I am not sure 
whether we can have a better gimple IRthan it.
>> We also need to be consistent.  If case 2 is allowed to use length
>> parameters that are greater than the vector length, then there's no
>> reason for case 1 to use the result of the MIN_EXPR as the length
>> parameter.  It could just use the loop IV directly.  (I realise the
>> select_vl patch will change case 1 for RVV anyway.  But the principle
>>  still holds.)

Oh, thanks for catching this. After thinking about your comments,
I suddenly realize that make length possible larger than VF will create
potential issues to RVV.
 
>> What does the riscv backend's implementation of the len_load and
>> len_store guarantee?  Is any length greater than the vector length
>> capped to the vector length?  Or is it more complicated than that?

For RVV, it is more complicated than that...

For RVV, we will emit vsetvli instruction first for len_load/len_store.

For example for case 1:
loop:
min a5, a4,16
vsetvli zero, a5...
load.
add.
store
a4 = a4 - a5...

Since we have min a5,a4,16 which will make the length always <= vf.
So it works fine for RVV.

However, if we do something like this (according to undefine behavior of 
len_load/len_store):
loop:
min a5, a4,16
vsetvli zero, a4...
load.
add.
store
a4 = a4 - a5...

Notice there is different here:
vsetvli zero, a4..., here we use "a4" instead of "a5", this will create an 
issue here:
Since according to RVV ISA, for vsetvli instruction:
vsetvli zero, a4...,  if a4 <= vf, then, the length = vf
if vf < a4 <= 2 *vf, it will make length any value between [a4/2, vf] depending 
on the hardward design.
Since our current data reference pointer IV is added by VF (in bytes) by 
default.
Then it will be an issue.

So, may be for case 2 like your said, we should not involve undefine behavior 
into len_load/len_store,
instead, we should well handle loop control by add "MIN (16)" ?

Thanks.



juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 14:57
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
>>> The examples are good, but this one made me wonder: why is the
>>> adjustment made to the limit (namely 16, the gap between _39 and _41)
>>> different from the limits imposed by the MIN_EXPR (32)?  And I think
>>> the answer is that:
>
>>> - _47 counts the number of elements processed by the loop in total,
>>>   including the vectors under the control of _44
>
>>> - _44 counts the number of elements controlled by _47 in the next
>>>   iteration of the vector loop (if there is one)
>
>>> And that's needed to allow the IVs to be updated independently.
>
>>> The difficulty with this is that the len_load* and len_store*
>>> optabs currently say that the behaviour is undefined if the
>>> length argument is greater than the length of a vector.
>>> So I think using these values of _47 and _44 in the .LEN_STOREs
>>> is relying on undefined behaviour.
>
>>> Haven't had time to think about the consequences of that yet,
>>> but wanted to send something out sooner rather than later.
>
> Hi, Richard. I totally understand your concern now. I think the undefine 
> behavior is more
> appropriate for RVV since we have vsetvli instruction that gurantee this will 
> cause potential
> issues. However, for some other target, we may need to use additional 
> MIN_EXPR to guard
> the length never over VF. I think it can be addressed in the future when it 
> is needed.
 
But we can't generate (vector) gimple that has undefined behaviour from
(scalar) gimple that had defined behaviour.  So something needs to change.
Either we need to generate a different sequence, or we need to define
what the behaviour of len_load/store/etc. are when the length is out of
range (perhaps under a target hook?).
 
We also need to be consistent.  If case 2 is allowed to use length
parameters that are greater than the vector length, then there's no
reason for case 1 to use the result of the MIN_EXPR as the length
parameter.  It could just use the loop IV directly.  (I realise the
select_vl patch will change case 1 for RVV anyway.  But the principle
still holds.)
 
What does the riscv backend's implementation of the len_load and
len_store guarantee?  Is any length greater than the vector length
capp

Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Richard Sandiford via Gcc-patches
"Li, Pan2"  writes:
> Kindly ping for this PATCH v3.

The patch was sent on Saturday, so this is effectively pinging after
one working day in most of Europe and America.  That's too soon and
comes across as aggressive.

I realise you and others are working intensively on this.  But in a
sense that's part of the reason why reviews might seem slow.  The volume
of RVV patches recently has been pretty high, so it's been difficult to
keep up.  There are have also been many other non-RVV patches that have
been "unlocked" by stage 1 opening, so there's a high volume from that
as well.

Also, please bear in mind that most people active in the GCC community
have their own work to do and can only a dedicate a certain amount of
the day to reviews.  And reviewing patches can be time-consuming in
itsself.

So sometimes a patch will get a review within the day.  Sometimes it
will take a bit longer.  The fact that a patch doesn't get a response
within one working day doesn't mean that it's been forgotten.

Thanks,
Richard


RE: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

2023-05-16 Thread Li, Pan2 via Gcc-patches
Committed, thanks all.

Pan

-Original Message-
From: Li, Pan2 
Sent: Tuesday, May 16, 2023 3:17 PM
To: Robin Dapp ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, Yanzhang 
; jeffreya...@gmail.com
Subject: RE: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

Got it, thanks, will commit to trunk after pass the rvv test.

Pan

-Original Message-
From: Robin Dapp 
Sent: Tuesday, May 16, 2023 3:11 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V: Adjust stdint.h to stdint-gcc.h for rvv tests

> This patch would like to align the stdint.h to the stdint-gcc.h for 
> all the RVV test files. Aka:
> 
> stdint.h => stdint-gcc.h

Looks good.  Jeff already pre-approved so you can go ahead and install this on 
the trunk.

Regards
 Robin


Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard.
For case 2, I come up with this idea:

+Case 2 (SLP multiple rgroup):
+   ...
+   _38 = (unsigned long) n_12(D);
+   _39 = _38 * 2;
+   _40 = MAX_EXPR <_39, 16>;
+   _41 = _40 - 16;
+   ...
+   # ivtmp_42 = PHI 
+   # ivtmp_45 = PHI 
+   ...
+   _44 = MIN_EXPR ;
+   _47 = MIN_EXPR ;+   _47_2 = MIN_EXPR 
<_47, 16>;  >add+   _47_3 = _47 - _47_2 ; > add
+   ...
+   .LEN_STORE (_6, 8B, _47_2, ...);
+   ...
+   .LEN_STORE (_25, 8B, _47_3, ...);
+   _33 = _47_2 / 2;
+   ...
+   .LEN_STORE (_8, 16B, _33, ...);
+   _36 = _47_3 / 2;
+   ...
+   .LEN_STORE (_15, 16B, _36, ...);
+   ivtmp_46 = ivtmp_45 - _47;
+   ivtmp_43 = ivtmp_42 - _44;
+   ...
+   if (ivtmp_46 != 0)
+ goto ; [83.33%]
+   else
+ goto ; [16.67%]
Is it reasonable ? Or you do have better idea for it?

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 14:57
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
>>> The examples are good, but this one made me wonder: why is the
>>> adjustment made to the limit (namely 16, the gap between _39 and _41)
>>> different from the limits imposed by the MIN_EXPR (32)?  And I think
>>> the answer is that:
>
>>> - _47 counts the number of elements processed by the loop in total,
>>>   including the vectors under the control of _44
>
>>> - _44 counts the number of elements controlled by _47 in the next
>>>   iteration of the vector loop (if there is one)
>
>>> And that's needed to allow the IVs to be updated independently.
>
>>> The difficulty with this is that the len_load* and len_store*
>>> optabs currently say that the behaviour is undefined if the
>>> length argument is greater than the length of a vector.
>>> So I think using these values of _47 and _44 in the .LEN_STOREs
>>> is relying on undefined behaviour.
>
>>> Haven't had time to think about the consequences of that yet,
>>> but wanted to send something out sooner rather than later.
>
> Hi, Richard. I totally understand your concern now. I think the undefine 
> behavior is more
> appropriate for RVV since we have vsetvli instruction that gurantee this will 
> cause potential
> issues. However, for some other target, we may need to use additional 
> MIN_EXPR to guard
> the length never over VF. I think it can be addressed in the future when it 
> is needed.
 
But we can't generate (vector) gimple that has undefined behaviour from
(scalar) gimple that had defined behaviour.  So something needs to change.
Either we need to generate a different sequence, or we need to define
what the behaviour of len_load/store/etc. are when the length is out of
range (perhaps under a target hook?).
 
We also need to be consistent.  If case 2 is allowed to use length
parameters that are greater than the vector length, then there's no
reason for case 1 to use the result of the MIN_EXPR as the length
parameter.  It could just use the loop IV directly.  (I realise the
select_vl patch will change case 1 for RVV anyway.  But the principle
still holds.)
 
What does the riscv backend's implementation of the len_load and
len_store guarantee?  Is any length greater than the vector length
capped to the vector length?  Or is it more complicated than that?
 
Thanks,
Richard
 


Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Oh, 
I am sorry for incorrect typos in the last email, fix typos :

Hi, Richard.
For case 2, I come up with this idea:
+Case 2 (SLP multiple rgroup):
+   ...
+   _38 = (unsigned long) n_12(D);
+   _39 = _38 * 2;
+   _40 = MAX_EXPR <_39, 16>;   ->remove
+   _41 = _40 - 16; ->remove

+   ...
+   # ivtmp_42 = PHI   ->remove

+   # ivtmp_45 = PHI 
+   ...
+   _44 = MIN_EXPR ;  ->remove

+   _47 = MIN_EXPR ;+   _47_2 = MIN_EXPR 
<_47, 16>;  >add+   _47_3 = _47 - _47_2 ; > add
+   ...
+   .LEN_STORE (_6, 8B, _47_2, ...);
+   ...
+   .LEN_STORE (_25, 8B, _47_3, ...);
+   _33 = _47_2 / 2;
+   ...
+   .LEN_STORE (_8, 16B, _33, ...);
+   _36 = _47_3 / 2;
+   ...
+   .LEN_STORE (_15, 16B, _36, ...);
+   ivtmp_46 = ivtmp_45 - _47;
+   ivtmp_43 = ivtmp_42 - _44;  ->remove

+   ...
+   if (ivtmp_46 != 0)
+ goto ; [83.33%]
+   else
+ goto ; [16.67%]
Is it reasonable ? Or you do have better idea for it?
Thanks.



juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 14:57
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
>>> The examples are good, but this one made me wonder: why is the
>>> adjustment made to the limit (namely 16, the gap between _39 and _41)
>>> different from the limits imposed by the MIN_EXPR (32)?  And I think
>>> the answer is that:
>
>>> - _47 counts the number of elements processed by the loop in total,
>>>   including the vectors under the control of _44
>
>>> - _44 counts the number of elements controlled by _47 in the next
>>>   iteration of the vector loop (if there is one)
>
>>> And that's needed to allow the IVs to be updated independently.
>
>>> The difficulty with this is that the len_load* and len_store*
>>> optabs currently say that the behaviour is undefined if the
>>> length argument is greater than the length of a vector.
>>> So I think using these values of _47 and _44 in the .LEN_STOREs
>>> is relying on undefined behaviour.
>
>>> Haven't had time to think about the consequences of that yet,
>>> but wanted to send something out sooner rather than later.
>
> Hi, Richard. I totally understand your concern now. I think the undefine 
> behavior is more
> appropriate for RVV since we have vsetvli instruction that gurantee this will 
> cause potential
> issues. However, for some other target, we may need to use additional 
> MIN_EXPR to guard
> the length never over VF. I think it can be addressed in the future when it 
> is needed.
 
But we can't generate (vector) gimple that has undefined behaviour from
(scalar) gimple that had defined behaviour.  So something needs to change.
Either we need to generate a different sequence, or we need to define
what the behaviour of len_load/store/etc. are when the length is out of
range (perhaps under a target hook?).
 
We also need to be consistent.  If case 2 is allowed to use length
parameters that are greater than the vector length, then there's no
reason for case 1 to use the result of the MIN_EXPR as the length
parameter.  It could just use the loop IV directly.  (I realise the
select_vl patch will change case 1 for RVV anyway.  But the principle
still holds.)
 
What does the riscv backend's implementation of the len_load and
len_store guarantee?  Is any length greater than the vector length
capped to the vector length?  Or is it more complicated than that?
 
Thanks,
Richard
 


Re: [PATCH v9] RISC-V: Add the 'zfa' extension, version 0.2

2023-05-16 Thread Kito Cheng via Gcc-patches
zfa requires/depend f, it means zfa implies f in current toolchain
implementation, could you add that into riscv-common.cc?

Also that means zfa is exclusive with Z[FDH]INX.

Ref: https://github.com/riscv/riscv-isa-manual/issues/1020

On Tue, May 16, 2023 at 3:06 PM jinma  wrote:
>
> On 5/15/23 07:16, Jin Ma wrote:
> > > This patch adds the 'Zfa' extension for riscv, which is based on:
> > > https://github.com/riscv/riscv-isa-manual/commits/zfb
> > >
> > > The binutils-gdb for 'Zfa' extension:
> > > https://sourceware.org/pipermail/binutils/2023-April/127060.html
> > >
> > > What needs special explanation is:
> > > 1, The immediate number of the instructions FLI.H/S/D is represented in 
> > > the assembly as a
> > >floating-point value, with scientific counting when rs1 is 2,3, and 
> > > decimal numbers for
> > >the rest.
> > >
> > >Related llvm link:
> > >  https://reviews.llvm.org/D145645
> > >Related discussion link:
> > >  https://github.com/riscv/riscv-isa-manual/issues/980
> > >
> > > 2, According to riscv-spec, "The FCVTMO D.W.D instruction was added 
> > > principally to
> > >accelerate the processing of JavaScript Numbers.", so it seems that no 
> > > implementation
> > >is required.
> > >
> > > 3, The instructions FMINM and FMAXM correspond to C23 library function 
> > > fminimum and fmaximum.
> > >Therefore, this patch has simply implemented the pattern of 
> > > fminm3 and
> > >fmaxm3 to prepare for later.
> > >
> > > gcc/ChangeLog:
> > >
> > >  * common/config/riscv/riscv-common.cc: Add zfa extension version.
> > >  * config/riscv/constraints.md (zfli): Constrain the floating point 
> > > number that the
> > >  instructions FLI.H/S/D can load.
> > >  * config/riscv/iterators.md (ceil): New.
> > >  (rup): New.
> > >  * config/riscv/riscv-opts.h (MASK_ZFA): New.
> > >  (TARGET_ZFA): New.
> > >  * config/riscv/riscv-protos.h (riscv_float_const_rtx_index_for_fli): New.
> > >  * config/riscv/riscv.cc (riscv_float_const_rtx_index_for_fli): New.
> > >  (riscv_cannot_force_const_mem): If instruction FLI.H/S/D can be used, 
> > > memory is not applicable.
> > >  (riscv_const_insns): Likewise.
> > >  (riscv_legitimize_const_move): Likewise.
> > >  (riscv_split_64bit_move_p): If instruction FLI.H/S/D can be used, no 
> > > split is required.
> > >  (riscv_split_doubleword_move): Likewise.
> > >  (riscv_output_move): Output the mov instructions in zfa extension.
> > >  (riscv_print_operand): Output the floating-point value of the FLI.H/S/D 
> > > immediate in assembly
> > >  (riscv_secondary_memory_needed): Likewise.
> > >  * config/riscv/riscv.md (fminm3): New.
> > >  (fmaxm3): New.
> > >  (movsidf2_low_rv32): New.
> > >  (movsidf2_high_rv32): New.
> > >  (movdfsisi3_rv32): New.
> > >  (f_quiet4_zfa): Likewise.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >  * gcc.target/riscv/zfa-fleq-fltq-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fleq-fltq.c: New test.
> > >  * gcc.target/riscv/zfa-fli-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fli-zfh-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fli-zfh.c: New test.
> > >  * gcc.target/riscv/zfa-fli.c: New test.
> > >  * gcc.target/riscv/zfa-fmovh-fmovp-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fround-rv32.c: New test.
> > >  * gcc.target/riscv/zfa-fround.c: New test.
> > > ---
> > >   gcc/common/config/riscv/riscv-common.cc   |   4 +
> > >   gcc/config/riscv/constraints.md   |  21 +-
> > >   gcc/config/riscv/iterators.md |   5 +
> > >   gcc/config/riscv/riscv-opts.h |   3 +
> > >   gcc/config/riscv/riscv-protos.h   |   1 +
> > >   gcc/config/riscv/riscv.cc | 204 +-
> > >   gcc/config/riscv/riscv.md | 145 +++--
> > >   .../gcc.target/riscv/zfa-fleq-fltq-rv32.c |  19 ++
> > >   .../gcc.target/riscv/zfa-fleq-fltq.c  |  19 ++
> > >   gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c |  79 +++
> > >   .../gcc.target/riscv/zfa-fli-zfh-rv32.c   |  41 
> > >   gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c  |  41 
> > >   gcc/testsuite/gcc.target/riscv/zfa-fli.c  |  79 +++
> > >   .../gcc.target/riscv/zfa-fmovh-fmovp-rv32.c   |  10 +
> > >   .../gcc.target/riscv/zfa-fround-rv32.c|  42 
> > >   gcc/testsuite/gcc.target/riscv/zfa-fround.c   |  42 
> > >   16 files changed, 719 insertions(+), 36 deletions(-)
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fleq-fltq.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh-rv32.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli-zfh.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fli.c
> > >   create mode 100644 gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-rv32.c
> > >   create mode 100644 

RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Li, Pan2 via Gcc-patches
I see, thanks Richard for reminding. I am sorry I failed to locate anywhere(doc 
or something else) mentioned such convention about ping, will follow the below 
convention in future.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Tuesday, May 16, 2023 3:30 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

"Li, Pan2"  writes:
> Kindly ping for this PATCH v3.

The patch was sent on Saturday, so this is effectively pinging after one 
working day in most of Europe and America.  That's too soon and comes across as 
aggressive.

I realise you and others are working intensively on this.  But in a sense 
that's part of the reason why reviews might seem slow.  The volume of RVV 
patches recently has been pretty high, so it's been difficult to keep up.  
There are have also been many other non-RVV patches that have been "unlocked" 
by stage 1 opening, so there's a high volume from that as well.

Also, please bear in mind that most people active in the GCC community have 
their own work to do and can only a dedicate a certain amount of the day to 
reviews.  And reviewing patches can be time-consuming in itsself.

So sometimes a patch will get a review within the day.  Sometimes it will take 
a bit longer.  The fact that a patch doesn't get a response within one working 
day doesn't mean that it's been forgotten.

Thanks,
Richard


Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Tejas Belagod via Gcc-patches
Thanks for your comments, Richard.

From: Richard Sandiford 
Date: Friday, May 12, 2023 at 1:02 AM
To: Tejas Belagod 
Cc: gcc-patches@gcc.gnu.org , Tejas Belagod 

Subject: Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]
Tejas Belagod  writes:
> From: Tejas Belagod 
>
>   This PR optimizes an SVE intrinsics sequence where
> svlasta (svptrue_pat_b8 (SV_VL1), x)
>   a scalar is selected based on a constant predicate and a variable vector.
>   This sequence is optimized to return the correspoding element of a NEON
>   vector. For eg.
> svlasta (svptrue_pat_b8 (SV_VL1), x)
>   returns
> umovw0, v0.b[1]
>   Likewise,
> svlastb (svptrue_pat_b8 (SV_VL1), x)
>   returns
>  umovw0, v0.b[0]
>   This optimization only works provided the constant predicate maps to a range
>   that is within the bounds of a 128-bit NEON register.
>
> gcc/ChangeLog:
>
>PR target/96339
>* config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold): 
> Fold sve
>calls that have a constant input predicate vector.
>(svlast_impl::is_lasta): Query to check if intrinsic is svlasta.
>(svlast_impl::is_lastb): Query to check if intrinsic is svlastb.
>(svlast_impl::vect_all_same): Check if all vector elements are equal.
>
> gcc/testsuite/ChangeLog:
>
>PR target/96339
>* gcc.target/aarch64/sve/acle/general-c/svlast.c: New.
>* gcc.target/aarch64/sve/acle/general-c/svlast128_run.c: New.
>* gcc.target/aarch64/sve/acle/general-c/svlast256_run.c: New.
>* gcc.target/aarch64/sve/pcs/return_4.c (caller_bf16): Fix asm
>to expect optimized code for function body.
>* gcc.target/aarch64/sve/pcs/return_4_128.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_4_256.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_4_512.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_4_1024.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_4_2048.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_5.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_5_128.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_5_256.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_5_512.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_5_1024.c (caller_bf16): Likewise.
>* gcc.target/aarch64/sve/pcs/return_5_2048.c (caller_bf16): Likewise.
> ---
>  .../aarch64/aarch64-sve-builtins-base.cc  | 124 +++
>  .../aarch64/sve/acle/general-c/svlast.c   |  63 
>  .../sve/acle/general-c/svlast128_run.c| 313 +
>  .../sve/acle/general-c/svlast256_run.c| 314 ++
>  .../gcc.target/aarch64/sve/pcs/return_4.c |   2 -
>  .../aarch64/sve/pcs/return_4_1024.c   |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_4_128.c |   2 -
>  .../aarch64/sve/pcs/return_4_2048.c   |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_4_256.c |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_4_512.c |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_5.c |   2 -
>  .../aarch64/sve/pcs/return_5_1024.c   |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_5_128.c |   2 -
>  .../aarch64/sve/pcs/return_5_2048.c   |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_5_256.c |   2 -
>  .../gcc.target/aarch64/sve/pcs/return_5_512.c |   2 -
>  16 files changed, 814 insertions(+), 24 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svlast.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svlast128_run.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svlast256_run.c
>
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> index cd9cace3c9b..db2b4dcaac9 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> @@ -1056,6 +1056,130 @@ class svlast_impl : public quiet
>  public:
>CONSTEXPR svlast_impl (int unspec) : m_unspec (unspec) {}
>
> +  bool is_lasta () const { return m_unspec == UNSPEC_LASTA; }
> +  bool is_lastb () const { return m_unspec == UNSPEC_LASTB; }
> +
> +  bool vect_all_same (tree v , int step) const

Nit: stray space after "v".

> +  {
> +int i;
> +int nelts = vector_cst_encoded_nelts (v);
> +int first_el = 0;
> +
> +for (i = first_el; i < nelts; i += step)
> +  if (VECTOR_CST_ENCODED_ELT (v, i) != VECTOR_CST_ENCODED_ELT (v, 
> first_el))

I think this should use !operand_equal_p (..., ..., 0).

Oops! I wonder why I thought VECTOR_CST_ENCODED_ELT returned a constant! Thanks 
for spotting that. Also, should the flags here be OEP_ONLY_CONST ?


> + return false;
> +
> +return true;
> +  }
> +
> +  /* Fold a svlast{a/b} call with constant predicate to a BIT_FI

Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Xi Ruoyao via Gcc-patches
On Tue, 2023-05-16 at 07:55 +, Li, Pan2 via Gcc-patches wrote:
> I see, thanks Richard for reminding. I am sorry I failed to locate
> anywhere(doc or something else) mentioned such convention about ping,

https://gcc.gnu.org/contribute.html suggests two week.

> will follow the below convention in future.
> 
> Pan
> 
> -Original Message-
> From: Richard Sandiford  
> Sent: Tuesday, May 16, 2023 3:30 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai;
> kito.ch...@sifive.com; Wang, Yanzhang ;
> jeffreya...@gmail.com; rguent...@suse.de
> Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16
> bits
> 
> "Li, Pan2"  writes:
> > Kindly ping for this PATCH v3.
> 
> The patch was sent on Saturday, so this is effectively pinging after
> one working day in most of Europe and America.  That's too soon and
> comes across as aggressive.
> 
> I realise you and others are working intensively on this.  But in a
> sense that's part of the reason why reviews might seem slow.  The
> volume of RVV patches recently has been pretty high, so it's been
> difficult to keep up.  There are have also been many other non-RVV
> patches that have been "unlocked" by stage 1 opening, so there's a
> high volume from that as well.
> 
> Also, please bear in mind that most people active in the GCC community
> have their own work to do and can only a dedicate a certain amount of
> the day to reviews.  And reviewing patches can be time-consuming in
> itsself.
> 
> So sometimes a patch will get a review within the day.  Sometimes it
> will take a bit longer.  The fact that a patch doesn't get a response
> within one working day doesn't mean that it's been forgotten.
> 
> Thanks,
> Richard

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Li, Pan2 via Gcc-patches
Got it. Thank you!

Pan

-Original Message-
From: Xi Ruoyao  
Sent: Tuesday, May 16, 2023 4:04 PM
To: Li, Pan2 ; Richard Sandiford 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

On Tue, 2023-05-16 at 07:55 +, Li, Pan2 via Gcc-patches wrote:
> I see, thanks Richard for reminding. I am sorry I failed to locate 
> anywhere(doc or something else) mentioned such convention about ping,

https://gcc.gnu.org/contribute.html suggests two week.

> will follow the below convention in future.
> 
> Pan
> 
> -Original Message-
> From: Richard Sandiford 
> Sent: Tuesday, May 16, 2023 3:30 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; 
> kito.ch...@sifive.com; Wang, Yanzhang ; 
> jeffreya...@gmail.com; rguent...@suse.de
> Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 
> bits
> 
> "Li, Pan2"  writes:
> > Kindly ping for this PATCH v3.
> 
> The patch was sent on Saturday, so this is effectively pinging after 
> one working day in most of Europe and America.  That's too soon and 
> comes across as aggressive.
> 
> I realise you and others are working intensively on this.  But in a 
> sense that's part of the reason why reviews might seem slow.  The 
> volume of RVV patches recently has been pretty high, so it's been 
> difficult to keep up.  There are have also been many other non-RVV 
> patches that have been "unlocked" by stage 1 opening, so there's a 
> high volume from that as well.
> 
> Also, please bear in mind that most people active in the GCC community 
> have their own work to do and can only a dedicate a certain amount of 
> the day to reviews.  And reviewing patches can be time-consuming in 
> itsself.
> 
> So sometimes a patch will get a review within the day.  Sometimes it 
> will take a bit longer.  The fact that a patch doesn't get a response 
> within one working day doesn't mean that it's been forgotten.
> 
> Thanks,
> Richard

--
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai"  writes:
> Oh, 
> I am sorry for incorrect typos in the last email, fix typos :
>
> Hi, Richard.
> For case 2, I come up with this idea:
> +  Case 2 (SLP multiple rgroup):
> + ...
> + _38 = (unsigned long) n_12(D);
> + _39 = _38 * 2;
> + _40 = MAX_EXPR <_39, 16>;   ->remove
> + _41 = _40 - 16; ->remove
>
> + ...
> + # ivtmp_42 = PHI   ->remove
>
> + # ivtmp_45 = PHI 
> + ...
> + _44 = MIN_EXPR ;  ->remove
>
> + _47 = MIN_EXPR ;+   _47_2 = MIN_EXPR 
> <_47, 16>;  >add+   _47_3 = _47 - _47_2 ; > add
> + ...
> + .LEN_STORE (_6, 8B, _47_2, ...);
> + ...
> + .LEN_STORE (_25, 8B, _47_3, ...);
> + _33 = _47_2 / 2;
> + ...
> + .LEN_STORE (_8, 16B, _33, ...);
> + _36 = _47_3 / 2;
> + ...
> + .LEN_STORE (_15, 16B, _36, ...);
> + ivtmp_46 = ivtmp_45 - _47;
> + ivtmp_43 = ivtmp_42 - _44;  ->remove
>
> + ...
> + if (ivtmp_46 != 0)
> +   goto ; [83.33%]
> + else
> +   goto ; [16.67%]
> Is it reasonable ? Or you do have better idea for it?

Yeah, this makes sense, and I think it makes case 2 very similar
(equivalent?) to case 3.  If so, it would be nice if they could be
combined.

Of course, this loses the nice property that the original had: that each
IV was independent, and so the dependency chains were shorter.  With the
above approach, the second length parameter instead depends on a
three-instruction chain.  But that might be OK (up to you).

How much of the riscv backend infrastructure is in place now?  The reason
I ask is that it would be good if the patch had some tests.  AIUI, the
patch is an optimisation on top of what the current len_load/store code does,
rather than something that is needed for correctness.  So it seems like
the necessary patterns could be added and tested using the current approach,
then this patch could be applied on top, with its own tests for the new
approach.

Thanks,
Richard


Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard.

RVV infrastructure in RISC-V backend status:
1. All RVV instructions pattern related to intrinsics are all finished (They 
will be called not only by intrinsics but also autovec in the future).
2. In case of autovec, we finished len_load/len_store (They are temporary used 
and will be removed after I support len_mask_load/len_mask_store in the 
middle-end).
   binary integer autovec patterns.
   vec_init pattern.
   That's all we have so far.

In case of testing of this patch, I have multiple rgroup testcases in local, 
you mean you want me to post them together with this patch?
Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
post them after this patch is finished and merged into trunk.
What do you suggest ?

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:16
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Oh, 
> I am sorry for incorrect typos in the last email, fix typos :
>
> Hi, Richard.
> For case 2, I come up with this idea:
> +  Case 2 (SLP multiple rgroup):
> + ...
> + _38 = (unsigned long) n_12(D);
> + _39 = _38 * 2;
> + _40 = MAX_EXPR <_39, 16>;   ->remove
> + _41 = _40 - 16; ->remove
>
> + ...
> + # ivtmp_42 = PHI   ->remove
>
> + # ivtmp_45 = PHI 
> + ...
> + _44 = MIN_EXPR ;  ->remove
>
> + _47 = MIN_EXPR ;+   _47_2 = MIN_EXPR <_47, 16>;  
> >add+   _47_3 = _47 - _47_2 ; > add
> + ...
> + .LEN_STORE (_6, 8B, _47_2, ...);
> + ...
> + .LEN_STORE (_25, 8B, _47_3, ...);
> + _33 = _47_2 / 2;
> + ...
> + .LEN_STORE (_8, 16B, _33, ...);
> + _36 = _47_3 / 2;
> + ...
> + .LEN_STORE (_15, 16B, _36, ...);
> + ivtmp_46 = ivtmp_45 - _47;
> + ivtmp_43 = ivtmp_42 - _44;  ->remove
>
> + ...
> + if (ivtmp_46 != 0)
> +   goto ; [83.33%]
> + else
> +   goto ; [16.67%]
> Is it reasonable ? Or you do have better idea for it?
 
Yeah, this makes sense, and I think it makes case 2 very similar
(equivalent?) to case 3.  If so, it would be nice if they could be
combined.
 
Of course, this loses the nice property that the original had: that each
IV was independent, and so the dependency chains were shorter.  With the
above approach, the second length parameter instead depends on a
three-instruction chain.  But that might be OK (up to you).
 
How much of the riscv backend infrastructure is in place now?  The reason
I ask is that it would be good if the patch had some tests.  AIUI, the
patch is an optimisation on top of what the current len_load/store code does,
rather than something that is needed for correctness.  So it seems like
the necessary patterns could be added and tested using the current approach,
then this patch could be applied on top, with its own tests for the new
approach.
 
Thanks,
Richard
 


Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.

Thanks.

> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?

It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.

Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.

Richard


RE: [PATCH] aarch64: Add SVE instruction types

2023-05-16 Thread Kyrylo Tkachov via Gcc-patches
Hi Evandro,

I created a new attribute so I didn’t have to extend the “type” attribute that 
lives in config/arm/types.md. As that attribute and file lives in the arm 
backend but SVE is AArch64-only I didn’t want to add logic to the arm backend 
as it’s not truly shared.
The granularity has been somewhat subjective. I had looked at the Software 
Optimisation guides for various SVE and SVE2-capable cores from Arm on 
developer.arm.com and tried to glean commonalities between different 
instruction groups.
I did try writing a model for Neoverse V1 using that classification but I 
couldn’t spend much time on it and the resulting model didn’t give me much 
improvements and gave some regressions instead.
I think that was more down to my rushed model rather than anything else though.

Thanks,
Kyrill

From: Evandro Menezes 
Sent: Monday, May 15, 2023 9:13 PM
To: Kyrylo Tkachov 
Cc: Richard Sandiford ; Evandro Menezes via 
Gcc-patches ; evandro+...@gcc.gnu.org; Tamar Christina 

Subject: Re: [PATCH] aarch64: Add SVE instruction types

Hi, Kyrill.

I wasn’t aware of your previous patch.  Could you clarify why you considered 
creating an SVE specific type attribute instead of reusing the common one?  I 
really liked the iterators that you created; I’d like to use them.

Do you have specific examples which you might want to mention with regards to 
granularity?

Yes, my intent for this patch is to enable modeling the SVE instructions on N1. 
 The patch that implements it brings up some performance improvements, but it’s 
mostly flat, as expected.

Thank you,

--
Evandro Menezes




Em 15 de mai. de 2023, à(s) 04:49, Kyrylo Tkachov 
mailto:kyrylo.tkac...@arm.com>> escreveu:




-Original Message-
From: Richard Sandiford 
mailto:richard.sandif...@arm.com>>
Sent: Monday, May 15, 2023 10:01 AM
To: Evandro Menezes via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: evandro+...@gcc.gnu.org; Evandro Menezes 
mailto:ebah...@icloud.com>>;
Kyrylo Tkachov mailto:kyrylo.tkac...@arm.com>>; Tamar 
Christina
mailto:tamar.christ...@arm.com>>
Subject: Re: [PATCH] aarch64: Add SVE instruction types

Evandro Menezes via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>> writes:

This patch adds the attribute `type` to most SVE1 instructions, as in the
other

instructions.

Thanks for doing this.

Could you say what criteria you used for picking the granularity?  Other
maintainers might disagree, but personally I'd prefer to distinguish two
instructions only if:

(a) a scheduling description really needs to distinguish them or
(b) grouping them together would be very artificial (because they're
   logically unrelated)

It's always possible to split types later if new scheduling descriptions
require it.  Because of that, I don't think we should try to predict ahead
of time what future scheduling descriptions will need.

Of course, this depends on having results that show that scheduling
makes a significant difference on an SVE core.  I think one of the
problems here is that, when a different scheduling model changes the
performance of a particular test, it's difficult to tell whether
the gain/loss is caused by the model being more/less accurate than
the previous one, or if it's due to important "secondary" effects
on register live ranges.  Instinctively, I'd have expected these
secondary effects to dominate on OoO cores.

I agree with Richard on these points. The key here is getting the granularity 
right without having too maintain too many types that aren't useful in the 
models.
FWIW I had posted 
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607101.html in 
November. It adds annotations to SVE2 patterns as well as for base SVE.
Feel free to reuse it if you'd like.
I see you had posted a Neoverse V1 scheduling model. Does that give an 
improvement on SVE code when combined with the scheduling attributes somehow?
Thanks,
Kyrill



[COMMITTED] ada: Restore proof of System.Arith_Double

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Yannick Moy 

Use Assert_And_Cut to simplify proof of second part of the Scaled_Divide.
Add intermediate assertions and simplify where necessary.

gcc/ada/

* libgnat/s-aridou.adb:
(Big3): Remove override made useless.
(Lemma_Quot_Rem): Add new lemma and justify it, as no prover
manages to prove it.
(Lemma_Div_Pow2): Use new lemma Lemma_Quot_Rem.
(Prove_Scaled_Mult_Decomposition_Regroup3): Retype for
simplification.
(Scaled_Divide): Remove useless assertions.Decompose some
assertions with cut operations. Use Assert_And_Cut for second
half. Add assertions.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-aridou.adb | 150 +++
 1 file changed, 119 insertions(+), 31 deletions(-)

diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb
index 67ebdd44a0c..15f87646563 100644
--- a/gcc/ada/libgnat/s-aridou.adb
+++ b/gcc/ada/libgnat/s-aridou.adb
@@ -45,7 +45,8 @@ is
 Contract_Cases => Ignore,
 Ghost  => Ignore,
 Loop_Invariant => Ignore,
-Assert => Ignore);
+Assert => Ignore,
+Assert_And_Cut => Ignore);
 
pragma Suppress (Overflow_Check);
pragma Suppress (Range_Check);
@@ -141,13 +142,6 @@ is
with Ghost;
--  X1&X2&X3 as a big integer
 
-   function Big3 (X1, X2, X3 : Big_Integer) return Big_Integer is
- (Big_2xxSingle * Big_2xxSingle * X1
-+ Big_2xxSingle * X2
-+ X3)
-   with Ghost;
-   --  Version of Big3 on big integers
-
function Le3 (X1, X2, X3, Y1, Y2, Y3 : Single_Uns) return Boolean
with
  Post => Le3'Result = (Big3 (X1, X2, X3) <= Big3 (Y1, Y2, Y3));
@@ -1543,15 +1537,36 @@ is
 Post => X / Double_Uns'(2) ** I / Double_Uns'(2)
   = X / Double_Uns'(2) ** (I + 1);
 
+  procedure Lemma_Quot_Rem (X, Div, Q, R : Double_Uns)
+  with
+Ghost,
+Pre  => Div /= 0
+  and then X = Q * Div + R
+  and then Q <= Double_Uns'Last / Div
+  and then R <= Double_Uns'Last - Q * Div
+  and then R < Div,
+Post => Q = X / Div;
+  pragma Annotate (GNATprove, False_Positive, "postcondition might fail",
+   "Q is the quotient of X by Div");
+
   procedure Lemma_Div_Pow2 (X : Double_Uns; I : Natural) is
  Div1 : constant Double_Uns := Double_Uns'(2) ** I;
  Div2 : constant Double_Uns := Double_Uns'(2);
  Left : constant Double_Uns := X / Div1 / Div2;
+ R2   : constant Double_Uns := X / Div1 - Left * Div2;
+ pragma Assert (R2 < Div2);
+ R1   : constant Double_Uns := X - X / Div1 * Div1;
+ pragma Assert (R1 < Div1);
   begin
+ pragma Assert (X = Left * (Div1 * Div2) + R2 * Div1 + R1);
+ pragma Assert (R2 * Div1 + R1 < Div1 * Div2);
+ Lemma_Quot_Rem (X, Div1 * Div2, Left, R2 * Div1 + R1);
  pragma Assert (Left = X / (Div1 * Div2));
  pragma Assert (Div1 * Div2 = Double_Uns'(2) ** (I + 1));
   end Lemma_Div_Pow2;
 
+  procedure Lemma_Quot_Rem (X, Div, Q, R : Double_Uns) is null;
+
   XX : Double_Uns := X;
 
begin
@@ -2115,12 +2130,15 @@ is
   --  fourth component.
 
   procedure Prove_Scaled_Mult_Decomposition_Regroup3
-(D1, D2, D3, D4 : Big_Integer)
+(D1, D2, D3, D4 : Single_Uns)
   with
 Ghost,
 Pre  => Scale < Double_Size
-  and then Is_Scaled_Mult_Decomposition (D1, D2, D3, D4),
-Post => Is_Scaled_Mult_Decomposition (0, 0, Big3 (D1, D2, D3), D4);
+  and then Is_Scaled_Mult_Decomposition
+(Big (Double_Uns (D1)), Big (Double_Uns (D2)),
+ Big (Double_Uns (D3)), Big (Double_Uns (D4))),
+Post => Is_Scaled_Mult_Decomposition (0, 0, Big3 (D1, D2, D3),
+  Big (Double_Uns (D4)));
   --  Proves scaled decomposition of Mult after regrouping on third
   --  component.
 
@@ -2492,7 +2510,7 @@ is
   --
 
   procedure Prove_Scaled_Mult_Decomposition_Regroup3
-(D1, D2, D3, D4 : Big_Integer)
+(D1, D2, D3, D4 : Single_Uns)
   is null;
 
   --
@@ -2825,9 +2843,6 @@ is
Big_2xxSingle * Big (T2) =
  Big_2xxSingle *
(Big (Double_Uns (Lo (T1))) + Big (Double_Uns (D (3));
-Lemma_Mult_Distribution (Big_2xxSingle,
- Big (Double_Uns (D (3))),
- Big (Double_Uns (Lo (T1;
 Lemma_Hi_Lo (T2, Hi (T2), Lo (T2));
 
 D (3) := Lo (T2);
@@ -2840,11 +2855,20 @@ is
   (Big (Double_Uns (Hi (T1))) + Big (Doubl

[COMMITTED] ada: Set Loop_Variant assertion policy to Ignore in both

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Yannick Moy 

Set Loop_Variant assertion policy to Ignore in both.

gcc/ada/

* libgnat/a-strsup.adb: Set assertion policy for Loop_Variant.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/a-strsup.adb | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/libgnat/a-strsup.adb b/gcc/ada/libgnat/a-strsup.adb
index 70aa4f8bcf3..25a843153f2 100644
--- a/gcc/ada/libgnat/a-strsup.adb
+++ b/gcc/ada/libgnat/a-strsup.adb
@@ -29,12 +29,13 @@
 --  --
 --
 
---  Ghost code, loop invariants and assertions in this unit are meant for
+--  Ghost code, loop (in)variants and assertions in this unit are meant for
 --  analysis only, not for run-time checking, as it would be too costly
 --  otherwise. This is enforced by setting the assertion policy to Ignore.
 
 pragma Assertion_Policy (Ghost  => Ignore,
  Loop_Invariant => Ignore,
+ Loop_Variant   => Ignore,
  Assert => Ignore);
 
 with Ada.Strings.Maps; use Ada.Strings.Maps;
-- 
2.40.0



[COMMITTED] ada: Trivial refactoring in Instantiate_*_Body

2023-05-16 Thread Marc Poulhiès via Gcc-patches
Factor out Par_Vis/Install_Parent/Par_Installed in Instantiate_Package_Body
and Instantiate_Subprogram_Body.

gcc/ada/

* sem_ch12.adb (Instantiate_Package_Body): Simplify if/then/else.
(Instantiate_Subprogram_Body): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch12.adb | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
index 39ceaf7c16f..c4cc641c68c 100644
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -12175,9 +12175,6 @@ package body Sem_Ch12 is
and then Nkind (Gen_Id) = N_Expanded_Name
  then
 Par_Ent := Entity (Prefix (Gen_Id));
-Par_Vis := Is_Immediately_Visible (Par_Ent);
-Install_Parent (Par_Ent, In_Body => True);
-Par_Installed := True;
 
  elsif Ekind (Scope (Gen_Unit)) = E_Generic_Package
and then Ekind (Scope (Act_Decl_Id)) = E_Package
@@ -12189,12 +12186,12 @@ package body Sem_Ch12 is
 Par_Ent := Entity
   (Prefix (Name (Get_Unit_Instantiation_Node
(Scope (Act_Decl_Id);
-Par_Vis := Is_Immediately_Visible (Par_Ent);
-Install_Parent (Par_Ent, In_Body => True);
-Par_Installed := True;
 
  elsif Is_Child_Unit (Gen_Unit) then
 Par_Ent := Scope (Gen_Unit);
+ end if;
+
+ if Present (Par_Ent) then
 Par_Vis := Is_Immediately_Visible (Par_Ent);
 Install_Parent (Par_Ent, In_Body => True);
 Par_Installed := True;
@@ -12611,12 +12608,11 @@ package body Sem_Ch12 is
and then Nkind (Gen_Id) = N_Expanded_Name
  then
 Par_Ent := Entity (Prefix (Gen_Id));
-Par_Vis := Is_Immediately_Visible (Par_Ent);
-Install_Parent (Par_Ent, In_Body => True);
-Par_Installed := True;
-
  elsif Is_Child_Unit (Gen_Unit) then
 Par_Ent := Scope (Gen_Unit);
+ end if;
+
+ if Present (Par_Ent) then
 Par_Vis := Is_Immediately_Visible (Par_Ent);
 Install_Parent (Par_Ent, In_Body => True);
 Par_Installed := True;
-- 
2.40.0



[COMMITTED] ada: Missing dependency with -gnatc

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Arnaud Charlet 

When using -gnatc, dependencies on preprocessor and config files
were not recorded.

gcc/ada/

* gnat1drv.adb: Ensure all dependencies are recorded even when not
generating code.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gnat1drv.adb | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/ada/gnat1drv.adb b/gcc/ada/gnat1drv.adb
index 238618468e1..e74036e506a 100644
--- a/gcc/ada/gnat1drv.adb
+++ b/gcc/ada/gnat1drv.adb
@@ -1396,6 +1396,17 @@ begin
  Back_End_Mode := Skip;
   end if;
 
+  --  Ensure that we properly register a dependency on system.ads, since
+  --  even if we do not semantically depend on this, Targparm has read
+  --  system parameters from the system.ads file.
+
+  Lib.Writ.Ensure_System_Dependency;
+
+  --  Add dependencies, if any, on preprocessing data file and on
+  --  preprocessing definition file(s).
+
+  Prepcomp.Add_Dependencies;
+
   --  At this stage Back_End_Mode is set to indicate if the backend should
   --  be called to generate code. If it is Skip, then code generation has
   --  been turned off, even though code was requested by the original
@@ -1542,17 +1553,6 @@ begin
  return;
   end if;
 
-  --  Ensure that we properly register a dependency on system.ads, since
-  --  even if we do not semantically depend on this, Targparm has read
-  --  system parameters from the system.ads file.
-
-  Lib.Writ.Ensure_System_Dependency;
-
-  --  Add dependencies, if any, on preprocessing data file and on
-  --  preprocessing definition file(s).
-
-  Prepcomp.Add_Dependencies;
-
   if GNATprove_Mode then
 
  --  In GNATprove mode we're writing the ALI much earlier than usual
-- 
2.40.0



[COMMITTED] ada: Introduce Cannot_Be_Superflat flag on N_Range nodes

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The support of superflat arrays in the language generates an overhead that
the code generator attempts to minimize, but it cannot handle too complex
cases and it would be helpful if the front-end could lend a hand.

This change introduces the Cannot_Be_Superflat flag on N_Range nodes for
this purpose, and sets it on the result of string concatenations when it
is guaranteed to be nonnull.

gcc/ada/

* gen_il-fields.ads (Opt_Field_Enum): Add Cannot_Be_Superflat.
* gen_il-gen-gen_nodes.adb (N_Range): Add Cannot_Be_Superflat as
semantical flag and change Includes_Infinities to semantical.
* sinfo.ads (Cannot_Be_Superflat): Document it for N_Range.
* exp_ch4.adb (Expand_Concatenate): Set Cannot_Be_Superflat on the
range of the result if the result cannot be null.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch4.adb  | 20 
 gcc/ada/gen_il-fields.ads|  1 +
 gcc/ada/gen_il-gen-gen_nodes.adb |  3 ++-
 gcc/ada/sinfo.ads|  7 +++
 4 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index c1fe02d60c1..9558596ffa0 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -2536,7 +2536,7 @@ package body Exp_Ch4 is
   --  Reset to False if at least one operand is encountered which is known
   --  at compile time to be non-null. Used for handling the special case
   --  of setting the high bound to the last operand high bound for a null
-  --  result, thus ensuring a proper high bound in the super-flat case.
+  --  result, thus ensuring a proper high bound in the superflat case.
 
   N : constant Nat := List_Length (Opnds);
   --  Number of concatenation operands including possibly null operands
@@ -2726,8 +2726,9 @@ package body Exp_Ch4 is
   --  Local Declarations
 
   Opnd_Typ   : Entity_Id;
-  Slice_Rng  : Entity_Id;
-  Subtyp_Ind : Entity_Id;
+  Slice_Rng  : Node_Id;
+  Subtyp_Ind : Node_Id;
+  Subtyp_Rng : Node_Id;
   Ent: Entity_Id;
   Len: Unat;
   J  : Nat;
@@ -3184,7 +3185,7 @@ package body Exp_Ch4 is
 
   --  Handle the exceptional case where the result is null, in which case
   --  case the bounds come from the last operand (so that we get the proper
-  --  bounds if the last operand is super-flat).
+  --  bounds if the last operand is superflat).
 
   if Result_May_Be_Null then
  Low_Bound :=
@@ -3239,6 +3240,12 @@ package body Exp_Ch4 is
  Slice_Rng := Empty;
   end if;
 
+  Subtyp_Rng := Make_Range (Loc, Low_Bound, High_Bound);
+
+  --  If the result cannot be null then the range cannot be superflat
+
+  Set_Cannot_Be_Superflat (Subtyp_Rng, not Result_May_Be_Null);
+
   --  Now we construct an array object with appropriate bounds. We mark
   --  the target as internal to prevent useless initialization when
   --  Initialize_Scalars is enabled. Also since this is the actual result
@@ -3249,10 +3256,7 @@ package body Exp_Ch4 is
   Subtype_Mark => New_Occurrence_Of (Atyp, Loc),
   Constraint   =>
 Make_Index_Or_Discriminant_Constraint (Loc,
-  Constraints => New_List (
-Make_Range (Loc,
-  Low_Bound  => Low_Bound,
-  High_Bound => High_Bound;
+  Constraints => New_List (Subtyp_Rng)));
 
   Ent := Make_Temporary (Loc, 'S');
   Set_Is_Internal   (Ent);
diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
index 458219c6853..582837cb7ec 100644
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -87,6 +87,7 @@ package Gen_IL.Fields is
   Body_Required,
   Body_To_Inline,
   Box_Present,
+  Cannot_Be_Superflat,
   Char_Literal_Value,
   Chars,
   Check_Address_Alignment,
diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index 44da1d1d924..a330f6913c5 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -531,7 +531,8 @@ begin -- Gen_IL.Gen.Gen_Nodes
Cc (N_Range, N_Subexpr,
(Sy (Low_Bound, Node_Id),
 Sy (High_Bound, Node_Id),
-Sy (Includes_Infinities, Flag)));
+Sm (Cannot_Be_Superflat, Flag),
+Sm (Includes_Infinities, Flag)));
 
Cc (N_Reference, N_Subexpr,
(Sy (Prefix, Node_Id)));
diff --git a/gcc/ada/sinfo.ads b/gcc/ada/sinfo.ads
index c25db08bc96..6cacebe7775 100644
--- a/gcc/ada/sinfo.ads
+++ b/gcc/ada/sinfo.ads
@@ -932,6 +932,12 @@ package Sinfo is
--a pragma Import or Interface applies, in which case no body is
--permitted (in Ada 83 or Ada 95).
 
+   --  Cannot_Be_Superflat
+   --This flag is present in N_Range nodes. It is set if the range is of a
+   --discrete type and cannot be superflat, i.e. it is guaranteed that the
+   --ineq

[COMMITTED] ada: Simplify dramatically ghost code for proof of System.Arith_Double

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Yannick Moy 

Using Inline_For_Proof annotation on key expression functions makes
it possible to remove hundreds of lines of ghost code that were
previously needed to guide provers.

gcc/ada/

* libgnat/s-aridou.adb (Big3, Is_Mult_Decomposition)
(Is_Scaled_Mult_Decomposition): Add annotation for inlining.
(Double_Divide, Scaled_Divide): Simplify and remove ghost code.
(Prove_Multiplication): Add calls to lemmas to make proof go
through.
* libgnat/s-aridou.ads (Big, In_Double_Int_Range): Add annotation
for inlining.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-aridou.adb | 428 ---
 gcc/ada/libgnat/s-aridou.ads |  12 +-
 2 files changed, 56 insertions(+), 384 deletions(-)

diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb
index 15f87646563..dbf0f42cd49 100644
--- a/gcc/ada/libgnat/s-aridou.adb
+++ b/gcc/ada/libgnat/s-aridou.adb
@@ -139,7 +139,9 @@ is
  (Big_2xxSingle * Big_2xxSingle * Big (Double_Uns (X1))
 + Big_2xxSingle * Big (Double_Uns (X2))
 + Big (Double_Uns (X3)))
-   with Ghost;
+   with
+ Ghost,
+ Annotate => (GNATprove, Inline_For_Proof);
--  X1&X2&X3 as a big integer
 
function Le3 (X1, X2, X3, Y1, Y2, Y3 : Single_Uns) return Boolean
@@ -1063,17 +1065,10 @@ is
 
   T1 := Ylo * Zlo;
 
-  pragma Assert (Big (T2) = Big (Double_Uns'(Yhi * Zlo))
-  + Big (Double_Uns'(Ylo * Zhi)));
   Lemma_Mult_Distribution (Big_2xxSingle,
Big (Double_Uns'(Yhi * Zlo)),
Big (Double_Uns'(Ylo * Zhi)));
-  pragma Assert (Mult = Big_2xxSingle * Big (T2) + Big (T1));
   Lemma_Hi_Lo (T1, Hi (T1), Lo (T1));
-  pragma Assert
-(Mult = Big_2xxSingle * Big (T2)
-  + Big_2xxSingle * Big (Double_Uns (Hi (T1)))
-  + Big (Double_Uns (Lo (T1;
   Lemma_Mult_Distribution (Big_2xxSingle,
Big (T2),
Big (Double_Uns (Hi (T1;
@@ -1081,17 +1076,11 @@ is
 
   T2 := T2 + Hi (T1);
 
-  pragma Assert
-(Mult = Big_2xxSingle * Big (T2) + Big (Double_Uns (Lo (T1;
   Lemma_Hi_Lo (T2, Hi (T2), Lo (T2));
   Lemma_Mult_Distribution (Big_2xxSingle,
Big (Double_Uns (Hi (T2))),
Big (Double_Uns (Lo (T2;
   Lemma_Double_Big_2xxSingle;
-  pragma Assert
-(Mult = Big_2xxDouble * Big (Double_Uns (Hi (T2)))
-  + Big_2xxSingle * Big (Double_Uns (Lo (T2)))
-  + Big (Double_Uns (Lo (T1;
 
   if Hi (T2) /= 0 then
  R := X;
@@ -1947,7 +1936,9 @@ is
   + Big_2xxSingle * Big_2xxSingle * D2
   + Big_2xxSingle * D3
   + D4)
-  with Ghost;
+  with
+Ghost,
+Annotate => (GNATprove, Inline_For_Proof);
 
   function Is_Scaled_Mult_Decomposition
 (D1, D2, D3, D4 : Big_Integer)
@@ -1960,7 +1951,8 @@ is
+ D4)
   with
 Ghost,
-Pre  => Scale < Double_Size;
+Annotate => (GNATprove, Inline_For_Proof),
+Pre => Scale < Double_Size;
 
   --  Local lemmas
 
@@ -2239,17 +2231,8 @@ is
  pragma Assert (Big_D3 = Big_T2);
  pragma Assert (Big_2xxSingle * Big_D3 = Big_2xxSingle * Big_T2);
  Lemma_Mult_Commutation (2 ** Scale, Double_Uns (D (4)), T3);
- pragma Assert (Big_D4 = Big_T3);
  pragma Assert
-   (By (Is_Scaled_Mult_Decomposition (0, Big_T1, Big_T2, Big_T3),
-By (Big_2xxSingle * Big_2xxSingle * Big_D12 =
-Big_2xxSingle * Big_2xxSingle * Big_T1,
-Big_D12 = Big_T1)
-  and then
-By (Big_2xxSingle * Big_D3  = Big_2xxSingle * Big_T2,
-Big_D3 = Big_T2)
-  and then
-Big_D4 = Big_T3));
+   (Is_Scaled_Mult_Decomposition (0, Big_T1, Big_T2, Big_T3));
  Lemma_Hi_Lo (T1, Hi (T1), Lo (T1));
  Lemma_Hi_Lo (T2, Hi (T2), Lo (T2));
  Lemma_Hi_Lo (T3, Hi (T3), Lo (T3));
@@ -2265,60 +2248,6 @@ is
  Lemma_Mult_Distribution (Big_2xxSingle,
   Big (Double_Uns (Lo (T2))),
   Big (Double_Uns (Hi (T3;
- pragma Assert
-   (By (Is_Scaled_Mult_Decomposition
-  (Big (Double_Uns (Hi (T1))),
-   Big (Double_Uns (Lo (T1))) + Big (Double_Uns (Hi (T2))),
-   Big (Double_Uns (Lo (T2))) + Big (Double_Uns (Hi (T3))),
-   Big (Double_Uns (Lo (T3,
---  Start from stating equality between the expanded va

[COMMITTED] ada: Add intermediate assertions for proof of Super_Tail

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Yannick Moy 

Proof of Superbounded internal unit requires a little more help.

gcc/ada/

* libgnat/a-strsup.adb: Add intermediate assertions.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/a-strsup.adb | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/ada/libgnat/a-strsup.adb b/gcc/ada/libgnat/a-strsup.adb
index 25a843153f2..c727575ee6b 100644
--- a/gcc/ada/libgnat/a-strsup.adb
+++ b/gcc/ada/libgnat/a-strsup.adb
@@ -1788,6 +1788,12 @@ package body Ada.Strings.Superbounded with SPARK_Mode is
   Source.Data (1 .. Npad) := [others => Pad];
   Source.Data (Npad + 1 .. Max_Length) :=
 Temp (1 .. Max_Length - Npad);
+
+  pragma Assert
+(Source.Data (1 .. Npad) = [1 .. Npad => Pad]);
+  pragma Assert
+(Source.Data (Npad + 1 .. Max_Length)
+ = Temp (1 .. Max_Length - Npad));
end if;
 
 when Strings.Left =>
-- 
2.40.0



[COMMITTED] ada: Fix Ada representation of r_debug and link_map types

2023-05-16 Thread Marc Poulhiès via Gcc-patches
Both record types need to have their components 'aliased' to match their
C version. The mismatch could be observed when using LTO:

  warning: type of 'r_debug' does not match original declaration
   [-Wlto-type-mismatch]

  /usr/include/link.h:66:23: note: type 'struct r_debug' should match
  type 'struct  system__traceback__symbolic__module_name__build_...
   ...cache_for_all_modules__r_debug_type'

gcc/ada/

* libgnat/s-tsmona__linux.adb (link_map, r_debug_type): Add
'aliased' on all components.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-tsmona__linux.adb | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/libgnat/s-tsmona__linux.adb 
b/gcc/ada/libgnat/s-tsmona__linux.adb
index 7e1b493c991..6b539f13c16 100644
--- a/gcc/ada/libgnat/s-tsmona__linux.adb
+++ b/gcc/ada/libgnat/s-tsmona__linux.adb
@@ -93,23 +93,30 @@ package body Module_Name is
   pragma Convention (C, link_map_acc);
 
   type link_map is record
- l_addr : Address;
+ l_addr : aliased Address;
  --  Base address of the shared object
 
- l_name : Address;
+ l_name : aliased Address;
  --  Null-terminated absolute file name
 
- l_ld   : Address;
+ l_ld   : aliased Address;
  --  Dynamic section
 
- l_next, l_prev : link_map_acc;
+ l_next, l_prev : aliased link_map_acc;
  --  Chain
   end record;
   pragma Convention (C, link_map);
 
+  type r_debug_state is (RT_CONSISTENT, RT_ADD, RT_DELETE);
+  pragma Convention (C, r_debug_state);
+  pragma Unreferenced (RT_CONSISTENT, RT_ADD, RT_DELETE);
+
   type r_debug_type is record
- r_version : Integer;
- r_map : link_map_acc;
+ r_version : aliased int;
+ r_map : aliased link_map_acc;
+ r_brk : aliased Address;
+ r_state   : aliased r_debug_state;
+ r_ldbase  : aliased Address;
   end record;
   pragma Convention (C, r_debug_type);
 
-- 
2.40.0



[COMMITTED] ada: Get name from entity if that's what's passed to Subprogram_Name

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Richard Kenner 

gcc/ada/

* sem_util.adb (Subprogram_Name): If what's passed is already an
entity, use that for the name.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index eb0d08a1851..8bce0229867 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -28095,6 +28095,9 @@ package body Sem_Util is
Ent := Defining_Identifier (Ent);
exit;
 
+when N_Entity =>
+   exit;
+
 when others =>
null;
  end case;
-- 
2.40.0



[COMMITTED] ada: Change Present_Expr field type to Uint

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Richard Kenner 

We want the field to be initialized to No_Uint because we want to be
able to test in GNAT LLVM whether we've already set it so we can be
sure we only set it once.

gcc/ada/

* gen_il-gen-gen_nodes.adb (Present_Expr): Type is now Uint.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gen_il-gen-gen_nodes.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
index 389c9a0f005..44da1d1d924 100644
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -1604,7 +1604,7 @@ begin -- Gen_IL.Gen.Gen_Nodes
 Sm (Dcheck_Function, Node_Id),
 Sm (Enclosing_Variant, Node_Id),
 Sm (Has_SP_Choice, Flag),
-Sm (Present_Expr, Valid_Uint)));
+Sm (Present_Expr, Uint)));
 
Cc (N_Variant_Part, Node_Kind,
(Sy (Name, Node_Id, Default_Empty),
-- 
2.40.0



[COMMITTED] ada: Implement inheritance of user-defined literal aspects for untagged types

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

In Ada 2022, user-defined literal aspects are nonoverridable but the named
subprograms present in them can be overridden, including for untagged types.

gcc/ada/

* sem_res.adb (Has_Applicable_User_Defined_Literal): Apply the
same processing for derived untagged types as for tagged types.
* sem_util.ads (Corresponding_Primitive_Op): Adjust description.
* sem_util.adb (Corresponding_Primitive_Op): Handle untagged
types.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_res.adb  |  1 -
 gcc/ada/sem_util.adb | 39 +++
 gcc/ada/sem_util.ads |  6 +++---
 3 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
index df9ccb18468..f6634da42a7 100644
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -492,7 +492,6 @@ package body Sem_Res is
  Name := Make_Identifier (Loc, Chars (Callee));
 
  if Is_Derived_Type (Typ)
-   and then Is_Tagged_Type (Typ)
and then Base_Type (Etype (Callee)) /= Base_Type (Typ)
  then
 Callee :=
diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 38dc654f7be..1d8d4fc30f8 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -6483,9 +6483,8 @@ package body Sem_Util is
  (Ancestor_Op : Entity_Id;
   Descendant_Type : Entity_Id) return Entity_Id
is
-  Typ  : constant Entity_Id := Find_Dispatching_Type (Ancestor_Op);
-  Elmt : Elmt_Id;
-  Subp : Entity_Id;
+  function Find_Untagged_Type_Of (Prim : Entity_Id) return Entity_Id;
+  --  Search for the untagged type of the primitive operation Prim.
 
   function Profile_Matches_Ancestor (S : Entity_Id) return Boolean;
   --  Returns True if subprogram S has the proper profile for an
@@ -6493,6 +6492,34 @@ package body Sem_Util is
   --  have the same type, or are corresponding controlling formals,
   --  and similarly for result types).
 
+  ---
+  -- Find_Untagged_Type_Of --
+  ---
+
+  function Find_Untagged_Type_Of (Prim : Entity_Id) return Entity_Id is
+ E : Entity_Id := First_Entity (Scope (Prim));
+
+  begin
+ while Present (E) and then E /= Prim loop
+if not Is_Tagged_Type (E)
+  and then Present (Direct_Primitive_Operations (E))
+  and then Contains (Direct_Primitive_Operations (E), Prim)
+then
+   return E;
+end if;
+
+Next_Entity (E);
+ end loop;
+
+ pragma Assert (False);
+ return Empty;
+  end Find_Untagged_Type_Of;
+
+  Typ  : constant Entity_Id :=
+   (if Is_Dispatching_Operation (Ancestor_Op)
+ then Find_Dispatching_Type (Ancestor_Op)
+ else Find_Untagged_Type_Of (Ancestor_Op));
+
   --
   -- Profile_Matches_Ancestor --
   --
@@ -6529,10 +6556,14 @@ package body Sem_Util is
   or else Is_Ancestor (Typ, Etype (S)));
   end Profile_Matches_Ancestor;
 
+  --  Local variables
+
+  Elmt : Elmt_Id;
+  Subp : Entity_Id;
+
--  Start of processing for Corresponding_Primitive_Op
 
begin
-  pragma Assert (Is_Dispatching_Operation (Ancestor_Op));
   pragma Assert (Is_Ancestor (Typ, Descendant_Type)
   or else Is_Progenitor (Typ, Descendant_Type));
 
diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
index f98e05615fd..42c6d249e2f 100644
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -618,9 +618,9 @@ package Sem_Util is
--  Possible optimization???
 
function Corresponding_Primitive_Op
-   (Ancestor_Op : Entity_Id;
-Descendant_Type : Entity_Id) return Entity_Id;
-   --  Given a primitive subprogram of a tagged type and a (distinct)
+ (Ancestor_Op : Entity_Id;
+  Descendant_Type : Entity_Id) return Entity_Id;
+   --  Given a primitive subprogram of a first type and a (distinct)
--  descendant type of that type, find the corresponding primitive
--  subprogram of the descendant type.
 
-- 
2.40.0



[COMMITTED] ada: Fix typo in "pattern"

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Tom Tromey 

I found a couple of spots using the typo "patterm" rather than the
correct "pattern".

gcc/ada/

* doc/gnat_ugn/building_executable_programs_with_gnat.rst
(Switches_for_gnatbind): Fix typo.
* libgnat/g-spipat.ads: Fix typo.
* gnat_ugn.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 .../doc/gnat_ugn/building_executable_programs_with_gnat.rst   | 2 +-
 gcc/ada/gnat_ugn.texi | 4 ++--
 gcc/ada/libgnat/g-spipat.ads  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst 
b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
index 79da3c2cbcc..7968073a985 100644
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -6806,7 +6806,7 @@ be presented in subsequent sections.
 
 The underlying scalar is set to a value consisting of repeated bytes, whose
 value corresponds to the given value. For example if ``BF`` is given,
-then a 32-bit scalar value will be set to the bit patterm ``16#BFBFBFBF#``.
+then a 32-bit scalar value will be set to the bit pattern ``16#BFBFBFBF#``.
 
   .. index:: GNAT_INIT_SCALARS
 
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index bd2cb3e5629..b95519a8295 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -16030,7 +16030,7 @@ one bits. For floating-point, a large value is set
 
 The underlying scalar is set to a value consisting of repeated bytes, whose
 value corresponds to the given value. For example if @code{BF} is given,
-then a 32-bit scalar value will be set to the bit patterm @code{16#BFBFBFBF#}.
+then a 32-bit scalar value will be set to the bit pattern @code{16#BFBFBFBF#}.
 @end itemize
 
 @geindex GNAT_INIT_SCALARS
@@ -29466,8 +29466,8 @@ to permit their use in free software.
 
 @printindex ge
 
-@anchor{cf}@w{  }
 @anchor{gnat_ugn/gnat_utility_programs switches-related-to-project-files}@w{   
   }
+@anchor{cf}@w{  }
 
 @c %**end of body
 @bye
diff --git a/gcc/ada/libgnat/g-spipat.ads b/gcc/ada/libgnat/g-spipat.ads
index 5766b3af686..297afbf5dee 100644
--- a/gcc/ada/libgnat/g-spipat.ads
+++ b/gcc/ada/libgnat/g-spipat.ads
@@ -58,7 +58,7 @@
 --   stored in a binary compatible manner.
 
 -- GNAT.Spitbol.Patterns (files g-spipat.ads/g-spipat.adb)
---   This is a completely general patterm matching package based on the
+--   This is a completely general pattern matching package based on the
 --   pattern language of SNOBOL4, as implemented in SPITBOL. The pattern
 --   language is modeled on context free grammars, with context sensitive
 --   extensions that provide full (type 0) computational capabilities.
-- 
2.40.0



[COMMITTED] ada: Add tags on style messages

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Arnaud Charlet 

Similar to tags on warnings [-gnatwx], we add tags on style messages
[-gnatyx] when -gnatw.d is enabled.

gcc/ada/

* errout.ads: Update comment.
* errout.adb (Skip_Msg_Insertion_Warning): Update to take e.g.
-gnatyM into account.
* erroutc.adb (Get_Warning_Option, Get_Warning_Tag)
(Prescan_Message): Add support for Style tags.
* par-ch5.adb, par-ch6.adb, par-ch7.adb, par-endh.adb,
par-util.adb, style.adb, styleg.adb: Set tag on all style
messages.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/errout.adb   |  3 ++-
 gcc/ada/errout.ads   |  6 ++---
 gcc/ada/erroutc.adb  | 37 +--
 gcc/ada/par-ch5.adb  |  4 +--
 gcc/ada/par-ch6.adb  |  2 +-
 gcc/ada/par-ch7.adb  |  2 +-
 gcc/ada/par-endh.adb |  2 +-
 gcc/ada/par-util.adb |  4 +--
 gcc/ada/style.adb| 18 +++---
 gcc/ada/styleg.adb   | 59 ++--
 10 files changed, 75 insertions(+), 62 deletions(-)

diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
index 96b56ffc57a..49281fdb05f 100644
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -3976,7 +3976,8 @@ package body Errout is
 P := P + 1;
 
  elsif P < Text'Last and then Text (P + 1) = C
-   and then Text (P) in 'a' .. 'z' | '*' | '$'
+   and then Text (P) in 'a' .. 'z' | 'A' .. 'Z' |
+'0' .. '9' | '*' | '$'
  then
 P := P + 2;
 
diff --git a/gcc/ada/errout.ads b/gcc/ada/errout.ads
index 1e099614325..f152839678d 100644
--- a/gcc/ada/errout.ads
+++ b/gcc/ada/errout.ads
@@ -307,9 +307,9 @@ package Errout is
--Insertion character ?x? ?.x? ?_x? (warning with switch)
--  "x" is a (lower-case) warning switch character.
--  Like ??, but if the flag Warn_Doc_Switch is True, adds the string
-   --  "[-gnatwx]", "[-gnatw.x]", or "[-gnatw_x]", at the end of the
-   --  warning message. For continuations, use this on each continuation
-   --  message.
+   --  "[-gnatwx]", "[-gnatw.x]", "[-gnatw_x]", or "[-gnatyx]" (for style
+   --  messages), at the end of the warning message. For continuations, use
+   --  this on each continuation message.
 
--Insertion character ?*? (restriction warning)
--  Like ?, but if the flag Warn_Doc_Switch is True, adds the string
diff --git a/gcc/ada/erroutc.adb b/gcc/ada/erroutc.adb
index 291a340ef6e..e5caeba6802 100644
--- a/gcc/ada/erroutc.adb
+++ b/gcc/ada/erroutc.adb
@@ -367,17 +367,25 @@ package body Erroutc is
 
function Get_Warning_Option (Id : Error_Msg_Id) return String is
   Warn : constant Boolean := Errors.Table (Id).Warn;
+  Style: constant Boolean := Errors.Table (Id).Style;
   Warn_Chr : constant String (1 .. 2) := Errors.Table (Id).Warn_Chr;
+
begin
-  if Warn and then Warn_Chr /= "  " and then Warn_Chr (1) /= '?' then
+  if (Warn or Style)
+and then Warn_Chr /= "  "
+and then Warn_Chr (1) /= '?'
+  then
  if Warn_Chr = "$ " then
 return "-gnatel";
+ elsif Style then
+return "-gnaty" & Warn_Chr (1);
  elsif Warn_Chr (2) = ' ' then
 return "-gnatw" & Warn_Chr (1);
  else
 return "-gnatw" & Warn_Chr;
  end if;
   end if;
+
   return "";
end Get_Warning_Option;
 
@@ -387,10 +395,12 @@ package body Erroutc is
 
function Get_Warning_Tag (Id : Error_Msg_Id) return String is
   Warn : constant Boolean := Errors.Table (Id).Warn;
+  Style: constant Boolean := Errors.Table (Id).Style;
   Warn_Chr : constant String (1 .. 2) := Errors.Table (Id).Warn_Chr;
   Option   : constant String  := Get_Warning_Option (Id);
+
begin
-  if Warn then
+  if Warn or Style then
  if Warn_Chr = "? " then
 return "[enabled by default]";
  elsif Warn_Chr = "* " then
@@ -880,7 +890,7 @@ package body Erroutc is
 J := J + 1;
 
  elsif J < Msg'Last and then Msg (J + 1) = C
-   and then Msg (J) in 'a' .. 'z' | '*' | '$'
+   and then Msg (J) in 'a' .. 'z' | 'A' .. 'Z' | '0' .. '9' | '*' | '$'
  then
 Message_Class := Msg (J) & " ";
 J := J + 2;
@@ -964,19 +974,20 @@ package body Erroutc is
  --  Warning message (? or < insertion sequence)
 
  elsif Msg (J) = '?' or else Msg (J) = '<' then
-Is_Warning_Msg := Msg (J) = '?' or else Error_Msg_Warn;
-J := J + 1;
-
-if Is_Warning_Msg then
+if Msg (J) = '?' or else Error_Msg_Warn then
+   Is_Warning_Msg := not Is_Style_Msg;
+   J := J + 1;
Warning_Msg_Char := Parse_Message_Class;
-end if;
 
---  Bomb if untagged warning message. This code can be uncommented
---  for debugging 

[COMMITTED] ada: Bad handling of ASCII with -gnatyn

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Arnaud Charlet 

ASCII is special cased but this wasn't taking into account all cases
such as Standard.ASCII.

gcc/ada/

* snames.ads-tmpl (Name_ASCII): New.
* style.adb (Check_Identifier): Fix handling of ASCII.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/snames.ads-tmpl | 1 +
 gcc/ada/style.adb   | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/snames.ads-tmpl b/gcc/ada/snames.ads-tmpl
index 8f71ad98db3..afe7508ac28 100644
--- a/gcc/ada/snames.ads-tmpl
+++ b/gcc/ada/snames.ads-tmpl
@@ -260,6 +260,7 @@ package Snames is
 
--  Some miscellaneous names used for error detection/recovery
 
+   Name_ASCII  : constant Name_Id := N + $;
Name_Const  : constant Name_Id := N + $;
Name_Error  : constant Name_Id := N + $;
Name_False  : constant Name_Id := N + $;
diff --git a/gcc/ada/style.adb b/gcc/ada/style.adb
index 3014359acba..dda5cd47c06 100644
--- a/gcc/ada/style.adb
+++ b/gcc/ada/style.adb
@@ -35,9 +35,8 @@ with Nlists; use Nlists;
 with Opt;use Opt;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
-with Sinfo.Utils;use Sinfo.Utils;
 with Sinput; use Sinput;
-with Stand;  use Stand;
+with Snames; use Snames;
 with Stylesw;use Stylesw;
 
 package body Style is
@@ -201,7 +200,7 @@ package body Style is
 else
--  ASCII is all upper case
 
-   if Entity (Ref) = Standard_ASCII then
+   if Chars (Ref) = Name_ASCII then
   Cas := All_Upper_Case;
 
--  Special handling for names in package ASCII
-- 
2.40.0



[COMMITTED] ada: Fix missing warning on aggregate with iterated component

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This happens when the iterated component does not really iterate.

gcc/ada/

* exp_aggr.adb (Expand_Array_Aggregate): Do not set Warnings_Off on
the temporary created when in-place expansion is not possible.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index cf8bac0f4bf..fe61e0ec90b 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -7068,7 +7068,6 @@ package body Exp_Aggr is
  Defining_Identifier => Tmp,
  Object_Definition   => New_Occurrence_Of (Typ, Loc));
  Set_No_Initialization (Tmp_Decl, True);
- Set_Warnings_Off (Tmp);
 
  --  If we are within a loop, the temporary will be pushed on the
  --  stack at each iteration. If the aggregate is the expression
-- 
2.40.0



[COMMITTED] ada: Enable Support_Atomic_Primitives on PPC Linux

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Johannes Kliemann 

gcc/ada/

* libgnat/system-linux-ppc.ads: Add Support_Atomic_Primitives.
* libgnat/s-atopri__32.ads: Add 32 bit version of s-atopri.ads.
* Makefile.rtl: Use s-atopro__32.ads for ppc-linux.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/Makefile.rtl |   1 +
 gcc/ada/libgnat/s-atopri__32.ads | 149 +++
 gcc/ada/libgnat/system-linux-ppc.ads |   1 +
 3 files changed, 151 insertions(+)
 create mode 100644 gcc/ada/libgnat/s-atopri__32.ads

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 96306f8cc9a..2cfdd8dc613 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2185,6 +2185,7 @@ ifeq ($(strip $(filter-out powerpc% linux%,$(target_cpu) 
$(target_os))),)
   EXTRA_GNATRTL_NONTASKING_OBJS += $(GNATRTL_128BIT_OBJS)
 endif
   else
+LIBGNAT_TARGET_PAIRS += s-atopri.adshttp://www.gnu.org/licenses/>.  --
+--  --
+-- GNAT was originally developed  by the GNAT team at  New York University. --
+-- Extensive contributions were provided by Ada Core Technologies Inc.  --
+--  --
+--
+
+--  This package contains both atomic primitives defined from GCC built-in
+--  functions and operations used by the compiler to generate the lock-free
+--  implementation of protected objects.
+--  This is the version that only contains primitives available on 32 bit
+--  platforms.
+
+with Interfaces.C;
+
+package System.Atomic_Primitives is
+   pragma Pure;
+
+   type uint is mod 2 ** Long_Integer'Size;
+
+   type uint8  is mod 2**8
+ with Size => 8;
+
+   type uint16 is mod 2**16
+ with Size => 16;
+
+   type uint32 is mod 2**32
+ with Size => 32;
+
+   Relaxed : constant := 0;
+   Consume : constant := 1;
+   Acquire : constant := 2;
+   Release : constant := 3;
+   Acq_Rel : constant := 4;
+   Seq_Cst : constant := 5;
+   Last: constant := 6;
+
+   subtype Mem_Model is Integer range Relaxed .. Last;
+
+   
+   -- GCC built-in atomic primitives --
+   
+
+   generic
+  type Atomic_Type is mod <>;
+   function Atomic_Load
+ (Ptr   : Address;
+  Model : Mem_Model := Seq_Cst) return Atomic_Type;
+   pragma Import (Intrinsic, Atomic_Load, "__atomic_load_n");
+
+   function Atomic_Load_8  is new Atomic_Load (uint8);
+   function Atomic_Load_16 is new Atomic_Load (uint16);
+   function Atomic_Load_32 is new Atomic_Load (uint32);
+
+   generic
+  type Atomic_Type is mod <>;
+   function Atomic_Compare_Exchange
+ (Ptr   : Address;
+  Expected  : Address;
+  Desired   : Atomic_Type;
+  Weak  : Boolean   := False;
+  Success_Model : Mem_Model := Seq_Cst;
+  Failure_Model : Mem_Model := Seq_Cst) return Boolean;
+   pragma Import
+ (Intrinsic, Atomic_Compare_Exchange, "__atomic_compare_exchange_n");
+
+   function Atomic_Compare_Exchange_8  is new Atomic_Compare_Exchange (uint8);
+   function Atomic_Compare_Exchange_16 is new Atomic_Compare_Exchange (uint16);
+   function Atomic_Compare_Exchange_32 is new Atomic_Compare_Exchange (uint32);
+
+   function Atomic_Test_And_Set
+ (Ptr   : System.Address;
+  Model : Mem_Model := Seq_Cst) return Boolean;
+   pragma Import (Intrinsic, Atomic_Test_And_Set, "__atomic_test_and_set");
+
+   procedure Atomic_Clear
+ (Ptr   : System.Address;
+  Model : Mem_Model := Seq_Cst);
+   pragma Import (Intrinsic, Atomic_Clear, "__atomic_clear");
+
+   function Atomic_Always_Lock_Free
+ (Size : Interfaces.C.size_t;
+  Ptr  : System.Address := System.Null_Address) return Boolean;
+   pragma Import
+ (Intrinsic, Atomic_Always_Lock_Free, "__atomic_always_lock_free");
+
+   --
+   -- Lock-free operations --
+   --
+
+   --  The lock-free implementation uses two atomic instructions for the
+   --  expansion of protected operations:
+
+   --  * Lock_Free_Read atomically loads the value contained in Ptr (with the
+   --Acquire synchronization mode).
+
+   --  * Lock_Free_Try_Write atomically tries to write the Desired value into
+   --Ptr if Ptr contains the Expected value. It returns true if the value
+   --in Ptr was changed, or False if it was not, in which case Expected is
+   --updated to the unexpected value in Ptr. Note that it does nothing and
+   --returns true if Desired and Expected are equal.
+
+   generic
+  type Atomic_Type is mod <>;
+   function Lock_Free_Read (Ptr : Address) return Atomic_Type;
+
+   function Lock_Free_Read_8  is new Lock_Free_Read (uint8);
+   function Lock_Free_Read_16 is new Lock_Free_Read (uint16);
+   function Lock_Free_Read_32 is ne

[COMMITTED] ada: usage.adb: document -gnatyD switch

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Ghjuvan Lacambre 

-gnatyD was documented in the user guide but not in `gnat --help-ada`.

gcc/ada/

* usage.adb (Usage): Document -gnatyD.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/usage.adb | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ada/usage.adb b/gcc/ada/usage.adb
index 4a2fa019013..97cedbb9a2d 100644
--- a/gcc/ada/usage.adb
+++ b/gcc/ada/usage.adb
@@ -655,6 +655,7 @@ begin
Write_Line ("ccheck comment format (two spaces)");
Write_Line ("Ccheck comment format (one space)");
Write_Line ("dcheck no DOS line terminators");
+   Write_Line ("Dcheck declared identifiers in mixed case");
Write_Line ("echeck end/exit labels present");
Write_Line ("fcheck no form feeds/vertical tabs in source");
Write_Line ("gcheck standard GNAT style rules, same as ydISux");
-- 
2.40.0



[COMMITTED] ada: Document examples of No_Dependence restriction for code generation

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

gcc/ada/

* doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
(No_Dependence): Give examples of new No_Dependence restrictions.
* gnat_rm.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 ...nd_implementation_defined_restrictions.rst | 12 ++-
 gcc/ada/gnat_rm.texi  | 33 ++-
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git 
a/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst 
b/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
index f8e2a58595f..275b46c3712 100644
--- a/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
+++ b/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
@@ -186,7 +186,17 @@ No_Dependence
 [RM 13.12.1] This restriction ensures at compile time that there are no
 dependences on a library unit. For GNAT, this includes implicit implementation
 dependences on units of the runtime library that are created by the compiler
-to support specific constructs of the language.
+to support specific constructs of the language. Here are some examples:
+
+* ``System.Arith_64``: 64-bit arithmetics for 32-bit platforms,
+* ``System.Arith_128``: 128-bit arithmetics for 64-bit platforms,
+* ``System.Memory``: heap memory allocation routines,
+* ``System.Memory_Compare``: memory comparison routine (aka ``memcmp`` for C),
+* ``System.Memory_Copy``: memory copy routine (aka ``memcpy`` for C),
+* ``System.Memory_Move``: memoy move routine (aka ``memmove`` for C),
+* ``System.Memory_Set``: memory set routine (aka ``memset`` for C),
+* ``System.Stack_Checking[.Operations]``: stack checking without MMU,
+* ``System.GCC``: support routines from the GCC library.
 
 No_Direct_Boolean_Operators
 ---
diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
index 5e05287d6d8..3818f22414a 100644
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -12727,7 +12727,38 @@ delay statements and no semantic dependences on 
package Calendar.
 [RM 13.12.1] This restriction ensures at compile time that there are no
 dependences on a library unit. For GNAT, this includes implicit implementation
 dependences on units of the runtime library that are created by the compiler
-to support specific constructs of the language.
+to support specific constructs of the language. Here are some examples:
+
+
+@itemize *
+
+@item 
+@code{System.Arith_64}: 64-bit arithmetics for 32-bit platforms,
+
+@item 
+@code{System.Arith_128}: 128-bit arithmetics for 64-bit platforms,
+
+@item 
+@code{System.Memory}: heap memory allocation routines,
+
+@item 
+@code{System.Memory_Compare}: memory comparison routine (aka @code{memcmp} for 
C),
+
+@item 
+@code{System.Memory_Copy}: memory copy routine (aka @code{memcpy} for C),
+
+@item 
+@code{System.Memory_Move}: memoy move routine (aka @code{memmove} for C),
+
+@item 
+@code{System.Memory_Set}: memory set routine (aka @code{memset} for C),
+
+@item 
+@code{System.Stack_Checking[.Operations]}: stack checking without MMU,
+
+@item 
+@code{System.GCC}: support routines from the GCC library.
+@end itemize
 
 @node No_Direct_Boolean_Operators,No_Dispatch,No_Dependence,Partition-Wide 
Restrictions
 @anchor{gnat_rm/standard_and_implementation_defined_restrictions 
no-direct-boolean-operators}@anchor{1ca}
-- 
2.40.0



[COMMITTED] ada: Apply range checks to preanalyzed aggregate expressions

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When preanalyzing expressions in GNATprove mode, e.g. Pre/Post
contracts, we apply checks, because these expressions will never
be expanded. This didn't happen for aggregate expressions, most
likely because of an oversight.

gcc/ada/

* sem_util.adb (Aggregate_Constraint_Checks): Don't exit early
when preanalysing in GNATprove mode. Now the condition is
consistent with other similar conditions in other code.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index ad74de6b6f6..38dc654f7be 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -477,7 +477,7 @@ package body Sem_Util is
   --  this breaks the name resolution mechanism for generic instances.
 
   if not Expander_Active
-and (Inside_A_Generic or not Full_Analysis or not GNATprove_Mode)
+and not (GNATprove_Mode and not Inside_A_Generic)
   then
  return;
   end if;
-- 
2.40.0



[COMMITTED] ada: Spurious error analyzing 'old or 'result in class-wide conditions

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Javier Miranda 

gcc/ada/

* sem_attr.adb
(Analyze_Attribute_Old_Result): When preanalyzing a class-wide
condition, search in the scopes stack for the subprogram that has
the condition. This is required because returning the current
scope causes reporting spurious errors when the occurrence of the
attribute is found, for example, in a quantified expression.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_attr.adb | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index 452aabdd436..a07e91b839d 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -1366,8 +1366,27 @@ package body Sem_Attr is
  --  yet on its definite context.
 
  if Inside_Class_Condition_Preanalysis then
-Legal   := True;
-Spec_Id := Current_Scope;
+Legal := True;
+
+--  Search for the subprogram that has this class-wide condition;
+--  required to avoid reporting spurious errors since the current
+--  scope may not be appropriate because the attribute may be
+--  referenced from the inner scope of, for example, quantified
+--  expressions.
+
+--  Although the expression is not installed on its definite
+--  context, we know that the subprogram has been placed in the
+--  scope stack by Preanalyze_Condition; we also know that it is
+--  not a generic subprogram since class-wide pre/postconditions
+--  can only be applied for primitive operations of tagged types.
+
+if Is_Subprogram (Current_Scope) then
+   Spec_Id := Current_Scope;
+else
+   Spec_Id := Enclosing_Subprogram (Current_Scope);
+end if;
+
+pragma Assert (Is_Dispatching_Operation (Spec_Id));
 return;
  end if;
 
-- 
2.40.0



[COMMITTED] ada: Build invariant procedure while freezing in GNATprove mode

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Invariant procedure bodies are created either by expansion of freezing
nodes (but only in ordinary compilation mode) or at the end of package
private declarations (but not for with private types in the type
derivation chain).

In GNATprove mode we didn't create invariant procedure bodies in
lightweight expansion, so we didn't create them at all when there were
private types in the type derivation chain.

This patch copies the relevant freezing part from ordinary to
lightweight expansion. This obviously involves code duplication,
but it seems better to duplicate whole sections that work properly
instead of small pieces that are incomplete. There are other pieces
of freezing that are similarly duplicated, so this patch doesn't make
the code substantially worse.

gcc/ada/

* exp_spark.adb (SPARK_Freeze_Type): Copy whole handling of DIC
and Type_Invariant from Freeze_Type.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_spark.adb | 54 ---
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/exp_spark.adb b/gcc/ada/exp_spark.adb
index efa5c2cd8da..c344dc1e706 100644
--- a/gcc/ada/exp_spark.adb
+++ b/gcc/ada/exp_spark.adb
@@ -101,7 +101,7 @@ package body Exp_SPARK is
--  expanded body would compare the _parent component, which is
--  intentionally not generated in the GNATprove mode.
--
-   --  We build the DIC procedure body here as well.
+   --  We build the DIC and Type_Invariant procedure bodies here as well.
 
--
-- Expand_SPARK --
@@ -920,15 +920,53 @@ package body Exp_SPARK is
 
   Set_Ghost_Mode (Typ);
 
-  --  When a DIC is inherited by a tagged type, it may need to be
-  --  specialized to the descendant type, hence build a separate DIC
-  --  procedure for it as done during regular expansion for compilation.
+  --  Generate the [spec and] body of the invariant procedure tasked with
+  --  the runtime verification of all invariants that pertain to the type.
+  --  This includes invariants on the partial and full view, inherited
+  --  class-wide invariants from parent types or interfaces, and invariants
+  --  on array elements or record components. But skip internal types.
 
-  if Has_DIC (Typ) and then Is_Tagged_Type (Typ) then
- --  Why is this needed for DIC, but not for other aspects (such as
- --  Type_Invariant)???
+  if Is_Itype (Typ) then
+ null;
+
+  elsif Is_Interface (Typ) then
+
+ --  Interfaces are treated as the partial view of a private type in
+ --  order to achieve uniformity with the general case. As a result, an
+ --  interface receives only a "partial" invariant procedure which is
+ --  never called.
+
+ if Has_Own_Invariants (Typ) then
+Build_Invariant_Procedure_Body
+  (Typ   => Typ,
+   Partial_Invariant => Is_Interface (Typ));
+ end if;
+
+  --  Non-interface types
 
- Build_DIC_Procedure_Body (Typ);
+  --  Do not generate invariant procedure within other assertion
+  --  subprograms, which may involve local declarations of local
+  --  subtypes to which these checks do not apply.
+
+  else
+ if Has_Invariants (Typ) then
+if not Predicate_Check_In_Scope (Typ)
+  or else (Ekind (Current_Scope) = E_Function
+and then Is_Predicate_Function (Current_Scope))
+then
+   null;
+else
+   Build_Invariant_Procedure_Body (Typ);
+end if;
+ end if;
+
+ --  Generate the [spec and] body of the procedure tasked with the
+ --  run-time verification of pragma Default_Initial_Condition's
+ --  expression.
+
+ if Has_DIC (Typ) then
+Build_DIC_Procedure_Body (Typ);
+ end if;
   end if;
 
   if Ekind (Typ) = E_Record_Type
-- 
2.40.0



[COMMITTED] ada: Spurious error on function returning CPP type

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Javier Miranda 

gcc/ada/

* exp_ch6.adb
(Needs_BIP_Alloc_Form): Return False for functions with foreign
convention since we never use build-in-place for such functions.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index af7f75342fa..b8e5a720a7c 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -9435,9 +9435,14 @@ package body Exp_Ch6 is
   --  types, and those can be used to call primitives, so the formal needs
   --  to be passed to all such build-in-place functions, primitive or not.
 
+  --  We never use build-in-place if the function has foreign convention,
+  --  but note that it is OK for a build-in-place function to return a
+  --  type with a foreign convention because the machinery ensures there
+  --  is no copying.
+
   return not Restriction_Active (No_Secondary_Stack)
 and then (Needs_Secondary_Stack (Typ) or else Is_Tagged_Type (Typ))
-and then not Has_Foreign_Convention (Typ);
+and then not Has_Foreign_Convention (Func_Id);
end Needs_BIP_Alloc_Form;
 
-
-- 
2.40.0



[COMMITTED] ada: Use accumulator type in expansion of 'Reduce attribute

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The current expansion of the 'Reduce attribute uses the resolution type of
the expression for the accumulator. Now this type can be unresolved or set
to a universal type, for example if it is itself the prefix of the 'Image
attribute, and this may yield a spurious type mismatch error in that case.

This changes the expansion to use the accumulator type instead as defined
by the RM 4.5.10 clause, albeit only in the prefixed case for now.

gcc/ada/

* exp_attr.adb (Expand_N_Attribute_Reference) :
Use the canonical accumulator type as the type of the accumulator
in the prefixed case.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_attr.adb | 72 ++--
 1 file changed, 62 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
index aababd516d5..7e71422eba3 100644
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -5978,27 +5978,30 @@ package body Exp_Attr is
   when Attribute_Reduce =>
  declare
 Loc : constant Source_Ptr := Sloc (N);
-E1  : constant Node_Id := First (Expressions (N));
-E2  : constant Node_Id := Next (E1);
-Bnn : constant Entity_Id := Make_Temporary (Loc, 'B', N);
-Typ : constant Entity_Id := Etype (N);
+E1  : constant Node_Id:= First (Expressions (N));
+E2  : constant Node_Id:= Next (E1);
+Bnn : constant Entity_Id  := Make_Temporary (Loc, 'B', N);
 
-New_Loop : Node_Id;
-Stat : Node_Id;
+Accum_Typ : Entity_Id;
+New_Loop  : Node_Id;
 
 function Build_Stat (Comp : Node_Id) return Node_Id;
 --  The reducer can be a function, a procedure whose first
 --  parameter is in-out, or an attribute that is a function,
 --  which (for now) can only be Min/Max. This subprogram
---  builds the corresponding computation for the generated loop.
+--  builds the corresponding computation for the generated loop
+--  and retrieves the accumulator type as per RM 4.5.10(19/5).
 
 
 -- Build_Stat --
 
 
 function Build_Stat (Comp : Node_Id) return Node_Id is
+   Stat : Node_Id;
+
 begin
if Nkind (E1) = N_Attribute_Reference then
+  Accum_Typ := Entity (Prefix (E1));
   Stat := Make_Assignment_Statement (Loc,
 Name => New_Occurrence_Of (Bnn, Loc),
 Expression => Make_Attribute_Reference (Loc,
@@ -6009,12 +6012,14 @@ package body Exp_Attr is
 Comp)));
 
elsif Ekind (Entity (E1)) = E_Procedure then
+  Accum_Typ := Etype (First_Formal (Entity (E1)));
   Stat := Make_Procedure_Call_Statement (Loc,
 Name => New_Occurrence_Of (Entity (E1), Loc),
Parameter_Associations => New_List (
  New_Occurrence_Of (Bnn, Loc),
  Comp));
else
+  Accum_Typ := Etype (Entity (E1));
   Stat := Make_Assignment_Statement (Loc,
 Name => New_Occurrence_Of (Bnn, Loc),
 Expression => Make_Function_Call (Loc,
@@ -6074,6 +6079,13 @@ package body Exp_Attr is
   End_Label => Empty,
   Statements =>
 New_List (Build_Stat (Relocate_Node (Expr;
+
+  --  If the reducer subprogram is a universal operator, then
+  --  we still look at the context to find the type for now.
+
+  if Is_Universal_Numeric_Type (Accum_Typ) then
+ Accum_Typ := Etype (N);
+  end if;
end;
 
 else
@@ -6082,9 +6094,10 @@ package body Exp_Attr is
--  a container with the proper aspects.
 
declare
-  Iter : Node_Id;
   Elem : constant Entity_Id := Make_Temporary (Loc, 'E', N);
 
+  Iter : Node_Id;
+
begin
   Iter :=
 Make_Iterator_Specification (Loc,
@@ -6101,6 +6114,44 @@ package body Exp_Attr is
   End_Label => Empty,
   Statements => New_List (
 Build_Stat (New_Occurrence_Of (Elem, Loc;
+
+  --  If the reducer subprogram is a universal operator, then
+  --  we need to look at the prefix to find the type. This is
+  --  modeled on Analyze_Iterator_Specification in Sem_Ch5.
+
+  if Is_Universal_Numeric_Type (Accum_Typ) then
+ declare

[COMMITTED] ada: Adjust semantics and implementation of storage models

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This makes the following adjustments to the semantics and implementation of
storage models in the compiler:

  1. By-copy semantics in subprogram calls: when an object accessed with a
 nonnative storage model is passed as an actual parameter in a call to
 a subprogram, an intermediate copy made on the host is passed instead.

  2. More generally, any additional temporary required on the host by the
 semantics of nonnative storage models is now created by the front-end
 instead of the code generator.

  3. All the temporaries created on the host for nonnative storage models
 are allocated on the secondary stack instead of the primary stack.

As a result, this should simplify the implementation in code generators.

gcc/ada/

* exp_aggr.adb (Build_Assignment_With_Temporary): Adjust comment
and fix type of second parameter. Create the temporary on the
secondary stack by calling Build_Temporary_On_Secondary_Stack.
(Convert_Array_Aggr_In_Allocator): Adjust formatting.
(Expand_Array_Aggregate): Likewise.
* exp_ch4.adb (Expand_N_Allocator): Set Actual_Designated_Subtype
on the dereference in the initialization for all composite types.
* exp_ch5.adb (Expand_N_Assignment_Statement): Create a temporary
on the host for an assignment between nonnative storage models.
Suppress more checks when Suppress_Assignment_Checks is set.
* exp_ch6.adb (Add_Simple_Call_By_Copy_Code): Deal with actuals
that are dereferences with an Actual_Designated_Subtype. Add
support for nonnative storage models.
(Expand_Actuals): Create a copy if the actual is a dereference
with a nonnative storage model.
* exp_util.ads (Build_Temporary_On_Secondary_Stack): Declare.
* exp_util.adb (Build_Temporary_On_Secondary_Stack): New function.
* sem_ch5.adb (Analyze_Assignment.Set_Assignment_Type): Do not
build an actual subtype for dereferences with an
Actual_Designated_Subtype
* sinfo.ads (Actual_Designated_Subtype): Adjust documentation.
(Suppress_Assignment_Checks): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb |  51 +-
 gcc/ada/exp_ch4.adb  |  52 +--
 gcc/ada/exp_ch5.adb  |  58 +++--
 gcc/ada/exp_ch6.adb  | 121 ---
 gcc/ada/exp_util.adb |  49 ++
 gcc/ada/exp_util.ads |  12 +
 gcc/ada/sem_ch5.adb  |   9 ++--
 gcc/ada/sinfo.ads|   4 +-
 8 files changed, 274 insertions(+), 82 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index f1cbbfc3155..cf8bac0f4bf 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -62,7 +62,7 @@ with Sem_Eval;   use Sem_Eval;
 with Sem_Mech;   use Sem_Mech;
 with Sem_Res;use Sem_Res;
 with Sem_Util;   use Sem_Util;
-use Sem_Util.Storage_Model_Support;
+ use Sem_Util.Storage_Model_Support;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
 with Sinfo.Utils;use Sinfo.Utils;
@@ -78,12 +78,10 @@ package body Exp_Aggr is
 
function Build_Assignment_With_Temporary
  (Target : Node_Id;
-  Typ: Node_Id;
+  Typ: Entity_Id;
   Source : Node_Id) return List_Id;
--  Returns a list of actions to assign Source to Target of type Typ using
-   --  an extra temporary:
-   --   Tmp := Source;
-   --   Target := Tmp;
+   --  an extra temporary, which can potentially be large.
 
type Case_Bounds is record
  Choice_Lo   : Node_Id;
@@ -2524,33 +2522,33 @@ package body Exp_Aggr is
 
function Build_Assignment_With_Temporary
  (Target : Node_Id;
-  Typ: Node_Id;
+  Typ: Entity_Id;
   Source : Node_Id) return List_Id
is
   Loc : constant Source_Ptr := Sloc (Source);
 
   Aggr_Code : List_Id;
   Tmp   : Entity_Id;
-  Tmp_Decl  : Node_Id;
 
begin
-  Tmp := Make_Temporary (Loc, 'A', Source);
-  Tmp_Decl :=
-Make_Object_Declaration (Loc,
-  Defining_Identifier => Tmp,
-  Object_Definition   => New_Occurrence_Of (Typ, Loc));
-  Set_No_Initialization (Tmp_Decl, True);
+  Aggr_Code := New_List;
+
+  Tmp := Build_Temporary_On_Secondary_Stack (Loc, Typ, Aggr_Code);
 
-  Aggr_Code := New_List (Tmp_Decl);
   Append_To (Aggr_Code,
 Make_OK_Assignment_Statement (Loc,
-  Name   => New_Occurrence_Of (Tmp, Loc),
+  Name   =>
+Make_Explicit_Dereference (Loc,
+  Prefix => New_Occurrence_Of (Tmp, Loc)),
   Expression => Source));
 
   Append_To (Aggr_Code,
 Make_OK_Assignment_Statement (Loc,
   Name   => Target,
-  Expression => New_Occurrence_Of (Tmp, Loc)));
+  Expression =>
+Make_Explicit_Dereference (Loc,
+  Prefix => New_Occurrence_O

[COMMITTED] ada: Fix internal error on chain of predicated record types

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The preanalysis of a predicate set on one of the record types was causing
premature freezing of another record type.

gcc/ada/

* sem_ch13.adb: Add with and use clauses for Expander.
(Resolve_Aspect_Expressions) : Emulate a
bona-fide preanalysis setup before calling
Resolve_Aspect_Expression.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index 1c757228241..a4a5084793e 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -38,6 +38,7 @@ with Exp_Ch3;  use Exp_Ch3;
 with Exp_Disp; use Exp_Disp;
 with Exp_Tss;  use Exp_Tss;
 with Exp_Util; use Exp_Util;
+with Expander; use Expander;
 with Freeze;   use Freeze;
 with Ghost;use Ghost;
 with Lib;  use Lib;
@@ -15625,15 +15626,29 @@ package body Sem_Ch13 is
  --  Preanalyze expression after type replacement to catch
  --  name resolution errors if the predicate function has
  --  not been built yet.
+
  --  Note that we cannot use Preanalyze_Spec_Expression
- --  because of the special handling required for
- --  quantifiers, see comments on Resolve_Aspect_Expression
- --  above.
+ --  directly because of the special handling required for
+ --  quantifiers (see comments on Resolve_Aspect_Expression
+ --  above) but we need to emulate it properly.
 
  if No (Predicate_Function (E)) then
-Push_Type (E);
-Resolve_Aspect_Expression (Expr);
-Pop_Type (E);
+declare
+   Save_In_Spec_Expression : constant Boolean :=
+   In_Spec_Expression;
+   Save_Full_Analysis : constant Boolean :=
+  Full_Analysis;
+begin
+   In_Spec_Expression := True;
+   Full_Analysis := False;
+   Expander_Mode_Save_And_Set (False);
+   Push_Type (E);
+   Resolve_Aspect_Expression (Expr);
+   Pop_Type (E);
+   Expander_Mode_Restore;
+   Full_Analysis := Save_Full_Analysis;
+   In_Spec_Expression := Save_In_Spec_Expression;
+end;
  end if;
 
   when Pre_Post_Aspects =>
-- 
2.40.0



Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Richard Sandiford via Gcc-patches
Tejas Belagod  writes:
>> +  {
>> +int i;
>> +int nelts = vector_cst_encoded_nelts (v);
>> +int first_el = 0;
>> +
>> +for (i = first_el; i < nelts; i += step)
>> +  if (VECTOR_CST_ENCODED_ELT (v, i) != VECTOR_CST_ENCODED_ELT (v,
> first_el))
>
> I think this should use !operand_equal_p (..., ..., 0).
>
>
> Oops! I wonder why I thought VECTOR_CST_ENCODED_ELT returned a constant! 
> Thanks
> for spotting that.

It does only return a constant.  But there can be multiple trees with
the same constant value, through things like TREE_OVERFLOW (not sure
where things stand on expunging that from gimple) and the fact that
gimple does not maintain a distinction between different types that
have the same mode and signedness.  (E.g. on ILP32 hosts, gimple does
not maintain a distinction between int and long, even though int 0 and
long 0 are different trees.)

> Also, should the flags here be OEP_ONLY_CONST ?

Nah, just 0 should be fine.

>> + return false;
>> +
>> +return true;
>> +  }
>> +
>> +  /* Fold a svlast{a/b} call with constant predicate to a BIT_FIELD_REF.
>> + BIT_FIELD_REF lowers to a NEON element extract, so we have to make sure
>> + the index of the element being accessed is in the range of a NEON
> vector
>> + width.  */
>
> s/NEON/Advanced SIMD/.  Same in later comments
>
>> +  gimple *fold (gimple_folder & f) const override
>> +  {
>> +tree pred = gimple_call_arg (f.call, 0);
>> +tree val = gimple_call_arg (f.call, 1);
>> +
>> +if (TREE_CODE (pred) == VECTOR_CST)
>> +  {
>> + HOST_WIDE_INT pos;
>> + unsigned int const_vg;
>> + int i = 0;
>> + int step = f.type_suffix (0).element_bytes;
>> + int step_1 = gcd (step, VECTOR_CST_NPATTERNS (pred));
>> + int npats = VECTOR_CST_NPATTERNS (pred);
>> + unsigned HOST_WIDE_INT nelts = vector_cst_encoded_nelts (pred);
>> + tree b = NULL_TREE;
>> + bool const_vl = aarch64_sve_vg.is_constant (&const_vg);
>
> I think this might be left over from previous versions, but:
> const_vg isn't used and const_vl is only used once, so I think it
> would be better to remove them.
>
>> +
>> + /* We can optimize 2 cases common to variable and fixed-length cases
>> +without a linear search of the predicate vector:
>> +1.  LASTA if predicate is all true, return element 0.
>> +2.  LASTA if predicate all false, return element 0.  */
>> + if (is_lasta () && vect_all_same (pred, step_1))
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT), bitsize_int (0));
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* Handle the all-false case for LASTB where SVE VL == 128b -
>> +return the highest numbered element.  */
>> + if (is_lastb () && known_eq (BYTES_PER_SVE_VECTOR, 16)
>> + && vect_all_same (pred, step_1)
>> + && integer_zerop (VECTOR_CST_ENCODED_ELT (pred, 0)))
>
> Formatting nit: one condition per line once one line isn't enough.
>
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT),
>> + bitsize_int ((16 - step) * BITS_PER_UNIT));
>> +
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) == 2 and every multiple of
>> +'step_1' in
>> +[VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1]
>> +is zero, then we can treat the vector as VECTOR_CST_NPATTERNS
>> +elements followed by all inactive elements.  */
>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) == 2)
>
> Following on from the above, maybe use:
>
>   !VECTOR_CST_NELTS (pred).is_constant ()
>
> instead of !const_vl here.
>
> I have a horrible suspicion that I'm contradicting our earlier discussion
> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != 2.
>
>  
>
> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the encoded
> constant. This means the repetition occurs if NELTS_PER_PATTERN == 2, IOW the
> base1 repeats in the encoding. This loop is checking this condition and looks
> for a 1 in the repeated part of the NELTS_PER_PATTERN == 2 in a VL vector.
> Please correct me if I’m misunderstanding here.

NELTS_PER_PATTERN == 1 is also a repeating pattern: it means that the
entire sequence is repeated to fill a vector.  So if an NELTS_PER_PATTERN
== 1 constant has elements {0, 1, 0, 0}, the vector is:

   {0, 1, 0, 0, 0, 1, 0, 0, ...}

and the optimisation can't handle that.  NELTS_PER_PATTERN == 3 isn't
likely to occur for predicates, but in principle it has the same problem.

Thanks,
Richard


[COMMITTED] ada: Follow-up improvement to implementation of storage models

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

It avoids to recreate an actual subtype for an explicit dereference.

gcc/ada/

* sem_util.adb (Get_Actual_Subtype): For an explicit dereference,
return the Actual_Designated_Subtype if it is present.
(Get_Actual_Subtype_If_Available): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 8bce0229867..ad74de6b6f6 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10017,6 +10017,14 @@ package body Sem_Util is
   then
  return Actual_Subtype (Entity (N));
 
+  --  Similarly, if we have an explicit dereference, then we get the
+  --  actual subtype from the node itself if one has been built.
+
+  elsif Nkind (N) = N_Explicit_Dereference
+and then Present (Actual_Designated_Subtype (N))
+  then
+ return Actual_Designated_Subtype (N);
+
   --  Actual subtype of unchecked union is always itself. We never need
   --  the "real" actual subtype. If we did, we couldn't get it anyway
   --  because the discriminant is not available. The restrictions on
@@ -10130,6 +10138,14 @@ package body Sem_Util is
   then
  return Actual_Subtype (Entity (N));
 
+  --  Similarly, if we have an explicit dereference, then we get the
+  --  actual subtype from the node itself if one has been built.
+
+  elsif Nkind (N) = N_Explicit_Dereference
+and then Present (Actual_Designated_Subtype (N))
+  then
+ return Actual_Designated_Subtype (N);
+
   --  Otherwise the Etype of N is returned unchanged
 
   else
-- 
2.40.0



[COMMITTED] ada: Add "gnat --help-ada" text for new switches.

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

The output generated by "gnat --help-ada" should include descriptions for
the newly added -gnatw_s and -gnatw_S switches".

gcc/ada/

* usage.adb: Generate output text describing the -gnatw_s switch
(and the corresponding -gnatw_S switch).

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/usage.adb | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/ada/usage.adb b/gcc/ada/usage.adb
index 97cedbb9a2d..9e2aa019573 100644
--- a/gcc/ada/usage.adb
+++ b/gcc/ada/usage.adb
@@ -580,6 +580,10 @@ begin
Write_Line ("ssuppress all info/warnings");
Write_Line (".s   turn on warnings for overridden size clause");
Write_Line (".S*  turn off warnings for overridden size clause");
+   Write_Line ("_s+  turn on warnings for ineffective predicate " &
+  "tests");
+   Write_Line ("_S*  turn off warnings for ineffective predicate " &
+   "tests");
Write_Line ("tturn on warnings for tracking deleted code");
Write_Line ("T*   turn off warnings for tracking deleted code");
Write_Line (".t*+ turn on warnings for suspicious contract");
-- 
2.40.0



[COMMITTED] ada: Update proof of runtime units

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Yannick Moy 

Following changes in GNATprove, proofs need to be amended.

gcc/ada/

* libgnat/s-aridou.adb (Lemma_Div_Pow2): Add assertion.
* libgnat/s-arit32.adb (Lemma_Abs_Div_Commutation): Simplify.
* libgnat/s-expmod.adb (Lemma_Exp_Mod): Add assertions.
(Lemma_Euclidean_Mod): Add body to lemma.
(Lemma_Mult_Mod): Add assertion.
* libgnat/s-valueu.adb (Scan_Raw_Unsigned): Modify assertion.
* libgnat/s-vauspe.ads (Raw_Unsigned_Last_Ghost): Add
postcondition.
* libgnat/s-widthi.adb: Use more precise types.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-aridou.adb |  2 +-
 gcc/ada/libgnat/s-arit32.adb | 33 +
 gcc/ada/libgnat/s-expmod.adb | 20 ++--
 gcc/ada/libgnat/s-valueu.adb | 12 ++--
 gcc/ada/libgnat/s-vauspe.ads |  3 ++-
 gcc/ada/libgnat/s-widthi.adb |  6 +++---
 6 files changed, 31 insertions(+), 45 deletions(-)

diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb
index dbf0f42cd49..041478538a7 100644
--- a/gcc/ada/libgnat/s-aridou.adb
+++ b/gcc/ada/libgnat/s-aridou.adb
@@ -1543,7 +1543,7 @@ is
  Div2 : constant Double_Uns := Double_Uns'(2);
  Left : constant Double_Uns := X / Div1 / Div2;
  R2   : constant Double_Uns := X / Div1 - Left * Div2;
- pragma Assert (R2 < Div2);
+ pragma Assert (R2 <= Div2 - 1);
  R1   : constant Double_Uns := X - X / Div1 * Div1;
  pragma Assert (R1 < Div1);
   begin
diff --git a/gcc/ada/libgnat/s-arit32.adb b/gcc/ada/libgnat/s-arit32.adb
index bd316c1bc20..219523b00f2 100644
--- a/gcc/ada/libgnat/s-arit32.adb
+++ b/gcc/ada/libgnat/s-arit32.adb
@@ -195,12 +195,6 @@ is
or else (X >= Big_0 and then Y <= Big_0),
  Post => X * Y <= Big_0;
 
-   procedure Lemma_Neg_Div (X, Y : Big_Integer)
-   with
- Ghost,
- Pre  => Y /= 0,
- Post => X / Y = (-X) / (-Y);
-
procedure Lemma_Neg_Rem (X, Y : Big_Integer)
with
  Ghost,
@@ -223,6 +217,7 @@ is
-
 
procedure Lemma_Abs_Commutation (X : Int32) is null;
+   procedure Lemma_Abs_Div_Commutation (X, Y : Big_Integer) is null;
procedure Lemma_Abs_Mult_Commutation (X, Y : Big_Integer) is null;
procedure Lemma_Div_Commutation (X, Y : Uns64) is null;
procedure Lemma_Div_Ge (X, Y, Z : Big_Integer) is null;
@@ -234,22 +229,6 @@ is
procedure Lemma_Not_In_Range_Big2xx32 is null;
procedure Lemma_Rem_Commutation (X, Y : Uns64) is null;
 
-   ---
-   -- Lemma_Abs_Div_Commutation --
-   ---
-
-   procedure Lemma_Abs_Div_Commutation (X, Y : Big_Integer) is
-   begin
-  if Y < 0 then
- if X < 0 then
-pragma Assert (abs (X / Y) = abs (X / (-Y)));
- else
-Lemma_Neg_Div (X, Y);
-pragma Assert (abs (X / Y) = abs ((-X) / (-Y)));
- end if;
-  end if;
-   end Lemma_Abs_Div_Commutation;
-
---
-- Lemma_Abs_Rem_Commutation --
---
@@ -277,16 +256,6 @@ is
   pragma Assert (Uns64 (Xlo) = Xu mod 2 ** 32);
end Lemma_Hi_Lo;
 
-   ---
-   -- Lemma_Neg_Div --
-   ---
-
-   procedure Lemma_Neg_Div (X, Y : Big_Integer) is
-   begin
-  pragma Assert ((-X) / (-Y) = -(X / (-Y)));
-  pragma Assert (X / (-Y) = -(X / Y));
-   end Lemma_Neg_Div;
-
-
-- Raise_Error --
-
diff --git a/gcc/ada/libgnat/s-expmod.adb b/gcc/ada/libgnat/s-expmod.adb
index 0682589d352..aa6e9b4c361 100644
--- a/gcc/ada/libgnat/s-expmod.adb
+++ b/gcc/ada/libgnat/s-expmod.adb
@@ -109,9 +109,21 @@ is
 
   procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) with
 Pre  => F /= 0,
-Post => (Q * F + R) mod F = R mod F;
+Post => (Q * F + R) mod F = R mod F,
+Subprogram_Variant => (Decreases => Q);
 
-  procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) is null;
+  -
+  -- Lemma_Euclidean_Mod --
+  -
+
+  procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) is
+  begin
+ if Q > 0 then
+Lemma_Euclidean_Mod (Q - 1, F, R);
+ end if;
+  end Lemma_Euclidean_Mod;
+
+  --  Local variables
 
   Left  : constant Big_Natural := (X + Y) mod B;
   Right : constant Big_Natural := ((X mod B) + (Y mod B)) mod B;
@@ -164,6 +176,9 @@ is
 Lemma_Mod_Mod (A, B);
 Lemma_Exp_Mod (A, Exp - 1, B);
 Lemma_Mult_Mod (A, A ** (Exp - 1), B);
+pragma Assert
+  ((A mod B) * (A mod B) ** (Exp - 1) = (A mod B) ** Exp);
+pragma Assert (A * A ** (Exp - 1) = A ** Exp);
 pragma Assert (Left = Right);
  end;
   end if;
@@ -190,6 +205,7 @@ is
 pragma Assert (Left = Right);
  else

[COMMITTED] ada: Fix internal error on 'Image applied to array component

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This happens because the array component depends on a discriminant.

gcc/ada/

* exp_imgv.adb (Rewrite_Object_Image): If the prefix is a component
that depends on a discriminant, create an actual subtype for it.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_imgv.adb | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/exp_imgv.adb b/gcc/ada/exp_imgv.adb
index 93fdb70306f..257f65badd0 100644
--- a/gcc/ada/exp_imgv.adb
+++ b/gcc/ada/exp_imgv.adb
@@ -2498,12 +2498,31 @@ package body Exp_Imgv is
   Attr_Name : Name_Id;
   Str_Typ   : Entity_Id)
is
+  Ptyp : Entity_Id;
+
begin
+  Ptyp := Etype (Pref);
+
+  --  If the prefix is a component that depends on a discriminant, then
+  --  create an actual subtype for it.
+
+  if Nkind (Pref) = N_Selected_Component then
+ declare
+Decl : constant Node_Id :=
+ Build_Actual_Subtype_Of_Component (Ptyp, Pref);
+ begin
+if Present (Decl) then
+   Insert_Action (N, Decl);
+   Ptyp := Defining_Identifier (Decl);
+end if;
+ end;
+  end if;
+
   Rewrite (N,
 Make_Attribute_Reference (Sloc (N),
-  Prefix => New_Occurrence_Of (Etype (Pref), Sloc (N)),
+  Prefix => New_Occurrence_Of (Ptyp, Sloc (N)),
   Attribute_Name => Attr_Name,
-  Expressions=> New_List (Relocate_Node (Pref;
+  Expressions=> New_List (Unchecked_Convert_To (Ptyp, Pref;
 
   Analyze_And_Resolve (N, Str_Typ);
end Rewrite_Object_Image;
-- 
2.40.0



[COMMITTED] ada: Fix crash on iterated component in expression function

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The problem is that the freeze node generated for the type of a static
subexpression present in the expression function is incorrectly placed
inside instead of outside the function.

gcc/ada/

* freeze.adb (Freeze_Expression): When the freezing is to be done
outside the current scope, skip any scope that is an internal loop.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/freeze.adb | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
index 86622003b97..f54ae0503a1 100644
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -8712,17 +8712,19 @@ package body Freeze is
 
 --  The current scope may be that of a constrained component of
 --  an enclosing record declaration, or of a loop of an enclosing
---  quantified expression, which is above the current scope in the
---  scope stack. Indeed in the context of a quantified expression,
---  a scope is created and pushed above the current scope in order
---  to emulate the loop-like behavior of the quantified expression.
+--  quantified expression or aggregate with an iterated component
+--  in Ada 2022, which is above the current scope in the scope
+--  stack. Indeed in the context of a quantified expression or
+--  an aggregate with an iterated component, an internal scope is
+--  created and pushed above the current scope in order to emulate
+--  the loop-like behavior of the construct.
 --  If the expression is within a top-level pragma, as for a pre-
 --  condition on a library-level subprogram, nothing to do.
 
 if not Is_Compilation_Unit (Current_Scope)
   and then (Is_Record_Type (Scope (Current_Scope))
- or else Nkind (Parent (Current_Scope)) =
- N_Quantified_Expression)
+ or else (Ekind (Current_Scope) = E_Loop
+   and then Is_Internal (Current_Scope)))
 then
Pos := Pos - 1;
 end if;
-- 
2.40.0



[patch,avr] PR105753: Fix ICE in add_clobbers.

2023-05-16 Thread Georg-Johann Lay

This patch removes the superfluous parallel in [u]divmod patterns
in the AVR backend.  Effect of extra parallel is that add_clobbers
reaches gcc_unreachable() because the clobbers for [u]divmod are
missing.  The parallel around the parts of an insn pattern is
implicit if it has multiple parts like clobbers, so extra parallel
should be removed.

Ok to apply?

Johann

--

gcc/
PR target/105753
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod4): Tidy code.  Use gcc_unreachable() instead of
printing error text to assembly.

gcc/testsuite/
PR target/105753
* gcc.target/avr/torture/pr105753.c: New test.diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index 43b75046384..a79c6824fad 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -3705,17 +3705,17 @@ (define_insn "*mulohisi3_call"
 ;;CSE has problems to operate on hard regs.
 ;;
 (define_insn_and_split "divmodqi4"
-  [(set (match_operand:QI 0 "pseudo_register_operand" "")
-(div:QI (match_operand:QI 1 "pseudo_register_operand" "")
-(match_operand:QI 2 "pseudo_register_operand" "")))
-   (set (match_operand:QI 3 "pseudo_register_operand" "")
+  [(set (match_operand:QI 0 "pseudo_register_operand")
+(div:QI (match_operand:QI 1 "pseudo_register_operand")
+(match_operand:QI 2 "pseudo_register_operand")))
+   (set (match_operand:QI 3 "pseudo_register_operand")
 (mod:QI (match_dup 1) (match_dup 2)))
(clobber (reg:QI 22))
(clobber (reg:QI 23))
(clobber (reg:QI 24))
(clobber (reg:QI 25))]
   ""
-  "this divmodqi4 pattern should have been splitted;"
+  { gcc_unreachable(); }
   ""
   [(set (reg:QI 24) (match_dup 1))
(set (reg:QI 22) (match_dup 2))
@@ -3751,17 +3751,17 @@ (define_insn "*divmodqi4_call"
   [(set_attr "type" "xcall")])
 
 (define_insn_and_split "udivmodqi4"
- [(set (match_operand:QI 0 "pseudo_register_operand" "")
-   (udiv:QI (match_operand:QI 1 "pseudo_register_operand" "")
-(match_operand:QI 2 "pseudo_register_operand" "")))
-   (set (match_operand:QI 3 "pseudo_register_operand" "")
-(umod:QI (match_dup 1) (match_dup 2)))
-   (clobber (reg:QI 22))
-   (clobber (reg:QI 23))
-   (clobber (reg:QI 24))
-   (clobber (reg:QI 25))]
-  ""
-  "this udivmodqi4 pattern should have been splitted;"
+ [(set (match_operand:QI 0 "pseudo_register_operand")
+   (udiv:QI (match_operand:QI 1 "pseudo_register_operand")
+(match_operand:QI 2 "pseudo_register_operand")))
+  (set (match_operand:QI 3 "pseudo_register_operand")
+   (umod:QI (match_dup 1) (match_dup 2)))
+  (clobber (reg:QI 22))
+  (clobber (reg:QI 23))
+  (clobber (reg:QI 24))
+  (clobber (reg:QI 25))]
+  ""
+  { gcc_unreachable(); }
   ""
   [(set (reg:QI 24) (match_dup 1))
(set (reg:QI 22) (match_dup 2))
@@ -3793,17 +3793,17 @@ (define_insn "*udivmodqi4_call"
   [(set_attr "type" "xcall")])
 
 (define_insn_and_split "divmodhi4"
-  [(set (match_operand:HI 0 "pseudo_register_operand" "")
-(div:HI (match_operand:HI 1 "pseudo_register_operand" "")
-(match_operand:HI 2 "pseudo_register_operand" "")))
-   (set (match_operand:HI 3 "pseudo_register_operand" "")
+  [(set (match_operand:HI 0 "pseudo_register_operand")
+(div:HI (match_operand:HI 1 "pseudo_register_operand")
+(match_operand:HI 2 "pseudo_register_operand")))
+   (set (match_operand:HI 3 "pseudo_register_operand")
 (mod:HI (match_dup 1) (match_dup 2)))
(clobber (reg:QI 21))
(clobber (reg:HI 22))
(clobber (reg:HI 24))
(clobber (reg:HI 26))]
   ""
-  "this should have been splitted;"
+  { gcc_unreachable(); }
   ""
   [(set (reg:HI 24) (match_dup 1))
(set (reg:HI 22) (match_dup 2))
@@ -3839,17 +3839,17 @@ (define_insn "*divmodhi4_call"
   [(set_attr "type" "xcall")])
 
 (define_insn_and_split "udivmodhi4"
-  [(set (match_operand:HI 0 "pseudo_register_operand" "")
-(udiv:HI (match_operand:HI 1 "pseudo_register_operand" "")
- (match_operand:HI 2 "pseudo_register_operand" "")))
-   (set (match_operand:HI 3 "pseudo_register_operand" "")
+  [(set (match_operand:HI 0 "pseudo_register_operand")
+(udiv:HI (match_operand:HI 1 "pseudo_register_operand")
+ (match_operand:HI 2 "pseudo_register_operand")))
+   (set (match_operand:HI 3 "pseudo_register_operand")
 (umod:HI (match_dup 1) (match_dup 2)))
(clobber (reg:QI 21))
(clobber (reg:HI 22))
(clobber (reg:HI 24))
(clobber (reg:HI 26))]
   ""
-  "this udivmodhi4 pattern should have been splitted.;"
+  { gcc_unreachable(); }
   ""
   [(set (reg:HI 24) (match_dup 1))
(set (reg:HI 22) (match_dup 2))
@@ -4090,14 +4090,14 @@ (define_insn "*mulpsi3.libgcc"
 ;; implementation works the other way round.
 
 (define_insn_and_split "divmodpsi4"
-  [(parallel [(set (match_ope

[committed] libstdc++: Do not use pthread_mutex_clocklock with ThreadSanitizer

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

As noted in https://github.com/llvm/llvm-project/issues/62623 there are
no tsan interceptors for some of the new POSIX-1:202x APIs added by
https://austingroupbugs.net/view.php?id=1216 so tsan gives false
positive warnings for try_lock_for on timed mutexes.

Disable the uses of the new pthread_mutex_clocklock API when tsan is
active. This changes the semantics of the try_lock_for functions,
because it can change which clock is used for the wait. This means those
functions might be affected by system clock adjustments when tsan is
used, when they would not be affected otherwise.

Reviewed-by: Thomas Rodgers 
Reviewed-by: Mike Crowe 

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK): Define
_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK in terms of _GLIBCXX_TSAN.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 2 +-
 libstdc++-v3/configure| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 42a8e7a775e..0ce3b8b5b31 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4314,7 +4314,7 @@ AC_DEFUN([GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK], [
   [glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK=no])
   ])
   if test $glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK = yes; then
-AC_DEFINE(_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK, 1, [Define if 
pthread_mutex_clocklock is available in .])
+AC_DEFINE(_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK, (_GLIBCXX_TSAN==0), 
[Define if pthread_mutex_clocklock is available in .])
   fi
 
   CXXFLAGS="$ac_save_CXXFLAGS"
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index d4286b67a73..c1faebd54f2 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -21364,7 +21364,7 @@ fi
 $as_echo "$glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK" >&6; }
   if test $glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK = yes; then
 
-$as_echo "#define _GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK 1" >>confdefs.h
+$as_echo "#define _GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK (_GLIBCXX_TSAN==0)" 
>>confdefs.h
 
   fi
 
-- 
2.40.1



[committed] libstdc++: Require tzdb support for chrono::zoned_time printer test

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* testsuite/libstdc++-prettyprinters/chrono.cc: Only test
printer for chrono::zoned_time for cx11 ABI and tzdb effective
target.
---
 libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc 
b/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc
index 01a46169393..b5314e025cc 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc
@@ -1,5 +1,6 @@
 // { dg-options "-g -O0 -std=gnu++2a" }
 // { dg-do run { target c++2a } }
+// { dg-additional-options "-DTEST_ZONED_TIME" { target tzdb } }
 
 // Copyright The GNU Toolchain Authors.
 //
@@ -38,7 +39,7 @@ main()
   utc_time utc(467664h);
   // { dg-final { note-test utc {std::chrono::utc_time = { 467664h }} } }
 
-#if _GLIBCXX_USE_CXX11_ABI
+#if _GLIBCXX_USE_CXX11_ABI && defined TEST_ZONED_TIME
   zoned_time zt("Europe/London", half_past_epoch);
   // { dg-final { note-test zt {std::chrono::zoned_time = { "Europe/London" 
180ms [1970-01-01 00:30:00] }} { target cxx11_abi } } }
 #endif
-- 
2.40.1



[committed] libstdc++: Add assertion to debug_allocator test

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* testsuite/ext/debug_allocator/check_deallocate_null.cc: Add
assertion to ensure expected exception is throw.
---
 .../testsuite/ext/debug_allocator/check_deallocate_null.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc 
b/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc
index 1f0a9eb0b61..c5bcafb04e9 100644
--- a/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc
+++ b/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc
@@ -31,7 +31,8 @@ int main()
 
   try
 {
-  __gnu_test::check_deallocate_null(); 
+  __gnu_test::check_deallocate_null();
+  VERIFY(false);
 }
   catch (std::runtime_error& obj)
 {
-- 
2.40.1



Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Richard Sandiford via Gcc-patches
pan2...@intel.com writes:
> diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h
> index c5180b9308a..38b4d6160c2 100644
> --- a/gcc/rtl-ssa/accesses.h
> +++ b/gcc/rtl-ssa/accesses.h
> @@ -254,7 +254,7 @@ private:
>unsigned int m_spare : 2;
>  
>// The value returned by the accessor above.
> -  machine_mode m_mode : 8;
> +  machine_mode m_mode : MACHINE_MODE_BITSIZE;
>  };
>  
>  // A contiguous array of access_info pointers.  Used to represent a

This structure (access_info) isn't mentioned in the table in the patch
description.  The structure is currently 1 LP64 word and is very
size-sensitive.  I think we should:

- Put the mode after m_regno
- Reduce m_kind to 2 bits
- Remove m_spare

I *think* that will keep the current size, but please check.

LGTM otherwise.

Thanks,
Richard


[committed 1/3] libstdc++: Stop using _GLIBCXX_USE_C99_COMPLEX_TR1 in

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

The _GLIBCXX_USE_C99_COMPLEX_TR1 macro (and the comments about it in
acinclude.m4 and config.h) are misleading when it is also used for
, not only . It is also wrong, because the
configure checks for TR1 use -std=c++98 and a target might define cacos
etc. for C++11 but not for C++98.

Add a separate configure check for the inverse trigonometric functions
that are covered by _GLIBCXX_USE_C99_COMPLEX_TR1, but using -std=c++11
for the checks. Use the result of that separate check in .

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_USE_C99): Check for complex inverse trig
functions in C++11 mode and define _GLIBCXX_USE_C99_COMPLEX_ARC.
* config.h.in: Regenerate.
* configure: Regenerate.
* doc/doxygen/user.cfg.in (PREDEFINED): Add new macro.
* include/std/complex: Check _GLIBCXX_USE_C99_COMPLEX_ARC
instead of _GLIBCXX_USE_C99_COMPLEX_TR1.
---
 libstdc++-v3/acinclude.m4| 37 +++
 libstdc++-v3/config.h.in |  5 +++
 libstdc++-v3/configure   | 53 
 libstdc++-v3/doc/doxygen/user.cfg.in |  1 +
 libstdc++-v3/include/std/complex | 14 
 5 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 0ce3b8b5b31..84b12adbc24 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1200,6 +1200,43 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
 requires corresponding C99 library functions to be present.])
 fi
 
+# Check for the existence of  complex inverse trigonometric
+# math functions used by  for C++11 and later.
+ac_c99_complex_arc=no;
+if test x"$ac_has_complex_h" = x"yes"; then
+  AC_MSG_CHECKING([for ISO C99 support for inverse trig functions in 
])
+  AC_TRY_COMPILE([#include ],
+[typedef __complex__ float float_type; float_type tmpf;
+ cacosf(tmpf);
+ casinf(tmpf);
+ catanf(tmpf);
+ cacoshf(tmpf);
+ casinhf(tmpf);
+ catanhf(tmpf);
+ typedef __complex__ double double_type; double_type tmpd;
+ cacos(tmpd);
+ casin(tmpd);
+ catan(tmpd);
+ cacosh(tmpd);
+ casinh(tmpd);
+ catanh(tmpd);
+ typedef __complex__ long double ld_type; ld_type tmpld;
+ cacosl(tmpld);
+ casinl(tmpld);
+ catanl(tmpld);
+ cacoshl(tmpld);
+ casinhl(tmpld);
+ catanhl(tmpld);
+],[ac_c99_complex_arc=yes], [ac_c99_complex_arc=no])
+fi
+AC_MSG_RESULT($ac_c99_complex_arc)
+if test x"$ac_c99_complex_arc" = x"yes"; then
+  AC_DEFINE(_GLIBCXX_USE_C99_COMPLEX_ARC, 1,
+   [Define if C99 inverse trig functions in  should be
+   used in . Using compiler builtins for these functions
+   requires corresponding C99 library functions to be present.])
+fi
+
 # Check for the existence in  of vscanf, et. al.
 AC_CACHE_CHECK([for ISO C99 support in  for C++11],
   glibcxx_cv_c99_stdio_cxx11, [
diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in 
b/libstdc++-v3/doc/doxygen/user.cfg.in
index 14981c96f95..210e13400b9 100644
--- a/libstdc++-v3/doc/doxygen/user.cfg.in
+++ b/libstdc++-v3/doc/doxygen/user.cfg.in
@@ -2352,6 +2352,7 @@ PREDEFINED = __cplusplus=202002L \
  _GLIBCXX_USE_NOEXCEPT=noexcept \
  _GLIBCXX_USE_WCHAR_T \
  _GLIBCXX_USE_LONG_LONG \
+_GLIBCXX_USE_C99_COMPLEX_ARC \
  _GLIBCXX_USE_C99_STDINT_TR1 \
  _GLIBCXX_USE_SCHED_YIELD \
  _GLIBCXX_USE_NANOSLEEP \
diff --git a/libstdc++-v3/include/std/complex b/libstdc++-v3/include/std/complex
index 0f5f14c3ddb..40fc062e53d 100644
--- a/libstdc++-v3/include/std/complex
+++ b/libstdc++-v3/include/std/complex
@@ -2021,7 +2021,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return std::complex<_Tp>(__pi_2 - __t.real(), -__t.imag());
 }
 
-#if _GLIBCXX_USE_C99_COMPLEX_TR1
+#if _GLIBCXX_USE_C99_COMPLEX_ARC
 #if defined(__STDCPP_FLOAT16_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
   inline __complex__ _Float16
   __complex_acos(__complex__ _Float16 __z)
@@ -2177,7 +2177,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 #endif
 
-#if _GLIBCXX_USE_C99_COMPLEX_TR1
+#if _GLIBCXX_USE_C99_COMPLEX_ARC
   inline __complex__ float
   __complex_acos(__complex__ float __z)
   { return __builtin_cacosf(__z); }
@@ -2213,7 +2213,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return std::complex<_Tp>(__t.imag(), -__t.real());
 }
 
-#if _GLIBCXX_USE_C99_COM

[committed 3/3] libstdc++: Stop using TR1 macros in and

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

As with the two commits before this, the _GLIBCXX_USE_C99_CTYPE_TR1 and
_GLIBCXX_USE_C99_FENV_TR1 macros are misleading when they are also used
for  and , not only for TR1 headers. It is also wrong,
because the configure checks for TR1 use -std=c++98 and a target might
define the C99 features for C++11 but not for C++98.

Add separate configure checks for the  and  features using 
-std=c++11
for the checks. Use the new macros defined by those checks in the
C++11-specific parts of , , and .

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_USE_C99): Check for isblank in C++11
mode and define _GLIBCXX_USE_C99_CTYPE. Check for 
functions in C++11 mode and define _GLIBCXX_USE_C99_FENV.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/c_compatibility/fenv.h: Check _GLIBCXX_USE_C99_FENV
instead of _GLIBCXX_USE_C99_FENV_TR1.
* include/c_global/cfenv: Likewise.
* include/c_global/cctype: Check _GLIBCXX_USE_C99_CTYPE instead
of _GLIBCXX_USE_C99_CTYPE_TR1.
---
 libstdc++-v3/acinclude.m4   | 46 ++
 libstdc++-v3/config.h.in|  8 ++
 libstdc++-v3/configure  | 97 +
 libstdc++-v3/include/c_compatibility/fenv.h |  4 +-
 libstdc++-v3/include/c_global/cctype|  4 +-
 libstdc++-v3/include/c_global/cfenv |  4 +-
 6 files changed, 157 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 0c01b526ebf..988c532c4e2 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1476,6 +1476,52 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
   fi
 fi
 
+# Check for the existence of  functions.
+AC_CACHE_CHECK([for ISO C99 support for C++11 in ],
+glibcxx_cv_c99_ctype, [
+AC_TRY_COMPILE([#include ],
+  [int ch;
+   int ret;
+   ret = isblank(ch);
+  ],[glibcxx_cv_c99_ctype=yes],
+[glibcxx_cv_c99_ctype=no])
+])
+if test x"$glibcxx_cv_c99_ctype" = x"yes"; then
+  AC_DEFINE(_GLIBCXX_USE_C99_CTYPE, 1,
+   [Define if C99 functions in  should be imported in
+in namespace std for C++11.])
+fi
+
+# Check for the existence of  functions.
+AC_CHECK_HEADERS(fenv.h, ac_has_fenv_h=yes, ac_has_fenv_h=no)
+ac_c99_fenv=no;
+if test x"$ac_has_fenv_h" = x"yes"; then
+  AC_MSG_CHECKING([for ISO C99 support for C++11 in ])
+  AC_TRY_COMPILE([#include ],
+[int except, mode;
+ fexcept_t* pflag;
+ fenv_t* penv;
+ int ret;
+ ret = feclearexcept(except);
+ ret = fegetexceptflag(pflag, except);
+ ret = feraiseexcept(except);
+ ret = fesetexceptflag(pflag, except);
+ ret = fetestexcept(except);
+ ret = fegetround();
+ ret = fesetround(mode);
+ ret = fegetenv(penv);
+ ret = feholdexcept(penv);
+ ret = fesetenv(penv);
+ ret = feupdateenv(penv);
+],[ac_c99_fenv=yes], [ac_c99_fenv=no])
+  AC_MSG_RESULT($ac_c99_fenv)
+fi
+if test x"$ac_c99_fenv" = x"yes"; then
+  AC_DEFINE(_GLIBCXX_USE_C99_FENV, 1,
+   [Define if C99 functions in  should be imported in
+in namespace std for C++11.])
+fi
+
 gcc_no_link="$ac_save_gcc_no_link"
 LIBS="$ac_save_LIBS"
 CXXFLAGS="$ac_save_CXXFLAGS"
diff --git a/libstdc++-v3/include/c_compatibility/fenv.h 
b/libstdc++-v3/include/c_compatibility/fenv.h
index 70ce3f834f4..83e930f12d1 100644
--- a/libstdc++-v3/include/c_compatibility/fenv.h
+++ b/libstdc++-v3/include/c_compatibility/fenv.h
@@ -38,7 +38,7 @@
 
 #if __cplusplus >= 201103L
 
-#if _GLIBCXX_USE_C99_FENV_TR1
+#if _GLIBCXX_USE_C99_FENV
 
 #undef feclearexcept
 #undef fegetexceptflag
@@ -74,7 +74,7 @@ namespace std
   using ::feupdateenv;
 } // namespace
 
-#endif // _GLIBCXX_USE_C99_FENV_TR1
+#endif // _GLIBCXX_USE_C99_FENV
 
 #endif // C++11
 
diff --git a/libstdc++-v3/include/c_global/cctype 
b/libstdc++-v3/include/c_global/cctype
index bd667fba15d..e6ff1204df6 100644
--- a/libstdc++-v3/include/c_global/cctype
+++ b/libstdc++-v3/include/c_global/cctype
@@ -78,7 +78,7 @@ namespace std
 
 #if __cplusplus >= 201103L
 
-#ifdef _GLIBCXX_USE_C99_CTYPE_TR1
+#ifdef _GLIBCXX_USE_C99_CTYPE
 
 #undef isblank
 
@@ -87,7 +87,7 @@ namespace std
   using ::isblank;
 } // namespace std
 
-#endif // _GLIBCXX_USE_C99_CTYPE_TR1
+#endif // _GLIBCXX_USE_C99_CTYPE
 
 #endif // C++11
 
diff --git a/libstdc++-v3/include/c_global/cfenv 
b/libstdc++-v3/include/c_global/cfenv
index 6704dc5423e..3a1d9c4a6aa 100644
--- a/libstdc++-v3/include/c_global/cfenv
+++ b/libstdc++-

[committed 2/3] libstdc++: Stop using _GLIBCXX_USE_C99_STDINT_TR1 in

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

The _GLIBCXX_USE_C99_STDINT_TR1 macro (and the comments about it in
acinclude.m4 and config.h) are misleading when it is also used for
, not only . It is also wrong, because the
configure checks for TR1 use -std=c++98 and a target might define
uint32_t etc. for C++11 but not for C++98.

Add a separate configure check for the  types using -std=c++11
for the checks. Use the result of that separate check in  and
most other places that still depend on the macro (many uses of that
macro have been removed already). The remaining uses of the STDINT_TR1
macro are really for TR1, or are in the src/c++11/compatibility-*.cc
files, where we don't want/need to change the condition they depend on
(if those symbols were only exported when  types were
available for -std=c++98, then that's the condition we should continue
to use for whether to export the compat symbols now).

Make similar changes for the related _GLIBCXX_USE_C99_INTTYPES_TR1 and
_GLIBCXX_USE_C99_INTTYPES_WCHAR_T_TR1 macros, adding new macros for
non-TR1 uses.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_USE_C99): Check for  types in
C++11 mode and define _GLIBCXX_USE_C99_STDINT. Check for
 features in C++11 mode and define
_GLIBCXX_USE_C99_INTTYPES and _GLIBCXX_USE_C99_INTTYPES_WCHAR_T.
* config.h.in: Regenerate.
* configure: Regenerate.
* doc/doxygen/user.cfg.in (PREDEFINED): Add new macros.
* include/bits/chrono.h: Check _GLIBCXX_USE_C99_STDINT instead
of _GLIBCXX_USE_C99_STDINT_TR1.
* include/c_compatibility/inttypes.h: Check
_GLIBCXX_USE_C99_INTTYPES and _GLIBCXX_USE_C99_INTTYPES_WCHAR_T
instead of _GLIBCXX_USE_C99_INTTYPES_TR1 and
_GLIBCXX_USE_C99_INTTYPES_WCHAR_T_TR1.
* include/c_compatibility/stdatomic.h: Check
_GLIBCXX_USE_C99_STDINT instead of _GLIBCXX_USE_C99_STDINT_TR1.
* include/c_compatibility/stdint.h: Likewise.
* include/c_global/cinttypes: Check _GLIBCXX_USE_C99_INTTYPES
and _GLIBCXX_USE_C99_INTTYPES_WCHAR_T instead of
_GLIBCXX_USE_C99_INTTYPES_TR1 and
_GLIBCXX_USE_C99_INTTYPES_WCHAR_T_TR1.
* include/c_global/cstdint: Check _GLIBCXX_USE_C99_STDINT
instead of _GLIBCXX_USE_C99_STDINT_TR1.
* include/std/atomic: Likewise.
* src/c++11/cow-stdexcept.cc: Likewise.
* testsuite/29_atomics/headers/stdatomic.h/c_compat.cc:
Likewise.
* testsuite/lib/libstdc++.exp (check_v3_target_cstdint):
Likewise.
---
 libstdc++-v3/acinclude.m4 | 142 +
 libstdc++-v3/config.h.in  |  12 ++
 libstdc++-v3/configure| 196 ++
 libstdc++-v3/doc/doxygen/user.cfg.in  |   3 +
 libstdc++-v3/include/bits/chrono.h|   2 +-
 .../include/c_compatibility/inttypes.h|   6 +-
 .../include/c_compatibility/stdatomic.h   |   4 +-
 libstdc++-v3/include/c_compatibility/stdint.h |   4 +-
 libstdc++-v3/include/c_global/cinttypes   |   6 +-
 libstdc++-v3/include/c_global/cstdint |   6 +-
 libstdc++-v3/include/std/atomic   |   2 +-
 libstdc++-v3/src/c++11/cow-stdexcept.cc   |   4 +-
 .../headers/stdatomic.h/c_compat.cc   |   2 +-
 libstdc++-v3/testsuite/lib/libstdc++.exp  |   2 +-
 14 files changed, 372 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 84b12adbc24..0c01b526ebf 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1103,6 +1103,148 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
   ])
 fi
 
+# Check for the existence of  types.
+AC_CACHE_CHECK([for ISO C99 support in  for C++11],
+glibcxx_cv_c99_stdint, [
+AC_TRY_COMPILE([#define __STDC_LIMIT_MACROS
+   #define __STDC_CONSTANT_MACROS
+   #include ],
+  [typedef int8_t  my_int8_t;
+   my_int8_t   i8 = INT8_MIN;
+   i8 = INT8_MAX;
+   typedef int16_t my_int16_t;
+   my_int16_t  i16 = INT16_MIN;
+   i16 = INT16_MAX;
+   typedef int32_t my_int32_t;
+   my_int32_t  i32 = INT32_MIN;
+   i32 = INT32_MAX;
+   typedef int64_t my_int64_t;
+   my_int64_t  i64 = INT64_MIN;
+   i64 = INT64_MAX;
+   typedef int_fast8_t my_int_fast8_t;
+   my_int_fast8_t  if8 = INT_FAST8_MIN;
+   if8 = INT_FAST8_MAX;
+   typedef int_fast16_tmy_int_fast16_t;
+   my_int_fast16_t if16 = INT_FAST16_MIN;
+   if16 = INT_FAST16_MAX;
+   typedef int_fast32_tmy_int_fast32_t;
+   my_int_fast32_t  

[PATCH V10] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch implement decrement IV for length approach in loop control.

Address comment from kewen that incorporate the implementation inside
"vect_set_loop_controls_directly" instead of a standalone function.

Address comment from Richard using MIN_EXPR to handle these 3 following
cases
1. single rgroup.
2. multiple rgroup for SLP.
3. multiple rgroup for non-SLP (tested on vec_pack_trunc).


gcc/ChangeLog:

* tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
(vect_set_loop_controls_directly): Add decrement IV support.
(vect_set_loop_condition_partial_vectors): Ditto.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): New variable.
(vect_get_loop_len): Add decrement IV support.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
(vect_get_loop_len): Add decrement IV support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.

---
 .../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
 .../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
 .../autovec/partial/multiple_rgroup_run-1.c   |  19 +
 .../autovec/partial/multiple_rgroup_run-2.c   |  19 +
 gcc/tree-vect-loop-manip.cc   | 203 ++-
 gcc/tree-vect-loop.cc |  37 +-
 gcc/tree-vect-stmts.cc|   9 +-
 gcc/tree-vectorizer.h |  13 +-
 10 files changed, 1148 insertions(+), 14 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+ TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {
\
+   f[i * 2 + 0] = x;  \
+   f[i * 2 + 1] = x2; \
+   d[i] = y;  \
+  }
\
+  }
+
+#define run_1(TYPE1, TYPE2)
\
+  int n_1_##TYPE1_##TYPE2 = 1; 
\
+  TYPE1 x_1_##TYPE1 = 117; 
\
+  TYPE1 x2_1_##TYPE1 = 232;
\
+  TYPE2 y_1_##TYPE2 = 9762;
\
+  TYPE1 f_1_##TYPE1[2 * 2 + 1] = {0};  
\
+  TYPE2 d_1_##TYPE2[2] = {0};   

Re: [PATCH] Add auto-resizing capability to irange's [PR109695]

2023-05-16 Thread Aldy Hernandez via Gcc-patches



On 5/15/23 20:14, Aldy Hernandez wrote:

On 5/15/23 17:07, Aldy Hernandez wrote:



On 5/15/23 12:42, Jakub Jelinek wrote:

On Mon, May 15, 2023 at 12:35:23PM +0200, Aldy Hernandez wrote:

gcc/ChangeLog:

PR tree-optimization/109695
* value-range.cc (irange::operator=): Resize range.
(irange::union_): Same.
(irange::intersect): Same.
(irange::invert): Same.
(int_range_max): Default to 3 sub-ranges and resize as needed.
* value-range.h (irange::maybe_resize): New.
(~int_range): New.
(int_range::int_range): Adjust for resizing.
(int_range::operator=): Same.


LGTM.

One question is if we shouldn't do it for GCC13/GCC12 as well, perhaps
changing it to some larger number than 3 when the members aren't 
wide_ints

in there but just trees.  Sure, in 13/12 the problem is 10x less severe
than in current trunk, but still we have some cases where we run out of
stack because of it on some hosts.


Sure, but that would require messing around with the gt_* GTY 
functions, and making sure we're allocating the trees from a sensible 
place, etc etc.  I'm less confident in my ability to mess with GTY 
stuff this late in the game.


Hmmm, maybe backporting this isn't too bad.  The only time we'd have a 
chunk on the heap is for int_range_max, which will never live in GC 
space.  So I don't think we need to worry about GC at all.


Although, legacy mode in GCC13 does get in a the way a bit.  Sigh.


I've adapted the patch to GCC13 and tested it on x86-64 Linux.  Please 
look over the new[] I do for trees to make sure I did things right.


int_range_max on GCC13 is currently 4112 bytes.  Here are the numbers 
for various defaults:


< 2> =  64 bytes, 3.02% for VRP.
< 3> =  80 bytes, 2.67% for VRP.
< 8> = 160 bytes, 2.46% for VRP.
<16> = 288 bytes, 2.40% for VRP.

Note that we don't have any runway on GCC13, so this would be a net loss 
in performance for VRP.  Threading shows about half as much of a drop 
than VRP.  Overall compilation is within 0.2%, so not noticeable.


I'm surprised 2 sub-ranges doesn't incur a  bigger penalty, but 3 seems 
to be the happy medium.  Anything more than that, and there's no difference.


The patch defaults to 3 sub-ranges.  I must say, 80 bytes looks mighty 
nice.  It's up to you what to do with the patch.  I'm chicken shit at 
heart and hate touching release compilers :).


AldyFrom 777aa930b106fea2dd6ed9fe22b42a2717f1472d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 15 May 2023 12:25:58 +0200
Subject: [PATCH] [GCC13] Add auto-resizing capability to irange's [PR109695]

Backport the following from trunk.

	Note that the patch has been adapted to trees.

	The numbers for various sub-ranges on GCC13 are:
		< 2> =  64 bytes, -3.02% for VRP.
		< 3> =  80 bytes, -2.67% for VRP.
		< 8> = 160 bytes, -2.46% for VRP.
		<16> = 288 bytes, -2.40% for VRP.


We can now have int_range for automatically
resizable ranges.  int_range_max is now int_range<3, true>
for a 69X reduction in size from current trunk, and 6.9X reduction from
GCC12.  This incurs a 5% performance penalty for VRP that is more than
covered by our > 13% improvements recently.


int_range_max is the temporary range object we use in the ranger for
integers.  With the conversion to wide_int, this structure bloated up
significantly because wide_ints are huge (80 bytes a piece) and are
about 10 times as big as a plain tree.  Since the temporary object
requires 255 sub-ranges, that's 255 * 80 * 2, plus the control word.
This means the structure grew from 4112 bytes to 40912 bytes.

This patch adds the ability to resize ranges as needed, defaulting to
no resizing, while int_range_max now defaults to 3 sub-ranges (instead
of 255) and grows to 255 when the range being calculated does not fit.

For example:

int_range<1> foo;	// 1 sub-range with no resizing.
int_range<5> foo;	// 5 sub-ranges with no resizing.
int_range<5, true> foo;	// 5 sub-ranges with resizing.

I ran some tests and found that 3 sub-ranges cover 99% of cases, so
I've set the int_range_max default to that:

	typedef int_range<3, /*RESIZABLE=*/true> int_range_max;

We don't bother growing incrementally, since the default covers most
cases and we have a 255 hard-limit.  This hard limit could be reduced
to 128, since my tests never saw a range needing more than 124, but we
could do that as a follow-up if needed.

With 3-subranges, int_range_max is now 592 bytes versus 40912 for
trunk, and versus 4112 bytes for GCC12!  The penalty is 5.04% for VRP
and 3.02% for threading, with no noticeable change in overall
compilation (0.27%).  This is more than covered by our 13.26%
improvements for the legacy removal + wide_int conversion.

I think this approach is a good alternative, while providing us with
flexibility going forward.  For example, we could try defaulting to a
8 sub-ranges for a noticeable improvement in VRP.  We could also use
large sub-ranges for switch analys

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard.
I have sent V10:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618718.html 

I can't combine implementation Case 2 and Case 3, Case 2 each control (len) are 
coming from same rgc.
But Case 3 each control (len) are coming coming from different rgc.
Can you help me with that ?
Also, I have append my testcases too in this patch too.
Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:30
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.
 
Thanks.
 
> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?
 
It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.
 
Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.
 
Richard
 


Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-16 Thread Frederik Harwath via Gcc-patches

Hi Jakub,

On 15.05.23 12:19, Jakub Jelinek wrote:

On Fri, Mar 24, 2023 at 04:30:38PM +0100, Frederik Harwath wrote:

this patch series implements the OpenMP 5.1 "unroll" and "tile"
constructs.  It includes changes to the C,C++, and Fortran front end
for parsing the new constructs and a new middle-end
"omp_transform_loops" pass which implements the transformations in a
source language agnostic way.

I'm afraid we can't do it this way, at least not completely.

The OpenMP requirements and what is being discussed for further loop
transformations pretty much requires parts of it to be done as soon as possible.
My understanding is that that is where other implementations implement that
too and would also prefer GCC not to be the only implementation that takes
significantly different decision in that case from other implementations


The place where different compilers implement the loop transformations
was discussed in an OpenMP loop transformation meeting last year. Two 
compilers (another one and GCC with this patch series) transformed the 
loops in the middle end after the handling of data sharing, one planned 
to do so. Yet another vendor had not yet decided where it will be 
implemented. Clang currently does everything in the front end, but it 
was mentioned that this might change in the future e.g. for code sharing 
with Flang. Implementing the loop transformations late could potentially
complicate the implementation of transformations which require 
adjustments of the data sharing clauses, but this is known and 
consequentially, no such transformations are planned for OpenMP 6.0. In 
particular, the "apply" clause therefore only permits loop-transforming 
constructs to be applied to the loops generated from other loop

transformations in TR11.


The normal loop constructs (OMP_FOR, OMP_SIMD, OMP_DISTRIBUTE, OMP_LOOP)
already need to know given their collapse/ordered how many loops they are
actually associated with and the loop transformation constructs can change
that.
So, I think we need to do the loop transformations in the FEs, that doesn't
mean we need to write everything 3 times, once for each frontend.
Already now, e.g. various stuff is shared between C and C++ FEs in c-family,
though how much can be shared between c-family and Fortran is to be
discovered.
Or at least partially, to the extent that we compute how many canonical
loops the loop transformations result in, what artificial iterators they
will use etc., so that during gimplification we can take all that into
account and then can do the actual transformations later.


The patches in this patch series already do compute how many canonical
loop nests result from the loop transformations in the front end.
This is necessary to represent the loop nest that is affected by the
loop transformations by a single OMP_FOR to meet the expectations
of all later OpenMP code transformations. This is also the major
reason why the loop transformations are represented by clauses
instead of representing them as  "OMP_UNROLL/OMP_TILE as
GENERIC constructs like OMP_FOR" as you suggest below. Since the
loop transformations may also appear on inner loops of a collapsed
loop nest (i.e. within the collapsed depth), representing the
transformation by OMP_FOR-like constructs would imply that a collapsed
loop nest would have to be broken apart into single loops. Perhaps this
could be handled somehow, but the collapsed loop nest would have to be
re-assembled to meet the expectations of e.g. gimplification.
The clause representation is also much better suited for the upcoming
OpenMP "apply" clause where the transformations will not appear
as directives in front of actual loops but inside of other clauses.
In fact, the loop transformation clauses in the implementation already
specify the level of a loop nest to which they apply and it could
be possible to re-use this handling for "apply".

My initial reaction also was to implement the loop transformations
as OMP_FOR-like constructs and the patch actually introduces an
OMP_LOOP_TRANS construct which is used to represent loops that
are not going to be associated with another OpenMP directive after
the transformation, e.g.

void foo () {
  #pragma omp tile sizes (4, 8, 16)
  for (int i = 0; i < 64; ++i)
  {
...
  }

}

You suggest to implement the loop transformations during gimplification.
I am not sure if gimplification is actually well-suited to implement the 
depth-first evaluation of the loop transformations. I also believe that 
gimplification already handles too many things which conceptually are 
not related to the translation to GIMPLE. Having a separate pass seems 
to be the right move to achieve a better separation of concerns. I think 
this will be even more important in the future as the size of the loop 
transformation implementation keeps growing. As you mention below, 
several new constructs are already planned.



For C, I think the lowering of loop transformation constructs or at least
determining w

[PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-16 Thread juzhe . zhong
From: Juzhe-Zhong 

This patch is adding rounding mode operand and FRM_REGNUM dependency
into floating-point instructions.

The floating-point instructions we added FRM and rounding mode operand:
1. vfadd/vfsub
2. vfwadd/vfwsub
3. vfmul
4. vfdiv
5. vfwmul
6. vfwmacc/vfwnmacc/vfwmsac/vfwnmsac
7. vfsqrt
8. floating-point conversions.
9. floating-point reductions.
10. floating-point ternary.

The floating-point instructions we did NOT add FRM and rounding mode operand:
1. vfabs/vfneg/vfsqrt7/vfrec7
2. vfmin/vfmax
3. comparisons
4. vfclass
5. vfsgnj/vfsgnjn/vfsgnjx
6. vfmerge
7. vfmv.v.f

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum frm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_ternop_insn): Add default rounding mode.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add FRM REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(FRM_REG_P): Ditto.
(RISCV_DWARF_FRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md: split no frm and has frm operations.
* config/riscv/vector.md (@pred__scalar): New pattern.
(@pred_): Ditto.

---
 gcc/config/riscv/riscv-protos.h   |  10 +
 gcc/config/riscv/riscv-vector-builtins.cc |  14 ++
 gcc/config/riscv/riscv.cc |   7 +-
 gcc/config/riscv/riscv.h  |   7 +-
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/vector-iterators.md  |   9 +-
 gcc/config/riscv/vector.md| 258 ++
 7 files changed, 251 insertions(+), 55 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 835bb802fc6..12634d0ac1a 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -231,6 +231,16 @@ enum vxrm_field_enum
   VXRM_RDN,
   VXRM_ROD
 };
+/* Rounding mode bitfield for floating point FRM.  */
+enum frm_field_enum
+{
+  FRM_RNE = 0b000,
+  FRM_RTZ = 0b001,
+  FRM_RDN = 0b010,
+  FRM_RUP = 0b011,
+  FRM_RMM = 0b100,
+  DYN = 0b111
+};
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 1de075fb90d..b7458aaace6 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3460,6 +3460,13 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
@@ -3482,6 +3489,13 @@ function_expander::use_widen_ternop_insn (insn_code 
icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b52e613c629..de5b87b1a87 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6082,7 +6082,8 @@ riscv_hard_regno_nregs (unsigned int regno, machine_mode 
mode)
 
   /* mode for VL or VTYPE are just a marker, not holding value,
  so it always consume one register.  */
-  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return 1;
 
   /* Assume every valid non-vector mode fits in one vector register.  */
@@ -6150,7 +6151,8 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (lmul != 1)
return ((regno % lmul) == 0);
 }
-  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return true;
   else
 return false;
@@ -6587,6 +6589,7 @@ riscv_conditional_register_usage (void)
   fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1;
   fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1;
   fixed_regs[VXRM_REGNUM] = call_used_regs[VXR

[PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-16 Thread juzhe . zhong
From: Juzhe-Zhong 

This patch is adding rounding mode operand and FRM_REGNUM dependency
into floating-point instructions.

The floating-point instructions we added FRM and rounding mode operand:
1. vfadd/vfsub
2. vfwadd/vfwsub
3. vfmul
4. vfdiv
5. vfwmul
6. vfwmacc/vfwnmacc/vfwmsac/vfwnmsac
7. vfsqrt
8. floating-point conversions.
9. floating-point reductions.
10. floating-point ternary.

The floating-point instructions we did NOT add FRM and rounding mode operand:
1. vfabs/vfneg/vfsqrt7/vfrec7
2. vfmin/vfmax
3. comparisons
4. vfclass
5. vfsgnj/vfsgnjn/vfsgnjx
6. vfmerge
7. vfmv.v.f

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum frm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_ternop_insn): Add default rounding mode.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add FRM REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(FRM_REG_P): Ditto.
(RISCV_DWARF_FRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md: split no frm and has frm operations.
* config/riscv/vector.md (@pred__scalar): New pattern.
(@pred_): Ditto.

---
 gcc/config/riscv/riscv-protos.h   |  10 +
 gcc/config/riscv/riscv-vector-builtins.cc |  14 ++
 gcc/config/riscv/riscv.cc |   7 +-
 gcc/config/riscv/riscv.h  |   7 +-
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/vector-iterators.md  |   9 +-
 gcc/config/riscv/vector.md| 258 ++
 7 files changed, 251 insertions(+), 55 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 835bb802fc6..12634d0ac1a 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -231,6 +231,16 @@ enum vxrm_field_enum
   VXRM_RDN,
   VXRM_ROD
 };
+/* Rounding mode bitfield for floating point FRM.  */
+enum frm_field_enum
+{
+  FRM_RNE = 0b000,
+  FRM_RTZ = 0b001,
+  FRM_RDN = 0b010,
+  FRM_RUP = 0b011,
+  FRM_RMM = 0b100,
+  DYN = 0b111
+};
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 1de075fb90d..b7458aaace6 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3460,6 +3460,13 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
@@ -3482,6 +3489,13 @@ function_expander::use_widen_ternop_insn (insn_code 
icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b52e613c629..de5b87b1a87 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6082,7 +6082,8 @@ riscv_hard_regno_nregs (unsigned int regno, machine_mode 
mode)
 
   /* mode for VL or VTYPE are just a marker, not holding value,
  so it always consume one register.  */
-  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return 1;
 
   /* Assume every valid non-vector mode fits in one vector register.  */
@@ -6150,7 +6151,8 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (lmul != 1)
return ((regno % lmul) == 0);
 }
-  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return true;
   else
 return false;
@@ -6587,6 +6589,7 @@ riscv_conditional_register_usage (void)
   fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1;
   fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1;
   fixed_regs[VXRM_REGNUM] = call_used_regs[VXR

[PATCH V11] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch implement decrement IV for length approach in loop control.

Address comment from kewen that incorporate the implementation inside
"vect_set_loop_controls_directly" instead of a standalone function.

Address comment from Richard using MIN_EXPR to handle these 3 following
cases
1. single rgroup.
2. multiple rgroup for SLP.
3. multiple rgroup for non-SLP (tested on vec_pack_trunc).


gcc/ChangeLog:

* tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
(vect_set_loop_controls_directly): Add decrement IV support.
(vect_set_loop_condition_partial_vectors): Ditto.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): New variable.
(vect_get_loop_len): Add decrement IV support.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
(vect_get_loop_len): Add decrement IV support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.

---
 .../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
 .../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
 .../autovec/partial/multiple_rgroup_run-1.c   |  19 +
 .../autovec/partial/multiple_rgroup_run-2.c   |  19 +
 gcc/tree-vect-loop-manip.cc   | 184 +-
 gcc/tree-vect-loop.cc |  37 +-
 gcc/tree-vect-stmts.cc|   9 +-
 gcc/tree-vectorizer.h |  13 +-
 10 files changed, 1132 insertions(+), 11 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+ TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {
\
+   f[i * 2 + 0] = x;  \
+   f[i * 2 + 1] = x2; \
+   d[i] = y;  \
+  }
\
+  }
+
+#define run_1(TYPE1, TYPE2)
\
+  int n_1_##TYPE1_##TYPE2 = 1; 
\
+  TYPE1 x_1_##TYPE1 = 117; 
\
+  TYPE1 x2_1_##TYPE1 = 232;
\
+  TYPE2 y_1_##TYPE2 = 9762;
\
+  TYPE1 f_1_##TYPE1[2 * 2 + 1] = {0};  
\
+  TYPE2 d_1_##TYPE2[2] = {0};

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard and Richi.
I am so sorry for sending you garbage patches (My mistake, sending RISC-V 
patches to you).

I finally realize that Case 2 and Case 3 are totally the same sequence!
I have combined them into single function called "vect_adjust_loop_lens_control"

I have sent V11 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618724.html 

I think this patch is the reasonable patch now!
Could you take a look at it?

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:30
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.
 
Thanks.
 
> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?
 
It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.
 
Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.
 
Richard
 


[PATCH] aarch64: Allow moves after tied-register intrinsics (2nd edition)

2023-05-16 Thread Richard Sandiford via Gcc-patches
I missed these two in g:4ff89f10ca0d41f9cfa76 because I was
testing on a system that didn't support big-endian compilation.
Testing on aarch64_be-elf shows no other related failures
(although the overall results are worse than for little-endian).

Tested on aarch64_be-elf & pushed.

Richard


gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c: Allow mves
to occur after the intrinsic instruction, rather than requiring
them to happen before.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c: Likewise.
---
 .../gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c| 10 ++
 .../gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   | 10 ++
 2 files changed, 20 insertions(+)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
index ae0a953f7b4..9975edb8fdb 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
@@ -70,8 +70,13 @@ float32x4_t ufooq_lane(float32x4_t r, bfloat16x8_t x, 
bfloat16x4_t y)
 
 /*
 **ufoo_untied:
+** (
 ** mov v0.8b, v1.8b
 ** bfdot   v0.2s, (v2.4h, v3.4h|v3.4h, v2.4h)
+** |
+** bfdot   v1.2s, (v2.4h, v3.4h|v3.4h, v2.4h)
+** mov v0.8b, v1.8b
+** )
 ** ret
 */
 float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, bfloat16x4_t x, 
bfloat16x4_t y)
@@ -81,8 +86,13 @@ float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, 
bfloat16x4_t x, bfloa
 
 /*
 **ufooq_lane_untied:
+** (
 ** mov v0.16b, v1.16b
 ** bfdot   v0.4s, v2.8h, v3.2h\[1\]
+** |
+** bfdot   v1.4s, v2.8h, v3.2h\[1\]
+** mov v0.16b, v1.16b
+** )
 ** ret
 */
 float32x4_t ufooq_lane_untied(float32x4_t unused, float32x4_t r, bfloat16x8_t 
x, bfloat16x4_t y)
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
index 61c7c51f5ec..76787f6bedd 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
@@ -115,8 +115,13 @@ int32x4_t sfooq_laneq (int32x4_t r, int8x16_t x, 
uint8x16_t y)
 
 /*
 **ufoo_untied:
+** (
 ** mov v0\.8b, v1\.8b
 ** usdot   v0\.2s, v2\.8b, v3\.8b
+** |
+** usdot   v1\.2s, v2\.8b, v3\.8b
+** mov v0\.8b, v1\.8b
+** )
 ** ret
 */
 int32x2_t ufoo_untied (int32x2_t unused, int32x2_t r, uint8x8_t x, int8x8_t y)
@@ -126,8 +131,13 @@ int32x2_t ufoo_untied (int32x2_t unused, int32x2_t r, 
uint8x8_t x, int8x8_t y)
 
 /*
 **ufooq_laneq_untied:
+** (
 ** mov v0\.16b, v1\.16b
 ** usdot   v0\.4s, v2\.16b, v3\.4b\[3\]
+** |
+** usdot   v1\.4s, v2\.16b, v3\.4b\[3\]
+** mov v0\.16b, v1\.16b
+** )
 ** ret
 */
 int32x4_t ufooq_laneq_untied (int32x2_t unused, int32x4_t r, uint8x16_t x, 
int8x16_t y)
-- 
2.25.1



Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-16 Thread Jakub Jelinek via Gcc-patches
On Tue, May 16, 2023 at 11:45:16AM +0200, Frederik Harwath wrote:
> The place where different compilers implement the loop transformations
> was discussed in an OpenMP loop transformation meeting last year. Two
> compilers (another one and GCC with this patch series) transformed the loops
> in the middle end after the handling of data sharing, one planned to do so.
> Yet another vendor had not yet decided where it will be implemented. Clang
> currently does everything in the front end, but it was mentioned that this
> might change in the future e.g. for code sharing with Flang. Implementing
> the loop transformations late could potentially
> complicate the implementation of transformations which require adjustments
> of the data sharing clauses, but this is known and consequentially, no such

When already in the FE we determine how many canonical loops a particular
loop transformation creates, I think the primary changes I'd like to see is
really have OMP_UNROLL/OMP_TILE GENERIC statements (see below) and consider
where is the best spot to lower it.  I believe for data sharing it is best
done during gimplification before the containing loops are handled, it is
already shared code among all the FEs, I think will make it easier to handle
data sharing right and gimplification is also where doacross processing is
done.  While there is restriction that ordered clause is incompatible with
generated loops from tile construct, there isn't one for unroll (unless
"The ordered clause must not appear on a worksharing-loop directive if the 
associated loops
include the generated loops of a tile directive."
means unroll partial implicitly because partial unroll tiles the loop, but
it doesn't say it acts as if it was a tile construct), so we'd have to handle
#pragma omp for ordered(2)
for (int i = 0; i < 64; i++)
  #pragma omp unroll partial(4)
  for (int j = 0; j < 64; j++)
{
  #pragma omp ordered depend (sink: i - 1, j - 2)
  #pragma omp ordered depend (source)
}
and I think handling it after gimplification is going to be increasingly
harder.  Of course another possibility is ask lang committee to clarify
unless it has been clarified already in 6.0 (but in TR11 it is not).
Also, I think creating temporaries is easier to be done during
gimplification than later.

Another option is as you implemented a separate pre-omp-lowering pass,
and another one would be do it in the omplower pass, which has actually
several subpasses internally, do it in the scan phase.  Disadvantage of
a completely separate pass is that we have to walk the whole IL again,
while doing it in the scan phase means we avoid that cost.  We already
do there similar transformations, scan_omp_simd transforms simd constructs
into if (...) simd else simt and then we process it with normal scan_omp_for
on what we've created.  So, if you insist doing it after gimplification
perhaps for compatibility with other non-LLVM compilers, I'd prefer to
do it there rather than in a completely separate pass.

> transformations are planned for OpenMP 6.0. In particular, the "apply"
> clause therefore only permits loop-transforming constructs to be applied to
> the loops generated from other loop
> transformations in TR11.
> 
> > The normal loop constructs (OMP_FOR, OMP_SIMD, OMP_DISTRIBUTE, OMP_LOOP)
> > already need to know given their collapse/ordered how many loops they are
> > actually associated with and the loop transformation constructs can change
> > that.
> > So, I think we need to do the loop transformations in the FEs, that doesn't
> > mean we need to write everything 3 times, once for each frontend.
> > Already now, e.g. various stuff is shared between C and C++ FEs in c-family,
> > though how much can be shared between c-family and Fortran is to be
> > discovered.
> > Or at least partially, to the extent that we compute how many canonical
> > loops the loop transformations result in, what artificial iterators they
> > will use etc., so that during gimplification we can take all that into
> > account and then can do the actual transformations later.
> 
> The patches in this patch series already do compute how many canonical
> loop nests result from the loop transformations in the front end.

Good.

> This is necessary to represent the loop nest that is affected by the
> loop transformations by a single OMP_FOR to meet the expectations
> of all later OpenMP code transformations. This is also the major
> reason why the loop transformations are represented by clauses
> instead of representing them as  "OMP_UNROLL/OMP_TILE as
> GENERIC constructs like OMP_FOR" as you suggest below. Since the

I really don't see why.  We try to represent what we see in the source
as OpenMP constructs as those constructs.  We already have a precedent
with composite loop constructs, where for the combined constructs which
aren't innermost we temporarily use NULL OMP_FOR_{INIT,COND,INCR,ORIG_DECLS}
vectors to stand for this will be some loop, but the details for it aren

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. Forget about V10 patch. Just go directly V11 patch.
I am so sorry that I send V10 since I originally did not notice Case 2 and Case 
3 are totally the same.
I apologize for that. I have reviewed V11 patch twice, it seems that this patch 
is much more reasonable and better understanding than before.

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:30
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.
 
Thanks.
 
> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?
 
It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.
 
Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.
 
Richard
 


Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Tejas Belagod via Gcc-patches



From: Richard Sandiford 
Date: Tuesday, May 16, 2023 at 2:15 PM
To: Tejas Belagod 
Cc: gcc-patches@gcc.gnu.org 
Subject: Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]
Tejas Belagod  writes:
>> +  {
>> +int i;
>> +int nelts = vector_cst_encoded_nelts (v);
>> +int first_el = 0;
>> +
>> +for (i = first_el; i < nelts; i += step)
>> +  if (VECTOR_CST_ENCODED_ELT (v, i) != VECTOR_CST_ENCODED_ELT (v,
> first_el))
>
> I think this should use !operand_equal_p (..., ..., 0).
>
>
> Oops! I wonder why I thought VECTOR_CST_ENCODED_ELT returned a constant! 
> Thanks
> for spotting that.

It does only return a constant.  But there can be multiple trees with
the same constant value, through things like TREE_OVERFLOW (not sure
where things stand on expunging that from gimple) and the fact that
gimple does not maintain a distinction between different types that
have the same mode and signedness.  (E.g. on ILP32 hosts, gimple does
not maintain a distinction between int and long, even though int 0 and
long 0 are different trees.)

> Also, should the flags here be OEP_ONLY_CONST ?

Nah, just 0 should be fine.

>> + return false;
>> +
>> +return true;
>> +  }
>> +
>> +  /* Fold a svlast{a/b} call with constant predicate to a BIT_FIELD_REF.
>> + BIT_FIELD_REF lowers to a NEON element extract, so we have to make sure
>> + the index of the element being accessed is in the range of a NEON
> vector
>> + width.  */
>
> s/NEON/Advanced SIMD/.  Same in later comments
>
>> +  gimple *fold (gimple_folder & f) const override
>> +  {
>> +tree pred = gimple_call_arg (f.call, 0);
>> +tree val = gimple_call_arg (f.call, 1);
>> +
>> +if (TREE_CODE (pred) == VECTOR_CST)
>> +  {
>> + HOST_WIDE_INT pos;
>> + unsigned int const_vg;
>> + int i = 0;
>> + int step = f.type_suffix (0).element_bytes;
>> + int step_1 = gcd (step, VECTOR_CST_NPATTERNS (pred));
>> + int npats = VECTOR_CST_NPATTERNS (pred);
>> + unsigned HOST_WIDE_INT nelts = vector_cst_encoded_nelts (pred);
>> + tree b = NULL_TREE;
>> + bool const_vl = aarch64_sve_vg.is_constant (&const_vg);
>
> I think this might be left over from previous versions, but:
> const_vg isn't used and const_vl is only used once, so I think it
> would be better to remove them.
>
>> +
>> + /* We can optimize 2 cases common to variable and fixed-length cases
>> +without a linear search of the predicate vector:
>> +1.  LASTA if predicate is all true, return element 0.
>> +2.  LASTA if predicate all false, return element 0.  */
>> + if (is_lasta () && vect_all_same (pred, step_1))
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT), bitsize_int (0));
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* Handle the all-false case for LASTB where SVE VL == 128b -
>> +return the highest numbered element.  */
>> + if (is_lastb () && known_eq (BYTES_PER_SVE_VECTOR, 16)
>> + && vect_all_same (pred, step_1)
>> + && integer_zerop (VECTOR_CST_ENCODED_ELT (pred, 0)))
>
> Formatting nit: one condition per line once one line isn't enough.
>
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT),
>> + bitsize_int ((16 - step) * BITS_PER_UNIT));
>> +
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) == 2 and every multiple of
>> +'step_1' in
>> +[VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1]
>> +is zero, then we can treat the vector as VECTOR_CST_NPATTERNS
>> +elements followed by all inactive elements.  */
>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) == 2)
>
> Following on from the above, maybe use:
>
>   !VECTOR_CST_NELTS (pred).is_constant ()
>
> instead of !const_vl here.
>
> I have a horrible suspicion that I'm contradicting our earlier discussion
> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != 2.
>
>
>
> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the encoded
> constant. This means the repetition occurs if NELTS_PER_PATTERN == 2, IOW the
> base1 repeats in the encoding. This loop is checking this condition and looks
> for a 1 in the repeated part of the NELTS_PER_PATTERN == 2 in a VL vector.
> Please correct me if I’m misunderstanding here.

NELTS_PER_PATTERN == 1 is also a repeating pattern: it means that the
entire sequence is repeated to fill a vector.  So if an NELTS_PER_PATTERN
== 1 constant has elements {0, 1, 0, 0}, the vector is:

   {0, 1, 0, 0, 0, 1, 0, 0, ...}

Wouldn’t the vect_all_same(pred, step) cover this case for a given value of 
step?

and the optimisation can't handle that.  NELTS_PER_PATTERN == 3 isn't
likely to occur for predicates, but in principle it has t

Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Richard Sandiford via Gcc-patches
Tejas Belagod  writes:
>>> +   {
>>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>>> + bitsize_int (step * BITS_PER_UNIT),
>>> + bitsize_int ((16 - step) * BITS_PER_UNIT));
>>> +
>>> + return gimple_build_assign (f.lhs, b);
>>> +   }
>>> +
>>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) == 2 and every multiple of
>>> +'step_1' in
>>> +[VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1]
>>> +is zero, then we can treat the vector as VECTOR_CST_NPATTERNS
>>> +elements followed by all inactive elements.  */
>>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) == 2)
>>
>> Following on from the above, maybe use:
>>
>>   !VECTOR_CST_NELTS (pred).is_constant ()
>>
>> instead of !const_vl here.
>>
>> I have a horrible suspicion that I'm contradicting our earlier discussion
>> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != 2.
>>
>> 
>>
>> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the
> encoded
>> constant. This means the repetition occurs if NELTS_PER_PATTERN == 2, IOW the
>> base1 repeats in the encoding. This loop is checking this condition and looks
>> for a 1 in the repeated part of the NELTS_PER_PATTERN == 2 in a VL vector.
>> Please correct me if I’m misunderstanding here.
>
> NELTS_PER_PATTERN == 1 is also a repeating pattern: it means that the
> entire sequence is repeated to fill a vector.  So if an NELTS_PER_PATTERN
> == 1 constant has elements {0, 1, 0, 0}, the vector is:
>
>{0, 1, 0, 0, 0, 1, 0, 0, ...}
>
>
> Wouldn’t the vect_all_same(pred, step) cover this case for a given value of
> step?
>
>
> and the optimisation can't handle that.  NELTS_PER_PATTERN == 3 isn't
> likely to occur for predicates, but in principle it has the same problem.
>
>  
>
> OK, I had misunderstood the encoding to always make base1 the repeating value
> by adjusting the NPATTERNS accordingly – I didn’t know you could also have the
> base2 value and beyond encoding the repeat value. In this case could I just
> remove NELTS_PER_PATTERN == 2 condition and the enclosed loop would check for 
> a
> repeating ‘1’ in the repeated part of the encoded pattern?

But for NELTS_PER_PATTERN==1, the whole encoded sequence repeats.
So you would have to start the check at element 0 rather than
NPATTERNS.  And then (for NELTS_PER_PATTERN==1) the loop would reject
any constant that has a nonzero element.  But all valid zero-vector
cases have been handled by this point, so the effect wouldn't be useful.

It should never be the case that all elements from NPATTERNS
onwards are zero for NELTS_PER_PATTERN==3; that case should be
canonicalised to NELTS_PER_PATTERN==2 instead.

So in practice it's simpler and more obviously correct to punt
when NELTS_PER_PATTERN != 2.

Thanks,
Richard


RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Li, Pan2 via Gcc-patches
Thanks Richard Sandiford for review.

Yes, currently the class access_info will be extended from 8 bytes to 12 bytes, 
which is missed in the table. With the adjustment as you suggested it will be 8 
bytes but unfortunately the change of m_kind may trigger some ICE in some test 
case(s).

I will take a look into it and keep you posted.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Tuesday, May 16, 2023 5:09 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

pan2...@intel.com writes:
> diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h index 
> c5180b9308a..38b4d6160c2 100644
> --- a/gcc/rtl-ssa/accesses.h
> +++ b/gcc/rtl-ssa/accesses.h
> @@ -254,7 +254,7 @@ private:
>unsigned int m_spare : 2;
>  
>// The value returned by the accessor above.
> -  machine_mode m_mode : 8;
> +  machine_mode m_mode : MACHINE_MODE_BITSIZE;
>  };
>  
>  // A contiguous array of access_info pointers.  Used to represent a

This structure (access_info) isn't mentioned in the table in the patch 
description.  The structure is currently 1 LP64 word and is very 
size-sensitive.  I think we should:

- Put the mode after m_regno
- Reduce m_kind to 2 bits
- Remove m_spare

I *think* that will keep the current size, but please check.

LGTM otherwise.

Thanks,
Richard


Re: [PATCH] rtl: AArch64: New RTL for ABD

2023-05-16 Thread Richard Sandiford via Gcc-patches
Sorry for the slow reply.

Oluwatamilore Adebayo  writes:
> From afa416dab831795f7e1114da2fb9e94ea3b8c519 Mon Sep 17 00:00:00 2001
> From: oluade01 
> Date: Fri, 14 Apr 2023 15:10:07 +0100
> Subject: [PATCH 2/4] AArch64: New RTL for ABD
>
> This patch adds new RTL and tests for sabd and uabd
>
> PR tree-optimization/109156
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-simd-builtins.def (sabd, uabd):
> Change the mode to 3.
> * config/aarch64/aarch64-simd.md (aarch64_abd):
> Rename to abd3.
> * config/aarch64/aarch64-sve.md (abd_3): Rename
> to abd3.

Thanks.  These changes look good, once the vectoriser part is sorted,
but I have some comments about the tests:

> diff --git a/gcc/testsuite/gcc.target/aarch64/abd.h 
> b/gcc/testsuite/gcc.target/aarch64/abd.h
> new file mode 100644
> index 
> ..bc38e8508056cf2623cddd6053bf1cec3fa4ece4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/abd.h
> @@ -0,0 +1,62 @@
> +#ifdef ABD_IDIOM
> +
> +#define TEST1(S, TYPE) \
> +void fn_##S##_##TYPE (S TYPE * restrict a, \
> + S TYPE * restrict b,  \
> + S TYPE * restrict out) {  \
> +  for (int i = 0; i < N; i++) {\
> +signed TYPE diff = b[i] - a[i];\
> +out[i] = diff > 0 ? diff : -diff;  \
> +} }
> +
> +#define TEST2(S, TYPE1, TYPE2) \
> +void fn_##S##_##TYPE1##_##TYPE1##_##TYPE2  \
> +(S TYPE1 * restrict a, \
> + S TYPE1 * restrict b, \
> + S TYPE2 * restrict out) { \
> +  for (int i = 0; i < N; i++) {\
> +signed TYPE2 diff = b[i] - a[i];   \
> +out[i] = diff > 0 ? diff : -diff;  \
> +} }
> +
> +#define TEST3(S, TYPE1, TYPE2, TYPE3)  \
> +void fn_##S##_##TYPE1##_##TYPE2##_##TYPE3  \
> +(S TYPE1 * restrict a, \
> + S TYPE2 * restrict b, \
> + S TYPE3 * restrict out) { \
> +  for (int i = 0; i < N; i++) {\
> +signed TYPE3 diff = b[i] - a[i];   \
> +out[i] = diff > 0 ? diff : -diff;  \
> +} }
> +
> +#endif
> +
> +#ifdef ABD_ABS
> +
> +#define TEST1(S, TYPE) \
> +void fn_##S##_##TYPE (S TYPE * restrict a, \
> + S TYPE * restrict b,  \
> + S TYPE * restrict out) {  \
> +  for (int i = 0; i < N; i++)  \
> +out[i] = __builtin_abs(a[i] - b[i]);   \
> +}
> +
> +#define TEST2(S, TYPE1, TYPE2) \
> +void fn_##S##_##TYPE1##_##TYPE1##_##TYPE2  \
> +(S TYPE1 * restrict a, \
> + S TYPE1 * restrict b, \
> + S TYPE2 * restrict out) { \
> +  for (int i = 0; i < N; i++)  \
> +out[i] = __builtin_abs(a[i] - b[i]);   \
> +}
> +
> +#define TEST3(S, TYPE1, TYPE2, TYPE3)  \
> +void fn_##S##_##TYPE1##_##TYPE2##_##TYPE3  \
> +(S TYPE1 * restrict a, \
> + S TYPE2 * restrict b, \
> + S TYPE3 * restrict out) { \
> +  for (int i = 0; i < N; i++)  \
> +out[i] = __builtin_abs(a[i] - b[i]);   \
> +}
> +
> +#endif

It would be good to mark all of these functions with __attribute__((noipa)),
since I think interprocedural optimisations might otherwise defeat the
runtime test in abd_run_1.c (in the sense that we might end up folding
things at compile time and not testing the vector versions of the functions).

> diff --git a/gcc/testsuite/gcc.target/aarch64/abd_2.c 
> b/gcc/testsuite/gcc.target/aarch64/abd_2.c
> new file mode 100644
> index 
> ..45bcfabe05a395f6775f78f28c73eb536ba5654e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/abd_2.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +#pragma GCC target "+nosve"
> +#define N 1024
> +
> +#define ABD_ABS
> +#include "abd.h"
> +
> +TEST1(signed, int)
> +TEST1(signed, short)
> +TEST1(signed, char)
> +
> +TEST2(signed, char, int)
> +TEST2(signed, char, short)
> +
> +TEST3(signed, char, int, short)
> +TEST3(signed, char, short, int)
> +
> +TEST1(unsigned, int)
> +TEST1(unsigned, short)
> +TEST1(unsigned, char)
> +
> +TEST2(unsigned, char, int)
> +TEST2(unsigned, char, short)
> +
> +TEST3(unsigned, char, int, short)
> +TEST3(unsigned, char, short, int)
> +
> +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, 
> v\[0-9\]+\.4s" 2 } } */
> +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, 
> v\[0-9\]+\.8h" 1 } } */
> +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, 
> v\[0-9\]+\.16b" 1 } } */
> +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, 
> v\

Re: [PATCH v4 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.

2023-05-16 Thread Ajit Agarwal via Gcc-patches



On 29/04/23 5:03 am, Jeff Law wrote:
> 
> 
> On 4/28/23 16:42, Hans-Peter Nilsson wrote:
>> On Sat, 22 Apr 2023, Ajit Agarwal via Gcc-patches wrote:
>>
>>> Hello All:
>>>
>>> This new version of patch 4 use improve ree pass for rs6000 target using 
>>> defined ABI interfaces.
>>> Bootstrapped and regtested on power64-linux-gnu.
>>>
>>> Thanks & Regards
>>> Ajit
>>>
>>>
>>> ree: Improve ree pass for rs6000 target using defined abi interfaces
>>>
>>>  For rs6000 target we see redundant zero and sign
>>>  extension and done to improve ree pass to eliminate
>>>  such redundant zero and sign extension using defines
>>>  ABI interfaces.
>>>
>>>  2023-04-22  Ajit Kumar Agarwal  
>>>
>>> gcc/ChangeLog:
>>>
>>>  * ree.cc (combline_reaching_defs): Add zero_extend
>>>  using defined abi interfaces.
>>>  (add_removable_extension): use of defined abi interfaces
>>>  for no reaching defs.
>>>  (abi_extension_candidate_return_reg_p): New defined ABI function.
>>>  (abi_extension_candidate_p): New defined ABI function.
>>>  (abi_extension_candidate_argno_p): New defined ABI function.
>>>  (abi_handle_regs_without_defs_p): New defined ABI function.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>  * g++.target/powerpc/zext-elim-3.C
>>> ---
>>>   gcc/ree.cc    | 176 +++---
>>>   .../g++.target/powerpc/zext-elim-3.C  |  16 ++
>>>   2 files changed, 162 insertions(+), 30 deletions(-)
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
>>>
>>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>>> index 413aec7c8eb..0de96b1ece1 100644
>>> --- a/gcc/ree.cc
>>> +++ b/gcc/ree.cc
>>> @@ -473,7 +473,8 @@ get_defs (rtx_insn *insn, rtx reg, vec 
>>> *dest)
>>>   break;
>>>   }
>>>   -  gcc_assert (use != NULL);
>>> +  if (use == NULL)
>>> +    return NULL;
>>>       ref_chain = DF_REF_CHAIN (use);
>>>   @@ -514,7 +515,8 @@ get_uses (rtx_insn *insn, rtx reg)
>>>   if (REGNO (DF_REF_REG (def)) == REGNO (reg))
>>>     break;
>>>   -  gcc_assert (def != NULL);
>>> +  if (def == NULL)
>>> +    return NULL;
>>>       ref_chain = DF_REF_CHAIN (def);
>>>   @@ -750,6 +752,103 @@ get_extended_src_reg (rtx src)
>>>     return src;
>>>   }
>>>   +/* Return TRUE if the candidate insn is zero extend and regno is
>>> +   an return  registers.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +
>>> +  if (GET_CODE (SET_SRC (set)) !=  ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  if (FUNCTION_VALUE_REGNO_P (regno))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Return TRUE if reg source operand of zero_extend is argument registers
>>> +   and not return registers and source and destination operand are same
>>> +   and mode of source and destination operand are not same.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_p (rtx_insn *insn)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +
>>> +  if (GET_CODE (SET_SRC (set)) !=  ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  machine_mode ext_dst_mode = GET_MODE (SET_DEST (set));
>>> +  rtx orig_src = XEXP (SET_SRC (set),0);
>>> +
>>> +  bool copy_needed
>>> +    = (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0)));
>>> +
>>> +  if (!copy_needed && ext_dst_mode != GET_MODE (orig_src)
>>> +  && FUNCTION_ARG_REGNO_P (REGNO (orig_src))
>>> +  && !abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Return TRUE if the candidate insn is zero extend and regno is
>>> +   an argument registers.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_argno_p (rtx_code code, int regno)
>>> +{
>>> +  if (code !=  ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  if (FUNCTION_ARG_REGNO_P (regno))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>
>> I don't see anything in those functions that checks if
>> ZERO_EXTEND is actually a feature of the ABI, e.g. as opposed to
>> no extension or SIGN_EXTEND.  Do I miss something?
> I don't think you missed anything.  That was one of the points I was making 
> last week.  Somewhere, somehow we need to describe what the ABI mandates and 
> guarantees.
> 
> So while what Ajit has done is a step forward, at some point the actual 
> details of the ABI need to be described in a way that can be checked and 
> consumed by REE.


The ABI we need for ree pass are the argument registers and return registers. 
Based on that I have described interfaces that we need. Other than that we dont 
any other ABI hooks. I have used FUNCTION_VALUE_REGNO_P and 
FuNCTION_ARG_REGNO_P abi hooks.

Thanks & Regards
Ajit
> 
> Jeff


[GCC12 backport] arm: MVE testsuite and backend bugfixes

2023-05-16 Thread Stamatis Markianos-Wright via Gcc-patches

Hi all,

We've recently sent up a lot of patches overhauling the testsuite of the 
Arm MVE backend.
With these changes, we've also identified and fixed a number of bugs 
(some backend bugs and many to do with the polymorphism of intrinsics in 
MVE the header file).

These would all be relevant to backport to GCC12.
The list is as follows (in the order they all apply on top of eachother):

* This patch series: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606552.html 
(commits 9a79b522e0663a202a288db56ebcbdcdb48bdaca to 
f2b54e5b796b00f0072b61f9cd6a964c66ead29b)

* ecc363971aeac52481d92de8b37521f6cc2d38e6 arm: Fix MVE testsuite fallouts
* 06aa66af7d0dacc1b247d9e38175e789ef159191 arm: Add missing early 
clobber to MVE vrev64q_m patterns
* c09663eabfb84ac56ddd8d44abcab3f4902c83bd testsuite: [arm] Relax 
expected register names in MVE tests
* 330d665ce6dcc63ed0bd78d807e69bbfc55255b6 arm: [MVE] Add missing 
length=8 attribute
* 8d4f007398bc3f8fea812fb8cff4d7d0556d12f1 arm: fix mve intrinsics scan 
body tests for C++
* This patch series 
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/610312.html 
(commits dd4424ef898608321b60610c4f3c98737ace3680 to 
267f01a493ab8a0bec9325ce3386b946c46f2e98)
* 8a1360e72d6c6056606aa5edd8c906c50f26de59 arm: Split up MVE _Generic 
associations to prevent type clashes [PR107515]

* 3f0ca7a3e4431534bff3b8eb73709cc822e489b0 arm: Fix vcreate definition
* c1093923733a1072a237f112e3239b5ebd88eadd arm: Make MVE masked stores 
read memory operand [PR 108177]
* f54e31ddefe3ea7146624eabcb75b1c90dc59f1a arm: fix __arm_vld1q_z* and 
__arm_vst1q_p* intrinsics [PR108442]
* 1d509f190393627cdf0afffc427b25dd21c2 arm: remove unused variables 
from test


-- up to this point everything applied cleanly. The final two need minor 
rebasing changes --


* This patch series: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/617008.html (Not 
pushed to trunk yet, but has been approved. For trunk we do now need to 
resolve some merge conflicts, since Christophe has started merging the 
MVE Intrinsic Restructuring, but these are trivial. I will also backport 
to GCC13 where this patch series applies cleanly)
* cfa118fc089e38a94ec60ccf5b667aea015e5f60 [arm] complete vmsr/vmrs 
blank and case adjustments.


The final one is a commit from Alexandre Oliva that is needed to ensure 
that we don't accidentally regress the test due to the tabs vs spaces 
and capitalisation on the vmrs/vmsr instructions :)


After all that, no regressions on baremetal arm-none-eabi in a bunch 
configurations (-marm, thumb1, thumb2, MVE, MVE.FP, softfp and hardfp):


Thanks,
Stam



Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions

2023-05-16 Thread Richard Earnshaw (lists) via Gcc-patches

On 24/04/2023 09:33, Richard Sandiford via Gcc-patches wrote:

Richard Sandiford  writes:

Tamar Christina  writes:

Hi All,

This patch adds support for a compact syntax for specifying constraints in
instruction patterns. Credit for the idea goes to Richard Earnshaw.

I am sending up this RFC to get feedback for it's inclusion in GCC 14.
With this new syntax we want a clean break from the current limitations to make
something that is hopefully easier to use and maintain.

The idea behind this compact syntax is that often times it's quite hard to
correlate the entries in the constrains list, attributes and instruction lists.

One has to count and this often is tedious.  Additionally when changing a single
line in the insn multiple lines in a diff change, making it harder to see what's
going on.

This new syntax takes into account many of the common things that are done in MD
files.   It's also worth saying that this version is intended to deal with the
common case of a string based alternatives.   For C chunks we have some ideas
but those are not intended to be addressed here.

It's easiest to explain with an example:

normal syntax:

(define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  r,  
r,  r, w,r,w, w")
(match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %1
#
* return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
ldr\\t%w0, %1
ldr\\t%s0, %1
str\\t%w1, %0
str\\t%s1, %0
adrp\\t%x0, %A1\;ldr\\t%w0, [%x0, %L1]
adr\\t%x0, %c1
adrp\\t%x0, %A1
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
[(const_int 0)]
"{
aarch64_expand_mov_immediate (operands[0], operands[1]);
DONE;
 }"
   ;; The "mov_imm" type for CNT is just a placeholder.
   [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,mov_imm,load_4,

load_4,store_4,store_4,load_4,adr,adr,f_mcr,f_mrc,fmov,neon_move")
(set_attr "arch"   "*,*,*,*,*,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd")
(set_attr "length" "4,4,4,4,*,  4,4, 4,4, 4,8,4,4, 4, 4, 4,   4")
]
)

New syntax:

(define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand")
(match_operand:SI 1 "aarch64_mov_operand"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@@ (cons: 0 1; attrs: type arch length)
[=r, r  ; mov_reg  , *   , 4] mov\t%w0, %w1
[k , r  ; mov_reg  , *   , 4] ^
[r , k  ; mov_reg  , *   , 4] ^
[r , M  ; mov_imm  , *   , 4] mov\t%w0, %1
[r , n  ; mov_imm  , *   , *] #
[r , Usv; mov_imm  , sve , 4] << aarch64_output_sve_cnt_immediate ('cnt', 
'%x0', operands[1]);
[r , m  ; load_4   , *   , 4] ldr\t%w0, %1
[w , m  ; load_4   , fp  , 4] ldr\t%s0, %1
[m , rZ ; store_4  , *   , 4] str\t%w1, %0
[m , w  ; store_4  , fp  , 4] str\t%s1, %0
[r , Usw; load_4   , *   , 8] adrp\t%x0, %A1;ldr\t%w0, [%x0, %L1]
[r , Usa; adr  , *   , 4] adr\t%x0, %c1
[r , Ush; adr  , *   , 4] adrp\t%x0, %A1
[w , rZ ; f_mcr, fp  , 4] fmov\t%s0, %w1
[r , w  ; f_mrc, fp  , 4] fmov\t%w0, %s1
[w , w  ; fmov , fp  , 4] fmov\t%s0, %s1
[w , Ds ; neon_move, simd, 4] << aarch64_output_scalar_simd_mov_immediate 
(operands[1], SImode);"
   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
   [(const_int 0)]
   {
 aarch64_expand_mov_immediate (operands[0], operands[1]);
 DONE;
   }
   ;; The "mov_imm" type for CNT is just a placeholder.
)

The patch contains some more rewritten examples for both Arm and AArch64.  I
have included them for examples in this RFC but the final version posted in
GCC 14 will have these split out.

The main syntax rules are as follows (See docs for full rules):
   - Template must start with "@@" to use the new syntax.
   - "@@" is followed by a layout in parentheses which is "cons:" followed by
 a list of match_operand/match_scratch IDs, then a semicolon, then the
 same for attributes ("attrs:"). Both sections are optional (so you can
 use only cons, or only attrs, or both), and cons must come before attrs
 if present.
   - Each alternative begins with any amount of whitespace.
   - Following the whitespace is a comma-separated list of constraints and/or
 attributes within brackets [], with sections separated by a semicolon.
   - Following the closing ']' is any amount of whit

Remove stale Autoconf checks for Perl

2023-05-16 Thread Thomas Schwinge
Hi!

OK to push the attached "Remove stale Autoconf checks for Perl"?


For avoidance of doubt, there still exist a few instances of Perl usage
in the GCC build process (like, when 'contrib/make_sunver.pl' is used),
but those always directly invoke 'perl'.  As this, apparently, is working
fine, I'm not proposing changing those to now use Autoconf-determined
Perl.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From e86eabae296a9153a1d02b1ed8cafda1b70485a6 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 16 May 2023 12:00:37 +0200
Subject: [PATCH] Remove stale Autoconf checks for Perl

Subversion r110220 (Git commit 03b8fe495d716c004f5491eb2347537f115ab2d8) for
PR25884 "libgomp should not require perl to compile" removed all '$(PERL)'
usage from libgomp -- but didn't remove the then-unused Autoconf Perl check
itself.  Later, this Autoconf Perl check appears to have been copied from
libgomp into other GCC libraries, likewise unused.

	libgomp/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* testsuite/Makefile.in: Likewise.
	libatomic/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* testsuite/Makefile.in: Likewise.
	libgm2/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* libm2cor/Makefile.in: Likewise.
	* libm2iso/Makefile.in: Likewise.
	* libm2log/Makefile.in: Likewise.
	* libm2min/Makefile.in: Likewise.
	* libm2pim/Makefile.in: Likewise.
	libitm/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* testsuite/Makefile.in: Likewise.
---
 libatomic/Makefile.in   |  1 -
 libatomic/configure | 46 ++---
 libatomic/configure.ac  |  1 -
 libatomic/testsuite/Makefile.in |  1 -
 libgm2/Makefile.in  |  1 -
 libgm2/configure| 46 ++---
 libgm2/configure.ac |  1 -
 libgm2/libm2cor/Makefile.in |  1 -
 libgm2/libm2iso/Makefile.in |  1 -
 libgm2/libm2log/Makefile.in |  1 -
 libgm2/libm2min/Makefile.in |  1 -
 libgm2/libm2pim/Makefile.in |  1 -
 libgomp/Makefile.in |  1 -
 libgomp/configure   | 46 ++---
 libgomp/configure.ac|  1 -
 libgomp/testsuite/Makefile.in   |  1 -
 libitm/Makefile.in  |  1 -
 libitm/configure| 46 ++---
 libitm/configure.ac |  1 -
 libitm/testsuite/Makefile.in|  1 -
 20 files changed, 8 insertions(+), 192 deletions(-)

diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index a0fa3dfc8cc..83efe7d2694 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -321,7 +321,6 @@ PACKAGE_TARNAME = @PACKAGE_TARNAME@
 PACKAGE_URL = @PACKAGE_URL@
 PACKAGE_VERSION = @PACKAGE_VERSION@
 PATH_SEPARATOR = @PATH_SEPARATOR@
-PERL = @PERL@
 RANLIB = @RANLIB@
 SECTION_LDFLAGS = @SECTION_LDFLAGS@
 SED = @SED@
diff --git a/libatomic/configure b/libatomic/configure
index e47d2d7fb35..1994662b7c5 100755
--- a/libatomic/configure
+++ b/libatomic/configure
@@ -680,7 +680,6 @@ EGREP
 GREP
 SED
 LIBTOOL
-PERL
 RANLIB
 NM
 AR
@@ -4869,47 +4868,6 @@ else
   RANLIB="$ac_cv_prog_RANLIB"
 fi
 
-# Extract the first word of "perl", so it can be a program name with args.
-set dummy perl; ac_word=$2
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
-$as_echo_n "checking for $ac_word... " >&6; }
-if ${ac_cv_path_PERL+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  case $PERL in
-  [\\/]* | ?:[\\/]*)
-  ac_cv_path_PERL="$PERL" # Let the user override the test with a path.
-  ;;
-  *)
-  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
-for as_dir in $PATH
-do
-  IFS=$as_save_IFS
-  test -z "$as_dir" && as_dir=.
-for ac_exec_ext in '' $ac_executable_extensions; do
-  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
-ac_cv_path_PERL="$as_dir/$ac_word$ac_exec_ext"
-$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
-break 2
-  fi
-done
-  done
-IFS=$as_save_IFS
-
-  test -z "$ac_cv_path_PERL" && ac_cv_path_PERL="perl-not-found-in-path-error"
-  ;;
-esac
-fi
-PERL=$ac_cv_path_PERL
-if test -n "$PERL"; then
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PERL" >&5
-$as_echo "$PERL" >&6; }
-else
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
-$as_echo "no" >&6; }
-fi
-
-
 
 
 # Configure libtool
@@ -11406,7 +11364,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11409 "configure"
+#line 11367 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11512,7 +11470,7 @@ else
   lt_dlunknown=0; lt_

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

2023-05-16 Thread Thomas Schwinge
Hi!

On 2023-05-05T10:59:31+0200, I wrote:
> On 2023-05-05T10:55:41+0200, I wrote:
>> [Putting Bernhard, Honza, Segher in CC, as they are eager to test this,
>> based on recent comments on IRC.]  ;-P


>> First, establish the parallel testing infrastructure -- while still
>> hard-coding the number of parallel slots to one.

>> "Support parallel testing in libgomp, part I [PR66005]"

> On top of that, second, enable parallel testing.

> implemented what I'd described in
> :
>
> | [...] parallelize *all* compilation, while just allowing for *one*
> | execution test job slot.  That will require some GCC DejaGnu test
> | harness hackery which I've [now] gotten to look into.  That is, enable
> | the usual GCC/DejaGnu parallel testing, but also have some kind of
> | mutex for the execution test invocation.  This has to play nicely with
> | DejaGnu timeout handling, etc.

> Subject: [PATCH] Support parallel testing in libgomp, part II [PR66005]
>
> ..., and enable if 'flock' is available for serializing execution testing.

OK to push the attached
"Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]"?

Per the PR66005 discussion, if 'flock' is not available, having a
fallback Perl 'flock' for parallelizing 'check-target-libgomp' wasn't met
with the greatest of all enthusiasm -- but in my opinion it's still
better than continued all-serial 'check-target-libgomp'?

We may then proceed working on a more integrated solution, using TCL or
shell features.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From c62858bf888fec2f61febafcd6afe2dc8c3f679b Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 15 May 2023 20:00:07 +0200
Subject: [PATCH] Support parallel testing in libgomp: fallback Perl 'flock'
 [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k

	PR testsuite/66005
	gcc/
	* doc/install.texi: Document (optional) Perl usage for parallel
	testing of libgomp.
	libgomp/
	* testsuite/lib/libgomp.exp: 'flock' through stdout.
	* testsuite/flock: New.
	* configure.ac (FLOCK): Point to that if no 'flock' available, but
	'perl' is.
	* configure: Regenerate.
---
 gcc/doc/install.texi  |  3 +++
 libgomp/configure | 42 +++
 libgomp/configure.ac  |  5 
 libgomp/testsuite/flock   | 17 +
 libgomp/testsuite/lib/libgomp.exp |  4 ++-
 5 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100755 libgomp/testsuite/flock

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index dfab47dac96..fe4a972980f 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -382,6 +382,9 @@ tables.
 
 Used by @command{automake}.
 
+If available, enables parallel testing of @samp{libgomp} in case that
+@command{flock} is not available.
+
 @end table
 
 Several support libraries are necessary to build GCC, some are required,
diff --git a/libgomp/configure b/libgomp/configure
index 2b45acd08c6..a280ca9238a 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -16457,6 +16457,8 @@ $as_echo "unable to detect (assuming 1)" >&6; }
 fi
 
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for flock implementation" >&5
+$as_echo "$as_me: checking for flock implementation" >&6;}
 for ac_prog in flock
 do
   # Extract the first word of "$ac_prog", so it c

[PATCH] OpenMP: Array shaping operator and strided "target update" for C

2023-05-16 Thread Julian Brown
Following the similar support for C++ and Fortran, here is the
C implementation for the OpenMP 5.0 array-shaping operator, and for
strided and rectangular updates for "target update" directives.

Much of the implementation is shared with the previously-posted C++
support:

  https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613788.html

Some details of parsing necessarily differ for C, but the general ideas
are the same.

This patch is intended to be applied on top of the following series:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609031.html

(with followup:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609566.html)

and (the series supporting the C++ patch in the first link above):

  https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613785.html

and (Fortran support):

  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616921.html

Tested with offloading to NVPTX, and bootstrapped. OK?

Thanks,

Julian

2023-05-16  Julian Brown  

gcc/c/
* c-parser.cc (c_parser_braced_init): Disallow array-shaping operator
in braced init.
(c_parser_conditional_expression): Disallow array-shaping operator in
conditional expression.
(c_parser_cast_expression): Add array-shaping operator support.
(c_parser_postfix_expression): Disallow array-shaping operator in
statement expressions.
(c_parser_postfix_expression_after_primary): Add OpenMP array section
stride support.
(c_parser_expr_list): Disallow array-shaping operator in expression
lists.
(c_array_type_nelts_top, c_array_type_nelts_total): New functions.
(c_parser_omp_variable_list): Support array-shaping operator.
(c_parser_omp_clause_to, c_parser_omp_clause_from): Allow generalised
lvalue parsing in "to" and "from" clauses.
(c_parser_omp_target_update): Recognize GOMP_MAP_TO_GRID and
GOMP_MAP_FROM_GRID map kinds as well as OMP_CLAUSE_TO/OMP_CLAUSE_FROM.
* c-tree.h (c_omp_array_shaping_op_p, c_omp_has_array_shape_p): New
extern declarations.
(create_omp_arrayshape_type): Add prototype.
* c-typeck.cc (c_omp_array_shaping_op_p, c_omp_has_array_shape_p): New
globals.
(build_omp_array_section): Permit integral types, not just integer
constants, when creating array types for array sections.
(create_omp_arrayshape_type): New function.
(handle_omp_array_sections_1): Add DISCONTIGUOUS parameter.  Add
strided/rectangular array section support.
(omp_array_section_low_bound): New function.
(handle_omp_array_sections): Add DISCONTIGUOUS parameter.  Add
strided/rectangular array section support.
(c_finish_omp_clauses): Update calls to handle_omp_array_sections.
Handle discontiguous updates.

gcc/testsuite/
* gcc.dg/gomp/bad-array-shaping-c-1.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-2.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-3.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-4.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-5.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-6.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-7.c: New test.

libgomp/
* testsuite/libgomp.c/array-shaping-1.c: New test.
* testsuite/libgomp.c/array-shaping-2.c: New test.
* testsuite/libgomp.c/array-shaping-3.c: New test.
* testsuite/libgomp.c/array-shaping-4.c: New test.
* testsuite/libgomp.c/array-shaping-5.c: New test.
* testsuite/libgomp.c/array-shaping-6.c: New test.
---
 gcc/c/c-parser.cc | 305 +-
 gcc/c/c-tree.h|   4 +
 gcc/c/c-typeck.cc | 241 --
 .../gcc.dg/gomp/bad-array-shaping-c-1.c   |  26 ++
 .../gcc.dg/gomp/bad-array-shaping-c-2.c   |  24 ++
 .../gcc.dg/gomp/bad-array-shaping-c-3.c   |  30 ++
 .../gcc.dg/gomp/bad-array-shaping-c-4.c   |  27 ++
 .../gcc.dg/gomp/bad-array-shaping-c-5.c   |  17 +
 .../gcc.dg/gomp/bad-array-shaping-c-6.c   |  26 ++
 .../gcc.dg/gomp/bad-array-shaping-c-7.c   |  15 +
 libgomp/testsuite/libgomp.c/array-shaping-1.c | 236 ++
 libgomp/testsuite/libgomp.c/array-shaping-2.c |  39 +++
 libgomp/testsuite/libgomp.c/array-shaping-3.c |  42 +++
 libgomp/testsuite/libgomp.c/array-shaping-4.c |  36 +++
 libgomp/testsuite/libgomp.c/array-shaping-5.c |  38 +++
 libgomp/testsuite/libgomp.c/array-shaping-6.c |  45 +++
 16 files changed, 1101 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-4.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-5.c
 create mode 10064

Re: [PATCH v5 1/4] rs6000: Enable REE pass by default

2023-05-16 Thread Segher Boessenkool
Hi!

On Tue, May 16, 2023 at 11:45:28AM +0530, Ajit Agarwal wrote:
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -12455,8 +12455,8 @@ Attempt to remove redundant extension instructions.  
> This is especially
>  helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
>  registers after writing to their lower 32-bit half.
>  
> -Enabled for Alpha, AArch64 and x86 at levels @option{-O2},
> -@option{-O3}, @option{-Os}.
> +Enabled for Alpha, AArch64, RS/6000, RISC-V, SPARC, h83000 and x86 at levels 
> +@option{-O2}, @option{-O3}, @option{-Os}.

Please don't mention RS/6000, we don't support that anymore.  The
architecture we do support is called Power or PowerPC; the target
triplets are powerpc*-*-*.  rs6000-*-* might still somewhat work, but
no one should use it anymore, and we probably should delete it.

Please say PowerPC here.

With that the patch is okay for trunk.  Thank you!


Segher


Re: [PATCH] configure: Implement --enable-host-pie

2023-05-16 Thread Marek Polacek via Gcc-patches
Ping.

On Tue, May 09, 2023 at 03:41:58PM -0400, Marek Polacek via Gcc-patches wrote:
> [ This is my third attempt to add this configure option.  The first
> version was approved but it came too late in the development cycle.
> The second version was also approved, but I had to revert it:
> .
> I've fixed the problem (by moving $(PICFLAG) from INTERNAL_CFLAGS to
> ALL_COMPILERFLAGS).  Another change is that since r13-4536 I no longer
> need to touch Makefile.def, so this patch is simplified. ]
> 
> This patch implements the --enable-host-pie configure option which
> makes the compiler executables PIE.  This can be used to enhance
> protection against ROP attacks, and can be viewed as part of a wider
> trend to harden binaries.
> 
> It is similar to the option --enable-host-shared, except that --e-h-s
> won't add -shared to the linker flags whereas --e-h-p will add -pie.
> It is different from --enable-default-pie because that option just
> adds an implicit -fPIE/-pie when the compiler is invoked, but the
> compiler itself isn't PIE.
> 
> Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
> regressions.
> 
> When building the compiler, the build process may use various in-tree
> libraries; these need to be built with -fPIE so that it's possible to
> use them when building a PIE.  For instance, when --with-included-gettext
> is in effect, intl object files must be compiled with -fPIE.  Similarly,
> when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
> -fPIE.
> 
> With this patch and --enable-host-pie used to configure gcc:
> 
> $ file gcc/cc1{,plus,obj} gcc/f951 gcc/lto1 gcc/cpp
> gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/f951:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/cc1obj:  ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/lto1:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/cpp: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> 
> I plan to add an option to link with -Wl,-z,now.
> 
> Bootstrapped on x86_64-pc-linux-gnu with --with-included-gettext
> --enable-host-pie as well as without --enable-host-pie.  Also tested
> on a Debian system where the system gcc was configured with
> --enable-default-pie.
> 
> ChangeLog:
> 
>   * configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
>   check.
>   * configure: Regenerate.
> 
> c++tools/ChangeLog:
> 
>   * Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
>   Use pic/libiberty.a if PICFLAG is set.
>   * configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
>   (--enable-host-pie): New check.
>   * configure: Regenerate.
> 
> fixincludes/ChangeLog:
> 
>   * Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
>   build of libiberty if PICFLAG is set.
>   * configure.ac:
>   * configure: Regenerate.
> 
> gcc/ChangeLog:
> 
>   * Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
>   Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
>   ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
>   * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>   (--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
>   check.
>   * configure: Regenerate.
>   * doc/install.texi: Document --enable-host-pie.
> 
> gcc/d/ChangeLog:
> 
>   * Make-lang.in: Remove NO_PIE_CFLAGS.
> 
> intl/ChangeLog:
> 
>   * Makefile.in: Use @PICFLAG@ in COMPILE as well.
>   * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>   (--enable-host-pie): New check.  Set PICFLAG after this check.
>   * configure: Regenerate.
> 
> libcody/ChangeLog:
> 
>   * Makefile.in: Pass LD_PICFLAG to LDFLAGS.
>   * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>   (--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
>   check.
>   * configure: Regenerate.
> 
> libcpp/ChangeLog:
> 
>   * configure.ac (--enable-host-shared): Don't s

[committed] RISC-V: Fix wrong select_kind in riscv_compute_multilib

2023-05-16 Thread Kito Cheng via Gcc-patches
Seems like I screw up bare-metal toolchian multi lib selection during
finxing linux multi-lib selction...

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_compute_multilib):
Fix wrong select_kind...
---
 gcc/common/config/riscv/riscv-common.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 3a285dfbff0e..fb2635eb5599 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1777,11 +1777,11 @@ riscv_compute_multilib (
   switch (select_kind)
 {
 case select_by_abi:
-  return riscv_select_multilib (riscv_current_abi_str, subset_list,
-   switches, n_switches, multilib_infos);
-case select_by_abi_arch_cmodel:
   return riscv_select_multilib_by_abi (riscv_current_abi_str,
   multilib_infos);
+case select_by_abi_arch_cmodel:
+  return riscv_select_multilib (riscv_current_abi_str, subset_list,
+   switches, n_switches, multilib_infos);
 case select_by_builtin:
   gcc_unreachable ();
 default:
-- 
2.39.2



[PATCH] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Pan Li via Gcc-patches
From: Pan Li 

We are running out of the machine_mode(8 bits) in RISC-V backend. Thus
we would like to extend the machine_mode bit size from 8 to 16 bits.
However, it is sensitive to extend the memory size in common structure
like tree or rtx. This patch would like to extend the machine_mode bits
to 16 bits by shrinking, like:

* Swap the bit size of code and machine code in rtx_def.
* Adjust the machine_mode location and spare in tree.

The memory impact of this patch for correlated structure looks like below:

+---+--+-+--+
| struct/bytes  | upstream | patched | diff |
+---+--+-+--+
| rtx_obj_reference |8 |  12 |   +4 |
| ext_modified  |2 |   4 |   +2 |
| ira_allocno   |  192 | 184 |   -8 |
| qty_table_elem|   40 |  40 |0 |
| reg_stat_type |   64 |  64 |0 |
| rtx_def   |   40 |  40 |0 |
| table_elt |   80 |  80 |0 |
| tree_decl_common  |  112 | 112 |0 |
| tree_type_common  |  128 | 128 |0 |
| access_info   |8 |   8 |0 |
+---+--+-+--+

The tree and rtx related struct has no memory changes after this patch,
and the machine_mode changes to 16 bits already.

Signed-off-by: Pan Li 
Co-authored-by: Ju-Zhe Zhong 
Co-authored-by: Kito Cheng 
Co-Authored-By: Richard Biener 
Co-Authored-By: Richard Sandiford 

gcc/ChangeLog:

* combine.cc (struct reg_stat_type): Extend machine_mode to 16 bits.
* cse.cc (struct qty_table_elem): Extend machine_mode to 16 bits
(struct table_elt): Extend machine_mode to 16 bits.
(struct set): Ditto.
* genmodes.cc (emit_mode_wider): Extend type from char to short.
(emit_mode_complex): Ditto.
(emit_mode_inner): Ditto.
(emit_class_narrowest_mode): Ditto.
* genopinit.cc (main): Extend the machine_mode limit.
* ira-int.h (struct ira_allocno): Extend machine_mode to 16 bits and
re-ordered the struct fields for padding.
* machmode.h (MACHINE_MODE_BITSIZE): New macro.
(GET_MODE_2XWIDER_MODE): Extend type from char to short.
(get_mode_alignment): Extend type from char to short.
* ree.cc (struct ext_modified): Extend machine_mode to 16 bits and
removed the ATTRIBUTE_PACKED.
* rtl-ssa/accesses.h: Extend machine_mode to 16 bits, narrow
m_kind to 2 bits and remove m_spare.
* rtl.h (RTX_CODE_BITSIZE): New macro.
(struct rtx_def): Swap both the bit size and location between the
rtx_code and the machine_mode.
(subreg_shape::unique_id): Extend the machine_mode limit.
* rtlanal.h: Extend machine_mode to 16 bits.
* tree-core.h (struct tree_type_common): Extend machine_mode to 16
bits and re-ordered the struct fields for padding.
(struct tree_decl_common): Extend machine_mode to 16 bits.
* internals.inl (rtl_ssa::access_info): Adjust the assignment.
---
 gcc/combine.cc|  4 +--
 gcc/cse.cc| 16 ---
 gcc/genmodes.cc   | 16 +--
 gcc/genopinit.cc  |  3 ++-
 gcc/ira-int.h | 56 +++
 gcc/machmode.h| 27 ++-
 gcc/ree.cc|  4 +--
 gcc/rtl-ssa/accesses.h| 12 -
 gcc/rtl-ssa/internals.inl |  5 ++--
 gcc/rtl.h | 12 +
 gcc/rtlanal.h |  2 +-
 gcc/tree-core.h   |  9 ---
 12 files changed, 88 insertions(+), 78 deletions(-)

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 5aa0ec5c45a..a23caeed96f 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -200,7 +200,7 @@ struct reg_stat_type {
 
   unsigned HOST_WIDE_INT   last_set_nonzero_bits;
   char last_set_sign_bit_copies;
-  ENUM_BITFIELD(machine_mode)  last_set_mode : 8;
+  ENUM_BITFIELD(machine_mode)  last_set_mode : MACHINE_MODE_BITSIZE;
 
   /* Set nonzero if references to register n in expressions should not be
  used.  last_set_invalid is set nonzero when this register is being
@@ -235,7 +235,7 @@ struct reg_stat_type {
  truncation if we know that value already contains a truncated
  value.  */
 
-  ENUM_BITFIELD(machine_mode)  truncated_to_mode : 8;
+  ENUM_BITFIELD(machine_mode)  truncated_to_mode : MACHINE_MODE_BITSIZE;
 };
 
 
diff --git a/gcc/cse.cc b/gcc/cse.cc
index b10c9b0c94d..86403b95938 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -248,10 +248,8 @@ struct qty_table_elem
   rtx comparison_const;
   int comparison_qty;
   unsigned int first_reg, last_reg;
-  /* The sizes of these fields should match the sizes of the
- code and mode fields of struct rtx_def (see rtl.h).  */
-  ENUM_BITFIELD(rtx_code) comparison_code : 16;
-  ENUM_BITFIELD(machine_mode) mode : 8;
+  ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE;
+  ENUM_BIT

[PATCH] configure: Implement --enable-host-bind-now

2023-05-16 Thread Marek Polacek via Gcc-patches
As promised in the --enable-host-pie patch, this patch adds another
configure option, --enable-host-bind-now, which adds -z now when linking
the compiler executables in order to extend hardening.  BIND_NOW with RELRO
allows the GOT to be marked RO; this prevents GOT modification attacks.

This option does not affect linking of target libraries; you can use
LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW.

With this patch:
$ readelf -Wd cc1{,plus} | grep FLAGS
 0x001e (FLAGS)  BIND_NOW
 0x6ffb (FLAGS_1)Flags: NOW PIE
 0x001e (FLAGS)  BIND_NOW
 0x6ffb (FLAGS_1)Flags: NOW PIE

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

c++tools/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.
* configure: Regenerate.

gcc/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Add
-Wl,-z,now to LD_PICFLAG if --enable-host-bind-now.
* configure: Regenerate.
* doc/install.texi: Document --enable-host-bind-now.

lto-plugin/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Link with
-z,now.
* configure: Regenerate.

diff --git a/c++tools/configure b/c++tools/configure
index 88087009383..006efe07b35 100755
--- a/c++tools/configure
+++ b/c++tools/configure
@@ -628,6 +628,7 @@ EGREP
 GREP
 CXXCPP
 LD_PICFLAG
+enable_host_bind_now
 PICFLAG
 MAINTAINER
 CXX_AUX_TOOLS
@@ -702,6 +703,7 @@ enable_maintainer_mode
 enable_checking
 enable_default_pie
 enable_host_pie
+enable_host_bind_now
 with_gcc_major_version_only
 '
   ac_precious_vars='build_alias
@@ -1336,6 +1338,7 @@ Optional Features:
   yes,no,all,none,release.
   --enable-default-pieenable Position Independent Executable as default
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
 
 Optional Packages:
   --with-PACKAGE[=ARG]use PACKAGE [ARG=yes]
@@ -3007,6 +3010,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now; LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"
+fi
+
+
+
 
 # Check if O_CLOEXEC is defined by fcntl
 
diff --git a/c++tools/configure.ac b/c++tools/configure.ac
index 44dfaccbbfa..c2a16601425 100644
--- a/c++tools/configure.ac
+++ b/c++tools/configure.ac
@@ -110,6 +110,13 @@ AC_ARG_ENABLE(host-pie,
[build host code as PIE])],
 [PICFLAG=-fPIE; LD_PICFLAG=-pie], [])
 AC_SUBST(PICFLAG)
+
+# Enable --enable-host-bind-now
+AC_ARG_ENABLE(host-bind-now,
+[AS_HELP_STRING([--enable-host-bind-now],
+   [link host code as BIND_NOW])],
+[LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"], [])
+AC_SUBST(enable_host_bind_now)
 AC_SUBST(LD_PICFLAG)
 
 # Check if O_CLOEXEC is defined by fcntl
diff --git a/gcc/configure b/gcc/configure
index 629446ecf3b..6d847c60024 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -635,6 +635,7 @@ CET_HOST_FLAGS
 LD_PICFLAG
 PICFLAG
 enable_default_pie
+enable_host_bind_now
 enable_host_pie
 enable_host_shared
 enable_plugin
@@ -1031,6 +1032,7 @@ enable_version_specific_runtime_libs
 enable_plugin
 enable_host_shared
 enable_host_pie
+enable_host_bind_now
 enable_libquadmath_support
 with_linker_hash_style
 with_diagnostics_color
@@ -1794,6 +1796,7 @@ Optional Features:
   --enable-plugin enable plugin support
   --enable-host-sharedbuild host code as shared libraries
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
   --disable-libquadmath-support
   disable libquadmath support for Fortran
   --enable-default-pieenable Position Independent Executable as default
@@ -19852,7 +19855,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19867 "configure"
+#line 19870 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19958,7 +19961,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19973 "configure"
+#line 19976 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -32105,6 +32108,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now;
+fi
+
+
+
 # Check whether --enable-libquadmath-support was given.
 if test "${enable_libquadmath_support+set}" = set; then :
   enableval=$enable_libquadmath_support; ENABLE_LIBQUADMATH_SUPPORT=$enableval
@@ -32291,6 +32302,8 @@ else
   PICFLAG=
 fi
 
+
+
 if test x$enable_host_pie = xyes; then
   LD_PICFLAG=-pie
 elif test x$gcc_cv_no_pie = xyes; then
@@ -32299,6 +32312,9 @@ else
   LD_PICFLAG=
 fi
 
+if test x$enable_host_bind_now = xyes; then
+  LD_PICFLAG="$LD_PICFL

[PATCH] c++: desig init in presence of list ctor [PR109871]

2023-05-16 Thread Patrick Palka via Gcc-patches
add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertendly being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the list initializer as
an ordinary non-designated one.  This patch fixes this by making us check
for invalid designated initialization sooner.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 13?  IIUC desig init is C++20 but we also accept it
with a pedwarn in earlier dialects, so not sure if this'd be suitable
for backporting.

PR c++/109871

gcc/cp/ChangeLog:

* call.cc (add_list_candidates): Check for invalid
designated initialization sooner, even for types that have
a list constructor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/desig6.C: New test.
---
 gcc/cp/call.cc  | 16 
 gcc/testsuite/g++.dg/cpp0x/desig6.C | 16 
 2 files changed, 24 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/desig6.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 48611bb16a3..908374a43c9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -4129,6 +4129,14 @@ add_list_candidates (tree fns, tree first_arg,
   if (CONSTRUCTOR_NELTS (init_list) == 0
   && TYPE_HAS_DEFAULT_CONSTRUCTOR (totype))
 ;
+  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
+  && !CP_AGGREGATE_TYPE_P (totype))
+{
+  if (complain & tf_error)
+   error ("designated initializers cannot be used with a "
+  "non-aggregate type %qT", totype);
+  return;
+}
   /* If the class has a list ctor, try passing the list as a single
  argument first, but only consider list ctors.  */
   else if (TYPE_HAS_LIST_CTOR (totype))
@@ -4140,14 +4148,6 @@ add_list_candidates (tree fns, tree first_arg,
   if (any_strictly_viable (*candidates))
return;
 }
-  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
-  && !CP_AGGREGATE_TYPE_P (totype))
-{
-  if (complain & tf_error)
-   error ("designated initializers cannot be used with a "
-  "non-aggregate type %qT", totype);
-  return;
-}
 
   /* Expand the CONSTRUCTOR into a new argument vec.  */
   vec *new_args;
diff --git a/gcc/testsuite/g++.dg/cpp0x/desig6.C 
b/gcc/testsuite/g++.dg/cpp0x/desig6.C
new file mode 100644
index 000..8d4cf483176
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/desig6.C
@@ -0,0 +1,16 @@
+// PR c++/109871
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+#include 
+
+struct vector {
+  vector(std::initializer_list); // #1
+  vector(int); // #2
+};
+
+void f(vector);
+
+int main() {
+  f({.blah = 42}); // { dg-error "designated" } previously incorrectly 
selected #2
+}
-- 
2.40.1.552.g91428f078b



RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Li, Pan2 via Gcc-patches
Update the PATCH v4 (I am sorry, missed the v4 in subject) as below with x86 
bootstrap test passed.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618742.html

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Tuesday, May 16, 2023 8:17 PM
To: Richard Sandiford 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

Thanks Richard Sandiford for review.

Yes, currently the class access_info will be extended from 8 bytes to 12 bytes, 
which is missed in the table. With the adjustment as you suggested it will be 8 
bytes but unfortunately the change of m_kind may trigger some ICE in some test 
case(s).

I will take a look into it and keep you posted.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Tuesday, May 16, 2023 5:09 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

pan2...@intel.com writes:
> diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h index 
> c5180b9308a..38b4d6160c2 100644
> --- a/gcc/rtl-ssa/accesses.h
> +++ b/gcc/rtl-ssa/accesses.h
> @@ -254,7 +254,7 @@ private:
>unsigned int m_spare : 2;
>  
>// The value returned by the accessor above.
> -  machine_mode m_mode : 8;
> +  machine_mode m_mode : MACHINE_MODE_BITSIZE;
>  };
>  
>  // A contiguous array of access_info pointers.  Used to represent a

This structure (access_info) isn't mentioned in the table in the patch 
description.  The structure is currently 1 LP64 word and is very 
size-sensitive.  I think we should:

- Put the mode after m_regno
- Reduce m_kind to 2 bits
- Remove m_spare

I *think* that will keep the current size, but please check.

LGTM otherwise.

Thanks,
Richard


Re: [PATCH] c++: desig init in presence of list ctor [PR109871]

2023-05-16 Thread Jason Merrill via Gcc-patches

On 5/16/23 11:38, Patrick Palka wrote:

add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertendly being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the list initializer as
an ordinary non-designated one.  This patch fixes this by making us check
for invalid designated initialization sooner.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 13?  IIUC desig init is C++20 but we also accept it
with a pedwarn in earlier dialects, so not sure if this'd be suitable
for backporting.


OK.


PR c++/109871

gcc/cp/ChangeLog:

* call.cc (add_list_candidates): Check for invalid
designated initialization sooner, even for types that have
a list constructor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/desig6.C: New test.
---
  gcc/cp/call.cc  | 16 
  gcc/testsuite/g++.dg/cpp0x/desig6.C | 16 
  2 files changed, 24 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/desig6.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 48611bb16a3..908374a43c9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -4129,6 +4129,14 @@ add_list_candidates (tree fns, tree first_arg,
if (CONSTRUCTOR_NELTS (init_list) == 0
&& TYPE_HAS_DEFAULT_CONSTRUCTOR (totype))
  ;
+  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
+  && !CP_AGGREGATE_TYPE_P (totype))
+{
+  if (complain & tf_error)
+   error ("designated initializers cannot be used with a "
+  "non-aggregate type %qT", totype);
+  return;
+}
/* If the class has a list ctor, try passing the list as a single
   argument first, but only consider list ctors.  */
else if (TYPE_HAS_LIST_CTOR (totype))
@@ -4140,14 +4148,6 @@ add_list_candidates (tree fns, tree first_arg,
if (any_strictly_viable (*candidates))
return;
  }
-  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
-  && !CP_AGGREGATE_TYPE_P (totype))
-{
-  if (complain & tf_error)
-   error ("designated initializers cannot be used with a "
-  "non-aggregate type %qT", totype);
-  return;
-}
  
/* Expand the CONSTRUCTOR into a new argument vec.  */

vec *new_args;
diff --git a/gcc/testsuite/g++.dg/cpp0x/desig6.C 
b/gcc/testsuite/g++.dg/cpp0x/desig6.C
new file mode 100644
index 000..8d4cf483176
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/desig6.C
@@ -0,0 +1,16 @@
+// PR c++/109871
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+#include 
+
+struct vector {
+  vector(std::initializer_list); // #1
+  vector(int); // #2
+};
+
+void f(vector);
+
+int main() {
+  f({.blah = 42}); // { dg-error "designated" } previously incorrectly 
selected #2
+}




[committed gcc13 backport] RISCV: Inline subword atomic ops

2023-05-16 Thread Patrick O'Neill

On 5/15/23 21:32, Jeff Law wrote:




On 5/9/23 10:01, Patrick O'Neill wrote:

Ping.

OK for backporting.  Sorry for the delay.

jeff


Committed.

Thanks,
Patrick



[committed] rs6000: Enable REE pass by default

2023-05-16 Thread Ajit Agarwal via Gcc-patches
rs6000: Enable REE pass by default

Add ree pass as a default pass for rs6000 target for
O2 and above.

2023-05-16  Ajit Kumar Agarwal  

gcc/ChangeLog:

* common/config/rs6000/rs6000-common.cc: Add REE pass as a
default rs6000 target pass for O2 and above.
* doc/invoke.texi: Document -free
---
 gcc/common/config/rs6000/rs6000-common.cc | 2 ++
 gcc/doc/invoke.texi   | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/common/config/rs6000/rs6000-common.cc 
b/gcc/common/config/rs6000/rs6000-common.cc
index 2140c442ba9..968db215028 100644
--- a/gcc/common/config/rs6000/rs6000-common.cc
+++ b/gcc/common/config/rs6000/rs6000-common.cc
@@ -34,6 +34,8 @@ static const struct default_options 
rs6000_option_optimization_table[] =
 { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 },
 /* Enable -fsched-pressure for first pass instruction scheduling.  */
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+/* Enable -free for zero extension and sign extension elimination.*/
+{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
 /* Enable -munroll-only-small-loops with -funroll-loops to unroll small
loops at -O2 and above by default.  */
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b92b8576027..2c525762171 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12455,8 +12455,8 @@ Attempt to remove redundant extension instructions.  
This is especially
 helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
 registers after writing to their lower 32-bit half.

-Enabled for Alpha, AArch64 and x86 at levels @option{-O2},
-@option{-O3}, @option{-Os}.
+Enabled for Alpha, AArch64, PowerPC, RISC-V, SPARC, h83000 and x86 at levels
+@option{-O2}, @option{-O3}, @option{-Os}.

 @opindex fno-lifetime-dse
 @opindex flifetime-dse
-- 
2.31.1


[PATCH] libstdc++: Disable embedded tzdata for all 16-bit targets

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Builds OK for avr too.

Roger, does this work for xstormy16?


-- >8 --

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ZONEINFO_DIR): Extend logic for avr and
msp430 to all 16-bit targets.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 15 +--
 libstdc++-v3/configure| 18 --
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 8129373e9dd..eb30c4f00a5 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5426,12 +5426,15 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
zoneinfo_dir=none
;;
 esac
-case "$host" in
-  avr-*-* | msp430-*-* ) embed_zoneinfo=no ;;
-  *)
-   # Also embed a copy of the tzdata.zi file as a static string.
-   embed_zoneinfo=yes ;;
-esac
+
+AC_COMPUTE_INT(glibcxx_cv_at_least_32bit, [sizeof(void*) >= 4])
+if test "$glibcxx_cv_at_least_32bit" -ne 0; then
+  # Also embed a copy of the tzdata.zi file as a static string.
+  embed_zoneinfo=yes
+else
+  # The embedded data is too large for 16-bit targets.
+  embed_zoneinfo=no
+fi
   elif test "x${with_libstdcxx_zoneinfo}" = xno; then
 # Disable tzdb support completely.
 zoneinfo_dir=none
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 188be08d716..345ba5721a8 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -71903,12 +71903,18 @@ fi
zoneinfo_dir=none
;;
 esac
-case "$host" in
-  avr-*-* | msp430-*-* ) embed_zoneinfo=no ;;
-  *)
-   # Also embed a copy of the tzdata.zi file as a static string.
-   embed_zoneinfo=yes ;;
-esac
+
+if ac_fn_c_compute_int "$LINENO" "sizeof(void*) >= 4" 
"glibcxx_cv_at_least_32bit"""; then :
+
+fi
+
+if test "$glibcxx_cv_at_least_32bit" -ne 0; then
+  # Also embed a copy of the tzdata.zi file as a static string.
+  embed_zoneinfo=yes
+else
+  # The embedded data is too large for 16-bit targets.
+  embed_zoneinfo=no
+fi
   elif test "x${with_libstdcxx_zoneinfo}" = xno; then
 # Disable tzdb support completely.
 zoneinfo_dir=none
-- 
2.40.1



[committed] libstdc++: Disable cacheline alignment for DJGPP [PR109741]

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Builds OK on djgpp too.

Pushed to trunk.

-- >8 --

DJGPP (and maybe other targets) uses MAX_OFILE_ALIGNMENT=16 which means
that globals (and static objects) can't have alignment greater than 16.
This causes an error for the locks defined in src/c++11/shared_ptr.cc
because we try to align them to the cacheline size, to avoid false
sharing.

Add a configure check for the increased alignment, and live with false
sharing where we can't increase the alignment.

libstdc++-v3/ChangeLog:

PR libstdc++/109741
* acinclude.m4 (GLIBCXX_CHECK_ALIGNAS_CACHELINE): Define.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_CHECK_ALIGNAS_CACHELINE.
* src/c++11/shared_ptr.cc (__gnu_internal::get_mutex): Do not
align lock table if not supported. use __GCC_DESTRUCTIVE_SIZE
instead of hardcoded 64.
---
 libstdc++-v3/acinclude.m4| 25 +++
 libstdc++-v3/config.h.in |  4 +++
 libstdc++-v3/configure   | 48 
 libstdc++-v3/configure.ac|  3 ++
 libstdc++-v3/src/c++11/shared_ptr.cc |  8 +++--
 5 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 988c532c4e2..8129373e9dd 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5471,6 +5471,31 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
   fi
 ])
 
+dnl
+dnl Check whether lock tables can be aligned to avoid false sharing.
+dnl
+dnl Defines:
+dnl  _GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE if objects with static storage
+dnlduration can be aligned to std::hardware_destructive_interference_size.
+dnl
+AC_DEFUN([GLIBCXX_CHECK_ALIGNAS_CACHELINE], [
+  AC_LANG_SAVE
+  AC_LANG_CPLUSPLUS
+
+  AC_MSG_CHECKING([whether static objects can be aligned to the cacheline 
size])
+  AC_TRY_COMPILE(, [struct alignas(__GCC_DESTRUCTIVE_SIZE) Aligned { };
+   alignas(Aligned) static char buf[sizeof(Aligned) * 16];
+], [ac_alignas_cacheline=yes], [ac_alignas_cacheline=no])
+  if test "$ac_alignas_cacheline" = yes; then
+AC_DEFINE_UNQUOTED(_GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE, 1,
+  [Define if global objects can be aligned to
+   std::hardware_destructive_interference_size.])
+  fi
+  AC_MSG_RESULT($ac_alignas_cacheline)
+
+  AC_LANG_RESTORE
+])
+
 # Macros from the top-level gcc directory.
 m4_include([../config/gc++filt.m4])
 m4_include([../config/tls.m4])
diff --git a/libstdc++-v3/config.h.in b/libstdc++-v3/config.h.in
index f91f7eb9097..bbb2613ff69 100644
--- a/libstdc++-v3/config.h.in
+++ b/libstdc++-v3/config.h.in
@@ -819,6 +819,10 @@
 /* Define if the compiler supports C++11 atomics. */
 #undef _GLIBCXX_ATOMIC_BUILTINS
 
+/* Define if global objects can be aligned to
+   std::hardware_destructive_interference_size. */
+#undef _GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE
+
 /* Define to use concept checking code from the boost libraries. */
 #undef _GLIBCXX_CONCEPT_CHECKS
 
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index a9589d882e6..188be08d716 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -71957,6 +71957,54 @@ _ACEOF
   fi
 
 
+
+
+  ac_ext=cpp
+ac_cpp='$CXXCPP $CPPFLAGS'
+ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS 
conftest.$ac_ext $LIBS >&5'
+ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
+
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether static objects can 
be aligned to the cacheline size" >&5
+$as_echo_n "checking whether static objects can be aligned to the cacheline 
size... " >&6; }
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+struct alignas(__GCC_DESTRUCTIVE_SIZE) Aligned { };
+   alignas(Aligned) static char buf[sizeof(Aligned) * 16];
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  ac_alignas_cacheline=yes
+else
+  ac_alignas_cacheline=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+  if test "$ac_alignas_cacheline" = yes; then
+
+cat >>confdefs.h <<_ACEOF
+#define _GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE 1
+_ACEOF
+
+  fi
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_alignas_cacheline" >&5
+$as_echo "$ac_alignas_cacheline" >&6; }
+
+  ac_ext=c
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext 
$LIBS >&5'
+ac_compiler_gnu=$ac_cv_c_compiler_gnu
+
+
+
 # Define documentation rules conditionally.
 
 # See if makeinfo has been installed and is modern enough
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 0dd550a4b4b..df01f58bd83 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -538,6 +538,9 @@ GLIBCXX_EMERGENCY_EH_ALLOC
 # For src/c++20/tzdb.cc defaults.
 GLIBCX

[PATCH] c++: -Wdangling-reference not suppressed in template [PR109774]

2023-05-16 Thread Marek Polacek via Gcc-patches
In check_return_expr, we suppress the -Wdangling-reference warning when
we're sure it would be a false positive.  It wasn't working in a
template, though, because the suppress_warning call was never reached.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 13.2?

PR c++/109774

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): In a template, return only after
suppressing -Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference13.C: New test.
---
 gcc/cp/typeck.cc  |  6 ++---
 .../g++.dg/warn/Wdangling-reference13.C   | 23 +++
 2 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference13.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 53ac925a092..c225c4e2423 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11236,9 +11236,6 @@ check_return_expr (tree retval, bool *no_warning)
 build_zero_cst (TREE_TYPE (retval)));
 }
 
-  if (processing_template_decl)
-return saved_retval;
-
   /* A naive attempt to reduce the number of -Wdangling-reference false
  positives: if we know that this function can return a variable with
  static storage duration rather than one of its parameters, suppress
@@ -11250,6 +11247,9 @@ check_return_expr (tree retval, bool *no_warning)
   && TREE_STATIC (bare_retval))
 suppress_warning (current_function_decl, OPT_Wdangling_reference);
 
+  if (processing_template_decl)
+return saved_retval;
+
   /* Actually copy the value returned into the appropriate location.  */
   if (retval && retval != result)
 {
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
new file mode 100644
index 000..bc09fbae22b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
@@ -0,0 +1,23 @@
+// PR c++/109774
+// { dg-do compile }
+// { dg-options "-Wdangling-reference" }
+
+int y;
+
+template
+int& get(const char& )
+{
+return y;
+}
+
+int& get2(const char&)
+{
+return y;
+}
+
+int stuff(void)
+{
+const int &h = get(0); // { dg-bogus "dangling reference" }
+const int &k = get2(0); // { dg-bogus "dangling reference" }
+return h+k;
+}

base-commit: 94a311abf783de754f0f1b2d4c1f00a9788e795b
-- 
2.40.1



[PATCH] c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]

2023-05-16 Thread Jakub Jelinek via Gcc-patches
Hi!

My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.

I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.

In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/13,
perhaps even 12?

2023-05-16  Jakub Jelinek  

PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.

* g++.dg/init/pr109868.C: New test.

--- gcc/cp/init.cc.jj   2023-05-01 23:07:05.147417750 +0200
+++ gcc/cp/init.cc  2023-05-16 10:01:14.512489727 +0200
@@ -189,15 +189,21 @@ build_zero_init_1 (tree type, tree nelts
 init = build_zero_cst (type);
   else if (RECORD_OR_UNION_CODE_P (TREE_CODE (type)))
 {
-  tree field;
+  tree field, next;
   vec *v = NULL;
 
   /* Iterate over the fields, building initializations.  */
-  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+  for (field = TYPE_FIELDS (type); field; field = next)
{
+ next = DECL_CHAIN (field);
+
  if (TREE_CODE (field) != FIELD_DECL)
continue;
 
+ /* For unions, only the first field is initialized.  */
+ if (TREE_CODE (type) == UNION_TYPE)
+   next = NULL_TREE;
+
  if (TREE_TYPE (field) == error_mark_node)
continue;
 
@@ -212,6 +218,11 @@ build_zero_init_1 (tree type, tree nelts
continue;
}
 
+ /* Don't add zero width bitfields.  */
+ if (DECL_C_BIT_FIELD (field)
+ && integer_zerop (DECL_SIZE (field)))
+   continue;
+
  /* Note that for class types there will be FIELD_DECLs
 corresponding to base classes as well.  Thus, iterating
 over TYPE_FIELDs will result in correct initialization of
@@ -230,10 +241,6 @@ build_zero_init_1 (tree type, tree nelts
  if (value)
CONSTRUCTOR_APPEND_ELT(v, field, value);
}
-
- /* For unions, only the first field is initialized.  */
- if (TREE_CODE (type) == UNION_TYPE)
-   break;
}
 
   /* Build a constructor to contain the initializations.  */
--- gcc/testsuite/g++.dg/init/pr109868.C.jj 2023-05-16 09:43:54.706278293 
+0200
+++ gcc/testsuite/g++.dg/init/pr109868.C2023-05-16 09:44:16.581966894 
+0200
@@ -0,0 +1,13 @@
+// PR c++/109868
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A { virtual void foo (); };
+struct B { long b; int : 0; };
+struct C : A { B c; };
+
+void
+bar (C *p)
+{
+  *p = C ();
+}

Jakub



Re: [PATCH] c++: -Wdangling-reference not suppressed in template [PR109774]

2023-05-16 Thread Jason Merrill via Gcc-patches

On 5/16/23 15:13, Marek Polacek wrote:

In check_return_expr, we suppress the -Wdangling-reference warning when
we're sure it would be a false positive.  It wasn't working in a
template, though, because the suppress_warning call was never reached.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 13.2?


OK.


PR c++/109774

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): In a template, return only after
suppressing -Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference13.C: New test.
---
  gcc/cp/typeck.cc  |  6 ++---
  .../g++.dg/warn/Wdangling-reference13.C   | 23 +++
  2 files changed, 26 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference13.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 53ac925a092..c225c4e2423 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11236,9 +11236,6 @@ check_return_expr (tree retval, bool *no_warning)
 build_zero_cst (TREE_TYPE (retval)));
  }
  
-  if (processing_template_decl)

-return saved_retval;
-
/* A naive attempt to reduce the number of -Wdangling-reference false
   positives: if we know that this function can return a variable with
   static storage duration rather than one of its parameters, suppress
@@ -11250,6 +11247,9 @@ check_return_expr (tree retval, bool *no_warning)
&& TREE_STATIC (bare_retval))
  suppress_warning (current_function_decl, OPT_Wdangling_reference);
  
+  if (processing_template_decl)

+return saved_retval;
+
/* Actually copy the value returned into the appropriate location.  */
if (retval && retval != result)
  {
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
new file mode 100644
index 000..bc09fbae22b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
@@ -0,0 +1,23 @@
+// PR c++/109774
+// { dg-do compile }
+// { dg-options "-Wdangling-reference" }
+
+int y;
+
+template
+int& get(const char& )
+{
+return y;
+}
+
+int& get2(const char&)
+{
+return y;
+}
+
+int stuff(void)
+{
+const int &h = get(0); // { dg-bogus "dangling reference" }
+const int &k = get2(0); // { dg-bogus "dangling reference" }
+return h+k;
+}

base-commit: 94a311abf783de754f0f1b2d4c1f00a9788e795b




Re: [PATCH] c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]

2023-05-16 Thread Jason Merrill via Gcc-patches

On 5/16/23 15:34, Jakub Jelinek wrote:

Hi!

My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.

I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.

In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/13,
perhaps even 12?


OK back to 12, I think.


2023-05-16  Jakub Jelinek  

PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.

* g++.dg/init/pr109868.C: New test.

--- gcc/cp/init.cc.jj   2023-05-01 23:07:05.147417750 +0200
+++ gcc/cp/init.cc  2023-05-16 10:01:14.512489727 +0200
@@ -189,15 +189,21 @@ build_zero_init_1 (tree type, tree nelts
  init = build_zero_cst (type);
else if (RECORD_OR_UNION_CODE_P (TREE_CODE (type)))
  {
-  tree field;
+  tree field, next;
vec *v = NULL;
  
/* Iterate over the fields, building initializations.  */

-  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+  for (field = TYPE_FIELDS (type); field; field = next)
{
+ next = DECL_CHAIN (field);
+
  if (TREE_CODE (field) != FIELD_DECL)
continue;
  
+	  /* For unions, only the first field is initialized.  */

+ if (TREE_CODE (type) == UNION_TYPE)
+   next = NULL_TREE;
+
  if (TREE_TYPE (field) == error_mark_node)
continue;
  
@@ -212,6 +218,11 @@ build_zero_init_1 (tree type, tree nelts

continue;
}
  
+	  /* Don't add zero width bitfields.  */

+ if (DECL_C_BIT_FIELD (field)
+ && integer_zerop (DECL_SIZE (field)))
+   continue;
+
  /* Note that for class types there will be FIELD_DECLs
 corresponding to base classes as well.  Thus, iterating
 over TYPE_FIELDs will result in correct initialization of
@@ -230,10 +241,6 @@ build_zero_init_1 (tree type, tree nelts
  if (value)
CONSTRUCTOR_APPEND_ELT(v, field, value);
}
-
- /* For unions, only the first field is initialized.  */
- if (TREE_CODE (type) == UNION_TYPE)
-   break;
}
  
/* Build a constructor to contain the initializations.  */

--- gcc/testsuite/g++.dg/init/pr109868.C.jj 2023-05-16 09:43:54.706278293 
+0200
+++ gcc/testsuite/g++.dg/init/pr109868.C2023-05-16 09:44:16.581966894 
+0200
@@ -0,0 +1,13 @@
+// PR c++/109868
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A { virtual void foo (); };
+struct B { long b; int : 0; };
+struct C : A { B c; };
+
+void
+bar (C *p)
+{
+  *p = C ();
+}

Jakub





Re: [PATCH] s390: Implement TARGET_ATOMIC_ALIGN_FOR_MODE

2023-05-16 Thread Andreas Krebbel via Gcc-patches
On 5/16/23 08:43, Stefan Schulze Frielinghaus wrote:
> So far atomic objects are aligned according to their default alignment.
> For 128 bit scalar types like int128 or long double this results in an
> 8 byte alignment which is wrong and must be 16 byte.
> 
> libstdc++ already computes a correct alignment, though, still adding a
> test case in order to make sure that both implementations are
> compatible.
> 
> Bootstrapped and regtested.  Ok for mainline?  Since this is an ABI
> break, is a backport to GCC 13 reasonable?

Ok for mainline.

I would also like to have it in GCC 13. It is an ABI breakage but on the other 
hand it also fixes an
ABI inconsistency between C and C++ which we should fix asap I think.

Andreas


> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (TARGET_ATOMIC_ALIGN_FOR_MODE):
>   New.
>   (s390_atomic_align_for_mode): New.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.target/s390/atomic-align-1.C: New test.
>   * gcc.target/s390/atomic-align-1.c: New test.
>   * gcc.target/s390/atomic-align-2.c: New test.
> ---
>  gcc/config/s390/s390.cc   |  8 ++
>  .../g++.target/s390/atomic-align-1.C  | 25 +++
>  .../gcc.target/s390/atomic-align-1.c  | 23 +
>  .../gcc.target/s390/atomic-align-2.c  | 18 +
>  4 files changed, 74 insertions(+)
>  create mode 100644 gcc/testsuite/g++.target/s390/atomic-align-1.C
>  create mode 100644 gcc/testsuite/gcc.target/s390/atomic-align-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/atomic-align-2.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 505de995da8..4813bf91dc4 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -450,6 +450,14 @@ s390_preserve_fpr_arg_p (int regno)
> && regno >= FPR0_REGNUM);
>  }
>  
> +#undef TARGET_ATOMIC_ALIGN_FOR_MODE
> +#define TARGET_ATOMIC_ALIGN_FOR_MODE s390_atomic_align_for_mode
> +static unsigned int
> +s390_atomic_align_for_mode (machine_mode mode)
> +{
> +  return GET_MODE_BITSIZE (mode);
> +}
> +
>  /* A couple of shortcuts.  */
>  #define CONST_OK_FOR_J(x) \
>   CONST_OK_FOR_CONSTRAINT_P((x), 'J', "J")
> diff --git a/gcc/testsuite/g++.target/s390/atomic-align-1.C 
> b/gcc/testsuite/g++.target/s390/atomic-align-1.C
> new file mode 100644
> index 000..43aa0bc39ed
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/s390/atomic-align-1.C
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-std=c++11" } */
> +/* { dg-final { scan-assembler-times {\.align\t2} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t4} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t8} 3 } } */
> +/* { dg-final { scan-assembler-times {\.align\t16} 2 } } */
> +
> +#include 
> +
> +// 2
> +std::atomic var_char;
> +std::atomic var_short;
> +// 4
> +std::atomic var_int;
> +// 8
> +std::atomic var_long;
> +std::atomic var_long_long;
> +// 16
> +std::atomic<__int128> var_int128;
> +// 4
> +std::atomic var_float;
> +// 8
> +std::atomic var_double;
> +// 16
> +std::atomic var_long_double;
> diff --git a/gcc/testsuite/gcc.target/s390/atomic-align-1.c 
> b/gcc/testsuite/gcc.target/s390/atomic-align-1.c
> new file mode 100644
> index 000..b2e1233e3ee
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/atomic-align-1.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-std=c11" } */
> +/* { dg-final { scan-assembler-times {\.align\t2} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t4} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t8} 3 } } */
> +/* { dg-final { scan-assembler-times {\.align\t16} 2 } } */
> +
> +// 2
> +_Atomic char var_char;
> +_Atomic short var_short;
> +// 4
> +_Atomic int var_int;
> +// 8
> +_Atomic long var_long;
> +_Atomic long long var_long_long;
> +// 16
> +_Atomic __int128 var_int128;
> +// 4
> +_Atomic float var_float;
> +// 8
> +_Atomic double var_double;
> +// 16
> +_Atomic long double var_long_double;
> diff --git a/gcc/testsuite/gcc.target/s390/atomic-align-2.c 
> b/gcc/testsuite/gcc.target/s390/atomic-align-2.c
> new file mode 100644
> index 000..0bf17341bf8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/atomic-align-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-O -std=c11" } */
> +/* { dg-final { scan-assembler-not {abort} } } */
> +
> +/* The stack is 8 byte aligned which means GCC has to manually align a 16 
> byte
> +   aligned object.  This is done by allocating not 16 but rather 24 bytes for
> +   variable X and then manually aligning a pointer inside the memory block.
> +   Validate this by ensuring that the if-statement is optimized out.  */
> +
> +void bar (_Atomic unsigned __int128 *ptr);
> +
> +void foo (void) {
> +  _Atomic unsigned __int128 x;
> +  unsigned long n = (unsigned long)&x;
> +  if (n % 16 != 0)
> +__builtin_abort ();
> +  bar (&x);
> +}



Re: [PATCH] configure: Implement --enable-host-pie

2023-05-16 Thread Iain Sandoe
Hi Marek,

> On 16 May 2023, at 16:29, Marek Polacek via Gcc-patches 
>  wrote:
> 
> Ping.

I’m trying this on Darwin (since I have a local patch to do this for modern 
[darwin20+]
versions, which do not allow non-PIE)

I think you are missing a hunk to deal with Ada.

thanks for the patch
Iain

> 
> On Tue, May 09, 2023 at 03:41:58PM -0400, Marek Polacek via Gcc-patches wrote:
>> [ This is my third attempt to add this configure option.  The first
>> version was approved but it came too late in the development cycle.
>> The second version was also approved, but I had to revert it:
>> .
>> I've fixed the problem (by moving $(PICFLAG) from INTERNAL_CFLAGS to
>> ALL_COMPILERFLAGS).  Another change is that since r13-4536 I no longer
>> need to touch Makefile.def, so this patch is simplified. ]
>> 
>> This patch implements the --enable-host-pie configure option which
>> makes the compiler executables PIE.  This can be used to enhance
>> protection against ROP attacks, and can be viewed as part of a wider
>> trend to harden binaries.
>> 
>> It is similar to the option --enable-host-shared, except that --e-h-s
>> won't add -shared to the linker flags whereas --e-h-p will add -pie.
>> It is different from --enable-default-pie because that option just
>> adds an implicit -fPIE/-pie when the compiler is invoked, but the
>> compiler itself isn't PIE.
>> 
>> Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
>> regressions.
>> 
>> When building the compiler, the build process may use various in-tree
>> libraries; these need to be built with -fPIE so that it's possible to
>> use them when building a PIE.  For instance, when --with-included-gettext
>> is in effect, intl object files must be compiled with -fPIE.  Similarly,
>> when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
>> -fPIE.
>> 
>> With this patch and --enable-host-pie used to configure gcc:
>> 
>> $ file gcc/cc1{,plus,obj} gcc/f951 gcc/lto1 gcc/cpp
>> gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/f951:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/cc1obj:  ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/lto1:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/cpp: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> 
>> I plan to add an option to link with -Wl,-z,now.
>> 
>> Bootstrapped on x86_64-pc-linux-gnu with --with-included-gettext
>> --enable-host-pie as well as without --enable-host-pie.  Also tested
>> on a Debian system where the system gcc was configured with
>> --enable-default-pie.
>> 
>> ChangeLog:
>> 
>>  * configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
>>  check.
>>  * configure: Regenerate.
>> 
>> c++tools/ChangeLog:
>> 
>>  * Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
>>  Use pic/libiberty.a if PICFLAG is set.
>>  * configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
>>  (--enable-host-pie): New check.
>>  * configure: Regenerate.
>> 
>> fixincludes/ChangeLog:
>> 
>>  * Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
>>  build of libiberty if PICFLAG is set.
>>  * configure.ac:
>>  * configure: Regenerate.
>> 
>> gcc/ChangeLog:
>> 
>>  * Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
>>  Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
>>  ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
>>  * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>>  (--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
>>  check.
>>  * configure: Regenerate.
>>  * doc/install.texi: Document --enable-host-pie.
>> 
>> gcc/d/ChangeLog:
>> 
>>  * Make-lang.in: Remove NO_PIE_CFLAGS.
>> 
>> intl/ChangeLog:
>> 
>>  * Makefile.in: Use @PICFLAG@ in COMPILE as well.
>>  * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>>  (--enable-host-pie): New check.  Set PICFLAG after this check.
>>  * configure

Re: [PATCH] aarch64: Add SVE instruction types

2023-05-16 Thread Evandro Menezes via Gcc-patches
Hi, Kyrill.

It makes sense.  I could add the classification to a different attribute as you 
did and keep it in aarch64 as well.

I took the same approach, gleaning over several optimization guides for Arm 
processors supporting SVE and figuring out the smallest number of types that 
could cover most variations of resources used.  Methinks that the 
classification in this patch is close to that goal, but feedback is appreciated.

I did observe a meaningful gain in performance.  Of course, wide machines like 
the V1 can handle most instruction sequences thrown at it, but there’s still 
some efficiency left on the table without a tailored scheduling, especially 
when recovering from cache or branch misses, when it’s important to quickly 
fill up the pipeline back to regime, albeit umpteen transistors are dedicated 
to make sure that misses do not happen often.

Thank you,

-- 
Evandro Menezes



> Em 16 de mai. de 2023, à(s) 03:36, Kyrylo Tkachov  
> escreveu:
> 
> Hi Evandro,
>  
> I created a new attribute so I didn’t have to extend the “type” attribute 
> that lives in config/arm/types.md. As that attribute and file lives in the 
> arm backend but SVE is AArch64-only I didn’t want to add logic to the arm 
> backend as it’s not truly shared.
> The granularity has been somewhat subjective. I had looked at the Software 
> Optimisation guides for various SVE and SVE2-capable cores from Arm on 
> developer.arm.com  and tried to glean 
> commonalities between different instruction groups.
> I did try writing a model for Neoverse V1 using that classification but I 
> couldn’t spend much time on it and the resulting model didn’t give me much 
> improvements and gave some regressions instead.
> I think that was more down to my rushed model rather than anything else 
> though.
>  
> Thanks,
> Kyrill
>  
> From: Evandro Menezes  
> Sent: Monday, May 15, 2023 9:13 PM
> To: Kyrylo Tkachov 
> Cc: Richard Sandiford ; Evandro Menezes via 
> Gcc-patches ; evandro+...@gcc.gnu.org; Tamar 
> Christina 
> Subject: Re: [PATCH] aarch64: Add SVE instruction types
>  
> Hi, Kyrill.
>  
> I wasn’t aware of your previous patch.  Could you clarify why you considered 
> creating an SVE specific type attribute instead of reusing the common one?  I 
> really liked the iterators that you created; I’d like to use them.
>  
> Do you have specific examples which you might want to mention with regards to 
> granularity?
>  
> Yes, my intent for this patch is to enable modeling the SVE instructions on 
> N1.  The patch that implements it brings up some performance improvements, 
> but it’s mostly flat, as expected.
>  
> Thank you,
> 
> -- 
> Evandro Menezes
>  
>  
> 
> 
> Em 15 de mai. de 2023, à(s) 04:49, Kyrylo Tkachov  > escreveu:
>  
> 
> 
> 
> -Original Message-
> From: Richard Sandiford  >
> Sent: Monday, May 15, 2023 10:01 AM
> To: Evandro Menezes via Gcc-patches  >
> Cc: evandro+...@gcc.gnu.org ; Evandro Menezes 
> mailto:ebah...@icloud.com>>;
> Kyrylo Tkachov mailto:kyrylo.tkac...@arm.com>>; 
> Tamar Christina
> mailto:tamar.christ...@arm.com>>
> Subject: Re: [PATCH] aarch64: Add SVE instruction types
> 
> Evandro Menezes via Gcc-patches  > writes:
> 
> This patch adds the attribute `type` to most SVE1 instructions, as in the
> other
> 
> instructions.
> 
> Thanks for doing this.
> 
> Could you say what criteria you used for picking the granularity?  Other
> maintainers might disagree, but personally I'd prefer to distinguish two
> instructions only if:
> 
> (a) a scheduling description really needs to distinguish them or
> (b) grouping them together would be very artificial (because they're
>logically unrelated)
> 
> It's always possible to split types later if new scheduling descriptions
> require it.  Because of that, I don't think we should try to predict ahead
> of time what future scheduling descriptions will need.
> 
> Of course, this depends on having results that show that scheduling
> makes a significant difference on an SVE core.  I think one of the
> problems here is that, when a different scheduling model changes the
> performance of a particular test, it's difficult to tell whether
> the gain/loss is caused by the model being more/less accurate than
> the previous one, or if it's due to important "secondary" effects
> on register live ranges.  Instinctively, I'd have expected these
> secondary effects to dominate on OoO cores.
> 
> I agree with Richard on these points. The key here is getting the granularity 
> right without having too maintain too many types that aren't useful in the 
> models.
> FWIW I had posted 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607101.html in 
> November. It adds annotations to SVE2 patterns as well as for base SVE.
> Feel free to reuse it if you'd like.
> I see you had post

RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

A few of us were talking about test-related issues in the patchwork meeting
this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is about what I
remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test running had
been hanging so there might be some issue preventing folks from seeing the
failures.

I guess I didn't get time to look last time and I doubt things are looking any
better right now.  I'll try and take a look at some point, but any help would
of course be appreciated.

$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter gcc glibc 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find 
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
=== g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
=== g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
=== g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
=== g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-18.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-19.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-20.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-21.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-22.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C (test for excess errors)
=== g++: Unexpected fails for rv64gczba_zbb_zbc_zbs lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== gcc: Unexpected fails for rv64imac lp64 medlow ===
ERROR: tcl error sourcing 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp.
ERROR: torture-init: torture_without_loops is not empty as expected
ERROR: tcl error sourcing 
/scratch/merges/rgt-gcc-trunk/risc

[committed] c: Remove restrictions on declarations in 'for' loops for C2X

2023-05-16 Thread Joseph Myers
C2X removes a restriction that the only declarations in the
declaration part of a 'for' loop are declarations of objects with
storage class auto or register.  Implement this change, making the
diagnostics into pedwarn_c11 calls instead of errors (as usual for
features added in a new standard version that were invalid code in a
previous version), so now pedwarn-if-pedantic for older standards and
diagnosed also with -Wc11-c2x-compat.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (check_for_loop_decls): Use pedwarn_c11 for
diagnostics.

gcc/testsuite/
* gcc.dg/c11-fordecl-1.c, gcc.dg/c11-fordecl-2.c,
gcc.dg/c11-fordecl-3.c, gcc.dg/c11-fordecl-4.c,
gcc.dg/c2x-fordecl-1.c, gcc.dg/c2x-fordecl-2.c,
gcc.dg/c2x-fordecl-3.c, gcc.dg/c2x-fordecl-4.c: New tests.
* gcc.dg/c99-fordecl-2.c: Test diagnostic for typedef declaration
in for loop here.
* gcc.dg/pr67784-2.c, gcc.dg/pr68320.c, objc.dg/foreach-7.m: Do
not expect errors for typedef declaration in for loop.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 90d7cd27cd5..f8ede362bfd 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -11032,7 +11032,9 @@ check_for_loop_decls (location_t loc, bool 
turn_off_iso_c99_error)
  only applies to those that are.  (A question on this in comp.std.c
  in November 2000 received no answer.)  We implement the strictest
  interpretation, to avoid creating an extension which later causes
- problems.  */
+ problems.
+
+ This constraint was removed in C2X.  */
 
   for (b = current_scope->bindings; b; b = b->prev)
 {
@@ -11048,33 +11050,35 @@ check_for_loop_decls (location_t loc, bool 
turn_off_iso_c99_error)
  {
location_t decl_loc = DECL_SOURCE_LOCATION (decl);
if (TREE_STATIC (decl))
- error_at (decl_loc,
-   "declaration of static variable %qD in % loop "
-   "initial declaration", decl);
+ pedwarn_c11 (decl_loc, OPT_Wpedantic,
+  "declaration of static variable %qD in % "
+  "loop initial declaration", decl);
else if (DECL_EXTERNAL (decl))
- error_at (decl_loc,
-   "declaration of % variable %qD in % loop 
"
-   "initial declaration", decl);
+ pedwarn_c11 (decl_loc, OPT_Wpedantic,
+  "declaration of % variable %qD in % "
+  "loop initial declaration", decl);
  }
  break;
 
case RECORD_TYPE:
- error_at (loc,
-   "% declared in % loop initial "
-   "declaration", id);
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "% declared in % loop initial "
+  "declaration", id);
  break;
case UNION_TYPE:
- error_at (loc,
-   "% declared in % loop initial declaration",
-   id);
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "% declared in % loop initial "
+  "declaration",
+  id);
  break;
case ENUMERAL_TYPE:
- error_at (loc, "% declared in % loop "
-   "initial declaration", id);
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "% declared in % loop "
+  "initial declaration", id);
  break;
default:
- error_at (loc, "declaration of non-variable "
-   "%qD in % loop initial declaration", decl);
+ pedwarn_c11 (loc, OPT_Wpedantic, "declaration of non-variable "
+  "%qD in % loop initial declaration", decl);
}
 
   n_decls++;
diff --git a/gcc/testsuite/gcc.dg/c11-fordecl-1.c 
b/gcc/testsuite/gcc.dg/c11-fordecl-1.c
new file mode 100644
index 000..4aceb335e18
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-fordecl-1.c
@@ -0,0 +1,27 @@
+/* Test for C99 declarations in for loops.  Test constraints are diagnosed for
+   C11.  Based on c99-fordecl-2.c.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+void
+foo (void)
+{
+  int j = 0;
+  for (int i = 1, bar (void); i <= 10; i++) /* { dg-error "bar" } */
+j += i;
+
+  for (static int i = 1; i <= 10; i++) /* /* { dg-error "static" } */
+j += i;
+
+  for (extern int i; j <= 500; j++) /* { dg-error "extern" } */
+j += 5;
+
+  for (enum { FOO } i = FOO; i < 10; i++) /* { dg-error "FOO" } */
+j += i;
+
+  for (enum BAR { FOO } i = FOO; i < 10; i++) /* { dg-error "FOO" } */
+/* { dg-error "BAR" "enum tag in for loop" { target *-*-* } .-1 } */
+j += i;
+  for (typedef int T;;) /* { dg-error "non-variable" } */
+;
+}
diff --git a/gcc/testsuite/gcc.dg/c11-fordecl-2.c 
b/gcc/testsuite/gcc.dg/c11-fordecl-2.c
new file mode 100644
index 000..0be1a0d13fa
--- /dev/null
+

Re: RISC-V Test Errors and Failures

2023-05-16 Thread Vineet Gupta

On 5/16/23 16:06, Palmer Dabbelt wrote:
A few of us were talking about test-related issues in the patchwork 
meeting

this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is 
about what I

remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test 
running had
been hanging so there might be some issue preventing folks from seeing 
the

failures.

I guess I didn't get time to look last time and I doubt things are 
looking any
better right now.  I'll try and take a look at some point, but any 
help would

of course be appreciated.


Yes I was seeing similar tcl errors and such - and in my case an even 
higher count.

Also for posterity, what was your configure cmdline ? multilibs or no
We really need to add some CI around RV toolchains to trip on these sooner !



$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter 
gcc glibc 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find 
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`

    === g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess 
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess 
errors)

    === g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess 
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess 
errors)

    === g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
    === g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess 
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess 
errors)

FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-18.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-19.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-20.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-21.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-22.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C (test for excess errors)
    === g++: Unexpected fails for rv64gczba_zbb_zbc_zbs lp64d ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === gcc: Unexpected fails for rv64imac lp64 medlow ===
ERROR: tcl error sourcing 
/scratch/

  1   2   >