[PATCH] RISC-V: Introduce rounding mode operand into fixed-point intrinsics

2023-05-16 Thread juzhe . zhong
From: Juzhe-Zhong 

According to new comming fixed-point API:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222

Introduce vxrm argument:
- vint32m1_t __riscv_vsadd_vv_i32m1 (vint32m1_t op1, vint32m1_t op2, size_t vl);
+ vint32m1_t __riscv_vsadd_vv_i32m1 (vint32m1_t op1, vint32m1_t op2, size_t 
vxrm, size_t vl);

This patch doesn't insert vxrm csrw configuration instruction yet.
Will support automatically insert csrw vxrm instruction in the next patch.

This patch does this following:
1. Only extend the vxrm argument.
2. Check vxrm argument is invalid immediate and report error message if it is 
invalid.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Introduce rounding mode.
* config/riscv/riscv-vector-builtins-shapes.cc (struct alu_def): Ditto.
(struct narrow_alu_def): Ditto.
* config/riscv/riscv-vector-builtins.cc 
(function_builder::apply_predication): Ditto.
(function_expander::use_exact_insn): Ditto.
* config/riscv/riscv-vector-builtins.h (function_checker::arg_num): New 
function.
(function_base::has_rounding_mode_operand_p): New function.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/bug-11.C: Adapt testcase.
* g++.target/riscv/rvv/base/bug-12.C: Ditto.
* g++.target/riscv/rvv/base/bug-14.C: Ditto.
* g++.target/riscv/rvv/base/bug-15.C: Ditto.
* g++.target/riscv/rvv/base/bug-16.C: Ditto.
* g++.target/riscv/rvv/base/bug-17.C: Ditto.
* g++.target/riscv/rvv/base/bug-18.C: Ditto.
* g++.target/riscv/rvv/base/bug-19.C: Ditto.
* g++.target/riscv/rvv/base/bug-20.C: Ditto.
* g++.target/riscv/rvv/base/bug-21.C: Ditto.
* g++.target/riscv/rvv/base/bug-22.C: Ditto.
* g++.target/riscv/rvv/base/bug-23.C: Ditto.
* g++.target/riscv/rvv/base/bug-3.C: Ditto.
* g++.target/riscv/rvv/base/bug-5.C: Ditto.
* g++.target/riscv/rvv/base/bug-6.C: Ditto.
* g++.target/riscv/rvv/base/bug-8.C: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-100.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-101.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-102.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-103.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-104.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-105.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-106.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-107.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-108.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-109.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-110.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-111.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-112.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-113.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-114.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-115.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-116.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-117.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-118.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-119.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-122.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-97.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-98.c: Ditto.
* gcc.target/riscv/rvv/base/merge_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-6.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-7.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-8.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-9.c: Ditto.
* gcc.target/riscv/rvv/base/vxrm-2.c: New test.
* gcc.target/riscv/rvv/base/vxrm-3.c: New test.
* gcc.target/riscv/rvv/base/vxrm-4.c: New test.
* gcc.target/riscv/rvv/base/vxrm-5.c: New test.

---
 .../riscv/riscv-vector-builtins-bases.cc  |  10 ++
 .../riscv/riscv-vector-builtins-shapes.cc |  26 +++
 gcc/config/riscv/riscv-vector-builtins.cc |  19 +-
 gcc/config/riscv/riscv-vector-builtins.h  |  18 ++
 .../g++.target/riscv/rvv/base/bug-11.C|   2 +-
 .../g++.target/riscv/rvv/base/bug-12.C|   8 +-
 .../g++.target/riscv/rvv/base/bug-14.C|   4 +-
 .../g++.target/riscv/rvv/base/bug-15.C|   2 +-
 .../g++.target/riscv/rvv/base/bug-16.C|   8 +-
 .../g++.target/riscv/rvv/base/bug-17.C|   2 +-
 .../g++.target/riscv/rvv/base/bug-18.C|   2 +-
 .../g++.target/riscv/rvv/base/bug-19.C|   2 +-
 .../g++.target/riscv/rvv/base/bug-20.C|   2 +-
 .../g++.target/riscv/rvv/base/bug-21.C|   2 +-
 

Re: RISC-V: Remove masking third operand of rotate instructions

2023-05-16 Thread Jeff Law via Gcc-patches




On 5/10/23 09:50, Jivan Hakobyan via Gcc-patches wrote:

Subject:
RISC-V: Remove masking third operand of rotate instructions
From:
Jivan Hakobyan via Gcc-patches 
Date:
5/10/23, 09:50

To:
gcc-patches@gcc.gnu.org


Rotate instructions do not need to mask the third operand.
For example  RV64 the following code:

unsigned long foo1(unsigned long rs1, unsigned long rs2)
{
 long shamt = rs2 & (64 - 1);
 return (rs1 << shamt) | (rs1 >> ((64 - shamt) & (64 - 1)));
}

Compiles to:
foo1:
 andia1,a1,63
 rol a0,a0,a1
 ret

This patch removes unnecessary masking.
Besides, I have merged masking insns for shifts that were written before.


gcc/ChangeLog:
 * config/riscv/riscv.md: Merged
 * config/riscv/bitmanip.md: New insns
 * config/riscv/iterators.md: New iterator and optab items
 * config/riscv/predicates.md: New predicates
Usually we try to mention the patterns that got changed.  So something 
like this


* config/riscv/riscv.md (*3_mask):  New pattern,
combined from
(*si3_mask, *di3_mask): Here.

Similarly for the other patterns in riscv.md that you combined.

For the bitmanip ChangeLog it might look like

* config/riscv/bitmanip.md (*3_mask): New
pattern.
(*si3_sext_mask): Likewise.


* config/riscv/iterators.md (shiftm1): Generalize to handle more
masking constants.
(bitmanip_rotate): New iterator.
(bitmanip_optab): Add rotates.

* config/riscv/predicates.md (const_si_mask_operand): Renamed
from const31_operand.  Generalize to handle more mask constants.
(const_di_mask_operand): Similarly.





-- With the best regards Jivan Hakobyan


rotate_mask.patch

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index a27fc3e34a1..0fd0cbdeb04 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -351,6 +351,42 @@
"rolw\t%0,%1,%2"
[(set_attr "type" "bitmanip")])
  
+(define_insn_and_split "*3_mask"

+  [(set (match_operand:X 0 "register_operand" "= r")
+(bitmanip_rotate:X
+(match_operand:X 1 "register_operand" "  r")
+(match_operator 4 "subreg_lowpart_operator"
+ [(and:X
+   (match_operand:X 2 "register_operand"  "r")
+   (match_operand 3 "" ""))])))]
+  "TARGET_ZBB || TARGET_ZBKB"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+(bitmanip_rotate:X (match_dup 1)
+   (match_dup 2)))]
+  "operands[2] = gen_lowpart (QImode, operands[2]);"
+  [(set_attr "type" "bitmanip")
+   (set_attr "mode" "")])
It's worth noting that by using a subreg_lowpart_operator in this manner 
we can match any narrowing subreg lowpart rather than being restricted 
to QImode.  Clever use of iterators for the predicate and constraints on 
operand 3 as well.








diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c508ee3ad89..777d9468efa 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2010,44 +2010,23 @@
[(set_attr "type" "shift")
 (set_attr "mode" "SI")])
  
-(define_insn_and_split "*si3_mask"

-  [(set (match_operand:SI 0 "register_operand" "= r")
-   (any_shift:SI
-   (match_operand:SI 1 "register_operand" "  r")
+(define_insn_and_split "*3_mask"
+  [(set (match_operand:X 0 "register_operand" "= r")
+   (any_shift:X
+   (match_operand:X 1 "register_operand" "  r")
(match_operator 4 "subreg_lowpart_operator"
 [(and:SI
   (match_operand:SI 2 "register_operand"  "r")
-  (match_operand 3 "const_int_operand"))])))]

Shouldn't the mode of operand 2 change to "X" as well?






-(define_insn_and_split "*di3_mask_1"
-  [(set (match_operand:DI 0 "register_operand" "= r")
-   (any_shift:DI
-   (match_operand:DI 1 "register_operand" "  r")
+(define_insn_and_split "*3_mask_1"
+  [(set (match_operand:GPR 0 "register_operand" "= r")
+   (any_shift:GPR
+   (match_operand:GPR 1 "register_operand" "  r")
(match_operator 4 "subreg_lowpart_operator"
 [(and:DI
   (match_operand:DI 2 "register_operand"  "r")
-  (match_operand 3 "const_int_operand"))])))]
Presumably we use GPR here because for TARGET_64BIT we can match both 
the 32bit and 64bit opcodes?  I was wondering if we should do the same 
for the rotate patterns -- use the GPR iterator rather than the X 
iterator to match rol[w] and ror[w].


Overall it looks really good.  Both in terms of improving code 
generation for the rotates and cleaning up the shift patterns a bit too. 
 Just a couple questions/cleanups and an improved ChangeLog and this 
should be good to go.


jeff






Re: RISC-V Test Errors and Failures

2023-05-16 Thread Jeff Law via Gcc-patches




On 5/16/23 20:39, Palmer Dabbelt wrote:



By "chroot environment" you mean something like a 
debootstrap-into-chroot with qemu-user/binfmt-misc?  I don't have that 
setup right now, but it wouldn't be a big lift.
Essentially, yes.  I actually have a home built ones for the various 
targets.  There was a time when they needed to be <100M so they could be 
stored on github.  Going forward I just want them to be a docker 
container for Fedora or Ubuntu.  With binfmt you can run non-native 
containers trivially.


jeff


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Jeff Law via Gcc-patches




On 5/16/23 21:33, Kito Cheng via Gcc-patches wrote:

diff --git a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run 
b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
index 94d6ec5..efc3a80 100755
--- a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
+++ b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
@@ -12,4 +12,4 @@ done

  xlen="$(readelf -h $1 | grep 'Class' | cut -d: -f 2 | xargs echo | sed 
's/^ELF//')"

-qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
rv$xlen,zba=on,zbb=on,zbc=on,zbs=on "$@"
+qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
rv$xlen,zba=on,zbb=on,zbc=on,zbs=on,v=on "$@"


This not work when you testing some combination e.g. Z*inx and zve*,
but anyway I guess those configurations are not matter for you guys :P
What you could do is install a suitable binfmt handler, then you don't 
need the wrappers at all.  That's how we're handling this stuff in 
Ventana.  It also means you don't need magic dejagnu baseboard files or 
anything like that.  In fact from dejagnu's standpoint it looks native.


jeff



Re: Re: [PATCH V5] RISC-V: Using merge approach to optimize repeating sequence in vec_init

2023-05-16 Thread Kito Cheng via Gcc-patches
On Wed, May 17, 2023 at 11:36 AM juzhe.zh...@rivai.ai
 wrote:
>
> >> Does it means we assume inner_int_mode is DImode? (because sizeof 
> >> (uint64_t))
> >> or it should be something like `for (unsigned int i = 0; i <
> >> (GET_MODE_SIZE(inner_int_mode ()) * 8 / npatterns ()); i++)` ?
> No, sizeof (uint64_t) means uint64_t mask = 0;

+  return gen_int_mode (mask, inner_int_mode ());
And we expect the uint64_t mask can always be put into inner_int_mode ()?
If not, why do we fill up all 64 bits?

>
> >> Do you mind give more comment about this? what it checked and what it did?
> The reason we use known_gt (GET_MODE_SIZE (dup_mode), BYTES_PER_RISCV_VECTOR)
> since we want are using vector integer mode to generate the mask for example
> we generate 0b01010101010101 mask, we should use a scalar register 
> holding value = 0b010101010...
> Then vmv.v.x into a vector,then this vector will be used as a mask.
>
> >> Why this only hide in else? I guess I have this question is because I
> >> don't fully understand the logic of the if condition?
>
> Since we can't vector floting-point instruction to generate a mask.

I don't get why it's not something like below?

if (known_gt (GET_MODE_SIZE (dup_mode), BYTES_PER_RISCV_VECTOR))
{
...
}
if (FLOAT_MODE_P (dup_mode))
{
...
}



>
> >> nit: builder.inner_mode () rather than GET_MODE_INNER (dup_mode)?
>
> They are the same. I can change it using GET_MODE_INNER
>
> >> And I would like have more commnet to explain why we need force_reg here.
> Since it will creat ICE.

But why? And why can it be resolved by force_reg? you need few more
comment in the code


Re: Re: [PATCH V5] RISC-V: Using merge approach to optimize repeating sequence in vec_init

2023-05-16 Thread juzhe.zh...@rivai.ai
>> Does it means we assume inner_int_mode is DImode? (because sizeof (uint64_t))
>> or it should be something like `for (unsigned int i = 0; i <
>> (GET_MODE_SIZE(inner_int_mode ()) * 8 / npatterns ()); i++)` ?
No, sizeof (uint64_t) means uint64_t mask = 0;

>> Do you mind give more comment about this? what it checked and what it did?
The reason we use known_gt (GET_MODE_SIZE (dup_mode), BYTES_PER_RISCV_VECTOR)
since we want are using vector integer mode to generate the mask for example
we generate 0b01010101010101 mask, we should use a scalar register holding 
value = 0b010101010...
Then vmv.v.x into a vector,then this vector will be used as a mask.

>> Why this only hide in else? I guess I have this question is because I
>> don't fully understand the logic of the if condition?

Since we can't vector floting-point instruction to generate a mask.

>> nit: builder.inner_mode () rather than GET_MODE_INNER (dup_mode)?

They are the same. I can change it using GET_MODE_INNER

>> And I would like have more commnet to explain why we need force_reg here.
Since it will creat ICE.




juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-05-17 11:21
To: juzhe.zhong
CC: gcc-patches; palmer; jeffreyalaw
Subject: Re: [PATCH V5] RISC-V: Using merge approach to optimize repeating 
sequence in vec_init
> +
> +/* Get the mask for merge approach.
> +
> + Consider such following case:
> +   {a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b}
> + To merge "a", the mask should be 1010
> + To merge "b", the mask should be 0101
> +*/
> +rtx
> +rvv_builder::get_merge_mask_bitfield (unsigned int index) const
> +{
> +  uint64_t base_mask = (1ULL << index);
> +  uint64_t mask = 0;
> +  for (unsigned int i = 0; i < (sizeof (uint64_t) * 8 / npatterns ()); i++)
> +mask |= base_mask << (i * npatterns ());
> +  return gen_int_mode (mask, inner_int_mode ());
 
Does it means we assume inner_int_mode is DImode? (because sizeof (uint64_t))
or it should be something like `for (unsigned int i = 0; i <
(GET_MODE_SIZE(inner_int_mode ()) * 8 / npatterns ()); i++)` ?
 
> +}
> +
>  /* Subroutine of riscv_vector_expand_vector_init.
> Works as follows:
> (a) Initialize TARGET by broadcasting element NELTS_REQD - 1 of BUILDER.
> @@ -1226,6 +1307,107 @@ expand_vector_init_insert_elems (rtx target, const 
> rvv_builder ,
>  }
>  }
>
> +/* Emit vmv.s.x instruction.  */
> +
> +static void
> +emit_scalar_move_op (rtx dest, rtx src, machine_mode mask_mode)
> +{
> +  insn_expander<8> e;
> +  machine_mode mode = GET_MODE (dest);
> +  rtx scalar_move_mask = gen_scalar_move_mask (mask_mode);
> +  e.set_dest_and_mask (scalar_move_mask, dest, mask_mode);
> +  e.add_input_operand (src, GET_MODE_INNER (mode));
> +  e.set_len_and_policy (const1_rtx, false);
> +  e.expand (code_for_pred_broadcast (mode), false);
> +}
> +
> +/* Emit merge instruction.  */
> +
> +static void
> +emit_merge_op (rtx dest, rtx src1, rtx src2, rtx mask)
> +{
> +  insn_expander<8> e;
> +  machine_mode mode = GET_MODE (dest);
> +  e.set_dest_merge (dest);
> +  e.add_input_operand (src1, mode);
> +  if (VECTOR_MODE_P (GET_MODE (src2)))
> +e.add_input_operand (src2, mode);
> +  else
> +e.add_input_operand (src2, GET_MODE_INNER (mode));
> +
> +  e.add_input_operand (mask, GET_MODE (mask));
> +  e.set_len_and_policy (NULL_RTX, true, true, false);
> +  if (VECTOR_MODE_P (GET_MODE (src2)))
> +e.expand (code_for_pred_merge (mode), false);
> +  else
> +e.expand (code_for_pred_merge_scalar (mode), false);
> +}
> +
> +/* Use merge approach to initialize the vector with repeating sequence.
> + v = {a, b, a, b, a, b, a, b}.
> +
> + v = broadcast (a).
> + mask = 0b01010101
> + v = merge (v, b, mask)
> +*/
> +static void
> +expand_vector_init_merge_repeating_sequence (rtx target,
> +const rvv_builder )
> +{
> +  machine_mode mask_mode = get_mask_mode (builder.mode ()).require ();
> +  machine_mode dup_mode = builder.mode ();
> +  if (known_gt (GET_MODE_SIZE (dup_mode), BYTES_PER_RISCV_VECTOR))
> +{
> +  poly_uint64 nunits
> +   = exact_div (BYTES_PER_RISCV_VECTOR, builder.inner_units ());
> +  dup_mode = get_vector_mode (builder.inner_int_mode (), nunits).require 
> ();
> +}
 
Do you mind give more comment about this? what it checked and what it did?
 
> +  else
> +{
> +  if (FLOAT_MODE_P (dup_mode))
> +   {
> + poly_uint64 nunits = GET_MODE_NUNITS (dup_mode);
> + dup_mode
> +   = get_vector_mode (builder.inner_int_mode (), nunits).require ();
> +   }
 
Why this only hide in else? I guess I have this question is because I
don't fully understand the logic of the if condition?
 
> +}
> +
> +  machine_mode dup_mask_mode = get_mask_mode (dup_mode).require ();
> +
> +  /* Step 1: Broadcast the 1st-pattern.  */
> +  emit_len_op (code_for_pred_broadcast (builder.mode ()), target,
> +  force_reg (builder.inner_mode (), 

Re: RISC-V Test Errors and Failures

2023-05-16 Thread Kito Cheng via Gcc-patches
> diff --git a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run 
> b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
> index 94d6ec5..efc3a80 100755
> --- a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
> +++ b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
> @@ -12,4 +12,4 @@ done
>
>  xlen="$(readelf -h $1 | grep 'Class' | cut -d: -f 2 | xargs echo | sed 
> 's/^ELF//')"
>
> -qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
> rv$xlen,zba=on,zbb=on,zbc=on,zbs=on "$@"
> +qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
> rv$xlen,zba=on,zbb=on,zbc=on,zbs=on,v=on "$@"

This not work when you testing some combination e.g. Z*inx and zve*,
but anyway I guess those configurations are not matter for you guys :P

>
> for now.  I'm going to throw together hwprobe for qemu-user, from looking at
> the AVX stuff it should be pretty easy to plumb that into DG and then get the
> detection going.


Re: [PATCH V5] RISC-V: Using merge approach to optimize repeating sequence in vec_init

2023-05-16 Thread Kito Cheng via Gcc-patches
> +
> +/* Get the mask for merge approach.
> +
> + Consider such following case:
> +   {a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b}
> + To merge "a", the mask should be 1010
> + To merge "b", the mask should be 0101
> +*/
> +rtx
> +rvv_builder::get_merge_mask_bitfield (unsigned int index) const
> +{
> +  uint64_t base_mask = (1ULL << index);
> +  uint64_t mask = 0;
> +  for (unsigned int i = 0; i < (sizeof (uint64_t) * 8 / npatterns ()); i++)
> +mask |= base_mask << (i * npatterns ());
> +  return gen_int_mode (mask, inner_int_mode ());

Does it means we assume inner_int_mode is DImode? (because sizeof (uint64_t))
or it should be something like `for (unsigned int i = 0; i <
(GET_MODE_SIZE(inner_int_mode ()) * 8 / npatterns ()); i++)` ?

> +}
> +
>  /* Subroutine of riscv_vector_expand_vector_init.
> Works as follows:
> (a) Initialize TARGET by broadcasting element NELTS_REQD - 1 of BUILDER.
> @@ -1226,6 +1307,107 @@ expand_vector_init_insert_elems (rtx target, const 
> rvv_builder ,
>  }
>  }
>
> +/* Emit vmv.s.x instruction.  */
> +
> +static void
> +emit_scalar_move_op (rtx dest, rtx src, machine_mode mask_mode)
> +{
> +  insn_expander<8> e;
> +  machine_mode mode = GET_MODE (dest);
> +  rtx scalar_move_mask = gen_scalar_move_mask (mask_mode);
> +  e.set_dest_and_mask (scalar_move_mask, dest, mask_mode);
> +  e.add_input_operand (src, GET_MODE_INNER (mode));
> +  e.set_len_and_policy (const1_rtx, false);
> +  e.expand (code_for_pred_broadcast (mode), false);
> +}
> +
> +/* Emit merge instruction.  */
> +
> +static void
> +emit_merge_op (rtx dest, rtx src1, rtx src2, rtx mask)
> +{
> +  insn_expander<8> e;
> +  machine_mode mode = GET_MODE (dest);
> +  e.set_dest_merge (dest);
> +  e.add_input_operand (src1, mode);
> +  if (VECTOR_MODE_P (GET_MODE (src2)))
> +e.add_input_operand (src2, mode);
> +  else
> +e.add_input_operand (src2, GET_MODE_INNER (mode));
> +
> +  e.add_input_operand (mask, GET_MODE (mask));
> +  e.set_len_and_policy (NULL_RTX, true, true, false);
> +  if (VECTOR_MODE_P (GET_MODE (src2)))
> +e.expand (code_for_pred_merge (mode), false);
> +  else
> +e.expand (code_for_pred_merge_scalar (mode), false);
> +}
> +
> +/* Use merge approach to initialize the vector with repeating sequence.
> + v = {a, b, a, b, a, b, a, b}.
> +
> + v = broadcast (a).
> + mask = 0b01010101
> + v = merge (v, b, mask)
> +*/
> +static void
> +expand_vector_init_merge_repeating_sequence (rtx target,
> +const rvv_builder )
> +{
> +  machine_mode mask_mode = get_mask_mode (builder.mode ()).require ();
> +  machine_mode dup_mode = builder.mode ();
> +  if (known_gt (GET_MODE_SIZE (dup_mode), BYTES_PER_RISCV_VECTOR))
> +{
> +  poly_uint64 nunits
> +   = exact_div (BYTES_PER_RISCV_VECTOR, builder.inner_units ());
> +  dup_mode = get_vector_mode (builder.inner_int_mode (), nunits).require 
> ();
> +}

Do you mind give more comment about this? what it checked and what it did?

> +  else
> +{
> +  if (FLOAT_MODE_P (dup_mode))
> +   {
> + poly_uint64 nunits = GET_MODE_NUNITS (dup_mode);
> + dup_mode
> +   = get_vector_mode (builder.inner_int_mode (), nunits).require ();
> +   }

Why this only hide in else? I guess I have this question is because I
don't fully understand the logic of the if condition?

> +}
> +
> +  machine_mode dup_mask_mode = get_mask_mode (dup_mode).require ();
> +
> +  /* Step 1: Broadcast the 1st-pattern.  */
> +  emit_len_op (code_for_pred_broadcast (builder.mode ()), target,
> +  force_reg (builder.inner_mode (), builder.elt (0)), NULL_RTX,
> +  mask_mode);
> +
> +  /* Step 2: Merge each non 1st pattern.  */
> +  for (unsigned int i = 1; i < builder.npatterns (); i++)
> +{
> +  /* Step 2-1: Generate mask register v0 for each merge.  */
> +  rtx mask_bitfield = builder.get_merge_mask_bitfield (i);
> +  rtx mask = gen_reg_rtx (mask_mode);
> +  rtx dup = gen_reg_rtx (dup_mode);
> +  if (builder.inner_size () >= builder.full_nelts ().to_constant ())
> +   {
> + /* Use vmv.s.x.  */
> + emit_scalar_move_op (dup, mask_bitfield, dup_mask_mode);
> +   }
> +  else
> +   {
> + /* Use vmv.v.x.  */
> + unsigned int mask_num = CEIL (builder.full_nelts ().to_constant (),
> +   builder.inner_size ());
> + rtx vl = gen_int_mode (mask_num, Pmode);
> + emit_len_op (code_for_pred_broadcast (dup_mode), dup,
> +  force_reg (GET_MODE_INNER (dup_mode), mask_bitfield), 
> vl,

nit: builder.inner_mode () rather than GET_MODE_INNER (dup_mode)?

And I would like have more commnet to explain why we need force_reg here.

I guess it's corresponding to FLOAT_MODE_P, but it's not easy to
understand at frist moment without comment.


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 20:08:26 PDT (-0700), Vineet Gupta wrote:


On 5/16/23 19:53, Palmer Dabbelt wrote:


Probably, I'll go try and bump stuff and see if it works...


Word of caution: Best to not disturb your existing setup, a try a fresh
checkout first


Even easier, I think I can get away with just

diff --git a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run 
b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
index 94d6ec5..efc3a80 100755
--- a/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
+++ b/scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run
@@ -12,4 +12,4 @@ done

xlen="$(readelf -h $1 | grep 'Class' | cut -d: -f 2 | xargs echo | sed 
's/^ELF//')"

-qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
rv$xlen,zba=on,zbb=on,zbc=on,zbs=on "$@"
+qemu-riscv$xlen -r 5.10 "${qemu_args[@]}" -L ${RISC_V_SYSROOT} -cpu 
rv$xlen,zba=on,zbb=on,zbc=on,zbs=on,v=on "$@"

for now.  I'm going to throw together hwprobe for qemu-user, from looking at
the AVX stuff it should be pretty easy to plumb that into DG and then get the
detection going.


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Vineet Gupta



On 5/16/23 19:53, Palmer Dabbelt wrote:


Probably, I'll go try and bump stuff and see if it works... 


Word of caution: Best to not disturb your existing setup, a try a fresh 
checkout first


Re: [PATCH] RISC-V: Add rounding mode enum for fixed-point intrinsics

2023-05-16 Thread Kito Cheng via Gcc-patches
I would like to defer this until the PR has updated.

On Wed, May 17, 2023 at 9:52 AM  wrote:
>
> From: Juzhe-Zhong 
>
> Hi, since fixed-point with modeling rounding mode intrinsics are coming:
> https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222
>
> I am adding vxrm rounding mode enum to user first before the API intrinsic.
>
> This patch is simple && obvious.
>
> Ok for trunk ?
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins.cc (register_vxrm): New function.
> (DEF_RVV_VXRM_ENUM): New macro.
> (handle_pragma_vector): Add vxrm enum register.
> * config/riscv/riscv-vector-builtins.def (DEF_RVV_VXRM_ENUM): New 
> macro.
> (RNU): Ditto.
> (RNE): Ditto.
> (RDN): Ditto.
> (ROD): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/vxrm-1.c: New test.
>
> ---
>  gcc/config/riscv/riscv-vector-builtins.cc | 16 ++
>  gcc/config/riscv/riscv-vector-builtins.def| 11 +++
>  .../gcc.target/riscv/rvv/base/vxrm-1.c| 29 +++
>  3 files changed, 56 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
> b/gcc/config/riscv/riscv-vector-builtins.cc
> index b7458aaace6..bcabf1ea1a6 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -3740,6 +3740,19 @@ verify_type_context (location_t loc, type_context_kind 
> context, const_tree type,
>gcc_unreachable ();
>  }
>
> +/* Register the vxrm enum.  */
> +static void
> +register_vxrm ()
> +{
> +  auto_vec values;
> +#define DEF_RVV_VXRM_ENUM(NAME, VALUE)   
>\
> +  values.quick_push (string_int_pair ("VXRM_" #NAME, VALUE));
> +#include "riscv-vector-builtins.def"
> +#undef DEF_RVV_VXRM_ENUM
> +
> +  lang_hooks.types.simulate_enum_decl (input_location, "RVV_VXRM", );
> +}
> +
>  /* Implement #pragma riscv intrinsic vector.  */
>  void
>  handle_pragma_vector ()
> @@ -3755,6 +3768,9 @@ handle_pragma_vector ()
>for (unsigned int type_i = 0; type_i < NUM_VECTOR_TYPES; ++type_i)
>  register_vector_type ((enum vector_type_index) type_i);
>
> +  /* Define the enums.  */
> +  register_vxrm ();
> +
>/* Define the functions.  */
>function_table = new hash_table (1023);
>function_builder builder;
> diff --git a/gcc/config/riscv/riscv-vector-builtins.def 
> b/gcc/config/riscv/riscv-vector-builtins.def
> index 0a387fd1617..2a1a9dbc903 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.def
> +++ b/gcc/config/riscv/riscv-vector-builtins.def
> @@ -83,6 +83,11 @@ along with GCC; see the file COPYING3.  If not see
>X64_VLMUL_EXT, TUPLE_SUBPART)
>  #endif
>
> +/* Define RVV_VXRM rounding mode enum for fixed-point intrinsics.  */
> +#ifndef DEF_RVV_VXRM_ENUM
> +#define DEF_RVV_VXRM_ENUM(NAME, VALUE)
> +#endif
> +
>  /* SEW/LMUL = 64:
> Only enable when TARGET_MIN_VLEN > 32.
> Machine mode = VNx1BImode when TARGET_MIN_VLEN < 128.
> @@ -643,6 +648,11 @@ DEF_RVV_BASE_TYPE (vlmul_ext_x64, get_vector_type 
> (type_idx))
>  DEF_RVV_BASE_TYPE (size_ptr, build_pointer_type (size_type_node))
>  DEF_RVV_BASE_TYPE (tuple_subpart, get_tuple_subpart_type (type_idx))
>
> +DEF_RVV_VXRM_ENUM (RNU, VXRM_RNU)
> +DEF_RVV_VXRM_ENUM (RNE, VXRM_RNE)
> +DEF_RVV_VXRM_ENUM (RDN, VXRM_RDN)
> +DEF_RVV_VXRM_ENUM (ROD, VXRM_ROD)
> +
>  #include "riscv-vector-type-indexer.gen.def"
>
>  #undef DEF_RVV_PRED_TYPE
> @@ -651,3 +661,4 @@ DEF_RVV_BASE_TYPE (tuple_subpart, get_tuple_subpart_type 
> (type_idx))
>  #undef DEF_RVV_TUPLE_TYPE
>  #undef DEF_RVV_BASE_TYPE
>  #undef DEF_RVV_TYPE_INDEX
> +#undef DEF_RVV_VXRM_ENUM
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
> new file mode 100644
> index 000..0d364787ad0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +size_t f0 ()
> +{
> +  return VXRM_RNU;
> +}
> +
> +size_t f1 ()
> +{
> +  return VXRM_RNE;
> +}
> +
> +size_t f2 ()
> +{
> +  return VXRM_RDN;
> +}
> +
> +size_t f3 ()
> +{
> +  return VXRM_ROD;
> +}
> +
> +/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*0} 1} } */
> +/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*1} 1} } */
> +/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*2} 1} } */
> +/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*3} 1} } */
> --
> 2.36.3
>


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:51:48 PDT (-0700), Patrick O'Neill wrote:


On 5/16/23 19:47, Palmer Dabbelt wrote:

On Tue, 16 May 2023 19:46:28 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 19:21, Kito Cheng wrote:

Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233


That only helps if using bleeding edge toolchain scripts (which I
regularly do and so did Patrick).

Palmer has a fork of toolchain scripts and I'm assuming he hasn't caught
up to that point ;-)


I'm fine dropping the fork if the bugs have been fixed.  IIRC last
week we were still waiting for them to merge something?

The testsuite was broken last week, but was fixed by
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1247 which was
merged last Friday.

That might be the thing you were thinking about?


Probably, I'll go try and bump stuff and see if it works...

Thanks!


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Patrick O'Neill



On 5/16/23 19:47, Palmer Dabbelt wrote:

On Tue, 16 May 2023 19:46:28 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 19:21, Kito Cheng wrote:

Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233


That only helps if using bleeding edge toolchain scripts (which I
regularly do and so did Patrick).

Palmer has a fork of toolchain scripts and I'm assuming he hasn't caught
up to that point ;-)


I'm fine dropping the fork if the bugs have been fixed.  IIRC last 
week we were still waiting for them to merge something?
The testsuite was broken last week, but was fixed by 
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1247 which was 
merged last Friday.


That might be the thing you were thinking about?


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:46:28 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 19:21, Kito Cheng wrote:

Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233


That only helps if using bleeding edge toolchain scripts (which I
regularly do and so did Patrick).

Palmer has a fork of toolchain scripts and I'm assuming he hasn't caught
up to that point ;-)


I'm fine dropping the fork if the bugs have been fixed.  IIRC last week 
we were still waiting for them to merge something?



-Vineet


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Vineet Gupta

On 5/16/23 19:21, Kito Cheng wrote:

Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233


That only helps if using bleeding edge toolchain scripts (which I 
regularly do and so did Patrick).


Palmer has a fork of toolchain scripts and I'm assuming he hasn't caught 
up to that point ;-)


-Vineet


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:32:21 PDT (-0700), jeffreya...@gmail.com wrote:



On 5/16/23 20:05, Palmer Dabbelt wrote:

On Tue, 16 May 2023 19:00:12 PDT (-0700), Jeff Law wrote:



On 5/16/23 19:29, Palmer Dabbelt wrote:




I think the most pressing need is bleeding edge gcc regression
tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.

Correct.  More precisely, the riscv64 builds hang.  Not sure if it's
stage2 or stage3 of the bootstrap.  Been happening for the last couple
weeks.  I suspect some codegen bug in the riscv port.  I'll have to
bisect it which will be quite painful.


Can anyone else do it?  If the only blocker for having an upstream
regression CI thing is just sorting out why it broke over the last few
weeks then I'm happy to try and trick someone around here into doing
some work...

Probably easiest for me unless someone else has a chroot environment
handy.  It's not hard to do the bisection, it just involves a lot of
waiting.


By "chroot environment" you mean something like a 
debootstrap-into-chroot with qemu-user/binfmt-misc?  I don't have that 
setup right now, but it wouldn't be a big lift.



I've just about got the my problem from earlier today under control,
then I can probably start bisection.


That's fine with me, I have plenty of other stuff to do ;)


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Jeff Law via Gcc-patches




On 5/16/23 20:05, Palmer Dabbelt wrote:

On Tue, 16 May 2023 19:00:12 PDT (-0700), Jeff Law wrote:



On 5/16/23 19:29, Palmer Dabbelt wrote:



I think the most pressing need is bleeding edge gcc regression 
tracking.

  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.

Correct.  More precisely, the riscv64 builds hang.  Not sure if it's
stage2 or stage3 of the bootstrap.  Been happening for the last couple
weeks.  I suspect some codegen bug in the riscv port.  I'll have to
bisect it which will be quite painful.


Can anyone else do it?  If the only blocker for having an upstream 
regression CI thing is just sorting out why it broke over the last few 
weeks then I'm happy to try and trick someone around here into doing 
some work...
Probably easiest for me unless someone else has a chroot environment 
handy.  It's not hard to do the bisection, it just involves a lot of 
waiting.


I've just about got the my problem from earlier today under control, 
then I can probably start bisection.


Jeff


Re: Re: RISC-V Test Errors and Failures

2023-05-16 Thread Kito Cheng via Gcc-patches
Palmer:

For short-term, this should help your internal test:
https://github.com/riscv-collab/riscv-gnu-toolchain/pull/1233

On Wed, May 17, 2023 at 10:20 AM Kito Cheng  wrote:
>
> Currently we are highly rely on simulator can setup correctly by ELF
> attribute or -march setting, but seems not true for everyone, for
> longer term we need something like
> check_effective_target_aarch64_sve_hw, but as Palmer point out, we
> might need...bunch of that for different extensions
>
> On Wed, May 17, 2023 at 10:13 AM Palmer Dabbelt  wrote:
> >
> > On Tue, 16 May 2023 19:07:01 PDT (-0700), juzhe.zh...@rivai.ai wrote:
> > > Oh, I see. Kito has add /* { dg-do run { target { riscv_vector } } } */
> > > But not all RVV tests has use this and I not sure whether it can work.
> > > I think Kito can answer it.
> > > If yes, I think we should add all of them.
> >
> > Unless I'm missing something, it looks like that only checks if GCC is
> > compiling for V.  Nothing appears to be checking if the system the tests
> > are running on supports V.
> >
> > # Return 1 if the target has RISC-V vector extension, 0 otherwise.
> > # Cache the result.
> >
> > proc check_effective_target_riscv_vector { } {
> > # Check that we are compiling for v by checking the __riscv_v marco.
> > return [check_no_compiler_messages riscv_vector assembly {
> >#if !defined(__riscv_v)
> >#error "__riscv_v not defined!"
> >#endif
> > }]
> > }
> >
> > Those are really just two different things.
> >
> > It seems pretty reasonably to me to just avoid running the tests when
> > the DUT lacks V, but I'm never great with DG.  We should probably add
> > similar checks for the other ISA extensions, there's going to be a bunch
> > of this.
> >
> > >
> > > Thanks.
> > >
> > >
> > > juzhe.zh...@rivai.ai
> > >
> > > From: Andrew Pinski
> > > Date: 2023-05-17 10:02
> > > To: juzhe.zh...@rivai.ai
> > > CC: gcc-patches; palmer; Kito.cheng
> > > Subject: Re: RISC-V Test Errors and Failures
> > > On Tue, May 16, 2023 at 6:58 PM juzhe.zh...@rivai.ai
> > >  wrote:
> > >>
> > >> Hi, Palmer.
> > >> I saw your patch showed there are a lot of run time fail (execution 
> > >> fail) of C++.
> > >> bug-*.C
> > >>
> > >> These tests are RVV api intrinsics tests coming from Kito's that I have 
> > >> already fixed all of them.
> > >> I just double checked again they all passed.
> > >> I think it may be your regression environment does not set up simulator 
> > >> (QEMU or SPIKE or GEM5) correctly.
> > >> For example, did not enable vector extension in simulator, I don't you 
> > >> may try.
> > >
> > > So on x86_64, we test to see if you have the right vector unit before
> > > running those tests? The same thing was true on powerpc (and I think
> > > aarch64 does the same for SVE now too). The reason why I am asking is
> > > that I would need to run the testsuite using the simulator as setup
> > > for the RISCV ISA I am using rather than the one with everything on.
> > > So does the RVV runtime testsuite tests to see if you can run RVV
> > > before running them (or running them and return they passed)?
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > >>
> > >> Thanks.
> > >>
> > >>
> > >> juzhe.zh...@rivai.ai
> > >


Re: Re: RISC-V Test Errors and Failures

2023-05-16 Thread Kito Cheng via Gcc-patches
Currently we are highly rely on simulator can setup correctly by ELF
attribute or -march setting, but seems not true for everyone, for
longer term we need something like
check_effective_target_aarch64_sve_hw, but as Palmer point out, we
might need...bunch of that for different extensions

On Wed, May 17, 2023 at 10:13 AM Palmer Dabbelt  wrote:
>
> On Tue, 16 May 2023 19:07:01 PDT (-0700), juzhe.zh...@rivai.ai wrote:
> > Oh, I see. Kito has add /* { dg-do run { target { riscv_vector } } } */
> > But not all RVV tests has use this and I not sure whether it can work.
> > I think Kito can answer it.
> > If yes, I think we should add all of them.
>
> Unless I'm missing something, it looks like that only checks if GCC is
> compiling for V.  Nothing appears to be checking if the system the tests
> are running on supports V.
>
> # Return 1 if the target has RISC-V vector extension, 0 otherwise.
> # Cache the result.
>
> proc check_effective_target_riscv_vector { } {
> # Check that we are compiling for v by checking the __riscv_v marco.
> return [check_no_compiler_messages riscv_vector assembly {
>#if !defined(__riscv_v)
>#error "__riscv_v not defined!"
>#endif
> }]
> }
>
> Those are really just two different things.
>
> It seems pretty reasonably to me to just avoid running the tests when
> the DUT lacks V, but I'm never great with DG.  We should probably add
> similar checks for the other ISA extensions, there's going to be a bunch
> of this.
>
> >
> > Thanks.
> >
> >
> > juzhe.zh...@rivai.ai
> >
> > From: Andrew Pinski
> > Date: 2023-05-17 10:02
> > To: juzhe.zh...@rivai.ai
> > CC: gcc-patches; palmer; Kito.cheng
> > Subject: Re: RISC-V Test Errors and Failures
> > On Tue, May 16, 2023 at 6:58 PM juzhe.zh...@rivai.ai
> >  wrote:
> >>
> >> Hi, Palmer.
> >> I saw your patch showed there are a lot of run time fail (execution fail) 
> >> of C++.
> >> bug-*.C
> >>
> >> These tests are RVV api intrinsics tests coming from Kito's that I have 
> >> already fixed all of them.
> >> I just double checked again they all passed.
> >> I think it may be your regression environment does not set up simulator 
> >> (QEMU or SPIKE or GEM5) correctly.
> >> For example, did not enable vector extension in simulator, I don't you may 
> >> try.
> >
> > So on x86_64, we test to see if you have the right vector unit before
> > running those tests? The same thing was true on powerpc (and I think
> > aarch64 does the same for SVE now too). The reason why I am asking is
> > that I would need to run the testsuite using the simulator as setup
> > for the RISCV ISA I am using rather than the one with everything on.
> > So does the RVV runtime testsuite tests to see if you can run RVV
> > before running them (or running them and return they passed)?
> >
> > Thanks,
> > Andrew Pinski
> >
> >>
> >> Thanks.
> >>
> >>
> >> juzhe.zh...@rivai.ai
> >


Re: [PATCH V4 2/2] rs6000: use li;x?oris to build constant

2023-05-16 Thread guojiufu via Gcc-patches

Hi,

On 2023-05-15 14:53, Kewen.Lin wrote:

Hi Jeff,

on 2022/12/12 09:38, Jiufu Guo wrote:

Hi,

For constant C:
If '(c & 0xULL) == 0x' or say:
32(1) || 1(0) || 15(x) || 16(0), we could use "lis; xoris" to build.

Here N(M) means N continuous bit M, x for M means it is ok for either
1 or 0; '||' means concatenation.

This patch update rs6000_emit_set_long_const to support those 
constants.


Compare with previous version:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607618.htm
This patch fix conflicts with trunk.

Bootstrap and regtest pass on ppc64{,le}.

Is this ok for trunk?


OK for trunk, thanks for improving this.

btw, the test case needs to be updated a bit as the function names in 
the
context changed upstream, please ensure it's tested well before 
committing,

thanks!


Yeap! Retested and verified.
Thanks so much for your always insight review and helpful comments!

Committed via r14-923-g5eb7d560626e42.

BR,
Jeff (Jiufu)





BR,
Jeff (Jiufu)


PR target/106708

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add to build
constants through "lis; xoris".


Maybe s/Add to build/Support building/

Yes :)



BR,
Kewen



gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106708.c: Add test function.

---
 gcc/config/rs6000/rs6000.cc |  7 +++
 gcc/testsuite/gcc.target/powerpc/pr106708.c | 10 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 8c1192a10c8..1138d5e8cd4 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -10251,6 +10251,13 @@ rs6000_emit_set_long_const (rtx dest, 
HOST_WIDE_INT c)

   if (ud1 != 0)
emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
 }
+  else if (ud4 == 0x && ud3 == 0x && !(ud2 & 0x8000) && ud1 
== 0)

+{
+  /* lis; xoris */
+  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
+  emit_move_insn (temp, GEN_INT (sext_hwi ((ud2 | 0x8000) << 16, 
32)));
+  emit_move_insn (dest, gen_rtx_XOR (DImode, temp, GEN_INT 
(0x8000)));

+}
   else if (ud4 == 0x && ud3 == 0x && (ud1 & 0x8000))
 {
   /* li; xoris */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.c 
b/gcc/testsuite/gcc.target/powerpc/pr106708.c

index dc9ceda8367..a015c71e630 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr106708.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr106708.c
@@ -4,7 +4,7 @@
 /* { dg-require-effective-target has_arch_ppc64 } */

 long long arr[]
-  = {0x7cdeab55LL, 0x98765432LL, 0xabcdLL};
+= {0x7cdeab55LL, 0x98765432LL, 0xabcdLL, 
0x6543LL};


 void __attribute__ ((__noipa__)) lixoris (long long *arg)
 {
@@ -27,6 +27,13 @@ void __attribute__ ((__noipa__)) lisrldicl (long 
long *arg)

 /* { dg-final { scan-assembler-times {\mlis .*,0xabcd\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mrldicl .*,0,32\M} 1 } } */

+void __attribute__ ((__noipa__)) lisxoris (long long *arg)
+{
+  *arg = 0x6543LL;
+}
+/* { dg-final { scan-assembler-times {\mlis .*,0xe543\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxoris .*0x8000\M} 1 } } */
+
 int
 main ()
 {
@@ -35,6 +42,7 @@ main ()
   lixoris (a);
   lioris (a + 1);
   lisrldicl (a + 2);
+  lisxoris (a + 3);
   if (__builtin_memcmp (a, arr, sizeof (arr)) != 0)
 __builtin_abort ();
   return 0;


Re: Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:07:01 PDT (-0700), juzhe.zh...@rivai.ai wrote:

Oh, I see. Kito has add /* { dg-do run { target { riscv_vector } } } */
But not all RVV tests has use this and I not sure whether it can work.
I think Kito can answer it.
If yes, I think we should add all of them.


Unless I'm missing something, it looks like that only checks if GCC is 
compiling for V.  Nothing appears to be checking if the system the tests 
are running on supports V.


   # Return 1 if the target has RISC-V vector extension, 0 otherwise.
   # Cache the result.
   
   proc check_effective_target_riscv_vector { } {

   # Check that we are compiling for v by checking the __riscv_v marco.
   return [check_no_compiler_messages riscv_vector assembly {
  #if !defined(__riscv_v)
  #error "__riscv_v not defined!"
  #endif
   }]
   }

Those are really just two different things.

It seems pretty reasonably to me to just avoid running the tests when 
the DUT lacks V, but I'm never great with DG.  We should probably add 
similar checks for the other ISA extensions, there's going to be a bunch 
of this.




Thanks.


juzhe.zh...@rivai.ai
 
From: Andrew Pinski

Date: 2023-05-17 10:02
To: juzhe.zh...@rivai.ai
CC: gcc-patches; palmer; Kito.cheng
Subject: Re: RISC-V Test Errors and Failures
On Tue, May 16, 2023 at 6:58 PM juzhe.zh...@rivai.ai
 wrote:


Hi, Palmer.
I saw your patch showed there are a lot of run time fail (execution fail) of 
C++.
bug-*.C

These tests are RVV api intrinsics tests coming from Kito's that I have already 
fixed all of them.
I just double checked again they all passed.
I think it may be your regression environment does not set up simulator (QEMU 
or SPIKE or GEM5) correctly.
For example, did not enable vector extension in simulator, I don't you may try.
 
So on x86_64, we test to see if you have the right vector unit before

running those tests? The same thing was true on powerpc (and I think
aarch64 does the same for SVE now too). The reason why I am asking is
that I would need to run the testsuite using the simulator as setup
for the RISCV ISA I am using rather than the one with everything on.
So does the RVV runtime testsuite tests to see if you can run RVV
before running them (or running them and return they passed)?
 
Thanks,

Andrew Pinski
 


Thanks.


juzhe.zh...@rivai.ai
 


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Jeff Law




On 5/16/23 20:08, Vineet Gupta wrote:




I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.


Jeff it seems has his own test infra. I was ask
Mine is the closest we've got for project-wide testing.  Various orgs 
have their own servers/bots.


jeff



Re: RISC-V Test Errors and Failures

2023-05-16 Thread Vineet Gupta




On 5/16/23 18:29, Palmer Dabbelt wrote:

On Tue, 16 May 2023 18:04:37 PDT (-0700), Vineet Gupta wrote:

+ Christoph, Jiawei

On 5/16/23 17:20, Palmer Dabbelt wrote:

We really need to add some CI around RV toolchains to trip on these
sooner !


Sounds like you're volunteering to set one up?


Patrick's github CI patch seems to be a great start. Lets wait for it to
get merged, that will at least catch rv toolchain snafus: although the
granularity of testing is not ideal (tc changes are not so frequent)


You mean riscv-gnu-toolchain changes?  That's not super useful for GCC 
development, they're on a fork.


Well they are still useful to catch various snafus in toolchain plumbing 
itself - I've run into 2 of those and Patcrick 2 himself, when trying to 
use latest upstream toolchain scripts. But sure they are not testing 
bleeding edge gcc.






I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.


Jeff it seems has his own test infra. I was asking if sourceware (or 
whatever the custodian of gcc project has).
I'd be really surprised if primary arches such as x86/aarch64 don't have 
any test bots there ?





FWIW rivos gitlab CI (not public) has capability to track upstream gcc
(Kevin almost has it working), but there is no easy way to publish it
for rest of the world and I'd rather that be done in a public infra.


+Kevin

At least having the failure lists public would be a must-have, and I 
think that's tricky to do with gitlab.


Yep.

Bjorn and Conor have something glued to the kernel patchwork that 
uploads test results to github as snippits, but IIRC we're trying to 
replace it with something more directly visible.



Didn't ISCAS/PLCT have such infra - sorry Kito asked the same question
this morning, but I was not fully awoke so don't remember what Jiawei
replied.


I didn't even remember he asked ;)




Re: Re: RISC-V Test Errors and Failures

2023-05-16 Thread juzhe.zh...@rivai.ai
Oh, I see. Kito has add /* { dg-do run { target { riscv_vector } } } */
But not all RVV tests has use this and I not sure whether it can work.
I think Kito can answer it.
If yes, I think we should add all of them.

Thanks.


juzhe.zh...@rivai.ai
 
From: Andrew Pinski
Date: 2023-05-17 10:02
To: juzhe.zh...@rivai.ai
CC: gcc-patches; palmer; Kito.cheng
Subject: Re: RISC-V Test Errors and Failures
On Tue, May 16, 2023 at 6:58 PM juzhe.zh...@rivai.ai
 wrote:
>
> Hi, Palmer.
> I saw your patch showed there are a lot of run time fail (execution fail) of 
> C++.
> bug-*.C
>
> These tests are RVV api intrinsics tests coming from Kito's that I have 
> already fixed all of them.
> I just double checked again they all passed.
> I think it may be your regression environment does not set up simulator (QEMU 
> or SPIKE or GEM5) correctly.
> For example, did not enable vector extension in simulator, I don't you may 
> try.
 
So on x86_64, we test to see if you have the right vector unit before
running those tests? The same thing was true on powerpc (and I think
aarch64 does the same for SVE now too). The reason why I am asking is
that I would need to run the testsuite using the simulator as setup
for the RISCV ISA I am using rather than the one with everything on.
So does the RVV runtime testsuite tests to see if you can run RVV
before running them (or running them and return they passed)?
 
Thanks,
Andrew Pinski
 
>
> Thanks.
>
>
> juzhe.zh...@rivai.ai
 


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 19:00:12 PDT (-0700), Jeff Law wrote:



On 5/16/23 19:29, Palmer Dabbelt wrote:




I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.

Correct.  More precisely, the riscv64 builds hang.  Not sure if it's
stage2 or stage3 of the bootstrap.  Been happening for the last couple
weeks.  I suspect some codegen bug in the riscv port.  I'll have to
bisect it which will be quite painful.


Can anyone else do it?  If the only blocker for having an upstream 
regression CI thing is just sorting out why it broke over the last few 
weeks then I'm happy to try and trick someone around here into doing 
some work...


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Andrew Pinski via Gcc-patches
On Tue, May 16, 2023 at 6:58 PM juzhe.zh...@rivai.ai
 wrote:
>
> Hi, Palmer.
> I saw your patch showed there are a lot of run time fail (execution fail) of 
> C++.
> bug-*.C
>
> These tests are RVV api intrinsics tests coming from Kito's that I have 
> already fixed all of them.
> I just double checked again they all passed.
> I think it may be your regression environment does not set up simulator (QEMU 
> or SPIKE or GEM5) correctly.
> For example, did not enable vector extension in simulator, I don't you may 
> try.

So on x86_64, we test to see if you have the right vector unit before
running those tests? The same thing was true on powerpc (and I think
aarch64 does the same for SVE now too). The reason why I am asking is
that I would need to run the testsuite using the simulator as setup
for the RISCV ISA I am using rather than the one with everything on.
So does the RVV runtime testsuite tests to see if you can run RVV
before running them (or running them and return they passed)?

Thanks,
Andrew Pinski

>
> Thanks.
>
>
> juzhe.zh...@rivai.ai


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Jeff Law




On 5/16/23 19:29, Palmer Dabbelt wrote:




I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.
Correct.  More precisely, the riscv64 builds hang.  Not sure if it's 
stage2 or stage3 of the bootstrap.  Been happening for the last couple 
weeks.  I suspect some codegen bug in the riscv port.  I'll have to 
bisect it which will be quite painful.


jeff


RISC-V Test Errors and Failures

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Palmer.
I saw your patch showed there are a lot of run time fail (execution fail) of 
C++.
bug-*.C

These tests are RVV api intrinsics tests coming from Kito's that I have already 
fixed all of them.
I just double checked again they all passed.
I think it may be your regression environment does not set up simulator (QEMU 
or SPIKE or GEM5) correctly.
For example, did not enable vector extension in simulator, I don't you may try.

Thanks.


juzhe.zh...@rivai.ai


[PATCH] RISC-V: Add rounding mode enum for fixed-point intrinsics

2023-05-16 Thread juzhe . zhong
From: Juzhe-Zhong 

Hi, since fixed-point with modeling rounding mode intrinsics are coming:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/222

I am adding vxrm rounding mode enum to user first before the API intrinsic.

This patch is simple && obvious.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc (register_vxrm): New function.
(DEF_RVV_VXRM_ENUM): New macro.
(handle_pragma_vector): Add vxrm enum register.
* config/riscv/riscv-vector-builtins.def (DEF_RVV_VXRM_ENUM): New macro.
(RNU): Ditto.
(RNE): Ditto.
(RDN): Ditto.
(ROD): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/vxrm-1.c: New test.

---
 gcc/config/riscv/riscv-vector-builtins.cc | 16 ++
 gcc/config/riscv/riscv-vector-builtins.def| 11 +++
 .../gcc.target/riscv/rvv/base/vxrm-1.c| 29 +++
 3 files changed, 56 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index b7458aaace6..bcabf1ea1a6 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3740,6 +3740,19 @@ verify_type_context (location_t loc, type_context_kind 
context, const_tree type,
   gcc_unreachable ();
 }
 
+/* Register the vxrm enum.  */
+static void
+register_vxrm ()
+{
+  auto_vec values;
+#define DEF_RVV_VXRM_ENUM(NAME, VALUE) 
 \
+  values.quick_push (string_int_pair ("VXRM_" #NAME, VALUE));
+#include "riscv-vector-builtins.def"
+#undef DEF_RVV_VXRM_ENUM
+
+  lang_hooks.types.simulate_enum_decl (input_location, "RVV_VXRM", );
+}
+
 /* Implement #pragma riscv intrinsic vector.  */
 void
 handle_pragma_vector ()
@@ -3755,6 +3768,9 @@ handle_pragma_vector ()
   for (unsigned int type_i = 0; type_i < NUM_VECTOR_TYPES; ++type_i)
 register_vector_type ((enum vector_type_index) type_i);
 
+  /* Define the enums.  */
+  register_vxrm ();
+
   /* Define the functions.  */
   function_table = new hash_table (1023);
   function_builder builder;
diff --git a/gcc/config/riscv/riscv-vector-builtins.def 
b/gcc/config/riscv/riscv-vector-builtins.def
index 0a387fd1617..2a1a9dbc903 100644
--- a/gcc/config/riscv/riscv-vector-builtins.def
+++ b/gcc/config/riscv/riscv-vector-builtins.def
@@ -83,6 +83,11 @@ along with GCC; see the file COPYING3.  If not see
   X64_VLMUL_EXT, TUPLE_SUBPART)
 #endif
 
+/* Define RVV_VXRM rounding mode enum for fixed-point intrinsics.  */
+#ifndef DEF_RVV_VXRM_ENUM
+#define DEF_RVV_VXRM_ENUM(NAME, VALUE)
+#endif
+
 /* SEW/LMUL = 64:
Only enable when TARGET_MIN_VLEN > 32.
Machine mode = VNx1BImode when TARGET_MIN_VLEN < 128.
@@ -643,6 +648,11 @@ DEF_RVV_BASE_TYPE (vlmul_ext_x64, get_vector_type 
(type_idx))
 DEF_RVV_BASE_TYPE (size_ptr, build_pointer_type (size_type_node))
 DEF_RVV_BASE_TYPE (tuple_subpart, get_tuple_subpart_type (type_idx))
 
+DEF_RVV_VXRM_ENUM (RNU, VXRM_RNU)
+DEF_RVV_VXRM_ENUM (RNE, VXRM_RNE)
+DEF_RVV_VXRM_ENUM (RDN, VXRM_RDN)
+DEF_RVV_VXRM_ENUM (ROD, VXRM_ROD)
+
 #include "riscv-vector-type-indexer.gen.def"
 
 #undef DEF_RVV_PRED_TYPE
@@ -651,3 +661,4 @@ DEF_RVV_BASE_TYPE (tuple_subpart, get_tuple_subpart_type 
(type_idx))
 #undef DEF_RVV_TUPLE_TYPE
 #undef DEF_RVV_BASE_TYPE
 #undef DEF_RVV_TYPE_INDEX
+#undef DEF_RVV_VXRM_ENUM
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
new file mode 100644
index 000..0d364787ad0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vxrm-1.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t f0 ()
+{
+  return VXRM_RNU;
+}
+
+size_t f1 ()
+{
+  return VXRM_RNE;
+}
+
+size_t f2 ()
+{
+  return VXRM_RDN;
+}
+
+size_t f3 ()
+{
+  return VXRM_ROD;
+}
+
+/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*0} 1} } */
+/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*1} 1} } */
+/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*2} 1} } */
+/* { dg-final { scan-assembler-times {li\s+[a-x0-9]+,\s*3} 1} } */
-- 
2.36.3



Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 18:04:37 PDT (-0700), Vineet Gupta wrote:

+ Christoph, Jiawei

On 5/16/23 17:20, Palmer Dabbelt wrote:

We really need to add some CI around RV toolchains to trip on these
sooner !


Sounds like you're volunteering to set one up?


Patrick's github CI patch seems to be a great start. Lets wait for it to
get merged, that will at least catch rv toolchain snafus: although the
granularity of testing is not ideal (tc changes are not so frequent)


You mean riscv-gnu-toolchain changes?  That's not super useful for GCC 
development, they're on a fork.



I think the most pressing need is bleeding edge gcc regression tracking.
  @Jeff is anything setup on sourceware and/or usable ? I thought they
do have existing bots for some arches to spin up build / run - perhaps
runs are native and not qemu.


IIRC Jeff said his builders were hanging right now.


FWIW rivos gitlab CI (not public) has capability to track upstream gcc
(Kevin almost has it working), but there is no easy way to publish it
for rest of the world and I'd rather that be done in a public infra.


+Kevin

At least having the failure lists public would be a must-have, and I 
think that's tricky to do with gitlab.  Bjorn and Conor have something 
glued to the kernel patchwork that uploads test results to github as 
snippits, but IIRC we're trying to replace it with something more 
directly visible.



Didn't ISCAS/PLCT have such infra - sorry Kito asked the same question
this morning, but I was not fully awoke so don't remember what Jiawei
replied.


I didn't even remember he asked ;)


Re: RISC-V Test Errors and Failures

2023-05-16 Thread Vineet Gupta

+ Christoph, Jiawei

On 5/16/23 17:20, Palmer Dabbelt wrote:
We really need to add some CI around RV toolchains to trip on these 
sooner !


Sounds like you're volunteering to set one up? 


Patrick's github CI patch seems to be a great start. Lets wait for it to 
get merged, that will at least catch rv toolchain snafus: although the 
granularity of testing is not ideal (tc changes are not so frequent)


I think the most pressing need is bleeding edge gcc regression tracking.
 @Jeff is anything setup on sourceware and/or usable ? I thought they 
do have existing bots for some arches to spin up build / run - perhaps 
runs are native and not qemu.


FWIW rivos gitlab CI (not public) has capability to track upstream gcc 
(Kevin almost has it working), but there is no easy way to publish it 
for rest of the world and I'd rather that be done in a public infra.


Didn't ISCAS/PLCT have such infra - sorry Kito asked the same question 
this morning, but I was not fully awoke so don't remember what Jiawei 
replied.


Thx,
-Vineet


[PATCH] Fix PR 106900: array-bounds warning inside simplify_builtin_call

2023-05-16 Thread Andrew Pinski via Gcc-patches
The problem here is that VRP cannot figure out isize could not be 0
due to using integer_zerop. This patch removes the use of integer_zerop
and instead checks for 0 directly after converting the tree to
an unsigned HOST_WIDE_INT. This allows VRP to figure out isize is not 0
and `isize - 1` will always be >= 0.

This patch is just to avoid the warning that GCC could produce sometimes
and does not change any code generation or even VRP.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (simplify_builtin_call): Check
against 0 instead of calling integer_zerop.
---
 gcc/tree-ssa-forwprop.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 06f19868ade..0326e6733e8 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -1231,14 +1231,14 @@ simplify_builtin_call (gimple_stmt_iterator *gsi_p, 
tree callee2)
  tree size = gimple_call_arg (stmt2, 2);
  /* Size must be a constant which is <= UNITS_PER_WORD and
 <= the string length.  */
- if (TREE_CODE (size) != INTEGER_CST || integer_zerop (size))
+ if (TREE_CODE (size) != INTEGER_CST)
break;
 
  if (!tree_fits_uhwi_p (size))
break;
 
  unsigned HOST_WIDE_INT sz = tree_to_uhwi (size);
- if (sz > UNITS_PER_WORD || sz >= slen)
+ if (sz == 0 || sz > UNITS_PER_WORD || sz >= slen)
break;
 
  tree ch = gimple_call_arg (stmt2, 1);
-- 
2.31.1



Re: RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

On Tue, 16 May 2023 17:16:11 PDT (-0700), Vineet Gupta wrote:

On 5/16/23 16:06, Palmer Dabbelt wrote:

A few of us were talking about test-related issues in the patchwork
meeting
this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is
about what I
remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test
running had
been hanging so there might be some issue preventing folks from seeing
the
failures.

I guess I didn't get time to look last time and I doubt things are
looking any
better right now.  I'll try and take a look at some point, but any
help would
of course be appreciated.


Yes I was seeing similar tcl errors and such - and in my case an even
higher count.
Also for posterity, what was your configure cmdline ? multilibs or no


If only I'd saved those in the build somewhere... :)

It's all in github.com/palmer-dabbelt/riscv-systems-ci, which points to 
riscv-gnu-toolchain.  I've always got uncommitted diff in my various 
local checkous, but I think this would only be


   toolchain: toolchain/install.stamp
   
   toolchain/install.stamp: toolchain/Makefile

   $(MAKE) -C $(dir $<)
   date > $@
   
   toolchain/Makefile: riscv-gnu-toolchain/configure

   mkdir -p $(dir $@)
   env -C $(dir $@) $(abspath $<) --prefix="$(abspath $(dir 
$@)/install)" --enable-linux --enable-multilib --enable-gcc-checking=yes
   
   toolchain/check.log: toolchain/install.stamp

   $(MAKE) -C $(dir $<) check \
   
GLIBC_TARGET_BOARDS_EXTRA="riscv-sim/-march=rv64gczba_zbb_zbc_zbs/-mabi=lp64d 
riscv-sim/-march=rv64imafdcv/-mabi=lp64d riscv-sim/-march=rv32imafdcv/-mabi=ilp32d" 
|& tee $@
   touch -c $@
   
   toolchain/report: toolchain/check.log

   $(MAKE) -C $(dir $<) report \
   
GLIBC_TARGET_BOARDS_EXTRA="riscv-sim/-march=rv64gczba_zbb_zbc_zbs/-mabi=lp64d 
riscv-sim/-march=rv64imafdcv/-mabi=lp64d riscv-sim/-march=rv32imafdcv/-mabi=ilp32d" 
|& tee $@
   touch -c $@


We really need to add some CI around RV toolchains to trip on these sooner !


Sounds like you're volunteering to set one up?


$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter
gcc glibc
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
    === g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess
errors)
    === g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess
errors)
    === g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
    === g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess
errors)
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: 

Re: RISC-V Test Errors and Failures

2023-05-16 Thread Vineet Gupta

On 5/16/23 16:06, Palmer Dabbelt wrote:
A few of us were talking about test-related issues in the patchwork 
meeting

this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is 
about what I

remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test 
running had
been hanging so there might be some issue preventing folks from seeing 
the

failures.

I guess I didn't get time to look last time and I doubt things are 
looking any
better right now.  I'll try and take a look at some point, but any 
help would

of course be appreciated.


Yes I was seeing similar tcl errors and such - and in my case an even 
higher count.

Also for posterity, what was your configure cmdline ? multilibs or no
We really need to add some CI around RV toolchains to trip on these sooner !



$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter 
gcc glibc 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find 
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`

    === g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess 
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess 
errors)

    === g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess 
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess 
errors)

    === g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
    === g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess 
errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess 
errors)

FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-18.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-19.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-20.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-21.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-22.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C (test for excess errors)
    === g++: Unexpected fails for rv64gczba_zbb_zbc_zbs lp64d ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
    === gcc: Unexpected fails for rv64imac lp64 medlow ===
ERROR: tcl error sourcing 

[committed] c: Remove restrictions on declarations in 'for' loops for C2X

2023-05-16 Thread Joseph Myers
C2X removes a restriction that the only declarations in the
declaration part of a 'for' loop are declarations of objects with
storage class auto or register.  Implement this change, making the
diagnostics into pedwarn_c11 calls instead of errors (as usual for
features added in a new standard version that were invalid code in a
previous version), so now pedwarn-if-pedantic for older standards and
diagnosed also with -Wc11-c2x-compat.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (check_for_loop_decls): Use pedwarn_c11 for
diagnostics.

gcc/testsuite/
* gcc.dg/c11-fordecl-1.c, gcc.dg/c11-fordecl-2.c,
gcc.dg/c11-fordecl-3.c, gcc.dg/c11-fordecl-4.c,
gcc.dg/c2x-fordecl-1.c, gcc.dg/c2x-fordecl-2.c,
gcc.dg/c2x-fordecl-3.c, gcc.dg/c2x-fordecl-4.c: New tests.
* gcc.dg/c99-fordecl-2.c: Test diagnostic for typedef declaration
in for loop here.
* gcc.dg/pr67784-2.c, gcc.dg/pr68320.c, objc.dg/foreach-7.m: Do
not expect errors for typedef declaration in for loop.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 90d7cd27cd5..f8ede362bfd 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -11032,7 +11032,9 @@ check_for_loop_decls (location_t loc, bool 
turn_off_iso_c99_error)
  only applies to those that are.  (A question on this in comp.std.c
  in November 2000 received no answer.)  We implement the strictest
  interpretation, to avoid creating an extension which later causes
- problems.  */
+ problems.
+
+ This constraint was removed in C2X.  */
 
   for (b = current_scope->bindings; b; b = b->prev)
 {
@@ -11048,33 +11050,35 @@ check_for_loop_decls (location_t loc, bool 
turn_off_iso_c99_error)
  {
location_t decl_loc = DECL_SOURCE_LOCATION (decl);
if (TREE_STATIC (decl))
- error_at (decl_loc,
-   "declaration of static variable %qD in % loop "
-   "initial declaration", decl);
+ pedwarn_c11 (decl_loc, OPT_Wpedantic,
+  "declaration of static variable %qD in % "
+  "loop initial declaration", decl);
else if (DECL_EXTERNAL (decl))
- error_at (decl_loc,
-   "declaration of % variable %qD in % loop 
"
-   "initial declaration", decl);
+ pedwarn_c11 (decl_loc, OPT_Wpedantic,
+  "declaration of % variable %qD in % "
+  "loop initial declaration", decl);
  }
  break;
 
case RECORD_TYPE:
- error_at (loc,
-   "% declared in % loop initial "
-   "declaration", id);
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "% declared in % loop initial "
+  "declaration", id);
  break;
case UNION_TYPE:
- error_at (loc,
-   "% declared in % loop initial declaration",
-   id);
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "% declared in % loop initial "
+  "declaration",
+  id);
  break;
case ENUMERAL_TYPE:
- error_at (loc, "% declared in % loop "
-   "initial declaration", id);
+ pedwarn_c11 (loc, OPT_Wpedantic,
+  "% declared in % loop "
+  "initial declaration", id);
  break;
default:
- error_at (loc, "declaration of non-variable "
-   "%qD in % loop initial declaration", decl);
+ pedwarn_c11 (loc, OPT_Wpedantic, "declaration of non-variable "
+  "%qD in % loop initial declaration", decl);
}
 
   n_decls++;
diff --git a/gcc/testsuite/gcc.dg/c11-fordecl-1.c 
b/gcc/testsuite/gcc.dg/c11-fordecl-1.c
new file mode 100644
index 000..4aceb335e18
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-fordecl-1.c
@@ -0,0 +1,27 @@
+/* Test for C99 declarations in for loops.  Test constraints are diagnosed for
+   C11.  Based on c99-fordecl-2.c.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+void
+foo (void)
+{
+  int j = 0;
+  for (int i = 1, bar (void); i <= 10; i++) /* { dg-error "bar" } */
+j += i;
+
+  for (static int i = 1; i <= 10; i++) /* /* { dg-error "static" } */
+j += i;
+
+  for (extern int i; j <= 500; j++) /* { dg-error "extern" } */
+j += 5;
+
+  for (enum { FOO } i = FOO; i < 10; i++) /* { dg-error "FOO" } */
+j += i;
+
+  for (enum BAR { FOO } i = FOO; i < 10; i++) /* { dg-error "FOO" } */
+/* { dg-error "BAR" "enum tag in for loop" { target *-*-* } .-1 } */
+j += i;
+  for (typedef int T;;) /* { dg-error "non-variable" } */
+;
+}
diff --git a/gcc/testsuite/gcc.dg/c11-fordecl-2.c 
b/gcc/testsuite/gcc.dg/c11-fordecl-2.c
new file mode 100644
index 000..0be1a0d13fa
--- /dev/null

RISC-V Test Errors and Failures

2023-05-16 Thread Palmer Dabbelt

A few of us were talking about test-related issues in the patchwork meeting
this morning.  I bumped to trunk and did a full rebuild, I'm getting the
following (it's in riscv-systems-ci/riscv-gnu-toolchain).  This is about what I
remember seeing last time I ran the tests, which was a week or so ago.  I
figured it'd be best to just blast the lists, as Jeff said his test running had
been hanging so there might be some issue preventing folks from seeing the
failures.

I guess I didn't get time to look last time and I doubt things are looking any
better right now.  I'll try and take a look at some point, but any help would
of course be appreciated.

$ cat toolchain/report
make[1]: Entering directory '/scratch/merges/rgt-gcc-trunk/toolchain'
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/scripts/testsuite-filter gcc glibc 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/test/allowlist `find 
build-gcc-linux-stage2/gcc/testsuite/ -name *.sum |paste -sd "," -`
=== g++: Unexpected fails for rv64imac lp64 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== g++: Unexpected fails for rv32imac ilp32 medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
=== g++: Unexpected fails for rv64imafdc lp64d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== g++: Unexpected fails for rv32imafdc ilp32d medlow ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
=== g++: Unexpected fails for rv64imafdcv lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C execution test
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C execution test
=== g++: Unexpected fails for rv32imafdcv ilp32d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_c.C -std=c++2b (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-10.C execution test
FAIL: g++.target/riscv/rvv/base/bug-11.C execution test
FAIL: g++.target/riscv/rvv/base/bug-12.C execution test
FAIL: g++.target/riscv/rvv/base/bug-13.C execution test
FAIL: g++.target/riscv/rvv/base/bug-14.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-15.C execution test
FAIL: g++.target/riscv/rvv/base/bug-16.C execution test
FAIL: g++.target/riscv/rvv/base/bug-17.C execution test
FAIL: g++.target/riscv/rvv/base/bug-18.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-19.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-2.C execution test
FAIL: g++.target/riscv/rvv/base/bug-20.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-21.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-22.C (test for excess errors)
FAIL: g++.target/riscv/rvv/base/bug-23.C execution test
FAIL: g++.target/riscv/rvv/base/bug-3.C execution test
FAIL: g++.target/riscv/rvv/base/bug-4.C execution test
FAIL: g++.target/riscv/rvv/base/bug-5.C execution test
FAIL: g++.target/riscv/rvv/base/bug-6.C execution test
FAIL: g++.target/riscv/rvv/base/bug-7.C execution test
FAIL: g++.target/riscv/rvv/base/bug-8.C execution test
FAIL: g++.target/riscv/rvv/base/bug-9.C (test for excess errors)
=== g++: Unexpected fails for rv64gczba_zbb_zbc_zbs lp64d  ===
FAIL: g++.dg/contracts/contracts-tmpl-spec2.C   output pattern test
=== gcc: Unexpected fails for rv64imac lp64 medlow ===
ERROR: tcl error sourcing 
/scratch/merges/rgt-gcc-trunk/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp.
ERROR: torture-init: torture_without_loops is not empty as expected
ERROR: tcl error sourcing 

Re: [PATCH] aarch64: Add SVE instruction types

2023-05-16 Thread Evandro Menezes via Gcc-patches
Hi, Kyrill.

It makes sense.  I could add the classification to a different attribute as you 
did and keep it in aarch64 as well.

I took the same approach, gleaning over several optimization guides for Arm 
processors supporting SVE and figuring out the smallest number of types that 
could cover most variations of resources used.  Methinks that the 
classification in this patch is close to that goal, but feedback is appreciated.

I did observe a meaningful gain in performance.  Of course, wide machines like 
the V1 can handle most instruction sequences thrown at it, but there’s still 
some efficiency left on the table without a tailored scheduling, especially 
when recovering from cache or branch misses, when it’s important to quickly 
fill up the pipeline back to regime, albeit umpteen transistors are dedicated 
to make sure that misses do not happen often.

Thank you,

-- 
Evandro Menezes



> Em 16 de mai. de 2023, à(s) 03:36, Kyrylo Tkachov  
> escreveu:
> 
> Hi Evandro,
>  
> I created a new attribute so I didn’t have to extend the “type” attribute 
> that lives in config/arm/types.md. As that attribute and file lives in the 
> arm backend but SVE is AArch64-only I didn’t want to add logic to the arm 
> backend as it’s not truly shared.
> The granularity has been somewhat subjective. I had looked at the Software 
> Optimisation guides for various SVE and SVE2-capable cores from Arm on 
> developer.arm.com  and tried to glean 
> commonalities between different instruction groups.
> I did try writing a model for Neoverse V1 using that classification but I 
> couldn’t spend much time on it and the resulting model didn’t give me much 
> improvements and gave some regressions instead.
> I think that was more down to my rushed model rather than anything else 
> though.
>  
> Thanks,
> Kyrill
>  
> From: Evandro Menezes  
> Sent: Monday, May 15, 2023 9:13 PM
> To: Kyrylo Tkachov 
> Cc: Richard Sandiford ; Evandro Menezes via 
> Gcc-patches ; evandro+...@gcc.gnu.org; Tamar 
> Christina 
> Subject: Re: [PATCH] aarch64: Add SVE instruction types
>  
> Hi, Kyrill.
>  
> I wasn’t aware of your previous patch.  Could you clarify why you considered 
> creating an SVE specific type attribute instead of reusing the common one?  I 
> really liked the iterators that you created; I’d like to use them.
>  
> Do you have specific examples which you might want to mention with regards to 
> granularity?
>  
> Yes, my intent for this patch is to enable modeling the SVE instructions on 
> N1.  The patch that implements it brings up some performance improvements, 
> but it’s mostly flat, as expected.
>  
> Thank you,
> 
> -- 
> Evandro Menezes
>  
>  
> 
> 
> Em 15 de mai. de 2023, à(s) 04:49, Kyrylo Tkachov  > escreveu:
>  
> 
> 
> 
> -Original Message-
> From: Richard Sandiford  >
> Sent: Monday, May 15, 2023 10:01 AM
> To: Evandro Menezes via Gcc-patches  >
> Cc: evandro+...@gcc.gnu.org ; Evandro Menezes 
> mailto:ebah...@icloud.com>>;
> Kyrylo Tkachov mailto:kyrylo.tkac...@arm.com>>; 
> Tamar Christina
> mailto:tamar.christ...@arm.com>>
> Subject: Re: [PATCH] aarch64: Add SVE instruction types
> 
> Evandro Menezes via Gcc-patches  > writes:
> 
> This patch adds the attribute `type` to most SVE1 instructions, as in the
> other
> 
> instructions.
> 
> Thanks for doing this.
> 
> Could you say what criteria you used for picking the granularity?  Other
> maintainers might disagree, but personally I'd prefer to distinguish two
> instructions only if:
> 
> (a) a scheduling description really needs to distinguish them or
> (b) grouping them together would be very artificial (because they're
>logically unrelated)
> 
> It's always possible to split types later if new scheduling descriptions
> require it.  Because of that, I don't think we should try to predict ahead
> of time what future scheduling descriptions will need.
> 
> Of course, this depends on having results that show that scheduling
> makes a significant difference on an SVE core.  I think one of the
> problems here is that, when a different scheduling model changes the
> performance of a particular test, it's difficult to tell whether
> the gain/loss is caused by the model being more/less accurate than
> the previous one, or if it's due to important "secondary" effects
> on register live ranges.  Instinctively, I'd have expected these
> secondary effects to dominate on OoO cores.
> 
> I agree with Richard on these points. The key here is getting the granularity 
> right without having too maintain too many types that aren't useful in the 
> models.
> FWIW I had posted 
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607101.html in 
> November. It adds annotations to SVE2 patterns as well as for base SVE.
> Feel free to reuse it if you'd like.
> I see you had 

Re: [PATCH] configure: Implement --enable-host-pie

2023-05-16 Thread Iain Sandoe
Hi Marek,

> On 16 May 2023, at 16:29, Marek Polacek via Gcc-patches 
>  wrote:
> 
> Ping.

I’m trying this on Darwin (since I have a local patch to do this for modern 
[darwin20+]
versions, which do not allow non-PIE)

I think you are missing a hunk to deal with Ada.

thanks for the patch
Iain

> 
> On Tue, May 09, 2023 at 03:41:58PM -0400, Marek Polacek via Gcc-patches wrote:
>> [ This is my third attempt to add this configure option.  The first
>> version was approved but it came too late in the development cycle.
>> The second version was also approved, but I had to revert it:
>> .
>> I've fixed the problem (by moving $(PICFLAG) from INTERNAL_CFLAGS to
>> ALL_COMPILERFLAGS).  Another change is that since r13-4536 I no longer
>> need to touch Makefile.def, so this patch is simplified. ]
>> 
>> This patch implements the --enable-host-pie configure option which
>> makes the compiler executables PIE.  This can be used to enhance
>> protection against ROP attacks, and can be viewed as part of a wider
>> trend to harden binaries.
>> 
>> It is similar to the option --enable-host-shared, except that --e-h-s
>> won't add -shared to the linker flags whereas --e-h-p will add -pie.
>> It is different from --enable-default-pie because that option just
>> adds an implicit -fPIE/-pie when the compiler is invoked, but the
>> compiler itself isn't PIE.
>> 
>> Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
>> regressions.
>> 
>> When building the compiler, the build process may use various in-tree
>> libraries; these need to be built with -fPIE so that it's possible to
>> use them when building a PIE.  For instance, when --with-included-gettext
>> is in effect, intl object files must be compiled with -fPIE.  Similarly,
>> when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
>> -fPIE.
>> 
>> With this patch and --enable-host-pie used to configure gcc:
>> 
>> $ file gcc/cc1{,plus,obj} gcc/f951 gcc/lto1 gcc/cpp
>> gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/f951:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/cc1obj:  ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/lto1:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> gcc/cpp: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
>> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
>> 3.2.0, with debug_info, not stripped
>> 
>> I plan to add an option to link with -Wl,-z,now.
>> 
>> Bootstrapped on x86_64-pc-linux-gnu with --with-included-gettext
>> --enable-host-pie as well as without --enable-host-pie.  Also tested
>> on a Debian system where the system gcc was configured with
>> --enable-default-pie.
>> 
>> ChangeLog:
>> 
>>  * configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
>>  check.
>>  * configure: Regenerate.
>> 
>> c++tools/ChangeLog:
>> 
>>  * Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
>>  Use pic/libiberty.a if PICFLAG is set.
>>  * configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
>>  (--enable-host-pie): New check.
>>  * configure: Regenerate.
>> 
>> fixincludes/ChangeLog:
>> 
>>  * Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
>>  build of libiberty if PICFLAG is set.
>>  * configure.ac:
>>  * configure: Regenerate.
>> 
>> gcc/ChangeLog:
>> 
>>  * Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
>>  Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
>>  ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
>>  * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>>  (--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
>>  check.
>>  * configure: Regenerate.
>>  * doc/install.texi: Document --enable-host-pie.
>> 
>> gcc/d/ChangeLog:
>> 
>>  * Make-lang.in: Remove NO_PIE_CFLAGS.
>> 
>> intl/ChangeLog:
>> 
>>  * Makefile.in: Use @PICFLAG@ in COMPILE as well.
>>  * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>>  (--enable-host-pie): New check.  Set PICFLAG after this check.
>>  * 

Re: [PATCH] s390: Implement TARGET_ATOMIC_ALIGN_FOR_MODE

2023-05-16 Thread Andreas Krebbel via Gcc-patches
On 5/16/23 08:43, Stefan Schulze Frielinghaus wrote:
> So far atomic objects are aligned according to their default alignment.
> For 128 bit scalar types like int128 or long double this results in an
> 8 byte alignment which is wrong and must be 16 byte.
> 
> libstdc++ already computes a correct alignment, though, still adding a
> test case in order to make sure that both implementations are
> compatible.
> 
> Bootstrapped and regtested.  Ok for mainline?  Since this is an ABI
> break, is a backport to GCC 13 reasonable?

Ok for mainline.

I would also like to have it in GCC 13. It is an ABI breakage but on the other 
hand it also fixes an
ABI inconsistency between C and C++ which we should fix asap I think.

Andreas


> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (TARGET_ATOMIC_ALIGN_FOR_MODE):
>   New.
>   (s390_atomic_align_for_mode): New.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.target/s390/atomic-align-1.C: New test.
>   * gcc.target/s390/atomic-align-1.c: New test.
>   * gcc.target/s390/atomic-align-2.c: New test.
> ---
>  gcc/config/s390/s390.cc   |  8 ++
>  .../g++.target/s390/atomic-align-1.C  | 25 +++
>  .../gcc.target/s390/atomic-align-1.c  | 23 +
>  .../gcc.target/s390/atomic-align-2.c  | 18 +
>  4 files changed, 74 insertions(+)
>  create mode 100644 gcc/testsuite/g++.target/s390/atomic-align-1.C
>  create mode 100644 gcc/testsuite/gcc.target/s390/atomic-align-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/atomic-align-2.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 505de995da8..4813bf91dc4 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -450,6 +450,14 @@ s390_preserve_fpr_arg_p (int regno)
> && regno >= FPR0_REGNUM);
>  }
>  
> +#undef TARGET_ATOMIC_ALIGN_FOR_MODE
> +#define TARGET_ATOMIC_ALIGN_FOR_MODE s390_atomic_align_for_mode
> +static unsigned int
> +s390_atomic_align_for_mode (machine_mode mode)
> +{
> +  return GET_MODE_BITSIZE (mode);
> +}
> +
>  /* A couple of shortcuts.  */
>  #define CONST_OK_FOR_J(x) \
>   CONST_OK_FOR_CONSTRAINT_P((x), 'J', "J")
> diff --git a/gcc/testsuite/g++.target/s390/atomic-align-1.C 
> b/gcc/testsuite/g++.target/s390/atomic-align-1.C
> new file mode 100644
> index 000..43aa0bc39ed
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/s390/atomic-align-1.C
> @@ -0,0 +1,25 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-std=c++11" } */
> +/* { dg-final { scan-assembler-times {\.align\t2} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t4} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t8} 3 } } */
> +/* { dg-final { scan-assembler-times {\.align\t16} 2 } } */
> +
> +#include 
> +
> +// 2
> +std::atomic var_char;
> +std::atomic var_short;
> +// 4
> +std::atomic var_int;
> +// 8
> +std::atomic var_long;
> +std::atomic var_long_long;
> +// 16
> +std::atomic<__int128> var_int128;
> +// 4
> +std::atomic var_float;
> +// 8
> +std::atomic var_double;
> +// 16
> +std::atomic var_long_double;
> diff --git a/gcc/testsuite/gcc.target/s390/atomic-align-1.c 
> b/gcc/testsuite/gcc.target/s390/atomic-align-1.c
> new file mode 100644
> index 000..b2e1233e3ee
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/atomic-align-1.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-std=c11" } */
> +/* { dg-final { scan-assembler-times {\.align\t2} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t4} 2 } } */
> +/* { dg-final { scan-assembler-times {\.align\t8} 3 } } */
> +/* { dg-final { scan-assembler-times {\.align\t16} 2 } } */
> +
> +// 2
> +_Atomic char var_char;
> +_Atomic short var_short;
> +// 4
> +_Atomic int var_int;
> +// 8
> +_Atomic long var_long;
> +_Atomic long long var_long_long;
> +// 16
> +_Atomic __int128 var_int128;
> +// 4
> +_Atomic float var_float;
> +// 8
> +_Atomic double var_double;
> +// 16
> +_Atomic long double var_long_double;
> diff --git a/gcc/testsuite/gcc.target/s390/atomic-align-2.c 
> b/gcc/testsuite/gcc.target/s390/atomic-align-2.c
> new file mode 100644
> index 000..0bf17341bf8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/atomic-align-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-O -std=c11" } */
> +/* { dg-final { scan-assembler-not {abort} } } */
> +
> +/* The stack is 8 byte aligned which means GCC has to manually align a 16 
> byte
> +   aligned object.  This is done by allocating not 16 but rather 24 bytes for
> +   variable X and then manually aligning a pointer inside the memory block.
> +   Validate this by ensuring that the if-statement is optimized out.  */
> +
> +void bar (_Atomic unsigned __int128 *ptr);
> +
> +void foo (void) {
> +  _Atomic unsigned __int128 x;
> +  unsigned long n = (unsigned long)
> +  if (n % 16 != 0)
> +__builtin_abort ();
> +  bar ();
> +}



Re: [PATCH] c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]

2023-05-16 Thread Jason Merrill via Gcc-patches

On 5/16/23 15:34, Jakub Jelinek wrote:

Hi!

My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.

I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.

In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/13,
perhaps even 12?


OK back to 12, I think.


2023-05-16  Jakub Jelinek  

PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.

* g++.dg/init/pr109868.C: New test.

--- gcc/cp/init.cc.jj   2023-05-01 23:07:05.147417750 +0200
+++ gcc/cp/init.cc  2023-05-16 10:01:14.512489727 +0200
@@ -189,15 +189,21 @@ build_zero_init_1 (tree type, tree nelts
  init = build_zero_cst (type);
else if (RECORD_OR_UNION_CODE_P (TREE_CODE (type)))
  {
-  tree field;
+  tree field, next;
vec *v = NULL;
  
/* Iterate over the fields, building initializations.  */

-  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+  for (field = TYPE_FIELDS (type); field; field = next)
{
+ next = DECL_CHAIN (field);
+
  if (TREE_CODE (field) != FIELD_DECL)
continue;
  
+	  /* For unions, only the first field is initialized.  */

+ if (TREE_CODE (type) == UNION_TYPE)
+   next = NULL_TREE;
+
  if (TREE_TYPE (field) == error_mark_node)
continue;
  
@@ -212,6 +218,11 @@ build_zero_init_1 (tree type, tree nelts

continue;
}
  
+	  /* Don't add zero width bitfields.  */

+ if (DECL_C_BIT_FIELD (field)
+ && integer_zerop (DECL_SIZE (field)))
+   continue;
+
  /* Note that for class types there will be FIELD_DECLs
 corresponding to base classes as well.  Thus, iterating
 over TYPE_FIELDs will result in correct initialization of
@@ -230,10 +241,6 @@ build_zero_init_1 (tree type, tree nelts
  if (value)
CONSTRUCTOR_APPEND_ELT(v, field, value);
}
-
- /* For unions, only the first field is initialized.  */
- if (TREE_CODE (type) == UNION_TYPE)
-   break;
}
  
/* Build a constructor to contain the initializations.  */

--- gcc/testsuite/g++.dg/init/pr109868.C.jj 2023-05-16 09:43:54.706278293 
+0200
+++ gcc/testsuite/g++.dg/init/pr109868.C2023-05-16 09:44:16.581966894 
+0200
@@ -0,0 +1,13 @@
+// PR c++/109868
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A { virtual void foo (); };
+struct B { long b; int : 0; };
+struct C : A { B c; };
+
+void
+bar (C *p)
+{
+  *p = C ();
+}

Jakub





Re: [PATCH] c++: -Wdangling-reference not suppressed in template [PR109774]

2023-05-16 Thread Jason Merrill via Gcc-patches

On 5/16/23 15:13, Marek Polacek wrote:

In check_return_expr, we suppress the -Wdangling-reference warning when
we're sure it would be a false positive.  It wasn't working in a
template, though, because the suppress_warning call was never reached.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 13.2?


OK.


PR c++/109774

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): In a template, return only after
suppressing -Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference13.C: New test.
---
  gcc/cp/typeck.cc  |  6 ++---
  .../g++.dg/warn/Wdangling-reference13.C   | 23 +++
  2 files changed, 26 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference13.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 53ac925a092..c225c4e2423 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11236,9 +11236,6 @@ check_return_expr (tree retval, bool *no_warning)
 build_zero_cst (TREE_TYPE (retval)));
  }
  
-  if (processing_template_decl)

-return saved_retval;
-
/* A naive attempt to reduce the number of -Wdangling-reference false
   positives: if we know that this function can return a variable with
   static storage duration rather than one of its parameters, suppress
@@ -11250,6 +11247,9 @@ check_return_expr (tree retval, bool *no_warning)
&& TREE_STATIC (bare_retval))
  suppress_warning (current_function_decl, OPT_Wdangling_reference);
  
+  if (processing_template_decl)

+return saved_retval;
+
/* Actually copy the value returned into the appropriate location.  */
if (retval && retval != result)
  {
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
new file mode 100644
index 000..bc09fbae22b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
@@ -0,0 +1,23 @@
+// PR c++/109774
+// { dg-do compile }
+// { dg-options "-Wdangling-reference" }
+
+int y;
+
+template
+int& get(const char& )
+{
+return y;
+}
+
+int& get2(const char&)
+{
+return y;
+}
+
+int stuff(void)
+{
+const int  = get(0); // { dg-bogus "dangling reference" }
+const int  = get2(0); // { dg-bogus "dangling reference" }
+return h+k;
+}

base-commit: 94a311abf783de754f0f1b2d4c1f00a9788e795b




[PATCH] c++: Don't try to initialize zero width bitfields in zero initialization [PR109868]

2023-05-16 Thread Jakub Jelinek via Gcc-patches
Hi!

My GCC 12 change to avoid removing zero-sized bitfields as they are
important for ABI and are needed for layout compatibility traits
apparently causes zero sized bitfields to be initialized in the IL,
which at least in 13+ results in ICEs in the ranger which is upset
about zero precision types.

I think we could even avoid initializing other unnamed bitfields, but
unfortunately !CONSTRUCTOR_NO_CLEARING doesn't mean in the middle-end
clearing of padding bits and until we have some new flag that represents
the request to clear padding bits, I think it is better to keep zeroing
non-zero sized unnamed bitfields.

In addition to skipping those fields, I have changed the logic how
UNION_TYPEs are handled, the current code was a little bit weird in that
e.g. if first non-static data member had error_mark_node type, we'd happily
zero initialize the second non-static data member, etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/13,
perhaps even 12?

2023-05-16  Jakub Jelinek  

PR c++/109868
* init.cc (build_zero_init_1): Don't initialize zero-width bitfields.
For unions only initialize the first FIELD_DECL.

* g++.dg/init/pr109868.C: New test.

--- gcc/cp/init.cc.jj   2023-05-01 23:07:05.147417750 +0200
+++ gcc/cp/init.cc  2023-05-16 10:01:14.512489727 +0200
@@ -189,15 +189,21 @@ build_zero_init_1 (tree type, tree nelts
 init = build_zero_cst (type);
   else if (RECORD_OR_UNION_CODE_P (TREE_CODE (type)))
 {
-  tree field;
+  tree field, next;
   vec *v = NULL;
 
   /* Iterate over the fields, building initializations.  */
-  for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+  for (field = TYPE_FIELDS (type); field; field = next)
{
+ next = DECL_CHAIN (field);
+
  if (TREE_CODE (field) != FIELD_DECL)
continue;
 
+ /* For unions, only the first field is initialized.  */
+ if (TREE_CODE (type) == UNION_TYPE)
+   next = NULL_TREE;
+
  if (TREE_TYPE (field) == error_mark_node)
continue;
 
@@ -212,6 +218,11 @@ build_zero_init_1 (tree type, tree nelts
continue;
}
 
+ /* Don't add zero width bitfields.  */
+ if (DECL_C_BIT_FIELD (field)
+ && integer_zerop (DECL_SIZE (field)))
+   continue;
+
  /* Note that for class types there will be FIELD_DECLs
 corresponding to base classes as well.  Thus, iterating
 over TYPE_FIELDs will result in correct initialization of
@@ -230,10 +241,6 @@ build_zero_init_1 (tree type, tree nelts
  if (value)
CONSTRUCTOR_APPEND_ELT(v, field, value);
}
-
- /* For unions, only the first field is initialized.  */
- if (TREE_CODE (type) == UNION_TYPE)
-   break;
}
 
   /* Build a constructor to contain the initializations.  */
--- gcc/testsuite/g++.dg/init/pr109868.C.jj 2023-05-16 09:43:54.706278293 
+0200
+++ gcc/testsuite/g++.dg/init/pr109868.C2023-05-16 09:44:16.581966894 
+0200
@@ -0,0 +1,13 @@
+// PR c++/109868
+// { dg-do compile }
+// { dg-options "-O2" }
+
+struct A { virtual void foo (); };
+struct B { long b; int : 0; };
+struct C : A { B c; };
+
+void
+bar (C *p)
+{
+  *p = C ();
+}

Jakub



[PATCH] c++: -Wdangling-reference not suppressed in template [PR109774]

2023-05-16 Thread Marek Polacek via Gcc-patches
In check_return_expr, we suppress the -Wdangling-reference warning when
we're sure it would be a false positive.  It wasn't working in a
template, though, because the suppress_warning call was never reached.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 13.2?

PR c++/109774

gcc/cp/ChangeLog:

* typeck.cc (check_return_expr): In a template, return only after
suppressing -Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference13.C: New test.
---
 gcc/cp/typeck.cc  |  6 ++---
 .../g++.dg/warn/Wdangling-reference13.C   | 23 +++
 2 files changed, 26 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference13.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 53ac925a092..c225c4e2423 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11236,9 +11236,6 @@ check_return_expr (tree retval, bool *no_warning)
 build_zero_cst (TREE_TYPE (retval)));
 }
 
-  if (processing_template_decl)
-return saved_retval;
-
   /* A naive attempt to reduce the number of -Wdangling-reference false
  positives: if we know that this function can return a variable with
  static storage duration rather than one of its parameters, suppress
@@ -11250,6 +11247,9 @@ check_return_expr (tree retval, bool *no_warning)
   && TREE_STATIC (bare_retval))
 suppress_warning (current_function_decl, OPT_Wdangling_reference);
 
+  if (processing_template_decl)
+return saved_retval;
+
   /* Actually copy the value returned into the appropriate location.  */
   if (retval && retval != result)
 {
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
new file mode 100644
index 000..bc09fbae22b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference13.C
@@ -0,0 +1,23 @@
+// PR c++/109774
+// { dg-do compile }
+// { dg-options "-Wdangling-reference" }
+
+int y;
+
+template
+int& get(const char& )
+{
+return y;
+}
+
+int& get2(const char&)
+{
+return y;
+}
+
+int stuff(void)
+{
+const int  = get(0); // { dg-bogus "dangling reference" }
+const int  = get2(0); // { dg-bogus "dangling reference" }
+return h+k;
+}

base-commit: 94a311abf783de754f0f1b2d4c1f00a9788e795b
-- 
2.40.1



[committed] libstdc++: Disable cacheline alignment for DJGPP [PR109741]

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Builds OK on djgpp too.

Pushed to trunk.

-- >8 --

DJGPP (and maybe other targets) uses MAX_OFILE_ALIGNMENT=16 which means
that globals (and static objects) can't have alignment greater than 16.
This causes an error for the locks defined in src/c++11/shared_ptr.cc
because we try to align them to the cacheline size, to avoid false
sharing.

Add a configure check for the increased alignment, and live with false
sharing where we can't increase the alignment.

libstdc++-v3/ChangeLog:

PR libstdc++/109741
* acinclude.m4 (GLIBCXX_CHECK_ALIGNAS_CACHELINE): Define.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_CHECK_ALIGNAS_CACHELINE.
* src/c++11/shared_ptr.cc (__gnu_internal::get_mutex): Do not
align lock table if not supported. use __GCC_DESTRUCTIVE_SIZE
instead of hardcoded 64.
---
 libstdc++-v3/acinclude.m4| 25 +++
 libstdc++-v3/config.h.in |  4 +++
 libstdc++-v3/configure   | 48 
 libstdc++-v3/configure.ac|  3 ++
 libstdc++-v3/src/c++11/shared_ptr.cc |  8 +++--
 5 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 988c532c4e2..8129373e9dd 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5471,6 +5471,31 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
   fi
 ])
 
+dnl
+dnl Check whether lock tables can be aligned to avoid false sharing.
+dnl
+dnl Defines:
+dnl  _GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE if objects with static storage
+dnlduration can be aligned to std::hardware_destructive_interference_size.
+dnl
+AC_DEFUN([GLIBCXX_CHECK_ALIGNAS_CACHELINE], [
+  AC_LANG_SAVE
+  AC_LANG_CPLUSPLUS
+
+  AC_MSG_CHECKING([whether static objects can be aligned to the cacheline 
size])
+  AC_TRY_COMPILE(, [struct alignas(__GCC_DESTRUCTIVE_SIZE) Aligned { };
+   alignas(Aligned) static char buf[sizeof(Aligned) * 16];
+], [ac_alignas_cacheline=yes], [ac_alignas_cacheline=no])
+  if test "$ac_alignas_cacheline" = yes; then
+AC_DEFINE_UNQUOTED(_GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE, 1,
+  [Define if global objects can be aligned to
+   std::hardware_destructive_interference_size.])
+  fi
+  AC_MSG_RESULT($ac_alignas_cacheline)
+
+  AC_LANG_RESTORE
+])
+
 # Macros from the top-level gcc directory.
 m4_include([../config/gc++filt.m4])
 m4_include([../config/tls.m4])
diff --git a/libstdc++-v3/config.h.in b/libstdc++-v3/config.h.in
index f91f7eb9097..bbb2613ff69 100644
--- a/libstdc++-v3/config.h.in
+++ b/libstdc++-v3/config.h.in
@@ -819,6 +819,10 @@
 /* Define if the compiler supports C++11 atomics. */
 #undef _GLIBCXX_ATOMIC_BUILTINS
 
+/* Define if global objects can be aligned to
+   std::hardware_destructive_interference_size. */
+#undef _GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE
+
 /* Define to use concept checking code from the boost libraries. */
 #undef _GLIBCXX_CONCEPT_CHECKS
 
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index a9589d882e6..188be08d716 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -71957,6 +71957,54 @@ _ACEOF
   fi
 
 
+
+
+  ac_ext=cpp
+ac_cpp='$CXXCPP $CPPFLAGS'
+ac_compile='$CXX -c $CXXFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CXX -o conftest$ac_exeext $CXXFLAGS $CPPFLAGS $LDFLAGS 
conftest.$ac_ext $LIBS >&5'
+ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
+
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether static objects can 
be aligned to the cacheline size" >&5
+$as_echo_n "checking whether static objects can be aligned to the cacheline 
size... " >&6; }
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+struct alignas(__GCC_DESTRUCTIVE_SIZE) Aligned { };
+   alignas(Aligned) static char buf[sizeof(Aligned) * 16];
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_cxx_try_compile "$LINENO"; then :
+  ac_alignas_cacheline=yes
+else
+  ac_alignas_cacheline=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+  if test "$ac_alignas_cacheline" = yes; then
+
+cat >>confdefs.h <<_ACEOF
+#define _GLIBCXX_CAN_ALIGNAS_DESTRUCTIVE_SIZE 1
+_ACEOF
+
+  fi
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_alignas_cacheline" >&5
+$as_echo "$ac_alignas_cacheline" >&6; }
+
+  ac_ext=c
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5'
+ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext 
$LIBS >&5'
+ac_compiler_gnu=$ac_cv_c_compiler_gnu
+
+
+
 # Define documentation rules conditionally.
 
 # See if makeinfo has been installed and is modern enough
diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 0dd550a4b4b..df01f58bd83 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -538,6 +538,9 @@ GLIBCXX_EMERGENCY_EH_ALLOC
 # For src/c++20/tzdb.cc defaults.
 

[PATCH] libstdc++: Disable embedded tzdata for all 16-bit targets

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Builds OK for avr too.

Roger, does this work for xstormy16?


-- >8 --

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ZONEINFO_DIR): Extend logic for avr and
msp430 to all 16-bit targets.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 15 +--
 libstdc++-v3/configure| 18 --
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 8129373e9dd..eb30c4f00a5 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5426,12 +5426,15 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
zoneinfo_dir=none
;;
 esac
-case "$host" in
-  avr-*-* | msp430-*-* ) embed_zoneinfo=no ;;
-  *)
-   # Also embed a copy of the tzdata.zi file as a static string.
-   embed_zoneinfo=yes ;;
-esac
+
+AC_COMPUTE_INT(glibcxx_cv_at_least_32bit, [sizeof(void*) >= 4])
+if test "$glibcxx_cv_at_least_32bit" -ne 0; then
+  # Also embed a copy of the tzdata.zi file as a static string.
+  embed_zoneinfo=yes
+else
+  # The embedded data is too large for 16-bit targets.
+  embed_zoneinfo=no
+fi
   elif test "x${with_libstdcxx_zoneinfo}" = xno; then
 # Disable tzdb support completely.
 zoneinfo_dir=none
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 188be08d716..345ba5721a8 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -71903,12 +71903,18 @@ fi
zoneinfo_dir=none
;;
 esac
-case "$host" in
-  avr-*-* | msp430-*-* ) embed_zoneinfo=no ;;
-  *)
-   # Also embed a copy of the tzdata.zi file as a static string.
-   embed_zoneinfo=yes ;;
-esac
+
+if ac_fn_c_compute_int "$LINENO" "sizeof(void*) >= 4" 
"glibcxx_cv_at_least_32bit"""; then :
+
+fi
+
+if test "$glibcxx_cv_at_least_32bit" -ne 0; then
+  # Also embed a copy of the tzdata.zi file as a static string.
+  embed_zoneinfo=yes
+else
+  # The embedded data is too large for 16-bit targets.
+  embed_zoneinfo=no
+fi
   elif test "x${with_libstdcxx_zoneinfo}" = xno; then
 # Disable tzdb support completely.
 zoneinfo_dir=none
-- 
2.40.1



[committed] rs6000: Enable REE pass by default

2023-05-16 Thread Ajit Agarwal via Gcc-patches
rs6000: Enable REE pass by default

Add ree pass as a default pass for rs6000 target for
O2 and above.

2023-05-16  Ajit Kumar Agarwal  

gcc/ChangeLog:

* common/config/rs6000/rs6000-common.cc: Add REE pass as a
default rs6000 target pass for O2 and above.
* doc/invoke.texi: Document -free
---
 gcc/common/config/rs6000/rs6000-common.cc | 2 ++
 gcc/doc/invoke.texi   | 4 ++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/common/config/rs6000/rs6000-common.cc 
b/gcc/common/config/rs6000/rs6000-common.cc
index 2140c442ba9..968db215028 100644
--- a/gcc/common/config/rs6000/rs6000-common.cc
+++ b/gcc/common/config/rs6000/rs6000-common.cc
@@ -34,6 +34,8 @@ static const struct default_options 
rs6000_option_optimization_table[] =
 { OPT_LEVELS_ALL, OPT_fsplit_wide_types_early, NULL, 1 },
 /* Enable -fsched-pressure for first pass instruction scheduling.  */
 { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+/* Enable -free for zero extension and sign extension elimination.*/
+{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
 /* Enable -munroll-only-small-loops with -funroll-loops to unroll small
loops at -O2 and above by default.  */
 { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b92b8576027..2c525762171 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12455,8 +12455,8 @@ Attempt to remove redundant extension instructions.  
This is especially
 helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
 registers after writing to their lower 32-bit half.

-Enabled for Alpha, AArch64 and x86 at levels @option{-O2},
-@option{-O3}, @option{-Os}.
+Enabled for Alpha, AArch64, PowerPC, RISC-V, SPARC, h83000 and x86 at levels
+@option{-O2}, @option{-O3}, @option{-Os}.

 @opindex fno-lifetime-dse
 @opindex flifetime-dse
-- 
2.31.1


[committed gcc13 backport] RISCV: Inline subword atomic ops

2023-05-16 Thread Patrick O'Neill

On 5/15/23 21:32, Jeff Law wrote:




On 5/9/23 10:01, Patrick O'Neill wrote:

Ping.

OK for backporting.  Sorry for the delay.

jeff


Committed.

Thanks,
Patrick



Re: [PATCH] c++: desig init in presence of list ctor [PR109871]

2023-05-16 Thread Jason Merrill via Gcc-patches

On 5/16/23 11:38, Patrick Palka wrote:

add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertendly being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the list initializer as
an ordinary non-designated one.  This patch fixes this by making us check
for invalid designated initialization sooner.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 13?  IIUC desig init is C++20 but we also accept it
with a pedwarn in earlier dialects, so not sure if this'd be suitable
for backporting.


OK.


PR c++/109871

gcc/cp/ChangeLog:

* call.cc (add_list_candidates): Check for invalid
designated initialization sooner, even for types that have
a list constructor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/desig6.C: New test.
---
  gcc/cp/call.cc  | 16 
  gcc/testsuite/g++.dg/cpp0x/desig6.C | 16 
  2 files changed, 24 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/desig6.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 48611bb16a3..908374a43c9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -4129,6 +4129,14 @@ add_list_candidates (tree fns, tree first_arg,
if (CONSTRUCTOR_NELTS (init_list) == 0
&& TYPE_HAS_DEFAULT_CONSTRUCTOR (totype))
  ;
+  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
+  && !CP_AGGREGATE_TYPE_P (totype))
+{
+  if (complain & tf_error)
+   error ("designated initializers cannot be used with a "
+  "non-aggregate type %qT", totype);
+  return;
+}
/* If the class has a list ctor, try passing the list as a single
   argument first, but only consider list ctors.  */
else if (TYPE_HAS_LIST_CTOR (totype))
@@ -4140,14 +4148,6 @@ add_list_candidates (tree fns, tree first_arg,
if (any_strictly_viable (*candidates))
return;
  }
-  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
-  && !CP_AGGREGATE_TYPE_P (totype))
-{
-  if (complain & tf_error)
-   error ("designated initializers cannot be used with a "
-  "non-aggregate type %qT", totype);
-  return;
-}
  
/* Expand the CONSTRUCTOR into a new argument vec.  */

vec *new_args;
diff --git a/gcc/testsuite/g++.dg/cpp0x/desig6.C 
b/gcc/testsuite/g++.dg/cpp0x/desig6.C
new file mode 100644
index 000..8d4cf483176
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/desig6.C
@@ -0,0 +1,16 @@
+// PR c++/109871
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+#include 
+
+struct vector {
+  vector(std::initializer_list); // #1
+  vector(int); // #2
+};
+
+void f(vector);
+
+int main() {
+  f({.blah = 42}); // { dg-error "designated" } previously incorrectly 
selected #2
+}




RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Li, Pan2 via Gcc-patches
Update the PATCH v4 (I am sorry, missed the v4 in subject) as below with x86 
bootstrap test passed.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618742.html

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Li, Pan2 via Gcc-patches
Sent: Tuesday, May 16, 2023 8:17 PM
To: Richard Sandiford 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

Thanks Richard Sandiford for review.

Yes, currently the class access_info will be extended from 8 bytes to 12 bytes, 
which is missed in the table. With the adjustment as you suggested it will be 8 
bytes but unfortunately the change of m_kind may trigger some ICE in some test 
case(s).

I will take a look into it and keep you posted.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Tuesday, May 16, 2023 5:09 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

pan2...@intel.com writes:
> diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h index 
> c5180b9308a..38b4d6160c2 100644
> --- a/gcc/rtl-ssa/accesses.h
> +++ b/gcc/rtl-ssa/accesses.h
> @@ -254,7 +254,7 @@ private:
>unsigned int m_spare : 2;
>  
>// The value returned by the accessor above.
> -  machine_mode m_mode : 8;
> +  machine_mode m_mode : MACHINE_MODE_BITSIZE;
>  };
>  
>  // A contiguous array of access_info pointers.  Used to represent a

This structure (access_info) isn't mentioned in the table in the patch 
description.  The structure is currently 1 LP64 word and is very 
size-sensitive.  I think we should:

- Put the mode after m_regno
- Reduce m_kind to 2 bits
- Remove m_spare

I *think* that will keep the current size, but please check.

LGTM otherwise.

Thanks,
Richard


[PATCH] c++: desig init in presence of list ctor [PR109871]

2023-05-16 Thread Patrick Palka via Gcc-patches
add_list_candidates has logic to reject designated initialization of a
non-aggregate type, but this is inadvertendly being suppressed if the type
has a list constructor due to the order of case analysis, which in the
below testcase leads to us incorrectly treating the list initializer as
an ordinary non-designated one.  This patch fixes this by making us check
for invalid designated initialization sooner.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and perhaps 13?  IIUC desig init is C++20 but we also accept it
with a pedwarn in earlier dialects, so not sure if this'd be suitable
for backporting.

PR c++/109871

gcc/cp/ChangeLog:

* call.cc (add_list_candidates): Check for invalid
designated initialization sooner, even for types that have
a list constructor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/desig6.C: New test.
---
 gcc/cp/call.cc  | 16 
 gcc/testsuite/g++.dg/cpp0x/desig6.C | 16 
 2 files changed, 24 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/desig6.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 48611bb16a3..908374a43c9 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -4129,6 +4129,14 @@ add_list_candidates (tree fns, tree first_arg,
   if (CONSTRUCTOR_NELTS (init_list) == 0
   && TYPE_HAS_DEFAULT_CONSTRUCTOR (totype))
 ;
+  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
+  && !CP_AGGREGATE_TYPE_P (totype))
+{
+  if (complain & tf_error)
+   error ("designated initializers cannot be used with a "
+  "non-aggregate type %qT", totype);
+  return;
+}
   /* If the class has a list ctor, try passing the list as a single
  argument first, but only consider list ctors.  */
   else if (TYPE_HAS_LIST_CTOR (totype))
@@ -4140,14 +4148,6 @@ add_list_candidates (tree fns, tree first_arg,
   if (any_strictly_viable (*candidates))
return;
 }
-  else if (CONSTRUCTOR_IS_DESIGNATED_INIT (init_list)
-  && !CP_AGGREGATE_TYPE_P (totype))
-{
-  if (complain & tf_error)
-   error ("designated initializers cannot be used with a "
-  "non-aggregate type %qT", totype);
-  return;
-}
 
   /* Expand the CONSTRUCTOR into a new argument vec.  */
   vec *new_args;
diff --git a/gcc/testsuite/g++.dg/cpp0x/desig6.C 
b/gcc/testsuite/g++.dg/cpp0x/desig6.C
new file mode 100644
index 000..8d4cf483176
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/desig6.C
@@ -0,0 +1,16 @@
+// PR c++/109871
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+#include 
+
+struct vector {
+  vector(std::initializer_list); // #1
+  vector(int); // #2
+};
+
+void f(vector);
+
+int main() {
+  f({.blah = 42}); // { dg-error "designated" } previously incorrectly 
selected #2
+}
-- 
2.40.1.552.g91428f078b



[PATCH] configure: Implement --enable-host-bind-now

2023-05-16 Thread Marek Polacek via Gcc-patches
As promised in the --enable-host-pie patch, this patch adds another
configure option, --enable-host-bind-now, which adds -z now when linking
the compiler executables in order to extend hardening.  BIND_NOW with RELRO
allows the GOT to be marked RO; this prevents GOT modification attacks.

This option does not affect linking of target libraries; you can use
LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW.

With this patch:
$ readelf -Wd cc1{,plus} | grep FLAGS
 0x001e (FLAGS)  BIND_NOW
 0x6ffb (FLAGS_1)Flags: NOW PIE
 0x001e (FLAGS)  BIND_NOW
 0x6ffb (FLAGS_1)Flags: NOW PIE

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

c++tools/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.
* configure: Regenerate.

gcc/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Add
-Wl,-z,now to LD_PICFLAG if --enable-host-bind-now.
* configure: Regenerate.
* doc/install.texi: Document --enable-host-bind-now.

lto-plugin/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Link with
-z,now.
* configure: Regenerate.

diff --git a/c++tools/configure b/c++tools/configure
index 88087009383..006efe07b35 100755
--- a/c++tools/configure
+++ b/c++tools/configure
@@ -628,6 +628,7 @@ EGREP
 GREP
 CXXCPP
 LD_PICFLAG
+enable_host_bind_now
 PICFLAG
 MAINTAINER
 CXX_AUX_TOOLS
@@ -702,6 +703,7 @@ enable_maintainer_mode
 enable_checking
 enable_default_pie
 enable_host_pie
+enable_host_bind_now
 with_gcc_major_version_only
 '
   ac_precious_vars='build_alias
@@ -1336,6 +1338,7 @@ Optional Features:
   yes,no,all,none,release.
   --enable-default-pieenable Position Independent Executable as default
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
 
 Optional Packages:
   --with-PACKAGE[=ARG]use PACKAGE [ARG=yes]
@@ -3007,6 +3010,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now; LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"
+fi
+
+
+
 
 # Check if O_CLOEXEC is defined by fcntl
 
diff --git a/c++tools/configure.ac b/c++tools/configure.ac
index 44dfaccbbfa..c2a16601425 100644
--- a/c++tools/configure.ac
+++ b/c++tools/configure.ac
@@ -110,6 +110,13 @@ AC_ARG_ENABLE(host-pie,
[build host code as PIE])],
 [PICFLAG=-fPIE; LD_PICFLAG=-pie], [])
 AC_SUBST(PICFLAG)
+
+# Enable --enable-host-bind-now
+AC_ARG_ENABLE(host-bind-now,
+[AS_HELP_STRING([--enable-host-bind-now],
+   [link host code as BIND_NOW])],
+[LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"], [])
+AC_SUBST(enable_host_bind_now)
 AC_SUBST(LD_PICFLAG)
 
 # Check if O_CLOEXEC is defined by fcntl
diff --git a/gcc/configure b/gcc/configure
index 629446ecf3b..6d847c60024 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -635,6 +635,7 @@ CET_HOST_FLAGS
 LD_PICFLAG
 PICFLAG
 enable_default_pie
+enable_host_bind_now
 enable_host_pie
 enable_host_shared
 enable_plugin
@@ -1031,6 +1032,7 @@ enable_version_specific_runtime_libs
 enable_plugin
 enable_host_shared
 enable_host_pie
+enable_host_bind_now
 enable_libquadmath_support
 with_linker_hash_style
 with_diagnostics_color
@@ -1794,6 +1796,7 @@ Optional Features:
   --enable-plugin enable plugin support
   --enable-host-sharedbuild host code as shared libraries
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
   --disable-libquadmath-support
   disable libquadmath support for Fortran
   --enable-default-pieenable Position Independent Executable as default
@@ -19852,7 +19855,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19867 "configure"
+#line 19870 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19958,7 +19961,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19973 "configure"
+#line 19976 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -32105,6 +32108,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now;
+fi
+
+
+
 # Check whether --enable-libquadmath-support was given.
 if test "${enable_libquadmath_support+set}" = set; then :
   enableval=$enable_libquadmath_support; ENABLE_LIBQUADMATH_SUPPORT=$enableval
@@ -32291,6 +32302,8 @@ else
   PICFLAG=
 fi
 
+
+
 if test x$enable_host_pie = xyes; then
   LD_PICFLAG=-pie
 elif test x$gcc_cv_no_pie = xyes; then
@@ -32299,6 +32312,9 @@ else
   LD_PICFLAG=
 fi
 
+if test x$enable_host_bind_now = xyes; then
+  

[PATCH] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Pan Li via Gcc-patches
From: Pan Li 

We are running out of the machine_mode(8 bits) in RISC-V backend. Thus
we would like to extend the machine_mode bit size from 8 to 16 bits.
However, it is sensitive to extend the memory size in common structure
like tree or rtx. This patch would like to extend the machine_mode bits
to 16 bits by shrinking, like:

* Swap the bit size of code and machine code in rtx_def.
* Adjust the machine_mode location and spare in tree.

The memory impact of this patch for correlated structure looks like below:

+---+--+-+--+
| struct/bytes  | upstream | patched | diff |
+---+--+-+--+
| rtx_obj_reference |8 |  12 |   +4 |
| ext_modified  |2 |   4 |   +2 |
| ira_allocno   |  192 | 184 |   -8 |
| qty_table_elem|   40 |  40 |0 |
| reg_stat_type |   64 |  64 |0 |
| rtx_def   |   40 |  40 |0 |
| table_elt |   80 |  80 |0 |
| tree_decl_common  |  112 | 112 |0 |
| tree_type_common  |  128 | 128 |0 |
| access_info   |8 |   8 |0 |
+---+--+-+--+

The tree and rtx related struct has no memory changes after this patch,
and the machine_mode changes to 16 bits already.

Signed-off-by: Pan Li 
Co-authored-by: Ju-Zhe Zhong 
Co-authored-by: Kito Cheng 
Co-Authored-By: Richard Biener 
Co-Authored-By: Richard Sandiford 

gcc/ChangeLog:

* combine.cc (struct reg_stat_type): Extend machine_mode to 16 bits.
* cse.cc (struct qty_table_elem): Extend machine_mode to 16 bits
(struct table_elt): Extend machine_mode to 16 bits.
(struct set): Ditto.
* genmodes.cc (emit_mode_wider): Extend type from char to short.
(emit_mode_complex): Ditto.
(emit_mode_inner): Ditto.
(emit_class_narrowest_mode): Ditto.
* genopinit.cc (main): Extend the machine_mode limit.
* ira-int.h (struct ira_allocno): Extend machine_mode to 16 bits and
re-ordered the struct fields for padding.
* machmode.h (MACHINE_MODE_BITSIZE): New macro.
(GET_MODE_2XWIDER_MODE): Extend type from char to short.
(get_mode_alignment): Extend type from char to short.
* ree.cc (struct ext_modified): Extend machine_mode to 16 bits and
removed the ATTRIBUTE_PACKED.
* rtl-ssa/accesses.h: Extend machine_mode to 16 bits, narrow
m_kind to 2 bits and remove m_spare.
* rtl.h (RTX_CODE_BITSIZE): New macro.
(struct rtx_def): Swap both the bit size and location between the
rtx_code and the machine_mode.
(subreg_shape::unique_id): Extend the machine_mode limit.
* rtlanal.h: Extend machine_mode to 16 bits.
* tree-core.h (struct tree_type_common): Extend machine_mode to 16
bits and re-ordered the struct fields for padding.
(struct tree_decl_common): Extend machine_mode to 16 bits.
* internals.inl (rtl_ssa::access_info): Adjust the assignment.
---
 gcc/combine.cc|  4 +--
 gcc/cse.cc| 16 ---
 gcc/genmodes.cc   | 16 +--
 gcc/genopinit.cc  |  3 ++-
 gcc/ira-int.h | 56 +++
 gcc/machmode.h| 27 ++-
 gcc/ree.cc|  4 +--
 gcc/rtl-ssa/accesses.h| 12 -
 gcc/rtl-ssa/internals.inl |  5 ++--
 gcc/rtl.h | 12 +
 gcc/rtlanal.h |  2 +-
 gcc/tree-core.h   |  9 ---
 12 files changed, 88 insertions(+), 78 deletions(-)

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 5aa0ec5c45a..a23caeed96f 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -200,7 +200,7 @@ struct reg_stat_type {
 
   unsigned HOST_WIDE_INT   last_set_nonzero_bits;
   char last_set_sign_bit_copies;
-  ENUM_BITFIELD(machine_mode)  last_set_mode : 8;
+  ENUM_BITFIELD(machine_mode)  last_set_mode : MACHINE_MODE_BITSIZE;
 
   /* Set nonzero if references to register n in expressions should not be
  used.  last_set_invalid is set nonzero when this register is being
@@ -235,7 +235,7 @@ struct reg_stat_type {
  truncation if we know that value already contains a truncated
  value.  */
 
-  ENUM_BITFIELD(machine_mode)  truncated_to_mode : 8;
+  ENUM_BITFIELD(machine_mode)  truncated_to_mode : MACHINE_MODE_BITSIZE;
 };
 
 
diff --git a/gcc/cse.cc b/gcc/cse.cc
index b10c9b0c94d..86403b95938 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -248,10 +248,8 @@ struct qty_table_elem
   rtx comparison_const;
   int comparison_qty;
   unsigned int first_reg, last_reg;
-  /* The sizes of these fields should match the sizes of the
- code and mode fields of struct rtx_def (see rtl.h).  */
-  ENUM_BITFIELD(rtx_code) comparison_code : 16;
-  ENUM_BITFIELD(machine_mode) mode : 8;
+  ENUM_BITFIELD(machine_mode) mode : MACHINE_MODE_BITSIZE;
+  

[committed] RISC-V: Fix wrong select_kind in riscv_compute_multilib

2023-05-16 Thread Kito Cheng via Gcc-patches
Seems like I screw up bare-metal toolchian multi lib selection during
finxing linux multi-lib selction...

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_compute_multilib):
Fix wrong select_kind...
---
 gcc/common/config/riscv/riscv-common.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 3a285dfbff0e..fb2635eb5599 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1777,11 +1777,11 @@ riscv_compute_multilib (
   switch (select_kind)
 {
 case select_by_abi:
-  return riscv_select_multilib (riscv_current_abi_str, subset_list,
-   switches, n_switches, multilib_infos);
-case select_by_abi_arch_cmodel:
   return riscv_select_multilib_by_abi (riscv_current_abi_str,
   multilib_infos);
+case select_by_abi_arch_cmodel:
+  return riscv_select_multilib (riscv_current_abi_str, subset_list,
+   switches, n_switches, multilib_infos);
 case select_by_builtin:
   gcc_unreachable ();
 default:
-- 
2.39.2



Re: [PATCH] configure: Implement --enable-host-pie

2023-05-16 Thread Marek Polacek via Gcc-patches
Ping.

On Tue, May 09, 2023 at 03:41:58PM -0400, Marek Polacek via Gcc-patches wrote:
> [ This is my third attempt to add this configure option.  The first
> version was approved but it came too late in the development cycle.
> The second version was also approved, but I had to revert it:
> .
> I've fixed the problem (by moving $(PICFLAG) from INTERNAL_CFLAGS to
> ALL_COMPILERFLAGS).  Another change is that since r13-4536 I no longer
> need to touch Makefile.def, so this patch is simplified. ]
> 
> This patch implements the --enable-host-pie configure option which
> makes the compiler executables PIE.  This can be used to enhance
> protection against ROP attacks, and can be viewed as part of a wider
> trend to harden binaries.
> 
> It is similar to the option --enable-host-shared, except that --e-h-s
> won't add -shared to the linker flags whereas --e-h-p will add -pie.
> It is different from --enable-default-pie because that option just
> adds an implicit -fPIE/-pie when the compiler is invoked, but the
> compiler itself isn't PIE.
> 
> Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
> regressions.
> 
> When building the compiler, the build process may use various in-tree
> libraries; these need to be built with -fPIE so that it's possible to
> use them when building a PIE.  For instance, when --with-included-gettext
> is in effect, intl object files must be compiled with -fPIE.  Similarly,
> when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
> -fPIE.
> 
> With this patch and --enable-host-pie used to configure gcc:
> 
> $ file gcc/cc1{,plus,obj} gcc/f951 gcc/lto1 gcc/cpp
> gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/f951:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/cc1obj:  ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/lto1:ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> gcc/cpp: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
> dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
> 3.2.0, with debug_info, not stripped
> 
> I plan to add an option to link with -Wl,-z,now.
> 
> Bootstrapped on x86_64-pc-linux-gnu with --with-included-gettext
> --enable-host-pie as well as without --enable-host-pie.  Also tested
> on a Debian system where the system gcc was configured with
> --enable-default-pie.
> 
> ChangeLog:
> 
>   * configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
>   check.
>   * configure: Regenerate.
> 
> c++tools/ChangeLog:
> 
>   * Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
>   Use pic/libiberty.a if PICFLAG is set.
>   * configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
>   (--enable-host-pie): New check.
>   * configure: Regenerate.
> 
> fixincludes/ChangeLog:
> 
>   * Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
>   build of libiberty if PICFLAG is set.
>   * configure.ac:
>   * configure: Regenerate.
> 
> gcc/ChangeLog:
> 
>   * Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
>   Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
>   ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
>   * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>   (--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
>   check.
>   * configure: Regenerate.
>   * doc/install.texi: Document --enable-host-pie.
> 
> gcc/d/ChangeLog:
> 
>   * Make-lang.in: Remove NO_PIE_CFLAGS.
> 
> intl/ChangeLog:
> 
>   * Makefile.in: Use @PICFLAG@ in COMPILE as well.
>   * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>   (--enable-host-pie): New check.  Set PICFLAG after this check.
>   * configure: Regenerate.
> 
> libcody/ChangeLog:
> 
>   * Makefile.in: Pass LD_PICFLAG to LDFLAGS.
>   * configure.ac (--enable-host-shared): Don't set PICFLAG here.
>   (--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
>   check.
>   * configure: Regenerate.
> 
> libcpp/ChangeLog:
> 
>   * configure.ac (--enable-host-shared): Don't 

Re: [PATCH v5 1/4] rs6000: Enable REE pass by default

2023-05-16 Thread Segher Boessenkool
Hi!

On Tue, May 16, 2023 at 11:45:28AM +0530, Ajit Agarwal wrote:
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -12455,8 +12455,8 @@ Attempt to remove redundant extension instructions.  
> This is especially
>  helpful for the x86-64 architecture, which implicitly zero-extends in 64-bit
>  registers after writing to their lower 32-bit half.
>  
> -Enabled for Alpha, AArch64 and x86 at levels @option{-O2},
> -@option{-O3}, @option{-Os}.
> +Enabled for Alpha, AArch64, RS/6000, RISC-V, SPARC, h83000 and x86 at levels 
> +@option{-O2}, @option{-O3}, @option{-Os}.

Please don't mention RS/6000, we don't support that anymore.  The
architecture we do support is called Power or PowerPC; the target
triplets are powerpc*-*-*.  rs6000-*-* might still somewhat work, but
no one should use it anymore, and we probably should delete it.

Please say PowerPC here.

With that the patch is okay for trunk.  Thank you!


Segher


[PATCH] OpenMP: Array shaping operator and strided "target update" for C

2023-05-16 Thread Julian Brown
Following the similar support for C++ and Fortran, here is the
C implementation for the OpenMP 5.0 array-shaping operator, and for
strided and rectangular updates for "target update" directives.

Much of the implementation is shared with the previously-posted C++
support:

  https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613788.html

Some details of parsing necessarily differ for C, but the general ideas
are the same.

This patch is intended to be applied on top of the following series:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609031.html

(with followup:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609566.html)

and (the series supporting the C++ patch in the first link above):

  https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613785.html

and (Fortran support):

  https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616921.html

Tested with offloading to NVPTX, and bootstrapped. OK?

Thanks,

Julian

2023-05-16  Julian Brown  

gcc/c/
* c-parser.cc (c_parser_braced_init): Disallow array-shaping operator
in braced init.
(c_parser_conditional_expression): Disallow array-shaping operator in
conditional expression.
(c_parser_cast_expression): Add array-shaping operator support.
(c_parser_postfix_expression): Disallow array-shaping operator in
statement expressions.
(c_parser_postfix_expression_after_primary): Add OpenMP array section
stride support.
(c_parser_expr_list): Disallow array-shaping operator in expression
lists.
(c_array_type_nelts_top, c_array_type_nelts_total): New functions.
(c_parser_omp_variable_list): Support array-shaping operator.
(c_parser_omp_clause_to, c_parser_omp_clause_from): Allow generalised
lvalue parsing in "to" and "from" clauses.
(c_parser_omp_target_update): Recognize GOMP_MAP_TO_GRID and
GOMP_MAP_FROM_GRID map kinds as well as OMP_CLAUSE_TO/OMP_CLAUSE_FROM.
* c-tree.h (c_omp_array_shaping_op_p, c_omp_has_array_shape_p): New
extern declarations.
(create_omp_arrayshape_type): Add prototype.
* c-typeck.cc (c_omp_array_shaping_op_p, c_omp_has_array_shape_p): New
globals.
(build_omp_array_section): Permit integral types, not just integer
constants, when creating array types for array sections.
(create_omp_arrayshape_type): New function.
(handle_omp_array_sections_1): Add DISCONTIGUOUS parameter.  Add
strided/rectangular array section support.
(omp_array_section_low_bound): New function.
(handle_omp_array_sections): Add DISCONTIGUOUS parameter.  Add
strided/rectangular array section support.
(c_finish_omp_clauses): Update calls to handle_omp_array_sections.
Handle discontiguous updates.

gcc/testsuite/
* gcc.dg/gomp/bad-array-shaping-c-1.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-2.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-3.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-4.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-5.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-6.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-7.c: New test.

libgomp/
* testsuite/libgomp.c/array-shaping-1.c: New test.
* testsuite/libgomp.c/array-shaping-2.c: New test.
* testsuite/libgomp.c/array-shaping-3.c: New test.
* testsuite/libgomp.c/array-shaping-4.c: New test.
* testsuite/libgomp.c/array-shaping-5.c: New test.
* testsuite/libgomp.c/array-shaping-6.c: New test.
---
 gcc/c/c-parser.cc | 305 +-
 gcc/c/c-tree.h|   4 +
 gcc/c/c-typeck.cc | 241 --
 .../gcc.dg/gomp/bad-array-shaping-c-1.c   |  26 ++
 .../gcc.dg/gomp/bad-array-shaping-c-2.c   |  24 ++
 .../gcc.dg/gomp/bad-array-shaping-c-3.c   |  30 ++
 .../gcc.dg/gomp/bad-array-shaping-c-4.c   |  27 ++
 .../gcc.dg/gomp/bad-array-shaping-c-5.c   |  17 +
 .../gcc.dg/gomp/bad-array-shaping-c-6.c   |  26 ++
 .../gcc.dg/gomp/bad-array-shaping-c-7.c   |  15 +
 libgomp/testsuite/libgomp.c/array-shaping-1.c | 236 ++
 libgomp/testsuite/libgomp.c/array-shaping-2.c |  39 +++
 libgomp/testsuite/libgomp.c/array-shaping-3.c |  42 +++
 libgomp/testsuite/libgomp.c/array-shaping-4.c |  36 +++
 libgomp/testsuite/libgomp.c/array-shaping-5.c |  38 +++
 libgomp/testsuite/libgomp.c/array-shaping-6.c |  45 +++
 16 files changed, 1101 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-4.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-shaping-c-5.c
 create mode 

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

2023-05-16 Thread Thomas Schwinge
Hi!

On 2023-05-05T10:59:31+0200, I wrote:
> On 2023-05-05T10:55:41+0200, I wrote:
>> [Putting Bernhard, Honza, Segher in CC, as they are eager to test this,
>> based on recent comments on IRC.]  ;-P


>> First, establish the parallel testing infrastructure -- while still
>> hard-coding the number of parallel slots to one.

>> "Support parallel testing in libgomp, part I [PR66005]"

> On top of that, second, enable parallel testing.

> implemented what I'd described in
> :
>
> | [...] parallelize *all* compilation, while just allowing for *one*
> | execution test job slot.  That will require some GCC DejaGnu test
> | harness hackery which I've [now] gotten to look into.  That is, enable
> | the usual GCC/DejaGnu parallel testing, but also have some kind of
> | mutex for the execution test invocation.  This has to play nicely with
> | DejaGnu timeout handling, etc.

> Subject: [PATCH] Support parallel testing in libgomp, part II [PR66005]
>
> ..., and enable if 'flock' is available for serializing execution testing.

OK to push the attached
"Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]"?

Per the PR66005 discussion, if 'flock' is not available, having a
fallback Perl 'flock' for parallelizing 'check-target-libgomp' wasn't met
with the greatest of all enthusiasm -- but in my opinion it's still
better than continued all-serial 'check-target-libgomp'?

We may then proceed working on a more integrated solution, using TCL or
shell features.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From c62858bf888fec2f61febafcd6afe2dc8c3f679b Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 15 May 2023 20:00:07 +0200
Subject: [PATCH] Support parallel testing in libgomp: fallback Perl 'flock'
 [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k

	PR testsuite/66005
	gcc/
	* doc/install.texi: Document (optional) Perl usage for parallel
	testing of libgomp.
	libgomp/
	* testsuite/lib/libgomp.exp: 'flock' through stdout.
	* testsuite/flock: New.
	* configure.ac (FLOCK): Point to that if no 'flock' available, but
	'perl' is.
	* configure: Regenerate.
---
 gcc/doc/install.texi  |  3 +++
 libgomp/configure | 42 +++
 libgomp/configure.ac  |  5 
 libgomp/testsuite/flock   | 17 +
 libgomp/testsuite/lib/libgomp.exp |  4 ++-
 5 files changed, 70 insertions(+), 1 deletion(-)
 create mode 100755 libgomp/testsuite/flock

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index dfab47dac96..fe4a972980f 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -382,6 +382,9 @@ tables.
 
 Used by @command{automake}.
 
+If available, enables parallel testing of @samp{libgomp} in case that
+@command{flock} is not available.
+
 @end table
 
 Several support libraries are necessary to build GCC, some are required,
diff --git a/libgomp/configure b/libgomp/configure
index 2b45acd08c6..a280ca9238a 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -16457,6 +16457,8 @@ $as_echo "unable to detect (assuming 1)" >&6; }
 fi
 
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for flock implementation" >&5
+$as_echo "$as_me: checking for flock implementation" >&6;}
 for ac_prog in flock
 do
   # Extract the first word of "$ac_prog", so it 

Remove stale Autoconf checks for Perl

2023-05-16 Thread Thomas Schwinge
Hi!

OK to push the attached "Remove stale Autoconf checks for Perl"?


For avoidance of doubt, there still exist a few instances of Perl usage
in the GCC build process (like, when 'contrib/make_sunver.pl' is used),
but those always directly invoke 'perl'.  As this, apparently, is working
fine, I'm not proposing changing those to now use Autoconf-determined
Perl.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From e86eabae296a9153a1d02b1ed8cafda1b70485a6 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 16 May 2023 12:00:37 +0200
Subject: [PATCH] Remove stale Autoconf checks for Perl

Subversion r110220 (Git commit 03b8fe495d716c004f5491eb2347537f115ab2d8) for
PR25884 "libgomp should not require perl to compile" removed all '$(PERL)'
usage from libgomp -- but didn't remove the then-unused Autoconf Perl check
itself.  Later, this Autoconf Perl check appears to have been copied from
libgomp into other GCC libraries, likewise unused.

	libgomp/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* testsuite/Makefile.in: Likewise.
	libatomic/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* testsuite/Makefile.in: Likewise.
	libgm2/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* libm2cor/Makefile.in: Likewise.
	* libm2iso/Makefile.in: Likewise.
	* libm2log/Makefile.in: Likewise.
	* libm2min/Makefile.in: Likewise.
	* libm2pim/Makefile.in: Likewise.
	libitm/
	* configure.ac (PERL): Remove.
	* configure: Regenerate.
	* Makefile.in: Likewise.
	* testsuite/Makefile.in: Likewise.
---
 libatomic/Makefile.in   |  1 -
 libatomic/configure | 46 ++---
 libatomic/configure.ac  |  1 -
 libatomic/testsuite/Makefile.in |  1 -
 libgm2/Makefile.in  |  1 -
 libgm2/configure| 46 ++---
 libgm2/configure.ac |  1 -
 libgm2/libm2cor/Makefile.in |  1 -
 libgm2/libm2iso/Makefile.in |  1 -
 libgm2/libm2log/Makefile.in |  1 -
 libgm2/libm2min/Makefile.in |  1 -
 libgm2/libm2pim/Makefile.in |  1 -
 libgomp/Makefile.in |  1 -
 libgomp/configure   | 46 ++---
 libgomp/configure.ac|  1 -
 libgomp/testsuite/Makefile.in   |  1 -
 libitm/Makefile.in  |  1 -
 libitm/configure| 46 ++---
 libitm/configure.ac |  1 -
 libitm/testsuite/Makefile.in|  1 -
 20 files changed, 8 insertions(+), 192 deletions(-)

diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index a0fa3dfc8cc..83efe7d2694 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -321,7 +321,6 @@ PACKAGE_TARNAME = @PACKAGE_TARNAME@
 PACKAGE_URL = @PACKAGE_URL@
 PACKAGE_VERSION = @PACKAGE_VERSION@
 PATH_SEPARATOR = @PATH_SEPARATOR@
-PERL = @PERL@
 RANLIB = @RANLIB@
 SECTION_LDFLAGS = @SECTION_LDFLAGS@
 SED = @SED@
diff --git a/libatomic/configure b/libatomic/configure
index e47d2d7fb35..1994662b7c5 100755
--- a/libatomic/configure
+++ b/libatomic/configure
@@ -680,7 +680,6 @@ EGREP
 GREP
 SED
 LIBTOOL
-PERL
 RANLIB
 NM
 AR
@@ -4869,47 +4868,6 @@ else
   RANLIB="$ac_cv_prog_RANLIB"
 fi
 
-# Extract the first word of "perl", so it can be a program name with args.
-set dummy perl; ac_word=$2
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
-$as_echo_n "checking for $ac_word... " >&6; }
-if ${ac_cv_path_PERL+:} false; then :
-  $as_echo_n "(cached) " >&6
-else
-  case $PERL in
-  [\\/]* | ?:[\\/]*)
-  ac_cv_path_PERL="$PERL" # Let the user override the test with a path.
-  ;;
-  *)
-  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
-for as_dir in $PATH
-do
-  IFS=$as_save_IFS
-  test -z "$as_dir" && as_dir=.
-for ac_exec_ext in '' $ac_executable_extensions; do
-  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
-ac_cv_path_PERL="$as_dir/$ac_word$ac_exec_ext"
-$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" >&5
-break 2
-  fi
-done
-  done
-IFS=$as_save_IFS
-
-  test -z "$ac_cv_path_PERL" && ac_cv_path_PERL="perl-not-found-in-path-error"
-  ;;
-esac
-fi
-PERL=$ac_cv_path_PERL
-if test -n "$PERL"; then
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $PERL" >&5
-$as_echo "$PERL" >&6; }
-else
-  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
-$as_echo "no" >&6; }
-fi
-
-
 
 
 # Configure libtool
@@ -11406,7 +11364,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11409 "configure"
+#line 11367 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11512,7 +11470,7 @@ else
   lt_dlunknown=0; 

Re: [PATCH] RFC: New compact syntax for insn and insn_split in Machine Descriptions

2023-05-16 Thread Richard Earnshaw (lists) via Gcc-patches

On 24/04/2023 09:33, Richard Sandiford via Gcc-patches wrote:

Richard Sandiford  writes:

Tamar Christina  writes:

Hi All,

This patch adds support for a compact syntax for specifying constraints in
instruction patterns. Credit for the idea goes to Richard Earnshaw.

I am sending up this RFC to get feedback for it's inclusion in GCC 14.
With this new syntax we want a clean break from the current limitations to make
something that is hopefully easier to use and maintain.

The idea behind this compact syntax is that often times it's quite hard to
correlate the entries in the constrains list, attributes and instruction lists.

One has to count and this often is tedious.  Additionally when changing a single
line in the insn multiple lines in a diff change, making it harder to see what's
going on.

This new syntax takes into account many of the common things that are done in MD
files.   It's also worth saying that this version is intended to deal with the
common case of a string based alternatives.   For C chunks we have some ideas
but those are not intended to be addressed here.

It's easiest to explain with an example:

normal syntax:

(define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  r,  
r,  r, w,r,w, w")
(match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %w1
mov\\t%w0, %1
#
* return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
ldr\\t%w0, %1
ldr\\t%s0, %1
str\\t%w1, %0
str\\t%s1, %0
adrp\\t%x0, %A1\;ldr\\t%w0, [%x0, %L1]
adr\\t%x0, %c1
adrp\\t%x0, %A1
fmov\\t%s0, %w1
fmov\\t%w0, %s1
fmov\\t%s0, %s1
* return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
[(const_int 0)]
"{
aarch64_expand_mov_immediate (operands[0], operands[1]);
DONE;
 }"
   ;; The "mov_imm" type for CNT is just a placeholder.
   [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,mov_imm,load_4,

load_4,store_4,store_4,load_4,adr,adr,f_mcr,f_mrc,fmov,neon_move")
(set_attr "arch"   "*,*,*,*,*,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd")
(set_attr "length" "4,4,4,4,*,  4,4, 4,4, 4,8,4,4, 4, 4, 4,   4")
]
)

New syntax:

(define_insn_and_split "*movsi_aarch64"
   [(set (match_operand:SI 0 "nonimmediate_operand")
(match_operand:SI 1 "aarch64_mov_operand"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
   "@@ (cons: 0 1; attrs: type arch length)
[=r, r  ; mov_reg  , *   , 4] mov\t%w0, %w1
[k , r  ; mov_reg  , *   , 4] ^
[r , k  ; mov_reg  , *   , 4] ^
[r , M  ; mov_imm  , *   , 4] mov\t%w0, %1
[r , n  ; mov_imm  , *   , *] #
[r , Usv; mov_imm  , sve , 4] << aarch64_output_sve_cnt_immediate ('cnt', 
'%x0', operands[1]);
[r , m  ; load_4   , *   , 4] ldr\t%w0, %1
[w , m  ; load_4   , fp  , 4] ldr\t%s0, %1
[m , rZ ; store_4  , *   , 4] str\t%w1, %0
[m , w  ; store_4  , fp  , 4] str\t%s1, %0
[r , Usw; load_4   , *   , 8] adrp\t%x0, %A1;ldr\t%w0, [%x0, %L1]
[r , Usa; adr  , *   , 4] adr\t%x0, %c1
[r , Ush; adr  , *   , 4] adrp\t%x0, %A1
[w , rZ ; f_mcr, fp  , 4] fmov\t%s0, %w1
[r , w  ; f_mrc, fp  , 4] fmov\t%w0, %s1
[w , w  ; fmov , fp  , 4] fmov\t%s0, %s1
[w , Ds ; neon_move, simd, 4] << aarch64_output_scalar_simd_mov_immediate 
(operands[1], SImode);"
   "CONST_INT_P (operands[1]) && !aarch64_move_imm (INTVAL (operands[1]), 
SImode)
 && REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
   [(const_int 0)]
   {
 aarch64_expand_mov_immediate (operands[0], operands[1]);
 DONE;
   }
   ;; The "mov_imm" type for CNT is just a placeholder.
)

The patch contains some more rewritten examples for both Arm and AArch64.  I
have included them for examples in this RFC but the final version posted in
GCC 14 will have these split out.

The main syntax rules are as follows (See docs for full rules):
   - Template must start with "@@" to use the new syntax.
   - "@@" is followed by a layout in parentheses which is "cons:" followed by
 a list of match_operand/match_scratch IDs, then a semicolon, then the
 same for attributes ("attrs:"). Both sections are optional (so you can
 use only cons, or only attrs, or both), and cons must come before attrs
 if present.
   - Each alternative begins with any amount of whitespace.
   - Following the whitespace is a comma-separated list of constraints and/or
 attributes within brackets [], with sections separated by a semicolon.
   - Following the closing ']' is any amount of 

[GCC12 backport] arm: MVE testsuite and backend bugfixes

2023-05-16 Thread Stamatis Markianos-Wright via Gcc-patches

Hi all,

We've recently sent up a lot of patches overhauling the testsuite of the 
Arm MVE backend.
With these changes, we've also identified and fixed a number of bugs 
(some backend bugs and many to do with the polymorphism of intrinsics in 
MVE the header file).

These would all be relevant to backport to GCC12.
The list is as follows (in the order they all apply on top of eachother):

* This patch series: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606552.html 
(commits 9a79b522e0663a202a288db56ebcbdcdb48bdaca to 
f2b54e5b796b00f0072b61f9cd6a964c66ead29b)

* ecc363971aeac52481d92de8b37521f6cc2d38e6 arm: Fix MVE testsuite fallouts
* 06aa66af7d0dacc1b247d9e38175e789ef159191 arm: Add missing early 
clobber to MVE vrev64q_m patterns
* c09663eabfb84ac56ddd8d44abcab3f4902c83bd testsuite: [arm] Relax 
expected register names in MVE tests
* 330d665ce6dcc63ed0bd78d807e69bbfc55255b6 arm: [MVE] Add missing 
length=8 attribute
* 8d4f007398bc3f8fea812fb8cff4d7d0556d12f1 arm: fix mve intrinsics scan 
body tests for C++
* This patch series 
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/610312.html 
(commits dd4424ef898608321b60610c4f3c98737ace3680 to 
267f01a493ab8a0bec9325ce3386b946c46f2e98)
* 8a1360e72d6c6056606aa5edd8c906c50f26de59 arm: Split up MVE _Generic 
associations to prevent type clashes [PR107515]

* 3f0ca7a3e4431534bff3b8eb73709cc822e489b0 arm: Fix vcreate definition
* c1093923733a1072a237f112e3239b5ebd88eadd arm: Make MVE masked stores 
read memory operand [PR 108177]
* f54e31ddefe3ea7146624eabcb75b1c90dc59f1a arm: fix __arm_vld1q_z* and 
__arm_vst1q_p* intrinsics [PR108442]
* 1d509f190393627cdf0afffc427b25dd21c2 arm: remove unused variables 
from test


-- up to this point everything applied cleanly. The final two need minor 
rebasing changes --


* This patch series: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/617008.html (Not 
pushed to trunk yet, but has been approved. For trunk we do now need to 
resolve some merge conflicts, since Christophe has started merging the 
MVE Intrinsic Restructuring, but these are trivial. I will also backport 
to GCC13 where this patch series applies cleanly)
* cfa118fc089e38a94ec60ccf5b667aea015e5f60 [arm] complete vmsr/vmrs 
blank and case adjustments.


The final one is a commit from Alexandre Oliva that is needed to ensure 
that we don't accidentally regress the test due to the tabs vs spaces 
and capitalisation on the vmrs/vmsr instructions :)


After all that, no regressions on baremetal arm-none-eabi in a bunch 
configurations (-marm, thumb1, thumb2, MVE, MVE.FP, softfp and hardfp):


Thanks,
Stam



Re: [PATCH v4 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.

2023-05-16 Thread Ajit Agarwal via Gcc-patches



On 29/04/23 5:03 am, Jeff Law wrote:
> 
> 
> On 4/28/23 16:42, Hans-Peter Nilsson wrote:
>> On Sat, 22 Apr 2023, Ajit Agarwal via Gcc-patches wrote:
>>
>>> Hello All:
>>>
>>> This new version of patch 4 use improve ree pass for rs6000 target using 
>>> defined ABI interfaces.
>>> Bootstrapped and regtested on power64-linux-gnu.
>>>
>>> Thanks & Regards
>>> Ajit
>>>
>>>
>>> ree: Improve ree pass for rs6000 target using defined abi interfaces
>>>
>>>  For rs6000 target we see redundant zero and sign
>>>  extension and done to improve ree pass to eliminate
>>>  such redundant zero and sign extension using defines
>>>  ABI interfaces.
>>>
>>>  2023-04-22  Ajit Kumar Agarwal  
>>>
>>> gcc/ChangeLog:
>>>
>>>  * ree.cc (combline_reaching_defs): Add zero_extend
>>>  using defined abi interfaces.
>>>  (add_removable_extension): use of defined abi interfaces
>>>  for no reaching defs.
>>>  (abi_extension_candidate_return_reg_p): New defined ABI function.
>>>  (abi_extension_candidate_p): New defined ABI function.
>>>  (abi_extension_candidate_argno_p): New defined ABI function.
>>>  (abi_handle_regs_without_defs_p): New defined ABI function.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>  * g++.target/powerpc/zext-elim-3.C
>>> ---
>>>   gcc/ree.cc    | 176 +++---
>>>   .../g++.target/powerpc/zext-elim-3.C  |  16 ++
>>>   2 files changed, 162 insertions(+), 30 deletions(-)
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
>>>
>>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>>> index 413aec7c8eb..0de96b1ece1 100644
>>> --- a/gcc/ree.cc
>>> +++ b/gcc/ree.cc
>>> @@ -473,7 +473,8 @@ get_defs (rtx_insn *insn, rtx reg, vec 
>>> *dest)
>>>   break;
>>>   }
>>>   -  gcc_assert (use != NULL);
>>> +  if (use == NULL)
>>> +    return NULL;
>>>       ref_chain = DF_REF_CHAIN (use);
>>>   @@ -514,7 +515,8 @@ get_uses (rtx_insn *insn, rtx reg)
>>>   if (REGNO (DF_REF_REG (def)) == REGNO (reg))
>>>     break;
>>>   -  gcc_assert (def != NULL);
>>> +  if (def == NULL)
>>> +    return NULL;
>>>       ref_chain = DF_REF_CHAIN (def);
>>>   @@ -750,6 +752,103 @@ get_extended_src_reg (rtx src)
>>>     return src;
>>>   }
>>>   +/* Return TRUE if the candidate insn is zero extend and regno is
>>> +   an return  registers.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +
>>> +  if (GET_CODE (SET_SRC (set)) !=  ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  if (FUNCTION_VALUE_REGNO_P (regno))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Return TRUE if reg source operand of zero_extend is argument registers
>>> +   and not return registers and source and destination operand are same
>>> +   and mode of source and destination operand are not same.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_p (rtx_insn *insn)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +
>>> +  if (GET_CODE (SET_SRC (set)) !=  ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  machine_mode ext_dst_mode = GET_MODE (SET_DEST (set));
>>> +  rtx orig_src = XEXP (SET_SRC (set),0);
>>> +
>>> +  bool copy_needed
>>> +    = (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0)));
>>> +
>>> +  if (!copy_needed && ext_dst_mode != GET_MODE (orig_src)
>>> +  && FUNCTION_ARG_REGNO_P (REGNO (orig_src))
>>> +  && !abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Return TRUE if the candidate insn is zero extend and regno is
>>> +   an argument registers.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_argno_p (rtx_code code, int regno)
>>> +{
>>> +  if (code !=  ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  if (FUNCTION_ARG_REGNO_P (regno))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>
>> I don't see anything in those functions that checks if
>> ZERO_EXTEND is actually a feature of the ABI, e.g. as opposed to
>> no extension or SIGN_EXTEND.  Do I miss something?
> I don't think you missed anything.  That was one of the points I was making 
> last week.  Somewhere, somehow we need to describe what the ABI mandates and 
> guarantees.
> 
> So while what Ajit has done is a step forward, at some point the actual 
> details of the ABI need to be described in a way that can be checked and 
> consumed by REE.


The ABI we need for ree pass are the argument registers and return registers. 
Based on that I have described interfaces that we need. Other than that we dont 
any other ABI hooks. I have used FUNCTION_VALUE_REGNO_P and 
FuNCTION_ARG_REGNO_P abi hooks.

Thanks & Regards
Ajit
> 
> Jeff


Re: [PATCH] rtl: AArch64: New RTL for ABD

2023-05-16 Thread Richard Sandiford via Gcc-patches
Sorry for the slow reply.

Oluwatamilore Adebayo  writes:
> From afa416dab831795f7e1114da2fb9e94ea3b8c519 Mon Sep 17 00:00:00 2001
> From: oluade01 
> Date: Fri, 14 Apr 2023 15:10:07 +0100
> Subject: [PATCH 2/4] AArch64: New RTL for ABD
>
> This patch adds new RTL and tests for sabd and uabd
>
> PR tree-optimization/109156
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-simd-builtins.def (sabd, uabd):
> Change the mode to 3.
> * config/aarch64/aarch64-simd.md (aarch64_abd):
> Rename to abd3.
> * config/aarch64/aarch64-sve.md (abd_3): Rename
> to abd3.

Thanks.  These changes look good, once the vectoriser part is sorted,
but I have some comments about the tests:

> diff --git a/gcc/testsuite/gcc.target/aarch64/abd.h 
> b/gcc/testsuite/gcc.target/aarch64/abd.h
> new file mode 100644
> index 
> ..bc38e8508056cf2623cddd6053bf1cec3fa4ece4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/abd.h
> @@ -0,0 +1,62 @@
> +#ifdef ABD_IDIOM
> +
> +#define TEST1(S, TYPE) \
> +void fn_##S##_##TYPE (S TYPE * restrict a, \
> + S TYPE * restrict b,  \
> + S TYPE * restrict out) {  \
> +  for (int i = 0; i < N; i++) {\
> +signed TYPE diff = b[i] - a[i];\
> +out[i] = diff > 0 ? diff : -diff;  \
> +} }
> +
> +#define TEST2(S, TYPE1, TYPE2) \
> +void fn_##S##_##TYPE1##_##TYPE1##_##TYPE2  \
> +(S TYPE1 * restrict a, \
> + S TYPE1 * restrict b, \
> + S TYPE2 * restrict out) { \
> +  for (int i = 0; i < N; i++) {\
> +signed TYPE2 diff = b[i] - a[i];   \
> +out[i] = diff > 0 ? diff : -diff;  \
> +} }
> +
> +#define TEST3(S, TYPE1, TYPE2, TYPE3)  \
> +void fn_##S##_##TYPE1##_##TYPE2##_##TYPE3  \
> +(S TYPE1 * restrict a, \
> + S TYPE2 * restrict b, \
> + S TYPE3 * restrict out) { \
> +  for (int i = 0; i < N; i++) {\
> +signed TYPE3 diff = b[i] - a[i];   \
> +out[i] = diff > 0 ? diff : -diff;  \
> +} }
> +
> +#endif
> +
> +#ifdef ABD_ABS
> +
> +#define TEST1(S, TYPE) \
> +void fn_##S##_##TYPE (S TYPE * restrict a, \
> + S TYPE * restrict b,  \
> + S TYPE * restrict out) {  \
> +  for (int i = 0; i < N; i++)  \
> +out[i] = __builtin_abs(a[i] - b[i]);   \
> +}
> +
> +#define TEST2(S, TYPE1, TYPE2) \
> +void fn_##S##_##TYPE1##_##TYPE1##_##TYPE2  \
> +(S TYPE1 * restrict a, \
> + S TYPE1 * restrict b, \
> + S TYPE2 * restrict out) { \
> +  for (int i = 0; i < N; i++)  \
> +out[i] = __builtin_abs(a[i] - b[i]);   \
> +}
> +
> +#define TEST3(S, TYPE1, TYPE2, TYPE3)  \
> +void fn_##S##_##TYPE1##_##TYPE2##_##TYPE3  \
> +(S TYPE1 * restrict a, \
> + S TYPE2 * restrict b, \
> + S TYPE3 * restrict out) { \
> +  for (int i = 0; i < N; i++)  \
> +out[i] = __builtin_abs(a[i] - b[i]);   \
> +}
> +
> +#endif

It would be good to mark all of these functions with __attribute__((noipa)),
since I think interprocedural optimisations might otherwise defeat the
runtime test in abd_run_1.c (in the sense that we might end up folding
things at compile time and not testing the vector versions of the functions).

> diff --git a/gcc/testsuite/gcc.target/aarch64/abd_2.c 
> b/gcc/testsuite/gcc.target/aarch64/abd_2.c
> new file mode 100644
> index 
> ..45bcfabe05a395f6775f78f28c73eb536ba5654e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/abd_2.c
> @@ -0,0 +1,34 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +#pragma GCC target "+nosve"
> +#define N 1024
> +
> +#define ABD_ABS
> +#include "abd.h"
> +
> +TEST1(signed, int)
> +TEST1(signed, short)
> +TEST1(signed, char)
> +
> +TEST2(signed, char, int)
> +TEST2(signed, char, short)
> +
> +TEST3(signed, char, int, short)
> +TEST3(signed, char, short, int)
> +
> +TEST1(unsigned, int)
> +TEST1(unsigned, short)
> +TEST1(unsigned, char)
> +
> +TEST2(unsigned, char, int)
> +TEST2(unsigned, char, short)
> +
> +TEST3(unsigned, char, int, short)
> +TEST3(unsigned, char, short, int)
> +
> +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, 
> v\[0-9\]+\.4s" 2 } } */
> +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, 
> v\[0-9\]+\.8h" 1 } } */
> +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, 
> v\[0-9\]+\.16b" 1 } } */
> +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, 
> 

RE: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Li, Pan2 via Gcc-patches
Thanks Richard Sandiford for review.

Yes, currently the class access_info will be extended from 8 bytes to 12 bytes, 
which is missed in the table. With the adjustment as you suggested it will be 8 
bytes but unfortunately the change of m_kind may trigger some ICE in some test 
case(s).

I will take a look into it and keep you posted.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Tuesday, May 16, 2023 5:09 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Wang, 
Yanzhang ; jeffreya...@gmail.com; rguent...@suse.de
Subject: Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

pan2...@intel.com writes:
> diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h index 
> c5180b9308a..38b4d6160c2 100644
> --- a/gcc/rtl-ssa/accesses.h
> +++ b/gcc/rtl-ssa/accesses.h
> @@ -254,7 +254,7 @@ private:
>unsigned int m_spare : 2;
>  
>// The value returned by the accessor above.
> -  machine_mode m_mode : 8;
> +  machine_mode m_mode : MACHINE_MODE_BITSIZE;
>  };
>  
>  // A contiguous array of access_info pointers.  Used to represent a

This structure (access_info) isn't mentioned in the table in the patch 
description.  The structure is currently 1 LP64 word and is very 
size-sensitive.  I think we should:

- Put the mode after m_regno
- Reduce m_kind to 2 bits
- Remove m_spare

I *think* that will keep the current size, but please check.

LGTM otherwise.

Thanks,
Richard


Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Richard Sandiford via Gcc-patches
Tejas Belagod  writes:
>>> +   {
>>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>>> + bitsize_int (step * BITS_PER_UNIT),
>>> + bitsize_int ((16 - step) * BITS_PER_UNIT));
>>> +
>>> + return gimple_build_assign (f.lhs, b);
>>> +   }
>>> +
>>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) == 2 and every multiple of
>>> +'step_1' in
>>> +[VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1]
>>> +is zero, then we can treat the vector as VECTOR_CST_NPATTERNS
>>> +elements followed by all inactive elements.  */
>>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) == 2)
>>
>> Following on from the above, maybe use:
>>
>>   !VECTOR_CST_NELTS (pred).is_constant ()
>>
>> instead of !const_vl here.
>>
>> I have a horrible suspicion that I'm contradicting our earlier discussion
>> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != 2.
>>
>> 
>>
>> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the
> encoded
>> constant. This means the repetition occurs if NELTS_PER_PATTERN == 2, IOW the
>> base1 repeats in the encoding. This loop is checking this condition and looks
>> for a 1 in the repeated part of the NELTS_PER_PATTERN == 2 in a VL vector.
>> Please correct me if I’m misunderstanding here.
>
> NELTS_PER_PATTERN == 1 is also a repeating pattern: it means that the
> entire sequence is repeated to fill a vector.  So if an NELTS_PER_PATTERN
> == 1 constant has elements {0, 1, 0, 0}, the vector is:
>
>{0, 1, 0, 0, 0, 1, 0, 0, ...}
>
>
> Wouldn’t the vect_all_same(pred, step) cover this case for a given value of
> step?
>
>
> and the optimisation can't handle that.  NELTS_PER_PATTERN == 3 isn't
> likely to occur for predicates, but in principle it has the same problem.
>
>  
>
> OK, I had misunderstood the encoding to always make base1 the repeating value
> by adjusting the NPATTERNS accordingly – I didn’t know you could also have the
> base2 value and beyond encoding the repeat value. In this case could I just
> remove NELTS_PER_PATTERN == 2 condition and the enclosed loop would check for 
> a
> repeating ‘1’ in the repeated part of the encoded pattern?

But for NELTS_PER_PATTERN==1, the whole encoded sequence repeats.
So you would have to start the check at element 0 rather than
NPATTERNS.  And then (for NELTS_PER_PATTERN==1) the loop would reject
any constant that has a nonzero element.  But all valid zero-vector
cases have been handled by this point, so the effect wouldn't be useful.

It should never be the case that all elements from NPATTERNS
onwards are zero for NELTS_PER_PATTERN==3; that case should be
canonicalised to NELTS_PER_PATTERN==2 instead.

So in practice it's simpler and more obviously correct to punt
when NELTS_PER_PATTERN != 2.

Thanks,
Richard


Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Tejas Belagod via Gcc-patches



From: Richard Sandiford 
Date: Tuesday, May 16, 2023 at 2:15 PM
To: Tejas Belagod 
Cc: gcc-patches@gcc.gnu.org 
Subject: Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]
Tejas Belagod  writes:
>> +  {
>> +int i;
>> +int nelts = vector_cst_encoded_nelts (v);
>> +int first_el = 0;
>> +
>> +for (i = first_el; i < nelts; i += step)
>> +  if (VECTOR_CST_ENCODED_ELT (v, i) != VECTOR_CST_ENCODED_ELT (v,
> first_el))
>
> I think this should use !operand_equal_p (..., ..., 0).
>
>
> Oops! I wonder why I thought VECTOR_CST_ENCODED_ELT returned a constant! 
> Thanks
> for spotting that.

It does only return a constant.  But there can be multiple trees with
the same constant value, through things like TREE_OVERFLOW (not sure
where things stand on expunging that from gimple) and the fact that
gimple does not maintain a distinction between different types that
have the same mode and signedness.  (E.g. on ILP32 hosts, gimple does
not maintain a distinction between int and long, even though int 0 and
long 0 are different trees.)

> Also, should the flags here be OEP_ONLY_CONST ?

Nah, just 0 should be fine.

>> + return false;
>> +
>> +return true;
>> +  }
>> +
>> +  /* Fold a svlast{a/b} call with constant predicate to a BIT_FIELD_REF.
>> + BIT_FIELD_REF lowers to a NEON element extract, so we have to make sure
>> + the index of the element being accessed is in the range of a NEON
> vector
>> + width.  */
>
> s/NEON/Advanced SIMD/.  Same in later comments
>
>> +  gimple *fold (gimple_folder & f) const override
>> +  {
>> +tree pred = gimple_call_arg (f.call, 0);
>> +tree val = gimple_call_arg (f.call, 1);
>> +
>> +if (TREE_CODE (pred) == VECTOR_CST)
>> +  {
>> + HOST_WIDE_INT pos;
>> + unsigned int const_vg;
>> + int i = 0;
>> + int step = f.type_suffix (0).element_bytes;
>> + int step_1 = gcd (step, VECTOR_CST_NPATTERNS (pred));
>> + int npats = VECTOR_CST_NPATTERNS (pred);
>> + unsigned HOST_WIDE_INT nelts = vector_cst_encoded_nelts (pred);
>> + tree b = NULL_TREE;
>> + bool const_vl = aarch64_sve_vg.is_constant (_vg);
>
> I think this might be left over from previous versions, but:
> const_vg isn't used and const_vl is only used once, so I think it
> would be better to remove them.
>
>> +
>> + /* We can optimize 2 cases common to variable and fixed-length cases
>> +without a linear search of the predicate vector:
>> +1.  LASTA if predicate is all true, return element 0.
>> +2.  LASTA if predicate all false, return element 0.  */
>> + if (is_lasta () && vect_all_same (pred, step_1))
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT), bitsize_int (0));
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* Handle the all-false case for LASTB where SVE VL == 128b -
>> +return the highest numbered element.  */
>> + if (is_lastb () && known_eq (BYTES_PER_SVE_VECTOR, 16)
>> + && vect_all_same (pred, step_1)
>> + && integer_zerop (VECTOR_CST_ENCODED_ELT (pred, 0)))
>
> Formatting nit: one condition per line once one line isn't enough.
>
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT),
>> + bitsize_int ((16 - step) * BITS_PER_UNIT));
>> +
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) == 2 and every multiple of
>> +'step_1' in
>> +[VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1]
>> +is zero, then we can treat the vector as VECTOR_CST_NPATTERNS
>> +elements followed by all inactive elements.  */
>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) == 2)
>
> Following on from the above, maybe use:
>
>   !VECTOR_CST_NELTS (pred).is_constant ()
>
> instead of !const_vl here.
>
> I have a horrible suspicion that I'm contradicting our earlier discussion
> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != 2.
>
>
>
> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the encoded
> constant. This means the repetition occurs if NELTS_PER_PATTERN == 2, IOW the
> base1 repeats in the encoding. This loop is checking this condition and looks
> for a 1 in the repeated part of the NELTS_PER_PATTERN == 2 in a VL vector.
> Please correct me if I’m misunderstanding here.

NELTS_PER_PATTERN == 1 is also a repeating pattern: it means that the
entire sequence is repeated to fill a vector.  So if an NELTS_PER_PATTERN
== 1 constant has elements {0, 1, 0, 0}, the vector is:

   {0, 1, 0, 0, 0, 1, 0, 0, ...}

Wouldn’t the vect_all_same(pred, step) cover this case for a given value of 
step?

and the optimisation can't handle that.  NELTS_PER_PATTERN == 3 isn't
likely to occur for predicates, but in principle it has the 

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. Forget about V10 patch. Just go directly V11 patch.
I am so sorry that I send V10 since I originally did not notice Case 2 and Case 
3 are totally the same.
I apologize for that. I have reviewed V11 patch twice, it seems that this patch 
is much more reasonable and better understanding than before.

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:30
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.
 
Thanks.
 
> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?
 
It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.
 
Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.
 
Richard
 


Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-16 Thread Jakub Jelinek via Gcc-patches
On Tue, May 16, 2023 at 11:45:16AM +0200, Frederik Harwath wrote:
> The place where different compilers implement the loop transformations
> was discussed in an OpenMP loop transformation meeting last year. Two
> compilers (another one and GCC with this patch series) transformed the loops
> in the middle end after the handling of data sharing, one planned to do so.
> Yet another vendor had not yet decided where it will be implemented. Clang
> currently does everything in the front end, but it was mentioned that this
> might change in the future e.g. for code sharing with Flang. Implementing
> the loop transformations late could potentially
> complicate the implementation of transformations which require adjustments
> of the data sharing clauses, but this is known and consequentially, no such

When already in the FE we determine how many canonical loops a particular
loop transformation creates, I think the primary changes I'd like to see is
really have OMP_UNROLL/OMP_TILE GENERIC statements (see below) and consider
where is the best spot to lower it.  I believe for data sharing it is best
done during gimplification before the containing loops are handled, it is
already shared code among all the FEs, I think will make it easier to handle
data sharing right and gimplification is also where doacross processing is
done.  While there is restriction that ordered clause is incompatible with
generated loops from tile construct, there isn't one for unroll (unless
"The ordered clause must not appear on a worksharing-loop directive if the 
associated loops
include the generated loops of a tile directive."
means unroll partial implicitly because partial unroll tiles the loop, but
it doesn't say it acts as if it was a tile construct), so we'd have to handle
#pragma omp for ordered(2)
for (int i = 0; i < 64; i++)
  #pragma omp unroll partial(4)
  for (int j = 0; j < 64; j++)
{
  #pragma omp ordered depend (sink: i - 1, j - 2)
  #pragma omp ordered depend (source)
}
and I think handling it after gimplification is going to be increasingly
harder.  Of course another possibility is ask lang committee to clarify
unless it has been clarified already in 6.0 (but in TR11 it is not).
Also, I think creating temporaries is easier to be done during
gimplification than later.

Another option is as you implemented a separate pre-omp-lowering pass,
and another one would be do it in the omplower pass, which has actually
several subpasses internally, do it in the scan phase.  Disadvantage of
a completely separate pass is that we have to walk the whole IL again,
while doing it in the scan phase means we avoid that cost.  We already
do there similar transformations, scan_omp_simd transforms simd constructs
into if (...) simd else simt and then we process it with normal scan_omp_for
on what we've created.  So, if you insist doing it after gimplification
perhaps for compatibility with other non-LLVM compilers, I'd prefer to
do it there rather than in a completely separate pass.

> transformations are planned for OpenMP 6.0. In particular, the "apply"
> clause therefore only permits loop-transforming constructs to be applied to
> the loops generated from other loop
> transformations in TR11.
> 
> > The normal loop constructs (OMP_FOR, OMP_SIMD, OMP_DISTRIBUTE, OMP_LOOP)
> > already need to know given their collapse/ordered how many loops they are
> > actually associated with and the loop transformation constructs can change
> > that.
> > So, I think we need to do the loop transformations in the FEs, that doesn't
> > mean we need to write everything 3 times, once for each frontend.
> > Already now, e.g. various stuff is shared between C and C++ FEs in c-family,
> > though how much can be shared between c-family and Fortran is to be
> > discovered.
> > Or at least partially, to the extent that we compute how many canonical
> > loops the loop transformations result in, what artificial iterators they
> > will use etc., so that during gimplification we can take all that into
> > account and then can do the actual transformations later.
> 
> The patches in this patch series already do compute how many canonical
> loop nests result from the loop transformations in the front end.

Good.

> This is necessary to represent the loop nest that is affected by the
> loop transformations by a single OMP_FOR to meet the expectations
> of all later OpenMP code transformations. This is also the major
> reason why the loop transformations are represented by clauses
> instead of representing them as  "OMP_UNROLL/OMP_TILE as
> GENERIC constructs like OMP_FOR" as you suggest below. Since the

I really don't see why.  We try to represent what we see in the source
as OpenMP constructs as those constructs.  We already have a precedent
with composite loop constructs, where for the combined constructs which
aren't innermost we temporarily use NULL OMP_FOR_{INIT,COND,INCR,ORIG_DECLS}
vectors to stand for this will be some loop, but the details for it 

[PATCH] aarch64: Allow moves after tied-register intrinsics (2nd edition)

2023-05-16 Thread Richard Sandiford via Gcc-patches
I missed these two in g:4ff89f10ca0d41f9cfa76 because I was
testing on a system that didn't support big-endian compilation.
Testing on aarch64_be-elf shows no other related failures
(although the overall results are worse than for little-endian).

Tested on aarch64_be-elf & pushed.

Richard


gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c: Allow mves
to occur after the intrinsic instruction, rather than requiring
them to happen before.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c: Likewise.
---
 .../gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c| 10 ++
 .../gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c   | 10 ++
 2 files changed, 20 insertions(+)

diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
index ae0a953f7b4..9975edb8fdb 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
@@ -70,8 +70,13 @@ float32x4_t ufooq_lane(float32x4_t r, bfloat16x8_t x, 
bfloat16x4_t y)
 
 /*
 **ufoo_untied:
+** (
 ** mov v0.8b, v1.8b
 ** bfdot   v0.2s, (v2.4h, v3.4h|v3.4h, v2.4h)
+** |
+** bfdot   v1.2s, (v2.4h, v3.4h|v3.4h, v2.4h)
+** mov v0.8b, v1.8b
+** )
 ** ret
 */
 float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, bfloat16x4_t x, 
bfloat16x4_t y)
@@ -81,8 +86,13 @@ float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, 
bfloat16x4_t x, bfloa
 
 /*
 **ufooq_lane_untied:
+** (
 ** mov v0.16b, v1.16b
 ** bfdot   v0.4s, v2.8h, v3.2h\[1\]
+** |
+** bfdot   v1.4s, v2.8h, v3.2h\[1\]
+** mov v0.16b, v1.16b
+** )
 ** ret
 */
 float32x4_t ufooq_lane_untied(float32x4_t unused, float32x4_t r, bfloat16x8_t 
x, bfloat16x4_t y)
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c 
b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
index 61c7c51f5ec..76787f6bedd 100644
--- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
@@ -115,8 +115,13 @@ int32x4_t sfooq_laneq (int32x4_t r, int8x16_t x, 
uint8x16_t y)
 
 /*
 **ufoo_untied:
+** (
 ** mov v0\.8b, v1\.8b
 ** usdot   v0\.2s, v2\.8b, v3\.8b
+** |
+** usdot   v1\.2s, v2\.8b, v3\.8b
+** mov v0\.8b, v1\.8b
+** )
 ** ret
 */
 int32x2_t ufoo_untied (int32x2_t unused, int32x2_t r, uint8x8_t x, int8x8_t y)
@@ -126,8 +131,13 @@ int32x2_t ufoo_untied (int32x2_t unused, int32x2_t r, 
uint8x8_t x, int8x8_t y)
 
 /*
 **ufooq_laneq_untied:
+** (
 ** mov v0\.16b, v1\.16b
 ** usdot   v0\.4s, v2\.16b, v3\.4b\[3\]
+** |
+** usdot   v1\.4s, v2\.16b, v3\.4b\[3\]
+** mov v0\.16b, v1\.16b
+** )
 ** ret
 */
 int32x4_t ufooq_laneq_untied (int32x2_t unused, int32x4_t r, uint8x16_t x, 
int8x16_t y)
-- 
2.25.1



Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard and Richi.
I am so sorry for sending you garbage patches (My mistake, sending RISC-V 
patches to you).

I finally realize that Case 2 and Case 3 are totally the same sequence!
I have combined them into single function called "vect_adjust_loop_lens_control"

I have sent V11 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618724.html 

I think this patch is the reasonable patch now!
Could you take a look at it?

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:30
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.
 
Thanks.
 
> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?
 
It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.
 
Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.
 
Richard
 


[PATCH V11] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch implement decrement IV for length approach in loop control.

Address comment from kewen that incorporate the implementation inside
"vect_set_loop_controls_directly" instead of a standalone function.

Address comment from Richard using MIN_EXPR to handle these 3 following
cases
1. single rgroup.
2. multiple rgroup for SLP.
3. multiple rgroup for non-SLP (tested on vec_pack_trunc).


gcc/ChangeLog:

* tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
(vect_set_loop_controls_directly): Add decrement IV support.
(vect_set_loop_condition_partial_vectors): Ditto.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): New variable.
(vect_get_loop_len): Add decrement IV support.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
(vect_get_loop_len): Add decrement IV support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.

---
 .../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
 .../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
 .../autovec/partial/multiple_rgroup_run-1.c   |  19 +
 .../autovec/partial/multiple_rgroup_run-2.c   |  19 +
 gcc/tree-vect-loop-manip.cc   | 184 +-
 gcc/tree-vect-loop.cc |  37 +-
 gcc/tree-vect-stmts.cc|   9 +-
 gcc/tree-vectorizer.h |  13 +-
 10 files changed, 1132 insertions(+), 11 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+ TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {
\
+   f[i * 2 + 0] = x;  \
+   f[i * 2 + 1] = x2; \
+   d[i] = y;  \
+  }
\
+  }
+
+#define run_1(TYPE1, TYPE2)
\
+  int n_1_##TYPE1_##TYPE2 = 1; 
\
+  TYPE1 x_1_##TYPE1 = 117; 
\
+  TYPE1 x2_1_##TYPE1 = 232;
\
+  TYPE2 y_1_##TYPE2 = 9762;
\
+  TYPE1 f_1_##TYPE1[2 * 2 + 1] = {0};  
\
+  TYPE2 d_1_##TYPE2[2] = {0};

[PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-16 Thread juzhe . zhong
From: Juzhe-Zhong 

This patch is adding rounding mode operand and FRM_REGNUM dependency
into floating-point instructions.

The floating-point instructions we added FRM and rounding mode operand:
1. vfadd/vfsub
2. vfwadd/vfwsub
3. vfmul
4. vfdiv
5. vfwmul
6. vfwmacc/vfwnmacc/vfwmsac/vfwnmsac
7. vfsqrt
8. floating-point conversions.
9. floating-point reductions.
10. floating-point ternary.

The floating-point instructions we did NOT add FRM and rounding mode operand:
1. vfabs/vfneg/vfsqrt7/vfrec7
2. vfmin/vfmax
3. comparisons
4. vfclass
5. vfsgnj/vfsgnjn/vfsgnjx
6. vfmerge
7. vfmv.v.f

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum frm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_ternop_insn): Add default rounding mode.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add FRM REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(FRM_REG_P): Ditto.
(RISCV_DWARF_FRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md: split no frm and has frm operations.
* config/riscv/vector.md (@pred__scalar): New pattern.
(@pred_): Ditto.

---
 gcc/config/riscv/riscv-protos.h   |  10 +
 gcc/config/riscv/riscv-vector-builtins.cc |  14 ++
 gcc/config/riscv/riscv.cc |   7 +-
 gcc/config/riscv/riscv.h  |   7 +-
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/vector-iterators.md  |   9 +-
 gcc/config/riscv/vector.md| 258 ++
 7 files changed, 251 insertions(+), 55 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 835bb802fc6..12634d0ac1a 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -231,6 +231,16 @@ enum vxrm_field_enum
   VXRM_RDN,
   VXRM_ROD
 };
+/* Rounding mode bitfield for floating point FRM.  */
+enum frm_field_enum
+{
+  FRM_RNE = 0b000,
+  FRM_RTZ = 0b001,
+  FRM_RDN = 0b010,
+  FRM_RUP = 0b011,
+  FRM_RMM = 0b100,
+  DYN = 0b111
+};
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 1de075fb90d..b7458aaace6 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3460,6 +3460,13 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
@@ -3482,6 +3489,13 @@ function_expander::use_widen_ternop_insn (insn_code 
icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b52e613c629..de5b87b1a87 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6082,7 +6082,8 @@ riscv_hard_regno_nregs (unsigned int regno, machine_mode 
mode)
 
   /* mode for VL or VTYPE are just a marker, not holding value,
  so it always consume one register.  */
-  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return 1;
 
   /* Assume every valid non-vector mode fits in one vector register.  */
@@ -6150,7 +6151,8 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (lmul != 1)
return ((regno % lmul) == 0);
 }
-  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return true;
   else
 return false;
@@ -6587,6 +6589,7 @@ riscv_conditional_register_usage (void)
   fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1;
   fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1;
   fixed_regs[VXRM_REGNUM] = 

[PATCH V2] RISC-V: Add FRM and rounding mode operand into floating point intrinsics

2023-05-16 Thread juzhe . zhong
From: Juzhe-Zhong 

This patch is adding rounding mode operand and FRM_REGNUM dependency
into floating-point instructions.

The floating-point instructions we added FRM and rounding mode operand:
1. vfadd/vfsub
2. vfwadd/vfwsub
3. vfmul
4. vfdiv
5. vfwmul
6. vfwmacc/vfwnmacc/vfwmsac/vfwnmsac
7. vfsqrt
8. floating-point conversions.
9. floating-point reductions.
10. floating-point ternary.

The floating-point instructions we did NOT add FRM and rounding mode operand:
1. vfabs/vfneg/vfsqrt7/vfrec7
2. vfmin/vfmax
3. comparisons
4. vfclass
5. vfsgnj/vfsgnjn/vfsgnjx
6. vfmerge
7. vfmv.v.f

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum frm_field_enum): New enum.
* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_ternop_insn): Add default rounding mode.
(function_expander::use_widen_ternop_insn): Ditto.
* config/riscv/riscv.cc (riscv_hard_regno_nregs): Add FRM REGNUM.
(riscv_hard_regno_mode_ok): Ditto.
(riscv_conditional_register_usage): Ditto.
* config/riscv/riscv.h (DWARF_FRAME_REGNUM): Ditto.
(FRM_REG_P): Ditto.
(RISCV_DWARF_FRM): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md: split no frm and has frm operations.
* config/riscv/vector.md (@pred__scalar): New pattern.
(@pred_): Ditto.

---
 gcc/config/riscv/riscv-protos.h   |  10 +
 gcc/config/riscv/riscv-vector-builtins.cc |  14 ++
 gcc/config/riscv/riscv.cc |   7 +-
 gcc/config/riscv/riscv.h  |   7 +-
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/vector-iterators.md  |   9 +-
 gcc/config/riscv/vector.md| 258 ++
 7 files changed, 251 insertions(+), 55 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 835bb802fc6..12634d0ac1a 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -231,6 +231,16 @@ enum vxrm_field_enum
   VXRM_RDN,
   VXRM_ROD
 };
+/* Rounding mode bitfield for floating point FRM.  */
+enum frm_field_enum
+{
+  FRM_RNE = 0b000,
+  FRM_RTZ = 0b001,
+  FRM_RDN = 0b010,
+  FRM_RUP = 0b011,
+  FRM_RMM = 0b100,
+  DYN = 0b111
+};
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 1de075fb90d..b7458aaace6 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3460,6 +3460,13 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
@@ -3482,6 +3489,13 @@ function_expander::use_widen_ternop_insn (insn_code 
icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
+
+  /* TODO: Currently, we don't support intrinsic that is modeling rounding 
mode.
+ We add default rounding mode for the intrinsics that didn't model rounding
+ mode yet.  */
+  if (opno != insn_data[icode].n_generator_args)
+add_input_operand (Pmode, const0_rtx);
+
   return generate_insn (icode);
 }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b52e613c629..de5b87b1a87 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -6082,7 +6082,8 @@ riscv_hard_regno_nregs (unsigned int regno, machine_mode 
mode)
 
   /* mode for VL or VTYPE are just a marker, not holding value,
  so it always consume one register.  */
-  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return 1;
 
   /* Assume every valid non-vector mode fits in one vector register.  */
@@ -6150,7 +6151,8 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   if (lmul != 1)
return ((regno % lmul) == 0);
 }
-  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno))
+  else if (VTYPE_REG_P (regno) || VL_REG_P (regno) || VXRM_REG_P (regno)
+  || FRM_REG_P (regno))
 return true;
   else
 return false;
@@ -6587,6 +6589,7 @@ riscv_conditional_register_usage (void)
   fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1;
   fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1;
   fixed_regs[VXRM_REGNUM] = 

Re: [PATCH 0/7] openmp: OpenMP 5.1 loop transformation directives

2023-05-16 Thread Frederik Harwath via Gcc-patches

Hi Jakub,

On 15.05.23 12:19, Jakub Jelinek wrote:

On Fri, Mar 24, 2023 at 04:30:38PM +0100, Frederik Harwath wrote:

this patch series implements the OpenMP 5.1 "unroll" and "tile"
constructs.  It includes changes to the C,C++, and Fortran front end
for parsing the new constructs and a new middle-end
"omp_transform_loops" pass which implements the transformations in a
source language agnostic way.

I'm afraid we can't do it this way, at least not completely.

The OpenMP requirements and what is being discussed for further loop
transformations pretty much requires parts of it to be done as soon as possible.
My understanding is that that is where other implementations implement that
too and would also prefer GCC not to be the only implementation that takes
significantly different decision in that case from other implementations


The place where different compilers implement the loop transformations
was discussed in an OpenMP loop transformation meeting last year. Two 
compilers (another one and GCC with this patch series) transformed the 
loops in the middle end after the handling of data sharing, one planned 
to do so. Yet another vendor had not yet decided where it will be 
implemented. Clang currently does everything in the front end, but it 
was mentioned that this might change in the future e.g. for code sharing 
with Flang. Implementing the loop transformations late could potentially
complicate the implementation of transformations which require 
adjustments of the data sharing clauses, but this is known and 
consequentially, no such transformations are planned for OpenMP 6.0. In 
particular, the "apply" clause therefore only permits loop-transforming 
constructs to be applied to the loops generated from other loop

transformations in TR11.


The normal loop constructs (OMP_FOR, OMP_SIMD, OMP_DISTRIBUTE, OMP_LOOP)
already need to know given their collapse/ordered how many loops they are
actually associated with and the loop transformation constructs can change
that.
So, I think we need to do the loop transformations in the FEs, that doesn't
mean we need to write everything 3 times, once for each frontend.
Already now, e.g. various stuff is shared between C and C++ FEs in c-family,
though how much can be shared between c-family and Fortran is to be
discovered.
Or at least partially, to the extent that we compute how many canonical
loops the loop transformations result in, what artificial iterators they
will use etc., so that during gimplification we can take all that into
account and then can do the actual transformations later.


The patches in this patch series already do compute how many canonical
loop nests result from the loop transformations in the front end.
This is necessary to represent the loop nest that is affected by the
loop transformations by a single OMP_FOR to meet the expectations
of all later OpenMP code transformations. This is also the major
reason why the loop transformations are represented by clauses
instead of representing them as  "OMP_UNROLL/OMP_TILE as
GENERIC constructs like OMP_FOR" as you suggest below. Since the
loop transformations may also appear on inner loops of a collapsed
loop nest (i.e. within the collapsed depth), representing the
transformation by OMP_FOR-like constructs would imply that a collapsed
loop nest would have to be broken apart into single loops. Perhaps this
could be handled somehow, but the collapsed loop nest would have to be
re-assembled to meet the expectations of e.g. gimplification.
The clause representation is also much better suited for the upcoming
OpenMP "apply" clause where the transformations will not appear
as directives in front of actual loops but inside of other clauses.
In fact, the loop transformation clauses in the implementation already
specify the level of a loop nest to which they apply and it could
be possible to re-use this handling for "apply".

My initial reaction also was to implement the loop transformations
as OMP_FOR-like constructs and the patch actually introduces an
OMP_LOOP_TRANS construct which is used to represent loops that
are not going to be associated with another OpenMP directive after
the transformation, e.g.

void foo () {
  #pragma omp tile sizes (4, 8, 16)
  for (int i = 0; i < 64; ++i)
  {
...
  }

}

You suggest to implement the loop transformations during gimplification.
I am not sure if gimplification is actually well-suited to implement the 
depth-first evaluation of the loop transformations. I also believe that 
gimplification already handles too many things which conceptually are 
not related to the translation to GIMPLE. Having a separate pass seems 
to be the right move to achieve a better separation of concerns. I think 
this will be even more important in the future as the size of the loop 
transformation implementation keeps growing. As you mention below, 
several new constructs are already planned.



For C, I think the lowering of loop transformation constructs or at least
determining 

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard.
I have sent V10:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618718.html 

I can't combine implementation Case 2 and Case 3, Case 2 each control (len) are 
coming from same rgc.
But Case 3 each control (len) are coming coming from different rgc.
Can you help me with that ?
Also, I have append my testcases too in this patch too.
Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:30
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> RVV infrastructure in RISC-V backend status:
> 1. All RVV instructions pattern related to intrinsics are all finished (They 
> will be called not only by intrinsics but also autovec in the future).
> 2. In case of autovec, we finished len_load/len_store (They are temporary 
> used and will be removed after I support len_mask_load/len_mask_store in the 
> middle-end).
>binary integer autovec patterns.
>vec_init pattern.
>That's all we have so far.
 
Thanks.
 
> In case of testing of this patch, I have multiple rgroup testcases in local, 
> you mean you want me to post them together with this patch?
> Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
> post them after this patch is finished and merged into trunk.
> What do you suggest ?
 
It would be useful to include the tests with the patch itself (as a patch
to the testsuite).  It doesn't matter that the tests are riscv-specific.
 
Obviously it would be more appropriate for the riscv maintainers to
review the riscv tests.  But keeping the tests with the patch helps when
reviewing the code, and also ensures that code is committed and never
later tested.
 
Richard
 


Re: [PATCH] Add auto-resizing capability to irange's [PR109695]

2023-05-16 Thread Aldy Hernandez via Gcc-patches



On 5/15/23 20:14, Aldy Hernandez wrote:

On 5/15/23 17:07, Aldy Hernandez wrote:



On 5/15/23 12:42, Jakub Jelinek wrote:

On Mon, May 15, 2023 at 12:35:23PM +0200, Aldy Hernandez wrote:

gcc/ChangeLog:

PR tree-optimization/109695
* value-range.cc (irange::operator=): Resize range.
(irange::union_): Same.
(irange::intersect): Same.
(irange::invert): Same.
(int_range_max): Default to 3 sub-ranges and resize as needed.
* value-range.h (irange::maybe_resize): New.
(~int_range): New.
(int_range::int_range): Adjust for resizing.
(int_range::operator=): Same.


LGTM.

One question is if we shouldn't do it for GCC13/GCC12 as well, perhaps
changing it to some larger number than 3 when the members aren't 
wide_ints

in there but just trees.  Sure, in 13/12 the problem is 10x less severe
than in current trunk, but still we have some cases where we run out of
stack because of it on some hosts.


Sure, but that would require messing around with the gt_* GTY 
functions, and making sure we're allocating the trees from a sensible 
place, etc etc.  I'm less confident in my ability to mess with GTY 
stuff this late in the game.


Hmmm, maybe backporting this isn't too bad.  The only time we'd have a 
chunk on the heap is for int_range_max, which will never live in GC 
space.  So I don't think we need to worry about GC at all.


Although, legacy mode in GCC13 does get in a the way a bit.  Sigh.


I've adapted the patch to GCC13 and tested it on x86-64 Linux.  Please 
look over the new[] I do for trees to make sure I did things right.


int_range_max on GCC13 is currently 4112 bytes.  Here are the numbers 
for various defaults:


< 2> =  64 bytes, 3.02% for VRP.
< 3> =  80 bytes, 2.67% for VRP.
< 8> = 160 bytes, 2.46% for VRP.
<16> = 288 bytes, 2.40% for VRP.

Note that we don't have any runway on GCC13, so this would be a net loss 
in performance for VRP.  Threading shows about half as much of a drop 
than VRP.  Overall compilation is within 0.2%, so not noticeable.


I'm surprised 2 sub-ranges doesn't incur a  bigger penalty, but 3 seems 
to be the happy medium.  Anything more than that, and there's no difference.


The patch defaults to 3 sub-ranges.  I must say, 80 bytes looks mighty 
nice.  It's up to you what to do with the patch.  I'm chicken shit at 
heart and hate touching release compilers :).


AldyFrom 777aa930b106fea2dd6ed9fe22b42a2717f1472d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 15 May 2023 12:25:58 +0200
Subject: [PATCH] [GCC13] Add auto-resizing capability to irange's [PR109695]

Backport the following from trunk.

	Note that the patch has been adapted to trees.

	The numbers for various sub-ranges on GCC13 are:
		< 2> =  64 bytes, -3.02% for VRP.
		< 3> =  80 bytes, -2.67% for VRP.
		< 8> = 160 bytes, -2.46% for VRP.
		<16> = 288 bytes, -2.40% for VRP.


We can now have int_range for automatically
resizable ranges.  int_range_max is now int_range<3, true>
for a 69X reduction in size from current trunk, and 6.9X reduction from
GCC12.  This incurs a 5% performance penalty for VRP that is more than
covered by our > 13% improvements recently.


int_range_max is the temporary range object we use in the ranger for
integers.  With the conversion to wide_int, this structure bloated up
significantly because wide_ints are huge (80 bytes a piece) and are
about 10 times as big as a plain tree.  Since the temporary object
requires 255 sub-ranges, that's 255 * 80 * 2, plus the control word.
This means the structure grew from 4112 bytes to 40912 bytes.

This patch adds the ability to resize ranges as needed, defaulting to
no resizing, while int_range_max now defaults to 3 sub-ranges (instead
of 255) and grows to 255 when the range being calculated does not fit.

For example:

int_range<1> foo;	// 1 sub-range with no resizing.
int_range<5> foo;	// 5 sub-ranges with no resizing.
int_range<5, true> foo;	// 5 sub-ranges with resizing.

I ran some tests and found that 3 sub-ranges cover 99% of cases, so
I've set the int_range_max default to that:

	typedef int_range<3, /*RESIZABLE=*/true> int_range_max;

We don't bother growing incrementally, since the default covers most
cases and we have a 255 hard-limit.  This hard limit could be reduced
to 128, since my tests never saw a range needing more than 124, but we
could do that as a follow-up if needed.

With 3-subranges, int_range_max is now 592 bytes versus 40912 for
trunk, and versus 4112 bytes for GCC12!  The penalty is 5.04% for VRP
and 3.02% for threading, with no noticeable change in overall
compilation (0.27%).  This is more than covered by our 13.26%
improvements for the legacy removal + wide_int conversion.

I think this approach is a good alternative, while providing us with
flexibility going forward.  For example, we could try defaulting to a
8 sub-ranges for a noticeable improvement in VRP.  We could also use
large sub-ranges for switch 

[PATCH V10] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe . zhong
From: Ju-Zhe Zhong 

This patch implement decrement IV for length approach in loop control.

Address comment from kewen that incorporate the implementation inside
"vect_set_loop_controls_directly" instead of a standalone function.

Address comment from Richard using MIN_EXPR to handle these 3 following
cases
1. single rgroup.
2. multiple rgroup for SLP.
3. multiple rgroup for non-SLP (tested on vec_pack_trunc).


gcc/ChangeLog:

* tree-vect-loop-manip.cc (vect_adjust_loop_lens): New function.
(vect_set_loop_controls_directly): Add decrement IV support.
(vect_set_loop_condition_partial_vectors): Ditto.
* tree-vect-loop.cc (_loop_vec_info::_loop_vec_info): New variable.
(vect_get_loop_len): Add decrement IV support.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (LOOP_VINFO_USING_DECREMENTING_IV_P): New macro.
(vect_get_loop_len): Add decrement IV support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.

---
 .../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
 .../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
 .../autovec/partial/multiple_rgroup_run-1.c   |  19 +
 .../autovec/partial/multiple_rgroup_run-2.c   |  19 +
 gcc/tree-vect-loop-manip.cc   | 203 ++-
 gcc/tree-vect-loop.cc |  37 +-
 gcc/tree-vect-stmts.cc|   9 +-
 gcc/tree-vectorizer.h |  13 +-
 10 files changed, 1148 insertions(+), 14 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+ TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {
\
+   f[i * 2 + 0] = x;  \
+   f[i * 2 + 1] = x2; \
+   d[i] = y;  \
+  }
\
+  }
+
+#define run_1(TYPE1, TYPE2)
\
+  int n_1_##TYPE1_##TYPE2 = 1; 
\
+  TYPE1 x_1_##TYPE1 = 117; 
\
+  TYPE1 x2_1_##TYPE1 = 232;
\
+  TYPE2 y_1_##TYPE2 = 9762;
\
+  TYPE1 f_1_##TYPE1[2 * 2 + 1] = {0};  
\
+  TYPE2 d_1_##TYPE2[2] = {0};   

[committed 2/3] libstdc++: Stop using _GLIBCXX_USE_C99_STDINT_TR1 in

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

The _GLIBCXX_USE_C99_STDINT_TR1 macro (and the comments about it in
acinclude.m4 and config.h) are misleading when it is also used for
, not only . It is also wrong, because the
configure checks for TR1 use -std=c++98 and a target might define
uint32_t etc. for C++11 but not for C++98.

Add a separate configure check for the  types using -std=c++11
for the checks. Use the result of that separate check in  and
most other places that still depend on the macro (many uses of that
macro have been removed already). The remaining uses of the STDINT_TR1
macro are really for TR1, or are in the src/c++11/compatibility-*.cc
files, where we don't want/need to change the condition they depend on
(if those symbols were only exported when  types were
available for -std=c++98, then that's the condition we should continue
to use for whether to export the compat symbols now).

Make similar changes for the related _GLIBCXX_USE_C99_INTTYPES_TR1 and
_GLIBCXX_USE_C99_INTTYPES_WCHAR_T_TR1 macros, adding new macros for
non-TR1 uses.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_USE_C99): Check for  types in
C++11 mode and define _GLIBCXX_USE_C99_STDINT. Check for
 features in C++11 mode and define
_GLIBCXX_USE_C99_INTTYPES and _GLIBCXX_USE_C99_INTTYPES_WCHAR_T.
* config.h.in: Regenerate.
* configure: Regenerate.
* doc/doxygen/user.cfg.in (PREDEFINED): Add new macros.
* include/bits/chrono.h: Check _GLIBCXX_USE_C99_STDINT instead
of _GLIBCXX_USE_C99_STDINT_TR1.
* include/c_compatibility/inttypes.h: Check
_GLIBCXX_USE_C99_INTTYPES and _GLIBCXX_USE_C99_INTTYPES_WCHAR_T
instead of _GLIBCXX_USE_C99_INTTYPES_TR1 and
_GLIBCXX_USE_C99_INTTYPES_WCHAR_T_TR1.
* include/c_compatibility/stdatomic.h: Check
_GLIBCXX_USE_C99_STDINT instead of _GLIBCXX_USE_C99_STDINT_TR1.
* include/c_compatibility/stdint.h: Likewise.
* include/c_global/cinttypes: Check _GLIBCXX_USE_C99_INTTYPES
and _GLIBCXX_USE_C99_INTTYPES_WCHAR_T instead of
_GLIBCXX_USE_C99_INTTYPES_TR1 and
_GLIBCXX_USE_C99_INTTYPES_WCHAR_T_TR1.
* include/c_global/cstdint: Check _GLIBCXX_USE_C99_STDINT
instead of _GLIBCXX_USE_C99_STDINT_TR1.
* include/std/atomic: Likewise.
* src/c++11/cow-stdexcept.cc: Likewise.
* testsuite/29_atomics/headers/stdatomic.h/c_compat.cc:
Likewise.
* testsuite/lib/libstdc++.exp (check_v3_target_cstdint):
Likewise.
---
 libstdc++-v3/acinclude.m4 | 142 +
 libstdc++-v3/config.h.in  |  12 ++
 libstdc++-v3/configure| 196 ++
 libstdc++-v3/doc/doxygen/user.cfg.in  |   3 +
 libstdc++-v3/include/bits/chrono.h|   2 +-
 .../include/c_compatibility/inttypes.h|   6 +-
 .../include/c_compatibility/stdatomic.h   |   4 +-
 libstdc++-v3/include/c_compatibility/stdint.h |   4 +-
 libstdc++-v3/include/c_global/cinttypes   |   6 +-
 libstdc++-v3/include/c_global/cstdint |   6 +-
 libstdc++-v3/include/std/atomic   |   2 +-
 libstdc++-v3/src/c++11/cow-stdexcept.cc   |   4 +-
 .../headers/stdatomic.h/c_compat.cc   |   2 +-
 libstdc++-v3/testsuite/lib/libstdc++.exp  |   2 +-
 14 files changed, 372 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 84b12adbc24..0c01b526ebf 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1103,6 +1103,148 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
   ])
 fi
 
+# Check for the existence of  types.
+AC_CACHE_CHECK([for ISO C99 support in  for C++11],
+glibcxx_cv_c99_stdint, [
+AC_TRY_COMPILE([#define __STDC_LIMIT_MACROS
+   #define __STDC_CONSTANT_MACROS
+   #include ],
+  [typedef int8_t  my_int8_t;
+   my_int8_t   i8 = INT8_MIN;
+   i8 = INT8_MAX;
+   typedef int16_t my_int16_t;
+   my_int16_t  i16 = INT16_MIN;
+   i16 = INT16_MAX;
+   typedef int32_t my_int32_t;
+   my_int32_t  i32 = INT32_MIN;
+   i32 = INT32_MAX;
+   typedef int64_t my_int64_t;
+   my_int64_t  i64 = INT64_MIN;
+   i64 = INT64_MAX;
+   typedef int_fast8_t my_int_fast8_t;
+   my_int_fast8_t  if8 = INT_FAST8_MIN;
+   if8 = INT_FAST8_MAX;
+   typedef int_fast16_tmy_int_fast16_t;
+   my_int_fast16_t if16 = INT_FAST16_MIN;
+   if16 = INT_FAST16_MAX;
+   typedef int_fast32_tmy_int_fast32_t;
+   my_int_fast32_t  

[committed 3/3] libstdc++: Stop using TR1 macros in and

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

As with the two commits before this, the _GLIBCXX_USE_C99_CTYPE_TR1 and
_GLIBCXX_USE_C99_FENV_TR1 macros are misleading when they are also used
for  and , not only for TR1 headers. It is also wrong,
because the configure checks for TR1 use -std=c++98 and a target might
define the C99 features for C++11 but not for C++98.

Add separate configure checks for the  and  features using 
-std=c++11
for the checks. Use the new macros defined by those checks in the
C++11-specific parts of , , and .

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_USE_C99): Check for isblank in C++11
mode and define _GLIBCXX_USE_C99_CTYPE. Check for 
functions in C++11 mode and define _GLIBCXX_USE_C99_FENV.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/c_compatibility/fenv.h: Check _GLIBCXX_USE_C99_FENV
instead of _GLIBCXX_USE_C99_FENV_TR1.
* include/c_global/cfenv: Likewise.
* include/c_global/cctype: Check _GLIBCXX_USE_C99_CTYPE instead
of _GLIBCXX_USE_C99_CTYPE_TR1.
---
 libstdc++-v3/acinclude.m4   | 46 ++
 libstdc++-v3/config.h.in|  8 ++
 libstdc++-v3/configure  | 97 +
 libstdc++-v3/include/c_compatibility/fenv.h |  4 +-
 libstdc++-v3/include/c_global/cctype|  4 +-
 libstdc++-v3/include/c_global/cfenv |  4 +-
 6 files changed, 157 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 0c01b526ebf..988c532c4e2 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1476,6 +1476,52 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
   fi
 fi
 
+# Check for the existence of  functions.
+AC_CACHE_CHECK([for ISO C99 support for C++11 in ],
+glibcxx_cv_c99_ctype, [
+AC_TRY_COMPILE([#include ],
+  [int ch;
+   int ret;
+   ret = isblank(ch);
+  ],[glibcxx_cv_c99_ctype=yes],
+[glibcxx_cv_c99_ctype=no])
+])
+if test x"$glibcxx_cv_c99_ctype" = x"yes"; then
+  AC_DEFINE(_GLIBCXX_USE_C99_CTYPE, 1,
+   [Define if C99 functions in  should be imported in
+in namespace std for C++11.])
+fi
+
+# Check for the existence of  functions.
+AC_CHECK_HEADERS(fenv.h, ac_has_fenv_h=yes, ac_has_fenv_h=no)
+ac_c99_fenv=no;
+if test x"$ac_has_fenv_h" = x"yes"; then
+  AC_MSG_CHECKING([for ISO C99 support for C++11 in ])
+  AC_TRY_COMPILE([#include ],
+[int except, mode;
+ fexcept_t* pflag;
+ fenv_t* penv;
+ int ret;
+ ret = feclearexcept(except);
+ ret = fegetexceptflag(pflag, except);
+ ret = feraiseexcept(except);
+ ret = fesetexceptflag(pflag, except);
+ ret = fetestexcept(except);
+ ret = fegetround();
+ ret = fesetround(mode);
+ ret = fegetenv(penv);
+ ret = feholdexcept(penv);
+ ret = fesetenv(penv);
+ ret = feupdateenv(penv);
+],[ac_c99_fenv=yes], [ac_c99_fenv=no])
+  AC_MSG_RESULT($ac_c99_fenv)
+fi
+if test x"$ac_c99_fenv" = x"yes"; then
+  AC_DEFINE(_GLIBCXX_USE_C99_FENV, 1,
+   [Define if C99 functions in  should be imported in
+in namespace std for C++11.])
+fi
+
 gcc_no_link="$ac_save_gcc_no_link"
 LIBS="$ac_save_LIBS"
 CXXFLAGS="$ac_save_CXXFLAGS"
diff --git a/libstdc++-v3/include/c_compatibility/fenv.h 
b/libstdc++-v3/include/c_compatibility/fenv.h
index 70ce3f834f4..83e930f12d1 100644
--- a/libstdc++-v3/include/c_compatibility/fenv.h
+++ b/libstdc++-v3/include/c_compatibility/fenv.h
@@ -38,7 +38,7 @@
 
 #if __cplusplus >= 201103L
 
-#if _GLIBCXX_USE_C99_FENV_TR1
+#if _GLIBCXX_USE_C99_FENV
 
 #undef feclearexcept
 #undef fegetexceptflag
@@ -74,7 +74,7 @@ namespace std
   using ::feupdateenv;
 } // namespace
 
-#endif // _GLIBCXX_USE_C99_FENV_TR1
+#endif // _GLIBCXX_USE_C99_FENV
 
 #endif // C++11
 
diff --git a/libstdc++-v3/include/c_global/cctype 
b/libstdc++-v3/include/c_global/cctype
index bd667fba15d..e6ff1204df6 100644
--- a/libstdc++-v3/include/c_global/cctype
+++ b/libstdc++-v3/include/c_global/cctype
@@ -78,7 +78,7 @@ namespace std
 
 #if __cplusplus >= 201103L
 
-#ifdef _GLIBCXX_USE_C99_CTYPE_TR1
+#ifdef _GLIBCXX_USE_C99_CTYPE
 
 #undef isblank
 
@@ -87,7 +87,7 @@ namespace std
   using ::isblank;
 } // namespace std
 
-#endif // _GLIBCXX_USE_C99_CTYPE_TR1
+#endif // _GLIBCXX_USE_C99_CTYPE
 
 #endif // C++11
 
diff --git a/libstdc++-v3/include/c_global/cfenv 
b/libstdc++-v3/include/c_global/cfenv
index 6704dc5423e..3a1d9c4a6aa 100644
--- a/libstdc++-v3/include/c_global/cfenv
+++ 

[committed 1/3] libstdc++: Stop using _GLIBCXX_USE_C99_COMPLEX_TR1 in

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

The _GLIBCXX_USE_C99_COMPLEX_TR1 macro (and the comments about it in
acinclude.m4 and config.h) are misleading when it is also used for
, not only . It is also wrong, because the
configure checks for TR1 use -std=c++98 and a target might define cacos
etc. for C++11 but not for C++98.

Add a separate configure check for the inverse trigonometric functions
that are covered by _GLIBCXX_USE_C99_COMPLEX_TR1, but using -std=c++11
for the checks. Use the result of that separate check in .

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_USE_C99): Check for complex inverse trig
functions in C++11 mode and define _GLIBCXX_USE_C99_COMPLEX_ARC.
* config.h.in: Regenerate.
* configure: Regenerate.
* doc/doxygen/user.cfg.in (PREDEFINED): Add new macro.
* include/std/complex: Check _GLIBCXX_USE_C99_COMPLEX_ARC
instead of _GLIBCXX_USE_C99_COMPLEX_TR1.
---
 libstdc++-v3/acinclude.m4| 37 +++
 libstdc++-v3/config.h.in |  5 +++
 libstdc++-v3/configure   | 53 
 libstdc++-v3/doc/doxygen/user.cfg.in |  1 +
 libstdc++-v3/include/std/complex | 14 
 5 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 0ce3b8b5b31..84b12adbc24 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1200,6 +1200,43 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
 requires corresponding C99 library functions to be present.])
 fi
 
+# Check for the existence of  complex inverse trigonometric
+# math functions used by  for C++11 and later.
+ac_c99_complex_arc=no;
+if test x"$ac_has_complex_h" = x"yes"; then
+  AC_MSG_CHECKING([for ISO C99 support for inverse trig functions in 
])
+  AC_TRY_COMPILE([#include ],
+[typedef __complex__ float float_type; float_type tmpf;
+ cacosf(tmpf);
+ casinf(tmpf);
+ catanf(tmpf);
+ cacoshf(tmpf);
+ casinhf(tmpf);
+ catanhf(tmpf);
+ typedef __complex__ double double_type; double_type tmpd;
+ cacos(tmpd);
+ casin(tmpd);
+ catan(tmpd);
+ cacosh(tmpd);
+ casinh(tmpd);
+ catanh(tmpd);
+ typedef __complex__ long double ld_type; ld_type tmpld;
+ cacosl(tmpld);
+ casinl(tmpld);
+ catanl(tmpld);
+ cacoshl(tmpld);
+ casinhl(tmpld);
+ catanhl(tmpld);
+],[ac_c99_complex_arc=yes], [ac_c99_complex_arc=no])
+fi
+AC_MSG_RESULT($ac_c99_complex_arc)
+if test x"$ac_c99_complex_arc" = x"yes"; then
+  AC_DEFINE(_GLIBCXX_USE_C99_COMPLEX_ARC, 1,
+   [Define if C99 inverse trig functions in  should be
+   used in . Using compiler builtins for these functions
+   requires corresponding C99 library functions to be present.])
+fi
+
 # Check for the existence in  of vscanf, et. al.
 AC_CACHE_CHECK([for ISO C99 support in  for C++11],
   glibcxx_cv_c99_stdio_cxx11, [
diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in 
b/libstdc++-v3/doc/doxygen/user.cfg.in
index 14981c96f95..210e13400b9 100644
--- a/libstdc++-v3/doc/doxygen/user.cfg.in
+++ b/libstdc++-v3/doc/doxygen/user.cfg.in
@@ -2352,6 +2352,7 @@ PREDEFINED = __cplusplus=202002L \
  _GLIBCXX_USE_NOEXCEPT=noexcept \
  _GLIBCXX_USE_WCHAR_T \
  _GLIBCXX_USE_LONG_LONG \
+_GLIBCXX_USE_C99_COMPLEX_ARC \
  _GLIBCXX_USE_C99_STDINT_TR1 \
  _GLIBCXX_USE_SCHED_YIELD \
  _GLIBCXX_USE_NANOSLEEP \
diff --git a/libstdc++-v3/include/std/complex b/libstdc++-v3/include/std/complex
index 0f5f14c3ddb..40fc062e53d 100644
--- a/libstdc++-v3/include/std/complex
+++ b/libstdc++-v3/include/std/complex
@@ -2021,7 +2021,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return std::complex<_Tp>(__pi_2 - __t.real(), -__t.imag());
 }
 
-#if _GLIBCXX_USE_C99_COMPLEX_TR1
+#if _GLIBCXX_USE_C99_COMPLEX_ARC
 #if defined(__STDCPP_FLOAT16_T__) && defined(_GLIBCXX_FLOAT_IS_IEEE_BINARY32)
   inline __complex__ _Float16
   __complex_acos(__complex__ _Float16 __z)
@@ -2177,7 +2177,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 #endif
 
-#if _GLIBCXX_USE_C99_COMPLEX_TR1
+#if _GLIBCXX_USE_C99_COMPLEX_ARC
   inline __complex__ float
   __complex_acos(__complex__ float __z)
   { return __builtin_cacosf(__z); }
@@ -2213,7 +2213,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return std::complex<_Tp>(__t.imag(), -__t.real());
 }
 
-#if 

Re: [PATCH v3] Machine_Mode: Extend machine_mode from 8 to 16 bits

2023-05-16 Thread Richard Sandiford via Gcc-patches
pan2...@intel.com writes:
> diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h
> index c5180b9308a..38b4d6160c2 100644
> --- a/gcc/rtl-ssa/accesses.h
> +++ b/gcc/rtl-ssa/accesses.h
> @@ -254,7 +254,7 @@ private:
>unsigned int m_spare : 2;
>  
>// The value returned by the accessor above.
> -  machine_mode m_mode : 8;
> +  machine_mode m_mode : MACHINE_MODE_BITSIZE;
>  };
>  
>  // A contiguous array of access_info pointers.  Used to represent a

This structure (access_info) isn't mentioned in the table in the patch
description.  The structure is currently 1 LP64 word and is very
size-sensitive.  I think we should:

- Put the mode after m_regno
- Reduce m_kind to 2 bits
- Remove m_spare

I *think* that will keep the current size, but please check.

LGTM otherwise.

Thanks,
Richard


[committed] libstdc++: Add assertion to debug_allocator test

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* testsuite/ext/debug_allocator/check_deallocate_null.cc: Add
assertion to ensure expected exception is throw.
---
 .../testsuite/ext/debug_allocator/check_deallocate_null.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git 
a/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc 
b/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc
index 1f0a9eb0b61..c5bcafb04e9 100644
--- a/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc
+++ b/libstdc++-v3/testsuite/ext/debug_allocator/check_deallocate_null.cc
@@ -31,7 +31,8 @@ int main()
 
   try
 {
-  __gnu_test::check_deallocate_null(); 
+  __gnu_test::check_deallocate_null();
+  VERIFY(false);
 }
   catch (std::runtime_error& obj)
 {
-- 
2.40.1



[committed] libstdc++: Require tzdb support for chrono::zoned_time printer test

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* testsuite/libstdc++-prettyprinters/chrono.cc: Only test
printer for chrono::zoned_time for cx11 ABI and tzdb effective
target.
---
 libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc 
b/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc
index 01a46169393..b5314e025cc 100644
--- a/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc
+++ b/libstdc++-v3/testsuite/libstdc++-prettyprinters/chrono.cc
@@ -1,5 +1,6 @@
 // { dg-options "-g -O0 -std=gnu++2a" }
 // { dg-do run { target c++2a } }
+// { dg-additional-options "-DTEST_ZONED_TIME" { target tzdb } }
 
 // Copyright The GNU Toolchain Authors.
 //
@@ -38,7 +39,7 @@ main()
   utc_time utc(467664h);
   // { dg-final { note-test utc {std::chrono::utc_time = { 467664h }} } }
 
-#if _GLIBCXX_USE_CXX11_ABI
+#if _GLIBCXX_USE_CXX11_ABI && defined TEST_ZONED_TIME
   zoned_time zt("Europe/London", half_past_epoch);
   // { dg-final { note-test zt {std::chrono::zoned_time = { "Europe/London" 
180ms [1970-01-01 00:30:00] }} { target cxx11_abi } } }
 #endif
-- 
2.40.1



[committed] libstdc++: Do not use pthread_mutex_clocklock with ThreadSanitizer

2023-05-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

As noted in https://github.com/llvm/llvm-project/issues/62623 there are
no tsan interceptors for some of the new POSIX-1:202x APIs added by
https://austingroupbugs.net/view.php?id=1216 so tsan gives false
positive warnings for try_lock_for on timed mutexes.

Disable the uses of the new pthread_mutex_clocklock API when tsan is
active. This changes the semantics of the try_lock_for functions,
because it can change which clock is used for the wait. This means those
functions might be affected by system clock adjustments when tsan is
used, when they would not be affected otherwise.

Reviewed-by: Thomas Rodgers 
Reviewed-by: Mike Crowe 

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK): Define
_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK in terms of _GLIBCXX_TSAN.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 2 +-
 libstdc++-v3/configure| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 42a8e7a775e..0ce3b8b5b31 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4314,7 +4314,7 @@ AC_DEFUN([GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK], [
   [glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK=no])
   ])
   if test $glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK = yes; then
-AC_DEFINE(_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK, 1, [Define if 
pthread_mutex_clocklock is available in .])
+AC_DEFINE(_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK, (_GLIBCXX_TSAN==0), 
[Define if pthread_mutex_clocklock is available in .])
   fi
 
   CXXFLAGS="$ac_save_CXXFLAGS"
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index d4286b67a73..c1faebd54f2 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -21364,7 +21364,7 @@ fi
 $as_echo "$glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK" >&6; }
   if test $glibcxx_cv_PTHREAD_MUTEX_CLOCKLOCK = yes; then
 
-$as_echo "#define _GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK 1" >>confdefs.h
+$as_echo "#define _GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK (_GLIBCXX_TSAN==0)" 
>>confdefs.h
 
   fi
 
-- 
2.40.1



[patch,avr] PR105753: Fix ICE in add_clobbers.

2023-05-16 Thread Georg-Johann Lay

This patch removes the superfluous parallel in [u]divmod patterns
in the AVR backend.  Effect of extra parallel is that add_clobbers
reaches gcc_unreachable() because the clobbers for [u]divmod are
missing.  The parallel around the parts of an insn pattern is
implicit if it has multiple parts like clobbers, so extra parallel
should be removed.

Ok to apply?

Johann

--

gcc/
PR target/105753
* config/avr/avr.md (divmodpsi, udivmodpsi, divmodsi, udivmodsi):
Remove superfluous "parallel" in insn pattern.
([u]divmod4): Tidy code.  Use gcc_unreachable() instead of
printing error text to assembly.

gcc/testsuite/
PR target/105753
* gcc.target/avr/torture/pr105753.c: New test.diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index 43b75046384..a79c6824fad 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -3705,17 +3705,17 @@ (define_insn "*mulohisi3_call"
 ;;CSE has problems to operate on hard regs.
 ;;
 (define_insn_and_split "divmodqi4"
-  [(set (match_operand:QI 0 "pseudo_register_operand" "")
-(div:QI (match_operand:QI 1 "pseudo_register_operand" "")
-(match_operand:QI 2 "pseudo_register_operand" "")))
-   (set (match_operand:QI 3 "pseudo_register_operand" "")
+  [(set (match_operand:QI 0 "pseudo_register_operand")
+(div:QI (match_operand:QI 1 "pseudo_register_operand")
+(match_operand:QI 2 "pseudo_register_operand")))
+   (set (match_operand:QI 3 "pseudo_register_operand")
 (mod:QI (match_dup 1) (match_dup 2)))
(clobber (reg:QI 22))
(clobber (reg:QI 23))
(clobber (reg:QI 24))
(clobber (reg:QI 25))]
   ""
-  "this divmodqi4 pattern should have been splitted;"
+  { gcc_unreachable(); }
   ""
   [(set (reg:QI 24) (match_dup 1))
(set (reg:QI 22) (match_dup 2))
@@ -3751,17 +3751,17 @@ (define_insn "*divmodqi4_call"
   [(set_attr "type" "xcall")])
 
 (define_insn_and_split "udivmodqi4"
- [(set (match_operand:QI 0 "pseudo_register_operand" "")
-   (udiv:QI (match_operand:QI 1 "pseudo_register_operand" "")
-(match_operand:QI 2 "pseudo_register_operand" "")))
-   (set (match_operand:QI 3 "pseudo_register_operand" "")
-(umod:QI (match_dup 1) (match_dup 2)))
-   (clobber (reg:QI 22))
-   (clobber (reg:QI 23))
-   (clobber (reg:QI 24))
-   (clobber (reg:QI 25))]
-  ""
-  "this udivmodqi4 pattern should have been splitted;"
+ [(set (match_operand:QI 0 "pseudo_register_operand")
+   (udiv:QI (match_operand:QI 1 "pseudo_register_operand")
+(match_operand:QI 2 "pseudo_register_operand")))
+  (set (match_operand:QI 3 "pseudo_register_operand")
+   (umod:QI (match_dup 1) (match_dup 2)))
+  (clobber (reg:QI 22))
+  (clobber (reg:QI 23))
+  (clobber (reg:QI 24))
+  (clobber (reg:QI 25))]
+  ""
+  { gcc_unreachable(); }
   ""
   [(set (reg:QI 24) (match_dup 1))
(set (reg:QI 22) (match_dup 2))
@@ -3793,17 +3793,17 @@ (define_insn "*udivmodqi4_call"
   [(set_attr "type" "xcall")])
 
 (define_insn_and_split "divmodhi4"
-  [(set (match_operand:HI 0 "pseudo_register_operand" "")
-(div:HI (match_operand:HI 1 "pseudo_register_operand" "")
-(match_operand:HI 2 "pseudo_register_operand" "")))
-   (set (match_operand:HI 3 "pseudo_register_operand" "")
+  [(set (match_operand:HI 0 "pseudo_register_operand")
+(div:HI (match_operand:HI 1 "pseudo_register_operand")
+(match_operand:HI 2 "pseudo_register_operand")))
+   (set (match_operand:HI 3 "pseudo_register_operand")
 (mod:HI (match_dup 1) (match_dup 2)))
(clobber (reg:QI 21))
(clobber (reg:HI 22))
(clobber (reg:HI 24))
(clobber (reg:HI 26))]
   ""
-  "this should have been splitted;"
+  { gcc_unreachable(); }
   ""
   [(set (reg:HI 24) (match_dup 1))
(set (reg:HI 22) (match_dup 2))
@@ -3839,17 +3839,17 @@ (define_insn "*divmodhi4_call"
   [(set_attr "type" "xcall")])
 
 (define_insn_and_split "udivmodhi4"
-  [(set (match_operand:HI 0 "pseudo_register_operand" "")
-(udiv:HI (match_operand:HI 1 "pseudo_register_operand" "")
- (match_operand:HI 2 "pseudo_register_operand" "")))
-   (set (match_operand:HI 3 "pseudo_register_operand" "")
+  [(set (match_operand:HI 0 "pseudo_register_operand")
+(udiv:HI (match_operand:HI 1 "pseudo_register_operand")
+ (match_operand:HI 2 "pseudo_register_operand")))
+   (set (match_operand:HI 3 "pseudo_register_operand")
 (umod:HI (match_dup 1) (match_dup 2)))
(clobber (reg:QI 21))
(clobber (reg:HI 22))
(clobber (reg:HI 24))
(clobber (reg:HI 26))]
   ""
-  "this udivmodhi4 pattern should have been splitted.;"
+  { gcc_unreachable(); }
   ""
   [(set (reg:HI 24) (match_dup 1))
(set (reg:HI 22) (match_dup 2))
@@ -4090,14 +4090,14 @@ (define_insn "*mulpsi3.libgcc"
 ;; implementation works the other way round.
 
 (define_insn_and_split "divmodpsi4"
-  [(parallel [(set 

[COMMITTED] ada: Fix crash on iterated component in expression function

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The problem is that the freeze node generated for the type of a static
subexpression present in the expression function is incorrectly placed
inside instead of outside the function.

gcc/ada/

* freeze.adb (Freeze_Expression): When the freezing is to be done
outside the current scope, skip any scope that is an internal loop.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/freeze.adb | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
index 86622003b97..f54ae0503a1 100644
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -8712,17 +8712,19 @@ package body Freeze is
 
 --  The current scope may be that of a constrained component of
 --  an enclosing record declaration, or of a loop of an enclosing
---  quantified expression, which is above the current scope in the
---  scope stack. Indeed in the context of a quantified expression,
---  a scope is created and pushed above the current scope in order
---  to emulate the loop-like behavior of the quantified expression.
+--  quantified expression or aggregate with an iterated component
+--  in Ada 2022, which is above the current scope in the scope
+--  stack. Indeed in the context of a quantified expression or
+--  an aggregate with an iterated component, an internal scope is
+--  created and pushed above the current scope in order to emulate
+--  the loop-like behavior of the construct.
 --  If the expression is within a top-level pragma, as for a pre-
 --  condition on a library-level subprogram, nothing to do.
 
 if not Is_Compilation_Unit (Current_Scope)
   and then (Is_Record_Type (Scope (Current_Scope))
- or else Nkind (Parent (Current_Scope)) =
- N_Quantified_Expression)
+ or else (Ekind (Current_Scope) = E_Loop
+   and then Is_Internal (Current_Scope)))
 then
Pos := Pos - 1;
 end if;
-- 
2.40.0



[COMMITTED] ada: Fix internal error on 'Image applied to array component

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This happens because the array component depends on a discriminant.

gcc/ada/

* exp_imgv.adb (Rewrite_Object_Image): If the prefix is a component
that depends on a discriminant, create an actual subtype for it.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_imgv.adb | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/exp_imgv.adb b/gcc/ada/exp_imgv.adb
index 93fdb70306f..257f65badd0 100644
--- a/gcc/ada/exp_imgv.adb
+++ b/gcc/ada/exp_imgv.adb
@@ -2498,12 +2498,31 @@ package body Exp_Imgv is
   Attr_Name : Name_Id;
   Str_Typ   : Entity_Id)
is
+  Ptyp : Entity_Id;
+
begin
+  Ptyp := Etype (Pref);
+
+  --  If the prefix is a component that depends on a discriminant, then
+  --  create an actual subtype for it.
+
+  if Nkind (Pref) = N_Selected_Component then
+ declare
+Decl : constant Node_Id :=
+ Build_Actual_Subtype_Of_Component (Ptyp, Pref);
+ begin
+if Present (Decl) then
+   Insert_Action (N, Decl);
+   Ptyp := Defining_Identifier (Decl);
+end if;
+ end;
+  end if;
+
   Rewrite (N,
 Make_Attribute_Reference (Sloc (N),
-  Prefix => New_Occurrence_Of (Etype (Pref), Sloc (N)),
+  Prefix => New_Occurrence_Of (Ptyp, Sloc (N)),
   Attribute_Name => Attr_Name,
-  Expressions=> New_List (Relocate_Node (Pref;
+  Expressions=> New_List (Unchecked_Convert_To (Ptyp, Pref;
 
   Analyze_And_Resolve (N, Str_Typ);
end Rewrite_Object_Image;
-- 
2.40.0



[COMMITTED] ada: Update proof of runtime units

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Yannick Moy 

Following changes in GNATprove, proofs need to be amended.

gcc/ada/

* libgnat/s-aridou.adb (Lemma_Div_Pow2): Add assertion.
* libgnat/s-arit32.adb (Lemma_Abs_Div_Commutation): Simplify.
* libgnat/s-expmod.adb (Lemma_Exp_Mod): Add assertions.
(Lemma_Euclidean_Mod): Add body to lemma.
(Lemma_Mult_Mod): Add assertion.
* libgnat/s-valueu.adb (Scan_Raw_Unsigned): Modify assertion.
* libgnat/s-vauspe.ads (Raw_Unsigned_Last_Ghost): Add
postcondition.
* libgnat/s-widthi.adb: Use more precise types.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-aridou.adb |  2 +-
 gcc/ada/libgnat/s-arit32.adb | 33 +
 gcc/ada/libgnat/s-expmod.adb | 20 ++--
 gcc/ada/libgnat/s-valueu.adb | 12 ++--
 gcc/ada/libgnat/s-vauspe.ads |  3 ++-
 gcc/ada/libgnat/s-widthi.adb |  6 +++---
 6 files changed, 31 insertions(+), 45 deletions(-)

diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb
index dbf0f42cd49..041478538a7 100644
--- a/gcc/ada/libgnat/s-aridou.adb
+++ b/gcc/ada/libgnat/s-aridou.adb
@@ -1543,7 +1543,7 @@ is
  Div2 : constant Double_Uns := Double_Uns'(2);
  Left : constant Double_Uns := X / Div1 / Div2;
  R2   : constant Double_Uns := X / Div1 - Left * Div2;
- pragma Assert (R2 < Div2);
+ pragma Assert (R2 <= Div2 - 1);
  R1   : constant Double_Uns := X - X / Div1 * Div1;
  pragma Assert (R1 < Div1);
   begin
diff --git a/gcc/ada/libgnat/s-arit32.adb b/gcc/ada/libgnat/s-arit32.adb
index bd316c1bc20..219523b00f2 100644
--- a/gcc/ada/libgnat/s-arit32.adb
+++ b/gcc/ada/libgnat/s-arit32.adb
@@ -195,12 +195,6 @@ is
or else (X >= Big_0 and then Y <= Big_0),
  Post => X * Y <= Big_0;
 
-   procedure Lemma_Neg_Div (X, Y : Big_Integer)
-   with
- Ghost,
- Pre  => Y /= 0,
- Post => X / Y = (-X) / (-Y);
-
procedure Lemma_Neg_Rem (X, Y : Big_Integer)
with
  Ghost,
@@ -223,6 +217,7 @@ is
-
 
procedure Lemma_Abs_Commutation (X : Int32) is null;
+   procedure Lemma_Abs_Div_Commutation (X, Y : Big_Integer) is null;
procedure Lemma_Abs_Mult_Commutation (X, Y : Big_Integer) is null;
procedure Lemma_Div_Commutation (X, Y : Uns64) is null;
procedure Lemma_Div_Ge (X, Y, Z : Big_Integer) is null;
@@ -234,22 +229,6 @@ is
procedure Lemma_Not_In_Range_Big2xx32 is null;
procedure Lemma_Rem_Commutation (X, Y : Uns64) is null;
 
-   ---
-   -- Lemma_Abs_Div_Commutation --
-   ---
-
-   procedure Lemma_Abs_Div_Commutation (X, Y : Big_Integer) is
-   begin
-  if Y < 0 then
- if X < 0 then
-pragma Assert (abs (X / Y) = abs (X / (-Y)));
- else
-Lemma_Neg_Div (X, Y);
-pragma Assert (abs (X / Y) = abs ((-X) / (-Y)));
- end if;
-  end if;
-   end Lemma_Abs_Div_Commutation;
-
---
-- Lemma_Abs_Rem_Commutation --
---
@@ -277,16 +256,6 @@ is
   pragma Assert (Uns64 (Xlo) = Xu mod 2 ** 32);
end Lemma_Hi_Lo;
 
-   ---
-   -- Lemma_Neg_Div --
-   ---
-
-   procedure Lemma_Neg_Div (X, Y : Big_Integer) is
-   begin
-  pragma Assert ((-X) / (-Y) = -(X / (-Y)));
-  pragma Assert (X / (-Y) = -(X / Y));
-   end Lemma_Neg_Div;
-
-
-- Raise_Error --
-
diff --git a/gcc/ada/libgnat/s-expmod.adb b/gcc/ada/libgnat/s-expmod.adb
index 0682589d352..aa6e9b4c361 100644
--- a/gcc/ada/libgnat/s-expmod.adb
+++ b/gcc/ada/libgnat/s-expmod.adb
@@ -109,9 +109,21 @@ is
 
   procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) with
 Pre  => F /= 0,
-Post => (Q * F + R) mod F = R mod F;
+Post => (Q * F + R) mod F = R mod F,
+Subprogram_Variant => (Decreases => Q);
 
-  procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) is null;
+  -
+  -- Lemma_Euclidean_Mod --
+  -
+
+  procedure Lemma_Euclidean_Mod (Q, F, R : Big_Natural) is
+  begin
+ if Q > 0 then
+Lemma_Euclidean_Mod (Q - 1, F, R);
+ end if;
+  end Lemma_Euclidean_Mod;
+
+  --  Local variables
 
   Left  : constant Big_Natural := (X + Y) mod B;
   Right : constant Big_Natural := ((X mod B) + (Y mod B)) mod B;
@@ -164,6 +176,9 @@ is
 Lemma_Mod_Mod (A, B);
 Lemma_Exp_Mod (A, Exp - 1, B);
 Lemma_Mult_Mod (A, A ** (Exp - 1), B);
+pragma Assert
+  ((A mod B) * (A mod B) ** (Exp - 1) = (A mod B) ** Exp);
+pragma Assert (A * A ** (Exp - 1) = A ** Exp);
 pragma Assert (Left = Right);
  end;
   end if;
@@ -190,6 +205,7 @@ is
 pragma Assert (Left = Right);
  

[COMMITTED] ada: Add "gnat --help-ada" text for new switches.

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Steve Baird 

The output generated by "gnat --help-ada" should include descriptions for
the newly added -gnatw_s and -gnatw_S switches".

gcc/ada/

* usage.adb: Generate output text describing the -gnatw_s switch
(and the corresponding -gnatw_S switch).

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/usage.adb | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/ada/usage.adb b/gcc/ada/usage.adb
index 97cedbb9a2d..9e2aa019573 100644
--- a/gcc/ada/usage.adb
+++ b/gcc/ada/usage.adb
@@ -580,6 +580,10 @@ begin
Write_Line ("ssuppress all info/warnings");
Write_Line (".s   turn on warnings for overridden size clause");
Write_Line (".S*  turn off warnings for overridden size clause");
+   Write_Line ("_s+  turn on warnings for ineffective predicate " &
+  "tests");
+   Write_Line ("_S*  turn off warnings for ineffective predicate " &
+   "tests");
Write_Line ("tturn on warnings for tracking deleted code");
Write_Line ("T*   turn off warnings for tracking deleted code");
Write_Line (".t*+ turn on warnings for suspicious contract");
-- 
2.40.0



[COMMITTED] ada: Follow-up improvement to implementation of storage models

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

It avoids to recreate an actual subtype for an explicit dereference.

gcc/ada/

* sem_util.adb (Get_Actual_Subtype): For an explicit dereference,
return the Actual_Designated_Subtype if it is present.
(Get_Actual_Subtype_If_Available): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 16 
 1 file changed, 16 insertions(+)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 8bce0229867..ad74de6b6f6 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10017,6 +10017,14 @@ package body Sem_Util is
   then
  return Actual_Subtype (Entity (N));
 
+  --  Similarly, if we have an explicit dereference, then we get the
+  --  actual subtype from the node itself if one has been built.
+
+  elsif Nkind (N) = N_Explicit_Dereference
+and then Present (Actual_Designated_Subtype (N))
+  then
+ return Actual_Designated_Subtype (N);
+
   --  Actual subtype of unchecked union is always itself. We never need
   --  the "real" actual subtype. If we did, we couldn't get it anyway
   --  because the discriminant is not available. The restrictions on
@@ -10130,6 +10138,14 @@ package body Sem_Util is
   then
  return Actual_Subtype (Entity (N));
 
+  --  Similarly, if we have an explicit dereference, then we get the
+  --  actual subtype from the node itself if one has been built.
+
+  elsif Nkind (N) = N_Explicit_Dereference
+and then Present (Actual_Designated_Subtype (N))
+  then
+ return Actual_Designated_Subtype (N);
+
   --  Otherwise the Etype of N is returned unchanged
 
   else
-- 
2.40.0



Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab]

2023-05-16 Thread Richard Sandiford via Gcc-patches
Tejas Belagod  writes:
>> +  {
>> +int i;
>> +int nelts = vector_cst_encoded_nelts (v);
>> +int first_el = 0;
>> +
>> +for (i = first_el; i < nelts; i += step)
>> +  if (VECTOR_CST_ENCODED_ELT (v, i) != VECTOR_CST_ENCODED_ELT (v,
> first_el))
>
> I think this should use !operand_equal_p (..., ..., 0).
>
>
> Oops! I wonder why I thought VECTOR_CST_ENCODED_ELT returned a constant! 
> Thanks
> for spotting that.

It does only return a constant.  But there can be multiple trees with
the same constant value, through things like TREE_OVERFLOW (not sure
where things stand on expunging that from gimple) and the fact that
gimple does not maintain a distinction between different types that
have the same mode and signedness.  (E.g. on ILP32 hosts, gimple does
not maintain a distinction between int and long, even though int 0 and
long 0 are different trees.)

> Also, should the flags here be OEP_ONLY_CONST ?

Nah, just 0 should be fine.

>> + return false;
>> +
>> +return true;
>> +  }
>> +
>> +  /* Fold a svlast{a/b} call with constant predicate to a BIT_FIELD_REF.
>> + BIT_FIELD_REF lowers to a NEON element extract, so we have to make sure
>> + the index of the element being accessed is in the range of a NEON
> vector
>> + width.  */
>
> s/NEON/Advanced SIMD/.  Same in later comments
>
>> +  gimple *fold (gimple_folder & f) const override
>> +  {
>> +tree pred = gimple_call_arg (f.call, 0);
>> +tree val = gimple_call_arg (f.call, 1);
>> +
>> +if (TREE_CODE (pred) == VECTOR_CST)
>> +  {
>> + HOST_WIDE_INT pos;
>> + unsigned int const_vg;
>> + int i = 0;
>> + int step = f.type_suffix (0).element_bytes;
>> + int step_1 = gcd (step, VECTOR_CST_NPATTERNS (pred));
>> + int npats = VECTOR_CST_NPATTERNS (pred);
>> + unsigned HOST_WIDE_INT nelts = vector_cst_encoded_nelts (pred);
>> + tree b = NULL_TREE;
>> + bool const_vl = aarch64_sve_vg.is_constant (_vg);
>
> I think this might be left over from previous versions, but:
> const_vg isn't used and const_vl is only used once, so I think it
> would be better to remove them.
>
>> +
>> + /* We can optimize 2 cases common to variable and fixed-length cases
>> +without a linear search of the predicate vector:
>> +1.  LASTA if predicate is all true, return element 0.
>> +2.  LASTA if predicate all false, return element 0.  */
>> + if (is_lasta () && vect_all_same (pred, step_1))
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT), bitsize_int (0));
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* Handle the all-false case for LASTB where SVE VL == 128b -
>> +return the highest numbered element.  */
>> + if (is_lastb () && known_eq (BYTES_PER_SVE_VECTOR, 16)
>> + && vect_all_same (pred, step_1)
>> + && integer_zerop (VECTOR_CST_ENCODED_ELT (pred, 0)))
>
> Formatting nit: one condition per line once one line isn't enough.
>
>> +   {
>> + b = build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val,
>> + bitsize_int (step * BITS_PER_UNIT),
>> + bitsize_int ((16 - step) * BITS_PER_UNIT));
>> +
>> + return gimple_build_assign (f.lhs, b);
>> +   }
>> +
>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) == 2 and every multiple of
>> +'step_1' in
>> +[VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1]
>> +is zero, then we can treat the vector as VECTOR_CST_NPATTERNS
>> +elements followed by all inactive elements.  */
>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) == 2)
>
> Following on from the above, maybe use:
>
>   !VECTOR_CST_NELTS (pred).is_constant ()
>
> instead of !const_vl here.
>
> I have a horrible suspicion that I'm contradicting our earlier discussion
> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != 2.
>
>  
>
> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the encoded
> constant. This means the repetition occurs if NELTS_PER_PATTERN == 2, IOW the
> base1 repeats in the encoding. This loop is checking this condition and looks
> for a 1 in the repeated part of the NELTS_PER_PATTERN == 2 in a VL vector.
> Please correct me if I’m misunderstanding here.

NELTS_PER_PATTERN == 1 is also a repeating pattern: it means that the
entire sequence is repeated to fill a vector.  So if an NELTS_PER_PATTERN
== 1 constant has elements {0, 1, 0, 0}, the vector is:

   {0, 1, 0, 0, 0, 1, 0, 0, ...}

and the optimisation can't handle that.  NELTS_PER_PATTERN == 3 isn't
likely to occur for predicates, but in principle it has the same problem.

Thanks,
Richard


[COMMITTED] ada: Fix internal error on chain of predicated record types

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The preanalysis of a predicate set on one of the record types was causing
premature freezing of another record type.

gcc/ada/

* sem_ch13.adb: Add with and use clauses for Expander.
(Resolve_Aspect_Expressions) : Emulate a
bona-fide preanalysis setup before calling
Resolve_Aspect_Expression.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index 1c757228241..a4a5084793e 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -38,6 +38,7 @@ with Exp_Ch3;  use Exp_Ch3;
 with Exp_Disp; use Exp_Disp;
 with Exp_Tss;  use Exp_Tss;
 with Exp_Util; use Exp_Util;
+with Expander; use Expander;
 with Freeze;   use Freeze;
 with Ghost;use Ghost;
 with Lib;  use Lib;
@@ -15625,15 +15626,29 @@ package body Sem_Ch13 is
  --  Preanalyze expression after type replacement to catch
  --  name resolution errors if the predicate function has
  --  not been built yet.
+
  --  Note that we cannot use Preanalyze_Spec_Expression
- --  because of the special handling required for
- --  quantifiers, see comments on Resolve_Aspect_Expression
- --  above.
+ --  directly because of the special handling required for
+ --  quantifiers (see comments on Resolve_Aspect_Expression
+ --  above) but we need to emulate it properly.
 
  if No (Predicate_Function (E)) then
-Push_Type (E);
-Resolve_Aspect_Expression (Expr);
-Pop_Type (E);
+declare
+   Save_In_Spec_Expression : constant Boolean :=
+   In_Spec_Expression;
+   Save_Full_Analysis : constant Boolean :=
+  Full_Analysis;
+begin
+   In_Spec_Expression := True;
+   Full_Analysis := False;
+   Expander_Mode_Save_And_Set (False);
+   Push_Type (E);
+   Resolve_Aspect_Expression (Expr);
+   Pop_Type (E);
+   Expander_Mode_Restore;
+   Full_Analysis := Save_Full_Analysis;
+   In_Spec_Expression := Save_In_Spec_Expression;
+end;
  end if;
 
   when Pre_Post_Aspects =>
-- 
2.40.0



[COMMITTED] ada: Adjust semantics and implementation of storage models

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

This makes the following adjustments to the semantics and implementation of
storage models in the compiler:

  1. By-copy semantics in subprogram calls: when an object accessed with a
 nonnative storage model is passed as an actual parameter in a call to
 a subprogram, an intermediate copy made on the host is passed instead.

  2. More generally, any additional temporary required on the host by the
 semantics of nonnative storage models is now created by the front-end
 instead of the code generator.

  3. All the temporaries created on the host for nonnative storage models
 are allocated on the secondary stack instead of the primary stack.

As a result, this should simplify the implementation in code generators.

gcc/ada/

* exp_aggr.adb (Build_Assignment_With_Temporary): Adjust comment
and fix type of second parameter. Create the temporary on the
secondary stack by calling Build_Temporary_On_Secondary_Stack.
(Convert_Array_Aggr_In_Allocator): Adjust formatting.
(Expand_Array_Aggregate): Likewise.
* exp_ch4.adb (Expand_N_Allocator): Set Actual_Designated_Subtype
on the dereference in the initialization for all composite types.
* exp_ch5.adb (Expand_N_Assignment_Statement): Create a temporary
on the host for an assignment between nonnative storage models.
Suppress more checks when Suppress_Assignment_Checks is set.
* exp_ch6.adb (Add_Simple_Call_By_Copy_Code): Deal with actuals
that are dereferences with an Actual_Designated_Subtype. Add
support for nonnative storage models.
(Expand_Actuals): Create a copy if the actual is a dereference
with a nonnative storage model.
* exp_util.ads (Build_Temporary_On_Secondary_Stack): Declare.
* exp_util.adb (Build_Temporary_On_Secondary_Stack): New function.
* sem_ch5.adb (Analyze_Assignment.Set_Assignment_Type): Do not
build an actual subtype for dereferences with an
Actual_Designated_Subtype
* sinfo.ads (Actual_Designated_Subtype): Adjust documentation.
(Suppress_Assignment_Checks): Likewise.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_aggr.adb |  51 +-
 gcc/ada/exp_ch4.adb  |  52 +--
 gcc/ada/exp_ch5.adb  |  58 +++--
 gcc/ada/exp_ch6.adb  | 121 ---
 gcc/ada/exp_util.adb |  49 ++
 gcc/ada/exp_util.ads |  12 +
 gcc/ada/sem_ch5.adb  |   9 ++--
 gcc/ada/sinfo.ads|   4 +-
 8 files changed, 274 insertions(+), 82 deletions(-)

diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
index f1cbbfc3155..cf8bac0f4bf 100644
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -62,7 +62,7 @@ with Sem_Eval;   use Sem_Eval;
 with Sem_Mech;   use Sem_Mech;
 with Sem_Res;use Sem_Res;
 with Sem_Util;   use Sem_Util;
-use Sem_Util.Storage_Model_Support;
+ use Sem_Util.Storage_Model_Support;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
 with Sinfo.Utils;use Sinfo.Utils;
@@ -78,12 +78,10 @@ package body Exp_Aggr is
 
function Build_Assignment_With_Temporary
  (Target : Node_Id;
-  Typ: Node_Id;
+  Typ: Entity_Id;
   Source : Node_Id) return List_Id;
--  Returns a list of actions to assign Source to Target of type Typ using
-   --  an extra temporary:
-   --   Tmp := Source;
-   --   Target := Tmp;
+   --  an extra temporary, which can potentially be large.
 
type Case_Bounds is record
  Choice_Lo   : Node_Id;
@@ -2524,33 +2522,33 @@ package body Exp_Aggr is
 
function Build_Assignment_With_Temporary
  (Target : Node_Id;
-  Typ: Node_Id;
+  Typ: Entity_Id;
   Source : Node_Id) return List_Id
is
   Loc : constant Source_Ptr := Sloc (Source);
 
   Aggr_Code : List_Id;
   Tmp   : Entity_Id;
-  Tmp_Decl  : Node_Id;
 
begin
-  Tmp := Make_Temporary (Loc, 'A', Source);
-  Tmp_Decl :=
-Make_Object_Declaration (Loc,
-  Defining_Identifier => Tmp,
-  Object_Definition   => New_Occurrence_Of (Typ, Loc));
-  Set_No_Initialization (Tmp_Decl, True);
+  Aggr_Code := New_List;
+
+  Tmp := Build_Temporary_On_Secondary_Stack (Loc, Typ, Aggr_Code);
 
-  Aggr_Code := New_List (Tmp_Decl);
   Append_To (Aggr_Code,
 Make_OK_Assignment_Statement (Loc,
-  Name   => New_Occurrence_Of (Tmp, Loc),
+  Name   =>
+Make_Explicit_Dereference (Loc,
+  Prefix => New_Occurrence_Of (Tmp, Loc)),
   Expression => Source));
 
   Append_To (Aggr_Code,
 Make_OK_Assignment_Statement (Loc,
   Name   => Target,
-  Expression => New_Occurrence_Of (Tmp, Loc)));
+  Expression =>
+Make_Explicit_Dereference (Loc,
+  Prefix => 

[COMMITTED] ada: Use accumulator type in expansion of 'Reduce attribute

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

The current expansion of the 'Reduce attribute uses the resolution type of
the expression for the accumulator. Now this type can be unresolved or set
to a universal type, for example if it is itself the prefix of the 'Image
attribute, and this may yield a spurious type mismatch error in that case.

This changes the expansion to use the accumulator type instead as defined
by the RM 4.5.10 clause, albeit only in the prefixed case for now.

gcc/ada/

* exp_attr.adb (Expand_N_Attribute_Reference) :
Use the canonical accumulator type as the type of the accumulator
in the prefixed case.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_attr.adb | 72 ++--
 1 file changed, 62 insertions(+), 10 deletions(-)

diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
index aababd516d5..7e71422eba3 100644
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -5978,27 +5978,30 @@ package body Exp_Attr is
   when Attribute_Reduce =>
  declare
 Loc : constant Source_Ptr := Sloc (N);
-E1  : constant Node_Id := First (Expressions (N));
-E2  : constant Node_Id := Next (E1);
-Bnn : constant Entity_Id := Make_Temporary (Loc, 'B', N);
-Typ : constant Entity_Id := Etype (N);
+E1  : constant Node_Id:= First (Expressions (N));
+E2  : constant Node_Id:= Next (E1);
+Bnn : constant Entity_Id  := Make_Temporary (Loc, 'B', N);
 
-New_Loop : Node_Id;
-Stat : Node_Id;
+Accum_Typ : Entity_Id;
+New_Loop  : Node_Id;
 
 function Build_Stat (Comp : Node_Id) return Node_Id;
 --  The reducer can be a function, a procedure whose first
 --  parameter is in-out, or an attribute that is a function,
 --  which (for now) can only be Min/Max. This subprogram
---  builds the corresponding computation for the generated loop.
+--  builds the corresponding computation for the generated loop
+--  and retrieves the accumulator type as per RM 4.5.10(19/5).
 
 
 -- Build_Stat --
 
 
 function Build_Stat (Comp : Node_Id) return Node_Id is
+   Stat : Node_Id;
+
 begin
if Nkind (E1) = N_Attribute_Reference then
+  Accum_Typ := Entity (Prefix (E1));
   Stat := Make_Assignment_Statement (Loc,
 Name => New_Occurrence_Of (Bnn, Loc),
 Expression => Make_Attribute_Reference (Loc,
@@ -6009,12 +6012,14 @@ package body Exp_Attr is
 Comp)));
 
elsif Ekind (Entity (E1)) = E_Procedure then
+  Accum_Typ := Etype (First_Formal (Entity (E1)));
   Stat := Make_Procedure_Call_Statement (Loc,
 Name => New_Occurrence_Of (Entity (E1), Loc),
Parameter_Associations => New_List (
  New_Occurrence_Of (Bnn, Loc),
  Comp));
else
+  Accum_Typ := Etype (Entity (E1));
   Stat := Make_Assignment_Statement (Loc,
 Name => New_Occurrence_Of (Bnn, Loc),
 Expression => Make_Function_Call (Loc,
@@ -6074,6 +6079,13 @@ package body Exp_Attr is
   End_Label => Empty,
   Statements =>
 New_List (Build_Stat (Relocate_Node (Expr;
+
+  --  If the reducer subprogram is a universal operator, then
+  --  we still look at the context to find the type for now.
+
+  if Is_Universal_Numeric_Type (Accum_Typ) then
+ Accum_Typ := Etype (N);
+  end if;
end;
 
 else
@@ -6082,9 +6094,10 @@ package body Exp_Attr is
--  a container with the proper aspects.
 
declare
-  Iter : Node_Id;
   Elem : constant Entity_Id := Make_Temporary (Loc, 'E', N);
 
+  Iter : Node_Id;
+
begin
   Iter :=
 Make_Iterator_Specification (Loc,
@@ -6101,6 +6114,44 @@ package body Exp_Attr is
   End_Label => Empty,
   Statements => New_List (
 Build_Stat (New_Occurrence_Of (Elem, Loc;
+
+  --  If the reducer subprogram is a universal operator, then
+  --  we need to look at the prefix to find the type. This is
+  --  modeled on Analyze_Iterator_Specification in Sem_Ch5.
+
+  if Is_Universal_Numeric_Type (Accum_Typ) then
+ 

[COMMITTED] ada: Spurious error on function returning CPP type

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Javier Miranda 

gcc/ada/

* exp_ch6.adb
(Needs_BIP_Alloc_Form): Return False for functions with foreign
convention since we never use build-in-place for such functions.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch6.adb | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
index af7f75342fa..b8e5a720a7c 100644
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -9435,9 +9435,14 @@ package body Exp_Ch6 is
   --  types, and those can be used to call primitives, so the formal needs
   --  to be passed to all such build-in-place functions, primitive or not.
 
+  --  We never use build-in-place if the function has foreign convention,
+  --  but note that it is OK for a build-in-place function to return a
+  --  type with a foreign convention because the machinery ensures there
+  --  is no copying.
+
   return not Restriction_Active (No_Secondary_Stack)
 and then (Needs_Secondary_Stack (Typ) or else Is_Tagged_Type (Typ))
-and then not Has_Foreign_Convention (Typ);
+and then not Has_Foreign_Convention (Func_Id);
end Needs_BIP_Alloc_Form;
 
-
-- 
2.40.0



[COMMITTED] ada: Build invariant procedure while freezing in GNATprove mode

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

Invariant procedure bodies are created either by expansion of freezing
nodes (but only in ordinary compilation mode) or at the end of package
private declarations (but not for with private types in the type
derivation chain).

In GNATprove mode we didn't create invariant procedure bodies in
lightweight expansion, so we didn't create them at all when there were
private types in the type derivation chain.

This patch copies the relevant freezing part from ordinary to
lightweight expansion. This obviously involves code duplication,
but it seems better to duplicate whole sections that work properly
instead of small pieces that are incomplete. There are other pieces
of freezing that are similarly duplicated, so this patch doesn't make
the code substantially worse.

gcc/ada/

* exp_spark.adb (SPARK_Freeze_Type): Copy whole handling of DIC
and Type_Invariant from Freeze_Type.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_spark.adb | 54 ---
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/exp_spark.adb b/gcc/ada/exp_spark.adb
index efa5c2cd8da..c344dc1e706 100644
--- a/gcc/ada/exp_spark.adb
+++ b/gcc/ada/exp_spark.adb
@@ -101,7 +101,7 @@ package body Exp_SPARK is
--  expanded body would compare the _parent component, which is
--  intentionally not generated in the GNATprove mode.
--
-   --  We build the DIC procedure body here as well.
+   --  We build the DIC and Type_Invariant procedure bodies here as well.
 
--
-- Expand_SPARK --
@@ -920,15 +920,53 @@ package body Exp_SPARK is
 
   Set_Ghost_Mode (Typ);
 
-  --  When a DIC is inherited by a tagged type, it may need to be
-  --  specialized to the descendant type, hence build a separate DIC
-  --  procedure for it as done during regular expansion for compilation.
+  --  Generate the [spec and] body of the invariant procedure tasked with
+  --  the runtime verification of all invariants that pertain to the type.
+  --  This includes invariants on the partial and full view, inherited
+  --  class-wide invariants from parent types or interfaces, and invariants
+  --  on array elements or record components. But skip internal types.
 
-  if Has_DIC (Typ) and then Is_Tagged_Type (Typ) then
- --  Why is this needed for DIC, but not for other aspects (such as
- --  Type_Invariant)???
+  if Is_Itype (Typ) then
+ null;
+
+  elsif Is_Interface (Typ) then
+
+ --  Interfaces are treated as the partial view of a private type in
+ --  order to achieve uniformity with the general case. As a result, an
+ --  interface receives only a "partial" invariant procedure which is
+ --  never called.
+
+ if Has_Own_Invariants (Typ) then
+Build_Invariant_Procedure_Body
+  (Typ   => Typ,
+   Partial_Invariant => Is_Interface (Typ));
+ end if;
+
+  --  Non-interface types
 
- Build_DIC_Procedure_Body (Typ);
+  --  Do not generate invariant procedure within other assertion
+  --  subprograms, which may involve local declarations of local
+  --  subtypes to which these checks do not apply.
+
+  else
+ if Has_Invariants (Typ) then
+if not Predicate_Check_In_Scope (Typ)
+  or else (Ekind (Current_Scope) = E_Function
+and then Is_Predicate_Function (Current_Scope))
+then
+   null;
+else
+   Build_Invariant_Procedure_Body (Typ);
+end if;
+ end if;
+
+ --  Generate the [spec and] body of the procedure tasked with the
+ --  run-time verification of pragma Default_Initial_Condition's
+ --  expression.
+
+ if Has_DIC (Typ) then
+Build_DIC_Procedure_Body (Typ);
+ end if;
   end if;
 
   if Ekind (Typ) = E_Record_Type
-- 
2.40.0



[COMMITTED] ada: Spurious error analyzing 'old or 'result in class-wide conditions

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Javier Miranda 

gcc/ada/

* sem_attr.adb
(Analyze_Attribute_Old_Result): When preanalyzing a class-wide
condition, search in the scopes stack for the subprogram that has
the condition. This is required because returning the current
scope causes reporting spurious errors when the occurrence of the
attribute is found, for example, in a quantified expression.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_attr.adb | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/ada/sem_attr.adb b/gcc/ada/sem_attr.adb
index 452aabdd436..a07e91b839d 100644
--- a/gcc/ada/sem_attr.adb
+++ b/gcc/ada/sem_attr.adb
@@ -1366,8 +1366,27 @@ package body Sem_Attr is
  --  yet on its definite context.
 
  if Inside_Class_Condition_Preanalysis then
-Legal   := True;
-Spec_Id := Current_Scope;
+Legal := True;
+
+--  Search for the subprogram that has this class-wide condition;
+--  required to avoid reporting spurious errors since the current
+--  scope may not be appropriate because the attribute may be
+--  referenced from the inner scope of, for example, quantified
+--  expressions.
+
+--  Although the expression is not installed on its definite
+--  context, we know that the subprogram has been placed in the
+--  scope stack by Preanalyze_Condition; we also know that it is
+--  not a generic subprogram since class-wide pre/postconditions
+--  can only be applied for primitive operations of tagged types.
+
+if Is_Subprogram (Current_Scope) then
+   Spec_Id := Current_Scope;
+else
+   Spec_Id := Enclosing_Subprogram (Current_Scope);
+end if;
+
+pragma Assert (Is_Dispatching_Operation (Spec_Id));
 return;
  end if;
 
-- 
2.40.0



[COMMITTED] ada: Apply range checks to preanalyzed aggregate expressions

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Piotr Trojanek 

When preanalyzing expressions in GNATprove mode, e.g. Pre/Post
contracts, we apply checks, because these expressions will never
be expanded. This didn't happen for aggregate expressions, most
likely because of an oversight.

gcc/ada/

* sem_util.adb (Aggregate_Constraint_Checks): Don't exit early
when preanalysing in GNATprove mode. Now the condition is
consistent with other similar conditions in other code.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_util.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index ad74de6b6f6..38dc654f7be 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -477,7 +477,7 @@ package body Sem_Util is
   --  this breaks the name resolution mechanism for generic instances.
 
   if not Expander_Active
-and (Inside_A_Generic or not Full_Analysis or not GNATprove_Mode)
+and not (GNATprove_Mode and not Inside_A_Generic)
   then
  return;
   end if;
-- 
2.40.0



[COMMITTED] ada: Document examples of No_Dependence restriction for code generation

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Eric Botcazou 

gcc/ada/

* doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
(No_Dependence): Give examples of new No_Dependence restrictions.
* gnat_rm.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 ...nd_implementation_defined_restrictions.rst | 12 ++-
 gcc/ada/gnat_rm.texi  | 33 ++-
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git 
a/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst 
b/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
index f8e2a58595f..275b46c3712 100644
--- a/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
+++ b/gcc/ada/doc/gnat_rm/standard_and_implementation_defined_restrictions.rst
@@ -186,7 +186,17 @@ No_Dependence
 [RM 13.12.1] This restriction ensures at compile time that there are no
 dependences on a library unit. For GNAT, this includes implicit implementation
 dependences on units of the runtime library that are created by the compiler
-to support specific constructs of the language.
+to support specific constructs of the language. Here are some examples:
+
+* ``System.Arith_64``: 64-bit arithmetics for 32-bit platforms,
+* ``System.Arith_128``: 128-bit arithmetics for 64-bit platforms,
+* ``System.Memory``: heap memory allocation routines,
+* ``System.Memory_Compare``: memory comparison routine (aka ``memcmp`` for C),
+* ``System.Memory_Copy``: memory copy routine (aka ``memcpy`` for C),
+* ``System.Memory_Move``: memoy move routine (aka ``memmove`` for C),
+* ``System.Memory_Set``: memory set routine (aka ``memset`` for C),
+* ``System.Stack_Checking[.Operations]``: stack checking without MMU,
+* ``System.GCC``: support routines from the GCC library.
 
 No_Direct_Boolean_Operators
 ---
diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
index 5e05287d6d8..3818f22414a 100644
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -12727,7 +12727,38 @@ delay statements and no semantic dependences on 
package Calendar.
 [RM 13.12.1] This restriction ensures at compile time that there are no
 dependences on a library unit. For GNAT, this includes implicit implementation
 dependences on units of the runtime library that are created by the compiler
-to support specific constructs of the language.
+to support specific constructs of the language. Here are some examples:
+
+
+@itemize *
+
+@item 
+@code{System.Arith_64}: 64-bit arithmetics for 32-bit platforms,
+
+@item 
+@code{System.Arith_128}: 128-bit arithmetics for 64-bit platforms,
+
+@item 
+@code{System.Memory}: heap memory allocation routines,
+
+@item 
+@code{System.Memory_Compare}: memory comparison routine (aka @code{memcmp} for 
C),
+
+@item 
+@code{System.Memory_Copy}: memory copy routine (aka @code{memcpy} for C),
+
+@item 
+@code{System.Memory_Move}: memoy move routine (aka @code{memmove} for C),
+
+@item 
+@code{System.Memory_Set}: memory set routine (aka @code{memset} for C),
+
+@item 
+@code{System.Stack_Checking[.Operations]}: stack checking without MMU,
+
+@item 
+@code{System.GCC}: support routines from the GCC library.
+@end itemize
 
 @node No_Direct_Boolean_Operators,No_Dispatch,No_Dependence,Partition-Wide 
Restrictions
 @anchor{gnat_rm/standard_and_implementation_defined_restrictions 
no-direct-boolean-operators}@anchor{1ca}
-- 
2.40.0



[COMMITTED] ada: usage.adb: document -gnatyD switch

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Ghjuvan Lacambre 

-gnatyD was documented in the user guide but not in `gnat --help-ada`.

gcc/ada/

* usage.adb (Usage): Document -gnatyD.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/usage.adb | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ada/usage.adb b/gcc/ada/usage.adb
index 4a2fa019013..97cedbb9a2d 100644
--- a/gcc/ada/usage.adb
+++ b/gcc/ada/usage.adb
@@ -655,6 +655,7 @@ begin
Write_Line ("ccheck comment format (two spaces)");
Write_Line ("Ccheck comment format (one space)");
Write_Line ("dcheck no DOS line terminators");
+   Write_Line ("Dcheck declared identifiers in mixed case");
Write_Line ("echeck end/exit labels present");
Write_Line ("fcheck no form feeds/vertical tabs in source");
Write_Line ("gcheck standard GNAT style rules, same as ydISux");
-- 
2.40.0



[COMMITTED] ada: Enable Support_Atomic_Primitives on PPC Linux

2023-05-16 Thread Marc Poulhiès via Gcc-patches
From: Johannes Kliemann 

gcc/ada/

* libgnat/system-linux-ppc.ads: Add Support_Atomic_Primitives.
* libgnat/s-atopri__32.ads: Add 32 bit version of s-atopri.ads.
* Makefile.rtl: Use s-atopro__32.ads for ppc-linux.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/Makefile.rtl |   1 +
 gcc/ada/libgnat/s-atopri__32.ads | 149 +++
 gcc/ada/libgnat/system-linux-ppc.ads |   1 +
 3 files changed, 151 insertions(+)
 create mode 100644 gcc/ada/libgnat/s-atopri__32.ads

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
index 96306f8cc9a..2cfdd8dc613 100644
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -2185,6 +2185,7 @@ ifeq ($(strip $(filter-out powerpc% linux%,$(target_cpu) 
$(target_os))),)
   EXTRA_GNATRTL_NONTASKING_OBJS += $(GNATRTL_128BIT_OBJS)
 endif
   else
+LIBGNAT_TARGET_PAIRS += s-atopri.adshttp://www.gnu.org/licenses/>.  --
+--  --
+-- GNAT was originally developed  by the GNAT team at  New York University. --
+-- Extensive contributions were provided by Ada Core Technologies Inc.  --
+--  --
+--
+
+--  This package contains both atomic primitives defined from GCC built-in
+--  functions and operations used by the compiler to generate the lock-free
+--  implementation of protected objects.
+--  This is the version that only contains primitives available on 32 bit
+--  platforms.
+
+with Interfaces.C;
+
+package System.Atomic_Primitives is
+   pragma Pure;
+
+   type uint is mod 2 ** Long_Integer'Size;
+
+   type uint8  is mod 2**8
+ with Size => 8;
+
+   type uint16 is mod 2**16
+ with Size => 16;
+
+   type uint32 is mod 2**32
+ with Size => 32;
+
+   Relaxed : constant := 0;
+   Consume : constant := 1;
+   Acquire : constant := 2;
+   Release : constant := 3;
+   Acq_Rel : constant := 4;
+   Seq_Cst : constant := 5;
+   Last: constant := 6;
+
+   subtype Mem_Model is Integer range Relaxed .. Last;
+
+   
+   -- GCC built-in atomic primitives --
+   
+
+   generic
+  type Atomic_Type is mod <>;
+   function Atomic_Load
+ (Ptr   : Address;
+  Model : Mem_Model := Seq_Cst) return Atomic_Type;
+   pragma Import (Intrinsic, Atomic_Load, "__atomic_load_n");
+
+   function Atomic_Load_8  is new Atomic_Load (uint8);
+   function Atomic_Load_16 is new Atomic_Load (uint16);
+   function Atomic_Load_32 is new Atomic_Load (uint32);
+
+   generic
+  type Atomic_Type is mod <>;
+   function Atomic_Compare_Exchange
+ (Ptr   : Address;
+  Expected  : Address;
+  Desired   : Atomic_Type;
+  Weak  : Boolean   := False;
+  Success_Model : Mem_Model := Seq_Cst;
+  Failure_Model : Mem_Model := Seq_Cst) return Boolean;
+   pragma Import
+ (Intrinsic, Atomic_Compare_Exchange, "__atomic_compare_exchange_n");
+
+   function Atomic_Compare_Exchange_8  is new Atomic_Compare_Exchange (uint8);
+   function Atomic_Compare_Exchange_16 is new Atomic_Compare_Exchange (uint16);
+   function Atomic_Compare_Exchange_32 is new Atomic_Compare_Exchange (uint32);
+
+   function Atomic_Test_And_Set
+ (Ptr   : System.Address;
+  Model : Mem_Model := Seq_Cst) return Boolean;
+   pragma Import (Intrinsic, Atomic_Test_And_Set, "__atomic_test_and_set");
+
+   procedure Atomic_Clear
+ (Ptr   : System.Address;
+  Model : Mem_Model := Seq_Cst);
+   pragma Import (Intrinsic, Atomic_Clear, "__atomic_clear");
+
+   function Atomic_Always_Lock_Free
+ (Size : Interfaces.C.size_t;
+  Ptr  : System.Address := System.Null_Address) return Boolean;
+   pragma Import
+ (Intrinsic, Atomic_Always_Lock_Free, "__atomic_always_lock_free");
+
+   --
+   -- Lock-free operations --
+   --
+
+   --  The lock-free implementation uses two atomic instructions for the
+   --  expansion of protected operations:
+
+   --  * Lock_Free_Read atomically loads the value contained in Ptr (with the
+   --Acquire synchronization mode).
+
+   --  * Lock_Free_Try_Write atomically tries to write the Desired value into
+   --Ptr if Ptr contains the Expected value. It returns true if the value
+   --in Ptr was changed, or False if it was not, in which case Expected is
+   --updated to the unexpected value in Ptr. Note that it does nothing and
+   --returns true if Desired and Expected are equal.
+
+   generic
+  type Atomic_Type is mod <>;
+   function Lock_Free_Read (Ptr : Address) return Atomic_Type;
+
+   function Lock_Free_Read_8  is new Lock_Free_Read (uint8);
+   function Lock_Free_Read_16 is new Lock_Free_Read (uint16);
+   function Lock_Free_Read_32 is 

  1   2   >