[PATCH] Make sure SCALAR_INT_MODE_P before invoke try_const_anchors

2023-06-08 Thread Jiufu Guo via Gcc-patches
Hi,

As checking the code, there is a "gcc_assert (SCALAR_INT_MODE_P (mode))"
in "try_const_anchors".
This assert seems correct because the function try_const_anchors cares
about integer values currently, and modes other than SCALAR_INT_MODE_P
are not needed to support.

This patch makes sure SCALAR_INT_MODE_P when calling try_const_anchors.

This patch is raised when drafting below one.
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html.
With that patch, "{[%1:DI]=0;} stack_tie" with BLKmode runs into
try_const_anchors, and hits the assert/ice.

Boostrap and regtest pass on ppc64{,le} and x86_64.
Is this ok for trunk?


BR,
Jeff (Jiufu Guo)

gcc/ChangeLog:

* cse.cc (cse_insn): Add SCALAR_INT_MODE_P condition.

---
 gcc/cse.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/cse.cc b/gcc/cse.cc
index 2bb63ac4105..f213fa0faf7 100644
*** a/gcc/cse.cc
--- b/gcc/cse.cc
***
*** 5003,5009 
if (targetm.const_anchor
  && !src_related
  && src_const
! && GET_CODE (src_const) == CONST_INT)
{
  src_related = try_const_anchors (src_const, mode);
  src_related_is_const_anchor = src_related != NULL_RTX;
- - 
--- 5003,5010 
if (targetm.const_anchor
  && !src_related
  && src_const
! && GET_CODE (src_const) == CONST_INT
! && SCALAR_INT_MODE_P (mode))
{
  src_related = try_const_anchors (src_const, mode);
  src_related_is_const_anchor = src_related != NULL_RTX;
2.39.3



Re: [PATCH v2] Explicitly view_convert_expr mask to signed type when folding pblendvb builtins.

2023-06-08 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 6, 2023 at 4:23 PM liuhongt  wrote:
>
> > I think this is a better patch and will always be correct and still
> > get folded at the gimple level (correctly):
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index d4ff56ee8dd..02bf5ba93a5 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -18561,8 +18561,10 @@ ix86_gimple_fold_builtin (gimple_stmt_iterator 
> > *gsi)
> >   tree itype = GET_MODE_INNER (TYPE_MODE (type)) == E_SFmode
> > ? intSI_type_node : intDI_type_node;
> >   type = get_same_sized_vectype (itype, type);
> > - arg2 = gimple_build (, VIEW_CONVERT_EXPR, type, arg2);
> > }
> > + else
> > +   type = signed_type_for (type);
> > + arg2 = gimple_build (, VIEW_CONVERT_EXPR, type, arg2);
> >   tree zero_vec = build_zero_cst (type);
> >   tree cmp_type = truth_type_for (type);
> >   tree cmp = gimple_build (, LT_EXPR, cmp_type, arg2, 
> > zero_vec);
> >
> >
>
> Yes, thanks.
>
> Here's the updated patch:
>
> Since mask < 0 will be always false for vector char when
> -funsigned-char, but vpblendvb needs to check the most significant
> bit. The patch explicitly VCE to vector signed char.
>
Pushed to trunk and backport to GCC-13/GCC-12 release branch.(No need
for GCC-11 and earlier since the bug is introduced in GCC12).
>
> gcc/ChangeLog:
>
> PR target/110108
> * config/i386/i386.cc (ix86_gimple_fold_builtin): Explicitly
> view_convert_expr mask to signed type when folding pblendvb
> builtins.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr110108-2.c: New test.
> ---
>  gcc/config/i386/i386.cc|  4 +++-
>  gcc/testsuite/gcc.target/i386/pr110108-2.c | 14 ++
>  2 files changed, 17 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr110108-2.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index da20c2c49de..4e594a9c88e 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -18561,8 +18561,10 @@ ix86_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>   tree itype = GET_MODE_INNER (TYPE_MODE (type)) == E_SFmode
> ? intSI_type_node : intDI_type_node;
>   type = get_same_sized_vectype (itype, type);
> - arg2 = gimple_build (, VIEW_CONVERT_EXPR, type, arg2);
> }
> + else
> +   type = signed_type_for (type);
> + arg2 = gimple_build (, VIEW_CONVERT_EXPR, type, arg2);
>   tree zero_vec = build_zero_cst (type);
>   tree cmp_type = truth_type_for (type);
>   tree cmp = gimple_build (, LT_EXPR, cmp_type, arg2, zero_vec);
> diff --git a/gcc/testsuite/gcc.target/i386/pr110108-2.c 
> b/gcc/testsuite/gcc.target/i386/pr110108-2.c
> new file mode 100644
> index 000..2d1d2fd4991
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr110108-2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx2 -O2 -funsigned-char" } */
> +/* { dg-final { scan-assembler-times "vpblendvb" 2 } } */
> +
> +#include 
> +__m128i do_stuff_128(__m128i X0, __m128i X1, __m128i X2) {
> +  __m128i Result = _mm_blendv_epi8(X0, X1, X2);
> +  return Result;
> +}
> +
> +__m256i do_stuff_256(__m256i X0, __m256i X1, __m256i X2) {
> +  __m256i Result = _mm256_blendv_epi8(X0, X1, X2);
> +  return Result;
> +}
> --
> 2.39.1.388.g2fc9e9ca3c
>


-- 
BR,
Hongtao


Re: [PATCH] Fold _mm{, 256, 512}_abs_{epi8, epi16, epi32, epi64} into gimple ABSU_EXPR + VCE.

2023-06-08 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 7, 2023 at 8:31 AM Hongtao Liu  wrote:
>
> On Tue, Jun 6, 2023 at 10:36 PM Uros Bizjak  wrote:
> >
> > On Tue, Jun 6, 2023 at 1:42 PM Hongtao Liu  wrote:
> > >
> > > On Tue, Jun 6, 2023 at 5:11 PM Uros Bizjak  wrote:
> > > >
> > > > On Tue, Jun 6, 2023 at 6:33 AM liuhongt via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > r14-1145 fold the intrinsics into gimple ABS_EXPR which has UB for
> > > > > TYPE_MIN, but PABSB will store unsigned result into dst. The patch
> > > > > uses ABSU_EXPR + VCE instead of ABS_EXPR.
> > > > >
> > > > > Also don't fold _mm_abs_{pi8,pi16,pi32} w/o TARGET_64BIT since 64-bit
> > > > > vector absm2 is guarded with TARGET_MMX_WITH_SSE.
> > > >
> > > >This should be !TARGET_MMX_WITH_SSE. TARGET_64BIT is not enough, see
> > > >the definition of T_M_W_S in i386.h. OTOH, these builtins are
> > > >available for TARGET_MMX, so I'm not sure if the above check is needed
> > > >at all.
> > > BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0,
> > > CODE_FOR_ssse3_absv8qi2, "__builtin_ia32_pabsb", IX86_BUILTIN_PABSB,
> > > UNKNOWN, (int) V8QI_FTYPE_V8QI)
> > >
> > > ISA requirement(OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX) will be
> > > checked by ix86_check_builtin_isa_match which is at the beginning of
> > > ix86_gimple_fold_builtin.
> > > Here, we're folding those builtin into gimple ABSU_EXPR, and
> > > ABSU_EXPR will be lowered by vec_lower pass when backend
> > > doesn't support corressponding absm2_optab, that's why i only check
> > > TARGET_64BIT here.
> > >
> > > > Please note that we are using builtins here, so we should not fold to
> > > > absm2, but to ssse3_absm2, which is also available with TARGET_MMX.
> > > Yes, that exactly why I checked TARGET_64BIT here, w/ TARGET_64BIT,
> > > backend suppport absm2_optab which exactly matches ssse3_absm2.
> > > w/o TARGET_64BIT, the builtin shouldn't folding into gimple ABSU_EXPR,
> > > but let backend expanded to ssse3_absm2.
> >
> > Thanks for the explanation, but for consistency, I'd recommend
> > checking TARGET_MMX_WITH_SSE (= TARGET_64BIT && TARGET_SSE2) here. The
> > macro is self-explanatory, while the usage of TARGET_64BIT is not that
> > descriptive.
> Sure.
Pushed to trunk.
> >
> > Uros.
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


[PATCH] MATCH: Fix zero_one_valued_p not to match signed 1 bit integers

2023-06-08 Thread Andrew Pinski via Gcc-patches
So for the attached testcase, we assumed that zero_one_valued_p would
be the value [0,1] but currently zero_one_valued_p matches also
signed 1 bit integers.
This changes that not to match that and fixes the 2 new testcases at
all optimization levels.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Note the GCC 13 patch will be slightly different due to the changes
made to zero_one_valued_p.

PR tree-optimization/110165
PR tree-optimization/110166

gcc/ChangeLog:

* match.pd (zero_one_valued_p): Don't accept
signed 1-bit integers.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr110165-1.c: New test.
* gcc.c-torture/execute/pr110166-1.c: New test.
---
 gcc/match.pd  | 13 ++--
 .../gcc.c-torture/execute/pr110165-1.c| 28 
 .../gcc.c-torture/execute/pr110166-1.c| 33 +++
 3 files changed, 71 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110165-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr110166-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 4ad037d641a..9a6bc2e9348 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1984,12 +1984,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   @0)
 
 /* zero_one_valued_p will match when a value is known to be either
-   0 or 1 including constants 0 or 1. */
+   0 or 1 including constants 0 or 1.
+   Signed 1-bits includes -1 so they cannot match here. */
 (match zero_one_valued_p
  @0
- (if (INTEGRAL_TYPE_P (type) && wi::leu_p (tree_nonzero_bits (@0), 1
+ (if (INTEGRAL_TYPE_P (type)
+  && (TYPE_UNSIGNED (type)
+ || TYPE_PRECISION (type) > 1)
+  && wi::leu_p (tree_nonzero_bits (@0), 1
 (match zero_one_valued_p
- truth_valued_p@0)
+ truth_valued_p@0
+ (if (INTEGRAL_TYPE_P (type)
+  && (TYPE_UNSIGNED (type)
+ || TYPE_PRECISION (type) > 1
 
 /* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 }.  */
 (simplify
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110165-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110165-1.c
new file mode 100644
index 000..9521a19428e
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110165-1.c
@@ -0,0 +1,28 @@
+struct s
+{
+  int t : 1;
+};
+
+int f(struct s t, int a, int b) __attribute__((noinline));
+int f(struct s t, int a, int b)
+{
+int bd = t.t;
+if (bd) a|=b;
+return a;
+}
+
+int main(void)
+{
+struct s t;
+for(int i = -1;i <= 1; i++)
+{
+int a = 0x10;
+int b = 0x0f;
+int c = a | b;
+   struct s t = {i};
+int r = f(t, a, b);
+int exp = (i != 0) ? a | b : a;
+if (exp != r)
+ __builtin_abort();
+}
+}
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr110166-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr110166-1.c
new file mode 100644
index 000..f999d47fe69
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr110166-1.c
@@ -0,0 +1,33 @@
+struct s
+{
+  int t : 1;
+  int t1 : 1;
+};
+
+int f(struct s t) __attribute__((noinline));
+int f(struct s t)
+{
+   int c = t.t;
+   int d = t.t1;
+   if (c > d)
+ t.t = d;
+   else
+ t.t = c;
+  return t.t;
+}
+
+int main(void)
+{
+struct s t;
+for(int i = -1;i <= 0; i++)
+{
+  for(int j = -1;j <= 0; j++)
+  {
+   struct s t = {i, j};
+int r = f(t);
+int exp = i < j ? i : j;
+if (exp != r)
+ __builtin_abort();
+  }
+}
+}
-- 
2.31.1



Re: [PATCH 1/2] Implementation of new RISCV optimizations pass: fold-mem-offsets.

2023-06-08 Thread Jeff Law via Gcc-patches




On 5/25/23 06:35, Manolis Tsamis wrote:

Implementation of the new RISC-V optimization pass for memory offset
calculations, documentation and testcases.

gcc/ChangeLog:

* config.gcc: Add riscv-fold-mem-offsets.o to extra_objs.
* config/riscv/riscv-passes.def (INSERT_PASS_AFTER): Schedule a new
pass.
* config/riscv/riscv-protos.h (make_pass_fold_mem_offsets): Declare.
* config/riscv/riscv.opt: New options.
* config/riscv/t-riscv: New build rule.
* doc/invoke.texi: Document new option.
* config/riscv/riscv-fold-mem-offsets.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/fold-mem-offsets-1.c: New test.
* gcc.target/riscv/fold-mem-offsets-2.c: New test.
* gcc.target/riscv/fold-mem-offsets-3.c: New test.

So a followup.

While I think we probably could create a variety of backend patterns, 
perhaps disallow the frame pointer as the addend argument to a shadd 
pattern and the like and capture the key cases from mcf and probably 
deepsjeng it's probably not the best direction.


What I suspect would ultimately happen is we'd be presented with 
additional cases over time that would require an ever increasing number 
of patterns.  sign vs zero extension, increasing depth of search space 
to find reassociation opportunities, different variants with and without 
shadd/zbb, etc etc.


So with that space explored a bit the next big question is target 
specific or generic.  I'll poke in there a it over the coming days.  In 
the mean time I do have some questions/comments on the code itself. 
There may be more over time..





+static rtx_insn*
+get_single_def_in_bb (rtx_insn *insn, rtx reg)

[ ... ]



+  for (ref_link = ref_chain; ref_link; ref_link = ref_link->next)
+{
+  /* Problem getting some definition for this instruction.  */
+  if (ref_link->ref == NULL)
+   return NULL;
+  if (DF_REF_INSN_INFO (ref_link->ref) == NULL)
+   return NULL;
+  if (global_regs[REGNO (reg)]
+ && !set_of (reg, DF_REF_INSN (ref_link->ref)))
+   return NULL;
+}
That last condition feels a bit odd.  It would seem that you wanted an 
OR boolean rather than AND.




+
+  unsigned int dest_regno = REGNO (dest);
+
+  /* We don't want to fold offsets from instructions that change some
+ particular registers with potentially global side effects.  */
+  if (!GP_REG_P (dest_regno)
+  || dest_regno == STACK_POINTER_REGNUM
+  || (frame_pointer_needed && dest_regno == HARD_FRAME_POINTER_REGNUM)
+  || dest_regno == GP_REGNUM
+  || dest_regno == THREAD_POINTER_REGNUM
+  || dest_regno == RETURN_ADDR_REGNUM)
+return 0;
I'd think most of this would be captured by testing fixed_registers 
rather than trying to list each register individually.  In fact, if we 
need to generalize this to work on other targets we almost certainly 
want a more general test.




+  else if ((
+   GET_CODE (src) == SIGN_EXTEND
+   || GET_CODE (src) == ZERO_EXTEND
+ )
+ && MEM_P (XEXP (src, 0)))

Formatting is goofy above...




+
+ if (dump_file)
+   {
+ fprintf (dump_file, "Instruction deleted from folding:");
+ print_rtl_single (dump_file, insn);
+   }
+
+ if (REGNO (dest) != REGNO (arg1))
+   {
+ /* If the dest register is different than the fisrt argument
+then the addition with constant 0 is equivalent to a move
+instruction.  We emit the move and let the subsequent
+pass cprop_hardreg eliminate that if possible.  */
+ rtx arg1_reg_rtx = gen_rtx_REG (GET_MODE (dest), REGNO (arg1));
+ rtx mov_rtx = gen_move_insn (dest, arg1_reg_rtx);
+ df_insn_rescan (emit_insn_after (mov_rtx, insn));
+   }
+
+ /* If the dest register is the same with the first argument
+then the addition with constant 0 is a no-op.
+We can now delete the original add immidiate instruction.  */
+ delete_insn (insn);
The debugging message is a bit misleading.  Yea, we always delete 
something here, but in one case we end up emitting a copy.





+
+ /* Temporarily change the offset in MEM to test whether
+it results in a valid instruction.  */
+ machine_mode mode = GET_MODE (mem_addr);
+ XEXP (mem, 0) = gen_rtx_PLUS (mode, reg, GEN_INT (offset));
+
+ bool valid_change = recog (PATTERN (insn), insn, 0) >= 0;
+
+ /* Restore the instruction.  */
+ XEXP (mem, 0) = mem_addr;
You need to reset the INSN_CODE after restoring the instruction.  That's 
generally a bad thing to do, but I've seen it done enough (and been 
guilty myself in the past) that we should just assume some ports are 
broken in this regard.



Anyway, just wanted to get those issues raised so that you can address them.

jeff


Re: [pushed] c++: allow NRV and non-NRV returns [PR58487]

2023-06-08 Thread Hans-Peter Nilsson via Gcc-patches
> Date: Wed,  7 Jun 2023 18:06:15 -0400
> From: Jason Merrill via Gcc-patches 

> Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> -- 8< --
> 
> Now that we support NRV from an inner block, we can also support non-NRV
> returns from other blocks, since once the NRV is out of scope a later return
> expression can't possibly alias it.
> 
> This fixes 58487 and half-fixes 53637: now one of the returns is elided, but
> not the other.
> 
> Fixing the remaining xfails in these testcases will require a very different
> approach, probably involving a full tree/block walk from finalize_nrv, and
> check_return_expr only adding to a list of potential return variables.
> 
>   PR c++/58487
>   PR c++/53637
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (INIT_EXPR_NRV_P): New.
>   * semantics.cc (finalize_nrv_r): Check it.
>   * name-lookup.h (decl_in_scope_p): Declare.
>   * name-lookup.cc (decl_in_scope_p): New.
>   * typeck.cc (check_return_expr): Allow non-NRV
>   returns if the NRV is no longer in scope.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/opt/nrv26.C: New test.
>   * g++.dg/opt/nrv26a.C: New test.
>   * g++.dg/opt/nrv27.C: New test.

This somehow caused 21 regressions for cris-elf in the c++
and libstdc++ testsuites.  I opened PR110185 to hold the
preprocessed g++.dg/cpp2a/spaceship-p1186.C.

brgds, H-P


Re: [PATCH] analyzer: Standalone OOB-warning [PR109437, PR109439]

2023-06-08 Thread Benjamin Priour via Gcc-patches
Hi Maxim,

I managed to nail the bug on the failing test pr100244.C, as I did too
observe a divergence after my patch.
For pr101962.c, it was simply a dg-note I forgot to remove, that made it
fail, since the related warning is no longer relevant. The behavior
otherwise is as expected before and after the patch.
It is now fixed.

Thanks again a lot for you attention,
Benjamin.

On Fri, Jun 9, 2023 at 2:19 AM Benjamin Priour  wrote:

> Hi David,
>
> So first real committed patch actually was a misstep. So I'm currently
> fixing that.
> The issue is that the original idea, to return a boolean and create a
> unknown_svalue on OOB access to prevent further "use-of-uninitialized-value"
> caused a loss of information on the location of the buffer. So now, when a
> buffer is on the stack, we lose that information by returning an
> unknown_svalue
> from get_store_value. Therefore further checks from 'sm_malloc' won't be
> able to detect erroneous operations expecting heap-allocated buffer, e.g. a
> delete.
> It does not trouble successive out_of_bounds checks, since the checks are
> done on the boundaries of the initial buffer, that
> The issue is from sm_state_map::get_state, since an unknown_svalue cannot
> hold any state, then in the checker is misled.
> I thought to artificially add a state, but since the unknown_svalue are
> singleton per type, it is not right.
> Therefore I'm considering something, to make it so
> can_have_initial_svalue_p holds true for OOB heap access as it is for the
> stack, instead of creating an unknown_svalue.
> I'll do **PROPER** testing tomorrow, now that I have the compile farm, to
> check if doing so won't introduce any further issue.
> This way we would keep all the relevant information about the region,
> without making it poisoned, and OOB checks are done with the initial byte
> size of the buffer,
> so this should not be disturbed.
>
> I briefly tried the above as a proof of concept. Doing so fixed PR100244.C
> mentioned by Maxim, while still passing my new test cases for PR109439.
> I will regtest this configuration tomorrow morning on the farm, I am
> getting sleepy, except if you can already see problems this would cause.
>
> Sorry again I have somewhat managed to fail my first commit, and pushed it.
>
> Thanks,
> Benjamin.
>
> -- Forwarded message -
> From: Benjamin Priour 
> Date: Thu, Jun 8, 2023 at 8:18 PM
> Subject: Re: [PATCH] analyzer: Standalone OOB-warning [PR109437, PR109439]
> To: Maxim Kuvyrkov 
> Cc: , Benjamin Priour , <
> dmalc...@redhat.com>
>
>
> Hi,
>
> Yes of course, I tested many days ago since regtesting takes several days
> on my box, I should have retested !
> But I got an account for the compile farm today, so I'm on it immediately,
> I also see a divergence in the warnings on my box.
>
> Thanks for the report !
> Sincerely sorry,
> Benjamin.
>
> On Thu, Jun 8, 2023 at 7:53 PM Maxim Kuvyrkov 
> wrote:
>
>> > On Jun 6, 2023, at 15:48, Benjamin Priour via Gcc-patches <
>> gcc-patches@gcc.gnu.org> wrote:
>> >
>> > From: Benjamin Priour 
>> >
>> > This patch enchances -Wanalyzer-out-of-bounds that is no longer paired
>> > with a -Wanalyzer-use-of-uninitialized-value on out-of-bounds-read.
>> >
>> > This also fixes PR analyzer/109437.
>> > Before there could always be at most one OOB-read warning per frame
>> because
>> > -Wanalyzer-use-of-uninitialized-value always terminates the analysis
>> > path.
>> >
>> > PR analyzer/109439
>> >
>> > gcc/analyzer/ChangeLog:
>> >
>> > * bounds-checking.cc (region_model::check_symbolic_bounds):
>> >  Returns whether the BASE_REG region access was OOB.
>> > (region_model::check_region_bounds): Likewise.
>> > * region-model.cc (region_model::get_store_value): Creates an
>> >  unknown svalue on OOB-read access to REG.
>> > (region_model::check_region_access): Returns whether an
>> > unknown svalue needs be created.
>> > (region_model::check_region_for_read): Passes
>> > check_region_access return value.
>> > * region-model.h: Update prior function definitions.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> > * gcc.dg/analyzer/out-of-bounds-2.c: Cleaned test for
>> >  uninitialized-value warning.
>> > * gcc.dg/analyzer/out-of-bounds-5.c: Likewise.
>> > * gcc.dg/analyzer/pr101962.c: Likewise.
>> > * gcc.dg/analyzer/realloc-5.c: Likewise.
>> > * gcc.dg/analyzer/pr109439.c: New test.
>> > ---
>>
>> Hi Benjamin,
>>
>> This patch makes two tests fail on arm-linux-gnueabihf.  Probably, they
>> need to be updated as well.  Would you please investigate?  Let me know if
>> it doesn't easily reproduce for you, and I'll help.
>>
>> === g++ tests ===
>>
>> Running g++:g++.dg/analyzer/analyzer.exp ...
>> FAIL: g++.dg/analyzer/pr100244.C -std=c++14 (test for warnings, line 17)
>> FAIL: g++.dg/analyzer/pr100244.C -std=c++17 (test for warnings, line 17)
>> FAIL: g++.dg/analyzer/pr100244.C -std=c++20 (test for warnings, line 17)
>>
>> === gcc tests ===
>>
>> Running gcc:gcc.dg/analyzer/analyzer.exp ...

Fwd: [PATCH] analyzer: Standalone OOB-warning [PR109437, PR109439]

2023-06-08 Thread Benjamin Priour via Gcc-patches
Hi David,

So first real committed patch actually was a misstep. So I'm currently
fixing that.
The issue is that the original idea, to return a boolean and create a
unknown_svalue on OOB access to prevent further "use-of-uninitialized-value"
caused a loss of information on the location of the buffer. So now, when a
buffer is on the stack, we lose that information by returning an
unknown_svalue
from get_store_value. Therefore further checks from 'sm_malloc' won't be
able to detect erroneous operations expecting heap-allocated buffer, e.g. a
delete.
It does not trouble successive out_of_bounds checks, since the checks are
done on the boundaries of the initial buffer, that
The issue is from sm_state_map::get_state, since an unknown_svalue cannot
hold any state, then in the checker is misled.
I thought to artificially add a state, but since the unknown_svalue are
singleton per type, it is not right.
Therefore I'm considering something, to make it so
can_have_initial_svalue_p holds true for OOB heap access as it is for the
stack, instead of creating an unknown_svalue.
I'll do **PROPER** testing tomorrow, now that I have the compile farm, to
check if doing so won't introduce any further issue.
This way we would keep all the relevant information about the region,
without making it poisoned, and OOB checks are done with the initial byte
size of the buffer,
so this should not be disturbed.

I briefly tried the above as a proof of concept. Doing so fixed PR100244.C
mentioned by Maxim, while still passing my new test cases for PR109439.
I will regtest this configuration tomorrow morning on the farm, I am
getting sleepy, except if you can already see problems this would cause.

Sorry again I have somewhat managed to fail my first commit, and pushed it.

Thanks,
Benjamin.

-- Forwarded message -
From: Benjamin Priour 
Date: Thu, Jun 8, 2023 at 8:18 PM
Subject: Re: [PATCH] analyzer: Standalone OOB-warning [PR109437, PR109439]
To: Maxim Kuvyrkov 
Cc: , Benjamin Priour , <
dmalc...@redhat.com>


Hi,

Yes of course, I tested many days ago since regtesting takes several days
on my box, I should have retested !
But I got an account for the compile farm today, so I'm on it immediately,
I also see a divergence in the warnings on my box.

Thanks for the report !
Sincerely sorry,
Benjamin.

On Thu, Jun 8, 2023 at 7:53 PM Maxim Kuvyrkov 
wrote:

> > On Jun 6, 2023, at 15:48, Benjamin Priour via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> >
> > From: Benjamin Priour 
> >
> > This patch enchances -Wanalyzer-out-of-bounds that is no longer paired
> > with a -Wanalyzer-use-of-uninitialized-value on out-of-bounds-read.
> >
> > This also fixes PR analyzer/109437.
> > Before there could always be at most one OOB-read warning per frame
> because
> > -Wanalyzer-use-of-uninitialized-value always terminates the analysis
> > path.
> >
> > PR analyzer/109439
> >
> > gcc/analyzer/ChangeLog:
> >
> > * bounds-checking.cc (region_model::check_symbolic_bounds):
> >  Returns whether the BASE_REG region access was OOB.
> > (region_model::check_region_bounds): Likewise.
> > * region-model.cc (region_model::get_store_value): Creates an
> >  unknown svalue on OOB-read access to REG.
> > (region_model::check_region_access): Returns whether an
> > unknown svalue needs be created.
> > (region_model::check_region_for_read): Passes
> > check_region_access return value.
> > * region-model.h: Update prior function definitions.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/analyzer/out-of-bounds-2.c: Cleaned test for
> >  uninitialized-value warning.
> > * gcc.dg/analyzer/out-of-bounds-5.c: Likewise.
> > * gcc.dg/analyzer/pr101962.c: Likewise.
> > * gcc.dg/analyzer/realloc-5.c: Likewise.
> > * gcc.dg/analyzer/pr109439.c: New test.
> > ---
>
> Hi Benjamin,
>
> This patch makes two tests fail on arm-linux-gnueabihf.  Probably, they
> need to be updated as well.  Would you please investigate?  Let me know if
> it doesn't easily reproduce for you, and I'll help.
>
> === g++ tests ===
>
> Running g++:g++.dg/analyzer/analyzer.exp ...
> FAIL: g++.dg/analyzer/pr100244.C -std=c++14 (test for warnings, line 17)
> FAIL: g++.dg/analyzer/pr100244.C -std=c++17 (test for warnings, line 17)
> FAIL: g++.dg/analyzer/pr100244.C -std=c++20 (test for warnings, line 17)
>
> === gcc tests ===
>
> Running gcc:gcc.dg/analyzer/analyzer.exp ...
> FAIL: gcc.dg/analyzer/pr101962.c (test for warnings, line 19)
>
> Thanks,
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
>
>
> > gcc/analyzer/bounds-checking.cc   | 30 +--
> > gcc/analyzer/region-model.cc  | 22 +-
> > gcc/analyzer/region-model.h   |  8 ++---
> > .../gcc.dg/analyzer/out-of-bounds-2.c |  1 -
> > .../gcc.dg/analyzer/out-of-bounds-5.c |  2 --
> > gcc/testsuite/gcc.dg/analyzer/pr101962.c  |  1 -
> > gcc/testsuite/gcc.dg/analyzer/pr109439.c  | 12 
> > 

Re: [PATCH] c++: unsynthesized defaulted constexpr fn [PR110122]

2023-06-08 Thread Jason Merrill via Gcc-patches

On 6/8/23 15:54, Patrick Palka wrote:

On Wed, 7 Jun 2023, Jason Merrill wrote:


On 6/6/23 14:29, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

In the second testcase of PR110122, during regeneration of the generic
lambda with V=Bar{}, substitution followed by coerce_template_parms for
A's template argument naturally yields a copy of V in terms of Bar's
(implicitly) defaulted copy constructor.

This however happens inside a template context so although we introduced
a use of the copy constructor, mark_used didn't actually synthesize it,
which causes subsequent constant evaluation of the template argument to
fail with:

nontype-class58.C: In instantiation of ‘void f() [with Bar V =
Bar{Foo()}]’:
nontype-class58.C:22:11:   required from here
nontype-class58.C:18:18: error: ‘constexpr Bar::Bar(const Bar&)’ used
before its definition

Conveniently we already make sure to instantiate eligible constexpr
functions before such (manifestly) constant evaluation, as per P0859R0.
So this patch fixes this by making sure to synthesize eligible defaulted
constexpr functions beforehand as well.


We probably also want to do this in cxx_eval_call_expression, under


Makes sense, like so?  I'm not sure if it's possible to write a test
for which this code path makes an observable difference, but I verified
the code path is hit a couple of times throughout the testsuite (mainly
from fold_non_dependent_expr called from build_non_dependent_expr).
Bootstrapped and regtested on x86_64-pc-linux-gnu.


OK.


-->8 --

PR c++/110122

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Also synthesize
eligible defaulted functions.
(instantiate_cx_fn_r): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class58.C: New test.
---
  gcc/cp/constexpr.cc  | 14 
  gcc/testsuite/g++.dg/cpp2a/nontype-class58.C | 23 
  2 files changed, 33 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class58.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 8f7f0b7d325..9122a5efa65 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2897,7 +2897,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  
/* We can't defer instantiating the function any longer.  */

if (!DECL_INITIAL (fun)
-  && DECL_TEMPLOID_INSTANTIATION (fun)
+  && (DECL_TEMPLOID_INSTANTIATION (fun) || DECL_DEFAULTED_FN (fun))
&& !uid_sensitive_constexpr_evaluation_p ())
  {
location_t save_loc = input_location;
@@ -2905,7 +2905,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
++function_depth;
if (ctx->manifestly_const_eval == mce_true)
FNDECL_MANIFESTLY_CONST_EVALUATED (fun) = true;
-  instantiate_decl (fun, /*defer_ok*/false, /*expl_inst*/false);
+  if (DECL_TEMPLOID_INSTANTIATION (fun))
+   instantiate_decl (fun, /*defer_ok*/false, /*expl_inst*/false);
+  else
+   synthesize_method (fun);
--function_depth;
input_location = save_loc;
  }
@@ -8110,11 +8113,14 @@ instantiate_cx_fn_r (tree *tp, int *walk_subtrees, void 
*/*data*/)
&& DECL_DECLARED_CONSTEXPR_P (*tp)
&& !DECL_INITIAL (*tp)
&& !trivial_fn_p (*tp)
-  && DECL_TEMPLOID_INSTANTIATION (*tp)
+  && (DECL_TEMPLOID_INSTANTIATION (*tp) || DECL_DEFAULTED_FN (*tp))
&& !uid_sensitive_constexpr_evaluation_p ())
  {
++function_depth;
-  instantiate_decl (*tp, /*defer_ok*/false, /*expl_inst*/false);
+  if (DECL_TEMPLOID_INSTANTIATION (*tp))
+   instantiate_decl (*tp, /*defer_ok*/false, /*expl_inst*/false);
+  else
+   synthesize_method (*tp);
--function_depth;
  }
else if (TREE_CODE (*tp) == CALL_EXPR
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class58.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class58.C
new file mode 100644
index 000..6e40698da2f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class58.C
@@ -0,0 +1,23 @@
+// PR c++/110122
+// { dg-do compile { target c++20 } }
+
+struct Foo {
+  Foo() = default;
+  constexpr Foo(const Foo&) { }
+};
+
+struct Bar {
+  Foo _;
+};
+
+template
+struct A { };
+
+template
+void f() {
+  [](auto){ A d; }(0); // { dg-bogus "used before its definition" }
+};
+
+int main() {
+  f();
+}




Orphaned patches

2023-06-08 Thread Steve Kargl via Gcc-patches
If anyone is so inclined, the patches in the following
PR's can be committed and the PR closed.   These are
patches for gfortran.

 69101  91960  95613  99139  99368  99798
100607 103795 103796 104626 105594 101967
101951 104649 106050 106500 107266 107406
107596

This is an opportunity for lurkers in the fortran@
list to grab a patch, apply to the tree, ask questions,
and then take the plunge to being someone who can
help care for gfortran.
 
-- 
Steve


Re: [PATCH] c++: unsynthesized defaulted constexpr fn [PR110122]

2023-06-08 Thread Patrick Palka via Gcc-patches
On Wed, 7 Jun 2023, Jason Merrill wrote:

> On 6/6/23 14:29, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > -- >8 --
> > 
> > In the second testcase of PR110122, during regeneration of the generic
> > lambda with V=Bar{}, substitution followed by coerce_template_parms for
> > A's template argument naturally yields a copy of V in terms of Bar's
> > (implicitly) defaulted copy constructor.
> > 
> > This however happens inside a template context so although we introduced
> > a use of the copy constructor, mark_used didn't actually synthesize it,
> > which causes subsequent constant evaluation of the template argument to
> > fail with:
> > 
> >nontype-class58.C: In instantiation of ‘void f() [with Bar V =
> > Bar{Foo()}]’:
> >nontype-class58.C:22:11:   required from here
> >nontype-class58.C:18:18: error: ‘constexpr Bar::Bar(const Bar&)’ used
> > before its definition
> > 
> > Conveniently we already make sure to instantiate eligible constexpr
> > functions before such (manifestly) constant evaluation, as per P0859R0.
> > So this patch fixes this by making sure to synthesize eligible defaulted
> > constexpr functions beforehand as well.
> 
> We probably also want to do this in cxx_eval_call_expression, under

Makes sense, like so?  I'm not sure if it's possible to write a test
for which this code path makes an observable difference, but I verified
the code path is hit a couple of times throughout the testsuite (mainly
from fold_non_dependent_expr called from build_non_dependent_expr).
Bootstrapped and regtested on x86_64-pc-linux-gnu.

-->8 --

PR c++/110122

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Also synthesize
eligible defaulted functions.
(instantiate_cx_fn_r): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class58.C: New test.
---
 gcc/cp/constexpr.cc  | 14 
 gcc/testsuite/g++.dg/cpp2a/nontype-class58.C | 23 
 2 files changed, 33 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class58.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 8f7f0b7d325..9122a5efa65 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2897,7 +2897,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 
   /* We can't defer instantiating the function any longer.  */
   if (!DECL_INITIAL (fun)
-  && DECL_TEMPLOID_INSTANTIATION (fun)
+  && (DECL_TEMPLOID_INSTANTIATION (fun) || DECL_DEFAULTED_FN (fun))
   && !uid_sensitive_constexpr_evaluation_p ())
 {
   location_t save_loc = input_location;
@@ -2905,7 +2905,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   ++function_depth;
   if (ctx->manifestly_const_eval == mce_true)
FNDECL_MANIFESTLY_CONST_EVALUATED (fun) = true;
-  instantiate_decl (fun, /*defer_ok*/false, /*expl_inst*/false);
+  if (DECL_TEMPLOID_INSTANTIATION (fun))
+   instantiate_decl (fun, /*defer_ok*/false, /*expl_inst*/false);
+  else
+   synthesize_method (fun);
   --function_depth;
   input_location = save_loc;
 }
@@ -8110,11 +8113,14 @@ instantiate_cx_fn_r (tree *tp, int *walk_subtrees, void 
*/*data*/)
   && DECL_DECLARED_CONSTEXPR_P (*tp)
   && !DECL_INITIAL (*tp)
   && !trivial_fn_p (*tp)
-  && DECL_TEMPLOID_INSTANTIATION (*tp)
+  && (DECL_TEMPLOID_INSTANTIATION (*tp) || DECL_DEFAULTED_FN (*tp))
   && !uid_sensitive_constexpr_evaluation_p ())
 {
   ++function_depth;
-  instantiate_decl (*tp, /*defer_ok*/false, /*expl_inst*/false);
+  if (DECL_TEMPLOID_INSTANTIATION (*tp))
+   instantiate_decl (*tp, /*defer_ok*/false, /*expl_inst*/false);
+  else
+   synthesize_method (*tp);
   --function_depth;
 }
   else if (TREE_CODE (*tp) == CALL_EXPR
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class58.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class58.C
new file mode 100644
index 000..6e40698da2f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class58.C
@@ -0,0 +1,23 @@
+// PR c++/110122
+// { dg-do compile { target c++20 } }
+
+struct Foo {
+  Foo() = default;
+  constexpr Foo(const Foo&) { }
+};
+
+struct Bar {
+  Foo _;
+};
+
+template
+struct A { };
+
+template
+void f() {
+  [](auto){ A d; }(0); // { dg-bogus "used before its definition" }
+};
+
+int main() {
+  f();
+}
-- 
2.41.0.rc1.10.g9e49351c30


> 
> >   /* We can't defer instantiating the function any longer.  */
> 
> Jason
> 
> 


[COMMITTED 3/4] Unify range_operators to one class.

2023-06-08 Thread Andrew MacLeod via Gcc-patches
Range_operator and range_operator_float are 2 different classes, which 
was not the original intent. This makes generalized dispatch to the 
appropriate function more difficult.  The distinction between what is a 
float operator and what is an integral operator also blurs when some 
methods have multiple types.  ie, casts : INT = FLOAT and FLOAT = INT, 
or other mixed operations like INT = FLOAT < FLOAT


This patch unifies all possible invocation patterns in one 
range_operator class. All the float operators now inherit from 
range_operator, and this allows the float table to use the general 
range_op_table class instead of re-implementing another kind of table.  
THis paves the way for the next patch which provides generalized 
dispatch for the various routines from a VRANGE.


There is little functional difference after this patch. Bootstraps on 
x86_64-pc-linux-gnu with no regressions.  Pushed.


Andrew
From e925119d520ac10674ed42faf14955aaf130c03b Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 31 May 2023 12:31:53 -0400
Subject: [PATCH 3/4] Unify range_operators to one class.

Range_operator and range_operator_float are 2 different classes, making
generalized dispatch difficult.  The distinction between what is a float
operator and what is an integral operator also blurs when some methods
have multiple types.  ie, casts : INT = FLOAT and FLOAT = INT

This patch unifies all possible invocation patterns in one class, and
switches the float table to use the general range_op_table.

	* gimple-range-op.cc (cfn_constant_float_p): Change base class.
	(cfn_pass_through_arg1): Adjust using statemenmt.
	(cfn_signbit): Change base class, adjust using statement.
	(cfn_copysign): Ditto.
	(cfn_sqrt): Ditto.
	(cfn_sincos): Ditto.
	* range-op-float.cc (fold_range): Change class to range_operator.
	(rv_fold): Ditto.
	(op1_range): Ditto
	(op2_range): Ditto
	(lhs_op1_relation): Ditto.
	(lhs_op2_relation): Ditto.
	(op1_op2_relation): Ditto.
	(foperator_*): Ditto.
	(class float_table): New.  Inherit from range_op_table.
	(floating_tree_table) Change to range_op_table pointer.
	(class floating_op_table): Delete.
	* range-op.cc (operator_equal): Adjust using statement.
	(operator_not_equal): Ditto.
	(operator_lt, operator_le, operator_gt, operator_ge): Ditto.
	(operator_minus, operator_cast): Ditto.
	(operator_bitwise_and, pointer_plus_operator): Ditto.
	(get_float_handle): Change return type.
	* range-op.h (range_operator_float): Delete.  Relocate all methods
	into class range_operator.
	(range_op_handler::m_float): Change type to range_operator.
	(floating_op_table): Delete.
	(floating_tree_table): Change type.
---
 gcc/gimple-range-op.cc |  27 ++---
 gcc/range-op-float.cc  | 222 +++--
 gcc/range-op.cc|  12 ++-
 gcc/range-op.h | 124 +++
 4 files changed, 183 insertions(+), 202 deletions(-)

diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 59c47e2074d..293d76402e1 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -268,10 +268,10 @@ gimple_range_op_handler::calc_op2 (vrange , const vrange _range,
 // 
 
 // Implement range operator for float CFN_BUILT_IN_CONSTANT_P.
-class cfn_constant_float_p : public range_operator_float
+class cfn_constant_float_p : public range_operator
 {
 public:
-  using range_operator_float::fold_range;
+  using range_operator::fold_range;
   virtual bool fold_range (irange , tree type, const frange ,
 			   const irange &, relation_trio) const
   {
@@ -319,6 +319,7 @@ class cfn_pass_through_arg1 : public range_operator
 {
 public:
   using range_operator::fold_range;
+  using range_operator::op1_range;
   virtual bool fold_range (irange , tree, const irange ,
 			   const irange &, relation_trio) const
   {
@@ -334,11 +335,11 @@ public:
 } op_cfn_pass_through_arg1;
 
 // Implement range operator for CFN_BUILT_IN_SIGNBIT.
-class cfn_signbit : public range_operator_float
+class cfn_signbit : public range_operator
 {
 public:
-  using range_operator_float::fold_range;
-  using range_operator_float::op1_range;
+  using range_operator::fold_range;
+  using range_operator::op1_range;
   virtual bool fold_range (irange , tree type, const frange ,
 			   const irange &, relation_trio) const override
   {
@@ -373,10 +374,10 @@ public:
 } op_cfn_signbit;
 
 // Implement range operator for CFN_BUILT_IN_COPYSIGN
-class cfn_copysign : public range_operator_float
+class cfn_copysign : public range_operator
 {
 public:
-  using range_operator_float::fold_range;
+  using range_operator::fold_range;
   virtual bool fold_range (frange , tree type, const frange ,
 			   const frange , relation_trio) const override
   {
@@ -464,11 +465,11 @@ frange_mpfr_arg1 (REAL_VALUE_TYPE *res_low, REAL_VALUE_TYPE *res_high,
   return true;
 }
 
-class cfn_sqrt : public range_operator_float
+class cfn_sqrt : public range_operator
 {
 public:
-  

[COMMITTED 4/4] Provide a new dispatch mechanism for range-ops.

2023-06-08 Thread Andrew MacLeod via Gcc-patches

This patch introduces a new dispatch mechanism for range_op_handler.

Instead of ad-hoc if then elses based on is_a and is_a,frange 
*>, the discriminators in class vrange are used for each operand to 
create a triplet, ie (III for "LHS = Irange, op1 = Irange, op2 = 
Irange", and IFI for "Irange Frange Irange")


These triplets are then used ina  switch to dispatch the call to the 
approriate one in range_operator for those types.  And added bonus is 
that if there is a pattern that is not recognized, we no longer trap.. 
Tthe dispatch routine simply returns the same thing as a default routine 
does in range_operator.. either false or VREL_VARYING depending on the 
signature.


This will make adding additional range types much simplier going 
forward, and aleviates the need to check for supported types before 
invoking routines like fold_range.


As part fo the rework, this patch also simplifies range_op_handler to 
now only contain a single pointer to a range_operator instead of 2 
pointers and a flag.   This is enabled by the previous patch which 
unifies all range operators to one class.


And added bonus is a bit of a compile time improvement for VRP and 
threading, as well as other clients due to less conditional checks at 
dispatch time. It only amounts to about a 0.75% improvement in those 
passes for the moment... but every little bit helps.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew
From f36f25792b3cb0b9067f318dd4d5c968f75a5c3d Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 31 May 2023 13:10:31 -0400
Subject: [PATCH 4/4] Provide a new dispatch mechanism for range-ops.

Simplify range_op_handler to have a single range_operator pointer and
provide a more flexible dispatch mechanism for calls via generic vrange
classes.   This is more extensible for adding new classes of range support.
Any unsupported dispatch patterns will simply return FALSE now rather
than generating compile time exceptions, aleviating the need to
constantly check for supoprted types.

	* gimple-range-op.cc
	(gimple_range_op_handler::gimple_range_op_handler): Adjust.
	(gimple_range_op_handler::maybe_builtin_call): Adjust.
	* gimple-range-op.h (operand1, operand2): Use m_operator.
	* range-op.cc (integral_table, pointer_table): Relocate.
	(get_op_handler): Rename from get_handler and handle all types.
	(range_op_handler::range_op_handler): Relocate.
	(range_op_handler::set_op_handler): Relocate and adjust.
	(range_op_handler::range_op_handler): Relocate.
	(dispatch_trio): New.
	(RO_III, RO_IFI, RO_IFF, RO_FFF, RO_FIF, RO_FII): New consts.
	(range_op_handler::dispatch_kind): New.
	(range_op_handler::fold_range): Relocate and Use new dispatch value.
	(range_op_handler::op1_range): Ditto.
	(range_op_handler::op2_range): Ditto.
	(range_op_handler::lhs_op1_relation): Ditto.
	(range_op_handler::lhs_op2_relation): Ditto.
	(range_op_handler::op1_op2_relation): Ditto.
	(range_op_handler::set_op_handler): Use m_operator member.
	* range-op.h (range_op_handler::operator bool): Use m_operator.
	(range_op_handler::dispatch_kind): New.
	(range_op_handler::m_valid): Delete.
	(range_op_handler::m_int): Delete
	(range_op_handler::m_float): Delete
	(range_op_handler::m_operator): New.
	(range_op_table::operator[]): Relocate from .cc file.
	(range_op_table::set): Ditto.
	* value-range.h (class vrange): Make range_op_handler a friend.
---
 gcc/gimple-range-op.cc |  84 +++-
 gcc/gimple-range-op.h  |   4 +-
 gcc/range-op.cc| 470 ++---
 gcc/range-op.h |  27 ++-
 gcc/value-range.h  |   1 +
 5 files changed, 306 insertions(+), 280 deletions(-)

diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 293d76402e1..b6b10e47b78 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -144,7 +144,7 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s)
   if (type)
 set_op_handler (code, type);
 
-  if (m_valid)
+  if (m_operator)
 switch (gimple_code (m_stmt))
   {
 	case GIMPLE_COND:
@@ -152,7 +152,7 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s)
 	  m_op2 = gimple_cond_rhs (m_stmt);
 	  // Check that operands are supported types.  One check is enough.
 	  if (!Value_Range::supports_type_p (TREE_TYPE (m_op1)))
-	m_valid = false;
+	m_operator = NULL;
 	  return;
 	case GIMPLE_ASSIGN:
 	  m_op1 = gimple_range_base_of_assignment (m_stmt);
@@ -171,7 +171,7 @@ gimple_range_op_handler::gimple_range_op_handler (gimple *s)
 	m_op2 = gimple_assign_rhs2 (m_stmt);
 	  // Check that operands are supported types.  One check is enough.
 	  if ((m_op1 && !Value_Range::supports_type_p (TREE_TYPE (m_op1
-	m_valid = false;
+	m_operator = NULL;
 	  return;
 	default:
 	  gcc_unreachable ();
@@ -1193,7 +1193,6 @@ gimple_range_op_handler::maybe_non_standard ()
   {
 	case WIDEN_MULT_EXPR:
 	{
-	  m_valid = false;
 	  m_op1 = gimple_assign_rhs1 (m_stmt);
 	  m_op2 = gimple_assign_rhs2 

[COMMITTED 2/4] - Remove tree_code from range-operator.

2023-06-08 Thread Andrew MacLeod via Gcc-patches
Range_operator had a tree code added last release to facilitate bitmask 
operations.  This was intended to be a temporary change until we could 
figure out something more strategic going forward.


This patch removes the tree_code and replaces it with a virtual routine 
to perform the masking. Each of the affected tree codes operators now 
call the bitmask routine via a virtual function.  At some point we may 
want to consolidate the code that CCP is using so that it resides in the 
range_operator, but the extensive parameter list used by that CCP 
routine makes that prohibitive to do at the moment.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew


From c5065669a36ba0c26841cb108d32f03058757e85 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 31 May 2023 10:55:28 -0400
Subject: [PATCH 2/4] Remove tree_code from range-operator.

Range_operator had a tree code added last release to facilitate
bitmask operations.  This removes the tree_code and replaces it with a
virtual routine to peform the masking.  Remove any duplicate instances
which are no longer needed.

	* range-op.cc (range_operator::fold_range): Call virtual routine.
	(range_operator::update_bitmask): New.
	(operator_equal::update_bitmask): New.
	(operator_not_equal::update_bitmask): New.
	(operator_lt::update_bitmask): New.
	(operator_le::update_bitmask): New.
	(operator_gt::update_bitmask): New.
	(operator_ge::update_bitmask): New.
	(operator_ge::update_bitmask): New.
	(operator_plus::update_bitmask): New.
	(operator_minus::update_bitmask): New.
	(operator_pointer_diff::update_bitmask): New.
	(operator_min::update_bitmask): New.
	(operator_max::update_bitmask): New.
	(operator_mult::update_bitmask): New.
	(operator_div:operator_div):New.
	(operator_div::update_bitmask): New.
	(operator_div::m_code): New member.
	(operator_exact_divide::operator_exact_divide): New constructor.
	(operator_lshift::update_bitmask): New.
	(operator_rshift::update_bitmask): New.
	(operator_bitwise_and::update_bitmask): New.
	(operator_bitwise_or::update_bitmask): New.
	(operator_bitwise_xor::update_bitmask): New.
	(operator_trunc_mod::update_bitmask): New.
	(op_ident, op_unknown, op_ptr_min_max): New.
	(op_nop, op_convert): Delete.
	(op_ssa, op_paren, op_obj_type): Delete.
	(op_realpart, op_imagpart): Delete.
	(op_ptr_min, op_ptr_max): Delete.
	(pointer_plus_operator:update_bitmask): New.
	(range_op_table::set): Do not use m_code.
	(integral_table::integral_table): Adjust to single instances.
	* range-op.h (range_operator::range_operator): Delete.
	(range_operator::m_code): Delete.
	(range_operator::update_bitmask): New.
---
 gcc/range-op.cc | 110 +---
 gcc/range-op.h  |   6 +--
 2 files changed, 79 insertions(+), 37 deletions(-)

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 3ab2c665901..2deca3bac93 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -286,7 +286,7 @@ range_operator::fold_range (irange , tree type,
 	break;
 	}
   op1_op2_relation_effect (r, type, lh, rh, rel);
-  update_known_bitmask (r, m_code, lh, rh);
+  update_bitmask (r, lh, rh);
   return true;
 }
 
@@ -298,7 +298,7 @@ range_operator::fold_range (irange , tree type,
   wi_fold_in_parts (r, type, lh.lower_bound (), lh.upper_bound (),
 			rh.lower_bound (), rh.upper_bound ());
   op1_op2_relation_effect (r, type, lh, rh, rel);
-  update_known_bitmask (r, m_code, lh, rh);
+  update_bitmask (r, lh, rh);
   return true;
 }
 
@@ -316,12 +316,12 @@ range_operator::fold_range (irange , tree type,
 	if (r.varying_p ())
 	  {
 	op1_op2_relation_effect (r, type, lh, rh, rel);
-	update_known_bitmask (r, m_code, lh, rh);
+	update_bitmask (r, lh, rh);
 	return true;
 	  }
   }
   op1_op2_relation_effect (r, type, lh, rh, rel);
-  update_known_bitmask (r, m_code, lh, rh);
+  update_bitmask (r, lh, rh);
   return true;
 }
 
@@ -387,6 +387,14 @@ range_operator::op1_op2_relation_effect (irange _range ATTRIBUTE_UNUSED,
   return false;
 }
 
+// Apply any known bitmask updates based on this operator.
+
+void
+range_operator::update_bitmask (irange &, const irange &,
+   const irange &) const
+{
+}
+
 // Create and return a range from a pair of wide-ints that are known
 // to have overflowed (or underflowed).
 
@@ -562,6 +570,8 @@ public:
 			  const irange ,
 			  relation_trio = TRIO_VARYING) const;
   virtual relation_kind op1_op2_relation (const irange ) const;
+  void update_bitmask (irange , const irange , const irange ) const
+{ update_known_bitmask (r, EQ_EXPR, lh, rh); }
 } op_equal;
 
 // Check if the LHS range indicates a relation between OP1 and OP2.
@@ -682,6 +692,8 @@ public:
 			  const irange ,
 			  relation_trio = TRIO_VARYING) const;
   virtual relation_kind op1_op2_relation (const irange ) const;
+  void update_bitmask (irange , const irange , const irange ) const
+{ update_known_bitmask (r, NE_EXPR, lh, rh); }
 } op_not_equal;
 

[COMMITTED 1/4] Fix floating point bug in fold_range.

2023-06-08 Thread Andrew MacLeod via Gcc-patches
We currently do not have any floating point operators where operand 1 is 
a different type than the LHS. When we eventually do there is a bug in 
fold_range. If either operand is a known NAN, it returns a NAN of the 
type of operand 1 instead of the result type.


This patch sets it to the correct type.

Bootstraps on build-x86_64-pc-linux-gnu with no regressions. Pushed.

Andrew

From ff0ef34aa04f7767933541f58f016600a3462c84 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 7 Jun 2023 14:03:35 -0400
Subject: [PATCH 1/4] Fix floating point bug in fold_range.

We currently do not have any floating point operators where operand 1 is
a different type than the LHS. When we eventually do there is a bug
in fold_range. If either operand is a known NAN, it returns a NAN
of the type of operand 1 instead of the result type.

	* range-op-float.cc (range_operator_float::fold_range): Return
	NAN of the result type.
---
 gcc/range-op-float.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a99a6b01ed8..af598b60a79 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -57,7 +57,7 @@ range_operator_float::fold_range (frange , tree type,
 return true;
   if (op1.known_isnan () || op2.known_isnan ())
 {
-  r.set_nan (op1.type ());
+  r.set_nan (type);
   return true;
 }
 
-- 
2.40.1



Fwd: [PATCH][RFC] c++: Accept elaborated-enum-base in system headers

2023-06-08 Thread Iain Sandoe



> Begin forwarded message:
> 
> From: Jason Merrill 
> Subject: Re: [PATCH][RFC] c++: Accept elaborated-enum-base in system headers
> Date: 8 June 2023 at 19:06:36 BST
> To: Alex Coplan , gcc-patches@gcc.gnu.org
> Cc: Nathan Sidwell , Iain Sandoe 
> 
> On 6/8/23 07:06, Alex Coplan wrote:
>> Hi,
>> macOS SDK headers using the CF_ENUM macro can expand to invalid C++ code
>> of the form:
>> typedef enum T : BaseType T;
>> i.e. an elaborated-type-specifier with an additional enum-base.
>> Upstream LLVM can be made to accept the above construct with
>> -Wno-error=elaborated-enum-base.
> 
> I guess we might as well follow that example, and so instead of this check:
> 
>> +   || (underlying_type && !in_system_header_at (colon_loc)))
> 
> Make the below an on-by-default pedwarn using OPT_Welaborated_enum_base, and 
> don't return error_mark_node.

I was also wondering about (for this and other reasons) a -fclang-compat to put 
some of these things behind (since std=clang++NN is not really going to work to 
describe other non-standard extensions etc. since most are not synchronised to 
std revisions.)

Iain

> 
>> +  cp_parser_commit_to_tentative_parse (parser);
>> +  error_at (colon_loc,
>> +"declaration of enumeration with "
>> +"fixed underlying type and no enumerator list is "
>> +"only permitted as a standalone declaration");
> 
> 



Re: [PATCH] analyzer: Standalone OOB-warning [PR109437, PR109439]

2023-06-08 Thread Benjamin Priour via Gcc-patches
Hi,

Yes of course, I tested many days ago since regtesting takes several days
on my box, I should have retested !
But I got an account for the compile farm today, so I'm on it immediately,
I also see a divergence in the warnings on my box.

Thanks for the report !
Sincerely sorry,
Benjamin.

On Thu, Jun 8, 2023 at 7:53 PM Maxim Kuvyrkov 
wrote:

> > On Jun 6, 2023, at 15:48, Benjamin Priour via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> >
> > From: Benjamin Priour 
> >
> > This patch enchances -Wanalyzer-out-of-bounds that is no longer paired
> > with a -Wanalyzer-use-of-uninitialized-value on out-of-bounds-read.
> >
> > This also fixes PR analyzer/109437.
> > Before there could always be at most one OOB-read warning per frame
> because
> > -Wanalyzer-use-of-uninitialized-value always terminates the analysis
> > path.
> >
> > PR analyzer/109439
> >
> > gcc/analyzer/ChangeLog:
> >
> > * bounds-checking.cc (region_model::check_symbolic_bounds):
> >  Returns whether the BASE_REG region access was OOB.
> > (region_model::check_region_bounds): Likewise.
> > * region-model.cc (region_model::get_store_value): Creates an
> >  unknown svalue on OOB-read access to REG.
> > (region_model::check_region_access): Returns whether an
> > unknown svalue needs be created.
> > (region_model::check_region_for_read): Passes
> > check_region_access return value.
> > * region-model.h: Update prior function definitions.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/analyzer/out-of-bounds-2.c: Cleaned test for
> >  uninitialized-value warning.
> > * gcc.dg/analyzer/out-of-bounds-5.c: Likewise.
> > * gcc.dg/analyzer/pr101962.c: Likewise.
> > * gcc.dg/analyzer/realloc-5.c: Likewise.
> > * gcc.dg/analyzer/pr109439.c: New test.
> > ---
>
> Hi Benjamin,
>
> This patch makes two tests fail on arm-linux-gnueabihf.  Probably, they
> need to be updated as well.  Would you please investigate?  Let me know if
> it doesn't easily reproduce for you, and I'll help.
>
> === g++ tests ===
>
> Running g++:g++.dg/analyzer/analyzer.exp ...
> FAIL: g++.dg/analyzer/pr100244.C -std=c++14 (test for warnings, line 17)
> FAIL: g++.dg/analyzer/pr100244.C -std=c++17 (test for warnings, line 17)
> FAIL: g++.dg/analyzer/pr100244.C -std=c++20 (test for warnings, line 17)
>
> === gcc tests ===
>
> Running gcc:gcc.dg/analyzer/analyzer.exp ...
> FAIL: gcc.dg/analyzer/pr101962.c (test for warnings, line 19)
>
> Thanks,
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
>
>
> > gcc/analyzer/bounds-checking.cc   | 30 +--
> > gcc/analyzer/region-model.cc  | 22 +-
> > gcc/analyzer/region-model.h   |  8 ++---
> > .../gcc.dg/analyzer/out-of-bounds-2.c |  1 -
> > .../gcc.dg/analyzer/out-of-bounds-5.c |  2 --
> > gcc/testsuite/gcc.dg/analyzer/pr101962.c  |  1 -
> > gcc/testsuite/gcc.dg/analyzer/pr109439.c  | 12 
> > gcc/testsuite/gcc.dg/analyzer/realloc-5.c |  1 -
> > 8 files changed, 51 insertions(+), 26 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr109439.c
> >
> > diff --git a/gcc/analyzer/bounds-checking.cc
> b/gcc/analyzer/bounds-checking.cc
> > index 3bf542a8eba..479b8e4b88d 100644
> > --- a/gcc/analyzer/bounds-checking.cc
> > +++ b/gcc/analyzer/bounds-checking.cc
> > @@ -767,9 +767,11 @@ public:
> >   }
> > };
> >
> > -/* Check whether an access is past the end of the BASE_REG.  */
> > +/* Check whether an access is past the end of the BASE_REG.
> > +  Return TRUE if the access was valid, FALSE otherwise.
> > +*/
> >
> > -void
> > +bool
> > region_model::check_symbolic_bounds (const region *base_reg,
> > const svalue *sym_byte_offset,
> > const svalue *num_bytes_sval,
> > @@ -800,6 +802,7 @@ region_model::check_symbolic_bounds (const region
> *base_reg,
> >  offset_tree,
> >  num_bytes_tree,
> >  capacity_tree));
> > +return false;
> >  break;
> > case DIR_WRITE:
> >  ctxt->warn (make_unique (base_reg,
> > @@ -807,9 +810,11 @@ region_model::check_symbolic_bounds (const region
> *base_reg,
> > offset_tree,
> > num_bytes_tree,
> > capacity_tree));
> > +return false;
> >  break;
> > }
> > }
> > +  return true;
> > }
> >
> > static tree
> > @@ -822,9 +827,11 @@ maybe_get_integer_cst_tree (const svalue *sval)
> >   return NULL_TREE;
> > }
> >
> > -/* May complain when the access on REG is out-of-bounds.  */
> > +/* May complain when the access on REG is out-of-bounds.
> > +   Return TRUE if the access was valid, FALSE otherwise.
> > +*/
> >
> > -void
> > +bool
> > region_model::check_region_bounds (const region *reg,
> >   enum access_direction dir,
> >   region_model_context *ctxt) const
> > @@ -839,14 +846,14 @@ region_model::check_region_bounds (const region
> *reg,
> >  (e.g. because the analyzer did not see previous offsets on the
> latter,
> >  it might think that a negative access is before the buffer).  */
> >   if (base_reg->symbolic_p ())
> > -  

Re: [PATCH] Fortran: add Fortran 2018 IEEE_{MIN,MAX} functions

2023-06-08 Thread Thomas Koenig via Gcc-patches

Hi Steve,


On Thu, Jun 08, 2023 at 12:17:02PM +0200, Thomas Koenig wrote:

[...]


Thanks for the explanation.  As I likely will not use a POWER-based
system, I only loosely followed the discussion.  I don't remember
if ibm double-double is REAL(16) or REAL(17).  If ieee 128-bit is
REAL(16), then everything should (I believe) be okay.


From a user standpoint, REAL(16) is always used. We only use the 17
as a marker in the library, and to be able to have library versions
of IBM long double co-reside with IEEE long double.


There is a virutal POWER machine at OSUL dedicated to the IEEE QP
gfortran effort. It hasn't been used for some time, but it's still
active. I just bootstrapped trunk there and ran all the IEEE from the
testsuite manually, with

$ for a in *.f90; do echo "Testing $a"; gfortran -o $a.exe -fno-range-check
-mcpu=power9 -mabi=ieeelongdouble -static-libgfortran $a signaling_1_c.c
signaling_2_c.c ; ./$a.exe ; done 2>&1 | grep -v command-line
Testing fma_1.f90


These could be misleading.  gfortran has pre-processor tokens
for REAL(10) and REAL(16).   If __GFC_REAL_16__ isn't defined
the ieee testing is skipped.


Hmm... need to check.  With the recently-built compiler:

$ cat tst.F90
program memain
#if __GFC_REAL_16__
  print *,"__GFC_REAL_16 found"
#endif
#if __GFC_REAL_17__
  print *,"__GFC_REAL_17 found"
#endif
  print *,"digits is ",digits(1._16)
end program memain
$ gfortran -static-libgfortran tst.F90 && ./a.out
 __GFC_REAL_16 found
 digits is  106
$ gfortran -static-libgfortran -mabi=ieeelongdouble -mcpu=power9 tst.F90 
&& ./a.out

 __GFC_REAL_16 found
 digits is  113

Looks clean.

[...]


Should we have a __GFC_REAL_17__?


I don't think we need it - REAL(KIND=17) is not supported in
the compiler (we discussed and rejected that), and people
who mix IBM long double and IEEE long double will have no
joy with their programs; they need to recompile everything.

But we may have to do something about the files in the
thelibgfortran/ieee subdirectory.

Best regards

Thomas


Re: [PATCH][RFC] c++: Accept elaborated-enum-base in system headers

2023-06-08 Thread Jason Merrill via Gcc-patches

On 6/8/23 07:06, Alex Coplan wrote:

Hi,

macOS SDK headers using the CF_ENUM macro can expand to invalid C++ code
of the form:

typedef enum T : BaseType T;

i.e. an elaborated-type-specifier with an additional enum-base.
Upstream LLVM can be made to accept the above construct with
-Wno-error=elaborated-enum-base.


I guess we might as well follow that example, and so instead of this check:


+  || (underlying_type && !in_system_header_at (colon_loc)))


Make the below an on-by-default pedwarn using OPT_Welaborated_enum_base, 
and don't return error_mark_node.



+ cp_parser_commit_to_tentative_parse (parser);
+ error_at (colon_loc,
+   "declaration of enumeration with "
+   "fixed underlying type and no enumerator list is "
+   "only permitted as a standalone declaration");





Re: [PATCH v6 0/4] P1689R5 support

2023-06-08 Thread Maxim Kuvyrkov via Gcc-patches
> On Jun 7, 2023, at 00:50, Ben Boeckel via Gcc-patches 
>  wrote:
> 
> Hi,
> 
> This patch series adds initial support for ISO C++'s [P1689R5][], a
> format for describing C++ module requirements and provisions based on
> the source code. This is required because compiling C++ with modules is
> not embarrassingly parallel and need to be ordered to ensure that
> `import some_module;` can be satisfied in time by making sure that any
> TU with `export import some_module;` is compiled first.

Hi Ben,

This patch series causes ICEs on arm-linux-gnueabihf.  Would you please 
investigate?  Please let me know if you need any in reproducing these.

=== g++ tests ===

Running g++:g++.dg/modules/modules.exp ...
FAIL: g++.dg/modules/ben-1_a.C -std=c++17 (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/ben-1_a.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/ben-1_a.C -std=c++2a (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/ben-1_a.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/ben-1_a.C -std=c++2b (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/ben-1_a.C -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/ben-1_a.C module-cmi =partitions/module-import.mod 
(partitions/module-import.mod)
FAIL: g++.dg/modules/ben-1_b.C -std=c++17 (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/ben-1_b.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/ben-1_b.C -std=c++2a (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/ben-1_b.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/ben-1_b.C -std=c++2b (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/ben-1_b.C -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/ben-1_b.C module-cmi =module.mod (module.mod)
FAIL: g++.dg/modules/gc-2_a.C -std=c++17 (internal compiler error: Segmentation 
fault)
FAIL: g++.dg/modules/gc-2_a.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/gc-2_a.C -std=c++2a (internal compiler error: Segmentation 
fault)
FAIL: g++.dg/modules/gc-2_a.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/gc-2_a.C -std=c++2b (internal compiler error: Segmentation 
fault)
FAIL: g++.dg/modules/gc-2_a.C -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/gc-2_a.C module-cmi =map-1_a.nms (map-1_a.nms)
UNRESOLVED: g++.dg/modules/map-1 -std=c++17 execute
UNRESOLVED: g++.dg/modules/map-1 -std=c++17 link
UNRESOLVED: g++.dg/modules/map-1 -std=c++2a execute
UNRESOLVED: g++.dg/modules/map-1 -std=c++2a link
UNRESOLVED: g++.dg/modules/map-1 -std=c++2b execute
UNRESOLVED: g++.dg/modules/map-1 -std=c++2b link
FAIL: g++.dg/modules/map-1_a.C -std=c++17 (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/map-1_a.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/map-1_a.C -std=c++2a (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/map-1_a.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/map-1_a.C -std=c++2b (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/map-1_a.C -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/map-1_a.C module-cmi =map-1_a.nms (map-1_a.nms)
FAIL: g++.dg/modules/map-1_b.C -std=c++17 (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/map-1_b.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/map-1_b.C -std=c++2a (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/map-1_b.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/map-1_b.C -std=c++2b (internal compiler error: 
Segmentation fault)
FAIL: g++.dg/modules/map-1_b.C -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/map-2.C -std=c++17 at line 8 (test for errors, line 7)
FAIL: g++.dg/modules/map-2.C -std=c++17 at line 9 (test for errors, line )
FAIL: g++.dg/modules/map-2.C -std=c++17 (internal compiler error: Segmentation 
fault)
FAIL: g++.dg/modules/map-2.C -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/map-2.C -std=c++2a at line 8 (test for errors, line 7)
FAIL: g++.dg/modules/map-2.C -std=c++2a at line 9 (test for errors, line )
FAIL: g++.dg/modules/map-2.C -std=c++2a (internal compiler error: Segmentation 
fault)
FAIL: g++.dg/modules/map-2.C -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/map-2.C -std=c++2b at line 8 (test for errors, line 7)
FAIL: g++.dg/modules/map-2.C -std=c++2b at line 9 (test for errors, line )
FAIL: g++.dg/modules/map-2.C -std=c++2b (internal compiler error: Segmentation 
fault)
FAIL: g++.dg/modules/map-2.C -std=c++2b (test for excess errors)
===

Thanks,

--
Maxim Kuvyrkov
https://www.linaro.org





> 
> [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> 
> I've also added patches to include imported module CMI files and the
> module mapper file as dependencies of the compilation. I briefly looked
> into adding dependencies on response files as well, but that appeared to
> need some code contortions to have a `class mkdeps` available before
> parsing the 

[PATCH] doc: Clarification for -Wmissing-field-initializers

2023-06-08 Thread Marek Polacek via Gcc-patches
The manual is incorrect in saying that the option does not warn
about designated initializers, which it does in C++.  Whether the
divergence in behavior is desirable is another thing, but let's
at least make the manual match the reality.

PR c/39589
PR c++/96868

gcc/ChangeLog:

* doc/invoke.texi: Clarify that -Wmissing-field-initializers doesn't
warn about designated initializers in C only.
---
 gcc/doc/invoke.texi | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6d08229ce40..0870f7aff93 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -9591,8 +9591,9 @@ struct s @{ int f, g, h; @};
 struct s x = @{ 3, 4 @};
 @end smallexample
 
-This option does not warn about designated initializers, so the following
-modification does not trigger a warning:
+@c It's unclear if this behavior is desirable.  See PR39589 and PR96868.
+In C this option does not warn about designated initializers, so the
+following modification does not trigger a warning:
 
 @smallexample
 struct s @{ int f, g, h; @};

base-commit: 1379ae33e05c28d705f3c69a3f6c774bf6e83136
-- 
2.40.1



Re: [PATCH] analyzer: Standalone OOB-warning [PR109437, PR109439]

2023-06-08 Thread Maxim Kuvyrkov via Gcc-patches
> On Jun 6, 2023, at 15:48, Benjamin Priour via Gcc-patches 
>  wrote:
> 
> From: Benjamin Priour 
> 
> This patch enchances -Wanalyzer-out-of-bounds that is no longer paired
> with a -Wanalyzer-use-of-uninitialized-value on out-of-bounds-read.
> 
> This also fixes PR analyzer/109437.
> Before there could always be at most one OOB-read warning per frame because
> -Wanalyzer-use-of-uninitialized-value always terminates the analysis
> path.
> 
> PR analyzer/109439
> 
> gcc/analyzer/ChangeLog:
> 
> * bounds-checking.cc (region_model::check_symbolic_bounds):
>  Returns whether the BASE_REG region access was OOB.
> (region_model::check_region_bounds): Likewise.
> * region-model.cc (region_model::get_store_value): Creates an
>  unknown svalue on OOB-read access to REG.
> (region_model::check_region_access): Returns whether an
> unknown svalue needs be created.
> (region_model::check_region_for_read): Passes
> check_region_access return value.
> * region-model.h: Update prior function definitions.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/analyzer/out-of-bounds-2.c: Cleaned test for
>  uninitialized-value warning.
> * gcc.dg/analyzer/out-of-bounds-5.c: Likewise.
> * gcc.dg/analyzer/pr101962.c: Likewise.
> * gcc.dg/analyzer/realloc-5.c: Likewise.
> * gcc.dg/analyzer/pr109439.c: New test.
> ---

Hi Benjamin,

This patch makes two tests fail on arm-linux-gnueabihf.  Probably, they need to 
be updated as well.  Would you please investigate?  Let me know if it doesn't 
easily reproduce for you, and I'll help. 

=== g++ tests ===

Running g++:g++.dg/analyzer/analyzer.exp ...
FAIL: g++.dg/analyzer/pr100244.C -std=c++14 (test for warnings, line 17)
FAIL: g++.dg/analyzer/pr100244.C -std=c++17 (test for warnings, line 17)
FAIL: g++.dg/analyzer/pr100244.C -std=c++20 (test for warnings, line 17)

=== gcc tests ===

Running gcc:gcc.dg/analyzer/analyzer.exp ...
FAIL: gcc.dg/analyzer/pr101962.c (test for warnings, line 19)

Thanks,

--
Maxim Kuvyrkov
https://www.linaro.org



> gcc/analyzer/bounds-checking.cc   | 30 +--
> gcc/analyzer/region-model.cc  | 22 +-
> gcc/analyzer/region-model.h   |  8 ++---
> .../gcc.dg/analyzer/out-of-bounds-2.c |  1 -
> .../gcc.dg/analyzer/out-of-bounds-5.c |  2 --
> gcc/testsuite/gcc.dg/analyzer/pr101962.c  |  1 -
> gcc/testsuite/gcc.dg/analyzer/pr109439.c  | 12 
> gcc/testsuite/gcc.dg/analyzer/realloc-5.c |  1 -
> 8 files changed, 51 insertions(+), 26 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr109439.c
> 
> diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
> index 3bf542a8eba..479b8e4b88d 100644
> --- a/gcc/analyzer/bounds-checking.cc
> +++ b/gcc/analyzer/bounds-checking.cc
> @@ -767,9 +767,11 @@ public:
>   }
> };
> 
> -/* Check whether an access is past the end of the BASE_REG.  */
> +/* Check whether an access is past the end of the BASE_REG.
> +  Return TRUE if the access was valid, FALSE otherwise.
> +*/
> 
> -void
> +bool
> region_model::check_symbolic_bounds (const region *base_reg,
> const svalue *sym_byte_offset,
> const svalue *num_bytes_sval,
> @@ -800,6 +802,7 @@ region_model::check_symbolic_bounds (const region 
> *base_reg,
>  offset_tree,
>  num_bytes_tree,
>  capacity_tree));
> +return false;
>  break;
> case DIR_WRITE:
>  ctxt->warn (make_unique (base_reg,
> @@ -807,9 +810,11 @@ region_model::check_symbolic_bounds (const region 
> *base_reg,
> offset_tree,
> num_bytes_tree,
> capacity_tree));
> +return false;
>  break;
> }
> }
> +  return true;
> }
> 
> static tree
> @@ -822,9 +827,11 @@ maybe_get_integer_cst_tree (const svalue *sval)
>   return NULL_TREE;
> }
> 
> -/* May complain when the access on REG is out-of-bounds.  */
> +/* May complain when the access on REG is out-of-bounds.
> +   Return TRUE if the access was valid, FALSE otherwise.
> +*/
> 
> -void
> +bool
> region_model::check_region_bounds (const region *reg,
>   enum access_direction dir,
>   region_model_context *ctxt) const
> @@ -839,14 +846,14 @@ region_model::check_region_bounds (const region *reg,
>  (e.g. because the analyzer did not see previous offsets on the latter,
>  it might think that a negative access is before the buffer).  */
>   if (base_reg->symbolic_p ())
> -return;
> +return true;
> 
>   /* Find out how many bytes were accessed.  */
>   const svalue *num_bytes_sval = reg->get_byte_size_sval (m_mgr);
>   tree num_bytes_tree = maybe_get_integer_cst_tree (num_bytes_sval);
>   /* Bail out if 0 bytes are accessed.  */
>   if (num_bytes_tree && zerop (num_bytes_tree))
> -return;
> +return true;
> 
>   /* Get the capacity of the buffer.  */
>   const svalue *capacity = get_capacity (base_reg);
> @@ -877,13 +884,13 @@ region_model::check_region_bounds (const region *reg,
> }
>   else
> byte_offset_sval = reg_offset.get_symbolic_byte_offset ();
> -   

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Mikael Morin

Le 08/06/2023 à 11:58, Tamar Christina via Gcc-patches a écrit :

Hi,

New version of the patch, I've omitted the explanation again 

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Any feedback?


Hello,

this is not my area of expertise, but I saw the following:


+  /* [ns..ns + len) should be a string with the id of the rtx to match
+ i.e. if rtx is the relevant match_operand or match_scratch then
+ [ns..ns + len) should equal itoa (XINT (rtx, 0)), and if set_attr then
+ [ns..ns + len) should equal XSTR (rtx, 0).  */
+  conlist (const char *ns, unsigned int len, bool numeric)
+  {
+/* Trim leading whitespaces.  */
+while (ISSPACE (*ns))
+  {
+   ns++;
+   len--;
+  }
+
+/* Trim trailing whitespace.  */
+for (int i = len - 1; i >= 0; i++, len--)
+  if (!ISSPACE (*ns))
+   break;
+
This for loop makes little sense to me.  Shouldn't the iteration step be 
i-- rather than i++ and the pointer dereference *(ns + i) rather than *ns?


Re: [PATCH] Fortran: add Fortran 2018 IEEE_{MIN,MAX} functions

2023-06-08 Thread Steve Kargl via Gcc-patches
On Thu, Jun 08, 2023 at 12:17:02PM +0200, Thomas Koenig wrote:
> Hi together,
> 
> > > On 6/6/23 21:11, FX Coudert via Gcc-patches wrote:
> > > > Hi,
> > > > 
> > > > > I cannot see if there is proper support for kind=17 in your patch;
> > > > > at least the libgfortran/ieee/ieee_arithmetic.F90 part does not
> > > > > seem to have any related code.
> > > > 
> > > > Can real(kind=17) ever be an IEEE mode? If so, something seriously 
> > > > wrong happened, because the IEEE modules have no kind=17 mention in 
> > > > them anywhere.
> > > > 
> > > > Actually, where is the kind=17 documented?
> > > > 
> > > > FX
> > > 
> > > I was hoping for Thomas to come forward with some comment, as
> > > he was quite involved in related work.
> > > 
> > > There are several threads on IEEE128 for Power on the fortran ML
> > > e.g. around November/December 2021, January 2022.
> > > 
> > > I wasn't meaning to block your work, just wondering if the Power
> > > platform needs more attention here.
> > > 
> > 
> > % cd gcc/gccx/libgfortran
> > % grep HAVE_GFC_REAL_17 ieee/*
> > % troutmask:sgk[219] ls ieee
> > % ieee_arithmetic.F90 ieee_features.F90
> > % ieee_exceptions.F90 ieee_helper.c
> > 
> > There are zero hits for REAL(17) in the IEEE code.  If REAL(17)
> > is intended to be an IEEE-754 type, then it seems gfortran's
> > support was never added for it.  If anyone has access to a
> > power system, it's easy to test
> > 
> > program foo
> > use ieee_arithmetic
> > print *, ieee_support_datatype(1.e_17)
> > end program foo
> 
> The KIND=17 is a bit of a kludge.  It is not visible for
> user programs, they use KIND=16, but this is then translated
> to library calls as if it was KIND=17 if the IEEE 128-bit floats
> are selected:
> 
> $ cat ml.f90
> subroutine mm(a)
>   real(kind=16), dimension(:,:) :: a
>   print *,maxloc(a)
> end subroutine mm
> $ gfortran -S -mabi=ieeelongdouble ml.f90 && grep maxloc ml.s
> bl _gfortran_maxloc0_4_r17
> $ gfortran -S  ml.f90 && grep maxloc ml.s
> bl _gfortran_maxloc0_4_r16
> 
> On POWER, if IBM long double exists, it is GFC_REAL_16, with GFC_REAL_17
> being IEEE long double. Everywhere else, GFC_REAL_16 is IEEE long
> double.
> 
> If POWER gets the flag -mabi=ieeelongdouble, it uses IEEE long doubles.
> 
> If it gets the additionalflag -mcpu=power9 or -mcpu=power10, it uses
> the hardware instructions for the arithmetic instead of library calls.
> 

(trimming for length)

Thanks for the explanation.  As I likely will not use a POWER-based
system, I only loosely followed the discussion.  I don't remember
if ibm double-double is REAL(16) or REAL(17).  If ieee 128-bit is
REAL(16), then everything should (I believe) be okay.

> There is a virutal POWER machine at OSUL dedicated to the IEEE QP
> gfortran effort. It hasn't been used for some time, but it's still
> active. I just bootstrapped trunk there and ran all the IEEE from the
> testsuite manually, with
> 
> $ for a in *.f90; do echo "Testing $a"; gfortran -o $a.exe -fno-range-check
> -mcpu=power9 -mabi=ieeelongdouble -static-libgfortran $a signaling_1_c.c
> signaling_2_c.c ; ./$a.exe ; done 2>&1 | grep -v command-line
> Testing fma_1.f90

These could be misleading.  gfortran has pre-processor tokens
for REAL(10) and REAL(16).   If __GFC_REAL_16__ isn't defined
the ieee testing is skipped.

% cd gcc/gccx/gcc/fortran/
% grep __GFC_REAL_ *
cpp.cc: cpp_define (cpp_in, "__GFC_REAL_10__=1");
cpp.cc: cpp_define (cpp_in, "__GFC_REAL_16__=1");
invoke.texi:@code{__GFC_REAL_10__}, and @code{__GFC_REAL_16__}.
% more cpp.cc
...
  /* Pre-defined macros for non-required REAL kind types.  */
  for (gfc_real_info *rtype = gfc_real_kinds; rtype->kind != 0; rtype++)
{
  if (rtype->kind == 10)
cpp_define (cpp_in, "__GFC_REAL_10__=1");
  if (rtype->kind == 16)
cpp_define (cpp_in, "__GFC_REAL_16__=1");
}
...

Should we have a __GFC_REAL_17__?

% cd ../testsuite/gfortran.dg/
% grep __GFC_REAL ieee/*
ieee/dec_math_1.f90:  ! Note however that if both __GFC_REAL_10__ and 
__GFC_REAL_16__ are defined,
ieee/dec_math_1.f90:#if defined(__GFC_REAL_10__)
ieee/dec_math_1.f90:#elif defined(__GFC_REAL_16__)
ieee/dec_math_1.f90:#ifdef __GFC_REAL_10__
ieee/dec_math_1.f90:#elif defined(__GFC_REAL_16__)
ieee/dec_math_1.f90:#ifdef __GFC_REAL_16__
ieee/dec_math_1.f90:#elif defined(__GFC_REAL_10__)
ieee/ieee_11.F90:#ifdef __GFC_REAL_10__
ieee/ieee_11.F90:#ifdef __GFC_REAL_16__

-- 
Steve


Re: [PATCH] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-08 Thread Carl Love via Gcc-patches


Kewen:
On Wed, 2023-06-07 at 17:36 +0800, Kewen.Lin wrote:
> Hi,
> 
> on 2023/6/7 03:54, Carl Love wrote:
> > On Mon, 2023-06-05 at 16:45 +0800, Kewen.Lin wrote:
> > > Hi Carl,
> > > 
> > > on 2023/5/2 23:52, Carl Love via Gcc-patches wrote:
> > > > GCC maintainers:
> > > > 
> > > > The following patch adds three buitins for inserting and
> > > > extracting
> > > > the
> > > > exponent and significand for an IEEE 128-bit floating point
> > > > values. 
> > > > The builtins are valid for Power 9 and Power 10.  
> > > 
> > > We already have:
> > > 
> > > unsigned long long int scalar_extract_exp (__ieee128 source);
> > > unsigned __int128 scalar_extract_sig (__ieee128 source);
> > > ieee_128 scalar_insert_exp (unsigned __int128 significand,
> > > unsigned long long int exponent);
> > > ieee_128 scalar_insert_exp (ieee_128 significand, unsigned long
> > > long
> > > int exponent);
> > > 
> > > you need to say something about the requirements or the
> > > justification
> > > for
> > > adding more, for this patch itself, some comments are inline
> > > below,
> > > thanks!
> > 
> > I implemented the patch based on a request for the builtins.  It
> > didn't
> > include any justification so I reached out to Steve Monroe who
> > requested the builtins to understand why he wanted them.  Here is
> > his
> > reply:
> > 
> >Basically there is no clean and performant way to transfer
> > between a
> >vector type and the ieee128 scalar, despite the fact that both
> >reside in vector registers. Also a union transfer does not work
> >correctly on most GCC versions (and will likely break again in
> > the
> >next release). I offer the long sad history of the IBM long
> > double
> >float runtime.
> 
> Thanks for clarifying this.  As the proposed changes, I think he
> meant
> to say "Basically there is no clean and performant way to transfer
> between
> a vector type and the scalar **types**". :) Because the proposed
> changes
> are:
>   scalar_extract_exp: unsigned long long => vector unsigned long long
>   scalar_extract_sig: unsigned __int128  => vector unsigned __int128
>   scalar_insert_exp: unsigned __int128 => vector unsigned __int128
>  unsigned long long => vector unsigned long long.
> 
> >Also there are __ieee128 operations that are provided by
> > builtins
> >for POWER9 but are not provided by libgcc (for POWER8).
> > 
> >Finally I can prove that a softfloat __ieee128 implementation
> > using
> >VMX integer operations, out-performs the current libgcc
> >implementation using DW GPRs.
> > 
> >The details are in the PVECLIB documentation
> >pveclib/vec__f128__ppc.h
> > 
> > 
> > > > The patch has been tested on both Power 9 and Power 10.
> > > > 
> > > > Please let me know if this patch is acceptable for
> > > > mainline.  Thanks.
> > > > 
> > > > Carl 
> > > > 
> > > > 
> > > > --
> > > > From a20cc81f98cce1140fc95775a7c25b55d1ca7cba Mon Sep 17
> > > > 00:00:00
> > > > 2001
> > > > From: Carl Love 
> > > > Date: Wed, 12 Apr 2023 17:46:37 -0400
> > > > Subject: [PATCH] rs6000: Add builtins for IEEE 128-bit floating
> > > > point values
> > > > 
> > > > Add support for the following builtins:
> > > > 
> > > >  __vector unsigned long long int __builtin_extractf128_exp
> > > > (__ieee128);
> > > 
> > > Could you make the name similar to the existing one?  The
> > > existing
> > > one
> > >   
> > >   unsigned long long int scalar_extract_exp (__ieee128 source);
> > > 
> > > has nothing like f128 on its name, this variant is just to change
> > > the
> > > return type to vector type, how about scalar_extract_exp_to_vec?
> > 
> > I changed the name  __builtin_extractf128_exp  to
> > __builtin_scalar_extract_exp_to_vec.
> > 
> > > >  __vector unsigned __int128 __builtin_extractf128_sig
> > > > (__ieee128);
> > > 
> > > Ditto.
> > 
> > I changed the name  __builtin_extractf128_sig to
> > __builtin_scalar_extract_sig_to_vec.
> > 
> > > >  __ieee128 __builtin_insertf128_exp (__vector unsigned
> > > > __int128,
> > > >  __vector unsigned long
> > > > long);
> > > 
> > > This one can just overload the existing scalar_insert_exp?
> > 
> > I tried making this one an overloaded version of
> > scalar_insert_exp.  However, the overload with the vector arguments
> > isn't recognized when I put the overload definition at the end of
> > the
> > list of overloads.  When I tried putting the vector version as the
> > first overloaded definition, I get an internal error
> > on  __builtin_vsx_scalar_insert_exp_q which is has the same
> > arguments
> > types but as scalars not vectors.  Best I can tell, there is an
> > issue
> > with mixing scalar and vector arguments in an overloaded builtin.
> 
> No, it's not due to mixing scalar and vector arguments, I looked into
> this and found we have some special handling for this builtin in
> altivec_resolve_overloaded_builtin, see the 

[PATCH ver 3] rs6000: Add builtins for IEEE 128-bit floating point values

2023-06-08 Thread Carl Love via Gcc-patches
Kewen, GCC maintainers:

Version 3, was able to get the overloaded version of scalar_insert_exp
to work and the change to xsxexpqp_f128_ define instruction to
work with the suggestions from Kewen.  

Version 2, I have addressed the various comments from Kewen.  I had
issues with adding an additional overloaded version of
scalar_insert_exp with vector arguments.  The overload infrastructure
didn't work with a mix of scalar and vector arguments.  I did rename
the __builtin_insertf128_exp to __builtin_vsx_scalar_insert_exp_qp make
it similar to the existing builtin.  I also wasn't able to get the
suggested merge of xsxexpqp_f128_ with xsxexpqp_ to work so
I left the two simpler definitiions.

The patch add three new builtins to extract the significand and
exponent of an IEEE float 128-bit value where the builtin argument is a
vector.  Additionally, a builtin to insert the exponent into an IEEE
float 128-bit vector argument is added.  These builtins were requested
since there is no clean and optimal way to transfer between a vector
and a scalar IEEE 128 bit value.

The patch has been tested on Power 10 with no regressions.  Please let
me know if the patch is acceptable or not.  Thanks.

   Carl

---
rs6000: Add builtins for IEEE 128-bit floating point values

Add support for the following builtins:

 __vector unsigned long long int
 __builtin_scalar_extract_exp_to_vec (__ieee128);
 __vector unsigned __int128
 __builtin_scalar_extract_sig_to_vec (__ieee128);
 __ieee128 scalar_insert_exp (__vector unsigned __int128,
  __vector unsigned long long);

These builtins were requesed since there is no clean and performant way to
transfer a value from a vector type and scalar type, despite the fact
that they both reside in vector registers.

gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin):
Rename CCDE_FOR_xsxexpqp_tf to CODE_FOR_xsxexpqp_tf_di.
Rename CODE_FOR_xsxexpqp_kf to CODE_FOR_xsxexpqp_kf_di.
* config/rs6000/rs6000-buildin.def (__builtin_extractf128_exp,
 __builtin_extractf128_sig, __builtin_insertf128_exp): Add new
builtin definitions.
Rename xsxexpqp_kf to xsxexpqp_kf_di.
* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
Add else if for MODE_VECTOR_INT. Update comments.
* config/rs6000/rs6000-overload.def
(__builtin_vec_scalar_insert_exp): Add new overload definition with
vector arguments.
* config/vsx.md (VSEEQP_DI): New mode iterator.
Rename define_insn xsxexpqp_ to
sxexpqp__.
(xsxsigqp_f128_, xsiexpqpf_f128_): Add define_insn for
new builtins.
* doc/extend.texi (__builtin_extractf128_exp,
__builtin_extractf128_sig): Add documentation for new builtins.
(scalar_insert_exp): Add new overloaded builtin definition.

gcc/testsuite/
* gcc.target/powerpc/bfp/extract-exp-ieee128.c: New test case.
* gcc.target/powerpc/bfp/extract-sig-ieee128.c: New test case.
* gcc.target/powerpc/bfp/insert-exp-ieee128.c: New test case.
---
 gcc/config/rs6000/rs6000-builtin.cc   |  4 +-
 gcc/config/rs6000/rs6000-builtins.def | 11 ++-
 gcc/config/rs6000/rs6000-c.cc | 10 +-
 gcc/config/rs6000/rs6000-overload.def |  2 +
 gcc/config/rs6000/vsx.md  | 28 +-
 gcc/doc/extend.texi   |  9 ++
 .../powerpc/bfp/extract-exp-ieee128.c | 50 ++
 .../powerpc/bfp/extract-sig-ieee128.c | 57 
 .../powerpc/bfp/insert-exp-ieee128.c  | 91 +++
 9 files changed, 253 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-exp-ieee128.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/extract-sig-ieee128.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/bfp/insert-exp-ieee128.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 534698e7d3e..d99f0ae5dda 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -3326,8 +3326,8 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
   case CODE_FOR_fmakf4_odd:
icode = CODE_FOR_fmatf4_odd;
break;
-  case CODE_FOR_xsxexpqp_kf:
-   icode = CODE_FOR_xsxexpqp_tf;
+  case CODE_FOR_xsxexpqp_kf_di:
+   icode = CODE_FOR_xsxexpqp_tf_di;
break;
   case CODE_FOR_xsxsigqp_kf:
icode = CODE_FOR_xsxsigqp_tf;
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 638d0bc72ca..dcd4a393906 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2901,8 +2901,14 @@
   fpmath double __builtin_truncf128_round_to_odd (_Float128);
 TRUNCF128_ODD trunckfdf2_odd {}
 
+  vull __builtin_scalar_extract_exp_to_vec 

Re: [PATCH] RISC-V: Add Veyron V1 pipeline description

2023-06-08 Thread Kito Cheng via Gcc-patches
> > I'd very much like to see the condops go into GCC as well, but I've been
> > hesitant to move it forward myself.  We're still waiting on hardware and
> > it wasn't clear to me that we really had consensus agreement to move the
> > bits forward based on an announcement vs waiting on actual hardware
> > availability (based on the comments from Palmer when I upstreamed the
> > binutils bits).

My bad, apparently I grep wrong binutils so I thought it isn't in the tree yet,
I don't have strong opinion on that, so I would defer this to Palmer.


RE: [PATCH v2] RISC-V: Add more test cases for RVV FP16

2023-06-08 Thread Li, Pan2 via Gcc-patches
Committed as passed all riscv.exp rvv.exp tests, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, June 8, 2023 10:01 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; rdapp@gmail.com; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v2] RISC-V: Add more test cases for RVV FP16



On 6/8/23 01:52, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to add new test cases to make sure the RVV FP16 
> works well as expected.
> 
> Signed-off-by: Pan Li 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Add new cases.
>   * gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: New test.
OK.  If there are dependencies on the ZVFHMIN (or anything else) then please 
wait to commit.  If the current compiler can handle these new tests, then you 
can go ahead and commit them now.

jeff


Re: [PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-06-08 Thread Jeff Law via Gcc-patches




On 6/6/23 21:50, Wang, Yanzhang wrote:

Hi Jeff,

Thanks your comments. I have few questions that I don't quite understand.


One of the things that needs to be upstreamed is long jump support within
a function.  Essentially once a function reaches 1M in size we have the
real possibility that a direct jump may not reach its target.

To support this I expect that $ra is going to become a fixed register (ie,
not available to the register allocator as a temporary).  It'll be used
as a scratch register for long jump sequences.

One of the consequences of this is $ra will need to be saved in leaf
functions that are near or over 1M in size.

Note that at the time when we have to lay out the stack, we do not know
the precise length of the function.  So there's a degree of "fuzz" in the
decision whether or not to save $ra in a function that is close to the 1M
limit.


Do you mean that, long jump to more than 1M offset will need multiple jal
and each jal will save the $ra ?
Long jumps are implemnted as an indirect jump which needs a scratch 
register to hold the high part of the jump target address.




If yes, I'm confused about what's the influence of the $ra saving for
function prologue. We will save the fp+ra at the prologue, the next $ra
saving seems will not modify the $ra already saved.
The long branch handling is done at the assembler level.  So the 
clobbering of $ra isn't visible to the compiler.  Thus the compiler has 
to be extremely careful to not hold values in $ra because the assembler 
may clobber $ra.


This ultimately comes back to the phase ordering problem.  At register 
allocation time we don't know if we need long jumps or not.  So we don't 
know if $ra is potentially clobbered by the assembler.   A similar phase 
ordering problems exists in the prologue/epilogue generation.


The other approach to long branch handling would be to do it all in the 
compiler.  I would actually prefer this approach, but it's not likely to 
land in the near term.





I think it's yes (not valid) when we want to get the return address to parent
function from $ra directly in the function body. But we can get the right
return address from fp with offset if we save them at prologue, is it right ?

Right.  You'll be able to get the value of $ra out of the stack.






Meaning that what you really want is to be using -fno-omit-frame-pointer
and for $ra to always be saved in the stack, even in a leaf function.


This is also another solution but will change the default behavior of
-fno-omit-frame-pointer.
That's OK.  While -f options are target independent options, targets are 
allowed to adjust certain behaviors based on those options.


If you're not going to use dwarf, then my recommendation is to ensure 
that the data you need is *always* available in the stack at known 
offsets.   That will mean your code isn't optimized as well.  It means 
hand written assembly code has to follow the conventions, you can't link 
against libraries that do not follow those conventions, etc etc.  But 
that's the price you pay for not using dwarf (or presumably ORC/SFRAME 
which I haven't studied in detail).


Jeff






Jeff


[PATCH] simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

2023-06-08 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch implements RTL constant-folding for the SS_TRUNCATE and US_TRUNCATE 
codes.
The semantics are a clamping operation on the argument with the min and max of 
the narrow mode,
followed by a truncation. The signedness of the clamp and the min/max extrema 
is derived from
the signedness of the saturating operation.

We have a number of instructions in aarch64 that use SS_TRUNCATE and 
US_TRUNCATE to represent
their operations and we have pretty thorough runtime tests in 
gcc.target/aarch64/advsimd-intrinsics/vqmovn*.c.
With this patch the instructions are folded away at optimisation levels and the 
correctness checks still
pass.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Ok for trunk?

Thanks,
Kyrill

gcc/ChangeLog:

* simplify-rtx.cc (simplify_const_unary_operation):
Handle US_TRUNCATE, SS_TRUNCATE.


s_truncate.patch
Description: s_truncate.patch


Re: [PATCH] RISC-V: Add Veyron V1 pipeline description

2023-06-08 Thread Philipp Tomsich
On Thu 8. Jun 2023 at 16:17, Jeff Law  wrote:

>
>
> On 6/8/23 04:22, Kito Cheng wrote:
>
> >
> >
> > Oh, okay I got the awkness point...I am ok with that on gcc land, but I
> > would like binutils support that first, or remove the extension from the
> > mcpu for temporary before binutils support, otherwise it just a broken
> > support for that CPU on trunk gcc.
> I pushed the binutils bits into the repo a couple months ago:
>
> > commit 1656d3f8ef56a16745689c03269412988ebcaa54
> > Author: Philipp Tomsich 
> > Date:   Wed Apr 26 14:09:34 2023 -0600
> >
> > RISC-V: Support XVentanaCondOps extension
> [ ... ]
>
> I'd very much like to see the condops go into GCC as well, but I've been
> hesitant to move it forward myself.  We're still waiting on hardware and
> it wasn't clear to me that we really had consensus agreement to move the
> bits forward based on an announcement vs waiting on actual hardware
> availability (based on the comments from Palmer when I upstreamed the
> binutils bits).


Zicondops will go to ratification in the next couple of weeks, and the plan
is to revise the patches by then.

So I would propose that we move Zicond forward as that happens and (given
how small XVentanaCondOps is on-top of Zicond) we pick it up then.


> IIRC there was general consensus on rewriting the lowest level


That was part of the “moving forward”… this needs a rebase and a major
revision.


> primitives as if-then-else constructs.  Something like this:
>
> > (define_code_iterator eq_or_ne [eq ne])
> > (define_code_attr n [(eq "") (ne "n")])
> > (define_code_attr rev [(eq "n") (ne "")])
> >
> > (define_insn "*vt.maskc"
> >   [(set (match_operand:X 0 "register_operand" "=r")
> > (if_then_else:X
> >  (eq_or_ne (match_operand:X 1 "register_operand" "r")
> >  (const_int 0))
> >  (const_int 0)
> >  (match_operand:X 2 "register_operand" "r")))]
> >   "TARGET_XVENTANACONDOPS"
> >   "vt.maskc\t%0,%2,%1")
> >
> > (define_insn "*vt.maskc_reverse"
> >   [(set (match_operand:X 0 "register_operand" "=r")
> > (if_then_else:X
> >  (eq_or_ne (match_operand:X 1 "register_operand" "r")
> >  (const_int 0))
> >  (match_operand:X 2 "register_operand" "r")
> >  (const_int 0)))]
> >   "TARGET_XVENTANACONDOPS"
> >   "vt.maskc\t%0,%2,%1")
>
> That's what we're using internally these days.  I would expect zicond to
> work in exactly the same manner, but with a different instruction being
> emitted.
>
> We've also got bits here which wire this up in the conditional move
> expander and which adjust the ifcvt.cc bits from VRULL to use the
> if-then-else form.  All this will be useful for zicond as well.
>
> I don't mind letting zicond go first.  It's frozen so it ought to be
> non-controversial.  We can then upstream the various improvements to
> utilize zicond better.  That moves things forward in a meaningful manner
> and buys time to meet the hardware requirement for xventanacondops which
> will be trivial to add if zicond is already supported.
>
>
>
>
> Jeff
>


[PATCH] fix frange_nextafter odr violation

2023-06-08 Thread Alexandre Oliva via Gcc-patches


C++ requires inline functions to be declared inline and defined in
every translation unit that uses them.  frange_nextafter is used in
gimple-range-op.cc but it's only defined as inline in
range-op-float.cc.  Drop the extraneous inline specifier.

Other non-static inline functions in range-op-float.cc are not
referenced elsewhere, so I'm making them static.

Bootstrapping on x86_64-linux-gnu, along with other changes that exposed
the problem; it's already into stage3, and it wouldn't get past stage2
before.  Ok to install?


for  gcc/ChangeLog

* range-op-float.cc (frange_nextafter): Drop inline.
(frelop_early_resolve): Add static.
(frange_float): Likewise.
---
 gcc/range-op-float.cc |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a99a6b01ed835..d6da2aa701ee3 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -255,7 +255,7 @@ maybe_isnan (const frange , const frange )
 // Floating version of relop_early_resolve that takes into account NAN
 // and -ffinite-math-only.
 
-inline bool
+static inline bool
 frelop_early_resolve (irange , tree type,
  const frange , const frange ,
  relation_trio rel, relation_kind my_rel)
@@ -272,7 +272,7 @@ frelop_early_resolve (irange , tree type,
 
 // Set VALUE to its next real value, or INF if the operation overflows.
 
-inline void
+void
 frange_nextafter (enum machine_mode mode,
  REAL_VALUE_TYPE ,
  const REAL_VALUE_TYPE )
@@ -2878,7 +2878,7 @@ namespace selftest
 
 // Build an frange from string endpoints.
 
-inline frange
+static inline frange
 frange_float (const char *lb, const char *ub, tree type = float_type_node)
 {
   REAL_VALUE_TYPE min, max;


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Sandiford via Gcc-patches
In addition to Andreas's and Richard's comments:

Tamar Christina  writes:
> +@item
> +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} 
> followed by
> +a list of @code{match_operand}/@code{match_scratch} comma operand numbers, 
> then a

How about:

  a comma-separated list of @code{match_operand}/@code{match_scratch} operand
  numbers, then a

Some lines are >80 chars.

> +semicolon, followed by the same for attributes (@samp{attrs:}).  Operand
> +modifiers can be placed in this section group as well.  Both sections
> +are optional (so you can use only @samp{cons}, or only @samp{attrs}, or 
> both),
> +and @samp{cons} must come before @samp{attrs} if present.
> +
> +@item
> +Each alternative begins with any amount of whitespace.
> +
> +@item
> +Following the whitespace is a comma-separated list of "constraints" and/or
> +"attributes" within brackets @code{[]}, with sections separated by a 
> semicolon.
> +
> +@item
> +Should you want to copy the previous asm line, the symbol @code{^} can be 
> used.
> +This allows less copy pasting between alternative and reduces the number of
> +lines to update on changes.
> +
> +@item
> +When using C functions for output, the idiom @samp{* return ;} can 
> be

@samp{* return @var{function};}

> +replaced with the shorthand @samp{<< @var{function};}.
> +
> +@item
> +Following the closing @samp{]} is any amount of whitespace, and then the 
> actual
> +asm output.
> +
> +@item
> +Spaces are allowed in the list (they will simply be removed).
> +
> +@item
> +All constraint alternatives should be specified: a blank list should be
> +@samp{[,,]} or generally use @samp{*} for the alternatives. e.g. 
> @samp{[*,*,*]}.

I think this is mixing two things.  How about:

@item
All constraint alternatives should be specified.  For example, a list of
of three blank alternatives should be written @samp{[,,]} rather than
@samp{[]}.

@item
All attribute alternatives should be non-empty, with @samp{*}
representing the default attribute value.  For example, a list of three
default attribute values should be written @samp{[*,*,*]} rather than
@samp{[]}.

> +
> +@item
> +Within an @samp{@{@@} block both multiline and singleline C comments are
> +allowed, but when used outside of a C block they must be the only 
> non-whitespace
> +blocks on the line.
> +
> +@item
> +Within an @samp{@{@@} block, any iterators that do not get expanded will 
> result in an
> +error.  If for some reason it is required to have @code{<>} in the output 
> then

Maybe better as:

s/@code{<>}/@code{<} or @code{>}/

> +these must be escaped using @backslashchar{}.
> +
> +@item
> +The actual constraint string in the @code{match_operand} or
> +@code{match_scratch}, and the attribute string in the @code{set_attr}, must 
> be
> +blank or an empty string (you can't combine the old and new syntaxes).

It looks like the new version drops support for the set_attr case though
(thanks).

> +
> +@item
> +Additional @code{set_attr} can be specified other than the ones in the
> +@samp{attrs} list.  These must use the normal syntax and must come last.  
> There
> +must not be any overlap between the two lists.

Similarly here: I don't think the “they must come last” bit applies
any more.  How about something like:

  It is possible to use the @samp{attrs} list to specify some attributes
  and to use the normal @code{set_attr} syntax to specify other attributes.
  There must not be any overlap between the two lists.

> +
> +In other words, the following is valid:
> +@smallexample
> +@group
> +(define_insn_and_split ""
> +  [(set (match_operand:SI 0 "nonimmediate_operand")
> +   (match_operand:SI 1 "aarch64_mov_operand"))]
> +  ""
> +  @{@@ [cons: 0, 1; attrs: type, arch, length]@}
> +  @dots{}
> +  [(set_attr "foo" "mov_imm")]
> +)
> +@end group
> +@end smallexample
> +
> +but these are not valid:
> +@smallexample
> +@group
> +(define_insn_and_split ""
> +  [(set (match_operand:SI 0 "nonimmediate_operand")
> +   (match_operand:SI 1 "aarch64_mov_operand"))]
> +  ""
> +  @{@@ [cons: 0, 1; attrs: type, arch, length]@}
> +  @dots{}
> +  [(set_attr "type")
> +   (set_attr "arch")
> +   (set_attr "foo" "mov_imm")]
> +)
> +@end group
> +@end smallexample
> +
> +and
> +
> +@smallexample
> +@group
> +(define_insn_and_split ""
> +  [(set (match_operand:SI 0 "nonimmediate_operand")
> +   (match_operand:SI 1 "aarch64_mov_operand"))]
> +  ""
> +  @{@@ [cons: 0, 1; attrs: type, arch, length]@}
> +  @dots{}
> +  [(set_attr "type")
> +   (set_attr "foo" "mov_imm")
> +   (set_attr "arch")
> +   (set_attr "length")]
> +)
> +@end group
> +@end smallexample
> +
> +because the order of the entries don't match and new entries must be last.
> +@end itemize

These examples probably need updating too.

> +
>  @node Predicates
>  @section Predicates
>  @cindex predicates
> diff --git a/gcc/genoutput.cc b/gcc/genoutput.cc
> index 
> 163e8dfef4ca2c2c92ce1cf001ee6be40a54ca3e..7088f816cfa6e6ab2c1f51b8bbaa5eae990a0a4b
>  

Re: [PATCH] RISC-V: Add Veyron V1 pipeline description

2023-06-08 Thread Jeff Law via Gcc-patches




On 6/8/23 04:22, Kito Cheng wrote:




Oh, okay I got the awkness point...I am ok with that on gcc land, but I 
would like binutils support that first, or remove the extension from the 
mcpu for temporary before binutils support, otherwise it just a broken 
support for that CPU on trunk gcc.

I pushed the binutils bits into the repo a couple months ago:


commit 1656d3f8ef56a16745689c03269412988ebcaa54
Author: Philipp Tomsich 
Date:   Wed Apr 26 14:09:34 2023 -0600

RISC-V: Support XVentanaCondOps extension

[ ... ]

I'd very much like to see the condops go into GCC as well, but I've been 
hesitant to move it forward myself.  We're still waiting on hardware and 
it wasn't clear to me that we really had consensus agreement to move the 
bits forward based on an announcement vs waiting on actual hardware 
availability (based on the comments from Palmer when I upstreamed the 
binutils bits).


IIRC there was general consensus on rewriting the lowest level 
primitives as if-then-else constructs.  Something like this:



(define_code_iterator eq_or_ne [eq ne])
(define_code_attr n [(eq "") (ne "n")])
(define_code_attr rev [(eq "n") (ne "")])

(define_insn "*vt.maskc"
  [(set (match_operand:X 0 "register_operand" "=r")
(if_then_else:X
 (eq_or_ne (match_operand:X 1 "register_operand" "r")
 (const_int 0))
 (const_int 0)
 (match_operand:X 2 "register_operand" "r")))]
  "TARGET_XVENTANACONDOPS"
  "vt.maskc\t%0,%2,%1")

(define_insn "*vt.maskc_reverse"
  [(set (match_operand:X 0 "register_operand" "=r")
(if_then_else:X
 (eq_or_ne (match_operand:X 1 "register_operand" "r")
 (const_int 0))
 (match_operand:X 2 "register_operand" "r")
 (const_int 0)))]
  "TARGET_XVENTANACONDOPS"
  "vt.maskc\t%0,%2,%1")


That's what we're using internally these days.  I would expect zicond to 
work in exactly the same manner, but with a different instruction being 
emitted.


We've also got bits here which wire this up in the conditional move 
expander and which adjust the ifcvt.cc bits from VRULL to use the 
if-then-else form.  All this will be useful for zicond as well.


I don't mind letting zicond go first.  It's frozen so it ought to be 
non-controversial.  We can then upstream the various improvements to 
utilize zicond better.  That moves things forward in a meaningful manner 
and buys time to meet the hardware requirement for xventanacondops which 
will be trivial to add if zicond is already supported.





Jeff


Re: [PATCH RFC] c++: use __cxa_call_terminate for MUST_NOT_THROW [PR97720]

2023-06-08 Thread Jason Merrill via Gcc-patches
On Thu, Jun 8, 2023 at 9:13 AM Jonathan Wakely  wrote:

>
> On Fri, 26 May 2023 at 10:58, Jonathan Wakely wrote:
>
>>
>>
>> On Wed, 24 May 2023 at 19:56, Jason Merrill via Libstdc++ <
>> libstd...@gcc.gnu.org> wrote:
>>
>>> Middle-end folks: any thoughts about how best to make the change
>>> described in
>>> the last paragraph below?
>>>
>>> Library folks: any thoughts on the changes to __cxa_call_terminate?
>>>
>>
>> I see no harm in exporting it (with the adjusted signature). The "looks
>> standard but isn't" name is a little unfortunate, but not a big deal.
>>
>
> Jason, do you have any objection to exporting __cxa_call_terminate for GCC
> 13.2 as well, even though the FE won't use it?
>

That sounds fine.

Jason


Re: [PATCH v2] RISC-V: Add more test cases for RVV FP16

2023-06-08 Thread Jeff Law via Gcc-patches




On 6/8/23 01:52, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to add new test cases to make sure the
RVV FP16 works well as expected.

Signed-off-by: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Add new cases.
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: New test.
OK.  If there are dependencies on the ZVFHMIN (or anything else) then 
please wait to commit.  If the current compiler can handle these new 
tests, then you can go ahead and commit them now.


jeff


Re: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Kito Cheng via Gcc-patches
I like JuZhe's proposal too since it's a less invasive way :)

On Thu, Jun 8, 2023 at 9:18 PM Li, Pan2 via Gcc-patches
 wrote:
>
> Thanks Juzhe for the idea. It looks work well as we expected, with the 
> following try.
>
>
>   1.  Allow all FP=16 types for vfadd, then _zvfh and _zvfhmin will be OK.
>   2.  Add restriction define_attr as juzhe mentioned, then _zvfh works well, 
> and _zvfhmin will meet error like unsatisfied insn.
>
> I think only we need to do is the define_attr, and there will be no changes 
> to vector.md. If no more concern, will have a try for this approach.
>
> Pan
>
> From: juzhe.zh...@rivai.ai 
> Sent: Thursday, June 8, 2023 4:32 PM
> To: kito.cheng 
> Cc: Li, Pan2 ; gcc-patches ; 
> Robin Dapp ; jeffreyalaw ; Wang, 
> Yanzhang 
> Subject: Re: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
>
> I have an idea base on what Kito said.
> We enable vfadd FP16 for TARGET_ZVFH. But we don't need to add TARGET_VECTOR 
> && 
> for each instruction.
>
> We can reference riscv.md:
> (define_attr "ext_enabled" "no,yes"
>   (cond [(eq_attr "ext" "base")
>(const_string "yes")
>
>(and (eq_attr "ext" "f")
> (match_test "TARGET_HARD_FLOAT"))
>(const_string "yes")
>
>(and (eq_attr "ext" "d")
> (match_test "TARGET_DOUBLE_FLOAT"))
>(const_string "yes")
>
>(and (eq_attr "ext" "vector")
> (match_test "TARGET_VECTOR"))
>(const_string "yes")
>   ]
>   (const_string "no")))
>
> Define a new attribute as follows:
> (define_attr "fp16_vector_enabled" "no,yes"
>   (cond [
>(and (eq_attr "type" "vfalu")
>  (and eq_attr "mode" "VNx1HF")
> (match_test "!TARGET_ZVFH")))
>(const_string "no")
>   ]
>   (const_string "yes")))
>
>
> I think you can do experiment with this to see whether it can disable MD 
> pattern.
>
> 
> juzhe.zh...@rivai.ai
>
> From: Kito Cheng
> Date: 2023-06-08 15:58
> To: juzhe.zh...@rivai.ai
> CC: pan2.li; 
> gcc-patches; Robin 
> Dapp; jeffreyalaw; 
> yanzhang.wang
> Subject: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> I am thinking, is it possible to use mode attr to remove the overhead
> of checking the mode for other FP modes other than FP16?
>
> e.g.
> (define_mode_attr TARGET_FP_FULL_OPERATION_CHECKING [
>   (VNx1HF "TARGET_ZVFH")
> ...
>   (VNx1SF "1")
> ...
> ])
>
>
>   "TARGET_VECTOR && riscv_vector::float_mode_supported_p 
> (mode)"
> ->
>   "TARGET_VECTOR && "
>
>
> On Thu, Jun 8, 2023 at 2:35 PM 
> juzhe.zh...@rivai.ai
> mailto:juzhe.zh...@rivai.ai>> wrote:
> >
> > LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
> > autovec.
> >
> >
> >
> > juzhe.zh...@rivai.ai
> >
> > From: pan2.li
> > Date: 2023-06-08 14:29
> > To: gcc-patches
> > CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
> > Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> > From: Pan Li mailto:pan2...@intel.com>>
> >
> > This patch would like to refactor the requirement of both the ZVFH
> > and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
> > iterators of RVV. And then the ZVFH will leverage one function as
> > the gate for FP16 supported or not.
> >
> > Please note the ZVFH will cover the ZVFHMIN instructions. This patch
> > add one test for this.
> >
> > Signed-off-by: Pan Li mailto:pan2...@intel.com>>
> > Co-Authored by: Juzhe-Zhong 
> > mailto:juzhe.zh...@rivai.ai>>
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv-protos.h (float_mode_supported_p):
> > New function to float point is supported by extension.
> > * config/riscv/riscv-v.cc (float_mode_supported_p):
> > Ditto.
> > * config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
> > * config/riscv/vector.md: Add condition to FP define insn.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
> > for ZVFHMIN.
> > ---
> > gcc/config/riscv/riscv-protos.h   |   1 +
> > gcc/config/riscv/riscv-v.cc   |  12 ++
> > gcc/config/riscv/vector-iterators.md  |  23 +--
> > gcc/config/riscv/vector.md| 144 ++
> > .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
> > 5 files changed, 118 insertions(+), 77 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv-protos.h 
> > b/gcc/config/riscv/riscv-protos.h
> > index ebbaac255f9..1f606f59ce1 100644
> > --- a/gcc/config/riscv/riscv-protos.h
> > +++ b/gcc/config/riscv/riscv-protos.h
> > @@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
> > bool check_builtin_call (location_t, vec, unsigned int,
> >tree, unsigned int, tree *);
> > bool 

RE: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Li, Pan2 via Gcc-patches
Thanks Juzhe for the idea. It looks work well as we expected, with the 
following try.


  1.  Allow all FP=16 types for vfadd, then _zvfh and _zvfhmin will be OK.
  2.  Add restriction define_attr as juzhe mentioned, then _zvfh works well, 
and _zvfhmin will meet error like unsatisfied insn.

I think only we need to do is the define_attr, and there will be no changes to 
vector.md. If no more concern, will have a try for this approach.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Thursday, June 8, 2023 4:32 PM
To: kito.cheng 
Cc: Li, Pan2 ; gcc-patches ; Robin 
Dapp ; jeffreyalaw ; Wang, Yanzhang 

Subject: Re: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

I have an idea base on what Kito said.
We enable vfadd FP16 for TARGET_ZVFH. But we don't need to add TARGET_VECTOR && 

for each instruction.

We can reference riscv.md:
(define_attr "ext_enabled" "no,yes"
  (cond [(eq_attr "ext" "base")
   (const_string "yes")

   (and (eq_attr "ext" "f")
(match_test "TARGET_HARD_FLOAT"))
   (const_string "yes")

   (and (eq_attr "ext" "d")
(match_test "TARGET_DOUBLE_FLOAT"))
   (const_string "yes")

   (and (eq_attr "ext" "vector")
(match_test "TARGET_VECTOR"))
   (const_string "yes")
  ]
  (const_string "no")))

Define a new attribute as follows:
(define_attr "fp16_vector_enabled" "no,yes"
  (cond [
   (and (eq_attr "type" "vfalu")
 (and eq_attr "mode" "VNx1HF")
(match_test "!TARGET_ZVFH")))
   (const_string "no")
  ]
  (const_string "yes")))


I think you can do experiment with this to see whether it can disable MD 
pattern.


juzhe.zh...@rivai.ai

From: Kito Cheng
Date: 2023-06-08 15:58
To: juzhe.zh...@rivai.ai
CC: pan2.li; 
gcc-patches; Robin 
Dapp; jeffreyalaw; 
yanzhang.wang
Subject: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
I am thinking, is it possible to use mode attr to remove the overhead
of checking the mode for other FP modes other than FP16?

e.g.
(define_mode_attr TARGET_FP_FULL_OPERATION_CHECKING [
  (VNx1HF "TARGET_ZVFH")
...
  (VNx1SF "1")
...
])


  "TARGET_VECTOR && riscv_vector::float_mode_supported_p (mode)"
->
  "TARGET_VECTOR && "


On Thu, Jun 8, 2023 at 2:35 PM juzhe.zh...@rivai.ai
mailto:juzhe.zh...@rivai.ai>> wrote:
>
> LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
> autovec.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-08 14:29
> To: gcc-patches
> CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
> Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> From: Pan Li mailto:pan2...@intel.com>>
>
> This patch would like to refactor the requirement of both the ZVFH
> and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
> iterators of RVV. And then the ZVFH will leverage one function as
> the gate for FP16 supported or not.
>
> Please note the ZVFH will cover the ZVFHMIN instructions. This patch
> add one test for this.
>
> Signed-off-by: Pan Li mailto:pan2...@intel.com>>
> Co-Authored by: Juzhe-Zhong 
> mailto:juzhe.zh...@rivai.ai>>
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (float_mode_supported_p):
> New function to float point is supported by extension.
> * config/riscv/riscv-v.cc (float_mode_supported_p):
> Ditto.
> * config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
> * config/riscv/vector.md: Add condition to FP define insn.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
> for ZVFHMIN.
> ---
> gcc/config/riscv/riscv-protos.h   |   1 +
> gcc/config/riscv/riscv-v.cc   |  12 ++
> gcc/config/riscv/vector-iterators.md  |  23 +--
> gcc/config/riscv/vector.md| 144 ++
> .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
> 5 files changed, 118 insertions(+), 77 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index ebbaac255f9..1f606f59ce1 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
> bool check_builtin_call (location_t, vec, unsigned int,
>tree, unsigned int, tree *);
> bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
> +bool float_mode_supported_p (machine_mode mode);
> bool legitimize_move (rtx, rtx);
> void emit_vlmax_vsetvl (machine_mode, rtx);
> void emit_hard_vlmax_vsetvl (machine_mode, rtx);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 49752cd8899..fe4eb058ec0 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ 

Re: [PATCH RFC] c++: use __cxa_call_terminate for MUST_NOT_THROW [PR97720]

2023-06-08 Thread Jonathan Wakely via Gcc-patches
On Fri, 26 May 2023 at 10:58, Jonathan Wakely wrote:

>
>
> On Wed, 24 May 2023 at 19:56, Jason Merrill via Libstdc++ <
> libstd...@gcc.gnu.org> wrote:
>
>> Middle-end folks: any thoughts about how best to make the change
>> described in
>> the last paragraph below?
>>
>> Library folks: any thoughts on the changes to __cxa_call_terminate?
>>
>
> I see no harm in exporting it (with the adjusted signature). The "looks
> standard but isn't" name is a little unfortunate, but not a big deal.
>

Jason, do you have any objection to exporting __cxa_call_terminate for GCC
13.2 as well, even though the FE won't use it?

Currently both gcc-13 and trunk are at the same library version,
libstdc++.so.6.0.32

But with this addition to trunk we need to bump that .32 to .33, meaning
that gcc-13 and trunk diverge. If we want to backport any new symbols from
trunk to gcc-13 that gets trickier once they've diverged.

If we added __cxa_call_terminate to gcc-13, making it another new addition
to libstdc++.so.6.0.32, then it would simplify a few things.

In theory it could be a problem for distros already shipping gcc-13.1.1
with that new libstdc++.so.6.0.32 version, but since the
__cxa_call_terminate symbol won't actually be used by the gcc-13.1.1
compilers, I don't think it will be a problem.


Re: [V1][PATCH 1/3] Provide element_count attribute to flexible array member field (PR108896)

2023-06-08 Thread Qing Zhao via Gcc-patches


> On Jun 7, 2023, at 6:05 PM, Joseph Myers  wrote:
> 
> On Wed, 7 Jun 2023, Qing Zhao via Gcc-patches wrote:
> 
>> Are you suggesting to use identifier directly as the argument of the 
>> attribute?
>> I tried this in the beginning, however, the current parser for the attribute 
>> argument can not identify that this identifier is a field identifier inside 
>> the same structure. 
>> 
>> For example:
>> 
>> int count;
>> struct trailing_array_7 {
>>  Int count;
>>  int array_7[] __attribute ((element_count (count))); 
>> };
>> 
>> The identifier “count” inside the attribute will refer to the variable 
>> “int count” outside of the structure.
> 
> c_parser_attribute_arguments is supposed to allow an identifier as an 
> attribute argument - and not look it up (the user of the attribute would 
> later need to look it up in the context of the containing structure).  
> Callers use attribute_takes_identifier_p to determine which attributes 
> take identifiers (versus expressions) as arguments, which would need 
> updating to cover the new attribute.
> 
> There is a ??? comment about the case where the identifier is declared as 
> a type name.  That would simply be one of the cases carried over from the 
> old Bison parser, and it would seem reasonable to remove that 
> special-casing so that the attribute works even when the identifier is 
> declared as a typedef name as an ordinary identifier, since it's fine for 
> structure members to have the same name as a typedef name.
> 
> Certainly taking an identifier directly seems like cleaner syntax than 
> taking a string that then needs reinterpreting as an identifier.

Thanks a lot for the helpful info. I will study a little bit here to see how to 
do this.
Qing
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com



Re: [PATCH v5] MIPS: Add speculation_barrier support

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches



On 01/06/2023 05:26, YunQiang Su wrote:

speculation_barrier for MIPS needs sync+jr.hb (r2+),
so we implement __speculation_barrier in libgcc, like arm32 does.

gcc/ChangeLog:
* config/mips/mips-protos.h (mips_emit_speculation_barrier): New
 prototype.
* config/mips/mips.cc (speculation_barrier_libfunc): New static
 variable.
(mips_init_libfuncs): Initialize it.
(mips_emit_speculation_barrier): New function.
* config/mips/mips.md (speculation_barrier): Call
 mips_emit_speculation_barrier.

libgcc/ChangeLog:
* config/mips/lib1funcs.S: New file.
define __speculation_barrier and include mips16.S.
* config/mips/t-mips: define LIB1ASMSRC as mips/lib1funcs.S.
define LIB1ASMFUNCS as _speculation_barrier.
set version info for __speculation_barrier.
* config/mips/libgcc-mips.ver: New file.
* config/mips/t-mips16: don't define LIB1ASMSRC as mips16.S
included in lib1funcs.S now.
---


Please remember to cite PR86793 when committing this fix.

R.


  gcc/config/mips/mips-protos.h  |  2 +
  gcc/config/mips/mips.cc| 12 ++
  gcc/config/mips/mips.md| 12 ++
  libgcc/config/mips/lib1funcs.S | 65 ++
  libgcc/config/mips/libgcc-mips.ver | 21 ++
  libgcc/config/mips/t-mips  |  7 
  libgcc/config/mips/t-mips16|  3 +-
  7 files changed, 120 insertions(+), 2 deletions(-)
  create mode 100644 libgcc/config/mips/lib1funcs.S
  create mode 100644 libgcc/config/mips/libgcc-mips.ver

diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 20483469105..da7902c235b 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -388,4 +388,6 @@ extern void mips_register_frame_header_opt (void);
  extern void mips_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
  extern void mips_expand_vec_cmp_expr (rtx *);
  
+extern void mips_emit_speculation_barrier_function (void);

+
  #endif /* ! GCC_MIPS_PROTOS_H */
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index ca491b981a3..c1d1691306e 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -13611,6 +13611,9 @@ mips_autovectorize_vector_modes (vector_modes *modes, 
bool)
return 0;
  }
  
+

+static GTY (()) rtx speculation_barrier_libfunc;
+
  /* Implement TARGET_INIT_LIBFUNCS.  */
  
  static void

@@ -13680,6 +13683,7 @@ mips_init_libfuncs (void)
synchronize_libfunc = init_one_libfunc ("__sync_synchronize");
init_sync_libfuncs (UNITS_PER_WORD);
  }
+  speculation_barrier_libfunc = init_one_libfunc ("__speculation_barrier");
  }
  
  /* Build up a multi-insn sequence that loads label TARGET into $AT.  */

@@ -19092,6 +19096,14 @@ mips_avoid_hazard (rtx_insn *after, rtx_insn *insn, 
int *hilo_delay,
}
  }
  
+/* Emit a speculation barrier.

+   JR.HB is needed, so we put speculation_barrier_libfunc in libgcc.  */
+void
+mips_emit_speculation_barrier_function ()
+{
+  emit_library_call (speculation_barrier_libfunc, LCT_NORMAL, VOIDmode);
+}
+
  /* A SEQUENCE is breakable iff the branch inside it has a compact form
 and the target has compact branches.  */
  
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md

index ac1d77afc7d..5d04ac566dd 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -160,6 +160,8 @@
;; The `.insn' pseudo-op.
UNSPEC_INSN_PSEUDO
UNSPEC_JRHB
+
+  VUNSPEC_SPECULATION_BARRIER
  ])
  
  (define_constants

@@ -7455,6 +7457,16 @@
mips_expand_conditional_move (operands);
DONE;
  })
+
+(define_expand "speculation_barrier"
+  [(unspec_volatile [(const_int 0)] VUNSPEC_SPECULATION_BARRIER)]
+  ""
+  "
+  mips_emit_speculation_barrier_function ();
+  DONE;
+  "
+)
+
  
  ;;
  ;;  
diff --git a/libgcc/config/mips/lib1funcs.S b/libgcc/config/mips/lib1funcs.S
new file mode 100644
index 000..97a3655e8ab
--- /dev/null
+++ b/libgcc/config/mips/lib1funcs.S
@@ -0,0 +1,65 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+This file is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3, or (at your option) any
+later version.
+
+This file is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, 

Re: [PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2023-06-08 Thread Harald Anlauf via Gcc-patches

Hi FX,

Am 06.06.23 um 21:29 schrieb FX Coudert via Gcc-patches:

Hi,

This is a repost of the patch at 
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600887.html
which never really got green light, but I stopped pushing because stage 1 was 
closing and I was out of time.


I just looked at that thread.  I guess if you answer Mikael's
questions at

  https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601744.html

the patch will be fine.


It depends on a middle-end patch adding a type-generic __builtin_iseqsig(), 
which I posted for review at: 
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620801.html

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK to commit (once the 
middle-end patch is accepted)?

FX



Thanks,
Harald




Re: [committed] libstdc++: Fix code size regressions in std::vector [PR110060]

2023-06-08 Thread Jonathan Wakely via Gcc-patches
On Thu, 8 Jun 2023 at 09:58, Maxim Kuvyrkov 
wrote:

> Hi Jonathan,
>
> Interestingly, this increases code-size of -O3 code on aarch64-linux-gnu
> on SPEC CPU2017's 641.leela_s benchmark [1].
>
> In particular, FastBoard::get_nearby_enemies() grew from 1444 to 2212
> bytes.  This seems like a corner-case; the rest of SPEC CPU2017 is, mostly,
> neutral to this patch.  Is this something you may be interested in
> investigating?  I'll be happy to assist.
>
> Looking at assembly, one of the differences I see is that the "after"
> version has calls to realloc_insert(), while "before" version seems to have
> them inlined [2].
>

Was the size of that function stable at (approximately) 1444 bytes prior to
my most recent change?

Is it possible that r14-1452-gfb409a15d9babc caused the size to shrink, and
then r14-1470-gb7b255e77a2719 caused it to grow again?





>
> [1]
> https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
>
> [2] 641.leela_s is non-GPL/non-BSD benchmark, and I'm not sure if I can
> post its compiled and/or preprocessed code publicly.  I assume RedHat has
> SPEC CPU2017 license, and I can post details to you privately.
>
> Kind regards,
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
>
>
>
> > On Jun 1, 2023, at 19:09, Jonathan Wakely via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> >
> > Tested powerpc64le-linux. Pusshed to trunk.
> >
> > -- >8 --
> >
> > My r14-1452-gfb409a15d9babc change to add optimization hints to
> > std::vector causes regressions because it makes std::vector::size() and
> > std::vector::capacity() too big to inline. That's the opposite of what
> > I wanted, so revert the changes to those functions.
> >
> > To achieve the original aim of optimizing vec.assign(vec.size(), x) we
> > can add a local optimization hint to _M_fill_assign, so that it doesn't
> > affect all other uses of size() and capacity().
> >
> > Additionally, add the same hint to the _M_assign_aux overload for
> > forward iterators and add that to the testcase.
> >
> > It would be nice to similarly optimize:
> >  if (vec1.size() == vec2.size()) vec1 = vec2;
> > but adding hints to operator=(const vector&) doesn't help. Presumably
> > the relationships between the two sizes and two capacities are too
> > complex to track effectively.
> >
> > libstdc++-v3/ChangeLog:
> >
> > PR libstdc++/110060
> > * include/bits/stl_vector.h (_Vector_base::_M_invariant):
> > Remove.
> > (vector::size, vector::capacity): Remove calls to _M_invariant.
> > * include/bits/vector.tcc (vector::_M_fill_assign): Add
> > optimization hint to reallocating path.
> > (vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)):
> > Likewise.
> > * testsuite/23_containers/vector/capacity/invariant.cc: Moved
> > to...
> > * testsuite/23_containers/vector/modifiers/assign/no_realloc.cc:
> > ...here. Check assign(FwdIter, FwdIter) too.
> > * testsuite/23_containers/vector/types/1.cc: Revert addition
> > of -Wno-stringop-overread option.
> > ---
> > libstdc++-v3/include/bits/stl_vector.h| 23 +--
> > libstdc++-v3/include/bits/vector.tcc  | 17 ++
> > .../assign/no_realloc.cc} |  6 +
> > .../testsuite/23_containers/vector/types/1.cc |  2 +-
> > 4 files changed, 20 insertions(+), 28 deletions(-)
> > rename
> libstdc++-v3/testsuite/23_containers/vector/{capacity/invariant.cc =>
> modifiers/assign/no_realloc.cc} (70%)
> >
> > diff --git a/libstdc++-v3/include/bits/stl_vector.h
> b/libstdc++-v3/include/bits/stl_vector.h
> > index e593be443bc..70ced3d101f 100644
> > --- a/libstdc++-v3/include/bits/stl_vector.h
> > +++ b/libstdc++-v3/include/bits/stl_vector.h
> > @@ -389,23 +389,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >
> > protected:
> >
> > -  __attribute__((__always_inline__))
> > -  _GLIBCXX20_CONSTEXPR void
> > -  _M_invariant() const
> > -  {
> > -#if __OPTIMIZE__
> > - if (this->_M_impl._M_finish < this->_M_impl._M_start)
> > -  __builtin_unreachable();
> > - if (this->_M_impl._M_finish > this->_M_impl._M_end_of_storage)
> > -  __builtin_unreachable();
> > -
> > - size_t __sz = this->_M_impl._M_finish - this->_M_impl._M_start;
> > - size_t __cap = this->_M_impl._M_end_of_storage -
> this->_M_impl._M_start;
> > - if (__sz > __cap)
> > -  __builtin_unreachable();
> > -#endif
> > -  }
> > -
> >   _GLIBCXX20_CONSTEXPR
> >   void
> >   _M_create_storage(size_t __n)
> > @@ -1005,10 +988,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> >   size_type
> >   size() const _GLIBCXX_NOEXCEPT
> > -  {
> > - _Base::_M_invariant();
> > - return size_type(this->_M_impl._M_finish - this->_M_impl._M_start);
> > -  }
> > +  { return size_type(this->_M_impl._M_finish -
> this->_M_impl._M_start); }
> >
> >   /**  Returns the size() of the largest possible %vector.  */
> >   

Re: [PATCH] Fortran: add Fortran 2018 IEEE_{MIN,MAX} functions

2023-06-08 Thread Thomas Koenig via Gcc-patches

Hi FX,


Having a POWER system isn't enough, it also needs the IBM "advance
toolchain", and (at least with current distros, which default to
ibm long double), you need to dance counterclockwise three
times... I mean you need to invoke configure with some special magic


Thanks for the frank description, Thomas. To be honest, it reinforces > my feeling from when this was originally proposed and added: why> are 
we doing so much extra work for a feature that is used by> such a tiny 
fraction of our user base.


Well, I can tell you why I helped in this:  I like non-standard
architectures, I like 128-bit floats, and I like fast execution
speed of programs.  And if POWER having this goes any way towards
pushing Intel, AMD, or ARM towards having 128-bit floating point...
well, I would like that a lot.

And the need for all this magic will go away once distributions switch
to IEEE QP float as default.

By the way, if anybody wants to play with it, there should be no
problem in getting an account on the the OSL (virtual) machine
I ran this on.

As for the speed difference: A simple matrix multiplication has around
50 MFlops on my home box and around 250 MFlops on the POWER9 box I am
testing this on.  POWER10 should double that.

Best regards

Thomas


[PATCH] testsuite: fix the condition bug in tsvc s176

2023-06-08 Thread Lehua Ding
Hi,

This patch fixes the problem that the loop in the tsvc s176 function is
optimized and removed because `iterations/LEN_1D` is 0 (where iterations
is set to 1, LEN_1D is set to 32000 in tsvc.h).

This testcase passed on x86 and AArch64 system.

Best,
Lehua

gcc/testsuite/ChangeLog:

* gcc.dg/vect/tsvc/vect-tsvc-s176.c: adjust iterations

---
 gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c 
b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
index 79faf7fdb9e4..365e5205982b 100644
--- a/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
+++ b/gcc/testsuite/gcc.dg/vect/tsvc/vect-tsvc-s176.c
@@ -14,7 +14,7 @@ real_t s176(struct args_t * func_args)
 initialise_arrays(__func__);
 
 int m = LEN_1D/2;
-for (int nl = 0; nl < 4*(iterations/LEN_1D); nl++) {
+for (int nl = 0; nl < 4*(10*iterations/LEN_1D); nl++) {
 for (int j = 0; j < (LEN_1D/2); j++) {
 for (int i = 0; i < m; i++) {
 a[i] += b[i+m-j-1] * c[j];
@@ -39,4 +39,4 @@ int main (int argc, char **argv)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { xfail *-*-* } } } 
*/
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
-- 
2.36.1



[PATCH][RFC] c++: Accept elaborated-enum-base in system headers

2023-06-08 Thread Alex Coplan via Gcc-patches
Hi,

macOS SDK headers using the CF_ENUM macro can expand to invalid C++ code
of the form:

typedef enum T : BaseType T;

i.e. an elaborated-type-specifier with an additional enum-base.
Upstream LLVM can be made to accept the above construct with
-Wno-error=elaborated-enum-base.

This macro expansion occurs in the case that the compiler declares
support for enums with underlying type using __has_feature, see
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618450.html

GCC rejecting this construct outright means that GCC fails to bootstrap
on Darwin in the case that it (correctly) implements __has_feature and
declares support for C++ enums with underlying type.

This patch attempts to accept this construct in the C++ parser but only
if it appears in system headers. With this patch, GCC can bootstrap on
Darwin in combination with the (WIP) __has_feature patch posted at:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/617878.html

We also attempt to improve the diagnostic for this case, using a
similar diagnostic to that given by LLVM.

If it is more palatable I can look into restricting the change to accept
this code to only take effect on Darwin, but it's not clear that that's
any better or worse.

Other possible approaches here include trying to fixincludes the SDK
framework headers, but as Iain pointed out in the review of the
has_feature RFC, the necessary infrastructure doesn't exist at the
moment. Even if this support did exist, I believe the headers would
require quite extensive non-trivial "fixing".

Adjusting the parser to accept this construct in system headers seemed
more pragmatic and cleaner on balance.

Bootstrapped/regtested on aarch64-linux-gnu and x86_64-apple-darwin.

Any thoughts?

Thanks,
Alex

gcc/cp/ChangeLog:

* parser.cc (cp_parser_enum_specifier): Accept
elaborated-type-specifier with enum-base if in system headers.
Improve diagnostic when rejecting such a construct.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/enum40.C: Adjust expected diagnostics.
* g++.dg/cpp0x/forw_enum6.C: Likewise.
* g++.dg/ext/elab-enum-header.C: New test.
* g++.dg/ext/elab-enum-invalid.C: New test.
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index d77fbd20e56..e13133a6cfb 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -21024,11 +21024,13 @@ cp_parser_enum_specifier (cp_parser* parser)
 
   /* Check for the `:' that denotes a specified underlying type in C++0x.
  Note that a ':' could also indicate a bitfield width, however.  */
+  location_t colon_loc = UNKNOWN_LOCATION;
   if (cp_lexer_next_token_is (parser->lexer, CPP_COLON))
 {
   cp_decl_specifier_seq type_specifiers;
 
   /* Consume the `:'.  */
+  colon_loc = cp_lexer_peek_token (parser->lexer)->location;
   cp_lexer_consume_token (parser->lexer);
 
   auto tdf
@@ -21073,12 +21075,20 @@ cp_parser_enum_specifier (cp_parser* parser)
return error_mark_node;
}
   /* An opaque-enum-specifier must have a ';' here.  */
-  if ((scoped_enum_p || underlying_type)
+  if ((scoped_enum_p
+  || (underlying_type && !in_system_header_at (colon_loc)))
  && cp_lexer_next_token_is_not (parser->lexer, CPP_SEMICOLON))
{
  if (has_underlying_type)
-   cp_parser_commit_to_tentative_parse (parser);
- cp_parser_error (parser, "expected %<;%> or %<{%>");
+   {
+ cp_parser_commit_to_tentative_parse (parser);
+ error_at (colon_loc,
+   "declaration of enumeration with "
+   "fixed underlying type and no enumerator list is "
+   "only permitted as a standalone declaration");
+   }
+ else
+   cp_parser_error (parser, "expected %<;%> or %<{%>");
  if (has_underlying_type)
return error_mark_node;
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/enum40.C 
b/gcc/testsuite/g++.dg/cpp0x/enum40.C
index cfdf2a4a18a..d3ffeb62d70 100644
--- a/gcc/testsuite/g++.dg/cpp0x/enum40.C
+++ b/gcc/testsuite/g++.dg/cpp0x/enum40.C
@@ -4,23 +4,25 @@
 void
 foo ()
 {
-  enum : int a alignas;// { dg-error "expected" }
+  enum : int a alignas;// { dg-error "declaration of enum" }
+  // { dg-error {expected '\(' before ';'} "" { target *-*-* } .-1 }
 }
 
 void
 bar ()
 {
-  enum : int a;// { dg-error "expected" }
+  enum : int a;// { dg-error "declaration of enum" }
 }
 
 void
 baz ()
 {
-  enum class a : int b alignas;// { dg-error "expected" }
+  enum class a : int b alignas;// { dg-error "declaration of enum" }
+  // { dg-error {expected '\(' before ';'} "" { target *-*-* } .-1 }
 }
 
 void
 qux ()
 {
-  enum class a : int b;// { dg-error "expected" }
+  enum class a : int b;// { dg-error "declaration of enum" }
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/forw_enum6.C 

Re: [PATCH][GCC][AArch64] convert some patterns to new MD syntax

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/06/2023 11:00, Tamar Christina via Gcc-patches wrote:

Hi All,

This converts some patterns in the AArch64 backend to use the new
compact syntax.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

gcc/ChangeLog:

* config/aarch64/aarch64.md (arches): Add nosimd.
(*mov_aarch64, *movsi_aarch64, *movdi_aarch64): Rewrite to
compact syntax.

Thanks,
Tamar


A few nits but ok apart from that:



--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
8b8951d7b14aa1a8858fdc24bf6f9dd3d927d5ea..601173338a9068f7694867c8e6e78f9b10f32a17
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -366,7 +366,7 @@ (define_constants
  ;; As a convenience, "fp_q" means "fp" + the ability to move between
  ;; Q registers and is equivalent to "simd".
  
-(define_enum "arches" [ any rcpc8_4 fp fp_q simd sve fp16])

+(define_enum "arches" [ any rcpc8_4 fp fp_q simd nosimd sve fp16])
  
  (define_enum_attr "arch" "arches" (const_string "any"))
  
@@ -397,6 +397,9 @@ (define_attr "arch_enabled" "no,yes"

(and (eq_attr "arch" "fp_q, simd")
 (match_test "TARGET_SIMD"))
  
+	(and (eq_attr "arch" "nosimd")

+(match_test "!TARGET_SIMD"))
+
(and (eq_attr "arch" "fp16")
 (match_test "TARGET_FP_F16INST"))
  
@@ -1206,44 +1209,27 @@ (define_expand "mov"

  )
  
  (define_insn "*mov_aarch64"

-  [(set (match_operand:SHORT 0 "nonimmediate_operand" "=r,r,w,r  ,r,w, 
m,m,r,w,w")
-   (match_operand:SHORT 1 "aarch64_mov_operand"  " 
r,M,D,Usv,m,m,rZ,w,w,rZ,w"))]
+  [(set (match_operand:SHORT 0 "nonimmediate_operand")
+   (match_operand:SHORT 1 "aarch64_mov_operand"))]
"(register_operand (operands[0], mode)
  || aarch64_reg_or_zero (operands[1], mode))"
-{
-   switch (which_alternative)
- {
- case 0:
-   return "mov\t%w0, %w1";
- case 1:
-   return "mov\t%w0, %1";
- case 2:
-   return aarch64_output_scalar_simd_mov_immediate (operands[1],
-   mode);
- case 3:
-   return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
- case 4:
-   return "ldr\t%w0, %1";
- case 5:
-   return "ldr\t%0, %1";
- case 6:
-   return "str\t%w1, %0";
- case 7:
-   return "str\t%1, %0";
- case 8:
-   return TARGET_SIMD ? "umov\t%w0, %1.[0]" : "fmov\t%w0, %s1";
- case 9:
-   return TARGET_SIMD ? "dup\t%0., %w1" : "fmov\t%s0, %w1";
- case 10:
-   return TARGET_SIMD ? "dup\t%0, %1.[0]" : "fmov\t%s0, %s1";
- default:
-   gcc_unreachable ();
- }
-}
-  ;; The "mov_imm" type for CNT is just a placeholder.
-  [(set_attr "type" "mov_reg,mov_imm,neon_move,mov_imm,load_4,load_4,store_4,
-store_4,neon_to_gp,neon_from_gp,neon_dup")
-   (set_attr "arch" "*,*,simd,sve,*,*,*,*,*,*,*")]
+  {@ [cons: =0, 1; attrs: type, arch]
+ [r , r; mov_reg, * ] mov\t%w0, %w1

  ^
This space seems redundant as all alternatives have a single letter for 
the first constraint.  Perhaps this is a hang-over from when the first 
alternative had '=r'?




+ [r , M; mov_imm, * ] mov\t%w0, %1
+ [w , D; neon_move  , simd  ] << 
aarch64_output_scalar_simd_mov_immediate (operands[1], mode);
+ /* The "mov_imm" type for CNT is just a placeholder.  */
+ [r , Usv  ; mov_imm, sve   ] << aarch64_output_sve_cnt_immediate ("cnt", 
"%x0", operands[1]);
+ [r , m; load_4 , * ] ldr\t%w0, %1
+ [w , m; load_4 , * ] ldr\t%0, %1
+ [m , rZ   ; store_4, * ] str\\t%w1, %0


I'd write "rZ" as "r Z" to make it clear that the constraints are not a 
multi-letter constraint.



+ [m , w; store_4, * ] str\t%1, %0
+ [r , w; neon_to_gp  , simd  ] umov\t%w0, %1.[0]
+ [r , w; neon_to_gp  , nosimd] fmov\t%w0, %s1 /*foo */
+ [w , rZ   ; neon_from_gp, simd  ] dup\t%0., %w1
+ [w , rZ   ; neon_from_gp, nosimd] fmov\t%s0, %w1
+ [w , w; neon_dup   , simd  ] dup\t%0, %1.[0]
+ [w , w; neon_dup   , nosimd] fmov\t%s0, %s1
+  }
  )
  
  (define_expand "mov"

@@ -1280,79 +1266,71 @@ (define_expand "mov"
  )
  
  (define_insn_and_split "*movsi_aarch64"

-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  r,  
r,  r, w,r,w, w")
-   (match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand")
+   (match_operand:SI 1 "aarch64_mov_operand"))]
"(register_operand (operands[0], SImode)
  || aarch64_reg_or_zero (operands[1], SImode))"
-  "@
-   mov\\t%w0, %w1
-   mov\\t%w0, %w1
-   mov\\t%w0, %w1
-   mov\\t%w0, %1
-   #
-   * return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
-   ldr\\t%w0, %1
-   ldr\\t%s0, %1
-   

[PATCH 2/2] AArch64: New RTL for ABD

2023-06-08 Thread Oluwatamilore Adebayo via Gcc-patches
From: oluade01 

This patch adds new RTL and tests for sabd and uabd

PR tree-optimization/109156

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (sabd, uabd):
Change the mode to 3.
* config/aarch64/aarch64-simd.md (aarch64_abd):
Rename to abd3.
* config/aarch64/aarch64-sve.md (abd_3): Rename
to abd3.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/abd.h: New file.
* gcc.target/aarch64/abd_2.c: New test.
* gcc.target/aarch64/abd_3.c: New test.
* gcc.target/aarch64/abd_4.c: New test.
* gcc.target/aarch64/abd_none_2.c: New test.
* gcc.target/aarch64/abd_none_3.c: New test.
* gcc.target/aarch64/abd_none_4.c: New test.
* gcc.target/aarch64/abd_run_1.c: New test.
* gcc.target/aarch64/sve/abd_1.c: New test.
* gcc.target/aarch64/sve/abd_none_1.c: New test.
* gcc.target/aarch64/sve/abd_2.c: New test.
* gcc.target/aarch64/sve/abd_none_2.c: New test.
---
 gcc/config/aarch64/aarch64-simd-builtins.def  |  6 +-
 gcc/config/aarch64/aarch64-simd.md|  4 +-
 gcc/config/aarch64/aarch64-sve.md |  4 +-
 gcc/testsuite/gcc.target/aarch64/abd.h| 68 ++
 gcc/testsuite/gcc.target/aarch64/abd_2.c  | 35 +++
 gcc/testsuite/gcc.target/aarch64/abd_3.c  | 36 +++
 gcc/testsuite/gcc.target/aarch64/abd_4.c  | 30 ++
 gcc/testsuite/gcc.target/aarch64/abd_none_2.c | 14 +++
 gcc/testsuite/gcc.target/aarch64/abd_none_3.c | 14 +++
 gcc/testsuite/gcc.target/aarch64/abd_none_4.c | 19 
 gcc/testsuite/gcc.target/aarch64/abd_run_1.c  | 93 +++
 gcc/testsuite/gcc.target/aarch64/sve/abd_1.c  | 35 +++
 gcc/testsuite/gcc.target/aarch64/sve/abd_2.c  | 32 +++
 .../gcc.target/aarch64/sve/abd_none_1.c   | 13 +++
 .../gcc.target/aarch64/sve/abd_none_2.c   | 18 
 15 files changed, 414 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd.h
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_none_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_none_3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_none_4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_run_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_none_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_none_2.c

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
b/gcc/config/aarch64/aarch64-simd-builtins.def
index 
1beaa08c1e7c94bc13a64865ddb677345534699c..3efbf0a1874f6242e69665b8316d9a7d62a9c8cf
 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -194,9 +194,9 @@
   BUILTIN_VDQV_L (UNOP, saddlv, 0, NONE)
   BUILTIN_VDQV_L (UNOPU, uaddlv, 0, NONE)
 
-  /* Implemented by aarch64_abd.  */
-  BUILTIN_VDQ_BHSI (BINOP, sabd, 0, NONE)
-  BUILTIN_VDQ_BHSI (BINOPU, uabd, 0, NONE)
+  /* Implemented by abd3.  */
+  BUILTIN_VDQ_BHSI (BINOP, sabd, 3, NONE)
+  BUILTIN_VDQ_BHSI (BINOPU, uabd, 3, NONE)
 
   /* Implemented by aarch64_aba.  */
   BUILTIN_VDQ_BHSI (TERNOP, saba, 0, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 
a567f016b354c0f0542e58e7b51c0be739882d65..da35a928bac91db61f4e9884d9c8b162c3a3c937
 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -896,7 +896,7 @@ (define_insn "aarch64_abs"
 ;; So (ABS:QI (minus:QI 64 -128)) == (ABS:QI (192 or -64 signed)) == 64.
 ;; Whereas SABD would return 192 (-64 signed) on the above example.
 ;; Use MINUS ([us]max (op1, op2), [us]min (op1, op2)) instead.
-(define_insn "aarch64_abd"
+(define_insn "abd3"
   [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w")
(minus:VDQ_BHSI
  (USMAX:VDQ_BHSI
@@ -1087,7 +1087,7 @@ (define_expand "sadv16qi"
   {
rtx ones = force_reg (V16QImode, CONST1_RTX (V16QImode));
rtx abd = gen_reg_rtx (V16QImode);
-   emit_insn (gen_aarch64_abdv16qi (abd, operands[1], operands[2]));
+   emit_insn (gen_abdv16qi3 (abd, operands[1], operands[2]));
emit_insn (gen_udot_prodv16qi (operands[0], abd, ones, operands[3]));
DONE;
   }
diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index 
2898b85376b831c2728b806e0f2079086345f1fe..2de651a1989c6b36272dd78a8744c700ebc75c1a
 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -4001,7 +4001,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift_uxtw"
 ;; -
 
 ;; Unpredicated integer 

Re: [PATCH] rtl: AArch64: New RTL for ABD

2023-06-08 Thread Oluwatamilore Adebayo via Gcc-patches
> It would be good to add a:
> 
> /* { dg-final { scan-assembler-not {\tabs\t} } } */
> 
> to be the positive tests, to make it more obvious that all separate
> ABS instructions are elided.

Done.

Patch is in the next response.


Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/06/2023 11:29, Richard Earnshaw (lists) via Gcc-patches wrote:

On 08/06/2023 11:12, Andreas Schwab wrote:

On Jun 08 2023, Tamar Christina via Gcc-patches wrote:

@@ -713,6 +714,183 @@ you can use @samp{*} inside of a @samp{@@} 
multi-alternative template:

  @end group
  @end smallexample
+@node Compact Syntax
+@section Compact Syntax
+@cindex compact syntax
+
+In cases where the number of alternatives in a @code{define_insn} or
+@code{define_insn_and_split} are large then it may be beneficial to 
use the


 is large



Or perhaps better still:

When a define_insn or define_insn_and split has many alternatives it may 
be beneficial to ...


R.


Or perhaps even s/many/multiple/.  It doesn't have to have very many to 
make this new syntax preferable, IMO.


R.


[PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-08 Thread Oluwatamilore Adebayo via Gcc-patches
From: oluade01 

This adds a recognition pattern for the non-widening
absolute difference (ABD).

gcc/ChangeLog:

* doc/md.texi (sabd, uabd): Document them.
* internal-fn.def (ABD): Use new optab.
* optabs.def (sabd_optab, uabd_optab): New optabs,
* tree-vect-patterns.cc (vect_recog_absolute_difference):
Recognize the following idiom abs (a - b).
(vect_recog_sad_pattern): Refactor to use
vect_recog_absolute_difference.
(vect_recog_abd_pattern): Use patterns found by
vect_recog_absolute_difference to build a new ABD
internal call.
---
 gcc/doc/md.texi   |  10 ++
 gcc/internal-fn.def   |   3 +
 gcc/optabs.def|   2 +
 gcc/tree-vect-patterns.cc | 259 +-
 4 files changed, 244 insertions(+), 30 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
6a435eb44610960513e9739ac9ac1e8a27182c10..e11b10d2fca11016232921bc85e47975f700e6c6
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5787,6 +5787,16 @@ Other shift and rotate instructions, analogous to the
 Vector shift and rotate instructions that take vectors as operand 2
 instead of a scalar type.
 
+@cindex @code{uabd@var{m}} instruction pattern
+@cindex @code{sabd@var{m}} instruction pattern
+@item @samp{uabd@var{m}}, @samp{sabd@var{m}}
+Signed and unsigned absolute difference instructions.  These
+instructions find the difference between operands 1 and 2
+then return the absolute value.  A C code equivalent would be:
+@smallexample
+op0 = op1 > op2 ? op1 - op2 : op2 - op1;
+@end smallexample
+
 @cindex @code{avg@var{m}3_floor} instruction pattern
 @cindex @code{uavg@var{m}3_floor} instruction pattern
 @item @samp{avg@var{m}3_floor}
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 
3ac9d82aace322bd8ef108596e5583daa18c76e3..116965f4830cec8f60642ff011a86b6562e2c509
 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -191,6 +191,9 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary)
 DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary)
 DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary)
 
+DEF_INTERNAL_SIGNED_OPTAB_FN (ABD, ECF_CONST | ECF_NOTHROW, first,
+ sabd, uabd, binary)
+
 DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first,
  savg_floor, uavg_floor, binary)
 DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first,
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 
6c064ff4993620067d38742a0bfe0a3efb511069..35b835a6ac56d72417dac8ddfd77a8a7e2475e65
 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -359,6 +359,8 @@ OPTAB_D (mask_fold_left_plus_optab, 
"mask_fold_left_plus_$a")
 OPTAB_D (extract_last_optab, "extract_last_$a")
 OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a")
 
+OPTAB_D (uabd_optab, "uabd$a3")
+OPTAB_D (sabd_optab, "sabd$a3")
 OPTAB_D (savg_floor_optab, "avg$a3_floor")
 OPTAB_D (uavg_floor_optab, "uavg$a3_floor")
 OPTAB_D (savg_ceil_optab, "avg$a3_ceil")
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 
dc102c919352a0328cf86eabceb3a38c41a7e4fd..7296892aaa07da59b8122d29a22a2f583e8ff5aa
 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -782,6 +782,100 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info 
stmt2_info, tree new_rhs,
 }
 }
 
+/* Look for the following pattern
+   X = x[i]
+   Y = y[i]
+   DIFF = X - Y
+   DAD = ABS_EXPR
+
+   ABS_STMT should point to a statement of code ABS_EXPR or ABSU_EXPR.
+   HALF_TYPE and UNPROM will be set should the statement be found to
+   be a widened operation.
+   DIFF_STMT will be set to the MINUS_EXPR
+   statement that precedes the ABS_STMT unless vect_widened_op_tree
+   succeeds.
+ */
+static bool
+vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt,
+   tree *half_type,
+   vect_unpromoted_value unprom[2],
+   gassign **diff_stmt)
+{
+  if (!abs_stmt)
+return false;
+
+  /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
+ inside the loop (in case we are analyzing an outer-loop).  */
+  enum tree_code code = gimple_assign_rhs_code (abs_stmt);
+  if (code != ABS_EXPR && code != ABSU_EXPR)
+return false;
+
+  tree abs_oprnd = gimple_assign_rhs1 (abs_stmt);
+  tree abs_type = TREE_TYPE (abs_oprnd);
+  if (!abs_oprnd)
+return false;
+  if (!ANY_INTEGRAL_TYPE_P (abs_type)
+  || TYPE_OVERFLOW_WRAPS (abs_type)
+  || TYPE_UNSIGNED (abs_type))
+return false;
+
+  /* Peel off conversions from the ABS input.  This can involve sign
+ changes (e.g. from an unsigned subtraction to a signed ABS input)
+ or signed promotion, but it can't include unsigned promotion.
+ (Note that ABS of an unsigned promotion should have been folded
+ away before now anyway.)  */
+  vect_unpromoted_value 

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Earnshaw (lists) via Gcc-patches

On 08/06/2023 11:12, Andreas Schwab wrote:

On Jun 08 2023, Tamar Christina via Gcc-patches wrote:


@@ -713,6 +714,183 @@ you can use @samp{*} inside of a @samp{@@} 
multi-alternative template:
  @end group
  @end smallexample
  
+@node Compact Syntax

+@section Compact Syntax
+@cindex compact syntax
+
+In cases where the number of alternatives in a @code{define_insn} or
+@code{define_insn_and_split} are large then it may be beneficial to use the


 is large



Or perhaps better still:

When a define_insn or define_insn_and split has many alternatives it may 
be beneficial to ...


R.


Re: [PATCH] vect: Missed opportunity to use [SU]ABD

2023-06-08 Thread Oluwatamilore Adebayo via Gcc-patches
New patch to address issue brought up in a different
thread: mptjzwgplp2@arm.com

> > +  /* Failed to find a widen operation so we check for a regular 
> > MINUS_EXPR.  */
> > +  if (diff
> > +  && gimple_assign_rhs_code (diff) == MINUS_EXPR
> > +  && (TYPE_UNSIGNED (abs_type) || TYPE_OVERFLOW_UNDEFINED (abs_type)))
> > +{
> > +  *half_type = NULL_TREE;
> > +  return true;
> > +}
> 
> the condition should instead be:
> 
>   if (diff
>   && gimple_assign_rhs_code (diff) == MINUS_EXPR
>   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd)))
> {
>   *half_type = NULL_TREE;
>   return true;
> }
> 
> That is, we rely on overflow being undefined, so we need to check
> TYPE_OVERFLOW_UNDEFINED on the type of the subtraction (rather than
> abs_type, which is the type of ABS input, and at this point can be
> different from TREE_TYPE (abs_oprnd)).

I found that doing this alone would get rid of cases which otherwise
should have gone through so I added an extra step such that if this
part fails we'll try to find the unpromoted diff operands and then
try the type overflow check on the types of the unpromoted operands.

Patch is in the next response.


Re: [PATCH] RISC-V: Add Veyron V1 pipeline description

2023-06-08 Thread Kito Cheng via Gcc-patches
> On Thu 8. Jun 2023 at 09:35, Kito Cheng via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> > > diff --git a/gcc/config/riscv/riscv-cores.def
> > b/gcc/config/riscv/riscv-cores.def
> > > index 7d87ab7ce28..4078439e562 100644
> > > --- a/gcc/config/riscv/riscv-cores.def
> > > +++ b/gcc/config/riscv/riscv-cores.def
> > > @@ -38,6 +38,7 @@ RISCV_TUNE("sifive-3-series", generic,
> > rocket_tune_info)
> > >  RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
> > >  RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
> > >  RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
> > > +RISCV_TUNE("veyron-v1", veyron_v1, veyron_v1_tune_info)
> > >  RISCV_TUNE("size", generic, optimize_size_tune_info)
> > >
> > >  #undef RISCV_TUNE
> > > @@ -77,4 +78,7 @@ RISCV_CORE("thead-c906",
> > "rv64imafdc_xtheadba_xtheadbb_xtheadbs_xtheadcmo_"
> > >   "xtheadcondmov_xtheadfmemidx_xtheadmac_"
> > >   "xtheadmemidx_xtheadmempair_xtheadsync",
> > >   "thead-c906")
> > > +
> > > +RISCV_CORE("veyron-v1",
> >  "rv64imafdc_zba_zbb_zbc_zbs_zifencei_xventanacondops",
> > > + "veyron-v1")
> >
> > Seems like xventanacondops have not in the trunk yet, I saw Jeff has
> > approved before but not commit yet
>
>
> We couldn’t apply back then, as Veyro -V1 had been unnannounced.
> Can we move this forward now?
>

Oh, okay I got the awkness point...I am ok with that on gcc land, but I
would like binutils support that first, or remove the extension from the
mcpu for temporary before binutils support, otherwise it just a broken
support for that CPU on trunk gcc.




> Philipp.
>
> >
>


Re: [PATCH] Fortran: add Fortran 2018 IEEE_{MIN,MAX} functions

2023-06-08 Thread FX Coudert via Gcc-patches
> Having a POWER system isn't enough, it also needs the IBM "advance
> toolchain", and (at least with current distros, which default to
> ibm long double), you need to dance counterclockwise three
> times... I mean you need to invoke configure with some special magic

Thanks for the frank description, Thomas. To be honest, it reinforces my 
feeling from when this was originally proposed and added: why are we doing so 
much extra work for a feature that is used by such a tiny fraction of our user 
base.

FX

Re: [PATCH] Fortran: add Fortran 2018 IEEE_{MIN,MAX} functions

2023-06-08 Thread Thomas Koenig via Gcc-patches

Hi together,


On 6/6/23 21:11, FX Coudert via Gcc-patches wrote:

Hi,


I cannot see if there is proper support for kind=17 in your patch;
at least the libgfortran/ieee/ieee_arithmetic.F90 part does not
seem to have any related code.


Can real(kind=17) ever be an IEEE mode? If so, something seriously wrong 
happened, because the IEEE modules have no kind=17 mention in them anywhere.

Actually, where is the kind=17 documented?

FX


I was hoping for Thomas to come forward with some comment, as
he was quite involved in related work.

There are several threads on IEEE128 for Power on the fortran ML
e.g. around November/December 2021, January 2022.

I wasn't meaning to block your work, just wondering if the Power
platform needs more attention here.



% cd gcc/gccx/libgfortran
% grep HAVE_GFC_REAL_17 ieee/*
% troutmask:sgk[219] ls ieee
% ieee_arithmetic.F90 ieee_features.F90
% ieee_exceptions.F90 ieee_helper.c

There are zero hits for REAL(17) in the IEEE code.  If REAL(17)
is intended to be an IEEE-754 type, then it seems gfortran's
support was never added for it.  If anyone has access to a
power system, it's easy to test

program foo
use ieee_arithmetic
print *, ieee_support_datatype(1.e_17)
end program foo


The KIND=17 is a bit of a kludge.  It is not visible for
user programs, they use KIND=16, but this is then translated
to library calls as if it was KIND=17 if the IEEE 128-bit floats
are selected:

$ cat ml.f90
subroutine mm(a)
  real(kind=16), dimension(:,:) :: a
  print *,maxloc(a)
end subroutine mm
$ gfortran -S -mabi=ieeelongdouble ml.f90 && grep maxloc ml.s
bl _gfortran_maxloc0_4_r17
$ gfortran -S  ml.f90 && grep maxloc ml.s
bl _gfortran_maxloc0_4_r16

On POWER, if IBM long double exists, it is GFC_REAL_16, with GFC_REAL_17
being IEEE long double. Everywhere else, GFC_REAL_16 is IEEE long
double.

If POWER gets the flag -mabi=ieeelongdouble, it uses IEEE long doubles.

If it gets the additionalflag -mcpu=power9 or -mcpu=power10, it uses
the hardware instructions for the arithmetic instead of library calls.

Having a POWER system isn't enough, it also needs the IBM "advance
toolchain", and (at least with current distros, which default to
ibm long double), you need to dance counterclockwise three
times... I mean you need to invoke configure with some special magic
like

configure \
--prefix=$HOME \
--enable-languages=c,c++,fortran \
--disable-plugin \
--enable-checking \
--enable-stage1-checking \
--enable-gnu-indirect-function \
--disable-maintainer-mode \
--disable-libgomp \
--enable-decimal-float \
--enable-secureplt \
--enable-threads=posix \
--enable-__cxa_atexit \
--with-cpu=power9 \
--with-long-double-128 \
--with-as=/opt/at15.0/bin/as \
--with-ld=/opt/at15.0/bin/ld \
--with-gnu-as=/opt/at15.0/bin/as \
--with-gnu-ld=/opt/at15.0/bin/ld \
--with-advance-toolchain=at15.0 \
--with-native-system-header-dir=/opt/at15.0/include \
--without-ppl \
--without-cloog \
--without-isl

which Michael Meissner helped me with, I would never have figured it out
on my own.

There is a virutal POWER machine at OSUL dedicated to the IEEE QP
gfortran effort. It hasn't been used for some time, but it's still
active. I just bootstrapped trunk there and ran all the IEEE from the
testsuite manually, with

$ for a in *.f90; do echo "Testing $a"; gfortran -o $a.exe 
-fno-range-check -mcpu=power9 -mabi=ieeelongdouble -static-libgfortran 
$a signaling_1_c.c signaling_2_c.c ; ./$a.exe ; done 2>&1 | grep -v 
command-line

Testing fma_1.f90
   2.
   1.5000
   2.
   1.5000
   2.000
   1.500
   2.
   1.5000
Testing ieee_10.f90
Testing ieee_12.f90
Testing ieee_2.f90
Testing ieee_3.f90
Testing ieee_4.f90
Testing ieee_5.f90
Testing ieee_6.f90
Testing ieee_7.f90
Testing ieee_8.f90
Testing ieee_9.f90
Testing intrinsics_1.f90
Testing large_1.f90
Testing large_2.f90
Testing large_4.f90
Testing modes_1.f90
Testing pr77372.f90
Testing pr77507.f90
-Infinity
 F
Testing rounding_1.f90
Testing rounding_2.f90
Testing rounding_3.f90
Testing signaling_1.f90
Testing signaling_2.f90
Testing signaling_3.f90
Testing signbit_1.f90
Testing underflow_1.f90

so that seems to be OK.  However, the fact that there is no
mention of GFC_REAL_17 in there makes me a bit suspicious,

Michael, maybe you can comment if all is indeed well there,
and if the right things are being tested?

Regarding FX's patch: I am not quite sure that I am
actually testing the right thing if running the testsuite
there, so POWER should not hold up this patch.  If it turns
out that POWER needs additonal work on IEEE, we can always
add that later.

Best regards

Thomas


Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Andreas Schwab
On Jun 08 2023, Tamar Christina via Gcc-patches wrote:

> @@ -713,6 +714,183 @@ you can use @samp{*} inside of a @samp{@@} 
> multi-alternative template:
>  @end group
>  @end smallexample
>  
> +@node Compact Syntax
> +@section Compact Syntax
> +@cindex compact syntax
> +
> +In cases where the number of alternatives in a @code{define_insn} or
> +@code{define_insn_and_split} are large then it may be beneficial to use the

is large

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[PATCH][GCC][AArch64] convert some patterns to new MD syntax

2023-06-08 Thread Tamar Christina via Gcc-patches
Hi All,

This converts some patterns in the AArch64 backend to use the new
compact syntax.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

gcc/ChangeLog:

* config/aarch64/aarch64.md (arches): Add nosimd.
(*mov_aarch64, *movsi_aarch64, *movdi_aarch64): Rewrite to
compact syntax.

Thanks,
Tamar

--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
8b8951d7b14aa1a8858fdc24bf6f9dd3d927d5ea..601173338a9068f7694867c8e6e78f9b10f32a17
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -366,7 +366,7 @@ (define_constants
 ;; As a convenience, "fp_q" means "fp" + the ability to move between
 ;; Q registers and is equivalent to "simd".
 
-(define_enum "arches" [ any rcpc8_4 fp fp_q simd sve fp16])
+(define_enum "arches" [ any rcpc8_4 fp fp_q simd nosimd sve fp16])
 
 (define_enum_attr "arch" "arches" (const_string "any"))
 
@@ -397,6 +397,9 @@ (define_attr "arch_enabled" "no,yes"
(and (eq_attr "arch" "fp_q, simd")
 (match_test "TARGET_SIMD"))
 
+   (and (eq_attr "arch" "nosimd")
+(match_test "!TARGET_SIMD"))
+
(and (eq_attr "arch" "fp16")
 (match_test "TARGET_FP_F16INST"))
 
@@ -1206,44 +1209,27 @@ (define_expand "mov"
 )
 
 (define_insn "*mov_aarch64"
-  [(set (match_operand:SHORT 0 "nonimmediate_operand" "=r,r,w,r  ,r,w, 
m,m,r,w,w")
-   (match_operand:SHORT 1 "aarch64_mov_operand"  " 
r,M,D,Usv,m,m,rZ,w,w,rZ,w"))]
+  [(set (match_operand:SHORT 0 "nonimmediate_operand")
+   (match_operand:SHORT 1 "aarch64_mov_operand"))]
   "(register_operand (operands[0], mode)
 || aarch64_reg_or_zero (operands[1], mode))"
-{
-   switch (which_alternative)
- {
- case 0:
-   return "mov\t%w0, %w1";
- case 1:
-   return "mov\t%w0, %1";
- case 2:
-   return aarch64_output_scalar_simd_mov_immediate (operands[1],
-   mode);
- case 3:
-   return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
- case 4:
-   return "ldr\t%w0, %1";
- case 5:
-   return "ldr\t%0, %1";
- case 6:
-   return "str\t%w1, %0";
- case 7:
-   return "str\t%1, %0";
- case 8:
-   return TARGET_SIMD ? "umov\t%w0, %1.[0]" : "fmov\t%w0, %s1";
- case 9:
-   return TARGET_SIMD ? "dup\t%0., %w1" : "fmov\t%s0, %w1";
- case 10:
-   return TARGET_SIMD ? "dup\t%0, %1.[0]" : "fmov\t%s0, %s1";
- default:
-   gcc_unreachable ();
- }
-}
-  ;; The "mov_imm" type for CNT is just a placeholder.
-  [(set_attr "type" "mov_reg,mov_imm,neon_move,mov_imm,load_4,load_4,store_4,
-store_4,neon_to_gp,neon_from_gp,neon_dup")
-   (set_attr "arch" "*,*,simd,sve,*,*,*,*,*,*,*")]
+  {@ [cons: =0, 1; attrs: type, arch]
+ [r , r; mov_reg, * ] mov\t%w0, %w1
+ [r , M; mov_imm, * ] mov\t%w0, %1
+ [w , D; neon_move  , simd  ] << 
aarch64_output_scalar_simd_mov_immediate (operands[1], mode);
+ /* The "mov_imm" type for CNT is just a placeholder.  */
+ [r , Usv  ; mov_imm, sve   ] << aarch64_output_sve_cnt_immediate 
("cnt", "%x0", operands[1]);
+ [r , m; load_4 , * ] ldr\t%w0, %1
+ [w , m; load_4 , * ] ldr\t%0, %1
+ [m , rZ   ; store_4, * ] str\\t%w1, %0
+ [m , w; store_4, * ] str\t%1, %0
+ [r , w; neon_to_gp  , simd  ] umov\t%w0, %1.[0]
+ [r , w; neon_to_gp  , nosimd] fmov\t%w0, %s1 /*foo */
+ [w , rZ   ; neon_from_gp, simd  ] dup\t%0., %w1
+ [w , rZ   ; neon_from_gp, nosimd] fmov\t%s0, %w1
+ [w , w; neon_dup   , simd  ] dup\t%0, %1.[0]
+ [w , w; neon_dup   , nosimd] fmov\t%s0, %s1
+  }
 )
 
 (define_expand "mov"
@@ -1280,79 +1266,71 @@ (define_expand "mov"
 )
 
 (define_insn_and_split "*movsi_aarch64"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r,w, m, m,  
r,  r,  r, w,r,w, w")
-   (match_operand:SI 1 "aarch64_mov_operand"  " 
r,r,k,M,n,Usv,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand")
+   (match_operand:SI 1 "aarch64_mov_operand"))]
   "(register_operand (operands[0], SImode)
 || aarch64_reg_or_zero (operands[1], SImode))"
-  "@
-   mov\\t%w0, %w1
-   mov\\t%w0, %w1
-   mov\\t%w0, %w1
-   mov\\t%w0, %1
-   #
-   * return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]);
-   ldr\\t%w0, %1
-   ldr\\t%s0, %1
-   str\\t%w1, %0
-   str\\t%s1, %0
-   adrp\\t%x0, %A1\;ldr\\t%w0, [%x0, %L1]
-   adr\\t%x0, %c1
-   adrp\\t%x0, %A1
-   fmov\\t%s0, %w1
-   fmov\\t%w0, %s1
-   fmov\\t%s0, %s1
-   * return aarch64_output_scalar_simd_mov_immediate (operands[1], SImode);"
+  {@ [cons: =0, 1; attrs: type, arch, length]
+ [r , r  ; mov_reg  , *   , 4] mov\t%w0, %w1
+ [k , r  ; mov_reg  , *   , 4] ^
+ [r , k  ; mov_reg  , * 

RE: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Tamar Christina via Gcc-patches
Hi,

New version of the patch, I've omitted the explanation again 

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Any feedback?

Thanks,
Tamar

gcc/ChangeLog:

* gensupport.cc (class conlist, add_constraints, add_attributes,
create_missing_attributes, skip_spaces, expect_char,
preprocess_compact_syntax, parse_section_layout, parse_section,
convert_syntax): New.
(process_rtx): Check for conversion.
* genoutput.cc (process_template): Check for unresolved iterators.
(class data): Add compact_syntax_p.
(gen_insn): Use it.
* gensupport.h (compact_syntax): New.
(hash-set.h): Include.

Co-Authored-By: Omar Tahir 

--- inline copy of patch ---

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 
6a435eb44610960513e9739ac9ac1e8a27182c10..eee3684cd0865dbb07c0da45e0aa4ac0ce4e9643
 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -27,6 +27,7 @@ See the next chapter for information on the C header file.
 from such an insn.
 * Output Statement::For more generality, write C code to output
 the assembler code.
+* Compact Syntax::  Compact syntax for writing machine descriptors.
 * Predicates::  Controlling what kinds of operands can be used
 for an insn.
 * Constraints:: Fine-tuning operand selection.
@@ -713,6 +714,183 @@ you can use @samp{*} inside of a @samp{@@} 
multi-alternative template:
 @end group
 @end smallexample
 
+@node Compact Syntax
+@section Compact Syntax
+@cindex compact syntax
+
+In cases where the number of alternatives in a @code{define_insn} or
+@code{define_insn_and_split} are large then it may be beneficial to use the
+compact syntax when specifying alternatives.
+
+This syntax puts the constraints and attributes on the same horizontal line as
+the instruction assembly template.
+
+As an example
+
+@smallexample
+@group
+(define_insn_and_split ""
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r")
+   (match_operand:SI 1 "aarch64_mov_operand"  " r,r,k,M,n,Usv"))]
+  ""
+  "@@
+   mov\\t%w0, %w1
+   mov\\t%w0, %w1
+   mov\\t%w0, %w1
+   mov\\t%w0, %1
+   #
+   * return aarch64_output_sve_cnt_immediate ('cnt', '%x0', operands[1]);"
+  "&& true"
+   [(const_int 0)]
+  @{
+ aarch64_expand_mov_immediate (operands[0], operands[1]);
+ DONE;
+  @}
+  [(set_attr "type" "mov_reg,mov_reg,mov_reg,mov_imm,mov_imm,mov_imm")
+   (set_attr "arch"   "*,*,*,*,*,sve")
+   (set_attr "length" "4,4,4,4,*,  4")
+]
+)
+@end group
+@end smallexample
+
+can be better expressed as:
+
+@smallexample
+@group
+(define_insn_and_split ""
+  [(set (match_operand:SI 0 "nonimmediate_operand")
+   (match_operand:SI 1 "aarch64_mov_operand"))]
+  ""
+  @{@@ [cons: =0, 1; attrs: type, arch, length]
+ [r , r  ; mov_reg  , *   , 4] mov\t%w0, %w1
+ [k , r  ; mov_reg  , *   , 4] ^
+ [r , k  ; mov_reg  , *   , 4] ^
+ [r , M  ; mov_imm  , *   , 4] mov\t%w0, %1
+ [r , n  ; mov_imm  , *   , *] #
+ [r , Usv; mov_imm  , sve , 4] << aarch64_output_sve_cnt_immediate ("cnt", 
"%x0", operands[1]);
+  @}
+  "&& true"
+  [(const_int 0)]
+  @{
+aarch64_expand_mov_immediate (operands[0], operands[1]);
+DONE;
+  @}
+)
+@end group
+@end smallexample
+
+The syntax rules are as follows:
+@itemize @bullet
+@item
+Templates must start with @samp{@{@@} to use the new syntax.
+
+@item
+@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} 
followed by
+a list of @code{match_operand}/@code{match_scratch} comma operand numbers, 
then a
+semicolon, followed by the same for attributes (@samp{attrs:}).  Operand
+modifiers can be placed in this section group as well.  Both sections
+are optional (so you can use only @samp{cons}, or only @samp{attrs}, or both),
+and @samp{cons} must come before @samp{attrs} if present.
+
+@item
+Each alternative begins with any amount of whitespace.
+
+@item
+Following the whitespace is a comma-separated list of "constraints" and/or
+"attributes" within brackets @code{[]}, with sections separated by a semicolon.
+
+@item
+Should you want to copy the previous asm line, the symbol @code{^} can be used.
+This allows less copy pasting between alternative and reduces the number of
+lines to update on changes.
+
+@item
+When using C functions for output, the idiom @samp{* return ;} can be
+replaced with the shorthand @samp{<< @var{function};}.
+
+@item
+Following the closing @samp{]} is any amount of whitespace, and then the actual
+asm output.
+
+@item
+Spaces are allowed in the list (they will simply be removed).
+
+@item
+All constraint alternatives should be specified: a blank list should be
+@samp{[,,]} or generally use @samp{*} for the alternatives. e.g. 
@samp{[*,*,*]}.
+
+@item
+Within an @samp{@{@@} block both multiline and singleline C comments are
+allowed, but when used outside of a C block they must be the only 
non-whitespace
+blocks on the line.

Re: [PATCH] RISC-V: Add Veyron V1 pipeline description

2023-06-08 Thread Philipp Tomsich
On Thu 8. Jun 2023 at 09:35, Kito Cheng via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> > diff --git a/gcc/config/riscv/riscv-cores.def
> b/gcc/config/riscv/riscv-cores.def
> > index 7d87ab7ce28..4078439e562 100644
> > --- a/gcc/config/riscv/riscv-cores.def
> > +++ b/gcc/config/riscv/riscv-cores.def
> > @@ -38,6 +38,7 @@ RISCV_TUNE("sifive-3-series", generic,
> rocket_tune_info)
> >  RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
> >  RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
> >  RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
> > +RISCV_TUNE("veyron-v1", veyron_v1, veyron_v1_tune_info)
> >  RISCV_TUNE("size", generic, optimize_size_tune_info)
> >
> >  #undef RISCV_TUNE
> > @@ -77,4 +78,7 @@ RISCV_CORE("thead-c906",
> "rv64imafdc_xtheadba_xtheadbb_xtheadbs_xtheadcmo_"
> >   "xtheadcondmov_xtheadfmemidx_xtheadmac_"
> >   "xtheadmemidx_xtheadmempair_xtheadsync",
> >   "thead-c906")
> > +
> > +RISCV_CORE("veyron-v1",
>  "rv64imafdc_zba_zbb_zbc_zbs_zifencei_xventanacondops",
> > + "veyron-v1")
>
> Seems like xventanacondops have not in the trunk yet, I saw Jeff has
> approved before but not commit yet


We couldn’t apply back then, as Veyro -V1 had been unnannounced.
Can we move this forward now?

Philipp.

>


Re: [PATCH] Handle FMA friendly in reassoc pass

2023-06-08 Thread Maxim Kuvyrkov via Gcc-patches
> On May 25, 2023, at 03:30, Cui, Lili via Gcc-patches 
>  wrote:
> 
> From: Lili Cui 
> 
> Make some changes in reassoc pass to make it more friendly to fma pass later.
> Using FMA instead of mult + add reduces register pressure and insruction
> retired.
> 
> There are mainly two changes
> 1. Put no-mult ops and mult ops alternately at the end of the queue, which is
> conducive to generating more fma and reducing the loss of FMA when breaking
> the chain.
> 2. Rewrite the rewrite_expr_tree_parallel function to try to build parallel
> chains according to the given correlation width, keeping the FMA chance as
> much as possible.
> 
> With the patch applied
> 
> On ICX:
> 507.cactuBSSN_r: Improved by 1.7% for multi-copy .
> 503.bwaves_r   : Improved by  0.60% for single copy .
> 507.cactuBSSN_r: Improved by  1.10% for single copy .
> 519.lbm_r  : Improved by  2.21% for single copy .
> no measurable changes for other benchmarks.
> 
> On aarch64
> 507.cactuBSSN_r: Improved by 1.7% for multi-copy.
> 503.bwaves_r   : Improved by 6.00% for single-copy.
> no measurable changes for other benchmarks.

Hi Cui,

I'm seeing a 4% slowdown on 436.cactusADM from SPEC CPU2006 on 
aarch64-linux-gnu (Cortex-A57) when compiling with "-O2 -flto".  All other 
benchmarks seem neutral to this patch, and I didn't observe the slow down with 
plain -O2 no-LTO or with -O3.

Is this something interesting to investigate?  I'll be happy to assist.

Kind regards,

--
Maxim Kuvyrkov
https://www.linaro.org







[COMMITTED] analyzer: Standalone OOB-warning, formatting fixed [PR109437, PR109439]

2023-06-08 Thread Benjamin Priour via Gcc-patches
From: Benjamin Priour 

For the record, below is the previous patch I submitted, with the
little formatting issues fixed - multiline docstring no ends on a newline.
It was otherwise validated by David Malcolm, so I already committed it.

This patch enhances -Wanalyzer-out-of-bounds that is no longer paired
with a -Wanalyzer-use-of-uninitialized-value on out-of-bounds-read.

This also fixes PR analyzer/109437.
Before there could always be at most one OOB-read warning per frame because
-Wanalyzer-use-of-uninitialized-value always terminates the analysis
path.

PR analyzer/109439

gcc/analyzer/ChangeLog:

* bounds-checking.cc (region_model::check_symbolic_bounds):
  Returns whether the BASE_REG region access was OOB.
(region_model::check_region_bounds): Likewise.
* region-model.cc (region_model::get_store_value): Creates an
  unknown svalue on OOB-read access to REG.
(region_model::check_region_access): Returns whether an
unknown svalue needs be created.
(region_model::check_region_for_read): Passes
check_region_access return value.
* region-model.h: Update prior function definitions.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/out-of-bounds-2.c: Cleaned test for
  uninitialized-value warning.
* gcc.dg/analyzer/out-of-bounds-5.c: Likewise.
* gcc.dg/analyzer/pr101962.c: Likewise.
* gcc.dg/analyzer/realloc-5.c: Likewise.
* gcc.dg/analyzer/pr109439.c: New test.
---
 gcc/analyzer/bounds-checking.cc   | 28 +--
 gcc/analyzer/region-model.cc  | 22 +--
 gcc/analyzer/region-model.h   |  8 +++---
 .../gcc.dg/analyzer/out-of-bounds-2.c |  1 -
 .../gcc.dg/analyzer/out-of-bounds-5.c |  2 --
 gcc/testsuite/gcc.dg/analyzer/pr101962.c  |  1 -
 gcc/testsuite/gcc.dg/analyzer/pr109439.c  | 12 
 gcc/testsuite/gcc.dg/analyzer/realloc-5.c |  1 -
 8 files changed, 49 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr109439.c

diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc
index 3bf542a8eba..a5692cf9319 100644
--- a/gcc/analyzer/bounds-checking.cc
+++ b/gcc/analyzer/bounds-checking.cc
@@ -767,9 +767,10 @@ public:
   }
 };
 
-/* Check whether an access is past the end of the BASE_REG.  */
+/* Check whether an access is past the end of the BASE_REG.
+  Return TRUE if the access was valid, FALSE otherwise.  */
 
-void
+bool
 region_model::check_symbolic_bounds (const region *base_reg,
 const svalue *sym_byte_offset,
 const svalue *num_bytes_sval,
@@ -800,6 +801,7 @@ region_model::check_symbolic_bounds (const region *base_reg,
  offset_tree,
  num_bytes_tree,
  capacity_tree));
+ return false;
  break;
case DIR_WRITE:
  ctxt->warn (make_unique (base_reg,
@@ -807,9 +809,11 @@ region_model::check_symbolic_bounds (const region 
*base_reg,
 offset_tree,
 num_bytes_tree,
 capacity_tree));
+ return false;
  break;
}
 }
+  return true;
 }
 
 static tree
@@ -822,9 +826,10 @@ maybe_get_integer_cst_tree (const svalue *sval)
   return NULL_TREE;
 }
 
-/* May complain when the access on REG is out-of-bounds.  */
+/* May complain when the access on REG is out-of-bounds.
+   Return TRUE if the access was valid, FALSE otherwise.  */
 
-void
+bool
 region_model::check_region_bounds (const region *reg,
   enum access_direction dir,
   region_model_context *ctxt) const
@@ -839,14 +844,14 @@ region_model::check_region_bounds (const region *reg,
  (e.g. because the analyzer did not see previous offsets on the latter,
  it might think that a negative access is before the buffer).  */
   if (base_reg->symbolic_p ())
-return;
+ return true;
 
   /* Find out how many bytes were accessed.  */
   const svalue *num_bytes_sval = reg->get_byte_size_sval (m_mgr);
   tree num_bytes_tree = maybe_get_integer_cst_tree (num_bytes_sval);
   /* Bail out if 0 bytes are accessed.  */
   if (num_bytes_tree && zerop (num_bytes_tree))
-return;
+ return true;
 
   /* Get the capacity of the buffer.  */
   const svalue *capacity = get_capacity (base_reg);
@@ -877,13 +882,13 @@ region_model::check_region_bounds (const region *reg,
}
   else
byte_offset_sval = reg_offset.get_symbolic_byte_offset ();
-  check_symbolic_bounds (base_reg, byte_offset_sval, num_bytes_sval,
+ return 

Re: [committed] libstdc++: Fix code size regressions in std::vector [PR110060]

2023-06-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 08, 2023 at 10:05:43AM +0100, Jonathan Wakely via Gcc-patches wrote:
> > Looking at assembly, one of the differences I see is that the "after"
> > version has calls to realloc_insert(), while "before" version seems to have
> > them inlined [2].
> >
> > [1]
> > https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
> >
> >
> I find it annoying that adding `if (n < sz) __builtin_unreachable()` seems
> to affect the size estimates for the function, and so perturbs inlining
> decisions. That code shouldn't add any actual instructions, so shouldn't
> affect size estimates.
> 
> I mentioned this in a meeting last week and Jason suggested checking
> whether using __builtin_assume has the same undesirable consequences, so I

We don't support __builtin_assume (intentionally), if you mean 
[[assume(n>=sz)]],
then because n >= sz doesn't have side-effects, it will be lowered to
exactly that if (n < sz) __builtin_unreachable(); - you can look at
-fdump-tree-all to confirm that.

I agree that the inliner should ignore if (comparison) __builtin_unreachable();
from costs estimation.  And inliner should ignore what we emit for [[assume()]]
if there are side-effects.  CCing Honza.

Jakub



Re: [Patch, fortran] PR87477 - (associate) - [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-06-08 Thread Paul Richard Thomas via Gcc-patches
Thanks Gents!

The solution is to gfc_free_expr (p) if the replacement is not made.

I am regtesting a patch for PR107900. I'll include the fix for the
memory leak in the patch for that.

Cheers

Paul


On Thu, 8 Jun 2023 at 09:30, Harald Anlauf  wrote:
>
> On 6/8/23 09:46, Mikael Morin wrote:
> > Le 08/06/2023 à 07:57, Paul Richard Thomas via Fortran a écrit :
> >> Hi Harald,
> >>
> >> In answer to your question:
> >> void
> >> gfc_replace_expr (gfc_expr *dest, gfc_expr *src)
> >> {
> >>free_expr0 (dest);
> >>*dest = *src;
> >>free (src);
> >> }
> >> So it does indeed do the job.
> >>
> > Sure, but his comment was about the case gfc_replace_expr is *not*
> > executed.
>
> Right.  The following legal code exhibits the leak, pointing
> to the gfc_copy_expr:
>
> subroutine paul (n)
>integer  :: n
>character(n) :: c
> end
>
> >> I should perhaps have remarked that, following the divide error,
> >> gfc_simplify_expr was returning a mutilated version of the expression
> >> and this was somehow connected with successfully simplifying the
> >> parentheses. Copying and replacing on no errors deals with the
> >> problem.
> >>
> > Is the expression mutilated enough that it can't be safely freed?
> >
> >
> >
>


-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


Re: [committed] libstdc++: Fix code size regressions in std::vector [PR110060]

2023-06-08 Thread Jonathan Wakely via Gcc-patches
On Thu, 8 Jun 2023 at 09:58, Maxim Kuvyrkov 
wrote:

> Hi Jonathan,
>
> Interestingly, this increases code-size of -O3 code on aarch64-linux-gnu
> on SPEC CPU2017's 641.leela_s benchmark [1].
>
> In particular, FastBoard::get_nearby_enemies() grew from 1444 to 2212
> bytes.  This seems like a corner-case; the rest of SPEC CPU2017 is, mostly,
> neutral to this patch.  Is this something you may be interested in
> investigating?  I'll be happy to assist.
>

I'd certainly like to avoid the regression, but I'm too dumb to understand
most inlining bugs myself.


>
> Looking at assembly, one of the differences I see is that the "after"
> version has calls to realloc_insert(), while "before" version seems to have
> them inlined [2].
>
> [1]
> https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt
>
>
I find it annoying that adding `if (n < sz) __builtin_unreachable()` seems
to affect the size estimates for the function, and so perturbs inlining
decisions. That code shouldn't add any actual instructions, so shouldn't
affect size estimates.

I mentioned this in a meeting last week and Jason suggested checking
whether using __builtin_assume has the same undesirable consequences, so I
think I'll start by investigating that.



> [2] 641.leela_s is non-GPL/non-BSD benchmark, and I'm not sure if I can
> post its compiled and/or preprocessed code publicly.  I assume RedHat has
> SPEC CPU2017 license, and I can post details to you privately.
>
>
Yes, I think I can get the benchmark code from Vlad.

Thanks for bringing this to my attention.


Re: [committed] libstdc++: Fix code size regressions in std::vector [PR110060]

2023-06-08 Thread Maxim Kuvyrkov via Gcc-patches
Hi Jonathan,

Interestingly, this increases code-size of -O3 code on aarch64-linux-gnu on 
SPEC CPU2017's 641.leela_s benchmark [1].

In particular, FastBoard::get_nearby_enemies() grew from 1444 to 2212 bytes.  
This seems like a corner-case; the rest of SPEC CPU2017 is, mostly, neutral to 
this patch.  Is this something you may be interested in investigating?  I'll be 
happy to assist.

Looking at assembly, one of the differences I see is that the "after" version 
has calls to realloc_insert(), while "before" version seems to have them 
inlined [2]. 

[1] 
https://git.linaro.org/toolchain/ci/interesting-commits.git/tree/gcc/sha1/b7b255e77a271974479c34d1db3daafc04b920bc/tcwg_bmk-code_size-cpu2017fast/status.txt

[2] 641.leela_s is non-GPL/non-BSD benchmark, and I'm not sure if I can post 
its compiled and/or preprocessed code publicly.  I assume RedHat has SPEC 
CPU2017 license, and I can post details to you privately.

Kind regards,

--
Maxim Kuvyrkov
https://www.linaro.org




> On Jun 1, 2023, at 19:09, Jonathan Wakely via Gcc-patches 
>  wrote:
> 
> Tested powerpc64le-linux. Pusshed to trunk.
> 
> -- >8 --
> 
> My r14-1452-gfb409a15d9babc change to add optimization hints to
> std::vector causes regressions because it makes std::vector::size() and
> std::vector::capacity() too big to inline. That's the opposite of what
> I wanted, so revert the changes to those functions.
> 
> To achieve the original aim of optimizing vec.assign(vec.size(), x) we
> can add a local optimization hint to _M_fill_assign, so that it doesn't
> affect all other uses of size() and capacity().
> 
> Additionally, add the same hint to the _M_assign_aux overload for
> forward iterators and add that to the testcase.
> 
> It would be nice to similarly optimize:
>  if (vec1.size() == vec2.size()) vec1 = vec2;
> but adding hints to operator=(const vector&) doesn't help. Presumably
> the relationships between the two sizes and two capacities are too
> complex to track effectively.
> 
> libstdc++-v3/ChangeLog:
> 
> PR libstdc++/110060
> * include/bits/stl_vector.h (_Vector_base::_M_invariant):
> Remove.
> (vector::size, vector::capacity): Remove calls to _M_invariant.
> * include/bits/vector.tcc (vector::_M_fill_assign): Add
> optimization hint to reallocating path.
> (vector::_M_assign_aux(FwdIter, FwdIter, forward_iterator_tag)):
> Likewise.
> * testsuite/23_containers/vector/capacity/invariant.cc: Moved
> to...
> * testsuite/23_containers/vector/modifiers/assign/no_realloc.cc:
> ...here. Check assign(FwdIter, FwdIter) too.
> * testsuite/23_containers/vector/types/1.cc: Revert addition
> of -Wno-stringop-overread option.
> ---
> libstdc++-v3/include/bits/stl_vector.h| 23 +--
> libstdc++-v3/include/bits/vector.tcc  | 17 ++
> .../assign/no_realloc.cc} |  6 +
> .../testsuite/23_containers/vector/types/1.cc |  2 +-
> 4 files changed, 20 insertions(+), 28 deletions(-)
> rename libstdc++-v3/testsuite/23_containers/vector/{capacity/invariant.cc => 
> modifiers/assign/no_realloc.cc} (70%)
> 
> diff --git a/libstdc++-v3/include/bits/stl_vector.h 
> b/libstdc++-v3/include/bits/stl_vector.h
> index e593be443bc..70ced3d101f 100644
> --- a/libstdc++-v3/include/bits/stl_vector.h
> +++ b/libstdc++-v3/include/bits/stl_vector.h
> @@ -389,23 +389,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> 
> protected:
> 
> -  __attribute__((__always_inline__))
> -  _GLIBCXX20_CONSTEXPR void
> -  _M_invariant() const
> -  {
> -#if __OPTIMIZE__
> - if (this->_M_impl._M_finish < this->_M_impl._M_start)
> -  __builtin_unreachable();
> - if (this->_M_impl._M_finish > this->_M_impl._M_end_of_storage)
> -  __builtin_unreachable();
> -
> - size_t __sz = this->_M_impl._M_finish - this->_M_impl._M_start;
> - size_t __cap = this->_M_impl._M_end_of_storage - this->_M_impl._M_start;
> - if (__sz > __cap)
> -  __builtin_unreachable();
> -#endif
> -  }
> -
>   _GLIBCXX20_CONSTEXPR
>   void
>   _M_create_storage(size_t __n)
> @@ -1005,10 +988,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
>   size_type
>   size() const _GLIBCXX_NOEXCEPT
> -  {
> - _Base::_M_invariant();
> - return size_type(this->_M_impl._M_finish - this->_M_impl._M_start);
> -  }
> +  { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
> 
>   /**  Returns the size() of the largest possible %vector.  */
>   _GLIBCXX_NODISCARD _GLIBCXX20_CONSTEXPR
> @@ -1095,7 +1075,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>   size_type
>   capacity() const _GLIBCXX_NOEXCEPT
>   {
> - _Base::_M_invariant();
> return size_type(this->_M_impl._M_end_of_storage
>   - this->_M_impl._M_start);
>   }
> diff --git a/libstdc++-v3/include/bits/vector.tcc 
> b/libstdc++-v3/include/bits/vector.tcc
> index d6fdea2dd01..acd11e2dc68 100644
> --- a/libstdc++-v3/include/bits/vector.tcc
> +++ b/libstdc++-v3/include/bits/vector.tcc
> @@ -270,15 

Re: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread juzhe.zh...@rivai.ai
I have an idea base on what Kito said.
We enable vfadd FP16 for TARGET_ZVFH. But we don't need to add TARGET_VECTOR && 

for each instruction.

We can reference riscv.md:
(define_attr "ext_enabled" "no,yes"
  (cond [(eq_attr "ext" "base")
   (const_string "yes")

   (and (eq_attr "ext" "f")
(match_test "TARGET_HARD_FLOAT"))
   (const_string "yes")

   (and (eq_attr "ext" "d")
(match_test "TARGET_DOUBLE_FLOAT"))
   (const_string "yes")

   (and (eq_attr "ext" "vector")
(match_test "TARGET_VECTOR"))
   (const_string "yes")
  ]
  (const_string "no")))

Define a new attribute as follows:
(define_attr "fp16_vector_enabled" "no,yes"
  (cond [
   (and (eq_attr "type" "vfalu")
(and eq_attr "mode" "VNx1HF")
(match_test "!TARGET_ZVFH")))
   (const_string "no")
  ]
  (const_string "yes")))


I think you can do experiment with this to see whether it can disable MD 
pattern.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-06-08 15:58
To: juzhe.zh...@rivai.ai
CC: pan2.li; gcc-patches; Robin Dapp; jeffreyalaw; yanzhang.wang
Subject: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
I am thinking, is it possible to use mode attr to remove the overhead
of checking the mode for other FP modes other than FP16?
 
e.g.
(define_mode_attr TARGET_FP_FULL_OPERATION_CHECKING [
  (VNx1HF "TARGET_ZVFH")
...
  (VNx1SF "1")
...
])
 
 
  "TARGET_VECTOR && riscv_vector::float_mode_supported_p (mode)"
->
  "TARGET_VECTOR && "
 
 
On Thu, Jun 8, 2023 at 2:35 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
> autovec.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-08 14:29
> To: gcc-patches
> CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
> Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> From: Pan Li 
>
> This patch would like to refactor the requirement of both the ZVFH
> and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
> iterators of RVV. And then the ZVFH will leverage one function as
> the gate for FP16 supported or not.
>
> Please note the ZVFH will cover the ZVFHMIN instructions. This patch
> add one test for this.
>
> Signed-off-by: Pan Li 
> Co-Authored by: Juzhe-Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (float_mode_supported_p):
> New function to float point is supported by extension.
> * config/riscv/riscv-v.cc (float_mode_supported_p):
> Ditto.
> * config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
> * config/riscv/vector.md: Add condition to FP define insn.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
> for ZVFHMIN.
> ---
> gcc/config/riscv/riscv-protos.h   |   1 +
> gcc/config/riscv/riscv-v.cc   |  12 ++
> gcc/config/riscv/vector-iterators.md  |  23 +--
> gcc/config/riscv/vector.md| 144 ++
> .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
> 5 files changed, 118 insertions(+), 77 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index ebbaac255f9..1f606f59ce1 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
> bool check_builtin_call (location_t, vec, unsigned int,
>tree, unsigned int, tree *);
> bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
> +bool float_mode_supported_p (machine_mode mode);
> bool legitimize_move (rtx, rtx);
> void emit_vlmax_vsetvl (machine_mode, rtx);
> void emit_hard_vlmax_vsetvl (machine_mode, rtx);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 49752cd8899..fe4eb058ec0 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
> minval,
>   && IN_RANGE (INTVAL (elt), minval, maxval));
> }
> +/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
> +   float point machine mode.  */
> +bool
> +float_mode_supported_p (machine_mode mode)
> +{
> +  machine_mode inner_mode = GET_MODE_INNER (mode);
> +
> +  gcc_assert (FLOAT_MODE_P (inner_mode));
> +
> +  return inner_mode == HFmode ? TARGET_ZVFH : true;
> +}
> +
> /* Return true if VEC is a constant in which every element is in the range
> [MINVAL, MAXVAL].  The elements do not need to have the same value.
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index f4946d84449..234b712bc9d 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
>(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
> "TARGET_VECTOR_ELEN_64")
>(VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
> "TARGET_VECTOR_ELEN_64 

Re: [Patch, fortran] PR87477 - (associate) - [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-06-08 Thread Harald Anlauf via Gcc-patches

On 6/8/23 09:46, Mikael Morin wrote:

Le 08/06/2023 à 07:57, Paul Richard Thomas via Fortran a écrit :

Hi Harald,

In answer to your question:
void
gfc_replace_expr (gfc_expr *dest, gfc_expr *src)
{
   free_expr0 (dest);
   *dest = *src;
   free (src);
}
So it does indeed do the job.

Sure, but his comment was about the case gfc_replace_expr is *not* 
executed.


Right.  The following legal code exhibits the leak, pointing
to the gfc_copy_expr:

subroutine paul (n)
  integer  :: n
  character(n) :: c
end


I should perhaps have remarked that, following the divide error,
gfc_simplify_expr was returning a mutilated version of the expression
and this was somehow connected with successfully simplifying the
parentheses. Copying and replacing on no errors deals with the
problem.


Is the expression mutilated enough that it can't be safely freed?








RE: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Li, Pan2 via Gcc-patches
Looks doable up to a point, I will have a try and keep you posted.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, June 8, 2023 3:58 PM
To: juzhe.zh...@rivai.ai
Cc: Li, Pan2 ; gcc-patches ; Robin 
Dapp ; jeffreyalaw ; Wang, Yanzhang 

Subject: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

I am thinking, is it possible to use mode attr to remove the overhead of 
checking the mode for other FP modes other than FP16?

e.g.
(define_mode_attr TARGET_FP_FULL_OPERATION_CHECKING [
  (VNx1HF "TARGET_ZVFH")
...
  (VNx1SF "1")
...
])


  "TARGET_VECTOR && riscv_vector::float_mode_supported_p (mode)"
->
  "TARGET_VECTOR && "


On Thu, Jun 8, 2023 at 2:35 PM juzhe.zh...@rivai.ai  
wrote:
>
> LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
> autovec.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-08 14:29
> To: gcc-patches
> CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; 
> kito.cheng
> Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> From: Pan Li 
>
> This patch would like to refactor the requirement of both the ZVFH and 
> ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the 
> iterators of RVV. And then the ZVFH will leverage one function as the 
> gate for FP16 supported or not.
>
> Please note the ZVFH will cover the ZVFHMIN instructions. This patch 
> add one test for this.
>
> Signed-off-by: Pan Li  Co-Authored by: Juzhe-Zhong 
> 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (float_mode_supported_p):
> New function to float point is supported by extension.
> * config/riscv/riscv-v.cc (float_mode_supported_p):
> Ditto.
> * config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
> * config/riscv/vector.md: Add condition to FP define insn.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test for 
> ZVFHMIN.
> ---
> gcc/config/riscv/riscv-protos.h   |   1 +
> gcc/config/riscv/riscv-v.cc   |  12 ++
> gcc/config/riscv/vector-iterators.md  |  23 +--
> gcc/config/riscv/vector.md| 144 ++
> .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
> 5 files changed, 118 insertions(+), 77 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h 
> b/gcc/config/riscv/riscv-protos.h index ebbaac255f9..1f606f59ce1 
> 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx); bool 
> check_builtin_call (location_t, vec, unsigned int,
>tree, unsigned int, tree *);
> bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, 
> HOST_WIDE_INT);
> +bool float_mode_supported_p (machine_mode mode);
> bool legitimize_move (rtx, rtx);
> void emit_vlmax_vsetvl (machine_mode, rtx); void 
> emit_hard_vlmax_vsetvl (machine_mode, rtx); diff --git 
> a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 
> 49752cd8899..fe4eb058ec0 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
> minval,
>   && IN_RANGE (INTVAL (elt), minval, maxval)); }
> +/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
> +   float point machine mode.  */
> +bool
> +float_mode_supported_p (machine_mode mode) {
> +  machine_mode inner_mode = GET_MODE_INNER (mode);
> +
> +  gcc_assert (FLOAT_MODE_P (inner_mode));
> +
> +  return inner_mode == HFmode ? TARGET_ZVFH : true; }
> +
> /* Return true if VEC is a constant in which every element is in the range
> [MINVAL, MAXVAL].  The elements do not need to have the same value.
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index f4946d84449..234b712bc9d 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
>(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
> "TARGET_VECTOR_ELEN_64")
>(VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") 
> (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
> -  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
> -  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
> -  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
> +  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")  
> + (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
>(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
>(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
>(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32") @@ 
> -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [ 
> (define_mode_iterator V_FRACT [
>(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
> (VNx8QI "TARGET_MIN_VLEN >= 128")
>(VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") 
> (VNx4HI "TARGET_MIN_VLEN >= 128")
> -  (VNx1HF 

Re: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread juzhe.zh...@rivai.ai
Oh. Good suggestion.  It's much better than my solution I think.



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-06-08 15:58
To: juzhe.zh...@rivai.ai
CC: pan2.li; gcc-patches; Robin Dapp; jeffreyalaw; yanzhang.wang
Subject: Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
I am thinking, is it possible to use mode attr to remove the overhead
of checking the mode for other FP modes other than FP16?
 
e.g.
(define_mode_attr TARGET_FP_FULL_OPERATION_CHECKING [
  (VNx1HF "TARGET_ZVFH")
...
  (VNx1SF "1")
...
])
 
 
  "TARGET_VECTOR && riscv_vector::float_mode_supported_p (mode)"
->
  "TARGET_VECTOR && "
 
 
On Thu, Jun 8, 2023 at 2:35 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
> autovec.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-08 14:29
> To: gcc-patches
> CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
> Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> From: Pan Li 
>
> This patch would like to refactor the requirement of both the ZVFH
> and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
> iterators of RVV. And then the ZVFH will leverage one function as
> the gate for FP16 supported or not.
>
> Please note the ZVFH will cover the ZVFHMIN instructions. This patch
> add one test for this.
>
> Signed-off-by: Pan Li 
> Co-Authored by: Juzhe-Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (float_mode_supported_p):
> New function to float point is supported by extension.
> * config/riscv/riscv-v.cc (float_mode_supported_p):
> Ditto.
> * config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
> * config/riscv/vector.md: Add condition to FP define insn.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
> for ZVFHMIN.
> ---
> gcc/config/riscv/riscv-protos.h   |   1 +
> gcc/config/riscv/riscv-v.cc   |  12 ++
> gcc/config/riscv/vector-iterators.md  |  23 +--
> gcc/config/riscv/vector.md| 144 ++
> .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
> 5 files changed, 118 insertions(+), 77 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index ebbaac255f9..1f606f59ce1 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
> bool check_builtin_call (location_t, vec, unsigned int,
>tree, unsigned int, tree *);
> bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
> +bool float_mode_supported_p (machine_mode mode);
> bool legitimize_move (rtx, rtx);
> void emit_vlmax_vsetvl (machine_mode, rtx);
> void emit_hard_vlmax_vsetvl (machine_mode, rtx);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 49752cd8899..fe4eb058ec0 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
> minval,
>   && IN_RANGE (INTVAL (elt), minval, maxval));
> }
> +/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
> +   float point machine mode.  */
> +bool
> +float_mode_supported_p (machine_mode mode)
> +{
> +  machine_mode inner_mode = GET_MODE_INNER (mode);
> +
> +  gcc_assert (FLOAT_MODE_P (inner_mode));
> +
> +  return inner_mode == HFmode ? TARGET_ZVFH : true;
> +}
> +
> /* Return true if VEC is a constant in which every element is in the range
> [MINVAL, MAXVAL].  The elements do not need to have the same value.
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index f4946d84449..234b712bc9d 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
>(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
> "TARGET_VECTOR_ELEN_64")
>(VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
> "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
> -  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
> -  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
> -  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
> +  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
> +  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
>(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
>(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
>(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
> @@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
> (define_mode_iterator V_FRACT [
>(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
> (VNx8QI "TARGET_MIN_VLEN >= 128")
>(VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
> "TARGET_MIN_VLEN >= 128")
> -  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") 

Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Kito Cheng via Gcc-patches
I am thinking, is it possible to use mode attr to remove the overhead
of checking the mode for other FP modes other than FP16?

e.g.
(define_mode_attr TARGET_FP_FULL_OPERATION_CHECKING [
  (VNx1HF "TARGET_ZVFH")
...
  (VNx1SF "1")
...
])


  "TARGET_VECTOR && riscv_vector::float_mode_supported_p (mode)"
->
  "TARGET_VECTOR && "


On Thu, Jun 8, 2023 at 2:35 PM juzhe.zh...@rivai.ai
 wrote:
>
> LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
> autovec.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-06-08 14:29
> To: gcc-patches
> CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
> Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
> From: Pan Li 
>
> This patch would like to refactor the requirement of both the ZVFH
> and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
> iterators of RVV. And then the ZVFH will leverage one function as
> the gate for FP16 supported or not.
>
> Please note the ZVFH will cover the ZVFHMIN instructions. This patch
> add one test for this.
>
> Signed-off-by: Pan Li 
> Co-Authored by: Juzhe-Zhong 
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-protos.h (float_mode_supported_p):
> New function to float point is supported by extension.
> * config/riscv/riscv-v.cc (float_mode_supported_p):
> Ditto.
> * config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
> * config/riscv/vector.md: Add condition to FP define insn.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
> for ZVFHMIN.
> ---
> gcc/config/riscv/riscv-protos.h   |   1 +
> gcc/config/riscv/riscv-v.cc   |  12 ++
> gcc/config/riscv/vector-iterators.md  |  23 +--
> gcc/config/riscv/vector.md| 144 ++
> .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
> 5 files changed, 118 insertions(+), 77 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index ebbaac255f9..1f606f59ce1 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
> bool check_builtin_call (location_t, vec, unsigned int,
>tree, unsigned int, tree *);
> bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
> +bool float_mode_supported_p (machine_mode mode);
> bool legitimize_move (rtx, rtx);
> void emit_vlmax_vsetvl (machine_mode, rtx);
> void emit_hard_vlmax_vsetvl (machine_mode, rtx);
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 49752cd8899..fe4eb058ec0 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
> minval,
>   && IN_RANGE (INTVAL (elt), minval, maxval));
> }
> +/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
> +   float point machine mode.  */
> +bool
> +float_mode_supported_p (machine_mode mode)
> +{
> +  machine_mode inner_mode = GET_MODE_INNER (mode);
> +
> +  gcc_assert (FLOAT_MODE_P (inner_mode));
> +
> +  return inner_mode == HFmode ? TARGET_ZVFH : true;
> +}
> +
> /* Return true if VEC is a constant in which every element is in the range
> [MINVAL, MAXVAL].  The elements do not need to have the same value.
> diff --git a/gcc/config/riscv/vector-iterators.md 
> b/gcc/config/riscv/vector-iterators.md
> index f4946d84449..234b712bc9d 100644
> --- a/gcc/config/riscv/vector-iterators.md
> +++ b/gcc/config/riscv/vector-iterators.md
> @@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
>(VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
> "TARGET_VECTOR_ELEN_64")
>(VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
> "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
> -  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
> -  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
> -  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
> +  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
> +  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
>(VNx8HF "TARGET_VECTOR_ELEN_FP_16")
>(VNx16HF "TARGET_VECTOR_ELEN_FP_16")
>(VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
> @@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
> (define_mode_iterator V_FRACT [
>(VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
> (VNx8QI "TARGET_MIN_VLEN >= 128")
>(VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
> "TARGET_MIN_VLEN >= 128")
> -  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") (VNx4HF 
> "TARGET_MIN_VLEN >= 128")
> +
> +  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
> +  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
> +  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
> +
>(VNx1SI "TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN < 128") 

[PATCH v2] RISC-V: Add more test cases for RVV FP16

2023-06-08 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to add new test cases to make sure the
RVV FP16 works well as expected.

Signed-off-by: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Add new cases.
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: New test.
---
 .../riscv/rvv/base/zvfh-intrinsic.c   | 22 ++-
 .../riscv/rvv/base/zvfh-over-zvfhmin.c| 37 +++
 2 files changed, 57 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
index 2e86d1faaf1..c951644de4b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-intrinsic.c
@@ -413,9 +413,25 @@ vfloat32m1_t test_vfwredusum_vs_f16m8_f32m1(vfloat16m8_t 
vector, vfloat32m1_t sc
   return __riscv_vfwredusum_vs_f16m8_f32m1(vector, scalar, vl);
 }
 
-/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 49 } } */
+vfloat16mf4_t test_vfslide1up_vf_f16mf4(vfloat16mf4_t src, float16_t value, 
size_t vl) {
+  return __riscv_vfslide1up_vf_f16mf4(src, value, vl);
+}
+
+vfloat16m8_t test_vfslide1up_vf_f16m8(vfloat16m8_t src, float16_t value, 
size_t vl) {
+  return __riscv_vfslide1up_vf_f16m8(src, value, vl);
+}
+
+vfloat16mf4_t test_vfslide1down_vf_f16mf4(vfloat16mf4_t src, float16_t value, 
size_t vl) {
+  return __riscv_vfslide1down_vf_f16mf4(src, value, vl);
+}
+
+vfloat16m8_t test_vfslide1down_vf_f16m8(vfloat16m8_t src, float16_t value, 
size_t vl) {
+  return __riscv_vfslide1down_vf_f16m8(src, value, vl);
+}
+
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 51 } } */
 /* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au]} 11 } } */
-/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} 40 } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} 42 } } */
 /* { dg-final { scan-assembler-times 
{vfadd\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times 
{vfsub\.v[fv]\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times 
{vfrsub\.vf\s+v[0-9]+,\s*v[0-9]+,\s*[vfa]+[0-9]+} 2 } } */
@@ -470,3 +486,5 @@ vfloat32m1_t test_vfwredusum_vs_f16m8_f32m1(vfloat16m8_t 
vector, vfloat32m1_t sc
 /* { dg-final { scan-assembler-times 
{vfredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times 
{vfwredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
 /* { dg-final { scan-assembler-times 
{vfwredusum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times 
{vfslide1up\.vf\s+v[0-9]+,\s*v[0-9]+,\s*fa[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times 
{vfslide1down\.vf\s+v[0-9]+,\s*v[0-9]+,\s*fa[0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c
new file mode 100644
index 000..2afc105e2da
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64 -O3" } */
+
+#include "riscv_vector.h"
+
+typedef _Float16 float16_t;
+
+vfloat16mf4_t test_vfncvt_f_f_w_f16mf4(vfloat32mf2_t src, size_t vl) {
+  return __riscv_vfncvt_f_f_w_f16mf4(src, vl);
+}
+
+vfloat16m4_t test_vfncvt_f_f_w_f16m4(vfloat32m8_t src, size_t vl) {
+  return __riscv_vfncvt_f_f_w_f16m4(src, vl);
+}
+
+vfloat32mf2_t test_vfwcvt_f_f_v_f32mf2(vfloat16mf4_t src, size_t vl) {
+  return __riscv_vfwcvt_f_f_v_f32mf2(src, vl);
+}
+
+vfloat32m8_t test_vfwcvt_f_f_v_f32m8(vfloat16m4_t src, size_t vl) {
+  return __riscv_vfwcvt_f_f_v_f32m8(src, vl);
+}
+
+vfloat16mf4_t test_vle16_v_f16mf4(const float16_t *base, size_t vl) {
+  return __riscv_vle16_v_f16mf4(base, vl);
+}
+
+vfloat16m8_t test_vle16_v_f16m8(const float16_t *base, size_t vl) {
+  return __riscv_vle16_v_f16m8(base, vl);
+}
+
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au]} 3 } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au]} 2 } } */
+/* { dg-final { scan-assembler-times 
{vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au]} 1 } } */
+/* { dg-final { scan-assembler-times {vfwcvt\.f\.f\.v\s+v[0-9]+,\s*v[0-9]+} 2 
} } */
+/* { dg-final { scan-assembler-times {vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+} 2 
} } */
+/* { dg-final { scan-assembler-times {vle16\.v\s+v[0-9]+,\s*0\([0-9ax]+\)} 3 } 
} */
-- 
2.34.1



Re: [Patch, fortran] PR87477 - (associate) - [meta-bug] [F03] issues concerning the ASSOCIATE statement

2023-06-08 Thread Mikael Morin

Le 08/06/2023 à 07:57, Paul Richard Thomas via Fortran a écrit :

Hi Harald,

In answer to your question:
void
gfc_replace_expr (gfc_expr *dest, gfc_expr *src)
{
   free_expr0 (dest);
   *dest = *src;
   free (src);
}
So it does indeed do the job.


Sure, but his comment was about the case gfc_replace_expr is *not* executed.


I should perhaps have remarked that, following the divide error,
gfc_simplify_expr was returning a mutilated version of the expression
and this was somehow connected with successfully simplifying the
parentheses. Copying and replacing on no errors deals with the
problem.


Is the expression mutilated enough that it can't be safely freed?




Re: [PATCH V5] VECT: Add SELECT_VL support

2023-06-08 Thread juzhe.zh...@rivai.ai
Bootstrap && Regression PASSed
Ok for trunk ?



juzhe.zh...@rivai.ai
 
From: juzhe.zhong
Date: 2023-06-08 10:05
To: gcc-patches
CC: richard.sandiford; rguenther; Ju-Zhe Zhong
Subject: [PATCH V5] VECT: Add SELECT_VL support
From: Ju-Zhe Zhong 
 
Co-authored-by: Richard Sandiford
Co-authored-by: Richard Biener 
 
This patch address comments from Richard && Richi and rebase to trunk.
 
This patch is adding SELECT_VL middle-end support
allow target have target dependent optimization in case of
length calculation.
 
This patch is inspired by RVV ISA and LLVM:
https://reviews.llvm.org/D99750
 
The SELECT_VL is same behavior as LLVM "get_vector_length" with
these following properties:
 
1. Only apply on single-rgroup.
2. non SLP.
3. adjust loop control IV.
4. adjust data reference IV.
5. allow non-vf elements processing in non-final iteration
 
Code:
   # void vvaddint32(size_t n, const int*x, const int*y, int*z)
# { for (size_t i=0; i
Co-authored-by: Richard Biener 
 
---
gcc/doc/md.texi | 22 ++
gcc/internal-fn.def |  1 +
gcc/optabs.def  |  1 +
gcc/tree-vect-loop-manip.cc | 32 ++
gcc/tree-vect-loop.cc   | 72 +++
gcc/tree-vect-stmts.cc  | 86 -
gcc/tree-vectorizer.h   |  6 +++
7 files changed, 201 insertions(+), 19 deletions(-)
 
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 6a435eb4461..95f7fe1f802 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -4974,6 +4974,28 @@ for (i = 1; i < operand3; i++)
   operand0[i] = operand0[i - 1] && (operand1 + i < operand2);
@end smallexample
+@cindex @code{select_vl@var{m}} instruction pattern
+@item @code{select_vl@var{m}}
+Set operand 0 to the number of scalar iterations that should be handled
+by one iteration of a vector loop.  Operand 1 is the total number of
+scalar iterations that the loop needs to process and operand 2 is a
+maximum bound on the result (also known as the maximum ``vectorization
+factor'').
+
+The maximum value of operand 0 is given by:
+@smallexample
+operand0 = MIN (operand1, operand2)
+@end smallexample
+However, targets might choose a lower value than this, based on
+target-specific criteria.  Each iteration of the vector loop might
+therefore process a different number of scalar iterations, which in turn
+means that induction variables will have a variable step.  Because of
+this, it is generally not useful to define this instruction if it will
+always calculate the maximum value.
+
+This optab is only useful on targets that implement @samp{len_load_@var{m}}
+and/or @samp{len_store_@var{m}}.
+
@cindex @code{check_raw_ptrs@var{m}} instruction pattern
@item @samp{check_raw_ptrs@var{m}}
Check whether, given two pointers @var{a} and @var{b} and a length @var{len},
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 3ac9d82aace..5d638de6d06 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -177,6 +177,7 @@ DEF_INTERNAL_OPTAB_FN (VEC_SET, 0, vec_set, vec_set)
DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store)
DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while)
+DEF_INTERNAL_OPTAB_FN (SELECT_VL, ECF_CONST | ECF_NOTHROW, select_vl, binary)
DEF_INTERNAL_OPTAB_FN (CHECK_RAW_PTRS, ECF_CONST | ECF_NOTHROW,
   check_raw_ptrs, check_ptrs)
DEF_INTERNAL_OPTAB_FN (CHECK_WAR_PTRS, ECF_CONST | ECF_NOTHROW,
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 6c064ff4993..f31b69c5d85 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -488,3 +488,4 @@ OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES)
OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a")
OPTAB_D (len_load_optab, "len_load_$a")
OPTAB_D (len_store_optab, "len_store_$a")
+OPTAB_D (select_vl_optab, "select_vl$a")
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 3f735945e67..1c8100c1a1c 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -534,7 +534,7 @@ vect_set_loop_controls_directly (class loop *loop, 
loop_vec_info loop_vinfo,
   _10 = (unsigned long) count_12(D);
   ...
   # ivtmp_9 = PHI 
-_36 = MIN_EXPR ;
+_36 = (MIN_EXPR | SELECT_VL) ;
   ...
   vect__4.8_28 = .LEN_LOAD (_17, 32B, _36, 0);
   ...
@@ -549,15 +549,28 @@ vect_set_loop_controls_directly (class loop *loop, 
loop_vec_info loop_vinfo,
   tree step = rgc->controls.length () == 1 ? rgc->controls[0]
   : make_ssa_name (iv_type);
   /* Create decrement IV.  */
-  create_iv (nitems_total, MINUS_EXPR, nitems_step, NULL_TREE, loop,
- _gsi, insert_after, _before_incr,
- _after_incr);
-  gimple_seq_add_stmt (header_seq, gimple_build_assign (step, MIN_EXPR,
- index_before_incr,
- nitems_step));
+  if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo))
+ {
+   create_iv (nitems_total, MINUS_EXPR, step, NULL_TREE, loop, _gsi,
+  insert_after, _before_incr, _after_incr);
+   tree len = gimple_build (header_seq, IFN_SELECT_VL, iv_type,
+index_before_incr, 

Re: [PATCH] RISC-V: Add Veyron V1 pipeline description

2023-06-08 Thread Kito Cheng via Gcc-patches
> diff --git a/gcc/config/riscv/riscv-cores.def 
> b/gcc/config/riscv/riscv-cores.def
> index 7d87ab7ce28..4078439e562 100644
> --- a/gcc/config/riscv/riscv-cores.def
> +++ b/gcc/config/riscv/riscv-cores.def
> @@ -38,6 +38,7 @@ RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
>  RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
>  RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
>  RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
> +RISCV_TUNE("veyron-v1", veyron_v1, veyron_v1_tune_info)
>  RISCV_TUNE("size", generic, optimize_size_tune_info)
>
>  #undef RISCV_TUNE
> @@ -77,4 +78,7 @@ RISCV_CORE("thead-c906",  
> "rv64imafdc_xtheadba_xtheadbb_xtheadbs_xtheadcmo_"
>   "xtheadcondmov_xtheadfmemidx_xtheadmac_"
>   "xtheadmemidx_xtheadmempair_xtheadsync",
>   "thead-c906")
> +
> +RISCV_CORE("veyron-v1",   
> "rv64imafdc_zba_zbb_zbc_zbs_zifencei_xventanacondops",
> + "veyron-v1")

Seems like xventanacondops have not in the trunk yet, I saw Jeff has
approved before but not commit yet

https://patchwork.ozlabs.org/project/gcc/patch/20230210224150.2801962-11-philipp.toms...@vrull.eu/


Re: [PATCH v2] LoongArch: Modify the register constraints for template "jumptable" and "indirect_jump" from "r" to "e" [PR110136]

2023-06-08 Thread WANG Xuerui

On 2023/6/8 10:27, Lulu Cheng wrote:

Micro-architecture unconditionally treats a "jr $ra" as "return from 
subroutine",
hence doing "jr $ra" would interfere with both subroutine return prediction and
the more general indirect branch prediction.

Therefore, a problem like PR110136 can cause a significant increase in branch 
error
prediction rate and affect performance. The same problem exists with 
"indirect_jump".

gcc/ChangeLog:

* config/loongarch/loongarch.md: Modify the register constraints for 
template
"jumptable" and "indirect_jump" from "r" to "e".

Co-authored-by: Andrew Pinski 
---
v1 -> v2:
   1. Modify the description
   2. Modify the register constraints of the template "indirect_jump".
---
  gcc/config/loongarch/loongarch.md | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 816a943d155..43a2ecc8957 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2895,6 +2895,10 @@ (define_insn "*jump_pic"
  }
[(set_attr "type" "branch")])
  
+;; Micro-architecture unconditionally treats a "jr $ra" as "return from subroutine",

+;; hence doing "jr $ra" would interfere with both subroutine return prediction 
and


Not all cases of "doing 'jr $ra'" are harmful, obviously. Paraphrasing 
it like e.g. "non-returning indirect jumps through $ra" would be better.


You could simplify the patch title a lot with this too: "Avoid 
non-returning indirect jumps through $ra" is shorter and does not 
duplicate the ChangeLog message.



+;; the more general indirect branch prediction.
+
  (define_expand "indirect_jump"
[(set (pc) (match_operand 0 "register_operand"))]
""
@@ -2905,7 +2909,7 @@ (define_expand "indirect_jump"
  })
  
  (define_insn "@indirect_jump"

-  [(set (pc) (match_operand:P 0 "register_operand" "r"))]
+  [(set (pc) (match_operand:P 0 "register_operand" "e"))]
""
"jr\t%0"
[(set_attr "type" "jump")
@@ -2928,7 +2932,7 @@ (define_expand "tablejump"
  
  (define_insn "@tablejump"

[(set (pc)
-   (match_operand:P 0 "register_operand" "r"))
+   (match_operand:P 0 "register_operand" "e"))
 (use (label_ref (match_operand 1 "" "")))]
""
"jr\t%0"


Re: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread juzhe.zh...@rivai.ai
LGTM. Let's wait for Jeff and Robin. After this patch, we can start FP16 
autovec.



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-06-08 14:29
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
From: Pan Li 
 
This patch would like to refactor the requirement of both the ZVFH
and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
iterators of RVV. And then the ZVFH will leverage one function as
the gate for FP16 supported or not.
 
Please note the ZVFH will cover the ZVFHMIN instructions. This patch
add one test for this.
 
Signed-off-by: Pan Li 
Co-Authored by: Juzhe-Zhong 
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (float_mode_supported_p):
New function to float point is supported by extension.
* config/riscv/riscv-v.cc (float_mode_supported_p):
Ditto.
* config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
* config/riscv/vector.md: Add condition to FP define insn.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
for ZVFHMIN.
---
gcc/config/riscv/riscv-protos.h   |   1 +
gcc/config/riscv/riscv-v.cc   |  12 ++
gcc/config/riscv/vector-iterators.md  |  23 +--
gcc/config/riscv/vector.md| 144 ++
.../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
5 files changed, 118 insertions(+), 77 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ebbaac255f9..1f606f59ce1 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+bool float_mode_supported_p (machine_mode mode);
bool legitimize_move (rtx, rtx);
void emit_vlmax_vsetvl (machine_mode, rtx);
void emit_hard_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 49752cd8899..fe4eb058ec0 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
}
+/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
+   float point machine mode.  */
+bool
+float_mode_supported_p (machine_mode mode)
+{
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+
+  gcc_assert (FLOAT_MODE_P (inner_mode));
+
+  return inner_mode == HFmode ? TARGET_ZVFH : true;
+}
+
/* Return true if VEC is a constant in which every element is in the range
[MINVAL, MAXVAL].  The elements do not need to have the same value.
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f4946d84449..234b712bc9d 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
"TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
"TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
-  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
   (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
@@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
(define_mode_iterator V_FRACT [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
(VNx8QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
"TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") (VNx4HF 
"TARGET_MIN_VLEN >= 128")
+
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+
   (VNx1SI "TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN < 128") (VNx2SI 
"TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN 
< 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
@@ -497,12 +500,12 @@ (define_mode_iterator VWEXTI [
])
(define_mode_iterator VWEXTF [
-  (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
-  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (VNx1SF 

RE: [PATCH v7] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Li, Pan2 via Gcc-patches
Sure, update it in PATCH v8.

https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621016.html

Pan

From: juzhe.zh...@rivai.ai 
Sent: Thursday, June 8, 2023 2:09 PM
To: Li, Pan2 ; gcc-patches 
Cc: Robin Dapp ; jeffreyalaw ; Li, 
Pan2 ; Wang, Yanzhang 
Subject: Re: [PATCH v7] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

Rename float_point_mode_supported_p into float_mode_supported_p



juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-06-08 14:06
To: gcc-patches
CC: juzhe.zhong; 
rdapp.gcc; 
jeffreyalaw; pan2.li; 
yanzhang.wang
Subject: [PATCH v7] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to refactor the requirement of both the ZVFH
and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
iterators of RVV. And then the ZVFH will leverage one function as
the gate for FP16 supported or not.

Please note the ZVFH will cover the ZVFHMIN instructions. This patch
add one test for this.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
Co-Authored by: Juzhe-Zhong mailto:juzhe.zh...@rivai.ai>>

gcc/ChangeLog:

* config/riscv/riscv-protos.h (float_point_mode_supported_p):
New function to float point is supported by extension.
* config/riscv/riscv-v.cc (float_point_mode_supported_p):
Ditto.
* config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
* config/riscv/vector.md: Add condition to FP define insn.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
for ZVFHMIN.
---
gcc/config/riscv/riscv-protos.h   |   1 +
gcc/config/riscv/riscv-v.cc   |  12 ++
gcc/config/riscv/vector-iterators.md  |  23 +--
gcc/config/riscv/vector.md| 144 ++
.../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
5 files changed, 118 insertions(+), 77 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ebbaac255f9..e4881786b53 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+bool float_point_mode_supported_p (machine_mode mode);
bool legitimize_move (rtx, rtx);
void emit_vlmax_vsetvl (machine_mode, rtx);
void emit_hard_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 49752cd8899..1cc157f1858 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
}
+/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
+   float point machine mode.  */
+bool
+float_point_mode_supported_p (machine_mode mode)
+{
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+
+  gcc_assert (FLOAT_MODE_P (inner_mode));
+
+  return inner_mode == HFmode ? TARGET_ZVFH : true;
+}
+
/* Return true if VEC is a constant in which every element is in the range
[MINVAL, MAXVAL].  The elements do not need to have the same value.
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f4946d84449..234b712bc9d 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
"TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
"TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
-  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
   (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
@@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
(define_mode_iterator V_FRACT [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
(VNx8QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
"TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") (VNx4HF 
"TARGET_MIN_VLEN >= 128")
+
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+

[PATCH v8] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to refactor the requirement of both the ZVFH
and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
iterators of RVV. And then the ZVFH will leverage one function as
the gate for FP16 supported or not.

Please note the ZVFH will cover the ZVFHMIN instructions. This patch
add one test for this.

Signed-off-by: Pan Li 
Co-Authored by: Juzhe-Zhong 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (float_mode_supported_p):
New function to float point is supported by extension.
* config/riscv/riscv-v.cc (float_mode_supported_p):
Ditto.
* config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
* config/riscv/vector.md: Add condition to FP define insn.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
for ZVFHMIN.
---
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-v.cc   |  12 ++
 gcc/config/riscv/vector-iterators.md  |  23 +--
 gcc/config/riscv/vector.md| 144 ++
 .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
 5 files changed, 118 insertions(+), 77 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ebbaac255f9..1f606f59ce1 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
 bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
 bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+bool float_mode_supported_p (machine_mode mode);
 bool legitimize_move (rtx, rtx);
 void emit_vlmax_vsetvl (machine_mode, rtx);
 void emit_hard_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 49752cd8899..fe4eb058ec0 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
 }
 
+/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
+   float point machine mode.  */
+bool
+float_mode_supported_p (machine_mode mode)
+{
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+
+  gcc_assert (FLOAT_MODE_P (inner_mode));
+
+  return inner_mode == HFmode ? TARGET_ZVFH : true;
+}
+
 /* Return true if VEC is a constant in which every element is in the range
[MINVAL, MAXVAL].  The elements do not need to have the same value.
 
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f4946d84449..234b712bc9d 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
"TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
"TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
 
-  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
-  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
   (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
@@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
 (define_mode_iterator V_FRACT [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
(VNx8QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
"TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") (VNx4HF 
"TARGET_MIN_VLEN >= 128")
+
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+
   (VNx1SI "TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN < 128") (VNx2SI 
"TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN 
< 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
@@ -497,12 +500,12 @@ (define_mode_iterator VWEXTI [
 ])
 
 (define_mode_iterator VWEXTF [
-  (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
-  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (VNx1SF "TARGET_VECTOR_ELEN_FP_16 && TARGET_VECTOR_ELEN_FP_32 && 
TARGET_MIN_VLEN < 128")
+  (VNx2SF "TARGET_VECTOR_ELEN_FP_16 && TARGET_VECTOR_ELEN_FP_32")
+  (VNx4SF "TARGET_VECTOR_ELEN_FP_16 && 

Re: [PATCH 2/2] cprop_hardreg: Enable propagation of the stack pointer if possible.

2023-06-08 Thread Manolis Tsamis
Hi Jeff,

Yes that one has changed; I changed the implementation based on your feedback.

Thanks,
Manolis

On Thu, Jun 8, 2023 at 1:18 AM Jeff Law  wrote:
>
>
>
> On 5/25/23 06:35, Manolis Tsamis wrote:
> > Propagation of the stack pointer in cprop_hardreg is currenty forbidden
> > in all cases, due to maybe_mode_change returning NULL. Relax this
> > restriction and allow propagation when no mode change is requested.
> >
> > gcc/ChangeLog:
> >
> >  * regcprop.cc (maybe_mode_change): Enable stack pointer 
> > propagation.
> Thanks for the clarification.  This is OK for the trunk.  It looks
> generic enough to have value going forward now rather than waiting.
>
> jeff


Re: [PATCH v7] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread juzhe.zh...@rivai.ai
Rename float_point_mode_supported_p into float_mode_supported_p




juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2023-06-08 14:06
To: gcc-patches
CC: juzhe.zhong; rdapp.gcc; jeffreyalaw; pan2.li; yanzhang.wang
Subject: [PATCH v7] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
From: Pan Li 
 
This patch would like to refactor the requirement of both the ZVFH
and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
iterators of RVV. And then the ZVFH will leverage one function as
the gate for FP16 supported or not.
 
Please note the ZVFH will cover the ZVFHMIN instructions. This patch
add one test for this.
 
Signed-off-by: Pan Li 
Co-Authored by: Juzhe-Zhong 
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (float_point_mode_supported_p):
New function to float point is supported by extension.
* config/riscv/riscv-v.cc (float_point_mode_supported_p):
Ditto.
* config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
* config/riscv/vector.md: Add condition to FP define insn.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
for ZVFHMIN.
---
gcc/config/riscv/riscv-protos.h   |   1 +
gcc/config/riscv/riscv-v.cc   |  12 ++
gcc/config/riscv/vector-iterators.md  |  23 +--
gcc/config/riscv/vector.md| 144 ++
.../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
5 files changed, 118 insertions(+), 77 deletions(-)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ebbaac255f9..e4881786b53 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+bool float_point_mode_supported_p (machine_mode mode);
bool legitimize_move (rtx, rtx);
void emit_vlmax_vsetvl (machine_mode, rtx);
void emit_hard_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 49752cd8899..1cc157f1858 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
}
+/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
+   float point machine mode.  */
+bool
+float_point_mode_supported_p (machine_mode mode)
+{
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+
+  gcc_assert (FLOAT_MODE_P (inner_mode));
+
+  return inner_mode == HFmode ? TARGET_ZVFH : true;
+}
+
/* Return true if VEC is a constant in which every element is in the range
[MINVAL, MAXVAL].  The elements do not need to have the same value.
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f4946d84449..234b712bc9d 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
"TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
"TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
-  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
   (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
@@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
(define_mode_iterator V_FRACT [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
(VNx8QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
"TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") (VNx4HF 
"TARGET_MIN_VLEN >= 128")
+
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+
   (VNx1SI "TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN < 128") (VNx2SI 
"TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN 
< 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
@@ -497,12 +500,12 @@ (define_mode_iterator VWEXTI [
])
(define_mode_iterator VWEXTF [
-  (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
-  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (VNx1SF 

RE: [PATCH v5] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Li, Pan2 via Gcc-patches
Update the PATCH v7 (please help to ignore v6) for this change, thanks Juzhe 
for the suggestion.

https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621012.html

Pan

From: Li, Pan2
Sent: Wednesday, June 7, 2023 4:43 PM
To: juzhe.zh...@rivai.ai; gcc-patches 
Cc: Robin Dapp ; jeffreyalaw ; 
Wang, Yanzhang 
Subject: RE: [PATCH v5] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

Thanks Juzhe for reviewing. I see, this way may have even smaller code change 
which treats the zvfhmin as minimal base sub extension.
I will have a try for PATCH V6.

Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Wednesday, June 7, 2023 4:27 PM
To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Robin Dapp mailto:rdapp@gmail.com>>; jeffreyalaw 
mailto:jeffreya...@gmail.com>>; Li, Pan2 
mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: [PATCH v5] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

In this patch, you add TARGET_ZVFH into VF iterator which is not correct.

When TARGET_ZVFH is true, TARGET_ZVFHMIN is always true.

For vfadd, it is true we should enable "vfadd" for TARGET_ZVFH
For vle16,  we should enable for TARGET_ZVFHMIN.
This patch will disable both "vle16" and "vfadd" for FP16 on ZVFHMIN which is 
not correct.

I think you should allow all FP16 vector modes in iterator enable by 
TARGET_VECTOR_FP_ELN_16 (TARGET_ZVFHMIN).

Then, when zvfhmin is enabled, all FP16 instructions are enabled by default.

To gate the situation only enable when TARGET_ZVFH, you add the predicate as 
below:

For example:
vfadd.vv (need

(define_insn "@pred_"
  [(set (match_operand:VF 0 "register_operand"   "=vd, vd, vr, vr")
  (if_then_else:VF
(unspec:
  [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1")
   (match_operand 5 "vector_length_operand"" rK, rK, rK, rK")
   (match_operand 6 "const_int_operand""  i,  i,  i,  i")
   (match_operand 7 "const_int_operand""  i,  i,  i,  i")
   (match_operand 8 "const_int_operand""  i,  i,  i,  i")
   (match_operand 9 "const_int_operand""  i,  i,  i,  i")
   (reg:SI VL_REGNUM)
   (reg:SI VTYPE_REGNUM)
   (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
(any_float_binop:VF
  (match_operand:VF 3 "register_operand"   " vr, vr, vr, vr")
  (match_operand:VF 4 "register_operand"   " vr, vr, vr, vr"))
(match_operand:VF 2 "vector_merge_operand" " vu,  0, vu,  0")))]
  "TARGET_VECTOR && riscv_vector::float_mode_supported_p (mode)"
  "vf.vv\t%0,%3,%4%p1"
  [(set_attr "type" "")
   (set_attr "mode" "")])

bool
float_mode_supported_p (machine_mode mode)
{
  if (GET_MODE_INNER (mode) == HFmode)
 return TARGET_ZVFH;
   return true;
}



juzhe.zh...@rivai.ai

From: pan2.li
Date: 2023-06-07 16:06
To: gcc-patches
CC: juzhe.zhong; 
rdapp.gcc; 
jeffreyalaw; pan2.li; 
yanzhang.wang
Subject: [PATCH v5] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to refactor the requirement of both the ZVFH
and ZVFHMIN. The related define_insn and iterator will take the
requirement based on the ZVFHMIN and ZVFH.

Please note the ZVFH will cover the ZVFHMIN instructions. This patch
add one test for this.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Add requirement to VF,
VWEXTF and VWCONVERTI, add V_CONVERT_F and VCONVERTF.
* config/riscv/vector.md: Adjust FP convert to V_CONVERT_F
and VCONVERTF, and fix V_WHOLE and V_FRACT.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: New test.
---
gcc/config/riscv/vector-iterators.md  | 79 +--
gcc/config/riscv/vector.md| 46 +--
.../riscv/rvv/base/zvfh-over-zvfhmin.c| 25 ++
3 files changed, 104 insertions(+), 46 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f4946d84449..e6c2ecf7c86 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -296,13 +296,13 @@ (define_mode_iterator VWI_ZVE32 [
])
(define_mode_iterator VF [
-  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
-  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
-  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_ZVFH && 

[PATCH v7] RISC-V: Refactor requirement of ZVFH and ZVFHMIN.

2023-06-08 Thread Pan Li via Gcc-patches
From: Pan Li 

This patch would like to refactor the requirement of both the ZVFH
and ZVFHMIN. By default, the ZVFHMIN will enable FP16 for all the
iterators of RVV. And then the ZVFH will leverage one function as
the gate for FP16 supported or not.

Please note the ZVFH will cover the ZVFHMIN instructions. This patch
add one test for this.

Signed-off-by: Pan Li 
Co-Authored by: Juzhe-Zhong 

gcc/ChangeLog:

* config/riscv/riscv-protos.h (float_point_mode_supported_p):
New function to float point is supported by extension.
* config/riscv/riscv-v.cc (float_point_mode_supported_p):
Ditto.
* config/riscv/vector-iterators.md: Fix V_WHOLE and V_FRACT.
* config/riscv/vector.md: Add condition to FP define insn.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Add vle16 test
for ZVFHMIN.
---
 gcc/config/riscv/riscv-protos.h   |   1 +
 gcc/config/riscv/riscv-v.cc   |  12 ++
 gcc/config/riscv/vector-iterators.md  |  23 +--
 gcc/config/riscv/vector.md| 144 ++
 .../riscv/rvv/base/zvfhmin-intrinsic.c|  15 +-
 5 files changed, 118 insertions(+), 77 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index ebbaac255f9..e4881786b53 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -177,6 +177,7 @@ rtx expand_builtin (unsigned int, tree, rtx);
 bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
 bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+bool float_point_mode_supported_p (machine_mode mode);
 bool legitimize_move (rtx, rtx);
 void emit_vlmax_vsetvl (machine_mode, rtx);
 void emit_hard_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 49752cd8899..1cc157f1858 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -418,6 +418,18 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
 }
 
+/* Return true if the inner of mode is HFmode when ZVFH enabled, or other
+   float point machine mode.  */
+bool
+float_point_mode_supported_p (machine_mode mode)
+{
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+
+  gcc_assert (FLOAT_MODE_P (inner_mode));
+
+  return inner_mode == HFmode ? TARGET_ZVFH : true;
+}
+
 /* Return true if VEC is a constant in which every element is in the range
[MINVAL, MAXVAL].  The elements do not need to have the same value.
 
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f4946d84449..234b712bc9d 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -453,9 +453,8 @@ (define_mode_iterator V_WHOLE [
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI 
"TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI 
"TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
 
-  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
-  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
-  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN == 64")
   (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
@@ -477,7 +476,11 @@ (define_mode_iterator V_WHOLE [
 (define_mode_iterator V_FRACT [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI (VNx4QI "TARGET_MIN_VLEN > 32") 
(VNx8QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") (VNx2HI "TARGET_MIN_VLEN > 32") (VNx4HI 
"TARGET_MIN_VLEN >= 128")
-  (VNx1HF "TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_MIN_VLEN > 32") (VNx4HF 
"TARGET_MIN_VLEN >= 128")
+
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
+
   (VNx1SI "TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN < 128") (VNx2SI 
"TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32 && TARGET_MIN_VLEN 
< 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
@@ -497,12 +500,12 @@ (define_mode_iterator VWEXTI [
 ])
 
 (define_mode_iterator VWEXTF [
-  (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
-  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
-  (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  (VNx1SF "TARGET_VECTOR_ELEN_FP_16 && TARGET_VECTOR_ELEN_FP_32 && 
TARGET_MIN_VLEN < 128")
+  (VNx2SF "TARGET_VECTOR_ELEN_FP_16 && TARGET_VECTOR_ELEN_FP_32")
+  (VNx4SF