Re: [PATCH] Add VXRM enum

2023-07-12 Thread Robin Dapp via Gcc-patches
> +enum __RISCV_VXRM {
> +  __RISCV_VXRM_RNU = 0,
> +  __RISCV_VXRM_RNE = 1,
> +  __RISCV_VXRM_RDN = 2,
> +  __RISCV_VXRM_ROD = 3,
> +};
> +
>  __extension__ extern __inline unsigned long
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  vread_csr(enum RVV_CSR csr)

We have that already in riscv-protos.h :)
(fixed_point_rounding_mode)

Regards
 Robin



Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jul 2023, Richard Sandiford wrote:

> Richard Biener  writes:
> > The PRs ask for optimizing of
> >
> >   _1 = BIT_FIELD_REF ;
> >   result_4 = BIT_INSERT_EXPR ;
> >
> > to a vector permutation.  The following implements this as
> > match.pd pattern, improving code generation on x86_64.
> >
> > On the RTL level we face the issue that backend patterns inconsistently
> > use vec_merge and vec_select of vec_concat to represent permutes.
> 
> Yeah, the current RTL codes probably overlap a bit too much.
> 
> Maybe we should have a rule that a vec_merge with a constant
> third operand should be canonicalised to a vec_select?

But vec_merge always has a constant third operand:

@findex vec_merge
@item (vec_merge:@var{m} @var{vec1} @var{vec2} @var{items})
This describes a merge operation between two vectors.  The result is a 
vector
of mode @var{m}; its elements are selected from either @var{vec1} or
@var{vec2}.  Which elements are selected is described by @var{items}, 
which
is a bit mask represented by a @code{const_int}; a zero bit indicates the
corresponding element in the result vector is taken from @var{vec2} while
a set bit indicates it is taken from @var{vec1}.

the "advantage" of vec_merge over vec_concat + vec_select is
that you don't need the 2x wider vector mode, but that's the
only one.  I guess we could allow a mode-less (VOIDmode) vec_concat as
the first operand of a vec_select since that mode isn't really
used for anything.

That said, we could work around the issue by having combine
also try to match vec_merge when the vec_select + vec_concat
combination is a blend.  But I fear that doesn't resonate well
with Segher.

>  And maybe
> change the first operand of vec_select to be an rtvec, so that
> no separate vec_concat (and thus wider mode) is needed for two-input
> permutes?  Would be a lot of work though...
> 
> > I think using a (supported) permute is almost always better
> > than an extract plus insert, maybe excluding the case we extract
> > element zero and that's aliased to a register that can be used
> > directly for insertion (not sure how to query that).
> 
> Yeah, extraction of the low element (0 for LE, N-1 for BE) is special
> in RTL, in that it is now folded to a subreg.  But IMO it's reasonable
> for even that case to through TARGET_VECTORIZE_VEC_PERM_CONST,
> maybe with a target-independent helper function to match permute
> vectors that are equivalent to extract-and-insert.
> 
> On AArch64, extract-and-insert is a single operation for other
> elements too, e.g.:
> 
>   ins v0.s[2], v1.s[1]
> 
> is a thing.  But if the helper returns the index of the extracted
> elements, targets can decide for themselves whether the index is
> supported or not.
> 
> Agree that this is the right thing for gimple to do FWIW.

I think so as well.  Btw, I think only proper re-association and
merging will handle a full sequence of select, merge and permute
optimally.  In principle we have the bswap pass facility for this.

Richard.

> Thanks,
> Richard
> 
> > But this regresses for example gcc.target/i386/pr54855-8.c because PRE
> > now realizes that
> >
> >   _1 = BIT_FIELD_REF ;
> >   if (_1 > a_4(D))
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> >
> >[local count: 536870913]:
> >
> >[local count: 1073741824]:
> >   # iftmp.0_2 = PHI <_1(3), a_4(D)(2)>
> >   x_5 = BIT_INSERT_EXPR ;
> >
> > is equal to
> >
> >[local count: 1073741824]:
> >   _1 = BIT_FIELD_REF ;
> >   if (_1 > a_4(D))
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> >
> >[local count: 536870912]:
> >   _7 = BIT_INSERT_EXPR ;
> >
> >[local count: 1073741824]:
> >   # prephitmp_8 = PHI 
> >
> > and that no longer produces the desired maxsd operation at the RTL
> > level (we fail to match .FMAX at the GIMPLE level earlier).
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu with regressions:
> >
> > FAIL: gcc.target/i386/pr54855-13.c scan-assembler-times vmaxsh[ t] 1
> > FAIL: gcc.target/i386/pr54855-13.c scan-assembler-not vcomish[ t]
> > FAIL: gcc.target/i386/pr54855-8.c scan-assembler-times maxsd 1
> > FAIL: gcc.target/i386/pr54855-8.c scan-assembler-not movsd
> > FAIL: gcc.target/i386/pr54855-9.c scan-assembler-times minss 1
> > FAIL: gcc.target/i386/pr54855-9.c scan-assembler-not movss
> >
> > I think this is also PR88540 (the lack of min/max detection, not
> > sure if the SSE min/max are suitable here)
> >
> > PR tree-optimization/94864
> > PR tree-optimization/94865
> > * match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
> > for vector insertion from vector extraction.
> >
> > * gcc.target/i386/pr94864.c: New testcase.
> > * gcc.target/i386/pr94865.c: Likewise.
> > ---
> >  gcc/match.pd| 25 +
> >  gcc/testsuite/gcc.target/i386/pr94864.c | 13 +
> >  gcc/testsuite/gcc.target/i386/pr94865.c | 13 +
> >  3 files changed, 51 insertions(+)
> >  

Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Richard Biener via Gcc-patches
On Thu, 13 Jul 2023, Hongtao Liu wrote:

> On Thu, Jul 13, 2023 at 10:47?AM Hongtao Liu  wrote:
> >
> > On Wed, Jul 12, 2023 at 9:37?PM Richard Biener via Gcc-patches
> >  wrote:
> > >
> > > The PRs ask for optimizing of
> > >
> > >   _1 = BIT_FIELD_REF ;
> > >   result_4 = BIT_INSERT_EXPR ;
> > >
> > > to a vector permutation.  The following implements this as
> > > match.pd pattern, improving code generation on x86_64.
> > >
> > > On the RTL level we face the issue that backend patterns inconsistently
> > > use vec_merge and vec_select of vec_concat to represent permutes.
> > >
> > > I think using a (supported) permute is almost always better
> > > than an extract plus insert, maybe excluding the case we extract
> > > element zero and that's aliased to a register that can be used
> > > directly for insertion (not sure how to query that).
> > >
> > > But this regresses for example gcc.target/i386/pr54855-8.c because PRE
> > > now realizes that
> > >
> > >   _1 = BIT_FIELD_REF ;
> > >   if (_1 > a_4(D))
> > > goto ; [50.00%]
> > >   else
> > > goto ; [50.00%]
> > >
> > >[local count: 536870913]:
> > >
> > >[local count: 1073741824]:
> > >   # iftmp.0_2 = PHI <_1(3), a_4(D)(2)>
> > >   x_5 = BIT_INSERT_EXPR ;
> > >
> > > is equal to
> > >
> > >[local count: 1073741824]:
> > >   _1 = BIT_FIELD_REF ;
> > >   if (_1 > a_4(D))
> > > goto ; [50.00%]
> > >   else
> > > goto ; [50.00%]
> > >
> > >[local count: 536870912]:
> > >   _7 = BIT_INSERT_EXPR ;
> > >
> > >[local count: 1073741824]:
> > >   # prephitmp_8 = PHI 
> > >
> > > and that no longer produces the desired maxsd operation at the RTL
> > The comparison is scalar mode, but operations in then_bb is
> > vector_mode, if_convert can't eliminate the condition any more(and
> > won't go into backend ix86_expand_sse_fp_minmax).
> > I think for ordered comparisons like _1 > a_4, it doesn't match
> > fmin/fmax, but match SSE MINSS/MAXSS since it alway returns the second
> > operand(not the other operand) when there's NONE.
> I mean NANs.

Btw, I once tried to recognize MAX here at the GIMPLE level but
while the x86 (vector) max insns are fine for x > y ? x : y we
have no tree code or optab for exactly that, we have MAX_EXPR
which behaves differently for NaN and .FMAX which is exactly IEEE
which the x86 ISA isn't.

I wonder if we thus should if-convert this on the GIMPLE level
but to x > y ? x : y, thus a COND_EXPR?

Richard.

> > > level (we fail to match .FMAX at the GIMPLE level earlier).
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu with regressions:
> > >
> > > FAIL: gcc.target/i386/pr54855-13.c scan-assembler-times vmaxsh[ t] 1
> > > FAIL: gcc.target/i386/pr54855-13.c scan-assembler-not vcomish[ t]
> > > FAIL: gcc.target/i386/pr54855-8.c scan-assembler-times maxsd 1
> > > FAIL: gcc.target/i386/pr54855-8.c scan-assembler-not movsd
> > > FAIL: gcc.target/i386/pr54855-9.c scan-assembler-times minss 1
> > > FAIL: gcc.target/i386/pr54855-9.c scan-assembler-not movss
> > >
> > > I think this is also PR88540 (the lack of min/max detection, not
> > > sure if the SSE min/max are suitable here)
> > >
> > > PR tree-optimization/94864
> > > PR tree-optimization/94865
> > > * match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
> > > for vector insertion from vector extraction.
> > >
> > > * gcc.target/i386/pr94864.c: New testcase.
> > > * gcc.target/i386/pr94865.c: Likewise.
> > > ---
> > >  gcc/match.pd| 25 +
> > >  gcc/testsuite/gcc.target/i386/pr94864.c | 13 +
> > >  gcc/testsuite/gcc.target/i386/pr94865.c | 13 +
> > >  3 files changed, 51 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr94864.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr94865.c
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 8543f777a28..8cc106049c4 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -7770,6 +7770,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >   wi::to_wide (@ipos) + isize))
> > >  (BIT_FIELD_REF @0 @rsize @rpos)
> > >
> > > +/* Simplify vector inserts of other vector extracts to a permute.  */
> > > +(simplify
> > > + (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos)
> > > + (if (VECTOR_TYPE_P (type)
> > > +  && types_match (@0, @1)
> > > +  && types_match (TREE_TYPE (TREE_TYPE (@0)), TREE_TYPE (@2))
> > > +  && TYPE_VECTOR_SUBPARTS (type).is_constant ())
> > > +  (with
> > > +   {
> > > + unsigned HOST_WIDE_INT elsz
> > > +   = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (TREE_TYPE (@1;
> > > + poly_uint64 relt = exact_div (tree_to_poly_uint64 (@rpos), elsz);
> > > + poly_uint64 ielt = exact_div (tree_to_poly_uint64 (@ipos), elsz);
> > > + unsigned nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> > > + vec_perm_builder builder;
> > > + builder.n

[PATCH] Add VXRM enum

2023-07-12 Thread chenyixuan
From: XYenChi 

Noticed that the rvv-intrinsic-doc updated the __RISCV_VXRM.
gcc/ChangeLog:Add __RISCV_VXRM enum to riscv_vector.h

2023-07-13  XYenChi  

* config/riscv/riscv_vector.h (enum __RISCV_VXRM):Add an enum 
__RISCV_VXRM to help express the rounding modes.


---
 gcc/config/riscv/riscv_vector.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/riscv/riscv_vector.h b/gcc/config/riscv/riscv_vector.h
index ff54b6be863..0a90816be1a 100644
--- a/gcc/config/riscv/riscv_vector.h
+++ b/gcc/config/riscv/riscv_vector.h
@@ -42,6 +42,13 @@ enum RVV_CSR {
   RVV_VCSR,
 };
 
+enum __RISCV_VXRM {
+  __RISCV_VXRM_RNU = 0,
+  __RISCV_VXRM_RNE = 1,
+  __RISCV_VXRM_RDN = 2,
+  __RISCV_VXRM_ROD = 3,
+};
+
 __extension__ extern __inline unsigned long
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vread_csr(enum RVV_CSR csr)
-- 
2.41.0



Re: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM

2023-07-12 Thread Kito Cheng via Gcc-patches
Hmmm? I didn't get that error on selftest?

my diff with your v2:

$ git diff
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 12655f7fdc65..466e1aed91c7 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8058,8 +8058,9 @@ asm_insn_p (rtx_insn *insn)
static bool
vxrm_unknown_p (rtx_insn *insn)
{
+  static const_rtx vxrm_reg = gen_rtx_REG (SImode, VXRM_REGNUM);
  /* Return true if there is a definition of VXRM.  */
-  if (reg_set_p (gen_rtx_REG (SImode, VXRM_REGNUM), insn))
+  if (reg_set_p (vxrm_reg, insn))
return true;

  /* A CALL function may contain an instruction that modifies the VXRM,
@@ -8080,8 +8081,9 @@ vxrm_unknown_p (rtx_insn *insn)
static bool
frm_unknown_dynamic_p (rtx_insn *insn)
{
+  static const_rtx frm_reg = gen_rtx_REG (SImode, FRM_REGNUM);
  /* Return true if there is a definition of FRM.  */
-  if (reg_set_p (gen_rtx_REG (SImode, FRM_REGNUM), insn))
+  if (reg_set_p (frm_reg, insn))
return true;

  /* A CALL function may contain an instruction that modifies the FRM,


On Thu, Jul 13, 2023 at 1:07 PM Li, Pan2 via Gcc-patches
 wrote:
>
> Thanks Jeff and Kito for comments, update the V3 version as below.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624347.html
>
> > Extract vxrm reg to a local static variable to prevent construct that again 
> > and again.
>
> The "static const_rtx vxrm_rtx = gen_rtx_REG (SImode, VXRM_REGMU)" results in 
> some error when selftest like below, thus patch v3 doesn't include this 
> change.
>
> /home/pli/repos/gcc/111/riscv-gnu-toolchain/build-gcc-newlib-stage1/./gcc/xgcc
>  -B/home/pli/repos/gcc/111/riscv-gnu-toolchain/build-gcc-newlib-stage1/./gcc/ 
>  -xc -nostdinc /dev/null -S -o /dev/null 
> -fself-test=../.././gcc/gcc/testsuite/selftests
> virtual memory exhausted: Invalid argument
> make[2]: *** [../.././gcc/gcc/c/Make-lang.in:153: s-selftest-c] Error 1
>
> Pan
>
> -Original Message-
> From: Jeff Law 
> Sent: Wednesday, July 12, 2023 11:31 PM
> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; rdapp@gmail.com; Wang, Yanzhang 
> ; kito.ch...@gmail.com
> Subject: Re: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM
>
>
>
> On 7/11/23 23:50, pan2...@intel.com wrote:
> > From: Pan Li 
> >
> > When investigate the FRM dynmaic rounding mode, we find the global
> > unknown status is quite different between the fixed-point and
> > floating-point. Thus, we separate the unknown function with extracting
> > some inner common functions.
> >
> > We will also prepare more test cases in another PATCH.
> >
> > Signed-off-by: Pan Li 
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv.cc (regnum_definition_p): New function.
> >   (insn_asm_p): Ditto.
> >   (riscv_vxrm_mode_after): New function for fixed-point.
> >   (global_vxrm_state_unknown_p): Ditto.
> >   (riscv_frm_mode_after): New function for floating-point.
> >   (global_frm_state_unknown_p): Ditto.
> >   (riscv_mode_after): Leverage new functions.
> >   (riscv_entity_mode_after): Removed.
> > ---
> >   gcc/config/riscv/riscv.cc | 96 +--
> >   1 file changed, 82 insertions(+), 14 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 38d8eb2fcf5..553fbb4435a 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -7742,19 +7742,91 @@ global_state_unknown_p (rtx_insn *insn, unsigned 
> > int regno)
> > return false;
> >   }
> >
> > +static bool
> > +regnum_definition_p (rtx_insn *insn, unsigned int regno)
> Needs a function comment.  This is true for each new function added.  In
> this specific case somethign like this might be appropriate
>
> /* Return TRUE if REGNO is set in INSN, FALSE otherwise.  */
>
> Which begs the question, is there some reason why we're not using the
> existing reg_set_p or simple_regno_set from rtlanal.cc?
>
>
>
> Jeff


[PATCH] RISCV: Add -m(no)-omit-leaf-frame-pointer support.

2023-07-12 Thread yanzhang.wang--- via Gcc-patches
From: Yanzhang Wang 

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_save_reg_p): Save ra for leaf
  when enabling -mno-omit-leaf-frame-pointer
(riscv_option_override): Override omit-frame-pointer.
(riscv_frame_pointer_required): Save s0 for non-leaf function
(TARGET_FRAME_POINTER_REQUIRED): Override defination
* config/riscv/riscv.opt: Add option support.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/omit-frame-pointer-1.c: New test.
* gcc.target/riscv/omit-frame-pointer-2.c: New test.
* gcc.target/riscv/omit-frame-pointer-3.c: New test.
* gcc.target/riscv/omit-frame-pointer-4.c: New test.
* gcc.target/riscv/omit-frame-pointer-test.c: New test.

Signed-off-by: Yanzhang Wang 
---
 gcc/config/riscv/riscv.cc | 34 ++-
 gcc/config/riscv/riscv.opt|  4 +++
 .../gcc.target/riscv/omit-frame-pointer-1.c   |  7 
 .../gcc.target/riscv/omit-frame-pointer-2.c   |  7 
 .../gcc.target/riscv/omit-frame-pointer-3.c   |  7 
 .../gcc.target/riscv/omit-frame-pointer-4.c   |  7 
 .../riscv/omit-frame-pointer-test.c   | 13 +++
 7 files changed, 78 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/omit-frame-pointer-test.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 706c18416db..caae6168c29 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -379,6 +379,10 @@ static const struct riscv_tune_info 
riscv_tune_info_table[] = {
 #include "riscv-cores.def"
 };
 
+/* Global variable to distinguish whether we should save and restore s0/fp for
+   function.  */
+static bool riscv_save_frame_pointer;
+
 void riscv_frame_info::reset(void)
 {
   total_size = 0;
@@ -4948,7 +4952,11 @@ riscv_save_reg_p (unsigned int regno)
   if (regno == HARD_FRAME_POINTER_REGNUM && frame_pointer_needed)
 return true;
 
-  if (regno == RETURN_ADDR_REGNUM && crtl->calls_eh_return)
+  /* Need not to use ra for leaf when frame pointer is turned off by option
+ whatever the omit-leaf-frame's value.  */
+  bool keep_leaf_ra = frame_pointer_needed && crtl->is_leaf
+&& !TARGET_OMIT_LEAF_FRAME_POINTER;
+  if (regno == RETURN_ADDR_REGNUM && (crtl->calls_eh_return || keep_leaf_ra))
 return true;
 
   /* If this is an interrupt handler, then must save extra registers.  */
@@ -6577,6 +6585,21 @@ riscv_option_override (void)
   if (flag_pic)
 riscv_cmodel = CM_PIC;
 
+  /* We need to save the fp with ra for non-leaf functions with no fp and ra
+ for leaf functions while no-omit-frame-pointer with
+ omit-leaf-frame-pointer.  The x_flag_omit_frame_pointer has the first
+ priority to determine whether the frame pointer is needed.  If we do not
+ override it, the fp and ra will be stored for leaf functions, which is not
+ our wanted.  */
+  riscv_save_frame_pointer = false;
+  if (TARGET_OMIT_LEAF_FRAME_POINTER_P (global_options.x_target_flags))
+{
+  if (!global_options.x_flag_omit_frame_pointer)
+   riscv_save_frame_pointer = true;
+
+  global_options.x_flag_omit_frame_pointer = 1;
+}
+
   /* We get better code with explicit relocs for CM_MEDLOW, but
  worse code for the others (for now).  Pick the best default.  */
   if ((target_flags_explicit & MASK_EXPLICIT_RELOCS) == 0)
@@ -7857,6 +7880,12 @@ riscv_preferred_else_value (unsigned, tree, unsigned int 
nops, tree *ops)
   return nops == 3 ? ops[2] : ops[0];
 }
 
+static bool
+riscv_frame_pointer_required (void)
+{
+  return riscv_save_frame_pointer && !crtl->is_leaf;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -8161,6 +8190,9 @@ riscv_preferred_else_value (unsigned, tree, unsigned int 
nops, tree *ops)
 #undef TARGET_PREFERRED_ELSE_VALUE
 #define TARGET_PREFERRED_ELSE_VALUE riscv_preferred_else_value
 
+#undef TARGET_FRAME_POINTER_REQUIRED
+#define TARGET_FRAME_POINTER_REQUIRED riscv_frame_pointer_required
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-riscv.h"
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index dd062f1c8bd..8e6a94fd01a 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -138,6 +138,10 @@ Enable the CSR checking for the ISA-dependent CRS and the 
read-only CSR.
 The ISA-dependent CSR are only valid when the specific ISA is set.  The
 read-only CSR can not be written by the CSR instructions.
 
+momit-leaf-frame-pointer
+Target Mask (OMIT_LEAF_FRAME_POINTER) Save
+Omit the frame pointer in leaf functions.
+
 Mask(64BIT)
 
 Mask(MUL)
diff --git a/gcc/te

[PATCH 1/4] Support Intel AVX-VNNI-INT16

2023-07-12 Thread Haochen Jiang via Gcc-patches
From: Kong Lingling 

gcc/ChangeLog

* common/config/i386/cpuinfo.h (get_available_features): Detect
avxvnniint16.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AVXVNNIINT16_SET): New.
(OPTION_MASK_ISA2_AVXVNNIINT16_UNSET): Ditto.
(ix86_handle_option): Handle -mavxvnniint16.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_AVXVNNIINT16.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
avxvnniint16.
* config.gcc: Add avxvnniint16.h.
* config/i386/avxvnniint16intrin.h: New file.
* config/i386/cpuid.h (bit_AVXVNNIINT16): New.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__AVXVNNIINT16__.
* config/i386/i386-options.cc (isa2_opts): Add -mavxvnniint16.
(ix86_valid_target_attribute_inner_p): Handle avxvnniint16intrin.h.
* config/i386/i386-isa.def: Add DEF_PTA(AVXVNNIINT16).
* config/i386/i386.opt: Add option -mavxvnniint16.
* config/i386/immintrin.h: Include avxvnniint16.h.
* config/i386/sse.md
(vpdp_): New define_insn.
* doc/extend.texi: Document avxvnniint16.
* doc/invoke.texi: Document -mavxvnniint16.
* doc/sourcebuild.texi: Document target avxvnniint16.

gcc/testsuite/ChangeLog

* g++.dg/other/i386-2.C: Add -mavxvnniint16.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-check.h: Add avxvnniint16 check.
* gcc.target/i386/sse-12.c: Add -mavxvnniint16.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avxvnniint16): New.
* gcc.target/i386/avxvnniint16-1.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwusd-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwusds-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwsud-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwsuds-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwuud-2.c: Ditto.
* gcc.target/i386/avxvnniint16-vpdpwuuds-2.c: Ditto.

Co-authored-by: Haochen Jiang 
---
 gcc/common/config/i386/cpuinfo.h  |   2 +
 gcc/common/config/i386/i386-common.cc |  22 ++-
 gcc/common/config/i386/i386-cpuinfo.h |   1 +
 gcc/common/config/i386/i386-isas.h|   2 +
 gcc/config.gcc|   2 +-
 gcc/config/i386/avxvnniint16intrin.h  | 138 ++
 gcc/config/i386/cpuid.h   |   1 +
 gcc/config/i386/i386-builtin.def  |  14 ++
 gcc/config/i386/i386-c.cc |   2 +
 gcc/config/i386/i386-isa.def  |   1 +
 gcc/config/i386/i386-options.cc   |   4 +-
 gcc/config/i386/i386.opt  |   5 +
 gcc/config/i386/immintrin.h   |   2 +
 gcc/config/i386/sse.md|  32 
 gcc/doc/extend.texi   |   5 +
 gcc/doc/invoke.texi   |  10 +-
 gcc/doc/sourcebuild.texi  |   3 +
 gcc/testsuite/g++.dg/other/i386-2.C   |   2 +-
 gcc/testsuite/g++.dg/other/i386-3.C   |   2 +-
 gcc/testsuite/gcc.target/i386/avx-check.h |   3 +
 .../gcc.target/i386/avxvnniint16-1.c  |  43 ++
 .../gcc.target/i386/avxvnniint16-vpdpwsud-2.c |  71 +
 .../i386/avxvnniint16-vpdpwsuds-2.c   |  72 +
 .../gcc.target/i386/avxvnniint16-vpdpwusd-2.c |  71 +
 .../i386/avxvnniint16-vpdpwusds-2.c   |  72 +
 .../gcc.target/i386/avxvnniint16-vpdpwuud-2.c |  71 +
 .../i386/avxvnniint16-vpdpwuuds-2.c   |  71 +
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |   2 +
 gcc/testsuite/gcc.target/i386/sse-12.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c|   4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c|   2 +-
 gcc/testsuite/lib/target-supports.exp |  12 ++
 34 files changed, 735 insertions(+), 15 deletions(-)
 create mode 100644 gcc/config/i386/avxvnniint16intrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-vpdpwsud-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-vpdpwsuds-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-vpdpwusd-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-vpdpwusds-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-vpdpwuud-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avxvnniint16-vpdpwuuds-2.c

[PATCH 3/4] Support Intel SHA512

2023-07-12 Thread Haochen Jiang via Gcc-patches
gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Detect SHA512.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SHA512_SET,
OPTION_MASK_ISA2_SHA512_UNSET): New.
(OPTION_MASK_ISA2_AVX_UNSET): Add SHA512.
(ix86_handle_option): Handle -msha512.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_SHA512.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
sha512.
* config.gcc: Add sha512intrin.h.
* config/i386/cpuid.h (bit_SHA512): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V4DI, V4DI, V4DI, V2DI).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__SHA512__.
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
V4DI_FTYPE_V4DI_V4DI_V2DI and V4DI_FTYPE_V4DI_V2DI.
* config/i386/i386-isa.def (SHA512): Add DEF_PTA(SHA512).
* config/i386/i386-options.cc (isa2_opts): Add -msha512.
(ix86_valid_target_attribute_inner_p): Handle sha512.
* config/i386/i386.opt: Add option -msha512.
* config/i386/immintrin.h: Include sha512intrin.h.
* config/i386/sse.md (vsha512msg1): New define insn.
(vsha512msg2): Ditto.
(vsha512rnds2): Ditto.
* doc/extend.texi: Document sha512.
* doc/invoke.texi: Document -msha512.
* doc/sourcebuild.texi: Document target sha512.
* config/i386/sha512intrin.h: New file.

gcc/testsuite/ChangeLog:

* g++.dg/others/i386-2.C: Add -msha512.
* g++.dg/others/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -msha512.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add sha512.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp (check_effective_target_sha512): New.
* gcc.target/i386/sha512-1.c: New test.
* gcc.target/i386/sha512-check.h: Ditto.
* gcc.target/i386/sha512msg1-2.c: Ditto.
* gcc.target/i386/sha512msg2-2.c: Ditto.
* gcc.target/i386/sha512rnds2-2.c: Ditto.
---
 gcc/common/config/i386/cpuinfo.h  |  2 +
 gcc/common/config/i386/i386-common.cc | 19 -
 gcc/common/config/i386/i386-cpuinfo.h |  1 +
 gcc/common/config/i386/i386-isas.h|  1 +
 gcc/config.gcc|  2 +-
 gcc/config/i386/cpuid.h   |  1 +
 gcc/config/i386/i386-builtin-types.def|  3 +
 gcc/config/i386/i386-builtin.def  |  5 ++
 gcc/config/i386/i386-c.cc |  2 +
 gcc/config/i386/i386-expand.cc|  2 +
 gcc/config/i386/i386-isa.def  |  1 +
 gcc/config/i386/i386-options.cc   |  4 +-
 gcc/config/i386/i386.opt  | 10 +++
 gcc/config/i386/immintrin.h   |  2 +
 gcc/config/i386/sha512intrin.h| 64 ++
 gcc/config/i386/sse.md| 40 +
 gcc/doc/extend.texi   |  5 ++
 gcc/doc/invoke.texi   | 10 ++-
 gcc/doc/sourcebuild.texi  |  3 +
 gcc/testsuite/g++.dg/other/i386-2.C   |  2 +-
 gcc/testsuite/g++.dg/other/i386-3.C   |  2 +-
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |  2 +
 gcc/testsuite/gcc.target/i386/sha512-1.c  | 18 
 gcc/testsuite/gcc.target/i386/sha512-check.h  | 43 ++
 gcc/testsuite/gcc.target/i386/sha512msg1-2.c  | 48 +++
 gcc/testsuite/gcc.target/i386/sha512msg2-2.c  | 47 ++
 gcc/testsuite/gcc.target/i386/sha512rnds2-2.c | 85 +++
 gcc/testsuite/gcc.target/i386/sse-12.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c|  2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c|  4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c|  2 +-
 gcc/testsuite/lib/target-supports.exp | 14 +++
 33 files changed, 436 insertions(+), 14 deletions(-)
 create mode 100644 gcc/config/i386/sha512intrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sha512-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sha512-check.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sha512msg1-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sha512msg2-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sha512rnds2-2.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index e5cdffe017a..0cfde3ebccd 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -879,6 +879,8 @@ get_available_features (struct __processor_model *cpu_model,
set_feature (FEATURE_AVXVNNIINT16);
  if (eax & bit_SM3)
  

[PATCH 2/4] Support Intel SM3

2023-07-12 Thread Haochen Jiang via Gcc-patches
gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Detect SM3.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SM3_SET,
OPTION_MASK_ISA2_SM3_UNSET): New.
(OPTION_MASK_ISA2_AVX_UNSET): Add SM3.
(ix86_handle_option): Handle -msm3.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_SM3.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
SM3.
* config.gcc: Add sm3intrin.h
* config/i386/cpuid.h (bit_SM3): New.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V4SI, V4SI, V4SI, V4SI, INT).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__SM3__.
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
V4SI_FTYPE_V4SI_V4SI_V4SI_INT.
* config/i386/i386-isa.def (SM3): Add DEF_PTA(SM3).
* config/i386/i386-options.cc (isa2_opts): Add -msm3.
(ix86_valid_target_attribute_inner_p): Handle sm3.
* config/i386/i386.opt: Add option -msm3.
* config/i386/immintrin.h: Include sm3intrin.h.
* config/i386/sse.md (vsm3msg1): New define insn.
(vsm3msg2): Ditto.
(vsm3rnds2): Ditto.
* doc/extend.texi: Document sm3.
* doc/invoke.texi: Document -msm3.
* doc/sourcebuild.texi: Document target sm3.
* config/i386/sm3intrin.h: New file.

gcc/testsuite/ChangeLog:

* g++.dg/other/i386-2.C: Add -msm3.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/avx-1.c: Add new define for immediate.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -msm3.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add sm3.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp (check_effective_target_sm3): New.
* gcc.target/i386/sm3-1.c: New test.
* gcc.target/i386/sm3-check.h: Ditto.
* gcc.target/i386/sm3msg1-2.c: Ditto.
* gcc.target/i386/sm3msg2-2.c: Ditto.
* gcc.target/i386/sm3rnds2-2.c: Ditto.
---
 gcc/common/config/i386/cpuinfo.h  |   2 +
 gcc/common/config/i386/i386-common.cc |  20 +++-
 gcc/common/config/i386/i386-cpuinfo.h |   1 +
 gcc/common/config/i386/i386-isas.h|   1 +
 gcc/config.gcc|   3 +-
 gcc/config/i386/cpuid.h   |   1 +
 gcc/config/i386/i386-builtin-types.def|   3 +
 gcc/config/i386/i386-builtin.def  |   5 +
 gcc/config/i386/i386-c.cc |   2 +
 gcc/config/i386/i386-expand.cc|   1 +
 gcc/config/i386/i386-isa.def  |   1 +
 gcc/config/i386/i386-options.cc   |   2 +
 gcc/config/i386/i386.opt  |   5 +
 gcc/config/i386/immintrin.h   |   2 +
 gcc/config/i386/sm3intrin.h   |  72 
 gcc/config/i386/sse.md|  43 
 gcc/doc/extend.texi   |   5 +
 gcc/doc/invoke.texi   |   7 +-
 gcc/doc/sourcebuild.texi  |   3 +
 gcc/testsuite/g++.dg/other/i386-2.C   |   2 +-
 gcc/testsuite/g++.dg/other/i386-3.C   |   2 +-
 gcc/testsuite/gcc.target/i386/avx-1.c |   3 +
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |   2 +
 gcc/testsuite/gcc.target/i386/sm3-1.c |  17 +++
 gcc/testsuite/gcc.target/i386/sm3-check.h |  37 +++
 gcc/testsuite/gcc.target/i386/sm3msg1-2.c |  54 +
 gcc/testsuite/gcc.target/i386/sm3msg2-2.c |  57 ++
 gcc/testsuite/gcc.target/i386/sm3rnds2-2.c| 104 ++
 gcc/testsuite/gcc.target/i386/sse-12.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c|   5 +-
 gcc/testsuite/gcc.target/i386/sse-14.c|   5 +-
 gcc/testsuite/gcc.target/i386/sse-22.c|   7 +-
 gcc/testsuite/gcc.target/i386/sse-23.c|   5 +-
 gcc/testsuite/lib/target-supports.exp |  15 +++
 34 files changed, 484 insertions(+), 12 deletions(-)
 create mode 100644 gcc/config/i386/sm3intrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sm3-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sm3-check.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sm3msg1-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sm3msg2-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sm3rnds2-2.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 3599f9def2c..e5cdffe017a 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -877,6 +877,8 @@ get_available_features (struct __processor_model *cpu_model,
set_feature (FEATURE_AVXNECONVERT);
  if (edx & bit_AVXV

[PATCH 4/4] Support Intel SM4

2023-07-12 Thread Haochen Jiang via Gcc-patches
gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_available_features):
Detech SM4.
* common/config/i386/i386-common.cc (OPTION_MASK_ISA2_SM4_SET,
OPTION_MASK_ISA2_SM4_UNSET): New.
(OPTION_MASK_ISA2_AVX_UNSET): Add SM4.
(ix86_handle_option): Handle -msm4.
* common/config/i386/i386-cpuinfo.h (enum processor_features):
Add FEATURE_SM4.
* common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for
sm4.
* config.gcc: Add sm4intrin.h.
* config/i386/cpuid.h (bit_SM4): New.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-c.cc (ix86_target_macros_internal): Define
__SM4__.
* config/i386/i386-isa.def (SM4): Add DEF_PTA(SM4).
* config/i386/i386-options.cc (isa2_opts): Add -msm4.
(ix86_valid_target_attribute_inner_p): Handle sm4.
* config/i386/i386.opt: Add option -msm4.
* config/i386/immintrin.h: Include sm4intrin.h
* config/i386/sse.md (vsm4key4_): New define insn.
(vsm4rnds4_): Ditto.
* doc/extend.texi: Document sm4.
* doc/invoke.texi: Document -msm4.
* doc/sourcebuild.texi: Document target sm4.
* config/i386/sm4intrin.h: New file.

gcc/testsuite/ChangeLog:

* g++.dg/other/i386-2.C: Add -msm4.
* g++.dg/other/i386-3.C: Ditto.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* gcc.target/i386/sse-12.c: Add -msm4.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Add sm4.
* gcc.target/i386/sse-23.c: Ditto.
* lib/target-supports.exp (check_effective_target_sm4): New.
* gcc.target/i386/sm4-1.c: New test.
* gcc.target/i386/sm4-check.h: Ditto.
* gcc.target/i386/sm4key4-2.c: Ditto.
* gcc.target/i386/sm4rnds4-2.c: Ditto.
---
 gcc/common/config/i386/cpuinfo.h  |   2 +
 gcc/common/config/i386/i386-common.cc |  20 +-
 gcc/common/config/i386/i386-cpuinfo.h |   1 +
 gcc/common/config/i386/i386-isas.h|   1 +
 gcc/config.gcc|   2 +-
 gcc/config/i386/cpuid.h   |   1 +
 gcc/config/i386/i386-builtin.def  |   6 +
 gcc/config/i386/i386-c.cc |   2 +
 gcc/config/i386/i386-isa.def  |   1 +
 gcc/config/i386/i386-options.cc   |   4 +-
 gcc/config/i386/i386.opt  |   5 +
 gcc/config/i386/immintrin.h   |   2 +
 gcc/config/i386/sm4intrin.h   |  70 +++
 gcc/config/i386/sse.md|  26 +++
 gcc/doc/extend.texi   |   5 +
 gcc/doc/invoke.texi   |   9 +-
 gcc/doc/sourcebuild.texi  |   3 +
 gcc/testsuite/g++.dg/other/i386-2.C   |   2 +-
 gcc/testsuite/g++.dg/other/i386-3.C   |   2 +-
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |   2 +
 gcc/testsuite/gcc.target/i386/sm4-1.c |  20 ++
 gcc/testsuite/gcc.target/i386/sm4-check.h | 183 ++
 gcc/testsuite/gcc.target/i386/sm4key4-2.c |  14 ++
 gcc/testsuite/gcc.target/i386/sm4rnds4-2.c|  14 ++
 gcc/testsuite/gcc.target/i386/sse-12.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-13.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c|   2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c|   4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c|   2 +-
 gcc/testsuite/lib/target-supports.exp |  14 ++
 30 files changed, 409 insertions(+), 14 deletions(-)
 create mode 100644 gcc/config/i386/sm4intrin.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sm4-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sm4-check.h
 create mode 100644 gcc/testsuite/gcc.target/i386/sm4key4-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sm4rnds4-2.c

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 0cfde3ebccd..f9434f038ea 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -881,6 +881,8 @@ get_available_features (struct __processor_model *cpu_model,
set_feature (FEATURE_SM3);
  if (eax & bit_SHA512)
set_feature (FEATURE_SHA512);
+ if (eax & bit_SM4)
+   set_feature (FEATURE_SM4);
}
   if (avx512_usable)
{
diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index 97c3cdfe5e1..610cabe52c1 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -122,6 +122,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA2_AVXVNNIINT16_SET OPTION_MASK_ISA2_AVXVNNIINT16
 #define OPTION_MASK_ISA2_SM3_SET OPTION_MASK_ISA2_SM3
 #define OPTION_MASK_ISA2_SHA512_SET OPTION_MASK_ISA2_SHA512
+#define OPTION_MASK_ISA2_SM4_SET OPTION_

[PATCH 0/4] Support Intel Arrow Lake/Lunar Lake ISAs

2023-07-12 Thread Haochen Jiang via Gcc-patches
Hi all,

These four patches aimed to add Intel Arrow Lake/Lunar Lake
instructions, including AVX-VNNI-INT16, SM3, SHA512 and SM4.

The information is based on newly released
Intel Architecture Instruction Set Extensions and Future Features.

The document comes following:
https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Regtested on x86_64-pc-linux-gnu. Ok for trunk?

BRs,
Haochen




RE: [PATCH v2] RISC-V: Add more tests for RVV floating-point FRM.

2023-07-12 Thread Li, Pan2 via Gcc-patches
Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Thursday, July 13, 2023 1:54 PM
To: Li, Pan2 
Cc: GCC Patches ; 钟居哲 ; Robin 
Dapp ; Jeff Law ; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Add more tests for RVV floating-point FRM.

LGTM

mailto:pan2...@intel.com>> 於 2023年7月13日 週四 13:10 寫道:
From: Pan Li mailto:pan2...@intel.com>>

Add more test cases include both the asm check and run for RVV FRM.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-frm-insert-10.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-7.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-8.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-9.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-1.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-2.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-3.c: New test.
---
 .../rvv/base/float-point-frm-insert-10.c  | 23 ++
 .../riscv/rvv/base/float-point-frm-insert-7.c | 29 +++
 .../riscv/rvv/base/float-point-frm-insert-8.c | 27 +++
 .../riscv/rvv/base/float-point-frm-insert-9.c | 24 ++
 .../riscv/rvv/base/float-point-frm-run-1.c| 79 +++
 .../riscv/rvv/base/float-point-frm-run-2.c| 70 
 .../riscv/rvv/base/float-point-frm-run-3.c| 71 +
 7 files changed, 323 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-3.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
new file mode 100644
index 000..c46910b878c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  asm volatile (
+"addi %0, %0, 0x12"
+:"+r"(vl)
+:
+:
+  );
+
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
new file mode 100644
index 000..7b1602fd509
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+normalize_vl (size_t vl)
+{
+  if (vl % 4 == 0)
+return vl;
+
+  return ((vl / 4) + 1) * 4;
+}
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+
+  vl = normalize_vl (vl);
+
+  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
+
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
new file mode 100644
index 000..37481ddac38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+normalize_vl (size_t vl)
+{
+  if (vl % 4 == 0)
+return vl;
+
+  return ((vl / 4) + 1) * 4;
+}
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  vl = normalize_vl (vl);
+
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-fi

Re: [PATCH v2] RISC-V: Add more tests for RVV floating-point FRM.

2023-07-12 Thread Kito Cheng via Gcc-patches
LGTM

 於 2023年7月13日 週四 13:10 寫道:

> From: Pan Li 
>
> Add more test cases include both the asm check and run for RVV FRM.
>
> Signed-off-by: Pan Li 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-frm-insert-10.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-insert-7.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-insert-8.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-insert-9.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-run-1.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-run-2.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-run-3.c: New test.
> ---
>  .../rvv/base/float-point-frm-insert-10.c  | 23 ++
>  .../riscv/rvv/base/float-point-frm-insert-7.c | 29 +++
>  .../riscv/rvv/base/float-point-frm-insert-8.c | 27 +++
>  .../riscv/rvv/base/float-point-frm-insert-9.c | 24 ++
>  .../riscv/rvv/base/float-point-frm-run-1.c| 79 +++
>  .../riscv/rvv/base/float-point-frm-run-2.c| 70 
>  .../riscv/rvv/base/float-point-frm-run-3.c| 71 +
>  7 files changed, 323 insertions(+)
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-1.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-2.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-3.c
>
> diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
> new file mode 100644
> index 000..c46910b878c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +void
> +test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t
> op2,
> +size_t vl)
> +{
> +  asm volatile (
> +"addi %0, %0, 0x12"
> +:"+r"(vl)
> +:
> +:
> +  );
> +
> +  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
> +  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
> +  *(vfloat32m1_t *)out = result;
> +}
> +
> +/* { dg-final { scan-assembler-times
> {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2
> } } */
> diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
> new file mode 100644
> index 000..7b1602fd509
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +size_t __attribute__ ((noinline))
> +normalize_vl (size_t vl)
> +{
> +  if (vl % 4 == 0)
> +return vl;
> +
> +  return ((vl / 4) + 1) * 4;
> +}
> +
> +void
> +test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t
> op2,
> +size_t vl)
> +{
> +  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
> +
> +  vl = normalize_vl (vl);
> +
> +  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
> +
> +  *(vfloat32m1_t *)out = result;
> +}
> +
> +/* { dg-final { scan-assembler-times
> {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2
> } } */
> diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
> new file mode 100644
> index 000..37481ddac38
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +size_t __attribute__ ((noinline))
> +normalize_vl (size_t vl)
> +{
> +  if (vl % 4 == 0)
> +return vl;
> +
> +  return ((vl / 4) + 1) * 4;
> +}
> +
> +void
> +test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t
> op2,
> +size_t vl)
> +{
> +  vl = normalize_vl (vl);
> +
> +  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
> +
> +  *(vfloat32m1_t *)out = result;
> +}
> +
> +/* { dg-final { scan-assembler-times
> {vfadd\.v[vf

[PATCH 2/2] RISC-V: Implement locality for __builtin_prefetch

2023-07-12 Thread Monk Chiang via Gcc-patches
gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_print_operand):
  Add 'N' for print a non-temporal locality hints instruction.
* config/riscv/riscv.md (prefetch):
  Add NTLH instruction for prefetch.r and prefetch.w.
gcc/testsuite/ChangeLog:

* gcc.target/riscv/prefetch-zihintntl.c: New test.
---
 gcc/config/riscv/riscv.cc | 22 +++
 gcc/config/riscv/riscv.md | 10 ++---
 .../gcc.target/riscv/prefetch-zihintntl.c | 20 +
 3 files changed, 49 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 706c18416db..42f80088bab 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4532,6 +4532,7 @@ riscv_memmodel_needs_amo_release (enum memmodel model)
'A' Print the atomic operation suffix for memory model OP.
'I' Print the LR suffix for memory model OP.
'J' Print the SC suffix for memory model OP.
+   'N' Print a non-temporal locality hints instruction.
'z' Print x0 if OP is zero, otherwise print OP normally.
'i' Print i if the operand is not a register.
'S' Print shift-index of single-bit mask OP.
@@ -4718,6 +4719,27 @@ riscv_print_operand (FILE *file, rtx op, int letter)
   break;
 }
 
+case 'N':
+  {
+   const char *ntl_hint = NULL;
+   switch (INTVAL (op))
+ {
+ case 0:
+   ntl_hint = "ntl.all";
+   break;
+ case 1:
+   ntl_hint = "ntl.pall";
+   break;
+ case 2:
+   ntl_hint = "ntl.p1";
+   break;
+ }
+
+  if (ntl_hint)
+   asm_fprintf (file, "%s\n\t", ntl_hint);
+  break;
+  }
+
 case 'i':
   if (code != REG)
 fputs ("i", file);
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 7988026d129..3357c981b5d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3256,11 +3256,15 @@
 {
   switch (INTVAL (operands[1]))
   {
-case 0: return "prefetch.r\t%a0";
-case 1: return "prefetch.w\t%a0";
+case 0: return TARGET_ZIHINTNTL ? "%N2prefetch.r\t%a0" : "prefetch.r\t%a0";
+case 1: return TARGET_ZIHINTNTL ? "%N2prefetch.w\t%a0" : "prefetch.w\t%a0";
 default: gcc_unreachable ();
   }
-})
+}
+  [(set (attr "length") (if_then_else (and (match_test "TARGET_ZIHINTNTL")
+  (match_test "INTVAL (operands[2]) != 
3"))
+ (const_string "8")
+ (const_string "4")))])
 
 (define_insn "riscv_prefetchi_"
   [(unspec_volatile:X [(match_operand:X 0 "address_operand" "r")
diff --git a/gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c 
b/gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c
new file mode 100644
index 000..78a3afe6833
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/prefetch-zihintntl.c
@@ -0,0 +1,20 @@
+/* { dg-do compile target { { rv64-*-*}}} */
+/* { dg-options "-march=rv64gc_zicbop_zihintntl -mabi=lp64" } */
+
+void foo (char *p)
+{
+  __builtin_prefetch (p, 0, 0);
+  __builtin_prefetch (p, 0, 1);
+  __builtin_prefetch (p, 0, 2);
+  __builtin_prefetch (p, 0, 3);
+  __builtin_prefetch (p, 1, 0);
+  __builtin_prefetch (p, 1, 1);
+  __builtin_prefetch (p, 1, 2);
+  __builtin_prefetch (p, 1, 3);
+}
+
+/* { dg-final { scan-assembler-times "ntl.all" 2 } } */
+/* { dg-final { scan-assembler-times "ntl.pall" 2 } } */
+/* { dg-final { scan-assembler-times "ntl.p1" 2 } } */
+/* { dg-final { scan-assembler-times "prefetch.r" 4 } } */
+/* { dg-final { scan-assembler-times "prefetch.w" 4 } } */
-- 
2.40.1



[PATCH 1/2] RISC-V: Recognized zihintntl extensions

2023-07-12 Thread Monk Chiang via Gcc-patches
gcc/ChangeLog:

* common/config/riscv/riscv-common.cc:
(riscv_implied_info): Add zihintntl item.
(riscv_ext_version_table): Ditto.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv-opts.h (MASK_ZIHINTNTL): New macro.
(TARGET_ZIHINTNTL): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-22.c: New test.
* gcc.target/riscv/predef-28.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc|  4 ++
 gcc/config/riscv/riscv-opts.h  |  2 +
 gcc/testsuite/gcc.target/riscv/arch-22.c   |  5 +++
 gcc/testsuite/gcc.target/riscv/predef-28.c | 47 ++
 4 files changed, 58 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-22.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/predef-28.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 6091d8f281b..28c8f0c1489 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -206,6 +206,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zksh",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zkt",   ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"zihintntl", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zicboz",ISA_SPEC_CLASS_NONE, 1, 0},
   {"zicbom",ISA_SPEC_CLASS_NONE, 1, 0},
   {"zicbop",ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1267,6 +1269,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zksh",   &gcc_options::x_riscv_zk_subext, MASK_ZKSH},
   {"zkt",&gcc_options::x_riscv_zk_subext, MASK_ZKT},
 
+  {"zihintntl", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTNTL},
+
   {"zicboz", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOZ},
   {"zicbom", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOM},
   {"zicbop", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOP},
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index cfcf608ea62..beee241aa1b 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -101,9 +101,11 @@ enum riscv_entity
 
 #define MASK_ZICSR(1 << 0)
 #define MASK_ZIFENCEI (1 << 1)
+#define MASK_ZIHINTNTL (1 << 2)
 
 #define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
 #define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
+#define TARGET_ZIHINTNTL ((riscv_zi_subext & MASK_ZIHINTNTL) != 0)
 
 #define MASK_ZAWRS   (1 << 0)
 #define TARGET_ZAWRS ((riscv_za_subext & MASK_ZAWRS) != 0)
diff --git a/gcc/testsuite/gcc.target/riscv/arch-22.c 
b/gcc/testsuite/gcc.target/riscv/arch-22.c
new file mode 100644
index 000..cdc18e13d0f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-22.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gc_zihintntl -mabi=lp64 -mcmodel=medlow" } */
+int foo()
+{
+}
diff --git a/gcc/testsuite/gcc.target/riscv/predef-28.c 
b/gcc/testsuite/gcc.target/riscv/predef-28.c
new file mode 100644
index 000..81fdad571e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/predef-28.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zihintntl -mabi=lp64 -mcmodel=medlow" } */
+
+int main () {
+
+#ifndef __riscv_arch_test
+#error "__riscv_arch_test"
+#endif
+
+#if __riscv_xlen != 64
+#error "__riscv_xlen"
+#endif
+
+#if !defined(__riscv_i)
+#error "__riscv_i"
+#endif
+
+#if !defined(__riscv_c)
+#error "__riscv_c"
+#endif
+
+#if defined(__riscv_e)
+#error "__riscv_e"
+#endif
+
+#if !defined(__riscv_a)
+#error "__riscv_a"
+#endif
+
+#if !defined(__riscv_m)
+#error "__riscv_m"
+#endif
+
+#if !defined(__riscv_f)
+#error "__riscv_f"
+#endif
+
+#if !defined(__riscv_d)
+#error "__riscv_d"
+#endif
+
+#if !defined(__riscv_zihintntl)
+#error "__riscv_zihintntl"
+#endif
+
+  return 0;
+}
-- 
2.40.1



[PATCH] SSA MATH: Support COND_LEN_FMA for floating-point math optimization

2023-07-12 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Hi, Richard and Richi.

Previous patch we support COND_LEN_* binary operations. However, we didn't
support COND_LEN_* ternary.

Now, this patch support COND_LEN_* ternary. Consider this following case:

#define TEST_TYPE(TYPE)\
  __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst,\
  TYPE *__restrict a,  \
  TYPE *__restrict b,\
TYPE *__restrict c, int n)   \
  {\
for (int i = 0; i < n; i++)\
  dst[i] += a[i] * b[i];
 \
  }

#define TEST_ALL() TEST_TYPE (double)

TEST_ALL ()

Before this patch:
...
COND_LEN_MUL
COND_LEN_ADD

Afther this patch:
...
COND_LEN_FMA

gcc/ChangeLog:

* genmatch.cc (commutative_op): Add COND_LEN_*
* internal-fn.cc (first_commutative_argument): Ditto.
(CASE): Ditto.
(get_unconditional_internal_fn): Ditto.
(can_interpret_as_conditional_op_p): Ditto.
(internal_fn_len_index): Ditto.
* internal-fn.h (can_interpret_as_conditional_op_p): Ditt.
* tree-ssa-math-opts.cc (convert_mult_to_fma_1): Ditto.
(convert_mult_to_fma): Ditto.
(math_opts_dom_walker::after_dom_children): Ditto.

---
 gcc/genmatch.cc   | 13 +++
 gcc/internal-fn.cc| 82 +++
 gcc/internal-fn.h |  2 +-
 gcc/tree-ssa-math-opts.cc | 57 ---
 4 files changed, 139 insertions(+), 15 deletions(-)

diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc
index 5fceeec9780..2302f2a7ff0 100644
--- a/gcc/genmatch.cc
+++ b/gcc/genmatch.cc
@@ -559,6 +559,19 @@ commutative_op (id_base *id)
   case CFN_COND_FMS:
   case CFN_COND_FNMA:
   case CFN_COND_FNMS:
+  case CFN_COND_LEN_ADD:
+  case CFN_COND_LEN_MUL:
+  case CFN_COND_LEN_MIN:
+  case CFN_COND_LEN_MAX:
+  case CFN_COND_LEN_FMIN:
+  case CFN_COND_LEN_FMAX:
+  case CFN_COND_LEN_AND:
+  case CFN_COND_LEN_IOR:
+  case CFN_COND_LEN_XOR:
+  case CFN_COND_LEN_FMA:
+  case CFN_COND_LEN_FMS:
+  case CFN_COND_LEN_FNMA:
+  case CFN_COND_LEN_FNMS:
return 1;
 
   default:
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index c11123a1173..e47b1377ff8 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -4191,6 +4191,19 @@ first_commutative_argument (internal_fn fn)
 case IFN_COND_FMS:
 case IFN_COND_FNMA:
 case IFN_COND_FNMS:
+case IFN_COND_LEN_ADD:
+case IFN_COND_LEN_MUL:
+case IFN_COND_LEN_MIN:
+case IFN_COND_LEN_MAX:
+case IFN_COND_LEN_FMIN:
+case IFN_COND_LEN_FMAX:
+case IFN_COND_LEN_AND:
+case IFN_COND_LEN_IOR:
+case IFN_COND_LEN_XOR:
+case IFN_COND_LEN_FMA:
+case IFN_COND_LEN_FMS:
+case IFN_COND_LEN_FNMA:
+case IFN_COND_LEN_FNMS:
   return 1;
 
 default:
@@ -4330,11 +4343,15 @@ conditional_internal_fn_code (internal_fn ifn)
 {
   switch (ifn)
 {
-#define CASE(CODE, IFN) case IFN_COND_##IFN: return CODE;
-  FOR_EACH_CODE_MAPPING(CASE)
+#define CASE(CODE, IFN)
\
+  case IFN_COND_##IFN: 
\
+return CODE;   
\
+  case IFN_COND_LEN_##IFN: 
\
+return CODE;
+  FOR_EACH_CODE_MAPPING (CASE)
 #undef CASE
-default:
-  return ERROR_MARK;
+  default:
+   return ERROR_MARK;
 }
 }
 
@@ -4433,6 +4450,18 @@ get_unconditional_internal_fn (internal_fn ifn)
operating elementwise if the operands are vectors.  This includes
the case of an all-true COND, so that the operation always happens.
 
+   There is an alternative approach to interpret the STMT when the operands
+   are vectors which is the operation predicated by both conditional mask
+   and loop control length, the equivalent C code:
+
+ for (int i = 0; i < NUNTIS; i++)
+  {
+   if (i < LEN + BIAS && COND[i])
+ LHS[i] = A[i] CODE B[i];
+   else
+ LHS[i] = ELSE[i];
+  }
+
When returning true, set:
 
- *COND_OUT to the condition COND, or to NULL_TREE if the condition
@@ -4440,13 +4469,18 @@ get_unconditional_internal_fn (internal_fn ifn)
- *CODE_OUT to the tree code
- OPS[I] to operand I of *CODE_OUT
- *ELSE_OUT to the fallback value ELSE, or to NULL_TREE if the
- condition is known to be all true.  */
+ condition is known to be all true.
+   - *LEN to the len argument if it COND_LEN_* operations or to NULL_TREE.
+   - *BIAS to the bias argument if it COND_LEN_* operations or to NULL_TREE.  
*/
 
 bool
 can_interpret

[PATCH v2] RISC-V: Add more tests for RVV floating-point FRM.

2023-07-12 Thread Pan Li via Gcc-patches
From: Pan Li 

Add more test cases include both the asm check and run for RVV FRM.

Signed-off-by: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-frm-insert-10.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-7.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-8.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-9.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-1.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-2.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-3.c: New test.
---
 .../rvv/base/float-point-frm-insert-10.c  | 23 ++
 .../riscv/rvv/base/float-point-frm-insert-7.c | 29 +++
 .../riscv/rvv/base/float-point-frm-insert-8.c | 27 +++
 .../riscv/rvv/base/float-point-frm-insert-9.c | 24 ++
 .../riscv/rvv/base/float-point-frm-run-1.c| 79 +++
 .../riscv/rvv/base/float-point-frm-run-2.c| 70 
 .../riscv/rvv/base/float-point-frm-run-3.c| 71 +
 7 files changed, 323 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-3.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
new file mode 100644
index 000..c46910b878c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  asm volatile (
+"addi %0, %0, 0x12"
+:"+r"(vl)
+:
+:
+  );
+
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
new file mode 100644
index 000..7b1602fd509
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+normalize_vl (size_t vl)
+{
+  if (vl % 4 == 0)
+return vl;
+
+  return ((vl / 4) + 1) * 4;
+}
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+
+  vl = normalize_vl (vl);
+
+  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
+
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
new file mode 100644
index 000..37481ddac38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+normalize_vl (size_t vl)
+{
+  if (vl % 4 == 0)
+return vl;
+
+  return ((vl / 4) + 1) * 4;
+}
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  vl = normalize_vl (vl);
+
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
new file mode 100644
index 00

RE: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM

2023-07-12 Thread Li, Pan2 via Gcc-patches
Thanks Jeff and Kito for comments, update the V3 version as below.

https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624347.html

> Extract vxrm reg to a local static variable to prevent construct that again 
> and again.

The "static const_rtx vxrm_rtx = gen_rtx_REG (SImode, VXRM_REGMU)" results in 
some error when selftest like below, thus patch v3 doesn't include this change.

/home/pli/repos/gcc/111/riscv-gnu-toolchain/build-gcc-newlib-stage1/./gcc/xgcc 
-B/home/pli/repos/gcc/111/riscv-gnu-toolchain/build-gcc-newlib-stage1/./gcc/  
-xc -nostdinc /dev/null -S -o /dev/null 
-fself-test=../.././gcc/gcc/testsuite/selftests
virtual memory exhausted: Invalid argument
make[2]: *** [../.././gcc/gcc/c/Make-lang.in:153: s-selftest-c] Error 1

Pan

-Original Message-
From: Jeff Law  
Sent: Wednesday, July 12, 2023 11:31 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; rdapp@gmail.com; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM



On 7/11/23 23:50, pan2...@intel.com wrote:
> From: Pan Li 
> 
> When investigate the FRM dynmaic rounding mode, we find the global
> unknown status is quite different between the fixed-point and
> floating-point. Thus, we separate the unknown function with extracting
> some inner common functions.
> 
> We will also prepare more test cases in another PATCH.
> 
> Signed-off-by: Pan Li 
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv.cc (regnum_definition_p): New function.
>   (insn_asm_p): Ditto.
>   (riscv_vxrm_mode_after): New function for fixed-point.
>   (global_vxrm_state_unknown_p): Ditto.
>   (riscv_frm_mode_after): New function for floating-point.
>   (global_frm_state_unknown_p): Ditto.
>   (riscv_mode_after): Leverage new functions.
>   (riscv_entity_mode_after): Removed.
> ---
>   gcc/config/riscv/riscv.cc | 96 +--
>   1 file changed, 82 insertions(+), 14 deletions(-)
> 
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 38d8eb2fcf5..553fbb4435a 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -7742,19 +7742,91 @@ global_state_unknown_p (rtx_insn *insn, unsigned int 
> regno)
> return false;
>   }
>   
> +static bool
> +regnum_definition_p (rtx_insn *insn, unsigned int regno)
Needs a function comment.  This is true for each new function added.  In 
this specific case somethign like this might be appropriate

/* Return TRUE if REGNO is set in INSN, FALSE otherwise.  */

Which begs the question, is there some reason why we're not using the 
existing reg_set_p or simple_regno_set from rtlanal.cc?



Jeff


[PATCH v3] RISC-V: Refactor riscv mode after for VXRM and FRM

2023-07-12 Thread Pan Li via Gcc-patches
From: Pan Li 

When investigate the FRM dynmaic rounding mode, we find the global
unknown status is quite different between the fixed-point and
floating-point. Thus, we separate the unknown function with extracting
some inner common functions.

We will also prepare more test cases in another PATCH.

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv.cc (vxrm_rtx): New static var.
(frm_rtx): Ditto.
(global_state_unknown_p): Removed.
(riscv_entity_mode_after): Removed.
(asm_insn_p): New function.
(vxrm_unknown_p): New function for fixed-point.
(riscv_vxrm_mode_after): Ditto.
(frm_unknown_dynamic_p): New function for floating-point.
(riscv_frm_mode_after): Ditto.
(riscv_mode_after): Leverage new functions.
---
 gcc/config/riscv/riscv.cc | 85 ---
 1 file changed, 62 insertions(+), 23 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 706c18416db..6ed735d6983 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7701,17 +7701,24 @@ riscv_mode_needed (int entity, rtx_insn *insn)
 }
 }
 
-/* Return true if the VXRM/FRM status of the INSN is unknown.  */
+/* Return TRUE that an insn is asm.  */
+
 static bool
-global_state_unknown_p (rtx_insn *insn, unsigned int regno)
+asm_insn_p (rtx_insn *insn)
 {
-  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
-  df_ref ref;
+  extract_insn (insn);
+
+  return recog_data.is_asm;
+}
+
+/* Return TRUE that an insn is unknown for VXRM.  */
 
+static bool
+vxrm_unknown_p (rtx_insn *insn)
+{
   /* Return true if there is a definition of VXRM.  */
-  for (ref = DF_INSN_INFO_DEFS (insn_info); ref; ref = DF_REF_NEXT_LOC (ref))
-if (DF_REF_REGNO (ref) == regno)
-  return true;
+  if (reg_set_p (gen_rtx_REG (SImode, VXRM_REGNUM), insn))
+return true;
 
   /* A CALL function may contain an instruction that modifies the VXRM,
  return true in this situation.  */
@@ -7720,25 +7727,61 @@ global_state_unknown_p (rtx_insn *insn, unsigned int 
regno)
 
   /* Return true for all assembly since users may hardcode a assembly
  like this: asm volatile ("csrwi vxrm, 0").  */
-  extract_insn (insn);
-  if (recog_data.is_asm)
+  if (asm_insn_p (insn))
+return true;
+
+  return false;
+}
+
+/* Return TRUE that an insn is unknown dynamic for FRM.  */
+
+static bool
+frm_unknown_dynamic_p (rtx_insn *insn)
+{
+  /* Return true if there is a definition of FRM.  */
+  if (reg_set_p (gen_rtx_REG (SImode, FRM_REGNUM), insn))
 return true;
+
+  /* A CALL function may contain an instruction that modifies the FRM,
+ return true in this situation.  */
+  if (CALL_P (insn))
+return true;
+
   return false;
 }
 
+/* Return the mode that an insn results in for VXRM.  */
+
 static int
-riscv_entity_mode_after (int regnum, rtx_insn *insn, int mode,
-int (*get_attr_mode) (rtx_insn *), int default_mode)
+riscv_vxrm_mode_after (rtx_insn *insn, int mode)
 {
-  if (global_state_unknown_p (insn, regnum))
-return default_mode;
-  else if (recog_memoized (insn) < 0)
+  if (vxrm_unknown_p (insn))
+return VXRM_MODE_NONE;
+
+  if (recog_memoized (insn) < 0)
+return mode;
+
+  if (reg_mentioned_p (gen_rtx_REG (SImode, VXRM_REGNUM), PATTERN (insn)))
+return get_attr_vxrm_mode (insn);
+  else
 return mode;
+}
+
+/* Return the mode that an insn results in for FRM.  */
 
-  rtx reg = gen_rtx_REG (SImode, regnum);
-  bool mentioned_p = reg_mentioned_p (reg, PATTERN (insn));
+static int
+riscv_frm_mode_after (rtx_insn *insn, int mode)
+{
+  if (frm_unknown_dynamic_p (insn))
+return FRM_MODE_DYN;
 
-  return mentioned_p ? get_attr_mode (insn): mode;
+  if (recog_memoized (insn) < 0)
+return mode;
+
+  if (reg_mentioned_p (gen_rtx_REG (SImode, FRM_REGNUM), PATTERN (insn)))
+return get_attr_frm_mode (insn);
+  else
+return mode;
 }
 
 /* Return the mode that an insn results in.  */
@@ -7749,13 +7792,9 @@ riscv_mode_after (int entity, int mode, rtx_insn *insn)
   switch (entity)
 {
 case RISCV_VXRM:
-  return riscv_entity_mode_after (VXRM_REGNUM, insn, mode,
- (int (*)(rtx_insn *)) get_attr_vxrm_mode,
- VXRM_MODE_NONE);
+  return riscv_vxrm_mode_after (insn, mode);
 case RISCV_FRM:
-  return riscv_entity_mode_after (FRM_REGNUM, insn, mode,
- (int (*)(rtx_insn *)) get_attr_frm_mode,
- FRM_MODE_DYN);
+  return riscv_frm_mode_after (insn, mode);
 default:
   gcc_unreachable ();
 }
-- 
2.34.1



Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-12 Thread Lehua Ding
Commited to the trunk, thanks Jeff.
 
 
-- Original --
From:  "Jeff Law"

[PATCH v5 2/2] libstdc++: Use new built-in trait __is_pointer

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_pointer.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_ptr): Use __is_pointer
built-in trait.
* include/std/type_traits (is_pointer): Likewise. Optimize its
implementation.
(is_pointer_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/bits/cpp_type_traits.h |  8 
 libstdc++-v3/include/std/type_traits| 44 +
 2 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h 
b/libstdc++-v3/include/bits/cpp_type_traits.h
index 3711e4be526..4da1e7c407c 100644
--- a/libstdc++-v3/include/bits/cpp_type_traits.h
+++ b/libstdc++-v3/include/bits/cpp_type_traits.h
@@ -363,6 +363,13 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   //
   // Pointer types
   //
+#if __has_builtin(__is_pointer)
+  template
+struct __is_ptr : __truth_type<__is_pointer(_Tp)>
+{
+  enum { __value = __is_pointer(_Tp) };
+};
+#else
   template
 struct __is_ptr
 {
@@ -376,6 +383,7 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   enum { __value = 1 };
   typedef __true_type __type;
 };
+#endif
 
   //
   // An arithmetic type is an integer type or a floating point type
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..16b2f6de536 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -515,19 +515,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_array<_Tp[]>
 : public true_type { };
 
-  template
-struct __is_pointer_helper
+  /// is_pointer
+#if __has_builtin(__is_pointer)
+  template
+struct is_pointer
+: public __bool_constant<__is_pointer(_Tp)>
+{ };
+#else
+  template
+struct is_pointer
 : public false_type { };
 
   template
-struct __is_pointer_helper<_Tp*>
+struct is_pointer<_Tp*>
 : public true_type { };
 
-  /// is_pointer
   template
-struct is_pointer
-: public __is_pointer_helper<__remove_cv_t<_Tp>>::type
-{ };
+struct is_pointer<_Tp* const>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* volatile>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* const volatile>
+: public true_type { };
+#endif
 
   /// is_lvalue_reference
   template
@@ -3168,8 +3182,22 @@ template 
 template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
 
+#if __has_builtin(__is_pointer)
+template 
+  inline constexpr bool is_pointer_v = __is_pointer(_Tp);
+#else
 template 
-  inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
+  inline constexpr bool is_pointer_v = false;
+template 
+  inline constexpr bool is_pointer_v<_Tp*> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* volatile> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
+#endif
+
 template 
   inline constexpr bool is_lvalue_reference_v = false;
 template 
-- 
2.41.0



[PATCH v5 1/2] c++, libstdc++: Implement __is_pointer built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_pointer.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
* g++.dg/ext/is_pointer.C: New test.
* g++.dg/tm/pr46567.C (__is_pointer): Rename to ...
(__is_ptr): ... this.
* g++.dg/torture/20070621-1.C: Likewise.
* g++.dg/torture/pr57107.C: Likewise.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_pointer): Rename to ...
(__is_ptr): ... this.
* include/bits/deque.tcc: Use __is_ptr instead.
* include/bits/stl_algobase.h: Likewise.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc|  3 ++
 gcc/cp/cp-trait.def |  1 +
 gcc/cp/semantics.cc |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
 gcc/testsuite/g++.dg/ext/is_pointer.C   | 51 +
 gcc/testsuite/g++.dg/tm/pr46567.C   | 22 -
 gcc/testsuite/g++.dg/torture/20070621-1.C   |  4 +-
 gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
 libstdc++-v3/include/bits/cpp_type_traits.h |  6 +--
 libstdc++-v3/include/bits/deque.tcc |  6 +--
 libstdc++-v3/include/bits/stl_algobase.h|  6 +--
 11 files changed, 86 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..30266204eb5 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_POINTER:
+  inform (loc, "  %qT is not a pointer", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..b7c263e9a77 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..68f8a4fe85b 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_POINTER:
+  return TYPE_PTR_P (type1);
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_POINTER:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..9dace5cbd48 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_pointer)
+# error "__has_builtin (__is_pointer) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_pointer.C
new file mode 100644
index 000..d6e39565950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(!__is_pointer(int));
+SA(__is_pointer(int*));
+SA(__is_pointer(int**));
+
+SA(__is_pointer(const int*));
+SA(__is_pointer(const int**));
+SA(__is_pointer(int* const));
+SA(__is_pointer(int** const));
+SA(__is_pointer(int* const* const));
+
+SA(__is_pointer(volatile int*));
+SA(__is_pointer(volatile int**));
+SA(__is_pointer(int* volatile));
+SA(__is_pointer(int** volatile));
+SA(__is_pointer(int* volatile* volatile));
+
+SA(__is_pointer(const volatile int*));
+SA(__is_pointer(const volatile int**));
+SA(__is_pointer(const int* volatile));
+SA(__is_pointer(volatile int* const));
+SA(__is_pointer(int* const volatile));
+SA(__is_pointer(const int** volatile));
+SA(__is_pointer(volatile int** const));
+SA(__is_pointer(int** const volatile));
+SA(__

[PATCH v4 2/2] libstdc++: Use new built-in trait __is_pointer

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_pointer.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_ptr): Use __is_pointer
built-in trait.
* include/std/type_traits (is_pointer): Likewise. Optimize its
implementation.
(is_pointer_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/bits/cpp_type_traits.h |  8 
 libstdc++-v3/include/std/type_traits| 45 +
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h 
b/libstdc++-v3/include/bits/cpp_type_traits.h
index 3711e4be526..4da1e7c407c 100644
--- a/libstdc++-v3/include/bits/cpp_type_traits.h
+++ b/libstdc++-v3/include/bits/cpp_type_traits.h
@@ -363,6 +363,13 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   //
   // Pointer types
   //
+#if __has_builtin(__is_pointer)
+  template
+struct __is_ptr : __truth_type<__is_pointer(_Tp)>
+{
+  enum { __value = __is_pointer(_Tp) };
+};
+#else
   template
 struct __is_ptr
 {
@@ -376,6 +383,7 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   enum { __value = 1 };
   typedef __true_type __type;
 };
+#endif
 
   //
   // An arithmetic type is an integer type or a floating point type
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..0743db4cb51 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -515,19 +515,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_array<_Tp[]>
 : public true_type { };
 
-  template
-struct __is_pointer_helper
+  /// is_pointer
+#if __has_builtin(__is_pointer)
+  template
+struct is_pointer
+: public __bool_constant<__is_pointer(_Tp)>
+{ };
+#else
+  template
+struct is_pointer
 : public false_type { };
+{ };
 
   template
-struct __is_pointer_helper<_Tp*>
+struct is_pointer<_Tp*>
 : public true_type { };
 
-  /// is_pointer
   template
-struct is_pointer
-: public __is_pointer_helper<__remove_cv_t<_Tp>>::type
-{ };
+struct is_pointer<_Tp* const>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* volatile>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* const volatile>
+: public true_type { };
+#endif
 
   /// is_lvalue_reference
   template
@@ -3168,8 +3183,22 @@ template 
 template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
 
+#if __has_builtin(__is_pointer)
+template 
+  inline constexpr bool is_pointer_v = __is_pointer(_Tp);
+#else
+template 
+  inline constexpr bool is_pointer_v = false;
+template 
+  inline constexpr bool is_pointer_v<_Tp*> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const> = true;
 template 
-  inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
+  inline constexpr bool is_pointer_v<_Tp* volatile> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
+#endif
+
 template 
   inline constexpr bool is_lvalue_reference_v = false;
 template 
-- 
2.41.0



[PATCH v4 1/2] c++, libstdc++: Implement __is_pointer built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_pointer.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
* g++.dg/ext/is_pointer.C: New test.
* g++.dg/tm/pr46567.C (__is_pointer): Rename to ...
(__is_ptr): ... this.
* g++.dg/torture/20070621-1.C: Likewise.
* g++.dg/torture/pr57107.C: Likewise.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_pointer): Rename to ...
(__is_ptr): ... this.
* include/bits/deque.tcc: Use __is_ptr instead.
* include/bits/stl_algobase.h: Likewise.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc|  3 ++
 gcc/cp/cp-trait.def |  1 +
 gcc/cp/semantics.cc |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
 gcc/testsuite/g++.dg/ext/is_pointer.C   | 51 +
 gcc/testsuite/g++.dg/tm/pr46567.C   | 22 -
 gcc/testsuite/g++.dg/torture/20070621-1.C   |  4 +-
 gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
 libstdc++-v3/include/bits/cpp_type_traits.h |  6 +--
 libstdc++-v3/include/bits/deque.tcc |  6 +--
 libstdc++-v3/include/bits/stl_algobase.h|  6 +--
 11 files changed, 86 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..30266204eb5 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_POINTER:
+  inform (loc, "  %qT is not a pointer", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..b7c263e9a77 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..68f8a4fe85b 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_POINTER:
+  return TYPE_PTR_P (type1);
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_POINTER:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..9dace5cbd48 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_pointer)
+# error "__has_builtin (__is_pointer) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_pointer.C
new file mode 100644
index 000..d6e39565950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(!__is_pointer(int));
+SA(__is_pointer(int*));
+SA(__is_pointer(int**));
+
+SA(__is_pointer(const int*));
+SA(__is_pointer(const int**));
+SA(__is_pointer(int* const));
+SA(__is_pointer(int** const));
+SA(__is_pointer(int* const* const));
+
+SA(__is_pointer(volatile int*));
+SA(__is_pointer(volatile int**));
+SA(__is_pointer(int* volatile));
+SA(__is_pointer(int** volatile));
+SA(__is_pointer(int* volatile* volatile));
+
+SA(__is_pointer(const volatile int*));
+SA(__is_pointer(const volatile int**));
+SA(__is_pointer(const int* volatile));
+SA(__is_pointer(volatile int* const));
+SA(__is_pointer(int* const volatile));
+SA(__is_pointer(const int** volatile));
+SA(__is_pointer(volatile int** const));
+SA(__is_pointer(int** const volatile));
+SA(__

Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 13, 2023 at 10:47 AM Hongtao Liu  wrote:
>
> On Wed, Jul 12, 2023 at 9:37 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > The PRs ask for optimizing of
> >
> >   _1 = BIT_FIELD_REF ;
> >   result_4 = BIT_INSERT_EXPR ;
> >
> > to a vector permutation.  The following implements this as
> > match.pd pattern, improving code generation on x86_64.
> >
> > On the RTL level we face the issue that backend patterns inconsistently
> > use vec_merge and vec_select of vec_concat to represent permutes.
> >
> > I think using a (supported) permute is almost always better
> > than an extract plus insert, maybe excluding the case we extract
> > element zero and that's aliased to a register that can be used
> > directly for insertion (not sure how to query that).
> >
> > But this regresses for example gcc.target/i386/pr54855-8.c because PRE
> > now realizes that
> >
> >   _1 = BIT_FIELD_REF ;
> >   if (_1 > a_4(D))
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> >
> >[local count: 536870913]:
> >
> >[local count: 1073741824]:
> >   # iftmp.0_2 = PHI <_1(3), a_4(D)(2)>
> >   x_5 = BIT_INSERT_EXPR ;
> >
> > is equal to
> >
> >[local count: 1073741824]:
> >   _1 = BIT_FIELD_REF ;
> >   if (_1 > a_4(D))
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> >
> >[local count: 536870912]:
> >   _7 = BIT_INSERT_EXPR ;
> >
> >[local count: 1073741824]:
> >   # prephitmp_8 = PHI 
> >
> > and that no longer produces the desired maxsd operation at the RTL
> The comparison is scalar mode, but operations in then_bb is
> vector_mode, if_convert can't eliminate the condition any more(and
> won't go into backend ix86_expand_sse_fp_minmax).
> I think for ordered comparisons like _1 > a_4, it doesn't match
> fmin/fmax, but match SSE MINSS/MAXSS since it alway returns the second
> operand(not the other operand) when there's NONE.
I mean NANs.
> > level (we fail to match .FMAX at the GIMPLE level earlier).
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu with regressions:
> >
> > FAIL: gcc.target/i386/pr54855-13.c scan-assembler-times vmaxsh[ t] 1
> > FAIL: gcc.target/i386/pr54855-13.c scan-assembler-not vcomish[ t]
> > FAIL: gcc.target/i386/pr54855-8.c scan-assembler-times maxsd 1
> > FAIL: gcc.target/i386/pr54855-8.c scan-assembler-not movsd
> > FAIL: gcc.target/i386/pr54855-9.c scan-assembler-times minss 1
> > FAIL: gcc.target/i386/pr54855-9.c scan-assembler-not movss
> >
> > I think this is also PR88540 (the lack of min/max detection, not
> > sure if the SSE min/max are suitable here)
> >
> > PR tree-optimization/94864
> > PR tree-optimization/94865
> > * match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
> > for vector insertion from vector extraction.
> >
> > * gcc.target/i386/pr94864.c: New testcase.
> > * gcc.target/i386/pr94865.c: Likewise.
> > ---
> >  gcc/match.pd| 25 +
> >  gcc/testsuite/gcc.target/i386/pr94864.c | 13 +
> >  gcc/testsuite/gcc.target/i386/pr94865.c | 13 +
> >  3 files changed, 51 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr94864.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr94865.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 8543f777a28..8cc106049c4 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -7770,6 +7770,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >   wi::to_wide (@ipos) + isize))
> >  (BIT_FIELD_REF @0 @rsize @rpos)
> >
> > +/* Simplify vector inserts of other vector extracts to a permute.  */
> > +(simplify
> > + (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos)
> > + (if (VECTOR_TYPE_P (type)
> > +  && types_match (@0, @1)
> > +  && types_match (TREE_TYPE (TREE_TYPE (@0)), TREE_TYPE (@2))
> > +  && TYPE_VECTOR_SUBPARTS (type).is_constant ())
> > +  (with
> > +   {
> > + unsigned HOST_WIDE_INT elsz
> > +   = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (TREE_TYPE (@1;
> > + poly_uint64 relt = exact_div (tree_to_poly_uint64 (@rpos), elsz);
> > + poly_uint64 ielt = exact_div (tree_to_poly_uint64 (@ipos), elsz);
> > + unsigned nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> > + vec_perm_builder builder;
> > + builder.new_vector (nunits, nunits, 1);
> > + for (unsigned i = 0; i < nunits; ++i)
> > +   builder.quick_push (known_eq (ielt, i) ? nunits + relt : i);
> > + vec_perm_indices sel (builder, 2, nunits);
> > +   }
> > +   (if (!VECTOR_MODE_P (TYPE_MODE (type))
> > +   || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), sel, 
> > false))
> > +(vec_perm @0 @1 { vec_perm_indices_to_tree
> > +(build_vector_type (ssizetype, nunits), sel); 
> > })
> > +
> >  (if (canonicalize_math_after_vectorization_p ())
> >   (for fmas (FMA)
> >(simplify
> > diff --git a/gcc/testsuite/gcc.target/i386/pr94864.c 
> > b/gcc/te

Re: [PATCH v10 2/5] libstdc++: Use new built-in trait __is_reference for std::is_reference

2023-07-12 Thread Ken Matsui via Gcc-patches
Hi,

Here is the benchmark result for is_reference:

https://github.com/ken-matsui/gcc-benches/blob/main/is_reference.md#wed-jul-12-074702-pm-pdt-2023

Time: -8.15593%
Peak Memory Usage: -4.48408%
Total Memory Usage: -8.03783%

Sincerely,
Ken Matsui

On Wed, Jul 12, 2023 at 7:39 PM Ken Matsui  wrote:
>
> This patch gets std::is_reference to dispatch to new built-in trait
> __is_reference.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/type_traits (is_reference): Use __is_reference built-in
> trait.
> (is_reference_v): Likewise.
>
> Signed-off-by: Ken Matsui 
> ---
>  libstdc++-v3/include/std/type_traits | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index 0e7a9c9c7f3..2a14df7e5f9 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -639,6 +639,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// Composite type categories.
>
>/// is_reference
> +#if __has_builtin(__is_reference)
> +  template
> +struct is_reference
> +: public __bool_constant<__is_reference(_Tp)>
> +{ };
> +#else
>template
>  struct is_reference
>  : public false_type
> @@ -653,6 +659,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  struct is_reference<_Tp&&>
>  : public true_type
>  { };
> +#endif
>
>/// is_arithmetic
>template
> @@ -3192,12 +3199,19 @@ template 
>inline constexpr bool is_class_v = __is_class(_Tp);
>  template 
>inline constexpr bool is_function_v = is_function<_Tp>::value;
> +
> +#if __has_builtin(__is_reference)
> +template 
> +  inline constexpr bool is_reference_v = __is_reference(_Tp);
> +#else
>  template 
>inline constexpr bool is_reference_v = false;
>  template 
>inline constexpr bool is_reference_v<_Tp&> = true;
>  template 
>inline constexpr bool is_reference_v<_Tp&&> = true;
> +#endif
> +
>  template 
>inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
>  template 
> --
> 2.41.0
>


Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 12, 2023 at 9:37 PM Richard Biener via Gcc-patches
 wrote:
>
> The PRs ask for optimizing of
>
>   _1 = BIT_FIELD_REF ;
>   result_4 = BIT_INSERT_EXPR ;
>
> to a vector permutation.  The following implements this as
> match.pd pattern, improving code generation on x86_64.
>
> On the RTL level we face the issue that backend patterns inconsistently
> use vec_merge and vec_select of vec_concat to represent permutes.
>
> I think using a (supported) permute is almost always better
> than an extract plus insert, maybe excluding the case we extract
> element zero and that's aliased to a register that can be used
> directly for insertion (not sure how to query that).
>
> But this regresses for example gcc.target/i386/pr54855-8.c because PRE
> now realizes that
>
>   _1 = BIT_FIELD_REF ;
>   if (_1 > a_4(D))
> goto ; [50.00%]
>   else
> goto ; [50.00%]
>
>[local count: 536870913]:
>
>[local count: 1073741824]:
>   # iftmp.0_2 = PHI <_1(3), a_4(D)(2)>
>   x_5 = BIT_INSERT_EXPR ;
>
> is equal to
>
>[local count: 1073741824]:
>   _1 = BIT_FIELD_REF ;
>   if (_1 > a_4(D))
> goto ; [50.00%]
>   else
> goto ; [50.00%]
>
>[local count: 536870912]:
>   _7 = BIT_INSERT_EXPR ;
>
>[local count: 1073741824]:
>   # prephitmp_8 = PHI 
>
> and that no longer produces the desired maxsd operation at the RTL
The comparison is scalar mode, but operations in then_bb is
vector_mode, if_convert can't eliminate the condition any more(and
won't go into backend ix86_expand_sse_fp_minmax).
I think for ordered comparisons like _1 > a_4, it doesn't match
fmin/fmax, but match SSE MINSS/MAXSS since it alway returns the second
operand(not the other operand) when there's NONE.
> level (we fail to match .FMAX at the GIMPLE level earlier).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu with regressions:
>
> FAIL: gcc.target/i386/pr54855-13.c scan-assembler-times vmaxsh[ t] 1
> FAIL: gcc.target/i386/pr54855-13.c scan-assembler-not vcomish[ t]
> FAIL: gcc.target/i386/pr54855-8.c scan-assembler-times maxsd 1
> FAIL: gcc.target/i386/pr54855-8.c scan-assembler-not movsd
> FAIL: gcc.target/i386/pr54855-9.c scan-assembler-times minss 1
> FAIL: gcc.target/i386/pr54855-9.c scan-assembler-not movss
>
> I think this is also PR88540 (the lack of min/max detection, not
> sure if the SSE min/max are suitable here)
>
> PR tree-optimization/94864
> PR tree-optimization/94865
> * match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
> for vector insertion from vector extraction.
>
> * gcc.target/i386/pr94864.c: New testcase.
> * gcc.target/i386/pr94865.c: Likewise.
> ---
>  gcc/match.pd| 25 +
>  gcc/testsuite/gcc.target/i386/pr94864.c | 13 +
>  gcc/testsuite/gcc.target/i386/pr94865.c | 13 +
>  3 files changed, 51 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr94864.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr94865.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 8543f777a28..8cc106049c4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7770,6 +7770,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   wi::to_wide (@ipos) + isize))
>  (BIT_FIELD_REF @0 @rsize @rpos)
>
> +/* Simplify vector inserts of other vector extracts to a permute.  */
> +(simplify
> + (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos)
> + (if (VECTOR_TYPE_P (type)
> +  && types_match (@0, @1)
> +  && types_match (TREE_TYPE (TREE_TYPE (@0)), TREE_TYPE (@2))
> +  && TYPE_VECTOR_SUBPARTS (type).is_constant ())
> +  (with
> +   {
> + unsigned HOST_WIDE_INT elsz
> +   = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (TREE_TYPE (@1;
> + poly_uint64 relt = exact_div (tree_to_poly_uint64 (@rpos), elsz);
> + poly_uint64 ielt = exact_div (tree_to_poly_uint64 (@ipos), elsz);
> + unsigned nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> + vec_perm_builder builder;
> + builder.new_vector (nunits, nunits, 1);
> + for (unsigned i = 0; i < nunits; ++i)
> +   builder.quick_push (known_eq (ielt, i) ? nunits + relt : i);
> + vec_perm_indices sel (builder, 2, nunits);
> +   }
> +   (if (!VECTOR_MODE_P (TYPE_MODE (type))
> +   || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), sel, 
> false))
> +(vec_perm @0 @1 { vec_perm_indices_to_tree
> +(build_vector_type (ssizetype, nunits), sel); })
> +
>  (if (canonicalize_math_after_vectorization_p ())
>   (for fmas (FMA)
>(simplify
> diff --git a/gcc/testsuite/gcc.target/i386/pr94864.c 
> b/gcc/testsuite/gcc.target/i386/pr94864.c
> new file mode 100644
> index 000..69cb481fcfe
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94864.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -msse2 -mno-avx" } */
> +
> +typedef double v2df __attribute__((vector_size(16)));
> +
> +v2df mo

[PATCH v10 5/5] libstdc++: Make std::is_object dispatch to new built-in traits

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch gets std::is_object to dispatch to new built-in traits,
__is_function and __is_reference.

libstdc++-v3/ChangeLog:
* include/std/type_traits (is_object): Use new built-in traits,
__is_function and __is_reference.
(is_object_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 17 +
 1 file changed, 17 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 7ef50a2e64f..4ff025b09fa 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -682,11 +682,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_object
+#if __has_builtin(__is_function) && __has_builtin(__is_reference) \
+&& __has_builtin(__is_void)
+  template
+struct is_object
+: public __bool_constant
+{ };
+#else
   template
 struct is_object
 : public __not_<__or_, is_reference<_Tp>,
   is_void<_Tp>>>::type
 { };
+#endif
 
   template
 struct is_member_pointer;
@@ -3233,8 +3242,16 @@ template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
 template 
   inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
+
+#if __has_builtin(__is_function) && __has_builtin(__is_reference)
+template 
+  inline constexpr bool is_object_v
+= !(__is_function(_Tp) || __is_reference(_Tp) || is_void<_Tp>::value);
+#else
 template 
   inline constexpr bool is_object_v = is_object<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_scalar_v = is_scalar<_Tp>::value;
 template 
-- 
2.41.0



Re: [PATCH v10 3/5] c++: Implement __is_function built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
Hi,

Here is the benchmark result for is_function:

https://github.com/ken-matsui/gcc-benches/blob/main/is_function.md#wed-jul-12-072510-pm-pdt-2023

Time: -21.3748%
Peak Memory Usage: -10.962%
Total Memory Usage: -12.8384%

Sincerely,
Ken Matsui

On Wed, Jul 12, 2023 at 7:40 PM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_function.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_function.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_FUNCTION.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_function.
> * g++.dg/ext/is_function.C: New test.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc |  3 ++
>  gcc/cp/cp-trait.def  |  1 +
>  gcc/cp/semantics.cc  |  4 ++
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
>  gcc/testsuite/g++.dg/ext/is_function.C   | 58 
>  5 files changed, 69 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index f6951ee2670..927605c6cb7 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3754,6 +3754,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>inform (loc, "  %qT is not a union", t1);
>break;
> +case CPTK_IS_FUNCTION:
> +  inform (loc, "  %qT is not a function", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 1e3310cd682..3cd3babc242 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -83,6 +83,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_FUNCTION, "__is_function", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 2f37bc353a1..b976633645a 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12072,6 +12072,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_ENUM:
>return type_code1 == ENUMERAL_TYPE;
>
> +case CPTK_IS_FUNCTION:
> +  return type_code1 == FUNCTION_TYPE;
> +
>  case CPTK_IS_FINAL:
>return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
>
> @@ -12293,6 +12296,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
>  case CPTK_IS_REFERENCE:
> +case CPTK_IS_FUNCTION:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index b697673790c..90eb00ebf2d 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -149,3 +149,6 @@
>  #if !__has_builtin (__is_reference)
>  # error "__has_builtin (__is_reference) failed"
>  #endif
> +#if !__has_builtin (__is_function)
> +# error "__has_builtin (__is_function) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_function.C 
> b/gcc/testsuite/g++.dg/ext/is_function.C
> new file mode 100644
> index 000..2e1594b12ad
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_function.C
> @@ -0,0 +1,58 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +
> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
> +  SA(TRAIT(TYPE) == EXPECT);   \
> +  SA(TRAIT(const TYPE) == EXPECT); \
> +  SA(TRAIT(volatile TYPE) == EXPECT);  \
> +  SA(TRAIT(const volatile TYPE) == EXPECT)
> +
> +struct A
> +{ void fn(); };
> +
> +template
> +struct AHolder { };
> +
> +template
> +struct AHolder
> +{ using type = U; };
> +
> +// Positive tests.
> +SA(__is_function(int (int)));
> +SA(__is_function(ClassType (ClassType)));
> +SA(__is_function(float (int, float, int[], int&)));
> +SA(__is_function(int (int, ...)));
> +SA(__is_function(bool (ClassType) const));
> +SA(__is_function(AHolder::type));
> +
> +void fn();
> +SA(__is_function(decltype(fn)));
> +
> +// Negative tests.
> +SA_TEST_CATEGORY(__is_function, int, false);
> +SA_TEST_CATEGORY(__is_function, int*, false);
> +SA_TEST_CATEGORY(__is_function, int&, false);
> +SA_TEST_CATEGORY(__

[PATCH v10 4/5] libstdc++: Use new built-in trait __is_function for std::is_function

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch gets std::is_function to dispatch to new built-in trait
__is_function.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_function): Use __is_function built-in
trait.
(is_function_v): Likewise. Optimize its implementation.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 2a14df7e5f9..7ef50a2e64f 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -594,6 +594,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_function
+#if __has_builtin(__is_function)
+  template
+struct is_function
+: public __bool_constant<__is_function(_Tp)>
+{ };
+#else
   template
 struct is_function
 : public __bool_constant::value> { };
@@ -605,6 +611,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_function<_Tp&&>
 : public false_type { };
+#endif
 
 #define __cpp_lib_is_null_pointer 201309L
 
@@ -3197,8 +3204,18 @@ template 
   inline constexpr bool is_union_v = __is_union(_Tp);
 template 
   inline constexpr bool is_class_v = __is_class(_Tp);
+
+#if __has_builtin(__is_function)
 template 
-  inline constexpr bool is_function_v = is_function<_Tp>::value;
+  inline constexpr bool is_function_v = __is_function(_Tp);
+#else
+template 
+  inline constexpr bool is_function_v = !is_const_v;
+template 
+  inline constexpr bool is_function_v<_Tp&> = false;
+template 
+  inline constexpr bool is_function_v<_Tp&&> = false;
+#endif
 
 #if __has_builtin(__is_reference)
 template 
-- 
2.41.0



[PATCH v10 3/5] c++: Implement __is_function built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_function.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_function.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_FUNCTION.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_function.
* g++.dg/ext/is_function.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 gcc/testsuite/g++.dg/ext/is_function.C   | 58 
 5 files changed, 69 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index f6951ee2670..927605c6cb7 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3754,6 +3754,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_FUNCTION:
+  inform (loc, "  %qT is not a function", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 1e3310cd682..3cd3babc242 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -83,6 +83,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_FUNCTION, "__is_function", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 2f37bc353a1..b976633645a 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12072,6 +12072,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_ENUM:
   return type_code1 == ENUMERAL_TYPE;
 
+case CPTK_IS_FUNCTION:
+  return type_code1 == FUNCTION_TYPE;
+
 case CPTK_IS_FINAL:
   return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
 
@@ -12293,6 +12296,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
 case CPTK_IS_REFERENCE:
+case CPTK_IS_FUNCTION:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index b697673790c..90eb00ebf2d 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -149,3 +149,6 @@
 #if !__has_builtin (__is_reference)
 # error "__has_builtin (__is_reference) failed"
 #endif
+#if !__has_builtin (__is_function)
+# error "__has_builtin (__is_function) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_function.C 
b/gcc/testsuite/g++.dg/ext/is_function.C
new file mode 100644
index 000..2e1594b12ad
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_function.C
@@ -0,0 +1,58 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+struct A
+{ void fn(); };
+
+template
+struct AHolder { };
+
+template
+struct AHolder
+{ using type = U; };
+
+// Positive tests.
+SA(__is_function(int (int)));
+SA(__is_function(ClassType (ClassType)));
+SA(__is_function(float (int, float, int[], int&)));
+SA(__is_function(int (int, ...)));
+SA(__is_function(bool (ClassType) const));
+SA(__is_function(AHolder::type));
+
+void fn();
+SA(__is_function(decltype(fn)));
+
+// Negative tests.
+SA_TEST_CATEGORY(__is_function, int, false);
+SA_TEST_CATEGORY(__is_function, int*, false);
+SA_TEST_CATEGORY(__is_function, int&, false);
+SA_TEST_CATEGORY(__is_function, void, false);
+SA_TEST_CATEGORY(__is_function, void*, false);
+SA_TEST_CATEGORY(__is_function, void**, false);
+SA_TEST_CATEGORY(__is_function, std::nullptr_t, false);
+
+SA_TEST_CATEGORY(__is_function, AbstractClass, false);
+SA(!__is_function(int(&)(int)));
+SA(!__is_function(int(*)(int)));
+
+SA_TEST_CATEGORY(__is_function, A, false);
+SA_TEST_CATEGORY(__is_function, decltype(&A::fn), false);
+
+struct FnCallOverload
+{ void operator()(); };
+SA_TEST_CATEGORY(__is_function, FnCallOverload, false);
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_

[PATCH v10 1/5] c++: Implement __is_reference built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_reference.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_reference.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_REFERENCE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_reference.
* g++.dg/ext/is_reference.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_reference.C  | 34 
 5 files changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_reference.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..f6951ee2670 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3705,6 +3705,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_HAS_VIRTUAL_DESTRUCTOR:
   inform (loc, "  %qT does not have a virtual destructor", t1);
   break;
+case CPTK_IS_REFERENCE:
+  inform (loc, "  %qT is not a reference", t1);
+  break;
 case CPTK_IS_ABSTRACT:
   inform (loc, "  %qT is not an abstract class", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..1e3310cd682 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -67,6 +67,7 @@ DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
 DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
 DEFTRAIT_EXPR (IS_ENUM, "__is_enum", 1)
 DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1)
+DEFTRAIT_EXPR (IS_REFERENCE, "__is_reference", 1)
 DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2)
 DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1)
 DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index a2e74a5d2c7..2f37bc353a1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12075,6 +12075,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_FINAL:
   return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
 
+case CPTK_IS_REFERENCE:
+  return type_code1 == REFERENCE_TYPE;
+
 case CPTK_IS_LAYOUT_COMPATIBLE:
   return layout_compatible_type_p (type1, type2);
 
@@ -12289,6 +12292,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_REFERENCE:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..b697673790c 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_reference)
+# error "__has_builtin (__is_reference) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_reference.C 
b/gcc/testsuite/g++.dg/ext/is_reference.C
new file mode 100644
index 000..b5ce4db7afd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_reference.C
@@ -0,0 +1,34 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+// Positive tests.
+SA_TEST_CATEGORY(__is_reference, int&, true);
+SA_TEST_CATEGORY(__is_reference, ClassType&, true);
+SA(__is_reference(int(&)(int)));
+SA_TEST_CATEGORY(__is_reference, int&&, true);
+SA_TEST_CATEGORY(__is_reference, ClassType&&, true);
+SA(__is_reference(int(&&)(int)));
+SA_TEST_CATEGORY(__is_reference, IncompleteClass&, true);
+
+// Negative tests
+SA_TEST_CATEGORY(__is_reference, void, false);
+SA_TEST_CATEGORY(__is_reference, int*, false);
+SA_TEST_CATEGORY(__is_reference, int[3], false);
+SA(!__is_reference(int(int)));
+SA(!__is_reference(int(*const)(int)));
+SA(!__is_reference(int(*volatile)(int)));
+SA(!__is_reference(int(*const volatile)(int)));
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_reference, ClassType, false);
+SA_TEST_CATEGORY(__is_reference, IncompleteClass, false);
-- 
2.41.0



[PATCH v10 2/5] libstdc++: Use new built-in trait __is_reference for std::is_reference

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch gets std::is_reference to dispatch to new built-in trait
__is_reference.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_reference): Use __is_reference built-in
trait.
(is_reference_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..2a14df7e5f9 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -639,6 +639,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Composite type categories.
 
   /// is_reference
+#if __has_builtin(__is_reference)
+  template
+struct is_reference
+: public __bool_constant<__is_reference(_Tp)>
+{ };
+#else
   template
 struct is_reference
 : public false_type
@@ -653,6 +659,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_reference<_Tp&&>
 : public true_type
 { };
+#endif
 
   /// is_arithmetic
   template
@@ -3192,12 +3199,19 @@ template 
   inline constexpr bool is_class_v = __is_class(_Tp);
 template 
   inline constexpr bool is_function_v = is_function<_Tp>::value;
+
+#if __has_builtin(__is_reference)
+template 
+  inline constexpr bool is_reference_v = __is_reference(_Tp);
+#else
 template 
   inline constexpr bool is_reference_v = false;
 template 
   inline constexpr bool is_reference_v<_Tp&> = true;
 template 
   inline constexpr bool is_reference_v<_Tp&&> = true;
+#endif
+
 template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
 template 
-- 
2.41.0



[PATCH v10 0/5] c++, libstdc++: Make std::is_object dispatch to new built-in traits

2023-07-12 Thread Ken Matsui via Gcc-patches
Hi,

This patch series gets std::is_object to dispatch to built-in traits and
implements the following built-in traits, on which std::object depends.

* __is_reference
* __is_function

std::is_object was depending on them with disjunction and negation.

__not_<__or_, is_reference<_Tp>, is_void<_Tp>>>::type

Therefore, this patch uses them directly instead of implementing an additional
built-in trait __is_object, which makes the compiler slightly bigger and
slower.

__bool_constant::value)>

This would instantiate only __bool_constant and __bool_constant,
which can be mostly shared. That is, the purpose of built-in traits is
considered as achieved.

Changes in v8

* Dropped __is_void built-in implementation since it is optimal.
* Optimized is_function_v

Ken Matsui (5):
  c++: Implement __is_reference built-in trait
  libstdc++: Use new built-in trait __is_reference for std::is_reference
  c++: Implement __is_function built-in trait
  libstdc++: Use new built-in trait __is_function for std::is_function
  libstdc++: Make std::is_object dispatch to new built-in traits

 gcc/cp/constraint.cc |  6 +++
 gcc/cp/cp-trait.def  |  2 +
 gcc/cp/semantics.cc  |  8 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  6 +++
 gcc/testsuite/g++.dg/ext/is_function.C   | 58 
 gcc/testsuite/g++.dg/ext/is_reference.C  | 34 ++
 libstdc++-v3/include/std/type_traits | 50 +++-
 7 files changed, 163 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_reference.C

-- 
2.41.0



Re: [PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-12 Thread Kito Cheng via Gcc-patches
That's intentional before, since some time binutils may have supported
that but the compiler doesn't, so GCC just bypasses that to binutils
to let binutils reject those unknown extensions.

But I am considering rejecting those extensions or adding more checks
on the GCC side recently too, because accepting unknown extensions
might cause problems on the architecture extension test macro[1], it
makes the value become unreliable if the extension version info isn't
in GCC yet.

So I am OK with this change but two minor comments :

---
> riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 'zvl' 
> starts with `z` but is not a standard sub-extension

I would like to say it's `unsupported standard extension` rather than
`not a standard sub-extension`.

Because some extensions have just become ratified but GCC is
unsupported yet, so `not a standard sub-extension` might confuse IMO.
and why `extension` rather than `sub-extension`: IIRC `sub-extension`
was used as an official term long ago, but it is called standard
extension now[2].


> riscv64-unknown-elf-gcc: error: '-march=rv64gcv_zvl128_s123': extension 
> 's123' start with `s` but not a standard supervisor extension
`is` is missing.



Also I would like to reject unknown single letter extensions and `x`
extensions too, for the same reason as the other two, except that make
the architecture extension test macro less useful.


[1] 
https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#architecture-extension-test-macro
[2] 
https://github.com/riscv/riscv-isa-manual/blob/main/src/naming.adoc#additional-standard-extension-names


[PATCH v2 2/2] libstdc++: Use new built-in trait __is_signed

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_signed.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_signed): Use __is_signed built-in trait.
(is_signed_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..23ab5a4b1e5 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -865,6 +865,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public __bool_constant<__is_abstract(_Tp)>
 { };
 
+  /// is_signed
+#if __has_builtin(__is_signed)
+  template
+struct is_signed
+: public __bool_constant<__is_signed(_Tp)>
+{ };
+#else
   /// @cond undocumented
   template::value>
@@ -877,11 +884,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
   /// @endcond
 
-  /// is_signed
   template
 struct is_signed
 : public __is_signed_helper<_Tp>::type
 { };
+#endif
 
   /// is_unsigned
   template
@@ -3240,8 +3247,14 @@ template 
 template 
   inline constexpr bool is_final_v = __is_final(_Tp);
 
+#if __has_builtin(__is_signed)
+template 
+  inline constexpr bool is_signed_v = __is_signed(_Tp);
+#else
 template 
   inline constexpr bool is_signed_v = is_signed<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_unsigned_v = is_unsigned<_Tp>::value;
 
-- 
2.41.0



[PATCH v2 1/2] c++, libstdc++: Implement __is_signed built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_signed.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_signed.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_SIGNED.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_signed.
* g++.dg/ext/is_signed.C: New test.
* g++.dg/tm/pr46567.C (__is_signed): Rename to ...
(__is_signed_type): ... this.

libstdc++-v3/ChangeLog:

* include/ext/numeric_traits.h (__is_signed): Rename to ...
(__is_signed_type): ... this.
* include/bits/charconv.h: Use __is_signed_type instead.
* include/bits/locale_facets.tcc: Likewise.
* include/bits/uniform_int_dist.h: Likewise.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 gcc/testsuite/g++.dg/ext/is_signed.C | 47 
 gcc/testsuite/g++.dg/tm/pr46567.C| 12 ++---
 libstdc++-v3/include/bits/charconv.h |  2 +-
 libstdc++-v3/include/bits/locale_facets.tcc  |  6 +--
 libstdc++-v3/include/bits/uniform_int_dist.h |  4 +-
 libstdc++-v3/include/ext/numeric_traits.h| 18 
 10 files changed, 79 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_signed.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..73fcbfe39e8 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_SIGNED:
+  inform (loc, "  %qT is not a signed type", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..576d5528d05 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_SIGNED, "__is_signed", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..17aad992f96 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_SIGNED:
+  return ARITHMETIC_TYPE_P (type1) && TYPE_SIGN (type1) == SIGNED;
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_SIGNED:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..a43202d0d59 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_signed)
+# error "__has_builtin (__is_signed) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_signed.C 
b/gcc/testsuite/g++.dg/ext/is_signed.C
new file mode 100644
index 000..a04b548105d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_signed.C
@@ -0,0 +1,47 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, X, expect) \
+  SA(TRAIT(X) == expect);  \
+  SA(TRAIT(const X) == expect);\
+  SA(TRAIT(volatile X) == expect); \
+  SA(TRAIT(const volatile X) == expect)
+
+SA_TEST_CATEGORY(__is_signed, void, false);
+
+SA_TEST_CATEGORY(__is_signed, bool, bool(-1) < bool(0));
+SA_TEST_CATEGORY(__is_signed, char, char(-1) < char(0));
+SA_TEST_CATEGORY(__is_signed, signed char, true);
+SA_TEST_CATEGORY(__is_signed, unsigned char, false);
+SA_TEST_CATEGORY(__is_signed, wchar_t, wchar_t(-1) < wchar_t(0));
+SA_TEST_CATEGORY(__is_signed, short, true);
+SA_TEST_CATEGORY(__is_signed, unsigned short, false);
+SA_TEST_CATEGORY(__is_signed, int, true);
+SA_TEST_CATEGORY(__is_signed, unsigned int, fa

Re: [PATCH 1/2] c++, libstdc++: implement __is_signed built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
On Wed, Jul 12, 2023 at 3:20 AM Jonathan Wakely  wrote:
>
> On Sun, 9 Jul 2023 at 09:50, Ken Matsui via Libstdc++
>  wrote:
> >
> > This patch implements built-in trait for std::is_signed.
> >
> > gcc/cp/ChangeLog:
> >
> > * cp-trait.def: Define __is_signed.
> > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_SIGNED.
> > * semantics.cc (trait_expr_value): Likewise.
> > (finish_trait_expr): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/ext/has-builtin-1.C: Test existence of __is_signed.
> > * g++.dg/ext/is_signed.C: New test.
> > * g++.dg/tm/pr46567.C (__is_signed): Rename to ...
> > (is_signed): ... this.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/ext/numeric_traits.h (__is_signed): Rename to ...
> > (is_signed): ... this.
>
> Again, please do not use four underscores.
>
> This data member of __numeric_traits_integer could be __signed or
> __is_signed_integer. I think I prefer __signed here, since the
> "integer" part is redundant with __numeric_traits_integer.
>

Thank you for your review. It appears that __signed is a keyword. I
will choose __is_signed_type since we also have __is_signed for
__numeric_traits_floating.

>
>
> > * include/bits/charconv.h: Use is_signed instead.
> > * include/bits/locale_facets.tcc: Likewise.
> > * include/bits/uniform_int_dist.h: Likewise.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/cp/constraint.cc |  3 ++
> >  gcc/cp/cp-trait.def  |  1 +
> >  gcc/cp/semantics.cc  |  4 ++
> >  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
> >  gcc/testsuite/g++.dg/ext/is_signed.C | 47 
> >  gcc/testsuite/g++.dg/tm/pr46567.C| 12 ++---
> >  libstdc++-v3/include/bits/charconv.h |  2 +-
> >  libstdc++-v3/include/bits/locale_facets.tcc  |  6 +--
> >  libstdc++-v3/include/bits/uniform_int_dist.h |  4 +-
> >  libstdc++-v3/include/ext/numeric_traits.h| 18 
> >  10 files changed, 79 insertions(+), 21 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/ext/is_signed.C
> >
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 8cf0f2d0974..73fcbfe39e8 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
> >  case CPTK_IS_UNION:
> >inform (loc, "  %qT is not a union", t1);
> >break;
> > +case CPTK_IS_SIGNED:
> > +  inform (loc, "  %qT is not a signed type", t1);
> > +  break;
> >  case CPTK_IS_AGGREGATE:
> >inform (loc, "  %qT is not an aggregate", t1);
> >break;
> > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> > index 8b7fece0cc8..576d5528d05 100644
> > --- a/gcc/cp/cp-trait.def
> > +++ b/gcc/cp/cp-trait.def
> > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> > "__is_trivially_assignable", 2)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> > -1)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
> >  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> > +DEFTRAIT_EXPR (IS_SIGNED, "__is_signed", 1)
> >  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> > "__reference_constructs_from_temporary", 2)
> >  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> > "__reference_converts_from_temporary", 2)
> >  /* FIXME Added space to avoid direct usage in GCC 13.  */
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 8fb47fd179e..17aad992f96 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> > tree type2)
> >  case CPTK_IS_UNION:
> >return type_code1 == UNION_TYPE;
> >
> > +case CPTK_IS_SIGNED:
> > +  return ARITHMETIC_TYPE_P (type1) && TYPE_SIGN (type1) == SIGNED;
> > +
> >  case CPTK_IS_ASSIGNABLE:
> >return is_xible (MODIFY_EXPR, type1, type2);
> >
> > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >  case CPTK_IS_ENUM:
> >  case CPTK_IS_UNION:
> >  case CPTK_IS_SAME:
> > +case CPTK_IS_SIGNED:
> >break;
> >
> >  case CPTK_IS_LAYOUT_COMPATIBLE:
> > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > index f343e153e56..a43202d0d59 100644
> > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > @@ -146,3 +146,6 @@
> >  #if !__has_builtin (__remove_cvref)
> >  # error "__has_builtin (__remove_cvref) failed"
> >  #endif
> > +#if !__has_builtin (__is_signed)
> > +# error "__has_builtin (__is_signed) failed"
> > +#endif
> > diff --git a/gcc/testsuite/g++.dg/ext/is_signed.C 
> > b/gcc/testsuite/g++.dg/ext/is_signed.C
> > new file mode 100644
> > index 000..a04b5481

Re: [PATCH v3 1/2] c++, libstdc++: Implement __is_pointer built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
Also, here is the Kanban board for our GSoC project, which might be
useful for you to manage non-reviewed patches.

https://github.com/users/ken-matsui/projects/1/views/1

On Wed, Jul 12, 2023 at 6:13 PM Ken Matsui  wrote:
>
> Hi,
>
> Here is the updated benchmark result for is_pointer:
>
> https://github.com/ken-matsui/gcc-benches/blob/main/is_pointer.md#wed-jul-12-055654-pm-pdt-2023
>
> Time: -2.79488%
> Peak Memory Usage: -2.39379%
> Total Memory Usage: -3.39559%
>
> Sincerely,
> Ken Matsui
>
> On Wed, Jul 12, 2023 at 6:12 PM Ken Matsui  wrote:
> >
> > This patch implements built-in trait for std::is_pointer.
> >
> > gcc/cp/ChangeLog:
> >
> > * cp-trait.def: Define __is_pointer.
> > * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
> > * semantics.cc (trait_expr_value): Likewise.
> > (finish_trait_expr): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
> > * g++.dg/ext/is_pointer.C: New test.
> > * g++.dg/tm/pr46567.C (__is_pointer): Rename to ...
> > (__is_ptr): ... this.
> > * g++.dg/torture/20070621-1.C: Likewise.
> > * g++.dg/torture/pr57107.C: Likewise.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/bits/cpp_type_traits.h (__is_pointer): Rename to ...
> > (__is_ptr): ... this.
> > * include/bits/deque.tcc: Use __is_ptr instead.
> > * include/bits/stl_algobase.h: Likewise.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/cp/constraint.cc|  3 ++
> >  gcc/cp/cp-trait.def |  1 +
> >  gcc/cp/semantics.cc |  4 ++
> >  gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
> >  gcc/testsuite/g++.dg/ext/is_pointer.C   | 51 +
> >  gcc/testsuite/g++.dg/tm/pr46567.C   | 22 -
> >  gcc/testsuite/g++.dg/torture/20070621-1.C   |  4 +-
> >  gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
> >  libstdc++-v3/include/bits/cpp_type_traits.h |  6 +--
> >  libstdc++-v3/include/bits/deque.tcc |  6 +--
> >  libstdc++-v3/include/bits/stl_algobase.h|  6 +--
> >  11 files changed, 86 insertions(+), 24 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C
> >
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 8cf0f2d0974..30266204eb5 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
> >  case CPTK_IS_UNION:
> >inform (loc, "  %qT is not a union", t1);
> >break;
> > +case CPTK_IS_POINTER:
> > +  inform (loc, "  %qT is not a pointer", t1);
> > +  break;
> >  case CPTK_IS_AGGREGATE:
> >inform (loc, "  %qT is not an aggregate", t1);
> >break;
> > diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> > index 8b7fece0cc8..b7c263e9a77 100644
> > --- a/gcc/cp/cp-trait.def
> > +++ b/gcc/cp/cp-trait.def
> > @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> > "__is_trivially_assignable", 2)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> > -1)
> >  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
> >  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> > +DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
> >  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> > "__reference_constructs_from_temporary", 2)
> >  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> > "__reference_converts_from_temporary", 2)
> >  /* FIXME Added space to avoid direct usage in GCC 13.  */
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 8fb47fd179e..68f8a4fe85b 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> > tree type2)
> >  case CPTK_IS_UNION:
> >return type_code1 == UNION_TYPE;
> >
> > +case CPTK_IS_POINTER:
> > +  return TYPE_PTR_P (type1);
> > +
> >  case CPTK_IS_ASSIGNABLE:
> >return is_xible (MODIFY_EXPR, type1, type2);
> >
> > @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> > kind, tree type1, tree type2)
> >  case CPTK_IS_ENUM:
> >  case CPTK_IS_UNION:
> >  case CPTK_IS_SAME:
> > +case CPTK_IS_POINTER:
> >break;
> >
> >  case CPTK_IS_LAYOUT_COMPATIBLE:
> > diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> > b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > index f343e153e56..9dace5cbd48 100644
> > --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> > @@ -146,3 +146,6 @@
> >  #if !__has_builtin (__remove_cvref)
> >  # error "__has_builtin (__remove_cvref) failed"
> >  #endif
> > +#if !__has_builtin (__is_pointer)
> > +# error "__has_builtin (__is_pointer) failed"
> > +#endif
> > diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
> > b/gcc/testsuite/g++.dg/e

[PATCH v3 2/2] libstdc++: Use new built-in trait __is_pointer

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch lets libstdc++ use new built-in trait __is_pointer.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_ptr): Use __is_pointer
built-in trait.
* include/std/type_traits (is_pointer): Likewise.
(is_pointer_v): Likewise. Optimize its implementation.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/bits/cpp_type_traits.h |  8 +++
 libstdc++-v3/include/std/type_traits| 25 +++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h 
b/libstdc++-v3/include/bits/cpp_type_traits.h
index 3711e4be526..4da1e7c407c 100644
--- a/libstdc++-v3/include/bits/cpp_type_traits.h
+++ b/libstdc++-v3/include/bits/cpp_type_traits.h
@@ -363,6 +363,13 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   //
   // Pointer types
   //
+#if __has_builtin(__is_pointer)
+  template
+struct __is_ptr : __truth_type<__is_pointer(_Tp)>
+{
+  enum { __value = __is_pointer(_Tp) };
+};
+#else
   template
 struct __is_ptr
 {
@@ -376,6 +383,7 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   enum { __value = 1 };
   typedef __true_type __type;
 };
+#endif
 
   //
   // An arithmetic type is an integer type or a floating point type
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 0e7a9c9c7f3..181a50e48d0 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -515,6 +515,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_array<_Tp[]>
 : public true_type { };
 
+  /// is_pointer
+#if __has_builtin(__is_pointer)
+  template
+struct is_pointer
+: public __bool_constant<__is_pointer(_Tp)>
+{ };
+#else
   template
 struct __is_pointer_helper
 : public false_type { };
@@ -523,11 +530,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_pointer_helper<_Tp*>
 : public true_type { };
 
-  /// is_pointer
   template
 struct is_pointer
 : public __is_pointer_helper<__remove_cv_t<_Tp>>::type
 { };
+#endif
 
   /// is_lvalue_reference
   template
@@ -3168,8 +3175,22 @@ template 
 template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
 
+#if __has_builtin(__is_pointer)
 template 
-  inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
+  inline constexpr bool is_pointer_v = __is_pointer(_Tp);
+#else
+template 
+  inline constexpr bool is_pointer_v = false;
+template 
+  inline constexpr bool is_pointer_v<_Tp*> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* volatile> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
+#endif
+
 template 
   inline constexpr bool is_lvalue_reference_v = false;
 template 
-- 
2.41.0



Re: [PATCH v3 1/2] c++, libstdc++: Implement __is_pointer built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
Hi,

Here is the updated benchmark result for is_pointer:

https://github.com/ken-matsui/gcc-benches/blob/main/is_pointer.md#wed-jul-12-055654-pm-pdt-2023

Time: -2.79488%
Peak Memory Usage: -2.39379%
Total Memory Usage: -3.39559%

Sincerely,
Ken Matsui

On Wed, Jul 12, 2023 at 6:12 PM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_pointer.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_pointer.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
> * g++.dg/ext/is_pointer.C: New test.
> * g++.dg/tm/pr46567.C (__is_pointer): Rename to ...
> (__is_ptr): ... this.
> * g++.dg/torture/20070621-1.C: Likewise.
> * g++.dg/torture/pr57107.C: Likewise.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/cpp_type_traits.h (__is_pointer): Rename to ...
> (__is_ptr): ... this.
> * include/bits/deque.tcc: Use __is_ptr instead.
> * include/bits/stl_algobase.h: Likewise.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc|  3 ++
>  gcc/cp/cp-trait.def |  1 +
>  gcc/cp/semantics.cc |  4 ++
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
>  gcc/testsuite/g++.dg/ext/is_pointer.C   | 51 +
>  gcc/testsuite/g++.dg/tm/pr46567.C   | 22 -
>  gcc/testsuite/g++.dg/torture/20070621-1.C   |  4 +-
>  gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
>  libstdc++-v3/include/bits/cpp_type_traits.h |  6 +--
>  libstdc++-v3/include/bits/deque.tcc |  6 +--
>  libstdc++-v3/include/bits/stl_algobase.h|  6 +--
>  11 files changed, 86 insertions(+), 24 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 8cf0f2d0974..30266204eb5 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_UNION:
>inform (loc, "  %qT is not a union", t1);
>break;
> +case CPTK_IS_POINTER:
> +  inform (loc, "  %qT is not a pointer", t1);
> +  break;
>  case CPTK_IS_AGGREGATE:
>inform (loc, "  %qT is not an aggregate", t1);
>break;
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 8b7fece0cc8..b7c263e9a77 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
> "__is_trivially_assignable", 2)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", 
> -1)
>  DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
>  DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
> +DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
>  DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
> "__reference_constructs_from_temporary", 2)
>  DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
> "__reference_converts_from_temporary", 2)
>  /* FIXME Added space to avoid direct usage in GCC 13.  */
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 8fb47fd179e..68f8a4fe85b 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_UNION:
>return type_code1 == UNION_TYPE;
>
> +case CPTK_IS_POINTER:
> +  return TYPE_PTR_P (type1);
> +
>  case CPTK_IS_ASSIGNABLE:
>return is_xible (MODIFY_EXPR, type1, type2);
>
> @@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ENUM:
>  case CPTK_IS_UNION:
>  case CPTK_IS_SAME:
> +case CPTK_IS_POINTER:
>break;
>
>  case CPTK_IS_LAYOUT_COMPATIBLE:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index f343e153e56..9dace5cbd48 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -146,3 +146,6 @@
>  #if !__has_builtin (__remove_cvref)
>  # error "__has_builtin (__remove_cvref) failed"
>  #endif
> +#if !__has_builtin (__is_pointer)
> +# error "__has_builtin (__is_pointer) failed"
> +#endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
> b/gcc/testsuite/g++.dg/ext/is_pointer.C
> new file mode 100644
> index 000..d6e39565950
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_pointer.C
> @@ -0,0 +1,51 @@
> +// { dg-do compile { target c++11 } }
> +
> +#define SA(X) static_assert((X),#X)
> +
> +SA(!__is_pointer(int));
> +SA(__is_pointer(int*));
> +SA(__is_pointer(int**));
> +
> +SA(__is_pointer(const int*));
> +SA(__is_pointer(const int**));
> +SA(__is_pointer(int* const));
> +SA(__is_pointer(int** const));
> +SA(_

RE: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Thursday, July 13, 2023 5:49 AM
To: 钟居哲 ; gcc-patches 
Cc: kito.cheng ; kito.cheng ; 
rdapp.gcc 
Subject: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization



On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.



Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.

Jeff


[PATCH v3 1/2] c++, libstdc++: Implement __is_pointer built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
This patch implements built-in trait for std::is_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_pointer.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
* g++.dg/ext/is_pointer.C: New test.
* g++.dg/tm/pr46567.C (__is_pointer): Rename to ...
(__is_ptr): ... this.
* g++.dg/torture/20070621-1.C: Likewise.
* g++.dg/torture/pr57107.C: Likewise.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_pointer): Rename to ...
(__is_ptr): ... this.
* include/bits/deque.tcc: Use __is_ptr instead.
* include/bits/stl_algobase.h: Likewise.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc|  3 ++
 gcc/cp/cp-trait.def |  1 +
 gcc/cp/semantics.cc |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
 gcc/testsuite/g++.dg/ext/is_pointer.C   | 51 +
 gcc/testsuite/g++.dg/tm/pr46567.C   | 22 -
 gcc/testsuite/g++.dg/torture/20070621-1.C   |  4 +-
 gcc/testsuite/g++.dg/torture/pr57107.C  |  4 +-
 libstdc++-v3/include/bits/cpp_type_traits.h |  6 +--
 libstdc++-v3/include/bits/deque.tcc |  6 +--
 libstdc++-v3/include/bits/stl_algobase.h|  6 +--
 11 files changed, 86 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 8cf0f2d0974..30266204eb5 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3751,6 +3751,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_POINTER:
+  inform (loc, "  %qT is not a pointer", t1);
+  break;
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 8b7fece0cc8..b7c263e9a77 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 /* FIXME Added space to avoid direct usage in GCC 13.  */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..68f8a4fe85b 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12118,6 +12118,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_POINTER:
+  return TYPE_PTR_P (type1);
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12296,6 +12299,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ENUM:
 case CPTK_IS_UNION:
 case CPTK_IS_SAME:
+case CPTK_IS_POINTER:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index f343e153e56..9dace5cbd48 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -146,3 +146,6 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__is_pointer)
+# error "__has_builtin (__is_pointer) failed"
+#endif
diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_pointer.C
new file mode 100644
index 000..d6e39565950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(!__is_pointer(int));
+SA(__is_pointer(int*));
+SA(__is_pointer(int**));
+
+SA(__is_pointer(const int*));
+SA(__is_pointer(const int**));
+SA(__is_pointer(int* const));
+SA(__is_pointer(int** const));
+SA(__is_pointer(int* const* const));
+
+SA(__is_pointer(volatile int*));
+SA(__is_pointer(volatile int**));
+SA(__is_pointer(int* volatile));
+SA(__is_pointer(int** volatile));
+SA(__is_pointer(int* volatile* volatile));
+
+SA(__is_pointer(const volatile int*));
+SA(__is_pointer(const volatile int**));
+SA(__is_pointer(const int* volatile));
+SA(__is_pointer(volatile int* const));
+SA(__is_pointer(int* const volatile));
+SA(__is_pointer(const int** volatile));
+SA(__is_pointer(volatile int** const));
+SA(__is_pointer(int** const volatile));
+SA(__

RE: [PATCH V2] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Li, Pan2 via Gcc-patches
Committed, thanks Jeff.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of Jeff Law via Gcc-patches
Sent: Wednesday, July 12, 2023 11:40 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; kito.ch...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH V2] RISC-V: Support COND_LEN_* patterns



On 7/12/23 09:24, Juzhe-Zhong wrote:
> This middle-end has been merged:
> https://github.com/gcc-mirror/gcc/commit/0d4dd7e07a879d6c07a33edb2799710faa95651e
> 
> With this patch, we can handle operations may trap on elements outside the 
> loop.
>   
> These 2 following cases will be addressed by this patch:
>   
> 1. integer division:
>   
>#define TEST_TYPE(TYPE) \
>__attribute__((noipa)) \
>void vrem_##TYPE (TYPE * __restrict dst, TYPE * __restrict a, TYPE * 
> __restrict b, int n) \
>{ \
>  for (int i = 0; i < n; i++) \
>dst[i] = a[i] % b[i]; \
>}
>#define TEST_ALL() \
> TEST_TYPE(int8_t) \
>TEST_ALL()
>   
>Before this patch:
>   
> vrem_int8_t:
>  ble a3,zero,.L14
>  csrrt4,vlenb
>  addiw   a5,a3,-1
>  addiw   a4,t4,-1
>  sext.w  t5,a3
>  bltua5,a4,.L10
>  csrrt3,vlenb
>  subwt3,t5,t3
>  li  a5,0
>  vsetvli t6,zero,e8,m1,ta,ma
> .L4:
>  add a6,a2,a5
>  add a7,a0,a5
>  add t1,a1,a5
>  mv  a4,a5
>  add a5,a5,t4
>  vl1re8.vv2,0(a6)
>  vl1re8.vv1,0(t1)
>  sext.w  a6,a5
>  vrem.vv v1,v1,v2
>  vs1r.v  v1,0(a7)
>  bleua6,t3,.L4
>  csrra5,vlenb
>  addwa4,a4,a5
>  sext.w  a5,a4
>  beq t5,a4,.L16
> .L3:
>  csrra6,vlenb
>  subwt5,t5,a4
>  srlia6,a6,1
>  addiw   t1,t5,-1
>  addiw   a7,a6,-1
>  bltut1,a7,.L9
>  sllia4,a4,32
>  srlia4,a4,32
>  add t0,a1,a4
>  add t6,a2,a4
>  add a4,a0,a4
>  vsetvli a7,zero,e8,mf2,ta,ma
>  sext.w  t3,a6
>  vle8.v  v1,0(t0)
>  vle8.v  v2,0(t6)
>  subwt4,t5,a6
>  vrem.vv v1,v1,v2
>  vse8.v  v1,0(a4)
>  mv  t1,t3
>  bltut4,t3,.L7
>  csrrt1,vlenb
>  add a4,a4,a6
>  add t0,t0,a6
>  add t6,t6,a6
>  sext.w  t1,t1
>  vle8.v  v1,0(t0)
>  vle8.v  v2,0(t6)
>  vrem.vv v1,v1,v2
>  vse8.v  v1,0(a4)
> .L7:
>  addwa5,t1,a5
>  beq t5,t1,.L14
> .L9:
>  add a4,a1,a5
>  add a6,a2,a5
>  lb  a6,0(a6)
>  lb  a4,0(a4)
>  add a7,a0,a5
>  addia5,a5,1
>  remwa4,a4,a6
>  sext.w  a6,a5
>  sb  a4,0(a7)
>  bgt a3,a6,.L9
> .L14:
>  ret
> .L10:
>  li  a4,0
>  li  a5,0
>  j   .L3
> .L16:
>  ret
>   
> After this patch:
>   
> vrem_int8_t:
> ble a3,zero,.L5
> .L3:
> vsetvli a5,a3,e8,m1,tu,ma
> vle8.v v1,0(a1)
> vle8.v v2,0(a2)
> sub a3,a3,a5
> vrem.vv v1,v1,v2
> vse8.v v1,0(a0)
> add a1,a1,a5
> add a2,a2,a5
> add a0,a0,a5
> bne a3,zero,.L3
> .L5:
> ret
>   
> 2. Floating-point operation **WITHOUT** -ffast-math:
>   
>  #define TEST_TYPE(TYPE) \
>  __attribute__((noipa)) \
>  void vadd_##TYPE (TYPE * __restrict dst, TYPE *__restrict a, TYPE 
> *__restrict b, int n) \
>  { \
>for (int i = 0; i < n; i++) \
>  dst[i] = a[i] + b[i]; \
>  }
>   
>  #define TEST_ALL() \
>   TEST_TYPE(float) \
>   
>  TEST_ALL()
> 
> Before this patch:
> 
> vadd_float:
>  ble a3,zero,.L10
>  csrra4,vlenb
>  srlit3,a4,2
>  addiw   a5,a3,-1
>  addiw   a6,t3,-1
>  sext.w  t6,a3
>  bltua5,a6,.L7
>  subwt5,t6,t3
>  mv  t1,a1
>  mv  a7,a2
>  mv  a6,a0
>  li  a5,0
>  vsetvli t4,zero,e32,m1,ta,ma
> .L4:
>  vl1re32.v   v1,0(t1)
>  vl1re32.v   v2,0(a7)
>  addwa5,a5,t3
>  vfadd.vvv1,v1,v2
>  vs1r.v  v1,0(a6)
>  add t1,t1,a4
>  add a7,a7,a4
>  add a6,a6,a4
>  bgeut5,a5,.L4
>  beq t6,a5,.L10
>  sext.w  a5,a5
> .L3:
>  sllia4,a5,2
> .L6:
>  add a6,a1,a4
>  add a7,a2,a4
>  flw fa4,0(a6)
>  flw fa5,0(a7)
>  add a6,a0,a4
>  addiw   a5,a5,1
>  fadd.s  fa5,fa5,fa4
>  addia4,a4,4
>  fsw fa5,0(a6)
>  bgt a3,a5,.L6
> .L10:
>  ret
> .L7:
>  li  a5,0
>  j   .L3
>   
> After this patch:
>   
> vadd_float:
> ble a3,zero,.L5
> .L3:
> vsetvli a5,a3,e32,m1,tu,m

Loop-ch improvements, part 2

2023-07-12 Thread Jan Hubicka via Gcc-patches
Hi,
as discussed this patch moves profile updating to tree-ssa-loop-ch.cc since it 
is
now quite ch specific. There are no functional changes.

Boostrapped/regtesed x86_64-linux, comitted.

gcc/ChangeLog:

* tree-cfg.cc (gimple_duplicate_sese_region): Rename to ...
(gimple_duplicate_seme_region): ... this; break out profile updating
code to ...
* tree-ssa-loop-ch.cc (update_profile_after_ch): ... here.
(ch_base::copy_headers): Update.
* tree-cfg.h (gimple_duplicate_sese_region): Rename to ...
(gimple_duplicate_seme_region): ... this.

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 7dad7b4ac72..7ccc2a5a5a7 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -6662,25 +6662,19 @@ add_phi_args_after_copy (basic_block *region_copy, 
unsigned n_region,
The function returns false if it is unable to copy the region,
true otherwise.
 
-   ELIMINATED_EDGE is an edge that is known to be removed in the dupicated
-   region.  ORIG_ELIMINATED_EDGES, if non-NULL is set of edges known to be
-   removed from the original region.  */
+   It is callers responsibility to update profile.  */
 
 bool
-gimple_duplicate_sese_region (edge entry, edge exit,
+gimple_duplicate_seme_region (edge entry, edge exit,
  basic_block *region, unsigned n_region,
  basic_block *region_copy,
- bool update_dominance,
- edge eliminated_edge,
- hash_set  *orig_eliminated_edges)
+ bool update_dominance)
 {
   unsigned i;
   bool free_region_copy = false, copying_header = false;
   class loop *loop = entry->dest->loop_father;
   edge exit_copy;
   edge redirected;
-  profile_count total_count = profile_count::uninitialized ();
-  profile_count entry_count = profile_count::uninitialized ();
 
   if (!can_copy_bbs_p (region, n_region))
 return false;
@@ -6733,144 +6727,10 @@ gimple_duplicate_sese_region (edge entry, edge exit,
  inside.  */
   auto_vec doms;
   if (update_dominance)
-{
-  doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region);
-}
-
-  if (entry->dest->count.initialized_p ())
-{
-  total_count = entry->dest->count;
-  entry_count = entry->count ();
-  /* Fix up corner cases, to avoid division by zero or creation of negative
-frequencies.  */
-  if (entry_count > total_count)
-   entry_count = total_count;
-}
+doms = get_dominated_by_region (CDI_DOMINATORS, region, n_region);
 
   copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
split_edge_bb_loc (entry), update_dominance);
-  if (total_count.initialized_p () && entry_count.initialized_p ())
-{
-  if (!eliminated_edge
- && (!orig_eliminated_edges || orig_eliminated_edges->is_empty ()))
-   {
- scale_bbs_frequencies_profile_count (region, n_region,
-  total_count - entry_count,
-  total_count);
- scale_bbs_frequencies_profile_count (region_copy, n_region,
-  entry_count, total_count);
-   }
-  else
-   {
- /* We only support only case where eliminated_edge is one and it
-exists first BB.  We also assume that the duplicated region is
-acyclic.  So we expect the following:
-
-  // region_copy_start entry will be scaled to entry_count
-if (cond1) <- this condition will become false
-  and we update probabilities
-  goto loop_exit;
-if (cond2) <- this condition is loop invariant
-  goto loop_exit;
-goto loop_header   <- this will be redirected to loop.
-  // region_copy_end
-loop:
-  
-  // region start
-loop_header:
-  if (cond1)   <- we need to update probabbility here
-goto loop_exit;
-  if (cond2)   <- and determine scaling factor here.
-  moreover cond2 is now always true
-goto loop_exit;
-  else
-goto loop;
-  // region end
-
-Adding support for more exits can be done similarly,
-but only consumer so far is tree-ssa-loop-ch and it uses only this
-to handle the common case of peeling headers which have
-conditionals known to be always true upon entry.  */
- gcc_checking_assert (copying_header);
- for (unsigned int i = 0; i < n_region; i++)
-   {
- edge exit_e, exit_e_copy, e, e_copy;
- if (EDGE_COUNT (region[i]->succs) == 1)
-   {
- reg

Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread 钟居哲
I notice vectorizable_call in Loop Vectorizer.
It's vectorizing CALL function for example like fmax/fmin.
From my understanding, we dont have RVV instruction for fmax/fmin?

So for now, I don't need to support builtin call function vectorization for RVV.
Am I right?

I am wondering whether we do have some kind of builtin function call 
vectorization by using RVV instructions.


Thanks.


juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-07-13 06:25
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization
 
 
On 7/12/23 16:17, 钟居哲 wrote:
> Thanks Jeff.
> Will commit with formating the codes.
> 
> I am gonna first support COND_FMA and reduction first (which I think 
> is higher priority).
> Then come back support strided_load/store.
Sure.One thing to note with strided loads, they can significantly 
help x264's sad/satd loops.  So hopefully you're testing with those :-)
 
 
 
jeff
 


Re: [pushed] c++: C++26 constexpr cast from void* [PR110344]

2023-07-12 Thread Marek Polacek via Gcc-patches
On Tue, Jun 27, 2023 at 11:29:34PM -0400, Jason Merrill via Gcc-patches wrote:
> Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> -- 8< --
> 
> P2768 allows static_cast from void* to ob* in constant evaluation if the
> pointer does in fact point to an object of the appropriate type.
> cxx_fold_indirect_ref already does the work of finding such an object if it
> happens to be a subobject rather than the outermost object at that address,
> as in constexpr-voidptr2.C.

This patch seems to have broken a lot of tests when running with
GXX_TESTSUITE_STDS=98,11,14,17,20,23,26.  cpp0x/constexpr-cast2.C
probably just needs not to expect certain errors in C++26, and
cpp2a/constexpr-new*.C may need code to handle &heap (?).
 
>   P2768
>   PR c++/110344
> 
> gcc/c-family/ChangeLog:
> 
>   * c-cppbuiltin.cc (c_cpp_builtins): Update __cpp_constexpr.
> 
> gcc/cp/ChangeLog:
> 
>   * constexpr.cc (cxx_eval_constant_expression): In C++26, allow cast
>   from void* to the type of a pointed-to object.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp26/constexpr-voidptr1.C: New test.
>   * g++.dg/cpp26/constexpr-voidptr2.C: New test.
>   * g++.dg/cpp26/feat-cxx26.C: New test.
> ---
>  gcc/c-family/c-cppbuiltin.cc  |   8 +-
>  gcc/cp/constexpr.cc   |  11 +
>  .../g++.dg/cpp26/constexpr-voidptr1.C |  35 +
>  .../g++.dg/cpp26/constexpr-voidptr2.C |  15 +
>  gcc/testsuite/g++.dg/cpp26/feat-cxx26.C   | 597 ++
>  5 files changed, 665 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp26/constexpr-voidptr1.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp26/constexpr-voidptr2.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp26/feat-cxx26.C
> 
> diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
> index 5d64625fcd7..6bd4c1261a7 100644
> --- a/gcc/c-family/c-cppbuiltin.cc
> +++ b/gcc/c-family/c-cppbuiltin.cc
> @@ -1075,12 +1075,18 @@ c_cpp_builtins (cpp_reader *pfile)
> cpp_define (pfile, "__cpp_size_t_suffix=202011L");
> cpp_define (pfile, "__cpp_if_consteval=202106L");
> cpp_define (pfile, "__cpp_auto_cast=202110L");
> -   cpp_define (pfile, "__cpp_constexpr=202211L");
> +   if (cxx_dialect <= cxx23)
> + cpp_define (pfile, "__cpp_constexpr=202211L");
> cpp_define (pfile, "__cpp_multidimensional_subscript=202211L");
> cpp_define (pfile, "__cpp_named_character_escapes=202207L");
> cpp_define (pfile, "__cpp_static_call_operator=202207L");
> cpp_define (pfile, "__cpp_implicit_move=202207L");
>   }
> +  if (cxx_dialect > cxx23)
> + {
> +   /* Set feature test macros for C++26.  */
> +   cpp_define (pfile, "__cpp_constexpr=202306L");
> + }
>if (flag_concepts)
>  {
> if (cxx_dialect >= cxx20)
> diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> index 432b3a275e8..cca0435bafc 100644
> --- a/gcc/cp/constexpr.cc
> +++ b/gcc/cp/constexpr.cc
> @@ -7681,6 +7681,17 @@ cxx_eval_constant_expression (const constexpr_ctx 
> *ctx, tree t,
>   && !is_std_construct_at (ctx->call)
>   && !is_std_allocator_allocate (ctx->call))
> {
> + /* P2738 (C++26): a conversion from a prvalue P of type "pointer to
> +cv void" to a pointer-to-object type T unless P points to an
> +object whose type is similar to T.  */
> + if (cxx_dialect > cxx23)
> +   if (tree ob
> +   = cxx_fold_indirect_ref (ctx, loc, TREE_TYPE (type), op))
> + {
> +   r = build1 (ADDR_EXPR, type, ob);
> +   break;
> + }
> +
>   /* Likewise, don't error when casting from void* when OP is
>  &heap uninit and similar.  */
>   tree sop = tree_strip_nop_conversions (op);
> diff --git a/gcc/testsuite/g++.dg/cpp26/constexpr-voidptr1.C 
> b/gcc/testsuite/g++.dg/cpp26/constexpr-voidptr1.C
> new file mode 100644
> index 000..ce0ccbef5f9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp26/constexpr-voidptr1.C
> @@ -0,0 +1,35 @@
> +// PR c++/110344
> +// { dg-do compile { target c++26 } }
> +
> +#include 
> +struct Sheep {
> +  constexpr std::string_view speak() const noexcept { return "Baa"; }
> +};
> +struct Cow {
> +  constexpr std::string_view speak() const noexcept { return "Mooo"; }
> +};
> +class Animal_View {
> +private:
> +  const void *animal;
> +  std::string_view (*speak_function)(const void *);
> +public:
> +  template 
> +  constexpr Animal_View(const Animal &a)
> +: animal{&a}, speak_function{[](const void *object) {
> +return static_cast *>(object)->speak();
> +  }} {}
> +  constexpr std::string_view speak() const noexcept {
> +return speak_function(animal);
> +  }
> +};
> +// This is the key bit here. This is a single concrete function
> +// that can take anything that happens to 

Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 16:17, 钟居哲 wrote:

Thanks Jeff.
Will commit with formating the codes.

I am gonna first support COND_FMA and reduction first (which I think 
is higher priority).

Then come back support strided_load/store.
Sure.One thing to note with strided loads, they can significantly 
help x264's sad/satd loops.  So hopefully you're testing with those :-)




jeff


Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread 钟居哲
Thanks Jeff.
Will commit with formating the codes.

I am gonna first support COND_FMA and reduction first (which I think is 
higher priority).
Then come back support strided_load/store.

Thanks.



juzhe.zh...@rivai.ai
 
发件人: Jeff Law
发送时间: 2023-07-13 05:48
收件人: 钟居哲; gcc-patches
抄送: kito.cheng; kito.cheng; rdapp.gcc
主题: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV 
auto-vectorization
 
 
On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.
 
 
 
Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.
 
Jeff
 


Re: [COMMITTED] [range-op] Take known set bits into account in popcount [PR107053]

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 15:15, Aldy Hernandez via Gcc-patches wrote:

This patch teaches popcount about known set bits which are now
available in the irange.

PR tree-optimization/107053

gcc/ChangeLog:

* gimple-range-op.cc (cfn_popcount): Use known set bits.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr107053.c: New test.
You could probably play similar games with ctz/clz, though it's hard to 
know if it's worth the effort.


One way to find out might be to build jemalloc which uses those idioms 
heavily.  Similarly for deepsjeng from spec2017.


Jeff


Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 15:22, 钟居哲 wrote:
I have removed strided load/store, instead, I will support strided 
load/store in vectorizer later.


Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.




Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.


Jeff


[COMMITTED] [range-op] Take known set bits into account in popcount [PR107053]

2023-07-12 Thread Aldy Hernandez via Gcc-patches
This patch teaches popcount about known set bits which are now
available in the irange.

PR tree-optimization/107053

gcc/ChangeLog:

* gimple-range-op.cc (cfn_popcount): Use known set bits.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr107053.c: New test.
---
 gcc/gimple-range-op.cc   | 11 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr107053.c | 13 +
 2 files changed, 20 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107053.c

diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 72c7b866f90..67b3c3d015e 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -880,17 +880,20 @@ public:
 if (lh.undefined_p ())
   return false;
 unsigned prec = TYPE_PRECISION (type);
-wide_int nz = lh.get_nonzero_bits ();
-wide_int pop = wi::shwi (wi::popcount (nz), prec);
+irange_bitmask bm = lh.get_bitmask ();
+wide_int nz = bm.get_nonzero_bits ();
+wide_int high = wi::shwi (wi::popcount (nz), prec);
 // Calculating the popcount of a singleton is trivial.
 if (lh.singleton_p ())
   {
-   r.set (type, pop, pop);
+   r.set (type, high, high);
return true;
   }
 if (cfn_ffs::fold_range (r, type, lh, rh, rel))
   {
-   int_range<2> tmp (type, wi::zero (prec), pop);
+   wide_int known_ones = ~bm.mask () & bm.value ();
+   wide_int low = wi::shwi (wi::popcount (known_ones), prec);
+   int_range<2> tmp (type, low, high);
r.intersect (tmp);
return true;
   }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107053.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr107053.c
new file mode 100644
index 000..8195d0f57b4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107053.c
@@ -0,0 +1,13 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-evrp" }
+
+void link_failure();
+void f(int a)
+{
+a |= 0x300;
+int b =  __builtin_popcount(a);
+if (b < 2)
+link_failure();
+}
+
+// { dg-final { scan-tree-dump-not "link_failure" "evrp" } }
-- 
2.40.1



[COMMITTED] [range-op] Take known mask into account for bitwise ands [PR107043]

2023-07-12 Thread Aldy Hernandez via Gcc-patches
PR tree-optimization/107043

gcc/ChangeLog:

* range-op.cc (operator_bitwise_and::op1_range): Update bitmask.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr107043.c: New test.
---
 gcc/range-op.cc  |  8 
 gcc/testsuite/gcc.dg/tree-ssa/pr107043.c | 22 ++
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107043.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 56e80c9f3ae..6b5d4f2accd 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -3463,6 +3463,14 @@ operator_bitwise_and::op1_range (irange &r, tree type,
   if (r.undefined_p ())
 set_nonzero_range_from_mask (r, type, lhs);
 
+  // For MASK == op1 & MASK, all the bits in MASK must be set in op1.
+  wide_int mask;
+  if (lhs == op2 && lhs.singleton_p (mask))
+{
+  r.update_bitmask (irange_bitmask (mask, ~mask));
+  return true;
+}
+
   // For 0 = op1 & MASK, op1 is ~MASK.
   if (lhs.zero_p () && op2.singleton_p ())
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107043.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr107043.c
new file mode 100644
index 000..af5df225746
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107043.c
@@ -0,0 +1,22 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-evrp" }
+
+int g0(int n)
+{
+  int n1 = n & 0x8000;
+  if (n1 == 0)
+return 1;
+  // n1 will be 0x8000 here.
+  return (n1 >> 15) & 0x1;
+}
+
+int g1(int n)
+{
+  int n1 = n & 0x8000;
+  if (n1 == 0)
+return 1;
+  // n>>15 will be xx1 here.
+  return (n >> 15) & 0x1;
+}
+
+// { dg-final { scan-tree-dump-times "return 1;" 2 "evrp" } }
-- 
2.40.1



RISC-V: Folding memory for FP + constant case

2023-07-12 Thread Jivan Hakobyan via Gcc-patches
Accessing local arrays element turned into load form (fp + (index << C1)) +
C2 address.
In the case when access is in the loop we got loop invariant computation.
For some reason, moving out that part cannot be done in
loop-invariant passes.
But we can handle that in target-specific hook (legitimize_address).
That provides an opportunity to rewrite memory access more suitable for the
target architecture.

This patch solves the mentioned case by rewriting mentioned case to ((fp +
C2) + (index << C1))
I have evaluated it on SPEC2017 and got an improvement on leela (over 7b
instructions,
.39% of the dynamic count) and dwarfs the regression for gcc (14m
instructions, .0012%
of the dynamic count).


gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimize_address): Handle folding.
(mem_shadd_or_shadd_rtx_p): New predicate.


-- 
With the best regards
Jivan Hakobyan
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index e4dc8115e696ed44affe6ee8b51d635fe0eaaa33..2a7e464b855ec45f1fce4daec36d84842f3f3ea4 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1754,6 +1754,22 @@ riscv_shorten_lw_offset (rtx base, HOST_WIDE_INT offset)
   return addr;
 }
 
+/* Helper for riscv_legitimize_address. Given X, return true if it
+   is a left shift by 1, 2 or 3 positions or a multiply by 2, 4 or 8.
+
+   This respectively represent canonical shift-add rtxs or scaled
+   memory addresses.  */
+static bool
+mem_shadd_or_shadd_rtx_p (rtx x)
+{
+  return ((GET_CODE (x) == ASHIFT
+   || GET_CODE (x) == MULT)
+  && GET_CODE (XEXP (x, 1)) == CONST_INT
+  && ((GET_CODE (x) == ASHIFT && IN_RANGE (INTVAL (XEXP (x, 1)), 1, 3))
+  || (GET_CODE (x) == MULT
+  && IN_RANGE (exact_log2 (INTVAL (XEXP (x, 1))), 1, 3;
+}
+
 /* This function is used to implement LEGITIMIZE_ADDRESS.  If X can
be legitimized in a way that the generic machinery might not expect,
return a new address, otherwise return NULL.  MODE is the mode of
@@ -1779,6 +1795,33 @@ riscv_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
   rtx base = XEXP (x, 0);
   HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
 
+  /* Handle (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case.  */
+  if (GET_CODE (base) == PLUS && mem_shadd_or_shadd_rtx_p (XEXP (base, 0))
+  && SMALL_OPERAND (offset))
+{
+
+  rtx index = XEXP (base, 0);
+  rtx fp = XEXP (base, 1);
+  if (REGNO (fp) == VIRTUAL_STACK_VARS_REGNUM)
+{
+
+  /* If we were given a MULT, we must fix the constant
+ as we're going to create the ASHIFT form.  */
+  int shift_val = INTVAL (XEXP (index, 1));
+  if (GET_CODE (index) == MULT)
+shift_val = exact_log2 (shift_val);
+
+  rtx reg1 = gen_reg_rtx (Pmode);
+  rtx reg2 = gen_reg_rtx (Pmode);
+  rtx reg3 = gen_reg_rtx (Pmode);
+  riscv_emit_binary (PLUS, reg1, fp, GEN_INT (offset));
+  riscv_emit_binary (ASHIFT, reg2, XEXP (index, 0), GEN_INT (shift_val));
+  riscv_emit_binary (PLUS, reg3, reg2, reg1);
+
+  return reg3;
+}
+}
+
   if (!riscv_valid_base_register_p (base, mode, false))
 	base = copy_to_mode_reg (Pmode, base);
   if (optimize_function_for_size_p (cfun)


Re: [PATCH v2 1/2] c++, libstdc++: implement __is_pointer built-in trait

2023-07-12 Thread Ken Matsui via Gcc-patches
On Wed, Jul 12, 2023 at 3:01 AM Jonathan Wakely  wrote:
>
> On Mon, 10 Jul 2023 at 06:51, Ken Matsui via Libstdc++
>  wrote:
> >
> > Hi,
> >
> > Here is the benchmark result for is_pointer:
> >
> > https://github.com/ken-matsui/gcc-benches/blob/main/is_pointer.md#sun-jul--9-103948-pm-pdt-2023
> >
> > Time: -62.1344%
> > Peak Memory Usage: -52.4281%
> > Total Memory Usage: -53.5889%
>
> Wow!
>
> Although maybe we could have improved our std::is_pointer_v anyway, like so:
>
> template 
>   inline constexpr bool is_pointer_v = false;
> template 
>   inline constexpr bool is_pointer_v<_Tp*> = true;
> template 
>   inline constexpr bool is_pointer_v<_Tp* const> = true;
> template 
>   inline constexpr bool is_pointer_v<_Tp* volatile> = true;
> template 
>   inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
>
> I'm not sure why I didn't already do that.
>
> Could you please benchmark that? And if it is better than the current
> impl using is_pointer<_Tp>::value then we should do this in the
> library:
>
> #if __has_builtin(__is_pointer)
> template 
>   inline constexpr bool is_pointer_v = __is_pointer(_Tp);
> #else
> template 
>   inline constexpr bool is_pointer_v = false;
> template 
>   inline constexpr bool is_pointer_v<_Tp*> = true;
> template 
>   inline constexpr bool is_pointer_v<_Tp* const> = true;
> template 
>   inline constexpr bool is_pointer_v<_Tp* volatile> = true;
> template 
>   inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
> #endif

Hi François and Jonathan,

Thank you for your reviews! I will rename the four underscores to the
appropriate name and take a benchmark once I get home.

If I apply your change on is_pointer_v, is it better to add the
`Co-authored-by:` line in the commit?


[committed] libstdc++: Check conversion from filesystem::path to wide strings [PR95048]

2023-07-12 Thread Jonathan Wakely via Gcc-patches
Tested powerp64le-linux. Pushed to trunk. This can be backported too.

-- >8 --

The testcase added for this bug only checks conversion from wide strings
on construction, but the fix also covered conversion to wide stings via
path::wstring(). Add checks for that, and u16string() and u32string().

libstdc++-v3/ChangeLog:

PR libstdc++/95048
* testsuite/27_io/filesystem/path/construct/95048.cc: Check
conversions to wide strings.
* testsuite/experimental/filesystem/path/construct/95048.cc:
Likewise.
---
 .../testsuite/27_io/filesystem/path/construct/95048.cc  | 6 ++
 .../experimental/filesystem/path/construct/95048.cc | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/construct/95048.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/construct/95048.cc
index c1a382d1420..cd80d668b23 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/path/construct/95048.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/construct/95048.cc
@@ -16,6 +16,8 @@ test_wide()
   VERIFY( CHECK(L, "\U0001F4C1") ); // folder
   VERIFY( CHECK(L, "\U0001F4C2") ); // open folder
   VERIFY( CHECK(L, "\U0001F4C4") ); // filing cabient
+
+  VERIFY( path(u8"\U0001D11E").wstring() == L"\U0001D11E" ); // G Clef
 }
 
 void
@@ -25,6 +27,8 @@ test_u16()
   VERIFY( CHECK(u, "\U0001F4C1") ); // folder
   VERIFY( CHECK(u, "\U0001F4C2") ); // open folder
   VERIFY( CHECK(u, "\U0001F4C4") ); // filing cabient
+
+  VERIFY( path(u8"\U0001D11E").u16string() == u"\U0001D11E" ); // G Clef
 }
 
 void
@@ -34,6 +38,8 @@ test_u32()
   VERIFY( CHECK(U, "\U0001F4C1") ); // folder
   VERIFY( CHECK(U, "\U0001F4C2") ); // open folder
   VERIFY( CHECK(U, "\U0001F4C4") ); // filing cabient
+
+  VERIFY( path(u8"\U0001D11E").u32string() == U"\U0001D11E" ); // G Clef
 }
 
 int
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/path/construct/95048.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/path/construct/95048.cc
index b7a93f3c985..fc65bfecd4d 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/path/construct/95048.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/path/construct/95048.cc
@@ -18,6 +18,8 @@ test_wide()
   VERIFY( CHECK(L, "\U0001F4C1") ); // folder
   VERIFY( CHECK(L, "\U0001F4C2") ); // open folder
   VERIFY( CHECK(L, "\U0001F4C4") ); // filing cabient
+
+  VERIFY( path(u8"\U0001D11E").wstring() == L"\U0001D11E" ); // G Clef
 }
 
 void
@@ -27,6 +29,8 @@ test_u16()
   VERIFY( CHECK(u, "\U0001F4C1") ); // folder
   VERIFY( CHECK(u, "\U0001F4C2") ); // open folder
   VERIFY( CHECK(u, "\U0001F4C4") ); // filing cabient
+
+  VERIFY( path(u8"\U0001D11E").u16string() == u"\U0001D11E" ); // G Clef
 }
 
 void
@@ -36,6 +40,8 @@ test_u32()
   VERIFY( CHECK(U, "\U0001F4C1") ); // folder
   VERIFY( CHECK(U, "\U0001F4C2") ); // open folder
   VERIFY( CHECK(U, "\U0001F4C4") ); // filing cabient
+
+  VERIFY( path(u8"\U0001D11E").u32string() == U"\U0001D11E" ); // G Clef
 }
 
 int
-- 
2.41.0



Re: [PATCH v2 2/2] libstdc++: use new built-in trait __is_scalar for std::is_scalar

2023-07-12 Thread Ken Matsui via Gcc-patches
On Wed, Jul 12, 2023 at 12:23 PM Jonathan Wakely  wrote:
>
>
>
> On Wed, 12 Jul 2023, 19:33 Ken Matsui via Libstdc++,  
> wrote:
>>
>> On Wed, Jul 12, 2023 at 2:50 AM Jonathan Wakely  wrote:
>> >
>> > On Sat, 8 Jul 2023 at 05:47, Ken Matsui via Libstdc++
>> >  wrote:
>> > >
>> > > This patch gets std::is_scalar to dispatch to new built-in trait
>> > > __is_scalar.
>> > >
>> > > libstdc++-v3/ChangeLog:
>> > >
>> > > * include/std/type_traits (is_scalar): Use __is_scalar built-in
>> > > trait.
>> > > (is_scalar_v): Likewise.
>> >
>> > OK for trunk (conditional on the front-end change being committed
>> > first of course).
>> >
>>
>> Thank you for your review!
>>
>> Just to confirm, this approval does not include the [1/2] patch, does
>> it? Or, did you approve this entire patch series?
>
>
> Only this patch. I cannot approve compiler changes, I'm only a reviewer for 
> libstdc++.
>
>
>>
>> > conditional on the front-end change being committed first of course
>>
>> Does this mean we want to commit this [2/2] patch before committing
>> the [1/2] patch in this case?
>
>
> The other way around, as Xi Ruoyao said.
>
>
>>
>> Also, can I tweak the commit message without being approved again,
>> such as attaching the benchmark result?
>
>
> Yes, that's fine.

Thank you!

>>
>> > >
>> > > Signed-off-by: Ken Matsui 
>> > > ---
>> > >  libstdc++-v3/include/std/type_traits | 14 ++
>> > >  1 file changed, 14 insertions(+)
>> > >
>> > > diff --git a/libstdc++-v3/include/std/type_traits 
>> > > b/libstdc++-v3/include/std/type_traits
>> > > index 0e7a9c9c7f3..bc90b2c61ca 100644
>> > > --- a/libstdc++-v3/include/std/type_traits
>> > > +++ b/libstdc++-v3/include/std/type_traits
>> > > @@ -678,11 +678,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> > >  struct is_member_pointer;
>> > >
>> > >/// is_scalar
>> > > +#if __has_builtin(__is_scalar)
>> > > +  template
>> > > +struct is_scalar
>> > > +: public __bool_constant<__is_scalar(_Tp)>
>> > > +{ };
>> > > +#else
>> > >template
>> > >  struct is_scalar
>> > >  : public __or_, is_enum<_Tp>, is_pointer<_Tp>,
>> > > is_member_pointer<_Tp>, is_null_pointer<_Tp>>::type
>> > >  { };
>> > > +#endif
>> > >
>> > >/// is_compound
>> > >template
>> > > @@ -3204,8 +3211,15 @@ template 
>> > >inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
>> > >  template 
>> > >inline constexpr bool is_object_v = is_object<_Tp>::value;
>> > > +
>> > > +#if __has_builtin(__is_scalar)
>> > > +template 
>> > > +  inline constexpr bool is_scalar_v = __is_scalar(_Tp);
>> > > +#else
>> > >  template 
>> > >inline constexpr bool is_scalar_v = is_scalar<_Tp>::value;
>> > > +#endif
>> > > +
>> > >  template 
>> > >inline constexpr bool is_compound_v = is_compound<_Tp>::value;
>> > >  template 
>> > > --
>> > > 2.41.0
>> > >
>> >


Re: [PATCH 0/9] Add btf_decl_tag C attribute

2023-07-12 Thread David Faust via Gcc-patches



On 7/12/23 06:49, Jose E. Marchesi wrote:
> 
>> On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi
>>  wrote:
>>>
>>>
>>> [Added Eduard Zingerman in CC, who is implementing this same feature in
>>>  clang/llvm and also the consumer component in the kernel (pahole).]
>>>
>>> Hi Richard.
>>>
 On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches
  wrote:
>
> Hello,
>
> This series adds support for a new attribute, "btf_decl_tag" in GCC.
> The same attribute is already supported in clang, and is used by various
> components of the BPF ecosystem.
>
> The purpose of the attribute is to allow to associate (to "tag")
> declarations with arbitrary string annotations, which are emitted into
> debugging information (DWARF and/or BTF) to facilitate post-compilation
> analysis (the motivating use case being the Linux kernel BPF verifier).
> Multiple tags are allowed on the same declaration.
>
> These strings are not interpreted by the compiler, and the attribute
> itself has no effect on generated code, other than to produce additional
> DWARF DIEs and/or BTF records conveying the annotations.
>
> This entails:
>
> - A new C-language-level attribute which allows to associate (to "tag")
>   particular declarations with arbitrary strings.
>
> - The conveyance of that information in DWARF in the form of a new DIE,
>   DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
>   that of the DW_TAG_LLVM_annotation extension supported in LLVM for
>   the same purpose. These DIEs are already supported by BPF tooling,
>   such as pahole.
>
> - The conveyance of that information in BTF debug info in the form of
>   BTF_KIND_DECL_TAG records. These records are already supported by
>   LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
>   eBPF verifier.
>
>
> Background
> ==
>
> The purpose of these tags is to convey additional semantic information
> to post-compilation consumers, in particular the Linux kernel eBPF
> verifier. The verifier can make use of that information while analyzing
> a BPF program to aid in determining whether to allow or reject the
> program to be run. More background on these tags can be found in the
> early support for them in the kernel here [1] and [2].
>
> The "btf_decl_tag" attribute is half the story; the other half is a
> sibling attribute "btf_type_tag" which serves the same purpose but
> applies to types. Support for btf_type_tag will come in a separate
> patch series, since it is impaced by GCC bug 110439 which needs to be
> addressed first.
>
> I submitted an initial version of this work (including btf_type_tag)
> last spring [3], however at the time there were some open questions
> about the behavior of the btf_type_tag attribute and issues with its
> implementation. Since then we have clarified these details and agreed
> to solutions with the BPF community and LLVM BPF folks.
>
> The main motivation for emitting the tags in DWARF is that the Linux
> kernel generates its BTF information via pahole, using DWARF as a source:
>
> ++  BTF  BTF   +--+
> | pahole |---> vmlinux.btf --->| verifier |
> ++ +--+
> ^^
> ||
>   DWARF |BTF |
> ||
>   vmlinux  +-+
>   module1.ko   | BPF program |
>   module2.ko   +-+
> ...
>
> This is because:
>
> a)  pahole adds additional kernel-specific information into the
> produced BTF based on additional analysis of kernel objects.
>
> b)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>
> b)  GCC can generate BTF for whatever target with -gbtf, but there is no
> support for linking/deduplicating BTF in the linker.
>
> In the scenario above, the verifier needs access to the pointer tags of
> both the kernel types/declarations (conveyed in the DWARF and translated
> to BTF by pahole) and those of the BPF program (available directly in 
> BTF).
>
>
> DWARF Representation
> 
>
> As noted above, btf_decl_tag is represented in DWARF via a new DIE
> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
> the following format:
>
>   DW_TAG_GNU_annotation (0x6000)
> DW_AT_name: "btf_decl_tag"
> DW_AT_const_value: 
>
>

Re: [PATCH v2 2/2] libstdc++: use new built-in trait __is_scalar for std::is_scalar

2023-07-12 Thread Jonathan Wakely via Gcc-patches
On Wed, 12 Jul 2023, 19:33 Ken Matsui via Libstdc++, 
wrote:

> On Wed, Jul 12, 2023 at 2:50 AM Jonathan Wakely 
> wrote:
> >
> > On Sat, 8 Jul 2023 at 05:47, Ken Matsui via Libstdc++
> >  wrote:
> > >
> > > This patch gets std::is_scalar to dispatch to new built-in trait
> > > __is_scalar.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/std/type_traits (is_scalar): Use __is_scalar built-in
> > > trait.
> > > (is_scalar_v): Likewise.
> >
> > OK for trunk (conditional on the front-end change being committed
> > first of course).
> >
>
> Thank you for your review!
>
> Just to confirm, this approval does not include the [1/2] patch, does
> it? Or, did you approve this entire patch series?
>

Only this patch. I cannot approve compiler changes, I'm only a reviewer for
libstdc++.



> > conditional on the front-end change being committed first of course
>
> Does this mean we want to commit this [2/2] patch before committing
> the [1/2] patch in this case?
>

The other way around, as Xi Ruoyao said.



> Also, can I tweak the commit message without being approved again,
> such as attaching the benchmark result?
>

Yes, that's fine.


> > >
> > > Signed-off-by: Ken Matsui 
> > > ---
> > >  libstdc++-v3/include/std/type_traits | 14 ++
> > >  1 file changed, 14 insertions(+)
> > >
> > > diff --git a/libstdc++-v3/include/std/type_traits
> b/libstdc++-v3/include/std/type_traits
> > > index 0e7a9c9c7f3..bc90b2c61ca 100644
> > > --- a/libstdc++-v3/include/std/type_traits
> > > +++ b/libstdc++-v3/include/std/type_traits
> > > @@ -678,11 +678,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >  struct is_member_pointer;
> > >
> > >/// is_scalar
> > > +#if __has_builtin(__is_scalar)
> > > +  template
> > > +struct is_scalar
> > > +: public __bool_constant<__is_scalar(_Tp)>
> > > +{ };
> > > +#else
> > >template
> > >  struct is_scalar
> > >  : public __or_, is_enum<_Tp>, is_pointer<_Tp>,
> > > is_member_pointer<_Tp>, is_null_pointer<_Tp>>::type
> > >  { };
> > > +#endif
> > >
> > >/// is_compound
> > >template
> > > @@ -3204,8 +3211,15 @@ template 
> > >inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
> > >  template 
> > >inline constexpr bool is_object_v = is_object<_Tp>::value;
> > > +
> > > +#if __has_builtin(__is_scalar)
> > > +template 
> > > +  inline constexpr bool is_scalar_v = __is_scalar(_Tp);
> > > +#else
> > >  template 
> > >inline constexpr bool is_scalar_v = is_scalar<_Tp>::value;
> > > +#endif
> > > +
> > >  template 
> > >inline constexpr bool is_compound_v = is_compound<_Tp>::value;
> > >  template 
> > > --
> > > 2.41.0
> > >
> >
>


[committed] IRA+LRA: Change return type of predicate functions from int to bool

2023-07-12 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog:

* ira.cc (equiv_init_varies_p): Change return type from int to bool
and adjust function body accordingly.
(equiv_init_movable_p): Ditto.
(memref_used_between_p): Ditto.
* lra-constraints.cc (valid_address_p): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/ira.cc b/gcc/ira.cc
index 02dea5d49ee..a1860105c60 100644
--- a/gcc/ira.cc
+++ b/gcc/ira.cc
@@ -3075,7 +3075,7 @@ validate_equiv_mem_from_store (rtx dest, const_rtx set 
ATTRIBUTE_UNUSED,
 info->equiv_mem_modified = true;
 }
 
-static int equiv_init_varies_p (rtx x);
+static bool equiv_init_varies_p (rtx x);
 
 enum valid_equiv { valid_none, valid_combine, valid_reload };
 
@@ -3145,8 +3145,8 @@ validate_equiv_mem (rtx_insn *start, rtx reg, rtx memref)
   return valid_none;
 }
 
-/* Returns zero if X is known to be invariant.  */
-static int
+/* Returns false if X is known to be invariant.  */
+static bool
 equiv_init_varies_p (rtx x)
 {
   RTX_CODE code = GET_CODE (x);
@@ -3162,14 +3162,14 @@ equiv_init_varies_p (rtx x)
 CASE_CONST_ANY:
 case SYMBOL_REF:
 case LABEL_REF:
-  return 0;
+  return false;
 
 case REG:
   return reg_equiv[REGNO (x)].replace == 0 && rtx_varies_p (x, 0);
 
 case ASM_OPERANDS:
   if (MEM_VOLATILE_P (x))
-   return 1;
+   return true;
 
   /* Fall through.  */
 
@@ -3182,24 +3182,24 @@ equiv_init_varies_p (rtx x)
 if (fmt[i] == 'e')
   {
if (equiv_init_varies_p (XEXP (x, i)))
- return 1;
+ return true;
   }
 else if (fmt[i] == 'E')
   {
int j;
for (j = 0; j < XVECLEN (x, i); j++)
  if (equiv_init_varies_p (XVECEXP (x, i, j)))
-   return 1;
+   return true;
   }
 
-  return 0;
+  return false;
 }
 
-/* Returns nonzero if X (used to initialize register REGNO) is movable.
+/* Returns true if X (used to initialize register REGNO) is movable.
X is only movable if the registers it uses have equivalent initializations
which appear to be within the same loop (or in an inner loop) and movable
or if they are not candidates for local_alloc and don't vary.  */
-static int
+static bool
 equiv_init_movable_p (rtx x, int regno)
 {
   int i, j;
@@ -3212,7 +3212,7 @@ equiv_init_movable_p (rtx x, int regno)
   return equiv_init_movable_p (SET_SRC (x), regno);
 
 case CLOBBER:
-  return 0;
+  return false;
 
 case PRE_INC:
 case PRE_DEC:
@@ -3220,7 +3220,7 @@ equiv_init_movable_p (rtx x, int regno)
 case POST_DEC:
 case PRE_MODIFY:
 case POST_MODIFY:
-  return 0;
+  return false;
 
 case REG:
   return ((reg_equiv[REGNO (x)].loop_depth >= reg_equiv[regno].loop_depth
@@ -3229,11 +3229,11 @@ equiv_init_movable_p (rtx x, int regno)
  && ! rtx_varies_p (x, 0)));
 
 case UNSPEC_VOLATILE:
-  return 0;
+  return false;
 
 case ASM_OPERANDS:
   if (MEM_VOLATILE_P (x))
-   return 0;
+   return false;
 
   /* Fall through.  */
 
@@ -3247,16 +3247,16 @@ equiv_init_movable_p (rtx x, int regno)
   {
   case 'e':
if (! equiv_init_movable_p (XEXP (x, i), regno))
- return 0;
+ return false;
break;
   case 'E':
for (j = XVECLEN (x, i) - 1; j >= 0; j--)
  if (! equiv_init_movable_p (XVECEXP (x, i, j), regno))
-   return 0;
+   return false;
break;
   }
 
-  return 1;
+  return true;
 }
 
 static bool memref_referenced_p (rtx memref, rtx x, bool read_p);
@@ -3370,7 +3370,7 @@ memref_referenced_p (rtx memref, rtx x, bool read_p)
Callers should not call this routine if START is after END in the
RTL chain.  */
 
-static int
+static bool
 memref_used_between_p (rtx memref, rtx_insn *start, rtx_insn *end)
 {
   rtx_insn *insn;
@@ -3383,15 +3383,15 @@ memref_used_between_p (rtx memref, rtx_insn *start, 
rtx_insn *end)
continue;
 
   if (memref_referenced_p (memref, PATTERN (insn), false))
-   return 1;
+   return true;
 
   /* Nonconst functions may access memory.  */
   if (CALL_P (insn) && (! RTL_CONST_CALL_P (insn)))
-   return 1;
+   return true;
 }
 
   gcc_assert (insn == NEXT_INSN (end));
-  return 0;
+  return false;
 }
 
 /* Mark REG as having no known equivalence.
diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index 123ff662cbc..9bfc88149ff 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -329,20 +329,20 @@ in_mem_p (int regno)
   return get_reg_class (regno) == NO_REGS;
 }
 
-/* Return 1 if ADDR is a valid memory address for mode MODE in address
+/* Return true if ADDR is a valid memory address for mode MODE in address
space AS, and check that each pseudo has the proper kind of hard
reg. */
-static int
+static bool
 valid_address_p (machine_mode mode ATTRIBUTE_UNUSED,
 rtx addr, addr_space_t as)
 {
 #ifdef GO_IF_LEGITIMATE_ADDRESS
   l

Re: [PATCH v2 2/2] libstdc++: use new built-in trait __is_scalar for std::is_scalar

2023-07-12 Thread Ken Matsui via Gcc-patches
On Wed, Jul 12, 2023 at 11:56 AM Xi Ruoyao  wrote:
>
> On Wed, 2023-07-12 at 11:32 -0700, Ken Matsui via Gcc-patches wrote:
> > > conditional on the front-end change being committed first of course
> >
> > Does this mean we want to commit this [2/2] patch before committing
> > the [1/2] patch in this case?
>
> No, this mean you should get 1/2 reviewed and committed first.
>
> > Also, can I tweak the commit message without being approved again,
> > such as attaching the benchmark result?
>
> Yes, as long as the ChangeLog is still correct (the Git hook will reject
> a push with wrong ChangeLog format anyway).

I see. Thank you so much!

> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2 2/2] libstdc++: use new built-in trait __is_scalar for std::is_scalar

2023-07-12 Thread Xi Ruoyao via Gcc-patches
On Wed, 2023-07-12 at 11:32 -0700, Ken Matsui via Gcc-patches wrote:
> > conditional on the front-end change being committed first of course
> 
> Does this mean we want to commit this [2/2] patch before committing
> the [1/2] patch in this case?

No, this mean you should get 1/2 reviewed and committed first.

> Also, can I tweak the commit message without being approved again,
> such as attaching the benchmark result?

Yes, as long as the ChangeLog is still correct (the Git hook will reject
a push with wrong ChangeLog format anyway).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] c++: non-standalone surrogate call template

2023-07-12 Thread Patrick Palka via Gcc-patches
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  There might be an existing PR for this issue but Bugzilla search
seems to be timing out for me currently.

-- >8 --

I noticed we were accidentally preventing ourselves from considering
a pointer/reference-to-function conversion function template if it's
not the first conversion function that's considered, which for the
testcase below resulted in us accepting the B call but not the A call
despite the only difference between A and B being the order of member
declarations.  This patch fixes this so that the outcome of overload
resolution doesn't arbitrarily depend on declaration order in this
situation.

gcc/cp/ChangeLog:

* call.cc (add_template_conv_candidate): Don't check for
non-empty 'candidates' here.
(build_op_call): Check it here, before we've considered any
conversion functions.

gcc/testsuite/ChangeLog:

* g++.dg/overload/conv-op5.C: New test.
---
 gcc/cp/call.cc   | 24 ++--
 gcc/testsuite/g++.dg/overload/conv-op5.C | 18 ++
 2 files changed, 32 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/overload/conv-op5.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 81935b83908..119063979fa 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -3709,12 +3709,6 @@ add_template_conv_candidate (struct z_candidate 
**candidates, tree tmpl,
 tree return_type, tree access_path,
 tree conversion_path, tsubst_flags_t complain)
 {
-  /* Making this work broke PR 71117 and 85118, so until the committee resolves
- core issue 2189, let's disable this candidate if there are any call
- operators.  */
-  if (*candidates)
-return NULL;
-
   return
 add_template_candidate_real (candidates, tmpl, NULL_TREE, NULL_TREE,
 NULL_TREE, arglist, return_type, access_path,
@@ -5290,6 +5284,8 @@ build_op_call (tree obj, vec **args, 
tsubst_flags_t complain)
  LOOKUP_NORMAL, &candidates, complain);
 }
 
+  bool any_call_ops = candidates != nullptr;
+
   convs = lookup_conversions (type);
 
   for (; convs; convs = TREE_CHAIN (convs))
@@ -5306,10 +5302,18 @@ build_op_call (tree obj, vec **args, 
tsubst_flags_t complain)
  continue;
 
if (TREE_CODE (fn) == TEMPLATE_DECL)
- add_template_conv_candidate
-   (&candidates, fn, obj, *args, totype,
-/*access_path=*/NULL_TREE,
-/*conversion_path=*/NULL_TREE, complain);
+ {
+   /* Making this work broke PR 71117 and 85118, so until the
+  committee resolves core issue 2189, let's disable this
+  candidate if there are any call operators.  */
+   if (any_call_ops)
+ continue;
+
+   add_template_conv_candidate
+ (&candidates, fn, obj, *args, totype,
+  /*access_path=*/NULL_TREE,
+  /*conversion_path=*/NULL_TREE, complain);
+ }
else
  add_conv_candidate (&candidates, fn, obj,
  *args, /*conversion_path=*/NULL_TREE,
diff --git a/gcc/testsuite/g++.dg/overload/conv-op5.C 
b/gcc/testsuite/g++.dg/overload/conv-op5.C
new file mode 100644
index 000..b7724908b62
--- /dev/null
+++ b/gcc/testsuite/g++.dg/overload/conv-op5.C
@@ -0,0 +1,18 @@
+// { dg-do compile { target c++11 } }
+
+template using F = int(*)(T);
+using G = int(*)(int*);
+
+struct A {
+  template operator F();  // #1
+  operator G() = delete; // #2
+};
+
+int i = A{}(0); // selects #1
+
+struct B {
+  operator G() = delete; // #2
+  template operator F();  // #1
+};
+
+int j = B{}(0); // selects #1
-- 
2.41.0.327.gaa9166bcc0



Re: [PATCH v2 2/2] libstdc++: use new built-in trait __is_scalar for std::is_scalar

2023-07-12 Thread Ken Matsui via Gcc-patches
On Wed, Jul 12, 2023 at 2:50 AM Jonathan Wakely  wrote:
>
> On Sat, 8 Jul 2023 at 05:47, Ken Matsui via Libstdc++
>  wrote:
> >
> > This patch gets std::is_scalar to dispatch to new built-in trait
> > __is_scalar.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/std/type_traits (is_scalar): Use __is_scalar built-in
> > trait.
> > (is_scalar_v): Likewise.
>
> OK for trunk (conditional on the front-end change being committed
> first of course).
>

Thank you for your review!

Just to confirm, this approval does not include the [1/2] patch, does
it? Or, did you approve this entire patch series?

> conditional on the front-end change being committed first of course

Does this mean we want to commit this [2/2] patch before committing
the [1/2] patch in this case?

Also, can I tweak the commit message without being approved again,
such as attaching the benchmark result?

> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  libstdc++-v3/include/std/type_traits | 14 ++
> >  1 file changed, 14 insertions(+)
> >
> > diff --git a/libstdc++-v3/include/std/type_traits 
> > b/libstdc++-v3/include/std/type_traits
> > index 0e7a9c9c7f3..bc90b2c61ca 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -678,11 +678,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >  struct is_member_pointer;
> >
> >/// is_scalar
> > +#if __has_builtin(__is_scalar)
> > +  template
> > +struct is_scalar
> > +: public __bool_constant<__is_scalar(_Tp)>
> > +{ };
> > +#else
> >template
> >  struct is_scalar
> >  : public __or_, is_enum<_Tp>, is_pointer<_Tp>,
> > is_member_pointer<_Tp>, is_null_pointer<_Tp>>::type
> >  { };
> > +#endif
> >
> >/// is_compound
> >template
> > @@ -3204,8 +3211,15 @@ template 
> >inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
> >  template 
> >inline constexpr bool is_object_v = is_object<_Tp>::value;
> > +
> > +#if __has_builtin(__is_scalar)
> > +template 
> > +  inline constexpr bool is_scalar_v = __is_scalar(_Tp);
> > +#else
> >  template 
> >inline constexpr bool is_scalar_v = is_scalar<_Tp>::value;
> > +#endif
> > +
> >  template 
> >inline constexpr bool is_compound_v = is_compound<_Tp>::value;
> >  template 
> > --
> > 2.41.0
> >
>


Re: [PATCH] ci: Add a linux CI

2023-07-12 Thread Tal Regev via Gcc-patches
- I am also not sure if the maintainers of gcc will want this ci, but many
other developers will be happy about that.
  Because many copies of gcc are already fork on github. We can see it
here: https://github.com/gcc-mirror/gcc.
  Also gcc maintainers are requested to check and validate the gcc as I did
in the CI here:
  https://gcc.gnu.org/contribute.html. They ask to test it on one platform
and run all testsuites.
  The CI can be extended to work not one job on linux, but other os as
windows and osx as well, then also more platforms on one click check.
- Github is not the main repo for gcc. github holds mirrors for gcc. Like
here: https://github.com/gcc-mirror/gcc.
   Many developers and forks are already on github.
- I am thinking this repo https://github.com/gcc-mirror/gcc is created by
the gcc maintainers because to do a mirror, you need to `git push --mirror`
from the
  original repo as they explain here:
https://docs.github.com/en/repositories/creating-and-managing-repositories/duplicating-a-repository.
But maybe I
  am mistaken.
- It means each time a person will have a PR on their repo. the ci will be
activated. (In the mirror repo they turn the ci). Also the mirror will
allow it,
  It can activate the ci each time the master or the releases branches are
changed. github will handle it. It has already done it for all the other
open source projects.
  For public open source each job has time for 6 hours. can have unlimited
jobs as the repo wants. limited to 20 jobs in parallel. Also there is
unlimited time cpu. all free of charge.
  This is how github action does for all open source projects. meaning if
you just have an open repo (not private) you can have it.
- This should be fixed on the gcc side. If they do so, they will have a
great way to test their code, and have more free bug code to insert into
their repo.
  Also more devs can contribute and they can check gcc across platforms in
one single PR. They can run the ci on many OS systems that they don't have
access or skill to do.

They CI can extend to aarch64, also arm or any flavor. The ci of github
action supports public docker images, and can run gcc on them too.

Thank you for your support.

Regards,
Tal Regev.


On Wed, 12 Jul 2023 at 15:42, Christophe Lyon 
wrote:

> Hi,
>
>
> On Sun, 9 Jul 2023 at 19:13, Tal Regev via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
>> Description: adding a ci in a github repo. Everytime a user will do a PR
>> to
>> master branch or releases branches, it will activate the ci on their repo.
>> for example: https://github.com/talregev/gcc/pull/1. Can help users to
>> verify their own changes before submitting a patch.
>>
>> ChangeLog: Add a linux CI
>>
>> Bootstrapping and testing: I tested it on linux with
>> host: x86_64-linux-gnu
>> target: x86_64-linux-gnu
>> some tests are failing. You can see the results in my CI yourself.
>>
>>
> Thanks for sharing your patch & idea.
> I think GCC validation is and has been a problem for a long time ;-)
>
> I am not a maintainer, so take my comments with a grain of salt ;-)
>
> - I don't know if the GCC project would want to accept such patches,
> pointing to github etc...
> - github is not the main GCC repository, it hosts several mirrors AFAIK
> - these mirrors are updated by individuals, I think, I don't know at which
> frequency etc... correct me if I'm wrong
> - would this mean that each time each such mirror/fork is updated, this
> triggers builds on github servers? Would that handle the load? I don't
> think so (also: how many "free" minutes of CPU time can be used?)
> - as you have noticed the GCC testsuite is not 100% clean (i.e. there are
> failures, so 'make check' always exits with an error code), making such a
> step useless. What we need is to compare to a baseline (eg. results of
> previous build) and report if there were detections. Several companies have
> CI systems doing this (either internally, or on publicly accessible servers)
>
> In particular, at Linaro we monitor regressions for several arm and
> aarch64 flavors, and we are also experimenting with "pre-commit CI", based
> on patchwork.
>
> Thanks anyway for sharing, it's good to see such initiatives ;-)
>
> Christophe
>
>
>
>> Patch: attach to this email.
>>
>


Re: [PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 10:32, Palmer Dabbelt wrote:

On Wed, 12 Jul 2023 09:02:06 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/11/23 21:30, juzhe.zh...@rivai.ai wrote:

LGTM

OK for the trunk.


I'd like to make sure Kito is OK with this.  IIUC the "pass through 
unknown extensions" behavior is deliberate.  It's not what I would have 
done, but I didn't do it ;)
OK.   Not sure what the rationale behind that would be, but Kito may 
have had a good reason.  Let's wait for his input.


jeff


Re: [PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-12 Thread Palmer Dabbelt

On Wed, 12 Jul 2023 09:02:06 PDT (-0700), jeffreya...@gmail.com wrote:



On 7/11/23 21:30, juzhe.zh...@rivai.ai wrote:

LGTM

OK for the trunk.


I'd like to make sure Kito is OK with this.  IIUC the "pass through 
unknown extensions" behavior is deliberate.  It's not what I would have 
done, but I didn't do it ;)



jeff


Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Vladimir Makarov via Gcc-patches



On 7/12/23 12:22, Richard Sandiford wrote:

Vladimir Makarov  writes:

On 7/12/23 06:07, Richard Sandiford wrote:

Vladimir Makarov via Gcc-patches  writes:

diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index 73fbef29912..2f95121df06 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1443,10 +1443,11 @@ assign_by_spills (void)
 pass.  Indicate that it is no longer spilled.  */
  bitmap_clear_bit (&all_spilled_pseudos, regno);
  assign_hard_regno (hard_regno, regno);
- if (! reload_p)
-   /* As non-reload pseudo assignment is changed we
-  should reconsider insns referring for the
-  pseudo.  */
+ if (! reload_p || regno_allocno_class_array[regno] == ALL_REGS)

Is this test meaningful on all targets?  We have some for which
GENERAL_REGS == ALL_REGS (e.g. nios2 and nvptx), so ALL_REGS can
be a valid allocation class.


Richard, thank you for the question.

As I remember nvptx does not use IRA/LRA.

I don't think it is a problem.  For targets with GENERAL_REGS ==
ALL_REGS, it only results in one more insn processing on the next
constraint sub-pass.

Ah, ok, thanks.  If there's no risk of cycling then I agree it
doesn't matter.
No. There is no additional risk of cycling as insn processing only 
starts after assigning hard reg to the reload pseudo and it can happens 
only once for the reload pseudo before spilling sub-pass.




Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Richard Sandiford via Gcc-patches
Vladimir Makarov  writes:
> On 7/12/23 06:07, Richard Sandiford wrote:
>> Vladimir Makarov via Gcc-patches  writes:
>>> diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
>>> index 73fbef29912..2f95121df06 100644
>>> --- a/gcc/lra-assigns.cc
>>> +++ b/gcc/lra-assigns.cc
>>> @@ -1443,10 +1443,11 @@ assign_by_spills (void)
>>>  pass.  Indicate that it is no longer spilled.  */
>>>   bitmap_clear_bit (&all_spilled_pseudos, regno);
>>>   assign_hard_regno (hard_regno, regno);
>>> - if (! reload_p)
>>> -   /* As non-reload pseudo assignment is changed we
>>> -  should reconsider insns referring for the
>>> -  pseudo.  */
>>> + if (! reload_p || regno_allocno_class_array[regno] == ALL_REGS)
>> Is this test meaningful on all targets?  We have some for which
>> GENERAL_REGS == ALL_REGS (e.g. nios2 and nvptx), so ALL_REGS can
>> be a valid allocation class.
>>
> Richard, thank you for the question.
>
> As I remember nvptx does not use IRA/LRA.
>
> I don't think it is a problem.  For targets with GENERAL_REGS == 
> ALL_REGS, it only results in one more insn processing on the next 
> constraint sub-pass.

Ah, ok, thanks.  If there's no risk of cycling then I agree it
doesn't matter.

Richard

> I could do more accurate solution but it would need introducing new data 
> (flags) for pseudos which I'd like to avoid.


[COMMITTED] [range-op] Enable value/mask propagation in range-op.

2023-07-12 Thread Aldy Hernandez via Gcc-patches
Throw the switch in range-ops to make full use of the value/mask
information instead of only the nonzero bits.  This will cause most of
the operators implemented in range-ops to use the value/mask
information calculated by CCP's bit_value_binop() function which
range-ops uses.  This opens up more optimization opportunities.

In follow-up patches I will change the global range setter
(set_range_info) to be able to save the value/mask pair, and make both
CCP and IPA be able to save the known ones bit info, instead of
throwing it away.

gcc/ChangeLog:

* range-op.cc (irange_to_masked_value): Remove.
(update_known_bitmask): Update irange value/mask pair instead of
only updating nonzero bits.

gcc/testsuite/ChangeLog:

* gcc.dg/pr83073.c: Adjust testcase.
---
 gcc/range-op.cc| 53 ++
 gcc/testsuite/gcc.dg/pr83073.c |  2 +-
 2 files changed, 23 insertions(+), 32 deletions(-)

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index cb584314f4c..56e80c9f3ae 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -367,23 +367,6 @@ range_op_handler::op1_op2_relation (const vrange &lhs) 
const
 }
 
 
-// Convert irange bitmasks into a VALUE MASK pair suitable for calling CCP.
-
-static void
-irange_to_masked_value (const irange &r, widest_int &value, widest_int &mask)
-{
-  if (r.singleton_p ())
-{
-  mask = 0;
-  value = widest_int::from (r.lower_bound (), TYPE_SIGN (r.type ()));
-}
-  else
-{
-  mask = widest_int::from (r.get_nonzero_bits (), TYPE_SIGN (r.type ()));
-  value = 0;
-}
-}
-
 // Update the known bitmasks in R when applying the operation CODE to
 // LH and RH.
 
@@ -391,25 +374,33 @@ void
 update_known_bitmask (irange &r, tree_code code,
  const irange &lh, const irange &rh)
 {
-  if (r.undefined_p () || lh.undefined_p () || rh.undefined_p ())
+  if (r.undefined_p () || lh.undefined_p () || rh.undefined_p ()
+  || r.singleton_p ())
 return;
 
-  widest_int value, mask, lh_mask, rh_mask, lh_value, rh_value;
+  widest_int widest_value, widest_mask;
   tree type = r.type ();
   signop sign = TYPE_SIGN (type);
   int prec = TYPE_PRECISION (type);
-  signop lh_sign = TYPE_SIGN (lh.type ());
-  signop rh_sign = TYPE_SIGN (rh.type ());
-  int lh_prec = TYPE_PRECISION (lh.type ());
-  int rh_prec = TYPE_PRECISION (rh.type ());
-
-  irange_to_masked_value (lh, lh_value, lh_mask);
-  irange_to_masked_value (rh, rh_value, rh_mask);
-  bit_value_binop (code, sign, prec, &value, &mask,
-  lh_sign, lh_prec, lh_value, lh_mask,
-  rh_sign, rh_prec, rh_value, rh_mask);
-  wide_int tmp = wide_int::from (value | mask, prec, sign);
-  r.set_nonzero_bits (tmp);
+  irange_bitmask lh_bits = lh.get_bitmask ();
+  irange_bitmask rh_bits = rh.get_bitmask ();
+
+  bit_value_binop (code, sign, prec, &widest_value, &widest_mask,
+  TYPE_SIGN (lh.type ()),
+  TYPE_PRECISION (lh.type ()),
+  widest_int::from (lh_bits.value (), sign),
+  widest_int::from (lh_bits.mask (), sign),
+  TYPE_SIGN (rh.type ()),
+  TYPE_PRECISION (rh.type ()),
+  widest_int::from (rh_bits.value (), sign),
+  widest_int::from (rh_bits.mask (), sign));
+
+  wide_int mask = wide_int::from (widest_mask, prec, sign);
+  wide_int value = wide_int::from (widest_value, prec, sign);
+  // Bitmasks must have the unknown value bits cleared.
+  value &= ~mask;
+  irange_bitmask bm (value, mask);
+  r.update_bitmask (bm);
 }
 
 // Return the upper limit for a type.
diff --git a/gcc/testsuite/gcc.dg/pr83073.c b/gcc/testsuite/gcc.dg/pr83073.c
index 1168ae822a4..228e1890086 100644
--- a/gcc/testsuite/gcc.dg/pr83073.c
+++ b/gcc/testsuite/gcc.dg/pr83073.c
@@ -7,4 +7,4 @@ int f(int x)
 return x & 1;
 }
 
-/* { dg-final { scan-tree-dump "gimple_simplified to.* = 1" "evrp" } }  */
+/* { dg-final { scan-tree-dump "Folded into: return 1;" "evrp" } }  */
-- 
2.40.1



Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Vladimir Makarov via Gcc-patches



On 7/12/23 06:07, Richard Sandiford wrote:

Vladimir Makarov via Gcc-patches  writes:

diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index 73fbef29912..2f95121df06 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1443,10 +1443,11 @@ assign_by_spills (void)
 pass.  Indicate that it is no longer spilled.  */
  bitmap_clear_bit (&all_spilled_pseudos, regno);
  assign_hard_regno (hard_regno, regno);
- if (! reload_p)
-   /* As non-reload pseudo assignment is changed we
-  should reconsider insns referring for the
-  pseudo.  */
+ if (! reload_p || regno_allocno_class_array[regno] == ALL_REGS)

Is this test meaningful on all targets?  We have some for which
GENERAL_REGS == ALL_REGS (e.g. nios2 and nvptx), so ALL_REGS can
be a valid allocation class.


Richard, thank you for the question.

As I remember nvptx does not use IRA/LRA.

I don't think it is a problem.  For targets with GENERAL_REGS == 
ALL_REGS, it only results in one more insn processing on the next 
constraint sub-pass.


I could do more accurate solution but it would need introducing new data 
(flags) for pseudos which I'd like to avoid.




[r14-2407 Regression] FAIL: g++.dg/vect/pr110557.cc -std=c++98 (test for excess errors) on Linux/x86_64

2023-07-12 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

63ae6bc60c0f67fb2791991bf4b6e7e0a907d420 is the first bad commit
commit 63ae6bc60c0f67fb2791991bf4b6e7e0a907d420
Author: Xi Ruoyao 
Date:   Thu Jul 6 23:08:57 2023 +0800

vect: Fix vectorized BIT_FIELD_REF for signed bit-fields [PR110557]

caused

FAIL: g++.dg/vect/pr110557.cc  -std=c++14 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc  -std=c++17 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc  -std=c++20 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc  -std=c++98 (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-2407/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=g++.dg/vect/pr110557.cc --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=g++.dg/vect/pr110557.cc --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


[PATCH] Fix part of PR 110293: `A NEEQ (A NEEQ CST)` part

2023-07-12 Thread Andrew Pinski via Gcc-patches
This fixes part of PR 110293, for the outer comparison case
being `!=` or `==`.  In turn PR 110539 is able to be optimized
again as the if statement for `(a&1) == ((a & 1) != 0)` gets optimized
to `false` early enough to allow FRE/DOM to do a CSE for memory store/load.

OK? Bootstrapped and tested on x86_64-linux with no regressions.

gcc/ChangeLog:

PR tree-optimization/110293
PR tree-optimization/110539
* match.pd: Expand the `x != (typeof x)(x == 0)`
pattern to handle where the inner and outer comparsions
are either `!=` or `==` and handle other constants
than 0.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr110293-1.c: New test.
* gcc.dg/tree-ssa/pr110539-1.c: New test.
* gcc.dg/tree-ssa/pr110539-2.c: New test.
* gcc.dg/tree-ssa/pr110539-3.c: New test.
* gcc.dg/tree-ssa/pr110539-4.c: New test.
---
 gcc/match.pd   | 39 --
 gcc/testsuite/gcc.dg/tree-ssa/pr110293-1.c | 58 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr110539-1.c | 12 
 gcc/testsuite/gcc.dg/tree-ssa/pr110539-2.c | 12 
 gcc/testsuite/gcc.dg/tree-ssa/pr110539-3.c | 75 
 gcc/testsuite/gcc.dg/tree-ssa/pr110539-4.c | 82 ++
 6 files changed, 274 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110293-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110539-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110539-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110539-3.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr110539-4.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 8543f777a28..351d9285e92 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6429,10 +6429,41 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
   { constant_boolean_node (false, type); }))
 
-/* x != (typeof x)(x == 0) is always true.  */
-(simplify
- (ne:c @0 (convert (eq @0 integer_zerop)))
- { constant_boolean_node (true, type); })
+/* x != (typeof x)(x == CST) -> CST == 0 ? 1 : (CST == 1 ? (x!=0&&x!=1) : x != 
0) */
+/* x != (typeof x)(x != CST) -> CST == 1 ? 1 : (CST == 0 ? (x!=0&&x!=1) : x != 
1) */
+/* x == (typeof x)(x == CST) -> CST == 0 ? 0 : (CST == 1 ? (x==0||x==1) : x != 
0) */
+/* x == (typeof x)(x != CST) -> CST == 1 ? 0 : (CST == 0 ? (x==0||x==1) : x != 
1) */
+(for outer (ne eq)
+ (for inner (ne eq)
+  (simplify
+   (outer:c @0 (convert (inner @0 INTEGER_CST@1)))
+   (with {
+ bool cst1 = integer_onep (@1);
+ bool cst0 = integer_zerop (@1);
+ bool innereq = inner == EQ_EXPR;
+ bool outereq = outer == EQ_EXPR;
+}
+   (switch
+(if (innereq ? cst0 : cst1)
+ { constant_boolean_node (!outereq, type); })
+(if (innereq ? cst1 : cst0)
+ (with {
+   tree utype = unsigned_type_for (TREE_TYPE (@0));
+   tree ucst1 = build_one_cst (utype);
+  }
+  (if (!outereq)
+   (gt (convert:utype @0) { ucst1; })
+   (le (convert:utype @0) { ucst1; })
+  )
+ )
+)
+(if (innereq)
+ (ne @0 { build_zero_cst (TREE_TYPE (@0)); }))
+(ne @0 { build_one_cst (TREE_TYPE (@0)); }))
+   )
+  )
+ )
+)
 
 (for cmp (unordered ordered unlt unle ungt unge uneq ltgt)
  /* If the second operand is NaN, the result is constant.  */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110293-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110293-1.c
new file mode 100644
index 000..24aea1a2d03
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110293-1.c
@@ -0,0 +1,58 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized-raw" } */
+
+_Bool eqeq0(unsigned x)
+{
+  return x == (x == 0);
+}
+_Bool eqeq1(unsigned x)
+{
+  return x == (x == 1);
+}
+_Bool eqeq2(unsigned x)
+{
+  return x == (x == 2);
+}
+
+_Bool neeq0(unsigned x)
+{
+  return x != (x == 0);
+}
+_Bool neeq1(unsigned x)
+{
+  return x != (x == 1);
+}
+_Bool neeq2(unsigned x)
+{
+  return x != (x == 2);
+}
+
+_Bool eqne0(unsigned x)
+{
+  return x == (x != 0);
+}
+_Bool eqne1(unsigned x)
+{
+  return x == (x != 1);
+}
+_Bool eqne2(unsigned x)
+{
+  return x == (x != 2);
+}
+
+_Bool nene0(unsigned x)
+{
+  return x != (x != 0);
+}
+_Bool nene1(unsigned x)
+{
+  return x != (x != 1);
+}
+_Bool nene2(unsigned x)
+{
+  return x != (x != 2);
+}
+
+/* All of these functions should have removed the inner most comparison which
+   means all of the conversions from bool to unsigned should have been removed 
too. */
+/* { dg-final { scan-tree-dump-not "nop_expr," "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr110539-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr110539-1.c
new file mode 100644
index 000..6ba864cdd13
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr110539-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+int f(int a)
+{
+int b = a & 1;
+int c = b != 0;
+return c == b;
+}
+
+/* This should be optimized to jus

Re: [PATCH] mklog: Add --append option to auto add generate ChangeLog to patch file

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/11/23 22:01, Lehua Ding wrote:

Hi,

This tiny patch add --append option to mklog.py that support add generated
ChangeLog to the corresponding patch file. With this option there is no need
to manually copy the generated ChangeLog to the patch file. e.g.:

Run `mklog.py -a /path/to/this/patch` will add the generated ChangeLog

```
contrib/ChangeLog:

* mklog.py:
```

to the right place of the /path/to/this/patch file.

Best,
Lehua

contrib/ChangeLog:

* mklog.py: Add --append option.

OK for the trunk.
jeff


Re: [PATCH] RISC-V: Throw compilation error for unknown sub-extension or supervisor extension

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/11/23 21:30, juzhe.zh...@rivai.ai wrote:

LGTM

OK for the trunk.
jeff


Re: [PATCH] c++: constrained surrogate calls [PR110535]

2023-07-12 Thread Patrick Palka via Gcc-patches
On Wed, 12 Jul 2023, Patrick Palka wrote:

> We're not checking constraints of pointer/reference-to-function conversion
> functions during overload resolution, which causes us to ICE on the first
> testcase and incorrectly reject the second testcase.

Er, I noticed [over.call.object] doesn't exactly say that surrogate
call functions inherit the constraints of the corresponding conversion
function, but I reckon that's the intent?

> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk/13?
> 
>   PR c++/110535
> 
> gcc/cp/ChangeLog:
> 
>   * call.cc (add_conv_candidate): Check constraints.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/concepts-surrogate1.C: New test.
>   * g++.dg/cpp2a/concepts-surrogate2.C: New test.
> ---
>  gcc/cp/call.cc   |  8 
>  gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C | 12 
>  gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C | 14 ++
>  3 files changed, 34 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C
> 
> diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> index 15a3d6f2a1f..81935b83908 100644
> --- a/gcc/cp/call.cc
> +++ b/gcc/cp/call.cc
> @@ -2588,6 +2588,14 @@ add_conv_candidate (struct z_candidate **candidates, 
> tree fn, tree obj,
>if (*candidates && (*candidates)->fn == totype)
>  return NULL;
>  
> +  if (!constraints_satisfied_p (fn))
> +{
> +  reason = constraint_failure ();
> +  viable = 0;
> +  return add_candidate (candidates, fn, obj, arglist, len, convs,
> + access_path, conversion_path, viable, reason, 
> flags);
> +}
> +
>for (i = 0; i < len; ++i)
>  {
>tree arg, argtype, convert_type = NULL_TREE;
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C
> new file mode 100644
> index 000..e8481a31656
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C
> @@ -0,0 +1,12 @@
> +// PR c++/110535
> +// { dg-do compile { target c++20 } }
> +
> +using F = int(int);
> +
> +template
> +struct A {
> + operator F*() requires B;
> +};
> +
> +int i = A{}(0);  // OK
> +int j = A{}(0); // { dg-error "no match" }
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C
> new file mode 100644
> index 000..8bf8364beb7
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C
> @@ -0,0 +1,14 @@
> +// PR c++/110535
> +// { dg-do compile { target c++20 } }
> +
> +using F = int(int);
> +using G = long(int);
> +
> +template
> +struct A {
> + operator F&() requires B;
> + operator G&() requires (!B);
> +};
> +
> +int i = A{}(0);  // { dg-bogus "ambiguous" }
> +int j = A{}(0); // { dg-bogus "ambiguous" }
> -- 
> 2.41.0.327.gaa9166bcc0
> 
> 



[PATCH] c++: constrained surrogate calls [PR110535]

2023-07-12 Thread Patrick Palka via Gcc-patches
We're not checking constraints of pointer/reference-to-function conversion
functions during overload resolution, which causes us to ICE on the first
testcase and incorrectly reject the second testcase.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/13?

PR c++/110535

gcc/cp/ChangeLog:

* call.cc (add_conv_candidate): Check constraints.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-surrogate1.C: New test.
* g++.dg/cpp2a/concepts-surrogate2.C: New test.
---
 gcc/cp/call.cc   |  8 
 gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C | 12 
 gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C | 14 ++
 3 files changed, 34 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 15a3d6f2a1f..81935b83908 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -2588,6 +2588,14 @@ add_conv_candidate (struct z_candidate **candidates, 
tree fn, tree obj,
   if (*candidates && (*candidates)->fn == totype)
 return NULL;
 
+  if (!constraints_satisfied_p (fn))
+{
+  reason = constraint_failure ();
+  viable = 0;
+  return add_candidate (candidates, fn, obj, arglist, len, convs,
+   access_path, conversion_path, viable, reason, 
flags);
+}
+
   for (i = 0; i < len; ++i)
 {
   tree arg, argtype, convert_type = NULL_TREE;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C
new file mode 100644
index 000..e8481a31656
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate1.C
@@ -0,0 +1,12 @@
+// PR c++/110535
+// { dg-do compile { target c++20 } }
+
+using F = int(int);
+
+template
+struct A {
+ operator F*() requires B;
+};
+
+int i = A{}(0);  // OK
+int j = A{}(0); // { dg-error "no match" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C
new file mode 100644
index 000..8bf8364beb7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-surrogate2.C
@@ -0,0 +1,14 @@
+// PR c++/110535
+// { dg-do compile { target c++20 } }
+
+using F = int(int);
+using G = long(int);
+
+template
+struct A {
+ operator F&() requires B;
+ operator G&() requires (!B);
+};
+
+int i = A{}(0);  // { dg-bogus "ambiguous" }
+int j = A{}(0); // { dg-bogus "ambiguous" }
-- 
2.41.0.327.gaa9166bcc0



Re: [PATCH V2] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 09:24, Juzhe-Zhong wrote:

This middle-end has been merged:
https://github.com/gcc-mirror/gcc/commit/0d4dd7e07a879d6c07a33edb2799710faa95651e

With this patch, we can handle operations may trap on elements outside the loop.
  
These 2 following cases will be addressed by this patch:
  
1. integer division:
  
   #define TEST_TYPE(TYPE) \

   __attribute__((noipa)) \
   void vrem_##TYPE (TYPE * __restrict dst, TYPE * __restrict a, TYPE * 
__restrict b, int n) \
   { \
 for (int i = 0; i < n; i++) \
   dst[i] = a[i] % b[i]; \
   }
   #define TEST_ALL() \
TEST_TYPE(int8_t) \
   TEST_ALL()
  
   Before this patch:
  
vrem_int8_t:

 ble a3,zero,.L14
 csrrt4,vlenb
 addiw   a5,a3,-1
 addiw   a4,t4,-1
 sext.w  t5,a3
 bltua5,a4,.L10
 csrrt3,vlenb
 subwt3,t5,t3
 li  a5,0
 vsetvli t6,zero,e8,m1,ta,ma
.L4:
 add a6,a2,a5
 add a7,a0,a5
 add t1,a1,a5
 mv  a4,a5
 add a5,a5,t4
 vl1re8.vv2,0(a6)
 vl1re8.vv1,0(t1)
 sext.w  a6,a5
 vrem.vv v1,v1,v2
 vs1r.v  v1,0(a7)
 bleua6,t3,.L4
 csrra5,vlenb
 addwa4,a4,a5
 sext.w  a5,a4
 beq t5,a4,.L16
.L3:
 csrra6,vlenb
 subwt5,t5,a4
 srlia6,a6,1
 addiw   t1,t5,-1
 addiw   a7,a6,-1
 bltut1,a7,.L9
 sllia4,a4,32
 srlia4,a4,32
 add t0,a1,a4
 add t6,a2,a4
 add a4,a0,a4
 vsetvli a7,zero,e8,mf2,ta,ma
 sext.w  t3,a6
 vle8.v  v1,0(t0)
 vle8.v  v2,0(t6)
 subwt4,t5,a6
 vrem.vv v1,v1,v2
 vse8.v  v1,0(a4)
 mv  t1,t3
 bltut4,t3,.L7
 csrrt1,vlenb
 add a4,a4,a6
 add t0,t0,a6
 add t6,t6,a6
 sext.w  t1,t1
 vle8.v  v1,0(t0)
 vle8.v  v2,0(t6)
 vrem.vv v1,v1,v2
 vse8.v  v1,0(a4)
.L7:
 addwa5,t1,a5
 beq t5,t1,.L14
.L9:
 add a4,a1,a5
 add a6,a2,a5
 lb  a6,0(a6)
 lb  a4,0(a4)
 add a7,a0,a5
 addia5,a5,1
 remwa4,a4,a6
 sext.w  a6,a5
 sb  a4,0(a7)
 bgt a3,a6,.L9
.L14:
 ret
.L10:
 li  a4,0
 li  a5,0
 j   .L3
.L16:
 ret
  
After this patch:
  
vrem_int8_t:

ble a3,zero,.L5
.L3:
vsetvli a5,a3,e8,m1,tu,ma
vle8.v v1,0(a1)
vle8.v v2,0(a2)
sub a3,a3,a5
vrem.vv v1,v1,v2
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
bne a3,zero,.L3
.L5:
ret
  
2. Floating-point operation **WITHOUT** -ffast-math:
  
 #define TEST_TYPE(TYPE) \

 __attribute__((noipa)) \
 void vadd_##TYPE (TYPE * __restrict dst, TYPE *__restrict a, TYPE 
*__restrict b, int n) \
 { \
   for (int i = 0; i < n; i++) \
 dst[i] = a[i] + b[i]; \
 }
  
 #define TEST_ALL() \

  TEST_TYPE(float) \
  
 TEST_ALL()

Before this patch:

vadd_float:

 ble a3,zero,.L10
 csrra4,vlenb
 srlit3,a4,2
 addiw   a5,a3,-1
 addiw   a6,t3,-1
 sext.w  t6,a3
 bltua5,a6,.L7
 subwt5,t6,t3
 mv  t1,a1
 mv  a7,a2
 mv  a6,a0
 li  a5,0
 vsetvli t4,zero,e32,m1,ta,ma
.L4:
 vl1re32.v   v1,0(t1)
 vl1re32.v   v2,0(a7)
 addwa5,a5,t3
 vfadd.vvv1,v1,v2
 vs1r.v  v1,0(a6)
 add t1,t1,a4
 add a7,a7,a4
 add a6,a6,a4
 bgeut5,a5,.L4
 beq t6,a5,.L10
 sext.w  a5,a5
.L3:
 sllia4,a5,2
.L6:
 add a6,a1,a4
 add a7,a2,a4
 flw fa4,0(a6)
 flw fa5,0(a7)
 add a6,a0,a4
 addiw   a5,a5,1
 fadd.s  fa5,fa5,fa4
 addia4,a4,4
 fsw fa5,0(a6)
 bgt a3,a5,.L6
.L10:
 ret
.L7:
 li  a5,0
 j   .L3
  
After this patch:
  
vadd_float:

ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,tu,ma
slli a4,a5,2
vle32.v v1,0(a1)
vle32.v v2,0(a2)
sub a3,a3,a5
vfadd.vv v1,v1,v2
vse32.v v1,0(a0)
add a1,a1,a4
add a2,a2,a4
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret
   
gcc/ChangeLog:
  
 * config/riscv/autovec.md (cond_len_): New pattern.

 * config/riscv/riscv-protos.h (enum insn_type): New enum.
 (expand_cond_len_binop): New function.
 * config/riscv/riscv-v.cc (emit_nonvlmax_tu_insn): Ditto.
 (emit_nonvlmax_fp_tu_insn): Ditto.
 (need_fp_rounding_p): Ditto.
 (expand_cond_len_binop): Ditto.
 * config/riscv/riscv.cc (riscv_preferred_else_value): Ditto.
 (TARGET_PREFERRED_ELSE_VALUE): New target hook.
  
gcc/testsuite/Change

Re: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/11/23 23:50, pan2...@intel.com wrote:

From: Pan Li 

When investigate the FRM dynmaic rounding mode, we find the global
unknown status is quite different between the fixed-point and
floating-point. Thus, we separate the unknown function with extracting
some inner common functions.

We will also prepare more test cases in another PATCH.

Signed-off-by: Pan Li 

gcc/ChangeLog:

* config/riscv/riscv.cc (regnum_definition_p): New function.
(insn_asm_p): Ditto.
(riscv_vxrm_mode_after): New function for fixed-point.
(global_vxrm_state_unknown_p): Ditto.
(riscv_frm_mode_after): New function for floating-point.
(global_frm_state_unknown_p): Ditto.
(riscv_mode_after): Leverage new functions.
(riscv_entity_mode_after): Removed.
---
  gcc/config/riscv/riscv.cc | 96 +--
  1 file changed, 82 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38d8eb2fcf5..553fbb4435a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7742,19 +7742,91 @@ global_state_unknown_p (rtx_insn *insn, unsigned int 
regno)
return false;
  }
  
+static bool

+regnum_definition_p (rtx_insn *insn, unsigned int regno)
Needs a function comment.  This is true for each new function added.  In 
this specific case somethign like this might be appropriate


/* Return TRUE if REGNO is set in INSN, FALSE otherwise.  */

Which begs the question, is there some reason why we're not using the 
existing reg_set_p or simple_regno_set from rtlanal.cc?




Jeff


[PATCH V2] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Juzhe-Zhong
This middle-end has been merged:
https://github.com/gcc-mirror/gcc/commit/0d4dd7e07a879d6c07a33edb2799710faa95651e

With this patch, we can handle operations may trap on elements outside the loop.
 
These 2 following cases will be addressed by this patch:
 
1. integer division:
 
  #define TEST_TYPE(TYPE) \
  __attribute__((noipa)) \
  void vrem_##TYPE (TYPE * __restrict dst, TYPE * __restrict a, TYPE * 
__restrict b, int n) \
  { \
for (int i = 0; i < n; i++) \
  dst[i] = a[i] % b[i]; \
  }
  #define TEST_ALL() \
   TEST_TYPE(int8_t) \
  TEST_ALL()
 
  Before this patch:
 
   vrem_int8_t:
ble a3,zero,.L14
csrrt4,vlenb
addiw   a5,a3,-1
addiw   a4,t4,-1
sext.w  t5,a3
bltua5,a4,.L10
csrrt3,vlenb
subwt3,t5,t3
li  a5,0
vsetvli t6,zero,e8,m1,ta,ma
.L4:
add a6,a2,a5
add a7,a0,a5
add t1,a1,a5
mv  a4,a5
add a5,a5,t4
vl1re8.vv2,0(a6)
vl1re8.vv1,0(t1)
sext.w  a6,a5
vrem.vv v1,v1,v2
vs1r.v  v1,0(a7)
bleua6,t3,.L4
csrra5,vlenb
addwa4,a4,a5
sext.w  a5,a4
beq t5,a4,.L16
.L3:
csrra6,vlenb
subwt5,t5,a4
srlia6,a6,1
addiw   t1,t5,-1
addiw   a7,a6,-1
bltut1,a7,.L9
sllia4,a4,32
srlia4,a4,32
add t0,a1,a4
add t6,a2,a4
add a4,a0,a4
vsetvli a7,zero,e8,mf2,ta,ma
sext.w  t3,a6
vle8.v  v1,0(t0)
vle8.v  v2,0(t6)
subwt4,t5,a6
vrem.vv v1,v1,v2
vse8.v  v1,0(a4)
mv  t1,t3
bltut4,t3,.L7
csrrt1,vlenb
add a4,a4,a6
add t0,t0,a6
add t6,t6,a6
sext.w  t1,t1
vle8.v  v1,0(t0)
vle8.v  v2,0(t6)
vrem.vv v1,v1,v2
vse8.v  v1,0(a4)
.L7:
addwa5,t1,a5
beq t5,t1,.L14
.L9:
add a4,a1,a5
add a6,a2,a5
lb  a6,0(a6)
lb  a4,0(a4)
add a7,a0,a5
addia5,a5,1
remwa4,a4,a6
sext.w  a6,a5
sb  a4,0(a7)
bgt a3,a6,.L9
.L14:
ret
.L10:
li  a4,0
li  a5,0
j   .L3
.L16:
ret
 
After this patch:
 
   vrem_int8_t:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e8,m1,tu,ma
vle8.v v1,0(a1)
vle8.v v2,0(a2)
sub a3,a3,a5
vrem.vv v1,v1,v2
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
bne a3,zero,.L3
.L5:
ret
 
2. Floating-point operation **WITHOUT** -ffast-math:
 
#define TEST_TYPE(TYPE) \
__attribute__((noipa)) \
void vadd_##TYPE (TYPE * __restrict dst, TYPE *__restrict a, TYPE 
*__restrict b, int n) \
{ \
  for (int i = 0; i < n; i++) \
dst[i] = a[i] + b[i]; \
}
 
#define TEST_ALL() \
 TEST_TYPE(float) \
 
TEST_ALL()
   
Before this patch:
   
   vadd_float:
ble a3,zero,.L10
csrra4,vlenb
srlit3,a4,2
addiw   a5,a3,-1
addiw   a6,t3,-1
sext.w  t6,a3
bltua5,a6,.L7
subwt5,t6,t3
mv  t1,a1
mv  a7,a2
mv  a6,a0
li  a5,0
vsetvli t4,zero,e32,m1,ta,ma
.L4:
vl1re32.v   v1,0(t1)
vl1re32.v   v2,0(a7)
addwa5,a5,t3
vfadd.vvv1,v1,v2
vs1r.v  v1,0(a6)
add t1,t1,a4
add a7,a7,a4
add a6,a6,a4
bgeut5,a5,.L4
beq t6,a5,.L10
sext.w  a5,a5
.L3:
sllia4,a5,2
.L6:
add a6,a1,a4
add a7,a2,a4
flw fa4,0(a6)
flw fa5,0(a7)
add a6,a0,a4
addiw   a5,a5,1
fadd.s  fa5,fa5,fa4
addia4,a4,4
fsw fa5,0(a6)
bgt a3,a5,.L6
.L10:
ret
.L7:
li  a5,0
j   .L3
 
After this patch:
 
   vadd_float:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,tu,ma
slli a4,a5,2
vle32.v v1,0(a1)
vle32.v v2,0(a2)
sub a3,a3,a5
vfadd.vv v1,v1,v2
vse32.v v1,0(a0)
add a1,a1,a4
add a2,a2,a4
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret
  
gcc/ChangeLog:
 
* config/riscv/autovec.md (cond_len_): New pattern.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_cond_len_binop): New function.
* config/riscv/riscv-v.cc (emit_nonvlmax_tu_insn): Ditto.
(emit_nonvlmax_fp_tu_insn): Ditto.
(need_fp_rounding_p): Ditto.
(expand_cond_len_binop): Ditto.
* config/riscv/riscv.cc (riscv_preferred_else_value): Ditto.
(TARGET_PREFERRED_ELSE_VALUE): New target hook.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Adapt testcase.
* gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c

Re: [PATCH v1] RISC-V: Add more tests for RVV floating-point FRM.

2023-07-12 Thread Kito Cheng via Gcc-patches
Pan Li via Gcc-patches 於 2023年7月12日 週三,23:07寫道:

> From: Pan Li 
>
> Add more test cases include both the asm check and run for RVV FRM.
>
> Signed-off-by: Pan Li 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/float-point-frm-insert-10.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-insert-7.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-insert-8.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-insert-9.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-run-1.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-run-2.c: New test.
> * gcc.target/riscv/rvv/base/float-point-frm-run-3.c: New test.
> ---
>  .../rvv/base/float-point-frm-insert-10.c  | 23 ++
>  .../riscv/rvv/base/float-point-frm-insert-7.c | 29 +++
>  .../riscv/rvv/base/float-point-frm-insert-8.c | 27 +++
>  .../riscv/rvv/base/float-point-frm-insert-9.c | 24 ++
>  .../riscv/rvv/base/float-point-frm-run-1.c| 79 +++
>  .../riscv/rvv/base/float-point-frm-run-2.c| 71 +
>  .../riscv/rvv/base/float-point-frm-run-3.c| 73 +
>  7 files changed, 326 insertions(+)
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-1.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-2.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-3.c
>
> diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
> new file mode 100644
> index 000..d35ee6d2131
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +void
> +test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t
> op2,
> +size_t vl)
> +{
> +  asm volatile (
> +"addi %0, %0, 0x12"
> +:"=r"(vl)


Should be + rather than = here

>
> +:
> +:
> +  );
> +
> +  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
> +  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
> +  *(vfloat32m1_t *)out = result;
> +}
> +
> +/* { dg-final { scan-assembler-times
> {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2
> } } */
> diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
> new file mode 100644
> index 000..7b1602fd509
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +size_t __attribute__ ((noinline))
> +normalize_vl (size_t vl)
> +{
> +  if (vl % 4 == 0)
> +return vl;
> +
> +  return ((vl / 4) + 1) * 4;
> +}
> +
> +void
> +test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t
> op2,
> +size_t vl)
> +{
> +  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
> +
> +  vl = normalize_vl (vl);
> +
> +  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
> +
> +  *(vfloat32m1_t *)out = result;
> +}
> +
> +/* { dg-final { scan-assembler-times
> {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
> +/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2
> } } */
> diff --git
> a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
> b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
> new file mode 100644
> index 000..37481ddac38
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
> +
> +#include "riscv_vector.h"
> +
> +size_t __attribute__ ((noinline))
> +normalize_vl (size_t vl)
> +{
> +  if (vl % 4 == 0)
> +return vl;
> +
> +  return ((vl / 4) + 1) * 4;
> +}
> +
> +void
> +test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t
> op2,
> +size_t vl)
> +{
> +  vl = normalize_vl (vl);
> +
> +  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
> +
> +  *(vfloat32m1_t *)out = result;
> +}
> +
> 

[PATCH v1] RISC-V: Add more tests for RVV floating-point FRM.

2023-07-12 Thread Pan Li via Gcc-patches
From: Pan Li 

Add more test cases include both the asm check and run for RVV FRM.

Signed-off-by: Pan Li 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/float-point-frm-insert-10.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-7.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-8.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-insert-9.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-1.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-2.c: New test.
* gcc.target/riscv/rvv/base/float-point-frm-run-3.c: New test.
---
 .../rvv/base/float-point-frm-insert-10.c  | 23 ++
 .../riscv/rvv/base/float-point-frm-insert-7.c | 29 +++
 .../riscv/rvv/base/float-point-frm-insert-8.c | 27 +++
 .../riscv/rvv/base/float-point-frm-insert-9.c | 24 ++
 .../riscv/rvv/base/float-point-frm-run-1.c| 79 +++
 .../riscv/rvv/base/float-point-frm-run-2.c| 71 +
 .../riscv/rvv/base/float-point-frm-run-3.c| 73 +
 7 files changed, 326 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-run-3.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
new file mode 100644
index 000..d35ee6d2131
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-10.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  asm volatile (
+"addi %0, %0, 0x12"
+:"=r"(vl)
+:
+:
+  );
+
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
new file mode 100644
index 000..7b1602fd509
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-7.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+normalize_vl (size_t vl)
+{
+  if (vl % 4 == 0)
+return vl;
+
+  return ((vl / 4) + 1) * 4;
+}
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+
+  vl = normalize_vl (vl);
+
+  result = __riscv_vfadd_vv_f32m1_rm (op1, result, 3, vl);
+
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
new file mode 100644
index 000..37481ddac38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-8.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O3 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+normalize_vl (size_t vl)
+{
+  if (vl % 4 == 0)
+return vl;
+
+  return ((vl / 4) + 1) * 4;
+}
+
+void
+test_float_point_frm_static (float *out, vfloat32m1_t op1, vfloat32m1_t op2,
+size_t vl)
+{
+  vl = normalize_vl (vl);
+
+  vfloat32m1_t result = __riscv_vfadd_vv_f32m1_rm (op1, op2, 2, vl);
+
+  *(vfloat32m1_t *)out = result;
+}
+
+/* { dg-final { scan-assembler-times 
{vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 1 } } */
+/* { dg-final { scan-assembler-times {fsrm\s+[ax][0-9]+,\s*[ax][0-9]+} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-frm-insert-9.c
new file mode 100644
index 0

Re: [PATCH] riscv: thead: Fix failing XTheadCondMov tests (indirect-rv[32|64])

2023-07-12 Thread Philipp Tomsich
Thanks, applied to trunk!

Philipp.

On Wed, 12 Jul 2023 at 16:08, Jeff Law  wrote:

>
>
> On 7/12/23 08:07, Philipp Tomsich wrote:
> >
> >
> > On Wed, 12 Jul 2023 at 16:05, Jeff Law  > > wrote:
> >
> >
> >
> > On 7/12/23 06:48, Christoph Müllner wrote:
> >  > On Wed, Jul 12, 2023 at 4:05 AM Jeff Law  > > wrote:
> >  >>
> >  >>
> >  >>
> >  >> On 7/10/23 22:44, Christoph Muellner wrote:
> >  >>> From: Christoph Müllner  > >
> >  >>>
> >  >>> Recently, two identical XTheadCondMov tests have been added,
> > which both fail.
> >  >>> Let's fix that by changing the following:
> >  >>> * Merge both files into one (no need for separate tests for
> > rv32 and rv64)
> >  >>> * Drop unrelated attribute check test (we already test for
> > `th.mveqz`
> >  >>> and `th.mvnez` instructions, so there is little additional
> > value)
> >  >>> * Fix the pattern to allow matching
> >  >>>
> >  >>> gcc/testsuite/ChangeLog:
> >  >>>
> >  >>>* gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Moved
> > to...
> >  >>>* gcc.target/riscv/xtheadcondmov-indirect.c: ...here.
> >  >>>* gcc.target/riscv/xtheadcondmov-indirect-rv64.c:
> Removed.
> >  >> I thought this stuff got fixed recently.  Certainly happy to see
> the
> >  >> files merged though.  Here's what I got from the July 4 run:
> >  >
> >  > I have the following with a GCC master from today
> >  > (a454325bea77a0dd79415480d48233a7c296bc0a):
> >  >
> >  > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2
> >  > scan-assembler .attribute arch,
> >  >
> >
>  "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> >  > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2
> >  > scan-assembler .attribute arch,
> >  >
> >
>  "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> >  >
> >  > With this patch the fails are gone.
> > Then it's fine with me :-)
> >
> >
> > For the avoidance of all doubt: could I hear an "OK"?
> OK for the trunk.
> jeff
>


Re: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM

2023-07-12 Thread Kito Cheng via Gcc-patches
Li, Pan2 via Gcc-patches 於 2023年7月12日 週三,15:07寫道:

> Thank Juzhe for review. Sure, let me hold the v3 for kito's comments.
>
> Pan
>
> From: juzhe.zh...@rivai.ai 
> Sent: Wednesday, July 12, 2023 2:11 PM
> To: Li, Pan2 ; gcc-patches 
> Cc: Robin Dapp ; jeffreyalaw ;
> Li, Pan2 ; Wang, Yanzhang ;
> kito.cheng 
> Subject: Re: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM
>
>
> +regnum_definition_p (rtx_insn *insn, unsigned int regno)
>
> I prefer it to be reg_set_p.
>
>
>
> +insn_asm_p (rtx_insn *insn)
>
> asm_insn_p
>
>
>
> +global_vxrm_state_unknown_p
>
> vxrm_unknown_p
>
>
>
> +global_frm_state_unknown_p (rtx_insn *insn)
>
> FRM of CALL function is not "UNKNOWN" unlike VXRM.
>
> It just change into another unknown(may be same or different from previous
> dynamic mode) Dynamic mode.
>
> frm_unknown_dynamic_p
>
>
>
> The reset refactoring looks good.
>
> Let's see whether kito has more comments.
>
>
>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
> From: pan2.li
> Date: 2023-07-12 13:50
> To: gcc-patches
> CC: juzhe.zhong; rdapp.gcc rdapp@gmail.com>; jeffreyalaw; pan2.li
> ; yanzhang.wang;
> kito.cheng
> Subject: [PATCH v2] RISC-V: Refactor riscv mode after for VXRM and FRM
> From: Pan Li mailto:pan2...@intel.com>>
>
> When investigate the FRM dynmaic rounding mode, we find the global
> unknown status is quite different between the fixed-point and
> floating-point. Thus, we separate the unknown function with extracting
> some inner common functions.
>
> We will also prepare more test cases in another PATCH.
>
> Signed-off-by: Pan Li mailto:pan2...@intel.com>>
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (regnum_definition_p): New function.
> (insn_asm_p): Ditto.
> (riscv_vxrm_mode_after): New function for fixed-point.
> (global_vxrm_state_unknown_p): Ditto.
> (riscv_frm_mode_after): New function for floating-point.
> (global_frm_state_unknown_p): Ditto.
> (riscv_mode_after): Leverage new functions.
> (riscv_entity_mode_after): Removed.
> ---
> gcc/config/riscv/riscv.cc | 96 +--
> 1 file changed, 82 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 38d8eb2fcf5..553fbb4435a 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -7742,19 +7742,91 @@ global_state_unknown_p (rtx_insn *insn, unsigned
> int regno)
>return false;
> }
> +static bool
> +regnum_definition_p (rtx_insn *insn, unsigned int regno)
> +{
> +  df_ref ref;
> +  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
> +
> +  /* Return true if there is a definition of regno.  */
> +  for (ref = DF_INSN_INFO_DEFS (insn_info); ref; ref = DF_REF_NEXT_LOC
> (ref))
> +if (DF_REF_REGNO (ref) == regno)
> +  return true;
> +
> +  return false;
> +}
> +
> +static bool
> +insn_asm_p (rtx_insn *insn)
> +{
> +  extract_insn (insn);
> +
> +  return recog_data.is_asm;
> +}
> +
> +static bool
> +global_vxrm_state_unknown_p (rtx_insn *insn)
> +{
> +  /* Return true if there is a definition of VXRM.  */
> +  if (regnum_definition_p (insn, VXRM_REGNUM))
> +return true;
> +
> +  /* A CALL function may contain an instruction that modifies the VXRM,
> + return true in this situation.  */
> +  if (CALL_P (insn))
> +return true;
> +
> +  /* Return true for all assembly since users may hardcode a assembly
> + like this: asm volatile ("csrwi vxrm, 0").  */
> +  if (insn_asm_p (insn))
> +return true;
> +
> +  return false;
> +}
> +
> +static bool
> +global_frm_state_unknown_p (rtx_insn *insn)
> +{
> +  /* Return true if there is a definition of FRM.  */
> +  if (regnum_definition_p (insn, FRM_REGNUM))
> +return true;
> +
> +  /* A CALL function may contain an instruction that modifies the FRM,
> + return true in this situation.  */
> +  if (CALL_P (insn))
> +return true;
> +
> +  return false;
> +}
> +
> static int
> -riscv_entity_mode_after (int regnum, rtx_insn *insn, int mode,
> - int (*get_attr_mode) (rtx_insn *), int default_mode)
> +riscv_vxrm_mode_after (rtx_insn *insn, int mode)
> {
> -  if (global_state_unknown_p (insn, regnum))
> -return default_mode;
> -  else if (recog_memoized (insn) < 0)
> +  if (global_vxrm_state_unknown_p (insn))
> +return VXRM_MODE_NONE;
> +
> +  if (recog_memoized (insn) < 0)
> +return mode;
> +
> +  if (reg_mentioned_p (gen_rtx_REG (SImode, VXRM_REGNUM), PATTERN (insn)))


Extract vxrm reg to a local static variable to prevent construct that again
and again.


> +return get_attr_vxrm_mode (insn);
> +  else
>  return mode;
> +}
> -  rtx reg = gen_rtx_REG (SImode, regnum);
> -  bool mentioned_p = reg_mentioned_p (reg, PATTERN (insn));
> +static int
> +riscv_frm_mode_after (rtx

[committed] ifcvt: Change return type of predicate functions from int to bool

2023-07-12 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool.

gcc/ChangeLog:

* ifcvt.cc (cond_exec_changed_p): Change variable to bool.
(last_active_insn): Change "skip_use_p" function argument to bool.
(noce_operand_ok): Change return type from int to bool.
(find_cond_trap): Ditto.
(block_jumps_and_fallthru_p): Change "fallthru_p" and
"jump_p" variables to bool.
(noce_find_if_block): Change return type from int to bool.
(cond_exec_find_if_block): Ditto.
(find_if_case_1): Ditto.
(find_if_case_2): Ditto.
(dead_or_predicable): Ditto. Change "reversep" function arg to bool.
(block_jumps_and_fallthru): Rename from block_jumps_and_fallthru_p.
(cond_exec_process_insns): Change return type from int to bool.
Change "mod_ok" function arg to bool.
(cond_exec_process_if_block): Change return type from int to bool.
Change "do_multiple_p" function arg to bool.  Change "then_mod_ok"
variable to bool.
(noce_emit_store_flag): Change return type from int to bool.
Change "reversep" function arg to bool.  Change "cond_complex"
variable to bool.
(noce_try_move): Change return type from int to bool.
(noce_try_ifelse_collapse): Ditto.
(noce_try_store_flag): Ditto. Change "reversep" variable to bool.
(noce_try_addcc): Change return type from int to bool.  Change
"subtract" variable to bool.
(noce_try_store_flag_constants): Change return type from int to bool.
(noce_try_store_flag_mask): Ditto.  Change "reversep" variable to bool.
(noce_try_cmove): Change return type from int to bool.
(noce_try_cmove_arith): Ditto. Change "is_mem" variable to bool.
(noce_try_minmax): Change return type from int to bool.  Change
"unsignedp" variable to bool.
(noce_try_abs): Change return type from int to bool.  Change
"negate" variable to bool.
(noce_try_sign_mask): Change return type from int to bool.
(noce_try_move): Ditto.
(noce_try_store_flag_constants): Ditto.
(noce_try_cmove): Ditto.
(noce_try_cmove_arith): Ditto.
(noce_try_minmax): Ditto.  Change "unsignedp" variable to bool.
(noce_try_bitop): Change return type from int to bool.
(noce_operand_ok): Ditto.
(noce_convert_multiple_sets): Ditto.
(noce_convert_multiple_sets_1): Ditto.
(noce_process_if_block): Ditto.
(check_cond_move_block): Ditto.
(cond_move_process_if_block): Ditto. Change "success_p"
variable to bool.
(rest_of_handle_if_conversion): Change return type to void.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index 0b180b4568f..a0af553b9ff 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -73,29 +73,29 @@ static int num_updated_if_blocks;
 static int num_true_changes;
 
 /* Whether conditional execution changes were made.  */
-static int cond_exec_changed_p;
+static bool cond_exec_changed_p;
 
 /* Forward references.  */
 static int count_bb_insns (const_basic_block);
 static bool cheap_bb_rtx_cost_p (const_basic_block, profile_probability, int);
 static rtx_insn *first_active_insn (basic_block);
-static rtx_insn *last_active_insn (basic_block, int);
+static rtx_insn *last_active_insn (basic_block, bool);
 static rtx_insn *find_active_insn_before (basic_block, rtx_insn *);
 static rtx_insn *find_active_insn_after (basic_block, rtx_insn *);
 static basic_block block_fallthru (basic_block);
 static rtx cond_exec_get_condition (rtx_insn *, bool);
 static rtx noce_get_condition (rtx_insn *, rtx_insn **, bool);
-static int noce_operand_ok (const_rtx);
+static bool noce_operand_ok (const_rtx);
 static void merge_if_block (ce_if_block *);
-static int find_cond_trap (basic_block, edge, edge);
+static bool find_cond_trap (basic_block, edge, edge);
 static basic_block find_if_header (basic_block, int);
-static int block_jumps_and_fallthru_p (basic_block, basic_block);
-static int noce_find_if_block (basic_block, edge, edge, int);
-static int cond_exec_find_if_block (ce_if_block *);
-static int find_if_case_1 (basic_block, edge, edge);
-static int find_if_case_2 (basic_block, edge, edge);
-static int dead_or_predicable (basic_block, basic_block, basic_block,
-  edge, int);
+static int block_jumps_and_fallthru (basic_block, basic_block);
+static bool noce_find_if_block (basic_block, edge, edge, int);
+static bool cond_exec_find_if_block (ce_if_block *);
+static bool find_if_case_1 (basic_block, edge, edge);
+static bool find_if_case_2 (basic_block, edge, edge);
+static bool dead_or_predicable (basic_block, basic_block, basic_block,
+   edge, bool);
 static void noce_emit_move_insn (rtx, rtx);
 static rtx_insn *block_has_only_trap (basic_block);
 static void need_cmov_or_rewire (basic_block, hash_set *,
@@ -234,7 +234,7 @@ first_active_insn (basic_block bb)
 /* Return the last non-jump active (non-jump) insn in the basic block.  */
 
 static rtx_insn *
-last_active_insn

Re: Re: [PATCH] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread 钟居哲
>> Return true if the operation requires a rounding mode operand.  Maybe also
>>call it needs_fp_rounding?
ok

>>What's FMLA?  That's SVE I suppose and ours is fmacc?
Yes, the comments is misleading will fix it soon.


juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-07-12 22:24
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Support COND_LEN_* patterns
Hi Juzhe,
 
> +/* Return true if the operation is the floating-point operation need FRM.  */
> +static bool
> +need_frm_p (rtx_code code, machine_mode mode)
> +{
> +  if (!FLOAT_MODE_P (mode))
> +return false;
> +  return code != SMIN && code != SMAX;
> +}
 
Return true if the operation requires a rounding mode operand.  Maybe also
call it needs_fp_rounding?
 
> +  if (need_frm_p (code, mode))
> + emit_nonvlmax_fp_tu_insn (icode, RVV_BINOP_MU, ops, len);
> +  else
> + emit_nonvlmax_tu_insn (icode, RVV_BINOP_MU, ops, len);
> +}
 
This feels like we could decide it inside emit_nonvlmax_tu_insn.
Same for without _tu.  But let's keep it like this for now in
order not to stall progress.
 
> +/* Implement TARGET_PREFERRED_ELSE_VALUE.  For binary operations,
> +   prefer to use the first arithmetic operand as the else value if
> +   the else value doesn't matter, since that exactly matches the SVE
> +   destructive merging form.  For ternary operations we could either
> +   pick the first operand and use FMAD-like instructions or the last
> +   operand and use FMLA-like instructions; the latter seems more
> +   natural.  */
 
What's FMLA?  That's SVE I suppose and ours is fmacc?
 
Apart from that fine from my side, thanks for supporting this.
 
Regards
Robin
 
 


RE: Re: [PATCH V3] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Li, Pan2 via Gcc-patches
Committed v4 as passed both the regression and bootstrap tests, thanks both 
Richard.

Pan

-Original Message-
From: Gcc-patches  On Behalf 
Of ???
Sent: Wednesday, July 12, 2023 9:19 PM
To: richard.sandiford 
Cc: gcc-patches ; rguenther 
Subject: Re: Re: [PATCH V3] VECT: Apply COND_LEN_* into vectorizable_operation

I fix comments as you suggested.

Thanks a lot!
 Soon will merge it when I finish the bootstrap && regression.



juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-07-12 20:14
To: juzhe.zhong
CC: gcc-patches; rguenther
Subject: Re: [PATCH V3] VECT: Apply COND_LEN_* into vectorizable_operation
juzhe.zh...@rivai.ai writes:
> From: Ju-Zhe Zhong 
>
> Hi, Richard and Richi.
> As we disscussed before, COND_LEN_* patterns were added for multiple 
> situations.
> This patch apply CON_LEN_* for the following situation:
>
> Support for the situation that in "vectorizable_operation":
>   /* If operating on inactive elements could generate spurious traps,
>  we need to restrict the operation to active lanes.  Note that this
>  specifically doesn't apply to unhoisted invariants, since they
>  operate on the same value for every lane.
>
>  Similarly, if this operation is part of a reduction, a fully-masked
>  loop should only change the active lanes of the reduction chain,
>  keeping the inactive lanes as-is.  */
>   bool mask_out_inactive = ((!is_invariant && gimple_could_trap_p (stmt))
> || reduc_idx >= 0);
>
> For mask_out_inactive is true with length loop control.
>
> So, we can these 2 following cases:
>
> 1. Integer division:
>
>#define TEST_TYPE(TYPE) \
>__attribute__((noipa)) \
>void vrem_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n) \
>{ \
>  for (int i = 0; i < n; i++) \
>dst[i] = a[i] % b[i]; \
>}
>#define TEST_ALL() \
>TEST_TYPE(int8_t) \
>TEST_ALL()
>
> With this patch:
>   
>   _61 = .SELECT_VL (ivtmp_59, POLY_INT_CST [4, 4]);
>   ivtmp_45 = _61 * 4;
>   vect__4.8_48 = .LEN_MASK_LOAD (vectp_a.6_46, 32B, _61, 0, { -1, ... });
>   vect__6.11_52 = .LEN_MASK_LOAD (vectp_b.9_50, 32B, _61, 0, { -1, ... });
>   vect__8.12_53 = .COND_LEN_ADD ({ -1, ... }, vect__4.8_48, vect__6.11_52, 
> vect__4.8_48, _61, 0);
>   .LEN_MASK_STORE (vectp_dst.13_55, 32B, _61, 0, { -1, ... }, vect__8.12_53);
>
> 2. Floating-point arithmetic **WITHOUT** -ffast-math
>   
>#define TEST_TYPE(TYPE) \
>__attribute__((noipa)) \
>void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n) \
>{ \
>  for (int i = 0; i < n; i++) \
>dst[i] = a[i] + b[i]; \
>}
>#define TEST_ALL() \
>TEST_TYPE(float) \
>TEST_ALL()
>
> With this patch:
>
>   _61 = .SELECT_VL (ivtmp_59, POLY_INT_CST [4, 4]);
>   ivtmp_45 = _61 * 4;
>   vect__4.8_48 = .LEN_MASK_LOAD (vectp_a.6_46, 32B, _61, 0, { -1, ... });
>   vect__6.11_52 = .LEN_MASK_LOAD (vectp_b.9_50, 32B, _61, 0, { -1, ... });
>   vect__8.12_53 = .COND_LEN_ADD ({ -1, ... }, vect__4.8_48, vect__6.11_52, 
> vect__4.8_48, _61, 0);
>   .LEN_MASK_STORE (vectp_dst.13_55, 32B, _61, 0, { -1, ... }, vect__8.12_53);
>
> With this patch, we can make sure operations won't trap for elements that 
> "mask_out_inactive".
>
> gcc/ChangeLog:
>
> * internal-fn.cc (FOR_EACH_CODE_MAPPING): Adapt for COND_LEN_* 
> support.
> (CASE): Ditto.
> (get_conditional_len_internal_fn): New function.
> * internal-fn.h (get_conditional_len_internal_fn): Ditto.
> * tree-vect-stmts.cc (vectorizable_operation): Adapt for COND_LEN_* 
> support.
>
> ---
>  gcc/internal-fn.cc | 73 +++---
>  gcc/internal-fn.h  |  1 +
>  gcc/tree-vect-stmts.cc | 48 ---
>  3 files changed, 93 insertions(+), 29 deletions(-)
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index f9aaf66cf2a..b288ac6fe6b 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -4276,23 +4276,24 @@ static void (*const internal_fn_expanders[]) 
> (internal_fn, gcall *) = {
>0
>  };
>  
> -/* Invoke T(CODE, IFN) for each conditional function IFN that maps to a
> -   tree code CODE.  */
> +/* Invoke T(CODE, SUFFIX) for each conditional function IFN_COND_##SUFFIX
> +   that maps to a tree code CODE.  There is also an IFN_COND_LEN_##SUFFIX
> +   for each such IFN_COND_##SUFFIX.  */
>  #define FOR_EACH_CODE_MAPPING(T) \
> -  T (PLUS_EXPR, IFN_COND_ADD) \
> -  T (MINUS_EXPR, IFN_COND_SUB) \
> -  T (MULT_EXPR, IFN_COND_MUL) \
> -  T (TRUNC_DIV_EXPR, IFN_COND_DIV) \
> -  T (TRUNC_MOD_EXPR, IFN_COND_MOD) \
> -  T (RDIV_EXPR, IFN_COND_RDIV) \
> -  T (MIN_EXPR, IFN_COND_MIN) \
> -  T (MAX_EXPR, IFN_COND_MAX) \
> -  T (BIT_AND_EXPR, IFN_COND_AND) \
> -  T (BIT_IOR_EXPR, IFN_COND_IOR) \
> -  T (BIT_XOR_EXPR, IFN_COND_XOR) \
> -  T (LSHIFT_EXPR, IFN_COND_SHL) \
> -  T (RSHIFT_EXPR, IFN_COND_SHR) \
> -  T (NEGATE_EXPR, IFN_COND_NEG)
> +  T (PLUS_EXPR, ADD) \
> +  T (MINUS_EXPR, SUB) \
> +  T (MULT_EXPR, MUL) \
> +  T (TRUNC_DIV_EXPR, DIV) 

Re: [PATCH V5] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 01:27, Richard Biener wrote:



Using SSA_NAME_DEF_STMT during expansion is OK, but I don't think you
can rely on REG_EXPR here since you don't know whether any coalescing
happened.  That said, maybe the implementation currently guarantees
you'll only see a REG_EXPR SSA name if there's a single definition
of that register, but at least I'm not aware of that and this is also
not documented.
If anyone knows if the implementation guarantees that, it'd probably be 
Michael, since he did the revamping of the expansion code years ago.




I wonder if you can recover vlse.v at combine time though?
It may be hard to recover at combine time -- our vector insns aren't in 
forms that are easily digested by combine.  In this specific case we 
have hope though.  Essentially combine would need to recognize the 
offsets vector as a simple stride and adjust appropriate.




That said, if the ISA supports gather/scatter with an affine offset
the more appropriate way would be to add additional named expanders
for this and deal with the above in the middle-end during RTL
expansion instead.
It's worth a try.  I didn't have much luck with this at Tachyum, but I 
always expected it was a mis-understanding of some parts of the 
vectorizer on my part.  I was deep inside this class of problems when I 
had to push it on the stack to develop a golang port :(


We were basically going down the path of treating everythign as a 
scatter-gather, but trying to recognize strides in the offsets vector as 
a degenerate case.


jeff


Re: [PATCH] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe,

> +/* Return true if the operation is the floating-point operation need FRM.  */
> +static bool
> +need_frm_p (rtx_code code, machine_mode mode)
> +{
> +  if (!FLOAT_MODE_P (mode))
> +return false;
> +  return code != SMIN && code != SMAX;
> +}

Return true if the operation requires a rounding mode operand.  Maybe also
call it needs_fp_rounding?

> +  if (need_frm_p (code, mode))
> + emit_nonvlmax_fp_tu_insn (icode, RVV_BINOP_MU, ops, len);
> +  else
> + emit_nonvlmax_tu_insn (icode, RVV_BINOP_MU, ops, len);
> +}

This feels like we could decide it inside emit_nonvlmax_tu_insn.
Same for without _tu.  But let's keep it like this for now in
order not to stall progress.

> +/* Implement TARGET_PREFERRED_ELSE_VALUE.  For binary operations,
> +   prefer to use the first arithmetic operand as the else value if
> +   the else value doesn't matter, since that exactly matches the SVE
> +   destructive merging form.  For ternary operations we could either
> +   pick the first operand and use FMAD-like instructions or the last
> +   operand and use FMLA-like instructions; the latter seems more
> +   natural.  */

What's FMLA?  That's SVE I suppose and ours is fmacc?

Apart from that fine from my side, thanks for supporting this.

Regards
 Robin



[committed] libgomp.texi: add cross ref, remove duplicated entry

2023-07-12 Thread Tobias Burnus

Committed as r14-2468-g13c3e29d47e359

"Some are only stubs" sounded worse than the actual status and we now a
have a rather extensive and complete section about this topic.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 13c3e29d47e359b2f05ea98d61710fc162ba6d31
Author: Tobias Burnus 
Date:   Wed Jul 12 16:14:20 2023 +0200

libgomp.texi: add cross ref, remove duplicated entry

libgomp/

* libgomp.texi (OpenMP 5.0): Replace '... stub' by @ref to
'Memory allocation' section which contains the full status.
(TR11): Remove differently worded duplicated entry.
---
 libgomp/libgomp.texi | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 9d910e6883c..1645cc0a2d3 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -221,7 +221,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{mutexinoutset} @emph{dependence-type} for @code{depend} clause
   @tab Y @tab
 @item Predefined memory spaces, memory allocators, allocator traits
-  @tab Y @tab Some are only stubs
+  @tab Y @tab See also @ref{Memory allocation}
 @item Memory management routines @tab Y @tab
 @item @code{allocate} directive @tab N @tab
 @item @code{allocate} clause @tab P @tab Initial support
@@ -487,8 +487,6 @@ Technical Report (TR) 11 is the first preview for OpenMP 6.0.
 @item Mapping lambda captures @tab N @tab
 @item For Fortran, atomic compare with storing the comparison result
   @tab N @tab
-@item @code{aligned} clause changes for @code{simd} and @code{declare simd}
-  @tab N @tab
 @end multitable
 
 


Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> The PRs ask for optimizing of
>
>   _1 = BIT_FIELD_REF ;
>   result_4 = BIT_INSERT_EXPR ;
>
> to a vector permutation.  The following implements this as
> match.pd pattern, improving code generation on x86_64.
>
> On the RTL level we face the issue that backend patterns inconsistently
> use vec_merge and vec_select of vec_concat to represent permutes.

Yeah, the current RTL codes probably overlap a bit too much.

Maybe we should have a rule that a vec_merge with a constant
third operand should be canonicalised to a vec_select?  And maybe
change the first operand of vec_select to be an rtvec, so that
no separate vec_concat (and thus wider mode) is needed for two-input
permutes?  Would be a lot of work though...

> I think using a (supported) permute is almost always better
> than an extract plus insert, maybe excluding the case we extract
> element zero and that's aliased to a register that can be used
> directly for insertion (not sure how to query that).

Yeah, extraction of the low element (0 for LE, N-1 for BE) is special
in RTL, in that it is now folded to a subreg.  But IMO it's reasonable
for even that case to through TARGET_VECTORIZE_VEC_PERM_CONST,
maybe with a target-independent helper function to match permute
vectors that are equivalent to extract-and-insert.

On AArch64, extract-and-insert is a single operation for other
elements too, e.g.:

ins v0.s[2], v1.s[1]

is a thing.  But if the helper returns the index of the extracted
elements, targets can decide for themselves whether the index is
supported or not.

Agree that this is the right thing for gimple to do FWIW.

Thanks,
Richard

> But this regresses for example gcc.target/i386/pr54855-8.c because PRE
> now realizes that
>
>   _1 = BIT_FIELD_REF ;
>   if (_1 > a_4(D))
> goto ; [50.00%]
>   else
> goto ; [50.00%]
>
>[local count: 536870913]:
>
>[local count: 1073741824]:
>   # iftmp.0_2 = PHI <_1(3), a_4(D)(2)>
>   x_5 = BIT_INSERT_EXPR ;
>
> is equal to
>
>[local count: 1073741824]:
>   _1 = BIT_FIELD_REF ;
>   if (_1 > a_4(D))
> goto ; [50.00%]
>   else
> goto ; [50.00%]
>
>[local count: 536870912]:
>   _7 = BIT_INSERT_EXPR ;
>
>[local count: 1073741824]:
>   # prephitmp_8 = PHI 
>
> and that no longer produces the desired maxsd operation at the RTL
> level (we fail to match .FMAX at the GIMPLE level earlier).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu with regressions:
>
> FAIL: gcc.target/i386/pr54855-13.c scan-assembler-times vmaxsh[ t] 1
> FAIL: gcc.target/i386/pr54855-13.c scan-assembler-not vcomish[ t]
> FAIL: gcc.target/i386/pr54855-8.c scan-assembler-times maxsd 1
> FAIL: gcc.target/i386/pr54855-8.c scan-assembler-not movsd
> FAIL: gcc.target/i386/pr54855-9.c scan-assembler-times minss 1
> FAIL: gcc.target/i386/pr54855-9.c scan-assembler-not movss
>
> I think this is also PR88540 (the lack of min/max detection, not
> sure if the SSE min/max are suitable here)
>
>   PR tree-optimization/94864
>   PR tree-optimization/94865
>   * match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): New pattern
>   for vector insertion from vector extraction.
>
>   * gcc.target/i386/pr94864.c: New testcase.
>   * gcc.target/i386/pr94865.c: Likewise.
> ---
>  gcc/match.pd| 25 +
>  gcc/testsuite/gcc.target/i386/pr94864.c | 13 +
>  gcc/testsuite/gcc.target/i386/pr94865.c | 13 +
>  3 files changed, 51 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr94864.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr94865.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 8543f777a28..8cc106049c4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7770,6 +7770,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> wi::to_wide (@ipos) + isize))
>  (BIT_FIELD_REF @0 @rsize @rpos)
>  
> +/* Simplify vector inserts of other vector extracts to a permute.  */
> +(simplify
> + (bit_insert @0 (BIT_FIELD_REF@2 @1 @rsize @rpos) @ipos)
> + (if (VECTOR_TYPE_P (type)
> +  && types_match (@0, @1)
> +  && types_match (TREE_TYPE (TREE_TYPE (@0)), TREE_TYPE (@2))
> +  && TYPE_VECTOR_SUBPARTS (type).is_constant ())
> +  (with
> +   {
> + unsigned HOST_WIDE_INT elsz
> +   = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (TREE_TYPE (@1;
> + poly_uint64 relt = exact_div (tree_to_poly_uint64 (@rpos), elsz);
> + poly_uint64 ielt = exact_div (tree_to_poly_uint64 (@ipos), elsz);
> + unsigned nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
> + vec_perm_builder builder;
> + builder.new_vector (nunits, nunits, 1);
> + for (unsigned i = 0; i < nunits; ++i)
> +   builder.quick_push (known_eq (ielt, i) ? nunits + relt : i);
> + vec_perm_indices sel (builder, 2, nunits);
> +   }
> +   (if (!VECTOR_MODE_P (TYPE_MODE (type))
> + || can_vec_perm_const_p (TYPE_MODE (type), TYPE_MODE (type), sel, 
> false))
> + 

Re: [PATCH v2] Implement new RTL optimizations pass: fold-mem-offsets.

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 03:12, Manolis Tsamis wrote:

On Mon, Jul 10, 2023 at 12:58 AM Hans-Peter Nilsson  wrote:


On Sun, 9 Jul 2023, Hans-Peter Nilsson wrote:


On Thu, 15 Jun 2023, Manolis Tsamis wrote:


This is a new RTL pass that tries to optimize memory offset calculations
by moving them from add immediate instructions to the memory loads/stores.



It punts on all "use" insns that are not SET.
Why not use single_set there too?


Also, I don't see insn costs considered?
(Also: typo "immidiate".)



The only change that this pass does is to change offsets where
possible and then simplify add immediate instructions to register
moves.
I don't see how this could result in worse performance and by
extension I don't see where insn costs could be used.
Do you have any thoughts about where to use the costs?
If the offset crosses an architectural size boundary such that the 
instruction was longer, but still valid, it could affect the cost.


That's the most obvious case to me.  There may be others.

Any progress on that m68k issue?  I've also got a report of x264 failing 
to build on riscv64 with the V2 variant, but I haven't distilled that 
down to a testcase yet.


jeff


Re: [PATCH] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread 钟居哲
The middle-end vectorizer patch is approved and soon will be merged.

The middle-end dependency is resolved.

Ok for trunk?


juzhe.zh...@rivai.ai
 
From: Juzhe-Zhong
Date: 2023-07-12 12:44
To: gcc-patches
CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Juzhe-Zhong
Subject: [PATCH] RISC-V: Support COND_LEN_* patterns
This patch is depending on the following patch on Vectorizer:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624179.html
 
With this patch, we can handle operations may trap on elements outside the loop.
 
These 2 following cases will be addressed by this patch:
 
1. integer division:
 
  #define TEST_TYPE(TYPE) \
  __attribute__((noipa)) \
  void vrem_##TYPE (TYPE * __restrict dst, TYPE * __restrict a, TYPE * 
__restrict b, int n) \
  { \
for (int i = 0; i < n; i++) \
  dst[i] = a[i] % b[i]; \
  }
  #define TEST_ALL() \
   TEST_TYPE(int8_t) \
  TEST_ALL()
 
  Before this patch:
 
   vrem_int8_t:
ble a3,zero,.L14
csrrt4,vlenb
addiw   a5,a3,-1
addiw   a4,t4,-1
sext.w  t5,a3
bltua5,a4,.L10
csrrt3,vlenb
subwt3,t5,t3
li  a5,0
vsetvli t6,zero,e8,m1,ta,ma
.L4:
add a6,a2,a5
add a7,a0,a5
add t1,a1,a5
mv  a4,a5
add a5,a5,t4
vl1re8.vv2,0(a6)
vl1re8.vv1,0(t1)
sext.w  a6,a5
vrem.vv v1,v1,v2
vs1r.v  v1,0(a7)
bleua6,t3,.L4
csrra5,vlenb
addwa4,a4,a5
sext.w  a5,a4
beq t5,a4,.L16
.L3:
csrra6,vlenb
subwt5,t5,a4
srlia6,a6,1
addiw   t1,t5,-1
addiw   a7,a6,-1
bltut1,a7,.L9
sllia4,a4,32
srlia4,a4,32
add t0,a1,a4
add t6,a2,a4
add a4,a0,a4
vsetvli a7,zero,e8,mf2,ta,ma
sext.w  t3,a6
vle8.v  v1,0(t0)
vle8.v  v2,0(t6)
subwt4,t5,a6
vrem.vv v1,v1,v2
vse8.v  v1,0(a4)
mv  t1,t3
bltut4,t3,.L7
csrrt1,vlenb
add a4,a4,a6
add t0,t0,a6
add t6,t6,a6
sext.w  t1,t1
vle8.v  v1,0(t0)
vle8.v  v2,0(t6)
vrem.vv v1,v1,v2
vse8.v  v1,0(a4)
.L7:
addwa5,t1,a5
beq t5,t1,.L14
.L9:
add a4,a1,a5
add a6,a2,a5
lb  a6,0(a6)
lb  a4,0(a4)
add a7,a0,a5
addia5,a5,1
remwa4,a4,a6
sext.w  a6,a5
sb  a4,0(a7)
bgt a3,a6,.L9
.L14:
ret
.L10:
li  a4,0
li  a5,0
j   .L3
.L16:
ret
 
After this patch:
 
   vrem_int8_t:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e8,m1,tu,ma
vle8.v v1,0(a1)
vle8.v v2,0(a2)
sub a3,a3,a5
vrem.vv v1,v1,v2
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
bne a3,zero,.L3
.L5:
ret
 
2. Floating-point operation **WITHOUT** -ffast-math:
 
#define TEST_TYPE(TYPE) \
__attribute__((noipa)) \
void vadd_##TYPE (TYPE * __restrict dst, TYPE *__restrict a, TYPE 
*__restrict b, int n) \
{ \
  for (int i = 0; i < n; i++) \
dst[i] = a[i] + b[i]; \
}
 
#define TEST_ALL() \
 TEST_TYPE(float) \
 
TEST_ALL()
   
Before this patch:
   
   vadd_float:
ble a3,zero,.L10
csrra4,vlenb
srlit3,a4,2
addiw   a5,a3,-1
addiw   a6,t3,-1
sext.w  t6,a3
bltua5,a6,.L7
subwt5,t6,t3
mv  t1,a1
mv  a7,a2
mv  a6,a0
li  a5,0
vsetvli t4,zero,e32,m1,ta,ma
.L4:
vl1re32.v   v1,0(t1)
vl1re32.v   v2,0(a7)
addwa5,a5,t3
vfadd.vvv1,v1,v2
vs1r.v  v1,0(a6)
add t1,t1,a4
add a7,a7,a4
add a6,a6,a4
bgeut5,a5,.L4
beq t6,a5,.L10
sext.w  a5,a5
.L3:
sllia4,a5,2
.L6:
add a6,a1,a4
add a7,a2,a4
flw fa4,0(a6)
flw fa5,0(a7)
add a6,a0,a4
addiw   a5,a5,1
fadd.s  fa5,fa5,fa4
addia4,a4,4
fsw fa5,0(a6)
bgt a3,a5,.L6
.L10:
ret
.L7:
li  a5,0
j   .L3
 
After this patch:
 
   vadd_float:
ble a3,zero,.L5
.L3:
vsetvli a5,a3,e32,m1,tu,ma
slli a4,a5,2
vle32.v v1,0(a1)
vle32.v v2,0(a2)
sub a3,a3,a5
vfadd.vv v1,v1,v2
vse32.v v1,0(a0)
add a1,a1,a4
add a2,a2,a4
add a0,a0,a4
bne a3,zero,.L3
.L5:
ret
  
gcc/ChangeLog:
 
* config/riscv/autovec.md (cond_len_): New pattern.
* config/riscv/riscv-protos.h (enum insn_type): New enum.
(expand_cond_len_binop): New function.
* config/riscv/riscv-v.cc (emit_nonvlmax_tu_insn): Ditto.
(emit_nonvlmax_fp_tu_insn): Ditto.
(need_frm_p): Ditto.
(expand_cond_len_binop): Ditto.
* config/riscv/riscv.cc

[PATCH] - Devirtualization of array destruction (C++) - 110057

2023-07-12 Thread Ng YongXiang via Gcc-patches
Component:
c++

Bug ID:
110057

Bugzilla link:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110057

Description:
Array should not call virtual destructor of object when array is destructed

ChangeLog:

2023-07-12  Ng YongXiang  PR c++
* Devirtualize auto generated destructor calls of arraycp/*
init.c: Call non virtual destructor of objects in arraytestsuite/
  * g++.dg/devirt-array-destructor-1.C: New.*
g++.dg/devirt-array-destructor-2.C: New.
* g++.dg/warn/pr83054.C: Change expected number of devirtualized calls


On Wed, Jul 12, 2023 at 5:02 PM Xi Ruoyao  wrote:

> On Wed, 2023-07-12 at 16:58 +0800, Ng YongXiang via Gcc-patches wrote:
> > I'm writing to seek for a review for an issue I filed some time ago.
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110057 . A proposed patch
> is
> > attached in the bug tracker as well.
>
> You should send the patch to gcc-patches@gcc.gnu.org for a review, see
> https://gcc.gnu.org/contribute.html for the details.  Generally we
> consider patches attached in bugzilla as drafts.
>
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University
>
From aafa45669695520c26504479eb3f21d61ea81edb Mon Sep 17 00:00:00 2001
From: yongxiangng 
Date: Sat, 3 Jun 2023 00:36:32 +0800
Subject: [PATCH] Devirtualize auto generated destructor calls of arrays

---
 gcc/cp/init.cc|  8 +++---
 .../g++.dg/devirt-array-destructor-1.C| 27 ++
 .../g++.dg/devirt-array-destructor-2.C| 28 +++
 gcc/testsuite/g++.dg/warn/pr83054.C   | 24 +++-
 4 files changed, 69 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/devirt-array-destructor-1.C
 create mode 100644 gcc/testsuite/g++.dg/devirt-array-destructor-2.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 6ccda365b04..69ab51d0a4b 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -4112,8 +4112,8 @@ build_vec_delete_1 (location_t loc, tree base, tree maxindex, tree type,
   if (type_build_dtor_call (type))
 	{
 	  tmp = build_delete (loc, ptype, base, sfk_complete_destructor,
-			  LOOKUP_NORMAL|LOOKUP_DESTRUCTOR, 1,
-			  complain);
+			  LOOKUP_NORMAL|LOOKUP_DESTRUCTOR|LOOKUP_NONVIRTUAL,
+			  1, complain);
 	  if (tmp == error_mark_node)
 	return error_mark_node;
 	}
@@ -4143,8 +4143,8 @@ build_vec_delete_1 (location_t loc, tree base, tree maxindex, tree type,
 return error_mark_node;
   body = build_compound_expr (loc, body, tmp);
   tmp = build_delete (loc, ptype, tbase, sfk_complete_destructor,
-		  LOOKUP_NORMAL|LOOKUP_DESTRUCTOR, 1,
-		  complain);
+		  LOOKUP_NORMAL|LOOKUP_DESTRUCTOR|LOOKUP_NONVIRTUAL,
+		  1, complain);
   if (tmp == error_mark_node)
 return error_mark_node;
   body = build_compound_expr (loc, body, tmp);
diff --git a/gcc/testsuite/g++.dg/devirt-array-destructor-1.C b/gcc/testsuite/g++.dg/devirt-array-destructor-1.C
new file mode 100644
index 000..be2d16ae761
--- /dev/null
+++ b/gcc/testsuite/g++.dg/devirt-array-destructor-1.C
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* Virtual calls should be devirtualized because we know dynamic type of object in array at compile time */
+/* { dg-options "-O3 -fdump-tree-optimized -fno-inline"  } */
+
+class A
+{
+public:
+  virtual ~A()
+  {
+  }
+};
+
+class B : public A
+{
+public:
+  virtual ~B()
+  {
+  }
+};
+
+int main()
+{
+  B b[10];
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "OBJ_TYPE_REF" 0 "optimized"} } */
diff --git a/gcc/testsuite/g++.dg/devirt-array-destructor-2.C b/gcc/testsuite/g++.dg/devirt-array-destructor-2.C
new file mode 100644
index 000..0b3ab2ca9d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/devirt-array-destructor-2.C
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* Virtual calls should be devirtualized because we know dynamic type of object in array at compile time */
+/* { dg-options "-O3 -fdump-tree-optimized -fno-inline"  } */
+
+class A
+{
+public:
+  virtual ~A()
+  {
+  }
+};
+
+class B : public A
+{
+public:
+  virtual ~B()
+  {
+  }
+};
+
+int main()
+{
+  B* ptr = new B[10];
+  delete[] ptr;
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "OBJ_TYPE_REF" 0 "optimized"} } */
diff --git a/gcc/testsuite/g++.dg/warn/pr83054.C b/gcc/testsuite/g++.dg/warn/pr83054.C
index 5285f94acee..7cd0951713d 100644
--- a/gcc/testsuite/g++.dg/warn/pr83054.C
+++ b/gcc/testsuite/g++.dg/warn/pr83054.C
@@ -10,7 +10,7 @@
 #endif
 
 extern "C" int printf (const char *, ...);
-struct foo // { dg-warning "final would enable devirtualization of 5 calls" }
+struct foo // { dg-warning "final would enable devirtualization of 1 call" }
 {
   static int count;
   void print (int i, int j) { printf ("foo[%d][%d] = %d\n", i, j, x); }
@@ -29,19 +29,15 @@ int foo::count;
 
 int main ()
 {
-  {
-foo array[3][3];
-for (int i = 0; i < 3; i++)
-  {
-	for (int j = 0; j < 3; j++)
-	  {
-	printf("&a[%d][%d] = %x\n", i, j, (void *)&arra

Re: [PATCH] riscv: thead: Fix failing XTheadCondMov tests (indirect-rv[32|64])

2023-07-12 Thread Kito Cheng via Gcc-patches
Ok

Philipp Tomsich 於 2023年7月12日 週三,22:08寫道:

> On Wed, 12 Jul 2023 at 16:05, Jeff Law  wrote:
>
> >
> >
> > On 7/12/23 06:48, Christoph Müllner wrote:
> > > On Wed, Jul 12, 2023 at 4:05 AM Jeff Law 
> wrote:
> > >>
> > >>
> > >>
> > >> On 7/10/23 22:44, Christoph Muellner wrote:
> > >>> From: Christoph Müllner 
> > >>>
> > >>> Recently, two identical XTheadCondMov tests have been added, which
> > both fail.
> > >>> Let's fix that by changing the following:
> > >>> * Merge both files into one (no need for separate tests for rv32 and
> > rv64)
> > >>> * Drop unrelated attribute check test (we already test for `th.mveqz`
> > >>> and `th.mvnez` instructions, so there is little additional value)
> > >>> * Fix the pattern to allow matching
> > >>>
> > >>> gcc/testsuite/ChangeLog:
> > >>>
> > >>>* gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Moved to...
> > >>>* gcc.target/riscv/xtheadcondmov-indirect.c: ...here.
> > >>>* gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Removed.
> > >> I thought this stuff got fixed recently.  Certainly happy to see the
> > >> files merged though.  Here's what I got from the July 4 run:
> > >
> > > I have the following with a GCC master from today
> > > (a454325bea77a0dd79415480d48233a7c296bc0a):
> > >
> > > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2
> > > scan-assembler .attribute arch,
> > >
> "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> > > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2
> > > scan-assembler .attribute arch,
> > >
> "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> > >
> > > With this patch the fails are gone.
> > Then it's fine with me :-)
>
>
> For the avoidance of all doubt: could I hear an "OK"?
>
> Thanks,
> Philipp.
>


Re: [PATCH] riscv: thead: Fix failing XTheadCondMov tests (indirect-rv[32|64])

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 08:07, Philipp Tomsich wrote:



On Wed, 12 Jul 2023 at 16:05, Jeff Law > wrote:




On 7/12/23 06:48, Christoph Müllner wrote:
 > On Wed, Jul 12, 2023 at 4:05 AM Jeff Law mailto:jeffreya...@gmail.com>> wrote:
 >>
 >>
 >>
 >> On 7/10/23 22:44, Christoph Muellner wrote:
 >>> From: Christoph Müllner mailto:christoph.muell...@vrull.eu>>
 >>>
 >>> Recently, two identical XTheadCondMov tests have been added,
which both fail.
 >>> Let's fix that by changing the following:
 >>> * Merge both files into one (no need for separate tests for
rv32 and rv64)
 >>> * Drop unrelated attribute check test (we already test for
`th.mveqz`
 >>>     and `th.mvnez` instructions, so there is little additional
value)
 >>> * Fix the pattern to allow matching
 >>>
 >>> gcc/testsuite/ChangeLog:
 >>>
 >>>        * gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Moved
to...
 >>>        * gcc.target/riscv/xtheadcondmov-indirect.c: ...here.
 >>>        * gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Removed.
 >> I thought this stuff got fixed recently.  Certainly happy to see the
 >> files merged though.  Here's what I got from the July 4 run:
 >
 > I have the following with a GCC master from today
 > (a454325bea77a0dd79415480d48233a7c296bc0a):
 >
 > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2
 > scan-assembler .attribute arch,
 >
"rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
 > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2
 > scan-assembler .attribute arch,
 >
"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
 >
 > With this patch the fails are gone.
Then it's fine with me :-)


For the avoidance of all doubt: could I hear an "OK"?

OK for the trunk.
jeff


Re: [PATCH] riscv: thead: Fix failing XTheadCondMov tests (indirect-rv[32|64])

2023-07-12 Thread Philipp Tomsich
On Wed, 12 Jul 2023 at 16:05, Jeff Law  wrote:

>
>
> On 7/12/23 06:48, Christoph Müllner wrote:
> > On Wed, Jul 12, 2023 at 4:05 AM Jeff Law  wrote:
> >>
> >>
> >>
> >> On 7/10/23 22:44, Christoph Muellner wrote:
> >>> From: Christoph Müllner 
> >>>
> >>> Recently, two identical XTheadCondMov tests have been added, which
> both fail.
> >>> Let's fix that by changing the following:
> >>> * Merge both files into one (no need for separate tests for rv32 and
> rv64)
> >>> * Drop unrelated attribute check test (we already test for `th.mveqz`
> >>> and `th.mvnez` instructions, so there is little additional value)
> >>> * Fix the pattern to allow matching
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>>* gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Moved to...
> >>>* gcc.target/riscv/xtheadcondmov-indirect.c: ...here.
> >>>* gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Removed.
> >> I thought this stuff got fixed recently.  Certainly happy to see the
> >> files merged though.  Here's what I got from the July 4 run:
> >
> > I have the following with a GCC master from today
> > (a454325bea77a0dd79415480d48233a7c296bc0a):
> >
> > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2
> > scan-assembler .attribute arch,
> > "rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> > FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2
> > scan-assembler .attribute arch,
> > "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
> >
> > With this patch the fails are gone.
> Then it's fine with me :-)


For the avoidance of all doubt: could I hear an "OK"?

Thanks,
Philipp.


Re: [PATCH] riscv: thead: Fix failing XTheadCondMov tests (indirect-rv[32|64])

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 06:48, Christoph Müllner wrote:

On Wed, Jul 12, 2023 at 4:05 AM Jeff Law  wrote:




On 7/10/23 22:44, Christoph Muellner wrote:

From: Christoph Müllner 

Recently, two identical XTheadCondMov tests have been added, which both fail.
Let's fix that by changing the following:
* Merge both files into one (no need for separate tests for rv32 and rv64)
* Drop unrelated attribute check test (we already test for `th.mveqz`
and `th.mvnez` instructions, so there is little additional value)
* Fix the pattern to allow matching

gcc/testsuite/ChangeLog:

   * gcc.target/riscv/xtheadcondmov-indirect-rv32.c: Moved to...
   * gcc.target/riscv/xtheadcondmov-indirect.c: ...here.
   * gcc.target/riscv/xtheadcondmov-indirect-rv64.c: Removed.

I thought this stuff got fixed recently.  Certainly happy to see the
files merged though.  Here's what I got from the July 4 run:


I have the following with a GCC master from today
(a454325bea77a0dd79415480d48233a7c296bc0a):

FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv32.c   -O2
scan-assembler .attribute arch,
"rv32i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"
FAIL: gcc.target/riscv/xtheadcondmov-indirect-rv64.c   -O2
scan-assembler .attribute arch,
"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_xtheadcondmov1p0"

With this patch the fails are gone.

Then it's fine with me :-)

jeff


Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Jeff Law via Gcc-patches




On 7/12/23 07:36, Richard Biener via Gcc-patches wrote:

The PRs ask for optimizing of

   _1 = BIT_FIELD_REF ;
   result_4 = BIT_INSERT_EXPR ;

to a vector permutation.  The following implements this as
match.pd pattern, improving code generation on x86_64.

On the RTL level we face the issue that backend patterns inconsistently
use vec_merge and vec_select of vec_concat to represent permutes.

I think using a (supported) permute is almost always better
than an extract plus insert, maybe excluding the case we extract
element zero and that's aliased to a register that can be used
directly for insertion (not sure how to query that).
So for a target with aliases at the register level, I'd bet they're 
already aware of the aliasing and are prepared to deal with it in the 
target (and are probably already trying to take advantage of that quirk 
when possible).


So I'd just punt that problem to the targets.  If it turns out to be 
common, then we can try to address it, probably at the gimple->rtl border.


jeff



Re: [PATCH 0/9] Add btf_decl_tag C attribute

2023-07-12 Thread Jose E. Marchesi via Gcc-patches


> On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi
>  wrote:
>>
>>
>> [Added Eduard Zingerman in CC, who is implementing this same feature in
>>  clang/llvm and also the consumer component in the kernel (pahole).]
>>
>> Hi Richard.
>>
>> > On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches
>> >  wrote:
>> >>
>> >> Hello,
>> >>
>> >> This series adds support for a new attribute, "btf_decl_tag" in GCC.
>> >> The same attribute is already supported in clang, and is used by various
>> >> components of the BPF ecosystem.
>> >>
>> >> The purpose of the attribute is to allow to associate (to "tag")
>> >> declarations with arbitrary string annotations, which are emitted into
>> >> debugging information (DWARF and/or BTF) to facilitate post-compilation
>> >> analysis (the motivating use case being the Linux kernel BPF verifier).
>> >> Multiple tags are allowed on the same declaration.
>> >>
>> >> These strings are not interpreted by the compiler, and the attribute
>> >> itself has no effect on generated code, other than to produce additional
>> >> DWARF DIEs and/or BTF records conveying the annotations.
>> >>
>> >> This entails:
>> >>
>> >> - A new C-language-level attribute which allows to associate (to "tag")
>> >>   particular declarations with arbitrary strings.
>> >>
>> >> - The conveyance of that information in DWARF in the form of a new DIE,
>> >>   DW_TAG_GNU_annotation, with tag number (0x6000) and format matching
>> >>   that of the DW_TAG_LLVM_annotation extension supported in LLVM for
>> >>   the same purpose. These DIEs are already supported by BPF tooling,
>> >>   such as pahole.
>> >>
>> >> - The conveyance of that information in BTF debug info in the form of
>> >>   BTF_KIND_DECL_TAG records. These records are already supported by
>> >>   LLVM and other tools in the eBPF ecosystem, such as the Linux kernel
>> >>   eBPF verifier.
>> >>
>> >>
>> >> Background
>> >> ==
>> >>
>> >> The purpose of these tags is to convey additional semantic information
>> >> to post-compilation consumers, in particular the Linux kernel eBPF
>> >> verifier. The verifier can make use of that information while analyzing
>> >> a BPF program to aid in determining whether to allow or reject the
>> >> program to be run. More background on these tags can be found in the
>> >> early support for them in the kernel here [1] and [2].
>> >>
>> >> The "btf_decl_tag" attribute is half the story; the other half is a
>> >> sibling attribute "btf_type_tag" which serves the same purpose but
>> >> applies to types. Support for btf_type_tag will come in a separate
>> >> patch series, since it is impaced by GCC bug 110439 which needs to be
>> >> addressed first.
>> >>
>> >> I submitted an initial version of this work (including btf_type_tag)
>> >> last spring [3], however at the time there were some open questions
>> >> about the behavior of the btf_type_tag attribute and issues with its
>> >> implementation. Since then we have clarified these details and agreed
>> >> to solutions with the BPF community and LLVM BPF folks.
>> >>
>> >> The main motivation for emitting the tags in DWARF is that the Linux
>> >> kernel generates its BTF information via pahole, using DWARF as a source:
>> >>
>> >> ++  BTF  BTF   +--+
>> >> | pahole |---> vmlinux.btf --->| verifier |
>> >> ++ +--+
>> >> ^^
>> >> ||
>> >>   DWARF |BTF |
>> >> ||
>> >>   vmlinux  +-+
>> >>   module1.ko   | BPF program |
>> >>   module2.ko   +-+
>> >> ...
>> >>
>> >> This is because:
>> >>
>> >> a)  pahole adds additional kernel-specific information into the
>> >> produced BTF based on additional analysis of kernel objects.
>> >>
>> >> b)  Unlike GCC, LLVM will only generate BTF for BPF programs.
>> >>
>> >> b)  GCC can generate BTF for whatever target with -gbtf, but there is no
>> >> support for linking/deduplicating BTF in the linker.
>> >>
>> >> In the scenario above, the verifier needs access to the pointer tags of
>> >> both the kernel types/declarations (conveyed in the DWARF and translated
>> >> to BTF by pahole) and those of the BPF program (available directly in 
>> >> BTF).
>> >>
>> >>
>> >> DWARF Representation
>> >> 
>> >>
>> >> As noted above, btf_decl_tag is represented in DWARF via a new DIE
>> >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF
>> >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has
>> >> the following format:
>> >>
>> >>   DW_TAG_GNU_annotation (0x6000)
>> >> DW_AT_name: "btf_decl_tag"
>> >> DW_AT_const_value: 
>> >>
>> >> These DIEs are placed in the DWARF tree as childre

  1   2   >