[PATCH v3] Modify gas uleb128 support test

2023-09-19 Thread mengqinggang
Some assemblers (GNU as for LoongArch) generates relocations for leb128
symbol arithmetic for relaxation, we need to disable relaxation probing
leb128 support then.

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Checking assembler for -mno-relax support.
Disable relaxation when probing leb128 support.

co-authored-by: Xi Ruoyao 
---
 gcc/configure| 42 +-
 gcc/configure.ac | 17 -
 2 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index d5e218e9a16..2ceab4e3b9c 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24676,6 +24676,46 @@ _ACEOF
 
 
 
+# Some assemblers (GNU as for LoongArch) generates relocations for
+# leb128 symbol arithmetic for relaxation, we need to disable relaxation
+# probing leb128 support then.
+case $target in
+  loongarch*-*-*)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for -mno-relax 
support" >&5
+$as_echo_n "checking assembler for -mno-relax support... " >&6; }
+if ${gcc_cv_as_mno_relax+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_mno_relax=no
+  if test x$gcc_cv_as != x; then
+$as_echo '.text' > conftest.s
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags -mno-relax -o conftest.o 
conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+then
+   gcc_cv_as_mno_relax=yes
+else
+  echo "configure: failed program was" >&5
+  cat conftest.s >&5
+fi
+rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_mno_relax" >&5
+$as_echo "$gcc_cv_as_mno_relax" >&6; }
+if test $gcc_cv_as_mno_relax = yes; then
+  check_leb128_asflags=-mno-relax
+fi
+
+;;
+  *)
+check_leb128_asflags=
+;;
+esac
+
 # Check if we have .[us]leb128, and support symbol arithmetic with it.
 # Older versions of GAS and some non-GNU assemblers, have a bugs handling
 # these directives, even when they appear to accept them.
@@ -24694,7 +24734,7 @@ L1:
 L2:
.uleb128 0x8000
 ' > conftest.s
-if { ac_try='$gcc_cv_as $gcc_cv_as_flags  -o conftest.o conftest.s >&5'
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags $check_leb128_asflags -o 
conftest.o conftest.s >&5'
   { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
   (eval $ac_try) 2>&5
   ac_status=$?
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 019d0375a2f..d780ea25386 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3229,10 +3229,25 @@ AC_MSG_RESULT($gcc_cv_ld_ro_rw_mix)
 
 gcc_AC_INITFINI_ARRAY
 
+# Some assemblers (GNU as for LoongArch) generates relocations for
+# leb128 symbol arithmetic for relaxation, we need to disable relaxation
+# probing leb128 support then.
+case $target in
+  loongarch*-*-*)
+gcc_GAS_CHECK_FEATURE([-mno-relax support],
+  gcc_cv_as_mno_relax,[-mno-relax],[.text],,
+  [check_leb128_asflags=-mno-relax])
+;;
+  *)
+check_leb128_asflags=
+;;
+esac
+
 # Check if we have .[us]leb128, and support symbol arithmetic with it.
 # Older versions of GAS and some non-GNU assemblers, have a bugs handling
 # these directives, even when they appear to accept them.
-gcc_GAS_CHECK_FEATURE([.sleb128 and .uleb128], gcc_cv_as_leb128,,
+gcc_GAS_CHECK_FEATURE([.sleb128 and .uleb128], gcc_cv_as_leb128,
+[$check_leb128_asflags],
 [  .data
.uleb128 L2 - L1
 L1:
-- 
2.31.1



C++ patch ping

2023-09-19 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping a couple of C++ patches.  All of them together
with the 2 updated patches posted yesterday have been
bootstrapped/regtested on x86_64-linux and i686-linux again yesterday.

- c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628375.html

- c++: Implement C++26 P2741R3 - user-generated static_assert messages 
[PR110348]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628378.html

- c++: Implement C++ DR 2406 - [[fallthrough]] attribute and iteration 
statements
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628487.html
  (from this one Richi approved the middle-end changes)

- c++: Implement C++26 P1854R4 - Making non-encodable string literals 
ill-formed [PR110341]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628490.html

- libcpp, v2: Small incremental patch for P1854R4 [PR110341]
  https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628586.html

Thanks

Jakub



Patch ping: [PATCH] testsuite work-around compound-assignment-1.c C++ failures on various targets [PR111377]

2023-09-19 Thread Jakub Jelinek via Gcc-patches
Hi!

On Tue, Sep 12, 2023 at 09:02:55AM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Sep 11, 2023 at 11:11:30PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > On Mon, Sep 11, 2023 at 07:27:57PM +0200, Benjamin Priour via Gcc-patches 
> > wrote:
> > > Thanks for the report,
> > > 
> > > After investigation it seems the location of the new dejagnu directive for
> > > C++ differs depending on the configuration.
> > > The expected warning is still emitted, but its location differ slightly.
> > > I expect it to be not an issue per se of the analyzer, but a divergence in
> > > the FE between the two configurations.
> > 
> > I think the divergence is whether called_by_test_5b returns the struct
> > in registers or in memory.  If in memory (like in the x86_64 -m32 case), we 
> > have
> >   [compound-assignment-1.c:71:21] D.3191 = called_by_test_5b (); [return 
> > slot optimization]
> >   [compound-assignment-1.c:71:21 discrim 1] D.3191 ={v} {CLOBBER(eol)};
> >   [compound-assignment-1.c:72:1] return;
> > in the IL, while if in registers (like x86_64 -m64 case), just
> >   [compound-assignment-1.c:71:21] D.3591 = called_by_test_5b ();
> >   [compound-assignment-1.c:72:1] return;
> > 
> > If you just want to avoid the differences, putting } on the same line as the
> > call might be a usable workaround for that.
> 
> Here is the workaround in patch form.  Tested on x86_64-linux -m32/-m64, ok
> for trunk?

I'd like to ping this patch.

> 2023-09-12  Jakub Jelinek  
> 
>   PR testsuite/111377
>   * c-c++-common/analyzer/compound-assignment-1.c (test_5b): Move
>   closing } to the same line as the call to work-around differences in
>   diagnostics line.
> 
> --- gcc/testsuite/c-c++-common/analyzer/compound-assignment-1.c.jj
> 2023-09-11 11:05:47.523727789 +0200
> +++ gcc/testsuite/c-c++-common/analyzer/compound-assignment-1.c   
> 2023-09-12 08:58:52.854231161 +0200
> @@ -68,5 +68,8 @@ called_by_test_5b (void)
>  
>  void test_5b (void)
>  {
> -  called_by_test_5b ();
> -} /* { dg-warning "leak of '.ptr_wrapper::ptr'" "" { target c++ } 
> } */
> +  called_by_test_5b (); }
> +/* { dg-warning "leak of '.ptr_wrapper::ptr'" "" { target c++ } 
> .-1 } */
> +/* The closing } above is intentionally on the same line as the call, because
> +   otherwise the exact line of the diagnostics depends on whether the
> +   called_by_test_5b () call satisfies aggregate_value_p or not.  */
> 
> 
>   Jakub

Jakub



[PATCH] v2: small _BitInt tweaks

2023-09-19 Thread Jakub Jelinek via Gcc-patches
Hi!

On Tue, Sep 12, 2023 at 05:27:30PM +, Joseph Myers wrote:
> On Tue, 12 Sep 2023, Jakub Jelinek via Gcc-patches wrote:
> 
> > And by ensuring we never create 1-bit signed BITINT_TYPE e.g. the backends
> > don't need to worry about them.
> > 
> > But I admit I don't feel strongly about that.
> > 
> > Joseph, what do you think about this?
> 
> I think it's appropriate to avoid 1-bit signed BITINT_TYPE consistently.

Here is a patch which does that.  In addition to the previously changed two
hunks it also adds a checking assertion that we don't create
signed _BitInt(0), unsigned _BitInt(0) or signed _BitInt(1) types.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-09-18  Jakub Jelinek  

gcc/
* tree.cc (build_bitint_type): Assert precision is not 0, or
for signed types 1.
(signed_or_unsigned_type_for): Return INTEGER_TYPE for signed variant
of unsigned _BitInt(1).
gcc/c-family/
* c-common.cc (c_common_signed_or_unsigned_type): Return INTEGER_TYPE
for signed variant of unsigned _BitInt(1).

--- gcc/tree.cc.jj  2023-09-11 17:01:17.612714178 +0200
+++ gcc/tree.cc 2023-09-18 12:36:37.598912717 +0200
@@ -7179,6 +7179,8 @@ build_bitint_type (unsigned HOST_WIDE_IN
 {
   tree itype, ret;
 
+  gcc_checking_assert (precision >= 1 + !unsignedp);
+
   if (unsignedp)
 unsignedp = MAX_INT_CACHED_PREC + 1;
 
@@ -11096,7 +11098,7 @@ signed_or_unsigned_type_for (int unsigne
   else
 return NULL_TREE;
 
-  if (TREE_CODE (type) == BITINT_TYPE)
+  if (TREE_CODE (type) == BITINT_TYPE && (unsignedp || bits > 1))
 return build_bitint_type (bits, unsignedp);
   return build_nonstandard_integer_type (bits, unsignedp);
 }
--- gcc/c-family/c-common.cc.jj 2023-09-11 17:01:17.517715431 +0200
+++ gcc/c-family/c-common.cc2023-09-18 12:35:06.829126858 +0200
@@ -2739,7 +2739,9 @@ c_common_signed_or_unsigned_type (int un
   || TYPE_UNSIGNED (type) == unsignedp)
 return type;
 
-  if (TREE_CODE (type) == BITINT_TYPE)
+  if (TREE_CODE (type) == BITINT_TYPE
+  /* signed _BitInt(1) is invalid, avoid creating that.  */
+  && (unsignedp || TYPE_PRECISION (type) > 1))
 return build_bitint_type (TYPE_PRECISION (type), unsignedp);
 
 #define TYPE_OK(node)  \


Jakub



PING^5: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant

2023-09-19 Thread Xi Ruoyao via Gcc-patches
Ping^5.

> > > On Thu, Aug 10, 2023 at 03:04:03PM +0200, Stefan Schulze Frielinghaus 
> > > wrote:
> > > > In the former fix in commit 41ef5a34161356817807be3a2e51fbdbe575ae85 I
> > > > completely missed the fact that the normal form of a generated constant 
> > > > for a
> > > > mode with fewer bits than in HOST_WIDE_INT is a sign extended version 
> > > > of the
> > > > actual constant.  This even holds true for unsigned constants.
> > > > 
> > > > Fixed by masking out the upper bits for the incoming constant and sign
> > > > extending the resulting unsigned constant.
> > > > 
> > > > Bootstrapped and regtested on x64 and s390x.  Ok for mainline?
> > > > 
> > > > While reading existing optimizations in combine I stumbled across two
> > > > optimizations where either my intuition about the representation of
> > > > unsigned integers via a const_int rtx is wrong, which then in turn would
> > > > probably also mean that this patch is wrong, or that the optimizations
> > > > are missed sometimes.  In other words in the following I would assume
> > > > that the upper bits are masked out:
> > > > 
> > > > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > > > index 468b7fde911..80c4ff0fbaf 100644
> > > > --- a/gcc/combine.cc
> > > > +++ b/gcc/combine.cc
> > > > @@ -11923,7 +11923,7 @@ simplify_compare_const (enum rtx_code code, 
> > > > machine_mode mode,
> > > >    /* (unsigned) < 0x8000 is equivalent to >= 0.  */
> > > >    else if (is_a  (mode, &int_mode)
> > > >    && GET_MODE_PRECISION (int_mode) - 1 < 
> > > > HOST_BITS_PER_WIDE_INT
> > > > -  && ((unsigned HOST_WIDE_INT) const_op
> > > > +  && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > > (int_mode))
> > > >    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION 
> > > > (int_mode) - 1)))
> > > >     {
> > > >   const_op = 0;
> > > > @@ -11962,7 +11962,7 @@ simplify_compare_const (enum rtx_code code, 
> > > > machine_mode mode,
> > > >    /* (unsigned) >= 0x8000 is equivalent to < 0.  */
> > > >    else if (is_a  (mode, &int_mode)
> > > >    && GET_MODE_PRECISION (int_mode) - 1 < 
> > > > HOST_BITS_PER_WIDE_INT
> > > > -  && ((unsigned HOST_WIDE_INT) const_op
> > > > +  && (((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > > (int_mode))
> > > >    == HOST_WIDE_INT_1U << (GET_MODE_PRECISION 
> > > > (int_mode) - 1)))
> > > >     {
> > > >   const_op = 0;
> > > > 
> > > > For example, while bootstrapping on x64 the optimization is missed since
> > > > a LTU comparison in QImode is done and the constant equals
> > > > 0xff80.
> > > > 
> > > > Sorry for inlining another patch, but I would really like to make sure
> > > > that my understanding is correct, now, before I come up with another
> > > > patch.  Thus it would be great if someone could shed some light on this.
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > * combine.cc (simplify_compare_const): Properly handle unsigned
> > > > constants while narrowing comparison of memory and constants.
> > > > ---
> > > >  gcc/combine.cc | 19 ++-
> > > >  1 file changed, 10 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > > > index e46d202d0a7..468b7fde911 100644
> > > > --- a/gcc/combine.cc
> > > > +++ b/gcc/combine.cc
> > > > @@ -12003,14 +12003,15 @@ simplify_compare_const (enum rtx_code code, 
> > > > machine_mode mode,
> > > >    && !MEM_VOLATILE_P (op0)
> > > >    /* The optimization makes only sense for constants which are big 
> > > > enough
> > > >  so that we have a chance to chop off something at all.  */
> > > > -  && (unsigned HOST_WIDE_INT) const_op > 0xff
> > > > -  /* Bail out, if the constant does not fit into INT_MODE.  */
> > > > -  && (unsigned HOST_WIDE_INT) const_op
> > > > -    < ((HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1) << 
> > > > 1) - 1)
> > > > +  && ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > > (int_mode)) > 0xff
> > > >    /* Ensure that we do not overflow during normalization.  */
> > > > -  && (code != GTU || (unsigned HOST_WIDE_INT) const_op < 
> > > > HOST_WIDE_INT_M1U))
> > > > +  && (code != GTU
> > > > + || ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK 
> > > > (int_mode))
> > > > +    < HOST_WIDE_INT_M1U)
> > > > +  && trunc_int_for_mode (const_op, int_mode) == const_op)
> > > >  {
> > > > -  unsigned HOST_WIDE_INT n = (unsigned HOST_WIDE_INT) const_op;
> > > > +  unsigned HOST_WIDE_INT n
> > > > +   = (unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode);
> > > >    enum rtx_code adjusted_code;
> > > >  
> > > >    /* Normalize code to either LEU or GEU.  */
> > > > @@ -12051,15 +12052,15 @@ simplify_compare_const (enum rtx_code code, 
> > > > machine_mode mode,
> > > > HOST_WI

[committed] libgomp: Handle NULL environ like pointer to NULL pointer [PR111413]

2023-09-19 Thread Jakub Jelinek via Gcc-patches
Hi!

clearenv function just sets environ to NULL (after sometimes freeing it),
rather than setting it to a pointer to NULL, and our code was assuming
it is always non-NULL.

Fixed thusly, the change seems to be large but actually is just
+  if (environ)
 for (env = environ; *env != 0; env++)
plus reindentation.  I've also noticed the block after this for loop
was badly indented (too much) and fixed that too.

No testcase added, as it needs clearenv + dlopen.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2023-09-19  Jakub Jelinek  

PR libgomp/111413
* env.c (initialize_env): Don't dereference environ if it is NULL.
Reindent.

--- libgomp/env.c.jj2023-06-20 08:57:39.095494805 +0200
+++ libgomp/env.c   2023-09-18 10:53:21.976636936 +0200
@@ -2224,139 +2224,140 @@ initialize_env (void)
   none = gomp_get_initial_icv_item (GOMP_DEVICE_NUM_FOR_NO_SUFFIX);
   initialize_icvs (&none->icvs);
 
-  for (env = environ; *env != 0; env++)
-{
-  if (!startswith (*env, "OMP_"))
-   continue;
-
- /* Name of the environment variable without suffix "OMP_".  */
- char *name = *env + sizeof ("OMP_") - 1;
- for (omp_var = 0; omp_var < OMP_VAR_CNT; omp_var++)
-   {
- if (startswith (name, envvars[omp_var].name))
-   {
- pos = envvars[omp_var].name_len;
- if (name[pos] == '=')
-   {
- pos++;
- flag_var_addr
-   = add_initial_icv_to_list (GOMP_DEVICE_NUM_FOR_NO_SUFFIX,
-  envvars[omp_var].flag_vars[0],
-  params);
-   }
- else if (startswith (&name[pos], "_DEV=")
-  && envvars[omp_var].flag & GOMP_ENV_SUFFIX_DEV)
-   {
- pos += 5;
- flag_var_addr
-   = add_initial_icv_to_list (GOMP_DEVICE_NUM_FOR_DEV,
-  envvars[omp_var].flag_vars[0],
-  params);
-   }
- else if (startswith (&name[pos], "_ALL=")
-  && envvars[omp_var].flag & GOMP_ENV_SUFFIX_ALL)
-   {
- pos += 5;
- flag_var_addr
-   = add_initial_icv_to_list (GOMP_DEVICE_NUM_FOR_ALL,
-  envvars[omp_var].flag_vars[0],
-  params);
-   }
- else if (startswith (&name[pos], "_DEV_")
-  && envvars[omp_var].flag & GOMP_ENV_SUFFIX_DEV_X)
-   {
- pos += 5;
- if (!get_device_num (*env, &name[pos], &dev_num,
-  &dev_num_len))
-   break;
+  if (environ)
+for (env = environ; *env != 0; env++)
+  {
+   if (!startswith (*env, "OMP_"))
+ continue;
 
- pos += dev_num_len + 1;
- flag_var_addr
-   = add_initial_icv_to_list (dev_num,
-  envvars[omp_var].flag_vars[0],
-  params);
-   }
- else
-   {
- gomp_error ("Invalid environment variable in %s", *env);
- break;
-   }
- env_val = &name[pos];
-
- if (envvars[omp_var].parse_func (*env, env_val, params))
-   {
- for (i = 0; i < 3; ++i)
-   if (envvars[omp_var].flag_vars[i])
- gomp_set_icv_flag (flag_var_addr,
-envvars[omp_var].flag_vars[i]);
-   else
+   /* Name of the environment variable without suffix "OMP_".  */
+   char *name = *env + sizeof ("OMP_") - 1;
+   for (omp_var = 0; omp_var < OMP_VAR_CNT; omp_var++)
+ {
+   if (startswith (name, envvars[omp_var].name))
+ {
+   pos = envvars[omp_var].name_len;
+   if (name[pos] == '=')
+ {
+   pos++;
+   flag_var_addr
+ = add_initial_icv_to_list (GOMP_DEVICE_NUM_FOR_NO_SUFFIX,
+envvars[omp_var].flag_vars[0],
+params);
+ }
+   else if (startswith (&name[pos], "_DEV=")
+&& envvars[omp_var].flag & GOMP_ENV_SUFFIX_DEV)
+ {
+   pos += 5;
+   flag_var_addr
+ = add_initial_icv_to_list (GOMP_DEVICE_NUM_FOR_DEV,
+envvars[omp_var].flag_vars[0],
+params);
+ }
+   else if (startswith (&n

Re: [PATCH] v2: small _BitInt tweaks

2023-09-19 Thread Richard Biener via Gcc-patches
On Tue, 19 Sep 2023, Jakub Jelinek wrote:

> Hi!
> 
> On Tue, Sep 12, 2023 at 05:27:30PM +, Joseph Myers wrote:
> > On Tue, 12 Sep 2023, Jakub Jelinek via Gcc-patches wrote:
> > 
> > > And by ensuring we never create 1-bit signed BITINT_TYPE e.g. the backends
> > > don't need to worry about them.
> > > 
> > > But I admit I don't feel strongly about that.
> > > 
> > > Joseph, what do you think about this?
> > 
> > I think it's appropriate to avoid 1-bit signed BITINT_TYPE consistently.
> 
> Here is a patch which does that.  In addition to the previously changed two
> hunks it also adds a checking assertion that we don't create
> signed _BitInt(0), unsigned _BitInt(0) or signed _BitInt(1) types.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2023-09-18  Jakub Jelinek  
> 
> gcc/
>   * tree.cc (build_bitint_type): Assert precision is not 0, or
>   for signed types 1.
>   (signed_or_unsigned_type_for): Return INTEGER_TYPE for signed variant
>   of unsigned _BitInt(1).
> gcc/c-family/
>   * c-common.cc (c_common_signed_or_unsigned_type): Return INTEGER_TYPE
>   for signed variant of unsigned _BitInt(1).
> 
> --- gcc/tree.cc.jj2023-09-11 17:01:17.612714178 +0200
> +++ gcc/tree.cc   2023-09-18 12:36:37.598912717 +0200
> @@ -7179,6 +7179,8 @@ build_bitint_type (unsigned HOST_WIDE_IN
>  {
>tree itype, ret;
>  
> +  gcc_checking_assert (precision >= 1 + !unsignedp);
> +
>if (unsignedp)
>  unsignedp = MAX_INT_CACHED_PREC + 1;
>  
> @@ -11096,7 +11098,7 @@ signed_or_unsigned_type_for (int unsigne
>else
>  return NULL_TREE;
>  
> -  if (TREE_CODE (type) == BITINT_TYPE)
> +  if (TREE_CODE (type) == BITINT_TYPE && (unsignedp || bits > 1))
>  return build_bitint_type (bits, unsignedp);
>return build_nonstandard_integer_type (bits, unsignedp);
>  }
> --- gcc/c-family/c-common.cc.jj   2023-09-11 17:01:17.517715431 +0200
> +++ gcc/c-family/c-common.cc  2023-09-18 12:35:06.829126858 +0200
> @@ -2739,7 +2739,9 @@ c_common_signed_or_unsigned_type (int un
>|| TYPE_UNSIGNED (type) == unsignedp)
>  return type;
>  
> -  if (TREE_CODE (type) == BITINT_TYPE)
> +  if (TREE_CODE (type) == BITINT_TYPE
> +  /* signed _BitInt(1) is invalid, avoid creating that.  */
> +  && (unsignedp || TYPE_PRECISION (type) > 1))
>  return build_bitint_type (TYPE_PRECISION (type), unsignedp);
>  
>  #define TYPE_OK(node)
> \
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] match.pd: Some build_nonstandard_integer_type tweaks

2023-09-19 Thread Jakub Jelinek via Gcc-patches
Hi!

As discussed earlier, using build_nonstandard_integer_type blindly for all
INTEGRAL_TYPE_Ps is problematic now that we have BITINT_TYPE, because it
always creates an INTEGRAL_TYPE with some possibly very large precision.
The following patch attempts to deal with 3 such spots in match.pd, others
still need looking at.

In the first case, I think it is quite expensive/undesirable to create
a non-standard INTEGER_TYPE with possibly huge precision and then
immediately just see type_has_mode_precision_p being false for it, or even
worse introducing a cast to TImode or OImode or XImode INTEGER_TYPE which
nothing will be able to actually handle.  128-bit or 64-bit (on 32-bit
targets) types are the largest supported by the backend, so the following
patch avoids creating and matching conversions to larger types, it is
an optimization anyway and so should be used when it is cheap that way.

In the second hunk, I believe the uses of build_nonstandard_integer_type
aren't useful at all.  It is when matching a ? -1 : 0 and trying to express
it as say -(type) (bool) a etc., but this is all GIMPLE only, where most of
integral types with same precision/signedness are compatible and we know
-1 is representable in that type, so I really don't see any reason not to
perform the negation of a [0, 1] valued expression in type, rather
than doing it in
build_nonstandard_integer_type (TYPE_PRECISION (type), TYPE_UNSIGNED (type))
(except that it breaks the BITINT_TYPEs).  I don't think we need to do
something like range_check_type.
While in there, I've also noticed it was using a (with {
tree booltrue = constant_boolean_node (true, boolean_type_node);
} and removed that + replaced uses of booltrue with boolean_true_node
which the above function always returns.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-09-19  Jakub Jelinek  

* match.pd ((x << c) >> c): Don't call build_nonstandard_integer_type
nor check type_has_mode_precision_p for width larger than [TD]Imode
precision.
(a ? CST1 : CST2): Don't use build_nonstandard_type, just convert
to type.  Use boolean_true_node instead of
constant_boolean_node (true, boolean_type_node).  Formatting fixes.

--- gcc/match.pd.jj 2023-09-18 10:37:56.002965361 +0200
+++ gcc/match.pd2023-09-18 12:14:32.321631010 +0200
@@ -4114,9 +4114,13 @@ (define_operator_list SYNC_FETCH_AND_AND
(if (INTEGRAL_TYPE_P (type))
 (with {
   int width = element_precision (type) - tree_to_uhwi (@1);
-  tree stype = build_nonstandard_integer_type (width, 0);
+  tree stype = NULL_TREE;
+  scalar_int_mode mode = (targetm.scalar_mode_supported_p (TImode)
+ ? TImode : DImode);
+  if (width <= GET_MODE_PRECISION (mode))
+   stype = build_nonstandard_integer_type (width, 0);
  }
- (if (width == 1 || type_has_mode_precision_p (stype))
+ (if (stype && (width == 1 || type_has_mode_precision_p (stype)))
   (convert (convert:stype @0
 
 /* Optimize x >> x into 0 */
@@ -5092,49 +5096,24 @@ (define_operator_list SYNC_FETCH_AND_AND
 /* a ? -1 : 0 -> -a.  No need to check the TYPE_PRECISION not being 1
here as the powerof2cst case above will handle that case correctly.  */
 (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
+ (negate (convert:type (convert:boolean_type_node @0))
+  (if (integer_zerop (@1))
+   (switch
+/* a ? 0 : 1 -> !a. */
+(if (integer_onep (@2))
+ (convert (bit_xor (convert:boolean_type_node @0) { boolean_true_node; })))
+/* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
+(if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2))
  (with {
-   auto prec = TYPE_PRECISION (type);
-   auto unsign = TYPE_UNSIGNED (type);
-   tree inttype = build_nonstandard_integer_type (prec, unsign);
+   tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
   }
-  (convert (negate (convert:inttype (convert:boolean_type_node @0
-  (if (integer_zerop (@1))
-   (with {
-  tree booltrue = constant_boolean_node (true, boolean_type_node);
-}
-(switch
- /* a ? 0 : 1 -> !a. */
- (if (integer_onep (@2))
-  (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
- /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
- (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
-  (with {
-   tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
-   }
-   (lshift (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } 
))
-{ shift; })))
- /* a ? -1 : 0 -> -(!a).  No need to check the TYPE_PRECISION not being 1
+  (lshift (convert (bit_xor (convert:boolean_type_node @0)
+   { boolean_true_node; })) { shift; })))
+/* a ? -1 : 0 -> -(!a).  No need to check the TYPE_PRECISION not being 1
here as the powerof2cst case above will handle that case correctl

Re: Re: [PATCH v1] RISC-V: Fix one ICE for vect test vect-multitypes-5

2023-09-19 Thread juzhe.zh...@rivai.ai
Thanks for reporting it.

Could you try this and verify for me?

-  rtx src_op_0 = XEXP (src, 0);
-
-  if (GET_CODE (src) == CONST && GET_CODE (src_op_0) == PLUS
-&& CONST_POLY_INT_P (XEXP (src_op_0, 1)))
+  if (GET_CODE (src) == CONST && GET_CODE (XEXP (src, 0)) == PLUS
+&& CONST_POLY_INT_P (XEXP (XEXP (src, 0), 1)))
 {
   rtx dest_tmp = gen_reg_rtx (mode);
   rtx tmp = gen_reg_rtx (mode);

-  riscv_emit_move (dest, XEXP (src_op_0, 0));
-  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (src_op_0, 1));
+  riscv_emit_move (dest, XEXP (XEXP (src, 0), 0));
+  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (XEXP (src, 0), 
1));

If it can fix your issue, plz send a patch and commit it.

Thanks.



juzhe.zh...@rivai.ai
 
From: Patrick O'Neill
Date: 2023-09-19 01:38
To: Li, Pan2; Kito Cheng
CC: gcc-patches@gcc.gnu.org; Wang, Yanzhang; juzhe.zh...@rivai.ai; Palmer 
Dabbelt
Subject: Re: [PATCH v1] RISC-V: Fix one ICE for vect test vect-multitypes-5
Hi,
 
After this patch, there is now an ICE when bootstrapping with
--enable-checking=rtl on rv32gc.
 
More details:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111461
 
Thanks,
Patrick
 
On 8/29/23 07:40, Li, Pan2 via Gcc-patches wrote:
> Committed, thanks Kito.
>
> Pan
>
> -Original Message-
> From: Kito Cheng 
> Sent: Tuesday, August 29, 2023 9:46 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; Wang, Yanzhang ; 
> juzhe.zh...@rivai.ai
> Subject: Re: [PATCH v1] RISC-V: Fix one ICE for vect test vect-multitypes-5
>
> LGTM, thanks :)
>
> On Tue, Aug 29, 2023 at 6:50 PM Pan Li via Gcc-patches
>  wrote:
>> From: Pan Li 
>>
>> There will be one ICE when build vect-multitypes-5.c similar as below:
>>
>> riscv64-unknown-elf-gcc -O3 \
>>-march=rv64imafdcv -mabi=lp64d -mcmodel=medlow \
>>-fdiagnostics-plain-output -flto -ffat-lto-objects \
>>--param riscv-autovec-preference=scalable -Wno-psabi \
>>-ftree-vectorize -fno-tree-loop-distribute-patterns \
>>-fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details \
>>gcc/testsuite/gcc.dg/vect/vect-multitypes-5.c -o test.elf -lm
>>
>> The below RTL is not well handled in riscv_legitimize_const_move, and
>> then fall through to the default pass. Then the
>> default force_const_mem will NULL_RTX, and will have ICE when operating
>> one the NULL_RTX.
>>
>> (const:DI
>>(plus:DI
>>  (symbol_ref:DI ("ic") [flags 0x2] )
>>  (const_poly_int:DI [16, 16])))
>>
>> This patch would like to take care of this rtl in 
>> riscv_legitimize_const_move.
>>
>> Signed-off-by: Pan Li 
>> Co-Authored-By: Ju-Zhe Zhong 
>>
>> gcc/ChangeLog:
>>
>>  * config/riscv/riscv.cc (riscv_legitimize_poly_move): New 
>> declaration.
>>  (riscv_legitimize_const_move): Handle ref plus const poly.
>> ---
>>   gcc/config/riscv/riscv.cc | 23 +++
>>   1 file changed, 23 insertions(+)
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index 1d6e278ea90..bab6ed70b2d 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -366,6 +366,7 @@ static const struct riscv_tune_param 
>> optimize_size_tune_info = {
>>
>>   static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool 
>> *);
>>   static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);
>> +static void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
>>
>>   /* Defining target-specific uses of __attribute__.  */
>>   static const struct attribute_spec riscv_attribute_table[] =
>> @@ -2118,6 +2119,28 @@ riscv_legitimize_const_move (machine_mode mode, rtx 
>> dest, rtx src)
>> return;
>>   }
>>
>> +  /* Handle below format.
>> + (const:DI
>> +   (plus:DI
>> +(symbol_ref:DI ("ic") [flags 0x2] ) <- 
>> op_0
>> +(const_poly_int:DI [16, 16]) // <- op_1
>> + ))
>> +   */
>> +  rtx src_op_0 = XEXP (src, 0);
>> +
>> +  if (GET_CODE (src) == CONST && GET_CODE (src_op_0) == PLUS
>> +&& CONST_POLY_INT_P (XEXP (src_op_0, 1)))
>> +{
>> +  rtx dest_tmp = gen_reg_rtx (mode);
>> +  rtx tmp = gen_reg_rtx (mode);
>> +
>> +  riscv_emit_move (dest, XEXP (src_op_0, 0));
>> +  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (src_op_0, 1));
>> +
>> +  emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, dest_tmp)));
>> +  return;
>> +}
>> +
>> src = force_const_mem (mode, src);
>>
>> /* When using explicit relocs, constant pool references are sometimes
>> --
>> 2.34.1
>>
 


[Committed] RISC-V: Support integer FMA/FNMA VLS modes autovectorization

2023-09-19 Thread Juzhe-Zhong
Simpily extend the current VLA iterator and patterns.

Regression passed with no difference.

gcc/ChangeLog:

* config/riscv/autovec.md: Add VLS modes.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS FMA/FNMA test.
* gcc.target/riscv/rvv/autovec/vls/fma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-4.c: New test.

---
 gcc/config/riscv/autovec.md   |  24 +-
 gcc/config/riscv/vector.md| 296 +-
 .../gcc.target/riscv/rvv/autovec/vls/def.h|  18 ++
 .../gcc.target/riscv/rvv/autovec/vls/fma-1.c  |  45 +++
 .../gcc.target/riscv/rvv/autovec/vls/fma-2.c  |  43 +++
 .../gcc.target/riscv/rvv/autovec/vls/fma-3.c  |  41 +++
 .../gcc.target/riscv/rvv/autovec/vls/fma-4.c  |  39 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-1.c |  45 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-2.c |  43 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-3.c |  41 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-4.c |  39 +++
 11 files changed, 514 insertions(+), 160 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-4.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index ac7599f3e0a..1aadb6eea1f 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1079,12 +1079,12 @@
 ;; -
 
 (define_insn_and_split "fma4"
-  [(set (match_operand:VI 0 "register_operand")
-   (plus:VI
- (mult:VI
-   (match_operand:VI 1 "register_operand")
-   (match_operand:VI 2 "register_operand"))
- (match_operand:VI 3 "register_operand")))]
+  [(set (match_operand:V_VLSI 0 "register_operand")
+   (plus:V_VLSI
+ (mult:V_VLSI
+   (match_operand:V_VLSI 1 "register_operand")
+   (match_operand:V_VLSI 2 "register_operand"))
+ (match_operand:V_VLSI 3 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -1107,12 +1107,12 @@
 ;; -
 
 (define_insn_and_split "fnma4"
-  [(set (match_operand:VI 0 "register_operand")
-(minus:VI
-  (match_operand:VI 3 "register_operand")
-  (mult:VI
-(match_operand:VI 1 "register_operand")
-(match_operand:VI 2 "register_operand"]
+  [(set (match_operand:V_VLSI 0 "register_operand")
+(minus:V_VLSI
+  (match_operand:V_VLSI 3 "register_operand")
+  (mult:V_VLSI
+(match_operand:V_VLSI 1 "register_operand")
+(match_operand:V_VLSI 2 "register_operand"]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index c7c6ec3d6f1..c5a1c9061c4 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -5144,8 +5144,8 @@
 ;; 
---
 
 (define_expand "@pred_mul_plus"
-  [(set (match_operand:VI 0 "register_operand")
-   (if_then_else:VI
+  [(set (match_operand:V_VLSI 0 "register_operand")
+   (if_then_else:V_VLSI
  (unspec:
[(match_operand: 1 "vector_mask_operand")
 (match_operand 6 "vector_length_operand")
@@ -5154,20 +5154,20 @@
 (match_operand 9 "const_int_operand")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (plus:VI
-   (mult:VI
- (match_operand:VI 2 "register_operand")
- (match_operand:VI 3 "register_operand"))
-   (match_operand:VI 4 "register_operand"))
- (match_operand:VI 5 "vector_merge_operand")))]
+ (plus:V_VLSI
+   (mult:V_VLSI
+ (match_operand:V_VLSI 2 "register_operand")
+ (match_operand:V_VLSI 3 "register_operand"))
+   (match_operand:V_VLSI 4 "register_operand"))
+ (match_operand:V_VLSI 5 "vector_merge

Re: gcc-patches From rewriting mailman settings (Was: [Linaro-TCWG-CI] gcc patch #75674: FAIL: 68 regressions)

2023-09-19 Thread Mark Wielaard
Hi all,

On Sun, Sep 17, 2023 at 10:04:37PM +0200, Mark Wielaard wrote:
> > We (Jeff or anyone else with mailman admin privs) could use the same
> > settings for gcc-patches. The settings that need to be set are in that
> > bug:
> > 
> > - subject_prefix (general): (empty)
> > - from_is_list (general): No
> > - anonymous_list (general): No
> > - first_strip_reply_to (general): No
> > - reply_goes_to_list (general): Poster
> > - reply_to_address (general): (empty)
> > - include_sender_header (general): No
> > - drop_cc (general): No
> > - msg_header (nondigest): (empty)
> > - msg_footer (nondigest): (empty)
> > - scrub_nondigest (nondigest): No
> > - dmarc_moderation_action (privacy): Accept
> > - filter_content (contentfilter): No
> > 
> > The only visible change (apart from no more From rewriting) is that
> > HTML multi-parts aren't scrubbed anymore (that would be a message
> > altering issue). The html part is still scrubbed from the
> > inbox.sourceware.org archive, so b4 works just fine. But I don't know
> > what patchwork.sourceware.org does with HTML attachements. Of course
> > people really shouldn't sent HTML attachments to gcc-patches, so maybe
> > this is no real problem.
> 
> Although there were some positive responses (on list and on irc) it is
> sometimes hard to know if there really is consensus for these kind of
> infrastructure tweaks. But I believe there is at least no sustained
> opposition to changing the gcc-patches mailman setting as proposed
> above.
> 
> So unless someone objects I like to make this change Tuesday September
> 19 around 08:00 UTC.

This change is now done for gcc-patches.

> And if there are no complaints at Cauldron we could do the same for
> the other patch lists the week after.
> 
> > > [1] https://patchwork.sourceware.org/project/gcc/list/
> > > [2] https://sourceware.org/bugzilla/show_bug.cgi?id=29713


Re: [PATCH] match.pd: Some build_nonstandard_integer_type tweaks

2023-09-19 Thread Richard Biener
On Tue, 19 Sep 2023, Jakub Jelinek wrote:

> Hi!
> 
> As discussed earlier, using build_nonstandard_integer_type blindly for all
> INTEGRAL_TYPE_Ps is problematic now that we have BITINT_TYPE, because it
> always creates an INTEGRAL_TYPE with some possibly very large precision.
> The following patch attempts to deal with 3 such spots in match.pd, others
> still need looking at.
> 
> In the first case, I think it is quite expensive/undesirable to create
> a non-standard INTEGER_TYPE with possibly huge precision and then
> immediately just see type_has_mode_precision_p being false for it, or even
> worse introducing a cast to TImode or OImode or XImode INTEGER_TYPE which
> nothing will be able to actually handle.  128-bit or 64-bit (on 32-bit
> targets) types are the largest supported by the backend, so the following
> patch avoids creating and matching conversions to larger types, it is
> an optimization anyway and so should be used when it is cheap that way.
> 
> In the second hunk, I believe the uses of build_nonstandard_integer_type
> aren't useful at all.  It is when matching a ? -1 : 0 and trying to express
> it as say -(type) (bool) a etc., but this is all GIMPLE only, where most of
> integral types with same precision/signedness are compatible and we know
> -1 is representable in that type, so I really don't see any reason not to
> perform the negation of a [0, 1] valued expression in type, rather
> than doing it in
> build_nonstandard_integer_type (TYPE_PRECISION (type), TYPE_UNSIGNED (type))
> (except that it breaks the BITINT_TYPEs).  I don't think we need to do
> something like range_check_type.
> While in there, I've also noticed it was using a (with {
> tree booltrue = constant_boolean_node (true, boolean_type_node);
> } and removed that + replaced uses of booltrue with boolean_true_node
> which the above function always returns.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2023-09-19  Jakub Jelinek  
> 
>   * match.pd ((x << c) >> c): Don't call build_nonstandard_integer_type
>   nor check type_has_mode_precision_p for width larger than [TD]Imode
>   precision.
>   (a ? CST1 : CST2): Don't use build_nonstandard_type, just convert
>   to type.  Use boolean_true_node instead of
>   constant_boolean_node (true, boolean_type_node).  Formatting fixes.
> 
> --- gcc/match.pd.jj   2023-09-18 10:37:56.002965361 +0200
> +++ gcc/match.pd  2023-09-18 12:14:32.321631010 +0200
> @@ -4114,9 +4114,13 @@ (define_operator_list SYNC_FETCH_AND_AND
> (if (INTEGRAL_TYPE_P (type))
>  (with {
>int width = element_precision (type) - tree_to_uhwi (@1);
> -  tree stype = build_nonstandard_integer_type (width, 0);
> +  tree stype = NULL_TREE;
> +  scalar_int_mode mode = (targetm.scalar_mode_supported_p (TImode)
> +   ? TImode : DImode);
> +  if (width <= GET_MODE_PRECISION (mode))
> + stype = build_nonstandard_integer_type (width, 0);
>   }
> - (if (width == 1 || type_has_mode_precision_p (stype))
> + (if (stype && (width == 1 || type_has_mode_precision_p (stype)))
>(convert (convert:stype @0
>  
>  /* Optimize x >> x into 0 */
> @@ -5092,49 +5096,24 @@ (define_operator_list SYNC_FETCH_AND_AND
>  /* a ? -1 : 0 -> -a.  No need to check the TYPE_PRECISION not being 1
> here as the powerof2cst case above will handle that case correctly.  
> */
>  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
> + (negate (convert:type (convert:boolean_type_node @0))
> +  (if (integer_zerop (@1))
> +   (switch
> +/* a ? 0 : 1 -> !a. */
> +(if (integer_onep (@2))
> + (convert (bit_xor (convert:boolean_type_node @0) { boolean_true_node; 
> })))
> +/* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
> +(if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2))
>   (with {
> -   auto prec = TYPE_PRECISION (type);
> -   auto unsign = TYPE_UNSIGNED (type);
> -   tree inttype = build_nonstandard_integer_type (prec, unsign);
> +   tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
>}
> -  (convert (negate (convert:inttype (convert:boolean_type_node @0
> -  (if (integer_zerop (@1))
> -   (with {
> -  tree booltrue = constant_boolean_node (true, boolean_type_node);
> -}
> -(switch
> - /* a ? 0 : 1 -> !a. */
> - (if (integer_onep (@2))
> -  (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
> - /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
> - (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
> -  (with {
> - tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
> -   }
> -   (lshift (convert (bit_xor (convert:boolean_type_node @0) { booltrue; 
> } ))
> -{ shift; })))
> - /* a ? -1 : 0 -> -(!a).  No need to check the TYPE_PRECISION not being 1
> +  (lshift (convert (bit_xor (convert:

[RFC 1/2] RISC-V: Add support for _Bfloat16.

2023-09-19 Thread Jin Ma
gcc/ChangeLog:

* config/riscv/iterators.md (HFBF): New.
* config/riscv/riscv-builtins.cc (riscv_init_builtin_types):
Initialize data type_Bfloat16.
* config/riscv/riscv-modes.def (FLOAT_MODE): New.
(ADJUST_FLOAT_FORMAT): New.
* config/riscv/riscv.cc (riscv_mangle_type): Support for BFmode.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_block_arith_comp_libfuncs_for_mode): New.
(riscv_init_libfuncs): Opening and closing some libfuncs for BFmode.
* config/riscv/riscv.md (mode" ): Add BF.
(truncdfbf2): New.
(movhf): Support for BFmode.
(mov): Ditto.
(*mov_softfloat):  Ditto.
(fix_truncbf2): New.
(fixuns_truncbf2): New.
(floatbf2): New.
(floatunsbf2): New.

libgcc/ChangeLog:

* config/riscv/sfp-machine.h (_FP_NANFRAC_B): New.
(_FP_NANSIGN_B): New.
* config/riscv/t-softfp32: Add support for BF libfuncs.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/bf16_arithmetic.c: New test.
* gcc.target/riscv/bf16_call.c: New test.
* gcc.target/riscv/bf16_comparisons.c: New test.
* gcc.target/riscv/bf16_convert-1.c: New test.
* gcc.target/riscv/bf16_convert-2.c: New test.
* gcc.target/riscv/bf16_convert_run.c: New test.
---
 gcc/config/riscv/iterators.md |   2 +
 gcc/config/riscv/riscv-builtins.cc|  16 ++
 gcc/config/riscv/riscv-modes.def  |   4 +
 gcc/config/riscv/riscv.cc |  93 --
 gcc/config/riscv/riscv.md |  94 --
 .../gcc.target/riscv/bf16_arithmetic.c|  36 
 gcc/testsuite/gcc.target/riscv/bf16_call.c|  17 ++
 .../gcc.target/riscv/bf16_comparisons.c   |  25 +++
 .../gcc.target/riscv/bf16_convert-1.c |  39 +
 .../gcc.target/riscv/bf16_convert-2.c |  38 
 .../gcc.target/riscv/bf16_convert_run.c   | 163 ++
 libgcc/config/riscv/sfp-machine.h |   3 +
 libgcc/config/riscv/t-softfp32|   7 +-
 13 files changed, 503 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_arithmetic.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_call.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_comparisons.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_convert-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_convert-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/bf16_convert_run.c

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..73523b73fdd 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -84,6 +84,8 @@ (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF 
"TARGET_ZFHMIN")])
 ;; instruction.
 (define_mode_attr size [(QI "b") (HI "h")])
 
+(define_mode_iterator HFBF [HF BF])
+
 ;; Mode attributes for loads.
 (define_mode_attr load [(QI "lb") (HI "lh") (SI "lw") (DI "ld") (HF "flh") (SF 
"flw") (DF "fld")])
 
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 3fe3a89dcc2..b7bb89794f7 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -192,6 +192,7 @@ static GTY(()) int riscv_builtin_decl_index[NUM_INSN_CODES];
   riscv_builtin_decls[riscv_builtin_decl_index[(CODE)]]
 
 tree riscv_float16_type_node = NULL_TREE;
+tree riscv_bfloat16_type_node = NULL_TREE;
 
 /* Return the function type associated with function prototype TYPE.  */
 
@@ -235,6 +236,21 @@ riscv_init_builtin_types (void)
   if (!maybe_get_identifier ("_Float16"))
 lang_hooks.types.register_builtin_type (riscv_float16_type_node,
"_Float16");
+
+  /* Provide the _Bfloat16 type and bfloat16_type_node if needed.  */
+  if (!bfloat16_type_node)
+{
+  riscv_bfloat16_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (riscv_bfloat16_type_node) = 16;
+  SET_TYPE_MODE (riscv_bfloat16_type_node, BFmode);
+  layout_type (riscv_bfloat16_type_node);
+}
+  else
+riscv_bfloat16_type_node = bfloat16_type_node;
+
+  if (!maybe_get_identifier ("_Bfloat16"))
+lang_hooks.types.register_builtin_type (riscv_bfloat16_type_node,
+   "_Bfloat16");
 }
 
 /* Implement TARGET_INIT_BUILTINS.  */
diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def
index e3c6ccb2809..723bfaee42d 100644
--- a/gcc/config/riscv/riscv-modes.def
+++ b/gcc/config/riscv/riscv-modes.def
@@ -22,6 +22,10 @@ along with GCC; see the file COPYING3.  If not see
 FLOAT_MODE (HF, 2, ieee_half_format);
 FLOAT_MODE (TF, 16, ieee_quad_format);
 
+FLOAT_MODE (BF, 2, 0);
+/* Reuse definition from arm.  */
+ADJUST_FLOAT_FORMAT (BF, &arm_bfloat_half_format);
+
 /* Vector modes.  */
 
 /* Encode the ratio of SEW/

[RFC 2/2] RISC-V: Add 'Zfbfmin' extension.

2023-09-19 Thread Jin Ma
This patch adds the 'Zfbfmin' extension for riscv, which is based on spec of 
bfloat16:
https://github.com/riscv/riscv-bfloat16/commit/5578e34e15a44e9ad13246072a29f51274b4d999

The 'Zfbfmin' extension of binutils-gdb (REVIEW ONLY):
https://sourceware.org/pipermail/binutils/2023-August/128773.html

The 'Zfbfmin' extension of qemu:
https://github.com/qemu/qemu/commit/5d1270caac2ef7b8c887d4cb5a2444ba6d237516

Because the binutils does not yet support the 'Zfbfmin' extension, test case
zfbfmin_convert_run.c is invalidated with '#if 0' and '#endif'.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add 'Zfbfmin' extension.
* config/riscv/riscv-opts.h (MASK_ZFBFMIN): New.
(TARGET_ZFBFMIN): New.
* config/riscv/riscv.cc (riscv_output_move): Enable FMV.X.H, and FMV.H.X
for 'Zfbfmin' extension.
(riscv_excess_precision): Likewise.
* config/riscv/riscv.md (truncsfbf2): New.
(extendbfsf2):  New.
(*mov_hardfloat): Support for BFmode.
(*mov_softfloat): Disable for BFmode  when 'Zfbfmin' extension is
enabled.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zfbfmin_arithmetic.c: New test.
* gcc.target/riscv/zfbfmin_call.c: New test.
* gcc.target/riscv/zfbfmin_comparisons.c: New test.
* gcc.target/riscv/zfbfmin_convert.c: New test.
* gcc.target/riscv/zfbfmin_convert_run.c: New test.
* gcc.target/riscv/zfbfmin_fsh_and_flh.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc   |   3 +
 gcc/config/riscv/riscv-opts.h |   2 +
 gcc/config/riscv/riscv.cc |   4 +-
 gcc/config/riscv/riscv.md |  40 ++--
 .../gcc.target/riscv/zfbfmin_arithmetic.c |  31 
 gcc/testsuite/gcc.target/riscv/zfbfmin_call.c |  17 ++
 .../gcc.target/riscv/zfbfmin_comparisons.c|  22 +++
 .../gcc.target/riscv/zfbfmin_convert.c|  38 
 .../gcc.target/riscv/zfbfmin_convert_run.c| 173 ++
 .../gcc.target/riscv/zfbfmin_fsh_and_flh.c|  12 ++
 10 files changed, 329 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin_arithmetic.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin_call.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin_comparisons.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin_convert.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin_convert_run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin_fsh_and_flh.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 9a0a68fe5db..1fcbb862aa4 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -123,6 +123,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
 
   {"zfh", "zfhmin"},
   {"zfhmin", "f"},
+  {"zfbfmin", "f"},
 
   {"zfa", "f"},
 
@@ -284,6 +285,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zfhmin",ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvfhmin",   ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvfh",  ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zfbfmin", ISA_SPEC_CLASS_NONE, 0, 8},
 
   {"zfa", ISA_SPEC_CLASS_NONE, 0, 1},
 
@@ -1461,6 +1463,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zfh",   &gcc_options::x_riscv_zf_subext, MASK_ZFH},
   {"zvfhmin",   &gcc_options::x_riscv_zf_subext, MASK_ZVFHMIN},
   {"zvfh",  &gcc_options::x_riscv_zf_subext, MASK_ZVFH},
+  {"zfbfmin",  &gcc_options::x_riscv_zf_subext, MASK_ZFBFMIN},
 
   {"zfa",   &gcc_options::x_riscv_zfa_subext, MASK_ZFA},
 
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index a525f679683..900a46fcae0 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -256,11 +256,13 @@ enum riscv_entity
 #define MASK_ZFH  (1 << 1)
 #define MASK_ZVFHMIN  (1 << 2)
 #define MASK_ZVFH (1 << 3)
+#define MASK_ZFBFMIN  (1 << 4)
 
 #define TARGET_ZFHMIN  ((riscv_zf_subext & MASK_ZFHMIN) != 0)
 #define TARGET_ZFH ((riscv_zf_subext & MASK_ZFH) != 0)
 #define TARGET_ZVFHMIN ((riscv_zf_subext & MASK_ZVFHMIN) != 0)
 #define TARGET_ZVFH((riscv_zf_subext & MASK_ZVFH) != 0)
+#define TARGET_ZFBFMIN((riscv_zf_subext & MASK_ZFBFMIN) != 0)
 
 #define MASK_ZMMUL  (1 << 0)
 #define TARGET_ZMMUL((riscv_zm_subext & MASK_ZMMUL) != 0)
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 910523ee2b9..6362c3f83c8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3372,7 +3372,7 @@ riscv_output_move (rtx dest, rtx src)
switch (width)
  {
  case 2:
-   if (TARGET_ZFHMIN)
+   if (TARGET_ZFHMIN || TARGET_ZFBFMIN)
  return "fmv.x.h\t%0,%1";
/* Using fmv.x.s + sign-extend to emulate fmv.x.h.  */
return "fmv.x.s\t%0,%1;slli\t%0,%0,16;srai\t%0,%0,16";
@@ -3428,7 +3428,7 @@ riscv_output_move 

[PING] More '#ifdef ASM_OUTPUT_DEF' -> 'if (TARGET_SUPPORTS_ALIASES)' etc. (was: [PATCH][v2] Introduce TARGET_SUPPORTS_ALIASES)

2023-09-19 Thread Thomas Schwinge
Hi!

Ping.


Grüße
 Thomas


On 2023-09-08T14:02:50+0200, I wrote:
> Hi!
>
> On 2017-08-10T15:42:13+0200, Jan Hubicka  wrote:
>>> On 07/31/2017 11:57 AM, Yuri Gribov wrote:
>>> > On Mon, Jul 31, 2017 at 9:04 AM, Martin Liška  wrote:
>>> >> Doing the transformation suggested by Honza.
>
> ... which was:
>
> | On 2017-07-24T16:06:22+0200, Jan Hubicka  wrote:
> | > we probably should turn ASM_OUTPUT_DEF ifdefs into a conditional 
> compilation
> | > incrementally.
>
>>> >From 78ee08b25d22125cb1fa248bac98ef1e84504761 Mon Sep 17 00:00:00 2001
>>> From: marxin 
>>> Date: Tue, 25 Jul 2017 13:11:28 +0200
>>> Subject: [PATCH] Introduce TARGET_SUPPORTS_ALIASES
>
> ..., and got pushed as commit a8b522b483ebb8c972ecfde8779a7a6ec16aecd6
> (Subversion r251048) "Introduce TARGET_SUPPORTS_ALIASES".
>
> I don't know if that was actually intentional here, or just an
> "accident", but such changes actually allow that a back end may or may
> not provide symbol aliasing support ('TARGET_SUPPORTS_ALIASES')
> independent of '#ifdef ASM_OUTPUT_DEF', and in particular, depending not
> just on static but instead on dynamic (run-time) configuration.  This is
> relevant for the nvptx back end's '-malias' flag.
>
> There did remain a few instances where we currently still assume that
> from '#ifdef ASM_OUTPUT_DEF' follows 'TARGET_SUPPORTS_ALIASES', which I'm
> adjusting in the attached (with '--ignore-space-change', for easy review)
> "More '#ifdef ASM_OUTPUT_DEF' -> 'if (TARGET_SUPPORTS_ALIASES)' etc.".
> OK to push?
>
> These changes are necessary to cure nvptx regressions raised in
> 
> "[nvptx] Use .alias directive for mptx >= 6.3", addressing the comment:
> "[...] remains to be analyzed".
>
>
> Grüße
>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 4c725226c3657adb775af274876de5077b8fbf45 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 7 Sep 2023 22:15:08 +0200
Subject: [PATCH] More '#ifdef ASM_OUTPUT_DEF' -> 'if
 (TARGET_SUPPORTS_ALIASES)' etc.

Per commit a8b522b483ebb8c972ecfde8779a7a6ec16aecd6 (Subversion r251048)
"Introduce TARGET_SUPPORTS_ALIASES", there is the idea that a back end may or
may not provide symbol aliasing support ('TARGET_SUPPORTS_ALIASES') independent
of '#ifdef ASM_OUTPUT_DEF', and in particular, depending not just on static but
instead on dynamic (run-time) configuration.  There did remain a few instances
where we currently still assume that from '#ifdef ASM_OUTPUT_DEF' follows
'TARGET_SUPPORTS_ALIASES'.  Change these to 'if (TARGET_SUPPORTS_ALIASES)',
similarly, or 'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);'.

	gcc/
	* ipa-icf.cc (sem_item::target_supports_symbol_aliases_p):
	'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);' before
	'return true;'.
	* ipa-visibility.cc (function_and_variable_visibility): Change
	'#ifdef ASM_OUTPUT_DEF' to 'if (TARGET_SUPPORTS_ALIASES)'.
	* varasm.cc (output_constant_pool_contents)
	[#ifdef ASM_OUTPUT_DEF]:
	'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);'.
	(do_assemble_alias) [#ifdef ASM_OUTPUT_DEF]:
	'if (!TARGET_SUPPORTS_ALIASES)',
	'gcc_checking_assert (seen_error ());'.
	(assemble_alias): Change '#if !defined (ASM_OUTPUT_DEF)' to
	'if (!TARGET_SUPPORTS_ALIASES)'.
	(default_asm_output_anchor):
	'gcc_checking_assert (TARGET_SUPPORTS_ALIASES);'.
---
 gcc/ipa-icf.cc|  1 +
 gcc/ipa-visibility.cc |  8 +---
 gcc/varasm.cc | 13 ++---
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/ipa-icf.cc b/gcc/ipa-icf.cc
index 836d0914ded..bbdfd445397 100644
--- a/gcc/ipa-icf.cc
+++ b/gcc/ipa-icf.cc
@@ -218,6 +218,7 @@ sem_item::target_supports_symbol_aliases_p (void)
 #if !defined (ASM_OUTPUT_DEF) || (!defined(ASM_OUTPUT_WEAK_ALIAS) && !defined (ASM_WEAKEN_DECL))
   return false;
 #else
+  gcc_checking_assert (TARGET_SUPPORTS_ALIASES);
   return true;
 #endif
 }
diff --git a/gcc/ipa-visibility.cc b/gcc/ipa-visibility.cc
index 8ec82bb333e..8ce56114ee3 100644
--- a/gcc/ipa-visibility.cc
+++ b/gcc/ipa-visibility.cc
@@ -622,7 +622,8 @@ function_and_variable_visibility (bool whole_program)
   /* All aliases should be processed at this point.  */
   gcc_checking_assert (!alias_pairs || !alias_pairs->length ());
 
-#ifdef ASM_OUTPUT_DEF
+  if (TARGET_SUPPORTS_ALIASES)
+{
   FOR_EACH_DEFINED_FUNCTION (node)
 	{
 	  if (node->get_availability () != AVAIL_INTERPOSABLE
@@ -643,7 +644,8 @@ function_and_variable_visibility (bool whole_program)
 
 	  if (!alias)
 		{
-	  alias = dyn_cast (node->noninterposable_alias ());
+		  alias
+		= dyn_cast (node->noninterposable_alias ());
 		  gcc_assert (alias && alias != node);
 		}
 
@@ -656,7 +658,7 @@ function_and_variable_visibility (bool whole_program)
 		}
 	}
 	}
-#en

Re: [PATCH] AArch64: Improve immediate expansion [PR105928]

2023-09-19 Thread Richard Sandiford
Wilco Dijkstra  writes:
> Hi Richard,
>
>> I was worried that reusing "dest" for intermediate results would
>> prevent CSE for cases like:
>>
>> void g (long long, long long);
>> void
>> f (long long *ptr)
>> {
>>   g (0xee11ee22ee11ee22LL, 0xdc23dc44ee11ee22LL);
>> }
>
> Note that aarch64_internal_mov_immediate may be called after reload,
> so it would end up even more complex.

The sequence I quoted was supposed to work before and after reload.  The:

rtx tmp = aarch64_target_reg (dest, DImode);

would create a fresh temporary before reload and reuse dest otherwise.
So the sequence after reload would be the same as in your patch,
but the sequence before reload would use a temporary.

> This should be done as a
> dedicated mid-end optimization similar to TARGET_CONST_ANCHOR.
> However the number of 3/4-instruction immediates is so small that
> sharable cases would be very rare, so I don't believe it is worth it.

Yeah.  If, with a few tweaks, we could easily reuse the existing pass
flow to optimise the split forms, then it might have been worth it.
But I agree it's not worth doing something special that only works
for multi-insn immediates.

I think there are other cases where CSE after split would help though.

Thanks,
Richard


[PATCH v7 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-09-19 Thread Ajit Agarwal
This new version of patch 7 use improve ree pass for rs6000 target using 
defined ABI interfaces.
Bootstrapped and regtested on power64-linux-gnu.

Review comments incorporated.

Thanks & Regards
Ajit

ree: Improve ree pass for rs6000 target using defined abi interfaces

For rs6000 target we see redundant zero and sign extension and done to
improve ree pass to eliminate such redundant zero and sign extension
using defined ABI interfaces.

2023-09-19  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc (combine_reaching_defs): Use of zero_extend and sign_extend
defined abi interfaces.
(add_removable_extension): Use of defined abi interfaces for no
reaching defs.
(abi_extension_candidate_return_reg_p): New function.
(abi_extension_candidate_p): New function.
(abi_extension_candidate_argno_p): New function.
(abi_handle_regs_without_defs_p): New function.
(abi_target_promote_function_mode): New function.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/zext-elim-3.C
---
 gcc/ree.cc| 148 +-
 .../g++.target/powerpc/zext-elim-3.C  |  13 ++
 2 files changed, 158 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C

diff --git a/gcc/ree.cc b/gcc/ree.cc
index fc04249fa84..79fc54f38a3 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -514,7 +514,8 @@ get_uses (rtx_insn *insn, rtx reg)
 if (REGNO (DF_REF_REG (def)) == REGNO (reg))
   break;
 
-  gcc_assert (def != NULL);
+  if (def == NULL)
+return NULL;
 
   ref_chain = DF_REF_CHAIN (def);
 
@@ -750,6 +751,122 @@ get_extended_src_reg (rtx src)
   return src;
 }
 
+/* Return TRUE if target mode is equal to source mode of zero_extend
+   or sign_extend otherwise false.  */
+
+static bool
+abi_target_promote_function_mode (machine_mode mode)
+{
+  int unsignedp;
+  machine_mode tgt_mode
+= targetm.calls.promote_function_mode (NULL_TREE, mode, &unsignedp,
+  NULL_TREE, 1);
+
+  if (tgt_mode == mode)
+return true;
+  else
+return false;
+}
+
+/* Return TRUE if the candidate insn is zero extend and regno is
+   a return registers.  */
+
+static bool
+abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
+{
+  rtx set = single_set (insn);
+
+  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
+return false;
+
+  if (targetm.calls.function_value_regno_p (regno))
+return true;
+
+  return false;
+}
+
+/* Return TRUE if reg source operand of zero_extend is argument registers
+   and not return registers and source and destination operand are same
+   and mode of source and destination operand are not same.  */
+
+static bool
+abi_extension_candidate_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+
+  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
+return false;
+
+  machine_mode dst_mode = GET_MODE (SET_DEST (set));
+  rtx orig_src = XEXP (SET_SRC (set), 0);
+
+  if (!FUNCTION_ARG_REGNO_P (REGNO (orig_src))
+  || abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
+return false;
+
+  /* Mode of destination and source of zero_extend should be different.  */
+  if (dst_mode == GET_MODE (orig_src))
+return false;
+
+  /* REGNO of source and destination of zero_extend should be same.  */
+  if (REGNO (SET_DEST (set)) != REGNO (orig_src))
+return false;
+
+  return true;
+}
+
+/* Return TRUE if the candidate insn is zero extend and regno is
+   an argument registers.  */
+
+static bool
+abi_extension_candidate_argno_p (rtx_code code, int regno)
+{
+  if (code != ZERO_EXTEND && code != SIGN_EXTEND)
+return false;
+
+  if (FUNCTION_ARG_REGNO_P (regno))
+return true;
+
+  return false;
+}
+
+/* Return TRUE if the candidate insn doesn't have defs and have
+ * uses without RTX_BIN_ARITH/RTX_COMM_ARITH/RTX_UNARY rtx class.  */
+
+static bool
+abi_handle_regs_without_defs_p (rtx_insn *insn)
+{
+  if (side_effects_p (PATTERN (insn)))
+return false;
+
+  struct df_link *uses = get_uses (insn, SET_DEST (PATTERN (insn)));
+
+  if (!uses)
+return false;
+
+  for (df_link *use = uses; use; use = use->next)
+{
+  if (!use->ref)
+   return false;
+
+  if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (DF_REF_INSN (use->ref)))
+   return false;
+
+  rtx_insn *use_insn = DF_REF_INSN (use->ref);
+
+  if (GET_CODE (PATTERN (use_insn)) == SET)
+   {
+ rtx_code code = GET_CODE (SET_SRC (PATTERN (use_insn)));
+
+ if (GET_RTX_CLASS (code) == RTX_BIN_ARITH
+ || GET_RTX_CLASS (code) == RTX_COMM_ARITH
+ || GET_RTX_CLASS (code) == RTX_UNARY)
+   return false;
+   }
+ }
+  return true;
+}
+
 /* This function goes through all reaching defs of the source
of the candidate for elimination (CAND) and tries to combine
the extension with the definition instruction.  The changes
@@ -770,6 +887,11 @@ combine_reaching_defs (ext_cand *

Re: PATCH v6 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.

2023-09-19 Thread Ajit Agarwal
Hello Maxim:

Version 7 of the patch solves this problem. Sorry for the inconvenience caused.

Thanks & Regards
Ajit

On 18/09/23 2:36 pm, Maxim Kuvyrkov wrote:
> Hi Ajit,
> 
> Is this patch supposed to be applied on top of another patch?  As is, this 
> patch fails build on AArch64 and AArch32, and Linaro TCWG CI have sent 
> notifications about the failures for v5 [1] and v6 [2] of this patch to you.  
> Did you receive the notifications?
> 
> Kind regards,
> 
> [1] 
> https://patchwork.sourceware.org/project/gcc/patch/5ad7cdca-63e1-73af-b38d-d58898e21...@linux.ibm.com/
> [2] 
> https://patchwork.sourceware.org/project/gcc/patch/65ed79a3-9964-dd50-39cb-98d5dbc72...@linux.ibm.com/
> 
> --
> Maxim Kuvyrkov
> https://www.linaro.org
> 
>> On Sep 18, 2023, at 09:59, Ajit Agarwal via Gcc-patches 
>>  wrote:
>>
>> This new version of patch 6 use improve ree pass for rs6000 target using 
>> defined ABI interfaces.
>> Bootstrapped and regtested on power64-linux-gnu.
>>
>> Review comments incorporated.
>>
>> Thanks & Regards
>> Ajit
>>
>>
>> ree: Improve ree pass for rs6000 target using defined abi interfaces
>>
>> For rs6000 target we see redundant zero and sign extension and done to
>> improve ree pass to eliminate such redundant zero and sign extension
>> using defined ABI interfaces.
>>
>> 2023-09-18  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>> * ree.cc (combine_reaching_defs): Use of  zero_extend and sign_extend
>> defined abi interfaces.
>> (add_removable_extension): Use of defined abi interfaces for no
>> reaching defs.
>> (abi_extension_candidate_return_reg_p): New function.
>> (abi_extension_candidate_p): New function.
>> (abi_extension_candidate_argno_p): New function.
>> (abi_handle_regs_without_defs_p): New function.
>> (abi_target_promote_function_mode): New function.
>>
>> gcc/testsuite/ChangeLog:
>>
>>* g++.target/powerpc/zext-elim-3.C
>> ---
>> gcc/ree.cc| 145 +-
>> .../g++.target/powerpc/zext-elim-3.C  |  13 ++
>> 2 files changed, 155 insertions(+), 3 deletions(-)
>> create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
>>
>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>> index fc04249fa84..e395af6b1bd 100644
>> --- a/gcc/ree.cc
>> +++ b/gcc/ree.cc
>> @@ -514,7 +514,8 @@ get_uses (rtx_insn *insn, rtx reg)
>> if (REGNO (DF_REF_REG (def)) == REGNO (reg))
>>   break;
>>
>> -  gcc_assert (def != NULL);
>> +  if (def == NULL)
>> +return NULL;
>>
>>   ref_chain = DF_REF_CHAIN (def);
>>
>> @@ -750,6 +751,118 @@ get_extended_src_reg (rtx src)
>>   return src;
>> }
>>
>> +/* Return TRUE if target mode is equal to source mode of zero_extend
>> +   or sign_extend otherwise false.  */
>> +
>> +static bool
>> +abi_target_promote_function_mode (machine_mode mode)
>> +{
>> +  int unsignedp;
>> +  machine_mode tgt_mode =
>> +targetm.calls.promote_function_mode (NULL_TREE, mode, &unsignedp,
>> + NULL_TREE, 1);
>> +
>> +  if (tgt_mode == mode)
>> +return true;
>> +  else
>> +return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn is zero extend and regno is
>> +   an return  registers.  */
>> +
>> +static bool
>> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
>> +{
>> +  rtx set = single_set (insn);
>> +
>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>> +return false;
>> +
>> +  if (FUNCTION_VALUE_REGNO_P (regno))
>> +return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if reg source operand of zero_extend is argument registers
>> +   and not return registers and source and destination operand are same
>> +   and mode of source and destination operand are not same.  */
>> +
>> +static bool
>> +abi_extension_candidate_p (rtx_insn *insn)
>> +{
>> +  rtx set = single_set (insn);
>> +
>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>> +return false;
>> +
>> +  machine_mode ext_dst_mode = GET_MODE (SET_DEST (set));
>> +  rtx orig_src = XEXP (SET_SRC (set),0);
>> +
>> +  bool copy_needed
>> += (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0)));
>> +
>> +  if (!copy_needed && ext_dst_mode != GET_MODE (orig_src)
>> +  && FUNCTION_ARG_REGNO_P (REGNO (orig_src))
>> +  && !abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
>> +return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn is zero extend and regno is
>> +   an argument registers.  */
>> +
>> +static bool
>> +abi_extension_candidate_argno_p (rtx_code code, int regno)
>> +{
>> +  if (code != ZERO_EXTEND)
>> +return false;
>> +
>> +  if (FUNCTION_ARG_REGNO_P (regno))
>> +return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn doesn't have defs and have
>> + * uses without RTX_BIN_ARITH/RTX_COMM_ARITH/RTX_UNARY rtx class.  */
>> +
>> +static bool
>> +abi_handle_regs_without_defs_p (rtx_insn *insn)
>> +{
>> +  if (side_effects_p (PATTERN (insn)))
>> +return false;
>> +
>> +  struct df_link *uses 

Re: [PATCH v7 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-09-19 Thread Xi Ruoyao
On Tue, 2023-09-19 at 14:29 +0530, Ajit Agarwal wrote:
> This new version of patch 7 use improve ree pass for rs6000 target
> using defined ABI interfaces.
> Bootstrapped and regtested on power64-linux-gnu.

You should drop the "4/4" in subject if it does not depends on other
non-committed patches.  Otherwise you should send all the non-committed
dependencies in a series.

If you just have the 4/4 there without {1..3}/4, people will believe
this is an incomplete patch submission and likely ignore it.

> Review comments incorporated.
> 
> Thanks & Regards
> Ajit
> 
> ree: Improve ree pass for rs6000 target using defined abi interfaces
> 
> For rs6000 target we see redundant zero and sign extension and done to
> improve ree pass to eliminate such redundant zero and sign extension
> using defined ABI interfaces.
> 
> 2023-09-19  Ajit Kumar Agarwal  
> 
> gcc/ChangeLog:
> 
>   * ree.cc (combine_reaching_defs): Use of zero_extend and
> sign_extend
>   defined abi interfaces.
>   (add_removable_extension): Use of defined abi interfaces for
> no
>   reaching defs.
>   (abi_extension_candidate_return_reg_p): New function.
>   (abi_extension_candidate_p): New function.
>   (abi_extension_candidate_argno_p): New function.
>   (abi_handle_regs_without_defs_p): New function.
>   (abi_target_promote_function_mode): New function.
> 
> gcc/testsuite/ChangeLog:
> 
>     * g++.target/powerpc/zext-elim-3.C
> ---
>  gcc/ree.cc    | 148
> +-
>  .../g++.target/powerpc/zext-elim-3.C  |  13 ++
>  2 files changed, 158 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
> 
> diff --git a/gcc/ree.cc b/gcc/ree.cc
> index fc04249fa84..79fc54f38a3 100644
> --- a/gcc/ree.cc
> +++ b/gcc/ree.cc
> @@ -514,7 +514,8 @@ get_uses (rtx_insn *insn, rtx reg)
>  if (REGNO (DF_REF_REG (def)) == REGNO (reg))
>    break;
>  
> -  gcc_assert (def != NULL);
> +  if (def == NULL)
> +    return NULL;
>  
>    ref_chain = DF_REF_CHAIN (def);
>  
> @@ -750,6 +751,122 @@ get_extended_src_reg (rtx src)
>    return src;
>  }
>  
> +/* Return TRUE if target mode is equal to source mode of zero_extend
> +   or sign_extend otherwise false.  */
> +
> +static bool
> +abi_target_promote_function_mode (machine_mode mode)
> +{
> +  int unsignedp;
> +  machine_mode tgt_mode
> +    = targetm.calls.promote_function_mode (NULL_TREE, mode,
> &unsignedp,
> +    NULL_TREE, 1);
> +
> +  if (tgt_mode == mode)
> +    return true;
> +  else
> +    return false;
> +}
> +
> +/* Return TRUE if the candidate insn is zero extend and regno is
> +   a return registers.  */
> +
> +static bool
> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
> +{
> +  rtx set = single_set (insn);
> +
> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
> +    return false;
> +
> +  if (targetm.calls.function_value_regno_p (regno))
> +    return true;
> +
> +  return false;
> +}
> +
> +/* Return TRUE if reg source operand of zero_extend is argument
> registers
> +   and not return registers and source and destination operand are
> same
> +   and mode of source and destination operand are not same.  */
> +
> +static bool
> +abi_extension_candidate_p (rtx_insn *insn)
> +{
> +  rtx set = single_set (insn);
> +
> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
> +    return false;
> +
> +  machine_mode dst_mode = GET_MODE (SET_DEST (set));
> +  rtx orig_src = XEXP (SET_SRC (set), 0);
> +
> +  if (!FUNCTION_ARG_REGNO_P (REGNO (orig_src))
> +  || abi_extension_candidate_return_reg_p (insn, REGNO
> (orig_src)))
> +    return false;
> +
> +  /* Mode of destination and source of zero_extend should be
> different.  */
> +  if (dst_mode == GET_MODE (orig_src))
> +    return false;
> +
> +  /* REGNO of source and destination of zero_extend should be same. 
> */
> +  if (REGNO (SET_DEST (set)) != REGNO (orig_src))
> +    return false;
> +
> +  return true;
> +}
> +
> +/* Return TRUE if the candidate insn is zero extend and regno is
> +   an argument registers.  */
> +
> +static bool
> +abi_extension_candidate_argno_p (rtx_code code, int regno)
> +{
> +  if (code != ZERO_EXTEND && code != SIGN_EXTEND)
> +    return false;
> +
> +  if (FUNCTION_ARG_REGNO_P (regno))
> +    return true;
> +
> +  return false;
> +}
> +
> +/* Return TRUE if the candidate insn doesn't have defs and have
> + * uses without RTX_BIN_ARITH/RTX_COMM_ARITH/RTX_UNARY rtx class.  */
> +
> +static bool
> +abi_handle_regs_without_defs_p (rtx_insn *insn)
> +{
> +  if (side_effects_p (PATTERN (insn)))
> +    return false;
> +
> +  struct df_link *uses = get_uses (insn, SET_DEST (PATTERN (insn)));
> +
> +  if (!uses)
> +    return false;
> +
> +  for (df_link *use = uses; use; use = use->next)
> +    {
> +  if (!use->ref)
> + return false;
> +
> +  if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (DF_REF_INSN (use-
> >ref)))
> + return fa

Re: PATCH v6 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces.

2023-09-19 Thread Ajit Agarwal
Hello Vineet:

Version 7 of the patch incorporates the below review comments.
Please review.

Thanks & Regards
Ajit

On 19/09/23 1:57 am, Vineet Gupta wrote:
> Hi Ajit,
> 
> On 9/17/23 22:59, Ajit Agarwal wrote:
>> This new version of patch 6 use improve ree pass for rs6000 target using 
>> defined ABI interfaces.
>> Bootstrapped and regtested on power64-linux-gnu.
>>
>> Review comments incorporated.
>>
>> Thanks & Regards
>> Ajit
> 
> Nit: This seems to belong to "what changed in v6" between the two "---" lines 
> right before start of source diff.
> 
>> ree: Improve ree pass for rs6000 target using defined abi interfaces
>>
>> For rs6000 target we see redundant zero and sign extension and done to
>> improve ree pass to eliminate such redundant zero and sign extension
>> using defined ABI interfaces.
> 
> It seems you have redundant "redundant zero and sign extension" - pun 
> intended  ;-)
> 
> On a serious note, when debugging your code for a possible RISC-V benefit, it 
> seems what it is trying to do is address REE giving up due to "missing 
> definition(s)". Perhaps mentioning that in commitlog would give the reader 
> more context.
> 
>> +/* Return TRUE if target mode is equal to source mode of zero_extend
>> +   or sign_extend otherwise false.  */
>> +
>> +static bool
>> +abi_target_promote_function_mode (machine_mode mode)
>> +{
>> +  int unsignedp;
>> +  machine_mode tgt_mode =
>> +    targetm.calls.promote_function_mode (NULL_TREE, mode, &unsignedp,
>> + NULL_TREE, 1);
>> +
>> +  if (tgt_mode == mode)
>> +    return true;
>> +  else
>> +    return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn is zero extend and regno is
>> +   an return  registers.  */
> 
> Additional Whitespace and grammer
> s/an return  registers/a return register
> 
> Please *run* contrib/check_gnu_style on your patch before sending out on 
> mailing lists, saves reviewers time and they can focus more on technical 
> content.
> 
>> +
>> +static bool
>> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
>> +{
>> +  rtx set = single_set (insn);
>> +
>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>> +    return false;
> 
> This still has ABI assumptions: RISC-V generates SIGN_EXTEND for functions 
> args and return reg.
> This is not a deficiency of patch per-se, but something we would like to 
> address - even if as an addon-patch.
> 
>> +
>> +  if (FUNCTION_VALUE_REGNO_P (regno))
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if reg source operand of zero_extend is argument registers
>> +   and not return registers and source and destination operand are same
>> +   and mode of source and destination operand are not same.  */
>> +
>> +static bool
>> +abi_extension_candidate_p (rtx_insn *insn)
>> +{
>> +  rtx set = single_set (insn);
>> +
>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>> +    return false;
> Ditto: ABI assumption.
> 
>> +
>> +  machine_mode ext_dst_mode = GET_MODE (SET_DEST (set));
> 
> why not simply @dst_mode
> 
>> +  rtx orig_src = XEXP (SET_SRC (set),0);
>> +
>> +  bool copy_needed
>> +    = (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0)));
> 
> Maybe use @orig_src here, rather than duplicating XEXP (SET_SRC (set),0)
> 
>> +  if (!copy_needed && ext_dst_mode != GET_MODE (orig_src)
> 
> The bailing out for copy_needed needs extra commentary, why ?
> 
>> +  && FUNCTION_ARG_REGNO_P (REGNO (orig_src))
>> +  && !abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
>> +    return true;
>> +
>> +  return false;
> 
> Consider this bike-shed but I would arrange this code differently. The main 
> case here is check for function args and then the not so imp reasons
> 
> +  rtx orig_src = XEXP (src, 0);
> +
> +  if (!FUNCTION_ARG_REGNO_P (REGNO (orig_src))
> +  || abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
> +    return false;
> +
> +  /* commentary as to why  */
> +  if (dst_mode == GET_MODE (orig_src))
> +    return false;
> 
> -   bool copy_needed
> -    = (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0)));
> +  /* copy needed  . */
> +  if (REGNO (SET_DEST (set)) != REGNO (orig_src))
> +    return false;
> +
> + return true;
> 
>> +/* Return TRUE if the candidate insn is zero extend and regno is
>> +   an argument registers.  */
>> +
>> +static bool
>> +abi_extension_candidate_argno_p (rtx_code code, int regno)
>> +{
>> +  if (code != ZERO_EXTEND)
>> +    return false;
> 
> ABI assumption still.
> 
>> +
>> +  if (FUNCTION_ARG_REGNO_P (regno))
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn doesn't have defs and have
>> + * uses without RTX_BIN_ARITH/RTX_COMM_ARITH/RTX_UNARY rtx class.  */
>> +
>> +static bool
>> +abi_handle_regs_without_defs_p (rtx_insn *insn)
>> +{
>> +  if (side_effects_p (PATTERN (insn)))
>> +    return false;
>> +
>> +  struct df_link *uses = get_uses (insn, SET_DEST (PATTERN (insn)));
>> +
>> 

Re: [PATCH v7 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-09-19 Thread Ajit Agarwal



On 19/09/23 2:36 pm, Xi Ruoyao wrote:
> On Tue, 2023-09-19 at 14:29 +0530, Ajit Agarwal wrote:
>> This new version of patch 7 use improve ree pass for rs6000 target
>> using defined ABI interfaces.
>> Bootstrapped and regtested on power64-linux-gnu.
> 
> You should drop the "4/4" in subject if it does not depends on other
> non-committed patches.  Otherwise you should send all the non-committed
> dependencies in a series.
> 
> If you just have the 4/4 there without {1..3}/4, people will believe
> this is an incomplete patch submission and likely ignore it.
> 

There are 4 oatches that are already under review and all the provious
patches are under review.

I will send 3/4 patches again for review.

Thanks & Regards
Ajit
>> Review comments incorporated.
>>
>> Thanks & Regards
>> Ajit
>>
>> ree: Improve ree pass for rs6000 target using defined abi interfaces
>>
>> For rs6000 target we see redundant zero and sign extension and done to
>> improve ree pass to eliminate such redundant zero and sign extension
>> using defined ABI interfaces.
>>
>> 2023-09-19  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>>  * ree.cc (combine_reaching_defs): Use of zero_extend and
>> sign_extend
>>  defined abi interfaces.
>>  (add_removable_extension): Use of defined abi interfaces for
>> no
>>  reaching defs.
>>  (abi_extension_candidate_return_reg_p): New function.
>>  (abi_extension_candidate_p): New function.
>>  (abi_extension_candidate_argno_p): New function.
>>  (abi_handle_regs_without_defs_p): New function.
>>  (abi_target_promote_function_mode): New function.
>>
>> gcc/testsuite/ChangeLog:
>>
>>     * g++.target/powerpc/zext-elim-3.C
>> ---
>>  gcc/ree.cc    | 148
>> +-
>>  .../g++.target/powerpc/zext-elim-3.C  |  13 ++
>>  2 files changed, 158 insertions(+), 3 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
>>
>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>> index fc04249fa84..79fc54f38a3 100644
>> --- a/gcc/ree.cc
>> +++ b/gcc/ree.cc
>> @@ -514,7 +514,8 @@ get_uses (rtx_insn *insn, rtx reg)
>>  if (REGNO (DF_REF_REG (def)) == REGNO (reg))
>>    break;
>>  
>> -  gcc_assert (def != NULL);
>> +  if (def == NULL)
>> +    return NULL;
>>  
>>    ref_chain = DF_REF_CHAIN (def);
>>  
>> @@ -750,6 +751,122 @@ get_extended_src_reg (rtx src)
>>    return src;
>>  }
>>  
>> +/* Return TRUE if target mode is equal to source mode of zero_extend
>> +   or sign_extend otherwise false.  */
>> +
>> +static bool
>> +abi_target_promote_function_mode (machine_mode mode)
>> +{
>> +  int unsignedp;
>> +  machine_mode tgt_mode
>> +    = targetm.calls.promote_function_mode (NULL_TREE, mode,
>> &unsignedp,
>> +   NULL_TREE, 1);
>> +
>> +  if (tgt_mode == mode)
>> +    return true;
>> +  else
>> +    return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn is zero extend and regno is
>> +   a return registers.  */
>> +
>> +static bool
>> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
>> +{
>> +  rtx set = single_set (insn);
>> +
>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>> +    return false;
>> +
>> +  if (targetm.calls.function_value_regno_p (regno))
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if reg source operand of zero_extend is argument
>> registers
>> +   and not return registers and source and destination operand are
>> same
>> +   and mode of source and destination operand are not same.  */
>> +
>> +static bool
>> +abi_extension_candidate_p (rtx_insn *insn)
>> +{
>> +  rtx set = single_set (insn);
>> +
>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>> +    return false;
>> +
>> +  machine_mode dst_mode = GET_MODE (SET_DEST (set));
>> +  rtx orig_src = XEXP (SET_SRC (set), 0);
>> +
>> +  if (!FUNCTION_ARG_REGNO_P (REGNO (orig_src))
>> +  || abi_extension_candidate_return_reg_p (insn, REGNO
>> (orig_src)))
>> +    return false;
>> +
>> +  /* Mode of destination and source of zero_extend should be
>> different.  */
>> +  if (dst_mode == GET_MODE (orig_src))
>> +    return false;
>> +
>> +  /* REGNO of source and destination of zero_extend should be same. 
>> */
>> +  if (REGNO (SET_DEST (set)) != REGNO (orig_src))
>> +    return false;
>> +
>> +  return true;
>> +}
>> +
>> +/* Return TRUE if the candidate insn is zero extend and regno is
>> +   an argument registers.  */
>> +
>> +static bool
>> +abi_extension_candidate_argno_p (rtx_code code, int regno)
>> +{
>> +  if (code != ZERO_EXTEND && code != SIGN_EXTEND)
>> +    return false;
>> +
>> +  if (FUNCTION_ARG_REGNO_P (regno))
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Return TRUE if the candidate insn doesn't have defs and have
>> + * uses without RTX_BIN_ARITH/RTX_COMM_ARITH/RTX_UNARY rtx class.  */
>> +
>> +static bool
>> +abi_handle_regs_without_defs_p (rtx_insn *insn)
>> +{
>> +  if (side_effects_p (PATTERN (insn)))

[PATCH v2 3/4] Improve functionality of ree pass with various constants with AND operation.

2023-09-19 Thread Ajit Agarwal


Hello Jeff:

This patch eliminates redundant zero and sign extension with ree pass for rs6000
target.

Bootstrapped and regtested for powerpc64-linux-gnu.

Thanks & Regards
Ajit


ree: Improve ree pass

For rs6000 target we see redundant zero and sign extension and ree pass
s improved to eliminate such redundant zero and sign extension. Support of
zero_extend/sign_extend/AND.

2023-09-04  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc (eliminate_across_bbs_p): Add checks to enable extension
elimination across and within basic blocks.
(def_arith_p): New function to check definition has arithmetic
operation.
(combine_set_extension): Modification to incorporate AND
and current zero_extend and sign_extend instruction.
(merge_def_and_ext): Add calls to eliminate_across_bbs_p and
zero_extend sign_extend and AND instruction.
(rtx_is_zext_p): New function.
(feasible_cfg): New function.
* rtl.h (reg_used_set_between_p): Add prototype.
* rtlanal.cc (reg_used_set_between_p): New function.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/zext-elim.C: New testcase.
* g++.target/powerpc/zext-elim-1.C: New testcase.
* g++.target/powerpc/zext-elim-2.C: New testcase.
* g++.target/powerpc/sext-elim.C: New testcase.
---
 gcc/ree.cc| 487 --
 gcc/rtl.h |   1 +
 gcc/rtlanal.cc|  15 +
 gcc/testsuite/g++.target/powerpc/sext-elim.C  |  17 +
 .../g++.target/powerpc/zext-elim-1.C  |  19 +
 .../g++.target/powerpc/zext-elim-2.C  |  11 +
 gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
 7 files changed, 534 insertions(+), 46 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C

diff --git a/gcc/ree.cc b/gcc/ree.cc
index fc04249fa84..931b9b08821 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -253,6 +253,77 @@ struct ext_cand
 
 static int max_insn_uid;
 
+/* Return TRUE if OP can be considered a zero extension from one or
+   more sub-word modes to larger modes up to a full word.
+
+   For example (and:DI (reg) (const_int X))
+
+   Depending on the value of X could be considered a zero extension
+   from QI, HI and SI to larger modes up to DImode.  */
+
+static bool
+rtx_is_zext_p (rtx insn)
+{
+  if (GET_CODE (insn) == AND)
+{
+  rtx set = XEXP (insn, 0);
+  if (REG_P (set))
+   {
+ rtx src = XEXP (insn, 1);
+ machine_mode m_mode = GET_MODE (set);
+
+ if (CONST_INT_P (src)
+ && (INTVAL (src) == 1
+ || (m_mode == QImode && INTVAL (src) == 0x7)
+ || (m_mode == QImode && INTVAL (src) == 0x007F)
+ || (m_mode == HImode && INTVAL (src) == 0x7FFF)
+ || (m_mode == SImode && INTVAL (src) == 0x007F)))
+   return true;
+
+   }
+  else
+   return false;
+}
+
+  return false;
+}
+/* Return TRUE if OP can be considered a zero extension from one or
+   more sub-word modes to larger modes up to a full word.
+
+   For example (and:DI (reg) (const_int X))
+
+   Depending on the value of X could be considered a zero extension
+   from QI, HI and SI to larger modes up to DImode.  */
+
+static bool
+rtx_is_zext_p (rtx_insn *insn)
+{
+  rtx body = single_set (insn);
+
+  if (GET_CODE (body) == SET && GET_CODE (SET_SRC (body)) == AND)
+   {
+ rtx set = XEXP (SET_SRC (body), 0);
+
+ if (REG_P (set) && GET_MODE (SET_DEST (body)) == GET_MODE (set))
+   {
+ rtx src = XEXP (SET_SRC (body), 1);
+ machine_mode m_mode = GET_MODE (set);
+
+ if (CONST_INT_P (src)
+ && (INTVAL (src) == 1
+ || (m_mode == QImode && INTVAL (src) == 0x7)
+ || (m_mode == QImode && INTVAL (src) == 0x007F)
+ || (m_mode == HImode && INTVAL (src) == 0x7FFF)
+ || (m_mode == SImode && INTVAL (src) == 0x007F)))
+   return true;
+   }
+ else
+  return false;
+   }
+
+   return false;
+}
+
 /* Update or remove REG_EQUAL or REG_EQUIV notes for INSN.  */
 
 static bool
@@ -319,7 +390,7 @@ combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, 
rtx *orig_set)
 {
   rtx orig_src = SET_SRC (*orig_set);
   machine_mode orig_mode = GET_MODE (SET_DEST (*orig_set));
-  rtx new_set;
+  rtx new_set = NULL_RTX;
   rtx cand_pat = single_set (cand->insn);
 
   /* If the extension's source/destination registers are not the same
@@ -359,27 +430,41 @@ combine_set_extension (ext_cand *cand, rtx_insn 
*curr_insn, rtx *orig_set)
   else if (GET_CODE (orig_src) == cand->code)
 {
   /* Here is a sequence of two extensions.  Tr

[Committed] RISC-V: Support VLS floating-point FMA/FNMA/FMS auto-vectorization

2023-09-19 Thread Juzhe-Zhong
Support VLS floating-point FMA/FNMA/FMS patterns.

Regression no difference after this patch, Committed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS floating-point modes.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add FMS tests.
* gcc.target/riscv/rvv/autovec/vls/fma-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fma-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fms-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fms-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnma-7.c: New test.

---
 gcc/config/riscv/autovec.md   |  50 ++--
 gcc/config/riscv/vector.md| 222 +-
 .../gcc.target/riscv/rvv/autovec/vls/def.h|   9 +
 .../gcc.target/riscv/rvv/autovec/vls/fma-5.c  |  31 +++
 .../gcc.target/riscv/rvv/autovec/vls/fma-6.c  |  30 +++
 .../gcc.target/riscv/rvv/autovec/vls/fma-7.c  |  29 +++
 .../gcc.target/riscv/rvv/autovec/vls/fms-1.c  |  31 +++
 .../gcc.target/riscv/rvv/autovec/vls/fms-2.c  |  30 +++
 .../gcc.target/riscv/rvv/autovec/vls/fms-3.c  |  29 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-5.c |  31 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-6.c |  30 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnma-7.c |  29 +++
 12 files changed, 415 insertions(+), 136 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fma-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fms-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fms-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fms-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnma-7.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 1aadb6eea1f..769ef6daa36 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1135,12 +1135,12 @@
 ;; -
 
 (define_insn_and_split "fma4"
-  [(set (match_operand:VF 0 "register_operand")
-(plus:VF
- (mult:VF
-   (match_operand:VF 1 "register_operand")
-   (match_operand:VF 2 "register_operand"))
- (match_operand:VF 3 "register_operand")))]
+  [(set (match_operand:V_VLSF 0 "register_operand")
+(plus:V_VLSF
+ (mult:V_VLSF
+   (match_operand:V_VLSF 1 "register_operand")
+   (match_operand:V_VLSF 2 "register_operand"))
+ (match_operand:V_VLSF 3 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -1163,12 +1163,12 @@
 ;; -
 
 (define_insn_and_split "fnma4"
-  [(set (match_operand:VF 0 "register_operand")
-(minus:VF
-  (match_operand:VF 3 "register_operand")
- (mult:VF
-   (match_operand:VF 1 "register_operand")
-   (match_operand:VF 2 "register_operand"]
+  [(set (match_operand:V_VLSF 0 "register_operand")
+(minus:V_VLSF
+  (match_operand:V_VLSF 3 "register_operand")
+ (mult:V_VLSF
+   (match_operand:V_VLSF 1 "register_operand")
+   (match_operand:V_VLSF 2 "register_operand"]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -1191,12 +1191,12 @@
 ;; -
 
 (define_insn_and_split "fms4"
-  [(set (match_operand:VF 0 "register_operand")
-(minus:VF
- (mult:VF
-   (match_operand:VF 1 "register_operand")
-   (match_operand:VF 2 "register_operand"))
- (match_operand:VF 3 "register_operand")))]
+  [(set (match_operand:V_VLSF 0 "register_operand")
+(minus:V_VLSF
+ (mult:V_VLSF
+   (match_operand:V_VLSF 1 "register_operand")
+   (match_operand:V_VLSF 2 "register_operand"))
+ (match_operand:V_VLSF 3 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -1219,13 +1219,13 @@
 ;; -
 
 (define_insn_and_split "fnms4"
-  [(set (match_operand:VF 0 "register_operand")
-(minus:VF
-  (neg:VF
-   (mult:VF
- (match_operand:VF 1 "register_operand")
- (match_operand:VF 2 "register_operand")))
- (ma

[PATCH] c/111468 - add unordered compare and pointer diff to GIMPLE FE parsing

2023-09-19 Thread Richard Biener
The following adds __UN{LT,LE,GT,GE,EQ}, __UNORDERED and __ORDERED
operator parsing support and support for parsing - as POINTER_DIFF_EXPR.

Bootstrapped and tested on x86_64-unknown-linux-gnu, will push later.

PR c/111468
gcc/c/
* gimple-parser.cc (c_parser_gimple_binary_expression): Add
return type argument.
(c_parser_gimple_statement): Adjust.
(c_parser_gimple_paren_condition): Likewise.
(c_parser_gimple_binary_expression): Use passed in return type,
add support for - as POINTER_DIFF_EXPR, __UN{LT,LE,GT,GE,EQ},
__UNORDERED and __ORDERED.

* gcc.dg/gimplefe-50.c: New testcase.
* gcc.dg/gimplefe-51.c: Likewise.
---
 gcc/c/gimple-parser.cc | 72 +-
 gcc/testsuite/gcc.dg/gimplefe-50.c | 27 +++
 gcc/testsuite/gcc.dg/gimplefe-51.c | 12 +
 3 files changed, 91 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gimplefe-50.c
 create mode 100644 gcc/testsuite/gcc.dg/gimplefe-51.c

diff --git a/gcc/c/gimple-parser.cc b/gcc/c/gimple-parser.cc
index cc3a8899d97..9cf29701c06 100644
--- a/gcc/c/gimple-parser.cc
+++ b/gcc/c/gimple-parser.cc
@@ -108,7 +108,7 @@ gimple_parser::push_edge (int src, int dest, int flags,
 static bool c_parser_gimple_compound_statement (gimple_parser &, gimple_seq *);
 static void c_parser_gimple_label (gimple_parser &, gimple_seq *);
 static void c_parser_gimple_statement (gimple_parser &, gimple_seq *);
-static struct c_expr c_parser_gimple_binary_expression (gimple_parser &);
+static struct c_expr c_parser_gimple_binary_expression (gimple_parser &, tree);
 static struct c_expr c_parser_gimple_unary_expression (gimple_parser &);
 static struct c_expr c_parser_gimple_postfix_expression (gimple_parser &);
 static struct c_expr c_parser_gimple_postfix_expression_after_primary
@@ -869,7 +869,7 @@ c_parser_gimple_statement (gimple_parser &parser, 
gimple_seq *seq)
   return;
 }
 
-  rhs = c_parser_gimple_binary_expression (parser);
+  rhs = c_parser_gimple_binary_expression (parser, TREE_TYPE (lhs.value));
   if (lhs.value != error_mark_node
   && rhs.value != error_mark_node)
 {
@@ -930,7 +930,7 @@ c_parser_gimple_statement (gimple_parser &parser, 
gimple_seq *seq)
 */
 
 static c_expr
-c_parser_gimple_binary_expression (gimple_parser &parser)
+c_parser_gimple_binary_expression (gimple_parser &parser, tree ret_type)
 {
   /* Location of the binary operator.  */
   struct c_expr ret, lhs, rhs;
@@ -939,7 +939,6 @@ c_parser_gimple_binary_expression (gimple_parser &parser)
   lhs = c_parser_gimple_postfix_expression (parser);
   if (c_parser_error (parser))
 return ret;
-  tree ret_type = TREE_TYPE (lhs.value);
   switch (c_parser_peek_token (parser)->type)
 {
 case CPP_MULT:
@@ -958,7 +957,10 @@ c_parser_gimple_binary_expression (gimple_parser &parser)
code = PLUS_EXPR;
   break;
 case CPP_MINUS:
-  code = MINUS_EXPR;
+  if (POINTER_TYPE_P (TREE_TYPE (lhs.value)))
+   code = POINTER_DIFF_EXPR;
+  else
+   code = MINUS_EXPR;
   break;
 case CPP_LSHIFT:
   code = LSHIFT_EXPR;
@@ -968,27 +970,21 @@ c_parser_gimple_binary_expression (gimple_parser &parser)
   break;
 case CPP_LESS:
   code = LT_EXPR;
-  ret_type = boolean_type_node;
   break;
 case CPP_GREATER:
   code = GT_EXPR;
-  ret_type = boolean_type_node;
   break;
 case CPP_LESS_EQ:
   code = LE_EXPR;
-  ret_type = boolean_type_node;
   break;
 case CPP_GREATER_EQ:
   code = GE_EXPR;
-  ret_type = boolean_type_node;
   break;
 case CPP_EQ_EQ:
   code = EQ_EXPR;
-  ret_type = boolean_type_node;
   break;
 case CPP_NOT_EQ:
   code = NE_EXPR;
-  ret_type = boolean_type_node;
   break;
 case CPP_AND:
   code = BIT_AND_EXPR;
@@ -1006,14 +1002,49 @@ c_parser_gimple_binary_expression (gimple_parser 
&parser)
   c_parser_error (parser, "%<||%> not valid in GIMPLE");
   return ret;
 case CPP_NAME:
-   {
- tree id = c_parser_peek_token (parser)->value;
- if (strcmp (IDENTIFIER_POINTER (id), "__MULT_HIGHPART") == 0)
-   {
- code = MULT_HIGHPART_EXPR;
- break;
-   }
-   }
+  {
+   tree id = c_parser_peek_token (parser)->value;
+   if (strcmp (IDENTIFIER_POINTER (id), "__MULT_HIGHPART") == 0)
+ {
+   code = MULT_HIGHPART_EXPR;
+   break;
+ }
+   else if (strcmp (IDENTIFIER_POINTER (id), "__UNLT") == 0)
+ {
+   code = UNLT_EXPR;
+   break;
+ }
+   else if (strcmp (IDENTIFIER_POINTER (id), "__UNLE") == 0)
+ {
+   code = UNLE_EXPR;
+   break;
+ }
+   else if (strcmp (IDENTIFIER_POINTER (id), "__UNGT") == 0)
+ {
+   code = UNGT_EXPR;
+   break;
+ }
+   else if (strcmp (IDENTIFIER_POINTER (id), "__UNGE"

[PATCH] tree-optimization/111465 - bougs jump threading with no-copy src block

2023-09-19 Thread Richard Biener
The following avoids to forward thread a path with a EDGE_NO_COPY_SRC_BLOCK
block that became non-empty due to folding.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to unbreak
bootstrap on powerpc.

PR tree-optimization/111465
* tree-ssa-threadupdate.cc (fwd_jt_path_registry::thread_block_1):
Cancel the path when a EDGE_NO_COPY_SRC_BLOCK became non-empty.

* g++.dg/torture/pr111465.C: New testcase.
---
 gcc/testsuite/g++.dg/torture/pr111465.C | 55 +
 gcc/tree-ssa-threadupdate.cc| 13 ++
 2 files changed, 68 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr111465.C

diff --git a/gcc/testsuite/g++.dg/torture/pr111465.C 
b/gcc/testsuite/g++.dg/torture/pr111465.C
new file mode 100644
index 000..8f2577adf4c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr111465.C
@@ -0,0 +1,55 @@
+// { dg-do compile }
+// { dg-additional-options "-fno-exceptions 
--param=logical-op-non-short-circuit=0" }
+
+typedef unsigned int location_t;
+const location_t MAX_LOCATION_T = 0x7FFF;
+struct line_maps {
+  unsigned int  info_ordinary;
+  location_t *maps;
+  unsigned int used;
+  location_t *data;
+};
+inline location_t LINEMAPS_MACRO_LOWEST_LOCATION(const line_maps *set) {
+  return set->used
+ ? set->maps[set->used - 1]
+ : MAX_LOCATION_T + 1;
+}
+const location_t *linemap_lookup(const line_maps *set, location_t line) {
+  int mn = set->info_ordinary;
+  if (mn >= 0)
+  if ((unsigned int)mn < set->used)
+  return &set->maps[0];
+  __builtin_unreachable();
+}
+bool linemap_location_from_macro_expansion_p(const class line_maps *set,
+ location_t location) {
+  if (location > MAX_LOCATION_T)
+location = set->data[location & MAX_LOCATION_T];
+  return location >= LINEMAPS_MACRO_LOWEST_LOCATION(set);
+}
+void first_map_in_common_1(line_maps *set, location_t *loc0,
+ location_t *loc1) {
+  linemap_lookup(set, 0);
+  __builtin_unreachable();
+}
+int linemap_compare_locations(line_maps *set, location_t pre, location_t post) 
{
+  bool pre_virtual_p;
+  location_t l0 = pre, l1 = post;
+  if (l0 > MAX_LOCATION_T)
+l0 = set->data[l0 & MAX_LOCATION_T];
+  if (l1 > MAX_LOCATION_T)
+l1 = set->data[l1 & MAX_LOCATION_T];;
+  if (l0 == l1)
+return 0;
+  if ((pre_virtual_p = linemap_location_from_macro_expansion_p(set, l0)))
+l0 = set->data[l0 & MAX_LOCATION_T];
+  if (linemap_location_from_macro_expansion_p(set, l1))
+l1 = set->data[l1 & MAX_LOCATION_T];
+  if (l0 == l1)
+ if (pre_virtual_p)
+   first_map_in_common_1(set, &l0, &l1);
+  if (l0 > MAX_LOCATION_T)
+if (l1 > MAX_LOCATION_T)
+  l1 = set->data[l1 & MAX_LOCATION_T];
+  return l1 - l0;
+}
diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
index a5b9a002a8a..86fe8aac677 100644
--- a/gcc/tree-ssa-threadupdate.cc
+++ b/gcc/tree-ssa-threadupdate.cc
@@ -1454,6 +1454,19 @@ fwd_jt_path_registry::thread_block_1 (basic_block bb,
  || ((*path)[1]->type == EDGE_COPY_SRC_BLOCK && joiners))
continue;
 
+  /* When a NO_COPY_SRC block became non-empty cancel the path.  */
+  if (path->last ()->type == EDGE_NO_COPY_SRC_BLOCK)
+   {
+ auto gsi = gsi_start_nondebug_bb (path->last ()->e->src);
+ if (!gsi_end_p (gsi)
+ && !is_ctrl_stmt (gsi_stmt (gsi)))
+   {
+ cancel_thread (path, "Non-empty EDGE_NO_COPY_SRC_BLOCK");
+ e->aux = NULL;
+ continue;
+   }
+   }
+
   e2 = path->last ()->e;
   if (!e2 || noloop_only)
{
-- 
2.35.3


[Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-19 Thread Juzhe-Zhong
Extend current VLA patterns with VLS modes.

Regression all passed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS modes.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add unary test.
* gcc.target/riscv/rvv/autovec/vls/neg-2.c: New test.

---
 gcc/config/riscv/autovec.md   | 12 ++---
 gcc/config/riscv/vector.md| 20 +++
 .../gcc.target/riscv/rvv/autovec/vls/def.h|  3 +-
 .../gcc.target/riscv/rvv/autovec/vls/neg-2.c  | 52 +++
 4 files changed, 70 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/neg-2.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 769ef6daa36..75ed7ae4f2e 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1031,9 +1031,9 @@
 ;; - vfneg.v/vfabs.v
 ;; 
---
 (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-(any_float_unop_nofrm:VF
- (match_operand:VF 1 "register_operand")))]
+  [(set (match_operand:V_VLSF 0 "register_operand")
+(any_float_unop_nofrm:V_VLSF
+ (match_operand:V_VLSF 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
@@ -1052,9 +1052,9 @@
 ;; - vfsqrt.v
 ;; 
---
 (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-(any_float_unop:VF
- (match_operand:VF 1 "register_operand")))]
+  [(set (match_operand:V_VLSF 0 "register_operand")
+(any_float_unop:V_VLSF
+ (match_operand:V_VLSF 1 "register_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index f7f37da692a..f66ffebba24 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -6756,8 +6756,8 @@
 ;; 
---
 
 (define_insn "@pred_"
-  [(set (match_operand:VF 0 "register_operand"   "=vd, vd, vr, vr")
-   (if_then_else:VF
+  [(set (match_operand:V_VLSF 0 "register_operand"   "=vd, vd, vr, vr")
+   (if_then_else:V_VLSF
  (unspec:
[(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1")
 (match_operand 4 "vector_length_operand"" rK, rK, rK, rK")
@@ -6768,9 +6768,9 @@
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)
 (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
- (any_float_unop:VF
-   (match_operand:VF 3 "register_operand"   " vr, vr, vr, vr"))
- (match_operand:VF 2 "vector_merge_operand" " vu,  0, vu,  0")))]
+ (any_float_unop:V_VLSF
+   (match_operand:V_VLSF 3 "register_operand"   " vr, vr, vr, vr"))
+ (match_operand:V_VLSF 2 "vector_merge_operand" " vu,  0, vu,  
0")))]
   "TARGET_VECTOR"
   "vf.v\t%0,%3%p1"
   [(set_attr "type" "")
@@ -6783,8 +6783,8 @@
(symbol_ref "riscv_vector::get_frm_mode (operands[8])"))])
 
 (define_insn "@pred_"
-  [(set (match_operand:VF 0 "register_operand"   "=vd, vd, vr, vr")
-   (if_then_else:VF
+  [(set (match_operand:V_VLSF 0 "register_operand"   "=vd, vd, vr, vr")
+   (if_then_else:V_VLSF
  (unspec:
[(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1")
 (match_operand 4 "vector_length_operand"" rK, rK, rK, rK")
@@ -6793,9 +6793,9 @@
 (match_operand 7 "const_int_operand""  i,  i,  i,  i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (any_float_unop_nofrm:VF
-   (match_operand:VF 3 "register_operand"   " vr, vr, vr, vr"))
- (match_operand:VF 2 "vector_merge_operand" " vu,  0, vu,  0")))]
+ (any_float_unop_nofrm:V_VLSF
+   (match_operand:V_VLSF 3 "register_operand"   " vr, vr, vr, vr"))
+ (match_operand:V_VLSF 2 "vector_merge_operand" " vu,  0, vu,  
0")))]
   "TARGET_VECTOR"
   "vf.v\t%0,%3%p1"
   [(set_attr "type" "")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
index 5df90704885..d7b721b4e3e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -1,4 +1,5 @@
 #include 
+#include 
 
 typedef int8_t v1qi __attribute__ ((vector_size (1)));
 typedef int8_t v2qi __attribute__ ((vector_size (2)));
@@ -210,7 +211,7 @@ typedef double v512df __attribute__ ((vector_size (4096)));
   PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b)
\
   {
\
 for (int i = 0; i < NUM; ++i)   

[COMMITTED] ada: Crash processing type invariants on child subprogram

2023-09-19 Thread Marc Poulhiès
From: Javier Miranda 

gcc/ada/

* contracts.adb
(Has_Public_Visibility_Of_Subprogram): Add missing support for
child subprograms.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/contracts.adb | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
index 77578dacc18..4aaa276495b 100644
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -2484,7 +2484,7 @@ package body Contracts is
--  declarations of the package containing the type, or in the
--  visible declaration of a child unit of that package.
 
-   else
+   elsif Is_List_Member (Subp_Decl) then
   declare
  Decls  : constant List_Id   :=
 List_Containing (Subp_Decl);
@@ -2508,6 +2508,29 @@ package body Contracts is
  (Specification
(Unit_Declaration_Node (Subp_Scope;
   end;
+
+   --  Determine whether the subprogram is a child subprogram of
+   --  of the package containing the type.
+
+   else
+  pragma Assert
+(Nkind (Parent (Subp_Decl)) = N_Compilation_Unit);
+
+  declare
+ Subp_Scope : constant Entity_Id :=
+Scope (Defining_Entity (Subp_Decl));
+ Typ_Scope  : constant Entity_Id := Scope (Typ);
+
+  begin
+ return
+   Ekind (Subp_Scope) = E_Package
+ and then
+   (Typ_Scope = Subp_Scope
+  or else
+(Is_Child_Unit (Subp_Scope)
+   and then Is_Ancestor_Package
+  (Typ_Scope, Subp_Scope)));
+  end;
end if;
 end Has_Public_Visibility_Of_Subprogram;
 
-- 
2.40.0



[COMMITTED] ada: Refine upper array bound for bit packed array

2023-09-19 Thread Marc Poulhiès
When using bit-packed arrays, the compiler creates new array subtypes of
1-bit component indexed by integers. The existing routine checks the
index subtype to find the min/max values. Bit-packed arrays being
indexed by integers, the routines gives up as returning the maximum
possible integer carries no useful information.

This change adds a simple max_value routine that can evaluate very
simple expressions by substituting variables by their min/max value.
Bit-packed array subtypes are currently declared as:

  subtype bp_array is packed_bytes1 (0 .. integer((1 * Var +  7) / 8 - 1));

The simple max_value evaluator handles the bare minimum for this
expression pattern.

gcc/ada/ChangeLog:

* gcc-interface/utils.cc (max_value): New.
* gcc-interface/gigi.h (max_value): New.
* gcc-interface/decl.cc (gnat_to_gnu_entity) :
When computing gnu_min/gnu_max, try to use max_value if there is
an initial expression.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gcc-interface/decl.cc  | 22 
 gcc/ada/gcc-interface/gigi.h   |  6 +++
 gcc/ada/gcc-interface/utils.cc | 95 ++
 3 files changed, 123 insertions(+)

diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
index 0cf7d3cee60..5e16b56217c 100644
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -2551,6 +2551,17 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree 
gnu_expr, bool definition)
  else
gnu_min = gnu_orig_min;
 
+ if (DECL_P (gnu_min)
+ && DECL_INITIAL (gnu_min) != NULL_TREE
+ && (TREE_CODE (gnu_min) != INTEGER_CST
+ || TREE_OVERFLOW (gnu_min)))
+   {
+ tree tmp = max_value (DECL_INITIAL(gnu_min), false);
+ if (TREE_CODE (tmp) == INTEGER_CST
+ && !TREE_OVERFLOW (tmp))
+   gnu_min = tmp;
+   }
+
  if (TREE_CODE (gnu_min) != INTEGER_CST
  || TREE_OVERFLOW (gnu_min))
gnu_min = TYPE_MIN_VALUE (TREE_TYPE (gnu_min));
@@ -2560,6 +2571,17 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree 
gnu_expr, bool definition)
  else
gnu_max = gnu_orig_max;
 
+ if (DECL_P (gnu_max)
+ && DECL_INITIAL (gnu_max) != NULL_TREE
+ && (TREE_CODE (gnu_max) != INTEGER_CST
+ || TREE_OVERFLOW (gnu_max)))
+   {
+ tree tmp = max_value (DECL_INITIAL(gnu_max), true);
+ if (TREE_CODE (tmp) == INTEGER_CST
+ && !TREE_OVERFLOW (tmp))
+   gnu_max = tmp;
+   }
+
  if (TREE_CODE (gnu_max) != INTEGER_CST
  || TREE_OVERFLOW (gnu_max))
gnu_max = TYPE_MAX_VALUE (TREE_TYPE (gnu_max));
diff --git a/gcc/ada/gcc-interface/gigi.h b/gcc/ada/gcc-interface/gigi.h
index ec85ce44bc3..eb5496f50db 100644
--- a/gcc/ada/gcc-interface/gigi.h
+++ b/gcc/ada/gcc-interface/gigi.h
@@ -763,6 +763,12 @@ extern void update_pointer_to (tree old_type, tree 
new_type);
minimum (if !MAX_P) possible value of the discriminant.  */
 extern tree max_size (tree exp, bool max_p);
 
+/* Try to compute the maximum (if MAX_P) or minimum (if !MAX_P) value for the
+   expression EXP, for very simple expressions.  Substitute variable references
+   with their respective type's min/max values.  Return the computed value if
+   any, or EXP if no value can be computed. */
+extern tree max_value (tree exp, bool max_p);
+
 /* Remove all conversions that are done in EXP.  This includes converting
from a padded type or to a left-justified modular type.  If TRUE_ADDRESS
is true, always return the address of the containing object even if
diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
index f720f3a3b4a..4e2ed173fbe 100644
--- a/gcc/ada/gcc-interface/utils.cc
+++ b/gcc/ada/gcc-interface/utils.cc
@@ -3830,6 +3830,100 @@ fntype_same_flags_p (const_tree t, tree cico_list, bool 
return_by_direct_ref_p,
 && TREE_ADDRESSABLE (t) == return_by_invisi_ref_p;
 }
 
+/* Try to compute the maximum (if MAX_P) or minimum (if !MAX_P) value for the
+   expression EXP, for very simple expressions.  Substitute variable references
+   with their respective type's min/max values.  Return the computed value if
+   any, or EXP if no value can be computed. */
+
+tree
+max_value (tree exp, bool max_p)
+{
+  enum tree_code code = TREE_CODE (exp);
+  tree type = TREE_TYPE (exp);
+  tree op0, op1, op2;
+
+  switch (TREE_CODE_CLASS (code))
+{
+case tcc_declaration:
+  if (VAR_P (exp))
+return fold_convert (type,
+ max_p
+ ? TYPE_MAX_VALUE (type) : TY

Re: [PATCH] ipa: Self-DCE of uses of removed call LHSs (PR 108007)

2023-09-19 Thread Martin Jambor
Hello,

and ping.

Thanks,

Martin


On Fri, Sep 01 2023, Martin Jambor wrote:
> Hello
>
> and ping.
>
> Thanks,
>
> Martin
>
>
> On Fri, May 12 2023, Martin Jambor wrote:
>> Hi,
>>
>> PR 108007 is another manifestation where we rely on DCE to clean-up
>> after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA
>> can leave behind statements which are fed uninitialized values and
>> trap, even though their results are themselves never used.
>>
>> I have already fixed this for unused parameters in callees, this bug
>> shows that almost the same thing can happen for removed returns, on
>> the side of callers.  This means that the issue has to be fixed
>> elsewhere, in call redirection.  This patch adds a function which
>> recursivewly looks for uses of operations fed specific SSA names and
>> removes them all.
>>
>> That would have been easy if it wasn't for debug statements during
>> tree-inline (from which call redirection is also invoked).  Debug
>> statements are decoupled from the rest at this point and iterating
>> over uses of SSAs does not bring them up.  During tree-inline they are
>> handled especially at the end, I assume in order to make sure that
>> relative ordering of UIDs are the same with and without debug info.
>>
>> This means that during tree-inline we need to make a hash of killed
>> SSAs, that we already have in copy_body_data, available to the
>> function making the purging.  So the patch duly does also that, making
>> the interface slightly ugly.
>>
>> Bootstrapped and tested on x86_64-linux.  OK for master?  (I am not sure
>> the problem is grave enough to warrant backporting to release branches
>> but can do that as well if people think I should.)
>>
>> Thanks,
>>
>> Martin
>>
>>
>> gcc/ChangeLog:
>>
>> 2023-05-11  Martin Jambor  
>>
>>  PR ipa/108007
>>  * cgraph.h (cgraph_edge): Add a parameter to
>>  redirect_call_stmt_to_callee.
>>  * ipa-param-manipulation.h (ipa_param_adjustments): Added a
>>  parameter to modify_call.
>>  * cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New
>>  parameter killed_ssas, pass it to padjs->modify_call.
>>  * ipa-param-manipulation.cc (purge_transitive_uses): New function.
>>  (ipa_param_adjustments::modify_call): New parameter killed_ssas.
>>  Instead of substitutin uses, invoke purge_transitive_uses.  If
>>  hash of killed SSAs has not been provided, create a temporary one
>>  and release SSAs that have been added to it.
>>  * tree-inline.cc (redirect_all_calls): Create
>>  id->killed_new_ssa_names earlier, pass it to edge redirection,
>>  adjust a comment.
>>  (copy_body): Release SSAs in id->killed_new_ssa_names.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2023-05-11  Martin Jambor  
>>
>>  PR ipa/108007
>>  * gcc.dg/ipa/pr108007.c: New test.
>> ---
>>  gcc/cgraph.cc   | 10 +++-
>>  gcc/cgraph.h|  9 ++-
>>  gcc/ipa-param-manipulation.cc   | 85 +
>>  gcc/ipa-param-manipulation.h|  3 +-
>>  gcc/testsuite/gcc.dg/ipa/pr108007.c | 32 +++
>>  gcc/tree-inline.cc  | 28 ++
>>  6 files changed, 129 insertions(+), 38 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr108007.c
>>
>> diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
>> index e8f9bec8227..5e923bf0557 100644
>> --- a/gcc/cgraph.cc
>> +++ b/gcc/cgraph.cc
>> @@ -1403,11 +1403,17 @@ cgraph_edge::redirect_callee (cgraph_node *n)
>> speculative indirect call, remove "speculative" of the indirect call and
>> also redirect stmt to it's final direct target.
>>  
>> +   When called from within tree-inline, KILLED_SSAs has to contain the 
>> pointer
>> +   to killed_new_ssa_names within the copy_body_data structure and SSAs
>> +   discovered to be useless (if LHS is removed) will be added to it, 
>> otherwise
>> +   it needs to be NULL.
>> +
>> It is up to caller to iteratively transform each "speculative"
>> direct call as appropriate.  */
>>  
>>  gimple *
>> -cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e)
>> +cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge *e,
>> +   hash_set  *killed_ssas)
>>  {
>>tree decl = gimple_call_fndecl (e->call_stmt);
>>gcall *new_stmt;
>> @@ -1527,7 +1533,7 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge 
>> *e)
>>  remove_stmt_from_eh_lp (e->call_stmt);
>>  
>>tree old_fntype = gimple_call_fntype (e->call_stmt);
>> -  new_stmt = padjs->modify_call (e, false);
>> +  new_stmt = padjs->modify_call (e, false, killed_ssas);
>>cgraph_node *origin = e->callee;
>>while (origin->clone_of)
>>  origin = origin->clone_of;
>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
>> index f5f54769eda..c1a3691b6f5 100644
>> --- a/gcc/cgraph.h
>> +++ b/gcc/cgraph.h
>> @@ -1833,9 +1833,16 @@ public:
>>   speculative indirect call, remove "speculative" of the indirect c

[COMMITTED] ada: Private extensions with the keyword "synchronized" are always limited.

2023-09-19 Thread Marc Poulhiès
From: Richard Wai 

GNAT was relying on synchronized private type extensions deriving from a
concurrent interface to determine its limitedness. This does not cover the case
where such an extension derives a limited interface. RM-7.6(6/2) makes is clear
that "synchronized" in a private extension implies the derived type is limited.
GNAT should explicitly check for the presence of "synchronized" in a private
extension declaration, and it should have the same effect as the presence of
“limited”.

gcc/ada/ChangeLog:

* sem_ch3.adb (Build_Derived_Record_Type): Treat presence of
keyword "synchronized" the same as "limited" when determining if a
private extension is limited.

gcc/testsuite/ChangeLog:

* gnat.dg/sync_tag_discriminals.adb: New test.
* gnat.dg/sync_tag_limited.adb: New test.

Signed-off-by: Richard Wai 
---
 gcc/ada/sem_ch3.adb   | 12 +++--
 .../gnat.dg/sync_tag_discriminals.adb | 51 +++
 gcc/testsuite/gnat.dg/sync_tag_limited.adb| 50 ++
 3 files changed, 110 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/sync_tag_discriminals.adb
 create mode 100644 gcc/testsuite/gnat.dg/sync_tag_limited.adb

diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
index 3262236dd14..92902a7debb 100644
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -9599,9 +9599,15 @@ package body Sem_Ch3 is
 
   --  AI-419: Limitedness is not inherited from an interface parent, so to
   --  be limited in that case the type must be explicitly declared as
-  --  limited. However, task and protected interfaces are always limited.
-
-  if Limited_Present (Type_Def) then
+  --  limited, or synchronized. While task and protected interfaces are
+  --  always limited, a synchronized private extension might not inherit
+  --  from such interfaces, and so we also need to recognize the
+  --  explicit limitedness implied by a synchronized private extension
+  --  that does not derive from a synchronized interface (see RM-7.3(6/2)).
+
+  if Limited_Present (Type_Def)
+or else Synchronized_Present (Type_Def)
+  then
  Set_Is_Limited_Record (Derived_Type);
 
   elsif Is_Limited_Record (Parent_Type)
diff --git a/gcc/testsuite/gnat.dg/sync_tag_discriminals.adb 
b/gcc/testsuite/gnat.dg/sync_tag_discriminals.adb
new file mode 100644
index 000..b105acf6e98
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_tag_discriminals.adb
@@ -0,0 +1,51 @@
+-- This test is related to sync_tag_limited in that previous versions of GNAT
+-- failed to consider a synchronized private extension as limited if it was
+-- not derrived from a synchronized interface (i.e. a limited interface). Since
+-- such a private type would not be considered limited, GNAT would fail to
+-- correctly build the expected discriminals later needed by the creation of
+-- the concurrent type's "corresponding record type", leading to a compilation
+-- error where the discriminants of the corresponding record type had no
+-- identifiers.
+--
+-- This test is in addition to sync_tag_limited because the sync_tag_limited
+-- would fail for "legality" reasons (default discriminants not allowed for
+-- a non-limited taged type). It is also an opportunity to ensure that non-
+-- defaulted discriminated synchronized private extensions work as expected.
+
+--  { dg-do compile }
+
+procedure Sync_Tag_Discriminals is
+   
+   package Ifaces is
+  
+  type Test_Interface is limited interface;
+  
+  procedure Interface_Action (Test: in out Test_Interface) is abstract;
+  
+   end Ifaces;
+   
+   
+   package Implementation is
+  type Test_Implementation
+(Constraint: Positive) is
+synchronized new Ifaces.Test_Interface with private;
+  
+   private
+  protected type Test_Implementation
+(Constraint: Positive)
+  is new Ifaces.Test_Interface with
+  
+ overriding procedure Interface_Action;
+ 
+  end Test_Implementation;
+   end Implementation;
+   
+   package body Implementation is
+  protected body Test_Implementation is
+ procedure Interface_Action is null;
+  end;
+   end Implementation;
+   
+begin
+   null;
+end Sync_Tag_Discriminals;
diff --git a/gcc/testsuite/gnat.dg/sync_tag_limited.adb 
b/gcc/testsuite/gnat.dg/sync_tag_limited.adb
new file mode 100644
index 000..608f10662a3
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_tag_limited.adb
@@ -0,0 +1,50 @@
+--  Synchronized tagged types created by a private extension with the keyword
+--  'synchronized' shall be seen as an (immutably) limited tagged type, and
+--  should therefore accept default disciminant spectifications.
+--  This was a bug in earlier versions of GNAT, whereby GNAT erroneously
+--  relied on a parent synchronized interface to determine limitedness
+--  of a synchronized private extension. The problem being that a sync

Re: [PATCH V2] RISC-V: Fix RVV can change mode class bug

2023-09-19 Thread Robin Dapp
Hi Juzhe,

I'd agree that punting is reasonable for now, therefore LGTM.

Regards
 Robin


Re: [PATCH V2] RISC-V: Fix RVV can change mode class bug

2023-09-19 Thread Lehua Ding

Committed, thanks Robin.

On 2023/9/19 20:04, Robin Dapp wrote:

Hi Juzhe,

I'd agree that punting is reasonable for now, therefore LGTM.

Regards
  Robin


--
Best,
Lehua


[COMMITTED] ada: TSS finalize address subprogram generation for constrained...

2023-09-19 Thread Marc Poulhiès
From: Richard Wai 

...subtypes of unconstrained synchronized private extensions should take
care to designate the corresponding record of the underlying concurrent
type.

When generating TSS finalize address subprograms for class-wide types of
constrained root types, it follows the parent chain looking for the
first "non-constrained" type. It is possible that such a type is a
private extension with the “synchronized” keyword, in which case the
underlying type is a concurrent type. When that happens, the designated
type of the finalize address subprogram should be the corresponding
record’s class-wide-type.

gcc/ada/ChangeLog:
* exp_ch3.adb (Expand_Freeze_Class_Wide_Type): Expanded comments
explaining why TSS Finalize_Address is not generated for
concurrent class-wide types.
* exp_ch7.adb (Make_Finalize_Address_Stmts): Handle cases where the
underlying non-constrained parent type is a concurrent type, and
adjust the designated type to be the corresponding record’s
class-wide type.

gcc/testsuite/ChangeLog:

* gnat.dg/sync_tag_finalize.adb: New test.

Signed-off-by: Richard Wai 
---
 gcc/ada/exp_ch3.adb |  4 ++
 gcc/ada/exp_ch7.adb | 28 +-
 gcc/testsuite/gnat.dg/sync_tag_finalize.adb | 60 +
 3 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gnat.dg/sync_tag_finalize.adb

diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
index 04c3ad8c631..bb015986200 100644
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -5000,6 +5000,10 @@ package body Exp_Ch3 is
   --  Do not create TSS routine Finalize_Address for concurrent class-wide
   --  types. Ignore C, C++, CIL and Java types since it is assumed that the
   --  non-Ada side will handle their destruction.
+  --
+  --  Concurrent Ada types are functionally represented by an associated
+  --  "corresponding record type" (typenameV), which owns the actual TSS
+  --  finalize bodies for the type (and technically class-wide type).
 
   elsif Is_Concurrent_Type (Root)
 or else Is_C_Derivation (Root)
diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
index aa16c707887..4ea5e6ede64 100644
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -8512,7 +8512,8 @@ package body Exp_Ch7 is
   Is_Empty_Elmt_List (Discriminant_Constraint (Root_Type (Typ)))
   then
  declare
-Parent_Typ : Entity_Id;
+Parent_Typ  : Entity_Id;
+Parent_Utyp : Entity_Id;
 
  begin
 --  Climb the parent type chain looking for a non-constrained type
@@ -8533,7 +8534,30 @@ package body Exp_Ch7 is
Parent_Typ := Underlying_Record_View (Parent_Typ);
 end if;
 
-Desig_Typ := Class_Wide_Type (Underlying_Type (Parent_Typ));
+Parent_Utyp := Underlying_Type (Parent_Typ);
+
+--  Handle views created for a synchronized private extension with
+--  known, non-defaulted discriminants. In that case, parent_typ
+--  will be the private extension, as it is the first "non
+--  -constrained" type in the parent chain. Unfortunately, the
+--  underlying type, being a protected or task type, is not the
+--  "real" type needing finalization. Rather, the "corresponding
+--  record type" should be the designated type here. In fact, TSS
+--  finalizer generation is specifically skipped for the nominal
+--  class-wide type of (the full view of) a concurrent type (see
+--  exp_ch7.Expand_Freeze_Class_Wide_Type). If we don't designate
+--  the underlying record (Tprot_typeVC), we will end up trying to
+--  dispatch to prot_typeVDF from an incorrectly designated
+--  Tprot_typeC, which is, of course, not actually a member of
+--  prot_typeV'Class, and thus incompatible.
+
+if Ekind (Parent_Utyp) in Concurrent_Kind
+  and then Present (Corresponding_Record_Type (Parent_Utyp))
+then
+   Parent_Utyp := Corresponding_Record_Type (Parent_Utyp);
+end if;
+
+Desig_Typ := Class_Wide_Type (Parent_Utyp);
  end;
 
   --  General case
diff --git a/gcc/testsuite/gnat.dg/sync_tag_finalize.adb 
b/gcc/testsuite/gnat.dg/sync_tag_finalize.adb
new file mode 100644
index 000..6dffd4a102c
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/sync_tag_finalize.adb
@@ -0,0 +1,60 @@
+--  In previous versions of GNAT there was a curious bug that caused
+--  compilation to fail in the case of a synchronized private extension
+--  with non-default discriminants, where the creation of a constrained object
+--  (and thus subtype) caused the TSS deep finalize machinery of the internal
+--  class-wide constratined subtype (TConstrainedC) to construct a malformed
+--  T

[PATCH] RISC-V: Add FNMS floating-point VLS tests

2023-09-19 Thread Juzhe-Zhong
Add tests and committed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add FNMS VLS modes tests.
* gcc.target/riscv/rvv/autovec/vls/fnms-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnms-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/fnms-3.c: New test.

---
 .../gcc.target/riscv/rvv/autovec/vls/def.h|  9 ++
 .../gcc.target/riscv/rvv/autovec/vls/fnms-1.c | 31 +++
 .../gcc.target/riscv/rvv/autovec/vls/fnms-2.c | 30 ++
 .../gcc.target/riscv/rvv/autovec/vls/fnms-3.c | 29 +
 4 files changed, 99 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-3.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
index d7b721b4e3e..64ef72d3ff4 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h
@@ -485,3 +485,12 @@ typedef double v512df __attribute__ ((vector_size (4096)));
 for (int i = 0; i < NUM; ++i)  
\
   a[i] = b[i] * c[i] - d[i];   
\
   }
+
+#define DEF_FNMS_VV(PREFIX, NUM, TYPE) 
\
+  void __attribute__ ((noinline, noclone)) 
\
+  PREFIX##_##TYPE##NUM (TYPE *restrict a, TYPE *restrict b, TYPE *restrict c,  
\
+   TYPE *restrict d)  \
+  {
\
+for (int i = 0; i < NUM; ++i)  
\
+  a[i] = -(b[i] * c[i]) - d[i];
\
+  }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-1.c
new file mode 100644
index 000..7fb8884f58c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-1.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 
--param=riscv-autovec-lmul=m8 -fdump-tree-optimized" } */
+
+#include "def.h"
+
+DEF_FNMS_VV (fnms, 2, _Float16)
+DEF_FNMS_VV (fnms, 4, _Float16)
+DEF_FNMS_VV (fnms, 8, _Float16)
+DEF_FNMS_VV (fnms, 16, _Float16)
+DEF_FNMS_VV (fnms, 32, _Float16)
+DEF_FNMS_VV (fnms, 64, _Float16)
+DEF_FNMS_VV (fnms, 128, _Float16)
+DEF_FNMS_VV (fnms, 256, _Float16)
+DEF_FNMS_VV (fnms, 512, _Float16)
+DEF_FNMS_VV (fnms, 1024, _Float16)
+DEF_FNMS_VV (fnms, 2048, _Float16)
+
+/* { dg-final { scan-assembler-times {vfnma[c-d][c-d]\.vv} 11 } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4,4" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "16,16" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "32,32" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "64,64" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "128,128" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "256,256" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "512,512" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "1024,1024" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2048,2048" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4096,4096" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-2.c
new file mode 100644
index 000..b044061c9d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/fnms-2.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 
--param=riscv-autovec-lmul=m8 -fdump-tree-optimized" } */
+
+#include "def.h"
+
+DEF_FNMS_VV (fnms, 2, float)
+DEF_FNMS_VV (fnms, 4, float)
+DEF_FNMS_VV (fnms, 8, float)
+DEF_FNMS_VV (fnms, 16, float)
+DEF_FNMS_VV (fnms, 32, float)
+DEF_FNMS_VV (fnms, 64, float)
+DEF_FNMS_VV (fnms, 128, float)
+DEF_FNMS_VV (fnms, 256, float)
+DEF_FNMS_VV (fnms, 512, float)
+DEF_FNMS_VV (fnms, 1024, float)
+
+/* { dg-final { scan-assembler-times {vfnma[c-d][c-d]\.vv} 10 } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "4,4" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "16,16" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "32,32" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "64,64" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "128,128" "optimized" } } */
+/

[PATCH]middle-end: relax validate_subreg to allow paradoxical subregs that change mode

2023-09-19 Thread Tamar Christina
Hi All,

This patch relaxes the subreg invariant that you can only change modes
or make it paradoxical in one conversion. i.e. it now allows subreg:V2DI 
(reg:DF ..))

This is well defined in the generic sense and allowing it would enable
you to write RTL without the extra moves which can be interfered with by
combine.

Patch has been pre-approved[1], but giving people chance to object

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629119.html

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu,
powerpc64le-unknown-linux-gnu and no issues.

Pushed to master.

Thanks,
Tamar

gcc/ChangeLog:

* emit-rtl.cc (validate_subreg): Relax subreg rule.

--- inline copy of patch -- 
diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 
f6276a2d0b627fd5b2a742795a55423f449e9d93..ea85a46a0f014b81aeb76d69d19a57233dd70bdc
 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -947,7 +947,7 @@ validate_subreg (machine_mode omode, machine_mode imode,
  in post-reload splitters that make arbitrarily mode changes to the
  registers themselves.  */
   else if (VECTOR_MODE_P (omode)
-  && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
+  && GET_MODE_UNIT_SIZE (omode) == GET_MODE_UNIT_SIZE (imode))
 ;
   /* Subregs involving floating point modes are not allowed to
  change size unless it's an insert into a complex mode.




-- 
diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 
f6276a2d0b627fd5b2a742795a55423f449e9d93..ea85a46a0f014b81aeb76d69d19a57233dd70bdc
 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -947,7 +947,7 @@ validate_subreg (machine_mode omode, machine_mode imode,
  in post-reload splitters that make arbitrarily mode changes to the
  registers themselves.  */
   else if (VECTOR_MODE_P (omode)
-  && GET_MODE_INNER (omode) == GET_MODE_INNER (imode))
+  && GET_MODE_UNIT_SIZE (omode) == GET_MODE_UNIT_SIZE (imode))
 ;
   /* Subregs involving floating point modes are not allowed to
  change size unless it's an insert into a complex mode.





[PATCH] c/111468 - dump unordered compare operators in their GIMPLE form with -gimple

2023-09-19 Thread Richard Biener
The following adjusts -gimple dumping to dump the unordered compare ops
and *h in their GIMPLE form.  It also adds parsing for __LTGT which I
missed before.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR c/111468
gcc/c/
* gimple-parser.cc (c_parser_gimple_binary_expression): Handle __LTGT.

gcc/
* tree-pretty-print.h (op_symbol_code): Add defaulted flags
argument.
* tree-pretty-print.cc (op_symbol): Likewise.
(op_symbol_code): Print TDF_GIMPLE variant if requested.
* gimple-pretty-print.cc (dump_binary_rhs): Pass flags to
op_symbol_code.
(dump_gimple_cond): Likewise.

gcc/testsuite/
* gcc.dg/gimplefe-50.c: Amend.
---
 gcc/c/gimple-parser.cc |  5 +
 gcc/gimple-pretty-print.cc |  4 ++--
 gcc/testsuite/gcc.dg/gimplefe-50.c |  1 +
 gcc/tree-pretty-print.cc   | 26 +-
 gcc/tree-pretty-print.h|  2 +-
 5 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/gcc/c/gimple-parser.cc b/gcc/c/gimple-parser.cc
index 9cf29701c06..f43c0398655 100644
--- a/gcc/c/gimple-parser.cc
+++ b/gcc/c/gimple-parser.cc
@@ -1044,6 +1044,11 @@ c_parser_gimple_binary_expression (gimple_parser 
&parser, tree ret_type)
code = ORDERED_EXPR;
break;
  }
+   else if (strcmp (IDENTIFIER_POINTER (id), "__LTGT") == 0)
+ {
+   code = LTGT_EXPR;
+   break;
+ }
   }
   /* Fallthru.  */
 default:
diff --git a/gcc/gimple-pretty-print.cc b/gcc/gimple-pretty-print.cc
index 82017b92e89..320df9197b4 100644
--- a/gcc/gimple-pretty-print.cc
+++ b/gcc/gimple-pretty-print.cc
@@ -480,7 +480,7 @@ dump_binary_rhs (pretty_printer *buffer, const gassign *gs, 
int spc,
   else
dump_generic_node (buffer, gimple_assign_rhs1 (gs), spc, flags, false);
   pp_space (buffer);
-  pp_string (buffer, op_symbol_code (gimple_assign_rhs_code (gs)));
+  pp_string (buffer, op_symbol_code (gimple_assign_rhs_code (gs), flags));
   pp_space (buffer);
   if (op_prio (gimple_assign_rhs2 (gs)) <= op_code_prio (code))
{
@@ -1092,7 +1092,7 @@ dump_gimple_cond (pretty_printer *buffer, const gcond 
*gs, int spc,
 flags | ((flags & TDF_GIMPLE) ? TDF_GIMPLE_VAL : 
TDF_NONE),
 false);
   pp_space (buffer);
-  pp_string (buffer, op_symbol_code (gimple_cond_code (gs)));
+  pp_string (buffer, op_symbol_code (gimple_cond_code (gs), flags));
   pp_space (buffer);
   dump_generic_node (buffer, gimple_cond_rhs (gs), spc,
 flags | ((flags & TDF_GIMPLE) ? TDF_GIMPLE_VAL : 
TDF_NONE),
diff --git a/gcc/testsuite/gcc.dg/gimplefe-50.c 
b/gcc/testsuite/gcc.dg/gimplefe-50.c
index 03db786b619..63d228ce76d 100644
--- a/gcc/testsuite/gcc.dg/gimplefe-50.c
+++ b/gcc/testsuite/gcc.dg/gimplefe-50.c
@@ -14,6 +14,7 @@ foo (float a, float b)
   x_7 = a_1(D) __UNEQ b_2(D);
   x_8 = a_1(D) __UNORDERED b_2(D);
   x_9 = a_1(D) __ORDERED b_2(D);
+  x_10 = a_1(D) __LTGT b_2(D);
   if (a_1(D) __UNEQ b_2(D))
 goto __BB4;
   else
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index 45a1fd3e848..12c57c14dd4 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -49,7 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #endif
 
 /* Local functions, macros and variables.  */
-static const char *op_symbol (const_tree);
+static const char *op_symbol (const_tree, dump_flags_t = TDF_NONE);
 static void newline_and_indent (pretty_printer *, int);
 static void maybe_init_pretty_print (FILE *);
 static void print_struct_decl (pretty_printer *, const_tree, int, 
dump_flags_t);
@@ -4327,7 +4327,7 @@ op_prio (const_tree op)
 /* Return the symbol associated with operator CODE.  */
 
 const char *
-op_symbol_code (enum tree_code code)
+op_symbol_code (enum tree_code code, dump_flags_t flags)
 {
   switch (code)
 {
@@ -4354,14 +4354,14 @@ op_symbol_code (enum tree_code code)
   return "&";
 
 case ORDERED_EXPR:
-  return "ord";
+  return (flags & TDF_GIMPLE) ? "__ORDERED" : "ord";
 case UNORDERED_EXPR:
-  return "unord";
+  return (flags & TDF_GIMPLE) ? "__UNORDERED" : "unord";
 
 case EQ_EXPR:
   return "==";
 case UNEQ_EXPR:
-  return "u==";
+  return (flags & TDF_GIMPLE) ? "__UNEQ" : "u==";
 
 case NE_EXPR:
   return "!=";
@@ -4369,25 +4369,25 @@ op_symbol_code (enum tree_code code)
 case LT_EXPR:
   return "<";
 case UNLT_EXPR:
-  return "u<";
+  return (flags & TDF_GIMPLE) ? "__UNLT" : "u<";
 
 case LE_EXPR:
   return "<=";
 case UNLE_EXPR:
-  return "u<=";
+  return (flags & TDF_GIMPLE) ? "__UNLE" : "u<=";
 
 case GT_EXPR:
   return ">";
 case UNGT_EXPR:
-  return "u>";
+  return (flags & TDF_GIMPLE) ? "__UNGT" : "u>";
 
 case GE_EXPR:
   return ">=";
 case UNGE_EXPR:
-  ret

[PATCH] target/30484 - testcase for exploration

2023-09-19 Thread Richard Biener
The following adds a testcase for the unfixed PR, skipped on the
known affected target.  The intent is to discover targets that
suffer from the same issue or whose sdiv do not wrap as expected
with -fwrapv.

Tested on x86_64-unknown-linux-gnu.

OK?  Jeff, do you want to put this on your tester first (do they
perform runtime simulator testing?)

PR target/30484
* gcc.dg/torture/pr30484.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr30484.c | 33 ++
 1 file changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr30484.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr30484.c 
b/gcc/testsuite/gcc.dg/torture/pr30484.c
new file mode 100644
index 000..22ea23bd27a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr30484.c
@@ -0,0 +1,33 @@
+/* { dg-do run } */
+/* { dg-skip-if "PR30484" { x86_64-*-* i?86-*-* } } */
+/* { dg-additional-options "-fwrapv" } */
+
+#define CHECK(TYPE, UTYPE) \
+TYPE __attribute__((noipa)) \
+div ## TYPE (TYPE a, TYPE b) \
+{ \
+  return a / b; \
+} \
+TYPE __attribute__((noipa)) \
+neg ## TYPE (TYPE a) \
+{ \
+  return -a; \
+} \
+void __attribute__((noipa)) \
+test ## TYPE () \
+{ \
+  TYPE min = (TYPE)((UTYPE)1 << (sizeof(TYPE)*8-1)); \
+  if (div ## TYPE (min, -1) != min  \
+  || neg ## TYPE (min) != min) \
+__builtin_abort (); \
+}
+
+CHECK(int, unsigned)
+CHECK(long, unsigned long)
+
+int main()
+{
+  testlong ();
+  testint ();
+  return 0;
+}
-- 
2.35.3


Re: [PATCH 1/2] using overflow_free_p to simplify pattern

2023-09-19 Thread Richard Biener
On Tue, 19 Sep 2023, Jiufu Guo wrote:

> Hi,
> 
> In r14-3582, an "overflow_free_p" interface is added.
> The pattern of "(t * 2) / 2" in match.pd can be simplified
> by using this interface.
> 
> Bootstrap & regtest pass on ppc64{,le} and x86_64.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu)
> 
> gcc/ChangeLog:
> 
>   * match.pd ((t * 2) / 2): Update to use overflow_free_p.
> 
> ---
>  gcc/match.pd | 37 +++--
>  1 file changed, 7 insertions(+), 30 deletions(-)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 87edf0e75c3..8bba7056000 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -926,36 +926,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (if (TYPE_OVERFLOW_UNDEFINED (type))
>  @0
>  #if GIMPLE
> -(with
> - {
> -   bool overflowed = true;
> -   value_range vr0, vr1;
> -   if (INTEGRAL_TYPE_P (type)
> -&& get_range_query (cfun)->range_of_expr (vr0, @0)
> -&& get_range_query (cfun)->range_of_expr (vr1, @1)
> -&& !vr0.varying_p () && !vr0.undefined_p ()
> -&& !vr1.varying_p () && !vr1.undefined_p ())
> -  {
> -wide_int wmin0 = vr0.lower_bound ();
> -wide_int wmax0 = vr0.upper_bound ();
> -wide_int wmin1 = vr1.lower_bound ();
> -wide_int wmax1 = vr1.upper_bound ();
> -/* If the multiplication can't overflow/wrap around, then
> -   it can be optimized too.  */
> -wi::overflow_type min_ovf, max_ovf;
> -wi::mul (wmin0, wmin1, TYPE_SIGN (type), &min_ovf);
> -wi::mul (wmax0, wmax1, TYPE_SIGN (type), &max_ovf);
> -if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
> -  {
> -wi::mul (wmin0, wmax1, TYPE_SIGN (type), &min_ovf);
> -wi::mul (wmax0, wmin1, TYPE_SIGN (type), &max_ovf);
> -if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
> -  overflowed = false;
> -  }
> -  }
> - }
> -(if (!overflowed)
> - @0))
> +(with {value_range vr0, vr1;}
> + (if (INTEGRAL_TYPE_P (type)
> +   && get_range_query (cfun)->range_of_expr (vr0, @0)
> +   && get_range_query (cfun)->range_of_expr (vr1, @1)
> +   && !vr0.varying_p () && !vr1.varying_p ()

>From your other uses checking !varying_p doesn't seem necessary?

OK with omitting.

Richard.

> +   && range_op_handler (MULT_EXPR).overflow_free_p (vr0, vr1))
> +  @0))
>  #endif
> 
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH 2/2] testcase: rename pr111303.c to pr111324.c

2023-09-19 Thread Richard Biener
On Tue, 19 Sep 2023, Jiufu Guo wrote:

> Hi,
> 
> When commit the fix for pr111324, the test cases was named as pr111303.c
> by mistake.  Here, rename it to pr111324.c
> 
> Is this ok for trunk?

OK.

> BR,
> Jeff (Jiufu Guo)
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/pr111303.c: Rename to ...
>   * gcc.dg/tree-ssa/pr111324.c: ... this.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/{pr111303.c => pr111324.c} | 0
>  1 file changed, 0 insertions(+), 0 deletions(-)
>  rename gcc/testsuite/gcc.dg/tree-ssa/{pr111303.c => pr111324.c} (100%)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr111303.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr111324.c
> similarity index 100%
> rename from gcc/testsuite/gcc.dg/tree-ssa/pr111303.c
> rename to gcc/testsuite/gcc.dg/tree-ssa/pr111324.c
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] [RFC] New early __builtin_unreachable processing.

2023-09-19 Thread Richard Biener
On Mon, Sep 18, 2023 at 3:48 PM Andrew MacLeod  wrote:
>
>
> On 9/18/23 02:53, Richard Biener wrote:
> > On Fri, Sep 15, 2023 at 4:45 PM Andrew MacLeod  wrote:
> >> Ive been looking at __builtin_unreachable () regressions.  The
> >> fundamental problem seems to be  a lack of consistent expectation for
> >> when we remove it earlier than the final pass of VRP.After looking
> >> through them, I think this provides a sensible approach.
> >>
> >> Ranger is pretty good at providing ranges in blocks dominated by the
> >> __builtin_unreachable  branch, so removing it isn't quite a critical as
> >> it once was.  Its also pretty good at identifying what in the block can
> >> be affected by the branch.
> >>
> >> This patch provide an alternate removal algorithm for earlier passes.
> >> it looks at *all* the exports from the block, and if the branch
> >> dominates every use of all the exports, AND none of those values access
> >> memory, VRP will remove the unreachable call, rewrite the branch, update
> >> all the values globally, and finally perform the simple DCE on the
> >> branch's ssa-name.   This is kind of what it did before, but it wasn't
> >> as stringent on the requirements.
> >>
> >> The memory access check is required because there are a couple of test
> >> cases for PRE in which there is a series of instruction leading to an
> >> unreachable call, and none of those ssa names are ever used in the IL
> >> again. The whole chunk is dead, and we update globals, however
> >> pointlessly.  However, one of ssa_names loads from memory, and a later
> >> passes commons this value with a later load, and then  the unreachable
> >> call provides additional information about the later load.This is
> >> evident in tree-ssa/ssa-pre-34.c.   The only way I see to avoid this
> >> situation is to not remove the unreachable if there is a load feeding it.
> >>
> >> What this does is a more sophisticated version of what DOM does in
> >> all_uses_feed_or_dominated_by_stmt.  THe feeding instructions dont have
> >> to be single use, but they do have to be dominated by the branch or be
> >> single use within the branches block..
> >>
> >> If there are multiple uses in the same block as the branch, this does
> >> not remove the unreachable call.  If we could be sure there are no
> >> intervening calls or side effects, it would be allowable, but this a
> >> more expensive checking operation.  Ranger gets the ranges right anyway,
> >> so with more passes using ranger, Im not sure we'd see much benefit from
> >> the additional analysis.   It could always be added later.
> >>
> >> This fixes at least 110249 and 110080 (and probably others).  The only
> >> regression is 93917 for which I changed the testcase to adjust
> >> expectations:
> >>
> >> // PR 93917
> >> void f1(int n)
> >> {
> >> if(n<0)
> >>   __builtin_unreachable();
> >> f3(n);
> >> }
> >>
> >> void f2(int*n)
> >> {
> >> if(*n<0)
> >>   __builtin_unreachable();
> >> f3 (*n);
> >> }
> >>
> >> We were removing both unreachable calls in VRP1, but only updating the
> >> global values in the first case, meaning we lose information.   With the
> >> change in semantics, we only update the global in the first case, but we
> >> leave the unreachable call in the second case now (due to the load from
> >> memory).  Ranger still calculates the contextual range correctly as [0,
> >> +INF] in the second case, it just doesn't set the global value until
> >> VRP2 when it is removed.
> >>
> >> Does this seem reasonable?
> > I wonder how this addresses the fundamental issue we always faced
> > in that when we apply the range this range info in itself allows the
> > branch to the __builtin_unreachable () to be statically determined,
> > so when the first VRP pass sets the range the next pass evaluating
> > the condition will remove it (and the guarded __builtin_unreachable ()).
> >
> > In principle there's nothing wrong with that if we don't lose the range
> > info during optimizations, but that unfortunately happens more often
> > than wanted and with the __builtin_unreachable () gone we've lost
> > the ability to re-compute them.
> >
> > I think it's good to explicitly remove the branch at the point we want
> > rather than relying on the "next" visitor to pick up the global range.
> >
> > As I read the patch we now remove __builtin_unreachable () explicitly
> > as soon as possible but don't really address the fundamental issue
> > in any way?
>
>
> I think it pretty much addresses the issue completely.  No globals are
> updated by the unreachable branch unless it is removed.  We remove the
> unreachable early ONLY if every use of all the exports is dominated by
> the branch...with the exception of a single use in the block used to
> define a different export. and those have to all have no other uses
> which are not dominated.  ie
>
>[local count: 1073741824]:
>y_2 = x_1(D) >> 1;
>t_3 = y_2 + 1;
>if (t_3 > 100)
>  goto ; [0.00%]
>el

Re: [PATCH] debug/111409 - don't generate COMDAT macro sections for split DWARF

2023-09-19 Thread Richard Biener
On Thu, Sep 14, 2023 at 8:42 AM Omar Sandoval  wrote:
>
> Split DWARF files aren't processed by the linker, so DW_MACRO_import
> offsets aren't relocated and the .debug_macro.dwo sections aren't
> deduplicated and merged.  There's no clear way for this to work for
> split DWARF, so disable it.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR debug/111409
> * dwarf2out.cc (output_macinfo): Don't call optimize_macinfo_range if
> dwarf_split_debug_info.
>
> gcc/testsuite/ChangeLog:
>
> PR debug/111409
> * gcc.dg/pr111409.c: New test.
> ---
>  gcc/ChangeLog   | 6 ++
>  gcc/dwarf2out.cc| 1 +
>  gcc/testsuite/ChangeLog | 5 +
>  gcc/testsuite/gcc.dg/pr111409.c | 7 +++
>  4 files changed, 19 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr111409.c
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index b69160b025d..2b0ff902f55 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2023-09-13  Omar Sandoval  
> +
> +   PR debug/111409
> +   * dwarf2out.cc (output_macinfo): Don't call optimize_macinfo_range if
> +   dwarf_split_debug_info.
> +
>  2023-09-12  Juzhe-Zhong  
>
> PR target/111337
> diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
> index c4a935d5101..f60a0656d8f 100644
> --- a/gcc/dwarf2out.cc
> +++ b/gcc/dwarf2out.cc
> @@ -29247,6 +29247,7 @@ output_macinfo (const char *debug_line_label, bool 
> early_lto_debug)
> case DW_MACINFO_define:
> case DW_MACINFO_undef:
>   if ((!dwarf_strict || dwarf_version >= 5)
> + && !dwarf_split_debug_info
>   && HAVE_COMDAT_GROUP
>   && vec_safe_length (files) != 1
>   && i > 0
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index de0eadc31d7..3534e203a8a 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2023-09-13  Omar Sandoval  
> +
> +   PR debug/111409
> +   * gcc.dg/pr111409.c: New test.
> +
>  2023-09-12  Juzhe-Zhong  
>
> * lib/target-supports.exp: Enable vect_int for RVV.
> diff --git a/gcc/testsuite/gcc.dg/pr111409.c b/gcc/testsuite/gcc.dg/pr111409.c
> new file mode 100644
> index 000..1a79d81444e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr111409.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "split DWARF unsupported" { *-*-darwin* } } */
> +/* { dg-options "-gsplit-dwarf -g3 -dA" } */
> +/* { dg-final { scan-assembler-times {.section\s+.debug_macro} 1 } } */
> +/* { dg-final { scan-assembler-not {.byte\s+0x7\s*#\s*Import} } } */
> +
> +#define foo 1
> --
> 2.41.0
>


Re: [PATCH] AArch64: Improve immediate expansion [PR105928]

2023-09-19 Thread Wilco Dijkstra
Hi Richard,

>> Note that aarch64_internal_mov_immediate may be called after reload,
>> so it would end up even more complex.
>
> The sequence I quoted was supposed to work before and after reload.  The:
>
>    rtx tmp = aarch64_target_reg (dest, DImode);
>
> would create a fresh temporary before reload and reuse dest otherwise.
> So the sequence after reload would be the same as in your patch,
> but the sequence before reload would use a temporary.

aarch64_target_reg just returns the input register so it won't do that.
Also the movsi/movdi patterns only split if the destination register is 
physical.
That's typically after register allocation but not uniformly so (eg. immediates 
in
returns will get split early), which is inconsistent. Given we always emit 
register
notes it's not obvious whether splitting early or late is better overall.

Cheers,
Wilco

Re: [PATCH v7] c++: Move consteval folding to cp_fold_r

2023-09-19 Thread Marek Polacek
On Mon, Sep 18, 2023 at 09:36:31PM -0400, Jason Merrill wrote:
> On 9/18/23 17:42, Marek Polacek wrote:
> > +  /* The purpose of this is not to emit errors for mce_unknown.  */
> > +  const tsubst_flags_t complain = (data->flags == ff_fold_immediate
> > +  ? tf_none : tf_error);
> 
> Maybe check flags & ff_mce_false, instead?  OK with that change.

Thanks!  And what do you think about:

--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1162,7 +1162,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
   tree stmt = *stmt_p;
   enum tree_code code = TREE_CODE (stmt);

-  cp_fold_immediate_r (stmt_p, walk_subtrees, data);
+  if (cxx_dialect > cxx17)
+cp_fold_immediate_r (stmt_p, walk_subtrees, data);

   *stmt_p = stmt = cp_fold (*stmt_p, data->flags);


since we can recurse on ?:, this could save some time.  It's sad that
it checks cxx_dialect in every invocation of cp_fold_r but still, it
should help.

Thanks again for the reviews,

Marek



Re: [PATCH v7] c++: Move consteval folding to cp_fold_r

2023-09-19 Thread Jason Merrill

On 9/19/23 09:01, Marek Polacek wrote:

On Mon, Sep 18, 2023 at 09:36:31PM -0400, Jason Merrill wrote:

On 9/18/23 17:42, Marek Polacek wrote:

+  /* The purpose of this is not to emit errors for mce_unknown.  */
+  const tsubst_flags_t complain = (data->flags == ff_fold_immediate
+  ? tf_none : tf_error);


Maybe check flags & ff_mce_false, instead?  OK with that change.


Thanks!  And what do you think about:

--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1162,7 +1162,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
tree stmt = *stmt_p;
enum tree_code code = TREE_CODE (stmt);

-  cp_fold_immediate_r (stmt_p, walk_subtrees, data);
+  if (cxx_dialect > cxx17)
+cp_fold_immediate_r (stmt_p, walk_subtrees, data);

*stmt_p = stmt = cp_fold (*stmt_p, data->flags);

since we can recurse on ?:, this could save some time.  It's sad that
it checks cxx_dialect in every invocation of cp_fold_r but still, it
should help.


Sure.

Jason



Re: [PATCH] [RFC] New early __builtin_unreachable processing.

2023-09-19 Thread Andrew MacLeod



On 9/19/23 08:56, Richard Biener wrote:

On Mon, Sep 18, 2023 at 3:48 PM Andrew MacLeod  wrote:


OK.

I dont see anything in the early VRP processing now that would allow a
later pass to remove the unreachable unless it does its own analysis
like DOM might do.

Isn't it as simple as

   if (i_2 > 5) __builtin_unreachable ();

registering a global range of [6, INF] for i_2 and then the next time
we fold if (i_2 > 5) using range info will eliminate it?  Yes, historically
that required VRP or DOM since nothing else looked at ranges, not
sure how it behaves now given more match.pd patterns do look
at (global) ranges.


if we set the range yes.   What I meant was in the cases where we decide 
it can't be removed, we do NOT set the range globally in vrp1 now. This 
means  unless some other pass determines the range is [6, +INF] the 
unreachcable call will remain in the IL and any ranger aware pass will 
still get the contextual range info resulting from the unreachable.  We 
were sometimes removing the unreachable without being able to update 
every affected global/future optimization opportunity, which this fixes. 
Hopefully :-)   Its certainly much better at least.


In theory, if inlining were aware of global ranges and propagated them, 
we could also now remove these some of these unreachables in EVRP rather 
than VRP1...   as I think we're now sure there is no benefit to keeping 
the unreachable call when we remove it.





In any case, thanks for the explanation and OK for the patch.


Will check it in shortly.

Andrew



Re: [PATCH] RISC-V: Support combine cond extend and reduce sum to cond widen reduce sum

2023-09-19 Thread Robin Dapp
Hi Lehua,

>> Would it hurt to allow any nonmemory operand here and just force the
>> "unsupported" constants into a register?
> 
> Are you talking about why operand 2 doesn't use nonmemory_operand
> predicate? If so, I think this is because our vmerge.v[vxi]m insns
> only supports that operand 1 is a scalar and operand 2 must be a
> vector register.

My question was rather:

Why doesn't something like

(define_insn_and_split "vcond_mask_"
  [(set (match_operand:V_VLS 0 "register_operand")
(if_then_else:V_VLS
  (match_operand: 3 "register_operand")
  (match_operand:V_VLS 1 "nonmemory_operand")
  (match_operand:V_VLS 2 "nonmemory_operand")))]
  "TARGET_VECTOR && can_create_pseudo_p ()"
  "#"
  "&& 1"
  [(const_int 0)]
  {
/* The order of vcond_mask is opposite to pred_merge.  */
if (REG_P (operands[2]))
  operands[2] = force_reg (mode, operands[2]);
std::swap (operands[1], operands[2]);
riscv_vector::emit_vlmax_insn (code_for_pred_merge (mode),
   riscv_vector::MERGE_OP, operands);
DONE;
  }
  [(set_attr "type" "vector")]
)

suffice?  You could disallow operands[2] != 0 if needed.

Regards
 Robin



[PATCH]middle-end ifcvt: replace C++ sort with vec::qsort [PR109154]

2023-09-19 Thread Tamar Christina
Hi All,

As requested later on, this replaces the C++ sort with vec::qsort.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/109154
* tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
(cmp_arg_entry): New.
(predicate_scalar_phi): Use it.

--- inline copy of patch -- 
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 
799f071965e5c41eb352b5530cf1d9c7ecf7bf25..0d7ac82986f399f1c5ff91c04ddb524813ab27de
 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -80,7 +80,6 @@ along with GCC; see the file COPYING3.  If not see
  :;
 */
 
-#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -2045,6 +2044,28 @@ gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator 
*gsi,
   return lhs;
 }
 
+typedef std::pair > ArgEntry;
+static int
+cmp_arg_entry (const void *p1, const void *p2)
+{
+  const ArgEntry sval1 = *(const ArgEntry *)p1;
+  const ArgEntry sval2 = *(const ArgEntry *)p2;
+  auto x1 = sval1.second;
+  auto x2 = sval2.second;
+
+  if (x1.first < x2.first)
+return -1;
+  else if (x1.first > x2.first)
+return 1;
+
+  if (x1.second < x2.second)
+return -1;
+  else if (x1.second > x2.second)
+return 1;
+
+  return 0;
+}
+
 /* Replace a scalar PHI node with a COND_EXPR using COND as condition.
This routine can handle PHI nodes with more than two arguments.
 
@@ -2186,7 +2207,6 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
*gsi)
   /* Determine element with max number of occurrences and complexity.  Looking 
at only
  number of occurrences as a measure for complexity isn't enough as all 
usages can
  be unique but the comparisons to reach the PHI node differ per branch.  */
-  typedef std::pair > ArgEntry;
   auto_vec argsKV;
   for (i = 0; i < args.length (); i++)
 {
@@ -2204,10 +2224,7 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
*gsi)
 }
 
   /* Sort elements based on rankings ARGS.  */
-  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
-const ArgEntry &right) {
-return left.second < right.second;
-  });
+  argsKV.qsort (cmp_arg_entry);
 
   for (i = 0; i < args.length (); i++)
 args[i] = argsKV[i].first;




-- 
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 
799f071965e5c41eb352b5530cf1d9c7ecf7bf25..0d7ac82986f399f1c5ff91c04ddb524813ab27de
 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -80,7 +80,6 @@ along with GCC; see the file COPYING3.  If not see
  :;
 */
 
-#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -2045,6 +2044,28 @@ gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator 
*gsi,
   return lhs;
 }
 
+typedef std::pair > ArgEntry;
+static int
+cmp_arg_entry (const void *p1, const void *p2)
+{
+  const ArgEntry sval1 = *(const ArgEntry *)p1;
+  const ArgEntry sval2 = *(const ArgEntry *)p2;
+  auto x1 = sval1.second;
+  auto x2 = sval2.second;
+
+  if (x1.first < x2.first)
+return -1;
+  else if (x1.first > x2.first)
+return 1;
+
+  if (x1.second < x2.second)
+return -1;
+  else if (x1.second > x2.second)
+return 1;
+
+  return 0;
+}
+
 /* Replace a scalar PHI node with a COND_EXPR using COND as condition.
This routine can handle PHI nodes with more than two arguments.
 
@@ -2186,7 +2207,6 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
*gsi)
   /* Determine element with max number of occurrences and complexity.  Looking 
at only
  number of occurrences as a measure for complexity isn't enough as all 
usages can
  be unique but the comparisons to reach the PHI node differ per branch.  */
-  typedef std::pair > ArgEntry;
   auto_vec argsKV;
   for (i = 0; i < args.length (); i++)
 {
@@ -2204,10 +2224,7 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
*gsi)
 }
 
   /* Sort elements based on rankings ARGS.  */
-  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
-const ArgEntry &right) {
-return left.second < right.second;
-  });
+  argsKV.qsort (cmp_arg_entry);
 
   for (i = 0; i < args.length (); i++)
 args[i] = argsKV[i].first;





Re: RFC: Introduce -fhardened to enable security-related flags

2023-09-19 Thread Qing Zhao



> On Sep 17, 2023, at 12:36 PM, Hans-Peter Nilsson via Gcc-patches 
>  wrote:
> 
>> From: Sam James 
>> Date: Sun, 17 Sep 2023 05:00:37 +0100
> 
>> Hans-Peter Nilsson via Gcc-patches  writes:
>> 
 Date: Tue, 29 Aug 2023 15:42:27 -0400
 From: Marek Polacek via Gcc-patches 
>>> 
 Surely, there must be no ABI impact, the option cannot cause
 severe performance issues,
>>> 
 Currently, -fhardened enables:
>>> ...
  -ftrivial-auto-var-init=zero
>>> 
 Thoughts?
>>> 
>>> Regarding -ftrivial-auto-var-init=zero, I was consulted when
>>> colleagues investigating a performance regression
>>> pint-pointed it as *causing severe performance issues*;
>>> cf. https://github.com/systemd/systemd.git commit 1a4e392760
>>> (TL;DR: adds "-ftrivial-auto-var-init=zero" to the systemd
>>> build).
>>> 
>>> The situation was described as "we noticed that some test
>>> suites takes 35% percent longer time to finish.  After
>>> further investigation it was noticed that running systemctl
>>> unmask x takes around 5s more time on [version including
>>> patch vs. before that patch]" (timing out some tests).
>>> Reverting that patch fixed the drop in performance.
>> 
>> Did some bug ever get filed for this to see if we can do a bit
>> better here?
> 
> Not that I know of; neither for systemd nor gcc.

Then, is it convenient to file a bug on this?  That will be very helpful for us 
to locate the issue and fix it.

Before I committing the -ftrivial-auto-var-init patch, I have done some 
performance testing on CPU2017 for x86 and aarch64,
The runtime overhead was quite limited. 

Which platform the 35% performance slowdown was on?

Thanks.

Qing
> 
>> Some slowdown doesn't mean it's of the expected magnitude.
> 
> Can you please rephrase that?
> 
>>> Just a data point, but I believe also exactly your intended
>>> use.  IMO including -ftrivial-auto-var-init is worth extra
>>> consideration.
>>> 
>>> Alternatively, strike the while "cannot cause severe
>>> performance issues".
>>> 
>>> brgds, H-P



Re: [PATCH]middle-end ifcvt: replace C++ sort with vec::qsort [PR109154]

2023-09-19 Thread Richard Biener
On Tue, 19 Sep 2023, Tamar Christina wrote:

> Hi All,
> 
> As requested later on, this replaces the C++ sort with vec::qsort.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/109154
>   * tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
>   (cmp_arg_entry): New.
>   (predicate_scalar_phi): Use it.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
> index 
> 799f071965e5c41eb352b5530cf1d9c7ecf7bf25..0d7ac82986f399f1c5ff91c04ddb524813ab27de
>  100644
> --- a/gcc/tree-if-conv.cc
> +++ b/gcc/tree-if-conv.cc
> @@ -80,7 +80,6 @@ along with GCC; see the file COPYING3.  If not see
>   :;
>  */
>  
> -#define INCLUDE_ALGORITHM
>  #include "config.h"
>  #include "system.h"
>  #include "coretypes.h"
> @@ -2045,6 +2044,28 @@ gen_phi_nest_statement (gphi *phi, 
> gimple_stmt_iterator *gsi,
>return lhs;
>  }
>  
> +typedef std::pair > ArgEntry;
> +static int
> +cmp_arg_entry (const void *p1, const void *p2)
> +{
> +  const ArgEntry sval1 = *(const ArgEntry *)p1;
> +  const ArgEntry sval2 = *(const ArgEntry *)p2;
> +  auto x1 = sval1.second;
> +  auto x2 = sval2.second;
> +
> +  if (x1.first < x2.first)
> +return -1;
> +  else if (x1.first > x2.first)
> +return 1;
> +
> +  if (x1.second < x2.second)
> +return -1;
> +  else if (x1.second > x2.second)
> +return 1;
> +
> +  return 0;
> +}
> +
>  /* Replace a scalar PHI node with a COND_EXPR using COND as condition.
> This routine can handle PHI nodes with more than two arguments.
>  
> @@ -2186,7 +2207,6 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
> *gsi)
>/* Determine element with max number of occurrences and complexity.  
> Looking at only
>   number of occurrences as a measure for complexity isn't enough as all 
> usages can
>   be unique but the comparisons to reach the PHI node differ per branch.  
> */
> -  typedef std::pair > ArgEntry;
>auto_vec argsKV;
>for (i = 0; i < args.length (); i++)
>  {
> @@ -2204,10 +2224,7 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator 
> *gsi)
>  }
>  
>/* Sort elements based on rankings ARGS.  */
> -  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
> -  const ArgEntry &right) {
> -return left.second < right.second;
> -  });
> +  argsKV.qsort (cmp_arg_entry);
>  
>for (i = 0; i < args.length (); i++)
>  args[i] = argsKV[i].first;
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: Patch ping: [PATCH] testsuite work-around compound-assignment-1.c C++ failures on various targets [PR111377]

2023-09-19 Thread David Malcolm
On Tue, 2023-09-19 at 09:20 +0200, Jakub Jelinek wrote:
> Hi!
> 
> On Tue, Sep 12, 2023 at 09:02:55AM +0200, Jakub Jelinek via Gcc-
> patches wrote:
> > On Mon, Sep 11, 2023 at 11:11:30PM +0200, Jakub Jelinek via Gcc-
> > patches wrote:
> > > On Mon, Sep 11, 2023 at 07:27:57PM +0200, Benjamin Priour via
> > > Gcc-patches wrote:
> > > > Thanks for the report,
> > > > 
> > > > After investigation it seems the location of the new dejagnu
> > > > directive for
> > > > C++ differs depending on the configuration.
> > > > The expected warning is still emitted, but its location differ
> > > > slightly.
> > > > I expect it to be not an issue per se of the analyzer, but a
> > > > divergence in
> > > > the FE between the two configurations.
> > > 
> > > I think the divergence is whether called_by_test_5b returns the
> > > struct
> > > in registers or in memory.  If in memory (like in the x86_64 -m32
> > > case), we have
> > >   [compound-assignment-1.c:71:21] D.3191 = called_by_test_5b ();
> > > [return slot optimization]
> > >   [compound-assignment-1.c:71:21 discrim 1] D.3191 ={v}
> > > {CLOBBER(eol)};
> > >   [compound-assignment-1.c:72:1] return;
> > > in the IL, while if in registers (like x86_64 -m64 case), just
> > >   [compound-assignment-1.c:71:21] D.3591 = called_by_test_5b ();
> > >   [compound-assignment-1.c:72:1] return;
> > > 
> > > If you just want to avoid the differences, putting } on the same
> > > line as the
> > > call might be a usable workaround for that.
> > 
> > Here is the workaround in patch form.  Tested on x86_64-linux -
> > m32/-m64, ok
> > for trunk?
> 
> I'd like to ping this patch.

OK

Dave

> 
> > 2023-09-12  Jakub Jelinek  
> > 
> > PR testsuite/111377
> > * c-c++-common/analyzer/compound-assignment-1.c (test_5b):
> > Move
> > closing } to the same line as the call to work-around
> > differences in
> > diagnostics line.
> > 
> > --- gcc/testsuite/c-c++-common/analyzer/compound-assignment-
> > 1.c.jj  2023-09-11 11:05:47.523727789 +0200
> > +++ gcc/testsuite/c-c++-common/analyzer/compound-assignment-
> > 1.c 2023-09-12 08:58:52.854231161 +0200
> > @@ -68,5 +68,8 @@ called_by_test_5b (void)
> >  
> >  void test_5b (void)
> >  {
> > -  called_by_test_5b ();
> > -} /* { dg-warning "leak of '.ptr_wrapper::ptr'" "" {
> > target c++ } } */
> > +  called_by_test_5b (); }
> > +/* { dg-warning "leak of '.ptr_wrapper::ptr'" "" {
> > target c++ } .-1 } */
> > +/* The closing } above is intentionally on the same line as the
> > call, because
> > +   otherwise the exact line of the diagnostics depends on whether
> > the
> > +   called_by_test_5b () call satisfies aggregate_value_p or not. 
> > */
> > 
> > 
> > Jakub
> 
> Jakub
> 



Re: [PATCH] gcc: Introduce -fhardened

2023-09-19 Thread Marek Polacek
On Mon, Sep 18, 2023 at 08:57:39AM +0200, Richard Biener wrote:
> On Fri, Sep 15, 2023 at 5:09 PM Marek Polacek via Gcc-patches
>  wrote:
> >
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, 
> > powerpc64le-unknown-linux-gnu,
> > and aarch64-unknown-linux-gnu; ok for trunk?
> >
> > -- >8 --
> > In 
> > I proposed -fhardened, a new umbrella option that enables a reasonable set
> > of hardening flags.  The read of the room seems to be that the option
> > would be useful.  So here's a patch implementing that option.
> >
> > Currently, -fhardened enables:
> >
> >   -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
> >   -D_GLIBCXX_ASSERTIONS
> >   -ftrivial-auto-var-init=pattern
> >   -fPIE  -pie  -Wl,-z,relro,-z,now
> >   -fstack-protector-strong
> >   -fstack-clash-protection
> >   -fcf-protection=full (x86 GNU/Linux only)
> >
> > -fhardened will not override options that were specified on the command line
> > (before or after -fhardened).  For example,
> >
> >  -D_FORTIFY_SOURCE=1 -fhardened
> >
> > means that _FORTIFY_SOURCE=1 will be used.  Similarly,
> >
> >   -fhardened -fstack-protector
> >
> > will not enable -fstack-protector-strong.
> >
> > In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
> > to anything.  I think we need a better way to show what it actually
> > enables.
> 
> I do think we need to find a solution here to solve asserting compliance.

Fair enough.

> Maybe we can have -Whardened that will diagnose any altering of
> -fhardened by other options on the command-line or by missed target
> implementations?  People might for example use -fstack-protector
> but don't really want to make protection lower than requested with -fhardened.
> 
> Any such conflict is much less appearant than when you use the
> flags -fhardened composes.

How about: --help=hardened says which options -fhardened attempts to
enable, and -Whardened warns when it didn't enable an option?  E.g.,

  -fstack-protector -fhardened -Whardened

would say that it didn't enable -fstack-protector-strong because
-fstack-protector was specified on the command line?

If !HAVE_LD_NOW_SUPPORT, --help=hardened probably doesn't even have to
list -z now, likewise for -z relro.

Unclear if -Whardened should be enabled by default, but probably yes?

Marek



Re: RISC-V sign extension query

2023-09-19 Thread Jeff Law




On 9/18/23 21:37, Vineet Gupta wrote:

On 9/18/23 19:41, Jeff Law wrote:

On 9/18/23 13:45, Vineet Gupta wrote:



For the cases which do require sign extends, but not being eliminated 
due to "missing definition(s)" I'm working on adapting Ajit's REE ABI 
interfaces work [2] to work for RISC-V as well.
I wonder if we could walk the DECL_ARGUMENTS for current_function_decl 
and create sign extensions for integer arguments smaller than XLEN at 
the start of the function.  Record them in a list.


Then we just let the rest of REE do its thing.  When REE is done we go 
back and delete those sign extensions we created.


Forgot to add that even if we were to do this (and the test is doing 
this already), REE would fail anyways. It does DF use/def traversal - 
starting with use in an extension insn and finding the defs. If the def 
was implicit - as in a function arg, it bails out. This is essentially 
what Ajit is trying to fix by identifying the def as a potential 
function arg and not bailing.


Right.  What I'm suggesting is to create an explicit extension in the IL 
at the very beginning of REE for the appropriate arguments.  Put the 
extension in the entry block.


That would make extensions that were previously in the RTL redundant 
allowing REE to remove them.


Then we also remove the explicitly added extensions.

Think of the extensions we're adding as expressing the the extension 
that the caller would have done and would expose the redundant 
extensions in the callee.  We put them in the IL merely to make the 
existing REE algorithm happy.  Once they've served their purpose in 
exposing redundant extensions we remove the ones we added.


If you're familiar with DSE there's a lot of similarities with how we 
discover dead stores into the stack.  We pretend there's a store to each 
local frame slot in the epilogue, that in turn exposes stores into the 
local frame that would never be read because we're exiting the function.


Jeff


[PATCH] c++: further optimize tsubst_template_decl

2023-09-19 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?

-- >8 --

This patch makes tsubst_template_decl use use_spec_table=false also in
the non-class non-function template case, to avoid computing 'argvec' and
doing a hash table lookup from tsubst_decl (when partially instantiating
a member variable or alias template).

This change reveals that for function templates, tsubst_template_decl
registers the partially instantiated TEMPLATE_DECL, whereas for other
non-class templates it registers the corresponding DECL_TEMPLATE_RESULT
which is an interesting inconsistency that I decided to preserve for now.
Trying to consistently register the TEMPLATE_DECL (or FUNCTION_DECL)
causes modules crashes, but I haven't looked into why.

In passing, I noticed in tsubst_function_decl that its 'argvec' goes
unused when 'lambda_fntype' is set (since lambdas aren't recorded in the
specializations table), so we can avoid computing it in that case.

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Don't bother computing 'argvec'
when 'lambda_fntype' is set.
(tsubst_template_decl): Make sure we return a TEMPLATE_DECL
after specialization lookup.  In the non-class non-function
template case, use tsubst_decl directly with use_spec_table=false,
update DECL_TI_ARGS and call register_specialization like
tsubst_decl would have done if use_spec_table=true.
---
 gcc/cp/pt.cc | 39 +--
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 777ff592789..cc8ba21d6fd 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -14370,7 +14370,7 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t 
complain,
 
   /* Calculate the complete set of arguments used to
 specialize R.  */
-  if (use_spec_table)
+  if (use_spec_table && !lambda_fntype)
{
  argvec = tsubst_template_args (DECL_TI_ARGS
 (DECL_TEMPLATE_RESULT
@@ -14380,14 +14380,11 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
return error_mark_node;
 
  /* Check to see if we already have this specialization.  */
- if (!lambda_fntype)
-   {
- hash = spec_hasher::hash (gen_tmpl, argvec);
- if (tree spec = retrieve_specialization (gen_tmpl, argvec, hash))
-   /* The spec for these args might be a partial instantiation of 
the
-  template, but here what we want is the FUNCTION_DECL.  */
-   return STRIP_TEMPLATE (spec);
-   }
+ hash = spec_hasher::hash (gen_tmpl, argvec);
+ if (tree spec = retrieve_specialization (gen_tmpl, argvec, hash))
+   /* The spec for these args might be a partial instantiation of the
+  template, but here what we want is the FUNCTION_DECL.  */
+   return STRIP_TEMPLATE (spec);
}
   else
argvec = args;
@@ -14704,6 +14701,8 @@ tsubst_template_decl (tree t, tree args, tsubst_flags_t 
complain,
/* Type partial instantiations are stored as the type by
   lookup_template_class_1, not here as the template.  */
spec = CLASSTYPE_TI_TEMPLATE (spec);
+ else if (TREE_CODE (spec) != TEMPLATE_DECL)
+   spec = DECL_TI_TEMPLATE (spec);
  return spec;
}
 }
@@ -14754,7 +14753,7 @@ tsubst_template_decl (tree t, tree args, tsubst_flags_t 
complain,
inner = tsubst_aggr_type (inner, args, complain,
  in_decl, /*entering*/1);
   else
-   inner = tsubst (inner, args, complain, in_decl);
+   inner = tsubst_decl (inner, args, complain, /*use_spec_table=*/false);
 }
   --processing_template_decl;
   if (inner == error_mark_node)
@@ -14780,12 +14779,11 @@ tsubst_template_decl (tree t, tree args, 
tsubst_flags_t complain,
 }
   else
 {
-  if (TREE_CODE (inner) == FUNCTION_DECL)
-   /* Set DECL_TI_ARGS to the full set of template arguments, which
-  tsubst_function_decl didn't do due to use_spec_table=false.  */
-   DECL_TI_ARGS (inner) = full_args;
-
   DECL_TI_TEMPLATE (inner) = r;
+  /* Set DECL_TI_ARGS to the full set of template arguments,
+which tsubst_function_decl / tsubst_decl didn't do due to
+use_spec_table=false.  */
+  DECL_TI_ARGS (inner) = full_args;
   DECL_TI_ARGS (r) = DECL_TI_ARGS (inner);
 }
 
@@ -14813,9 +14811,14 @@ tsubst_template_decl (tree t, tree args, 
tsubst_flags_t complain,
   if (PRIMARY_TEMPLATE_P (t))
 DECL_PRIMARY_TEMPLATE (r) = r;
 
-  if (TREE_CODE (decl) == FUNCTION_DECL && !lambda_fntype)
-/* Record this non-type partial instantiation.  */
-register_specialization (r, t, full_args, false, hash);
+  if (!lambda_fntype && !class_p)
+{
+  /* Record this non-type partial instantiation.  */
+  if (TREE_CODE (inner) == FUNCTION_DE

[PATCH 0/2] RISC-V: Support CORE-V XCVMAC and XCVALU extensions

2023-09-19 Thread Mary Bennett
This patch series presents the comprehensive implementation of the MAC and ALU
extension for CORE-V.

Tested with riscv-gnu-toolchain on binutils, ld, gas and gcc testsuites to
ensure its correctness and compatibility with the existing codebase.
However, your input, reviews, and suggestions are invaluable in making this
extension even more robust.

The CORE-V builtins are described in the specification [1] and work can be
found in the OpenHW group's Github repository [2].

[1] 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

[2] github.com/openhwgroup/corev-gcc

Contributors:
Mary Bennett 
Nandni Jamnadas 
Pietra Ferreira 
Charlie Keaney
Jessica Mills
Craig Blackmore 
Simon Cook 
Jeremy Bennett 
Helene Chelin 

  RISC-V: Add support for XCValu extension in CV32E40P
  RISC-V: Add support for XCVmac extension in CV32E40P

 gcc/common/config/riscv/riscv-common.cc   |   6 +
 gcc/config/riscv/constraints.md   |   7 +
 gcc/config/riscv/corev.def|  43 ++
 gcc/config/riscv/corev.md | 675 ++
 gcc/config/riscv/predicates.md|   5 +
 gcc/config/riscv/riscv-builtins.cc|  13 +
 gcc/config/riscv/riscv-ftypes.def |  11 +
 gcc/config/riscv/riscv-opts.h |   7 +
 gcc/config/riscv/riscv.cc |   7 +
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/riscv.opt|   3 +
 gcc/doc/extend.texi   | 174 +
 .../gcc.target/riscv/cv-alu-compile.c | 252 +++
 .../riscv/cv-alu-fail-compile-addn.c  |  11 +
 .../riscv/cv-alu-fail-compile-addrn.c |  11 +
 .../riscv/cv-alu-fail-compile-addun.c |  11 +
 .../riscv/cv-alu-fail-compile-addurn.c|  11 +
 .../riscv/cv-alu-fail-compile-clip.c  |  11 +
 .../riscv/cv-alu-fail-compile-clipu.c |  11 +
 .../riscv/cv-alu-fail-compile-subn.c  |  11 +
 .../riscv/cv-alu-fail-compile-subrn.c |  11 +
 .../riscv/cv-alu-fail-compile-subun.c |  11 +
 .../riscv/cv-alu-fail-compile-suburn.c|  11 +
 .../gcc.target/riscv/cv-alu-fail-compile.c|  32 +
 .../gcc.target/riscv/cv-mac-compile.c | 198 +
 .../riscv/cv-mac-fail-compile-mac.c   |  25 +
 .../riscv/cv-mac-fail-compile-machhsn.c   |  24 +
 .../riscv/cv-mac-fail-compile-machhsrn.c  |  24 +
 .../riscv/cv-mac-fail-compile-machhun.c   |  24 +
 .../riscv/cv-mac-fail-compile-machhurn.c  |  24 +
 .../riscv/cv-mac-fail-compile-macsn.c |  24 +
 .../riscv/cv-mac-fail-compile-macsrn.c|  24 +
 .../riscv/cv-mac-fail-compile-macun.c |  24 +
 .../riscv/cv-mac-fail-compile-macurn.c|  24 +
 .../riscv/cv-mac-fail-compile-msu.c   |  25 +
 .../riscv/cv-mac-fail-compile-mulhhsn.c   |  24 +
 .../riscv/cv-mac-fail-compile-mulhhsrn.c  |  24 +
 .../riscv/cv-mac-fail-compile-mulhhun.c   |  24 +
 .../riscv/cv-mac-fail-compile-mulhhurn.c  |  24 +
 .../riscv/cv-mac-fail-compile-mulsn.c |  24 +
 .../riscv/cv-mac-fail-compile-mulsrn.c|  24 +
 .../riscv/cv-mac-fail-compile-mulun.c |  24 +
 .../riscv/cv-mac-fail-compile-mulurn.c|  24 +
 .../riscv/cv-mac-test-autogeneration.c|  18 +
 gcc/testsuite/lib/target-supports.exp |  26 +
 45 files changed, 2022 insertions(+)
 create mode 100644 gcc/config/riscv/corev.def
 create mode 100644 gcc/config/riscv/corev.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-compile.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addun.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addurn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-clip.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-clipu.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-subn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-subrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-subun.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-suburn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-compile.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-mac.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhsn.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhsrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhun.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhurn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-m

[PATCH 1/2] RISC-V: Add support for XCVmac extension in CV32E40P

2023-09-19 Thread Mary Bennett
Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Added XCVmac.
* config/riscv/riscv-ftypes.def: Added XCVmac builtins.
* config/riscv/riscv-opts.h: Likewise.
* config/riscv/riscv.md: Likewise.
* config/riscv/riscv.opt: Likewise.
* doc/extend.texi: Added XCVmac builtin documentation.
* config/riscv/corev.def: New file.
* config/riscv/corev.md: New file.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Added new effective target check.
* gcc.target/riscv/cv-mac-compile.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mac.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-machhurn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-macurn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-msu.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulhhurn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulsn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulsrn.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulun.c: New test.
* gcc.target/riscv/cv-mac-fail-compile-mulurn.c: New test.
* gcc.target/riscv/cv-mac-test-autogeneration.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc   |   4 +
 gcc/config/riscv/corev.def|  19 +
 gcc/config/riscv/corev.md | 390 ++
 gcc/config/riscv/riscv-builtins.cc|  10 +
 gcc/config/riscv/riscv-ftypes.def |   5 +
 gcc/config/riscv/riscv-opts.h |   5 +
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/riscv.opt|   3 +
 gcc/doc/extend.texi   |  80 
 .../gcc.target/riscv/cv-mac-compile.c | 198 +
 .../riscv/cv-mac-fail-compile-mac.c   |  25 ++
 .../riscv/cv-mac-fail-compile-machhsn.c   |  24 ++
 .../riscv/cv-mac-fail-compile-machhsrn.c  |  24 ++
 .../riscv/cv-mac-fail-compile-machhun.c   |  24 ++
 .../riscv/cv-mac-fail-compile-machhurn.c  |  24 ++
 .../riscv/cv-mac-fail-compile-macsn.c |  24 ++
 .../riscv/cv-mac-fail-compile-macsrn.c|  24 ++
 .../riscv/cv-mac-fail-compile-macun.c |  24 ++
 .../riscv/cv-mac-fail-compile-macurn.c|  24 ++
 .../riscv/cv-mac-fail-compile-msu.c   |  25 ++
 .../riscv/cv-mac-fail-compile-mulhhsn.c   |  24 ++
 .../riscv/cv-mac-fail-compile-mulhhsrn.c  |  24 ++
 .../riscv/cv-mac-fail-compile-mulhhun.c   |  24 ++
 .../riscv/cv-mac-fail-compile-mulhhurn.c  |  24 ++
 .../riscv/cv-mac-fail-compile-mulsn.c |  24 ++
 .../riscv/cv-mac-fail-compile-mulsrn.c|  24 ++
 .../riscv/cv-mac-fail-compile-mulun.c |  24 ++
 .../riscv/cv-mac-fail-compile-mulurn.c|  24 ++
 .../riscv/cv-mac-test-autogeneration.c|  18 +
 gcc/testsuite/lib/target-supports.exp |  13 +
 30 files changed, 1180 insertions(+)
 create mode 100644 gcc/config/riscv/corev.def
 create mode 100644 gcc/config/riscv/corev.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-compile.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-mac.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhsn.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhsrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhun.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-machhurn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-macsn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-macsrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-macun.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-macurn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-msu.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-mulhhsn.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/cv-mac-fail-compile-

[PATCH 2/2] RISC-V: Add support for XCValu extension in CV32E40P

2023-09-19 Thread Mary Bennett
Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett 
  Nandni Jamnadas 
  Pietra Ferreira 
  Charlie Keaney
  Jessica Mills
  Craig Blackmore 
  Simon Cook 
  Jeremy Bennett 
  Helene Chelin 

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Added the XCValu
  extension.
* config/riscv/constraints.md: Added builtins for the XCValu
  extension.
* config/riscv/predicates.md (immediate_register_operand):
  Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
  (RISCV_ATYPE_UHI): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv-opts.h: Likewise.
* config/riscv/riscv.opt: Likewise.
* config/riscv/riscv.cc (riscv_print_operand): Likewise.
* doc/extend.texi: Added XCValu documentation.
* config/riscv/corev.def: New file.
* config/riscv/corev.md: New file.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Added proc for the XCValu extension.
* gcc.target/riscv/cv-alu-compile.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addrn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addun.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-addurn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-clip.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-clipu.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subrn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-subun.c: New test.
* gcc.target/riscv/cv-alu-fail-compile-suburn.c: New test.
* gcc.target/riscv/cv-alu-fail-compile.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc   |   2 +
 gcc/config/riscv/constraints.md   |   7 +
 gcc/config/riscv/corev.def|  24 ++
 gcc/config/riscv/corev.md | 285 ++
 gcc/config/riscv/predicates.md|   5 +
 gcc/config/riscv/riscv-builtins.cc|   3 +
 gcc/config/riscv/riscv-ftypes.def |   6 +
 gcc/config/riscv/riscv-opts.h |   2 +
 gcc/config/riscv/riscv.cc |   7 +
 gcc/doc/extend.texi   |  94 ++
 .../gcc.target/riscv/cv-alu-compile.c | 252 
 .../riscv/cv-alu-fail-compile-addn.c  |  11 +
 .../riscv/cv-alu-fail-compile-addrn.c |  11 +
 .../riscv/cv-alu-fail-compile-addun.c |  11 +
 .../riscv/cv-alu-fail-compile-addurn.c|  11 +
 .../riscv/cv-alu-fail-compile-clip.c  |  11 +
 .../riscv/cv-alu-fail-compile-clipu.c |  11 +
 .../riscv/cv-alu-fail-compile-subn.c  |  11 +
 .../riscv/cv-alu-fail-compile-subrn.c |  11 +
 .../riscv/cv-alu-fail-compile-subun.c |  11 +
 .../riscv/cv-alu-fail-compile-suburn.c|  11 +
 .../gcc.target/riscv/cv-alu-fail-compile.c|  32 ++
 gcc/testsuite/lib/target-supports.exp |  13 +
 23 files changed, 842 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-compile.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addun.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-addurn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-clip.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-clipu.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-subn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-subrn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-subun.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile-suburn.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/cv-alu-fail-compile.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 53e21fa4bce..e7c1a99fbd2 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -311,6 +311,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"svpbmt",  ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"xcvmac", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"xcvalu", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"xtheadba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"xtheadbb", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1483,6 +1484,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"ztso", &gcc_options::x_riscv_ztso_subext, MASK_ZTSO},
 
   {"xcvmac",&gcc_options::x_riscv_xcv_flags, MASK_XCVMAC},
+  {"xcvalu",&gcc_options::x_riscv_xcv_flags, MASK_XCVALU},
 
   {"xtheadba

Re: [PATCH] gcc: Introduce -fhardened

2023-09-19 Thread Jakub Jelinek
On Tue, Sep 19, 2023 at 10:58:19AM -0400, Marek Polacek wrote:
> > > In 
> > > I proposed -fhardened, a new umbrella option that enables a reasonable set
> > > of hardening flags.  The read of the room seems to be that the option
> > > would be useful.  So here's a patch implementing that option.
> > >
> > > Currently, -fhardened enables:
> > >
> > >   -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
> > >   -D_GLIBCXX_ASSERTIONS
> > >   -ftrivial-auto-var-init=pattern
> > >   -fPIE  -pie  -Wl,-z,relro,-z,now
> > >   -fstack-protector-strong
> > >   -fstack-clash-protection
> > >   -fcf-protection=full (x86 GNU/Linux only)
> > >
> > > -fhardened will not override options that were specified on the command 
> > > line
> > > (before or after -fhardened).  For example,
> > >
> > >  -D_FORTIFY_SOURCE=1 -fhardened
> > >
> > > means that _FORTIFY_SOURCE=1 will be used.  Similarly,
> > >
> > >   -fhardened -fstack-protector
> > >
> > > will not enable -fstack-protector-strong.
> > >
> > > In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
> > > to anything.  I think we need a better way to show what it actually
> > > enables.
> > 
> > I do think we need to find a solution here to solve asserting compliance.
> 
> Fair enough.

Well, asserting compliance doesn't make sense, because many of these features
are only best effort.  So, one can assert that certain options have been
passed to the compiler (or not), and for that the current
-grecord-gcc-switches I think works mostly fine (well, it doesn't record -D*
options), one knows the compiler version/snapshot date etc. and what options
have been passed and can by repeating those options see what is and isn't
enabled in that case.
As for what exact options are actually enabled in the end, --help should be
able to answer that.  Still, even if one records that -D_FORTIFY_SOURCE=3
was passed on the command line, that doesn't mean there is no
#undef _FORTIFY_SOURCE in the source before including headers, or that
the compiler has been successful to figure out object size (static or
dynamic) for certain pointer, or that a function has some array so that it
will use stack protector guard, or that certain function didn't disable
-fstack-protector through function attributes etc.
So, if one wants to know if certain vulnerability exploit can be stopped
through hardening, one needs to analyze actually emitted code, just looking
for checkboxes isn't enough.
One can assert -D_FORTIFY_SOURCE=2 has been passed, but without glibc
headers being used and actually using __builtin_*object_size it doesn't do
anything either.  Programs can just declare memcpy etc. themselves and not
include glibc headers...

If the checkboxes are desirable for some reason, perhaps we could introduce
a new DWARF DW_AT_GNU_hardening attribute which would contain say bitfield
of which of those hardening features have been enabled (though, not sure if
we want to emit them just per DW_TAG_compile_unit, or DW_TAG_subprogram,
or both (say put on DW_TAG_compile_unit always if at least one of those is
enabled and on DW_TAG_subprogram only if it is different from the CU one).

Jakub



[pushed] c++: inherited default constructor [CWG2799]

2023-09-19 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

In this testcase, it seems clear that B should be trivially
default-constructible, since the inherited default constructor is trivial
and there are no other subobjects to initialize.  But we were saying no
because we don't define triviality of inherited constructors.

CWG discussion suggested that the solution is to implicitly declare a
default constructor when inheriting a default constructor; that makes sense
to me.

DR 2799

gcc/cp/ChangeLog:

* class.cc (add_implicit_default_ctor): Split out...
(add_implicitly_declared_members): ...from here.
Also call it when inheriting a default ctor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/inh-ctor38.C: New test.
---
 gcc/cp/class.cc | 36 -
 gcc/testsuite/g++.dg/cpp0x/inh-ctor38.C | 19 +
 2 files changed, 43 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/inh-ctor38.C

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index d270dcbb14c..469e98ed8b7 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -3292,6 +3292,22 @@ one_inherited_ctor (tree ctor, tree t, tree using_decl)
 }
 }
 
+/* Implicitly declare T().  */
+
+static void
+add_implicit_default_ctor (tree t)
+{
+  TYPE_HAS_DEFAULT_CONSTRUCTOR (t) = 1;
+  CLASSTYPE_LAZY_DEFAULT_CTOR (t) = 1;
+  if (cxx_dialect >= cxx11)
+TYPE_HAS_CONSTEXPR_CTOR (t)
+  /* Don't force the declaration to get a hard answer; if the
+definition would have made the class non-literal, it will still be
+non-literal because of the base or member in question, and that
+gives a better diagnostic.  */
+  = type_maybe_constexpr_default_constructor (t);
+}
+
 /* Create default constructors, assignment operators, and so forth for
the type indicated by T, if they are needed.  CANT_HAVE_CONST_CTOR,
and CANT_HAVE_CONST_ASSIGNMENT are nonzero if, for whatever reason,
@@ -3320,17 +3336,7 @@ add_implicitly_declared_members (tree t, tree* 
access_decls,
  If there is no user-declared constructor for a class, a default
  constructor is implicitly declared.  */
   if (! TYPE_HAS_USER_CONSTRUCTOR (t))
-{
-  TYPE_HAS_DEFAULT_CONSTRUCTOR (t) = 1;
-  CLASSTYPE_LAZY_DEFAULT_CTOR (t) = 1;
-  if (cxx_dialect >= cxx11)
-   TYPE_HAS_CONSTEXPR_CTOR (t)
- /* Don't force the declaration to get a hard answer; if the
-definition would have made the class non-literal, it will still be
-non-literal because of the base or member in question, and that
-gives a better diagnostic.  */
- = type_maybe_constexpr_default_constructor (t);
-}
+add_implicit_default_ctor (t);
 
   /* [class.ctor]
 
@@ -3394,7 +3400,13 @@ add_implicitly_declared_members (tree t, tree* 
access_decls,
  location_t loc = input_location;
  input_location = DECL_SOURCE_LOCATION (using_decl);
  for (tree fn : ovl_range (ctor_list))
-   one_inherited_ctor (fn, t, using_decl);
+   {
+ if (!TYPE_HAS_DEFAULT_CONSTRUCTOR (t) && default_ctor_p (fn))
+   /* CWG2799: Inheriting a default constructor gives us a default
+  constructor, not just an inherited constructor.  */
+   add_implicit_default_ctor (t);
+ one_inherited_ctor (fn, t, using_decl);
+   }
  *access_decls = TREE_CHAIN (*access_decls);
  input_location = loc;
}
diff --git a/gcc/testsuite/g++.dg/cpp0x/inh-ctor38.C 
b/gcc/testsuite/g++.dg/cpp0x/inh-ctor38.C
new file mode 100644
index 000..56217be1aae
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/inh-ctor38.C
@@ -0,0 +1,19 @@
+// CWG 2799
+// Test that inheriting a trivial default constructor produces a trivial
+// default constructor.
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct A {
+  A() = default;
+};
+
+struct B : A
+{
+  using A::A;
+  B(int);
+};
+
+static_assert (std::is_trivially_constructible::value, "");

base-commit: 5c5851bd93b8078bdd9665bc9bfe91fbf0028dc1
-- 
2.39.3



Re: [PATCH] RISC-V: Support combine cond extend and reduce sum to cond widen reduce sum

2023-09-19 Thread Lehua Ding

Hi Robin,


Would it hurt to allow any nonmemory operand here and just force the
"unsupported" constants into a register?


Are you talking about why operand 2 doesn't use nonmemory_operand
predicate? If so, I think this is because our vmerge.v[vxi]m insns
only supports that operand 1 is a scalar and operand 2 must be a
vector register.


My question was rather:

Why doesn't something like

(define_insn_and_split "vcond_mask_"
   [(set (match_operand:V_VLS 0 "register_operand")
 (if_then_else:V_VLS
   (match_operand: 3 "register_operand")
   (match_operand:V_VLS 1 "nonmemory_operand")
   (match_operand:V_VLS 2 "nonmemory_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
   [(const_int 0)]
   {
 /* The order of vcond_mask is opposite to pred_merge.  */
 if (REG_P (operands[2]))
   operands[2] = force_reg (mode, operands[2]);
 std::swap (operands[1], operands[2]);
 riscv_vector::emit_vlmax_insn (code_for_pred_merge (mode),
riscv_vector::MERGE_OP, operands);
 DONE;
   }
   [(set_attr "type" "vector")]
)

suffice?  You could disallow operands[2] != 0 if needed.


I think I understand what you're saying. The reason for not doing this 
is because simply allowing the operand 2 of vcond_mask to be vec_const_0 
would cause the other combine patterns to fail because they require the 
operand 2 of vcond_mask to be a register. Like the following existing 
combine patterns (operand 2 as the instruction merge operand, is not 
allowed to be non-register):


(define_insn_and_split "*cond_"
  [(set (match_operand:VF 0 "register_operand")
 (if_then_else:VF
   (match_operand: 1 "register_operand")
   (any_float_unop:VF
 (match_operand:VF 2 "register_operand"))
   (match_operand:VF 3 "register_operand")))]
  "TARGET_VECTOR && can_create_pseudo_p ()"
  "#"
  "&& 1"
  [(const_int 0)]
{
  insn_code icode = code_for_pred (, mode);
  rtx ops[] = {operands[0], operands[1], operands[2], operands[3],
   gen_int_mode (GET_MODE_NUNITS (mode), Pmode)};
  riscv_vector::expand_cond_len_unop (icode, ops);
  DONE;
}
[(set_attr "type" "vector")])

My current method is still to keep the operand 2 of vcond_mask as a 
register, but the pattern of mov_vec_const_0 is simplified, so that the 
corresponding combine pattern can be more simple. That's the only reason 
I split the vcond_mask into three patterns.


--
Best,
Lehua


Re: PING^5: [PATCH] rtl-optimization/110939 Really fix narrow comparison of memory and constant

2023-09-19 Thread Stefan Schulze Frielinghaus
Since this patch is sitting in the queue for quite some time and (more
importantly?) solves a bootstrap problem let me reiterate:

While writing the initial commit 7cdd0860949c6c3232e6cff1d7ca37bb5234074c
and the subsequent (potential) fix 41ef5a34161356817807be3a2e51fbdbe575ae85
I was not aware of the fact that the normal form of a CONST_INT,
representing an unsigned integer with fewer bits than in HOST_WIDE_INT,
is a sign-extended version of the unsigned integer.  This invariant is
checked in rtl.h where we have at line 2297:

case CONST_INT:
  if (precision < HOST_BITS_PER_WIDE_INT)
/* Nonzero BImodes are stored as STORE_FLAG_VALUE, which on many
   targets is 1 rather than -1.  */
gcc_checking_assert (INTVAL (x.first)
 == sext_hwi (INTVAL (x.first), precision)
 || (x.second == BImode && INTVAL (x.first) == 1));

This was pretty surprising and frankly speaking unintuitive to me which
is why I was skimming further over existing code where I have found this
in combine.cc:

  /* (unsigned) < 0x8000 is equivalent to >= 0.  */
  else if (is_a  (mode, &int_mode)
   && GET_MODE_PRECISION (int_mode) - 1 < HOST_BITS_PER_WIDE_INT
   && ((unsigned HOST_WIDE_INT) const_op
   == HOST_WIDE_INT_1U << (GET_MODE_PRECISION (int_mode) - 1)))
{

The expression of the if-statement is a contradiction rendering the then-part
unreachable unless you mask out the upper bits of the sign-extended
unsigned integer const_op (as proposed in the inlined patch):

  ((unsigned HOST_WIDE_INT) const_op & GET_MODE_MASK (int_mode))

This is why I got a bit uncertain and hoped to get some feedback whether
my intuition is correct or not.  Meanwhile I also found a comment in
the internals book at "14.7 Constant Expression Types" where we have:

   "Constants generated for modes with fewer bits than in HOST_WIDE_INT
must be sign extended to full width (e.g., with gen_int_mode).
[...]
Note however that values are neither inherently signed nor
inherently unsigned; where necessary, signedness is determined by
the rtl operation instead."

At least this and the assert statement document that the normal form of
a CONST_INT is kind of special w.r.t. unsigned integers.  Is there
anyone who can shed some light on _why_ such a normal form was chosen?

Independent of why such a normal form was chosen, this patch restores
the normal form and solves the bootstrap problem for Loongarch.

Cheers,
Stefan


[COMMITTED] [frange] Add op2_range for operator_not_equal.

2023-09-19 Thread Aldy Hernandez
We're missing an op2_range entry for operator_not_equal so GORI can
calculate an outgoing edge.  The false side of != is true and
guarantees we don't have a NAN, so it's important to get this right.
We eventually get it through an intersection of various ranges in
ranger, but it's best to get things correct as early as possible.

gcc/ChangeLog:

* range-op-float.cc (operator_not_equal::op2_range): New.
* range-op-mixed.h: Add operator_not_equal::op2_range.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-13.c: New test.
---
 gcc/range-op-float.cc|  8 
 gcc/range-op-mixed.h |  3 +++
 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-13.c | 16 
 3 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-13.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index cc729e12a9e..91d3096fdac 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -900,6 +900,14 @@ operator_not_equal::op1_range (frange &r, tree type,
   return true;
 }
 
+bool
+operator_not_equal::op2_range (frange &r, tree type,
+  const irange &lhs,
+  const frange &op1,
+  relation_trio trio) const
+{
+  return op1_range (r, type, lhs, op1, trio);
+}
 
 // Check if the LHS range indicates a relation between OP1 and OP2.
 
diff --git a/gcc/range-op-mixed.h b/gcc/range-op-mixed.h
index ef562326c1f..f7ff47b2725 100644
--- a/gcc/range-op-mixed.h
+++ b/gcc/range-op-mixed.h
@@ -164,6 +164,9 @@ public:
   bool op2_range (irange &r, tree type,
  const irange &lhs, const irange &op1,
  relation_trio = TRIO_VARYING) const final override;
+  bool op2_range (frange &r, tree type,
+ const irange &lhs, const frange &op1,
+ relation_trio = TRIO_VARYING) const final override;
 
   relation_kind op1_op2_relation (const irange &lhs, const irange &,
  const irange &) const final override;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-13.c
new file mode 100644
index 000..f5a0164dd91
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-13.c
@@ -0,0 +1,16 @@
+// { dg-do compile }
+// { dg-options "-O2 -fno-thread-jumps -fdisable-tree-fre1 
-fdump-tree-evrp-details" }
+
+void a(float, float);
+void b(float, float);
+
+void foo(float x, float y)
+{
+  if (x != y)
+a (x,y);
+  else if (x < y)
+b (x,y);
+}
+
+// Test that the false side of if(x != y) has a range for y.
+// { dg-final { scan-tree-dump "2->4  \\(F\\) y_3\\(D\\)" "evrp" } }
-- 
2.41.0



gcc-patches@gcc.gnu.org

2023-09-19 Thread Aldy Hernandez
We can set_nan() with a nan_state so it's good form to have the
analogous form for update_nan().

gcc/ChangeLog:

* value-range.h (frange::update_nan): New.
---
 gcc/value-range.h | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/gcc/value-range.h b/gcc/value-range.h
index da04be00ab4..a792c593faa 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -1257,36 +1257,40 @@ frange::set_undefined ()
 verify_range ();
 }
 
-// Set the NAN bit and adjust the range.
+// Set the NAN bits to NAN and adjust the range.
 
 inline void
-frange::update_nan ()
+frange::update_nan (const nan_state &nan)
 {
   gcc_checking_assert (!undefined_p ());
   if (HONOR_NANS (m_type))
 {
-  m_pos_nan = true;
-  m_neg_nan = true;
+  m_pos_nan = nan.pos_p ();
+  m_neg_nan = nan.neg_p ();
   normalize_kind ();
   if (flag_checking)
verify_range ();
 }
 }
 
+// Set the NAN bit to +-NAN.
+
+inline void
+frange::update_nan ()
+{
+  gcc_checking_assert (!undefined_p ());
+  nan_state nan (true);
+  update_nan (nan);
+}
+
 // Like above, but set the sign of the NAN.
 
 inline void
 frange::update_nan (bool sign)
 {
   gcc_checking_assert (!undefined_p ());
-  if (HONOR_NANS (m_type))
-{
-  m_pos_nan = !sign;
-  m_neg_nan = sign;
-  normalize_kind ();
-  if (flag_checking)
-   verify_range ();
-}
+  nan_state nan (/*pos=*/!sign, /*neg=*/sign);
+  update_nan (nan);
 }
 
 inline bool
-- 
2.41.0



[COMMITTED] [frange] Remove redundant known_isnan() checks.

2023-09-19 Thread Aldy Hernandez
The known_isnan() method is a subset of maybe_isnan().  This patch
removes redundant calls to known_isnan().

gcc/ChangeLog:

* range-op-float.cc (operator_lt::op1_range): Remove known_isnan check.
(operator_lt::op2_range): Same.
(operator_le::op1_range): Same.
(operator_le::op2_range): Same.
(operator_gt::op1_range): Same.
(operator_gt::op2_range): Same.
(operator_ge::op1_range): Same.
(operator_ge::op2_range): Same.
(foperator_unordered_lt::op1_range): Same.
(foperator_unordered_lt::op2_range): Same.
(foperator_unordered_le::op1_range): Same.
(foperator_unordered_le::op2_range): Same.
(foperator_unordered_gt::op1_range): Same.
(foperator_unordered_gt::op2_range): Same.
(foperator_unordered_ge::op1_range): Same.
(foperator_unordered_ge::op2_range): Same.
---
 gcc/range-op-float.cc | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 91d3096fdac..5eb1d9c06e3 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -981,7 +981,7 @@ operator_lt::op1_range (frange &r,
 
 case BRS_FALSE:
   // On the FALSE side of x < NAN, we know nothing about x.
-  if (op2.known_isnan () || op2.maybe_isnan ())
+  if (op2.maybe_isnan ())
r.set_varying (type);
   else
build_ge (r, type, op2);
@@ -1018,7 +1018,7 @@ operator_lt::op2_range (frange &r,
 
 case BRS_FALSE:
   // On the FALSE side of NAN < x, we know nothing about x.
-  if (op1.known_isnan () || op1.maybe_isnan ())
+  if (op1.maybe_isnan ())
r.set_varying (type);
   else
build_le (r, type, op1);
@@ -1091,7 +1091,7 @@ operator_le::op1_range (frange &r,
 
 case BRS_FALSE:
   // On the FALSE side of x <= NAN, we know nothing about x.
-  if (op2.known_isnan () || op2.maybe_isnan ())
+  if (op2.maybe_isnan ())
r.set_varying (type);
   else
build_gt (r, type, op2);
@@ -1124,7 +1124,7 @@ operator_le::op2_range (frange &r,
 
 case BRS_FALSE:
   // On the FALSE side of NAN <= x, we know nothing about x.
-  if (op1.known_isnan () || op1.maybe_isnan ())
+  if (op1.maybe_isnan ())
r.set_varying (type);
   else if (op1.undefined_p ())
return false;
@@ -1210,7 +1210,7 @@ operator_gt::op1_range (frange &r,
 
 case BRS_FALSE:
   // On the FALSE side of x > NAN, we know nothing about x.
-  if (op2.known_isnan () || op2.maybe_isnan ())
+  if (op2.maybe_isnan ())
r.set_varying (type);
   else if (op2.undefined_p ())
return false;
@@ -1249,7 +1249,7 @@ operator_gt::op2_range (frange &r,
 
 case BRS_FALSE:
   // On The FALSE side of NAN > x, we know nothing about x.
-  if (op1.known_isnan () || op1.maybe_isnan ())
+  if (op1.maybe_isnan ())
r.set_varying (type);
   else if (op1.undefined_p ())
return false;
@@ -1323,7 +1323,7 @@ operator_ge::op1_range (frange &r,
 
 case BRS_FALSE:
   // On the FALSE side of x >= NAN, we know nothing about x.
-  if (op2.known_isnan () || op2.maybe_isnan ())
+  if (op2.maybe_isnan ())
r.set_varying (type);
   else if (op2.undefined_p ())
return false;
@@ -1357,7 +1357,7 @@ operator_ge::op2_range (frange &r, tree type,
 
 case BRS_FALSE:
   // On the FALSE side of NAN >= x, we know nothing about x.
-  if (op1.known_isnan () || op1.maybe_isnan ())
+  if (op1.maybe_isnan ())
r.set_varying (type);
   else if (op1.undefined_p ())
return false;
@@ -1720,7 +1720,7 @@ foperator_unordered_lt::op1_range (frange &r, tree type,
   switch (get_bool_state (r, lhs, type))
 {
 case BRS_TRUE:
-  if (op2.known_isnan () || op2.maybe_isnan ())
+  if (op2.maybe_isnan ())
r.set_varying (type);
   else if (op2.undefined_p ())
return false;
@@ -1754,7 +1754,7 @@ foperator_unordered_lt::op2_range (frange &r, tree type,
   switch (get_bool_state (r, lhs, type))
 {
 case BRS_TRUE:
-  if (op1.known_isnan () || op1.maybe_isnan ())
+  if (op1.maybe_isnan ())
r.set_varying (type);
   else if (op1.undefined_p ())
return false;
@@ -1832,7 +1832,7 @@ foperator_unordered_le::op1_range (frange &r, tree type,
   switch (get_bool_state (r, lhs, type))
 {
 case BRS_TRUE:
-  if (op2.known_isnan () || op2.maybe_isnan ())
+  if (op2.maybe_isnan ())
r.set_varying (type);
   else if (op2.undefined_p ())
return false;
@@ -1865,7 +1865,7 @@ foperator_unordered_le::op2_range (frange &r,
   switch (get_bool_state (r, lhs, type))
 {
 case BRS_TRUE:
-  if (op1.known_isnan () || op1.maybe_isnan ())
+  if (op1.maybe_isnan ())
r.set_varying (type);
   else if (op1.undefined_p ())
return false;
@@ -1945,7 +1945,7 @@ foperator_unordered_gt::op1_range 

Re: [PATCH v3][RFC] c-family: Implement __has_feature and __has_extension [PR60512]

2023-09-19 Thread Jason Merrill

On 8/3/23 05:21, Alex Coplan wrote:

Hi,

This patch implements clang's __has_feature and __has_extension in GCC.
This is a v3 which addresses feedback for the v2 patch posted here:

https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626058.html

Main changes since v2:
  - As per Jason's feedback, dropped the langhook in favour of
a function prototyped in c-family/c-common.h and implemented in
*-lang.cc for each frontend.
  - Also dropped the callbacks as suggested, we now compute whether
features/extensions are available when __has_feature is first invoked,
and only add available features to the hash table (storing a boolean
to indicate whether a given identifier names a feature or an extension).
  - Added many comments to top-level definitions.
  - Generally polished and tidied up a bit.

As of this writing, there are still a couple of unresolved issues
around cxx_binary_literals and TLS, see:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/626058.html


I think what you have for these makes sense.


Bootstrapped/regtested on aarch64-linux-gnu and x86_64-apple-darwin.
How does this version look?



diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index 3f492b33470..76dbb9892d6 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -3199,6 +3199,8 @@ directive}: @samp{#if}, @samp{#ifdef} or @samp{#ifndef}.
 * @code{__has_cpp_attribute}::
 * @code{__has_c_attribute}::
 * @code{__has_builtin}::
+* @code{__has_feature}::
+* @code{__has_extension}::
 * @code{__has_include}::
 @end menu
 
@@ -3561,6 +3563,33 @@ the operator is as follows:

 #endif
 @end smallexample
 
+@node @code{__has_feature}

+@subsection @code{__has_feature}
+@cindex @code{__has_feature}
+
+The special operator @code{__has_feature (@var{operand})} may be used in
+constant integer contexts and in preprocessor @samp{#if} and @samp{#elif}
+expressions to test whether the identifier given in @var{operand} is recognized
+as a feature supported by GCC given the current options and, in the case of
+standard language features, whether the feature is available in the chosen
+version of the language standard.
+
+@node @code{__has_extension}
+@subsection @code{__has_extension}
+@cindex @code{__has_extension}
+
+The special operator @code{__has_extension (@var{operand})} may be used in
+constant integer contexts and in preprocessor @samp{#if} and @samp{#elif}
+expressions to test whether the identifier given in @var{operand} is recognized
+as an extension supported by GCC given the current options.  In any given
+context, the features accepted by @code{__has_extension} are a strict superset
+of those accepted by @code{__has_feature}.  Unlike @code{__has_feature},
+@code{__has_extension} tests whether a given feature is available regardless of
+strict language standards conformance.
+
+If the @code{-pedantic-errors} flag is given, @code{__has_extension} is
+equivalent to @code{__has_feature}.
+


I think we need some documentation of what identifiers someone might 
specify.  It might be best to say that these are not recommended for new 
code, just provided for Clang compatibility, and point to their 
documentation.  I notice that we already refer to Clang docs for UBSan.


Jason



[PATCH] c++: improve class NTTP object pretty printing [PR111471]

2023-09-19 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/13?

-- >8 --

1. Move class NTTP object pretty printing to a more general spot in
   the pretty printer.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
   value, like dump_expr would have done.
3. Don't pretty print const VIEW_CONVERT_EXPR wrapping class NTTPs.

PR c++/111471

gcc/cp/ChangeLog:

* cxx-pretty-print.cc (cxx_pretty_printer::expression)
: Print the value of a class NTTP object.
: Strip cosnt VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle the class
NTTP objects here.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/diagnostic19.C: New test.
---
 gcc/cp/cxx-pretty-print.cc   | 19 +--
 gcc/testsuite/g++.dg/concepts/diagnostic19.C | 20 
 2 files changed, 37 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/concepts/diagnostic19.C

diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
index 909a9dc917f..7cd43151592 100644
--- a/gcc/cp/cxx-pretty-print.cc
+++ b/gcc/cp/cxx-pretty-print.cc
@@ -1121,6 +1121,15 @@ cxx_pretty_printer::expression (tree t)
   t = OVL_FIRST (t);
   /* FALLTHRU */
 case VAR_DECL:
+  if (DECL_NTTP_OBJECT_P (t))
+   {
+ /* Print the type followed by the CONSTRUCTOR value of an
+NTTP object.  */
+ simple_type_specifier (cv_unqualified (TREE_TYPE (t)));
+ expression (DECL_INITIAL (t));
+ break;
+   }
+  /* FALLTHRU */
 case PARM_DECL:
 case FIELD_DECL:
 case CONST_DECL:
@@ -1261,6 +1270,14 @@ cxx_pretty_printer::expression (tree t)
   pp_cxx_right_paren (this);
   break;
 
+case VIEW_CONVERT_EXPR:
+  if (TREE_CODE (TREE_OPERAND (t, 0)) == TEMPLATE_PARM_INDEX)
+   {
+ /* Strip const VIEW_CONVERT_EXPR wrappers for class NTTPs.  */
+ expression (TREE_OPERAND (t, 0));
+ break;
+   }
+  /* FALLTHRU */
 default:
   c_pretty_printer::expression (t);
   break;
@@ -1966,8 +1983,6 @@ pp_cxx_template_argument_list (cxx_pretty_printer *pp, 
tree t)
  if (TYPE_P (arg) || (TREE_CODE (arg) == TEMPLATE_DECL
   && TYPE_P (DECL_TEMPLATE_RESULT (arg
pp->type_id (arg);
- else if (VAR_P (arg) && DECL_NTTP_OBJECT_P (arg))
-   pp->expression (DECL_INITIAL (arg));
  else
pp->expression (arg);
}
diff --git a/gcc/testsuite/g++.dg/concepts/diagnostic19.C 
b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
new file mode 100644
index 000..20cdb63380b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
@@ -0,0 +1,20 @@
+// Verify our pretty printing of class NTTP objects.
+// PR c++/111471
+// { dg-do compile { target c++20 } }
+
+struct A { bool value; };
+
+template
+  requires (V.value) // { dg-message {'\(V\).value \[with V = A\{false\}\]'} }
+void f();
+
+template struct B { static constexpr auto value = V.value; };
+
+template
+  requires T::value // { dg-message {'T::value \[with T = B\]'} }
+void g();
+
+int main() {
+  f(); // { dg-error "no match" }
+  g>(); // { dg-error "no match" } 
+}
-- 
2.42.0.216.gbda494f404



Re: [PATCH] match.pd: Some build_nonstandard_integer_type tweaks

2023-09-19 Thread Richard Sandiford
Jakub Jelinek via Gcc-patches  writes:
> Hi!
>
> As discussed earlier, using build_nonstandard_integer_type blindly for all
> INTEGRAL_TYPE_Ps is problematic now that we have BITINT_TYPE, because it
> always creates an INTEGRAL_TYPE with some possibly very large precision.
> The following patch attempts to deal with 3 such spots in match.pd, others
> still need looking at.
>
> In the first case, I think it is quite expensive/undesirable to create
> a non-standard INTEGER_TYPE with possibly huge precision and then
> immediately just see type_has_mode_precision_p being false for it, or even
> worse introducing a cast to TImode or OImode or XImode INTEGER_TYPE which
> nothing will be able to actually handle.  128-bit or 64-bit (on 32-bit
> targets) types are the largest supported by the backend, so the following
> patch avoids creating and matching conversions to larger types, it is
> an optimization anyway and so should be used when it is cheap that way.
>
> In the second hunk, I believe the uses of build_nonstandard_integer_type
> aren't useful at all.  It is when matching a ? -1 : 0 and trying to express
> it as say -(type) (bool) a etc., but this is all GIMPLE only, where most of
> integral types with same precision/signedness are compatible and we know
> -1 is representable in that type, so I really don't see any reason not to
> perform the negation of a [0, 1] valued expression in type, rather
> than doing it in
> build_nonstandard_integer_type (TYPE_PRECISION (type), TYPE_UNSIGNED (type))
> (except that it breaks the BITINT_TYPEs).  I don't think we need to do
> something like range_check_type.
> While in there, I've also noticed it was using a (with {
> tree booltrue = constant_boolean_node (true, boolean_type_node);
> } and removed that + replaced uses of booltrue with boolean_true_node
> which the above function always returns.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2023-09-19  Jakub Jelinek  
>
>   * match.pd ((x << c) >> c): Don't call build_nonstandard_integer_type
>   nor check type_has_mode_precision_p for width larger than [TD]Imode
>   precision.
>   (a ? CST1 : CST2): Don't use build_nonstandard_type, just convert
>   to type.  Use boolean_true_node instead of
>   constant_boolean_node (true, boolean_type_node).  Formatting fixes.
>
> --- gcc/match.pd.jj   2023-09-18 10:37:56.002965361 +0200
> +++ gcc/match.pd  2023-09-18 12:14:32.321631010 +0200
> @@ -4114,9 +4114,13 @@ (define_operator_list SYNC_FETCH_AND_AND
> (if (INTEGRAL_TYPE_P (type))
>  (with {
>int width = element_precision (type) - tree_to_uhwi (@1);
> -  tree stype = build_nonstandard_integer_type (width, 0);
> +  tree stype = NULL_TREE;
> +  scalar_int_mode mode = (targetm.scalar_mode_supported_p (TImode)
> +   ? TImode : DImode);
> +  if (width <= GET_MODE_PRECISION (mode))
> + stype = build_nonstandard_integer_type (width, 0);

How about using MAX_FIXED_MODE_SIZE for things like this?

Thanks,
Richard

>   }
> - (if (width == 1 || type_has_mode_precision_p (stype))
> + (if (stype && (width == 1 || type_has_mode_precision_p (stype)))
>(convert (convert:stype @0
>  
>  /* Optimize x >> x into 0 */
> @@ -5092,49 +5096,24 @@ (define_operator_list SYNC_FETCH_AND_AND
>  /* a ? -1 : 0 -> -a.  No need to check the TYPE_PRECISION not being 1
> here as the powerof2cst case above will handle that case correctly.  
> */
>  (if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
> + (negate (convert:type (convert:boolean_type_node @0))
> +  (if (integer_zerop (@1))
> +   (switch
> +/* a ? 0 : 1 -> !a. */
> +(if (integer_onep (@2))
> + (convert (bit_xor (convert:boolean_type_node @0) { boolean_true_node; 
> })))
> +/* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
> +(if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2))
>   (with {
> -   auto prec = TYPE_PRECISION (type);
> -   auto unsign = TYPE_UNSIGNED (type);
> -   tree inttype = build_nonstandard_integer_type (prec, unsign);
> +   tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
>}
> -  (convert (negate (convert:inttype (convert:boolean_type_node @0
> -  (if (integer_zerop (@1))
> -   (with {
> -  tree booltrue = constant_boolean_node (true, boolean_type_node);
> -}
> -(switch
> - /* a ? 0 : 1 -> !a. */
> - (if (integer_onep (@2))
> -  (convert (bit_xor (convert:boolean_type_node @0) { booltrue; } )))
> - /* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
> - (if (INTEGRAL_TYPE_P (type) &&  integer_pow2p (@2))
> -  (with {
> - tree shift = build_int_cst (integer_type_node, tree_log2 (@2));
> -   }
> -   (lshift (convert (bit_xor (convert:boolean_type_node @0) { booltrue; 
> } ))
> -{ shift; })))
> - /* a ? -1 : 0 -> -(!a).  No need to check the TYPE_PRECISION not being 1

Re: [PATCH v14 16/40] c, c++: Use 16 bits for all use of enum rid for more keyword space

2023-09-19 Thread Jason Merrill

On 9/15/23 19:51, Ken Matsui via Gcc-patches wrote:

Now that RID_MAX has reached 255, we need to update the bit sizes of every
use of the enum rid from 8 to 16 to support more keywords.


Sorry to bring this up so late, but this does raise the question of 
whether we actually want to use keyword space for all these traits that 
will probably be used approximately once in a C++ translation unit.  I 
wonder if it would make sense to instead use e.g. RID_TRAIT for all of 
them and use gperf to look up the specific trait from the identifier?


Jason



Re: [PATCH] c++: improve class NTTP object pretty printing [PR111471]

2023-09-19 Thread Jason Merrill

On 9/19/23 12:40, Patrick Palka wrote:

Tested on x86_64-pc-linux-gnu, does this look OK for trunk/13?


OK for trunk.  What's your argument for backporting?


-- >8 --

1. Move class NTTP object pretty printing to a more general spot in
the pretty printer.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
value, like dump_expr would have done.
3. Don't pretty print const VIEW_CONVERT_EXPR wrapping class NTTPs.

PR c++/111471

gcc/cp/ChangeLog:

* cxx-pretty-print.cc (cxx_pretty_printer::expression)
: Print the value of a class NTTP object.
: Strip cosnt VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle the class
NTTP objects here.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/diagnostic19.C: New test.
---
  gcc/cp/cxx-pretty-print.cc   | 19 +--
  gcc/testsuite/g++.dg/concepts/diagnostic19.C | 20 
  2 files changed, 37 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/concepts/diagnostic19.C

diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
index 909a9dc917f..7cd43151592 100644
--- a/gcc/cp/cxx-pretty-print.cc
+++ b/gcc/cp/cxx-pretty-print.cc
@@ -1121,6 +1121,15 @@ cxx_pretty_printer::expression (tree t)
t = OVL_FIRST (t);
/* FALLTHRU */
  case VAR_DECL:
+  if (DECL_NTTP_OBJECT_P (t))
+   {
+ /* Print the type followed by the CONSTRUCTOR value of an
+NTTP object.  */
+ simple_type_specifier (cv_unqualified (TREE_TYPE (t)));
+ expression (DECL_INITIAL (t));
+ break;
+   }
+  /* FALLTHRU */
  case PARM_DECL:
  case FIELD_DECL:
  case CONST_DECL:
@@ -1261,6 +1270,14 @@ cxx_pretty_printer::expression (tree t)
pp_cxx_right_paren (this);
break;
  
+case VIEW_CONVERT_EXPR:

+  if (TREE_CODE (TREE_OPERAND (t, 0)) == TEMPLATE_PARM_INDEX)
+   {
+ /* Strip const VIEW_CONVERT_EXPR wrappers for class NTTPs.  */
+ expression (TREE_OPERAND (t, 0));
+ break;
+   }
+  /* FALLTHRU */
  default:
c_pretty_printer::expression (t);
break;
@@ -1966,8 +1983,6 @@ pp_cxx_template_argument_list (cxx_pretty_printer *pp, 
tree t)
  if (TYPE_P (arg) || (TREE_CODE (arg) == TEMPLATE_DECL
   && TYPE_P (DECL_TEMPLATE_RESULT (arg
pp->type_id (arg);
- else if (VAR_P (arg) && DECL_NTTP_OBJECT_P (arg))
-   pp->expression (DECL_INITIAL (arg));
  else
pp->expression (arg);
}
diff --git a/gcc/testsuite/g++.dg/concepts/diagnostic19.C 
b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
new file mode 100644
index 000..20cdb63380b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
@@ -0,0 +1,20 @@
+// Verify our pretty printing of class NTTP objects.
+// PR c++/111471
+// { dg-do compile { target c++20 } }
+
+struct A { bool value; };
+
+template
+  requires (V.value) // { dg-message {'\(V\).value \[with V = A\{false\}\]'} }
+void f();
+
+template struct B { static constexpr auto value = V.value; };
+
+template
+  requires T::value // { dg-message {'T::value \[with T = B\]'} }
+void g();
+
+int main() {
+  f(); // { dg-error "no match" }
+  g>(); // { dg-error "no match" }
+}




Re: [PATCH] libcpp: Fix ICE on #include after a line marker directive [PR61474]

2023-09-19 Thread Richard Sandiford
Lewis Hyatt via Gcc-patches  writes:
> Hello-
>
> This fixes an old PR, bootstrap + regtest on x86-64 Linux. Please let me know 
> if it's ok? Thanks!
>
> -Lewis
>
> -- >8 --
>
> As noted in the PR, GCC will segfault if a file name is first seen in a
> linemarker directive, and then later seen in a normal #include.  This is
> because the fake include process adds the file to the cache with a null PATH
> member. The normal #include finds this file in the cache and then attempts
> to use the null PATH.  Resolve by adding the file to the cache with a unique
> starting directory, so that the fake entry will only be found by a
> subsequent fake include, not by a real one.
>
> libcpp/ChangeLog:
>
>   PR preprocessor/61474
>   * files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
>   include files.
>   (_cpp_fake_include): Pass a unique cpp_dir* address so
>   the fake file will not be found when looked up for real.
>
> gcc/testsuite/ChangeLog:
>
>   PR preprocessor/61474
>   * c-c++-common/cpp/pr61474-2.h: New test.
>   * c-c++-common/cpp/pr61474.c: New test.
>   * c-c++-common/cpp/pr61474.h: New test.

Neat fix!  I don't know this code very well, but I agree it looks
correct.  OK if no-one objects in 24 hours.

Thanks,
Richard

> ---
>  libcpp/files.cc| 11 +--
>  gcc/testsuite/c-c++-common/cpp/pr61474-2.h |  1 +
>  gcc/testsuite/c-c++-common/cpp/pr61474.c   |  5 +
>  gcc/testsuite/c-c++-common/cpp/pr61474.h   |  6 ++
>  4 files changed, 21 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474-2.h
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474.c
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474.h
>
> diff --git a/libcpp/files.cc b/libcpp/files.cc
> index 43a8894b7de..27301d79fa4 100644
> --- a/libcpp/files.cc
> +++ b/libcpp/files.cc
> @@ -541,7 +541,9 @@ _cpp_find_file (cpp_reader *pfile, const char *fname, 
> cpp_dir *start_dir,
>  = (kind == _cpp_FFK_PRE_INCLUDE
> || (pfile->buffer && pfile->buffer->file->implicit_preinclude));
>  
> -  if (kind != _cpp_FFK_FAKE)
> +  if (kind == _cpp_FFK_FAKE)
> +file->dont_read = true;
> +  else
>  /* Try each path in the include chain.  */
>  for (;;)
>{
> @@ -1490,7 +1492,12 @@ cpp_clear_file_cache (cpp_reader *pfile)
>  void
>  _cpp_fake_include (cpp_reader *pfile, const char *fname)
>  {
> -  _cpp_find_file (pfile, fname, pfile->buffer->file->dir, 0, _cpp_FFK_FAKE, 
> 0);
> +  /* It does not matter what are the contents of fake_source_dir, it will 
> never
> + be inspected; we just use its address to uniquely signify that this file
> + was added as a fake include, so a later call to _cpp_find_file (to 
> include
> + the file for real) won't find the fake one in the hash table.  */
> +  static cpp_dir fake_source_dir;
> +  _cpp_find_file (pfile, fname, &fake_source_dir, 0, _cpp_FFK_FAKE, 0);
>  }
>  
>  /* Not everyone who wants to set system-header-ness on a buffer can
> diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474-2.h 
> b/gcc/testsuite/c-c++-common/cpp/pr61474-2.h
> new file mode 100644
> index 000..6f70f09beec
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pr61474-2.h
> @@ -0,0 +1 @@
> +#pragma once
> diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474.c 
> b/gcc/testsuite/c-c++-common/cpp/pr61474.c
> new file mode 100644
> index 000..f835a40fc7a
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pr61474.c
> @@ -0,0 +1,5 @@
> +/* { dg-do preprocess } */
> +#include "pr61474.h"
> +/* Make sure that the file can be included for real, after it was
> +   fake-included by the linemarker directives in pr61474.h.  */
> +#include "pr61474-2.h"
> diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474.h 
> b/gcc/testsuite/c-c++-common/cpp/pr61474.h
> new file mode 100644
> index 000..d9e8c3a1fec
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pr61474.h
> @@ -0,0 +1,6 @@
> +/* Create a fake include for pr61474-2.h and exercise looking it up.  */
> +/* Use #pragma once to check also that the fake-include entry in the file
> +   cache does not cause a problem in libcpp/files.cc:has_unique_contents().  
> */
> +#pragma once
> +# 1 "pr61474-2.h" 1
> +# 2 "pr61474-2.h" 1


Re: [PATCH] libcpp: Fix ICE on #include after a line marker directive [PR61474]

2023-09-19 Thread Marek Polacek
On Tue, Sep 19, 2023 at 06:08:50PM +0100, Richard Sandiford wrote:
> Lewis Hyatt via Gcc-patches  writes:
> > Hello-
> >
> > This fixes an old PR, bootstrap + regtest on x86-64 Linux. Please let me 
> > know if it's ok? Thanks!
> >
> > -Lewis
> >
> > -- >8 --
> >
> > As noted in the PR, GCC will segfault if a file name is first seen in a
> > linemarker directive, and then later seen in a normal #include.  This is
> > because the fake include process adds the file to the cache with a null PATH
> > member. The normal #include finds this file in the cache and then attempts
> > to use the null PATH.  Resolve by adding the file to the cache with a unique
> > starting directory, so that the fake entry will only be found by a
> > subsequent fake include, not by a real one.
> >
> > libcpp/ChangeLog:
> >
> > PR preprocessor/61474
> > * files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
> > include files.
> > (_cpp_fake_include): Pass a unique cpp_dir* address so
> > the fake file will not be found when looked up for real.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR preprocessor/61474
> > * c-c++-common/cpp/pr61474-2.h: New test.
> > * c-c++-common/cpp/pr61474.c: New test.
> > * c-c++-common/cpp/pr61474.h: New test.
> 
> Neat fix!  I don't know this code very well, but I agree it looks
> correct.  OK if no-one objects in 24 hours.

Looks fine to me too, thanks Lewis.
 
> Thanks,
> Richard
> 
> > ---
> >  libcpp/files.cc| 11 +--
> >  gcc/testsuite/c-c++-common/cpp/pr61474-2.h |  1 +
> >  gcc/testsuite/c-c++-common/cpp/pr61474.c   |  5 +
> >  gcc/testsuite/c-c++-common/cpp/pr61474.h   |  6 ++
> >  4 files changed, 21 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474-2.h
> >  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474.c
> >  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr61474.h
> >
> > diff --git a/libcpp/files.cc b/libcpp/files.cc
> > index 43a8894b7de..27301d79fa4 100644
> > --- a/libcpp/files.cc
> > +++ b/libcpp/files.cc
> > @@ -541,7 +541,9 @@ _cpp_find_file (cpp_reader *pfile, const char *fname, 
> > cpp_dir *start_dir,
> >  = (kind == _cpp_FFK_PRE_INCLUDE
> > || (pfile->buffer && pfile->buffer->file->implicit_preinclude));
> >  
> > -  if (kind != _cpp_FFK_FAKE)
> > +  if (kind == _cpp_FFK_FAKE)
> > +file->dont_read = true;
> > +  else
> >  /* Try each path in the include chain.  */
> >  for (;;)
> >{
> > @@ -1490,7 +1492,12 @@ cpp_clear_file_cache (cpp_reader *pfile)
> >  void
> >  _cpp_fake_include (cpp_reader *pfile, const char *fname)
> >  {
> > -  _cpp_find_file (pfile, fname, pfile->buffer->file->dir, 0, 
> > _cpp_FFK_FAKE, 0);
> > +  /* It does not matter what are the contents of fake_source_dir, it will 
> > never
> > + be inspected; we just use its address to uniquely signify that this 
> > file
> > + was added as a fake include, so a later call to _cpp_find_file (to 
> > include
> > + the file for real) won't find the fake one in the hash table.  */
> > +  static cpp_dir fake_source_dir;
> > +  _cpp_find_file (pfile, fname, &fake_source_dir, 0, _cpp_FFK_FAKE, 0);
> >  }
> >  
> >  /* Not everyone who wants to set system-header-ness on a buffer can
> > diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474-2.h 
> > b/gcc/testsuite/c-c++-common/cpp/pr61474-2.h
> > new file mode 100644
> > index 000..6f70f09beec
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/cpp/pr61474-2.h
> > @@ -0,0 +1 @@
> > +#pragma once
> > diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474.c 
> > b/gcc/testsuite/c-c++-common/cpp/pr61474.c
> > new file mode 100644
> > index 000..f835a40fc7a
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/cpp/pr61474.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do preprocess } */
> > +#include "pr61474.h"
> > +/* Make sure that the file can be included for real, after it was
> > +   fake-included by the linemarker directives in pr61474.h.  */
> > +#include "pr61474-2.h"
> > diff --git a/gcc/testsuite/c-c++-common/cpp/pr61474.h 
> > b/gcc/testsuite/c-c++-common/cpp/pr61474.h
> > new file mode 100644
> > index 000..d9e8c3a1fec
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/cpp/pr61474.h
> > @@ -0,0 +1,6 @@
> > +/* Create a fake include for pr61474-2.h and exercise looking it up.  */
> > +/* Use #pragma once to check also that the fake-include entry in the file
> > +   cache does not cause a problem in 
> > libcpp/files.cc:has_unique_contents().  */
> > +#pragma once
> > +# 1 "pr61474-2.h" 1
> > +# 2 "pr61474-2.h" 1
> 

Marek



Re: [PATCH] RISC-V: Finish Typing Un-Typed Instructions and Turn on Assert

2023-09-19 Thread Jeff Law




On 9/18/23 01:29, Lehua Ding wrote:

Hi Jeff,

Thank you for your comments, I'm not fully convinced yet but I will 
follow the current rules. I won't dwell on this for now.

Sounds reasonable.  Of all the things we need to do, this isn't that big :-)

jeff


[committed] Fix bogus operand predicate on iq2000

2023-09-19 Thread Jeff Law

The iq2000-elf port regressed these tests recently:


iq2000-sim: gcc.c-torture/execute/20040703-1.c   -O2  (test for excess errors)
iq2000-sim: gcc.c-torture/execute/20040703-1.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
iq2000-sim: gcc.c-torture/execute/20040703-1.c   -O3 -g  (test for excess 
errors)




It turns out one of the patterns had an operand predicate that allowed 
REG, SUBREG, CONST_INT (with a limited set of CONST_INTs).  Yet the 
constraint only allowed the limited set of immediates.  This naturally 
triggered an LRA constraint failure.


The fix is trivial, create an operand predicate that accurately reflects 
the kinds of operands allowed by the instruction.


It turns out this was a long standing bug -- fixing the pattern resolved 
55 failing tests in the testsuite.


Pushed to the trunk,
Jeff
commit eec7c373c2de6d5806537552de5f5b2bd064c43e
Author: Jeff Law 
Date:   Tue Sep 19 11:28:53 2023 -0600

Fix bogus operand predicate on iq2000

The iq2000-elf port regressed these tests recently:

> iq2000-sim: gcc.c-torture/execute/20040703-1.c   -O2  (test for excess 
errors)
> iq2000-sim: gcc.c-torture/execute/20040703-1.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> iq2000-sim: gcc.c-torture/execute/20040703-1.c   -O3 -g  (test for excess 
errors)

It turns out one of the patterns had an operand predicate that allowed REG,
SUBREG, CONST_INT (with a limited set of CONST_INTs).  Yet the constraint 
only
allowed the limited set of immediates.  This naturally triggered an LRA
constraint failure.

The fix is trivial, create an operand predicate that accurately reflects the
kinds of operands allowed by the instruction.

It turns out this was a long standing bug -- fixing the pattern resolved 55
failing tests in the testsuite.

gcc/
* config/iq2000/predicates.md (uns_arith_constant): New predicate.
* config/iq2000/iq2000.md (rotrsi3): Use it.

diff --git a/gcc/config/iq2000/iq2000.md b/gcc/config/iq2000/iq2000.md
index aaeda39ae99..f157a82ebc0 100644
--- a/gcc/config/iq2000/iq2000.md
+++ b/gcc/config/iq2000/iq2000.md
@@ -988,7 +988,7 @@ (define_insn "lshrsi3_internal1"
 (define_insn "rotrsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
 (rotatert:SI (match_operand:SI 1 "register_operand" "r")
- (match_operand:SI 2 "uns_arith_operand" "O")))]
+ (match_operand:SI 2 "uns_arith_constant" "O")))]
   ""
   "ram %0,%1,%2,0x0,0x0"
   [(set_attr "type" "arith")])
diff --git a/gcc/config/iq2000/predicates.md b/gcc/config/iq2000/predicates.md
index 1330f7d613c..38857e17c24 100644
--- a/gcc/config/iq2000/predicates.md
+++ b/gcc/config/iq2000/predicates.md
@@ -17,6 +17,15 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
+;; Return 1 if OP can be used as an operand where a 16-bit
+;; unsigned integer is needed.
+
+(define_predicate "uns_arith_constant"
+  (match_code "const_int")
+{
+  return SMALL_INT_UNSIGNED (op);
+})
+
 ;; Return 1 if OP can be used as an operand where a register or 16-bit
 ;; unsigned integer is needed.
 


Re: [PATCH] c++: improve class NTTP object pretty printing [PR111471]

2023-09-19 Thread Patrick Palka
On Tue, 19 Sep 2023, Jason Merrill wrote:

> On 9/19/23 12:40, Patrick Palka wrote:
> > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/13?
> 
> OK for trunk.  What's your argument for backporting?

Thanks.  I don't feel strongly about it, but I was thinking that since
we typically backport C++20-only correctness fixes to the most recent
release branch, C++20-only diagnostic improvements might be suitable
too?

> 
> > -- >8 --
> > 
> > 1. Move class NTTP object pretty printing to a more general spot in
> > the pretty printer.
> > 2. Print the type of an class NTTP object alongside its CONSTRUCTOR
> > value, like dump_expr would have done.
> > 3. Don't pretty print const VIEW_CONVERT_EXPR wrapping class NTTPs.
> > 
> > PR c++/111471
> > 
> > gcc/cp/ChangeLog:
> > 
> > * cxx-pretty-print.cc (cxx_pretty_printer::expression)
> > : Print the value of a class NTTP object.
> > : Strip cosnt VIEW_CONVERT_EXPR
> > wrappers for class NTTPs.
> > (pp_cxx_template_argument_list): Don't handle the class
> > NTTP objects here.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/concepts/diagnostic19.C: New test.
> > ---
> >   gcc/cp/cxx-pretty-print.cc   | 19 +--
> >   gcc/testsuite/g++.dg/concepts/diagnostic19.C | 20 
> >   2 files changed, 37 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > 
> > diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
> > index 909a9dc917f..7cd43151592 100644
> > --- a/gcc/cp/cxx-pretty-print.cc
> > +++ b/gcc/cp/cxx-pretty-print.cc
> > @@ -1121,6 +1121,15 @@ cxx_pretty_printer::expression (tree t)
> > t = OVL_FIRST (t);
> > /* FALLTHRU */
> >   case VAR_DECL:
> > +  if (DECL_NTTP_OBJECT_P (t))
> > +   {
> > + /* Print the type followed by the CONSTRUCTOR value of an
> > +NTTP object.  */
> > + simple_type_specifier (cv_unqualified (TREE_TYPE (t)));
> > + expression (DECL_INITIAL (t));
> > + break;
> > +   }
> > +  /* FALLTHRU */
> >   case PARM_DECL:
> >   case FIELD_DECL:
> >   case CONST_DECL:
> > @@ -1261,6 +1270,14 @@ cxx_pretty_printer::expression (tree t)
> > pp_cxx_right_paren (this);
> > break;
> >   +case VIEW_CONVERT_EXPR:
> > +  if (TREE_CODE (TREE_OPERAND (t, 0)) == TEMPLATE_PARM_INDEX)
> > +   {
> > + /* Strip const VIEW_CONVERT_EXPR wrappers for class NTTPs.  */
> > + expression (TREE_OPERAND (t, 0));
> > + break;
> > +   }
> > +  /* FALLTHRU */
> >   default:
> > c_pretty_printer::expression (t);
> > break;
> > @@ -1966,8 +1983,6 @@ pp_cxx_template_argument_list (cxx_pretty_printer *pp,
> > tree t)
> >   if (TYPE_P (arg) || (TREE_CODE (arg) == TEMPLATE_DECL
> >&& TYPE_P (DECL_TEMPLATE_RESULT (arg
> > pp->type_id (arg);
> > - else if (VAR_P (arg) && DECL_NTTP_OBJECT_P (arg))
> > -   pp->expression (DECL_INITIAL (arg));
> >   else
> > pp->expression (arg);
> > }
> > diff --git a/gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > new file mode 100644
> > index 000..20cdb63380b
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/concepts/diagnostic19.C
> > @@ -0,0 +1,20 @@
> > +// Verify our pretty printing of class NTTP objects.
> > +// PR c++/111471
> > +// { dg-do compile { target c++20 } }
> > +
> > +struct A { bool value; };
> > +
> > +template
> > +  requires (V.value) // { dg-message {'\(V\).value \[with V =
> > A\{false\}\]'} }
> > +void f();
> > +
> > +template struct B { static constexpr auto value = V.value; };
> > +
> > +template
> > +  requires T::value // { dg-message {'T::value \[with T = B\]'}
> > }
> > +void g();
> > +
> > +int main() {
> > +  f(); // { dg-error "no match" }
> > +  g>(); // { dg-error "no match" }
> > +}
> 
> 



Re: [PATCH v2 2/2] c++: convert_to_void and volatile references

2023-09-19 Thread Patrick Palka
On Mon, 18 Sep 2023, Jason Merrill wrote:

> On 9/18/23 12:12, Patrick Palka wrote:
> > Jason pointed out that even implicit loads of volatile references need
> > to undergo lvalue-to-rvalue conversion, but we currently emit a warning
> > in this case and discard the load.  This patch changes this behavior so
> > that we don't issue a warning, and preserve the load.
> > 
> > gcc/cp/ChangeLog:
> > 
> > * cvt.cc (convert_to_void) : Remove warning
> > for an implicit load of a volatile reference.  Simplify as if
> > is_reference is false.  Check REFERENCE_REF_P in the test
> > guarding the -Wunused-value diagnostic.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/expr/discarded1a.C: No longer expect warning for
> > implicit load of a volatile reference.
> > * g++.old-deja/g++.bugs/900428_01.C: Likewise.
> > * g++.dg/expr/volatile2.C: New test.
> > ---
> >   gcc/cp/cvt.cc | 56 ++-
> >   gcc/testsuite/g++.dg/expr/discarded1a.C   |  1 -
> >   gcc/testsuite/g++.dg/expr/volatile2.C | 12 
> >   .../g++.old-deja/g++.bugs/900428_01.C | 26 -
> >   4 files changed, 30 insertions(+), 65 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/expr/volatile2.C
> > 
> > diff --git a/gcc/cp/cvt.cc b/gcc/cp/cvt.cc
> > index 4424670356c..1cb6c1222c2 100644
> > --- a/gcc/cp/cvt.cc
> > +++ b/gcc/cp/cvt.cc
> > @@ -1251,12 +1251,9 @@ convert_to_void (tree expr, impl_conv_void implicit,
> > tsubst_flags_t complain)
> > {
> > tree type = TREE_TYPE (expr);
> > int is_volatile = TYPE_VOLATILE (type);
> > -   if (is_volatile)
> > - complete_type (type);
> > -   int is_complete = COMPLETE_TYPE_P (type);
> > /* Can't load the value if we don't know the type.  */
> > -   if (is_volatile && !is_complete)
> > +   if (is_volatile && !COMPLETE_TYPE_P (complete_type (type)))
> > {
> >   if (complain & tf_warning)
> >   switch (implicit)
> > @@ -1298,50 +1295,7 @@ convert_to_void (tree expr, impl_conv_void implicit,
> > tsubst_flags_t complain)
> > gcc_unreachable ();
> > }
> > }
> > -   /* Don't load the value if this is an implicit dereference, or if
> > -  the type needs to be handled by ctors/dtors.  */
> > -   else if (is_volatile && is_reference)
> > -  {
> > -if (complain & tf_warning)
> > - switch (implicit)
> > -   {
> > - case ICV_CAST:
> > -   warning_at (loc, 0, "conversion to void will not access "
> > -   "object of type %qT", type);
> > -   break;
> > - case ICV_SECOND_OF_COND:
> > -   warning_at (loc, 0, "implicit dereference will not access
> > "
> > -   "object of type %qT in second operand of "
> > -   "conditional expression", type);
> > -   break;
> > - case ICV_THIRD_OF_COND:
> > -   warning_at (loc, 0, "implicit dereference will not access
> > "
> > -   "object of type %qT in third operand of "
> > -   "conditional expression", type);
> > -   break;
> > - case ICV_RIGHT_OF_COMMA:
> > -   warning_at (loc, 0, "implicit dereference will not access
> > "
> > -   "object of type %qT in right operand of "
> > -   "comma operator", type);
> > -   break;
> > - case ICV_LEFT_OF_COMMA:
> > -   warning_at (loc, 0, "implicit dereference will not access
> > "
> > -   "object of type %qT in left operand of comma "
> > -   "operator", type);
> > -   break;
> > - case ICV_STATEMENT:
> > -   warning_at (loc, 0, "implicit dereference will not access
> > "
> > -   "object of type %qT in statement",  type);
> > -break;
> > - case ICV_THIRD_IN_FOR:
> > -   warning_at (loc, 0, "implicit dereference will not access
> > "
> > -   "object of type %qT in for increment
> > expression",
> > -   type);
> > -   break;
> > - default:
> > -   gcc_unreachable ();
> > -   }
> > -  }
> > +   /* Don't load the value if the type needs to be handled by cdtors.  */
> > else if (is_volatile && TREE_ADDRESSABLE (type))
> >   {
> > if (complain & tf_warning)
> > @@ -1386,7 +1340,7 @@ convert_to_void (tree expr, impl_conv_void implicit,
> > tsubst_flags_t complain)
> > gcc_unreachable ();
> > }
> >   }
> > -   if (is_reference || !is_volatile || !is_complete || TREE_ADDRESSABLE
> > (type))
> > +   if (!is_volatile || !COMPLETE_TYPE_P (type))
> > {
> >   /* Emit 

[PATCH] RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap

2023-09-19 Thread Patrick O'Neill
Resolves PR 111461.

during RTL pass: expand
offtime.c: In function '__offtime':
offtime.c:79:6: internal compiler error: RTL check: expected elt 0 type 'e' or 
'u', have 'w' (rtx const_int) in riscv_legitimize_const_move, at 
config/riscv/riscv.cc:2176
   79 |   ip = __mon_yday[__isleap(y)]; 

Tested on rv32gc glibc with --enable-checking=rtl.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_legitimize_const_move): Eliminate
src_op_0 var to avoid rtl check error.

Authored-by: Juzhe Zhong 
Tested-by: Patrick O'Neill 
---
 gcc/config/riscv/riscv.cc | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8c766e2e2be..9a1e643a6a8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2173,16 +2173,14 @@ riscv_legitimize_const_move (machine_mode mode, rtx 
dest, rtx src)
 (const_poly_int:DI [16, 16]) // <- op_1
  ))
*/
-  rtx src_op_0 = XEXP (src, 0);
-
-  if (GET_CODE (src) == CONST && GET_CODE (src_op_0) == PLUS
-&& CONST_POLY_INT_P (XEXP (src_op_0, 1)))
+  if (GET_CODE (src) == CONST && GET_CODE (XEXP (src, 0)) == PLUS
+  && CONST_POLY_INT_P (XEXP (XEXP (src, 0), 1)))
 {
   rtx dest_tmp = gen_reg_rtx (mode);
   rtx tmp = gen_reg_rtx (mode);
 
-  riscv_emit_move (dest, XEXP (src_op_0, 0));
-  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (src_op_0, 1));
+  riscv_emit_move (dest, XEXP (XEXP (src, 0), 0));
+  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (XEXP (src, 0), 
1));
 
   emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, dest_tmp)));
   return;
-- 
2.34.1



[PATCH] Fixes for profile count/probability maintenance

2023-09-19 Thread Eugene Rozenfeld
Verifier checks have recently been strengthened to check that
all counts and probabilities are initialized. The checks fired
during autoprofiledbootstrap build and this patch fixes it.

gcc/ChangeLog:

* auto-profile.cc (afdo_calculate_branch_prob): Fix count comparisons
* ipa-utils.cc (ipa_merge_profiles): Guard against zero count when
computing probabilities
* tree-vect-loop-manip.cc (vect_do_peeling): Guard against zero count
when scaling loop profile

Tested on x86_64-pc-linux-gnu.

---
 gcc/auto-profile.cc |  4 ++--
 gcc/ipa-utils.cc| 16 +---
 gcc/tree-vect-loop-manip.cc |  2 +-
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index ff3b763945c..3e61f36c29b 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1434,7 +1434,7 @@ afdo_calculate_branch_prob (bb_set *annotated_bb)
   else
 total_count += AFDO_EINFO (e)->get_count ();
 }
-if (num_unknown_succ == 0 && total_count > profile_count::zero ())
+if (num_unknown_succ == 0 && total_count > profile_count::zero ().afdo ())
   {
FOR_EACH_EDGE (e, ei, bb->succs)
  e->probability
@@ -1571,7 +1571,7 @@ afdo_annotate_cfg (const stmt_set &promoted_stmts)
   DECL_SOURCE_LOCATION (current_function_decl));
   afdo_source_profile->mark_annotated (cfun->function_start_locus);
   afdo_source_profile->mark_annotated (cfun->function_end_locus);
-  if (max_count > profile_count::zero ())
+  if (max_count > profile_count::zero ().afdo ())
 {
   /* Calculate, propagate count and probability information on CFG.  */
   afdo_calculate_branch_prob (&annotated_bb);
diff --git a/gcc/ipa-utils.cc b/gcc/ipa-utils.cc
index 956c6294fd7..3aaf7e595df 100644
--- a/gcc/ipa-utils.cc
+++ b/gcc/ipa-utils.cc
@@ -651,13 +651,15 @@ ipa_merge_profiles (struct cgraph_node *dst,
{
  edge srce = EDGE_SUCC (srcbb, i);
  edge dste = EDGE_SUCC (dstbb, i);
- dste->probability =
-   dste->probability * dstbb->count.ipa ().probability_in
-(dstbb->count.ipa ()
- + srccount.ipa ())
-   + srce->probability * srcbb->count.ipa ().probability_in
-(dstbb->count.ipa ()
- + srccount.ipa ());
+ profile_count total = dstbb->count.ipa () + srccount.ipa ();
+ if (total.nonzero_p ())
+   {
+ dste->probability =
+   dste->probability * dstbb->count.ipa ().probability_in
+   (total)
+   + srce->probability * srcbb->count.ipa ().probability_in
+   (total);
+   }
}
  dstbb->count = dstbb->count.ipa () + srccount.ipa ();
}
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 09641901ff1..2608c286e5d 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -3335,7 +3335,7 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, 
tree nitersm1,
  free (bbs);
  free (original_bbs);
}
- else
+ else if (old_count.nonzero_p ())
scale_loop_profile (epilog, guard_to->count.probability_in 
(old_count), -1);

  /* Only need to handle basic block before epilog loop if it's not
--
2.25.1


[PATCH] Remove .PHONY targets when building .fda files during autoprofiledbootstrap

2023-09-19 Thread Eugene Rozenfeld
These .PHONY targets are always executed and were breaking `make install`
for autoprofiledbootstrap build.

gcc/ChangeLog:

* c/Make-lang.in: Make create_fdas_for_cc1 target not .PHONY
* cp/Make-lang.in: Make create_fdas_for_cc1plus target not .PHONY
* lto/Make-lang.in: Make create_fdas_for_lto1 target not .PHONY

Tested on x86_64-pc-linux-gnu.

---
 gcc/c/Make-lang.in   | 4 ++--
 gcc/cp/Make-lang.in  | 4 ++--
 gcc/lto/Make-lang.in | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
index 79bc0dfd1cf..3ef8a674971 100644
--- a/gcc/c/Make-lang.in
+++ b/gcc/c/Make-lang.in
@@ -91,8 +91,6 @@ cc1$(exeext): $(C_OBJS) cc1-checksum.o $(BACKEND) $(LIBDEPS)
 components_in_prev = "bfd opcodes binutils fixincludes gas gcc gmp mpfr mpc 
isl gold intl ld libbacktrace libcpp libcody libdecnumber libiberty 
libiberty-linker-plugin libiconv zlib lto-plugin libctf libsframe"
 components_in_prev_target = "libstdc++-v3 libsanitizer libvtv libgcc 
libbacktrace libphobos zlib libgomp libatomic"

-.PHONY: create_fdas_for_cc1
-
 cc1.fda: create_fdas_for_cc1
$(PROFILE_MERGER) $(shell ls -ha cc1_*.fda) --output_file cc1.fda 
-gcov_version 2

@@ -116,6 +114,8 @@ create_fdas_for_cc1: ../stage1-gcc/cc1$(exeext) 
../prev-gcc/$(PERF_DATA)
$(CREATE_GCOV) -binary ../prev-gcc/cc1$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \  fi; \
done;
+
+   $(STAMP) $@
 #
 # Build hooks:

diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index ba5e8766e99..2727fb7f8cc 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -189,8 +189,6 @@ cp/name-lookup.o: $(srcdir)/cp/std-name-hint.h
 components_in_prev = "bfd opcodes binutils fixincludes gas gcc gmp mpfr mpc 
isl gold intl ld libbacktrace libcpp libcody libdecnumber libiberty 
libiberty-linker-plugin libiconv zlib lto-plugin libctf libsframe"
 components_in_prev_target = "libstdc++-v3 libsanitizer libvtv libgcc 
libbacktrace libphobos zlib libgomp libatomic"

-.PHONY: create_fdas_for_cc1plus
-
 cc1plus.fda: create_fdas_for_cc1plus
$(PROFILE_MERGER) $(shell ls -ha cc1plus_*.fda) --output_file 
cc1plus.fda -gcov_version 2

@@ -214,6 +212,8 @@ create_fdas_for_cc1plus: ../stage1-gcc/cc1plus$(exeext) 
../prev-gcc/$(PERF_DATA)
$(CREATE_GCOV) -binary ../prev-gcc/cc1plus$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
  fi; \
done;
+
+   $(STAMP) $@
 #
 # Build hooks:

diff --git a/gcc/lto/Make-lang.in b/gcc/lto/Make-lang.in
index 98aa9f4cc39..7dc0a9fef42 100644
--- a/gcc/lto/Make-lang.in
+++ b/gcc/lto/Make-lang.in
@@ -108,8 +108,6 @@ lto/lto-dump.o: $(LTO_OBJS)
 components_in_prev = "bfd opcodes binutils fixincludes gas gcc gmp mpfr mpc 
isl gold intl ld libbacktrace libcpp libcody libdecnumber libiberty 
libiberty-linker-plugin libiconv zlib lto-plugin libctf libsframe"
 components_in_prev_target = "libstdc++-v3 libsanitizer libvtv libgcc 
libbacktrace libphobos zlib libgomp libatomic"

-.PHONY: create_fdas_for_lto1
-
 lto1.fda: create_fdas_for_lto1
$(PROFILE_MERGER) $(shell ls -ha lto1_*.fda) --output_file lto1.fda 
-gcov_version 2

@@ -134,6 +132,8 @@ create_fdas_for_lto1: ../stage1-gcc/lto1$(exeext) 
../prev-gcc/$(PERF_DATA)
  fi; \
done;

+   $(STAMP) $@
+
 # LTO testing is done as part of C/C++/Fortran etc. testing.
 check-lto:

--
2.25.1


[pushed] c++: fix cxx_print_type's template-info dumping

2023-09-19 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, pushed to trunk as obvious.

-- >8 --

Unlike DECL_TEMPLATE_INFO which is stored in DECL_LANG_SPECIFIC,
TYPE_TEMPLATE_INFO isn't stored in TYPE_LANG_SPECIFIC, so we don't
need to check for both in cxx_print_type.  This fixes dumping the
template-info of ENUMERAL_TYPE and BOUND_TEMPLATE_TEMPLATE_PARM,
which seem to never have TYPE_LANG_SPECIFIC.

gcc/cp/ChangeLog:

* ptree.cc (cxx_print_type): Remove TYPE_LANG_SPECIFIC
test guarding TYPE_TEMPLATE_INFO.
---
 gcc/cp/ptree.cc | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/ptree.cc b/gcc/cp/ptree.cc
index b4001486701..32c5b5280dc 100644
--- a/gcc/cp/ptree.cc
+++ b/gcc/cp/ptree.cc
@@ -141,9 +141,8 @@ cxx_print_decl (FILE *file, tree node, int indent)
 void
 cxx_print_type (FILE *file, tree node, int indent)
 {
-  if (TYPE_LANG_SPECIFIC (node)
-  && TYPE_TEMPLATE_INFO (node))
-print_node (file, "template-info", TYPE_TEMPLATE_INFO (node), indent + 4);
+  if (tree ti = TYPE_TEMPLATE_INFO (node))
+print_node (file, "template-info", ti, indent + 4);
 
   switch (TREE_CODE (node))
 {
-- 
2.42.0.216.gbda494f404



Re: [PATCH] c++: further optimize tsubst_template_decl

2023-09-19 Thread Jason Merrill

On 9/19/23 11:04, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?

-- >8 --

This patch makes tsubst_template_decl use use_spec_table=false also in
the non-class non-function template case, to avoid computing 'argvec' and
doing a hash table lookup from tsubst_decl (when partially instantiating
a member variable or alias template).

This change reveals that for function templates, tsubst_template_decl
registers the partially instantiated TEMPLATE_DECL, whereas for other
non-class templates it registers the corresponding DECL_TEMPLATE_RESULT
which is an interesting inconsistency that I decided to preserve for now.


Can you document that in a comment somewhere, maybe at the bottom of 
tsubst_template_decl where you're handling them differently?  OK with 
that change.



Trying to consistently register the TEMPLATE_DECL (or FUNCTION_DECL)
causes modules crashes, but I haven't looked into why.

In passing, I noticed in tsubst_function_decl that its 'argvec' goes
unused when 'lambda_fntype' is set (since lambdas aren't recorded in the
specializations table), so we can avoid computing it in that case.

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Don't bother computing 'argvec'
when 'lambda_fntype' is set.
(tsubst_template_decl): Make sure we return a TEMPLATE_DECL
after specialization lookup.  In the non-class non-function
template case, use tsubst_decl directly with use_spec_table=false,
update DECL_TI_ARGS and call register_specialization like
tsubst_decl would have done if use_spec_table=true.
---
  gcc/cp/pt.cc | 39 +--
  1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 777ff592789..cc8ba21d6fd 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -14370,7 +14370,7 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t 
complain,
  
/* Calculate the complete set of arguments used to

 specialize R.  */
-  if (use_spec_table)
+  if (use_spec_table && !lambda_fntype)
{
  argvec = tsubst_template_args (DECL_TI_ARGS
 (DECL_TEMPLATE_RESULT
@@ -14380,14 +14380,11 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
return error_mark_node;
  
  	  /* Check to see if we already have this specialization.  */

- if (!lambda_fntype)
-   {
- hash = spec_hasher::hash (gen_tmpl, argvec);
- if (tree spec = retrieve_specialization (gen_tmpl, argvec, hash))
-   /* The spec for these args might be a partial instantiation of 
the
-  template, but here what we want is the FUNCTION_DECL.  */
-   return STRIP_TEMPLATE (spec);
-   }
+ hash = spec_hasher::hash (gen_tmpl, argvec);
+ if (tree spec = retrieve_specialization (gen_tmpl, argvec, hash))
+   /* The spec for these args might be a partial instantiation of the
+  template, but here what we want is the FUNCTION_DECL.  */
+   return STRIP_TEMPLATE (spec);
}
else
argvec = args;
@@ -14704,6 +14701,8 @@ tsubst_template_decl (tree t, tree args, tsubst_flags_t 
complain,
/* Type partial instantiations are stored as the type by
   lookup_template_class_1, not here as the template.  */
spec = CLASSTYPE_TI_TEMPLATE (spec);
+ else if (TREE_CODE (spec) != TEMPLATE_DECL)
+   spec = DECL_TI_TEMPLATE (spec);
  return spec;
}
  }
@@ -14754,7 +14753,7 @@ tsubst_template_decl (tree t, tree args, tsubst_flags_t 
complain,
inner = tsubst_aggr_type (inner, args, complain,
  in_decl, /*entering*/1);
else
-   inner = tsubst (inner, args, complain, in_decl);
+   inner = tsubst_decl (inner, args, complain, /*use_spec_table=*/false);
  }
--processing_template_decl;
if (inner == error_mark_node)
@@ -14780,12 +14779,11 @@ tsubst_template_decl (tree t, tree args, 
tsubst_flags_t complain,
  }
else
  {
-  if (TREE_CODE (inner) == FUNCTION_DECL)
-   /* Set DECL_TI_ARGS to the full set of template arguments, which
-  tsubst_function_decl didn't do due to use_spec_table=false.  */
-   DECL_TI_ARGS (inner) = full_args;
-
DECL_TI_TEMPLATE (inner) = r;
+  /* Set DECL_TI_ARGS to the full set of template arguments,
+which tsubst_function_decl / tsubst_decl didn't do due to
+use_spec_table=false.  */
+  DECL_TI_ARGS (inner) = full_args;
DECL_TI_ARGS (r) = DECL_TI_ARGS (inner);
  }
  
@@ -14813,9 +14811,14 @@ tsubst_template_decl (tree t, tree args, tsubst_flags_t complain,

if (PRIMARY_TEMPLATE_P (t))
  DECL_PRIMARY_TEMPLATE (r) = r;
  
-  if (TREE_CODE (decl) == FUNCTION_DECL && !lambda_fntype)

-/* Record this non-type partial inst

Re: [PATCH] [frange] Relax floating point relational folding.

2023-09-19 Thread Aldy Hernandez
Hi Jakub.

I wasn't ignoring you, just quietly thinking, making sure we weren't
missing anything.  In the process I cleaned everything, which
hopefully makes it easier to see why we don't need relationals (the
key is to look at frelop_early_resolve() and the op1/op2_range entries
which clear the NAN bits).

I am committing the patch below, and as I say in it:

I don't mean this patch as a hard-no against implementing the
unordered relations Jakub preferred, but seeing that it's looking
cleaner and trivially simple without the added burden of more enums,
I'd like to flesh it out completely and then discuss if we still think
new codes are needed.  At least now we have tests :).

Please let me know if you have any test cases you think we may be
missing.  FYI, I'm still not done with the unordered folders.

Tested on x86-64 Linux.  Committed.
Aldy

On Mon, Aug 28, 2023 at 3:01 AM Jakub Jelinek  wrote:
>
> On Wed, Aug 23, 2023 at 05:22:00PM +0200, Aldy Hernandez via Gcc-patches 
> wrote:
> > BTW, we batted some ideas on how to get this work, and it seems this
> > is the cleaner route with the special cases nestled in the operators
> > themselves.  Another idea is to add unordered relations, but that
> > would require bloating the various tables adding spots for VREL_UNEQ,
> > VREL_UNLT, etc, plus adding relations for VREL_UNORDERED so the
> > intersects work correctly.  I'm not wed to either one, and we can
> > certainly revisit this if it becomes burdensome to maintain (or to get
> > right).
>
> My strong preference would be to have the full set of operations,
> i.e. VREL_LTGT, VREL_{,UN}ORDERED, VREL_UN{LT,LE,GT,GE,EQ}, then everything
> will fall out of this cleanly, not just some common special cases, but
> also unions of them, intersections etc.
> The only important question is if you want to differentiate VREL_*
> for floating point comparisions with possible NANs vs. other comparisons
> in the callers, then one needs effectively e.g. 2 sets of rr_* tables
> in value-relation.cc and what exactly say VREL_EQ inverts to etc. is then
> dependent on the context (this is what we do at the tree level normally,
> e.g. fold-const.cc (invert_tree_comparison) has honor_nans argument),
> or whether it would be a completely new set of value relations, so
> even for EQ/NE etc. one would use VRELF_ or something similar.
>
> Jakub
>
From 220a58d9abbb1f403e8f79cd42ad01b7c9b10ae9 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 18 Sep 2023 21:41:08 -0400
Subject: [PATCH] [frange] Clean up floating point relational folding.

The following patch removes all the special casing from the floating
point relational folding code.  Now all the code relating to folding
of relationals is in frelop_early_resolve() and in
operator_not_equal::fold_range() which requires a small tweak.

I have written new relational tests, and moved them to
gcc.dg/tree-ssa/vrp-float-relations-* for easy reference.  In the
tests it's easy to see the type of things we need to handle:

(a)
	if (x != y)
	  if (x == y)
	link_error ();

(b)
	if (a != b)
	  if (a != b) // Foldable as true.

(c)
	/* We can thread BB2->BB4->BB5 even though we have no knowledge
	   of the NANness of either x_1 or a_5.  */
	__BB(4):
	  x_1 = __PHI (__BB2: a_5(D), __BB3: b_4(D));
	  if (x_1 __UNEQ a_5(D))

(d)
	/* Even though x_1 and a_4 are equivalent on the BB2->BB4 path,
	   we cannot fold the conditional because of possible NANs:  */
	__BB(4):
	  # x_1 = __PHI (__BB2: a_4(D), __BB3: 8.0e+0(3));
	  if (x_1 == a_4(D))

(e)
	if (cond)
	  x = a;
	else
	  x = 8.0;

	/* We can fold this as false on the path coming out of cond==1,
	   regardless of NANs on either "x" or "a".  */
	if (x < a)
	  stuff ();

[etc, etc]

We can implement everything without either special casing,
get_identity_relation(), or adding new unordered relationals.

The basic idea is that if we accurately reflect NANs in op[12]_range,
this information gets propagated to the relevant edges, and there's no
need for unordered relations (VREL_UN*), because the information is in
the range itself.  This information is then used in
frelop_early_resolve() to fold certain combinations.

I don't mean this patch as a hard-no against implementing the
unordered relations Jakub preferred, but seeing that it's looking
cleaner and trivially simple without the added burden of more enums,
I'd like to flesh it out completely and then discuss if we still think
new codes are needed.

More testcases or corner cases are highly welcome.

In follow-up patches I will finish up unordered relation folding, and
come up with suitable tests.

gcc/ChangeLog:

	* range-op-float.cc (frelop_early_resolve): Clean-up and remove
	special casing.
	(operator_not_equal::fold_range): Handle VREL_EQ.
	(operator_lt::fold_range): Remove special casing for VREL_EQ.
	(operator_gt::fold_range): Same.
	(foperator_unordered_equal::fold_range): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp-float-12.c: Moved to...
	

Re: [PATCH V4, rs6000] Disable generation of scalar modulo instructions

2023-09-19 Thread Pat Haugen

On 9/13/23 3:48 PM, Segher Boessenkool wrote:



-  "TARGET_POWER10 && TARGET_POWERPC64"
+  "TARGET_POWER10 && TARGET_POWERPC64 && !RS6000_DISABLE_SCALAR_MODULO"
"vmoduq %0,%1,%2"


Did we ever test if this insn in fact is slower as well?  I don't mean
either way, orthogonality is good, but just for my enlightenment.

Yes, I tested quadword too and saw the same result, div/mul/sub is the 
better option.




With improved changelog: okay for trunk.  Okay for all backports as
well (after some soak time).


Thanks, updated changelog and committed to trunk. Will backport after 
burn-in.


-Pat



Re: [PATCH v2 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-19 Thread Jason Merrill

On 9/11/23 09:49, waffl3x via Gcc-patches wrote:

Bootstrapped and tested on x86_64-linux with no regressions.

Hopefully I fixed all the issues. I also took the opportunity to remove the
small mistake present in v1, so that is no longer a concern.

Thanks again for all the patience.
   -Alex


Thank you, this is great!

One legal hurdle to start with: our DCO policy 
(https://gcc.gnu.org/dco.html) requires real names in the sign-off, not 
pseudonyms.  If you would prefer to contribute under this pseudonym, I 
encourage you to file a copyright assignment with the FSF, who are set 
up to handle that.



+/* These need to moved to somewhere appropriate.  */


This isn't a bad spot for these macros, but you could also move them 
down lower, maybe near DECL_THIS_STATIC and DECL_ARRAY_PARAMETER_P for 
some thematic connection.



+/* The flag is a member of base, but the value is meaningless for other
+   decl types so checking is still justified I imagine.  */


Absolutely, we often reuse bits for other purposes if they're disjoint 
from the use they were added for.



+/* Not a lang_decl field, but still specific to c++.  */
+#define DECL_PARM_XOBJ_FLAG(NODE) \
+  (PARM_DECL_CHECK (NODE)->decl_common.decl_flag_3)


Better to use a DECL_LANG_FLAG than claim one of the 
language-independent flags for C++.


There's a list at the top of cp-tree.h of the uses of *_LANG_FLAG_* on 
various kinds of tree node.  DECL_LANG_FLAG_4 seems free on PARM_DECL.



+  /* Only used for skipping over build_memfn_type, grokfndecl handles
+ copying the flag to the correct field for a func_decl.
+ There must be a better way to do this, but it isn't obvious how.  */
+  bool is_xobj_member_function = false;
+  auto get_xobj_parm = [](tree parm_list)


I guess you could add a flag to the declarator, but this is fine too. 
Though I'd move this lambda down into the cdk_function case or out to a 
separate function.



case cdk_function:
  {
+   tree xobj_parm
+ = get_xobj_parm (declarator->u.function.parameters);
+   is_xobj_member_function = xobj_parm;


I'd also move these down a few lines after the setting of 'raises'.


+   /* Set the xobj flag for this parm, unfortunately
+  I don't think there is a better way to do this.  */
+   DECL_PARM_XOBJ_FLAG (decl)
+ = decl_spec_seq_has_spec_p (declspecs, ds_this);


This seems like a fine way to handle this.


+  /* Special case for xobj parm, doesn't really belong up here
+(it applies to parm decls and those are mostly handled below
+the following specifiers) but I intend to refactor this function
+so I'm not worrying about it too much.
+The error diagnostics might be better elsewhere though.  */


This seems like a reasonable place for it since 'this' is supposed to 
precede the decl-specifiers, and since we are parsing initial attributes 
here rather than in the caller.  You will want to give an error if 
found_decl_spec is set.  And elsewhere complain about 'this' on 
parameters after the first (in cp_parser_parameter_declaration_list?), 
or in a non-member/lambda (in grokdeclarator?).


Jason



Re: [PATCH 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-19 Thread Jason Merrill

On 8/31/23 04:33, Jakub Jelinek wrote:

On Thu, Aug 31, 2023 at 06:02:36AM +, waffl3x via Gcc-patches wrote:


+++ b/gcc/testsuite/g++.dg/cpp23/explicit-object-param-valid2.C
@@ -0,0 +1,24 @@
+// P0847R7
+// { dg-do run { target c++23 } }


This raises an important question whether we as an extension
should support deducing this even in older standards or not.
I admit I haven't studied the paper enough to figure that out.
The syntax is certainly something that wasn't valid in older standards,
so from that POV it could be accepted say with pedwarn with
OPT_Wc__23_extensions if cxx_dialect < cxx23.  But perhaps some
of the rules in the paper change something unconditionally even when
the new syntax doesn't appear.
And, if it is accepted in older standards, the question is if it
shouldn't be banned say from C++98.


I don't think there's any obstacle to allowing it as an extension in 
older standards (with a pedwarn, of course).


Jason



Re: [PATCH v5] c++: extend cold, hot attributes to classes

2023-09-19 Thread Jason Merrill

On 9/6/23 06:20, Javier Martinez wrote:

reminder: ready for commit?


Pushed, thanks!


- Javier

On Wed 23. Aug 2023 at 15:02, Javier Martinez 
> wrote:


On Tue, Aug 22, 2023 at 7:50 PM Jason Merrill mailto:ja...@redhat.com>> wrote:
 > You still need an update to doc/extend.texi for this additional
use of
 > the attribute.  Sorry I didn't think of that before.

I should have caught that too, many thanks.

Also addressed the formatting comments. Patch attached.

Signed-off-by: Javier Martinez mailto:javier.martinez.bugzi...@gmail.com>>





Re: [PATCH] debug/111409 - don't generate COMDAT macro sections for split DWARF

2023-09-19 Thread Omar Sandoval
On Tue, Sep 19, 2023 at 02:56:36PM +0200, Richard Biener wrote:
> On Thu, Sep 14, 2023 at 8:42 AM Omar Sandoval  wrote:
> >
> > Split DWARF files aren't processed by the linker, so DW_MACRO_import
> > offsets aren't relocated and the .debug_macro.dwo sections aren't
> > deduplicated and merged.  There's no clear way for this to work for
> > split DWARF, so disable it.
> 
> OK.
> 
> Thanks,
> Richard.

Thank you! I don't have write access, how can I get this committed?

Omar


Re: [PATCH] RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap

2023-09-19 Thread 钟居哲
LGTM. You can commit it.
Thanks.



juzhe.zh...@rivai.ai
 
From: Patrick O'Neill
Date: 2023-09-20 02:04
To: gcc-patches
CC: juzhe.zhong; patrick; pan2.li; kito.cheng; yanzhang.wang; gnu-toolchain
Subject: [PATCH] RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap
Resolves PR 111461.
 
during RTL pass: expand
offtime.c: In function '__offtime':
offtime.c:79:6: internal compiler error: RTL check: expected elt 0 type 'e' or 
'u', have 'w' (rtx const_int) in riscv_legitimize_const_move, at 
config/riscv/riscv.cc:2176
   79 |   ip = __mon_yday[__isleap(y)]; 
 
Tested on rv32gc glibc with --enable-checking=rtl.
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (riscv_legitimize_const_move): Eliminate
src_op_0 var to avoid rtl check error.
 
Authored-by: Juzhe Zhong 
Tested-by: Patrick O'Neill 
---
gcc/config/riscv/riscv.cc | 10 --
1 file changed, 4 insertions(+), 6 deletions(-)
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8c766e2e2be..9a1e643a6a8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2173,16 +2173,14 @@ riscv_legitimize_const_move (machine_mode mode, rtx 
dest, rtx src)
(const_poly_int:DI [16, 16]) // <- op_1
  ))
*/
-  rtx src_op_0 = XEXP (src, 0);
-
-  if (GET_CODE (src) == CONST && GET_CODE (src_op_0) == PLUS
-&& CONST_POLY_INT_P (XEXP (src_op_0, 1)))
+  if (GET_CODE (src) == CONST && GET_CODE (XEXP (src, 0)) == PLUS
+  && CONST_POLY_INT_P (XEXP (XEXP (src, 0), 1)))
 {
   rtx dest_tmp = gen_reg_rtx (mode);
   rtx tmp = gen_reg_rtx (mode);
-  riscv_emit_move (dest, XEXP (src_op_0, 0));
-  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (src_op_0, 1));
+  riscv_emit_move (dest, XEXP (XEXP (src, 0), 0));
+  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (XEXP (src, 0), 
1));
   emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, dest_tmp)));
   return;
-- 
2.34.1
 
 


[PATCH v8 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-09-19 Thread Ajit Agarwal
Hello All:

This version 8 of the patch uses abi interfaces to remove zero and sign 
extension elimination.
Bootstrapped and regtested on powerpc-linux-gnu.

Incorporated all the review comments of version 6. Added sign extension 
elimination using abi 
interfaces.

Thanks & Regards
Ajit

ree: Improve ree pass for rs6000 target using defined abi interfaces

For rs6000 target we see redundant zero and sign extension and done
to improve ree pass to eliminate such redundant zero and sign extension
using defined ABI interfaces.

2023-09-20  Ajit Kumar Agarwal  

gcc/ChangeLog:

* ree.cc (combine_reaching_defs): Use of zero_extend and sign_extend
defined abi interfaces.
(add_removable_extension): Use of defined abi interfaces for no
reaching defs.
(abi_extension_candidate_return_reg_p): New function.
(abi_extension_candidate_p): New function.
(abi_extension_candidate_argno_p): New function.
(abi_handle_regs_without_defs_p): New function.
(abi_target_promote_function_mode): New function.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/zext-elim-3.C
---
 gcc/ree.cc| 161 +-
 .../g++.target/powerpc/zext-elim-3.C  |  13 ++
 2 files changed, 171 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C

diff --git a/gcc/ree.cc b/gcc/ree.cc
index fc04249fa84..e833db2432d 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -514,7 +514,8 @@ get_uses (rtx_insn *insn, rtx reg)
 if (REGNO (DF_REF_REG (def)) == REGNO (reg))
   break;
 
-  gcc_assert (def != NULL);
+  if (def == NULL)
+return NULL;
 
   ref_chain = DF_REF_CHAIN (def);
 
@@ -750,6 +751,134 @@ get_extended_src_reg (rtx src)
   return src;
 }
 
+/* Return TRUE if target mode is equal to source mode of zero_extend
+   or sign_extend otherwise false.  */
+
+static bool
+abi_target_promote_function_mode (machine_mode mode)
+{
+  int unsignedp;
+  machine_mode tgt_mode
+= targetm.calls.promote_function_mode (NULL_TREE, mode, &unsignedp,
+  NULL_TREE, 1);
+
+  if (tgt_mode == mode)
+return true;
+  else
+return false;
+}
+
+/* Return TRUE if the candidate insn is zero extend and regno is
+   a return registers.  */
+
+static bool
+abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
+{
+  rtx set = single_set (insn);
+  rtx src = SET_SRC (set);
+
+  if (GET_CODE (src) != ZERO_EXTEND && GET_CODE (src) != SIGN_EXTEND)
+return false;
+
+  if (targetm.calls.function_value_regno_p (regno))
+return true;
+
+  return false;
+}
+
+/* Return TRUE if reg source operand of zero_extend is argument registers
+   and not return registers and source and destination operand are same
+   and mode of source and destination operand are not same.  */
+
+static bool
+abi_extension_candidate_p (rtx_insn *insn)
+{
+  rtx set = single_set (insn);
+  rtx src = SET_SRC (set);
+
+  if (GET_CODE (src) != ZERO_EXTEND && GET_CODE (src) != SIGN_EXTEND)
+return false;
+
+  machine_mode dst_mode = GET_MODE (SET_DEST (set));
+  rtx orig_src = XEXP (SET_SRC (set), 0);
+
+  if (!FUNCTION_ARG_REGNO_P (REGNO (orig_src))
+  || abi_extension_candidate_return_reg_p (insn, REGNO (orig_src)))
+return false;
+
+  /* Mode of destination and source of zero_extend should be different.  */
+  if (dst_mode == GET_MODE (orig_src))
+return false;
+
+  /* REGNO of source and destination of zero_extend should be same.  */
+  if (REGNO (SET_DEST (set)) != REGNO (orig_src))
+return false;
+
+  return true;
+}
+
+/* Return TRUE if the candidate insn is zero extend and regno is
+   an argument registers.  */
+
+static bool
+abi_extension_candidate_argno_p (rtx_code code, int regno)
+{
+  if (code != ZERO_EXTEND && code != SIGN_EXTEND)
+return false;
+
+  if (FUNCTION_ARG_REGNO_P (regno))
+return true;
+
+  return false;
+}
+
+/* Return TRUE if the candidate insn doesn't have defs and have
+ * uses without RTX_BIN_ARITH/RTX_COMM_ARITH/RTX_UNARY rtx class.  */
+
+static bool
+abi_handle_regs (rtx_insn *insn)
+{
+  if (side_effects_p (PATTERN (insn)))
+return false;
+
+  struct df_link *uses = get_uses (insn, SET_DEST (PATTERN (insn)));
+
+  if (!uses)
+return false;
+
+  for (df_link *use = uses; use; use = use->next)
+{
+  if (!use->ref)
+   return false;
+
+  if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (DF_REF_INSN (use->ref)))
+   return false;
+
+  rtx_insn *use_insn = DF_REF_INSN (use->ref);
+
+  if (GET_CODE (PATTERN (use_insn)) == SET)
+   {
+ rtx_code code = GET_CODE (SET_SRC (PATTERN (use_insn)));
+
+ if (GET_RTX_CLASS (code) == RTX_BIN_ARITH
+ || GET_RTX_CLASS (code) == RTX_COMM_ARITH
+ || GET_RTX_CLASS (code) == RTX_UNARY)
+   return false;
+   }
+ }
+
+  rtx set = single_set (insn);
+
+  if (GET_CODE (SET_SRC (set)) == SIGN_EXTEND)
+{
+

Re: [PATCH v7 4/4] ree: Improve ree pass for rs6000 target using defined ABI interfaces

2023-09-19 Thread Ajit Agarwal
Hello Vineet:

This patch is without sign_extension implementation.
I have sent patch 8 that incorporates sign_extension using abi interfaces.

Please review.

Thanks & Regards
Ajit

On 19/09/23 2:48 pm, Ajit Agarwal wrote:
> 
> 
> On 19/09/23 2:36 pm, Xi Ruoyao wrote:
>> On Tue, 2023-09-19 at 14:29 +0530, Ajit Agarwal wrote:
>>> This new version of patch 7 use improve ree pass for rs6000 target
>>> using defined ABI interfaces.
>>> Bootstrapped and regtested on power64-linux-gnu.
>>
>> You should drop the "4/4" in subject if it does not depends on other
>> non-committed patches.  Otherwise you should send all the non-committed
>> dependencies in a series.
>>
>> If you just have the 4/4 there without {1..3}/4, people will believe
>> this is an incomplete patch submission and likely ignore it.
>>
> 
> There are 4 oatches that are already under review and all the provious
> patches are under review.
> 
> I will send 3/4 patches again for review.
> 
> Thanks & Regards
> Ajit
>>> Review comments incorporated.
>>>
>>> Thanks & Regards
>>> Ajit
>>>
>>> ree: Improve ree pass for rs6000 target using defined abi interfaces
>>>
>>> For rs6000 target we see redundant zero and sign extension and done to
>>> improve ree pass to eliminate such redundant zero and sign extension
>>> using defined ABI interfaces.
>>>
>>> 2023-09-19  Ajit Kumar Agarwal  
>>>
>>> gcc/ChangeLog:
>>>
>>> * ree.cc (combine_reaching_defs): Use of zero_extend and
>>> sign_extend
>>> defined abi interfaces.
>>> (add_removable_extension): Use of defined abi interfaces for
>>> no
>>> reaching defs.
>>> (abi_extension_candidate_return_reg_p): New function.
>>> (abi_extension_candidate_p): New function.
>>> (abi_extension_candidate_argno_p): New function.
>>> (abi_handle_regs_without_defs_p): New function.
>>> (abi_target_promote_function_mode): New function.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * g++.target/powerpc/zext-elim-3.C
>>> ---
>>>  gcc/ree.cc    | 148
>>> +-
>>>  .../g++.target/powerpc/zext-elim-3.C  |  13 ++
>>>  2 files changed, 158 insertions(+), 3 deletions(-)
>>>  create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-3.C
>>>
>>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>>> index fc04249fa84..79fc54f38a3 100644
>>> --- a/gcc/ree.cc
>>> +++ b/gcc/ree.cc
>>> @@ -514,7 +514,8 @@ get_uses (rtx_insn *insn, rtx reg)
>>>  if (REGNO (DF_REF_REG (def)) == REGNO (reg))
>>>    break;
>>>  
>>> -  gcc_assert (def != NULL);
>>> +  if (def == NULL)
>>> +    return NULL;
>>>  
>>>    ref_chain = DF_REF_CHAIN (def);
>>>  
>>> @@ -750,6 +751,122 @@ get_extended_src_reg (rtx src)
>>>    return src;
>>>  }
>>>  
>>> +/* Return TRUE if target mode is equal to source mode of zero_extend
>>> +   or sign_extend otherwise false.  */
>>> +
>>> +static bool
>>> +abi_target_promote_function_mode (machine_mode mode)
>>> +{
>>> +  int unsignedp;
>>> +  machine_mode tgt_mode
>>> +    = targetm.calls.promote_function_mode (NULL_TREE, mode,
>>> &unsignedp,
>>> +      NULL_TREE, 1);
>>> +
>>> +  if (tgt_mode == mode)
>>> +    return true;
>>> +  else
>>> +    return false;
>>> +}
>>> +
>>> +/* Return TRUE if the candidate insn is zero extend and regno is
>>> +   a return registers.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_return_reg_p (rtx_insn *insn, int regno)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +
>>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  if (targetm.calls.function_value_regno_p (regno))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Return TRUE if reg source operand of zero_extend is argument
>>> registers
>>> +   and not return registers and source and destination operand are
>>> same
>>> +   and mode of source and destination operand are not same.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_p (rtx_insn *insn)
>>> +{
>>> +  rtx set = single_set (insn);
>>> +
>>> +  if (GET_CODE (SET_SRC (set)) != ZERO_EXTEND)
>>> +    return false;
>>> +
>>> +  machine_mode dst_mode = GET_MODE (SET_DEST (set));
>>> +  rtx orig_src = XEXP (SET_SRC (set), 0);
>>> +
>>> +  if (!FUNCTION_ARG_REGNO_P (REGNO (orig_src))
>>> +  || abi_extension_candidate_return_reg_p (insn, REGNO
>>> (orig_src)))
>>> +    return false;
>>> +
>>> +  /* Mode of destination and source of zero_extend should be
>>> different.  */
>>> +  if (dst_mode == GET_MODE (orig_src))
>>> +    return false;
>>> +
>>> +  /* REGNO of source and destination of zero_extend should be same. 
>>> */
>>> +  if (REGNO (SET_DEST (set)) != REGNO (orig_src))
>>> +    return false;
>>> +
>>> +  return true;
>>> +}
>>> +
>>> +/* Return TRUE if the candidate insn is zero extend and regno is
>>> +   an argument registers.  */
>>> +
>>> +static bool
>>> +abi_extension_candidate_argno_p (rtx_code code, int regno)
>>> +{
>>> +  if (code != ZERO_EXTEND && code != SIGN_EXTEND)
>>> +    re

[Committed] RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap

2023-09-19 Thread Patrick O'Neill

Committed, thanks!

The pre-commit hook didn't like the Authored-by format so I changed it
into:
2023-09-19 Juzhe Zhong 

Patrick


From 0b9c51dc2fb58911b91889895d00437673d9f4cf Mon Sep 17 00:00:00 2001
From: Patrick O'Neill 
Date: Tue, 19 Sep 2023 10:03:35 -0700
Subject: [PATCH] RISC-V: Fix --enable-checking=rtl ICE on rv32gc bootstrap

Resolves PR 111461.

during RTL pass: expand
offtime.c: In function '__offtime':
offtime.c:79:6: internal compiler error: RTL check: expected elt 0 type 
'e' or 'u', have 'w' (rtx const_int) in riscv_legitimize_const_move, at 
config/riscv/riscv.cc:2176

   79 |   ip = __mon_yday[__isleap(y)];

Tested on rv32gc glibc with --enable-checking=rtl.

2023-09-19 Juzhe Zhong 

gcc/ChangeLog:

    * config/riscv/riscv.cc (riscv_legitimize_const_move): Eliminate
    src_op_0 var to avoid rtl check error.

Tested-by: Patrick O'Neill 
---
 gcc/config/riscv/riscv.cc | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8c766e2e2be..9a1e643a6a8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2173,16 +2173,14 @@ riscv_legitimize_const_move (machine_mode mode, 
rtx dest, rtx src)

  (const_poly_int:DI [16, 16]) // <- op_1
  ))
    */
-  rtx src_op_0 = XEXP (src, 0);
-
-  if (GET_CODE (src) == CONST && GET_CODE (src_op_0) == PLUS
-    && CONST_POLY_INT_P (XEXP (src_op_0, 1)))
+  if (GET_CODE (src) == CONST && GET_CODE (XEXP (src, 0)) == PLUS
+  && CONST_POLY_INT_P (XEXP (XEXP (src, 0), 1)))
 {
   rtx dest_tmp = gen_reg_rtx (mode);
   rtx tmp = gen_reg_rtx (mode);

-  riscv_emit_move (dest, XEXP (src_op_0, 0));
-  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (src_op_0, 1));
+  riscv_emit_move (dest, XEXP (XEXP (src, 0), 0));
+  riscv_legitimize_poly_move (mode, dest_tmp, tmp, XEXP (XEXP (src, 
0), 1));


   emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, dest_tmp)));
   return;
--
2.34.1



Re: [PATCH v2] c++: Catch indirect change of active union member in constexpr [PR101631]

2023-09-19 Thread Jason Merrill

On 9/1/23 08:22, Nathaniel Shead wrote:

On Wed, Aug 30, 2023 at 04:28:18PM -0400, Jason Merrill wrote:

On 8/29/23 09:35, Nathaniel Shead wrote:

This is an attempt to improve the constexpr machinery's handling of
union lifetime by catching more cases that cause UB. Is this approach
OK?

I'd also like some feedback on a couple of pain points with this
implementation; in particular, is there a good way to detect if a type
has a non-deleted trivial constructor? I've used 'is_trivially_xible' in
this patch, but that also checks for a trivial destructor which by my
reading of [class.union.general]p5 is possibly incorrect. Checking for a
trivial default constructor doesn't seem too hard but I couldn't find a
good way of checking if that constructor is deleted.


I guess the simplest would be

(TYPE_HAS_TRIVIAL_DFLT (t) && locate_ctor (t))

because locate_ctor returns null for a deleted default ctor.  It would be
good to make this a separate predicate.


I'm also generally unsatisfied with the additional complexity with the
third 'refs' argument in 'cxx_eval_store_expression' being pushed and
popped; would it be better to replace this with a vector of some
specific structure type for the data that needs to be passed on?


Perhaps, but what you have here is fine.  Another possibility would be to
just have a vec of the refs and extract the index from the ref later as
needed.

Jason



Thanks for the feedback. I've kept the refs as-is for now. I've also
cleaned up a couple of other typos I'd had with comments and diagnostics.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

@@ -6192,10 +6197,16 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
  
type = reftype;
  
-  if (code == UNION_TYPE && CONSTRUCTOR_NELTS (*valp)

- && CONSTRUCTOR_ELT (*valp, 0)->index != index)
+  if (code == UNION_TYPE
+ && TREE_CODE (t) == MODIFY_EXPR
+ && (CONSTRUCTOR_NELTS (*valp) == 0
+ || CONSTRUCTOR_ELT (*valp, 0)->index != index))
{
- if (cxx_dialect < cxx20)
+ /* We changed the active member of a union. Ensure that this is
+valid.  */
+ bool has_active_member = CONSTRUCTOR_NELTS (*valp) != 0;
+ tree inner = strip_array_types (reftype);
+ if (has_active_member && cxx_dialect < cxx20)
{
  if (!ctx->quiet)
error_at (cp_expr_loc_or_input_loc (t),


While we're looking at this area, this error message should really 
mention that it's allowed in C++20.



@@ -6205,8 +6216,36 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
  index);
  *non_constant_p = true;
}
- else if (TREE_CODE (t) == MODIFY_EXPR
-  && CONSTRUCTOR_NO_CLEARING (*valp))
+ else if (!is_access_expr
+  || (CLASS_TYPE_P (inner)
+  && !type_has_non_deleted_trivial_default_ctor (inner)))
+   {
+ /* Diagnose changing active union member after initialisation
+without a valid member access expression, as described in
+[class.union.general] p5.  */
+ if (!ctx->quiet)
+   {
+ if (has_active_member)
+   error_at (cp_expr_loc_or_input_loc (t),
+ "accessing %qD member instead of initialized "
+ "%qD member in constant expression",
+ index, CONSTRUCTOR_ELT (*valp, 0)->index);
+ else
+   error_at (cp_expr_loc_or_input_loc (t),
+ "accessing uninitialized member %qD",
+ index);
+ if (is_access_expr)
+   {
+ inform (DECL_SOURCE_LOCATION (index),
+ "%qD does not implicitly begin its lifetime "
+ "because %qT does not have a non-deleted "
+ "trivial default constructor",
+ index, inner);
+   }


The !is_access_expr case could also use an explanatory message.

Also, I notice that this testcase crashes with the patch:

union U { int i; float f; };
constexpr auto g (U u) { return (u.i = 42); }
static_assert (g({.f = 3.14}) == 42);

Jason



Re: [PATCH v8 0/4] P1689R5 support

2023-09-19 Thread Jason Merrill

On 9/1/23 09:04, Ben Boeckel wrote:

Hi,

This patch series adds initial support for ISO C++'s [P1689R5][], a
format for describing C++ module requirements and provisions based on
the source code. This is required because compiling C++ with modules is
not embarrassingly parallel and need to be ordered to ensure that
`import some_module;` can be satisfied in time by making sure that any
TU with `export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I've also added patches to include imported module CMI files and the
module mapper file as dependencies of the compilation. I briefly looked
into adding dependencies on response files as well, but that appeared to
need some code contortions to have a `class mkdeps` available before
parsing the command line or to keep the information around until one was
made.

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

FWIW, Clang as taken an alternate approach with its `clang-scan-deps`
tool rather than using the compiler directly.

Thanks,

--Ben

---
v7 -> v8:

- rename `DEPS_FMT_` enum variants to `FDEPS_FMT_` to match the
   associated flag
- memory leak fix in the `join` specfunc implementation (also better
   comments), both from Jason
- formatting fix in `mkdeps.cc` for `write_make_modules_deps` assignment
- comments on new functions for P1689R5 implementation


Pushed, thanks!

Jason



Re: [PATCH] RISC-V: Refactor and cleanup fma patterns

2023-09-19 Thread Robin Dapp
Hi Patrick,

thanks for reporting.  Before seeing your message here I already opened a PR
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111488

Regards
 Robin



Re: [PATCH] RISC-V: Support combine cond extend and reduce sum to cond widen reduce sum

2023-09-19 Thread Robin Dapp
Hi Lehua,

thanks for the explanation.

> My current method is still to keep the operand 2 of vcond_mask as a
> register, but the pattern of mov_vec_const_0 is simplified, so that
> the corresponding combine pattern can be more simple. That's the only
> reason I split the vcond_mask into three patterns.

My "problem" with the separate split it that it really sticks out
and everybody seeing it would wonder why we need it.  It's not that
bad of course but it appears as if we messed up somewhere else. 

I checked and I don't see additional FAILs with the vmask pattern
that additionally allows a const0 operand (that is forced into a register)
and a force_reg in abs:VF.

Would you mind re-checking if we can avoid the extra
"vec_duplicate_const_0" by changing the other affected patterns
in a similar manner?  I really didn't verify in-depth so if we needed
to add a force_reg to every pattern we might need to reconsider.
Still, I'd be unsure if I preferred the "vec_dup_const_0" over
additional force_regs ;)

Regards
 Robin


Re: [PATCH] libcpp: Fix ICE on #include after a line marker directive [PR61474]

2023-09-19 Thread Lewis Hyatt
On Tue, Sep 19, 2023 at 1:13 PM Marek Polacek  wrote:
>
> On Tue, Sep 19, 2023 at 06:08:50PM +0100, Richard Sandiford wrote:
> > Lewis Hyatt via Gcc-patches  writes:
> > > Hello-
> > >
> > > This fixes an old PR, bootstrap + regtest on x86-64 Linux. Please let me 
> > > know if it's ok? Thanks!
> > >
> > > -Lewis
> > >
> > > -- >8 --
> > >
> > > As noted in the PR, GCC will segfault if a file name is first seen in a
> > > linemarker directive, and then later seen in a normal #include.  This is
> > > because the fake include process adds the file to the cache with a null 
> > > PATH
> > > member. The normal #include finds this file in the cache and then attempts
> > > to use the null PATH.  Resolve by adding the file to the cache with a 
> > > unique
> > > starting directory, so that the fake entry will only be found by a
> > > subsequent fake include, not by a real one.
> > >
> > > libcpp/ChangeLog:
> > >
> > > PR preprocessor/61474
> > > * files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake
> > > include files.
> > > (_cpp_fake_include): Pass a unique cpp_dir* address so
> > > the fake file will not be found when looked up for real.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR preprocessor/61474
> > > * c-c++-common/cpp/pr61474-2.h: New test.
> > > * c-c++-common/cpp/pr61474.c: New test.
> > > * c-c++-common/cpp/pr61474.h: New test.
> >
> > Neat fix!  I don't know this code very well, but I agree it looks
> > correct.  OK if no-one objects in 24 hours.
>
> Looks fine to me too, thanks Lewis.

Thank you both, much appreciated. I will push it tomorrow evening then.

-Lewis


Re: [PATCH v14 16/40] c, c++: Use 16 bits for all use of enum rid for more keyword space

2023-09-19 Thread Ken Matsui
On Tue, Sep 19, 2023 at 9:59 AM Jason Merrill  wrote:
>
> On 9/15/23 19:51, Ken Matsui via Gcc-patches wrote:
> > Now that RID_MAX has reached 255, we need to update the bit sizes of every
> > use of the enum rid from 8 to 16 to support more keywords.
>
> Sorry to bring this up so late, but this does raise the question of
> whether we actually want to use keyword space for all these traits that
> will probably be used approximately once in a C++ translation unit.  I
> wonder if it would make sense to instead use e.g. RID_TRAIT for all of
> them and use gperf to look up the specific trait from the identifier?
>

Thank you for your review. To use gperf, we might need to duplicate
the list of all traits defined in cp-trait.def. Modifying the traits
would require us to edit two files, but would it be acceptable?

> Jason
>


Re: [PATCH v2 2/2] c++: convert_to_void and volatile references

2023-09-19 Thread Jason Merrill

On 9/19/23 13:53, Patrick Palka wrote:

On Mon, 18 Sep 2023, Jason Merrill wrote:


On 9/18/23 12:12, Patrick Palka wrote:

Jason pointed out that even implicit loads of volatile references need
to undergo lvalue-to-rvalue conversion, but we currently emit a warning
in this case and discard the load.  This patch changes this behavior so
that we don't issue a warning, and preserve the load.

gcc/cp/ChangeLog:

* cvt.cc (convert_to_void) : Remove warning
for an implicit load of a volatile reference.  Simplify as if
is_reference is false.  Check REFERENCE_REF_P in the test
guarding the -Wunused-value diagnostic.

gcc/testsuite/ChangeLog:

* g++.dg/expr/discarded1a.C: No longer expect warning for
implicit load of a volatile reference.
* g++.old-deja/g++.bugs/900428_01.C: Likewise.
* g++.dg/expr/volatile2.C: New test.
---
   gcc/cp/cvt.cc | 56 ++-
   gcc/testsuite/g++.dg/expr/discarded1a.C   |  1 -
   gcc/testsuite/g++.dg/expr/volatile2.C | 12 
   .../g++.old-deja/g++.bugs/900428_01.C | 26 -
   4 files changed, 30 insertions(+), 65 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/expr/volatile2.C

diff --git a/gcc/cp/cvt.cc b/gcc/cp/cvt.cc
index 4424670356c..1cb6c1222c2 100644
--- a/gcc/cp/cvt.cc
+++ b/gcc/cp/cvt.cc
@@ -1251,12 +1251,9 @@ convert_to_void (tree expr, impl_conv_void implicit,
tsubst_flags_t complain)
 {
tree type = TREE_TYPE (expr);
int is_volatile = TYPE_VOLATILE (type);
-   if (is_volatile)
- complete_type (type);
-   int is_complete = COMPLETE_TYPE_P (type);
/* Can't load the value if we don't know the type.  */
-   if (is_volatile && !is_complete)
+   if (is_volatile && !COMPLETE_TYPE_P (complete_type (type)))
 {
   if (complain & tf_warning)
  switch (implicit)
@@ -1298,50 +1295,7 @@ convert_to_void (tree expr, impl_conv_void implicit,
tsubst_flags_t complain)
gcc_unreachable ();
}
 }
-   /* Don't load the value if this is an implicit dereference, or if
-  the type needs to be handled by ctors/dtors.  */
-   else if (is_volatile && is_reference)
-  {
-if (complain & tf_warning)
- switch (implicit)
-   {
- case ICV_CAST:
-   warning_at (loc, 0, "conversion to void will not access "
-   "object of type %qT", type);
-   break;
- case ICV_SECOND_OF_COND:
-   warning_at (loc, 0, "implicit dereference will not access
"
-   "object of type %qT in second operand of "
-   "conditional expression", type);
-   break;
- case ICV_THIRD_OF_COND:
-   warning_at (loc, 0, "implicit dereference will not access
"
-   "object of type %qT in third operand of "
-   "conditional expression", type);
-   break;
- case ICV_RIGHT_OF_COMMA:
-   warning_at (loc, 0, "implicit dereference will not access
"
-   "object of type %qT in right operand of "
-   "comma operator", type);
-   break;
- case ICV_LEFT_OF_COMMA:
-   warning_at (loc, 0, "implicit dereference will not access
"
-   "object of type %qT in left operand of comma "
-   "operator", type);
-   break;
- case ICV_STATEMENT:
-   warning_at (loc, 0, "implicit dereference will not access
"
-   "object of type %qT in statement",  type);
-break;
- case ICV_THIRD_IN_FOR:
-   warning_at (loc, 0, "implicit dereference will not access
"
-   "object of type %qT in for increment
expression",
-   type);
-   break;
- default:
-   gcc_unreachable ();
-   }
-  }
+   /* Don't load the value if the type needs to be handled by cdtors.  */
else if (is_volatile && TREE_ADDRESSABLE (type))
  {
if (complain & tf_warning)
@@ -1386,7 +1340,7 @@ convert_to_void (tree expr, impl_conv_void implicit,
tsubst_flags_t complain)
gcc_unreachable ();
}
  }
-   if (is_reference || !is_volatile || !is_complete || TREE_ADDRESSABLE
(type))
+   if (!is_volatile || !COMPLETE_TYPE_P (type))
 {
   /* Emit a warning (if enabled) when the "effect-less"
INDIRECT_REF
  operation is stripped off. Note that we don't warn about
@

[PATCH v2 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-09-19 Thread waffl3x
> Thank you, this is great!

Thanks!

> One legal hurdle to start with: our DCO policy
> (https://gcc.gnu.org/dco.html) requires real names in the sign-off, not
> pseudonyms. If you would prefer to contribute under this pseudonym, I
> encourage you to file a copyright assignment with the FSF, who are set
> up to handle that.

I will get on that right away.

> > +/* These need to moved to somewhere appropriate. */
> 
> 
> This isn't a bad spot for these macros, but you could also move them
> down lower, maybe near DECL_THIS_STATIC and DECL_ARRAY_PARAMETER_P for
> some thematic connection.

Sounds good, I will move them down.

> > +/* The flag is a member of base, but the value is meaningless for other
> > + decl types so checking is still justified I imagine. */
> 
> 
> Absolutely, we often reuse bits for other purposes if they're disjoint
> from the use they were added for.

Would it be more appropriate to give it a general name in base instead
then? If so, I can also change that.

> > +/* Not a lang_decl field, but still specific to c++. */
> > +#define DECL_PARM_XOBJ_FLAG(NODE) \
> > + (PARM_DECL_CHECK (NODE)->decl_common.decl_flag_3)
> 
> 
> Better to use a DECL_LANG_FLAG than claim one of the
> language-independent flags for C++.
> 
> There's a list at the top of cp-tree.h of the uses of LANG_FLAG on
> various kinds of tree node. DECL_LANG_FLAG_4 seems free on PARM_DECL.

Okay, I will switch to that instead, I didn't like using such a general
purpose flag for what is only relevant until the FUNC_DECL is created
and then never again.

If you don't mind answering right now, what are the consequences of
claiming language-independent flags for C++? Or to phrase it
differently, why would this be claiming it for C++? My guess was that
those flags could be used by any front ends and there wouldn't be any
conflicts, as you can't really have crossover between two front ends at
the same time. Or is that the thing, that kind of cross-over is
actually viable and claiming a language independent flag inhibits that
possibility? Like I eluded to, this is kinda off topic from the patch
so feel free to defer the answer to someone else but I just want to
clear up my understanding for the future.

> > + /* Only used for skipping over build_memfn_type, grokfndecl handles
> > + copying the flag to the correct field for a func_decl.
> > + There must be a better way to do this, but it isn't obvious how. */
> > + bool is_xobj_member_function = false;
> > + auto get_xobj_parm = [](tree parm_list)
> 
> 
> I guess you could add a flag to the declarator, but this is fine too.
> Though I'd move this lambda down into the cdk_function case or out to a
> separate function.

Okay, I will move the lambda.

> > case cdk_function:
> > {
> > + tree xobj_parm
> > + = get_xobj_parm (declarator->u.function.parameters);
> > + is_xobj_member_function = xobj_parm;
> 
> 
> I'd also move these down a few lines after the setting of 'raises'.

Will do.
Also, I forgot to mention it anywhere, the diagnostic patch utilizes
xobj_parm which is why it's a separate variable.

> > + /* Set the xobj flag for this parm, unfortunately
> > + I don't think there is a better way to do this. */
> > + DECL_PARM_XOBJ_FLAG (decl)
> > + = decl_spec_seq_has_spec_p (declspecs, ds_this);
> 
> 
> This seems like a fine way to handle this.

Okay good, I had my doubt's there.
> > + /* Special case for xobj parm, doesn't really belong up here
> > + (it applies to parm decls and those are mostly handled below
> > + the following specifiers) but I intend to refactor this function
> > + so I'm not worrying about it too much.
> > + The error diagnostics might be better elsewhere though. */
> 
> 
> This seems like a reasonable place for it since 'this' is supposed to
> precede the decl-specifiers, and since we are parsing initial attributes
> here rather than in the caller. You will want to give an error if
> found_decl_spec is set. And elsewhere complain about 'this' on
> parameters after the first (in cp_parser_parameter_declaration_list?),
> or in a non-member/lambda (in grokdeclarator?).
> 
> Jason

Yeah, I separated all the diagnostics out into the second patch. This
patch was meant to include the bare minimum of what was necessary to
get the feature functional. As for the diagnostics patch, I'm not happy
with how scattered about the code base it is, but you'll be able to
judge for yourself when I resubmit that patch, hopefully later today.
So not to worry, I didn't neglect diagnostics, it's just in a follow
up. The v1 of it was submitted on August 31st if you want to find it,
but I wouldn't recommend it. I misunderstood how some things were to be
formatted so it's probably best you just wait for me to finish a v2 of
it.

One last thing, I assume I should clean up the comments and replace
them with more typical ones right? I'm going to go forward with that
assumption in v3, I just want to mention it in advanced just in case I
have the wrong idea.

I will get started on v3 of

Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-19 Thread Patrick O'Neill

Hi,

This patch highlights an issue Edwin and I have been having with the
testsuite where rv64 testcases are run when testing rv32gcv.

There's a large number of new failures in the rv32gcv testsuite from
this seemingly innocuous patch.

https://github.com/ewlu/riscv-gnu-toolchain/issues/166
(The repo is still a WIP - eventually will be non-gating patchworks
pre-commit CI)

From Edwin and my investigation the failures for rv32gcv look like [1].
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11: 
fatal error: gnu/stubs-lp64d.h: No such file or directory

compilation terminated.

Top of the failing testcase:
/* { dg-do compile } */
/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 
-fno-schedule-insns -fno-schedule-insns2 --param=riscv-autovec-lmul=m8" } */


#include "def.h"

The dg-options explicitly set rv64gcv, so I don't think this testcase
should even be executed.

For the 3 new failures on rv64gcv, they all explicitly set rv32gcv.
/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */

These are seen on non-multilib builds. Multilib rv32/64gc does not
appear to have the same issue when compiling (we're currently testing
multilib rv32/64gcv to see if they encounter issues when executing).

Are other people seeing similar errors/is this a known issue?

Patrick

[1]:
Executing on host: 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc 
-B/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/ 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output  
-O3 -ftree-vectorize --param riscv-autovec-preference=scalable 
-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2 --param=riscv-autovec-lmul=m8 -ffat-lto-objects 
-fno-ident -S   -o floating-point-mul-3.s    (timeout = 600)
spawn -ignore SIGHUP 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc 
-B/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/ 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output 
-O3 -ftree-vectorize --param riscv-autovec-preference=scalable 
-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2 --param=riscv-autovec-lmul=m8 -ffat-lto-objects 
-fno-ident -S -o floating-point-mul-3.s
In file included from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h:2,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c:4:
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11: 
fatal error: gnu/stubs-lp64d.h: No such file or directory

compilation terminated.
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for 
excess errors)


On 9/19/23 04:26, Juzhe-Zhong wrote:

Extend current VLA patterns with VLS modes.

Regression all passed.

gcc/ChangeLog:

* config/riscv/autovec.md: Extend VLS modes.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/def.h: Add unary test.
* gcc.target/riscv/rvv/autovec/vls/neg-2.c: New test.

---
  gcc/config/riscv/autovec.md   | 12 ++---
  gcc/config/riscv/vector.md| 20 +++
  .../gcc.target/riscv/rvv/autovec/vls/def.h|  3 +-
  .../gcc.target/riscv/rvv/autovec/vls/neg-2.c  | 52 +++
  4 files changed, 70 insertions(+), 17 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/neg-2.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 769ef6daa36..75ed7ae4f2e 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1031,9 +1031,9 @@
  ;; - vfneg.v/vfabs.v
  ;; 
---
  (define_insn_and_split "2"
-  [(set (match_operand:VF 0 "register_operand")
-(any_float_unop_nofrm:VF
- (match_operand:VF 1 "register_operand")))]
+  [(set (match_operand:V_VLSF 0 "regi

Re: Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-19 Thread juzhe.zh...@rivai.ai
I didn't see this issue.
They should be the bogus FAILs.
We should either fix testcases or ignore them.



juzhe.zh...@rivai.ai
 
From: Patrick O'Neill
Date: 2023-09-20 08:34
To: Juzhe-Zhong; Robin Dapp; gcc-patches
CC: kito.cheng; kito.cheng; jeffreyalaw; Palmer Dabbelt; Edwin Lu; 
joern.rennecke; jeremy.bennett; gnu-toolchain
Subject: Re: [Committed] RISC-V: Support VLS unary floating-point patterns
Hi,
 
This patch highlights an issue Edwin and I have been having with the
testsuite where rv64 testcases are run when testing rv32gcv.
 
There's a large number of new failures in the rv32gcv testsuite from
this seemingly innocuous patch.
 
https://github.com/ewlu/riscv-gnu-toolchain/issues/166
(The repo is still a WIP - eventually will be non-gating patchworks
pre-commit CI)
 
From Edwin and my investigation the failures for rv32gcv look like [1].
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11:
 
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.
 
Top of the failing testcase:
/* { dg-do compile } */
/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 
-fno-schedule-insns -fno-schedule-insns2 --param=riscv-autovec-lmul=m8" } */
 
#include "def.h"
 
The dg-options explicitly set rv64gcv, so I don't think this testcase
should even be executed.
 
For the 3 new failures on rv64gcv, they all explicitly set rv32gcv.
/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
 
These are seen on non-multilib builds. Multilib rv32/64gc does not
appear to have the same issue when compiling (we're currently testing
multilib rv32/64gcv to see if they encounter issues when executing).
 
Are other people seeing similar errors/is this a known issue?
 
Patrick
 
[1]:
Executing on host: 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
 
-B/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c
 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output  
-O3 -ftree-vectorize --param riscv-autovec-preference=scalable 
-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2 --param=riscv-autovec-lmul=m8 -ffat-lto-objects 
-fno-ident -S   -o floating-point-mul-3.s(timeout = 600)
spawn -ignore SIGHUP 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
 
-B/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c
 
-march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output 
-O3 -ftree-vectorize --param riscv-autovec-preference=scalable 
-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-schedule-insns 
-fno-schedule-insns2 --param=riscv-autovec-lmul=m8 -ffat-lto-objects 
-fno-ident -S -o floating-point-mul-3.s
In file included from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h:2,
 from 
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c:4:
/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11:
 
fatal error: gnu/stubs-lp64d.h: No such file or directory
compilation terminated.
compiler exited with status 1
FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable (test for 
excess errors)
 
On 9/19/23 04:26, Juzhe-Zhong wrote:
> Extend current VLA patterns with VLS modes.
>
> Regression all passed.
>
> gcc/ChangeLog:
>
> * config/riscv/autovec.md: Extend VLS modes.
> * config/riscv/vector.md: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/autovec/vls/def.h: Add unary test.
> * gcc.target/riscv/rvv/autovec/vls/neg-2.c: New test.
>
> ---
>   gcc/config/riscv/autovec.md   | 12 ++---
>   gcc/config/riscv/vector.md| 20 +++
>   .../gcc.target/riscv/rvv/autovec/vls/def.h|  3 +-
>   .../gcc.target/riscv/rvv/autovec/vls/neg-2.c  | 52 +++
>   4 files changed, 70 insertions(+), 17 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/neg-2.c
>
> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
> ind

Re: Re: [Committed] RISC-V: Support VLS unary floating-point patterns

2023-09-19 Thread Kito Cheng
It seems because math.h, similar issue as stdint.h, does math.h necessary
for the test case?

juzhe.zh...@rivai.ai  於 2023年9月20日 週三 08:44 寫道:

> I didn't see this issue.
> They should be the bogus FAILs.
> We should either fix testcases or ignore them.
>
> --
> juzhe.zh...@rivai.ai
>
>
> *From:* Patrick O'Neill 
> *Date:* 2023-09-20 08:34
> *To:* Juzhe-Zhong ; Robin Dapp ;
> gcc-patches 
> *CC:* kito.cheng ; kito.cheng
> ; jeffreyalaw ; Palmer
> Dabbelt ; Edwin Lu ;
> joern.rennecke ; jeremy.bennett
> ; gnu-toolchain 
> *Subject:* Re: [Committed] RISC-V: Support VLS unary floating-point
> patterns
> Hi,
>
> This patch highlights an issue Edwin and I have been having with the
> testsuite where rv64 testcases are run when testing rv32gcv.
>
> There's a large number of new failures in the rv32gcv testsuite from
> this seemingly innocuous patch.
>
> https://github.com/ewlu/riscv-gnu-toolchain/issues/166
> (The repo is still a WIP - eventually will be non-gating patchworks
> pre-commit CI)
>
> From Edwin and my investigation the failures for rv32gcv look like [1].
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11:
>
> fatal error: gnu/stubs-lp64d.h: No such file or directory
> compilation terminated.
>
> Top of the failing testcase:
> /* { dg-do compile } */
> /* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3
> -fno-schedule-insns -fno-schedule-insns2 --param=riscv-autovec-lmul=m8" }
> */
>
> #include "def.h"
>
> The dg-options explicitly set rv64gcv, so I don't think this testcase
> should even be executed.
>
> For the 3 new failures on rv64gcv, they all explicitly set rv32gcv.
> /* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
>
> These are seen on non-multilib builds. Multilib rv32/64gc does not
> appear to have the same issue when compiling (we're currently testing
> multilib rv32/64gcv to see if they encounter issues when executing).
>
> Are other people seeing similar errors/is this a known issue?
>
> Patrick
>
> [1]:
> Executing on host:
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
>
> -B/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c
>
> -march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output
> -O3 -ftree-vectorize --param riscv-autovec-preference=scalable
> -march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-schedule-insns
> -fno-schedule-insns2 --param=riscv-autovec-lmul=m8 -ffat-lto-objects
> -fno-ident -S   -o floating-point-mul-3.s(timeout = 600)
> spawn -ignore SIGHUP
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/xgcc
>
> -B/home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/build-gcc-linux-stage2/gcc/
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c
>
> -march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output
> -O3 -ftree-vectorize --param riscv-autovec-preference=scalable
> -march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-schedule-insns
> -fno-schedule-insns2 --param=riscv-autovec-lmul=m8 -ffat-lto-objects
> -fno-ident -S -o floating-point-mul-3.s
> In file included from
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/features.h:515,
>  from
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/bits/libc-header-start.h:33,
>  from
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/math.h:27,
>  from
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/def.h:2,
>  from
>
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c:4:
> /home/runner/work/riscv-gnu-toolchain/riscv-gnu-toolchain/build/sysroot/usr/include/gnu/stubs.h:17:11:
>
> fatal error: gnu/stubs-lp64d.h: No such file or directory
> compilation terminated.
> compiler exited with status 1
> FAIL: gcc.target/riscv/rvv/autovec/vls/floating-point-mul-3.c -O3
> -ftree-vectorize --param riscv-autovec-preference=scalable (test for
> excess errors)
>
> On 9/19/23 04:26, Juzhe-Zhong wrote:
> > Extend current VLA patterns with VLS modes.
> >
> > Regression all passed.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/autovec.md: Extend VLS modes.
> > * config/riscv/vector.md: Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/autovec/vls/def.h: Add unary test.
> > * gcc.target/riscv/rvv/autovec/vls/neg-2.c: New test.
> >
> > ---
> >   gcc/config/riscv/autovec.md   | 12 ++---
> >   gcc/config/riscv/vector.

  1   2   >