Re: [committed] testsuite: Fix up syntax errors in scan-tree-dump-times target selectors

2023-03-07 Thread Jakub Jelinek via Gcc-patches
On Mon, Mar 06, 2023 at 11:27:16AM +0100, Robin Dapp wrote:
> > This broke the tests, I'm seeing syntax errors:
> > ERROR: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects: error executing 
> > dg-final: syntax error in target selector "target !  vect_partial_vectors 
> > || vect32  || s390_vx"
> > ERROR: gcc.dg/vect/slp-3.c: error executing dg-final: syntax error in 
> > target selector "target !  vect_partial_vectors || vect32  || s390_vx"
> > ERROR: gcc.dg/vect/slp-multitypes-11.c -flto -ffat-lto-objects: error 
> > executing dg-final: syntax error in target selector "target vect_unpack && 
> > vect_partial_vectors_usage_1 &&  ! s390_vx"
> > ERROR: gcc.dg/vect/slp-multitypes-11.c: error executing dg-final: syntax 
> > error in target selector "target vect_unpack && 
> > vect_partial_vectors_usage_1 &&  ! s390_vx"
> 
> it appears that we are still missing some braces:
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-3.c 
> b/gcc/testsuite/gcc.dg/vect/slp-3.c
> index a0c6a72995bb..760b3fa35a2a 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-3.c
> @@ -144,4 +144,4 @@ int main (void)
>  /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target 
> { { ! { vect_partial_vectors || vect32 } } || s390_vx } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target 
> { { vect_partial_vectors || vect32 } && { ! s390_vx } } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> { target { { ! { vect_partial_vectors || vect32 } } || s390_vx } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { vect_partial_vectors || vect32 } && { ! s390_vx } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { { vect_partial_vectors || vect32 } && { ! s390_vx } } } } } */
> 
> Would you mind double-checking and committing if it's OK?

I'm not a TCL expert, I certainly can't reproduce any ERROR with this
anymore on any target, though I think your change is ok.

So please just check it in yourself, you've my ack for it.

> I keep making mistakes with the dejagnu syntax.  I suppose there is no better 
> way
> to test the selector (and regex) syntax than just running an individual test 
> case?

I'll defer that to TCL experts.

Jakub



Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-07 Thread Richard Biener via Gcc-patches
On Tue, 7 Mar 2023, Xionghu Luo wrote:

> 
> 
> On 2023/3/6 16:11, Richard Biener wrote:
> > On Mon, Mar 6, 2023 at 8:22 AM Xionghu Luo  wrote:
> >>
> >>
> >>
> >> On 2023/3/2 18:45, Richard Biener wrote:
> 
> 
>  small.gcno:  648:  block 2:`small.c':1, 3, 4, 6
>  small.gcno:  688:0145:  36:LINES
>  small.gcno:  700:  block 3:`small.c':8, 9
>  small.gcno:  732:0145:  32:LINES
>  small.gcno:  744:  block 5:`small.c':10
>  -small.gcno:  772:0145:  32:LINES
>  -small.gcno:  784:  block 6:`small.c':12
>  -small.gcno:  812:0145:  36:LINES
>  -small.gcno:  824:  block 7:`small.c':12, 13
>  +small.gcno:  772:0145:  36:LINES
>  +small.gcno:  784:  block 6:`small.c':12, 13
>  +small.gcno:  816:0145:  32:LINES
>  +small.gcno:  828:  block 8:`small.c':14
>  small.gcno:  856:0145:  32:LINES
>  -small.gcno:  868:  block 8:`small.c':14
>  -small.gcno:  896:0145:  32:LINES
>  -small.gcno:  908:  block 9:`small.c':17
>  +small.gcno:  868:  block 9:`small.c':17
> >>>
> >>> Looking at the CFG and the instrumentation shows
> >>>
> >>>  :
> >>> PROF_edge_counter_17 = __gcov0.f[0];
> >>> PROF_edge_counter_18 = PROF_edge_counter_17 + 1;
> >>> __gcov0.f[0] = PROF_edge_counter_18;
> >>> [t.c:3:7] p_6 = 0;
> >>> [t.c:5:3] switch (s_7(D))  [INV], [t.c:7:5] case 0:
> >>>  [INV], [t.c:11:5] case 1:  [INV]>
> >>>
> >>>  :
> >>> # n_1 = PHI 
> >>> # p_3 = PHI <[t.c:3:7] p_6(2), [t.c:8:15] p_12(4)>
> >>> [t.c:7:5] :
> >>> [t.c:8:15] p_12 = p_3 + 1;
> >>> [t.c:8:28] n_13 = n_1 + -1;
> >>> [t.c:8:28] if (n_13 != 0)
> >>>   goto ; [INV]
> >>> else
> >>>   goto ; [INV]
> >>>
> >>>  :
> >>> PROF_edge_counter_21 = __gcov0.f[2];
> >>> PROF_edge_counter_22 = PROF_edge_counter_21 + 1;
> >>> __gcov0.f[2] = PROF_edge_counter_22;
> >>> [t.c:7:5] goto ; [100.00%]
> >>>
> >>>  :
> >>> PROF_edge_counter_23 = __gcov0.f[3];
> >>> PROF_edge_counter_24 = PROF_edge_counter_23 + 1;
> >>> __gcov0.f[3] = PROF_edge_counter_24;
> >>> [t.c:9:16] _14 = p_12;
> >>> [t.c:9:16] goto ; [INV]
> >>>
> >>> so the reason this goes wrong is that gcov associates the "wrong"
> >>> counter with the block containing
> >>> the 'case' label(s), for the case 0 it should have chosen the counter
> >>> from bb5 but it likely
> >>> computed the count of bb3?
> >>>
> >>> It might be that ordering blocks differently puts the instrumentation
> >>> to different blocks or it
> >>> makes gcovs association chose different blocks but that means it's
> >>> just luck and not fixing
> >>> the actual issue?
> >>>
> >>> To me it looks like the correct thing to investigate is switch
> >>> statement and/or case label
> >>> handling.  One can also see that  having line number 7 is wrong to
> >>> the extent that
> >>> the position of the label doesn't match the number of times it
> >>> executes in the source.  So
> >>> placement of the label is wrong here, possibly caused by CFG cleanup
> >>> after CFG build
> >>> (but generally labels are not used for anything once the CFG is built
> >>> and coverage
> >>> instrumentation is late so it might fail due to us moving labels).  It
> >>> might be OK to
> >>> avoid moving labels for --coverage but then coverage should possibly
> >>> look at edges
> >>> rather than labels?
> >>>
> >>
> >> Thanks, I investigated the Labels, it seems wrong at the beginning from
> >> .gimple to .cfg very early quite like PR90574:
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90574
> >>
> >> .gimple:
> >>
> >> int f (int s, int n)
> >> [small.c:2:1] {
> >> int D.2755;
> >> int p;
> >>
> >> [small.c:3:7] p = 0;
> >> [small.c:5:3] switch (s) , [small.c:7:5] case 0:
> >> , [small.c:11:5] case 1: >
> >> [small.c:7:5] :  <= case label
> >> :<= loop label
> >> [small.c:8:13] p = p + 1;
> >> [small.c:8:26] n = n + -1;
> >> [small.c:8:26] if (n != 0) goto ; else goto ;
> >> :
> >> [small.c:9:14] D.2755 = p;
> >> [small.c:9:14] return D.2755;
> >> [small.c:11:5] :
> >> :
> >> [small.c:12:13] p = p + 1;
> >> [small.c:12:26] n = n + -1;
> >> [small.c:12:26] if (n != 0) goto ; else goto ;
> >> :
> >> [small.c:13:14] D.2755 = p;
> >> [small.c:13:14] return D.2755;
> >> :
> >> [small.c:16:10] D.2755 = 0;
> >> [small.c:16:10] return D.2755;
> >> }
> >>
> >> .cfg:
> >>
> >> int f (int s, int n)
> >> {
> >> int p;
> >> int D.2755;
> >>
> >>  :
> >> [small.c:3:7] p = 0;
> >> [small.c:5:3] switch (s)  [INV], [small.c:7:5] case 0:
> >>  [INV], [small.c:11:5] case 1:  [INV]>
> >>
> >>  :
> >> [small.c:7:5] : 

[PATCHv3, gfortran] Escalate failure when Hollerith constant to real conversion fails [PR103628]

2023-03-07 Thread HAO CHEN GUI via Gcc-patches
Hi,
  The patch escalates the failure when Hollerith constant to real conversion
fails in native_interpret_expr. It finally reports an "Cannot simplify
expression" error in do_simplify method.

  The patch of pr95450 added a verification for decoding/encoding checking
in native_interpret_expr. native_interpret_expr may fail on real type
conversion and returns a NULL tree then. But upper layer calls don't handle
the failure so that an ICE is reported when the verification fails.

  IBM long double is an example. It doesn't have a unique memory presentation
for some real values. So it may not pass the verification. The new test
case shows the problem.

  errorcount is used to check if an error is already reported or not when
getting a bad expr. Buffered errors need to be excluded as they don't
increase error count either.

  The patch passed regression test on Power and x86 linux platforms.

Gui Haochen
Thanks

ChangeLog
2023-03-07  Haochen Gui 

gcc/
PR target/103628
* fortran/target-memory.cc (gfc_interpret_float): Return FAIL when
native_interpret_expr gets a NULL tree.
* fortran/arith.cc (gfc_hollerith2real): Return NULL when
gfc_interpret_float fails.
* fortran/error.cc (gfc_buffered_p): Define.
* fortran/gfortran.h (gfc_buffered_p): Declare.
* fortran/intrinsic.cc: Add diagnostic.h to include list.
(do_simplify): Save errorcount and check it at finish.  Report a
"Cannot simplify expression" error on a bad result if error count
doesn't change and no other errors buffered.

gcc/testsuite/
PR target/103628
* gfortran.dg/pr103628.f90: New.

Co-Authored-By: Tobias Burnus 

patch.diff
diff --git a/gcc/fortran/arith.cc b/gcc/fortran/arith.cc
index c0d12cfad9d..d3d38c7eb6a 100644
--- a/gcc/fortran/arith.cc
+++ b/gcc/fortran/arith.cc
@@ -2752,10 +2752,12 @@ gfc_hollerith2real (gfc_expr *src, int kind)
   result = gfc_get_constant_expr (BT_REAL, kind, &src->where);

   hollerith2representation (result, src);
-  gfc_interpret_float (kind, (unsigned char *) result->representation.string,
-  result->representation.length, result->value.real);
-
-  return result;
+  if (gfc_interpret_float (kind,
+  (unsigned char *) result->representation.string,
+  result->representation.length, result->value.real))
+return result;
+  else
+return NULL;
 }

 /* Convert character to real.  The constant will be padded or truncated.  */
diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc
index 214fb78ba7b..872d42e731e 100644
--- a/gcc/fortran/error.cc
+++ b/gcc/fortran/error.cc
@@ -49,6 +49,13 @@ static gfc_error_buffer error_buffer;
 static output_buffer *pp_error_buffer, *pp_warning_buffer;
 static int warningcount_buffered, werrorcount_buffered;

+/* Return buffered_p.  */
+bool
+gfc_buffered_p (void)
+{
+  return buffered_p;
+}
+
 /* Return true if there output_buffer is empty.  */

 static bool
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 219ef8c7612..edfe11796a6 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3328,6 +3328,7 @@ void gfc_internal_error (const char *, ...) 
ATTRIBUTE_NORETURN ATTRIBUTE_GCC_GFC
 void gfc_clear_error (void);
 bool gfc_error_check (void);
 bool gfc_error_flag_test (void);
+bool gfc_buffered_p (void);

 notification gfc_notification_std (int);
 bool gfc_notify_std (int, const char *, ...) ATTRIBUTE_GCC_GFC(2,3);
diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc
index e89131f5a71..9d049001a51 100644
--- a/gcc/fortran/intrinsic.cc
+++ b/gcc/fortran/intrinsic.cc
@@ -25,6 +25,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "options.h"
 #include "gfortran.h"
 #include "intrinsic.h"
+#include "diagnostic.h" /* For errorcount.  */

 /* Namespace to hold the resolved symbols for intrinsic subroutines.  */
 static gfc_namespace *gfc_intrinsic_namespace;
@@ -4620,6 +4621,7 @@ do_simplify (gfc_intrinsic_sym *specific, gfc_expr *e)
 {
   gfc_expr *result, *a1, *a2, *a3, *a4, *a5, *a6;
   gfc_actual_arglist *arg;
+  int old_errorcount = errorcount;

   /* Max and min require special handling due to the variable number
  of args.  */
@@ -4708,7 +4710,12 @@ do_simplify (gfc_intrinsic_sym *specific, gfc_expr *e)

 finish:
   if (result == &gfc_bad_expr)
-return false;
+{
+  if (errorcount == old_errorcount
+ && (gfc_buffered_p () && !gfc_error_flag_test ()))
+   gfc_error ("Cannot simplify expression at %L", &e->where);
+  return false;
+}

   if (result == NULL)
 resolve_intrinsic (specific, e);   /* Must call at run-time */
diff --git a/gcc/fortran/target-memory.cc b/gcc/fortran/target-memory.cc
index 7ce7d736629..0c47aa6b842 100644
--- a/gcc/fortran/target-memory.cc
+++ b/gcc/fortran/target-memory.cc
@@ -416,11 +416,14 @@ gfc_interpret_float (int kind, unsigned char *buffer, 
size_t buffer_size,
 mp

[PATCH] c++: Fix up ICE in emit_support_tinfo_1 [PR109042]

2023-03-07 Thread Jakub Jelinek via Gcc-patches
Hi!

In my recent rtti.cc change I assumed when emitting the support tinfos
that the tinfos for the fundamental types haven't been created yet.
Normally (in libsupc++.a (fundamental_type_info.o)) that is the case,
but as can be seen on the testcase, one can violate it by using typeid
etc. in the same TU and do it before ~__fundamental_type_info ()
definition.

The following patch fixes that by popping from unemitted_tinfo_decls
only in the normal case when it is there, and treating non-NULL
DECL_INITIAL on a tinfo node as indication that emit_tinfo_decl has
processed it already.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-03-07  Jakub Jelinek  

PR c++/109042
* rtti.cc (emit_support_tinfo_1): Don't assert that last
unemitted_tinfo_decls element is tinfo, instead pop from it only in
that case.
* decl2.cc (c_parse_final_cleanups): Don't call emit_tinfo_decl
for unemitted_tinfO_decls which have already non-NULL DECL_INITIAL.

* g++.dg/rtti/pr109042.C: New test.

--- gcc/cp/rtti.cc.jj   2023-03-03 00:34:52.028567946 +0100
+++ gcc/cp/rtti.cc  2023-03-06 19:06:27.433307136 +0100
@@ -1581,10 +1581,10 @@ emit_support_tinfo_1 (tree bltn)
   /* Emit it right away if not emitted already.  */
   if (DECL_INITIAL (tinfo) == NULL_TREE)
{
- gcc_assert (unemitted_tinfo_decls->last () == tinfo);
  bool ok = emit_tinfo_decl (tinfo);
  gcc_assert (ok);
- unemitted_tinfo_decls->pop ();
+ if (unemitted_tinfo_decls->last () == tinfo)
+   unemitted_tinfo_decls->pop ();
}
 }
 }
--- gcc/cp/decl2.cc.jj  2023-01-18 16:11:47.053213397 +0100
+++ gcc/cp/decl2.cc 2023-03-06 19:07:16.830582984 +0100
@@ -4982,7 +4982,7 @@ c_parse_final_cleanups (void)
 get emitted.  */
   for (i = unemitted_tinfo_decls->length ();
   unemitted_tinfo_decls->iterate (--i, &t);)
-   if (emit_tinfo_decl (t))
+   if (DECL_INITIAL (t) || emit_tinfo_decl (t))
  {
reconsider = true;
unemitted_tinfo_decls->unordered_remove (i);
--- gcc/testsuite/g++.dg/rtti/pr109042.C.jj 2023-03-06 19:11:06.995208812 
+0100
+++ gcc/testsuite/g++.dg/rtti/pr109042.C2023-03-06 19:10:59.117324298 
+0100
@@ -0,0 +1,20 @@
+// PR c++/109042
+// { dg-do compile }
+
+namespace std { class type_info {}; }
+
+std::type_info
+foo ()
+{
+  return typeid (void);
+}
+
+namespace __cxxabiv1 {
+  struct __fundamental_type_info {
+virtual ~__fundamental_type_info ();
+  };
+
+  __fundamental_type_info::~__fundamental_type_info ()
+  {
+  }
+}

Jakub



[RFC] RISC-V: Support risc-v bfloat16 This patch support bfloat16 in riscv like x86_64 and arm.

2023-03-07 Thread Liao Shihua
   According to https://github.com/riscv/riscv-bfloat16 , zfbfmin extension 
depends on zfh/zfhmin extension.

   According to the discussion 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/367, this use __bf16 
and use DF16b in riscv_mangle_type like x86.


gcc\ChangeLog:

* common/config/riscv/riscv-common.cc: Add ZFBFMIN extension.
* config/riscv/iterators.md (TARGET_ZFHMIN):Add iterator BF.
(fld):Likewise.
(ld):Likewise
(fsd):Likewise
(sd):Likewise
(d):Likewise
(DF):Likewise
* config/riscv/riscv-builtins.cc (riscv_init_builtin_types): Add 
bfloat16 type in riscv .
(riscv_fp16_builtin_type):Likewise
(riscv_bf16_builtin_type):Likewise
* config/riscv/riscv-modes.def (FLOAT_MODE):Likewise
(ADJUST_FLOAT_FORMAT):Likewise
* config/riscv/riscv-opts.h (MASK_ZFBFMIN):Add ZFBFMIN extension.
(TARGET_ZFBFMIN):
* config/riscv/riscv-vector-switch.def (ENTRY):
* config/riscv/riscv.cc (riscv_emit_float_compare):Add bfloat16 type in 
riscv .
(riscv_mangle_type):
(riscv_scalar_mode_supported_p):
(riscv_libgcc_floating_mode_supported_p):
(riscv_init_libfuncs):
* config/riscv/riscv.md (mode" ):Add bfloat16 type in riscv .
(truncdfhf2):
(truncsfbf2):
(truncdf2):
(extendbfsf2):
(extendhfdf2):
(extenddf2):
(movbf):
(*movbf_hardfloat):
(*movbf_softfloat):

libgcc\ChangeLog:

* config/riscv/sfp-machine.h (_FP_NANFRAC_B):Add bfloat16 type in riscv 
.
(_FP_NANSIGN_B):
* config/riscv/t-softfp32:

---
 gcc/common/config/riscv/riscv-common.cc  |  4 ++
 gcc/config/riscv/iterators.md| 21 +---
 gcc/config/riscv/riscv-builtins.cc   | 29 +-
 gcc/config/riscv/riscv-modes.def |  2 +
 gcc/config/riscv/riscv-opts.h|  2 +
 gcc/config/riscv/riscv-vector-switch.def |  4 +-
 gcc/config/riscv/riscv.cc| 37 +++--
 gcc/config/riscv/riscv.md| 69 
 libgcc/config/riscv/sfp-machine.h|  3 ++
 libgcc/config/riscv/t-softfp32   |  8 +--
 10 files changed, 150 insertions(+), 29 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index ebc1ed7d7e4..2b3ff1f5b8e 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -102,6 +102,8 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zvl32768b", "zvl16384b"},
   {"zvl65536b", "zvl32768b"},
 
+  {"zfbfmin", "zfhmin"},
+
   {"zfh", "zfhmin"},
   {"zfhmin", "f"},
   
@@ -1239,6 +1241,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zvl16384b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL16384B},
   {"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
   {"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},
+  
+  {"zfbfmin",&gcc_options::x_riscv_zf_subext, MASK_ZFBFMIN},
 
   {"zfhmin",&gcc_options::x_riscv_zf_subext, MASK_ZFHMIN},
   {"zfh",   &gcc_options::x_riscv_zf_subext, MASK_ZFH},
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 5b70ab20758..6349f032bc8 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -61,10 +61,15 @@
 ;; Iterator for hardware-supported floating-point modes.
 (define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX")
(DF "TARGET_DOUBLE_FLOAT || TARGET_ZDINX")
-   (HF "TARGET_ZFH || TARGET_ZHINX")])
+   (HF "TARGET_ZFH || TARGET_ZHINX") 
+   (BF "TARGET_ZFBFMIN")])
+
+;; Iterator for HImode constant generation.
+(define_mode_iterator BFHF [BF HF])
 
 ;; Iterator for floating-point modes that can be loaded into X registers.
-(define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
+(define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")
+   (BF "TARGET_ZFBFMIN")])
 
 
 ;; ---
@@ -76,27 +81,27 @@
 (define_mode_attr size [(QI "b") (HI "h")])
 
 ;; Mode attributes for loads.
-(define_mode_attr load [(QI "lb") (HI "lh") (SI "lw") (DI "ld") (HF "flh") (SF 
"flw") (DF "fld")])
+(define_mode_attr load [(QI "lb") (HI "lh") (SI "lw") (DI "ld") (BF "flh") (HF 
"flh") (SF "flw") (DF "fld")])
 
 ;; Instruction names for integer loads that aren't explicitly sign or zero
 ;; extended.  See riscv_output_move and LOAD_EXTEND_OP.
 (define_mode_attr default_load [(QI "lbu") (HI "lhu") (SI "lw") (DI "ld")])
 
 ;; Mode attribute for FP loads into integer registers.
-(define_mode_attr softload [(HF "lh") (SF "lw") (DF "ld")])
+(define_mode_attr softload [(BF "lh") (HF "lh") (SF "lw") (DF "ld")])
 
 ;; Instruction names for stores.
-(define_mode_

Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-07 Thread Xionghu Luo via Gcc-patches




On 2023/3/7 16:53, Richard Biener wrote:

On Tue, 7 Mar 2023, Xionghu Luo wrote:



Unfortunately this change (flag_test_coverage -> !optimize ) caused hundred
of gfortran cases execution failure with O0.  Take gfortran.dg/index.f90 for
example:

.gimple:

__attribute__((fn spec (". ")))
void p ()
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:6:9] {
   
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:13:28]
   L.1:
   
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:14:28]
   L.2:
   
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:15:28]
   L.3:
   
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:16:28]
   L.4:
   
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:17:28]
   L.5:
   
[/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:18:72]
   L.6:
}

.cfg:

...
Removing basic block 7
;; basic block 7, loop depth 0
;;  pred:
return;
;;  succ:   EXIT


;; 1 loops found
;;
;; Loop 0
;;  header 0, latch 1
;;  depth 0, outer -1
;;  nodes: 0 1 2
;;2 succs { }
__attribute__((fn spec (". ")))
void p ()
{
:

}

Due to the "return;" is removed in bb 7.


OK, the issue is that make_edges_bb does nothing for an empty block
but it should at least create a fallthru edge here.  Thus,

   if (!last)
 fallthru = true;

   else
 switch (gimple_code (last))
   {

instead of simply returning if (!last).  The alternative would be
to make sure that cleanup_dead_labels preserves at least one
statement in a block.

Looking at the testcases I wonder if preserving all the fallthru labels
is really necessary - for coverage we should have a counter ready.  For
the testcase we arrive with

L.1:
L.2:
L.3:
L.4:
i = 1;


It was:

 :

 :
L.1:

 :
L.2:

 :
L.3:

 :
L.4:

 :
L.5:

 :
L.6:
return;

 :

before the second call of cleanup_dead_labels, after it, all labels are
removed, then tree_forwarder_block_p remove all forworders.  Yes, it
creates blocks and remove blocks immediately...



where the frontend simplified things but put labels at each line.
I suppose we could optimize this by re-computing TREE_USED and only
splitting before labels reached by a control statement?  That would
cover the backedge case in the original testcase.  cleanup_dead_labels
does something like that already.


actually in build_gimple_cfg, cleanup_dead_labels will remove all labels L.1
to L.6
first, then make_edges fail to create edges for  to  due to they
are all
EMPTY bb in make_edges_bb...
  


   240│   /* To speed up statement iterator walks, we first purge dead labels.
   */
   241│   cleanup_dead_labels ();
   242│
   243│   /* Group case nodes to reduce the number of edges.
   244│  We do this after cleaning up dead labels because otherwise we
   miss
   245│  a lot of obvious case merging opportunities.  */
   246│   group_case_labels ();
   247│
   248│   /* Create the edges of the flowgraph.  */
   249│   discriminator_per_locus = new hash_table (13);
   250├>  make_edges ();


 :

 :

 :

 :

 :

 :

 :
return;

 :


Seems deadlock here as you said to set goto_locus as labels are removed before
edges are created, the case could pass if I comment out the function
cleanup_dead_labels(),
so also not call it when !optimize?

if (!!optimize)
  cleanup_dead_labels ();


That probably makes sense.  Looking at group_case_labels () that also
seems to do unwanted things (to debugging and coverage), its comment
says that for

  switch (i)
  {
  case 1:
/* fallthru */
  case 2:
/* fallthru */
  case 3:
k = 0;

it would replace that with

  case 1..3:
k = 0;

but that also fails to produce correct coverage, right?  Likewise
setting breakpoints.


Yes.  Should also exclude this.



Does preserving the labels help setting a goto_locus for the
fallthru edges?  I don't see any code doing that, so
CFG cleanup will remove the forwarders we created again.


For the backedge case with switch-case-do-while, tree_forwarder_block_p
returns false when iterating the statement check.
The new created  with only one case label instruction still owns
location information in it, so CFG cleanup won't remove the forwarders.

 390│   for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
 391│ {
 392│   gimple *stmt = gsi_stmt (gsi);
 393│
 394│   switch (gimple_code (stmt))
 395│ {
 396│ case GIMPLE_LABEL:
 397│   if (DECL_NONLOCAL (gimple_label_label (as_a (stmt
 398│ return false;
 399│   if (!optimize
 400│   && (gimple_has_location (stmt)
 401│   || LOCATION_LOCUS (locus) != UNKNOWN_LOCATION)
 402│   && gimple_location (stmt) != locus)
 403├>return false;
 404│   break;


(gdb) ps stmt
:
(gdb) p gimple_location (stmt)
$154 = 2147483656
(gdb) pel $154
{file = 0x3e41af0 "small.c", line = 7, column = 5, data = 0x76f80420, sysp 
= false}
(gdb)
(gdb) pbb bb
;; basic block 3, loop de

[PATCH] tree-optimization/109046 - re-combine complex loads

2023-03-07 Thread Richard Biener via Gcc-patches
The following addresses PR109046 by adding an optimization to forwprop
to combine a piecewise complex load to a complex load when there are
no uses of the components.  That's something useful in general and
easier to do than avoiding the splitting in complex lowering.

The testcase exercises both the manual and the complex lowering case.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

PR tree-optimization/109046
* tree-ssa-forwprop.cc (pass_forwprop::execute): Combine
piecewise complex loads.

* gcc.dg/tree-ssa/forwprop-39.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-39.c | 15 ++
 gcc/tree-ssa-forwprop.cc| 31 -
 2 files changed, 45 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/forwprop-39.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-39.c 
b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-39.c
new file mode 100644
index 000..eb2930e77fd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-39.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -O2 -fdump-tree-forwprop1 -fdump-tree-optimized" } */
+
+#include 
+
+extern void push1(void *p, float _Complex x);
+void foo (void *q, float _Complex *x)
+{
+  float r = __real *x;
+  float i = __imag *x;
+  push1 (q, CMPLXF (r, i));
+}
+
+/* { dg-final { scan-tree-dump-not "COMPLEX_EXPR" "forwprop1" } } */
+/* { dg-final { scan-tree-dump-not "REALPART_EXPR" "optimized" } } */
diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 03fe0a3f6df..3111a2b96a3 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -3669,7 +3669,8 @@ pass_forwprop::execute (function *fun)
  /* Rewrite stores of a single-use complex build expression
 to component-wise stores.  */
  use_operand_p use_p;
- gimple *use_stmt;
+ gimple *use_stmt, *def1, *def2;
+ tree rhs2;
  if (single_imm_use (lhs, &use_p, &use_stmt)
  && gimple_store_p (use_stmt)
  && !gimple_has_volatile_ops (use_stmt)
@@ -3703,6 +3704,34 @@ pass_forwprop::execute (function *fun)
  release_defs (stmt);
  gsi_remove (&gsi, true);
}
+ /* Rewrite a component-wise load of a complex to a complex
+load if the components are not used separately.  */
+ else if (TREE_CODE (rhs) == SSA_NAME
+  && has_single_use (rhs)
+  && ((rhs2 = gimple_assign_rhs2 (stmt)), true)
+  && TREE_CODE (rhs2) == SSA_NAME
+  && has_single_use (rhs2)
+  && (def1 = SSA_NAME_DEF_STMT (rhs),
+  gimple_assign_load_p (def1))
+  && (def2 = SSA_NAME_DEF_STMT (rhs2),
+  gimple_assign_load_p (def2))
+  && (gimple_vuse (def1) == gimple_vuse (def2))
+  && gimple_assign_rhs_code (def1) == REALPART_EXPR
+  && gimple_assign_rhs_code (def2) == IMAGPART_EXPR
+  && operand_equal_p (TREE_OPERAND (gimple_assign_rhs1
+(def1), 0),
+  TREE_OPERAND (gimple_assign_rhs1
+(def2), 0)))
+   {
+ tree cl = TREE_OPERAND (gimple_assign_rhs1 (def1), 0);
+ gimple_assign_set_rhs_from_tree (&gsi, unshare_expr (cl));
+ gcc_assert (gsi_stmt (gsi) == stmt);
+ gimple_set_vuse (stmt, gimple_vuse (def1));
+ gimple_set_modified (stmt, true);
+ gimple_stmt_iterator gsi2 = gsi_for_stmt (def1);
+ gsi_remove (&gsi, false);
+ gsi_insert_after (&gsi2, stmt, GSI_SAME_STMT);
+   }
  else
gsi_next (&gsi);
}
-- 
2.35.3


Re: [PATCH 1/2] gcov: Fix "do-while" structure in case statement leads to incorrect code coverage [PR93680]

2023-03-07 Thread Richard Biener via Gcc-patches
On Tue, 7 Mar 2023, Xionghu Luo wrote:

> 
> 
> On 2023/3/7 16:53, Richard Biener wrote:
> > On Tue, 7 Mar 2023, Xionghu Luo wrote:
> 
> >> Unfortunately this change (flag_test_coverage -> !optimize ) caused hundred
> >> of gfortran cases execution failure with O0.  Take gfortran.dg/index.f90
> >> for
> >> example:
> >>
> >> .gimple:
> >>
> >> __attribute__((fn spec (". ")))
> >> void p ()
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:6:9]
> >> {
> >>
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:13:28]
> >>L.1:
> >>
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:14:28]
> >>L.2:
> >>
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:15:28]
> >>L.3:
> >>
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:16:28]
> >>L.4:
> >>
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:17:28]
> >>L.5:
> >>
> >> [/data/RocksDB_Docker/tgcc-master/gcc/testsuite/gfortran.dg/index_4.f90:18:72]
> >>L.6:
> >> }
> >>
> >> .cfg:
> >>
> >> ...
> >> Removing basic block 7
> >> ;; basic block 7, loop depth 0
> >> ;;  pred:
> >> return;
> >> ;;  succ:   EXIT
> >>
> >>
> >> ;; 1 loops found
> >> ;;
> >> ;; Loop 0
> >> ;;  header 0, latch 1
> >> ;;  depth 0, outer -1
> >> ;;  nodes: 0 1 2
> >> ;;2 succs { }
> >> __attribute__((fn spec (". ")))
> >> void p ()
> >> {
> >> :
> >>
> >> }
> >>
> >> Due to the "return;" is removed in bb 7.
> > 
> > OK, the issue is that make_edges_bb does nothing for an empty block
> > but it should at least create a fallthru edge here.  Thus,
> > 
> >if (!last)
> >  fallthru = true;
> > 
> >else
> >  switch (gimple_code (last))
> >{
> > 
> > instead of simply returning if (!last).  The alternative would be
> > to make sure that cleanup_dead_labels preserves at least one
> > statement in a block.
> > 
> > Looking at the testcases I wonder if preserving all the fallthru labels
> > is really necessary - for coverage we should have a counter ready.  For
> > the testcase we arrive with
> > 
> > L.1:
> > L.2:
> > L.3:
> > L.4:
> > i = 1;
> 
> It was:
> 
>  :
> 
>  :
> L.1:
> 
>  :
> L.2:
> 
>  :
> L.3:
> 
>  :
> L.4:
> 
>  :
> L.5:
> 
>  :
> L.6:
> return;
> 
>  :
> 
> before the second call of cleanup_dead_labels, after it, all labels are
> removed, then tree_forwarder_block_p remove all forworders.  Yes, it
> creates blocks and remove blocks immediately...
> 
> > 
> > where the frontend simplified things but put labels at each line.
> > I suppose we could optimize this by re-computing TREE_USED and only
> > splitting before labels reached by a control statement?  That would
> > cover the backedge case in the original testcase.  cleanup_dead_labels
> > does something like that already.
> > 
> >> actually in build_gimple_cfg, cleanup_dead_labels will remove all labels
> >> L.1
> >> to L.6
> >> first, then make_edges fail to create edges for  to  due to
> >> they
> >> are all
> >> EMPTY bb in make_edges_bb...
> >>   
> >>
> >>240│   /* To speed up statement iterator walks, we first purge dead
> >>labels.
> >>*/
> >>241│   cleanup_dead_labels ();
> >>242│
> >>243│   /* Group case nodes to reduce the number of edges.
> >>244│  We do this after cleaning up dead labels because otherwise we
> >>miss
> >>245│  a lot of obvious case merging opportunities.  */
> >>246│   group_case_labels ();
> >>247│
> >>248│   /* Create the edges of the flowgraph.  */
> >>249│   discriminator_per_locus = new hash_table
> >>(13);
> >>250├>  make_edges ();
> >>
> >>
> >>  :
> >>
> >>  :
> >>
> >>  :
> >>
> >>  :
> >>
> >>  :
> >>
> >>  :
> >>
> >>  :
> >> return;
> >>
> >>  :
> >>
> >>
> >> Seems deadlock here as you said to set goto_locus as labels are removed
> >> before
> >> edges are created, the case could pass if I comment out the function
> >> cleanup_dead_labels(),
> >> so also not call it when !optimize?
> >>
> >> if (!!optimize)
> >>   cleanup_dead_labels ();
> > 
> > That probably makes sense.  Looking at group_case_labels () that also
> > seems to do unwanted things (to debugging and coverage), its comment
> > says that for
> > 
> >   switch (i)
> >   {
> >   case 1:
> > /* fallthru */
> >   case 2:
> > /* fallthru */
> >   case 3:
> > k = 0;
> > 
> > it would replace that with
> > 
> >   case 1..3:
> > k = 0;
> > 
> > but that also fails to produce correct coverage, right?  Likewise
> > setting breakpoints.
> 
> Yes.  Should also exclude this.
> 
> > 
> > Does preserving the labels help setting a goto_locus for the
> > fallthru edges?  I don't see any code doing that, so
> > CFG cleanup will remove the forwarders we created again.
> 
> For the backedge case with switch-case-do-while, tree_forwarder_block_p
> returns false when iterating the statement check.
> The new created  with only one case

Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]

2023-03-07 Thread Jacek Caban via Gcc-patches

Hi Costas,

On 3/7/23 01:52, Costas Argyris via Gcc-patches wrote:

This is a proposal for addressing

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108865

by integrating the UTF-8 manifest file into gcc's build process for the
64-bit mingw host.



Is there a reason to make it specific to x86_64? It seems to me that all 
mingw hosts could use it.





+# The resource .rc file references the utf8 .manifest file.
+# Compile it into an object file using windres.
+# The resulting .o file gets added to host_extra_gcc_objs in
+# config.host for x86_64-*-mingw* host and gets linked into
+# the driver as a .o file, so it's lack of symbols is OK.
+utf8rc-mingw32.o : $(srcdir)/config/i386/utf8-mingw32.rc
+   $(WINDRES) $< $@



I think that .manifest file should also be a dependency here.


Thanks,

Jacek


[PATCH v5] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-07 Thread pan2.li--- via Gcc-patches
From: Pan Li 

Fix the bug of the rvv bool mode precision with the adjustment.
The bits size of vbool*_t will be adjusted to
[1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
adjusted mode precison of vbool*_t will help underlying pass to
make the right decision for both the correctness and optimization.

Given below sample code:
void test_1(int8_t * restrict in, int8_t * restrict out)
{
  vbool8_t v2 = *(vbool8_t*)in;
  vbool16_t v5 = *(vbool16_t*)in;
  *(vbool16_t*)(out + 200) = v5;
  *(vbool8_t*)(out + 100) = v2;
}

Before the precision adjustment:
addia4,a1,100
vsetvli a5,zero,e8,m1,ta,ma
addia1,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a4)
// Need one vsetvli and vlm.v for correctness here.
vsm.v   v24,0(a1)

After the precision adjustment:
csrrt0,vlenb
sllit1,t0,1
csrra3,vlenb
sub sp,sp,t1
sllia4,a3,1
add a4,a4,sp
sub a3,a4,a3
vsetvli a5,zero,e8,m1,ta,ma
addia2,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a3)
addia1,a1,100
vsetvli a4,zero,e8,mf2,ta,ma
csrrt0,vlenb
vlm.v   v25,0(a3)
vsm.v   v25,0(a2)
sllit1,t0,1
vsetvli a5,zero,e8,m1,ta,ma
vsm.v   v24,0(a1)
add sp,sp,t1
jr  ra

However, there may be some optimization opportunates after
the mode precision adjustment. It can be token care of in
the RISC-V backend in the underlying separted PR(s).

PR 108185
PR 108654

gcc/ChangeLog:

* config/riscv/riscv-modes.def (ADJUST_PRECISION):
* config/riscv/riscv.cc (riscv_v_adjust_precision):
* config/riscv/riscv.h (riscv_v_adjust_precision):
* genmodes.cc (ADJUST_PRECISION):
(emit_mode_adjustments):

gcc/testsuite/ChangeLog:

* gcc.target/riscv/pr108185-1.c: New test.
* gcc.target/riscv/pr108185-2.c: New test.
* gcc.target/riscv/pr108185-3.c: New test.
* gcc.target/riscv/pr108185-4.c: New test.
* gcc.target/riscv/pr108185-5.c: New test.
* gcc.target/riscv/pr108185-6.c: New test.
* gcc.target/riscv/pr108185-7.c: New test.
* gcc.target/riscv/pr108185-8.c: New test.

Signed-off-by: Pan Li 
Co-authored-by: Ju-Zhe Zhong 
---
 gcc/config/riscv/riscv-modes.def|  8 +++
 gcc/config/riscv/riscv.cc   | 12 
 gcc/config/riscv/riscv.h|  1 +
 gcc/genmodes.cc | 28 +++-
 gcc/testsuite/gcc.target/riscv/pr108185-1.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-2.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-3.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-4.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-5.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-6.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-7.c | 68 ++
 gcc/testsuite/gcc.target/riscv/pr108185-8.c | 77 +
 12 files changed, 600 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-8.c

diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def
index d5305efa8a6..110bddce851 100644
--- a/gcc/config/riscv/riscv-modes.def
+++ b/gcc/config/riscv/riscv-modes.def
@@ -72,6 +72,14 @@ ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * 
riscv_bytes_per_vector_chunk);
 ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
 ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8));
 
+ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));
+ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2));
+ADJUST_PRECISION (VNx4BI, riscv_v_adjust_precision (VNx4BImode, 4));
+ADJUST_PRECISION (VNx8BI, riscv_v_adjust_precision (VNx8BImode, 8));
+ADJUST_PRECISION (VNx16BI, riscv_v_adjust_precision (VNx16BImode, 16));
+ADJUST_PRECISION (VNx32BI, riscv_v_adjust_precision (VNx32BImode, 32));
+ADJUST_PRECISION (VNx64BI, riscv_v_adjust_precision (VNx64BImode, 64));
+
 /*
| Mode| MIN_VLEN=32 | MIN_VLEN=32 | MIN_VLEN=64 | MIN_VLEN=64 |
| | LMUL| SEW/LMUL| LMUL| SEW/LMUL|
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/

RE: [PATCH v4] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-07 Thread Li, Pan2 via Gcc-patches
Great! Thank you very much for help, Richard Sandiford!

Just adjusted the PATCH v5 with minor changes for the GNU style check, and 
completed the regression test and the RISC-V backend test without any surprise.
Given that is there anyone can help to merge this PATCH? Any question or 
concern please help to let me know.

https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613504.html

Pan

-Original Message-
From: Richard Sandiford  
Sent: Monday, March 6, 2023 11:02 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@sifive.com; 
rguent...@suse.de
Subject: Re: [PATCH v4] RISC-V: Bugfix for rvv bool mode precision adjustment

pan2...@intel.com writes:
> From: Pan Li 
>
>   Fix the bug of the rvv bool mode precision with the adjustment.
>   The bits size of vbool*_t will be adjusted to
>   [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
>   adjusted mode precison of vbool*_t will help underlying pass to
>   make the right decision for both the correctness and optimization.
>
>   Given below sample code:
>   void test_1(int8_t * restrict in, int8_t * restrict out)
>   {
> vbool8_t v2 = *(vbool8_t*)in;
> vbool16_t v5 = *(vbool16_t*)in;
> *(vbool16_t*)(out + 200) = v5;
> *(vbool8_t*)(out + 100) = v2;
>   }
>
>   Before the precision adjustment:
>   addia4,a1,100
>   vsetvli a5,zero,e8,m1,ta,ma
>   addia1,a1,200
>   vlm.v   v24,0(a0)
>   vsm.v   v24,0(a4)
>   // Need one vsetvli and vlm.v for correctness here.
>   vsm.v   v24,0(a1)
>
>   After the precision adjustment:
>   csrrt0,vlenb
>   sllit1,t0,1
>   csrra3,vlenb
>   sub sp,sp,t1
>   sllia4,a3,1
>   add a4,a4,sp
>   sub a3,a4,a3
>   vsetvli a5,zero,e8,m1,ta,ma
>   addia2,a1,200
>   vlm.v   v24,0(a0)
>   vsm.v   v24,0(a3)
>   addia1,a1,100
>   vsetvli a4,zero,e8,mf2,ta,ma
>   csrrt0,vlenb
>   vlm.v   v25,0(a3)
>   vsm.v   v25,0(a2)
>   sllit1,t0,1
>   vsetvli a5,zero,e8,m1,ta,ma
>   vsm.v   v24,0(a1)
>   add sp,sp,t1
>   jr  ra
>
>   However, there may be some optimization opportunates after
>   the mode precision adjustment. It can be token care of in
>   the RISC-V backend in the underlying separted PR(s).
>
>   PR 108185
>   PR 108654
>
> gcc/ChangeLog:
>
>   * config/riscv/riscv-modes.def (ADJUST_PRECISION):
>   * config/riscv/riscv.cc (riscv_v_adjust_precision):
>   * config/riscv/riscv.h (riscv_v_adjust_precision):
>   * genmodes.cc (ADJUST_PRECISION):
>   (emit_mode_adjustments):

OK for the genmodes.cc part, thanks.

Richard

> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/pr108185-1.c: New test.
>   * gcc.target/riscv/pr108185-2.c: New test.
>   * gcc.target/riscv/pr108185-3.c: New test.
>   * gcc.target/riscv/pr108185-4.c: New test.
>   * gcc.target/riscv/pr108185-5.c: New test.
>   * gcc.target/riscv/pr108185-6.c: New test.
>   * gcc.target/riscv/pr108185-7.c: New test.
>   * gcc.target/riscv/pr108185-8.c: New test.
>
> Signed-off-by: Pan Li 
> Co-authored-by: Ju-Zhe Zhong 
> ---
>  gcc/config/riscv/riscv-modes.def|  8 +++
>  gcc/config/riscv/riscv.cc   | 12 
>  gcc/config/riscv/riscv.h|  1 +
>  gcc/genmodes.cc | 28 +++-
>  gcc/testsuite/gcc.target/riscv/pr108185-1.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-2.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-3.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-4.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-5.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-6.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-7.c | 68 ++  
> gcc/testsuite/gcc.target/riscv/pr108185-8.c | 77 +
>  12 files changed, 600 insertions(+), 2 deletions(-)  create mode 
> 100644 gcc/testsuite/gcc.target/riscv/pr108185-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-8.c
>
> diff --git a/gcc/config/riscv/riscv-modes.def 
> b/gcc/config/riscv/riscv-modes.def
> index d5305efa8a6..110bddce851 100644
> --- a/gcc/config/riscv/riscv-modes.def
> +++ b/gcc/config/riscv/riscv-modes.def
> @@ -72,6 +72,14 @@ ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk);  ADJUST_BYTESIZE (VNx32BI, 

Re: [PATCH 2/2] libstdc++: use copy_file_range

2023-03-07 Thread Jonathan Wakely via Gcc-patches

On 06/03/23 23:11 +0100, Jannik Glückert wrote:

copy_file_range is a recent-ish syscall for copying files. It is similar
to sendfile but allows filesystem-specific optimizations. Common are:
Reflinks: BTRFS, XFS, ZFS (does not implement the syscall yet)
Server-side copy: NFS, SMB

If copy_file_range is not available for the given files, fall back to
sendfile / userspace copy.


Thanks for the patch!

Please note the legal prerequisites for GCC contributions:
https://gcc.gnu.org/contribute.html#legal


libstdc++-v3/ChangeLog:

* acinclude.m4 (_GLIBCXX_USE_COPY_FILE_RANGE): define
   * config.h.in: Regenerate.
* configure: Regenerate.
* src/filesystem/ops-common.h: use copy_file_range in
 std::filesystem::copy_file
---
libstdc++-v3/acinclude.m4| 20 
libstdc++-v3/config.h.in |  3 ++
libstdc++-v3/configure   | 62 
libstdc++-v3/src/filesystem/ops-common.h | 34 +
4 files changed, 119 insertions(+)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 5136c0571e8..ca09e1d22db 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4581,6 +4581,7 @@ dnl  _GLIBCXX_USE_UTIMENSAT
dnl  _GLIBCXX_USE_ST_MTIM
dnl  _GLIBCXX_USE_FCHMOD
dnl  _GLIBCXX_USE_FCHMODAT
+dnl  _GLIBCXX_USE_COPY_FILE_RANGE
dnl  _GLIBCXX_USE_SENDFILE
dnl  HAVE_LINK
dnl  HAVE_READLINK
@@ -4718,6 +4719,25 @@ dnl
  if test $glibcxx_cv_fchmodat = yes; then
AC_DEFINE(_GLIBCXX_USE_FCHMODAT, 1, [Define if fchmodat is available in 
.])
  fi
+dnl
+  AC_CACHE_CHECK([for copy_file_range that can copy files],
+glibcxx_cv_copy_file_range, [dnl
+case "${target_os}" in
+  linux*)
+   GCC_TRY_COMPILE_OR_LINK(
+ [#include ],
+ [copy_file_range(1, NULL, 2, NULL, 1, 0);],


This should either include  for NULL, or use nullptr.


+ [glibcxx_cv_copy_file_range=yes],
+ [glibcxx_cv_copy_file_range=no])
+   ;;
+  *)
+   glibcxx_cv_copy_file_range=no
+   ;;
+esac
+  ])
+  if test $glibcxx_cv_copy_file_range = yes; then
+AC_DEFINE(_GLIBCXX_USE_COPY_FILE_RANGE, 1, [Define if copy_file_range is 
available in .])
+  fi
dnl
  AC_CACHE_CHECK([for sendfile that can copy files],
glibcxx_cv_sendfile, [dnl
diff --git a/libstdc++-v3/src/filesystem/ops-common.h 
b/libstdc++-v3/src/filesystem/ops-common.h
index d8afc6a4d64..0491dc8d811 100644
--- a/libstdc++-v3/src/filesystem/ops-common.h
+++ b/libstdc++-v3/src/filesystem/ops-common.h
@@ -49,6 +49,9 @@
#ifdef NEED_DO_COPY_FILE
# include 
# include 
+# ifdef _GLIBCXX_USE_COPY_FILE_RANGE
+#  include  // copy_file_range
+# endif
# ifdef _GLIBCXX_USE_SENDFILE
#  include  // sendfile
# endif
@@ -358,6 +361,24 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM
  }

#ifdef NEED_DO_COPY_FILE
+#ifdef _GLIBCXX_USE_COPY_FILE_RANGE
+  bool
+  copy_file_copy_file_range(int fd_in, int fd_out, size_t length) noexcept
+  {
+size_t bytes_left = length;
+off_t offset = 0;
+ssize_t bytes_copied;
+do {
+  bytes_copied = ::copy_file_range(fd_in, &offset, fd_out, NULL, 
bytes_left, 0);


If there's a good reason to use &offset here instead of nullptr then I
think we need a comment explaining it. I initially thought the point
was to not adjust the file offset for fd_in, so that if
copy_file_range fails after the first call, we retry using sendfile or
filebufs and restart copying from the beginning again. But
copy_file_range will have updated the file offset of fd_out in that
case, and so sendfile/filebuf would start copying from the beginning
of fd_in but write to the end of fd_out, producing an incorrect copy.

Either we should pass &off_in and &off_out, so that neither file
offset is updated, or we should rewind the output file offset before
retrying with sendfile/filebuf. Your first patch removes the existing
code that seeks the filebuf positions to account for any partial
sendfile copying. I'm not sure the code is correct after removing
that.


+  if (bytes_copied < 0)
+{
+  return false;
+}
+  bytes_left -= bytes_copied;
+} while (bytes_left > 0 && bytes_copied > 0);


This will break out of the loop of copy_file_range returns zero, even
if bytes_left != 0. It's not clear from the man page whether that can
happen (it says it will return zero if the file offset of fd_in is at
EOF, but it doesn't say it *won't* return zero otherwise). This does
match the loop condition in the man page's example though.



+return true;
+  }
+#endif
#if defined _GLIBCXX_USE_SENDFILE && ! defined _GLIBCXX_FILESYSTEM_IS_WINDOWS
  bool
  copy_file_sendfile(int fd_in, int fd_out, size_t length) noexcept
@@ -518,6 +539,19 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM

bool has_copied = false;

+#ifdef _GLIBCXX_USE_COPY_FILE_RANGE
+if (!has_copied)
+  has_copied = copy_file_copy_file_range(in.fd, out.fd, from_st->st_size);
+if (!has_copied)
+  {
+if (errno != EFBIG && 

Re: [PATCH v5] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-07 Thread Kito Cheng via Gcc-patches
Hi Pan:

Really appreciate your patch for fixing this issue!

committed to trunk with minor commit log tweak :)

On Tue, Mar 7, 2023 at 8:05 PM pan2.li--- via Gcc-patches
 wrote:
>
> From: Pan Li 
>
> Fix the bug of the rvv bool mode precision with the adjustment.
> The bits size of vbool*_t will be adjusted to
> [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
> adjusted mode precison of vbool*_t will help underlying pass to
> make the right decision for both the correctness and optimization.
>
> Given below sample code:
> void test_1(int8_t * restrict in, int8_t * restrict out)
> {
>   vbool8_t v2 = *(vbool8_t*)in;
>   vbool16_t v5 = *(vbool16_t*)in;
>   *(vbool16_t*)(out + 200) = v5;
>   *(vbool8_t*)(out + 100) = v2;
> }
>
> Before the precision adjustment:
> addia4,a1,100
> vsetvli a5,zero,e8,m1,ta,ma
> addia1,a1,200
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a4)
> // Need one vsetvli and vlm.v for correctness here.
> vsm.v   v24,0(a1)
>
> After the precision adjustment:
> csrrt0,vlenb
> sllit1,t0,1
> csrra3,vlenb
> sub sp,sp,t1
> sllia4,a3,1
> add a4,a4,sp
> sub a3,a4,a3
> vsetvli a5,zero,e8,m1,ta,ma
> addia2,a1,200
> vlm.v   v24,0(a0)
> vsm.v   v24,0(a3)
> addia1,a1,100
> vsetvli a4,zero,e8,mf2,ta,ma
> csrrt0,vlenb
> vlm.v   v25,0(a3)
> vsm.v   v25,0(a2)
> sllit1,t0,1
> vsetvli a5,zero,e8,m1,ta,ma
> vsm.v   v24,0(a1)
> add sp,sp,t1
> jr  ra
>
> However, there may be some optimization opportunates after
> the mode precision adjustment. It can be token care of in
> the RISC-V backend in the underlying separted PR(s).
>
> PR 108185
> PR 108654
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-modes.def (ADJUST_PRECISION):
> * config/riscv/riscv.cc (riscv_v_adjust_precision):
> * config/riscv/riscv.h (riscv_v_adjust_precision):
> * genmodes.cc (ADJUST_PRECISION):
> (emit_mode_adjustments):
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/pr108185-1.c: New test.
> * gcc.target/riscv/pr108185-2.c: New test.
> * gcc.target/riscv/pr108185-3.c: New test.
> * gcc.target/riscv/pr108185-4.c: New test.
> * gcc.target/riscv/pr108185-5.c: New test.
> * gcc.target/riscv/pr108185-6.c: New test.
> * gcc.target/riscv/pr108185-7.c: New test.
> * gcc.target/riscv/pr108185-8.c: New test.
>
> Signed-off-by: Pan Li 
> Co-authored-by: Ju-Zhe Zhong 
> ---
>  gcc/config/riscv/riscv-modes.def|  8 +++
>  gcc/config/riscv/riscv.cc   | 12 
>  gcc/config/riscv/riscv.h|  1 +
>  gcc/genmodes.cc | 28 +++-
>  gcc/testsuite/gcc.target/riscv/pr108185-1.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-2.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-3.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-4.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-5.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-6.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-7.c | 68 ++
>  gcc/testsuite/gcc.target/riscv/pr108185-8.c | 77 +
>  12 files changed, 600 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-8.c
>
> diff --git a/gcc/config/riscv/riscv-modes.def 
> b/gcc/config/riscv/riscv-modes.def
> index d5305efa8a6..110bddce851 100644
> --- a/gcc/config/riscv/riscv-modes.def
> +++ b/gcc/config/riscv/riscv-modes.def
> @@ -72,6 +72,14 @@ ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk);
>  ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * 
> riscv_bytes_per_vector_chunk);
>  ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8));
>
> +ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));
> +ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2));
> +ADJUST_PRECISION (VNx4BI, riscv_v_adjust_precision (VNx4BImode, 4));
> +ADJUST_PRECISION (VNx8BI, riscv_v_adjust_precision (VNx8BImode, 8));
> +ADJUST_PRECISION (VNx16BI, 

Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]

2023-03-07 Thread Costas Argyris via Gcc-patches
Hi Jacek,

"Is there a reason to make it specific to x86_64? It seems to me that all
mingw hosts could use it."

Are you referring to the 32-bit host?My concern here is that this
functionality (embedding the UTF-8
manifest file into the executable) is only truly supported in recent
versions of Windows.From:

https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

It says that Windows Version 1903 (May 2019 Update) enables this, so we are
looking at the 64-bit
version of Windows.

I suppose you are referring to the scenario where one has a 32-bit gcc +
mingw running in a 64-bit
Windows that is recent enough to support this?It is not clear to me
based on the above doc what
would happen encoding-wise in that situation, and I haven't tried it either
because I assumed that
most people would want the 64-bit version of gcc since they are probably
running a 64-bit OS.

If you think it is useful, I could look into that as a separate task to try
and keep this one simple, if
that makes sense.

"I think that .manifest file should also be a dependency here."

Why is that?Windres takes only the .rc file as its input, as per its
own doc, and it successfully
compiles it into an object file.The .manifest file is only referenced
by the .rc file, and it doesn't
get passed to windres, so I don't see why it has to be listed as a
prerequisite in the make rule.

Thanks,
Costas

On Tue, 7 Mar 2023 at 12:02, Jacek Caban  wrote:

> Hi Costas,
>
> On 3/7/23 01:52, Costas Argyris via Gcc-patches wrote:
>
> This is a proposal for addressing
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108865
>
> by integrating the UTF-8 manifest file into gcc's build process for the
> 64-bit mingw host.
>
>
> Is there a reason to make it specific to x86_64? It seems to me that all
> mingw hosts could use it.
>
>
>
> +# The resource .rc file references the utf8 .manifest file.
> +# Compile it into an object file using windres.
> +# The resulting .o file gets added to host_extra_gcc_objs in
> +# config.host for x86_64-*-mingw* host and gets linked into
> +# the driver as a .o file, so it's lack of symbols is OK.
> +utf8rc-mingw32.o : $(srcdir)/config/i386/utf8-mingw32.rc
> + $(WINDRES) $< $@
>
>
> I think that .manifest file should also be a dependency here.
>
>
> Thanks,
>
> Jacek
>


Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]

2023-03-07 Thread Jacek Caban via Gcc-patches

Hi Costas,

On 3/7/23 15:00, Costas Argyris wrote:

Hi Jacek,

"Is there a reason to make it specific to x86_64? It seems to me that 
all mingw hosts could use it."


Are you referring to the 32-bit host?    My concern here is that this 
functionality (embedding the UTF-8
manifest file into the executable) is only truly supported in recent 
versions of Windows.    From:


https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

It says that Windows Version 1903 (May 2019 Update) enables this, so 
we are looking at the 64-bit

version of Windows.

I suppose you are referring to the scenario where one has a 32-bit 
gcc + mingw running in a 64-bit
Windows that is recent enough to support this?    It is not clear to 
me based on the above doc what
would happen encoding-wise in that situation, and I haven't tried it 
either because I assumed that
most people would want the 64-bit version of gcc since they are 
probably running a 64-bit OS.


If you think it is useful, I could look into that as a separate task 
to try and keep this one simple, if

that makes sense.



Yes, realistically it's mostly about 32-bit gcc on 64-bit Windows 
(perhaps aarch64 as well at some point in the future). It's probably 
indeed not very popular configuration those days, but I think it should 
work just fine if you didn't explicitly limit the patch to x86_64.




"I think that .manifest file should also be a dependency here."

Why is that?    Windres takes only the .rc file as its input, as per 
its own doc, and it successfully
compiles it into an object file.    The .manifest file is only 
referenced by the .rc file, and it doesn't
get passed to windres, so I don't see why it has to be listed as a 
prerequisite in the make rule.



The point that when winnt-utf8.manifest is modified, utf8-mingw32.o 
should be rebuilt. Anyway, it's probably not a big deal (I should 
disclaim that I'm not very familiar with gcc build system; I'm mostly on 
this ML due to mingw-w64 contributions).



Thanks,

Jacek



Re: Ping: [PATCH 1/2] testsuite: Provide means to regexp in multiline patterns

2023-03-07 Thread Martin Liška
On 3/7/23 01:32, Hans-Peter Nilsson wrote:
>> From: Mike Stump 
>> Date: Mon, 6 Mar 2023 02:05:35 -0800
> 
>> Ok
> 
> Thanks!  The server-side hook didn't like my ChangeLog
> entry:
> 
> * lib/multiline.exp (_build_multiline_regex): Map
> "{re:" to "(", ":re}" to ")" and ":re?}" to ")?".
> 
> It seems I forgot to validate that patch by
> contrib/gcc-changelog/git_check_commit.py, which complains:
> 
> Checking c0debd6f586ef76f1ceabfed11d7eaf8f6d1b110: FAILED
> ERR: bad wrapping of parenthesis: "   "{re:" to "(", ":re}" to ")" and 
> ":re?}" to ")?"."

Hello.

Yeah, that's quite interesting problem ;)


> 
> I gave in and took the easy way out; not fixing the bug in
> that script, but instead "wrapped the parenthesis" to:
> 
>   * lib/multiline.exp (_build_multiline_regex): Map
>   "{re:" to "(", similarly ")?" from ":re?}" and the
>   same without question mark.
> 
> I hope to make amends by fixing git_check_commit.py, if
> given guidance.

Sure, you can take a look at:
contrib/gcc-changelog/git_commit.py::process_parentheses
where we might want to skip the stack push/pop if the character is wrapper
in apostrophes or double quotes.

Martin

> 
> brgds, H-P



Re: [PATCH v6] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-03-07 Thread Jason Merrill via Gcc-patches

On 3/6/23 16:54, Marek Polacek wrote:

On Fri, Mar 03, 2023 at 09:30:38PM -0500, Jason Merrill wrote:

On 3/3/23 12:50, Marek Polacek wrote:

 switch (TREE_CODE (expr))
   {
   case CALL_EXPR:
@@ -13831,7 +13895,8 @@ do_warn_dangling_reference (tree expr)
 std::pair v = std::minmax(1, 2);
   which also creates a dangling reference, because std::minmax
   returns std::pair(b, a).  */
-   if (!(TYPE_REF_OBJ_P (rettype) || std_pair_ref_ref_p (rettype)))
+   if (!arg_p
+   && (!(TYPE_REF_OBJ_P (rettype) || std_pair_ref_ref_p (rettype


Instead of checking !arg_p maybe the std_pair_ref_ref_p call should change
to reference_like_class_p (which in turn should check std_pair_ref_ref_p)?


Could do.  I suppose the logic is that for std::pair
arguments we want to see through it to get at its arguments.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

   const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

This patch adjusts do_warn_dangling_reference so that we look through
reference wrapper classes (meaning, has a reference member and a
constructor taking the same reference type, or is std::reference_wrapper
or std::ranges::ref_view) and don't warn for them, supposing that the
member function returns a reference to a non-temporary object.

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): New.
(do_warn_dangling_reference): Add new bool parameter.  See through
reference_like_class_p.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference8.C: New test.
* g++.dg/warn/Wdangling-reference9.C: New test.
---
  gcc/cp/call.cc| 97 ---
  .../g++.dg/warn/Wdangling-reference8.C| 77 +++
  .../g++.dg/warn/Wdangling-reference9.C| 21 
  3 files changed, 181 insertions(+), 14 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference8.C
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference9.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 048b2b052f8..a43980b6e15 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13779,6 +13779,52 @@ std_pair_ref_ref_p (tree t)
return true;
  }
  
+/* Return true if a class CTYPE is either std::reference_wrapper or

+   std::ref_view, or a reference wrapper class.  We consider a class
+   a reference wrapper class if it has a reference member and a
+   constructor taking the same reference type.  */
+
+static bool
+reference_like_class_p (tree ctype)
+{
+  if (!CLASS_TYPE_P (ctype))
+return false;
+
+  /* Also accept a std::pair.  */
+  if (std_pair_ref_ref_p (ctype))
+return true;
+
+  tree tdecl = TYPE_NAME (TYPE_MAIN_VARIANT (ctype));
+  if (decl_in_std_namespace_p (tdecl))
+{
+  tree name = DECL_NAME (tdecl);
+  return (name
+ && (id_equal (name, "reference_wrapper")
+ || id_equal (name, "ref_view")));
+}
+  for (tree fields = TYPE_FIELDS (ctype);
+   fields;
+   fields = DECL_CHAIN (fields))
+{
+  if (TREE_CODE (fields) != FIELD_DECL || DECL_ARTIFICIAL (fields))
+   continue;
+  tree type = TREE_TYPE (fields);
+  if (!TYPE_REF_P (type))
+   continue;
+  /* OK, the field is a reference member.  Do we have a constructor
+taking its type?  */
+  for (tree fn : ovl_range (CLASSTYPE_CONSTRUCTORS (ctype)))
+   {
+ tree args = FUNCTION_FIRST_USER_PARMTYPE (fn);
+ if (args
+ && same_type_p (TREE_VALUE (args), type)
+ && TREE_CHAIN (args) == void_list_node)
+   return true;
+   }
+}
+  return false;
+}
+
  /* Helper for maybe_warn_dangling_reference to find a problematic CALL_EXPR
 that initializes the LHS (and at least one of its arguments represents
 a temporary, as outlined in maybe_warn_dangling_reference), or NULL_TREE
@@ -13793,12 +13839,36 @@ std_pair_ref_ref_p (tree t)
   const int& y = (f(1), 42); // NULL_TREE
   const int& z = f(f(1)); // f(f(1))
  
-   EXPR is the initializer.  */

+   EXPR is the initializer.  If ARG_P is true, we're processing an argument
+   to a function; the point is to distinguish between, for example,
+
+ Ref::inner (&TARGET_EXPR )
+
+   where we shouldn't warn, and
+
+ Ref::inner (&TARGET_EXPR )>)
+
+   where we should warn (Ref is a reference_like_class_p so we see through
+   it.  */
  
  static tree

-do_warn_dangling_reference (tree expr)
+do_warn_dan

Re: [PATCH v2] c++: error with constexpr operator() [PR107939]

2023-03-07 Thread Jason Merrill via Gcc-patches

On 3/6/23 17:01, Marek Polacek wrote:

On Mon, Mar 06, 2023 at 11:12:56AM -0500, Jason Merrill wrote:

On 3/3/23 12:51, Marek Polacek wrote:

Similarly to PR107938, this also started with r11-557, whereby cp_finish_decl
can call check_initializer even in a template for a constexpr initializer.

Here we are rejecting

extern const Q q;

template
constexpr auto p = q(0);

even though q has a constexpr operator().  It's deemed non-const by
decl_maybe_constant_var_p because even though 'q' is const it is not
of integral/enum type.  I think the fix is for p_c_e to treat q(0) as
potentially-constant, as below.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12?

PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (is_constexpr_function_object): New.
(potential_constant_expression_1): Treat an object with constexpr
operator() as potentially-constant.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ74.C: Remove dg-error.
* g++.dg/cpp1y/var-templ77.C: New test.
---
   gcc/cp/constexpr.cc  | 23 ++-
   gcc/testsuite/g++.dg/cpp1y/var-templ74.C |  2 +-
   gcc/testsuite/g++.dg/cpp1y/var-templ77.C | 14 ++
   3 files changed, 37 insertions(+), 2 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp1y/var-templ77.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index acf9847a4d1..7d786f332b4 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -8929,6 +8929,24 @@ check_for_return_continue (tree *tp, int *walk_subtrees, 
void *data)
 return NULL_TREE;
   }
+/* Return true iff TYPE is a class with constexpr operator().  */
+
+static bool
+is_constexpr_function_object (tree type)
+{
+  if (!CLASS_TYPE_P (type))
+return false;
+
+  for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
+if (TREE_CODE (f) == FUNCTION_DECL
+   && DECL_OVERLOADED_OPERATOR_P (f)
+   && DECL_OVERLOADED_OPERATOR_IS (f, CALL_EXPR)
+   && DECL_DECLARED_CONSTEXPR_P (f))
+  return true;
+
+  return false;
+}
+
   /* Return true if T denotes a potentially constant expression.  Issue
  diagnostic as appropriate under control of FLAGS.  If WANT_RVAL is true,
  an lvalue-rvalue conversion is implied.  If NOW is true, we want to
@@ -9160,7 +9178,10 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
  }
else if (fun)
 {
-   if (RECUR (fun, rval))
+   if (VAR_P (fun)
+   && is_constexpr_function_object (TREE_TYPE (fun)))
+ /* Could be an object with constexpr operator().  */;


I guess if fun is not a function pointer, we don't know if we're using it as
an lvalue or rvalue


Presumably the operator function could return this, making it an lvalue?
I'm not sure I'm really clear on this.


I mean just calling the operator uses the variable as an lvalue, by 
passing its address as 'this'.



, so we want to pass 'any' for want_rval, which should
make this work;


Yes, want_rval==false means that p_c_e/VAR_DECL will not issue the
hard error.


I don't think we need to be specific about constexpr op(),
as a constexpr conversion operator to fn* could also do the trick.


Ah, those surrogate classes.  I couldn't reproduce the problem with
them, though I'm adding a test for it anyway.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK, thanks.


-- >8 --
Similarly to PR107938, this also started with r11-557, whereby cp_finish_decl
can call check_initializer even in a template for a constexpr initializer.

Here we are rejecting

   extern const Q q;

   template
   constexpr auto p = q(0);

even though q has a constexpr operator().  It's deemed non-const by
decl_maybe_constant_var_p because even though 'q' is const it is not
of integral/enum type.

If fun is not a function pointer, we don't know if we're using it as an
lvalue or rvalue, so with this patch we pass 'any' for want_rval.  With
that, p_c_e/VAR_DECL doesn't flat out reject the underlying VAR_DECL.

PR c++/107939

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1) : Pass
'any' when recursing on a VAR_DECL and not a pointer to function.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ74.C: Remove dg-error.
* g++.dg/cpp1y/var-templ77.C: New test.
---
  gcc/cp/constexpr.cc  |  8 --
  gcc/testsuite/g++.dg/cpp1y/var-templ74.C |  2 +-
  gcc/testsuite/g++.dg/cpp1y/var-templ77.C | 32 
  3 files changed, 39 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/var-templ77.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 364695b762c..3079561f2e8 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9179,8 +9179,12 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
  }
else if (fun)
{
-   if (RECUR (f

Re: [PATCH] c++: noexcept and copy elision [PR109030]

2023-03-07 Thread Jason Merrill via Gcc-patches

On 3/6/23 18:59, Marek Polacek wrote:

When processing a noexcept, constructors aren't elided: build_over_call
has
 /* It's unsafe to elide the constructor when handling
a noexcept-expression, it may evaluate to the wrong
value (c++/53025).  */
 && (force_elide || cp_noexcept_operand == 0))
so the assert I added recently needs to be relaxed a little bit.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


PR c++/109030

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Relax assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept77.C: New test.
---
  gcc/cp/constexpr.cc | 6 +-
  gcc/testsuite/g++.dg/cpp0x/noexcept77.C | 9 +
  2 files changed, 14 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept77.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 364695b762c..5384d0e8e46 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2869,7 +2869,11 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  
/* We used to shortcut trivial constructor/op= here, but nowadays

   we can only get a trivial function here with -fno-elide-constructors.  */
-  gcc_checking_assert (!trivial_fn_p (fun) || !flag_elide_constructors);
+  gcc_checking_assert (!trivial_fn_p (fun)
+  || !flag_elide_constructors
+  /* We don't elide constructors when processing
+ a noexcept-expression.  */
+  || cp_noexcept_operand);
  
bool non_constant_args = false;

new_call.bindings
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept77.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept77.C
new file mode 100644
index 000..16db8eb79ee
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept77.C
@@ -0,0 +1,9 @@
+// PR c++/109030
+// { dg-do compile { target c++11 } }
+
+struct foo { };
+
+struct __as_receiver {
+  foo empty_env;
+};
+void sched(foo __fun) noexcept(noexcept(__as_receiver{__fun})) { }

base-commit: dfb14cdd796ad9df6b5f2def047ef36b29385902




Re: [Patch, fortran] PR37336 finalization

2023-03-07 Thread Thomas Koenig via Gcc-patches

Paul,

first of all, thank you very much indeed for the hard work you put into
this!  This is a great step for gfortran.


I can hurry this along to get the patch
into 13-branch or I can wait until 14-branch opens.


Personally, I think that this fixes so many bugs, and makes
the compiler so much better, that I would prefer having it
in gcc-13.  Finalization was only of very limited use before,
and the risk of meaningful regressions (short of a build
failure) is therefore very low.

Again, thanks a lot!

Best regards

Thomas





Re: [PATCH] c++: Fix up ICE in emit_support_tinfo_1 [PR109042]

2023-03-07 Thread Jason Merrill via Gcc-patches

On 3/7/23 04:07, Jakub Jelinek wrote:

Hi!

In my recent rtti.cc change I assumed when emitting the support tinfos
that the tinfos for the fundamental types haven't been created yet.
Normally (in libsupc++.a (fundamental_type_info.o)) that is the case,
but as can be seen on the testcase, one can violate it by using typeid
etc. in the same TU and do it before ~__fundamental_type_info ()
definition.

The following patch fixes that by popping from unemitted_tinfo_decls
only in the normal case when it is there, and treating non-NULL
DECL_INITIAL on a tinfo node as indication that emit_tinfo_decl has
processed it already.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-03-07  Jakub Jelinek  

PR c++/109042
* rtti.cc (emit_support_tinfo_1): Don't assert that last
unemitted_tinfo_decls element is tinfo, instead pop from it only in
that case.
* decl2.cc (c_parse_final_cleanups): Don't call emit_tinfo_decl
for unemitted_tinfO_decls which have already non-NULL DECL_INITIAL.

* g++.dg/rtti/pr109042.C: New test.

--- gcc/cp/rtti.cc.jj   2023-03-03 00:34:52.028567946 +0100
+++ gcc/cp/rtti.cc  2023-03-06 19:06:27.433307136 +0100
@@ -1581,10 +1581,10 @@ emit_support_tinfo_1 (tree bltn)
/* Emit it right away if not emitted already.  */
if (DECL_INITIAL (tinfo) == NULL_TREE)
{
- gcc_assert (unemitted_tinfo_decls->last () == tinfo);
  bool ok = emit_tinfo_decl (tinfo);
  gcc_assert (ok);
- unemitted_tinfo_decls->pop ();
+ if (unemitted_tinfo_decls->last () == tinfo)
+   unemitted_tinfo_decls->pop ();


So if it's not last we'll leave it in the vec, even though it is no 
longer unemitted, and let c_parse_final_cleanups deal with removing it? 
 That could use a comment.  OK with that change.



}
  }
  }
--- gcc/cp/decl2.cc.jj  2023-01-18 16:11:47.053213397 +0100
+++ gcc/cp/decl2.cc 2023-03-06 19:07:16.830582984 +0100
@@ -4982,7 +4982,7 @@ c_parse_final_cleanups (void)
 get emitted.  */
for (i = unemitted_tinfo_decls->length ();
   unemitted_tinfo_decls->iterate (--i, &t);)
-   if (emit_tinfo_decl (t))
+   if (DECL_INITIAL (t) || emit_tinfo_decl (t))
  {
reconsider = true;
unemitted_tinfo_decls->unordered_remove (i);
--- gcc/testsuite/g++.dg/rtti/pr109042.C.jj 2023-03-06 19:11:06.995208812 
+0100
+++ gcc/testsuite/g++.dg/rtti/pr109042.C2023-03-06 19:10:59.117324298 
+0100
@@ -0,0 +1,20 @@
+// PR c++/109042
+// { dg-do compile }
+
+namespace std { class type_info {}; }
+
+std::type_info
+foo ()
+{
+  return typeid (void);
+}
+
+namespace __cxxabiv1 {
+  struct __fundamental_type_info {
+virtual ~__fundamental_type_info ();
+  };
+
+  __fundamental_type_info::~__fundamental_type_info ()
+  {
+  }
+}

Jakub





Re: [PATCH v2] c++: error with constexpr operator() [PR107939]

2023-03-07 Thread Marek Polacek via Gcc-patches
On Tue, Mar 07, 2023 at 09:53:28AM -0500, Jason Merrill wrote:
> On 3/6/23 17:01, Marek Polacek wrote:
> > On Mon, Mar 06, 2023 at 11:12:56AM -0500, Jason Merrill wrote:
> > > On 3/3/23 12:51, Marek Polacek wrote:
> > > > Similarly to PR107938, this also started with r11-557, whereby 
> > > > cp_finish_decl
> > > > can call check_initializer even in a template for a constexpr 
> > > > initializer.
> > > > 
> > > > Here we are rejecting
> > > > 
> > > > extern const Q q;
> > > > 
> > > > template
> > > > constexpr auto p = q(0);
> > > > 
> > > > even though q has a constexpr operator().  It's deemed non-const by
> > > > decl_maybe_constant_var_p because even though 'q' is const it is not
> > > > of integral/enum type.  I think the fix is for p_c_e to treat q(0) as
> > > > potentially-constant, as below.
> > > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/12?
> > > > 
> > > > PR c++/107939
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * constexpr.cc (is_constexpr_function_object): New.
> > > > (potential_constant_expression_1): Treat an object with 
> > > > constexpr
> > > > operator() as potentially-constant.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp1y/var-templ74.C: Remove dg-error.
> > > > * g++.dg/cpp1y/var-templ77.C: New test.
> > > > ---
> > > >gcc/cp/constexpr.cc  | 23 ++-
> > > >gcc/testsuite/g++.dg/cpp1y/var-templ74.C |  2 +-
> > > >gcc/testsuite/g++.dg/cpp1y/var-templ77.C | 14 ++
> > > >3 files changed, 37 insertions(+), 2 deletions(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/cpp1y/var-templ77.C
> > > > 
> > > > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > > > index acf9847a4d1..7d786f332b4 100644
> > > > --- a/gcc/cp/constexpr.cc
> > > > +++ b/gcc/cp/constexpr.cc
> > > > @@ -8929,6 +8929,24 @@ check_for_return_continue (tree *tp, int 
> > > > *walk_subtrees, void *data)
> > > >  return NULL_TREE;
> > > >}
> > > > +/* Return true iff TYPE is a class with constexpr operator().  */
> > > > +
> > > > +static bool
> > > > +is_constexpr_function_object (tree type)
> > > > +{
> > > > +  if (!CLASS_TYPE_P (type))
> > > > +return false;
> > > > +
> > > > +  for (tree f = TYPE_FIELDS (type); f; f = DECL_CHAIN (f))
> > > > +if (TREE_CODE (f) == FUNCTION_DECL
> > > > +   && DECL_OVERLOADED_OPERATOR_P (f)
> > > > +   && DECL_OVERLOADED_OPERATOR_IS (f, CALL_EXPR)
> > > > +   && DECL_DECLARED_CONSTEXPR_P (f))
> > > > +  return true;
> > > > +
> > > > +  return false;
> > > > +}
> > > > +
> > > >/* Return true if T denotes a potentially constant expression.  Issue
> > > >   diagnostic as appropriate under control of FLAGS.  If WANT_RVAL 
> > > > is true,
> > > >   an lvalue-rvalue conversion is implied.  If NOW is true, we want 
> > > > to
> > > > @@ -9160,7 +9178,10 @@ potential_constant_expression_1 (tree t, bool 
> > > > want_rval, bool strict, bool now,
> > > >   }
> > > > else if (fun)
> > > >  {
> > > > -   if (RECUR (fun, rval))
> > > > +   if (VAR_P (fun)
> > > > +   && is_constexpr_function_object (TREE_TYPE (fun)))
> > > > + /* Could be an object with constexpr operator().  */;
> > > 
> > > I guess if fun is not a function pointer, we don't know if we're using it 
> > > as
> > > an lvalue or rvalue
> > 
> > Presumably the operator function could return this, making it an lvalue?
> > I'm not sure I'm really clear on this.
> 
> I mean just calling the operator uses the variable as an lvalue, by passing
> its address as 'this'.

Ah yeah, right.  Unless there's the && ref-qual etc.
 
> > > , so we want to pass 'any' for want_rval, which should
> > > make this work;
> > 
> > Yes, want_rval==false means that p_c_e/VAR_DECL will not issue the
> > hard error.
> > 
> > > I don't think we need to be specific about constexpr op(),
> > > as a constexpr conversion operator to fn* could also do the trick.
> > 
> > Ah, those surrogate classes.  I couldn't reproduce the problem with
> > them, though I'm adding a test for it anyway.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> OK, thanks.

Thanks.

Marek



Re: Enable UTF-8 code page in driver and compiler on 64-bit mingw host [PR108865]

2023-03-07 Thread Costas Argyris via Gcc-patches
Hi Jacek,

"but I think it should work just fine if you didn't explicitly limit the
patch to x86_64."

I would think so too.

Actually, even cygwin might benefit from this, assuming it has the same
problem, which I don't know if it's the case.

But I'm not experienced with that so I would like to explore these hosts
separately and just focus on the most common 64-bit Windows host with this
change, if possible.

"The point that when winnt-utf8.manifest is modified, utf8-mingw32.o should
be rebuilt."

Right, makes sense.

Just noting that winnt-utf8.manifest is really not meant to be modified,
because it is copied straight from:

https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

and will probably remain like that, but I do get your point and I am happy
to make the change.

Thanks,
Costas

On Tue, 7 Mar 2023 at 14:18, Jacek Caban  wrote:

> Hi Costas,
>
> On 3/7/23 15:00, Costas Argyris wrote:
> > Hi Jacek,
> >
> > "Is there a reason to make it specific to x86_64? It seems to me that
> > all mingw hosts could use it."
> >
> > Are you referring to the 32-bit host?My concern here is that this
> > functionality (embedding the UTF-8
> > manifest file into the executable) is only truly supported in recent
> > versions of Windows.From:
> >
> >
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
> >
> > It says that Windows Version 1903 (May 2019 Update) enables this, so
> > we are looking at the 64-bit
> > version of Windows.
> >
> > I suppose you are referring to the scenario where one has a 32-bit
> > gcc + mingw running in a 64-bit
> > Windows that is recent enough to support this?It is not clear to
> > me based on the above doc what
> > would happen encoding-wise in that situation, and I haven't tried it
> > either because I assumed that
> > most people would want the 64-bit version of gcc since they are
> > probably running a 64-bit OS.
> >
> > If you think it is useful, I could look into that as a separate task
> > to try and keep this one simple, if
> > that makes sense.
>
>
> Yes, realistically it's mostly about 32-bit gcc on 64-bit Windows
> (perhaps aarch64 as well at some point in the future). It's probably
> indeed not very popular configuration those days, but I think it should
> work just fine if you didn't explicitly limit the patch to x86_64.
>
>
> > "I think that .manifest file should also be a dependency here."
> >
> > Why is that?Windres takes only the .rc file as its input, as per
> > its own doc, and it successfully
> > compiles it into an object file.The .manifest file is only
> > referenced by the .rc file, and it doesn't
> > get passed to windres, so I don't see why it has to be listed as a
> > prerequisite in the make rule.
>
>
> The point that when winnt-utf8.manifest is modified, utf8-mingw32.o
> should be rebuilt. Anyway, it's probably not a big deal (I should
> disclaim that I'm not very familiar with gcc build system; I'm mostly on
> this ML due to mingw-w64 contributions).
>
>
> Thanks,
>
> Jacek
>
>


Re: [Patch, fortran] PR37336 finalization

2023-03-07 Thread Steve Kargl via Gcc-patches
On Tue, Mar 07, 2023 at 03:58:32PM +0100, Thomas Koenig via Fortran wrote:
> Paul,
> 
> first of all, thank you very much indeed for the hard work you put into
> this!  This is a great step for gfortran.

Ditto**2

> > I can hurry this along to get the patch
> > into 13-branch or I can wait until 14-branch opens.
> 
> Personally, I think that this fixes so many bugs, and makes
> the compiler so much better, that I would prefer having it
> in gcc-13.  Finalization was only of very limited use before,
> and the risk of meaningful regressions (short of a build
> failure) is therefore very low.
> 

I agree with Thomas.  The main branch is in stage 4,
which is regression and documentation fixing mode.  I
would think the number of bugs fixed by your patch
can be argued as fixing regressions.  I can set aside 
some time on Saturday to help with a review (if required).

-- 
Steve


[committed] libstdc++: Fix comment typo in eh_personality.cc

2023-03-07 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* libsupc++/eh_personality.cc: Fix spelling in comment.
---
 libstdc++-v3/libsupc++/eh_personality.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/libsupc++/eh_personality.cc 
b/libstdc++-v3/libsupc++/eh_personality.cc
index d21a0858555..12391e563d6 100644
--- a/libstdc++-v3/libsupc++/eh_personality.cc
+++ b/libstdc++-v3/libsupc++/eh_personality.cc
@@ -95,7 +95,7 @@ get_ttype_entry (lsda_header_info *info, _uleb128_t i)
   i *= size_of_encoded_value (info->ttype_encoding);
   read_encoded_value_with_base (
 #if __FDPIC__
-   /* Force these flags to nake sure to
+   /* Force these flags to make sure to
   take the GOT into account.  */
(DW_EH_PE_pcrel | DW_EH_PE_indirect),
 #else
-- 
2.39.2



[committed] libstdc++: Fix symver for __gnu_cxx11_ieee128::__try_use_facet [PR108882]

2023-03-07 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux. Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

PR libstdc++/108882
* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Adjust patterns to
not match symbols in namespace std::__gnu_cxx11_ieee128.
* config/os/gnu-linux/ldbl-ieee128-extra.ver: Add patterns for
std::__gnu_cxx11_ieee128::money_{get,put}.
---
 libstdc++-v3/config/abi/pre/gnu.ver | 3 ++-
 libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 34f23bcbce0..02a449a2f2f 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -2484,7 +2484,8 @@ GLIBCXX_3.4.31 {
 _ZSt8to_charsPcS_DF128_St12chars_formati;
 _ZSt10from_charsPKcS0_RDF128_St12chars_format;
 
-_ZSt15__try_use_facet*;
+_ZSt15__try_use_facetISt*;
+_ZSt15__try_use_facetINSt7__cxx11*;
 
 _ZNSt6chrono11reload_tzdbEv;
 _ZNSt6chrono8get_tzdbEv;
diff --git a/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver 
b/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
index 17d696364c4..20d87a5e373 100644
--- a/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
+++ b/libstdc++-v3/config/os/gnu-linux/ldbl-ieee128-extra.ver
@@ -53,6 +53,8 @@ GLIBCXX_IEEE128_3.4.30 {
 GLIBCXX_IEEE128_3.4.31 {
   _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1287num_get*;
   _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1287num_put*;
+  _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1289money_get*;
+  _ZSt15__try_use_facetINSt17__gnu_cxx_ieee1289money_put*;
   
_ZNSt19__gnu_cxx11_ieee1289money_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
   
_ZNSt19__gnu_cxx11_ieee1289money_putI[cw]St19ostreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
 } GLIBCXX_3.4.31;
-- 
2.39.2



[PATCH]middle-end: On emergency dumps finish the graph generation.

2023-03-07 Thread Tamar Christina via Gcc-patches
Hi All,

When doing an emergency dump the cfg output dumps are corrupted because the
ending "}" is missing.

Normally when the pass manager finishes it would call finish_graph_dump_file to
produce this.  This is called here because each pass can dump multiple digraphs.

However during an emergency dump we only dump the current function and so after
that is done we never go back to the pass manager.

As such, we need to manually call finish_graph_dump_file in order to properly
finish off graph generation.

With this -ftree-dump-*-graph works properly during a crash dump.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

* passes.cc (emergency_dump_function): Finish graph generation.

--- inline copy of patch -- 
diff --git a/gcc/passes.cc b/gcc/passes.cc
index 
347214e81d0cfac05d9ba782db0eda1cdd7e9c87..38642a4010941b414a1ed1fd70a348778addbf60
 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -1845,6 +1845,13 @@ emergency_dump_function ()
   fprintf (dump_file, "\n\n\nEMERGENCY DUMP:\n\n");
   execute_function_dump (cfun, current_pass);
 
+  /* Normally the passmanager will close the graphs as a pass could be wanting
+ to print multiple digraphs. But during an emergency dump there can only be
+ one and we must finish the graph manually.  */
+  if ((cfun->curr_properties & PROP_cfg)
+  && (dump_flags & TDF_GRAPH))
+finish_graph_dump_file (dump_file_name);
+
   if (symtab && current_pass->type == IPA_PASS)
 symtab->dump (dump_file);
 }




-- 
diff --git a/gcc/passes.cc b/gcc/passes.cc
index 
347214e81d0cfac05d9ba782db0eda1cdd7e9c87..38642a4010941b414a1ed1fd70a348778addbf60
 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -1845,6 +1845,13 @@ emergency_dump_function ()
   fprintf (dump_file, "\n\n\nEMERGENCY DUMP:\n\n");
   execute_function_dump (cfun, current_pass);
 
+  /* Normally the passmanager will close the graphs as a pass could be wanting
+ to print multiple digraphs. But during an emergency dump there can only be
+ one and we must finish the graph manually.  */
+  if ((cfun->curr_properties & PROP_cfg)
+  && (dump_flags & TDF_GRAPH))
+finish_graph_dump_file (dump_file_name);
+
   if (symtab && current_pass->type == IPA_PASS)
 symtab->dump (dump_file);
 }





Re: [PATCH] libstdc++: Some baseline_symbols.txt updates

2023-03-07 Thread Jonathan Wakely via Gcc-patches
On Mon, 20 Feb 2023 at 11:54, Jakub Jelinek via Libstdc++
 wrote:
>
> Hi!

Sorry for the delay.

> This updates baseline_symbols.txt for the Fedora 39 arches.
> Most of the added symbols are added to all 6 files, exceptions are
> DF16_ rtti stuff (only added on x86 and aarch64 which supports those),
> DF16b rtti stuff (only x86 right now), _M_replace_cold (m vs. j
> differences), DF128_ charconv (only x86), GLIBCXX_IEEE128_3.4.31
> symver symbols (only ppc64), GLIBCXX_LDBL_3.4.31 symver (ppc64 and s390x),
> _M_get_sys_info/_M_get_local_info (l vs. x) and
>   1 
> +FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEPKT_RKSt6locale@@GLIBCXX_3.4.31
>   1 
> +FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEPKT_RKSt6locale@@GLIBCXX_3.4.31
>   1 
> +FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEPKT_RKSt6locale@@GLIBCXX_3.4.31
>   1 
> +FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEPKT_RKSt6locale@@GLIBCXX_3.4.31
> for those, I wonder why they aren't in GLIBCXX_IEEE128_3.4.31 symver...

PR 108882 is fixed now, so they're not in the GLIBCXX_3.4.31 symver now.

> I was using
> grep ^+ | sed 's/OBJECT:[0-9]*:/OBJECT:/' | sort | uniq -c | sort -n | less
> on the patch to analyze.
>
> Ok for trunk?

I guess you want to regenerate the powerpc64 ones now. The others are
all OK for trunk.


>
> 2023-02-20  Jakub Jelinek  
>
> * config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
> * config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
> * config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
> * config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
> * config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
> * config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.



Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Bernhard Reutner-Fischer via Gcc-patches
On Mon, 6 Mar 2023 11:29:30 + (UTC)
Richard Biener via Gcc-patches  wrote:

> On Mon, 6 Mar 2023, Jakub Jelinek wrote:
> 
> > On Mon, Mar 06, 2023 at 11:01:18AM +, Richard Biener wrote:  
> > > +  auto_mpfr &operator=(const auto_mpfr &) = delete;
> > > +  auto_mpz &operator=(const auto_mpz &) = delete;  
> > 
> > Just formatting nit, space before (.
> > 
> > Looks like nice improvement and thanks Jonathan for the suggestions ;)  
> 
> Good, I've queued it for stage1 unless fortran folks want to pick it
> up earlier for the purpose of fixing leaks.

You did not Cc the fortran list though, not sure if Tobias is aware of
this possibility.

While it's a nice idea, there have been resentments towards (visible)
C++ in the fortran frontend and especially the library, i think.
I do not know if this has changed in the meantime.

PS: not sure about libgfortran/intrinsics/trigd.inc
FTYPE s
in #ifdef SIND_SMALL_LITERAL
when this is true:
if (mpfr_cmp_ld (ax, SIND_SMALL_LITERAL) < 0)
ISTM we leak s in the return x;

This seems to happen both in SIND and TAND, from the looks.

PPS: without handling (mpfr|mpz)_clears proper, hence missing quite
some potential spots, i guess one could potentially discuss at least
$ git diff | diffstat -ulp1
gcc/builtins.cc
gcc/fold-const-call.cc
gcc/fortran/arith.cc
gcc/fortran/array.cc
gcc/fortran/check.cc
gcc/fortran/data.cc
gcc/fortran/dependency.cc
gcc/fortran/expr.cc
gcc/fortran/resolve.cc
gcc/fortran/simplify.cc
gcc/fortran/target-memory.cc
gcc/fortran/trans-array.cc
gcc/fortran/trans-intrinsic.cc
gcc/real.cc
gcc/rust/backend/rust-compile-expr.cc
gcc/tree-ssa-loop-niter.cc
gcc/ubsan.cc

See the spots highlighted in the attached patch, which i did not try to
compile. I did not filter out the files you patched already, nor
externally maintained files like rust (which seems to leak fval and
ival not only on errors, from a quick look).

That was from
find ./ \( -name "*.c[cp]" -o -name "*.[ch]" -o -name "*.inc" \) -a \( ! -path 
"./gcc/testsuite/*" -a ! -path "./gcc/contrib/*" \) -exec spatch --sp-file 
~/coccinelle/auto_mpfrz.2.cocci --in-place {} \;

thanks,
// mpz and mpfr -> auto_\1
// Remember to s/= [[:space:]]*XXXZXXX//g
// to get rid of the disguise-in-c hack
// TODO: properly handle (mpfr|mpz)_clears
@ mpfr_1 @
typedef mpfr_t;
identifier i;
@@
- mpfr_t i;
+ auto_mpfr i;
...
- mpfr_init (i);
...
(
- mpfr_clear (i);
|
mpfr_clears (
 ...,
- i,
 ...
 );
)

@ mpfr_2 @
typedef mpfr_t;
identifier i;
expression prec;
@@
- mpfr_t i;
+ auto_mpfr i = XXXZXXX(prec);
...
- mpfr_init2 (i, prec);
...
(
- mpfr_clear (i);
|
mpfr_clears (
 ...,
- i,
 ...
 );
)

@ mpfr_3 @
@@
- mpfr_clears(NULL);

/// mpz ///

@ mpz_1 @
typedef mpz_t;
identifier i;
@@
- mpz_t i;
+ auto_mpz i;
...
- mpz_init (i);
...
(
- mpz_clear (i);
|
mpz_clears (
 ...,
- i,
 ...
 );
)

@ mpz_2 @
typedef mpz_t;
identifier i;
expression prec;
@@
- mpz_t i;
+ auto_mpz i = XXXZXXX(prec);
...
- mpz_init2 (i, prec);
...
(
- mpz_clear (i);
|
mpz_clears (
 ...,
- i,
 ...
 );
)

@ mpz_3 @
@@
- mpz_clears(NULL);

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 4d467c8c5c1..9be0949aa1d 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -11064,15 +11064,13 @@ do_mpfr_lgamma_r (tree arg, tree arg_sg, tree type)
  const int prec = fmt->p;
  const mpfr_rnd_t rnd = fmt->round_towards_zero? MPFR_RNDZ : MPFR_RNDN;
  int inexact, sg;
- mpfr_t m;
+ auto_mpfr m (prec);
  tree result_lg;
 
- mpfr_init2 (m, prec);
  mpfr_from_real (m, ra, MPFR_RNDN);
  mpfr_clear_flags ();
  inexact = mpfr_lgamma (m, &sg, m, rnd);
  result_lg = do_mpfr_ckconv (m, type, inexact);
- mpfr_clear (m);
  if (result_lg)
{
  tree result_sg;
diff --git a/gcc/fold-const-call.cc b/gcc/fold-const-call.cc
index 43819c1f984..900c4069ddb 100644
--- a/gcc/fold-const-call.cc
+++ b/gcc/fold-const-call.cc
@@ -130,14 +130,13 @@ do_mpfr_arg1 (real_value *result,
 
   int prec = format->p;
   mpfr_rnd_t rnd = format->round_towards_zero ? MPFR_RNDZ : MPFR_RNDN;
-  mpfr_t m;
+  auto_mpfr m (prec);
 
-  mpfr_init2 (m, prec);
+  
   mpfr_from_real (m, arg, MPFR_RNDN);
   mpfr_clear_flags ();
   bool inexact = func (m, m, rnd);
   bool ok = do_mpfr_ckconv (result, m, inexact, format);
-  mpfr_clear (m);
 
   return ok;
 }
@@ -224,14 +223,13 @@ do_mpfr_arg2 (real_value *result,
 
   int prec = format->p;
   mpfr_rnd_t rnd = format->round_towards_zero ? MPFR_RNDZ : MPFR_RNDN;
-  mpfr_t m;
+  auto_mpfr m (prec);
 
-  mpfr_init2 (m, prec);
+  
   mpfr_from_real (m, arg1, MPFR_RNDN);
   mpfr_clear_flags ();
   bool inexact = func (m, arg0.to_shwi (), m, rnd);
   bool ok = do_mpfr_ckconv (result, m, inexact, format);
-  mpfr_clear (m);
 
   return ok;
 }
diff --git a/gcc/fortran/arith.cc b/gcc/fortran/arith.cc
index d0d1c0b03d2..5b14a833d9a 100644
--- a/gcc/fortran/arith.cc
+++ b/gcc/fortran/arith.cc
@@ -329,1

Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Jakub Jelinek via Gcc-patches
On Tue, Mar 07, 2023 at 07:51:03PM +0100, Bernhard Reutner-Fischer wrote:
> While it's a nice idea, there have been resentments towards (visible)
> C++ in the fortran frontend and especially the library, i think.

I thought libgfortran is written in C and Fortran and doesn't use gmp/mpfr,
so this doesn't apply to it (ok, intrinsics/trigd.inc uses mpfr_*
names macros if in library which do something different).

As for the FE, we don't need to change all places with manual
allocation/deallocation for those, though changing most of them
will help with maintainability as one doesn't have to care about leaks.
I only see 2 mpfr_init2 calls in Fortran FE though, so pressumably
everything else could use visually C like auto_mpfr var;
instead of mpfr_t var + mpfr_init + mpfr_clear.

Jakub



Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Alexander Monakov via Gcc-patches
Hi,

On Mon, 6 Mar 2023, Richard Biener via Gcc-patches wrote:

> --- a/gcc/realmpfr.h
> +++ b/gcc/realmpfr.h
> @@ -24,6 +24,26 @@
>  #include 
>  #include 
>  
> +class auto_mpfr
> +{
> +public:
> +  auto_mpfr () { mpfr_init (m_mpfr); }
> +  explicit auto_mpfr (mpfr_prec_t prec) { mpfr_init2 (m_mpfr, prec); }
> +  ~auto_mpfr () { mpfr_clear (m_mpfr); }
> +
> +  operator mpfr_t& () { return m_mpfr; }
> +
> +  auto_mpfr (const auto_mpfr &) = delete;
> +  auto_mpfr &operator=(const auto_mpfr &) = delete;

Shouldn't this use the idiom suggested in ansidecl.h, i.e.

  private:
DISABLE_COPY_AND_ASSIGN (auto_mpfr);

Alexander


[PATCH] c++: static lambda tsubst [PR108526]

2023-03-07 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

A missed piece of the patch for static operator(): in tsubst_function_decl,
we don't want to replace the first parameter with a new closure pointer if
operator() is static.

PR c++/108526
PR c++/106651

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Don't replace the closure
parameter if DECL_STATIC_FUNCTION_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/static-operator-call5.C: Pass -g.
---
 gcc/cp/pt.cc   | 4 ++--
 gcc/testsuite/g++.dg/cpp23/static-operator-call5.C | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 85136df1730..aafc99d12c3 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -14393,12 +14393,12 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
 DECL_NAME (r) = make_conv_op_name (TREE_TYPE (type));
 
   tree parms = DECL_ARGUMENTS (t);
-  if (closure)
+  if (closure && !DECL_STATIC_FUNCTION_P (t))
 parms = DECL_CHAIN (parms);
   parms = tsubst (parms, args, complain, t);
   for (tree parm = parms; parm; parm = DECL_CHAIN (parm))
 DECL_CONTEXT (parm) = r;
-  if (closure)
+  if (closure && !DECL_STATIC_FUNCTION_P (t))
 {
   tree tparm = build_this_parm (r, closure, type_memfn_quals (type));
   DECL_NAME (tparm) = closure_identifier;
diff --git a/gcc/testsuite/g++.dg/cpp23/static-operator-call5.C 
b/gcc/testsuite/g++.dg/cpp23/static-operator-call5.C
index ae022d0b971..f7ce8c03008 100644
--- a/gcc/testsuite/g++.dg/cpp23/static-operator-call5.C
+++ b/gcc/testsuite/g++.dg/cpp23/static-operator-call5.C
@@ -1,5 +1,6 @@
 // PR c++/108526
 // { dg-do compile { target c++23 } }
+// { dg-additional-options -g } PR108706
 
 template void f()
 {

base-commit: 247cacc9e381d666a492dfa4ed61b7b19e2d008f
-- 
2.31.1



HELP: Questions on multiple PROGRAM_SUMMARY sections in a profiling data file

2023-03-07 Thread Qing Zhao via Gcc-patches
Hi, Jan,

I am studying one profiling feedback ICE bug with GCC8 recently. 
It’s an assertion failure inside the routine “compute_working_sets”of gcov-io.c:

gcov_nonruntime_assert (ws_ix == NUM_GCOV_WORKING_SETS);

After some debugging and study, I found that the corresponding .gcda file has 
two PROGRAM_SUMMARY sections:

foo.gcda:  a300: 202:PROGRAM_SUMMARY checksum=0x91f3e3ae 
foo.gcda:counts=10831, runs=0, sum_all=478965, run_max=125615, 
sum_max=201126 
foo.gcda:counter histogram: 
foo.gcda: 0: num counts=10187, min counter=0, cum_counter=0 
… 
foo.gcda:51: num counts=1, min counter=14524, cum_counter=14524 
foo.gcda:63: num counts=1, min counter=125615, 
cum_counter=125615 
foo.gcda:  a300: 137:PROGRAM_SUMMARY checksum=0xcf9f0896 
foo.gcda:counts=10502, runs=1, sum_all=48618, run_max=13999, 
sum_max=14046 
foo.gcda:counter histogram: 
foo.gcda: 0: num counts=9821, min counter=0, cum_counter=0 
… 
foo.gcda:43: num counts=1, min counter=3830, cum_counter=3830 
foo.gcda:50: num counts=1, min counter=13999, cum_counter=13999 

Looks like the 2nd PROGRAM_SUMMARY has some issue. If I manually change 
gcc/coverage.c 
to ignore the 2nd PROGRAM_SUMMARY section, the ICE disappears. 

I have several questions for the profiling feedback data file:

1. Under what situation, there will be multiple PROGRAM_SUMMARY sections for 
one module?
2. How to check whether one of the PROGRAM_SUMMARY has issue?

Thanks a lot for any help.

Qing

[PATCH] libstdc++: extraneous begin in cartesian_product_view::end [PR107572]

2023-03-07 Thread Patrick Palka via Gcc-patches
ranges::begin() isn't guaranteed to be equality-preserving for
non-forward ranges, so in cartesian_product_view::end we need to be
careful about calling begin() on the first range (which could be
non-forward) in the (non-degenerate) case where __empty_tail is false.

Since we're already using a variadic lambda to compute __empty_tail, we
might as well use that same lambda to build up the tuple of iterators
instead of doing it via __tuple_transform.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR libstdc++/107572

libstdc++-v3/ChangeLog:

* include/std/ranges (cartesian_product_view::end): When
building the tuple of iterators, avoid calling ranges::begin on
the first range if __empty_tail is false.
* testsuite/std/ranges/cartesian_product/1.cc (test07): New test.
---
 libstdc++-v3/include/std/ranges   | 36 +--
 .../std/ranges/cartesian_product/1.cc | 22 
 2 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index e0cac15a64f..0de7bdef504 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -8078,26 +8078,42 @@ namespace views::__adaptor
 end() requires ((!__detail::__simple_view<_First> || ... || 
!__detail::__simple_view<_Vs>)
&& __detail::__cartesian_product_is_common<_First, _Vs...>)
 {
-  bool __empty_tail = [this](index_sequence<_Is...>) {
-   return (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
+  auto __it = [this](index_sequence<_Is...>) {
+   bool __empty_tail = (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
+   auto& __first = std::get<0>(_M_bases);
+   auto __first_it = __empty_tail
+ ? ranges::begin(__first)
+ : __detail::__cartesian_common_arg_end(__first);
+   // N.B. When implementing P2165R4 this should be changed to always 
return tuple.
+   if constexpr (sizeof...(_Is) == 1)
+ return std::make_pair(std::move(__first_it),
+ranges::begin(std::get<1 + _Is>(_M_bases))...);
+   else
+ return std::make_tuple(std::move(__first_it),
+ranges::begin(std::get<1 + _Is>(_M_bases))...);
   }(make_index_sequence{});
 
-  auto __it = __detail::__tuple_transform(ranges::begin, _M_bases);
-  if (!__empty_tail)
-   std::get<0>(__it) = 
__detail::__cartesian_common_arg_end(std::get<0>(_M_bases));
   return _Iterator{*this, std::move(__it)};
 }
 
 constexpr _Iterator
 end() const requires __detail::__cartesian_product_is_common
 {
-  bool __empty_tail = [this](index_sequence<_Is...>) {
-   return (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
+  auto __it = [this](index_sequence<_Is...>) {
+   bool __empty_tail = (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
+   auto& __first = std::get<0>(_M_bases);
+   auto __first_it = __empty_tail
+ ? ranges::begin(__first)
+ : __detail::__cartesian_common_arg_end(__first);
+   // N.B. When implementing P2165R4 this should be changed to always 
return tuple.
+   if constexpr (sizeof...(_Is) == 1)
+ return std::make_pair(std::move(__first_it),
+ranges::begin(std::get<1 + _Is>(_M_bases))...);
+   else
+ return std::make_tuple(std::move(__first_it),
+ranges::begin(std::get<1 + _Is>(_M_bases))...);
   }(make_index_sequence{});
 
-  auto __it = __detail::__tuple_transform(ranges::begin, _M_bases);
-  if (!__empty_tail)
-   std::get<0>(__it) = 
__detail::__cartesian_common_arg_end(std::get<0>(_M_bases));
   return _Iterator{*this, std::move(__it)};
 }
 
diff --git a/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc 
b/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
index 1ec4422e6f3..34639f514aa 100644
--- a/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
+++ b/libstdc++-v3/testsuite/std/ranges/cartesian_product/1.cc
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -178,6 +179,26 @@ test06()
   return true;
 }
 
+void
+test07()
+{
+  // PR libstdc++/107572
+  static std::istringstream ints("0 1 2 3 4");
+  struct istream_range
+  {
+auto begin() { return std::istream_iterator{ints}; }
+auto end() { return std::istream_iterator{}; }
+  };
+  istream_range r;
+  int i = 0;
+  for (auto [v] : views::cartesian_product(r))
+{
+  VERIFY( v == i );
+  ++i;
+};
+  VERIFY( i == 5 );
+}
+
 int
 main()
 {
@@ -187,4 +208,5 @@ main()
   test04();
   test05();
   static_assert(test06());
+  test07();
 }
-- 
2.40.0.rc0.57.g454dfcbddf



Re: [PATCH] libstdc++: extraneous begin in cartesian_product_view::end [PR107572]

2023-03-07 Thread Patrick Palka via Gcc-patches
On Tue, 7 Mar 2023, Patrick Palka wrote:

> ranges::begin() isn't guaranteed to be equality-preserving for
> non-forward ranges, so in cartesian_product_view::end we need to be
> careful about calling begin() on the first range (which could be
> non-forward) in the (non-degenerate) case where __empty_tail is false.
> 
> Since we're already using a variadic lambda to compute __empty_tail, we
> might as well use that same lambda to build up the tuple of iterators
> instead of doing it via __tuple_transform.
> 
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
> 
>   PR libstdc++/107572
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/ranges (cartesian_product_view::end): When
>   building the tuple of iterators, avoid calling ranges::begin on
>   the first range if __empty_tail is false.
>   * testsuite/std/ranges/cartesian_product/1.cc (test07): New test.
> ---
>  libstdc++-v3/include/std/ranges   | 36 +--
>  .../std/ranges/cartesian_product/1.cc | 22 
>  2 files changed, 48 insertions(+), 10 deletions(-)
> 
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index e0cac15a64f..0de7bdef504 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -8078,26 +8078,42 @@ namespace views::__adaptor
>  end() requires ((!__detail::__simple_view<_First> || ... || 
> !__detail::__simple_view<_Vs>)
>   && __detail::__cartesian_product_is_common<_First, _Vs...>)
>  {
> -  bool __empty_tail = [this](index_sequence<_Is...>) {
> - return (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
> +  auto __it = [this](index_sequence<_Is...>) {
> + bool __empty_tail = (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
> + auto& __first = std::get<0>(_M_bases);
> + auto __first_it = __empty_tail
> +   ? ranges::begin(__first)
> +   : __detail::__cartesian_common_arg_end(__first);
> + // N.B. When implementing P2165R4 this should be changed to always 
> return tuple.
> + if constexpr (sizeof...(_Is) == 1)
> +   return std::make_pair(std::move(__first_it),
> +  ranges::begin(std::get<1 + _Is>(_M_bases))...);
> + else
> +   return std::make_tuple(std::move(__first_it),
> +  ranges::begin(std::get<1 + _Is>(_M_bases))...);

On second thought, it might be better to use __tuple_or_pair_t here
instead of manually determining whether to use a pair or tuple, so that
we don't forget to adjust this site when implementing P2165R4 (which
removes __tuple_or_pair_t):

-- >8 --

Subject: [PATCH] libstdc++: extraneous begin in cartesian_product_view::end
 [PR107572]

ranges::begin() isn't guaranteed to be equality-preserving for
non-forward ranges, so in cartesian_product_view::end we need to avoid
calling begin() on the first range (which could be non-forward) in the
(non-degenerate) case where __empty_tail is false.

Since we're already using a variadic lambda to compute __empty_tail, we
might as well use that same lambda to build up the tuple of iterators
instead of doing it separately via __tuple_transform.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR libstdc++/107572

libstdc++-v3/ChangeLog:

* include/std/ranges (cartesian_product_view::end): When
building the tuple of iterators, avoid calling ranges::begin on
the first range if __empty_tail is false.
* testsuite/std/ranges/cartesian_product/1.cc (test07): New test.
---
 libstdc++-v3/include/std/ranges   | 28 ---
 .../std/ranges/cartesian_product/1.cc | 25 +
 2 files changed, 43 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index e0cac15a64f..d2ab79179ca 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -8078,26 +8078,34 @@ namespace views::__adaptor
 end() requires ((!__detail::__simple_view<_First> || ... || 
!__detail::__simple_view<_Vs>)
&& __detail::__cartesian_product_is_common<_First, _Vs...>)
 {
-  bool __empty_tail = [this](index_sequence<_Is...>) {
-   return (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
+  auto __it = [this](index_sequence<_Is...>) {
+   using _Ret = __detail::__tuple_or_pair_t,
+iterator_t<_Vs>...>;
+   bool __empty_tail = (ranges::empty(std::get<1 + _Is>(_M_bases)) || ...);
+   auto& __first = std::get<0>(_M_bases);
+   return _Ret{(__empty_tail
+? ranges::begin(__first)
+: __detail::__cartesian_common_arg_end(__first)),
+   ranges::begin(std::get<1 + _Is>(_M_bases))...};
   }(make_index_sequence{});
 
-  auto __it = __detail::__tuple_transform(ranges::begin, _M_bases);
-  if (!__empty_tail)
-   s

Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Jonathan Wakely via Gcc-patches
On Tue, 7 Mar 2023 at 19:15, Alexander Monakov wrote:
>
> Hi,
>
> On Mon, 6 Mar 2023, Richard Biener via Gcc-patches wrote:
>
> > --- a/gcc/realmpfr.h
> > +++ b/gcc/realmpfr.h
> > @@ -24,6 +24,26 @@
> >  #include 
> >  #include 
> >
> > +class auto_mpfr
> > +{
> > +public:
> > +  auto_mpfr () { mpfr_init (m_mpfr); }
> > +  explicit auto_mpfr (mpfr_prec_t prec) { mpfr_init2 (m_mpfr, prec); }
> > +  ~auto_mpfr () { mpfr_clear (m_mpfr); }
> > +
> > +  operator mpfr_t& () { return m_mpfr; }
> > +
> > +  auto_mpfr (const auto_mpfr &) = delete;
> > +  auto_mpfr &operator=(const auto_mpfr &) = delete;
>
> Shouldn't this use the idiom suggested in ansidecl.h, i.e.
>
>   private:
> DISABLE_COPY_AND_ASSIGN (auto_mpfr);


Why? A macro like that (or a base class like boost::noncopyable) has
some value in a code base that wants to work for both C++03 and C++11
(or later). But in GCC we know we have C++11 now, so we can just
delete members. I don't see what the macro adds.



Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Alexander Monakov via Gcc-patches


On Tue, 7 Mar 2023, Jonathan Wakely wrote:

> > Shouldn't this use the idiom suggested in ansidecl.h, i.e.
> >
> >   private:
> > DISABLE_COPY_AND_ASSIGN (auto_mpfr);
> 
> 
> Why? A macro like that (or a base class like boost::noncopyable) has
> some value in a code base that wants to work for both C++03 and C++11
> (or later). But in GCC we know we have C++11 now, so we can just
> delete members. I don't see what the macro adds.

Evidently it's possible to forget to delete one of the members, as
showcased in this very thread.

The idiom is also slightly easier to read.

Alexander


Ping: [PATCH] libcpp: Fix ICE on directive inside _Pragma() operator [PR67046]

2023-03-07 Thread Lewis Hyatt via Gcc-patches
Hello-

May I please ping this short patch that fixes an old bug? Thanks...

-Lewis

On Sat, Jan 14, 2023 at 1:46 PM Lewis Hyatt  wrote:
>
> get__Pragma_string() in directives.cc is responsible for lexing the parens
> and the string argument from a _Pragma("...") operator. This function does
> not handle the case when the closing paren is not on the same line as the
> string; in that case, libcpp will by default reuse the token buffer it
> previously used for the string, so that the string token returned by
> get__Pragma_string() may be corrupted, as shown in the testcase. Fix using
> the existing keep_tokens mechanism that temporarily disables the reuse of
> token buffers.
>
> libcpp/ChangeLog:
>
> PR preprocessor/67046
> * directives.cc (_cpp_do__Pragma): Increment pfile->keep_tokens to
> ensure the returned string token is valid.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/67046
> * c-c++-common/cpp/pr67046.c: New test.
> ---
>
> Notes:
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67046
>
> This fixes an old ICE in libcpp that can happen when lexing the tokens 
> from a
> _Pragma operator. Bootstrapped+tested on x86-64 Linux with no
> regressions. Please let me know if it's OK? Thanks...
>
> -Lewis
>
>  gcc/testsuite/c-c++-common/cpp/pr67046.c | 10 ++
>  libcpp/directives.cc |  5 +
>  2 files changed, 15 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pr67046.c
>
> diff --git a/gcc/testsuite/c-c++-common/cpp/pr67046.c 
> b/gcc/testsuite/c-c++-common/cpp/pr67046.c
> new file mode 100644
> index 000..f37f20c624e
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pr67046.c
> @@ -0,0 +1,10 @@
> +/* { dg-do preprocess } */
> +
> +_Pragma(
> +"message(\"msg\")"
> +)
> +
> +_Pragma(
> +"message(\"msg\")"
> +#
> +)
> diff --git a/libcpp/directives.cc b/libcpp/directives.cc
> index 9dc4363c65a..ffd262bce7d 100644
> --- a/libcpp/directives.cc
> +++ b/libcpp/directives.cc
> @@ -1996,7 +1996,12 @@ destringize_and_run (cpp_reader *pfile, const 
> cpp_string *in,
>  int
>  _cpp_do__Pragma (cpp_reader *pfile, location_t expansion_loc)
>  {
> +  /* Make sure we don't invalidate the string token, if the closing 
> parenthesis
> +   ended up on a different line.  */
> +  ++pfile->keep_tokens;
>const cpp_token *string = get__Pragma_string (pfile);
> +  --pfile->keep_tokens;
> +
>pfile->directive_result.type = CPP_PADDING;
>
>if (string)


Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Jonathan Wakely via Gcc-patches
On Tue, 7 Mar 2023 at 21:52, Alexander Monakov wrote:
>
>
> On Tue, 7 Mar 2023, Jonathan Wakely wrote:
>
> > > Shouldn't this use the idiom suggested in ansidecl.h, i.e.
> > >
> > >   private:
> > > DISABLE_COPY_AND_ASSIGN (auto_mpfr);
> >
> >
> > Why? A macro like that (or a base class like boost::noncopyable) has
> > some value in a code base that wants to work for both C++03 and C++11
> > (or later). But in GCC we know we have C++11 now, so we can just
> > delete members. I don't see what the macro adds.
>
> Evidently it's possible to forget to delete one of the members, as
> showcased in this very thread.

But easily caught by review.

> The idiom is also slightly easier to read.

That's a matter of opinion, I prefer the idiomatic C++ code to a SHOUTY MACRO.



Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Marek Polacek via Gcc-patches
On Tue, Mar 07, 2023 at 09:54:08PM +, Jonathan Wakely via Gcc-patches wrote:
> On Tue, 7 Mar 2023 at 21:52, Alexander Monakov wrote:
> >
> >
> > On Tue, 7 Mar 2023, Jonathan Wakely wrote:
> >
> > > > Shouldn't this use the idiom suggested in ansidecl.h, i.e.
> > > >
> > > >   private:
> > > > DISABLE_COPY_AND_ASSIGN (auto_mpfr);
> > >
> > >
> > > Why? A macro like that (or a base class like boost::noncopyable) has
> > > some value in a code base that wants to work for both C++03 and C++11
> > > (or later). But in GCC we know we have C++11 now, so we can just
> > > delete members. I don't see what the macro adds.
> >
> > Evidently it's possible to forget to delete one of the members, as
> > showcased in this very thread.
> 
> But easily caught by review.
> 
> > The idiom is also slightly easier to read.
> 
> That's a matter of opinion, I prefer the idiomatic C++ code to a SHOUTY MACRO.

FWIW, I'd also prefer to see the explicit =deletes rather than having to
go look up what exactly the macro does.

Marek



Re: Fwd: Bugzilla Bug 81649 [PATCH]: Clarify LeakSanitizer in documentation

2023-03-07 Thread Sandra Loosemore

On 3/1/23 05:53, Jonny Grant wrote:

Hello
I don't have write access, could someone review and apply this please?
Kind regards
Jonny


Looks good; I've gone ahead and pushed it for you.

-Sandra




Re: [ping][PATCH 1/1] docs: Add link to gmplib.org

2023-03-07 Thread Sandra Loosemore

On 1/11/23 07:57, Benson Muite via Gcc-patches wrote:

Improvement to documentation from a new contributor without commit rights.

On 1/5/23 06:38, Benson Muite wrote:

Link is missing from install documentation


Thanks, I've pushed this patch.

-Sandra


Re: [PATCH v2 0/5] A small Texinfo refinement

2023-03-07 Thread Sandra Loosemore

On 2/23/23 03:27, Arsen Arsenović via Gcc-patches wrote:

I've rerendered the updated documentation with latest development
Texinfo (as some of the changes I made for the purposes of the GCC
manual still aren't in releases) at:

   https://www.aarsen.me/~arsen/final/


Ummm.  I don't think GCC's documentation should depend on an unreleased 
version of Texinfo.  Currently install.texi documents that version 4.7 
or later is required, 4.8 for "make pdf"; did I miss something in your 
patch set that bumps this requirement?  Exactly what features do you 
depend on that are not yet supported by an official Texinfo release?


-Sandra



[PATCH RFC] c++: lambda mangling alias issues [PR107897]

2023-03-07 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu.  Does this look good, or do we want to factor the
flag clearing into a symtab_node counterpart to cgraph_node::reset?

-- 8< --

In 107897, by the time we are looking at the mangling clash, the
alias has already been removed from the symbol table by analyze_functions,
so we can't look at n->cpp_implicit_alias.  So just assume that it's an
alias if it's internal.

In 108887 the problem is that removing the mangling alias from the symbol
table confuses analyze_functions, because it ended up as first_analyzed
somehow, so it becomes a dangling pointer.  Fixed by clearing various flags
to neutralize the alias.

PR c++/107897
PR c++/108887

gcc/cp/ChangeLog:

* decl2.cc (record_mangling): Improve symbol table handling.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-lambda3.C: Use -flto if supported.
* g++.dg/cpp0x/lambda/lambda-mangle7.C: New test.
---
 gcc/cp/decl2.cc   | 25 +--
 .../g++.dg/cpp0x/lambda/lambda-mangle7.C  | 70 +++
 gcc/testsuite/g++.dg/cpp2a/concepts-lambda3.C |  1 +
 3 files changed, 91 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle7.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 387e24542cd..e6e58b08de4 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -4742,15 +4742,30 @@ record_mangling (tree decl, bool need_warning)
 = mangled_decls->find_slot_with_hash (id, IDENTIFIER_HASH_VALUE (id),
  INSERT);
 
-  /* If this is already an alias, remove the alias, because the real
+  /* If this is already an alias, cancel the alias, because the real
  decl takes precedence.  */
   if (*slot && DECL_ARTIFICIAL (*slot) && DECL_IGNORED_P (*slot))
-if (symtab_node *n = symtab_node::get (*slot))
-  if (n->cpp_implicit_alias)
+{
+  if (symtab_node *n = symtab_node::get (*slot))
{
- n->remove ();
- *slot = NULL_TREE;
+ if (n->cpp_implicit_alias)
+   {
+ /* Actually removing the node isn't safe if other code is already
+holding a pointer to it, so just neutralize it.  */
+ n->remove_from_same_comdat_group ();
+ n->analyzed = false;
+ n->definition = false;
+ n->alias = false;
+ n->cpp_implicit_alias = false;
+   }
}
+  else
+   /* analyze_functions might have already removed the alias from the
+  symbol table if it's internal.  */
+   gcc_checking_assert (!TREE_PUBLIC (*slot));
+
+  *slot = NULL_TREE;
+}
 
   if (!*slot)
 *slot = decl;
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle7.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle7.C
new file mode 100644
index 000..c7946a2be08
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-mangle7.C
@@ -0,0 +1,70 @@
+// PR c++/108887
+// { dg-do compile { target c++11 } }
+
+template  struct integral_constant {
+  static constexpr int value = __v;
+};
+using false_type = integral_constant;
+template  struct __result_of_impl;
+template 
+struct __result_of_impl {
+  typedef decltype(0) type;
+};
+template 
+struct __invoke_result
+: __result_of_impl {};
+template 
+void __invoke_impl(_Fn __f, _Args... __args) {
+  __f(__args...);
+}
+template 
+void __invoke_r(_Callable __fn, _Args... __args) {
+  using __result = __invoke_result<_Args...>;
+  using __type = typename __result::type;
+  __invoke_impl<__type>(__fn, __args...);
+}
+struct QString {
+  QString(const char *);
+};
+template  class function;
+template  struct _Base_manager {
+  static _Functor _M_get_pointer(int) { __builtin_abort (); }
+};
+template  class _Function_handler;
+template 
+struct _Function_handler<_Res(_ArgTypes...), _Functor> {
+  using _Base = _Base_manager<_Functor>;
+  static _Res _M_invoke(const int &__functor, _ArgTypes &&...__args) {
+auto __trans_tmp_1 = _Base::_M_get_pointer(__functor);
+__invoke_r<_Res>(__trans_tmp_1, __args...);
+  }
+};
+template 
+struct function<_Res(_ArgTypes...)> {
+  template 
+  using _Handler = _Function_handler<_Res(_ArgTypes...), _Functor>;
+  template  function(_Functor) {
+using _My_handler = _Handler<_Functor>;
+_M_invoker = _My_handler::_M_invoke;
+  }
+  using _Invoker_type = _Res (*)(const int &, _ArgTypes &&...);
+  _Invoker_type _M_invoker;
+};
+struct QRegularExpression {
+  QRegularExpression(QString);
+};
+struct AbstractAccount {
+  void get(function,
+   function);
+};
+struct AbstractTimelineModel {
+  AbstractAccount m_account;
+};
+struct LinkPaginationTimelineModel : AbstractTimelineModel {
+  void fillTimeline();
+};
+void LinkPaginationTimelineModel::fillTimeline() {
+  [] {};
+  m_account.get([](AbstractAccount *) { static QRegularExpression re(""); },
+[](AbstractAccount *) {});
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda3.C 
b/gc

[PATCH] RISC-V: Fine tune merge operand constraint for integer/load/store

2023-03-07 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc: Split indexed load 
patterns according to RVV ISA.
* config/riscv/vector-iterators.md: New iterators.
* config/riscv/vector.md 
(@pred_indexed_load): Remove.
(@pred_indexed_load_same_eew): New pattern.
(@pred_indexed_load_x2_greater_eew): Ditto.
(@pred_indexed_load_x4_greater_eew): Ditto.
(@pred_indexed_load_x8_greater_eew): Ditto.
(@pred_indexed_load_x2_smaller_eew): Ditto.
(@pred_indexed_load_x4_smaller_eew): Ditto.
(@pred_indexed_load_x8_smaller_eew): Ditto.
(@pred_indexed_load): Remove.
(@pred_indexed_load): Ditto.
(@pred_indexed_load): Ditto.
(@pred_indexed_load): Ditto.
(@pred_indexed_load): Ditto.
(@pred_indexed_load): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/merge_constraint-1.c: New test.

---
 .../riscv/riscv-vector-builtins-bases.cc  |   54 +-
 gcc/config/riscv/vector-iterators.md  |  214 ++-
 gcc/config/riscv/vector.md| 1243 +
 .../riscv/rvv/base/merge_constraint-1.c   |  204 +++
 4 files changed, 1065 insertions(+), 650 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/merge_constraint-1.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index 532b2edbf2e..9f87f8c645a 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -129,9 +129,57 @@ public:
code_for_pred_indexed_store (unspec, e.vector_mode (),
 e.index_mode ()));
else
- return e.use_exact_insn (
-   code_for_pred_indexed_load (unspec, e.vector_mode (),
-   e.index_mode ()));
+ {
+   unsigned src_eew_bitsize
+ = GET_MODE_BITSIZE (GET_MODE_INNER (e.index_mode ()));
+   unsigned dst_eew_bitsize
+ = GET_MODE_BITSIZE (GET_MODE_INNER (e.vector_mode ()));
+   if (dst_eew_bitsize == src_eew_bitsize)
+ return e.use_exact_insn (
+   code_for_pred_indexed_load_same_eew (unspec, e.vector_mode ()));
+   else if (dst_eew_bitsize > src_eew_bitsize)
+ {
+   unsigned factor = dst_eew_bitsize / src_eew_bitsize;
+   switch (factor)
+ {
+ case 2:
+   return e.use_exact_insn (
+ code_for_pred_indexed_load_x2_greater_eew (
+   unspec, e.vector_mode ()));
+ case 4:
+   return e.use_exact_insn (
+ code_for_pred_indexed_load_x4_greater_eew (
+   unspec, e.vector_mode ()));
+ case 8:
+   return e.use_exact_insn (
+ code_for_pred_indexed_load_x8_greater_eew (
+   unspec, e.vector_mode ()));
+ default:
+   gcc_unreachable ();
+ }
+ }
+   else
+ {
+   unsigned factor = src_eew_bitsize / dst_eew_bitsize;
+   switch (factor)
+ {
+ case 2:
+   return e.use_exact_insn (
+ code_for_pred_indexed_load_x2_smaller_eew (
+   unspec, e.vector_mode ()));
+ case 4:
+   return e.use_exact_insn (
+ code_for_pred_indexed_load_x4_smaller_eew (
+   unspec, e.vector_mode ()));
+ case 8:
+   return e.use_exact_insn (
+ code_for_pred_indexed_load_x8_smaller_eew (
+   unspec, e.vector_mode ()));
+ default:
+   gcc_unreachable ();
+ }
+ }
+ }
   }
 else if (LST_TYPE == LST_STRIDED)
   {
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 4dea46f4470..d44943ae7c3 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -99,6 +99,65 @@
   (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
+(define_mode_iterator VEEWEXT2 [
+  VNx1HI VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32")
+  VNx1SI VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx1DI "TARGET_MIN_VLEN > 32") (VNx2DI "TARGET_MIN_VLEN > 32")
+  (VNx4DI "TARGET_MIN_VLEN > 32") (VNx8DI "TARGET_MIN_VLEN > 32")
+  (VNx1SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
+  (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
+  (VNx1DF "TARGET_VECTOR_ELEN_FP_64")
+  (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
+  (VNx4DF "TARGET_VECTOR_ELEN_FP_64

[PATCH v3 0/6] RISC-V: autovec: Add auto-vectorization support

2023-03-07 Thread Michael Collison
This series of patches adds foundational support for RISC-V auto-vectorization 
support. These patches are based on the current upstream rvv vector intrinsic 
support and is not a new implementation. Most of the implementation consists of 
adding the new vector cost model, the autovectorization patterns themselves and 
target hooks. This implementation only provides support for integer addition 
and subtraction as a proof of concept. This patch set should not be construed 
to be feature complete. Based on conversations with the community these patches 
are intended to lay the groundwork for feature completion and collaboration 
within the RISC-V community.

These patches are largely based off the work of Juzhe Zhong 
(juzhe.zh...@rivai.ai) of RiVAI. More specifically 
the rvv-next branch at: https://github.com/riscv-collab/riscv-gcc.git 
is the foundation of this patch 
set. 

As discussed on this list, if these patches are approved they will be merged 
into a "auto-vectorization" branch once gcc-13 branches for release. There are 
two known issues related to crashes (assert failures) associated with tree 
vectorization; one of which I have sent a patch for and have received feedback. 

Changes in v3:

- Removed the cost model and cost hooks based on feedback from Richard Biener
- Used RVV_VUNDEF macro to fix failing patterns

Changes in v2 

- Updated ChangeLog entry to include RiVAI contributions 
- Fixed ChangeLog email formatting 
- Fixed gnu formatting issues in the code 

Michael Collison (6):
  RISC-V: Add new predicates and function prototypes
  RISC-V: autovec: Export policy functions to global scope
  RISC-V:autovec: Add auto-vectorization support functions
  RISC-V:autovec: Add target vectorization hooks
  RISC-V:autovec: Add autovectorization patterns for add & sub
  RISC-V:autovec: Add autovectorization tests for add & sub

 gcc/config/riscv/predicates.md|  13 ++
 gcc/config/riscv/riscv-opts.h |  40 
 gcc/config/riscv/riscv-protos.h   |  15 ++
 gcc/config/riscv/riscv-v.cc   | 178 +-
 gcc/config/riscv/riscv-vector-builtins.cc |   4 +-
 gcc/config/riscv/riscv-vector-builtins.h  |   2 +
 gcc/config/riscv/riscv.cc | 156 +++
 gcc/config/riscv/riscv.md |   1 +
 gcc/config/riscv/riscv.opt|  20 ++
 gcc/config/riscv/vector-auto.md   | 172 +
 gcc/config/riscv/vector-iterators.md  |   2 +
 gcc/config/riscv/vector.md|   4 +-
 .../riscv/rvv/autovec/loop-add-rv32.c |  24 +++
 .../gcc.target/riscv/rvv/autovec/loop-add.c   |  24 +++
 .../riscv/rvv/autovec/loop-sub-rv32.c |  24 +++
 .../gcc.target/riscv/rvv/autovec/loop-sub.c   |  24 +++
 16 files changed, 698 insertions(+), 5 deletions(-)
 create mode 100644 gcc/config/riscv/vector-auto.md
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c

-- 
2.34.1



[PATCH v3 1/6] RISC-V: autovec: Add new predicates and function prototypes

2023-03-07 Thread Michael Collison
2023-03-02  Michael Collison  
Juzhe Zhong  

* config/riscv/riscv-protos.h (riscv_classify_vlmul_field):
New external declaration.
(riscv_vector_preferred_simd_mode): Ditto.
(riscv_tuple_mode_p): Ditto.
(riscv_vector_mask_mode_p): Ditto.
(riscv_classify_nf): Ditto.
(riscv_vlmul_regsize): Ditto.
(riscv_vector_preferred_simd_mode): Ditto.
(riscv_vector_get_mask_mode): Ditto.
(emit_vlmax_vsetvl): Ditto.
(get_mask_policy_no_pred): Ditto.
(get_tail_policy_no_pred): Ditto.
* config/riscv/riscv-opts.h (riscv_vector_bits_enum): New enum.
(riscv_vector_lmul_enum): Ditto.
(vlmul_field_enum): Ditto.
* config/riscv/riscv-v.cc (emit_vlmax_vsetvl):
Remove static scope.
* config/riscv/riscv.opt (riscv_vector_lmul):
New option -mriscv_vector_lmul.
* config/riscv/predicates.md (p_reg_or_const_csr_operand):
New predicate.
(vector_reg_or_const_dup_operand): Ditto.
---
 gcc/config/riscv/predicates.md  | 13 +++
 gcc/config/riscv/riscv-opts.h   | 40 +
 gcc/config/riscv/riscv-protos.h | 15 +
 gcc/config/riscv/riscv-v.cc |  2 +-
 gcc/config/riscv/riscv.opt  | 20 +
 5 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 0d9d7701c7e..19aa5e12920 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -264,6 +264,14 @@
 })
 
 ;; Predicates for the V extension.
+(define_special_predicate "p_reg_or_const_csr_operand"
+  (match_code "reg, subreg, const_int")
+{
+  if (CONST_INT_P (op))
+return satisfies_constraint_K (op);
+  return GET_MODE (op) == Pmode;
+})
+
 (define_special_predicate "vector_length_operand"
   (ior (match_operand 0 "pmode_register_operand")
(match_operand 0 "const_csr_operand")))
@@ -291,6 +299,11 @@
   (and (match_code "const_vector")
(match_test "rtx_equal_p (op, riscv_vector::gen_scalar_move_mask 
(GET_MODE (op)))")))
 
+(define_predicate "vector_reg_or_const_dup_operand"
+  (ior (match_operand 0 "register_operand")
+   (match_test "const_vec_duplicate_p (op)
+   && !CONST_POLY_INT_P (CONST_VECTOR_ELT (op, 0))")))
+
 (define_predicate "vector_mask_operand"
   (ior (match_operand 0 "register_operand")
(match_operand 0 "vector_all_trues_mask_operand")))
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index ff398c0a2ae..c6b6d84fce4 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -67,6 +67,46 @@ enum stack_protector_guard {
   SSP_GLOBAL   /* global canary */
 };
 
+/* RVV vector register sizes.  */
+enum riscv_vector_bits_enum
+{
+  RVV_SCALABLE,
+  RVV_NOT_IMPLEMENTED = RVV_SCALABLE,
+  RVV_64 = 64,
+  RVV_128 = 128,
+  RVV_256 = 256,
+  RVV_512 = 512,
+  RVV_1024 = 1024,
+  RVV_2048 = 2048,
+  RVV_4096 = 4096,
+  RVV_8192 = 8192,
+  RVV_16384 = 16384,
+  RVV_32768 = 32768,
+  RVV_65536 = 65536
+};
+
+/* vectorization factor.  */
+enum riscv_vector_lmul_enum
+{
+  RVV_LMUL1 = 1,
+  RVV_LMUL2 = 2,
+  RVV_LMUL4 = 4,
+  RVV_LMUL8 = 8
+};
+
+enum vlmul_field_enum
+{
+  VLMUL_FIELD_000, /* LMUL = 1.  */
+  VLMUL_FIELD_001, /* LMUL = 2.  */
+  VLMUL_FIELD_010, /* LMUL = 4.  */
+  VLMUL_FIELD_011, /* LMUL = 8.  */
+  VLMUL_FIELD_100, /* RESERVED.  */
+  VLMUL_FIELD_101, /* LMUL = 1/8.  */
+  VLMUL_FIELD_110, /* LMUL = 1/4.  */
+  VLMUL_FIELD_111, /* LMUL = 1/2.  */
+  MAX_VLMUL_FIELD
+};
+
 #define MASK_ZICSR(1 << 0)
 #define MASK_ZIFENCEI (1 << 1)
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 88a6bf5442f..6a486a1cd61 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -217,4 +217,19 @@ const unsigned int RISCV_BUILTIN_SHIFT = 1;
 /* Mask that selects the riscv_builtin_class part of a function code.  */
 const unsigned int RISCV_BUILTIN_CLASS = (1 << RISCV_BUILTIN_SHIFT) - 1;
 
+/* Routines implemented in riscv-v.cc.  */
+
+namespace riscv_vector {
+extern unsigned int riscv_classify_vlmul_field (enum machine_mode m);
+extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode,
+ unsigned vf);
+extern bool riscv_tuple_mode_p (machine_mode);
+extern bool riscv_vector_mask_mode_p (machine_mode);
+extern int riscv_classify_nf (machine_mode);
+extern int riscv_vlmul_regsize (machine_mode);
+extern opt_machine_mode riscv_vector_get_mask_mode (machine_mode mode);
+extern rtx emit_vlmax_vsetvl (machine_mode vmode);
+extern rtx get_mask_policy_no_pred ();
+extern rtx get_tail_policy_no_pred ();
+}
 #endif /* ! GCC_RISCV_PROTOS_H */
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index d65c65b26cd..2d2de6e4a6c 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -109,

[PATCH v3 3/6] RISC-V: autovec: Add auto-vectorization support functions

2023-03-07 Thread Michael Collison
2023-03-02  Michael Collison  
Juzhe Zhong  

* config/riscv/riscv-v.cc (riscv_classify_vlmul_field):
New function.
(riscv_vector_preferred_simd_mode): Ditto.
(get_mask_policy_no_pred): Ditto.
(get_tail_policy_no_pred): Ditto.
(riscv_tuple_mode_p): Ditto.
(riscv_classify_nf): Ditto.
(riscv_vlmul_regsize): Ditto.
(riscv_vector_mask_mode_p): Ditto.
(riscv_vector_get_mask_mode): Ditto.
---
 gcc/config/riscv/riscv-v.cc | 176 
 1 file changed, 176 insertions(+)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 2d2de6e4a6c..d21bde1bda6 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -39,9 +39,11 @@
 #include "emit-rtl.h"
 #include "tm_p.h"
 #include "target.h"
+#include "targhooks.h"
 #include "expr.h"
 #include "optabs.h"
 #include "tm-constrs.h"
+#include "riscv-vector-builtins.h"
 #include "rtx-vector-builder.h"
 
 using namespace riscv_vector;
@@ -109,6 +111,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
  && IN_RANGE (INTVAL (elt), minval, maxval));
 }
 
+/* Return the vlmul field for a specific machine mode.  */
+unsigned int
+riscv_classify_vlmul_field (enum machine_mode mode)
+{
+  /* Make the decision based on the mode's enum value rather than its
+ properties, so that we keep the correct classification regardless
+ of -mriscv-vector-bits.  */
+  switch (mode)
+{
+case E_VNx8BImode:
+  return VLMUL_FIELD_111;
+
+case E_VNx4BImode:
+  return VLMUL_FIELD_110;
+
+case E_VNx2BImode:
+  return VLMUL_FIELD_101;
+
+case E_VNx16BImode:
+  return VLMUL_FIELD_000;
+
+case E_VNx32BImode:
+  return VLMUL_FIELD_001;
+
+case E_VNx64BImode:
+  return VLMUL_FIELD_010;
+
+default:
+  break;
+}
+
+  /* we don't care about VLMUL for Mask.  */
+  return VLMUL_FIELD_000;
+}
+
 rtx
 emit_vlmax_vsetvl (machine_mode vmode)
 {
@@ -163,6 +200,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type vlmul)
   return ratio;
 }
 
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */
+
+machine_mode
+riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf)
+{
+  if (!TARGET_VECTOR)
+return word_mode;
+
+  switch (mode)
+{
+case E_QImode:
+  return vf == 1   ? VNx8QImode
+: vf == 2 ? VNx16QImode
+: vf == 4 ? VNx32QImode
+  : VNx64QImode;
+  break;
+case E_HImode:
+  return vf == 1   ? VNx4HImode
+: vf == 2 ? VNx8HImode
+: vf == 4 ? VNx16HImode
+  : VNx32HImode;
+  break;
+case E_SImode:
+  return vf == 1   ? VNx2SImode
+: vf == 2 ? VNx4SImode
+: vf == 4 ? VNx8SImode
+  : VNx16SImode;
+  break;
+case E_DImode:
+  if (riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+ && riscv_vector_elen_flags != MASK_VECTOR_ELEN_FP_32)
+   return vf == 1   ? VNx1DImode
+  : vf == 2 ? VNx2DImode
+  : vf == 4 ? VNx4DImode
+: VNx8DImode;
+  break;
+case E_SFmode:
+  if (TARGET_HARD_FLOAT && riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+ && riscv_vector_elen_flags != MASK_VECTOR_ELEN_64)
+   return vf == 1   ? VNx2SFmode
+  : vf == 2 ? VNx4SFmode
+  : vf == 4 ? VNx8SFmode
+: VNx16SFmode;
+  break;
+case E_DFmode:
+  if (TARGET_DOUBLE_FLOAT && TARGET_VECTOR_ELEN_FP_64)
+   return vf == 1   ? VNx1DFmode
+  : vf == 2 ? VNx2DFmode
+  : vf == 4 ? VNx4DFmode
+: VNx8DFmode;
+  break;
+default:
+  break;
+}
+
+  return word_mode;
+}
+
 /* Emit an RVV unmask && vl mov from SRC to DEST.  */
 static void
 emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len,
@@ -375,6 +470,87 @@ get_avl_type_rtx (enum avl_type type)
   return gen_int_mode (type, Pmode);
 }
 
+rtx
+get_mask_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+rtx
+get_tail_policy_no_pred ()
+{
+  return get_mask_policy_for_pred (PRED_TYPE_none);
+}
+
+/* Return true if it is a RVV tuple mode.  */
+bool
+riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
+/* Return nf for a machine mode.  */
+int
+riscv_classify_nf (machine_mode mode)
+{
+  switch (mode)
+{
+
+default:
+  break;
+}
+
+  return 1;
+}
+
+/* Return vlmul register size for a machine mode.  */
+int
+riscv_vlmul_regsize (machine_mode mode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
+return 1;
+  switch (riscv_classify_vlmul_field (mode))
+{
+case VLMUL_FIELD_001:
+  return 2;
+case VLMUL_FIELD_010:
+  return 4;
+case VLMUL_FIELD_011:
+  return 8;
+case VLMUL_FIELD_100:
+  gcc_unreachable ();
+default:
+  return 

[PATCH v3 2/6] RISC-V: autovec: Export policy functions to global scope

2023-03-07 Thread Michael Collison
2023-03-02  Michael Collison  
Juzhe Zhong  

* config/riscv/riscv-vector-builtins.cc (get_tail_policy_for_pred):
Remove static declaration to to make externally visible.
(get_mask_policy_for_pred): Ditto.
* config/riscv/riscv-vector-builtins.h (get_tail_policy_for_pred):
New external declaration.
(get_mask_policy_for_pred): Ditto.
---
 gcc/config/riscv/riscv-vector-builtins.cc | 4 ++--
 gcc/config/riscv/riscv-vector-builtins.h  | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 2d57086262b..352ffd8867d 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -2448,7 +2448,7 @@ use_real_merge_p (enum predication_type_index pred)
 
 /* Get TAIL policy for predication. If predication indicates TU, return the TU.
Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
 get_tail_policy_for_pred (enum predication_type_index pred)
 {
   if (pred == PRED_TYPE_tu || pred == PRED_TYPE_tum || pred == PRED_TYPE_tumu)
@@ -2458,7 +2458,7 @@ get_tail_policy_for_pred (enum predication_type_index 
pred)
 
 /* Get MASK policy for predication. If predication indicates MU, return the MU.
Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
 get_mask_policy_for_pred (enum predication_type_index pred)
 {
   if (pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu)
diff --git a/gcc/config/riscv/riscv-vector-builtins.h 
b/gcc/config/riscv/riscv-vector-builtins.h
index 8464aa9b7e9..d62d2bdab54 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -456,6 +456,8 @@ extern const char *const operand_suffixes[NUM_OP_TYPES];
 extern const rvv_builtin_suffixes type_suffixes[NUM_VECTOR_TYPES + 1];
 extern const char *const predication_suffixes[NUM_PRED_TYPES];
 extern rvv_builtin_types_t builtin_types[NUM_VECTOR_TYPES + 1];
+extern rtx get_tail_policy_for_pred (enum predication_type_index pred);
+extern rtx get_mask_policy_for_pred (enum predication_type_index pred);
 
 inline tree
 rvv_arg_type_info::get_scalar_type (vector_type_index type_idx) const
-- 
2.34.1



[PATCH v3 4/6] RISC-V: autovec: Add target vectorization hooks

2023-03-07 Thread Michael Collison
2023-03-02  Michael Collison  
Juzhe Zhong  

* config/riscv/riscv.cc (riscv_option_override):
Set riscv_vectorization_factor.
(riscv_estimated_poly_value): Implement
TARGET_ESTIMATED_POLY_VALUE.
(riscv_preferred_simd_mode): Implement
TARGET_VECTORIZE_PREFERRED_SIMD_MODE.
(riscv_autovectorize_vector_modes): Implement
TARGET_AUTOVECTORIZE_VECTOR_MODES.
(riscv_get_mask_mode): Implement TARGET_VECTORIZE_GET_MASK_MODE.
(riscv_empty_mask_is_expensive): Implement
TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.
(riscv_vectorize_create_costs): Implement
TARGET_VECTORIZE_CREATE_COSTS.
(TARGET_ESTIMATED_POLY_VALUE): Register target macro.
(TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
(TARGET_VECTORIZE_GET_MASK_MODE): Ditto.
(TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.
(TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.
---
 gcc/config/riscv/riscv.cc | 156 ++
 1 file changed, 156 insertions(+)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index befb9b498b7..1ca9f3c7ae4 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -60,6 +60,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "tm-constrs.h"
 #include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "cfgrtl.h"
+#include "sel-sched.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "gimple-expr.h"
+#include "tree-vectorizer.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -275,6 +284,9 @@ poly_uint16 riscv_vector_chunks;
 /* The number of bytes in a vector chunk.  */
 unsigned riscv_bytes_per_vector_chunk;
 
+/* Prefer vf for auto-vectorizer.  */
+unsigned riscv_vectorization_factor;
+
 /* Index R is the smallest register class that contains register R.  */
 const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = {
   GR_REGS, GR_REGS,GR_REGS,GR_REGS,
@@ -6199,6 +6211,10 @@ riscv_option_override (void)
 
   /* Convert -march to a chunks count.  */
   riscv_vector_chunks = riscv_convert_vector_bits ();
+
+  if (TARGET_VECTOR)
+riscv_vectorization_factor = riscv_vector_lmul;
+
 }
 
 /* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */
@@ -6893,6 +6909,128 @@ riscv_dwarf_poly_indeterminate_value (unsigned int i, 
unsigned int *factor,
   return RISCV_DWARF_VLENB;
 }
 
+/* Implement TARGET_ESTIMATED_POLY_VALUE.
+   Look into the tuning structure for an estimate.
+   KIND specifies the type of requested estimate: min, max or likely.
+   For cores with a known RVV width all three estimates are the same.
+   For generic RVV tuning we want to distinguish the maximum estimate from
+   the minimum and likely ones.
+   The likely estimate is the same as the minimum in that case to give a
+   conservative behavior of auto-vectorizing with RVV when it is a win
+   even for 128-bit RVV.
+   When RVV width information is available VAL.coeffs[1] is multiplied by
+   the number of VQ chunks over the initial Advanced SIMD 128 bits.  */
+
+static HOST_WIDE_INT
+riscv_estimated_poly_value (poly_int64 val,
+   poly_value_estimate_kind kind = POLY_VALUE_LIKELY)
+{
+  unsigned int width_source = BITS_PER_RISCV_VECTOR.is_constant ()
+? (unsigned int) BITS_PER_RISCV_VECTOR.to_constant ()
+: (unsigned int) RVV_SCALABLE;
+
+  /* If there is no core-specific information then the minimum and likely
+ values are based on 128-bit vectors and the maximum is based on
+ the architectural maximum of 2048 bits.  */
+  if (width_source == RVV_SCALABLE)
+switch (kind)
+  {
+  case POLY_VALUE_MIN:
+  case POLY_VALUE_LIKELY:
+   return val.coeffs[0];
+
+  case POLY_VALUE_MAX:
+   return val.coeffs[0] + val.coeffs[1] * 15;
+  }
+
+  /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the
+ lowest as likely.  This could be made more general if future -mtune
+ options need it to be.  */
+  if (kind == POLY_VALUE_MAX)
+width_source = 1 << floor_log2 (width_source);
+  else
+width_source = least_bit_hwi (width_source);
+
+  /* If the core provides width information, use that.  */
+  HOST_WIDE_INT over_128 = width_source - 128;
+  return val.coeffs[0] + val.coeffs[1] * over_128 / 128;
+}
+
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+riscv_preferred_simd_mode (scalar_mode mode)
+{
+  machine_mode vmode =
+riscv_vector::riscv_vector_preferred_simd_mode (mode,
+   riscv_vectorization_factor);
+  if (VECTOR_MODE_P (vmode))
+return vmode;
+
+  return word_mode;
+}
+
+/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV.  */
+static unsigned int
+riscv_autovectorize_vector_mo

[PATCH v3 5/6] RISC-V: autovec: Add autovectorization patterns for add & sub

2023-03-07 Thread Michael Collison
2023-03-02  Michael Collison  
Juzhe Zhong  

* config/riscv/riscv.md (riscv_vector_preferred_simd_mode): Include
vector-iterators.md.
* config/riscv/vector-auto.md: New file containing
autovectorization patterns.
* config/riscv/vector-iterators.md (UNSPEC_VADD/UNSPEC_VSUB):
New unspecs for autovectorization patterns.
* config/riscv/vector.md: Remove include of vector-iterators.md
and include vector-auto.md.
---
 gcc/config/riscv/riscv.md|   1 +
 gcc/config/riscv/vector-auto.md  | 172 +++
 gcc/config/riscv/vector-iterators.md |   2 +
 gcc/config/riscv/vector.md   |   4 +-
 4 files changed, 177 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/riscv/vector-auto.md

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 6c3176042fb..a504ace72e5 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -131,6 +131,7 @@
 (include "predicates.md")
 (include "constraints.md")
 (include "iterators.md")
+(include "vector-iterators.md")
 
 ;; 
 ;;
diff --git a/gcc/config/riscv/vector-auto.md b/gcc/config/riscv/vector-auto.md
new file mode 100644
index 000..5227a73d96d
--- /dev/null
+++ b/gcc/config/riscv/vector-auto.md
@@ -0,0 +1,172 @@
+;; Machine description for RISC-V 'V' Extension for GNU compiler.
+;; Copyright (C) 2022-2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zh...@rivai.ai), RiVAI Technologies Ltd.
+;; Contributed by Michael Collison (colli...@rivosinc.com, Rivos Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+
+;; -
+;;  [INT] Addition
+;; -
+;; Includes:
+;; - vadd.vv
+;; - vadd.vx
+;; - vadd.vi
+;; -
+
+(define_expand "add3"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "vector_arith_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = RVV_VUNDEF (mode);
+  rtx vl = emit_vlmax_vsetvl (mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add(operands[0], mask, merge, operands[1], 
operands[2],
+   vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "cond_add"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand: 1 "register_operand")
+   (match_operand:VI 2 "register_operand")
+   (match_operand:VI 3 "vector_reg_or_const_dup_operand")
+   (match_operand:VI 4 "register_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = operands[4];
+  rtx vl = emit_vlmax_vsetvl (mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = operands[1];
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add(operands[0], mask, merge, operands[2], 
operands[3],
+   vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
+(define_expand "len_add"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "vector_reg_or_const_dup_operand")
+   (match_operand 3 "p_reg_or_const_csr_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = RVV_VUNDEF (mode);
+  rtx vl = operands[3];
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add(operands[0], mask, merge, operands[1], 
operands[2],
+   vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
+
+;; -
+;;  [INT] Subtraction
+;; -
+;; Includes:
+;; - vsub.vv
+;; - vsub.vx
+;; - vadd.vi
+;; - vrsub.vx
+;; - vrsub.vi
+;; --

[PATCH v3 6/6] RISC-V: autovec: Add autovectorization tests for add & sub

2023-03-07 Thread Michael Collison
2023-03-02  Michael Collison  
Vineet Gupta 

* gcc.target/riscv/rvv/autovec: New directory
for autovectorization tests.
* gcc.target/riscv/rvv/autovec/loop-add-rv32.c: New
test to verify code generation of vector add on rv32.
* gcc.target/riscv/rvv/autovec/loop-add.c: New
test to verify code generation of vector add on rv64.
* gcc.target/riscv/rvv/autovec/loop-sub-rv32.c: New
test to verify code generation of vector subtract on rv32.
* gcc.target/riscv/rvv/autovec/loop-sub.c: New
test to verify code generation of vector subtract on rv64.
---
 .../riscv/rvv/autovec/loop-add-rv32.c | 24 +++
 .../gcc.target/riscv/rvv/autovec/loop-add.c   | 24 +++
 .../riscv/rvv/autovec/loop-sub-rv32.c | 24 +++
 .../gcc.target/riscv/rvv/autovec/loop-sub.c   | 24 +++
 4 files changed, 96 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
new file mode 100644
index 000..bdc3b6892e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" 
} */
+
+#include 
+
+#define TEST_TYPE(TYPE)\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)\
+  {\
+for (int i = 0; i < n; i++)\
+  dst[i] = a[i] + b[i];\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL() \
+ TEST_TYPE(int16_t)\
+ TEST_TYPE(uint16_t)   \
+ TEST_TYPE(int32_t)\
+ TEST_TYPE(uint32_t)   \
+ TEST_TYPE(int64_t)\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
new file mode 100644
index 000..d7f992c7d27
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } 
*/
+
+#include 
+
+#define TEST_TYPE(TYPE)\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)\
+  {\
+for (int i = 0; i < n; i++)\
+  dst[i] = a[i] + b[i];\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL() \
+ TEST_TYPE(int16_t)\
+ TEST_TYPE(uint16_t)   \
+ TEST_TYPE(int32_t)\
+ TEST_TYPE(uint32_t)   \
+ TEST_TYPE(int64_t)\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
new file mode 100644
index 000..7d0a40ec539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" 
} */
+
+#include 
+
+#define TEST_TYPE(TYPE)\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)\
+  {\
+for (int i = 0; i < n; i++)\
+  dst[i] = a[i] - b[i];\
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL() \
+ TEST_TYPE(int16_t)\
+ TEST_TYPE(uint16_t)   \
+ TEST_TYPE(int32_t)\
+ TEST_TYPE(uint32_t)   \
+ TEST_TYPE(int64_t)\
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
new file mode 100644
index 000..c8900884f83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } 
*/
+
+#include 
+
+#define TEST_TYPE(TYPE)\
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)\
+  {\
+for (int i = 0; i < n; i++)\
+  dst[i] = a[i] - b[i];\
+  }
+
+/* 

[PATCH] testsuite: Fix omp-parallel-for-get-min.c and -for-1.c for non-openmp

2023-03-07 Thread Hans-Peter Nilsson via Gcc-patches
Committed as obvious.
-- >8 --
The recently added tests missed checking for "fopenmp" (see
other tests where "-fopenmp" is passed), which makes them
fail on non-openmp systems.

* gcc.dg/analyzer/omp-parallel-for-get-min.c,
gcc.dg/analyzer/omp-parallel-for-1.c: Require effective target fopenmp.
---
 gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c   | 1 +
 gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c 
b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c
index dae940dac200..cadacc842750 100644
--- a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-1.c
@@ -1,3 +1,4 @@
+/* { dg-require-effective-target fopenmp } */
 /* { dg-additional-options "-fopenmp -Wall" } */
 
 typedef struct _Image
diff --git a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c 
b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c
index a7e64e1a3a85..ba9f634cd716 100644
--- a/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c
+++ b/gcc/testsuite/gcc.dg/analyzer/omp-parallel-for-get-min.c
@@ -1,5 +1,6 @@
 /* Reduced from ImageMagick-7.1.0-57's MagickCore/attribute.c: 
GetEdgeBackgroundColor */
 
+/* { dg-require-effective-target fopenmp } */
 /* { dg-additional-options "-fopenmp -Wall" } */
 
 extern double get_census (void);
-- 
2.30.2



[committed] Fix MIPS testsuite over-eager matching

2023-03-07 Thread Jeff Law via Gcc-patches


The mips msa-ds.c test is trying to ensure that MSA branches can have 
their delay slots filled.  The regexp it used looked for the function 
name, a nop, then the function name again.  If found that sequence, then 
the test failed.


The problem is with Vlad's recent IRA work there's simply less code in 
the test (good) and as a result one of the *other* branches in the test 
had an unfilled delay slot -- the delay slot for the MSA branch was 
still being filled.


This patch tightens up the regexp.  In particular it looks for the MSA 
branch and a nop on the next line (avoiding the over-eager .* 
construct).  That indicates that the MSA branch did not have its delay 
slot filled.  When that sequence is found, then the test fails.



This fixes the recent regressions for mips64 and mips64el in the tester.

Installing on the trunk,
jeffcommit 0d25f8265b3ba9338f4572ac3fab08e3f33367a5
Author: Jeff Law 
Date:   Tue Mar 7 22:00:39 2023 -0700

Fix MIPS testsuite over-eager matching

The mips msa-ds.c test is trying to ensure that MSA branches can have their
delay slots filled.  The regexp it used looked for the function name, a nop,
then the function name again.  If found that sequence, then the test failed.

The problem is with Vlad's recent IRA work there's simply less code in the
test (good) and as a result one of the *other* branches in the test had an
unfilled delay slot -- the delay slot for the MSA branch was still being
filled.

This patch tightens up the regexp.  In particular it looks for the MSA 
branch
and a nop on the next line (avoiding the over-eager .* construct).  That
indicates that the MSA branch did not have its delay slot filled.  When that
sequence is found, then the test fails.

This fixes the recent regressions for mips64 and mips64el in the tester.

Installing on the trunk,

gcc/testsuite:
* gcc.target/mips/msa-ds.c: Fix over eager pattern matching.

diff --git a/gcc/testsuite/gcc.target/mips/msa-ds.c 
b/gcc/testsuite/gcc.target/mips/msa-ds.c
index c6932b280cb..37957a02bd8 100644
--- a/gcc/testsuite/gcc.target/mips/msa-ds.c
+++ b/gcc/testsuite/gcc.target/mips/msa-ds.c
@@ -27,5 +27,9 @@ int __attribute__ ((cold)) bar (v4si v , int a, int b)
return b + c;
 }
 
-/* { dg-final { scan-assembler-not "foo:.*nop.*jr.*foo," } } */
-/* { dg-final { scan-assembler-not "bar:.*nop.*jr.*bar," } } */
+/* We need to avoid over matching here as we could have other
+   branches with unfilled slots.  So we verify that we do not have
+   an MSA branch with a NOP in its delay slot.  We need to match
+   both forms of the MSA branch that can occur in this test.  */
+/* { dg-final { scan-assembler-not "foo:.*bn\?z.w\[^\\n\\r\]*\\n\\tnop" } } */
+/* { dg-final { scan-assembler-not "bar:.*bn\?z.w\[^\\n\\r\]*\\n\\tnop" } } */


[PATCH] xtensa: Fix for enabling LRA

2023-03-07 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch makes LRA well with some exceptions
(e.g. MI thunk generation due to pretending reload_completed).

gcc/ChangeLog:

* config/xtensa/constraints.md (R, T, U):
Change define_constraint to define_memory_constraint.
* config/xtensa/xtensa.cc (xtensa_legitimate_constant_p):
Add short-circuit path for integer load instructions when
lra_in_progress.
* config/xtensa/xtensa.md (movsf):
Use can_create_pseudo_p() rather reload_in_progress and
reload_completed.
---
 gcc/config/xtensa/constraints.md | 26 --
 gcc/config/xtensa/xtensa.cc  |  4 
 gcc/config/xtensa/xtensa.md  |  2 +-
 3 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/gcc/config/xtensa/constraints.md b/gcc/config/xtensa/constraints.md
index 53e4d0d8dd1..9b31e162941 100644
--- a/gcc/config/xtensa/constraints.md
+++ b/gcc/config/xtensa/constraints.md
@@ -123,29 +123,19 @@
   (and (match_code "const_int")
   (match_test "! xtensa_split1_finished_p ()"
 
-;; Memory constraints.  Do not use define_memory_constraint here.  Doing so
-;; causes reload to force some constants into the constant pool, but since
-;; the Xtensa constant pool can only be accessed with L32R instructions, it
-;; is always better to just copy a constant into a register.  Instead, use
-;; regular constraints but add a check to allow pseudos during reload.
+;; Memory constraints.
 
-(define_constraint "R"
+(define_memory_constraint "R"
  "Memory that can be accessed with a 4-bit unsigned offset from a register."
- (ior (and (match_code "mem")
-  (match_test "smalloffset_mem_p (op)"))
-  (and (match_code "reg")
-  (match_test "reload_in_progress
-   && REGNO (op) >= FIRST_PSEUDO_REGISTER"
+ (and (match_code "mem")
+  (match_test "smalloffset_mem_p (op)")))
 
-(define_constraint "T"
+(define_memory_constraint "T"
  "Memory in a literal pool (addressable with an L32R instruction)."
  (and (match_code "mem")
   (match_test "!TARGET_CONST16 && constantpool_mem_p (op)")))
 
-(define_constraint "U"
+(define_memory_constraint "U"
  "Memory that is not in a literal pool."
- (ior (and (match_code "mem")
-  (match_test "! constantpool_mem_p (op)"))
-  (and (match_code "reg")
-  (match_test "reload_in_progress
-   && REGNO (op) >= FIRST_PSEUDO_REGISTER"
+ (and (match_code "mem")
+  (match_test "! constantpool_mem_p (op)")))
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc
index 7287aa7a258..a500dc2a06e 100644
--- a/gcc/config/xtensa/xtensa.cc
+++ b/gcc/config/xtensa/xtensa.cc
@@ -4872,6 +4872,10 @@ xtensa_trampoline_init (rtx m_tramp, tree fndecl, rtx 
chain)
 static bool
 xtensa_legitimate_constant_p (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
 {
+  if (lra_in_progress && CONST_INT_P (x))
+return TARGET_AUTO_LITPOOLS || TARGET_CONST16
+  || xtensa_simm12b (INTVAL (x));
+
   return !xtensa_tls_referenced_p (x);
 }
 
diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 3521fa33b47..195515d9427 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -1268,7 +1268,7 @@
   if ((!register_operand (operands[0], SFmode)
&& !register_operand (operands[1], SFmode))
   || (FP_REG_P (xt_true_regnum (operands[0]))
- && !(reload_in_progress | reload_completed)
+ && can_create_pseudo_p ()
  && (constantpool_mem_p (operands[1])
  || CONSTANT_P (operands[1]
 operands[1] = force_reg (SFmode, operands[1]);
-- 
2.30.2


[PATCH] RISC-V: Fine tunning merge operand constraint

2023-03-07 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/vector-iterators.md (=vd,vr): Fine tune.
(=vd,vd,vr,vr): Ditto.
* config/riscv/vector.md: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/merge_constraint-2.c: New test.

---
 gcc/config/riscv/vector-iterators.md  |   6 +-
 gcc/config/riscv/vector.md| 890 +-
 .../riscv/rvv/base/merge_constraint-2.c   | 118 +++
 3 files changed, 566 insertions(+), 448 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/merge_constraint-2.c

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index d44943ae7c3..266563a3aa0 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -813,9 +813,9 @@
 (UNSPEC_VSLIDE1UP "1up") (UNSPEC_VSLIDE1DOWN "1down")
 (UNSPEC_VFSLIDE1UP "1up") (UNSPEC_VFSLIDE1DOWN "1down")])
 
-(define_int_attr ud_constraint [(UNSPEC_VSLIDEUP "=&vr,&vr") 
(UNSPEC_VSLIDEDOWN "=vd,vr")
-   (UNSPEC_VSLIDE1UP "=&vr,&vr") 
(UNSPEC_VSLIDE1DOWN "=vd,vr")
-   (UNSPEC_VFSLIDE1UP "=&vr,&vr") 
(UNSPEC_VFSLIDE1DOWN "=vd,vr")])
+(define_int_attr ud_constraint [(UNSPEC_VSLIDEUP "=&vr,&vr,&vr,&vr") 
(UNSPEC_VSLIDEDOWN "=vd,vd,vr,vr")
+   (UNSPEC_VSLIDE1UP "=&vr,&vr,&vr,&vr") 
(UNSPEC_VSLIDE1DOWN "=vd,vd,vr,vr")
+   (UNSPEC_VFSLIDE1UP "=&vr,&vr,&vr,&vr") 
(UNSPEC_VFSLIDE1DOWN "=vd,vd,vr,vr")])
 
 (define_int_attr UNSPEC [(UNSPEC_VSLIDE1UP "UNSPEC_VSLIDE1UP")
 (UNSPEC_VSLIDE1DOWN "UNSPEC_VSLIDE1DOWN")])
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index a4a68b67e24..d3013844e5f 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -4653,123 +4653,123 @@
 ;; 
---
 
 (define_insn "@pred_widen_mul_plus"
-  [(set (match_operand:VWEXTI 0 "register_operand""=&vr")
+  [(set (match_operand:VWEXTI 0 "register_operand""=&vr,  
&vr")
(if_then_else:VWEXTI
  (unspec:
-   [(match_operand: 1 "vector_mask_operand" "vmWc1")
-(match_operand 6 "vector_length_operand""   rK")
-(match_operand 7 "const_int_operand""i")
-(match_operand 8 "const_int_operand""i")
-(match_operand 9 "const_int_operand""i")
+   [(match_operand: 1 "vector_mask_operand" 
"vmWc1,vmWc1")
+(match_operand 6 "vector_length_operand""   rK,   
rK")
+(match_operand 7 "const_int_operand""i,
i")
+(match_operand 8 "const_int_operand""i,
i")
+(match_operand 9 "const_int_operand""i,
i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
  (plus:VWEXTI
(mult:VWEXTI
  (any_extend:VWEXTI
-   (match_operand: 3 "register_operand" "   vr"))
+   (match_operand: 3 "register_operand" "   vr,   
vr"))
  (any_extend:VWEXTI
-   (match_operand: 4 "register_operand" "   vr")))
-   (match_operand:VWEXTI 2 "register_operand"   "0"))
- (match_operand:VWEXTI 5 "vector_merge_operand" "  0vu")))]
+   (match_operand: 4 "register_operand" "   vr,   
vr")))
+   (match_operand:VWEXTI 2 "register_operand"   "0,
0"))
+ (match_operand:VWEXTI 5 "vector_merge_operand" "   vu,
0")))]
   "TARGET_VECTOR"
   "vwmacc.vv\t%0,%3,%4%p1"
   [(set_attr "type" "viwmuladd")
(set_attr "mode" "")])
 
 (define_insn "@pred_widen_mul_plus_scalar"
-  [(set (match_operand:VWEXTI 0 "register_operand""=&vr")
+  [(set (match_operand:VWEXTI 0 "register_operand""=&vr,  
&vr")
(if_then_else:VWEXTI
  (unspec:
-   [(match_operand: 1 "vector_mask_operand" "vmWc1")
-(match_operand 6 "vector_length_operand""   rK")
-(match_operand 7 "const_int_operand""i")
-(match_operand 8 "const_int_operand""i")
-(match_operand 9 "const_int_operand""i")
+   [(match_operand: 1 "vector_mask_operand" 
"vmWc1,vmWc1")
+(match_operand 6 "vector_length_operand""   rK,   
rK")
+(match_operand 7 "const_int_operand""i,
i")
+(match_operand 8 "const_int_operand""i,
i")
+(match_operand 9 "const_in

Re: [ping][PATCH 1/1] docs: Add link to gmplib.org

2023-03-07 Thread Benson Muite via Gcc-patches
> Thanks, I've pushed this patch.
> 
> -Sandra
Appreciated. Thanks.


[PATCH] libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062]

2023-03-07 Thread Hongyu Wang via Gcc-patches
Hi,

When OMP_WAIT_POLICY is not specified, current implementation will cause
icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy
will remain its uninitialized value. Set it to -1 when the flag is not
specified to keep GOMP_SPINCOUNT behavior consistent with its description.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

libgomp/ChangeLog:

PR libgomp/109062
* env.c (initialize_env): Set wait_policy to -1 if
OMP_WAIT_POLICY is not specified.
* testsuite/libgomp.c-c++-common/pr109062.c: New test.
---
 libgomp/env.c |  2 ++
 libgomp/testsuite/libgomp.c-c++-common/pr109062.c | 14 ++
 2 files changed, 16 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/pr109062.c

diff --git a/libgomp/env.c b/libgomp/env.c
index c41c1f852cc..fa36a8697d6 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -2249,6 +2249,8 @@ initialize_env (void)
 wait_policy = none->icvs.wait_policy;
   else if (all != NULL && gomp_get_icv_flag (all->flags, GOMP_ICV_WAIT_POLICY))
 wait_policy = all->icvs.wait_policy;
+  else
+wait_policy = -1;
 
   if (!parse_spincount ("GOMP_SPINCOUNT", &gomp_spin_count_var))
 {
diff --git a/libgomp/testsuite/libgomp.c-c++-common/pr109062.c 
b/libgomp/testsuite/libgomp.c-c++-common/pr109062.c
new file mode 100644
index 000..5c7c287dafd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/pr109062.c
@@ -0,0 +1,14 @@
+/* { dg-do run } */
+
+#include 
+#include 
+
+int
+main ()
+{
+  omp_display_env (1);
+
+  return 0;
+}
+
+/* { dg-output ".*\\\[host] GOMP_SPINCOUNT = '30'.*" { target native } } */
-- 
2.31.1



Re: [PATCH]middle-end: On emergency dumps finish the graph generation.

2023-03-07 Thread Richard Biener via Gcc-patches
On Tue, 7 Mar 2023, Tamar Christina wrote:

> Hi All,
> 
> When doing an emergency dump the cfg output dumps are corrupted because the
> ending "}" is missing.
> 
> Normally when the pass manager finishes it would call finish_graph_dump_file 
> to
> produce this.  This is called here because each pass can dump multiple 
> digraphs.
> 
> However during an emergency dump we only dump the current function and so 
> after
> that is done we never go back to the pass manager.
> 
> As such, we need to manually call finish_graph_dump_file in order to properly
> finish off graph generation.
> 
> With this -ftree-dump-*-graph works properly during a crash dump.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * passes.cc (emergency_dump_function): Finish graph generation.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/passes.cc b/gcc/passes.cc
> index 
> 347214e81d0cfac05d9ba782db0eda1cdd7e9c87..38642a4010941b414a1ed1fd70a348778addbf60
>  100644
> --- a/gcc/passes.cc
> +++ b/gcc/passes.cc
> @@ -1845,6 +1845,13 @@ emergency_dump_function ()
>fprintf (dump_file, "\n\n\nEMERGENCY DUMP:\n\n");
>execute_function_dump (cfun, current_pass);
>  
> +  /* Normally the passmanager will close the graphs as a pass could be 
> wanting
> + to print multiple digraphs. But during an emergency dump there can only 
> be
> + one and we must finish the graph manually.  */
> +  if ((cfun->curr_properties & PROP_cfg)
> +  && (dump_flags & TDF_GRAPH))
> +finish_graph_dump_file (dump_file_name);
> +
>if (symtab && current_pass->type == IPA_PASS)
>  symtab->dump (dump_file);
>  }
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] [RFC] RAII auto_mpfr and autp_mpz

2023-03-07 Thread Richard Biener via Gcc-patches
On Wed, 8 Mar 2023, Alexander Monakov wrote:

> 
> On Tue, 7 Mar 2023, Jonathan Wakely wrote:
> 
> > > Shouldn't this use the idiom suggested in ansidecl.h, i.e.
> > >
> > >   private:
> > > DISABLE_COPY_AND_ASSIGN (auto_mpfr);
> > 
> > 
> > Why? A macro like that (or a base class like boost::noncopyable) has
> > some value in a code base that wants to work for both C++03 and C++11
> > (or later). But in GCC we know we have C++11 now, so we can just
> > delete members. I don't see what the macro adds.
> 
> Evidently it's possible to forget to delete one of the members, as
> showcased in this very thread.

Yes.  And I copy&pasted from somewhere I forgot which also forgot it ...

> The idiom is also slightly easier to read.

Of course inconsistency in the code-base isn't helping that.
auto_bitmap seems to declare but not define things (including
move assign/CTOR?)

Richard.


[PATCH] RISC-V: Bugfix for rvv bool mode size adjustment

2023-03-07 Thread pan2.li--- via Gcc-patches
From: yes 

Fix the bug of the rvv bool mode size by the adjustment.
Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64])
of the vbool*_t, the mode size (aka byte size) will be adjusted to
[1, 1, 1, 1, 2, 4, 8] according to the rvv spec 1.0 isa. The
adjustment will provide correct information for the underlying
redundant instruction elimiation.

Given the below sample code:
{
  vbool1_t v1 = *(vbool1_t*)in;
  vbool64_t v2 = *(vbool64_t*)in;

  *(vbool1_t*)(out + 100) = v1;
  *(vbool64_t*)(out + 200) = v2;
}

Before the size adjustment:
csrrt0,vlenb
sllit1,t0,1
csrra3,vlenb
sub sp,sp,t1
sllia4,a3,1
add a4,a4,sp
addia2,a1,100
vsetvli a5,zero,e8,m8,ta,ma
sub a3,a4,a3
vlm.v   v24,0(a0)
vsm.v   v24,0(a2)
vsm.v   v24,0(a3)
addia1,a1,200
csrrt0,vlenb
vsetvli a4,zero,e8,mf8,ta,ma
sllit1,t0,1
vlm.v   v24,0(a3)
vsm.v   v24,0(a1)
add sp,sp,t1
jr  ra

After the size adjustment:
addia3,a1,100
vsetvli a4,zero,e8,m8,ta,ma
addia1,a1,200
vlm.v   v24,0(a0)
vsm.v   v24,0(a3)
vsetvli a5,zero,e8,mf8,ta,ma
vlm.v   v24,0(a0)
vsm.v   v24,0(a1)
ret

Additionally, the size adjust cannot cover all possible combinations
of the vbool*_t code pattern like above. We will take a look into it
in another patches.

PR 108185
PR 108654

gcc/ChangeLog:

* config/riscv/riscv-modes.def (ADJUST_BYTESIZE):
* config/riscv/riscv.cc (riscv_v_adjust_bytesize):
* config/riscv/riscv.h (riscv_v_adjust_bytesize):

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr108185-1.c:
* gcc.target/riscv/rvv/base/pr108185-2.c:
* gcc.target/riscv/rvv/base/pr108185-3.c:

Signed-off-by: Pan Li 
Co-authored-by: Ju-Zhe Zhong 
---
 gcc/config/riscv/riscv-modes.def  | 14 ++--
 gcc/config/riscv/riscv.cc | 22 +++
 gcc/config/riscv/riscv.h  |  1 +
 .../gcc.target/riscv/rvv/base/pr108185-1.c|  2 +-
 .../gcc.target/riscv/rvv/base/pr108185-2.c|  2 +-
 .../gcc.target/riscv/rvv/base/pr108185-3.c|  2 +-
 6 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def
index 110bddce851..4cf7cf8b1c6 100644
--- a/gcc/config/riscv/riscv-modes.def
+++ b/gcc/config/riscv/riscv-modes.def
@@ -64,13 +64,13 @@ ADJUST_ALIGNMENT (VNx16BI, 1);
 ADJUST_ALIGNMENT (VNx32BI, 1);
 ADJUST_ALIGNMENT (VNx64BI, 1);
 
-ADJUST_BYTESIZE (VNx1BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx2BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx4BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx8BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
-ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8));
+ADJUST_BYTESIZE (VNx1BI, riscv_v_adjust_bytesize (VNx1BImode, 1));
+ADJUST_BYTESIZE (VNx2BI, riscv_v_adjust_bytesize (VNx2BImode, 1));
+ADJUST_BYTESIZE (VNx4BI, riscv_v_adjust_bytesize (VNx4BImode, 1));
+ADJUST_BYTESIZE (VNx8BI, riscv_v_adjust_bytesize (VNx8BImode, 1));
+ADJUST_BYTESIZE (VNx16BI, riscv_v_adjust_bytesize (VNx16BImode, 2));
+ADJUST_BYTESIZE (VNx32BI, riscv_v_adjust_bytesize (VNx32BImode, 4));
+ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_bytesize (VNx64BImode, 8));
 
 ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));
 ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2));
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index e7b7d87cebc..428fbb28fae 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1003,6 +1003,28 @@ riscv_v_adjust_nunits (machine_mode mode, int scale)
   return scale;
 }
 
+/* Call from ADJUST_BYTESIZE in riscv-modes.def.  Return the correct
+   BYTE size for corresponding machine_mode.  */
+
+poly_int64
+riscv_v_adjust_bytesize (machine_mode mode, int scale)
+{
+  if (riscv_v_ext_vector_mode_p (mode))
+  {
+poly_uint16 mode_size = GET_MODE_SIZE (mode);
+
+if (maybe_eq (mode_size, (uint16_t)-1))
+  mode_size = riscv_vector_chunks * scale;
+
+if (known_gt (mode_size, BYTES_PER_RISCV_VECTOR))
+  mode_size = BYTES_PER_RISCV_VECTOR;
+
+return mode_size;
+  }
+
+  return scale;
+}
+
 /* Call from ADJUST_PRECISION in riscv-modes.def.  Return the correct
PRECISION size for corresponding machine_mode.  */
 
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 15b9317a8ce..66fb07d6652 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -1026,6 +1026,7 @@ extern unsigned riscv_bytes_per_vector_chunk;
 extern poly_uint16 riscv_vector_chunks;
 extern poly_int64 riscv_v_adjust_nunits (enum machine_mode, int);
 extern poly_int64 riscv_v_adjust_precision (enum machine_mode, int);
+extern poly_int64 riscv_v_adjust_bytesize

Re: [PATCH] libgomp: Fix default value of GOMP_SPINCOUNT [PR 109062]

2023-03-07 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 08, 2023 at 02:31:38PM +0800, Hongyu Wang wrote:
> Hi,
> 
> When OMP_WAIT_POLICY is not specified, current implementation will cause
> icv flag GOMP_ICV_WAIT_POLICY unset, so global variable wait_policy
> will remain its uninitialized value. Set it to -1 when the flag is not
> specified to keep GOMP_SPINCOUNT behavior consistent with its description.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> libgomp/ChangeLog:
> 
>   PR libgomp/109062
>   * env.c (initialize_env): Set wait_policy to -1 if
>   OMP_WAIT_POLICY is not specified.
>   * testsuite/libgomp.c-c++-common/pr109062.c: New test.

I think the right spot to fix this would be instead in initialize_icvs,
change the
  icvs->wait_policy = 0;
in there to
  icvs->wait_policy = -1;
That way it will be the default for all the devices, not just the
initial one.

Ok for trunk with that change if it works.

> diff --git a/libgomp/env.c b/libgomp/env.c
> index c41c1f852cc..fa36a8697d6 100644
> --- a/libgomp/env.c
> +++ b/libgomp/env.c
> @@ -2249,6 +2249,8 @@ initialize_env (void)
>  wait_policy = none->icvs.wait_policy;
>else if (all != NULL && gomp_get_icv_flag (all->flags, 
> GOMP_ICV_WAIT_POLICY))
>  wait_policy = all->icvs.wait_policy;
> +  else
> +wait_policy = -1;
>  
>if (!parse_spincount ("GOMP_SPINCOUNT", &gomp_spin_count_var))
>  {
> diff --git a/libgomp/testsuite/libgomp.c-c++-common/pr109062.c 
> b/libgomp/testsuite/libgomp.c-c++-common/pr109062.c
> new file mode 100644
> index 000..5c7c287dafd
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c-c++-common/pr109062.c
> @@ -0,0 +1,14 @@
> +/* { dg-do run } */
> +
> +#include 
> +#include 
> +
> +int
> +main ()
> +{
> +  omp_display_env (1);
> +
> +  return 0;
> +}
> +
> +/* { dg-output ".*\\\[host] GOMP_SPINCOUNT = '30'.*" { target native } } 
> */
> -- 
> 2.31.1

Jakub



[PATCH] Extend nops num in "maybe_gen_insn" for RISC-V Vector intrinsics

2023-03-07 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Hi, current maybe_gen_insn can only expand 9 nops.
For RVV intrinsics, I need to extend it as 10, otherwise I should use GEN_FCN.
This patch is quite obvious change, Ok for trunk ?

Thanks.

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins.cc 
(function_expander::use_ternop_insn): Use maybe_gen_insn instead.
(function_expander::use_widen_ternop_insn): Ditto.
* optabs.cc (maybe_gen_insn): Extend nops handling.

---
 gcc/config/riscv/riscv-vector-builtins.cc | 24 ++-
 gcc/optabs.cc |  5 +
 2 files changed, 7 insertions(+), 22 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 60381cfe98f..fcda3863576 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -3154,17 +3154,7 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
insn_code icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
-
-  /* See optabs.cc, the maximum nops is 9 for using 'maybe_gen_insn'.
- We temporarily use GCN directly. We will change it back it we
- can support nops >= 10.  */
-  gcc_assert (maybe_legitimize_operands (icode, 0, opno, m_ops));
-  rtx_insn *pat = GEN_FCN (
-icode) (m_ops[0].value, m_ops[1].value, m_ops[2].value, m_ops[3].value,
-   m_ops[4].value, m_ops[5].value, m_ops[6].value, m_ops[7].value,
-   m_ops[8].value, m_ops[9].value);
-  emit_insn (pat);
-  return m_ops[0].value;
+  return generate_insn (icode);
 }
 
 /* Implement the call using instruction ICODE, with a 1:1 mapping between
@@ -3196,17 +3186,7 @@ function_expander::use_widen_ternop_insn (insn_code 
icode)
   add_input_operand (Pmode, get_tail_policy_for_pred (pred));
   add_input_operand (Pmode, get_mask_policy_for_pred (pred));
   add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
-
-  /* See optabs.cc, the maximum nops is 9 for using 'maybe_gen_insn'.
- We temporarily use GCN directly. We will change it back it we
- can support nops >= 10.  */
-  gcc_assert (maybe_legitimize_operands (icode, 0, opno, m_ops));
-  rtx_insn *pat = GEN_FCN (
-icode) (m_ops[0].value, m_ops[1].value, m_ops[2].value, m_ops[3].value,
-   m_ops[4].value, m_ops[5].value, m_ops[6].value, m_ops[7].value,
-   m_ops[8].value, m_ops[9].value);
-  emit_insn (pat);
-  return m_ops[0].value;
+  return generate_insn (icode);
 }
 
 /* Implement the call using instruction ICODE, with a 1:1 mapping between
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index cf22bfec3f5..4c641cab192 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -8091,6 +8091,11 @@ maybe_gen_insn (enum insn_code icode, unsigned int nops,
   return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
  ops[3].value, ops[4].value, ops[5].value,
  ops[6].value, ops[7].value, ops[8].value);
+case 10:
+  return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
+ ops[3].value, ops[4].value, ops[5].value,
+ ops[6].value, ops[7].value, ops[8].value,
+ ops[9].value);
 }
   gcc_unreachable ();
 }
-- 
2.36.1



Re: [PATCH] Extend nops num in "maybe_gen_insn" for RISC-V Vector intrinsics

2023-03-07 Thread Richard Biener via Gcc-patches
On Wed, 8 Mar 2023, juzhe.zh...@rivai.ai wrote:

> From: Ju-Zhe Zhong 
> 
> Hi, current maybe_gen_insn can only expand 9 nops.
> For RVV intrinsics, I need to extend it as 10, otherwise I should use GEN_FCN.
> This patch is quite obvious change, Ok for trunk ?

The optabs.cc change is OK.

Thanks,
Richard.

> Thanks.
> 
> gcc/ChangeLog:
> 
> * config/riscv/riscv-vector-builtins.cc 
> (function_expander::use_ternop_insn): Use maybe_gen_insn instead.
> (function_expander::use_widen_ternop_insn): Ditto.
> * optabs.cc (maybe_gen_insn): Extend nops handling.
> 
> ---
>  gcc/config/riscv/riscv-vector-builtins.cc | 24 ++-
>  gcc/optabs.cc |  5 +
>  2 files changed, 7 insertions(+), 22 deletions(-)
> 
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
> b/gcc/config/riscv/riscv-vector-builtins.cc
> index 60381cfe98f..fcda3863576 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -3154,17 +3154,7 @@ function_expander::use_ternop_insn (bool vd_accum_p, 
> insn_code icode)
>add_input_operand (Pmode, get_tail_policy_for_pred (pred));
>add_input_operand (Pmode, get_mask_policy_for_pred (pred));
>add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
> -
> -  /* See optabs.cc, the maximum nops is 9 for using 'maybe_gen_insn'.
> - We temporarily use GCN directly. We will change it back it we
> - can support nops >= 10.  */
> -  gcc_assert (maybe_legitimize_operands (icode, 0, opno, m_ops));
> -  rtx_insn *pat = GEN_FCN (
> -icode) (m_ops[0].value, m_ops[1].value, m_ops[2].value, m_ops[3].value,
> - m_ops[4].value, m_ops[5].value, m_ops[6].value, m_ops[7].value,
> - m_ops[8].value, m_ops[9].value);
> -  emit_insn (pat);
> -  return m_ops[0].value;
> +  return generate_insn (icode);
>  }
>  
>  /* Implement the call using instruction ICODE, with a 1:1 mapping between
> @@ -3196,17 +3186,7 @@ function_expander::use_widen_ternop_insn (insn_code 
> icode)
>add_input_operand (Pmode, get_tail_policy_for_pred (pred));
>add_input_operand (Pmode, get_mask_policy_for_pred (pred));
>add_input_operand (Pmode, get_avl_type_rtx (avl_type::NONVLMAX));
> -
> -  /* See optabs.cc, the maximum nops is 9 for using 'maybe_gen_insn'.
> - We temporarily use GCN directly. We will change it back it we
> - can support nops >= 10.  */
> -  gcc_assert (maybe_legitimize_operands (icode, 0, opno, m_ops));
> -  rtx_insn *pat = GEN_FCN (
> -icode) (m_ops[0].value, m_ops[1].value, m_ops[2].value, m_ops[3].value,
> - m_ops[4].value, m_ops[5].value, m_ops[6].value, m_ops[7].value,
> - m_ops[8].value, m_ops[9].value);
> -  emit_insn (pat);
> -  return m_ops[0].value;
> +  return generate_insn (icode);
>  }
>  
>  /* Implement the call using instruction ICODE, with a 1:1 mapping between
> diff --git a/gcc/optabs.cc b/gcc/optabs.cc
> index cf22bfec3f5..4c641cab192 100644
> --- a/gcc/optabs.cc
> +++ b/gcc/optabs.cc
> @@ -8091,6 +8091,11 @@ maybe_gen_insn (enum insn_code icode, unsigned int 
> nops,
>return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
> ops[3].value, ops[4].value, ops[5].value,
> ops[6].value, ops[7].value, ops[8].value);
> +case 10:
> +  return GEN_FCN (icode) (ops[0].value, ops[1].value, ops[2].value,
> +   ops[3].value, ops[4].value, ops[5].value,
> +   ops[6].value, ops[7].value, ops[8].value,
> +   ops[9].value);
>  }
>gcc_unreachable ();
>  }
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)