[PATCH] c-family: Fix up -fno-debug-cpp [PR111965]

2023-12-06 Thread Jakub Jelinek
Hi!

As can be seen in the second testcase, -fno-debug-cpp is actually
implemented the same as -fdebug-cpp and so doesn't turn the debugging
off.

The following patch fixes that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-12-07  Andrew Pinski  
Jakub Jelinek  

PR preprocessor/111965
gcc/c-family/
* c-opts.cc (c_common_handle_option) : Set
cpp_opts->debug to value rather than 1.
gcc/testsuite/
* gcc.dg/cpp/pr111965-1.c: New test.
* gcc.dg/cpp/pr111965-2.c: New test.

--- gcc/c-family/c-opts.cc.jj   2023-12-05 09:06:05.867881859 +0100
+++ gcc/c-family/c-opts.cc  2023-12-06 18:02:20.445469185 +0100
@@ -532,7 +532,7 @@ c_common_handle_option (size_t scode, co
   break;
 
 case OPT_fdebug_cpp:
-  cpp_opts->debug = 1;
+  cpp_opts->debug = value;
   break;
 
 case OPT_ftrack_macro_expansion:
--- gcc/testsuite/gcc.dg/cpp/pr111965-1.c.jj2023-12-06 17:54:03.696424916 
+0100
+++ gcc/testsuite/gcc.dg/cpp/pr111965-1.c   2023-12-06 18:01:32.341142764 
+0100
@@ -0,0 +1,5 @@
+/* PR preprocessor/111965
+   { dg-do preprocess }
+   { dg-options "-fdebug-cpp" }
+   { dg-final { scan-file pr111965-1.i "P:;F:;" } } */
+int x;
--- gcc/testsuite/gcc.dg/cpp/pr111965-2.c.jj2023-12-06 17:59:36.953758477 
+0100
+++ gcc/testsuite/gcc.dg/cpp/pr111965-2.c   2023-12-06 18:01:27.147215490 
+0100
@@ -0,0 +1,5 @@
+/* PR preprocessor/111965
+   { dg-do preprocess }
+   { dg-options "-fdebug-cpp -fno-debug-cpp" }
+   { dg-final { scan-file-not pr111965-2.i "P:;F:;" } } */
+int x;

Jakub



Re: [PATCH] s390: Fix expansion of vec_step

2023-12-06 Thread Andreas Krebbel
On 12/4/23 11:14, Stefan Schulze Frielinghaus wrote:
> Add missing "s390" while expanding vec_step to __builtin_s390_vec_step.
> 
> gcc/ChangeLog:
> 
>   * config/s390/vecintrin.h (vec_step): Expand vec_step to
>   __builtin_s390_vec_step.

Ok, Thanks!

Andreas



Re: [PATCH] testsuite: Fix up gcc.target/s390/pr96127.c test for modern C [PR96127]

2023-12-06 Thread Andreas Krebbel
On 12/3/23 19:36, Jakub Jelinek wrote:
> Hi!
> 
> I've noticed this test regressed on s390x-linux with the addition of the
> switch to modern C patchset.  Haven't tried to reproduce the ICE, but as it
> was a backend ICE and FE after warning used to add such casts before (now
> errors), I think this ought to keep the testcase testing what was intended
> before.
> 
> Ok for trunk?

Ok, thanks!

Andreas



[PATCH] expr: Handle BITINT_TYPE in count_type_elements [PR112881]

2023-12-06 Thread Jakub Jelinek
Hi!

The following testcaser ICEs during gimplification, because
count_type_elements doesn't handle BITINT_TYPE.  It should handle it like
other integral types.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-12-07  Jakub Jelinek  

PR middle-end/112881
* expr.cc (count_type_elements): Handle BITINT_TYPE like INTEGER_TYPE.

* gcc.dg/bitint-50.c: New test.

--- gcc/expr.cc.jj  2023-12-01 08:10:43.037319912 +0100
+++ gcc/expr.cc 2023-12-06 17:36:15.437408307 +0100
@@ -7021,6 +7021,7 @@ count_type_elements (const_tree type, bo
 case REFERENCE_TYPE:
 case NULLPTR_TYPE:
 case OPAQUE_TYPE:
+case BITINT_TYPE:
   return 1;
 
 case ERROR_MARK:
--- gcc/testsuite/gcc.dg/bitint-50.c.jj 2023-12-06 17:37:15.502565091 +0100
+++ gcc/testsuite/gcc.dg/bitint-50.c2023-12-06 17:36:48.866939013 +0100
@@ -0,0 +1,21 @@
+/* PR middle-end/112881 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O2 -std=c23" } */
+
+struct S { _BitInt(64) b; };
+
+struct S
+foo (_BitInt(64) p)
+{
+  return (struct S) { p };
+}
+
+#if __BITINT_MAXWIDTH__ >= 3924
+struct T { _BitInt(3924) b; };
+
+struct T
+bar (_BitInt(3924) p)
+{
+  return (struct T) { p };
+}
+#endif

Jakub



[PATCH] tree-ssa-dce: Fix up maybe_optimize_arith_overflow for BITINT_TYPE [PR112880]

2023-12-06 Thread Jakub Jelinek
Hi!

The following testcase ICEs because maybe_optimize_arith_overflow
uses build_nonstandard_integer_type, which is inappropriate if
type is large BITINT_TYPE.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2023-12-07  Jakub Jelinek  

PR tree-optimization/112880
* tree-ssa-dce.cc (maybe_optimize_arith_overflow): Use
unsigned_type_for instead of conditionally calling
build_nonstandard_integer_type.

* gcc.dg/bitint-49.c: New test.

--- gcc/tree-ssa-dce.cc.jj  2023-12-05 16:14:36.833248092 +0100
+++ gcc/tree-ssa-dce.cc 2023-12-06 17:16:42.740873447 +0100
@@ -1241,9 +1241,7 @@ maybe_optimize_arith_overflow (gimple_st
   tree arg1 = gimple_call_arg (stmt, 1);
   location_t loc = gimple_location (stmt);
   tree type = TREE_TYPE (TREE_TYPE (lhs));
-  tree utype = type;
-  if (!TYPE_UNSIGNED (type))
-utype = build_nonstandard_integer_type (TYPE_PRECISION (type), 1);
+  tree utype = unsigned_type_for (type);
   tree result = fold_build2_loc (loc, subcode, utype,
 fold_convert_loc (loc, utype, arg0),
 fold_convert_loc (loc, utype, arg1));
--- gcc/testsuite/gcc.dg/bitint-49.c.jj 2023-12-06 17:15:30.461888327 +0100
+++ gcc/testsuite/gcc.dg/bitint-49.c2023-12-06 17:15:01.669292609 +0100
@@ -0,0 +1,37 @@
+/* PR tree-optimization/112880 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=c23 -O2" } */
+
+#if __BITINT_MAXWIDTH__ >= 1024
+_BitInt(1024) a, b, c, d, e, f;
+
+void
+foo (void)
+{
+  __builtin_add_overflow (a, b, &a);
+  __builtin_sub_overflow (c, d, &c);
+  __builtin_mul_overflow (e, f, &e);
+}
+#endif
+
+#if __BITINT_MAXWIDTH__ >= 512
+_BitInt(512) g, h, i, j, k, l;
+
+void
+bar (void)
+{
+  __builtin_add_overflow (g, h, &g);
+  __builtin_sub_overflow (i, j, &i);
+  __builtin_mul_overflow (k, l, &k);
+}
+#endif
+
+_BitInt(32) m, n, o, p, q, r;
+
+void
+baz (void)
+{
+  __builtin_add_overflow (m, n, &m);
+  __builtin_sub_overflow (o, p, &o);
+  __builtin_mul_overflow (q, r, &q);
+}

Jakub



Re: veclower: improve selection of vector mode when lowering [PR 112787]

2023-12-06 Thread Richard Biener
On Wed, 6 Dec 2023, Andre Vieira (lists) wrote:

> Hi,
> 
> This patch addresses the issue reported in PR target/112787 by improving the
> compute type selection.  We do this by not considering types with more
> elements
> than the type we are lowering since we'd reject such types anyway.
> 
> gcc/ChangeLog:
> 
>   PR target/112787
>   * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
>   control maximum amount of elements in resulting vector mode.
>   (get_compute_type): Restrict vector_compute_type to a mode no wider
>   than the original compute type.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/pr112787.c: New test.
> 
> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
> x86_64-pc-linux-gnu.
> 
> Is this OK for trunk?

@@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator 
*gsi)
TYPE, or NULL_TREE if none is found.  */

Can you improve the function comment?  It also doesn't mention OP ...

 static tree
-type_for_widest_vector_mode (tree type, optab op)
+type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits = 
0)
 {
   machine_mode inner_mode = TYPE_MODE (type);
   machine_mode best_mode = VOIDmode, mode;
@@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
   FOR_EACH_MODE_FROM (mode, mode)
 if (GET_MODE_INNER (mode) == inner_mode
&& maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
-   && optab_handler (op, mode) != CODE_FOR_nothing)
+   && optab_handler (op, mode) != CODE_FOR_nothing
+   && (known_eq (max_nunits, 0)
+   || known_lt (GET_MODE_NUNITS (mode), max_nunits)))

max_nunits suggests that known_le would be appropriate instead.

I see the only other caller with similar "problems":

}
  /* Can't use get_compute_type here, as supportable_convert_operation
 doesn't necessarily use an optab and needs two arguments.  */
  tree vec_compute_type
= type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
  if (vec_compute_type
  && VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
  && subparts_gt (arg_type, vec_compute_type))

so please do not default to 0 but adjust this one as well.  It also
seems you then can remove the subparts_gt guards on both
vec_compute_type uses.

I think the API would be cleaner if we'd pass the original vector type
we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg.

No?

Thanks,
Richard.


Re: [PATCH] c++: Don't diagnose ignoring of attributes if all ignored attributes are attribute_ignored_p

2023-12-06 Thread Jakub Jelinek
On Wed, Dec 06, 2023 at 03:10:41PM +0100, Jakub Jelinek wrote:
> So far tested with
> GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 make check-g++ 
> RUNTESTFLAGS="dg.exp=Wno-attributes* ubsan.exp=Wno-attributes*"
> (which is all tests that use -Wno-attributes=), ok for trunk if it passes
> full bootstrap/regtest?

Successfully bootstrapped/regtested on x86_64-linux and i686-linux.

> 2023-12-06  Jakub Jelinek  
> 
> gcc/
>   * attribs.h (any_nonignored_attribute_p): Declare.
>   * attribs.cc (any_nonignored_attribute_p): New function.
> gcc/cp/
>   * parser.cc (cp_parser_statement, cp_parser_expression_statement,
>   cp_parser_declaration, cp_parser_elaborated_type_specifier,
>   cp_parser_asm_definition): Don't diagnose ignored attributes
>   if !any_nonignored_attribute_p.
>   * decl.cc (grokdeclarator): Likewise.
>   * name-lookup.cc (handle_namespace_attrs, finish_using_directive):
>   Don't diagnose ignoring of attr_ignored_p attributes.
> gcc/testsuite/
>   * g++.dg/warn/Wno-attributes-1.C: New test.

Jakub



Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-06 Thread Richard Biener
On Wed, Dec 6, 2023 at 6:15 PM Manos Anagnostakis
 wrote:
>
> Hi again,
>
> I went and tested the requested changes and found out the following:
>
> 1. The pass is currently increasing insn_cnt on a NONJUMP_INSN_P, which is a 
> subset of NONDEBUG_INSN_P. I think there is no problem with depending on -g 
> with the current version. Do you see something I don't or did you mean 
> something else?

It just occurred to me - thanks for double-checking (it wasn't obvious
to me NONJUMP_INSN_P doesn't include DEBUG_INSNs ...).

> 2. Not processing all instructions is not letting cselib record all the 
> effects they have, thus it does not have updated information to find true 
> forwardings at any given time. I can confirm this since I am witnessing many 
> unexpected changes on the number of handled cases if I do this only for 
> loads/stores.

Ah, OK.  I guess I don't fully get it, it seems you use cselib to
compare addresses and while possibly not
processing part of the address computation might break this other
stmts inbetween should be uninteresting
(at least I don't see you handling intermediate may-aliasing [small]
stores to disable the splitting).

So in the end it's a compile-time trade-off between relying solely on
cselib or trying to optimize
cselib use with DF for address computes?

Richard.

> Thanks in advance and please let me know your thoughts on the above.
> Manos.
>
> On Wed, Dec 6, 2023 at 5:10 PM Manos Anagnostakis 
>  wrote:
>>
>> Hi Richard,
>>
>> thanks for the useful comments.
>>
>> On Wed, Dec 6, 2023 at 4:32 PM Richard Biener  
>> wrote:
>>>
>>> On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis
>>>  wrote:
>>> >
>>> > This is an RTL pass that detects store forwarding from stores to larger 
>>> > loads (load pairs).
>>> >
>>> > This optimization is SPEC2017-driven and was found to be beneficial for 
>>> > some benchmarks,
>>> > through testing on ampere1/ampere1a machines.
>>> >
>>> > For example, it can transform cases like
>>> >
>>> > str  d5, [sp, #320]
>>> > fmul d5, d31, d29
>>> > ldp  d31, d17, [sp, #312] # Large load from small store
>>> >
>>> > to
>>> >
>>> > str  d5, [sp, #320]
>>> > fmul d5, d31, d29
>>> > ldr  d31, [sp, #312]
>>> > ldr  d17, [sp, #320]
>>> >
>>> > Currently, the pass is disabled by default on all architectures and 
>>> > enabled by a target-specific option.
>>> >
>>> > If deemed beneficial enough for a default, it will be enabled on 
>>> > ampere1/ampere1a,
>>> > or other architectures as well, without needing to be turned on by this 
>>> > option.
>>>
>>> What is aarch64-specific about the pass?
>>
>> The pass was designed to target load pairs, which are aarch64 specific, thus 
>> it cannot handle generic loads.
>>>
>>>
>>> I see an increasingly large number of target specific passes pop up 
>>> (probably
>>> for the excuse we can generalize them if necessary).  But GCC isn't LLVM
>>> and this feels like getting out of hand?
>>>
>>> The x86 backend also has its store-forwarding "pass" as part of mdreorg
>>> in ix86_split_stlf_stall_load.
>>>
>>> Richard.
>>>
>>> > Bootstrapped and regtested on aarch64-linux.
>>> >
>>> > gcc/ChangeLog:
>>> >
>>> > * config.gcc: Add aarch64-store-forwarding.o to extra_objs.
>>> > * config/aarch64/aarch64-passes.def (INSERT_PASS_AFTER): New pass.
>>> > * config/aarch64/aarch64-protos.h 
>>> > (make_pass_avoid_store_forwarding): Declare.
>>> > * config/aarch64/aarch64.opt (mavoid-store-forwarding): New 
>>> > option.
>>> > (aarch64-store-forwarding-threshold): New param.
>>> > * config/aarch64/t-aarch64: Add aarch64-store-forwarding.o
>>> > * doc/invoke.texi: Document new option and new param.
>>> > * config/aarch64/aarch64-store-forwarding.cc: New file.
>>> >
>>> > gcc/testsuite/ChangeLog:
>>> >
>>> > * gcc.target/aarch64/ldp_ssll_no_overlap_address.c: New test.
>>> > * gcc.target/aarch64/ldp_ssll_no_overlap_offset.c: New test.
>>> > * gcc.target/aarch64/ldp_ssll_overlap.c: New test.
>>> >
>>> > Signed-off-by: Manos Anagnostakis 
>>> > Co-Authored-By: Manolis Tsamis 
>>> > Co-Authored-By: Philipp Tomsich 
>>> > ---
>>> > Changes in v6:
>>> > - An obvious change. insn_cnt was incremented only on
>>> >   stores and not for every insn in the bb. Now restored.
>>> >
>>> >  gcc/config.gcc|   1 +
>>> >  gcc/config/aarch64/aarch64-passes.def |   1 +
>>> >  gcc/config/aarch64/aarch64-protos.h   |   1 +
>>> >  .../aarch64/aarch64-store-forwarding.cc   | 318 ++
>>> >  gcc/config/aarch64/aarch64.opt|   9 +
>>> >  gcc/config/aarch64/t-aarch64  |  10 +
>>> >  gcc/doc/invoke.texi   |  11 +-
>>> >  .../aarch64/ldp_ssll_no_overlap_address.c |  33 ++
>>> >  .../aarch64/ldp_ssll_no_overlap_offset.c  |  33 ++
>>> >  .../gcc.target/aarch64/ldp_ssll_overlap.c |  33 ++
>>> >  10 files changed, 449 insertions(+), 1 d

Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-06 Thread Richard Biener
On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich  wrote:
>
> On Wed, 6 Dec 2023 at 23:32, Richard Biener  
> wrote:
> >
> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis
> >  wrote:
> > >
> > > This is an RTL pass that detects store forwarding from stores to larger 
> > > loads (load pairs).
> > >
> > > This optimization is SPEC2017-driven and was found to be beneficial for 
> > > some benchmarks,
> > > through testing on ampere1/ampere1a machines.
> > >
> > > For example, it can transform cases like
> > >
> > > str  d5, [sp, #320]
> > > fmul d5, d31, d29
> > > ldp  d31, d17, [sp, #312] # Large load from small store
> > >
> > > to
> > >
> > > str  d5, [sp, #320]
> > > fmul d5, d31, d29
> > > ldr  d31, [sp, #312]
> > > ldr  d17, [sp, #320]
> > >
> > > Currently, the pass is disabled by default on all architectures and 
> > > enabled by a target-specific option.
> > >
> > > If deemed beneficial enough for a default, it will be enabled on 
> > > ampere1/ampere1a,
> > > or other architectures as well, without needing to be turned on by this 
> > > option.
> >
> > What is aarch64-specific about the pass?
> >
> > I see an increasingly large number of target specific passes pop up 
> > (probably
> > for the excuse we can generalize them if necessary).  But GCC isn't LLVM
> > and this feels like getting out of hand?
>
> We had an OK from Richard Sandiford on the earlier (v5) version with
> v6 just fixing an obvious bug... so I was about to merge this earlier
> just when you commented.
>
> Given that this had months of test exposure on our end, I would prefer
> to move this forward for GCC14 in its current form.
> The project of replacing architecture-specific store-forwarding passes
> with a generalized infrastructure could then be addressed in the GCC15
> timeframe (or beyond)?

It's up to target maintainers, I just picked this pass (randomly) to make this
comment (of course also knowing that STLF fails are a common issue on
pipelined uarchs).

Richard.

>
> --Philipp.
>
> >
> > The x86 backend also has its store-forwarding "pass" as part of mdreorg
> > in ix86_split_stlf_stall_load.
> >
> > Richard.
> >
> > > Bootstrapped and regtested on aarch64-linux.
> > >
> > > gcc/ChangeLog:
> > >
> > > * config.gcc: Add aarch64-store-forwarding.o to extra_objs.
> > > * config/aarch64/aarch64-passes.def (INSERT_PASS_AFTER): New pass.
> > > * config/aarch64/aarch64-protos.h 
> > > (make_pass_avoid_store_forwarding): Declare.
> > > * config/aarch64/aarch64.opt (mavoid-store-forwarding): New 
> > > option.
> > > (aarch64-store-forwarding-threshold): New param.
> > > * config/aarch64/t-aarch64: Add aarch64-store-forwarding.o
> > > * doc/invoke.texi: Document new option and new param.
> > > * config/aarch64/aarch64-store-forwarding.cc: New file.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/aarch64/ldp_ssll_no_overlap_address.c: New test.
> > > * gcc.target/aarch64/ldp_ssll_no_overlap_offset.c: New test.
> > > * gcc.target/aarch64/ldp_ssll_overlap.c: New test.
> > >
> > > Signed-off-by: Manos Anagnostakis 
> > > Co-Authored-By: Manolis Tsamis 
> > > Co-Authored-By: Philipp Tomsich 
> > > ---
> > > Changes in v6:
> > > - An obvious change. insn_cnt was incremented only on
> > >   stores and not for every insn in the bb. Now restored.
> > >
> > >  gcc/config.gcc|   1 +
> > >  gcc/config/aarch64/aarch64-passes.def |   1 +
> > >  gcc/config/aarch64/aarch64-protos.h   |   1 +
> > >  .../aarch64/aarch64-store-forwarding.cc   | 318 ++
> > >  gcc/config/aarch64/aarch64.opt|   9 +
> > >  gcc/config/aarch64/t-aarch64  |  10 +
> > >  gcc/doc/invoke.texi   |  11 +-
> > >  .../aarch64/ldp_ssll_no_overlap_address.c |  33 ++
> > >  .../aarch64/ldp_ssll_no_overlap_offset.c  |  33 ++
> > >  .../gcc.target/aarch64/ldp_ssll_overlap.c |  33 ++
> > >  10 files changed, 449 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/config/aarch64/aarch64-store-forwarding.cc
> > >  create mode 100644 
> > > gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_address.c
> > >  create mode 100644 
> > > gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_offset.c
> > >  create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_ssll_overlap.c
> > >
> > > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > > index 6450448f2f0..7c48429eb82 100644
> > > --- a/gcc/config.gcc
> > > +++ b/gcc/config.gcc
> > > @@ -350,6 +350,7 @@ aarch64*-*-*)
> > > cxx_target_objs="aarch64-c.o"
> > > d_target_objs="aarch64-d.o"
> > > extra_objs="aarch64-builtins.o aarch-common.o 
> > > aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o 
> > > aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o 
> > > aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o 
> > > aarch64-speculation.o falkor-tag-collis

Re: [PATCH] strub: enable conditional support

2023-12-06 Thread Richard Biener
On Thu, Dec 7, 2023 at 4:34 AM Alexandre Oliva  wrote:
>
> On Dec  6, 2023, Alexandre Oliva  wrote:
>
> > Disabling the runtime bits is easy, once we determine what condition we
> > wish to test for.  I suppose testing for target support in the compiler,
> > issuing a 'sorry' in case the feature is required, would provide
> > something for libgcc configure and testsuite effective-target to test
> > for and decide whether to enable runtime support and run the tests.
>
> Instead of doing something equivalent to an implicit -fstrub=disable,
> that would quietly compile without stack scrubbing, I thought it would
> be safer to be noisy if the feature is used (requested, really) when
> support is not available.
>
>
> Targets that don't expose callee stacks to callers, such as nvptx, as
> well as -fsplit-stack compilations, violate fundamental assumptions of
> the current strub implementation.  This patch enables targets to
> disable strub, and disables it when -fsplit-stack is enabled.
>
> When strub support is disabled, the testsuite will now skip strub
> tests, and libgcc will not build the strub runtime components.
>
> Regstrapped on x86_64-linux-gnu.  Also tested with an additional patch
> for i386.cc that mirrors the nvptx.cc change, to check that strub gets
> disabled without noisy test results.  Ok to install?

OK.

Thanks,
Richard.

>
> for  gcc/ChangeLog
>
> * target.def (have_strub_support_for): New hook.
> * doc/tm.texi.in: Document it.
> * doc/tm.texi: Rebuild.
> * ipa-strub.cc: Include target.h.
> (strub_target_support_p): New.
> (can_strub_p): Call it.  Test for no flag_split_stack.
> (pass_ipa_strub::adjust_at_calls_call): Check for target
> support.
> * config/nvptx/nvptx.cc (TARGET_HAVE_STRUB_SUPPORT_FOR):
> Disable.
> * doc/sourcebuild.texi (strub): Document new effective
> target.
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/strub-split-stack.c: New.
> * gcc.dg/strub-unsupported.c: New.
> * gcc.dg/strub-unsupported-2.c: New.
> * gcc.dg/strub-unsupported-3.c: New.
> * lib/target-supports.exp (check_effective_target_strub): New.
> * c-c++-common/strub-O0.c: Require effective target strub.
> * c-c++-common/strub-O1.c: Likewise.
> * c-c++-common/strub-O2.c: Likewise.
> * c-c++-common/strub-O2fni.c: Likewise.
> * c-c++-common/strub-O3.c: Likewise.
> * c-c++-common/strub-O3fni.c: Likewise.
> * c-c++-common/strub-Og.c: Likewise.
> * c-c++-common/strub-Os.c: Likewise.
> * c-c++-common/strub-all1.c: Likewise.
> * c-c++-common/strub-all2.c: Likewise.
> * c-c++-common/strub-apply1.c: Likewise.
> * c-c++-common/strub-apply2.c: Likewise.
> * c-c++-common/strub-apply3.c: Likewise.
> * c-c++-common/strub-apply4.c: Likewise.
> * c-c++-common/strub-at-calls1.c: Likewise.
> * c-c++-common/strub-at-calls2.c: Likewise.
> * c-c++-common/strub-defer-O1.c: Likewise.
> * c-c++-common/strub-defer-O2.c: Likewise.
> * c-c++-common/strub-defer-O3.c: Likewise.
> * c-c++-common/strub-defer-Os.c: Likewise.
> * c-c++-common/strub-internal1.c: Likewise.
> * c-c++-common/strub-internal2.c: Likewise.
> * c-c++-common/strub-parms1.c: Likewise.
> * c-c++-common/strub-parms2.c: Likewise.
> * c-c++-common/strub-parms3.c: Likewise.
> * c-c++-common/strub-relaxed1.c: Likewise.
> * c-c++-common/strub-relaxed2.c: Likewise.
> * c-c++-common/strub-short-O0-exc.c: Likewise.
> * c-c++-common/strub-short-O0.c: Likewise.
> * c-c++-common/strub-short-O1.c: Likewise.
> * c-c++-common/strub-short-O2.c: Likewise.
> * c-c++-common/strub-short-O3.c: Likewise.
> * c-c++-common/strub-short-Os.c: Likewise.
> * c-c++-common/strub-strict1.c: Likewise.
> * c-c++-common/strub-strict2.c: Likewise.
> * c-c++-common/strub-tail-O1.c: Likewise.
> * c-c++-common/strub-tail-O2.c: Likewise.
> * c-c++-common/strub-var1.c: Likewise.
> * c-c++-common/torture/strub-callable1.c: Likewise.
> * c-c++-common/torture/strub-callable2.c: Likewise.
> * c-c++-common/torture/strub-const1.c: Likewise.
> * c-c++-common/torture/strub-const2.c: Likewise.
> * c-c++-common/torture/strub-const3.c: Likewise.
> * c-c++-common/torture/strub-const4.c: Likewise.
> * c-c++-common/torture/strub-data1.c: Likewise.
> * c-c++-common/torture/strub-data2.c: Likewise.
> * c-c++-common/torture/strub-data3.c: Likewise.
> * c-c++-common/torture/strub-data4.c: Likewise.
> * c-c++-common/torture/strub-data5.c: Likewise.
> * c-c++-common/torture/strub-indcall1.c: Likewise.
> * c-c++-common/torture/strub-indcall2.c: Likewise.
> * c-c++-common/torture/strub-in

Re: Causes to nvptx bootstrap fail: [PATCH v5] Introduce strub: machine-independent stack scrubbing

2023-12-06 Thread Richard Biener
On Wed, Dec 6, 2023 at 11:12 PM Alexandre Oliva  wrote:
>
> On Dec  6, 2023, Thomas Schwinge  wrote:
>
> > As I understand things, this cannot be implemented (at the call site) for
> > nvptx, given that the callee's stack is not visible there: PTX is unusual
> > in that the concept of a "standard" stack isn't exposed.
>
> Not even when one PTX function calls another?  Interesting.  I'd hoped
> that with control over entering and leaving strub contexts, one could
> (manually) ensure they'd run in the same execution domain.  But if not
> even that is possible, it will render the current strub implementation
> entirely unusable for this target indeed.
>
> Now, it doesn't seem to me that the build errors being experienced have
> to do with that, but rather with lack of or incomplete support for
> __builtin_{frame,stack}_address().  Are those errors expected when using
> these builtins on this target?  I'd have expected them to compile, even
> if something went wrong at runtime.
>
>
> > Instead of allowing "strub" pieces that can be implemented, should this
> > whole machinery generally be disabled (forced '-fstrub=disable', or via a
> > new target hook?)?  The libgcc functions should then not get defined
> > (thus, linker error upon accidental use), or should just '__builtin_trap'
> > if that makes more sense?  Need an effective-target for the test cases.
>
> > Alternatively, we may also leave the generic middle end handling alive,
> > and 'sorry' (or similar) in the nvptx back end, as necessary?
>
> Disabling the runtime bits is easy, once we determine what condition we
> wish to test for.  I suppose testing for target support in the compiler,
> issuing a 'sorry' in case the feature is required, would provide
> something for libgcc configure and testsuite effective-target to test
> for and decide whether to enable runtime support and run the tests.

There's always the possibility to hardcode target triplets we don't support
of course.

Richard.

> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-12-06 Thread Kewen.Lin
on 2023/12/6 13:09, Michael Meissner wrote:
> On Wed, Dec 06, 2023 at 10:22:57AM +0800, Kewen.Lin wrote:
>> I'd expect you use UNSPEC_MMA_EXTRACT to extract V16QI from the result of 
>> lxvp,
>> the current define_insn_and_split "*vsx_disassemble_pair" should be able to 
>> take
>> care of it further (eg: reg and regoff).
>>
>> BR,
>> Kewen
> 
> With Peter's subreg patch, UNSPEC_MMA_EXTRACT would produce two move with
> eSUBREGs:

With the below details, I think you meant that even with Peter's subreg patch
which was intended to get rid of UNSPEC_MMA_EXTRACT for OOmode, we could still
have sub-optimal moves?

The proposed subreg and the current UNSPEC_MMA_EXTRACT unspec are alternatives
to extract the component from the result of lxvp.  Since the latest trunk still
adopts UNSPEC_MMA_EXTRACT, I replied to Ajit with it.

> 
> For a FMA type loop such as:
> 
> union vector_hack2 {
>   vector  unsigned char vuc[2];
>   vector double v[2];
> };
> 
> static void
> use_mma_ld_st_normal_no_unroll (double * __restrict__ r,
>   const double * __restrict__ a,
>   const double * __restrict__ b,
>   size_t num)
> {
>   __vector_pair * __restrict__ v_r = ( __vector_pair * __restrict__) r;
>   const __vector_pair * __restrict__ v_a = (const __vector_pair * 
> __restrict__) a;
>   const __vector_pair * __restrict__ v_b = (const __vector_pair * 
> __restrict__) b;
>   size_t num_vector = num / (2 * (sizeof (vector double) / sizeof (double)));
>   size_t num_scalar = num % (2 * (sizeof (vector double) / sizeof (double)));
>   size_t i;
>   union vector_hack2 a_union;
>   union vector_hack2 b_union;
>   union vector_hack2 r_union;
>   vector double a_hi, a_lo;
>   vector double b_hi, b_lo;
>   vector double r_hi, r_lo;
>   union vector_hack result_hi, result_lo;
> 
> #pragma GCC unroll 0
>   for (i = 0; i < num_vector; i++)
> {
>   __builtin_vsx_disassemble_pair (&a_union.vuc, &v_a[i]);
>   __builtin_vsx_disassemble_pair (&b_union.vuc, &v_b[i]);
>   __builtin_vsx_disassemble_pair (&r_union.vuc, &v_r[i]);
> 
>   a_hi = a_union.v[0];
>   b_hi = b_union.v[0];
>   r_hi = r_union.v[0];
> 
>   a_lo = a_union.v[1];
>   b_lo = b_union.v[1];
>   r_lo = r_union.v[1];
> 
>   result_hi.v = (a_hi * b_hi) + r_hi;
>   result_lo.v = (a_lo * b_lo) + r_lo;
> 
>   __builtin_vsx_build_pair (&v_r[i], result_hi.vuc, result_lo.vuc);
> }
> 
>   if (num_scalar)
> {
>   r += num_vector * (2 * (sizeof (vector double) / sizeof (double)));
>   a += num_vector * (2 * (sizeof (vector double) / sizeof (double)));
>   b += num_vector * (2 * (sizeof (vector double) / sizeof (double)));
> 
> #pragma GCC unroll 0
>   for (i = 0; i < num_scalar; i++)
>  r[i] += (a[i] * b[i]);
> }
> 
>   return;
> }
> 
> Peter's code would produce the following in the inner loop:
> 
> (insn 16 15 19 4 (set (reg:OO 133 [ _43 ])
> (mem:OO (plus:DI (reg/v/f:DI 150 [ a ])
> (reg:DI 143 [ ivtmp.1088 ])) [6 MEM[(__vector_pair *)a_30(D) 
> + ivtmp.1088_88 * 1]+0 S32 A128])) "p10-fma.h":3285:1 2181 {*movoo}
>  (nil))
> (insn 19 16 22 4 (set (reg:OO 136 [ _48 ])
> (mem:OO (plus:DI (reg/v/f:DI 151 [ b ])
> (reg:DI 143 [ ivtmp.1088 ])) [6 MEM[(__vector_pair *)b_31(D) 
> + ivtmp.1088_88 * 1]+0 S32 A128])) "p10-fma.h":3285:1 2181 {*movoo}
>  (nil))
> (insn 22 19 25 4 (set (reg:OO 139 [ _53 ])
> (mem:OO (plus:DI (reg/v/f:DI 149 [ r ])
> (reg:DI 143 [ ivtmp.1088 ])) [6 MEM[(__vector_pair *)r_29(D) 
> + ivtmp.1088_88 * 1]+0 S32 A128])) "p10-fma.h":3285:1 2181 {*movoo}
>  (nil))
> (insn 25 22 26 4 (set (reg:V2DF 117 [ _6 ])
> (fma:V2DF (subreg:V2DF (reg:OO 136 [ _48 ]) 16)
> (subreg:V2DF (reg:OO 133 [ _43 ]) 16)
> (subreg:V2DF (reg:OO 139 [ _53 ]) 16))) "p10-fma.h":3319:35 1265 
> {*vsx_fmav2df4}
>  (nil))
> (insn 26 25 27 4 (set (reg:V2DF 118 [ _8 ])
> (fma:V2DF (subreg:V2DF (reg:OO 136 [ _48 ]) 0)
> (subreg:V2DF (reg:OO 133 [ _43 ]) 0)
> (subreg:V2DF (reg:OO 139 [ _53 ]) 0))) "p10-fma.h":3320:35 1265 
> {*vsx_fmav2df4}
>  (expr_list:REG_DEAD (reg:OO 139 [ _53 ])
> (expr_list:REG_DEAD (reg:OO 136 [ _48 ])
> (expr_list:REG_DEAD (reg:OO 133 [ _43 ])
> (nil)
> (insn 27 26 28 4 (set (reg:OO 142 [ _59 ])
> (unspec:OO [
> (subreg:V16QI (reg:V2DF 117 [ _6 ]) 0)
> (subreg:V16QI (reg:V2DF 118 [ _8 ]) 0)
> ] UNSPEC_VSX_ASSEMBLE)) 2183 {*vsx_assemble_pair}
>  (expr_list:REG_DEAD (reg:V2DF 118 [ _8 ])
> (expr_list:REG_DEAD (reg:V2DF 117 [ _6 ])
> (nil
> 
> Now in theory you could get ride of the UNSPEC_VSX_ASSEMBLE also using 
> SUBREG's.

Agree, it looks doable, this comment seems more for Peter's subreg patch. :)

BR,
Kewen


Re: [patch-1v2, rs6000] enable fctiw on old archs [PR112707]

2023-12-06 Thread Kewen.Lin
Hi,

on 2023/12/6 16:13, HAO CHEN GUI wrote:
> Hi,
>   SImode in float register is supported on P7 above. It causes "fctiw"
> can't be generated on old 32-bit processors as the output operand of
> fctiw insn is an SImode in float/double register. This patch fixes the
> problem by adding one expand and one insn pattern for fctiw. The output
> of new pattern is DImode. When the targets don't support SImode in
> float register, it calls the new insn pattern and convert the DImode
> to SImode via stack.
> 
>   Compared to last version,
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/638860.html
> the main change is to change the mode of output operand of the new
> insn from SFmode to DImode so that it can call stfiwx pattern directly.
> No need additional unspecs.
> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
> no regressions. Is this OK for trunk?
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> rs6000: enable fctiw on old archs

Nit: s/rchs/old archs with stfiwx enabled/

> 
> The powerpc 32-bit processors (e.g. 5470) supports "fctiw" instruction,
> but the instruction can't be generated on such platforms as the insn is
> guard by TARGET_POPCNTD.  The root cause is SImode in float register is
> supported from Power7.  Actually implementation of "fctiw" only needs
> stfiwx which is supported by the old 32-bit processors.  This patch
> enables "fctiw" expand for these processors.
> 
> gcc/
>   PR target/112707
>   * config/rs6000/rs6000.md (expand lrintsi2): New.
>   (insn lrintsi2): Rename to...
>   (*lrintsi): ...this.
>   (lrintsi_di): New.
> 
> gcc/testsuite/
>   PR target/112707
>   * gcc.target/powerpc/pr112707-1.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 2a1b5ecfaee..dfb7f19c6ad 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6722,7 +6722,27 @@ (define_insn "lrintdi2"
>"fctid %0,%1"
>[(set_attr "type" "fp")])
> 
> -(define_insn "lrintsi2"
> +(define_expand "lrintsi2"
> +  [(set (match_operand:SI 0 "gpc_reg_operand" "=d")
> + (unspec:SI [(match_operand:SFDF 1 "gpc_reg_operand" "")]
> +UNSPEC_FCTIW))]
> +  "TARGET_HARD_FLOAT && TARGET_STFIWX"
> +{
> +  /* For those old archs in which SImode can't be hold in float registers,
> + call lrintsi_internal2 to put the result in SFmode then

Nit: s/_internal2/_di/
 s/SFmode/DImode/

OK for trunk with the above nits fixed, thanks!

BR,
Kewen

> + convert it via stack.  */
> +  if (!TARGET_POPCNTD)
> +{
> +  rtx tmp = gen_reg_rtx (DImode);
> +  emit_insn (gen_lrintsi_di (tmp, operands[1]));
> +  rtx stack = rs6000_allocate_stack_temp (SImode, false, true);
> +  emit_insn (gen_stfiwx (stack, tmp));
> +  emit_move_insn (operands[0], stack);
> +  DONE;
> +}
> +})
> +
> +(define_insn "*lrintsi"
>[(set (match_operand:SI 0 "gpc_reg_operand" "=d")
>   (unspec:SI [(match_operand:SFDF 1 "gpc_reg_operand" "")]
>  UNSPEC_FCTIW))]
> @@ -6730,6 +6750,14 @@ (define_insn "lrintsi2"
>"fctiw %0,%1"
>[(set_attr "type" "fp")])
> 
> +(define_insn "lrintsi_di"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=d")
> + (unspec:DI [(match_operand:SFDF 1 "gpc_reg_operand" "")]
> +UNSPEC_FCTIW))]
> +  "TARGET_HARD_FLOAT && !TARGET_POPCNTD"
> +  "fctiw %0,%1"
> +  [(set_attr "type" "fp")])
> +
>  (define_insn "btrunc2"
>[(set (match_operand:SFDF 0 "gpc_reg_operand" "=d,wa")
>   (unspec:SFDF [(match_operand:SFDF 1 "gpc_reg_operand" "d,wa")]
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112707-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr112707-1.c
> new file mode 100644
> index 000..cce6bd7f690
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr112707-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=7450 -fno-math-errno" } */
> +/* { dg-require-effective-target ilp32 } */
> +/* { dg-skip-if "" { has_arch_ppc64 } } */
> +/* { dg-final { scan-assembler-times {\mfctiw\M} 2 } }  */
> +/* { dg-final { scan-assembler-times {\mstfiwx\M} 2 } }  */
> +
> +int test1 (double a)
> +{
> +  return __builtin_irint (a);
> +}
> +
> +int test2 (float a)
> +{
> +  return __builtin_irint (a);
> +}




Re: [patch-2v2, rs6000] guard fctid on PPC64 and powerpc 476 [PR112707]

2023-12-06 Thread Kewen.Lin
Hi Haochen,

on 2023/12/6 16:13, HAO CHEN GUI wrote:
> Hi,
>   The "fctid" is supported on 64-bit Power processors and powerpc 476. It
> need a guard to check it. The patch fixes the issue.
> 
>   Compared with last version,
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/638859.html
> the main change is to define TARGET_FCTID to POWERPC64 or PPC476. Also
> guard "lrintdi2" by TARGET_FCTID as it generates fctid.
> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is this OK for trunk?
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> rs6000: guard fctid on PPC64 and powerpc 476.

We should unify the style, "PPC64 and PPC476", "PowerPC64 and PowerPC 476"
or "ppc64 and ppc476" or ...

> 
> fctid is supported on 64-bit Power processors and powerpc 476. It should
> be guarded by this condition. The patch fixes the issue.
> 
> gcc/
>   PR target/112707
>   * config/rs6000/rs6000.h (TARGET_FCTID): Define.
>   * config/rs6000/rs6000.md (lrintdi2): Add guard TARGET_FCTID.
>   * (lrounddi2): Replace TARGET_FPRND with TARGET_FCTID.
> 
> gcc/testsuite/
>   PR target/112707
>   * gcc.target/powerpc/pr112707.h: New.
>   * gcc.target/powerpc/pr112707-2.c: New.
>   * gcc.target/powerpc/pr112707-3.c: New.
>   * gcc.target/powerpc/pr88558-p7.c: Remove fctid for ilp32 as it's
>   now guarded by powerpc64.
>   * gcc.target/powerpc/pr88558-p8.c: Likewise.
>   * gfortran.dg/nint_p7.f90: Add powerpc64 target requirement as
>   lrounddi2 is now guarded by powerpc64.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 22595f6ebd7..8c29ca68ccf 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -467,6 +467,8 @@ extern int rs6000_vector_align[];
>  #define TARGET_FCFIDUS   TARGET_POPCNTD
>  #define TARGET_FCTIDUZ   TARGET_POPCNTD
>  #define TARGET_FCTIWUZ   TARGET_POPCNTD
> +/* Enable fctid on ppc64 and powerpc476.  */

Nit: It seems more clear with "Only powerpc64 and powerpc476 support fctid."

> +#define TARGET_FCTID (TARGET_POWERPC64 || rs6000_cpu == PROCESSOR_PPC476)
>  #define TARGET_CTZ   TARGET_MODULO
>  #define TARGET_EXTSWSLI  (TARGET_MODULO && TARGET_POWERPC64)
>  #define TARGET_MADDLDTARGET_MODULO
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 2a1b5ecfaee..3be79d49dc0 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6718,7 +6718,7 @@ (define_insn "lrintdi2"
>[(set (match_operand:DI 0 "gpc_reg_operand" "=d")
>   (unspec:DI [(match_operand:SFDF 1 "gpc_reg_operand" "")]
>  UNSPEC_FCTID))]
> -  "TARGET_HARD_FLOAT"
> +  "TARGET_HARD_FLOAT && TARGET_FCTID"
>"fctid %0,%1"
>[(set_attr "type" "fp")])
> 
> @@ -6784,7 +6784,7 @@ (define_expand "lrounddi2"
> (set (match_operand:DI 0 "gpc_reg_operand")
>   (unspec:DI [(match_dup 2)]
>  UNSPEC_FCTID))]
> -  "TARGET_HARD_FLOAT && TARGET_VSX && TARGET_FPRND"
> +  "TARGET_HARD_FLOAT && TARGET_VSX && TARGET_FCTID"
>  {
>operands[2] = gen_reg_rtx (mode);
>  })
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112707-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr112707-2.c
> new file mode 100644
> index 000..672e00691ea
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr112707-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=7450 -fno-math-errno" } */
> +/* { dg-require-effective-target ilp32 } */
> +/* { dg-skip-if "" { has_arch_ppc64 } } */
> +/* { dg-final { scan-assembler-not {\mfctid\M} } }  */
> +
> +/* powerpc 7450 doesn't support ppc64 (-m32 -mpowerpc64), so skips it.  */
> +
> +#include "pr112707.h"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112707-3.c 
> b/gcc/testsuite/gcc.target/powerpc/pr112707-3.c
> new file mode 100644
> index 000..924338fd390
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr112707-3.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-math-errno -mdejagnu-cpu=476fp" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +/* powerpc 476fp has hard float enabled which is required by fctid */
> +
> +#include "pr112707.h"
> +
> +/* { dg-final { scan-assembler-times {\mfctid\M} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112707.h 
> b/gcc/testsuite/gcc.target/powerpc/pr112707.h
> new file mode 100644
> index 000..e427dc6a72e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr112707.h
> @@ -0,0 +1,10 @@
> +long long test1 (double a)
> +{
> +  return __builtin_llrint (a);
> +}
> +
> +long long test2 (float a)
> +{
> +  return __builtin_llrint (a);
> +}
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c 
> b/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c
> index 3932656c5fd..13d433c4bdb 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr88558-p7.c
> @@ -

Re: [PATCH v3] LoongArch: Fix eh_return epilogue for normal returns

2023-12-06 Thread Yang Yujie
On Thu, Dec 07, 2023 at 11:02:58AM +0800, Xi Ruoyao wrote:
> 
> I don't like this pair of {} for the for statement.  It's not necessary
> and it changes the indent level, causing the diff hard to review.
> 
> Otherwise LGTM.  I'm not sure why I didn't notice the eh_return issue
> when I learnt shrink wrapping from RISC-V...
>

Thanks for the review!  This problem on LoongArch was first noticed in a
failed libphobos test case, and the fix is partially borrowed from i386,
which seemed to be the only architecture without this issue.

So despite the extra braces (which I'd say I prefer to have because of the
new block of comment inserted on top of the if statement :P), I am going to
ask Lulu for pushing this.

Yujie



Re: [PATCH V3 2/3] Using pli for constant splitting

2023-12-06 Thread Kewen.Lin
Hi Jeff,

on 2023/12/6 13:24, Jiufu Guo wrote:
> Hi,
> 
> For constant building e.g. r120=0x, which does not fit 'li or lis',
> 'pli' is used to build this constant via 'emit_move_insn'.
> 
> While for a complicated constant, e.g. 0xULL, when using
> 'rs6000_emit_set_long_const' to split the constant recursively, it fails to
> use 'pli' to build the half part constant: 0x.
> 
> 'rs6000_emit_set_long_const' could be updated to use 'pli' to build half
> part of the constant when necessary.  For example: 0xULL,
> "pli 3,1717986918; rldimi 3,3,32,0" can be used.
> 
> Compare with previous:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636567.html
> This verion is refreshed and added with a new testcase.
> 
> Bootstrap®test pass on ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu Guo)
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add code to use
>   pli for 34bit constant.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/const_split_pli.c: New test.

Nit: Now we have:

gcc/testsuite/gcc.target/powerpc/const-build.c
gcc/testsuite/gcc.target/powerpc/const_anchors.c
gcc/testsuite/gcc.target/powerpc/const-compare.c

I prefer the name of this new case is like const-build-1.c
(put a detailed comment inside) or const-build-split-pli.c,
to align with the existing.

> 
> ---
>  gcc/config/rs6000/rs6000.cc| 7 +++
>  gcc/testsuite/gcc.target/powerpc/const_split_pli.c | 9 +
>  2 files changed, 16 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_split_pli.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index dbdc72dce5d..2e074a21a05 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10509,6 +10509,13 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c, int *num_insns)
>  GEN_INT (0x)));
>};
>  
> +  if (TARGET_PREFIXED && SIGNED_INTEGER_34BIT_P (c))
> +{
> +  /* li/lis/pli */
> +  count_or_emit_insn (dest, GEN_INT (c));
> +  return;
> +}
> +
>if ((ud4 == 0x && ud3 == 0x && ud2 == 0x && (ud1 & 0x8000))
>|| (ud4 == 0 && ud3 == 0 && ud2 == 0 && !(ud1 & 0x8000)))
>  {
> diff --git a/gcc/testsuite/gcc.target/powerpc/const_split_pli.c 
> b/gcc/testsuite/gcc.target/powerpc/const_split_pli.c
> new file mode 100644
> index 000..626c93084aa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/const_split_pli.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile { target lp64 } } */
> +/* { dg-options "-O2" } */

It needs -mdejagnu-cpu=power10 as well.

> +/* { dg-require-effective-target power10_ok } */
> +
> +unsigned long long msk66() { return 0xULL; }
> +
> +/* { dg-final { scan-assembler-times {\mpli\M} 1 } } */
> +/* { dg-final { scan-assembler-not {\mli\M} } } */
> +/* { dg-final { scan-assembler-not {\mlis\M} } } */

OK for trunk with the above nits tweaked, thanks!

BR,
Kewen


Re: [PATCH V3 1/3]rs6000: update num_insns_constant for 2 insns

2023-12-06 Thread Kewen.Lin
Hi Jeff,

on 2023/12/6 13:24, Jiufu Guo wrote:
> Hi,
> 
> Trunk gcc supports more constants to be built via two instructions:
> e.g. "li/lis; xori/xoris/rldicl/rldicr/rldic".
> And then num_insns_constant should also be updated.
> 
> Function "rs6000_emit_set_long_const" is used to build complicated
> constants; and "num_insns_constant_gpr" is used to compute 'how
> many instructions are needed" to build the constant. So, these 
> two functions should be aligned.
> 
> The idea of this patch is: to reuse "rs6000_emit_set_long_const" to
> compute/record the instruction number(when computing the insn_num, 
> then do not emit instructions).
> 
> Compare with the previous version:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636565.html
> This version updates "rs6000_emit_set_long_const" to use a condition
> if to select either "computing insn number" or "emitting the insn".
> And put them together to avoid misalign in the future.
> 
> Bootstrap & regtest pass ppc64{,le}.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu Guo)
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add new
>   parameter to record number of instructions to build the constant.
>   (num_insns_constant_gpr): Call rs6000_emit_set_long_const to compute
>   num_insn.
> 
> ---
>  gcc/config/rs6000/rs6000.cc | 272 ++--
>  1 file changed, 137 insertions(+), 135 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 3dfd79c4c43..dbdc72dce5d 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1115,7 +1115,7 @@ static tree rs6000_handle_longcall_attribute (tree *, 
> tree, tree, int, bool *);
>  static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool 
> *);
>  static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *);
>  static tree rs6000_builtin_vectorized_libmass (combined_fn, tree, tree);
> -static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT);
> +static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT, int * = nullptr);
>  static int rs6000_memory_move_cost (machine_mode, reg_class_t, bool);
>  static bool rs6000_debug_rtx_costs (rtx, machine_mode, int, int, int *, 
> bool);
>  static int rs6000_debug_address_cost (rtx, machine_mode, addr_space_t,
> @@ -6054,21 +6054,9 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
>  
>else if (TARGET_POWERPC64)
>  {
> -  HOST_WIDE_INT low = sext_hwi (value, 32);
> -  HOST_WIDE_INT high = value >> 31;
> -
> -  if (high == 0 || high == -1)
> - return 2;
> -
> -  high >>= 1;
> -
> -  if (low == 0 || low == high)
> - return num_insns_constant_gpr (high) + 1;
> -  else if (high == 0)
> - return num_insns_constant_gpr (low) + 1;
> -  else
> - return (num_insns_constant_gpr (high)
> - + num_insns_constant_gpr (low) + 1);
> +  int num_insns = 0;
> +  rs6000_emit_set_long_const (NULL, value, &num_insns);

Nit: Maybe nullptr to align with the others in this patch?

> +  return num_insns;
>  }
>  
>else
> @@ -10494,14 +10482,13 @@ can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int 
> *shift, HOST_WIDE_INT *mask)
>  
>  /* Subroutine of rs6000_emit_set_const, handling PowerPC64 DImode.
> Output insns to set DEST equal to the constant C as a series of
> -   lis, ori and shl instructions.  */
> +   lis, ori and shl instructions.  If NUM_INSNS is not NULL, then
> +   only increase *NUM_INSNS as the number of insns, and do not output
> +   real insns.  */

Nit: Maybe s/output real/emit any/.

>  
>  static void
> -rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
> +rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c, int *num_insns)
>  {
> -  rtx temp;
> -  int shift;
> -  HOST_WIDE_INT mask;
>HOST_WIDE_INT ud1, ud2, ud3, ud4;
>  
>ud1 = c & 0x;
> @@ -10509,168 +10496,183 @@ rs6000_emit_set_long_const (rtx dest, 
> HOST_WIDE_INT c)
>ud3 = (c >> 32) & 0x;
>ud4 = (c >> 48) & 0x;
>  
> -  if ((ud4 == 0x && ud3 == 0x && ud2 == 0x && (ud1 & 0x8000))
> -  || (ud4 == 0 && ud3 == 0 && ud2 == 0 && ! (ud1 & 0x8000)))
> -emit_move_insn (dest, GEN_INT (sext_hwi (ud1, 16)));
> +  /* This lambda is used to emit one insn or just increase the insn count.
> + When counting the insn number, no need to emit the insn.  Here, two
> + kinds of insns are needed: move and rldimi. */

Can we make the latter a bit more generic?  Like something below?

> +  auto count_or_emit_insn = [&num_insns] (rtx dest, rtx op1, rtx op2 = NULL) 
> {
> +if (num_insns)
> +  (*num_insns)++;

Nit: Make it early return.

> +else if (!op2)
> +  emit_move_insn (dest, op1);
> +else
> +  emit_insn (gen_rotldi3_insert_3 (dest, op1, GEN_INT (32), op2,
> +GEN_INT (0x)));


[&num_insns] (rtx dest_or_insn, rtx src)

if (src)
  emit_move_insn (dest_or

[PATCH v26 23/23] libstdc++: Optimize std::remove_pointer compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::remove_pointer
by dispatching to the new remove_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (remove_pointer): Use __remove_pointer
built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 2979d79a801..f00c07f94f9 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2089,6 +2089,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Pointer modifications.
 
+  /// remove_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_pointer)
+  template
+struct remove_pointer
+{ using type = __remove_pointer(_Tp); };
+#else
   template
 struct __remove_pointer_helper
 { using type = _Tp; };
@@ -2097,11 +2103,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __remove_pointer_helper<_Tp, _Up*>
 { using type = _Up; };
 
-  /// remove_pointer
   template
 struct remove_pointer
 : public __remove_pointer_helper<_Tp, __remove_cv_t<_Tp>>
 { };
+#endif
 
   template
 struct __add_pointer_helper
-- 
2.43.0



[PATCH v26 21/23] libstdc++: Optimize std::is_object compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_object
by dispatching to the new __is_object built-in trait.

libstdc++-v3/ChangeLog:
* include/std/type_traits (is_object): Use __is_object built-in
trait.
(is_object_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index b6d0441129b..2979d79a801 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -725,11 +725,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_object
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_object)
+  template
+struct is_object
+: public __bool_constant<__is_object(_Tp)>
+{ };
+#else
   template
 struct is_object
 : public __not_<__or_, is_reference<_Tp>,
   is_void<_Tp>>>::type
 { };
+#endif
 
   template
 struct is_member_pointer;
@@ -3280,8 +3287,15 @@ template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
 template 
   inline constexpr bool is_fundamental_v = is_fundamental<_Tp>::value;
+
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_object)
+template 
+  inline constexpr bool is_object_v = __is_object(_Tp);
+#else
 template 
   inline constexpr bool is_object_v = is_object<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_scalar_v = is_scalar<_Tp>::value;
 template 
-- 
2.43.0



[PATCH v26 22/23] c++: Implement __remove_pointer built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::remove_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __remove_pointer.
* semantics.cc (finish_trait_type): Handle CPTK_REMOVE_POINTER.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __remove_pointer.
* g++.dg/ext/remove_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  5 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
 gcc/testsuite/g++.dg/ext/remove_pointer.C | 51 +++
 4 files changed, 60 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/remove_pointer.C

diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index b833efff26e..394f006f20f 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -95,6 +95,7 @@ DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_tempo
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
 DEFTRAIT_TYPE (REMOVE_CVREF, "__remove_cvref", 1)
+DEFTRAIT_TYPE (REMOVE_POINTER, "__remove_pointer", 1)
 DEFTRAIT_TYPE (REMOVE_REFERENCE, "__remove_reference", 1)
 DEFTRAIT_TYPE (TYPE_PACK_ELEMENT, "__type_pack_element", -1)
 DEFTRAIT_TYPE (UNDERLYING_TYPE, "__underlying_type", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index cf3d5476dbb..4b3acbcc767 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12695,6 +12695,11 @@ finish_trait_type (cp_trait_kind kind, tree type1, 
tree type2,
type1 = TREE_TYPE (type1);
   return cv_unqualified (type1);
 
+case CPTK_REMOVE_POINTER:
+  if (TYPE_PTR_P (type1))
+type1 = TREE_TYPE (type1);
+  return type1;
+
 case CPTK_REMOVE_REFERENCE:
   if (TYPE_REF_P (type1))
type1 = TREE_TYPE (type1);
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 2242276f633..02b4b4d745d 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -167,6 +167,9 @@
 #if !__has_builtin (__remove_cvref)
 # error "__has_builtin (__remove_cvref) failed"
 #endif
+#if !__has_builtin (__remove_pointer)
+# error "__has_builtin (__remove_pointer) failed"
+#endif
 #if !__has_builtin (__remove_reference)
 # error "__has_builtin (__remove_reference) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/remove_pointer.C 
b/gcc/testsuite/g++.dg/ext/remove_pointer.C
new file mode 100644
index 000..7b13db93950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/remove_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(__is_same(__remove_pointer(int), int));
+SA(__is_same(__remove_pointer(int*), int));
+SA(__is_same(__remove_pointer(int**), int*));
+
+SA(__is_same(__remove_pointer(const int*), const int));
+SA(__is_same(__remove_pointer(const int**), const int*));
+SA(__is_same(__remove_pointer(int* const), int));
+SA(__is_same(__remove_pointer(int** const), int*));
+SA(__is_same(__remove_pointer(int* const* const), int* const));
+
+SA(__is_same(__remove_pointer(volatile int*), volatile int));
+SA(__is_same(__remove_pointer(volatile int**), volatile int*));
+SA(__is_same(__remove_pointer(int* volatile), int));
+SA(__is_same(__remove_pointer(int** volatile), int*));
+SA(__is_same(__remove_pointer(int* volatile* volatile), int* volatile));
+
+SA(__is_same(__remove_pointer(const volatile int*), const volatile int));
+SA(__is_same(__remove_pointer(const volatile int**), const volatile int*));
+SA(__is_same(__remove_pointer(const int* volatile), const int));
+SA(__is_same(__remove_pointer(volatile int* const), volatile int));
+SA(__is_same(__remove_pointer(int* const volatile), int));
+SA(__is_same(__remove_pointer(const int** volatile), const int*));
+SA(__is_same(__remove_pointer(volatile int** const), volatile int*));
+SA(__is_same(__remove_pointer(int** const volatile), int*));
+SA(__is_same(__remove_pointer(int* const* const volatile), int* const));
+SA(__is_same(__remove_pointer(int* volatile* const volatile), int* volatile));
+SA(__is_same(__remove_pointer(int* const volatile* const volatile), int* const 
volatile));
+
+SA(__is_same(__remove_pointer(int&), int&));
+SA(__is_same(__remove_pointer(const int&), const int&));
+SA(__is_same(__remove_pointer(volatile int&), volatile int&));
+SA(__is_same(__remove_pointer(const volatile int&), const volatile int&));
+
+SA(__is_same(__remove_pointer(int&&), int&&));
+SA(__is_same(__remove_pointer(const int&&), const int&&));
+SA(__is_same(__remove_pointer(volatile int&&), volatile int&&));
+SA(__is_same(__remove_pointer(const volatile int&&), const volatile int&&));
+
+SA(__is_same(__remove_pointer(int[3]), int[3]));
+SA(__is_same(__remove_pointer(const int[3]), const int[3]));
+SA(__is_same(__remove_pointer(volatile int[3]), volatile int[3]));
+SA(__is_same(__remove_pointer(co

[PATCH v26 19/23] libstdc++: Optimize std::is_function compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_function
by dispatching to the new __is_function built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_function): Use __is_function
built-in trait.
(is_function_v): Likewise. Optimize its implementation.  Move
this under is_const_v as this depends on is_const_v.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index db880d87f60..b6d0441129b 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -637,6 +637,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_function
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
+  template
+struct is_function
+: public __bool_constant<__is_function(_Tp)>
+{ };
+#else
   template
 struct is_function
 : public __bool_constant::value> { };
@@ -648,6 +654,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_function<_Tp&&>
 : public false_type { };
+#endif
 
 #ifdef __cpp_lib_is_null_pointer // C++ >= 11
   /// is_null_pointer (LWG 2247).
@@ -3255,8 +3262,7 @@ template 
   inline constexpr bool is_union_v = __is_union(_Tp);
 template 
   inline constexpr bool is_class_v = __is_class(_Tp);
-template 
-  inline constexpr bool is_function_v = is_function<_Tp>::value;
+// is_function_v is defined below, after is_const_v.
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_reference)
 template 
@@ -3293,6 +3299,19 @@ template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
+template 
+  inline constexpr bool is_function_v = __is_function(_Tp);
+#else
+template 
+  inline constexpr bool is_function_v = !is_const_v;
+template 
+  inline constexpr bool is_function_v<_Tp&> = false;
+template 
+  inline constexpr bool is_function_v<_Tp&&> = false;
+#endif
+
 template 
   inline constexpr bool is_volatile_v = false;
 template 
-- 
2.43.0



[PATCH v26 20/23] c++: Implement __is_object built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_object.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_object.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_OBJECT.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_object.
* g++.dg/ext/is_object.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  6 +
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_object.C | 29 
 5 files changed, 42 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_object.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 4bea6089791..eeacead52a5 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3782,6 +3782,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_NOTHROW_CONVERTIBLE:
  inform (loc, "  %qT is not nothrow convertible from %qE", t2, t1);
   break;
+case CPTK_IS_OBJECT:
+  inform (loc, "  %qT is not an object type", t1);
+  break;
 case CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF:
   inform (loc, "  %qT is not pointer-interconvertible base of %qT",
  t1, t2);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 89712f18667..b833efff26e 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -78,6 +78,7 @@ DEFTRAIT_EXPR (IS_MEMBER_POINTER, "__is_member_pointer", 1)
 DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2)
 DEFTRAIT_EXPR (IS_NOTHROW_CONSTRUCTIBLE, "__is_nothrow_constructible", -1)
 DEFTRAIT_EXPR (IS_NOTHROW_CONVERTIBLE, "__is_nothrow_convertible", 2)
+DEFTRAIT_EXPR (IS_OBJECT, "__is_object", 1)
 DEFTRAIT_EXPR (IS_POINTER_INTERCONVERTIBLE_BASE_OF, 
"__is_pointer_interconvertible_base_of", 2)
 DEFTRAIT_EXPR (IS_POD, "__is_pod", 1)
 DEFTRAIT_EXPR (IS_POLYMORPHIC, "__is_polymorphic", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 50330922d70..cf3d5476dbb 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12415,6 +12415,11 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_NOTHROW_CONVERTIBLE:
   return is_nothrow_convertible (type1, type2);
 
+case CPTK_IS_OBJECT:
+  return (type_code1 != FUNCTION_TYPE
+ && type_code1 != REFERENCE_TYPE
+ && type_code1 != VOID_TYPE);
+
 case CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF:
   return pointer_interconvertible_base_of_p (type1, type2);
 
@@ -12615,6 +12620,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
 case CPTK_IS_MEMBER_OBJECT_POINTER:
 case CPTK_IS_MEMBER_POINTER:
+case CPTK_IS_OBJECT:
 case CPTK_IS_REFERENCE:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 5215da27d6f..2242276f633 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -113,6 +113,9 @@
 #if !__has_builtin (__is_nothrow_convertible)
 # error "__has_builtin (__is_nothrow_convertible) failed"
 #endif
+#if !__has_builtin (__is_object)
+# error "__has_builtin (__is_object) failed"
+#endif
 #if !__has_builtin (__is_pointer_interconvertible_base_of)
 # error "__has_builtin (__is_pointer_interconvertible_base_of) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_object.C 
b/gcc/testsuite/g++.dg/ext/is_object.C
new file mode 100644
index 000..5c759a5ef69
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_object.C
@@ -0,0 +1,29 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_NON_VOLATILE(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT)
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+SA_TEST_NON_VOLATILE(__is_object, int (int), false);
+SA_TEST_NON_VOLATILE(__is_object, ClassType (ClassType), false);
+SA_TEST_NON_VOLATILE(__is_object,
+float (int, float, int[], int&), false);
+SA_TEST_CATEGORY(__is_object, int&, false);
+SA_TEST_CATEGORY(__is_object, ClassType&, false);
+SA_TEST_NON_VOLATILE(__is_object, int(&)(int), false);
+SA_TEST_CATEGORY(__is_object, void, false);
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_object, ClassType, true);
-- 
2.43.0



[PATCH v26 18/23] c++: Implement __is_function built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_function.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_function.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_FUNCTION.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_function.
* g++.dg/ext/is_function.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 gcc/testsuite/g++.dg/ext/is_function.C   | 58 
 5 files changed, 69 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_function.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index aa42017f67c..4bea6089791 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3752,6 +3752,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_FINAL:
   inform (loc, "  %qT is not a final class", t1);
   break;
+case CPTK_IS_FUNCTION:
+  inform (loc, "  %qT is not a function", t1);
+  break;
 case CPTK_IS_LAYOUT_COMPATIBLE:
   inform (loc, "  %qT is not layout compatible with %qT", t1, t2);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 2d82ed3dd35..89712f18667 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -69,6 +69,7 @@ DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
 DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
 DEFTRAIT_EXPR (IS_ENUM, "__is_enum", 1)
 DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1)
+DEFTRAIT_EXPR (IS_FUNCTION, "__is_function", 1)
 DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2)
 DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1)
 DEFTRAIT_EXPR (IS_MEMBER_FUNCTION_POINTER, "__is_member_function_pointer", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index b637798f605..50330922d70 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12388,6 +12388,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_FINAL:
   return CLASS_TYPE_P (type1) && CLASSTYPE_FINAL (type1);
 
+case CPTK_IS_FUNCTION:
+  return type_code1 == FUNCTION_TYPE;
+
 case CPTK_IS_LAYOUT_COMPATIBLE:
   return layout_compatible_type_p (type1, type2);
 
@@ -12608,6 +12611,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
+case CPTK_IS_FUNCTION:
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
 case CPTK_IS_MEMBER_OBJECT_POINTER:
 case CPTK_IS_MEMBER_POINTER:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index b667b5c33ac..5215da27d6f 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -86,6 +86,9 @@
 #if !__has_builtin (__is_final)
 # error "__has_builtin (__is_final) failed"
 #endif
+#if !__has_builtin (__is_function)
+# error "__has_builtin (__is_function) failed"
+#endif
 #if !__has_builtin (__is_layout_compatible)
 # error "__has_builtin (__is_layout_compatible) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_function.C 
b/gcc/testsuite/g++.dg/ext/is_function.C
new file mode 100644
index 000..2e1594b12ad
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_function.C
@@ -0,0 +1,58 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+struct A
+{ void fn(); };
+
+template
+struct AHolder { };
+
+template
+struct AHolder
+{ using type = U; };
+
+// Positive tests.
+SA(__is_function(int (int)));
+SA(__is_function(ClassType (ClassType)));
+SA(__is_function(float (int, float, int[], int&)));
+SA(__is_function(int (int, ...)));
+SA(__is_function(bool (ClassType) const));
+SA(__is_function(AHolder::type));
+
+void fn();
+SA(__is_function(decltype(fn)));
+
+// Negative tests.
+SA_TEST_CATEGORY(__is_function, int, false);
+SA_TEST_CATEGORY(__is_function, int*, false);
+SA_TEST_CATEGORY(__is_function, int&, false);
+SA_TEST_CATEGORY(__is_function, void, false);
+SA_TEST_CATEGORY(__is_function, void*, false);
+SA_TEST_CATEGORY(__is_function, void**, false);
+SA_TEST_CATEGORY(__is_function, std::nullptr_t, false);
+
+SA_TEST_CATEGORY(__is_function, AbstractClass, false);
+SA(!__is_function(int(&)(int)));
+SA(!__is_function(int(*)(int)));
+
+SA_TEST_CATEGORY(__is_function, A, false);
+SA_TEST_CATEGORY(__is_function, decltype(&A::fn), false);
+
+struct FnCallOverload
+{ void operator()(

[PATCH v26 17/23] libstdc++: Optimize std::is_reference compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_reference
by dispatching to the new __is_reference built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_reference): Use __is_reference
built-in trait.
(is_reference_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 1edd05acb4c..db880d87f60 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -682,6 +682,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Composite type categories.
 
   /// is_reference
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_reference)
+  template
+struct is_reference
+: public __bool_constant<__is_reference(_Tp)>
+{ };
+#else
   template
 struct is_reference
 : public false_type
@@ -696,6 +702,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_reference<_Tp&&>
 : public true_type
 { };
+#endif
 
   /// is_arithmetic
   template
@@ -3250,12 +3257,19 @@ template 
   inline constexpr bool is_class_v = __is_class(_Tp);
 template 
   inline constexpr bool is_function_v = is_function<_Tp>::value;
+
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_reference)
+template 
+  inline constexpr bool is_reference_v = __is_reference(_Tp);
+#else
 template 
   inline constexpr bool is_reference_v = false;
 template 
   inline constexpr bool is_reference_v<_Tp&> = true;
 template 
   inline constexpr bool is_reference_v<_Tp&&> = true;
+#endif
+
 template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
 template 
-- 
2.43.0



[PATCH v26 16/23] c++: Implement __is_reference built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_reference.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_reference.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_REFERENCE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_reference.
* g++.dg/ext/is_reference.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_reference.C  | 34 
 5 files changed, 45 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_reference.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 204b9989b6a..aa42017f67c 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3789,6 +3789,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_POLYMORPHIC:
   inform (loc, "  %qT is not a polymorphic type", t1);
   break;
+case CPTK_IS_REFERENCE:
+  inform (loc, "  %qT is not a reference", t1);
+  break;
 case CPTK_IS_SAME:
   inform (loc, "  %qT is not the same as %qT", t1, t2);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index f5efffdfc99..2d82ed3dd35 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -80,6 +80,7 @@ DEFTRAIT_EXPR (IS_NOTHROW_CONVERTIBLE, 
"__is_nothrow_convertible", 2)
 DEFTRAIT_EXPR (IS_POINTER_INTERCONVERTIBLE_BASE_OF, 
"__is_pointer_interconvertible_base_of", 2)
 DEFTRAIT_EXPR (IS_POD, "__is_pod", 1)
 DEFTRAIT_EXPR (IS_POLYMORPHIC, "__is_polymorphic", 1)
+DEFTRAIT_EXPR (IS_REFERENCE, "__is_reference", 1)
 DEFTRAIT_EXPR (IS_SAME, "__is_same", 2)
 DEFTRAIT_EXPR (IS_SCOPED_ENUM, "__is_scoped_enum", 1)
 DEFTRAIT_EXPR (IS_STD_LAYOUT, "__is_standard_layout", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 557642d6089..b637798f605 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12421,6 +12421,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_POLYMORPHIC:
   return CLASS_TYPE_P (type1) && TYPE_POLYMORPHIC_P (type1);
 
+case CPTK_IS_REFERENCE:
+  return type_code1 == REFERENCE_TYPE;
+
 case CPTK_IS_SAME:
   return same_type_p (type1, type2);
 
@@ -12608,6 +12611,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
 case CPTK_IS_MEMBER_OBJECT_POINTER:
 case CPTK_IS_MEMBER_POINTER:
+case CPTK_IS_REFERENCE:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
 case CPTK_IS_UNION:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index b5797075d52..b667b5c33ac 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -119,6 +119,9 @@
 #if !__has_builtin (__is_polymorphic)
 # error "__has_builtin (__is_polymorphic) failed"
 #endif
+#if !__has_builtin (__is_reference)
+# error "__has_builtin (__is_reference) failed"
+#endif
 #if !__has_builtin (__is_same)
 # error "__has_builtin (__is_same) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_reference.C 
b/gcc/testsuite/g++.dg/ext/is_reference.C
new file mode 100644
index 000..b5ce4db7afd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_reference.C
@@ -0,0 +1,34 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+// Positive tests.
+SA_TEST_CATEGORY(__is_reference, int&, true);
+SA_TEST_CATEGORY(__is_reference, ClassType&, true);
+SA(__is_reference(int(&)(int)));
+SA_TEST_CATEGORY(__is_reference, int&&, true);
+SA_TEST_CATEGORY(__is_reference, ClassType&&, true);
+SA(__is_reference(int(&&)(int)));
+SA_TEST_CATEGORY(__is_reference, IncompleteClass&, true);
+
+// Negative tests
+SA_TEST_CATEGORY(__is_reference, void, false);
+SA_TEST_CATEGORY(__is_reference, int*, false);
+SA_TEST_CATEGORY(__is_reference, int[3], false);
+SA(!__is_reference(int(int)));
+SA(!__is_reference(int(*const)(int)));
+SA(!__is_reference(int(*volatile)(int)));
+SA(!__is_reference(int(*const volatile)(int)));
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_reference, ClassType, false);
+SA_TEST_CATEGORY(__is_reference, IncompleteClass, false);
-- 
2.43.0



[PATCH v26 15/23] libstdc++: Optimize std::is_member_object_pointer compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of
std::is_member_object_pointer by dispatching to the new
__is_member_object_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_member_object_pointer): Use
__is_member_object_pointer built-in trait.
(is_member_object_pointer_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 99ae825301c..1edd05acb4c 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -574,6 +574,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_rvalue_reference<_Tp&&>
 : public true_type { };
 
+  /// is_member_object_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_object_pointer)
+  template
+struct is_member_object_pointer
+: public __bool_constant<__is_member_object_pointer(_Tp)>
+{ };
+#else
   template
 struct __is_member_object_pointer_helper
 : public false_type { };
@@ -582,11 +589,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct __is_member_object_pointer_helper<_Tp _Cp::*>
 : public __not_>::type { };
 
-  /// is_member_object_pointer
+
   template
 struct is_member_object_pointer
 : public __is_member_object_pointer_helper<__remove_cv_t<_Tp>>::type
 { };
+#endif
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_function_pointer)
   /// is_member_function_pointer
@@ -3213,9 +3221,16 @@ template 
   inline constexpr bool is_rvalue_reference_v = false;
 template 
   inline constexpr bool is_rvalue_reference_v<_Tp&&> = true;
+
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_object_pointer)
+template 
+  inline constexpr bool is_member_object_pointer_v =
+__is_member_object_pointer(_Tp);
+#else
 template 
   inline constexpr bool is_member_object_pointer_v =
 is_member_object_pointer<_Tp>::value;
+#endif
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_function_pointer)
 template 
-- 
2.43.0



[PATCH v26 14/23] c++: Implement __is_member_object_pointer built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_member_object_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_member_object_pointer.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_MEMBER_OBJECT_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_member_object_pointer.
* g++.dg/ext/is_member_object_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc  |  3 ++
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  4 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
 .../g++.dg/ext/is_member_object_pointer.C | 30 +++
 5 files changed, 41 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_member_object_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 1efc7983039..204b9989b6a 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3761,6 +3761,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
   inform (loc, "  %qT is not a member function pointer", t1);
   break;
+case CPTK_IS_MEMBER_OBJECT_POINTER:
+  inform (loc, "  %qT is not a member object pointer", t1);
+  break;
 case CPTK_IS_MEMBER_POINTER:
   inform (loc, "  %qT is not a member pointer", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 03a5cc28020..f5efffdfc99 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -72,6 +72,7 @@ DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1)
 DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2)
 DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1)
 DEFTRAIT_EXPR (IS_MEMBER_FUNCTION_POINTER, "__is_member_function_pointer", 1)
+DEFTRAIT_EXPR (IS_MEMBER_OBJECT_POINTER, "__is_member_object_pointer", 1)
 DEFTRAIT_EXPR (IS_MEMBER_POINTER, "__is_member_pointer", 1)
 DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2)
 DEFTRAIT_EXPR (IS_NOTHROW_CONSTRUCTIBLE, "__is_nothrow_constructible", -1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index eefce24ac2c..557642d6089 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12397,6 +12397,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
   return TYPE_PTRMEMFUNC_P (type1);
 
+case CPTK_IS_MEMBER_OBJECT_POINTER:
+  return TYPE_PTRMEM_P (type1) && !TYPE_PTRMEMFUNC_P (type1);
+
 case CPTK_IS_MEMBER_POINTER:
   return TYPE_PTRMEM_P (type1);
 
@@ -12603,6 +12606,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
+case CPTK_IS_MEMBER_OBJECT_POINTER:
 case CPTK_IS_MEMBER_POINTER:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index fb17680d3b0..b5797075d52 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -95,6 +95,9 @@
 #if !__has_builtin (__is_member_function_pointer)
 # error "__has_builtin (__is_member_function_pointer) failed"
 #endif
+#if !__has_builtin (__is_member_object_pointer)
+# error "__has_builtin (__is_member_object_pointer) failed"
+#endif
 #if !__has_builtin (__is_member_pointer)
 # error "__has_builtin (__is_member_pointer) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_member_object_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_member_object_pointer.C
new file mode 100644
index 000..835e48c8f8e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_member_object_pointer.C
@@ -0,0 +1,30 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_NON_VOLATILE(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT)
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+// Positive tests.
+SA_TEST_CATEGORY(__is_member_object_pointer, int (ClassType::*), true);
+SA_TEST_CATEGORY(__is_member_object_pointer, ClassType (ClassType::*), true);
+
+// Negative tests.
+SA_TEST_NON_VOLATILE(__is_member_object_pointer, int (ClassType::*) (int), 
false);
+SA_TEST_NON_VOLATILE(__is_member_object_pointer, int (ClassType::*) (float, 
...), false);
+SA_TEST_NON_VOLATILE(__is_member_object_pointer, ClassType (ClassType::*) 
(ClassType), false);
+SA_TEST_NON_VOLATILE(__is_member_object_pointer, float (ClassType::*) (i

[PATCH v26 13/23] libstdc++: Optimize std::is_member_function_pointer compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of
std::is_member_function_pointer by dispatching to the new
__is_member_function_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_member_function_pointer): Use
__is_member_function_pointer built-in trait.
(is_member_function_pointer_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 16 
 1 file changed, 16 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 4ab1d29ff51..99ae825301c 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -588,6 +588,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public __is_member_object_pointer_helper<__remove_cv_t<_Tp>>::type
 { };
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_function_pointer)
+  /// is_member_function_pointer
+  template
+struct is_member_function_pointer
+: public __bool_constant<__is_member_function_pointer(_Tp)>
+{ };
+#else
   template
 struct __is_member_function_pointer_helper
 : public false_type { };
@@ -601,6 +608,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_member_function_pointer
 : public __is_member_function_pointer_helper<__remove_cv_t<_Tp>>::type
 { };
+#endif
 
   /// is_enum
   template
@@ -3208,9 +3216,17 @@ template 
 template 
   inline constexpr bool is_member_object_pointer_v =
 is_member_object_pointer<_Tp>::value;
+
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_function_pointer)
+template 
+  inline constexpr bool is_member_function_pointer_v =
+__is_member_function_pointer(_Tp);
+#else
 template 
   inline constexpr bool is_member_function_pointer_v =
 is_member_function_pointer<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_enum_v = __is_enum(_Tp);
 template 
-- 
2.43.0



[PATCH v26 12/23] c++: Implement __is_member_function_pointer built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_member_function_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_member_function_pointer.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_MEMBER_FUNCTION_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_member_function_pointer.
* g++.dg/ext/is_member_function_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc  |  3 ++
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  4 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
 .../g++.dg/ext/is_member_function_pointer.C   | 31 +++
 5 files changed, 42 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_member_function_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index fb150e02ea9..1efc7983039 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3758,6 +3758,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_LITERAL_TYPE:
   inform (loc, "  %qT is not a literal type", t1);
   break;
+case CPTK_IS_MEMBER_FUNCTION_POINTER:
+  inform (loc, "  %qT is not a member function pointer", t1);
+  break;
 case CPTK_IS_MEMBER_POINTER:
   inform (loc, "  %qT is not a member pointer", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index e17f5eaeac4..03a5cc28020 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -71,6 +71,7 @@ DEFTRAIT_EXPR (IS_ENUM, "__is_enum", 1)
 DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1)
 DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2)
 DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1)
+DEFTRAIT_EXPR (IS_MEMBER_FUNCTION_POINTER, "__is_member_function_pointer", 1)
 DEFTRAIT_EXPR (IS_MEMBER_POINTER, "__is_member_pointer", 1)
 DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2)
 DEFTRAIT_EXPR (IS_NOTHROW_CONSTRUCTIBLE, "__is_nothrow_constructible", -1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 050808b96b9..eefce24ac2c 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12394,6 +12394,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_LITERAL_TYPE:
   return literal_type_p (type1);
 
+case CPTK_IS_MEMBER_FUNCTION_POINTER:
+  return TYPE_PTRMEMFUNC_P (type1);
+
 case CPTK_IS_MEMBER_POINTER:
   return TYPE_PTRMEM_P (type1);
 
@@ -12599,6 +12602,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
+case CPTK_IS_MEMBER_FUNCTION_POINTER:
 case CPTK_IS_MEMBER_POINTER:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 349fae7104e..fb17680d3b0 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -92,6 +92,9 @@
 #if !__has_builtin (__is_literal_type)
 # error "__has_builtin (__is_literal_type) failed"
 #endif
+#if !__has_builtin (__is_member_function_pointer)
+# error "__has_builtin (__is_member_function_pointer) failed"
+#endif
 #if !__has_builtin (__is_member_pointer)
 # error "__has_builtin (__is_member_pointer) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_member_function_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_member_function_pointer.C
new file mode 100644
index 000..555123e8f07
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_member_function_pointer.C
@@ -0,0 +1,31 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_FN(TRAIT, TYPE, EXPECT)\
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT);
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+// Positive tests.
+SA_TEST_FN(__is_member_function_pointer, int (ClassType::*) (int), true);
+SA_TEST_FN(__is_member_function_pointer, int (ClassType::*) (int) const, true);
+SA_TEST_FN(__is_member_function_pointer, int (ClassType::*) (float, ...), 
true);
+SA_TEST_FN(__is_member_function_pointer, ClassType (ClassType::*) (ClassType), 
true);
+SA_TEST_FN(__is_member_function_pointer, float (ClassType::*) (int, float, 
int[], int&), true);
+
+// Negative tests.
+SA_TEST_CATEGORY(__is_member_function_pointer, int (ClassType::*), false);
+SA_TEST_CATEGORY(__is_member_function_pointer, ClassType (ClassType::*), 
false);
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_m

[PATCH v26 11/23] libstdc++: Optimize std::is_member_pointer compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_member_pointer
by dispatching to the new __is_member_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_member_pointer): Use
__is_member_pointer built-in trait.
(is_member_pointer_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 4a5068791af..4ab1d29ff51 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -716,6 +716,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct is_compound
 : public __not_>::type { };
 
+  /// is_member_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_pointer)
+  template
+struct is_member_pointer
+: public __bool_constant<__is_member_pointer(_Tp)>
+{ };
+#else
   /// @cond undocumented
   template
 struct __is_member_pointer_helper
@@ -726,11 +733,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public true_type { };
   /// @endcond
 
-  /// is_member_pointer
   template
 struct is_member_pointer
 : public __is_member_pointer_helper<__remove_cv_t<_Tp>>::type
 { };
+#endif
 
   template
 struct is_same;
@@ -3228,8 +3235,15 @@ template 
   inline constexpr bool is_scalar_v = is_scalar<_Tp>::value;
 template 
   inline constexpr bool is_compound_v = is_compound<_Tp>::value;
+
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_member_pointer)
+template 
+  inline constexpr bool is_member_pointer_v = __is_member_pointer(_Tp);
+#else
 template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
+#endif
+
 template 
   inline constexpr bool is_const_v = false;
 template 
-- 
2.43.0



[PATCH v26 10/23] c++: Implement __is_member_pointer built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_member_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_member_pointer.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_MEMBER_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_member_pointer.
* g++.dg/ext/is_member_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 gcc/testsuite/g++.dg/ext/is_member_pointer.C | 30 
 5 files changed, 41 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_member_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 062dc404ccf..fb150e02ea9 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3758,6 +3758,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_LITERAL_TYPE:
   inform (loc, "  %qT is not a literal type", t1);
   break;
+case CPTK_IS_MEMBER_POINTER:
+  inform (loc, "  %qT is not a member pointer", t1);
+  break;
 case CPTK_IS_NOTHROW_ASSIGNABLE:
   inform (loc, "  %qT is not nothrow assignable from %qT", t1, t2);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 9d848f6f77d..e17f5eaeac4 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -71,6 +71,7 @@ DEFTRAIT_EXPR (IS_ENUM, "__is_enum", 1)
 DEFTRAIT_EXPR (IS_FINAL, "__is_final", 1)
 DEFTRAIT_EXPR (IS_LAYOUT_COMPATIBLE, "__is_layout_compatible", 2)
 DEFTRAIT_EXPR (IS_LITERAL_TYPE, "__is_literal_type", 1)
+DEFTRAIT_EXPR (IS_MEMBER_POINTER, "__is_member_pointer", 1)
 DEFTRAIT_EXPR (IS_NOTHROW_ASSIGNABLE, "__is_nothrow_assignable", 2)
 DEFTRAIT_EXPR (IS_NOTHROW_CONSTRUCTIBLE, "__is_nothrow_constructible", -1)
 DEFTRAIT_EXPR (IS_NOTHROW_CONVERTIBLE, "__is_nothrow_convertible", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index ea85da3b41a..050808b96b9 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12394,6 +12394,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_LITERAL_TYPE:
   return literal_type_p (type1);
 
+case CPTK_IS_MEMBER_POINTER:
+  return TYPE_PTRMEM_P (type1);
+
 case CPTK_IS_NOTHROW_ASSIGNABLE:
   return is_nothrow_xible (MODIFY_EXPR, type1, type2);
 
@@ -12596,6 +12599,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
+case CPTK_IS_MEMBER_POINTER:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
 case CPTK_IS_UNION:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 744cfb3b42f..349fae7104e 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -92,6 +92,9 @@
 #if !__has_builtin (__is_literal_type)
 # error "__has_builtin (__is_literal_type) failed"
 #endif
+#if !__has_builtin (__is_member_pointer)
+# error "__has_builtin (__is_member_pointer) failed"
+#endif
 #if !__has_builtin (__is_nothrow_assignable)
 # error "__has_builtin (__is_nothrow_assignable) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_member_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_member_pointer.C
new file mode 100644
index 000..7ee2e3ab90c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_member_pointer.C
@@ -0,0 +1,30 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_NON_VOLATILE(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT)
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+SA_TEST_CATEGORY(__is_member_pointer, int (ClassType::*), true);
+SA_TEST_CATEGORY(__is_member_pointer, ClassType (ClassType::*), true);
+
+SA_TEST_NON_VOLATILE(__is_member_pointer, int (ClassType::*)(int), true);
+SA_TEST_NON_VOLATILE(__is_member_pointer, int (ClassType::*)(int) const, true);
+SA_TEST_NON_VOLATILE(__is_member_pointer, int (ClassType::*)(float, ...), 
true);
+SA_TEST_NON_VOLATILE(__is_member_pointer, ClassType (ClassType::*)(ClassType), 
true);
+SA_TEST_NON_VOLATILE(__is_member_pointer,
+float (ClassType::*)(int, float, int[], int&), true);
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_member_pointer, ClassType, false);
-- 
2.43.0



[PATCH v26 09/23] libstdc++: Optimize std::is_scoped_enum compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_scoped_enum
by dispatching to the new __is_scoped_enum built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_scoped_enum): Use
__is_scoped_enum built-in trait.
(is_scoped_enum_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 2a1a0aa80ff..4a5068791af 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3602,6 +3602,12 @@ template
   /// True if the type is a scoped enumeration type.
   /// @since C++23
 
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__is_scoped_enum)
+  template
+struct is_scoped_enum
+: bool_constant<__is_scoped_enum(_Tp)>
+{ };
+# else
   template
 struct is_scoped_enum
 : false_type
@@ -3613,11 +3619,17 @@ template
 struct is_scoped_enum<_Tp>
 : bool_constant
 { };
+# endif
 
   /// @ingroup variable_templates
   /// @since C++23
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__is_scoped_enum)
+  template
+inline constexpr bool is_scoped_enum_v = __is_scoped_enum(_Tp);
+# else
   template
 inline constexpr bool is_scoped_enum_v = is_scoped_enum<_Tp>::value;
+# endif
 #endif
 
 #ifdef __cpp_lib_reference_from_temporary // C++ >= 23 && 
ref_{converts,constructs}_from_temp
-- 
2.43.0



[PATCH v26 08/23] c++: Implement __is_scoped_enum built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_scoped_enum.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_scoped_enum.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_SCOPED_ENUM.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_scoped_enum.
* g++.dg/ext/is_scoped_enum.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc  |  3 +
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 +
 gcc/testsuite/g++.dg/ext/is_scoped_enum.C | 67 +++
 5 files changed, 78 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_scoped_enum.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 2311bab28c4..062dc404ccf 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3783,6 +3783,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_SAME:
   inform (loc, "  %qT is not the same as %qT", t1, t2);
   break;
+case CPTK_IS_SCOPED_ENUM:
+  inform (loc, "  %qT is not a scoped enum", t1);
+  break;
 case CPTK_IS_STD_LAYOUT:
   inform (loc, "  %qT is not an standard layout type", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 0e93e2b7114..9d848f6f77d 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -78,6 +78,7 @@ DEFTRAIT_EXPR (IS_POINTER_INTERCONVERTIBLE_BASE_OF, 
"__is_pointer_interconvertib
 DEFTRAIT_EXPR (IS_POD, "__is_pod", 1)
 DEFTRAIT_EXPR (IS_POLYMORPHIC, "__is_polymorphic", 1)
 DEFTRAIT_EXPR (IS_SAME, "__is_same", 2)
+DEFTRAIT_EXPR (IS_SCOPED_ENUM, "__is_scoped_enum", 1)
 DEFTRAIT_EXPR (IS_STD_LAYOUT, "__is_standard_layout", 1)
 DEFTRAIT_EXPR (IS_TRIVIAL, "__is_trivial", 1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index e3ea835a6b1..ea85da3b41a 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12415,6 +12415,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_SAME:
   return same_type_p (type1, type2);
 
+case CPTK_IS_SCOPED_ENUM:
+  return SCOPED_ENUM_P (type1);
+
 case CPTK_IS_STD_LAYOUT:
   return std_layout_type_p (type1);
 
@@ -12594,6 +12597,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
 case CPTK_IS_SAME:
+case CPTK_IS_SCOPED_ENUM:
 case CPTK_IS_UNION:
   break;
 
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 4cfb817788c..744cfb3b42f 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -116,6 +116,9 @@
 #if !__has_builtin (__is_same_as)
 # error "__has_builtin (__is_same_as) failed"
 #endif
+#if !__has_builtin (__is_scoped_enum)
+# error "__has_builtin (__is_scoped_enum) failed"
+#endif
 #if !__has_builtin (__is_standard_layout)
 # error "__has_builtin (__is_standard_layout) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_scoped_enum.C 
b/gcc/testsuite/g++.dg/ext/is_scoped_enum.C
new file mode 100644
index 000..a563b6ee67d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_scoped_enum.C
@@ -0,0 +1,67 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_FN(TRAIT, TYPE, EXPECT)\
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT);
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+enum class E { e1, e2 };
+SA_TEST_CATEGORY(__is_scoped_enum, E, true);
+enum class Ec : char { e1, e2 };
+SA_TEST_CATEGORY(__is_scoped_enum, Ec, true);
+
+// negative tests
+enum U { u1, u2 };
+SA_TEST_CATEGORY(__is_scoped_enum, U, false);
+enum F : int { f1, f2 };
+SA_TEST_CATEGORY(__is_scoped_enum, F, false);
+struct S;
+SA_TEST_CATEGORY(__is_scoped_enum, S, false);
+struct S { };
+SA_TEST_CATEGORY(__is_scoped_enum, S, false);
+
+SA_TEST_CATEGORY(__is_scoped_enum, int, false);
+SA_TEST_CATEGORY(__is_scoped_enum, int[], false);
+SA_TEST_CATEGORY(__is_scoped_enum, int[2], false);
+SA_TEST_CATEGORY(__is_scoped_enum, int[][2], false);
+SA_TEST_CATEGORY(__is_scoped_enum, int[2][3], false);
+SA_TEST_CATEGORY(__is_scoped_enum, int*, false);
+SA_TEST_CATEGORY(__is_scoped_enum, int&, false);
+SA_TEST_CATEGORY(__is_scoped_enum, int*&, false);
+SA_TEST_FN(__is_scoped_enum, int(), false);
+SA_TEST_FN(__is_scoped_enum, int(*)(), false);
+SA_TEST_FN(__is_scoped_enum, int(&)(), false);
+
+enum o

[PATCH v26 07/23] libstdc++: Optimize std::is_bounded_array compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_bounded_array
by dispatching to the new __is_bounded_array built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_bounded_array_v): Use
__is_bounded_array built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 64f9d67fe29..2a1a0aa80ff 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3506,11 +3506,16 @@ template
   /// True for a type that is an array of known bound.
   /// @ingroup variable_templates
   /// @since C++20
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__is_bounded_array)
+  template
+inline constexpr bool is_bounded_array_v = __is_bounded_array(_Tp);
+# else
   template
 inline constexpr bool is_bounded_array_v = false;
 
   template
 inline constexpr bool is_bounded_array_v<_Tp[_Size]> = true;
+# endif
 
   /// True for a type that is an array of unknown bound.
   /// @ingroup variable_templates
-- 
2.43.0



[PATCH v26 06/23] c++: Implement __is_bounded_array built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_bounded_array.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_bounded_array.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_BOUNDED_ARRAY.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_bounded_array.
* g++.dg/ext/is_bounded_array.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc|  3 ++
 gcc/cp/cp-trait.def |  1 +
 gcc/cp/semantics.cc |  4 +++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C|  3 ++
 gcc/testsuite/g++.dg/ext/is_bounded_array.C | 38 +
 5 files changed, 49 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_bounded_array.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index d75132e8e82..2311bab28c4 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3728,6 +3728,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_BASE_OF:
   inform (loc, "  %qT is not a base of %qT", t1, t2);
   break;
+case CPTK_IS_BOUNDED_ARRAY:
+  inform (loc, "  %qT is not a bounded array", t1);
+  break;
 case CPTK_IS_CLASS:
   inform (loc, "  %qT is not a class", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 759f10a3532..0e93e2b7114 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -62,6 +62,7 @@ DEFTRAIT_EXPR (IS_AGGREGATE, "__is_aggregate", 1)
 DEFTRAIT_EXPR (IS_ARRAY, "__is_array", 1)
 DEFTRAIT_EXPR (IS_ASSIGNABLE, "__is_assignable", 2)
 DEFTRAIT_EXPR (IS_BASE_OF, "__is_base_of", 2)
+DEFTRAIT_EXPR (IS_BOUNDED_ARRAY, "__is_bounded_array", 1)
 DEFTRAIT_EXPR (IS_CLASS, "__is_class", 1)
 DEFTRAIT_EXPR (IS_CONSTRUCTIBLE, "__is_constructible", -1)
 DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index aa551638a78..e3ea835a6b1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12367,6 +12367,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
  && (same_type_ignoring_top_level_qualifiers_p (type1, type2)
  || DERIVED_FROM_P (type1, type2)));
 
+case CPTK_IS_BOUNDED_ARRAY:
+  return type_code1 == ARRAY_TYPE && TYPE_DOMAIN (type1);
+
 case CPTK_IS_CLASS:
   return NON_UNION_CLASS_TYPE_P (type1);
 
@@ -12587,6 +12590,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
   break;
 
 case CPTK_IS_ARRAY:
+case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
 case CPTK_IS_SAME:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 6b9437f7c47..4cfb817788c 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -65,6 +65,9 @@
 #if !__has_builtin (__is_base_of)
 # error "__has_builtin (__is_base_of) failed"
 #endif
+#if !__has_builtin (__is_bounded_array)
+# error "__has_builtin (__is_bounded_array) failed"
+#endif
 #if !__has_builtin (__is_class)
 # error "__has_builtin (__is_class) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_bounded_array.C 
b/gcc/testsuite/g++.dg/ext/is_bounded_array.C
new file mode 100644
index 000..346790eba12
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_bounded_array.C
@@ -0,0 +1,38 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_CONST(TRAIT, TYPE, EXPECT) \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT)
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+SA_TEST_CATEGORY(__is_bounded_array, int[2], true);
+SA_TEST_CATEGORY(__is_bounded_array, int[], false);
+SA_TEST_CATEGORY(__is_bounded_array, int[2][3], true);
+SA_TEST_CATEGORY(__is_bounded_array, int[][3], false);
+SA_TEST_CATEGORY(__is_bounded_array, float*[2], true);
+SA_TEST_CATEGORY(__is_bounded_array, float*[], false);
+SA_TEST_CATEGORY(__is_bounded_array, float*[2][3], true);
+SA_TEST_CATEGORY(__is_bounded_array, float*[][3], false);
+SA_TEST_CATEGORY(__is_bounded_array, ClassType[2], true);
+SA_TEST_CATEGORY(__is_bounded_array, ClassType[], false);
+SA_TEST_CATEGORY(__is_bounded_array, ClassType[2][3], true);
+SA_TEST_CATEGORY(__is_bounded_array, ClassType[][3], false);
+SA_TEST_CATEGORY(__is_bounded_array, int(*)[2], false);
+SA_TEST_CATEGORY(__is_bounded_array, int(*)[], false);
+SA_TEST_CATEGORY(__is_bounded_array, int(&)[2], false);
+SA_TEST_CONST(__is_bounded_array, int(&)[], false);
+
+// Sanity check.
+S

[PATCH v26 05/23] libstdc++: Optimize std::is_array compilation performance

2023-12-06 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_array
by dispatching to the new __is_array built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_array): Use __is_array built-in
trait.
(is_array_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 677cd934b94..64f9d67fe29 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -523,6 +523,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { };
 
   /// is_array
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_array)
+  template
+struct is_array
+: public __bool_constant<__is_array(_Tp)>
+{ };
+#else
   template
 struct is_array
 : public false_type { };
@@ -534,6 +540,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_array<_Tp[]>
 : public true_type { };
+#endif
 
   template
 struct __is_pointer_helper
@@ -3169,12 +3176,17 @@ template 
 template 
   inline constexpr bool is_floating_point_v = is_floating_point<_Tp>::value;
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_array)
+template 
+  inline constexpr bool is_array_v = __is_array(_Tp);
+#else
 template 
   inline constexpr bool is_array_v = false;
 template 
   inline constexpr bool is_array_v<_Tp[]> = true;
 template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
+#endif
 
 template 
   inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
-- 
2.43.0



[PATCH v26 04/23] c++: Implement __is_array built-in trait

2023-12-06 Thread Ken Matsui
This patch implements built-in trait for std::is_array.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_array.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_ARRAY.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_array.
* g++.dg/ext/is_array.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_array.C  | 28 
 5 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_array.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 29aa7bb3df8..d75132e8e82 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3719,6 +3719,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_AGGREGATE:
   inform (loc, "  %qT is not an aggregate", t1);
   break;
+case CPTK_IS_ARRAY:
+  inform (loc, "  %qT is not an array", t1);
+  break;
 case CPTK_IS_ASSIGNABLE:
   inform (loc, "  %qT is not assignable from %qT", t1, t2);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 0e48e64b8dd..759f10a3532 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -59,6 +59,7 @@ DEFTRAIT_EXPR (HAS_UNIQUE_OBJ_REPRESENTATIONS, 
"__has_unique_object_representati
 DEFTRAIT_EXPR (HAS_VIRTUAL_DESTRUCTOR, "__has_virtual_destructor", 1)
 DEFTRAIT_EXPR (IS_ABSTRACT, "__is_abstract", 1)
 DEFTRAIT_EXPR (IS_AGGREGATE, "__is_aggregate", 1)
+DEFTRAIT_EXPR (IS_ARRAY, "__is_array", 1)
 DEFTRAIT_EXPR (IS_ASSIGNABLE, "__is_assignable", 2)
 DEFTRAIT_EXPR (IS_BASE_OF, "__is_base_of", 2)
 DEFTRAIT_EXPR (IS_CLASS, "__is_class", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 3abbd6df983..aa551638a78 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12356,6 +12356,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_AGGREGATE:
   return CP_AGGREGATE_TYPE_P (type1);
 
+case CPTK_IS_ARRAY:
+  return type_code1 == ARRAY_TYPE;
+
 case CPTK_IS_ASSIGNABLE:
   return is_xible (MODIFY_EXPR, type1, type2);
 
@@ -12583,6 +12586,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
return error_mark_node;
   break;
 
+case CPTK_IS_ARRAY:
 case CPTK_IS_CLASS:
 case CPTK_IS_ENUM:
 case CPTK_IS_SAME:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 2223f08a628..6b9437f7c47 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -56,6 +56,9 @@
 #if !__has_builtin (__is_aggregate)
 # error "__has_builtin (__is_aggregate) failed"
 #endif
+#if !__has_builtin (__is_array)
+# error "__has_builtin (__is_array) failed"
+#endif
 #if !__has_builtin (__is_assignable)
 # error "__has_builtin (__is_assignable) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_array.C 
b/gcc/testsuite/g++.dg/ext/is_array.C
new file mode 100644
index 000..facfed5c7cb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_array.C
@@ -0,0 +1,28 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+#define SA_TEST_CATEGORY(TRAIT, X, expect) \
+  SA(TRAIT(X) == expect);  \
+  SA(TRAIT(const X) == expect);\
+  SA(TRAIT(volatile X) == expect); \
+  SA(TRAIT(const volatile X) == expect)
+
+SA_TEST_CATEGORY(__is_array, int[2], true);
+SA_TEST_CATEGORY(__is_array, int[], true);
+SA_TEST_CATEGORY(__is_array, int[2][3], true);
+SA_TEST_CATEGORY(__is_array, int[][3], true);
+SA_TEST_CATEGORY(__is_array, float*[2], true);
+SA_TEST_CATEGORY(__is_array, float*[], true);
+SA_TEST_CATEGORY(__is_array, float*[2][3], true);
+SA_TEST_CATEGORY(__is_array, float*[][3], true);
+SA_TEST_CATEGORY(__is_array, ClassType[2], true);
+SA_TEST_CATEGORY(__is_array, ClassType[], true);
+SA_TEST_CATEGORY(__is_array, ClassType[2][3], true);
+SA_TEST_CATEGORY(__is_array, ClassType[][3], true);
+
+// Sanity check.
+SA_TEST_CATEGORY(__is_array, ClassType, false);
-- 
2.43.0



[PATCH v26 03/23] c++: Accept the use of built-in trait identifiers

2023-12-06 Thread Ken Matsui
This patch accepts the use of built-in trait identifiers when they are
actually not used as traits.  Specifically, we check if the subsequent
token is '(' for ordinary built-in traits or is '<' only for the special
__type_pack_element built-in trait.  If those identifiers are used
differently, the parser treats them as normal identifiers.  This allows
us to accept code like: struct __is_pointer {};.

gcc/cp/ChangeLog:

* parser.cc (cp_lexer_lookup_trait): Rename to ...
(cp_lexer_peek_trait): ... this.  Handle a subsequent token for
the corresponding built-in trait.
(cp_lexer_lookup_trait_expr): Rename to ...
(cp_lexer_peek_trait_expr): ... this.
(cp_lexer_lookup_trait_type): Rename to ...
(cp_lexer_peek_trait_type): ... this.
(cp_lexer_next_token_is_decl_specifier_keyword): Call
cp_lexer_peek_trait_type.
(cp_parser_simple_type_specifier): Likewise.
(cp_parser_primary_expression): Call cp_lexer_peek_trait_expr.

Signed-off-by: Ken Matsui 
---
 gcc/cp/parser.cc | 53 +++-
 1 file changed, 34 insertions(+), 19 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index a416b58a2a5..bf5add5cae9 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -246,12 +246,12 @@ static void cp_lexer_start_debugging
   (cp_lexer *) ATTRIBUTE_UNUSED;
 static void cp_lexer_stop_debugging
   (cp_lexer *) ATTRIBUTE_UNUSED;
-static const cp_trait *cp_lexer_lookup_trait
-  (const cp_token *);
-static const cp_trait *cp_lexer_lookup_trait_expr
-  (const cp_token *);
-static const cp_trait *cp_lexer_lookup_trait_type
-  (const cp_token *);
+static const cp_trait *cp_lexer_peek_trait
+  (cp_lexer *);
+static const cp_trait *cp_lexer_peek_trait_expr
+  (cp_lexer *);
+static const cp_trait *cp_lexer_peek_trait_type
+  (cp_lexer *);
 
 static cp_token_cache *cp_token_cache_new
   (cp_token *, cp_token *);
@@ -1188,15 +1188,29 @@ cp_keyword_starts_decl_specifier_p (enum rid keyword)
 }
 }
 
-/* Look ups the corresponding built-in trait if a given token is
-   a built-in trait.  Otherwise, returns nullptr.  */
+/* Peeks the corresponding built-in trait if the first token is
+   a built-in trait and the second token is either `(' or `<' depending
+   on the trait.  Otherwise, returns nullptr.  */
 
 static const cp_trait *
-cp_lexer_lookup_trait (const cp_token *token)
+cp_lexer_peek_trait (cp_lexer *lexer)
 {
-  if (token->type == CPP_NAME && IDENTIFIER_TRAIT_P (token->u.value))
-return &cp_traits[IDENTIFIER_CP_INDEX (token->u.value)];
+  const cp_token *token1 = cp_lexer_peek_token (lexer);
+  if (token1->type == CPP_NAME && IDENTIFIER_TRAIT_P (token1->u.value))
+{
+  const cp_trait &trait = cp_traits[IDENTIFIER_CP_INDEX (token1->u.value)];
+  const bool is_pack_element = (trait.kind == CPTK_TYPE_PACK_ELEMENT);
+
+  /* Check if the subsequent token is a `<' token to
+__type_pack_element or is a `(' token to everything else.  */
+  const cp_token *token2 = cp_lexer_peek_nth_token (lexer, 2);
+  if (is_pack_element && token2->type != CPP_LESS)
+   return nullptr;
+  if (!is_pack_element && token2->type != CPP_OPEN_PAREN)
+   return nullptr;
 
+  return &trait;
+}
   return nullptr;
 }
 
@@ -1204,9 +1218,9 @@ cp_lexer_lookup_trait (const cp_token *token)
built-in trait.  */
 
 static const cp_trait *
-cp_lexer_lookup_trait_expr (const cp_token *token)
+cp_lexer_peek_trait_expr (cp_lexer *lexer)
 {
-  const cp_trait *trait = cp_lexer_lookup_trait (token);
+  const cp_trait *trait = cp_lexer_peek_trait (lexer);
   if (trait && !trait->type)
 return trait;
 
@@ -1217,9 +1231,9 @@ cp_lexer_lookup_trait_expr (const cp_token *token)
built-in trait.  */
 
 static const cp_trait *
-cp_lexer_lookup_trait_type (const cp_token *token)
+cp_lexer_peek_trait_type (cp_lexer *lexer)
 {
-  const cp_trait *trait = cp_lexer_lookup_trait (token);
+  const cp_trait *trait = cp_lexer_peek_trait (lexer);
   if (trait && trait->type)
 return trait;
 
@@ -1233,9 +1247,10 @@ cp_lexer_next_token_is_decl_specifier_keyword (cp_lexer 
*lexer)
 {
   cp_token *token;
 
-  token = cp_lexer_peek_token (lexer);
-  if (cp_lexer_lookup_trait_type (token))
+  if (cp_lexer_peek_trait_type (lexer))
 return true;
+
+  token = cp_lexer_peek_token (lexer);
   return cp_keyword_starts_decl_specifier_p (token->keyword);
 }
 
@@ -6133,7 +6148,7 @@ cp_parser_primary_expression (cp_parser *parser,
 `::' as the beginning of a qualified-id, or the "operator"
 keyword.  */
 case CPP_NAME:
-  if (const cp_trait* trait = cp_lexer_lookup_trait_expr (token))
+  if (const cp_trait* trait = cp_lexer_peek_trait_expr (parser->lexer))
return cp_parser_trait (parser, trait);
   /* FALLTHRU */
 case CPP_SCOPE:
@@ -20151,7 +20166,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 }
 
   /* If token is a type-yielding built-in tra

[PATCH v26 02/23] c-family, c++: Look up built-in traits via identifier node

2023-12-06 Thread Ken Matsui
Since RID_MAX soon reaches 255 and all built-in traits are used
approximately once in a C++ translation unit, this patch removes
all RID values for built-in traits and uses the identifier node to
look up the specific trait.  Rather than holding traits as keywords,
we set all trait identifiers as cik_trait, which is a new
cp_identifier_kind.  As cik_reserved_for_udlit was unused and
cp_identifier_kind is 3 bits, we replaced the unused field with the new
cik_trait.  Also, the later patch handles a subsequent token to the
built-in identifier so that we accept the use of non-function-like
built-in trait identifiers.

gcc/c-family/ChangeLog:

* c-common.cc (c_common_reswords): Remove all mappings of
built-in traits.
* c-common.h (enum rid): Remove all RID values for built-in
traits.

gcc/cp/ChangeLog:

* cp-objcp-common.cc (names_builtin_p): Remove all RID value
cases for built-in traits.  Check for built-in traits via
the new cik_trait kind.
* cp-tree.h (enum cp_trait_kind): Set its underlying type to
addr_space_t.
(struct cp_trait): New struct to hold trait information.
(cp_traits): New array to hold a mapping to all traits.
(cik_reserved_for_udlit): Rename to ...
(cik_trait): ... this.
(IDENTIFIER_ANY_OP_P): Exclude cik_trait.
(IDENTIFIER_TRAIT_P): New macro to detect cik_trait.
* lex.cc (cp_traits): Define its values, declared in cp-tree.h.
(init_cp_traits): New function to set cik_trait and
IDENTIFIER_CP_INDEX for all built-in trait identifiers.
(cxx_init): Call init_cp_traits function.
* parser.cc (cp_lexer_lookup_trait): New function to look up a
built-in trait by IDENTIFIER_CP_INDEX.
(cp_lexer_lookup_trait_expr): Likewise, look up an
expression-yielding built-in trait.
(cp_lexer_lookup_trait_type): Likewise, look up a type-yielding
built-in trait.
(cp_keyword_starts_decl_specifier_p): Remove all RID value cases
for built-in traits.
(cp_lexer_next_token_is_decl_specifier_keyword): Handle
type-yielding built-in traits.
(cp_parser_primary_expression): Remove all RID value cases for
built-in traits.  Handle expression-yielding built-in traits.
(cp_parser_trait): Handle cp_trait instead of enum rid.
(cp_parser_simple_type_specifier): Remove all RID value cases
for built-in traits.  Handle type-yielding built-in traits.

Co-authored-by: Patrick Palka 
Signed-off-by: Ken Matsui 
---
 gcc/c-family/c-common.cc  |   7 ---
 gcc/c-family/c-common.h   |   5 --
 gcc/cp/cp-objcp-common.cc |   8 +--
 gcc/cp/cp-tree.h  |  32 +---
 gcc/cp/lex.cc |  34 
 gcc/cp/parser.cc  | 105 +++---
 6 files changed, 126 insertions(+), 65 deletions(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index d175054dddb..0f1de44a348 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -560,13 +560,6 @@ const struct c_common_resword c_common_reswords[] =
   { "wchar_t", RID_WCHAR,  D_CXXONLY },
   { "while",   RID_WHILE,  0 },
 
-#define DEFTRAIT(TCC, CODE, NAME, ARITY) \
-  { NAME,  RID_##CODE, D_CXXONLY },
-#include "cp/cp-trait.def"
-#undef DEFTRAIT
-  /* An alias for __is_same.  */
-  { "__is_same_as",RID_IS_SAME,D_CXXONLY },
-
   /* C++ transactional memory.  */
   { "synchronized",RID_SYNCHRONIZED, D_CXX_OBJC | D_TRANSMEM },
   { "atomic_noexcept", RID_ATOMIC_NOEXCEPT, D_CXXONLY | D_TRANSMEM },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index cb9b6f301d8..62d76c87cc0 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -168,11 +168,6 @@ enum rid
   RID_BUILTIN_LAUNDER,
   RID_BUILTIN_BIT_CAST,
 
-#define DEFTRAIT(TCC, CODE, NAME, ARITY) \
-  RID_##CODE,
-#include "cp/cp-trait.def"
-#undef DEFTRAIT
-
   /* C++11 */
   RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT,
 
diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 9439c4dc744..ee88df5767b 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -565,6 +565,10 @@ names_builtin_p (const char *name)
}
 }
 
+  /* Check for built-in traits.  */
+  if (IDENTIFIER_TRAIT_P (id))
+return true;
+
   /* Also detect common reserved C++ words that aren't strictly built-in
  functions.  */
   switch (C_RID_CODE (id))
@@ -578,10 +582,6 @@ names_builtin_p (const char *name)
 case RID_BUILTIN_ASSOC_BARRIER:
 case RID_BUILTIN_BIT_CAST:
 case RID_OFFSETOF:
-#define DEFTRAIT(TCC, CODE, NAME, ARITY) \
-case RID_##CODE:
-#include "cp-trait.def"
-#undef DEFTRAIT
   return true;
 default:
   break;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index cb89d372b23..cbf280ec454 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree

[PATCH v26 01/23] c++: Sort built-in traits alphabetically

2023-12-06 Thread Ken Matsui
This patch sorts built-in traits alphabetically for better code
readability.

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Sort built-in traits
alphabetically.
* cp-trait.def: Likewise.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.
(finish_trait_type): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Sort built-in traits alphabetically.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc | 68 -
 gcc/cp/cp-trait.def  | 10 +--
 gcc/cp/semantics.cc  | 94 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C | 70 +-
 4 files changed, 121 insertions(+), 121 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index d9972d69725..29aa7bb3df8 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3707,18 +3707,36 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_HAS_TRIVIAL_DESTRUCTOR:
   inform (loc, "  %qT is not trivially destructible", t1);
   break;
+case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
+  inform (loc, "  %qT does not have unique object representations", t1);
+  break;
 case CPTK_HAS_VIRTUAL_DESTRUCTOR:
   inform (loc, "  %qT does not have a virtual destructor", t1);
   break;
 case CPTK_IS_ABSTRACT:
   inform (loc, "  %qT is not an abstract class", t1);
   break;
+case CPTK_IS_AGGREGATE:
+  inform (loc, "  %qT is not an aggregate", t1);
+  break;
+case CPTK_IS_ASSIGNABLE:
+  inform (loc, "  %qT is not assignable from %qT", t1, t2);
+  break;
 case CPTK_IS_BASE_OF:
   inform (loc, "  %qT is not a base of %qT", t1, t2);
   break;
 case CPTK_IS_CLASS:
   inform (loc, "  %qT is not a class", t1);
   break;
+case CPTK_IS_CONSTRUCTIBLE:
+  if (!t2)
+inform (loc, "  %qT is not default constructible", t1);
+  else
+inform (loc, "  %qT is not constructible from %qE", t1, t2);
+  break;
+case CPTK_IS_CONVERTIBLE:
+  inform (loc, "  %qT is not convertible from %qE", t2, t1);
+  break;
 case CPTK_IS_EMPTY:
   inform (loc, "  %qT is not an empty class", t1);
   break;
@@ -3734,6 +3752,18 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_LITERAL_TYPE:
   inform (loc, "  %qT is not a literal type", t1);
   break;
+case CPTK_IS_NOTHROW_ASSIGNABLE:
+  inform (loc, "  %qT is not nothrow assignable from %qT", t1, t2);
+  break;
+case CPTK_IS_NOTHROW_CONSTRUCTIBLE:
+  if (!t2)
+   inform (loc, "  %qT is not nothrow default constructible", t1);
+  else
+   inform (loc, "  %qT is not nothrow constructible from %qE", t1, t2);
+  break;
+case CPTK_IS_NOTHROW_CONVERTIBLE:
+ inform (loc, "  %qT is not nothrow convertible from %qE", t2, t1);
+  break;
 case CPTK_IS_POINTER_INTERCONVERTIBLE_BASE_OF:
   inform (loc, "  %qT is not pointer-interconvertible base of %qT",
  t1, t2);
@@ -3753,50 +3783,20 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_TRIVIAL:
   inform (loc, "  %qT is not a trivial type", t1);
   break;
-case CPTK_IS_UNION:
-  inform (loc, "  %qT is not a union", t1);
-  break;
-case CPTK_IS_AGGREGATE:
-  inform (loc, "  %qT is not an aggregate", t1);
-  break;
-case CPTK_IS_TRIVIALLY_COPYABLE:
-  inform (loc, "  %qT is not trivially copyable", t1);
-  break;
-case CPTK_IS_ASSIGNABLE:
-  inform (loc, "  %qT is not assignable from %qT", t1, t2);
-  break;
 case CPTK_IS_TRIVIALLY_ASSIGNABLE:
   inform (loc, "  %qT is not trivially assignable from %qT", t1, t2);
   break;
-case CPTK_IS_NOTHROW_ASSIGNABLE:
-  inform (loc, "  %qT is not nothrow assignable from %qT", t1, t2);
-  break;
-case CPTK_IS_CONSTRUCTIBLE:
-  if (!t2)
-   inform (loc, "  %qT is not default constructible", t1);
-  else
-   inform (loc, "  %qT is not constructible from %qE", t1, t2);
-  break;
 case CPTK_IS_TRIVIALLY_CONSTRUCTIBLE:
   if (!t2)
inform (loc, "  %qT is not trivially default constructible", t1);
   else
inform (loc, "  %qT is not trivially constructible from %qE", t1, t2);
   break;
-case CPTK_IS_NOTHROW_CONSTRUCTIBLE:
-  if (!t2)
-   inform (loc, "  %qT is not nothrow default constructible", t1);
-  else
-   inform (loc, "  %qT is not nothrow constructible from %qE", t1, t2);
-  break;
-case CPTK_HAS_UNIQUE_OBJ_REPRESENTATIONS:
-  inform (loc, "  %qT does not have unique object representations", t1);
-  break;
-case CPTK_IS_CONVERTIBLE:
-  inform (loc, "  %qT is not convertible from %qE", t2, t1);
+case CPTK_IS_TRIVIALLY_COPYABLE:
+  inform (loc, "  %qT is not trivially copyable", t1);
   break;
-case CPTK_IS_NOTHROW_CONVERTIBLE:
-   infor

[PATCH v26 00/23] Optimize type traits compilation performance

2023-12-06 Thread Ken Matsui
This patch series optimizes type traits compilation performance by
implementing built-in type traits and using them in libstdc++.

Changes in v26:

* Rebased on top of trunk.
* Moved is_function_v under is_const_v.
* Isolated is_const, is_volatile, is_pointer, and
is_unbounded_array patches, which contain performance regression,
from this patch series.

Changes in v25:

* Optimized the __is_pointer implementation in cpp_type_traits.h.
* Fix compilation error in cpp_type_traits.h with Clang 16.
* Wrapped commit messages at 75 columns.
* Used & instead of && for the new IDENTIFIER_TRAIT_P macro.
* Made cp_lexer_peek_trait not to take cp_token.
* Fixed indentation error in cp_lexer_peek_trait.

Changes in v24:

* Fixed the way to handle an incomplete type error from __is_invocable
in the test cases so that we can correctly test both the use of
built-in and vice-versa.

Changes in v23:

* Improved the comment in cp-tree.h.
* Moved the definition of cp_traits to lex.cc from parser.cc.
* Implemented __is_invocable built-in trait.

Changes in v22:

* Included a missing patch in v21.

Changes in v21:

* Used _GLIBCXX_USE_BUILTIN_TRAIT instead of __has_builtin in
cpp_type_traits.h.
* Added const char* name to struct cp_trait, and loop over cp_traits
in init_cp_traits to get the name.
* Isolated patches for integral-related built-in traits from
this patch series since they are not ready for review yet.
* Implemented __is_object built-in trait.

Changes in v20:

* Used identifier node instead of gperf to look up built-in
traits.

Changes in v19:

* Fixed a typo.
* Rebased on top of trunk.
* Improved clarity of the commit message.

Changes in v18:

* Removed all RID values for built-in traits and used cik_trait
instead.
* Improved to handle the use of non-function-like built-in trait
identifiers.
* Reverted all changes to conflicted identifiers with new built-ins
in the existing code base.

Changes in v17:

* Rebased on top of trunk.
* Improved clarity of the commit message.
* Simplified Make-lang.in.
* Made ridpointers for RID_TRAIT_EXPR and RID_TRAIT_TYPE empty.

Changes in v16:

* Rebased on top of trunk.
* Improved clarity of the commit message.
* Simplified Make-lang.in and gperf struct.
* Supply -k option to gperf to support older versions than 2.8.

Changes in v15:

* Rebased on top of trunk.
* Use gperf to look up traits instead of enum rid.

Changes in v14:

* Added padding calculation to the commit message.

Changes in v13:

* Fixed ambiguous commit message and comment.

Changes in v12:

* Evaluated all paddings affected by the enum rid change.

Changes in v11:

* Merged all patches into one patch series.
* Rebased on top of trunk.
* Unified commit message style.
* Used _GLIBCXX_USE_BUILTIN_TRAIT.

Ken Matsui (23):
  c++: Sort built-in traits alphabetically
  c-family, c++: Look up built-in traits via identifier node
  c++: Accept the use of built-in trait identifiers
  c++: Implement __is_array built-in trait
  libstdc++: Optimize std::is_array compilation performance
  c++: Implement __is_bounded_array built-in trait
  libstdc++: Optimize std::is_bounded_array compilation performance
  c++: Implement __is_scoped_enum built-in trait
  libstdc++: Optimize std::is_scoped_enum compilation performance
  c++: Implement __is_member_pointer built-in trait
  libstdc++: Optimize std::is_member_pointer compilation performance
  c++: Implement __is_member_function_pointer built-in trait
  libstdc++: Optimize std::is_member_function_pointer compilation
performance
  c++: Implement __is_member_object_pointer built-in trait
  libstdc++: Optimize std::is_member_object_pointer compilation
performance
  c++: Implement __is_reference built-in trait
  libstdc++: Optimize std::is_reference compilation performance
  c++: Implement __is_function built-in trait
  libstdc++: Optimize std::is_function compilation performance
  c++: Implement __is_object built-in trait
  libstdc++: Optimize std::is_object compilation performance
  c++: Implement __remove_pointer built-in trait
  libstdc++: Optimize std::remove_pointer compilation performance

 gcc/c-family/c-common.cc  |   7 -
 gcc/c-family/c-common.h   |   5 -
 gcc/cp/constraint.cc  |  95 +++-
 gcc/cp/cp-objcp-common.cc |   8 +-
 gcc/cp/cp-trait.def   |  20 ++-
 gcc/cp/cp-tree.h  |  32 +++--
 gcc/cp/lex.cc |  34 +
 gcc/cp/parser.cc  | 120 ++-

RE: [PATCH v4] libgfortran: Replace mutex with rwlock

2023-12-06 Thread Zhu, Lipeng
> > [CCing Ian as libgcc maintainer]
> >
> > On Wed, 1 Nov 2023 10:14:37 +
> > "Zhu, Lipeng"  wrote:
> >
> > > > >
> > > > > Hi Lipeng,
> > > > >
> > > > > >>> Sure, as your comments, in the patch V6, I added 3 test
> > > > > >>> cases with OpenMP to test different cases in concurrency
> respectively:
> > > > > >>> 1. find and create unit very frequently to stress read lock
> > > > > >>> and write
> > lock.
> > > > > >>> 2. only access the unit which exist in cache to stress read lock.
> > > > > >>> 3. access the same unit in concurrency.
> > > > > >>> For the third test case, it also help to find a bug:  When
> > > > > >>> unit can't be found in cache nor unit list in read phase,
> > > > > >>> then threads will try to acquire write lock to insert the
> > > > > >>> same unit, this will cause duplicate key
> > > > > >> error.
> > > > > >>> To fix this bug, I get the unit from unit list once again
> > > > > >>> before insert in write
> > > > > >> lock.
> > > > > >>> More details you can refer the patch v6.
> > > > > >>>
> > > > > >>
> > > > > >> Could you help to review this update? I really appreciate
> > > > > >> your
> > assistance.
> > > > > >>
> > > > >
> > > > > > Could you help to review this update?  Any concern will be
> > appreciated.
> > > > >
> > > > > Fortran parts are OK (I think I wrote that already), we need
> > > > > somebody for the non-Fortran parts.
> > > > >
> > > > Hi Thomas,
> > > >
> > > > Thanks for your response. Very appreciate for your patience and help.
> > > >
> > > > > Jakub, could you maybe take a look?
> > > > >
> > > > > Best regards
> > > > >
> > > > >   Thomas
> > > >
> > > > Hi Jakub,
> > > >
> > > > Can you help to take a look at the change for libgcc part that
> > > > added several rwlock macros in libgcc/gthr-posix.h?
> > > >
> > >
> > > Hi Jakub,
> > >
> > > Could you help to review this, any comment will be greatly appreciated.
> >
> > Latest version is at
> > https://inbox.sourceware.org/gcc-patches/20230818031818.2161842-1-
> > lipeng@intel.com/
> >
> Thanks Bernhard.
> 
> Hi Ian,
> Could you help to review the changes for libgcc part?
> Very looking forward to your help.
> 
> > >
> > > > Best Regards,
> > > > Lipeng Zhu
> > >

Hi Jakub, 

Could you help to review this patch for the changes in libgcc/gthr-posix.h.
Just as Thomas commented: "Fortran parts are OK".  We need your 
comments for the non-fortran parts.  Very appreciated for your help.
Latest version is at 
https://inbox.sourceware.org/gcc-patches/20230818031818.2161842-1-lipeng@intel.com/

Lipeng Zhu,
Best Regards.


[PATCH v26 00/23] Optimize type traits compilation performance

2023-12-06 Thread Ken Matsui
This patch series optimizes type traits compilation performance by
implementing built-in type traits and using them in libstdc++.

Changes in v26:

* Rebased on top of trunk.
* Moved is_function_v under is_const_v.
* Isolated patches for is_const, is_volatile, is_pointer, and
is_unbounded_array, which contain performance regression, from
this patch series since they are not ready for review yet.

Changes in v25:

* Optimized the __is_pointer implementation in cpp_type_traits.h.
* Fix compilation error in cpp_type_traits.h with Clang 16.
* Wrapped commit messages at 75 columns.
* Used & instead of && for the new IDENTIFIER_TRAIT_P macro.
* Made cp_lexer_peek_trait not to take cp_token.
* Fixed indentation error in cp_lexer_peek_trait.

Changes in v24:

* Fixed the way to handle an incomplete type error from __is_invocable
in the test cases so that we can correctly test both the use of
built-in and vice-versa.

Changes in v23:

* Improved the comment in cp-tree.h.
* Moved the definition of cp_traits to lex.cc from parser.cc.
* Implemented __is_invocable built-in trait.

Changes in v22:

* Included a missing patch in v21.

Changes in v21:

* Used _GLIBCXX_USE_BUILTIN_TRAIT instead of __has_builtin in
cpp_type_traits.h.
* Added const char* name to struct cp_trait, and loop over cp_traits
in init_cp_traits to get the name.
* Isolated patches for integral-related built-in traits from
this patch series since they are not ready for review yet.
* Implemented __is_object built-in trait.

Changes in v20:

* Used identifier node instead of gperf to look up built-in
traits.

Changes in v19:

* Fixed a typo.
* Rebased on top of trunk.
* Improved clarity of the commit message.

Changes in v18:

* Removed all RID values for built-in traits and used cik_trait
instead.
* Improved to handle the use of non-function-like built-in trait
identifiers.
* Reverted all changes to conflicted identifiers with new built-ins
in the existing code base.

Changes in v17:

* Rebased on top of trunk.
* Improved clarity of the commit message.
* Simplified Make-lang.in.
* Made ridpointers for RID_TRAIT_EXPR and RID_TRAIT_TYPE empty.

Changes in v16:

* Rebased on top of trunk.
* Improved clarity of the commit message.
* Simplified Make-lang.in and gperf struct.
* Supply -k option to gperf to support older versions than 2.8.

Changes in v15:

* Rebased on top of trunk.
* Use gperf to look up traits instead of enum rid.

Changes in v14:

* Added padding calculation to the commit message.

Changes in v13:

* Fixed ambiguous commit message and comment.

Changes in v12:

* Evaluated all paddings affected by the enum rid change.

Changes in v11:

* Merged all patches into one patch series.
* Rebased on top of trunk.
* Unified commit message style.
* Used _GLIBCXX_USE_BUILTIN_TRAIT.

Ken Matsui (23):
  c++: Sort built-in traits alphabetically
  c-family, c++: Look up built-in traits via identifier node
  c++: Accept the use of built-in trait identifiers
  c++: Implement __is_array built-in trait
  libstdc++: Optimize std::is_array compilation performance
  c++: Implement __is_bounded_array built-in trait
  libstdc++: Optimize std::is_bounded_array compilation performance
  c++: Implement __is_scoped_enum built-in trait
  libstdc++: Optimize std::is_scoped_enum compilation performance
  c++: Implement __is_member_pointer built-in trait
  libstdc++: Optimize std::is_member_pointer compilation performance
  c++: Implement __is_member_function_pointer built-in trait
  libstdc++: Optimize std::is_member_function_pointer compilation
performance
  c++: Implement __is_member_object_pointer built-in trait
  libstdc++: Optimize std::is_member_object_pointer compilation
performance
  c++: Implement __is_reference built-in trait
  libstdc++: Optimize std::is_reference compilation performance
  c++: Implement __is_function built-in trait
  libstdc++: Optimize std::is_function compilation performance
  c++: Implement __is_object built-in trait
  libstdc++: Optimize std::is_object compilation performance
  c++: Implement __remove_pointer built-in trait
  libstdc++: Optimize std::remove_pointer compilation performance

 gcc/c-family/c-common.cc  |   7 -
 gcc/c-family/c-common.h   |   5 -
 gcc/cp/constraint.cc  |  95 +++-
 gcc/cp/cp-objcp-common.cc |   8 +-
 gcc/cp/cp-trait.def   |  20 ++-
 gcc/cp/cp-tree.h  |  32 +++--
 gcc/cp/lex.cc |  34 +
 gcc/cp/parser.cc   

[PATCH] strub: enable conditional support

2023-12-06 Thread Alexandre Oliva
On Dec  6, 2023, Alexandre Oliva  wrote:

> Disabling the runtime bits is easy, once we determine what condition we
> wish to test for.  I suppose testing for target support in the compiler,
> issuing a 'sorry' in case the feature is required, would provide
> something for libgcc configure and testsuite effective-target to test
> for and decide whether to enable runtime support and run the tests.

Instead of doing something equivalent to an implicit -fstrub=disable,
that would quietly compile without stack scrubbing, I thought it would
be safer to be noisy if the feature is used (requested, really) when
support is not available.


Targets that don't expose callee stacks to callers, such as nvptx, as
well as -fsplit-stack compilations, violate fundamental assumptions of
the current strub implementation.  This patch enables targets to
disable strub, and disables it when -fsplit-stack is enabled.

When strub support is disabled, the testsuite will now skip strub
tests, and libgcc will not build the strub runtime components.

Regstrapped on x86_64-linux-gnu.  Also tested with an additional patch
for i386.cc that mirrors the nvptx.cc change, to check that strub gets
disabled without noisy test results.  Ok to install?


for  gcc/ChangeLog

* target.def (have_strub_support_for): New hook.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Rebuild.
* ipa-strub.cc: Include target.h.
(strub_target_support_p): New.
(can_strub_p): Call it.  Test for no flag_split_stack.
(pass_ipa_strub::adjust_at_calls_call): Check for target
support.
* config/nvptx/nvptx.cc (TARGET_HAVE_STRUB_SUPPORT_FOR):
Disable.
* doc/sourcebuild.texi (strub): Document new effective
target.

for  gcc/testsuite/ChangeLog

* gcc.dg/strub-split-stack.c: New.
* gcc.dg/strub-unsupported.c: New.
* gcc.dg/strub-unsupported-2.c: New.
* gcc.dg/strub-unsupported-3.c: New.
* lib/target-supports.exp (check_effective_target_strub): New.
* c-c++-common/strub-O0.c: Require effective target strub.
* c-c++-common/strub-O1.c: Likewise.
* c-c++-common/strub-O2.c: Likewise.
* c-c++-common/strub-O2fni.c: Likewise.
* c-c++-common/strub-O3.c: Likewise.
* c-c++-common/strub-O3fni.c: Likewise.
* c-c++-common/strub-Og.c: Likewise.
* c-c++-common/strub-Os.c: Likewise.
* c-c++-common/strub-all1.c: Likewise.
* c-c++-common/strub-all2.c: Likewise.
* c-c++-common/strub-apply1.c: Likewise.
* c-c++-common/strub-apply2.c: Likewise.
* c-c++-common/strub-apply3.c: Likewise.
* c-c++-common/strub-apply4.c: Likewise.
* c-c++-common/strub-at-calls1.c: Likewise.
* c-c++-common/strub-at-calls2.c: Likewise.
* c-c++-common/strub-defer-O1.c: Likewise.
* c-c++-common/strub-defer-O2.c: Likewise.
* c-c++-common/strub-defer-O3.c: Likewise.
* c-c++-common/strub-defer-Os.c: Likewise.
* c-c++-common/strub-internal1.c: Likewise.
* c-c++-common/strub-internal2.c: Likewise.
* c-c++-common/strub-parms1.c: Likewise.
* c-c++-common/strub-parms2.c: Likewise.
* c-c++-common/strub-parms3.c: Likewise.
* c-c++-common/strub-relaxed1.c: Likewise.
* c-c++-common/strub-relaxed2.c: Likewise.
* c-c++-common/strub-short-O0-exc.c: Likewise.
* c-c++-common/strub-short-O0.c: Likewise.
* c-c++-common/strub-short-O1.c: Likewise.
* c-c++-common/strub-short-O2.c: Likewise.
* c-c++-common/strub-short-O3.c: Likewise.
* c-c++-common/strub-short-Os.c: Likewise.
* c-c++-common/strub-strict1.c: Likewise.
* c-c++-common/strub-strict2.c: Likewise.
* c-c++-common/strub-tail-O1.c: Likewise.
* c-c++-common/strub-tail-O2.c: Likewise.
* c-c++-common/strub-var1.c: Likewise.
* c-c++-common/torture/strub-callable1.c: Likewise.
* c-c++-common/torture/strub-callable2.c: Likewise.
* c-c++-common/torture/strub-const1.c: Likewise.
* c-c++-common/torture/strub-const2.c: Likewise.
* c-c++-common/torture/strub-const3.c: Likewise.
* c-c++-common/torture/strub-const4.c: Likewise.
* c-c++-common/torture/strub-data1.c: Likewise.
* c-c++-common/torture/strub-data2.c: Likewise.
* c-c++-common/torture/strub-data3.c: Likewise.
* c-c++-common/torture/strub-data4.c: Likewise.
* c-c++-common/torture/strub-data5.c: Likewise.
* c-c++-common/torture/strub-indcall1.c: Likewise.
* c-c++-common/torture/strub-indcall2.c: Likewise.
* c-c++-common/torture/strub-indcall3.c: Likewise.
* c-c++-common/torture/strub-inlinable1.c: Likewise.
* c-c++-common/torture/strub-inlinable2.c: Likewise.
* c-c++-common/torture/strub-ptrfn1.c: Likewise.
* c-c++-common/torture/strub-ptrfn2.c: Likewise.
* c-c+

Re: [PATCH v1] LoongArch: Modify the check type of the vector builtin function.

2023-12-06 Thread chenxiaolong
在 2023-12-05二的 20:44 +0800,Xi Ruoyao写道:
> On Tue, 2023-12-05 at 17:21 +0800, chenxiaolong wrote:
> > According to your suggestion, the check of the built-in function
> > was modifiedin the simd_correctness_check.h file, and the types of
> > the actual parameters
> > of the built-in function were inconsistent with those of the formal
> > parameters.
> > The problems with the GCC regression test are as follows:
> > 
> > ...
> > note: expected 'const void *' but argument is of type '__m128i'
> > error: incompatible type for argument 3 of 'ASSERTEQ_64'
> > ...
> > 
> > The reason is that the types used in __m{128i,128,128d} are defined
> > in
> > the vector header file (lsxintrin.h or lasxintrin.h), and their
> > basic
> > types do not match the parameter types corresponding to the
> > functions.
> 
> Ouch.  I forgot that we are passing vectors themselves to
> ASSERTEQ_64,
> not the pointers.
> 
> Now I come up with this:
> 
> #include 
> #include 
> #include 
> 
> static inline void
> dump (const void *_ptr, int size, const char *name)
> {
>   const char *ptr = (const char *)_ptr;
> 
>   printf("%s:", name);
> 
>   for (int i = 0; i < size; i++)
> printf(" %02hhx", ptr[i]);
> 
>   putchar('\n');
> }
> 
> template 
> static inline void
> assert_eq (const U &res, const V &ref, int line)
> {
>   static_assert(sizeof (res) == sizeof (ref));
>   if (!memcmp (&res, &ref, sizeof(ref)))
> return;
> 
>   dump (res, sizeof (res), "res");
>   dump (ref, sizeof (ref), "ref");
> }
> 
> int main()
> {
>   float x[4] = {};
>   int y[4] = {};
>   assert_eq(x, y, __LINE__);
> }
> 
> This is C++, not C.  But IMO we can port the tests to C++ anyway.
> 
 Following your idea, I tried to change C into C++ code. The problem is
that the tests cases of LoongArch architecture are written in the style
of C language, and the code changed to C++ involves more problems and
is not easy to completely modify. So it's best to keep the C language
style. 



Re: [PATCH v25 25/33] libstdc++: Optimize std::is_function compilation performance

2023-12-06 Thread Ken Matsui
On Tue, Oct 24, 2023 at 4:02 AM Jonathan Wakely  wrote:
>
>
>
> On Tue, 24 Oct 2023 at 03:16, Ken Matsui  wrote:
>>
>> This patch optimizes the compilation performance of std::is_function
>> by dispatching to the new __is_function built-in trait.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/std/type_traits (is_function): Use __is_function
>> built-in trait.
>> (is_function_v): Likewise. Optimize its implementation.
>> (is_const_v): Move on top of is_function_v as is_function_v now
>> depends on is_const_v.
>
>
> I think I'd prefer to keep is_const_v where it is now, adjacent to 
> is_volatile_v, and move is_function_v after those.
>
> i.e. like this (but with the additional changes to use the new built-in):
>
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -3198,8 +3198,8 @@ template 
>inline constexpr bool is_union_v = __is_union(_Tp);
>  template 
>inline constexpr bool is_class_v = __is_class(_Tp);
> -template 
> -  inline constexpr bool is_function_v = is_function<_Tp>::value;
> +// is_function_v is defined below, after is_const_v.
> +
>  template 
>inline constexpr bool is_reference_v = false;
>  template 
> @@ -3226,6 +3226,8 @@ template 
>inline constexpr bool is_volatile_v = false;
>  template 
>inline constexpr bool is_volatile_v = true;
> +template 
> +  inline constexpr bool is_function_v = is_function<_Tp>::value;
>
>  template 
>inline constexpr bool is_trivial_v = __is_trivial(_Tp);
>
> The variable templates are currently defined in the order shown in the 
> standard, in te [meta.type.synop] synopsis, and in the [meta.unary.cat] 
> table. So let's move _is_function_v later and add a comment saying why it's 
> not in the expected place.
>
>

Sorry for missing this.  Thank you for your review!  That totally
makes sense, and I will update my patches accordingly.


Re: Re: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.

2023-12-06 Thread Feng Wang
OK. Will do it firstly.
Thanks.


ESWIN
海宁奕斯伟集成电路设计有限公司
王峰
 
From: juzhe.zh...@rivai.ai
Date: 2023-12-07 10:39
To: wangfeng; gcc-patches
CC: kito.cheng; jeffreyalaw; zhusonghe; panciyan; wangfeng
Subject: Re: Re: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.
I think you can send a single separate patch with adding unsigned int (*avail) 
(void)
into current function_group_info first.

And test full coverage current rvv intrinsics.



juzhe.zh...@rivai.ai
 
From: juzhe.zh...@rivai.ai
Date: 2023-12-07 10:28
To: wangfeng; gcc-patches
CC: kito.cheng; jeffreyalaw; zhusonghe; panciyan; wangfeng
Subject: Re: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.
+/* There is no op_type name in vaesz overloaded intrinsic */
+if (!((strcmp (instance.base_name, "vaesz") == 0) && overloaded_p))
+  b.append_name (operand_suffixes[instance.op_info->op]);
You can dedicate a shape for vaesz to avoid use strcmp.

diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h 
b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
new file mode 100755
index 000..c360c1d794f
--- /dev/null
+++ b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
@@ -0,0 +1,25 @@
+#ifndef GCC_RISCV_VECTOR_CRYPTO_BUILTINS_AVAIL_H
+#define GCC_RISCV_VECTOR_CRYPTO_BUILTINS_AVAIL_H
+
+#include "insn-codes.h"
+namespace riscv_vector {
+
+/* Declare an availability predicate for built-in functions.  */
+#define AVAIL(NAME, COND)  \
+ static unsigned int   \
+ riscv_vector_crypto_avail_##NAME (void)   \
+ { \
+   return (COND);  \
+ }
+
+AVAIL (zvbb, TARGET_ZVBB)
+AVAIL (zvbc, TARGET_ZVBC)
+AVAIL (zvkb_or_zvbb, TARGET_ZVKB || TARGET_ZVBB)
+AVAIL (zvkg, TARGET_ZVKG)
+AVAIL (zvkned, TARGET_ZVKNED)
+AVAIL (zvknha_or_zvknhb, TARGET_ZVKNHA || TARGET_ZVKNHB)
+AVAIL (zvknhb, TARGET_ZVKNHB)
+AVAIL (zvksed, TARGET_ZVKSED)
+AVAIL (zvksh, TARGET_ZVKSH)
+}
+#endif

Can you rename riscv-vector-crypto-builtins-avail.h into 
riscv-vector-crypto-builtins-avail.h
make it into riscv-vector-builtins-avail.h

make AVAIL not the crypto specific.
make it general, so that we can use them for future BF16 vector support.

So, I think instead of create crypto_function_group_info, I prefer add unsigned 
int (*avail) (void); into current function_group_info.
For current vector intrinsics:
DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops)

change it into:


DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops, true)




juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-07 10:15
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; zhusonghe; panciyan; Feng Wang
Subject: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.
Patch v2:Optimize function_shape class for crypto_vector.
 
This patch add the intrinsic funtions of crypto vector based on the
intrinsic doc(https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob
/eopc/vector-crypto/auto-generated/vector-crypto/intrinsic_funcs.md).
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-bases.cc (class vandn):
Add new function_base for crypto vector.
(class bitmanip): Ditto.
(class b_reverse):Ditto.
(class vwsll):Ditto.
(class clmul):Ditto.
(class vg_nhab):  Ditto.
(class crypto_vv):Ditto.
(class crypto_vi):Ditto.
(class vaeskf2_vsm3c):Ditto.
(class vsm3me):Ditto.
(BASE): Add BASE declaration for crypto vector.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct crypto_vv_def):
Add new function_shape for crypto vector.
(struct crypto_vi_def): Ditto.
(SHAPE): Add SHAPE declaration of crypto vector.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_CRYPTO_SEW32_OPS):
Add new data struct for crypto vector.
(DEF_RVV_CRYPTO_SEW64_OPS): Ditto.
(DEF_VECTOR_CRYPTO_FUNCTION): New MACRO define of crypto vector.
(registered_function::overloaded_hash): Processing size_t uimm for C overloaded 
func.
(handle_pragma_vector): Add registration for crypto vector.
* config/riscv/riscv-vector-builtins.def (vi): Add vi OP_TYPE.
* config/riscv/riscv-vector-builtins.h (struct crypto_function_group_info):
Add new struct definition for crypto vector.
* config/riscv/t-riscv: Add building dependency files.
* config/riscv/riscv-vector-crypto-builtins-avail.h:
New file to control enable.
* config/riscv/riscv-vector-crypto-builtins-functions.def:
New file. Definition of crypto vector.
* config/riscv/riscv-vector-crypto-builtins-types.def:
New file. New type definition for crypto vector.
---
.../riscv/riscv-vector-builtins-bases.cc  | 259 +-
.../riscv/riscv-vector-builtins-bases.h   |  28 ++
.../riscv/riscv-vector-builtins-shapes.cc |  58 +++-
.../riscv/riscv-vector-builtins-shapes.h  |   3 +
gcc/config/riscv/riscv-vector-b

Re: [PATCH v3] LoongArch: Fix eh_return epilogue for normal returns

2023-12-06 Thread Xi Ruoyao
On Thu, 2023-12-07 at 09:40 +0800, Yang Yujie wrote:
>  static void
>  loongarch_for_each_saved_reg (HOST_WIDE_INT sp_offset,
> -   loongarch_save_restore_fn fn)
> +   loongarch_save_restore_fn fn,
> +   bool skip_eh_data_regs_p)
>  {
>    HOST_WIDE_INT offset;
>  
>    /* Save the link register and s-registers.  */
>    offset = cfun->machine->frame.gp_sp_offset - sp_offset;
>    for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
> -    if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> -  {
> - if (!cfun->machine->reg_is_wrapped_separately[regno])
> -   loongarch_save_restore_reg (word_mode, regno, offset, fn);
> +    {
> +  /* Special care needs to be taken for $r4-$r7 (EH_RETURN_DATA_REGNO)
> +  when returning normally from a function that calls __builtin_eh_return.
> +  In this case, these registers are saved but should not be restored,
> +  or the return value may be clobbered.  */
>  
> - offset -= UNITS_PER_WORD;
> -  }
> +  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> + {
> +   if (!(cfun->machine->reg_is_wrapped_separately[regno]
> + || (skip_eh_data_regs_p
> + && GP_ARG_FIRST <= regno && regno < GP_ARG_FIRST + 4)))
> +     loongarch_save_restore_reg (word_mode, regno, offset, fn);
> +
> +   offset -= UNITS_PER_WORD;
> + }
> +    }

I don't like this pair of {} for the for statement.  It's not necessary
and it changes the indent level, causing the diff hard to review.

Otherwise LGTM.  I'm not sure why I didn't notice the eh_return issue
when I learnt shrink wrapping from RISC-V...

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: Re: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.

2023-12-06 Thread juzhe.zh...@rivai.ai
I think you can send a single separate patch with adding unsigned int (*avail) 
(void)
into current function_group_info first.

And test full coverage current rvv intrinsics.



juzhe.zh...@rivai.ai
 
From: juzhe.zh...@rivai.ai
Date: 2023-12-07 10:28
To: wangfeng; gcc-patches
CC: kito.cheng; jeffreyalaw; zhusonghe; panciyan; wangfeng
Subject: Re: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.
+/* There is no op_type name in vaesz overloaded intrinsic */
+if (!((strcmp (instance.base_name, "vaesz") == 0) && overloaded_p))
+  b.append_name (operand_suffixes[instance.op_info->op]);
You can dedicate a shape for vaesz to avoid use strcmp.

diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h 
b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
new file mode 100755
index 000..c360c1d794f
--- /dev/null
+++ b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
@@ -0,0 +1,25 @@
+#ifndef GCC_RISCV_VECTOR_CRYPTO_BUILTINS_AVAIL_H
+#define GCC_RISCV_VECTOR_CRYPTO_BUILTINS_AVAIL_H
+
+#include "insn-codes.h"
+namespace riscv_vector {
+
+/* Declare an availability predicate for built-in functions.  */
+#define AVAIL(NAME, COND)  \
+ static unsigned int   \
+ riscv_vector_crypto_avail_##NAME (void)   \
+ { \
+   return (COND);  \
+ }
+
+AVAIL (zvbb, TARGET_ZVBB)
+AVAIL (zvbc, TARGET_ZVBC)
+AVAIL (zvkb_or_zvbb, TARGET_ZVKB || TARGET_ZVBB)
+AVAIL (zvkg, TARGET_ZVKG)
+AVAIL (zvkned, TARGET_ZVKNED)
+AVAIL (zvknha_or_zvknhb, TARGET_ZVKNHA || TARGET_ZVKNHB)
+AVAIL (zvknhb, TARGET_ZVKNHB)
+AVAIL (zvksed, TARGET_ZVKSED)
+AVAIL (zvksh, TARGET_ZVKSH)
+}
+#endif

Can you rename riscv-vector-crypto-builtins-avail.h into 
riscv-vector-crypto-builtins-avail.h
make it into riscv-vector-builtins-avail.h

make AVAIL not the crypto specific.
make it general, so that we can use them for future BF16 vector support.

So, I think instead of create crypto_function_group_info, I prefer add unsigned 
int (*avail) (void); into current function_group_info.
For current vector intrinsics:
DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops)

change it into:


DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops, true)




juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-07 10:15
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; zhusonghe; panciyan; Feng Wang
Subject: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.
Patch v2:Optimize function_shape class for crypto_vector.
 
This patch add the intrinsic funtions of crypto vector based on the
intrinsic doc(https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob
/eopc/vector-crypto/auto-generated/vector-crypto/intrinsic_funcs.md).
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-bases.cc (class vandn):
Add new function_base for crypto vector.
(class bitmanip): Ditto.
(class b_reverse):Ditto.
(class vwsll):Ditto.
(class clmul):Ditto.
(class vg_nhab):  Ditto.
(class crypto_vv):Ditto.
(class crypto_vi):Ditto.
(class vaeskf2_vsm3c):Ditto.
(class vsm3me):Ditto.
(BASE): Add BASE declaration for crypto vector.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct crypto_vv_def):
Add new function_shape for crypto vector.
(struct crypto_vi_def): Ditto.
(SHAPE): Add SHAPE declaration of crypto vector.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_CRYPTO_SEW32_OPS):
Add new data struct for crypto vector.
(DEF_RVV_CRYPTO_SEW64_OPS): Ditto.
(DEF_VECTOR_CRYPTO_FUNCTION): New MACRO define of crypto vector.
(registered_function::overloaded_hash): Processing size_t uimm for C overloaded 
func.
(handle_pragma_vector): Add registration for crypto vector.
* config/riscv/riscv-vector-builtins.def (vi): Add vi OP_TYPE.
* config/riscv/riscv-vector-builtins.h (struct crypto_function_group_info):
Add new struct definition for crypto vector.
* config/riscv/t-riscv: Add building dependency files.
* config/riscv/riscv-vector-crypto-builtins-avail.h:
New file to control enable.
* config/riscv/riscv-vector-crypto-builtins-functions.def:
New file. Definition of crypto vector.
* config/riscv/riscv-vector-crypto-builtins-types.def:
New file. New type definition for crypto vector.
---
.../riscv/riscv-vector-builtins-bases.cc  | 259 +-
.../riscv/riscv-vector-builtins-bases.h   |  28 ++
.../riscv/riscv-vector-builtins-shapes.cc |  58 +++-
.../riscv/riscv-vector-builtins-shapes.h  |   3 +
gcc/config/riscv/riscv-vector-builtins.cc | 152 +-
gcc/config/riscv/riscv-vector-builtins.def|   1 +
gcc/config/riscv/riscv-vector-builtins.h  |   8 +
.../riscv-vector-crypto-builtins-avail.h  |  25 ++
...riscv-vector-crypto-builtins-functions.def |  78 ++
.../riscv-vector-cr

Re: [PATCH 3/4][v2] RISC-V: Add crypto machine descriptions

2023-12-06 Thread juzhe.zh...@rivai.ai
+(define_insn "@pred_vandn"
+  [(set (match_operand:VI 0 "register_operand"   "=vr,vr")
+(if_then_else:VI
+  (unspec:
+[(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1")
+ (match_operand 5 "vector_length_operand""rK,rK")
+ (match_operand 6 "const_int_operand""i, i")
+ (match_operand 7 "const_int_operand""i, i")
+ (match_operand 8 "const_int_operand""i, i")
+ (reg:SI VL_REGNUM)
+ (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+  (and:VI
+(match_operand:VI 3 "register_operand"  "vr,vr")
+(not:VI (match_operand:VI 4 "register_operand"  "vr,vr")))
+  (match_operand:VI 2 "vector_merge_operand" "vu, 0")))]
+  "TARGET_ZVBB || TARGET_ZVKB"
+  "vandn.vv\t%0,%3,%4%p1"
+  [(set_attr "type" "vandn")
+   (set_attr "mode" "")])

Does vandn allows vandn v0,v2,v3,v0.t ??
I suspect it is not allowed.

So change the constraint as follows:

=vd,vr,vd,vr

"vm,Wc1,vm,Wc1"


+(define_insn "@pred_vwsll_scalar"
+  [(set (match_operand:VWEXTI 0 "register_operand"   "=&vr")
+(if_then_else:VWEXTI
+  (unspec:
+[(match_operand: 1 "vector_mask_operand" "vmWc1")
+ (match_operand 5 "vector_length_operand""   rK")
+ (match_operand 6 "const_int_operand""   i")
+ (match_operand 7 "const_int_operand""   i")
+ (match_operand 8 "const_int_operand""   i")
+ (reg:SI VL_REGNUM)
+ (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+  (ashift:VWEXTI
+(zero_extend:VWEXTI
+ (match_operand: 3 "register_operand"  "vr"))
+(match_operand: 4 "pmode_reg_or_uimm5_operand" "rK"))
+  (match_operand:VWEXTI 2 "vector_merge_operand" "0vu")))]
+  "TARGET_ZVBB"
+  "vwsll.v%o4\t%0,%3,%4%p1"
+  [(set_attr "type" "vwsll")
+   (set_attr "mode" "")])

Change constraint as I mentioned in the PR. That PR has been merged.



juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-07 10:15
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; zhusonghe; panciyan; Feng Wang
Subject: [PATCH 3/4][v2] RISC-V: Add crypto machine descriptions
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.
 
This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
 
gcc/ChangeLog:
 
* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for crypto 
vector.
---
gcc/config/riscv/iterators.md|   4 +-
gcc/config/riscv/riscv.md|  33 +-
gcc/config/riscv/vector-crypto.md| 500 +++
gcc/config/riscv/vector-iterators.md |  41 +++
gcc/config/riscv/vector.md   |  55 ++-
5 files changed, 612 insertions(+), 21 deletions(-)
create mode 100755 gcc/config/riscv/vector-crypto.md
 
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
(umax "maxu")
(clz "clz")
(ctz "ctz")
- (popcount "cpop")])
+ (popcount "cpop")
+ (rotate "rol")
+ (rotatert "ror")])
;; ---
;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 935eeb7fd8e..a887f3cd412 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -428,6 +428,34 @@
;; vcompressvector compress instruction
;; vmov whole vector register move
;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaes

Re: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.

2023-12-06 Thread juzhe.zh...@rivai.ai
+/* There is no op_type name in vaesz overloaded intrinsic */
+if (!((strcmp (instance.base_name, "vaesz") == 0) && overloaded_p))
+  b.append_name (operand_suffixes[instance.op_info->op]);
You can dedicate a shape for vaesz to avoid use strcmp.

diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h 
b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
new file mode 100755
index 000..c360c1d794f
--- /dev/null
+++ b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
@@ -0,0 +1,25 @@
+#ifndef GCC_RISCV_VECTOR_CRYPTO_BUILTINS_AVAIL_H
+#define GCC_RISCV_VECTOR_CRYPTO_BUILTINS_AVAIL_H
+
+#include "insn-codes.h"
+namespace riscv_vector {
+
+/* Declare an availability predicate for built-in functions.  */
+#define AVAIL(NAME, COND)  \
+ static unsigned int   \
+ riscv_vector_crypto_avail_##NAME (void)   \
+ { \
+   return (COND);  \
+ }
+
+AVAIL (zvbb, TARGET_ZVBB)
+AVAIL (zvbc, TARGET_ZVBC)
+AVAIL (zvkb_or_zvbb, TARGET_ZVKB || TARGET_ZVBB)
+AVAIL (zvkg, TARGET_ZVKG)
+AVAIL (zvkned, TARGET_ZVKNED)
+AVAIL (zvknha_or_zvknhb, TARGET_ZVKNHA || TARGET_ZVKNHB)
+AVAIL (zvknhb, TARGET_ZVKNHB)
+AVAIL (zvksed, TARGET_ZVKSED)
+AVAIL (zvksh, TARGET_ZVKSH)
+}
+#endif

Can you rename riscv-vector-crypto-builtins-avail.h into 
riscv-vector-crypto-builtins-avail.h
make it into riscv-vector-builtins-avail.h

make AVAIL not the crypto specific.
make it general, so that we can use them for future BF16 vector support.

So, I think instead of create crypto_function_group_info, I prefer add unsigned 
int (*avail) (void); into current function_group_info.
For current vector intrinsics:
DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops)

change it into:


DEF_RVV_FUNCTION (vlsegff, seg_fault_load, full_preds, 
tuple_v_scalar_const_ptr_size_ptr_ops, true)




juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-07 10:15
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; zhusonghe; panciyan; Feng Wang
Subject: [PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.
Patch v2:Optimize function_shape class for crypto_vector.
 
This patch add the intrinsic funtions of crypto vector based on the
intrinsic doc(https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob
/eopc/vector-crypto/auto-generated/vector-crypto/intrinsic_funcs.md).
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-builtins-bases.cc (class vandn):
Add new function_base for crypto vector.
(class bitmanip): Ditto.
(class b_reverse):Ditto.
(class vwsll):Ditto.
(class clmul):Ditto.
(class vg_nhab):  Ditto.
(class crypto_vv):Ditto.
(class crypto_vi):Ditto.
(class vaeskf2_vsm3c):Ditto.
(class vsm3me):Ditto.
(BASE): Add BASE declaration for crypto vector.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct crypto_vv_def):
Add new function_shape for crypto vector.
(struct crypto_vi_def): Ditto.
(SHAPE): Add SHAPE declaration of crypto vector.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_CRYPTO_SEW32_OPS):
Add new data struct for crypto vector.
(DEF_RVV_CRYPTO_SEW64_OPS): Ditto.
(DEF_VECTOR_CRYPTO_FUNCTION): New MACRO define of crypto vector.
(registered_function::overloaded_hash): Processing size_t uimm for C overloaded 
func.
(handle_pragma_vector): Add registration for crypto vector.
* config/riscv/riscv-vector-builtins.def (vi): Add vi OP_TYPE.
* config/riscv/riscv-vector-builtins.h (struct crypto_function_group_info):
Add new struct definition for crypto vector.
* config/riscv/t-riscv: Add building dependency files.
* config/riscv/riscv-vector-crypto-builtins-avail.h:
New file to control enable.
* config/riscv/riscv-vector-crypto-builtins-functions.def:
New file. Definition of crypto vector.
* config/riscv/riscv-vector-crypto-builtins-types.def:
New file. New type definition for crypto vector.
---
.../riscv/riscv-vector-builtins-bases.cc  | 259 +-
.../riscv/riscv-vector-builtins-bases.h   |  28 ++
.../riscv/riscv-vector-builtins-shapes.cc |  58 +++-
.../riscv/riscv-vector-builtins-shapes.h  |   3 +
gcc/config/riscv/riscv-vector-builtins.cc | 152 +-
gcc/config/riscv/riscv-vector-builtins.def|   1 +
gcc/config/riscv/riscv-vector-builtins.h  |   8 +
.../riscv-vector-crypto-builtins-avail.h  |  25 ++
...riscv-vector-crypto-builtins-functions.def |  78 ++
.../riscv-vector-crypto-builtins-types.def|  21 ++
gcc/config/riscv/t-riscv  |   2 +
11 files changed, 632 insertions(+), 3 deletions(-)
create mode 100755 gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
create mode 100755 gcc/config/riscv/riscv-vector-crypto-builtins-functions.def
create mode 100755 gcc/config/riscv/riscv-vector-crypto-builtins-types.def
 
diff --git a/gcc/config/riscv/riscv-ve

[PATCH 2/4][v2] RISC-V: Add crypto vector builtin function.

2023-12-06 Thread Feng Wang
Patch v2:Optimize function_shape class for crypto_vector.

This patch add the intrinsic funtions of crypto vector based on the
intrinsic doc(https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob
/eopc/vector-crypto/auto-generated/vector-crypto/intrinsic_funcs.md).

Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (class vandn):
Add new function_base for crypto vector.
(class bitmanip): Ditto.
(class b_reverse):Ditto.
(class vwsll):Ditto.
(class clmul):Ditto.
(class vg_nhab):  Ditto.
(class crypto_vv):Ditto.
(class crypto_vi):Ditto.
(class vaeskf2_vsm3c):Ditto.
(class vsm3me):Ditto.
(BASE): Add BASE declaration for crypto vector.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct crypto_vv_def):
Add new function_shape for crypto vector.
(struct crypto_vi_def): Ditto.
(SHAPE): Add SHAPE declaration of crypto vector.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_CRYPTO_SEW32_OPS):
Add new data struct for crypto vector.
(DEF_RVV_CRYPTO_SEW64_OPS): Ditto.
(DEF_VECTOR_CRYPTO_FUNCTION): New MACRO define of crypto vector.
(registered_function::overloaded_hash): Processing size_t uimm for C 
overloaded func.
(handle_pragma_vector): Add registration for crypto vector.
* config/riscv/riscv-vector-builtins.def (vi): Add vi OP_TYPE.
* config/riscv/riscv-vector-builtins.h (struct 
crypto_function_group_info):
Add new struct definition for crypto vector.
* config/riscv/t-riscv: Add building dependency files.
* config/riscv/riscv-vector-crypto-builtins-avail.h:
New file to control enable.
* config/riscv/riscv-vector-crypto-builtins-functions.def:
New file. Definition of crypto vector.
* config/riscv/riscv-vector-crypto-builtins-types.def:
New file. New type definition for crypto vector.
---
 .../riscv/riscv-vector-builtins-bases.cc  | 259 +-
 .../riscv/riscv-vector-builtins-bases.h   |  28 ++
 .../riscv/riscv-vector-builtins-shapes.cc |  58 +++-
 .../riscv/riscv-vector-builtins-shapes.h  |   3 +
 gcc/config/riscv/riscv-vector-builtins.cc | 152 +-
 gcc/config/riscv/riscv-vector-builtins.def|   1 +
 gcc/config/riscv/riscv-vector-builtins.h  |   8 +
 .../riscv-vector-crypto-builtins-avail.h  |  25 ++
 ...riscv-vector-crypto-builtins-functions.def |  78 ++
 .../riscv-vector-crypto-builtins-types.def|  21 ++
 gcc/config/riscv/t-riscv  |   2 +
 11 files changed, 632 insertions(+), 3 deletions(-)
 create mode 100755 gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
 create mode 100755 gcc/config/riscv/riscv-vector-crypto-builtins-functions.def
 create mode 100755 gcc/config/riscv/riscv-vector-crypto-builtins-types.def

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index d70468542ee..6d52230e9ba 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2127,6 +2127,207 @@ public:
   }
 };
 
+/* Below implements are vector crypto */
+/* Implements vandn.[vv,vx] */
+class vandn : public function_base
+{
+public:
+  rtx expand (function_expander &e) const override
+  {
+switch (e.op_info->op)
+  {
+  case OP_TYPE_vv:
+return e.use_exact_insn (code_for_pred_vandn (e.vector_mode ()));
+  case OP_TYPE_vx:
+return e.use_exact_insn (code_for_pred_vandn_scalar (e.vector_mode 
()));
+  default:
+gcc_unreachable ();
+  }
+  }
+};
+
+/* Implements vrol/vror/clz/ctz.  */
+template
+class bitmanip : public function_base
+{
+public:
+  bool apply_tail_policy_p () const override
+  {
+return (CODE == CLZ || CODE == CTZ) ? false : true;
+  }
+  bool apply_mask_policy_p () const override
+  {
+return (CODE == CLZ || CODE == CTZ) ? false : true;
+  }
+  bool has_merge_operand_p () const override
+  {
+return (CODE == CLZ || CODE == CTZ) ? false : true;
+  }
+  
+  rtx expand (function_expander &e) const override
+  {
+switch (e.op_info->op)
+{
+  case OP_TYPE_v:
+  case OP_TYPE_vv:
+return e.use_exact_insn (code_for_pred_v (CODE, e.vector_mode ()));
+  case OP_TYPE_vx:
+return e.use_exact_insn (code_for_pred_v_scalar (CODE, e.vector_mode 
()));
+  default:
+gcc_unreachable ();
+}
+  }
+};
+
+/* Implements vbrev/vbrev8/vrev8.  */
+template
+class b_reverse : public function_base
+{
+public:
+  rtx expand (fu

[PATCH 3/4][v2] RISC-V: Add crypto machine descriptions

2023-12-06 Thread Feng Wang
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.

This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.

Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 

gcc/ChangeLog:

* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for 
crypto vector.
---
 gcc/config/riscv/iterators.md|   4 +-
 gcc/config/riscv/riscv.md|  33 +-
 gcc/config/riscv/vector-crypto.md| 500 +++
 gcc/config/riscv/vector-iterators.md |  41 +++
 gcc/config/riscv/vector.md   |  55 ++-
 5 files changed, 612 insertions(+), 21 deletions(-)
 create mode 100755 gcc/config/riscv/vector-crypto.md

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
 (umax "maxu")
 (clz "clz")
 (ctz "ctz")
-(popcount "cpop")])
+(popcount "cpop")
+(rotate "rol")
+(rotatert "ror")])
 
 ;; ---
 ;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 935eeb7fd8e..a887f3cd412 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -428,6 +428,34 @@
 ;; vcompressvector compress instruction
 ;; vmov whole vector register move
 ;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
 (define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -447,7 +475,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
   (cond [(eq_attr "got" "load") (const_string "load")
 
 ;; If a doubleword move uses these expensive instructions,
@@ -3747,6 +3777,7 @@
 (include "thead.md")
 (include "generic-ooo.md")
 (include "vector.md")
+(include "vector-crypto.md")
 (include "zicond.md")
 (include "zc.md")
 (include "corev.md")
diff --git a/gcc/config/riscv/vector-crypto.md 
b/gcc/config/riscv/vector-crypto.md
new file mode 100755
index 0

[PATCH 1/4][v2] RISC-V:Add crypto vector implied ISA info.

2023-12-06 Thread Feng Wang
Patch v2: Change the implied ISA info using the minimum set and add
dependencies info  into the python script.

Due to the crypto vector entension is depend on the Vector extension,
so the "v" info is added into implied ISA info with the corresponding
crypto vector extension.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Modify implied ISA info.
* config/riscv/arch-canonicalize: Add crypto vector implied info.
---
 gcc/common/config/riscv/riscv-common.cc |  9 +
 gcc/config/riscv/arch-canonicalize  | 21 +++--
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 6c210412515..a7aa3435a8a 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -120,6 +120,15 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zvksc", "zvbc"},
   {"zvksg", "zvks"},
   {"zvksg", "zvkg"},
+  {"zvbb",  "zvkb"},
+  {"zvbc",   "zve64x"},
+  {"zvkb",   "zve32x"},
+  {"zvkg",   "zve32x"},
+  {"zvkned", "zve32x"},
+  {"zvknha", "zve32x"},
+  {"zvknhb", "zve64x"},
+  {"zvksed", "zve32x"},
+  {"zvksh",  "zve32x"},
 
   {"zfh", "zfhmin"},
   {"zfhmin", "f"},
diff --git a/gcc/config/riscv/arch-canonicalize 
b/gcc/config/riscv/arch-canonicalize
index ea2f67a0944..a8f47a1752b 100755
--- a/gcc/config/riscv/arch-canonicalize
+++ b/gcc/config/riscv/arch-canonicalize
@@ -69,12 +69,21 @@ IMPLIED_EXT = {
   "zvl32768b" : ["zvl16384b"],
   "zvl65536b" : ["zvl32768b"],
 
-  "zvkn" : ["zvkned", "zvknhb", "zvbb", "zvkt"],
-  "zvknc" : ["zvkn", "zvbc"],
-  "zvkng" : ["zvkn", "zvkg"],
-  "zvks" : ["zvksed", "zvksh", "zvbb", "zvkt"],
-  "zvksc" : ["zvks", "zvbc"],
-  "zvksg" : ["zvks", "zvkg"],
+  "zvkn"   : ["zvkned", "zvknhb", "zvkb", "zvkt"],
+  "zvknc"  : ["zvkn", "zvbc"],
+  "zvkng"  : ["zvkn", "zvkg"],
+  "zvks"   : ["zvksed", "zvksh", "zvkb", "zvkt"],
+  "zvksc"  : ["zvks", "zvbc"],
+  "zvksg"  : ["zvks", "zvkg"],
+  "zvbb"   : ["zvkb"],
+  "zvbc"   : ["zve64x"],
+  "zvkb"   : ["zve32x"],
+  "zvkg"   : ["zve32x"],
+  "zvkned" : ["zve32x"],
+  "zvknha" : ["zve32x"],
+  "zvknhb" : ["zve64x"],
+  "zvksed" : ["zve32x"],
+  "zvksh"  : ["zve32x"],
 }
 
 def arch_canonicalize(arch, isa_spec):
-- 
2.17.1



Re: Re: [PATCH 1/4] RISC-V: Add crypto vector implied ISA info.

2023-12-06 Thread Feng Wang
2023-12-06 11:33 Tsukasa OI  wrote:



>On 2023/12/06 11:45, Feng Wang wrote:



>> Due to the crypto vector entension is depend on the Vector extension,



>> so the "v" info is added into implied ISA info with the corresponding



>> crypto vector extension.



>



>Hi Feng,



>



>It's true that vector crypto extensions are based on the vector



>extension but it *does not* mean that it requires full the 'V'



>extension.  Vector crypto extensions also consider about embedded



>processors where VLEN < 128.



>



>Quoting the documentation:



>



>> The Zvknhb and Zvbc Vector Crypto Extensions --and accordingly the composite 
>> extensions Zvkn



>> and Zvks-- require a Zve64x base, or application ("V") base Vector Extension.



>> 



>> All of the other Vector Crypto Extensions can be built on any embedded 
>> (Zve*) or application ("V")



>> base Vector Extension.



>



>So, correct dependencies to add are like follows:



>



>> +  {"zvbb",  "zvkb"},



>> +  {"zvbc",   "zve64x"},



>> +  {"zvkb",   "zve32x"},



>> +  {"zvkg",   "zve32x"},



>> +  {"zvkned", "zve32x"},



>> +  {"zvknha", "zve32x"},



>> +  {"zvknhb", "zve64x"},



>> +  {"zvksed", "zve32x"},



>> +  {"zvksh",  "zve32x"},



>



>Note that 'V' indirectly depends on both 'Zve32x' and 'Zve64x' so this



>would be fine to represent "or application ('V')" part quoted above.



>



>Also, consider adding those dependencies to the Python script



>gcc/config/riscv/arch-canonicalize.



>



>Thanks,



>Tsukasa



>

I modified this part. Thank you for your correction.

>



>



>



>> 



>> gcc/ChangeLog:



>> 



>>  * common/config/riscv/riscv-common.cc: Add "v" into implied ISA info.



>> ---



>>  gcc/common/config/riscv/riscv-common.cc | 9 +



>>  1 file changed, 9 insertions(+)



>> 



>> diff --git a/gcc/common/config/riscv/riscv-common.cc 
>> b/gcc/common/config/riscv/riscv-common.cc



>> index 6c210412515..dbb42ca2f1e 100644



>> --- a/gcc/common/config/riscv/riscv-common.cc



>> +++ b/gcc/common/config/riscv/riscv-common.cc



>> @@ -120,6 +120,15 @@ static const riscv_implied_info_t riscv_implied_info[] =



>>    {"zvksc", "zvbc"},



>>    {"zvksg", "zvks"},



>>    {"zvksg", "zvkg"},



>> +  {"zvbb",  "zvkb"},



>> +  {"zvbc", "v"},



>> +  {"zvkb", "v"},



>> +  {"zvkg", "v"},



>> +  {"zvkned",   "v"},



>> +  {"zvknha",   "v"},



>> +  {"zvknhb",   "v"},



>> +  {"zvksed",   "v"},



>> +  {"zvksh",    "v"},



>>  



>>    {"zfh", "zfhmin"},



>>    {"zfhmin", "f"},




Re: Re: [PATCH 3/4] RISC-V: Add crypto vector machine descriptions

2023-12-06 Thread Feng Wang
2023-12-06 14:53 juzhe.zhong  wrote:



>Do vector crypto instruction demand RATIO ?



>



>If no, add them into:



>



>;; It is valid for instruction that require sew/lmul ratio.



>(define_attr "ratio" ""



>  (cond [(eq_attr "type" "vimov,vfmov,vldux,vldox,vstux,vstox,\



>    vialu,vshift,vicmp,vimul,vidiv,vsalu,\



>    vext,viwalu,viwmul,vicalu,vnshift,\



>    vimuladd,vimerge,vaalu,vsmul,vsshift,\



>    vnclip,viminmax,viwmuladd,vmffs,vmsfs,\



>    vmiota,vmidx,vfalu,vfmul,vfminmax,vfdiv,\



>    vfwalu,vfwmul,vfsqrt,vfrecp,vfsgnj,vfcmp,\



>    vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\



>    vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,\



>    vfncvtftof,vfmuladd,vfwmuladd,vfclass,vired,\



>    viwred,vfredu,vfredo,vfwredu,vfwredo,vimovvx,\



>    vimovxv,vfmovvf,vfmovfv,vslideup,vslidedown,\



>    vislide1up,vislide1down,vfslide1up,vfslide1down,\



>    vgather,vcompress,vlsegdux,vlsegdox,vssegtux,vssegtox")



> (const_int INVALID_ATTRIBUTE)



>

Modified, thanks!

>



>+(define_insn "@pred_vandn"



>+  [(set (match_operand:VI 0 "register_operand"   "=vd,vd")



>



>Seems all vector crypto instructions are not allowed to use v0 ? Why not use 
>vr?



>



>+   (set_attr "mode" "")])



>use  is enough.

Done.

>



>+(define_insn "@pred_vwsll_scalar"



>+  [(set (match_operand:VWEXTI 0 "register_operand"   "=&vd")



>+    (if_then_else:VWEXTI



>+  (unspec:



>+    [(match_operand: 1 "vector_mask_operand" "vmWc1")



>+ (match_operand 5 "vector_length_operand"    "   rK")



>+ (match_operand 6 "const_int_operand"    "   i")



>+ (match_operand 7 "const_int_operand"    "   i")



>+ (match_operand 8 "const_int_operand"    "   i")



>+ (reg:SI VL_REGNUM)



>+ (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)



>+  (ashift:VWEXTI



>+    (zero_extend:VWEXTI



>+ (match_operand: 3 "register_operand"  "vr"))



>+    (match_operand: 4 "pmode_reg_or_uimm5_operand" "rK"))



>+  (match_operand:VWEXTI 2 "vector_merge_operand" "0vu")))]



>+  "TARGET_ZVBB"



>+  "vwsll.v%o4\t%0,%3,%4%p1"



>+  [(set_attr "type" "vwsll")



>+   (set_attr "mode" "")])



>



>Seems that we can leverage EEW widen overlap ?



>



>See RVV ISA:



>



> ;; According to RVV ISA:



> ;; The destination EEW is greater than the source EEW, the source 
>EMUL is at least 1,



> ;; and the overlap is in the highest-numbered part of the destination 
>register group



> ;; (e.g., when LMUL=8, vzext.vf4 v0, v6 is legal, but a source of v0, 
>v2, or v4 is not).



> ;; So the source operand should have LMUL >= 1.



>



>Reference patch: 
>https://gcc.gnu.org/pipermail/gcc-patches/2023-December/638869.html 



>



>Currently, I don't have a solution to support highest-number overlap for vv 
>instruction.



>Keep them early clobber for now it ok.



>



>



>



>juzhe.zh...@rivai.ai


Will update this part after your patch merged.

> 



>From: Feng Wang



>Date: 2023-12-06 10:45



>To: gcc-patches



>CC: kito.cheng; jeffreyalaw; juzhe.zhong; zhusonghe; panciyan; Feng Wang



>Subject: [PATCH 3/4] RISC-V: Add crypto vector machine descriptions



>This patch add the crypto machine descriptions(vector-crypto.md) and



>some new iterators which are used by crypto vector ext.



> 



>Co-Authored by: Songhe Zhu 



>Co-Authored by: Ciyan Pan 



> 



>gcc/ChangeLog:



> 



>* config/riscv/iterators.md: Add rotate insn name.



>* config/riscv/riscv.md: Add new insns name for crypto vector.



>* config/riscv/vector-iterators.md: Add new iterators for crypto vector.



>* config/riscv/vector.md: Add the corresponding attr for crypto vector.



>* config/riscv/vector-crypto.md: New file.The machine descriptions for crypto 
>vector.



>---



>gcc/config/riscv/iterators.md    |   4 +-



>gcc/config/riscv/riscv.md    |  33 +-



>gcc/config/riscv/vector-crypto.md    | 500 +++



>gcc/config/riscv/vector-iterators.md |  41 +++



>gcc/config/riscv/vector.md   |  49 ++-



>5 files changed, 607 insertions(+), 20 deletions(-)



>create mode 100755 gcc/config/riscv/vector-crypto.md



> 



>diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md



>index ecf033f2fa7..f332fba7031 100644



>--- a/gcc/config/riscv/iterators.md



>+++ b/gcc/config/riscv/iterators.md



>@@ -304,7 +304,9 @@



>(umax "maxu")



>(clz "clz")



>(ctz "ctz")



>- (popcount "cpop")])



>+ (popcount "cpop")



>+ (rotate "rol")



>+ (rotatert "ror")])



>;; ---



>;; Int Iterators.



>diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md



>index 935eeb7fd8e..a887f3cd412 100644



>--- a/gcc/config/riscv/riscv.md



>++

Re: [PATCH] testsuite: Adjust for the new permerror -Wincompatible-pointer-types

2023-12-06 Thread Yang Yujie
On Thu, Dec 07, 2023 at 01:35:23AM +, Sam James wrote:
> 
> Yang Yujie  writes:
> 
> > On Wed, Dec 06, 2023 at 10:45:22AM -0700, Jeff Law wrote:
> >> 
> >> 
> >> On 12/6/23 05:12, Florian Weimer wrote:
> >> > * Yang Yujie:
> >> > 
> >> > > From: Yang Yujie 
> >> > > Subject: [PATCH] testsuite: Adjust for the new permerror
> >> > >   -Wincompatible-pointer-types
> >> > > To: gcc-patches@gcc.gnu.org
> >> > > Cc: r...@cebitec.uni-bielefeld.de, mikest...@comcast.net, 
> >> > > fwei...@redhat.com,
> >> > >   Yang Yujie 
> >> > > Date: Wed,  6 Dec 2023 10:29:31 +0800 (9 hours, 42 minutes, 7 seconds 
> >> > > ago)
> >> > > Message-ID: <20231206022931.33437-1-yangyu...@loongson.cn>
> >> > > 
> >> > > r14-6037 turned -Wincompatible-pointer-types into a permerror,
> >> > > which causes the following tests to fail.
> >> > > 
> >> > > gcc/testsuite/ChangeLog:
> >> > > 
> >> > >* gcc.dg/fixed-point/composite-type.c: replace dg-warning with 
> >> > > dg-error.
> >> > > ---
> >> > >   .../gcc.dg/fixed-point/composite-type.c   | 64 
> >> > > +--
> >> > >   1 file changed, 32 insertions(+), 32 deletions(-)
> >> > 
> >> > Looks reasonable to me, but I can't approve it.
> >> We might want to fix that from a policy standpoint :-)
> >> 
> >> Regardless, this is OK for the trunk.  Thanks Yang for taking care of it.  
> >> I
> >> don't see you in the maintainers file, so I'll go ahead and push it
> >> momentarily.
> >> 
> >> jeff
> >
> > Thanks for the review!
> >
> > With this patch, I also noticed a few errors in building unpatched older
> > software like expect-5.45.4, perl-5.28.3 and bash-5.0.  Will this also be
> > the case when GCC 14 gets released?
> >
> 
> Old versions of software will need to be patched or built with
> -fpermissive. We are working on fixing supported versions of software
> and sending patches upstream - please do join in if you're able, as
> the more help the better.
> 
> It is normal for software to need porting to newer compilers. For
> example, https://gcc.gnu.org/gcc-10/porting_to.html (-fcommon).
> 
> > Thanks,
> > Yujie
> 
> thanks,
> sam

Got it, thanks.

Yujie



[PATCH v3 0/1] LoongArch: Fix eh_return epilogue for normal returns

2023-12-06 Thread Yang Yujie
Updates:
v1 -> v2: Add a test case.
v2 -> v3: Fix code format.

Yang Yujie (1):
  LoongArch: Fix eh_return epilogue for normal returns

 gcc/config/loongarch/loongarch-protos.h   |  2 +-
 gcc/config/loongarch/loongarch.cc | 41 ---
 gcc/config/loongarch/loongarch.md | 18 +++-
 .../loongarch/eh_return-normal-return.c   | 32 +++
 4 files changed, 76 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/eh_return-normal-return.c

-- 
2.43.0



[PATCH v3] LoongArch: Fix eh_return epilogue for normal returns

2023-12-06 Thread Yang Yujie
On LoongArch, the regitsters $r4 - $r7 (EH_RETURN_DATA_REGNO) will be saved
and restored in the function prologue and epilogue if the given function calls
__builtin_eh_return.  This causes the return value to be overwritten on normal
return paths and breaks a rare case of libgcc's _Unwind_RaiseException.

gcc/ChangeLog:

* config/loongarch/loongarch.cc: Do not restore the saved eh_return
data registers ($r4-$r7) for a normal return of a function that calls
__builtin_eh_return elsewhere.
* config/loongarch/loongarch-protos.h: Same.
* config/loongarch/loongarch.md: Same.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/eh_return-normal-return.c: New test.
---
 gcc/config/loongarch/loongarch-protos.h   |  2 +-
 gcc/config/loongarch/loongarch.cc | 41 ---
 gcc/config/loongarch/loongarch.md | 18 +++-
 .../loongarch/eh_return-normal-return.c   | 32 +++
 4 files changed, 76 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/eh_return-normal-return.c

diff --git a/gcc/config/loongarch/loongarch-protos.h 
b/gcc/config/loongarch/loongarch-protos.h
index cb8fc36b086..af20b5d7132 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -60,7 +60,7 @@ enum loongarch_symbol_type {
 extern rtx loongarch_emit_move (rtx, rtx);
 extern HOST_WIDE_INT loongarch_initial_elimination_offset (int, int);
 extern void loongarch_expand_prologue (void);
-extern void loongarch_expand_epilogue (bool);
+extern void loongarch_expand_epilogue (int);
 extern bool loongarch_can_use_return_insn (void);
 
 extern bool loongarch_symbolic_constant_p (rtx, enum loongarch_symbol_type *);
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 3545e66a10e..9c0e0dd1b73 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1015,20 +1015,30 @@ loongarch_save_restore_reg (machine_mode mode, int 
regno, HOST_WIDE_INT offset,
 
 static void
 loongarch_for_each_saved_reg (HOST_WIDE_INT sp_offset,
- loongarch_save_restore_fn fn)
+ loongarch_save_restore_fn fn,
+ bool skip_eh_data_regs_p)
 {
   HOST_WIDE_INT offset;
 
   /* Save the link register and s-registers.  */
   offset = cfun->machine->frame.gp_sp_offset - sp_offset;
   for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
-if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
-  {
-   if (!cfun->machine->reg_is_wrapped_separately[regno])
- loongarch_save_restore_reg (word_mode, regno, offset, fn);
+{
+  /* Special care needs to be taken for $r4-$r7 (EH_RETURN_DATA_REGNO)
+when returning normally from a function that calls __builtin_eh_return.
+In this case, these registers are saved but should not be restored,
+or the return value may be clobbered.  */
 
-   offset -= UNITS_PER_WORD;
-  }
+  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
+   {
+ if (!(cfun->machine->reg_is_wrapped_separately[regno]
+   || (skip_eh_data_regs_p
+   && GP_ARG_FIRST <= regno && regno < GP_ARG_FIRST + 4)))
+   loongarch_save_restore_reg (word_mode, regno, offset, fn);
+
+ offset -= UNITS_PER_WORD;
+   }
+}
 
   /* This loop must iterate over the same space as its companion in
  loongarch_compute_frame_info.  */
@@ -1297,7 +1307,7 @@ loongarch_expand_prologue (void)
GEN_INT (-step1));
   RTX_FRAME_RELATED_P (emit_insn (insn)) = 1;
   size -= step1;
-  loongarch_for_each_saved_reg (size, loongarch_save_reg);
+  loongarch_for_each_saved_reg (size, loongarch_save_reg, false);
 }
 
   /* Set up the frame pointer, if we're using one.  */
@@ -1382,11 +1392,11 @@ loongarch_can_use_return_insn (void)
   return reload_completed && cfun->machine->frame.total_size == 0;
 }
 
-/* Expand an "epilogue" or "sibcall_epilogue" pattern; SIBCALL_P
-   says which.  */
+/* Expand function epilogue for the following insn patterns:
+   "epilogue" (style == 0) / "sibcall_epilogue" (1) / "eh_return" (2).  */
 
 void
-loongarch_expand_epilogue (bool sibcall_p)
+loongarch_expand_epilogue (int style)
 {
   /* Split the frame into two.  STEP1 is the amount of stack we should
  deallocate before restoring the registers.  STEP2 is the amount we
@@ -1403,7 +1413,8 @@ loongarch_expand_epilogue (bool sibcall_p)
   bool need_barrier_p
 = (get_frame_size () + cfun->machine->frame.arg_pointer_offset) != 0;
 
-  if (!sibcall_p && loongarch_can_use_return_insn ())
+  /* Handle simple returns.  */
+  if (style == 0 && loongarch_can_use_return_insn ())
 {
   emit_jump_insn (gen_return ());
   return;
@@ -1479,7 +1490,8 @@ loongarch_expand_epilogue (bool sibcall_p)
 
   /* Restore the register

Re: [PATCH] testsuite: Adjust for the new permerror -Wincompatible-pointer-types

2023-12-06 Thread Sam James


Yang Yujie  writes:

> On Wed, Dec 06, 2023 at 10:45:22AM -0700, Jeff Law wrote:
>> 
>> 
>> On 12/6/23 05:12, Florian Weimer wrote:
>> > * Yang Yujie:
>> > 
>> > > From: Yang Yujie 
>> > > Subject: [PATCH] testsuite: Adjust for the new permerror
>> > >   -Wincompatible-pointer-types
>> > > To: gcc-patches@gcc.gnu.org
>> > > Cc: r...@cebitec.uni-bielefeld.de, mikest...@comcast.net, 
>> > > fwei...@redhat.com,
>> > >   Yang Yujie 
>> > > Date: Wed,  6 Dec 2023 10:29:31 +0800 (9 hours, 42 minutes, 7 seconds 
>> > > ago)
>> > > Message-ID: <20231206022931.33437-1-yangyu...@loongson.cn>
>> > > 
>> > > r14-6037 turned -Wincompatible-pointer-types into a permerror,
>> > > which causes the following tests to fail.
>> > > 
>> > > gcc/testsuite/ChangeLog:
>> > > 
>> > >  * gcc.dg/fixed-point/composite-type.c: replace dg-warning with dg-error.
>> > > ---
>> > >   .../gcc.dg/fixed-point/composite-type.c   | 64 +--
>> > >   1 file changed, 32 insertions(+), 32 deletions(-)
>> > 
>> > Looks reasonable to me, but I can't approve it.
>> We might want to fix that from a policy standpoint :-)
>> 
>> Regardless, this is OK for the trunk.  Thanks Yang for taking care of it.  I
>> don't see you in the maintainers file, so I'll go ahead and push it
>> momentarily.
>> 
>> jeff
>
> Thanks for the review!
>
> With this patch, I also noticed a few errors in building unpatched older
> software like expect-5.45.4, perl-5.28.3 and bash-5.0.  Will this also be
> the case when GCC 14 gets released?
>

Old versions of software will need to be patched or built with
-fpermissive. We are working on fixing supported versions of software
and sending patches upstream - please do join in if you're able, as
the more help the better.

It is normal for software to need porting to newer compilers. For
example, https://gcc.gnu.org/gcc-10/porting_to.html (-fcommon).

> Thanks,
> Yujie

thanks,
sam


Re: [PATCH] testsuite: Adjust for the new permerror -Wincompatible-pointer-types

2023-12-06 Thread Yang Yujie
On Wed, Dec 06, 2023 at 10:45:22AM -0700, Jeff Law wrote:
> 
> 
> On 12/6/23 05:12, Florian Weimer wrote:
> > * Yang Yujie:
> > 
> > > From: Yang Yujie 
> > > Subject: [PATCH] testsuite: Adjust for the new permerror
> > >   -Wincompatible-pointer-types
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: r...@cebitec.uni-bielefeld.de, mikest...@comcast.net, 
> > > fwei...@redhat.com,
> > >   Yang Yujie 
> > > Date: Wed,  6 Dec 2023 10:29:31 +0800 (9 hours, 42 minutes, 7 seconds ago)
> > > Message-ID: <20231206022931.33437-1-yangyu...@loongson.cn>
> > > 
> > > r14-6037 turned -Wincompatible-pointer-types into a permerror,
> > > which causes the following tests to fail.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   * gcc.dg/fixed-point/composite-type.c: replace dg-warning with dg-error.
> > > ---
> > >   .../gcc.dg/fixed-point/composite-type.c   | 64 +--
> > >   1 file changed, 32 insertions(+), 32 deletions(-)
> > 
> > Looks reasonable to me, but I can't approve it.
> We might want to fix that from a policy standpoint :-)
> 
> Regardless, this is OK for the trunk.  Thanks Yang for taking care of it.  I
> don't see you in the maintainers file, so I'll go ahead and push it
> momentarily.
> 
> jeff

Thanks for the review!

With this patch, I also noticed a few errors in building unpatched older
software like expect-5.45.4, perl-5.28.3 and bash-5.0.  Will this also be
the case when GCC 14 gets released?

Thanks,
Yujie




Re: [PATCH 17/21]AArch64: Add implementation for vector cbranch for Advanced SIMD

2023-12-06 Thread Richard Sandiford
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Tuesday, November 28, 2023 5:56 PM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov 
>> Subject: Re: [PATCH 17/21]AArch64: Add implementation for vector cbranch for
>> Advanced SIMD
>> 
>> Richard Sandiford  writes:
>> > Tamar Christina  writes:
>> >> Hi All,
>> >>
>> >> This adds an implementation for conditional branch optab for AArch64.
>> >>
>> >> For e.g.
>> >>
>> >> void f1 ()
>> >> {
>> >>   for (int i = 0; i < N; i++)
>> >> {
>> >>   b[i] += a[i];
>> >>   if (a[i] > 0)
>> >>   break;
>> >> }
>> >> }
>> >>
>> >> For 128-bit vectors we generate:
>> >>
>> >> cmgtv1.4s, v1.4s, #0
>> >> umaxp   v1.4s, v1.4s, v1.4s
>> >> fmovx3, d1
>> >> cbnzx3, .L8
>> >>
>> >> and of 64-bit vector we can omit the compression:
>> >>
>> >> cmgtv1.2s, v1.2s, #0
>> >> fmovx2, d1
>> >> cbz x2, .L13
>> >>
>> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> >>
>> >> Ok for master?
>> >>
>> >> Thanks,
>> >> Tamar
>> >>
>> >> gcc/ChangeLog:
>> >>
>> >>   * config/aarch64/aarch64-simd.md (cbranch4): New.
>> >>
>> >> gcc/testsuite/ChangeLog:
>> >>
>> >>   * gcc.target/aarch64/vect-early-break-cbranch.c: New test.
>> >>
>> >> --- inline copy of patch --
>> >> diff --git a/gcc/config/aarch64/aarch64-simd.md
>> b/gcc/config/aarch64/aarch64-simd.md
>> >> index
>> 90118c6348e9614bef580d1dc94c0c1841dd5204..cd5ec35c3f53028f14828bd7
>> 0a92924f62524c15 100644
>> >> --- a/gcc/config/aarch64/aarch64-simd.md
>> >> +++ b/gcc/config/aarch64/aarch64-simd.md
>> >> @@ -3830,6 +3830,46 @@ (define_expand
>> "vcond_mask_"
>> >>DONE;
>> >>  })
>> >>
>> >> +;; Patterns comparing two vectors and conditionally jump
>> >> +
>> >> +(define_expand "cbranch4"
>> >> +  [(set (pc)
>> >> +(if_then_else
>> >> +  (match_operator 0 "aarch64_equality_operator"
>> >> +[(match_operand:VDQ_I 1 "register_operand")
>> >> + (match_operand:VDQ_I 2 "aarch64_simd_reg_or_zero")])
>> >> +  (label_ref (match_operand 3 ""))
>> >> +  (pc)))]
>> >> +  "TARGET_SIMD"
>> >> +{
>> >> +  auto code = GET_CODE (operands[0]);
>> >> +  rtx tmp = operands[1];
>> >> +
>> >> +  /* If comparing against a non-zero vector we have to do a comparison 
>> >> first
>> >> + so we can have a != 0 comparison with the result.  */
>> >> +  if (operands[2] != CONST0_RTX (mode))
>> >> +emit_insn (gen_vec_cmp (tmp, operands[0], operands[1],
>> >> + operands[2]));
>> >> +
>> >> +  /* For 64-bit vectors we need no reductions.  */
>> >> +  if (known_eq (128, GET_MODE_BITSIZE (mode)))
>> >> +{
>> >> +  /* Always reduce using a V4SI.  */
>> >> +  rtx reduc = gen_lowpart (V4SImode, tmp);
>> >> +  rtx res = gen_reg_rtx (V4SImode);
>> >> +  emit_insn (gen_aarch64_umaxpv4si (res, reduc, reduc));
>> >> +  emit_move_insn (tmp, gen_lowpart (mode, res));
>> >> +}
>> >> +
>> >> +  rtx val = gen_reg_rtx (DImode);
>> >> +  emit_move_insn (val, gen_lowpart (DImode, tmp));
>> >> +
>> >> +  rtx cc_reg = aarch64_gen_compare_reg (code, val, const0_rtx);
>> >> +  rtx cmp_rtx = gen_rtx_fmt_ee (code, DImode, cc_reg, const0_rtx);
>> >> +  emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[3]));
>> >> +  DONE;
>> >
>> > Are you sure this is correct for the operands[2] != const0_rtx case?
>> > It looks like it uses the same comparison code for the vector comparison
>> > and the scalar comparison.
>> >
>> > E.g. if the pattern is passed a comparison:
>> >
>> >   (eq (reg:V2SI x) (reg:V2SI y))
>> >
>> > it looks like we'd generate a CMEQ for the x and y, then branch
>> > when the DImode bitcast of the CMEQ result equals zero.  This means
>> > that we branch when no elements of x and y are equal, rather than
>> > when all elements of x and y are equal.
>> >
>> > E.g. for:
>> >
>> >{ 1, 2 } == { 1, 2 }
>> >
>> > CMEQ will produce { -1, -1 }, the scalar comparison will be -1 == 0,
>> > and the branch won't be taken.
>> >
>> > ISTM it would be easier for the operands[2] != const0_rtx case to use
>> > EOR instead of a comparison.  That gives a zero result if the input
>> > vectors are equal and a nonzero result if the input vectors are
>> > different.  We can then branch on the result using CODE and const0_rtx.
>> >
>> > (Hope I've got that right.)
>> >
>> > Maybe that also removes the need for patch 18.
>> 
>> Sorry, I forgot to say: we can't use operands[1] as a temporary,
>> since it's only an input to the pattern.  The EOR destination would
>> need to be a fresh register.
>
> I've updated the patch but it doesn't help since cbranch doesn't really push
> comparisons in.  So we don't seem to ever really get called with anything 
> non-zero.

I suppose it won't trigger for the early-break stuff, since for a scalar
== break condi

Re: [PATCH v3 00/16] Support Intel APX NDD

2023-12-06 Thread Hongtao Liu
On Wed, Dec 6, 2023 at 8:11 PM Uros Bizjak  wrote:
>
> On Wed, Dec 6, 2023 at 9:08 AM Hongyu Wang  wrote:
> >
> > Hi,
> >
> > Following up the discussion of V2 patches in
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639368.html,
> > this patch series add early clobber for all TImode NDD alternatives
> > to avoid any potential overlapping between dest register and src
> > register/memory. Also use get_attr_isa (insn) == ISA_APX_NDD instead of
> > checking alternative at asm output stage.
> >
> > Bootstrapped & regtested on x86_64-pc-linux-gnu{-m32,} and sde.
> >
> > Ok for master?
>
> LGTM, but Hongtao should have the final approval here.
Ok, thanks.
>
> Thanks,
> Uros.
>
> >
> > Hongyu Wang (7):
> >   [APX NDD] Disable seg_prefixed memory usage for NDD add
> >   [APX NDD] Support APX NDD for left shift insns
> >   [APX NDD] Support APX NDD for right shift insns
> >   [APX NDD] Support APX NDD for rotate insns
> >   [APX NDD] Support APX NDD for shld/shrd insns
> >   [APX NDD] Support APX NDD for cmove insns
> >   [APX NDD] Support TImode shift for NDD
> >
> > Kong Lingling (9):
> >   [APX NDD] Support Intel APX NDD for legacy add insn
> >   [APX NDD] Support APX NDD for optimization patterns of add
> >   [APX NDD] Support APX NDD for adc insns
> >   [APX NDD] Support APX NDD for sub insns
> >   [APX NDD] Support APX NDD for sbb insn
> >   [APX NDD] Support APX NDD for neg insn
> >   [APX NDD] Support APX NDD for not insn
> >   [APX NDD] Support APX NDD for and insn
> >   [APX NDD] Support APX NDD for or/xor insn
> >
> >  gcc/config/i386/constraints.md|5 +
> >  gcc/config/i386/i386-expand.cc|  164 +-
> >  gcc/config/i386/i386-options.cc   |2 +
> >  gcc/config/i386/i386-protos.h |   16 +-
> >  gcc/config/i386/i386.cc   |   30 +-
> >  gcc/config/i386/i386.md   | 2325 +++--
> >  gcc/testsuite/gcc.target/i386/apx-ndd-adc.c   |   15 +
> >  gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c  |   16 +
> >  gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c   |6 +
> >  .../gcc.target/i386/apx-ndd-shld-shrd.c   |   24 +
> >  .../gcc.target/i386/apx-ndd-ti-shift.c|   91 +
> >  gcc/testsuite/gcc.target/i386/apx-ndd.c   |  202 ++
> >  12 files changed, 2141 insertions(+), 755 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-adc.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-cmov.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-sbb.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-shld-shrd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-ti-shift.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd.c
> >
> > --
> > 2.31.1
> >



-- 
BR,
Hongtao


[pushed] analyzer: fix taint false positives with UNKNOWN [PR112850]

2023-12-06 Thread David Malcolm
PR analyzer/112850 reports a false positive from
-Wanalyzer-tainted-allocation-size on the Linux kernel [1] where
-fanalyzer complains that an allocation size is attacker-controlled
despite the value being correctly sanitized against upper and lower
limits.

The root cause is that the expression is sufficiently complex
to exceed the -param=analyzer-max-svalue-depth= threshold,
currently at 12, with depth 13, and so it is treated as UNKNOWN.
Hence the sanitizations are seen as comparisons of an UNKNOWN
symbolic value against constants, and these were being ignored
by the taint state machine.

The expression in question is relatively typical for those seen in
Linux kernel ioctl handlers, and I was surprised that it had exceeded
the analyzer's default expression complexity limit.

This patch addresses this problem in three ways:
(a) the default value of the threshold parameter is increased, from 12
to 18, so that such expressions are precisely handled
(b) adding a new -Wanalyzer-symbol-too-complex to warn when the symbol
complexity limit is reached.  This is off by default for users, and
on by default in the test suite.
(c) the taint state machine handles comparisons against UNKNOWN svalues
by dropping all taint information on that execution path, so that if
the complexity limit has been exceeded we don't generate false positives

As well as fixing the taint false positive (PR analyzer/112850), the
patch also fixes a couple of leak false positives seen on flex-generated
scanners (PR analyzer/103546).

[1] specifically, in sound/core/rawmidi.c's handler for
SNDRV_RAWMIDI_STREAM_OUTPUT.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-6239-g08b7462d3ad8e5.

gcc/ChangeLog:
PR analyzer/103546
PR analyzer/112850
* doc/invoke.texi: Add -Wanalyzer-symbol-too-complex.

gcc/analyzer/ChangeLog:
PR analyzer/103546
PR analyzer/112850
* analyzer.opt (-param=analyzer-max-svalue-depth=): Increase from
12 to 18.
(Wanalyzer-symbol-too-complex): New.
* diagnostic-manager.cc
(null_assignment_sm_context::clear_all_per_svalue_state): New.
* engine.cc (impl_sm_context::clear_all_per_svalue_state): New.
* program-state.cc (sm_state_map::clear_all_per_svalue_state):
New.
* program-state.h (sm_state_map::clear_all_per_svalue_state): New
decl.
* region-model-manager.cc
(region_model_manager::reject_if_too_complex): Add
-Wanalyzer-symbol-too-complex.
* sm-taint.cc (taint_state_machine::on_condition): Handle
comparisons against UNKNOWN.
* sm.h (sm_context::clear_all_per_svalue_state): New.

gcc/testsuite/ChangeLog:
PR analyzer/103546
PR analyzer/112850
* c-c++-common/analyzer/call-summaries-pr107158-2.c: Add
-Wno-analyzer-symbol-too-complex.
* c-c++-common/analyzer/call-summaries-pr107158.c: Likewise.
* c-c++-common/analyzer/deref-before-check-pr109060-haproxy-cfgparse.c:
Likewise.
* c-c++-common/analyzer/feasibility-3.c: Add
-Wno-analyzer-too-complex and -Wno-analyzer-symbol-too-complex.
* c-c++-common/analyzer/flex-with-call-summaries.c: Add
-Wno-analyzer-symbol-too-complex.  Remove fail for
PR analyzer/103546 leak false positive.
* c-c++-common/analyzer/flex-without-call-summaries.c: Remove
xfail for PR analyzer/103546 leak false positive.
* c-c++-common/analyzer/infinite-recursion-3.c: Add
-Wno-analyzer-symbol-too-complex.
* 
c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
Likewise.
* 
c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:
Likewise.
* c-c++-common/analyzer/null-deref-pr108400-SoftEtherVPN-WebUi.c:
Likewise.
* c-c++-common/analyzer/null-deref-pr108806-qemu.c: Likewise.
* c-c++-common/analyzer/null-deref-pr108830.c: Likewise.
* c-c++-common/analyzer/pr94596.c: Likewise.
* c-c++-common/analyzer/strtok-2.c: Likewise.
* c-c++-common/analyzer/strtok-4.c: Add -Wno-analyzer-too-complex
and -Wno-analyzer-symbol-too-complex.
* c-c++-common/analyzer/strtok-cppreference.c: Likewise.
* gcc.dg/analyzer/analyzer.exp: Add -Wanalyzer-symbol-too-complex
to DEFAULT_CFLAGS.
* gcc.dg/analyzer/attr-const-3.c: Add
-Wno-analyzer-symbol-too-complex.
* gcc.dg/analyzer/call-summaries-pr107072.c: Likewise.
* gcc.dg/analyzer/doom-s_sound-pr108867.c: Likewise.
* gcc.dg/analyzer/explode-4.c: Likewise.
* gcc.dg/analyzer/null-deref-pr102671-1.c: Likewise.
* gcc.dg/analyzer/null-deref-pr105755.c: Likewise.
* gcc.dg/analyzer/out-of-bounds-curl.c: Likewise.
* gcc.dg/analyzer/pr101503.c: Likewise.
* gcc.dg/analyzer/pr

[PATCH] aarch64: add -fno-stack-protector to tests

2023-12-06 Thread Marek Polacek
Bootstrapped/regtested on aarch64-pc-linux-gnu, ok for trunk/13?

-- >8 --
These tests fail when the testsuite is executed with -fstack-protector-strong.
To avoid this, this patch adds -fno-stack-protector to dg-options.

The list of FAILs is appended.  As you can see, it's mostly about
scan-assembler-* which are sort of expected to fail with the stack
protector on.

FAIL: gcc.target/aarch64/ldp_stp_unaligned_2.c scan-assembler-not 
mov\\tx[0-9]+, sp
FAIL: gcc.target/aarch64/shadow_call_stack_5.c scan-assembler-times 
stptx29, x30, [sp] 1
FAIL: gcc.target/aarch64/shadow_call_stack_5.c scan-assembler ldrtx29, 
[sp]
FAIL: gcc.target/aarch64/shadow_call_stack_6.c scan-assembler-times 
strtx30, [sp] 1
FAIL: gcc.target/aarch64/shadow_call_stack_7.c scan-assembler-times 
stptx19, x30, [sp, -[0-9]+]! 1
FAIL: gcc.target/aarch64/shadow_call_stack_7.c scan-assembler ldrtx19, 
[sp], [0-9]+
FAIL: gcc.target/aarch64/shadow_call_stack_8.c scan-assembler-times 
stptx19, x20, [sp, -[0-9]+]! 1
FAIL: gcc.target/aarch64/shadow_call_stack_8.c scan-assembler ldptx19, x20, 
[sp], [0-9]+
FAIL: gcc.target/aarch64/stack-check-12.c scan-assembler-times strtxzr,  2
FAIL: gcc.target/aarch64/stack-check-prologue-11.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-12.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-13.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-13.c scan-assembler-times 
strs+x30, [sp] 1
FAIL: gcc.target/aarch64/stack-check-prologue-14.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-14.c scan-assembler-times 
strs+x30, [sp] 1
FAIL: gcc.target/aarch64/stack-check-prologue-15.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-15.c scan-assembler-times 
strs+x30, [sp] 1
FAIL: gcc.target/aarch64/stack-check-prologue-17.c check-function-bodies test1
FAIL: gcc.target/aarch64/stack-check-prologue-17.c check-function-bodies test2
FAIL: gcc.target/aarch64/stack-check-prologue-18.c check-function-bodies test1
FAIL: gcc.target/aarch64/stack-check-prologue-18.c check-function-bodies test2
FAIL: gcc.target/aarch64/stack-check-prologue-18.c check-function-bodies test3
FAIL: gcc.target/aarch64/stack-check-prologue-19.c check-function-bodies test1
FAIL: gcc.target/aarch64/stack-check-prologue-19.c check-function-bodies test2
FAIL: gcc.target/aarch64/stack-check-prologue-19.c check-function-bodies test3
FAIL: gcc.target/aarch64/stack-check-prologue-2.c scan-assembler-times 
strs+xzr, 0
FAIL: gcc.target/aarch64/stack-check-prologue-5.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-6.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/stack-check-prologue-8.c scan-assembler-times 
strs+xzr, [sp, 1024] 2
FAIL: gcc.target/aarch64/stack-check-prologue-9.c scan-assembler-times 
strs+xzr, [sp, 1024] 1
FAIL: gcc.target/aarch64/test_frame_1.c scan-assembler-times str\\tx30, 
[sp, -[0-9]+]! 2
FAIL: gcc.target/aarch64/test_frame_10.c scan-assembler-times stp\\tx19, x30, 
[sp, [0-9]+] 1
FAIL: gcc.target/aarch64/test_frame_10.c scan-assembler ldp\\tx19, x30, 
[sp, [0-9]+]
FAIL: gcc.target/aarch64/test_frame_11.c scan-assembler-times stp\\tx29, x30, 
[sp, -[0-9]+]! 2
FAIL: gcc.target/aarch64/test_frame_13.c scan-assembler-times stp\\tx29, x30, 
[sp] 1
FAIL: gcc.target/aarch64/test_frame_15.c scan-assembler-times stp\\tx29, x30, 
[sp, [0-9]+] 1
FAIL: gcc.target/aarch64/test_frame_2.c scan-assembler-times stp\\tx19, x30, 
[sp, -[0-9]+]! 1
FAIL: gcc.target/aarch64/test_frame_2.c scan-assembler ldp\\tx19, x30, 
[sp], [0-9]+
FAIL: gcc.target/aarch64/test_frame_4.c scan-assembler-times stp\\tx19, x30, 
[sp, -[0-9]+]! 1
FAIL: gcc.target/aarch64/test_frame_4.c scan-assembler ldp\\tx19, x30, 
[sp], [0-9]+
FAIL: gcc.target/aarch64/test_frame_6.c scan-assembler-times str\\tx30, 
[sp] 1
FAIL: gcc.target/aarch64/test_frame_7.c scan-assembler-times stp\\tx19, x30, 
[sp] 1
FAIL: gcc.target/aarch64/test_frame_8.c scan-assembler-times str\\tx30, 
[sp, [0-9]+] 1
FAIL: gcc.target/aarch64/test_frame_8.c scan-assembler ldr\\tx30, [sp, 
[0-9]+]
FAIL: gcc.target/aarch64/sve/struct_vect_24.c scan-assembler-times 
cmps+x[0-9]+, 61440 4
FAIL: gcc.target/aarch64/sve/struct_vect_24.c scan-assembler-times 
subs+x[0-9]+, x[0-9]+, 61440 4
FAIL: gcc.target/aarch64/sve/struct_vect_24.c scan-assembler-times 
cmp\\s+x[0-9]+, 61440 4
FAIL: gcc.target/aarch64/sve/struct_vect_24.c scan-assembler-times 
sub\\s+x[0-9]+, x[0-9]+, 61440 4

gcc/testsuite/ChangeLo

Re: [Committed] RISC-V: Fix PR112888 ICE

2023-12-06 Thread Patrick O'Neill

Committed on behalf of Juzhe since he was having internet issues.

Thanks,
Patrick

On 12/6/23 14:35, Juzhe-Zhong wrote:

Committed as it is ovbious.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (extract_single_source): new function.
(pre_vsetvl::compute_lcm_local_properties): Fix ICE.

---
  gcc/config/riscv/riscv-vsetvl.cc | 12 +---
  1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 68f0be7e81d..90477f331d7 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -596,6 +596,14 @@ extract_single_source (set_info *set)
return first_insn;
  }
  
+static insn_info *

+extract_single_source (def_info *def)
+{
+  if (!def)
+return nullptr;
+  return extract_single_source (dyn_cast (def));
+}
+
  static bool
  same_equiv_note_p (set_info *set1, set_info *set2)
  {
@@ -2692,9 +2700,7 @@ pre_vsetvl::compute_lcm_local_properties ()
  def_lookup dl = crtl->ssa->find_def (resource, insn);
  def_info *def
= dl.matching_set_or_last_def_of_prev_group ();
- gcc_assert (def);
- insn_info *def_insn = extract_single_source (
-   dyn_cast (def));
+ insn_info *def_insn = extract_single_source (def);
  if (def_insn && vsetvl_insn_p (def_insn->rtl ()))
{
  vsetvl_info def_info = vsetvl_info (def_insn);


Re: [PATCH] libsupc++: try cxa_thread_atexit_impl at runtime

2023-12-06 Thread Alexandre Oliva
On Dec  6, 2023, Jonathan Wakely  wrote:

>> -void *obj, void *dso_handle)
>> +void *obj, [[maybe_unused]] void 
>> *dso_handle)

> The patch is OK with that change.

Thanks, here's what I'm going to install.  Regstrapped on
x86_64-linux-gnu, with and without
ac_cv_func___cxa_thread_atexit_impl=no, on a machine that has
__cxa_thread_atexit_impl (but not __cxa_thread_atexit) in libc.


libsupc++: try cxa_thread_atexit_impl at runtime

g++.dg/tls/thread_local-order2.C fails when the toolchain is built for
a platform that lacks __cxa_thread_atexit_impl, even if the program is
built and run using that toolchain on a (later) platform that offers
__cxa_thread_atexit_impl.

This patch adds runtime testing for __cxa_thread_atexit_impl on select
platforms (GNU variants, for starters) that support weak symbols.


for  libstdc++-v3/ChangeLog

PR libstdc++/112858
* config/os/gnu-linux/os_defines.h
(_GLIBCXX_MAY_HAVE___CXA_THREAD_ATEXIT_IMPL): Define.
* libsupc++/atexit_thread.cc [__GXX_WEAK__ &&
_GLIBCXX_MAY_HAVE___CXA_THREAD_ATEXIT_IMPL]
(__cxa_thread_atexit): Add dynamic detection of
__cxa_thread_atexit_impl.
---
 libstdc++-v3/config/os/gnu-linux/os_defines.h |5 +
 libstdc++-v3/libsupc++/atexit_thread.cc   |   23 ++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/config/os/gnu-linux/os_defines.h 
b/libstdc++-v3/config/os/gnu-linux/os_defines.h
index 87317031fcd71..a2e4baec069d5 100644
--- a/libstdc++-v3/config/os/gnu-linux/os_defines.h
+++ b/libstdc++-v3/config/os/gnu-linux/os_defines.h
@@ -60,6 +60,11 @@
 # define _GLIBCXX_HAVE_FLOAT128_MATH 1
 #endif
 
+// Enable __cxa_thread_atexit to rely on a (presumably libc-provided)
+// __cxa_thread_atexit_impl, if it happens to be defined, even if
+// configure couldn't find it during the build.
+#define _GLIBCXX_MAY_HAVE___CXA_THREAD_ATEXIT_IMPL 1
+
 #ifdef __linux__
 // The following libpthread properties only apply to Linux, not GNU/Hurd.
 
diff --git a/libstdc++-v3/libsupc++/atexit_thread.cc 
b/libstdc++-v3/libsupc++/atexit_thread.cc
index 9346d50f5dafe..28423344a0f34 100644
--- a/libstdc++-v3/libsupc++/atexit_thread.cc
+++ b/libstdc++-v3/libsupc++/atexit_thread.cc
@@ -138,11 +138,32 @@ namespace {
   }
 }
 
+#if __GXX_WEAK__ && _GLIBCXX_MAY_HAVE___CXA_THREAD_ATEXIT_IMPL
+extern "C"
+int __attribute__ ((__weak__))
+__cxa_thread_atexit_impl (void (_GLIBCXX_CDTOR_CALLABI *func) (void *),
+ void *arg, void *d);
+#endif
+
+// ??? We can't make it an ifunc, can we?
 extern "C" int
 __cxxabiv1::__cxa_thread_atexit (void (_GLIBCXX_CDTOR_CALLABI *dtor)(void *),
-void *obj, void */*dso_handle*/)
+void *obj, [[maybe_unused]] void *dso_handle)
   _GLIBCXX_NOTHROW
 {
+#if __GXX_WEAK__ && _GLIBCXX_MAY_HAVE___CXA_THREAD_ATEXIT_IMPL
+  if (__cxa_thread_atexit_impl)
+// Rely on a (presumably libc-provided) __cxa_thread_atexit_impl,
+// if it happens to be defined, even if configure couldn't find it
+// during the build.  _GLIBCXX_MAY_HAVE___CXA_THREAD_ATEXIT_IMPL
+// may be defined e.g. in os_defines.h on platforms where some
+// versions of libc have a __cxa_thread_atexit_impl definition,
+// but whose earlier versions didn't.  This enables programs build
+// by toolchains compatible with earlier libc versions to still
+// benefit from a libc-provided __cxa_thread_atexit_impl.
+return __cxa_thread_atexit_impl (dtor, obj, dso_handle);
+#endif
+
   // Do this initialization once.
   if (__gthread_active_p ())
 {

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[Committed] RISC-V: Fix PR112888 ICE

2023-12-06 Thread Juzhe-Zhong
Committed as it is ovbious.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (extract_single_source): new function.
(pre_vsetvl::compute_lcm_local_properties): Fix ICE.

---
 gcc/config/riscv/riscv-vsetvl.cc | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 68f0be7e81d..90477f331d7 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -596,6 +596,14 @@ extract_single_source (set_info *set)
   return first_insn;
 }
 
+static insn_info *
+extract_single_source (def_info *def)
+{
+  if (!def)
+return nullptr;
+  return extract_single_source (dyn_cast (def));
+}
+
 static bool
 same_equiv_note_p (set_info *set1, set_info *set2)
 {
@@ -2692,9 +2700,7 @@ pre_vsetvl::compute_lcm_local_properties ()
  def_lookup dl = crtl->ssa->find_def (resource, insn);
  def_info *def
= dl.matching_set_or_last_def_of_prev_group ();
- gcc_assert (def);
- insn_info *def_insn = extract_single_source (
-   dyn_cast (def));
+ insn_info *def_insn = extract_single_source (def);
  if (def_insn && vsetvl_insn_p (def_insn->rtl ()))
{
  vsetvl_info def_info = vsetvl_info (def_insn);
-- 
2.36.3



[PATCH] htdocs: correct spelling and use https in examples

2023-12-06 Thread Jonny Grant
Revised version of this patch after review.

ChangeLog:

htdocs: correct spelling and use https in examples.



>From 52d413bce86827f2add424e78321b509661f6f59 Mon Sep 17 00:00:00 2001
From: Jonathan Grant 
Date: Wed, 6 Dec 2023 22:27:29 +
Subject: [PATCH] htdocs: correct spelling and use https in examples

Signed-off-by: Jonathan Grant 
---
 htdocs/bugs/management.html   | 2 +-
 htdocs/codingrationale.html   | 2 +-
 htdocs/contribute.html| 6 +++---
 htdocs/gcc-14/changes.html| 2 +-
 htdocs/gccmission.html| 2 +-
 htdocs/git.html   | 7 +++
 htdocs/projects/cfg.html  | 2 +-
 htdocs/projects/cli.html  | 2 +-
 htdocs/projects/cxx-reflection/index.html | 2 +-
 htdocs/projects/optimize.html | 6 +++---
 htdocs/projects/tree-profiling.html   | 2 +-
 htdocs/testing/index.html | 2 +-
 12 files changed, 18 insertions(+), 19 deletions(-)

diff --git a/htdocs/bugs/management.html b/htdocs/bugs/management.html
index 28dfa76a..b2bb740e 100644
--- a/htdocs/bugs/management.html
+++ b/htdocs/bugs/management.html
@@ -64,7 +64,7 @@ perspective, these are the relevant ones and what their 
values mean:
 The status and resolution fields define and track the life cycle of a
 bug.  In addition to their https://gcc.gnu.org/bugzilla/page.cgi?id=fields.html";>regular
-descriptions, we also use two adition status values:
+descriptions, we also use two additional status values:
 
 
 
diff --git a/htdocs/codingrationale.html b/htdocs/codingrationale.html
index 6cc76885..c51c9da4 100644
--- a/htdocs/codingrationale.html
+++ b/htdocs/codingrationale.html
@@ -155,7 +155,7 @@ Wide use of implicit conversion can cause some very 
surprising results.
 
 
 C++03 has no explicit conversion operators,
-and hence using them cannot avoid suprises.
+and hence using them cannot avoid surprises.
 Wait for C++11.
 
 
diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index 7c1ae323..152675fa 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -299,7 +299,7 @@ followed by a colon.  For example,
 
 
 Some large components may be subdivided into sub-components.  If
-the subcomponent name is not disctinct in its own right, you can use the
+the subcomponent name is not distinct in its own right, you can use the
 form component/sub-component:.
 
 Series identifier
@@ -329,7 +329,7 @@ the commit message so that Bugzilla will correctly notice 
the
 commit.  If your patch relates to two bugs, then write
 [PRn, PRm].  For multiple
 bugs, just cite the most relevant one in the summary and use an
-elipsis instead of the second, or subsequent PR numbers; list all the
+ellipsis instead of the second, or subsequent PR numbers; list all the
 related PRs in the body of the commit message in the normal way.
 
 It is not necessary to cite bugs that are closed as duplicates of
@@ -354,7 +354,7 @@ together.
 If you submit a new version of a patch series, then you should
 start a new email thread (don't reply to the original patch series).
 This avoids email threads becoming confused between discussions of the
-first and subsequent revisions of the patch set.  Your cover leter
+first and subsequent revisions of the patch set.  Your cover letter
 (0/nnn) should explain clearly what has been changed between
 the two patch series.  Also state if some of the patches are unchanged
 between revisions; this saves maintainers having to re-review the
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 5a453437..bd51ecb4 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -34,7 +34,7 @@ a work-in-progress.
   another structure, is deprecated. Refer to
   https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html";>
   Zero Length Arrays.
-  Any code relying on this extension should be modifed to ensure that
+  Any code relying on this extension should be modified to ensure that
   C99 flexible array members only end up at the ends of structures.
   Please use the warning option
   https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end";>-Wflex-array-member-not-at-end
 to
diff --git a/htdocs/gccmission.html b/htdocs/gccmission.html
index 58a12755..1124fe9f 100644
--- a/htdocs/gccmission.html
+++ b/htdocs/gccmission.html
@@ -55,7 +55,7 @@ GCC.
  Patches will be considered equally based on their
  technical merits.
  All individuals and companies are welcome to contribute
- as long as they accept the groundrules.
+ as long as they accept the ground rules.
  
 Open mailing lists.
 Developer friendly tools and procedures (i.e. [version control], multiple
diff --git a/htdocs/git.html b/htdocs/git.html
index 22c0eec1..ed4607ef 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -27,7 +27,6 @@ Git history online.
 (Our web pages are managed via Git in a
 sep

Re: [PATCH] htdocs/git.html: correct spelling and use git in example

2023-12-06 Thread Jonny Grant



On 04/12/2023 20:37, Joseph Myers wrote:
> On Fri, 1 Dec 2023, Jonny Grant wrote:
> 
>>
>>
>> On 30/11/2023 23:56, Joseph Myers wrote:
>>> On Thu, 30 Nov 2023, Jonny Grant wrote:
>>>
 ChangeLog:

htdocs/git.html: change example to use git:// and correct
spelling repostiory -> repository .
>>>
>>> git:// (unencrypted / unauthenticated) is pretty widely considered 
>>> obsolescent, I'm not sure adding a use of it (as opposed to changing any 
>>> existing examples to use a secure connection mechanism) is a good idea.
>>>
>>
>> Hi Joseph
>>
>> Thank you for your review.
>>
>> Good point. I changed the ssh::// example because it doesn't work with 
>> anonymous access.
>> How about changing both to https:// ?
> 
> Using https:// makes sense for examples for anonymous access, yes.
> 

Okay, I'll email a revised patch in a moment with that changed.
With kind regards, Jonny


Re: [PATCH] analyzer: deal with -fshort-enums

2023-12-06 Thread David Malcolm
On Wed, 2023-12-06 at 02:31 -0300, Alexandre Oliva wrote:
> On Nov 22, 2023, Alexandre Oliva  wrote:
> 
> > Ah, nice, that's a great idea, I wish I'd thought of that!  Will
> > do.
> 
> Sorry it took me so long, here it is.  I added two tests, so that,
> regardless of the defaults, we get both circumstances tested, without
> repetition.
> 
> Regstrapped on x86_64-linux-gnu.  Also tested on arm-eabi.  Ok to
> install?

Thanks for the updated patch.

Looks good to me.

Dave

> 
> 
> analyzer: deal with -fshort-enums
> 
> On platforms that enable -fshort-enums by default, various switch-
> enum
> analyzer tests fail, because apply_constraints_for_gswitch doesn't
> expect the integral promotion type cast.  I've arranged for the code
> to cope with those casts.
> 
> 
> for  gcc/analyzer/ChangeLog
> 
> * region-model.cc (has_nondefault_case_for_value_p): Take
> enumerate type as a parameter.
> (region_model::apply_constraints_for_gswitch): Cope with
> integral promotion type casts.
> 
> for  gcc/testsuite/ChangeLog
> 
> * gcc.dg/analyzer/switch-short-enum-1.c: New.
> * gcc.dg/analyzer/switch-no-short-enum-1.c: New.
> ---
>  gcc/analyzer/region-model.cc   |   27 +++-
>  .../gcc.dg/analyzer/switch-no-short-enum-1.c   |  141
> 
>  .../gcc.dg/analyzer/switch-short-enum-1.c  |  140
> 
>  3 files changed, 304 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/switch-no-short-
> enum-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/switch-short-enum-
> 1.c
> 
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index 2157ad2578b85..6a7a8bc9f4884 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -5387,10 +5387,10 @@ has_nondefault_case_for_value_p (const
> gswitch *switch_stmt, tree int_cst)
>     has nondefault cases handling all values in the enum.  */
>  
>  static bool
> -has_nondefault_cases_for_all_enum_values_p (const gswitch
> *switch_stmt)
> +has_nondefault_cases_for_all_enum_values_p (const gswitch
> *switch_stmt,
> +   tree type)
>  {
>    gcc_assert (switch_stmt);
> -  tree type = TREE_TYPE (gimple_switch_index (switch_stmt));
>    gcc_assert (TREE_CODE (type) == ENUMERAL_TYPE);
>  
>    for (tree enum_val_iter = TYPE_VALUES (type);
> @@ -5426,6 +5426,23 @@ apply_constraints_for_gswitch (const
> switch_cfg_superedge &edge,
>  {
>    tree index  = gimple_switch_index (switch_stmt);
>    const svalue *index_sval = get_rvalue (index, ctxt);
> +  bool check_index_type = true;
> +
> +  /* With -fshort-enum, there may be a type cast.  */
> +  if (ctxt && index_sval->get_kind () == SK_UNARYOP
> +  && TREE_CODE (index_sval->get_type ()) == INTEGER_TYPE)
> +    {
> +  const unaryop_svalue *unaryop = as_a 
> (index_sval);
> +  if (unaryop->get_op () == NOP_EXPR
> + && is_a  (unaryop->get_arg ()))
> +   if (const initial_svalue *initvalop = (as_a  initial_svalue *>
> +  (unaryop->get_arg
> (
> + if (TREE_CODE (initvalop->get_type ()) == ENUMERAL_TYPE)
> +   {
> + index_sval = initvalop;
> + check_index_type = false;
> +   }
> +    }
>  
>    /* If we're switching based on an enum type, assume that the user
> is only
>   working with values from the enum.  Hence if this is an
> @@ -5437,12 +5454,14 @@ apply_constraints_for_gswitch (const
> switch_cfg_superedge &edge,
>    ctxt
>    /* Must be an enum value.  */
>    && index_sval->get_type ()
> -  && TREE_CODE (TREE_TYPE (index)) == ENUMERAL_TYPE
> +  && (!check_index_type
> + || TREE_CODE (TREE_TYPE (index)) == ENUMERAL_TYPE)
>    && TREE_CODE (index_sval->get_type ()) == ENUMERAL_TYPE
>    /* If we have a constant, then we can check it directly.  */
>    && index_sval->get_kind () != SK_CONSTANT
>    && edge.implicitly_created_default_p ()
> -  && has_nondefault_cases_for_all_enum_values_p (switch_stmt)
> +  && has_nondefault_cases_for_all_enum_values_p (switch_stmt,
> +    index_sval-
> >get_type ())
>    /* Don't do this if there's a chance that the index is
>  attacker-controlled.  */
>    && !ctxt->possibly_tainted_p (index_sval))
> diff --git a/gcc/testsuite/gcc.dg/analyzer/switch-no-short-enum-1.c
> b/gcc/testsuite/gcc.dg/analyzer/switch-no-short-enum-1.c
> new file mode 100644
> index 0..98f6d91f97481
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/switch-no-short-enum-1.c
> @@ -0,0 +1,141 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-fno-short-enums" } */
> +/* { dg-skip-if "default" { ! short_enums } } */
> +
> +#include "analyzer-decls.h"
> +
> +/* Verify the handling of "switch (enum_value)".  */
> +
> +enum e
> +{
> + E_VAL0,
> + E

Re: Causes to nvptx bootstrap fail: [PATCH v5] Introduce strub: machine-independent stack scrubbing

2023-12-06 Thread Alexandre Oliva
On Dec  6, 2023, Thomas Schwinge  wrote:

> As I understand things, this cannot be implemented (at the call site) for
> nvptx, given that the callee's stack is not visible there: PTX is unusual
> in that the concept of a "standard" stack isn't exposed.

Not even when one PTX function calls another?  Interesting.  I'd hoped
that with control over entering and leaving strub contexts, one could
(manually) ensure they'd run in the same execution domain.  But if not
even that is possible, it will render the current strub implementation
entirely unusable for this target indeed.

Now, it doesn't seem to me that the build errors being experienced have
to do with that, but rather with lack of or incomplete support for
__builtin_{frame,stack}_address().  Are those errors expected when using
these builtins on this target?  I'd have expected them to compile, even
if something went wrong at runtime.


> Instead of allowing "strub" pieces that can be implemented, should this
> whole machinery generally be disabled (forced '-fstrub=disable', or via a
> new target hook?)?  The libgcc functions should then not get defined
> (thus, linker error upon accidental use), or should just '__builtin_trap'
> if that makes more sense?  Need an effective-target for the test cases.

> Alternatively, we may also leave the generic middle end handling alive,
> and 'sorry' (or similar) in the nvptx back end, as necessary?

Disabling the runtime bits is easy, once we determine what condition we
wish to test for.  I suppose testing for target support in the compiler,
issuing a 'sorry' in case the feature is required, would provide
something for libgcc configure and testsuite effective-target to test
for and decide whether to enable runtime support and run the tests.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] RISC-V: xtheadfmemidx: Disable if xtheadmemidx is not available

2023-12-06 Thread Jeff Law




On 12/5/23 08:16, Christoph Müllner wrote:

XTheadMemIdx provides register-register offsets for GP register
loads/stores.  XTheadFMemIdx does the same for FP registers.

We've observed an issue with XTheadFMemIdx-only builds, where FP
registers have been promoted to GP registers:

(insn 26 22 51 (set (reg:DF 15 a5 [orig:136  ] [136])
 (mem/u:DF (plus:DI (reg/f:DI 15 a5 [141])
 (reg:DI 10 a0 [144])) [1 CSWTCH.2[_10]+0 S8 A64])) 217 
{*movdf_hardfloat_rv64}
  (expr_list:REG_DEAD (reg:DI 10 a0 [144])
 (nil)))

This results in the following assembler error:
   Assembler messages:
   Error: unrecognized opcode `th.lrd a5,a5,a0,0', extension `xtheadmemidx' 
required

There seems to be a (reasonable) assumption, that addressing modes
for FP registers are compatible with those of GP registers.

We already ran into a similar issue during development of the
XTheadFMemIdx support patch, where we could trace the issue down to
the optimization splitters.  Back then we simply disabled them in case
XTheadMemIdx is not available.  But as it turned out, that was not
enough.

To ensure, we won't see such issues anymore, let's make the support
for XTheadFMemIdx depend on XTheadMemIdx.  I.e., if only XTheadFMemIdx
is available, then no instructions of this extension will be emitted.

While this looks a bit drastic at first view, it is the best practical
solution since XTheadFMemIdx without XTheadMemIdx does not exist in real
hardware and would be an odd thing to do.

gcc/ChangeLog:

* config/riscv/thead.cc (th_memidx_classify_address_index):
Require TARGET_XTHEADMEMIDX for FP modes.
* config/riscv/thead.md: Require TARGET_XTHEADMEMIDX for all
XTheadFMemIdx pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadfmemidx-without-xtheadmemidx.c: New test.

OK.

Note that in the reload era this kind of issue was common.  Essentially 
you have a MEM, but you don't know if it's going to be used in a load or 
a store.  For loads you don't know if the destination will be a GPR, FPR 
or something else, similarly for stores you didn't know if the source 
value came from a GPR, FPR or elsewhere.


As a result you had to assume that you'd eventually see a GPR used with 
floating point modes and FPRs used with integer modes.  It caused all 
kinds of headaches on the PA.  Valid addressing modes differed across 
the various cases and integer multiplies actually used the FP unit and 
thus had to be moved into FP regs first made it even worse.


I think things are better with LRA, but I don't know if this class of 
problems is totally eliminated.Point being I'm not surprised that 
you're seeing GPRs referenced in FP modes.


jeff


Re: [PATCH] Reimplement __gnu_cxx::__ops operators

2023-12-06 Thread François Dumont

I think I still got no feedback about this cleanup proposal.

Here is a new version.

François

On 15/06/2023 07:07, François Dumont wrote:
I think we all agree that __gnu_cxx::__ops needed to be reimplemented, 
here it is.


Note that I kept the usage of std::ref in ,  and .

    libstdc++: Reimplement __gnu_cxx::__ops operators

    Replace functors using iterators as input to adopt functors that
    are matching the same Standard expectations as the ones imposed on
    predicates used in predicates-aware algos. Doing so we need far less
    functors. It impose that iterators are dereference at algo level and
    not in the functors anymore.

    libstdc++-v3/ChangeLog:

    * include/std/functional (_Not_fn): Move to...
    * include/bits/predefined_ops.h: ...here, and expose a 
version

    in pre-C++14 mode.
    (__not_fn): New, use latter.
    (_Iter_less_iter, _Iter_less_val, _Val_less_iter, 
_Iter_equal_to_iter)
    (_Iter_equal_to_val, _Iter_comp_iter, _Iter_comp_val, 
_Val_comp_iter)
    (_Iter_equals_val, _Iter_equals_iter, _Iter_pred, 
_Iter_comp_val)
    (_Iter_comp_to_val, _Iter_comp_to_iter, _Iter_negate): 
Remove.
    (__iter_less_iter, __iter_less_val, __iter_comp_val, 
__val_less_iter)
    (__val_comp_iter, __iter_equal_to_iter, 
__iter_equal_to_val, __iter_comp_iter)
    (__val_comp_iter, __iter_equals_val, __iter_comp_iter, 
__pred_iter): Remove.

    (_Less, _Equal_to, _Equal_to_val, _Comp_val): New.
    (__less, __equal_to, __comp_val): New.
    * include/bits/stl_algo.h: Adapt all algos to use new 
__gnu_cxx::__ops operators.
    When possible use std::move to pass predicates between 
routines.

    * include/bits/stl_algobase.h: Likewise.
    * include/bits/stl_heap.h: Likewise.
    * include/std/deque: Cleanup usage of __gnu_cxx::__ops 
operators.

    * include/std/string: Likewise.
    * include/std/vector: Likewise.

Tested under Linux x86_64 normal and _GLIBCXX_DEBUG modes.

Ok to commit ?

François
diff --git a/libstdc++-v3/include/bits/predefined_ops.h 
b/libstdc++-v3/include/bits/predefined_ops.h
index e9933373ed9..8753e6f64cd 100644
--- a/libstdc++-v3/include/bits/predefined_ops.h
+++ b/libstdc++-v3/include/bits/predefined_ops.h
@@ -32,376 +32,229 @@
 
 #include 
 
+#if __cplusplus >= 201103L
+# include 
+#endif
+
 namespace __gnu_cxx
 {
 namespace __ops
 {
-  struct _Iter_less_iter
+  struct _Less
   {
-template
+template
   _GLIBCXX14_CONSTEXPR
   bool
-  operator()(_Iterator1 __it1, _Iterator2 __it2) const
-  { return *__it1 < *__it2; }
+  operator()(const _Lhs& __lhs, const _Rhs& __rhs) const
+  { return __lhs < __rhs; }
   };
 
   _GLIBCXX14_CONSTEXPR
-  inline _Iter_less_iter
-  __iter_less_iter()
-  { return _Iter_less_iter(); }
-
-  struct _Iter_less_val
-  {
-#if __cplusplus >= 201103L
-constexpr _Iter_less_val() = default;
-#else
-_Iter_less_val() { }
-#endif
-
-_GLIBCXX20_CONSTEXPR
-explicit
-_Iter_less_val(_Iter_less_iter) { }
-
-template
-  _GLIBCXX20_CONSTEXPR
-  bool
-  operator()(_Iterator __it, _Value& __val) const
-  { return *__it < __val; }
-  };
-
-  _GLIBCXX20_CONSTEXPR
-  inline _Iter_less_val
-  __iter_less_val()
-  { return _Iter_less_val(); }
-
-  _GLIBCXX20_CONSTEXPR
-  inline _Iter_less_val
-  __iter_comp_val(_Iter_less_iter)
-  { return _Iter_less_val(); }
-
-  struct _Val_less_iter
-  {
-#if __cplusplus >= 201103L
-constexpr _Val_less_iter() = default;
-#else
-_Val_less_iter() { }
-#endif
-
-_GLIBCXX20_CONSTEXPR
-explicit
-_Val_less_iter(_Iter_less_iter) { }
-
-template
-  _GLIBCXX20_CONSTEXPR
-  bool
-  operator()(_Value& __val, _Iterator __it) const
-  { return __val < *__it; }
-  };
+  inline _Less
+  __less()
+  { return _Less(); }
 
-  _GLIBCXX20_CONSTEXPR
-  inline _Val_less_iter
-  __val_less_iter()
-  { return _Val_less_iter(); }
-
-  _GLIBCXX20_CONSTEXPR
-  inline _Val_less_iter
-  __val_comp_iter(_Iter_less_iter)
-  { return _Val_less_iter(); }
-
-  struct _Iter_equal_to_iter
+  struct _Equal_to
   {
-template
+template
   _GLIBCXX20_CONSTEXPR
   bool
-  operator()(_Iterator1 __it1, _Iterator2 __it2) const
-  { return *__it1 == *__it2; }
+  operator()(const _Lhs& __lhs, const _Rhs& __rhs) const
+  { return __lhs == __rhs; }
   };
 
   _GLIBCXX20_CONSTEXPR
-  inline _Iter_equal_to_iter
-  __iter_equal_to_iter()
-  { return _Iter_equal_to_iter(); }
-
-  struct _Iter_equal_to_val
-  {
-template
-  _GLIBCXX20_CONSTEXPR
-  bool
-  operator()(_Iterator __it, _Value& __val) const
-  { return *__it == __val; }
-  };
+  inline _Equal_to
+  __equal_to()
+  { return _Equal_to(); }
 
-  _GLIBCXX20_CONSTEXPR
-  inline _Iter_equal_to_val
-  __iter_equal_to_val()
-  { return _Iter_equal_to_val(); }
-
-  _GLIBCXX20_CONSTEXPR
-  inl

[PATCH] Fortran: function returning contiguous class array [PR105543]

2023-12-06 Thread Harald Anlauf
Dear all,

the attached patch fixes a rejects-valid for functions returning
a contiguous CLASS result.  The problem occurs because attr.class_ok
is inconsistent between sym and sym->result at the time the check
of the contiguous attribute is done.

I first thought that resolve_fl_procedure would be the right place
to do this fixup, but this is invoked only later from resolve_symbol.
Another attempt to put a fix directly after the recursive call to
resolve_symbol for sym->result lead to frightening regressions in
the testsuite, so I stayed with the attached simple solution.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 15810999b2f5cb4d8fbd69cb488c9b0c58e6 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 6 Dec 2023 20:42:27 +0100
Subject: [PATCH] Fortran: function returning contiguous class array [PR105543]

gcc/fortran/ChangeLog:

	PR fortran/105543
	* resolve.cc (resolve_symbol): For a CLASS-valued function having a
	RESULT clause, ensure that attr.class_ok is set for its symbol as
	well as for its resolved result variable.

gcc/testsuite/ChangeLog:

	PR fortran/105543
	* gfortran.dg/contiguous_13.f90: New test.
---
 gcc/fortran/resolve.cc  |  5 +
 gcc/testsuite/gfortran.dg/contiguous_13.f90 | 22 +
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/contiguous_13.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 166b702cd9a..4fe0e7202e5 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -16102,6 +16102,11 @@ resolve_symbol (gfc_symbol *sym)
   specification_expr = saved_specification_expr;
 }

+  /* For a CLASS-valued function with a result variable, affirm that it has
+ been resolved also when looking at the symbol 'sym'.  */
+  if (mp_flag && sym->ts.type == BT_CLASS && sym->result->attr.class_ok)
+sym->attr.class_ok = sym->result->attr.class_ok;
+
   if (sym->ts.type == BT_CLASS && sym->attr.class_ok && sym->ts.u.derived
   && CLASS_DATA (sym))
 {
diff --git a/gcc/testsuite/gfortran.dg/contiguous_13.f90 b/gcc/testsuite/gfortran.dg/contiguous_13.f90
new file mode 100644
index 000..8c6784432c9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/contiguous_13.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! PR fortran/105543 - function returning contiguous class array
+! Contributed by martin 
+
+module func_contiguous
+  implicit none
+  type :: a
+  end type a
+contains
+  function create1 () result(x)
+class(a), dimension(:), contiguous, pointer :: x
+  end
+  function create2 ()
+class(a), dimension(:), contiguous, pointer :: create2
+  end
+  function create3 () result(x)
+class(*), dimension(:), contiguous, pointer :: x
+  end
+  function create4 ()
+class(*), dimension(:), contiguous, pointer :: create4
+  end
+end module func_contiguous
--
2.35.3



Re: [Committed V2] RISC-V: Fix VSETVL PASS bug

2023-12-06 Thread Patrick O'Neill

Hi Juzhe,

An assert added in this patch is firing on a testcase on rv64gcv:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112888

Thanks,
Patrick

On 12/6/23 06:26, Juzhe-Zhong wrote:

As PR112855 mentioned, the VSETVL PASS insert vsetvli in unexpected location.

Due to 2 reasons:
1. incorrect transparant computation LCM data. We need to check VL operand defs 
and uses.
2. incorrect fusion of unrelated edge which is the edge never reach the vsetvl 
expression.

PR target/112855

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(pre_vsetvl::compute_lcm_local_properties): Fix transparant LCM data.
(pre_vsetvl::earliest_fuse_vsetvl_info): Disable earliest fusion for 
unrelated edge.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112855.c: New test.


Re: {Patch, fortran] PR112834 - Class array function selector causes chain of syntax and other spurious errors

2023-12-06 Thread Harald Anlauf

Hi Paul,

On 12/6/23 17:09, Paul Richard Thomas wrote:

Dear All,

This patch was rescued from my ill-fated and long winded attempt to provide
a fix-up for function selector references, where the function is parsed
after the procedure containing the associate/select type construct (PRs
89645 and 99065). The fix-ups broke down completely once these constructs
were enclosed by another associate construct, where the selector is a
derived type or class function. My inclination now is to introduce two pass
parsing for contained procedures.

Returning to PR112834, the patch is simple enough and is well described by
the change logs. PR111853 was fixed as a side effect of the bigger patch.
Steve Kargl had also posted the same fix on the PR.


the patch looks good, but could you please check the coding style?

@@ -6550,7 +6551,19 @@ select_type_set_tmp (gfc_typespec *ts)
   sym = tmp->n.sym;
   gfc_add_type (sym, ts, NULL);

-  if (selector->ts.type == BT_CLASS && selector->attr.class_ok
+  /* If the SELECT TYPE selector is a function we might be able to
obtain
+a typespec from the result. Since the function might not have been
+parsed yet we have to check that there is indeed a result symbol.  */
+  if (selector->ts.type == BT_UNKNOWN
+ && gfc_state_stack->construct
+ && (expr2 = gfc_state_stack->construct->expr2)
+ && expr2->expr_type == EXPR_FUNCTION
+ && expr2->symtree
+ && expr2->symtree->n.sym && expr2->symtree->n.sym->result)

Adding a line break before the second '&&' makes it more readable.

+   selector->ts = expr2->symtree->n.sym->result->ts;

@@ -2037,7 +2038,12 @@ trans_associate_var (gfc_symbol *sym,
gfc_wrapped_block *block)

   /* Class associate-names come this way because they are
 unconditionally associate pointers and the symbol is scalar.  */
-  if (sym->ts.type == BT_CLASS && CLASS_DATA (sym)->attr.dimension)
+  if (sym->ts.type == BT_CLASS && e->expr_type ==EXPR_FUNCTION)

There should be whitespace before AND after '=='.

+   {
+ gfc_conv_expr (&se, e);
+ se.expr = gfc_evaluate_now (se.expr, &se.pre);
+   }
+  else if (sym->ts.type == BT_CLASS && CLASS_DATA
(sym)->attr.dimension)


Regression tests - OK for trunk and 13-branch?

Paul



Thanks for the patch!

Harald



Re: {Patch, fortran] PR112834 - Class array function selector causes chain of syntax and other spurious errors

2023-12-06 Thread Jerry D

On 12/6/23 8:09 AM, Paul Richard Thomas wrote:

Dear All,

This patch was rescued from my ill-fated and long winded attempt to 
provide a fix-up for function selector references, where the function is 
parsed after the procedure containing the associate/select type 
construct (PRs 89645 and 99065). The fix-ups broke down completely once 
these constructs were enclosed by another associate construct, where the 
selector is a derived type or class function. My inclination now is to 
introduce two pass parsing for contained procedures.


Returning to PR112834, the patch is simple enough and is well described 
by the change logs. PR111853 was fixed as a side effect of the bigger 
patch. Steve Kargl had also posted the same fix on the PR.


Regression tests - OK for trunk and 13-branch?

Paul



Hi Paul, I am taking a crack at this. It looks reasonable to me. 
Certainly OK for trunk, and then, if no fallout, 13 at your discretion.


Regards,

Jerry



Re: [PATCH] RISC-V: Remove xfail from ssa-fre-3.c testcase

2023-12-06 Thread Palmer Dabbelt

On Wed, 06 Dec 2023 10:48:30 PST (-0800), Vineet Gupta wrote:


On 12/6/23 08:22, Palmer Dabbelt wrote:

Ran the test case at 122e7b4f9d0c2d54d865272463a1d812002d0a5c where the xfail

That's the original port submission, I'm actually kind of surprised it
still builds/works at all.


Full toolchain build would have been a stretch (matching pairing
binutils etc).
So I'd asked Edwin to just do a minimal cc1 build.


Ah, good idea.  I've gotten hung up a bunch of times trying to reproduce 
old stuff.  I'd always been trying full toolchain builds, I bet cc1 
would have a better chance of building for me.


veclower: improve selection of vector mode when lowering [PR 112787]

2023-12-06 Thread Andre Vieira (lists)

Hi,

This patch addresses the issue reported in PR target/112787 by improving the
compute type selection.  We do this by not considering types with more 
elements

than the type we are lowering since we'd reject such types anyway.

gcc/ChangeLog:

PR target/112787
* tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
control maximum amount of elements in resulting vector mode.
(get_compute_type): Restrict vector_compute_type to a mode no wider
than the original compute type.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr112787.c: New test.

Bootstrapped and regression tested on aarch64-unknown-linux-gnu and 
x86_64-pc-linux-gnu.


Is this OK for trunk?

Kind regards,
Andre Vieiradiff --git a/gcc/testsuite/gcc.target/aarch64/pr112787.c 
b/gcc/testsuite/gcc.target/aarch64/pr112787.c
new file mode 100644
index 
..caca1bf7ef447e4489b2c134d7200a4afd16763f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr112787.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -march=armv8-a+sve -mcpu=neoverse-n2" } */
+
+typedef int __attribute__((__vector_size__ (64))) vec;
+
+vec fn (vec a, vec b)
+{
+  return a + b;
+}
+
+/* { dg-final { scan-assembler-times {add\tv[0-9]+} 4 } } */
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 
a7e6cb87a5e31e3dd2a893ea5652eeebf8d5d214..2dbf3c8f5f64f2623944110dbc371fe0944198f0
 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator *gsi)
TYPE, or NULL_TREE if none is found.  */
 
 static tree
-type_for_widest_vector_mode (tree type, optab op)
+type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits = 0)
 {
   machine_mode inner_mode = TYPE_MODE (type);
   machine_mode best_mode = VOIDmode, mode;
@@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
   FOR_EACH_MODE_FROM (mode, mode)
 if (GET_MODE_INNER (mode) == inner_mode
&& maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
-   && optab_handler (op, mode) != CODE_FOR_nothing)
+   && optab_handler (op, mode) != CODE_FOR_nothing
+   && (known_eq (max_nunits, 0)
+   || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
   best_mode = mode, best_nunits = GET_MODE_NUNITS (mode);
 
   if (best_mode == VOIDmode)
@@ -1702,7 +1704,8 @@ get_compute_type (enum tree_code code, optab op, tree 
type)
  || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
 {
   tree vector_compute_type
-   = type_for_widest_vector_mode (TREE_TYPE (type), op);
+   = type_for_widest_vector_mode (TREE_TYPE (type), op,
+  TYPE_VECTOR_SUBPARTS (compute_type));
   if (vector_compute_type != NULL_TREE
  && subparts_gt (compute_type, vector_compute_type)
  && maybe_ne (TYPE_VECTOR_SUBPARTS (vector_compute_type), 1U)


Re: [PATCH] RISC-V: Remove xfail from ssa-fre-3.c testcase

2023-12-06 Thread Vineet Gupta


On 12/6/23 08:22, Palmer Dabbelt wrote:
>> Ran the test case at 122e7b4f9d0c2d54d865272463a1d812002d0a5c where the xfail
> That's the original port submission, I'm actually kind of surprised it 
> still builds/works at all.

Full toolchain build would have been a stretch (matching pairing
binutils etc).
So I'd asked Edwin to just do a minimal cc1 build.


Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-06 Thread Philipp Tomsich
On Wed, 6 Dec 2023 at 23:32, Richard Biener  wrote:
>
> On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis
>  wrote:
> >
> > This is an RTL pass that detects store forwarding from stores to larger 
> > loads (load pairs).
> >
> > This optimization is SPEC2017-driven and was found to be beneficial for 
> > some benchmarks,
> > through testing on ampere1/ampere1a machines.
> >
> > For example, it can transform cases like
> >
> > str  d5, [sp, #320]
> > fmul d5, d31, d29
> > ldp  d31, d17, [sp, #312] # Large load from small store
> >
> > to
> >
> > str  d5, [sp, #320]
> > fmul d5, d31, d29
> > ldr  d31, [sp, #312]
> > ldr  d17, [sp, #320]
> >
> > Currently, the pass is disabled by default on all architectures and enabled 
> > by a target-specific option.
> >
> > If deemed beneficial enough for a default, it will be enabled on 
> > ampere1/ampere1a,
> > or other architectures as well, without needing to be turned on by this 
> > option.
>
> What is aarch64-specific about the pass?
>
> I see an increasingly large number of target specific passes pop up (probably
> for the excuse we can generalize them if necessary).  But GCC isn't LLVM
> and this feels like getting out of hand?

We had an OK from Richard Sandiford on the earlier (v5) version with
v6 just fixing an obvious bug... so I was about to merge this earlier
just when you commented.

Given that this had months of test exposure on our end, I would prefer
to move this forward for GCC14 in its current form.
The project of replacing architecture-specific store-forwarding passes
with a generalized infrastructure could then be addressed in the GCC15
timeframe (or beyond)?

--Philipp.

>
> The x86 backend also has its store-forwarding "pass" as part of mdreorg
> in ix86_split_stlf_stall_load.
>
> Richard.
>
> > Bootstrapped and regtested on aarch64-linux.
> >
> > gcc/ChangeLog:
> >
> > * config.gcc: Add aarch64-store-forwarding.o to extra_objs.
> > * config/aarch64/aarch64-passes.def (INSERT_PASS_AFTER): New pass.
> > * config/aarch64/aarch64-protos.h 
> > (make_pass_avoid_store_forwarding): Declare.
> > * config/aarch64/aarch64.opt (mavoid-store-forwarding): New option.
> > (aarch64-store-forwarding-threshold): New param.
> > * config/aarch64/t-aarch64: Add aarch64-store-forwarding.o
> > * doc/invoke.texi: Document new option and new param.
> > * config/aarch64/aarch64-store-forwarding.cc: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/ldp_ssll_no_overlap_address.c: New test.
> > * gcc.target/aarch64/ldp_ssll_no_overlap_offset.c: New test.
> > * gcc.target/aarch64/ldp_ssll_overlap.c: New test.
> >
> > Signed-off-by: Manos Anagnostakis 
> > Co-Authored-By: Manolis Tsamis 
> > Co-Authored-By: Philipp Tomsich 
> > ---
> > Changes in v6:
> > - An obvious change. insn_cnt was incremented only on
> >   stores and not for every insn in the bb. Now restored.
> >
> >  gcc/config.gcc|   1 +
> >  gcc/config/aarch64/aarch64-passes.def |   1 +
> >  gcc/config/aarch64/aarch64-protos.h   |   1 +
> >  .../aarch64/aarch64-store-forwarding.cc   | 318 ++
> >  gcc/config/aarch64/aarch64.opt|   9 +
> >  gcc/config/aarch64/t-aarch64  |  10 +
> >  gcc/doc/invoke.texi   |  11 +-
> >  .../aarch64/ldp_ssll_no_overlap_address.c |  33 ++
> >  .../aarch64/ldp_ssll_no_overlap_offset.c  |  33 ++
> >  .../gcc.target/aarch64/ldp_ssll_overlap.c |  33 ++
> >  10 files changed, 449 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/config/aarch64/aarch64-store-forwarding.cc
> >  create mode 100644 
> > gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_address.c
> >  create mode 100644 
> > gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_offset.c
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_ssll_overlap.c
> >
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 6450448f2f0..7c48429eb82 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -350,6 +350,7 @@ aarch64*-*-*)
> > cxx_target_objs="aarch64-c.o"
> > d_target_objs="aarch64-d.o"
> > extra_objs="aarch64-builtins.o aarch-common.o 
> > aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o 
> > aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o 
> > aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o aarch64-speculation.o 
> > falkor-tag-collision-avoidance.o aarch-bti-insert.o aarch64-cc-fusion.o"
> > +   extra_objs="${extra_objs} aarch64-store-forwarding.o"
> > target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.cc 
> > \$(srcdir)/config/aarch64/aarch64-sve-builtins.h 
> > \$(srcdir)/config/aarch64/aarch64-sve-builtins.cc"
> > target_has_targetm_common=yes
> > ;;
> > diff --git a/gcc/config/aarch64/aarch64-passes.def 
> > b/gcc/config/aarch64/aarch64-passes.def
> > index 662a1

Re: [PATCH] remove qmtest-related Makefile targets

2023-12-06 Thread Eric Gallager
On Wed, Dec 6, 2023 at 12:56 PM Jeff Law  wrote:
> On 12/5/23 09:41, Eric Gallager wrote:
> > On GitHub, Joseph Myers (@jsm28 there) says in MentorEmbedded/qmtest#1
> > that the qmtest-related targets should have been removed long ago. This
> > patch does so.
> >
> > Ref:
> > https://github.com/MentorEmbedded/qmtest/issues/1
> >
> > gcc/ChangeLog:
> >
> >   * Makefile.in: Remove qmtest-related targets.
> In hindsight I probably should have been more supportive of the QMTest
> effort.  But it's dead now.
>
> OK for the trunk.
>
> jeff

Thanks, committed as r14-6230-gec266cbb859160:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ec266cbb859160aa207b6b292cfd974280ca8ff9


[pushed] v2: diagnostics: prettify JSON output formats

2023-12-06 Thread David Malcolm
I messed up the testing of the previous version of this patch, and
it turned out to have regressions.

Whilst fixing them, it turned out I needed a way to disable the
formatting for some test cases, so this version of the patch restricts
the formatting to just the diagnostics format, and adds a
-fno-diagnostics-json-formatting for turning it off.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-6228-g3bd8241a1f1982.

For reference, here's what I've pushed:

Previously our JSON output emitted the JSON all on one line, with
no indentation to show the structure of the values.

Although it's easy to reformat such output (e.g. with
"python -m json.tool"), I've found it's a pain to need to do so
e.g. my text editor sometimes hangs when opening a multimegabyte
json file all on one line.  Similarly diff-ing is easier if the
json is already formatted.

This patch add whitespace to json output to show the structure.
It turned out to be fairly easy to implement using pretty_printer's
existing indentation machinery.

The patch uses this formatting for the various JSON-based diagnostic
output formats.

For example, with this patch, the output from
fdiagnostics-format=json-stderr looks like:

[{"kind": "warning",
  "message": "stack-based buffer overflow",
  "option": "-Wanalyzer-out-of-bounds",
  "option_url": 
"https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html#index-Wanalyzer-out-of-bounds";,
  "children": [{"kind": "note",
"message": "write of 350 bytes to beyond the end of ‘buf’",
"locations": [{"caret": {"file": 
"../../src/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-19.c",
 "line": 20,
 "display-column": 3,
 "byte-column": 3,
 "column": 3},
   "finish": {"file": 
"../../src/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-19.c",
  "line": 20,
  "display-column": 27,
  "byte-column": 27,
  "column": 27}}],
"escape-source": false},
   {"kind": "note",
"message": "valid subscripts for ‘buf’ are ‘[0]’ to ‘[99]’",
"locations": [{"caret": {"file": 
"../../src/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-19.c",
 "line": 20,
 "display-column": 3,
 "byte-column": 3,
 "column": 3},
   "finish": {"file": 
"../../src/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-19.c",
  "line": 20,
  "display-column": 27,
  "byte-column": 27,
  "column": 27}}],
"escape-source": false}],
  "column-origin": 1,
...snip...]

I was able to update almost all of our DejaGnu test cases for JSON to
handle this format tweak, and IMHO it improved the readability of these
test cases, but a couple were more awkward.  Hence I added
-fno-diagnostics-json-formatting as an option to disable this
formatting.

The formatting does not affect the output of -fsave-optimization-record
or the JSON output from gcov (but this could be enabled if desirable).

gcc/analyzer/ChangeLog:
* engine.cc (dump_analyzer_json): Use
flag_diagnostics_json_formatting.

gcc/ChangeLog:
* common.opt (fdiagnostics-json-formatting): New.
* diagnostic-format-json.cc: Add "formatted" boolean
to json_output_format and subclasses, and to the
diagnostic_output_format_init_json_* functions.  Use it when
printing JSON.
* diagnostic-format-sarif.cc: Likewise for sarif_builder,
sarif_output_format, and the various
diagnostic_output_format_init_sarif_* functions.
* diagnostic.cc (diagnostic_output_format_init): Add
"json_formatting" boolean and pass on to the various cases.
* diagnostic.h (diagnostic_output_format_init): Add
"json_formatted" param.
(diagnostic_output_format_init_json_stderr): Add "formatted" param
(diagnostic_output_format_init_json_file): Likewise.
(diagnostic_output_format_init_sarif_stderr): Likewise.
(diagnostic_output_format_init_sarif_file): Likewise.
(diagnostic_output_format_init_sarif_stream): Likewise.
* doc/invoke.texi (-fdiagnostics-format=json): Remove discussion
about JSON output needing formatting.
(-fno-diagnostics-json-formatting): Add.
* gcc.cc (driver_handle_option): Use
opts->x_flag_diagnostics_json_formatting

Re: [C PATCH, v2] Add Walloc-size to warn about insufficient size in allocations [PR71219]

2023-12-06 Thread Eric Gallager
On Wed, Dec 6, 2023 at 10:13 AM Martin Uecker  wrote:
>
> Am Mittwoch, dem 06.12.2023 um 16:01 +0100 schrieb Jakub Jelinek:
> > On Wed, Dec 06, 2023 at 03:56:10PM +0100, Martin Uecker wrote:
> > > > That would be my preference because then the allocation size is
> > > > correct and it is purely a style warning.
> > > > It doesn't follow how the warning is described:
> > > > "Warn about calls to allocation functions decorated with attribute
> > > > @code{alloc_size} that specify insufficient size for the target type of
> > > > the pointer the result is assigned to"
> > > > when the size is certainly sufficient.
> > >
> > > The C standard defines the semantics of to allocate space
> > > of 'nmemb' objects of size 'size', so I would say
> > > the warning and its description are correct because
> > > if you call calloc with '1' as size argument but
> > > the object size is larger then you specify an
> > > insufficient size for the object given the semantical
> > > description of calloc in the standard.
> >
> > 1 is sizeof (char), so you ask for an array of sizeof (struct ...)
> > chars and store the struct into it.
>
> If you use
>
> char *p = calloc(sizeof(struct foo), 1);
>
> it does not warn.
>
> >
> > > > We have the -Wmemset-transposed-args warning, couldn't we
> > > > have a similar one for calloc, and perhaps do it solely in
> > > > the case where one uses sizeof of the type used in the cast
> > > > pointer?
> > > > So warn for
> > > > (struct S *) calloc (sizeof (struct S), 1)
> > > > or
> > > > (struct S *) calloc (sizeof (struct S), n)
> > > > but not for
> > > > (struct S *) calloc (4, 15)
> > > > or
> > > > (struct S *) calloc (sizeof (struct T), 1)
> > > > or similar?  Of course check for compatible types of TYPE_MAIN_VARIANTs.
> > >
> > > Yes, although in contrast to -Wmeset-transposed-args
> > > this would be considered a "style" option which then
> > > nobody would activate.  And if we put it into -Wextra
> > > then we have the same situation as today.
> >
> > Well, the significant difference would be that users would
> > know that they got the size for the allocation right, just
> > that a coding style says it is better to put the type's size
> > as the second argument rather than first, and they could disable
> > that warning separately from -Walloc-size and still get warnings
> > on (struct S *) calloc (1, 1) or (struct S *) malloc (3) if
> > sizeof (struct S) is 24...
>
> Ok.
>
> Note that another limitation of the current version is that it
> does not warn for
>
> ... = (struct S*) calloc (...)
>
> with the cast (which is non-idiomatic in C).

Note that -Wc++-compat encourages the cast, for people who are trying
to make their code compilable as both C and C++.

> This is also
> something I would like to address in the future and would be
> more important for the C++ version.  But for this case it
> should probably use the type of the cast and the warning
> needs to be added somewhere else in the FE.
>
>
> Martin
>


Re: [PATCH 1/1] RISC-V: Add support for XCVbitmanip extension in CV32E40P

2023-12-06 Thread Jeff Law




On 12/5/23 08:30, Kito Cheng wrote:

index 7d7b952d817..e7d4ad1760c 100644
--- a/gcc/config/riscv/corev.md
+++ b/gcc/config/riscv/corev.md
@@ -27,6 +27,25 @@

;;CORE-V EVENT LOAD
UNSPECV_CV_ELW
+
+  ;;CORE-V BITMANIP
+  UNSPEC_CV_BITMANIP_EXTRACT
+  UNSPEC_CV_BITMANIP_EXTRACT_INSN
+  UNSPEC_CV_BITMANIP_EXTRACTR_INSN
+  UNSPEC_CV_BITMANIP_EXTRACTU
+  UNSPEC_CV_BITMANIP_EXTRACTU_INSN
+  UNSPEC_CV_BITMANIP_EXTRACTUR_INSN
+  UNSPEC_CV_BITMANIP_INSERT
+  UNSPEC_CV_BITMANIP_INSERT_INSN
+  UNSPEC_CV_BITMANIP_INSERTR_INSN


You could reference bfe, sbfx and ubfx instructions in aarch64.md
to see how to write the insert and extract bit with RTL code.


+  UNSPEC_CV_BITMANIP_BCLR
+  UNSPEC_CV_BITMANIP_BCLR_INSN
+  UNSPEC_CV_BITMANIP_BCLRR_INSN
+  UNSPEC_CV_BITMANIP_BSET
+  UNSPEC_CV_BITMANIP_BSET_INSN
+  UNSPEC_CV_BITMANIP_BSETR_INSN


Just use generic RTL code for bset and bclr is fine, you could
reference bitmanip.md
Agreed.  And as a general principle if we can reasonably express the 
semantics of an instruction with RTL, we generally should.  Doing so 
gives the optimizers a chance to improve stuff.


I haven't looked at the patches, but the same might apply to the 
extractions & insertions, though there's more complex in that there's 
multiple implementations and I suspect some general cleanups would 
likely be necessary for that to work.  We started to look at it a bit, 
but concluded there were bigger fish to fry.


jeff


Re: [PATCH] remove qmtest-related Makefile targets

2023-12-06 Thread Jeff Law




On 12/5/23 09:41, Eric Gallager wrote:

On GitHub, Joseph Myers (@jsm28 there) says in MentorEmbedded/qmtest#1
that the qmtest-related targets should have been removed long ago. This
patch does so.

Ref:
https://github.com/MentorEmbedded/qmtest/issues/1

gcc/ChangeLog:

* Makefile.in: Remove qmtest-related targets.
In hindsight I probably should have been more supportive of the QMTest 
effort.  But it's dead now.


OK for the trunk.

jeff


Re: [PATCH] libgcc: Avoid -Wbuiltin-declaration-mismatch warnings in emutls.c

2023-12-06 Thread Jeff Law




On 12/6/23 03:04, Jakub Jelinek wrote:

Hi!

When libgcc is being built in --disable-tls configuration or on
a target without native TLS support, one gets annoying warnings:
../../../../libgcc/emutls.c:61:7: warning: conflicting types for built-in 
function ‘__emutls_get_address’; expected ‘void *(void *)’ 
[-Wbuiltin-declaration-mismatch]
61 | void *__emutls_get_address (struct __emutls_object *);
   |   ^~~~
../../../../libgcc/emutls.c:63:6: warning: conflicting types for built-in 
function ‘__emutls_register_common’; expected ‘void(void *, unsigned int,  
unsigned int,  void *)’ [-Wbuiltin-declaration-mismatch]
63 | void __emutls_register_common (struct __emutls_object *, word, word, 
void *);
   |  ^~~~
../../../../libgcc/emutls.c:140:1: warning: conflicting types for built-in 
function ‘__emutls_get_address’; expected ‘void *(void *)’ 
[-Wbuiltin-declaration-mismatch]
   140 | __emutls_get_address (struct __emutls_object *obj)
   | ^~~~
../../../../libgcc/emutls.c:204:1: warning: conflicting types for built-in 
function ‘__emutls_register_common’; expected ‘void(void *, unsigned int,  
unsigned int,  void *)’ [-Wbuiltin-declaration-mismatch]
   204 | __emutls_register_common (struct __emutls_object *obj,
   | ^~~~
The thing is that in that case __emutls_get_address and
__emutls_register_common are builtins, and are declared with void *
arguments rather than struct __emutls_object *.
Now, struct __emutls_object is a type private to libgcc/emutls.c and the
middle-end creates on demand when calling the builtins a similar structure
(with small differences, like not having the union in there).

We have a precedent for this e.g. for fprintf or strftime builtins where
the builtins are created with magic fileptr_type_node or const_tm_ptr_type_node
types and then match it with user definition of pointers to some structure,
but I think for this case users should never define these functions
themselves nor call them and having special types for them in the compiler
would mean extra compile time spent during compiler initialization and more
GC data, so I think it is better to keep the compiler as is.

On the library side, there is an option to just follow what the
compiler is doing and do
  EMUTLS_ATTR void
-__emutls_register_common (struct __emutls_object *obj,
+__emutls_register_common (void *xobj,
word size, word align, void *templ)
  {
+  struct __emutls_object *obj = (struct __emutls_object *) xobj;
but that will make e.g. libabigail complain about ABI change in libgcc.

So, the patch just turns the warning off.

Tested on x86_64-linux with --disable-tls, ok for trunk?

2023-12-06  Thomas Schwinge  
Jakub Jelinek  

PR libgcc/109289
* emutls.c: Add GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
pragma.

OK
jeff


Re: [PATCH] testsuite: Adjust for the new permerror -Wincompatible-pointer-types

2023-12-06 Thread Jeff Law




On 12/6/23 05:12, Florian Weimer wrote:

* Yang Yujie:


From: Yang Yujie 
Subject: [PATCH] testsuite: Adjust for the new permerror
  -Wincompatible-pointer-types
To: gcc-patches@gcc.gnu.org
Cc: r...@cebitec.uni-bielefeld.de, mikest...@comcast.net, fwei...@redhat.com,
  Yang Yujie 
Date: Wed,  6 Dec 2023 10:29:31 +0800 (9 hours, 42 minutes, 7 seconds ago)
Message-ID: <20231206022931.33437-1-yangyu...@loongson.cn>

r14-6037 turned -Wincompatible-pointer-types into a permerror,
which causes the following tests to fail.

gcc/testsuite/ChangeLog:

* gcc.dg/fixed-point/composite-type.c: replace dg-warning with dg-error.
---
  .../gcc.dg/fixed-point/composite-type.c   | 64 +--
  1 file changed, 32 insertions(+), 32 deletions(-)


Looks reasonable to me, but I can't approve it.

We might want to fix that from a policy standpoint :-)

Regardless, this is OK for the trunk.  Thanks Yang for taking care of 
it.  I don't see you in the maintainers file, so I'll go ahead and push 
it momentarily.


jeff


[pushed] diagnostics: use const and references for diagnostic_info

2023-12-06 Thread David Malcolm
No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r14-6227-g8fc4e6c397e1ce.

gcc/c-family/ChangeLog:
* c-opts.cc (c_diagnostic_finalizer): Make "diagnostic" param
const.

gcc/cp/ChangeLog:
* cp-tree.h (cxx_print_error_function): Make diagnostic_info param
const.
* error.cc (cxx_print_error_function): Likewise.
(cp_diagnostic_starter): Likewise.
(cp_print_error_function): Likewise.

gcc/ChangeLog:
* diagnostic-format-json.cc (on_begin_diagnostic): Convert param
to const reference.
(on_end_diagnostic): Likewise.
(json_output_format::on_end_diagnostic): Likewise.
* diagnostic-format-sarif.cc
(sarif_invocation::add_notification_for_ice): Likewise.
(sarif_result::on_nested_diagnostic): Likewise.
(sarif_ice_notification::sarif_ice_notification): Likewise.
(sarif_builder::end_diagnostic): Likewise.
(sarif_builder::make_result_object): Likewise.
(make_reporting_descriptor_object_for_warning): Likewise.
(sarif_builder::make_locations_arr): Likewise.
(sarif_output_format::on_begin_diagnostic): Likewise.
(sarif_output_format::on_end_diagnostic): Likewise.
* diagnostic.cc (default_diagnostic_starter): Make diagnostic_info
param const.
(default_diagnostic_finalizer): Likewise.
(diagnostic_context::report_diagnostic): Pass diagnostic by
reference to on_{begin,end}_diagnostic.
(diagnostic_text_output_format::on_begin_diagnostic): Convert
param to const reference.
(diagnostic_text_output_format::on_end_diagnostic): Likewise.
* diagnostic.h (diagnostic_starter_fn): Make diagnostic_info param
const.
(diagnostic_finalizer_fn): Likeewise.
(diagnostic_output_format::on_begin_diagnostic): Convert param to
const reference.
(diagnostic_output_format::on_end_diagnostic): Likewise.
(diagnostic_text_output_format::on_begin_diagnostic): Likewise.
(diagnostic_text_output_format::on_end_diagnostic): Likewise.
(default_diagnostic_starter): Make diagnostic_info param const.
(default_diagnostic_finalizer): Likewise.
* langhooks-def.h (lhd_print_error_function): Make diagnostic_info
param const.
* langhooks.cc (lhd_print_error_function): Likewise.
* langhooks.h (lang_hooks::print_error_function): Likewise.
* tree-diagnostic.cc (diagnostic_report_current_function):
Likewise.
(default_tree_diagnostic_starter): Likewise.
(virt_loc_aware_diagnostic_finalizer): Likewise.
* tree-diagnostic.h (diagnostic_report_current_function):
Likewise.
(virt_loc_aware_diagnostic_finalizer): Likewise.

gcc/fortran/ChangeLog:
* error.cc (gfc_diagnostic_starter): Make diagnostic_info param
const.
(gfc_diagnostic_finalizer): Likewise.

gcc/jit/ChangeLog:
* dummy-frontend.cc (jit_begin_diagnostic): Make diagnostic_info
param const.
(jit_end_diagnostic): Likewise.  Pass to add_diagnostic by
reference.
* jit-playback.cc (jit::playback::context::add_diagnostic):
Convert diagnostic_info to const reference.
* jit-playback.h (jit::playback::context::add_diagnostic):
Likewise.

gcc/testsuite/ChangeLog:
* g++.dg/plugin/show_template_tree_color_plugin.c
(noop_starter_fn): Make diagnostic_info param const.
* gcc.dg/plugin/diagnostic_group_plugin.c
(test_diagnostic_starter): Likewise.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
(custom_diagnostic_finalizer): Likewise.
* gcc.dg/plugin/location_overflow_plugin.c
(verify_unpacked_ranges): Likewise.
(verify_no_columns): Likewise.

libcc1/ChangeLog:
* context.cc (plugin_print_error_function): Make diagnostic_info
param const.

Signed-off-by: David Malcolm 
---
 gcc/c-family/c-opts.cc|  2 +-
 gcc/cp/cp-tree.h  |  2 +-
 gcc/cp/error.cc   | 12 ++--
 gcc/diagnostic-format-json.cc | 20 +++
 gcc/diagnostic-format-sarif.cc| 57 ++-
 gcc/diagnostic.cc | 17 +++---
 gcc/diagnostic.h  | 17 +++---
 gcc/fortran/error.cc  |  4 +-
 gcc/jit/dummy-frontend.cc |  7 ++-
 gcc/jit/jit-playback.cc   |  4 +-
 gcc/jit/jit-playback.h|  2 +-
 gcc/langhooks-def.h   |  3 +-
 gcc/langhooks.cc  |  2 +-
 gcc/langhooks.h   |  2 +-
 .../plugin/show_template_tree_color_plugin.c  |  2 +-
 .../gcc.dg/plugin/diagnostic_group_plugin.c   |  2 +-
 .../diagnostic_plugin_test_show_locus.

Re: [PATCH] gettext: disable install, docs targets, libasprintf, threads

2023-12-06 Thread Eric Gallager
On Mon, Dec 4, 2023 at 1:44 PM Tom Tromey  wrote:
>
> > "Arsen" == Arsen Arsenović  writes:
>
> Arsen> Thanks.  I'll wait for the Binutils and GDB maintainers to weigh in
> Arsen> before pushing (plus, I can't push there).
>
> Seems fine to me.  Thank you.
>
> Tom

LGTM; please post once it has been committed.
Thanks,
Eric


Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-06 Thread Manos Anagnostakis
Hi again,

I went and tested the requested changes and found out the following:

1. The pass is currently increasing insn_cnt on a NONJUMP_INSN_P, which is
a subset of NONDEBUG_INSN_P. I think there is no problem with depending on
-g with the current version. Do you see something I don't or did you mean
something else?
2. Not processing all instructions is not letting cselib record all the
effects they have, thus it does not have updated information to find true
forwardings at any given time. I can confirm this since I am witnessing
many unexpected changes on the number of handled cases if I do this only
for loads/stores.

Thanks in advance and please let me know your thoughts on the above.
Manos.

On Wed, Dec 6, 2023 at 5:10 PM Manos Anagnostakis <
manos.anagnosta...@vrull.eu> wrote:

> Hi Richard,
>
> thanks for the useful comments.
>
> On Wed, Dec 6, 2023 at 4:32 PM Richard Biener 
> wrote:
>
>> On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis
>>  wrote:
>> >
>> > This is an RTL pass that detects store forwarding from stores to larger
>> loads (load pairs).
>> >
>> > This optimization is SPEC2017-driven and was found to be beneficial for
>> some benchmarks,
>> > through testing on ampere1/ampere1a machines.
>> >
>> > For example, it can transform cases like
>> >
>> > str  d5, [sp, #320]
>> > fmul d5, d31, d29
>> > ldp  d31, d17, [sp, #312] # Large load from small store
>> >
>> > to
>> >
>> > str  d5, [sp, #320]
>> > fmul d5, d31, d29
>> > ldr  d31, [sp, #312]
>> > ldr  d17, [sp, #320]
>> >
>> > Currently, the pass is disabled by default on all architectures and
>> enabled by a target-specific option.
>> >
>> > If deemed beneficial enough for a default, it will be enabled on
>> ampere1/ampere1a,
>> > or other architectures as well, without needing to be turned on by this
>> option.
>>
>> What is aarch64-specific about the pass?
>>
> The pass was designed to target load pairs, which are aarch64 specific,
> thus it cannot handle generic loads.
>
>>
>> I see an increasingly large number of target specific passes pop up
>> (probably
>> for the excuse we can generalize them if necessary).  But GCC isn't LLVM
>> and this feels like getting out of hand?
>>
>> The x86 backend also has its store-forwarding "pass" as part of mdreorg
>> in ix86_split_stlf_stall_load.
>>
>> Richard.
>>
>> > Bootstrapped and regtested on aarch64-linux.
>> >
>> > gcc/ChangeLog:
>> >
>> > * config.gcc: Add aarch64-store-forwarding.o to extra_objs.
>> > * config/aarch64/aarch64-passes.def (INSERT_PASS_AFTER): New
>> pass.
>> > * config/aarch64/aarch64-protos.h
>> (make_pass_avoid_store_forwarding): Declare.
>> > * config/aarch64/aarch64.opt (mavoid-store-forwarding): New
>> option.
>> > (aarch64-store-forwarding-threshold): New param.
>> > * config/aarch64/t-aarch64: Add aarch64-store-forwarding.o
>> > * doc/invoke.texi: Document new option and new param.
>> > * config/aarch64/aarch64-store-forwarding.cc: New file.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> > * gcc.target/aarch64/ldp_ssll_no_overlap_address.c: New test.
>> > * gcc.target/aarch64/ldp_ssll_no_overlap_offset.c: New test.
>> > * gcc.target/aarch64/ldp_ssll_overlap.c: New test.
>> >
>> > Signed-off-by: Manos Anagnostakis 
>> > Co-Authored-By: Manolis Tsamis 
>> > Co-Authored-By: Philipp Tomsich 
>> > ---
>> > Changes in v6:
>> > - An obvious change. insn_cnt was incremented only on
>> >   stores and not for every insn in the bb. Now restored.
>> >
>> >  gcc/config.gcc|   1 +
>> >  gcc/config/aarch64/aarch64-passes.def |   1 +
>> >  gcc/config/aarch64/aarch64-protos.h   |   1 +
>> >  .../aarch64/aarch64-store-forwarding.cc   | 318 ++
>> >  gcc/config/aarch64/aarch64.opt|   9 +
>> >  gcc/config/aarch64/t-aarch64  |  10 +
>> >  gcc/doc/invoke.texi   |  11 +-
>> >  .../aarch64/ldp_ssll_no_overlap_address.c |  33 ++
>> >  .../aarch64/ldp_ssll_no_overlap_offset.c  |  33 ++
>> >  .../gcc.target/aarch64/ldp_ssll_overlap.c |  33 ++
>> >  10 files changed, 449 insertions(+), 1 deletion(-)
>> >  create mode 100644 gcc/config/aarch64/aarch64-store-forwarding.cc
>> >  create mode 100644
>> gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_address.c
>> >  create mode 100644
>> gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_offset.c
>> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_ssll_overlap.c
>> >
>> > diff --git a/gcc/config.gcc b/gcc/config.gcc
>> > index 6450448f2f0..7c48429eb82 100644
>> > --- a/gcc/config.gcc
>> > +++ b/gcc/config.gcc
>> > @@ -350,6 +350,7 @@ aarch64*-*-*)
>> > cxx_target_objs="aarch64-c.o"
>> > d_target_objs="aarch64-d.o"
>> > extra_objs="aarch64-builtins.o aarch-common.o
>> aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o
>> aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2

[committed v4 3/3] amdgcn, libgomp: low-latency allocator

2023-12-06 Thread Andrew Stubbs

This implements the OpenMP low-latency memory allocator for AMD GCN using the
small per-team LDS memory (Local Data Store).

Since addresses can now refer to LDS space, the "Global" address space is
no-longer compatible.  This patch therefore switches the backend to use
entirely "Flat" addressing (which supports both memories).  A future patch
will re-enable "global" instructions for cases where it is known to be safe
to do so.

gcc/ChangeLog:

* config/gcn/gcn-builtins.def (DISPATCH_PTR): New built-in.
* config/gcn/gcn.cc (gcn_init_machine_status): Disable global
addressing.
(gcn_expand_builtin_1): Implement GCN_BUILTIN_DISPATCH_PTR.

libgomp/ChangeLog:

* config/gcn/libgomp-gcn.h (TEAM_ARENA_START): Move to here.
(TEAM_ARENA_FREE): Likewise.
(TEAM_ARENA_END): Likewise.
(GCN_LOWLAT_HEAP): New.
* config/gcn/team.c (LITTLEENDIAN_CPU): New, and import hsa.h.
(__gcn_lowlat_init): New prototype.
(gomp_gcn_enter_kernel): Initialize the low-latency heap.
* libgomp.h (TEAM_ARENA_START): Move to libgomp.h.
(TEAM_ARENA_FREE): Likewise.
(TEAM_ARENA_END): Likewise.
* plugin/plugin-gcn.c (lowlat_size): New variable.
(print_kernel_dispatch): Label the group_segment_size purpose.
(init_environment_variables): Read GOMP_GCN_LOWLAT_POOL.
(create_kernel_dispatch): Pass low-latency head allocation to kernel.
(run_kernel): Use shadow; don't assume values.
* testsuite/libgomp.c/omp_alloc-traits.c: Enable for amdgcn.
* config/gcn/allocator.c: New file.
* libgomp.texi: Document low-latency implementation details.
---
 gcc/config/gcn/gcn-builtins.def   |   2 +
 gcc/config/gcn/gcn.cc |  16 ++-
 libgomp/config/gcn/allocator.c| 127 ++
 libgomp/config/gcn/libgomp-gcn.h  |   6 +
 libgomp/config/gcn/team.c |  12 ++
 libgomp/libgomp.h |   3 -
 libgomp/libgomp.texi  |  13 ++
 libgomp/plugin/plugin-gcn.c   |  35 -
 .../testsuite/libgomp.c/omp_alloc-traits.c|   2 +-
 9 files changed, 205 insertions(+), 11 deletions(-)
 create mode 100644 libgomp/config/gcn/allocator.c

diff --git a/gcc/config/gcn/gcn-builtins.def b/gcc/config/gcn/gcn-builtins.def
index 636a8e7a1a9..471457d7c23 100644
--- a/gcc/config/gcn/gcn-builtins.def
+++ b/gcc/config/gcn/gcn-builtins.def
@@ -164,6 +164,8 @@ DEF_BUILTIN (FIRST_CALL_THIS_THREAD_P, -1, "first_call_this_thread_p", B_INSN,
 	 _A1 (GCN_BTI_BOOL), gcn_expand_builtin_1)
 DEF_BUILTIN (KERNARG_PTR, -1, "kernarg_ptr", B_INSN, _A1 (GCN_BTI_VOIDPTR),
 	 gcn_expand_builtin_1)
+DEF_BUILTIN (DISPATCH_PTR, -1, "dispatch_ptr", B_INSN, _A1 (GCN_BTI_VOIDPTR),
+	 gcn_expand_builtin_1)
 DEF_BUILTIN (GET_STACK_LIMIT, -1, "get_stack_limit", B_INSN,
 	 _A1 (GCN_BTI_VOIDPTR), gcn_expand_builtin_1)
 
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 0781c2a47c2..031b405e810 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -110,7 +110,8 @@ gcn_init_machine_status (void)
 
   f = ggc_cleared_alloc ();
 
-  if (TARGET_GCN3)
+  // FIXME: re-enable global addressing with safety for LDS-flat addresses
+  //if (TARGET_GCN3)
 f->use_flat_addressing = true;
 
   return f;
@@ -4879,6 +4880,19 @@ gcn_expand_builtin_1 (tree exp, rtx target, rtx /*subtarget */ ,
 	  }
 	return ptr;
   }
+case GCN_BUILTIN_DISPATCH_PTR:
+  {
+	rtx ptr;
+	if (cfun->machine->args.reg[DISPATCH_PTR_ARG] >= 0)
+	   ptr = gen_rtx_REG (DImode,
+			  cfun->machine->args.reg[DISPATCH_PTR_ARG]);
+	else
+	  {
+	ptr = gen_reg_rtx (DImode);
+	emit_move_insn (ptr, const0_rtx);
+	  }
+	return ptr;
+  }
 case GCN_BUILTIN_FIRST_CALL_THIS_THREAD_P:
   {
 	/* Stash a marker in the unused upper 16 bits of s[0:1] to indicate
diff --git a/libgomp/config/gcn/allocator.c b/libgomp/config/gcn/allocator.c
new file mode 100644
index 000..e9a95d683f9
--- /dev/null
+++ b/libgomp/config/gcn/allocator.c
@@ -0,0 +1,127 @@
+/* Copyright (C) 2023 Free Software Foundation, Inc.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have

[committed v4 1/3] libgomp, nvptx: low-latency memory allocator

2023-12-06 Thread Andrew Stubbs

This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc.  The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the low-latency heap can be configured using
the GOMP_NVPTX_LOWLAT_POOL environment variable.

The use of the PTX dynamic_smem_size feature means that low-latency allocator
will not work with the PTX 3.1 multilib.

For now, the omp_low_lat_mem_alloc allocator also works, but that will change
when I implement the access traits.

libgomp/ChangeLog:

* allocator.c (MEMSPACE_ALLOC): New macro.
(MEMSPACE_CALLOC): New macro.
(MEMSPACE_REALLOC): New macro.
(MEMSPACE_FREE): New macro.
(predefined_alloc_mapping): New array.  Add _Static_assert to match.
(ARRAY_SIZE): New macro.
(omp_aligned_alloc): Use MEMSPACE_ALLOC.
Implement fall-backs for predefined allocators.  Simplify existing
fall-backs.
(omp_free): Use MEMSPACE_FREE.
(omp_calloc): Use MEMSPACE_CALLOC. Implement fall-backs for
predefined allocators.  Simplify existing fall-backs.
(omp_realloc): Use MEMSPACE_REALLOC, MEMSPACE_ALLOC, and MEMSPACE_FREE.
Implement fall-backs for predefined allocators.  Simplify existing
fall-backs.
* config/nvptx/team.c (__nvptx_lowlat_pool): New asm variable.
(__nvptx_lowlat_init): New prototype.
(gomp_nvptx_main): Call __nvptx_lowlat_init.
* libgomp.texi: Update memory space table.
* plugin/plugin-nvptx.c (lowlat_pool_size): New variable.
(GOMP_OFFLOAD_init_device): Read the GOMP_NVPTX_LOWLAT_POOL envvar.
(GOMP_OFFLOAD_run): Apply lowlat_pool_size.
* basic-allocator.c: New file.
* config/nvptx/allocator.c: New file.
* testsuite/libgomp.c/omp_alloc-1.c: New test.
* testsuite/libgomp.c/omp_alloc-2.c: New test.
* testsuite/libgomp.c/omp_alloc-3.c: New test.
* testsuite/libgomp.c/omp_alloc-4.c: New test.
* testsuite/libgomp.c/omp_alloc-5.c: New test.
* testsuite/libgomp.c/omp_alloc-6.c: New test.

Co-authored-by: Kwok Cheung Yeung  
Co-Authored-By: Thomas Schwinge 
---
 libgomp/allocator.c   | 246 --
 libgomp/basic-allocator.c | 382 ++
 libgomp/config/nvptx/allocator.c  | 120 +++
 libgomp/config/nvptx/team.c   |  18 +
 libgomp/libgomp.texi  |  11 +-
 libgomp/plugin/plugin-nvptx.c |  23 +-
 libgomp/testsuite/libgomp.c/omp_alloc-1.c |  56 
 libgomp/testsuite/libgomp.c/omp_alloc-2.c |  64 
 libgomp/testsuite/libgomp.c/omp_alloc-3.c |  42 +++
 libgomp/testsuite/libgomp.c/omp_alloc-4.c | 199 +++
 libgomp/testsuite/libgomp.c/omp_alloc-5.c |  63 
 libgomp/testsuite/libgomp.c/omp_alloc-6.c | 120 +++
 12 files changed, 1239 insertions(+), 105 deletions(-)
 create mode 100644 libgomp/basic-allocator.c
 create mode 100644 libgomp/config/nvptx/allocator.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-6.c

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index b4e50e2ad72..fa398128368 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -37,6 +37,47 @@
 
 #define omp_max_predefined_alloc omp_thread_mem_alloc
 
+/* These macros may be overridden in config//allocator.c.
+   The following definitions (ab)use comma operators to avoid unused
+   variable errors.  */
+#ifndef MEMSPACE_ALLOC
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
+  malloc (((void)(MEMSPACE), (SIZE)))
+#endif
+#ifndef MEMSPACE_CALLOC
+#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
+  calloc (1, (((void)(MEMSPACE), (SIZE
+#endif
+#ifndef MEMSPACE_REALLOC
+#define MEMSPACE_REALLOC(MEMSPACE, ADDR, OLDSIZE, SIZE) \
+  realloc (ADDR, (((void)(MEMSPACE), (void)(OLDSIZE), (SIZE
+#endif
+#ifndef MEMSPACE_FREE
+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
+  free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))
+#endif
+
+/* Map the predefined allocators to the correct memory space.
+   The index to this table is the omp_allocator_handle_t enum value.
+   When the user calls omp_alloc with a predefined allocator this
+   table determines what memory they get.  */
+static const omp_memspace_handle_t predefined_alloc_mapping[] = {
+  omp_default_mem_space,   /* omp_null_allocator doesn't actually use this. */
+  omp_default_mem_space,   /* omp_default_mem_alloc. */
+  omp_large_cap_mem_space, /* omp_large_cap_mem_alloc. */
+  omp_const_mem_space, /* omp_const_mem_alloc. */
+  omp_high_bw_mem_space,   /* omp_high_bw_mem_alloc. */

[committed v4 2/3] openmp, nvptx: low-lat memory access traits

2023-12-06 Thread Andrew Stubbs

The NVPTX low latency memory is not accessible outside the team that allocates
it, and therefore should be unavailable for allocators with the access trait
"all".  This change means that the omp_low_lat_mem_alloc predefined
allocator no longer works (but omp_cgroup_mem_alloc still does).

libgomp/ChangeLog:

* allocator.c (MEMSPACE_VALIDATE): New macro.
(omp_init_allocator): Use MEMSPACE_VALIDATE.
(omp_aligned_alloc): Use OMP_LOW_LAT_MEM_ALLOC_INVALID.
(omp_aligned_calloc): Likewise.
(omp_realloc): Likewise.
* config/nvptx/allocator.c (nvptx_memspace_validate): New function.
(MEMSPACE_VALIDATE): New macro.
(OMP_LOW_LAT_MEM_ALLOC_INVALID): New define.
* libgomp.texi: Document low-latency implementation details.
* testsuite/libgomp.c/omp_alloc-1.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-2.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-3.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-4.c (main): Add access trait.
* testsuite/libgomp.c/omp_alloc-5.c (main): Add gnu_lowlat.
* testsuite/libgomp.c/omp_alloc-6.c (main): Add access trait.
* testsuite/libgomp.c/omp_alloc-traits.c: New test.
---
 libgomp/allocator.c   | 20 ++
 libgomp/config/nvptx/allocator.c  | 21 ++
 libgomp/libgomp.texi  | 18 +
 libgomp/testsuite/libgomp.c/omp_alloc-1.c | 10 +++
 libgomp/testsuite/libgomp.c/omp_alloc-2.c |  8 +++
 libgomp/testsuite/libgomp.c/omp_alloc-3.c |  7 ++
 libgomp/testsuite/libgomp.c/omp_alloc-4.c |  7 +-
 libgomp/testsuite/libgomp.c/omp_alloc-5.c |  8 +++
 libgomp/testsuite/libgomp.c/omp_alloc-6.c |  7 +-
 .../testsuite/libgomp.c/omp_alloc-traits.c| 66 +++
 10 files changed, 166 insertions(+), 6 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-traits.c

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index fa398128368..a8a80f8028d 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -56,6 +56,10 @@
 #define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE) \
   free (((void)(MEMSPACE), (void)(SIZE), (ADDR)))
 #endif
+#ifndef MEMSPACE_VALIDATE
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) \
+  (((void)(MEMSPACE), (void)(ACCESS), 1))
+#endif
 
 /* Map the predefined allocators to the correct memory space.
The index to this table is the omp_allocator_handle_t enum value.
@@ -439,6 +443,10 @@ omp_init_allocator (omp_memspace_handle_t memspace, int ntraits,
   if (data.pinned)
 return omp_null_allocator;
 
+  /* Reject unsupported memory spaces.  */
+  if (!MEMSPACE_VALIDATE (data.memspace, data.access))
+return omp_null_allocator;
+
   ret = gomp_malloc (sizeof (struct omp_allocator_data));
   *ret = data;
 #ifndef HAVE_SYNC_BUILTINS
@@ -522,6 +530,10 @@ retry:
 new_size += new_alignment - sizeof (void *);
   if (__builtin_add_overflow (size, new_size, &new_size))
 goto fail;
+#ifdef OMP_LOW_LAT_MEM_ALLOC_INVALID
+  if (allocator == omp_low_lat_mem_alloc)
+goto fail;
+#endif
 
   if (__builtin_expect (allocator_data
 			&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
@@ -820,6 +832,10 @@ retry:
 goto fail;
   if (__builtin_add_overflow (size_temp, new_size, &new_size))
 goto fail;
+#ifdef OMP_LOW_LAT_MEM_ALLOC_INVALID
+  if (allocator == omp_low_lat_mem_alloc)
+goto fail;
+#endif
 
   if (__builtin_expect (allocator_data
 			&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
@@ -1054,6 +1070,10 @@ retry:
   if (__builtin_add_overflow (size, new_size, &new_size))
 goto fail;
   old_size = data->size;
+#ifdef OMP_LOW_LAT_MEM_ALLOC_INVALID
+  if (allocator == omp_low_lat_mem_alloc)
+goto fail;
+#endif
 
   if (__builtin_expect (allocator_data
 			&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
diff --git a/libgomp/config/nvptx/allocator.c b/libgomp/config/nvptx/allocator.c
index 6014fba177f..a3302411bcb 100644
--- a/libgomp/config/nvptx/allocator.c
+++ b/libgomp/config/nvptx/allocator.c
@@ -108,6 +108,21 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, void *addr,
 return realloc (addr, size);
 }
 
+static inline int
+nvptx_memspace_validate (omp_memspace_handle_t memspace, unsigned access)
+{
+#if __PTX_ISA_VERSION_MAJOR__ > 4 \
+|| (__PTX_ISA_VERSION_MAJOR__ == 4 && __PTX_ISA_VERSION_MINOR >= 1)
+  /* Disallow use of low-latency memory when it must be accessible by
+ all threads.  */
+  return (memspace != omp_low_lat_mem_space
+	  || access != omp_atv_all);
+#else
+  /* Low-latency memory is not available before PTX 4.1.  */
+  return (memspace != omp_low_lat_mem_space);
+#endif
+}
+
 #define MEMSPACE_ALLOC(MEMSPACE, SIZE) \
   nvptx_memspace_alloc (MEMSPACE, SIZE)
 #define MEMSPACE_CALLOC(MEMSPACE, SIZE) \
@@ -116,5 +131,11 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, void *addr,
   nvptx_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, SIZE)
 #d

[committed v4 0/3] libgomp: OpenMP low-latency omp_alloc

2023-12-06 Thread Andrew Stubbs
Thank you, Tobias, for approving the v3 patch series with minor changes.

https://patchwork.sourceware.org/project/gcc/list/?series=27815&state=%2A&archive=both

These patches are what I've actually committed.  Besides the requested
changes there were one or two bug fixes and minor tweaks, but otherwise
the patches are the same.

The series implements device-specific allocators and adds a low-latency
allocator for both GPUs architectures.

Andrew Stubbs (3):
  libgomp, nvptx: low-latency memory allocator
  openmp, nvptx: low-lat memory access traits
  amdgcn, libgomp: low-latency allocator

 gcc/config/gcn/gcn-builtins.def   |   2 +
 gcc/config/gcn/gcn.cc |  16 +-
 libgomp/allocator.c   | 266 +++-
 libgomp/basic-allocator.c | 382 ++
 libgomp/config/gcn/allocator.c| 127 ++
 libgomp/config/gcn/libgomp-gcn.h  |   6 +
 libgomp/config/gcn/team.c |  12 +
 libgomp/config/nvptx/allocator.c  | 141 +++
 libgomp/config/nvptx/team.c   |  18 +
 libgomp/libgomp.h |   3 -
 libgomp/libgomp.texi  |  42 +-
 libgomp/plugin/plugin-gcn.c   |  35 +-
 libgomp/plugin/plugin-nvptx.c |  23 +-
 libgomp/testsuite/libgomp.c/omp_alloc-1.c |  66 +++
 libgomp/testsuite/libgomp.c/omp_alloc-2.c |  72 
 libgomp/testsuite/libgomp.c/omp_alloc-3.c |  49 +++
 libgomp/testsuite/libgomp.c/omp_alloc-4.c | 200 +
 libgomp/testsuite/libgomp.c/omp_alloc-5.c |  71 
 libgomp/testsuite/libgomp.c/omp_alloc-6.c | 121 ++
 .../testsuite/libgomp.c/omp_alloc-traits.c|  66 +++
 20 files changed, 1603 insertions(+), 115 deletions(-)
 create mode 100644 libgomp/basic-allocator.c
 create mode 100644 libgomp/config/gcn/allocator.c
 create mode 100644 libgomp/config/nvptx/allocator.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-1.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-2.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-6.c
 create mode 100644 libgomp/testsuite/libgomp.c/omp_alloc-traits.c

-- 
2.41.0



[PATCH] libiberty/buildargv: POSIX behaviour for backslash handling

2023-12-06 Thread Andrew Burgess
GDB makes use of the libiberty function buildargv for splitting the
inferior (program being debugged) argument string in the case where
the inferior is not being started under a shell.

I have recently been working to improve this area of GDB, and have
tracked done some of the unexpected behaviour to the libiberty
function buildargv, and how it handles backslash escapes.

For reference, I've been mostly reading:

  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html

The issues that I would like to fix are:

  1. Backslashes within single quotes should not be treated as an
  escape, thus: '\a' should split to \a, retaining the backslash.

  2. Backslashes within double quotes should only act as an escape if
  they are immediately before one of the characters $ (dollar),
  ` (backtick), " (double quote), ` (backslash), or \n (newline).  In
  all other cases a backslash should not be treated as an escape
  character.  Thus: "\a" should split to \a, but "\$" should split to
  $.

  3. A backslash-newline sequence should be treated as a line
  continuation, both the backslash and the newline should be removed.

I've updated libiberty and also added some tests.  All the existing
libiberty tests continue to pass, but I'm not sure if there is more
testing that should be done, buildargv is used within lto-wraper.cc,
so maybe there's some testing folk can suggest that I run?
---
 libiberty/argv.c  |  8 +--
 libiberty/testsuite/test-expandargv.c | 34 +++
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/libiberty/argv.c b/libiberty/argv.c
index c2823d3e4ba..6bae4ca2ee9 100644
--- a/libiberty/argv.c
+++ b/libiberty/argv.c
@@ -224,9 +224,13 @@ char **buildargv (const char *input)
  if (bsquote)
{
  bsquote = 0;
- *arg++ = *input;
+ if (*input != '\n')
+   *arg++ = *input;
}
- else if (*input == '\\')
+ else if (*input == '\\'
+  && !squote
+  && (!dquote
+  || strchr ("$`\"\\\n", *(input + 1)) != NULL))
{
  bsquote = 1;
}
diff --git a/libiberty/testsuite/test-expandargv.c 
b/libiberty/testsuite/test-expandargv.c
index 30f2337ef77..b8dcc6a269a 100644
--- a/libiberty/testsuite/test-expandargv.c
+++ b/libiberty/testsuite/test-expandargv.c
@@ -142,6 +142,40 @@ const char *test_data[] = {
   "b",
   0,
 
+  /* Test 7 - No backslash removal within single quotes.  */
+  "'a\\$VAR' '\\\"'",/* Test 7 data */
+  ARGV0,
+  "@test-expandargv-7.lst",
+  0,
+  ARGV0,
+  "a\\$VAR",
+  "\\\"",
+  0,
+
+  /* Test 8 - Remove backslash / newline pairs.  */
+  "\"ab\\\ncd\" ef\\\ngh",/* Test 8 data */
+  ARGV0,
+  "@test-expandargv-8.lst",
+  0,
+  ARGV0,
+  "abcd",
+  "efgh",
+  0,
+
+  /* Test 9 - Backslash within double quotes.  */
+  "\"\\$VAR\" \"\\`\" \"\\\"\" \"\" \"\\n\" \"\\t\"",/* Test 9 data */
+  ARGV0,
+  "@test-expandargv-9.lst",
+  0,
+  ARGV0,
+  "$VAR",
+  "`",
+  "\"",
+  "\\",
+  "\\n",
+  "\\t",
+  0,
+
   0 /* Test done marker, don't remove. */
 };
 

base-commit: 458e7c937924bbcef80eb006af0b61420dbfc1c1
-- 
2.25.4



Re: [PATCH] [arm] testsuite: make mve_intrinsic_type_overloads-int.c libc-agnostic

2023-12-06 Thread Richard Earnshaw

Sorry, I only just spotted this while looking at something else.


On 23/05/2023 15:41, Christophe Lyon via Gcc-patches wrote:

Glibc defines int32_t as 'int' while newlib defines it as 'long int'.

Although these correspond to the same size, g++ complains when using the




   'wrong' version:
   invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
or
   invalid conversion from 'int*' to 'int32_t*' {aka 'long int*'} [-fpermissive]

when calling vst1q(int32*, int32x4_t) with a first parameter of type
'long int *' (resp. 'int *')

To make this test pass with any type of toolchain, this patch defines
'word_type' according to which libc is in use.

2023-05-23  Christophe Lyon  

gcc/testsuite/
* gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c:
Support both definitions of int32_t.
---
  .../mve_intrinsic_type_overloads-int.c| 28 ++-
  1 file changed, 15 insertions(+), 13 deletions(-)

diff --git 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
index 7947dc024bc..ab51cc8b323 100644
--- 
a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
+++ 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_intrinsic_type_overloads-int.c
@@ -47,14 +47,22 @@ foo2 (short * addr, int16x8_t value)
vst1q (addr, value);
  }
  
-void

-foo3 (int * addr, int32x4_t value)
-{
-  vst1q (addr, value); /* { dg-warning "invalid conversion" "" { target c++ } 
} */
-}
+/* Glibc defines int32_t as 'int' while newlib defines it as 'long int'.
+
+   Although these correspond to the same size, g++ complains when using the
+   'wrong' version:
+  invalid conversion from 'long int*' to 'int32_t*' {aka 'int*'} [-fpermissive]
+
+  The trick below is to make this test pass whether using glibc-based or
+  newlib-based toolchains.  */
  
+#if defined(__GLIBC__)

+#define word_type int
+#else
+#define word_type long int
+#endif


GCC #defines __INT32_TYPE__ for this and should be more reliable than 
trying to detect one specific library implementation.  Did you try that?



  void
-foo4 (long * addr, int32x4_t value)
+foo3 (word_type * addr, int32x4_t value)
  {
vst1q (addr, value);
  }
@@ -78,13 +86,7 @@ foo7 (unsigned short * addr, uint16x8_t value)
  }
  
  void

-foo8 (unsigned int * addr, uint32x4_t value)
-{
-  vst1q (addr, value); /* { dg-warning "invalid conversion" "" { target c++ } 
} */
-}
-
-void
-foo9 (unsigned long * addr, uint32x4_t value)
+foo8 (unsigned word_type * addr, uint32x4_t value)
  {
vst1q (addr, value);
  }


R.


RE: [PATCH 17/21]AArch64: Add implementation for vector cbranch for Advanced SIMD

2023-12-06 Thread Tamar Christina
> -Original Message-
> From: Richard Sandiford 
> Sent: Tuesday, November 28, 2023 5:56 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH 17/21]AArch64: Add implementation for vector cbranch for
> Advanced SIMD
> 
> Richard Sandiford  writes:
> > Tamar Christina  writes:
> >> Hi All,
> >>
> >> This adds an implementation for conditional branch optab for AArch64.
> >>
> >> For e.g.
> >>
> >> void f1 ()
> >> {
> >>   for (int i = 0; i < N; i++)
> >> {
> >>   b[i] += a[i];
> >>   if (a[i] > 0)
> >>break;
> >> }
> >> }
> >>
> >> For 128-bit vectors we generate:
> >>
> >> cmgtv1.4s, v1.4s, #0
> >> umaxp   v1.4s, v1.4s, v1.4s
> >> fmovx3, d1
> >> cbnzx3, .L8
> >>
> >> and of 64-bit vector we can omit the compression:
> >>
> >> cmgtv1.2s, v1.2s, #0
> >> fmovx2, d1
> >> cbz x2, .L13
> >>
> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >>
> >> Ok for master?
> >>
> >> Thanks,
> >> Tamar
> >>
> >> gcc/ChangeLog:
> >>
> >>* config/aarch64/aarch64-simd.md (cbranch4): New.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>* gcc.target/aarch64/vect-early-break-cbranch.c: New test.
> >>
> >> --- inline copy of patch --
> >> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> >> index
> 90118c6348e9614bef580d1dc94c0c1841dd5204..cd5ec35c3f53028f14828bd7
> 0a92924f62524c15 100644
> >> --- a/gcc/config/aarch64/aarch64-simd.md
> >> +++ b/gcc/config/aarch64/aarch64-simd.md
> >> @@ -3830,6 +3830,46 @@ (define_expand
> "vcond_mask_"
> >>DONE;
> >>  })
> >>
> >> +;; Patterns comparing two vectors and conditionally jump
> >> +
> >> +(define_expand "cbranch4"
> >> +  [(set (pc)
> >> +(if_then_else
> >> +  (match_operator 0 "aarch64_equality_operator"
> >> +[(match_operand:VDQ_I 1 "register_operand")
> >> + (match_operand:VDQ_I 2 "aarch64_simd_reg_or_zero")])
> >> +  (label_ref (match_operand 3 ""))
> >> +  (pc)))]
> >> +  "TARGET_SIMD"
> >> +{
> >> +  auto code = GET_CODE (operands[0]);
> >> +  rtx tmp = operands[1];
> >> +
> >> +  /* If comparing against a non-zero vector we have to do a comparison 
> >> first
> >> + so we can have a != 0 comparison with the result.  */
> >> +  if (operands[2] != CONST0_RTX (mode))
> >> +emit_insn (gen_vec_cmp (tmp, operands[0], operands[1],
> >> +  operands[2]));
> >> +
> >> +  /* For 64-bit vectors we need no reductions.  */
> >> +  if (known_eq (128, GET_MODE_BITSIZE (mode)))
> >> +{
> >> +  /* Always reduce using a V4SI.  */
> >> +  rtx reduc = gen_lowpart (V4SImode, tmp);
> >> +  rtx res = gen_reg_rtx (V4SImode);
> >> +  emit_insn (gen_aarch64_umaxpv4si (res, reduc, reduc));
> >> +  emit_move_insn (tmp, gen_lowpart (mode, res));
> >> +}
> >> +
> >> +  rtx val = gen_reg_rtx (DImode);
> >> +  emit_move_insn (val, gen_lowpart (DImode, tmp));
> >> +
> >> +  rtx cc_reg = aarch64_gen_compare_reg (code, val, const0_rtx);
> >> +  rtx cmp_rtx = gen_rtx_fmt_ee (code, DImode, cc_reg, const0_rtx);
> >> +  emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[3]));
> >> +  DONE;
> >
> > Are you sure this is correct for the operands[2] != const0_rtx case?
> > It looks like it uses the same comparison code for the vector comparison
> > and the scalar comparison.
> >
> > E.g. if the pattern is passed a comparison:
> >
> >   (eq (reg:V2SI x) (reg:V2SI y))
> >
> > it looks like we'd generate a CMEQ for the x and y, then branch
> > when the DImode bitcast of the CMEQ result equals zero.  This means
> > that we branch when no elements of x and y are equal, rather than
> > when all elements of x and y are equal.
> >
> > E.g. for:
> >
> >{ 1, 2 } == { 1, 2 }
> >
> > CMEQ will produce { -1, -1 }, the scalar comparison will be -1 == 0,
> > and the branch won't be taken.
> >
> > ISTM it would be easier for the operands[2] != const0_rtx case to use
> > EOR instead of a comparison.  That gives a zero result if the input
> > vectors are equal and a nonzero result if the input vectors are
> > different.  We can then branch on the result using CODE and const0_rtx.
> >
> > (Hope I've got that right.)
> >
> > Maybe that also removes the need for patch 18.
> 
> Sorry, I forgot to say: we can't use operands[1] as a temporary,
> since it's only an input to the pattern.  The EOR destination would
> need to be a fresh register.

I've updated the patch but it doesn't help since cbranch doesn't really push
comparisons in.  So we don't seem to ever really get called with anything 
non-zero.

That said, I'm not entirely convince that the == case is correct. Since == 
means all bits
Equal instead of any bit set, and so it needs to generate cbz instead of cbnz 
and I'm not
sure that's guaranteed.

I do have a failing testcase with this but haven'

Re: [PATCH] RISC-V: Remove xfail from ssa-fre-3.c testcase

2023-12-06 Thread Palmer Dabbelt

On Tue, 05 Dec 2023 16:39:06 PST (-0800), e...@rivosinc.com wrote:

Ran the test case at 122e7b4f9d0c2d54d865272463a1d812002d0a5c where the xfail


That's the original port submission, I'm actually kind of surprised it 
still builds/works at all.



was introduced. The test did pass at that hash and has continued to pass since
then. Remove the xfail

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-fre-3.c: Remove xfail

Signed-off-by: Edwin Lu 
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
index 224dd4f72ef..b2924837a22 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c
@@ -18,4 +18,4 @@ foo (int a, int b)
   return aa + bb;
 }

-/* { dg-final { scan-tree-dump "Replaced \\\(int\\\) aa_.*with a_" "fre1" { xfail { 
riscv*-*-* && lp64 } } } } */
+/* { dg-final { scan-tree-dump "Replaced \\\(int\\\) aa_.*with a_" "fre1" } } 
*/


Reviewed-by: Palmer Dabbelt 

Though Kito did all the test suite stuff back then, so not sure if he 
happens to remember anything specific about what was going on.


Thanks!


  1   2   >