Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Matthias Kretz
On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:
> On 11/8/21 15:00, Matthias Kretz wrote:
> > I forgot to mention why I tagged it [RFC]: I needed one more bit of
> > information on the template args TREE_VEC to encode
> > EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
> > constant denoting the number of non-default arguments, so I couldn't
> > trivially replace that. Therefore, I used the sign of that integer. I was
> > hoping to find a cleaner solution, though.
> It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
> would be a cleaner solution.

I tried that first but realized that TREE_VEC doesn't allow any 
TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the 
TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since the 
int constants are shared between many trees).

Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and 
TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments, 
respectively? (And where would I document this?)

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


[PATCH 2/2] libstdc++: Use diagnose_as attribute to improve simd diagnostics

2021-11-15 Thread Matthias Kretz


Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Diagnose
'std::experimental::parallelism_v2::simd_abi' as 'simd_abi'.
On x86, diagnose _VecBuiltin<16>, _VecBuiltin<32>, and
_VecBltnBtmsk<64> as 'simd_abi::[SSE]', 'simd_abi::[AVX]', and
'simd_abi::AVX512' respectively.
(simd_abi::_Scalar): Diagnose as 'simd_abi::scalar'.
(simd_abi::_Fixed): Diagnose as 'simd_abi::fixed_size'.
(__odr_helper): Shorten implementation details (effectively
hiding them).
* include/experimental/bits/simd_detail.h: Diagnose
'std::experimental::parallelism_v2' as 'stdₓ'.
---
 libstdc++-v3/include/experimental/bits/simd.h | 37 +--
 .../include/experimental/bits/simd_detail.h   |  2 +-
 2 files changed, 11 insertions(+), 28 deletions(-)


--
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 4fbad7d67b5..f581b46fbd8 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -83,13 +83,13 @@ using __m512d [[__gnu__::__vector_size__(64)]] = double;
 using __m512i [[__gnu__::__vector_size__(64)]] = long long;
 #endif
 
-namespace simd_abi {
+namespace simd_abi [[__gnu__::__diagnose_as__("simd_abi")]] {
 // simd_abi forward declarations {{{
 // implementation details:
-struct _Scalar;
+  struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar;
 
 template 
-  struct _Fixed;
+  struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed;
 
 // There are two major ABIs that appear on different architectures.
 // Both have non-boolean values packed into an N Byte register
@@ -108,28 +108,11 @@ template 
 template 
   struct _VecBltnBtmsk;
 
-template 
-  using _VecN = _VecBuiltin;
-
-template 
-  using _Sse = _VecBuiltin<_UsedBytes>;
-
-template 
-  using _Avx = _VecBuiltin<_UsedBytes>;
-
-template 
-  using _Avx512 = _VecBltnBtmsk<_UsedBytes>;
-
-template 
-  using _Neon = _VecBuiltin<_UsedBytes>;
-
-// implementation-defined:
-using __sse = _Sse<>;
-using __avx = _Avx<>;
-using __avx512 = _Avx512<>;
-using __neon = _Neon<>;
-using __neon128 = _Neon<16>;
-using __neon64 = _Neon<8>;
+#if defined __i386__ || defined __x86_64__
+using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>;
+using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>;
+using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] = _VecBltnBtmsk<64>;
+#endif
 
 // standard:
 template 
@@ -367,7 +350,7 @@ namespace __detail
* users link TUs compiled with different flags. This is especially important
* for using simd in libraries.
*/
-  using __odr_helper
+  using __odr_helper [[__gnu__::__diagnose_as__("[ODR helper]")]]
 = conditional_t<__machine_flags() == 0, _OdrEnforcer,
 		_MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>;
 
@@ -692,7 +675,7 @@ template 
   __is_avx512_abi()
   {
 constexpr auto _Bytes = __abi_bytes_v<_Abi>;
-return _Bytes <= 64 && is_same_v, _Abi>;
+return _Bytes <= 64 && is_same_v, _Abi>;
   }
 
 // }}}
diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h
index 198c925c133..437f1ddb278 100644
--- a/libstdc++-v3/include/experimental/bits/simd_detail.h
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -37,7 +37,7 @@
   {\
 _GLIBCXX_BEGIN_NAMESPACE_VERSION   \
   namespace experimental { \
-  inline namespace parallelism_v2 {
+	inline namespace parallelism_v2 [[__gnu__::__diagnose_as__("std\u2093")]] {
 #define _GLIBCXX_SIMD_END_NAMESPACE\
   }\
   }\


[PATCH 1/2] libstdc++: Use diagnose_as attribute to improve string diagnostics

2021-11-15 Thread Matthias Kretz


This hides the basic_string template in all diagnostics, reducing the
signal-to-noise ratio significantly. It also hides the std::__cxx11
namespace from users by presenting it as std.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

PR c++/89370
* include/bits/c++config: Diagnose std::__cxx11:: as std:: using
the diagnose_as attribute.
* include/bits/stringfwd.h: Add diagnose_as attribute to string,
wstring, u8string, u16string, and u32string.
* include/debug/string: Ditto.
* include/experimental/string: Ditto.
* include/std/string: Ditto.
---
 libstdc++-v3/include/bits/c++config  |  3 ++-
 libstdc++-v3/include/bits/stringfwd.h| 10 +-
 libstdc++-v3/include/debug/string| 10 +-
 libstdc++-v3/include/experimental/string | 10 +-
 libstdc++-v3/include/std/string  | 10 +-
 5 files changed, 22 insertions(+), 21 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config
index a6495809671..02d11afc1aa 100644
--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -318,7 +318,8 @@ namespace std
 #if _GLIBCXX_USE_CXX11_ABI
 namespace std
 {
-  inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { }
+  inline namespace __cxx11
+__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { }
 }
 namespace __gnu_cxx
 {
diff --git a/libstdc++-v3/include/bits/stringfwd.h b/libstdc++-v3/include/bits/stringfwd.h
index bcfd350e505..3f653feae14 100644
--- a/libstdc++-v3/include/bits/stringfwd.h
+++ b/libstdc++-v3/include/bits/stringfwd.h
@@ -74,22 +74,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_CXX11
 
   /// A string of @c char
-  typedef basic_stringstring;   
+  typedef basic_stringstring __attribute__((__diagnose_as__));
 
   /// A string of @c wchar_t
-  typedef basic_string wstring;   
+  typedef basic_string wstring __attribute__((__diagnose_as__));
 
 #ifdef _GLIBCXX_USE_CHAR8_T
   /// A string of @c char8_t
-  typedef basic_string u8string;
+  typedef basic_string u8string __attribute__((__diagnose_as__));
 #endif
 
 #if __cplusplus >= 201103L
   /// A string of @c char16_t
-  typedef basic_string u16string; 
+  typedef basic_string u16string __attribute__((__diagnose_as__));
 
   /// A string of @c char32_t
-  typedef basic_string u32string; 
+  typedef basic_string u32string __attribute__((__diagnose_as__));
 #endif
 
   /** @}  */
diff --git a/libstdc++-v3/include/debug/string b/libstdc++-v3/include/debug/string
index a8389528001..d6299e5552f 100644
--- a/libstdc++-v3/include/debug/string
+++ b/libstdc++-v3/include/debug/string
@@ -1296,21 +1296,21 @@ namespace __gnu_debug
   return __res;
 }
 
-  typedef basic_stringstring;
+  typedef basic_stringstring __attribute__((__diagnose_as__));
 
-  typedef basic_string wstring;
+  typedef basic_string wstring __attribute__((__diagnose_as__));
 
 #ifdef _GLIBCXX_USE_CHAR8_T
   /// A string of @c char8_t
-  typedef basic_string u8string;
+  typedef basic_string u8string __attribute__((__diagnose_as__));
 #endif
 
 #if __cplusplus >= 201103L
   /// A string of @c char16_t
-  typedef basic_string u16string;
+  typedef basic_string u16string __attribute__((__diagnose_as__));
 
   /// A string of @c char32_t
-  typedef basic_string u32string;
+  typedef basic_string u32string __attribute__((__diagnose_as__));
 #endif
 
   template
diff --git a/libstdc++-v3/include/experimental/string b/libstdc++-v3/include/experimental/string
index 4d92a7e39cc..91a9dd8b164 100644
--- a/libstdc++-v3/include/experimental/string
+++ b/libstdc++-v3/include/experimental/string
@@ -73,13 +73,13 @@ inline namespace fundamentals_v2
 
 // basic_string typedef names using polymorphic allocator in namespace
 // std::experimental::pmr
-typedef basic_string string;
+typedef basic_string string __attribute__((__diagnose_as__));
 #ifdef _GLIBCXX_USE_CHAR8_T
-typedef basic_string u8string;
+typedef basic_string u8string __attribute__((__diagnose_as__));
 #endif
-typedef basic_string u16string;
-typedef basic_string u32string;
-typedef basic_string wstring;
+typedef basic_string u16string __attribute__((__diagnose_as__));
+typedef basic_string u32string __attribute__((__diagnose_as__));
+typedef basic_string wstring __attribute__((__diagnose_as__));
 
   } // namespace pmr
 #endif
diff --git a/libstdc++-v3/include/std/string b/libstdc++-v3/include/std/string
index af840e887d5..03a3c68050f 100644
--- a/libstdc++-v3/include/std/string
+++ b/libstdc++-v3

[PATCH 0/2] Make use of the diagnose_as attribute to improve libstdc++ diagnostics

2021-11-15 Thread Matthias Kretz
After my two C++ patches for template diagnostics and the diagnose_as 
attribute are in, I'd like to make use of the attribute for std::*string and 
std::pmr::*string as well as for std::experimental::simd diagnostics.

Matthias Kretz (2):
  libstdc++: Use diagnose_as attribute to improve string diagnostics
  libstdc++: Use diagnose_as attribute to improve simd diagnostics

 libstdc++-v3/include/bits/c++config   |  3 +-
 libstdc++-v3/include/bits/stringfwd.h | 10 ++---
 libstdc++-v3/include/debug/string | 10 ++---
 libstdc++-v3/include/experimental/bits/simd.h | 37 +--
 .../include/experimental/bits/simd_detail.h   |  2 +-
 libstdc++-v3/include/experimental/string  | 10 ++---
 libstdc++-v3/include/std/string   | 10 ++---
 7 files changed, 33 insertions(+), 49 deletions(-)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──



Re: [PATCH 11/11] libstdc++: Fix ODR issues with different -m flags

2021-11-15 Thread Matthias Kretz
ping. OK to push?

On Tuesday, 8 June 2021 14:12:23 CET Matthias Kretz wrote:
> From: Matthias Kretz 
> 
> Explicitly support use of the stdx::simd implementation in situations
> where the user links TUs that were compiled with different -m flags. In
> general, this is always a (quasi) ODR violation for inline functions
> because at least codegen may differ in important ways. However, in the
> resulting executable only one (unspecified which one) of them might be
> used. For simd we want to support users to compile code multiple times,
> with different -m flags and have a runtime dispatch to the TU matching
> the target CPU. But if internal functions are not inlined this may lead
> to unexpected performance loss or execution of illegal instructions.
> Therefore, inline functions that are not marked as always_inline must
> use an additional template parameter somewhere in their name, to
> disambiguate between the different -m translations.
> 
> Signed-off-by: Matthias Kretz 
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/experimental/bits/simd.h: Move feature detection bools
>   and add __have_avx512bitalg, __have_avx512vbmi2,
>   __have_avx512vbmi, __have_avx512ifma, __have_avx512cd,
>   __have_avx512vnni, __have_avx512vpopcntdq.
>   (__detail::__machine_flags): New function which returns a unique
>   uint64 depending on relevant -m and -f flags.
>   (__detail::__odr_helper): New type alias for either an anonymous
>   type or a type specialized with the __machine_flags number.
>   (_SimdIntOperators): Change template parameters from _Impl to
>   _Tp, _Abi because _Impl now has an __odr_helper parameter which
>   may be _OdrEnforcer from the anonymous namespace, which makes
>   for a bad base class.
>   (many): Either add __odr_helper template parameter or mark as
>   always_inline.
>   * include/experimental/bits/simd_detail.h: Add defines for
>   AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD,
>   AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT.
>   * include/experimental/bits/simd_builtin.h: Add __odr_helper
>   template parameter or mark as always_inline.
>   * include/experimental/bits/simd_fixed_size.h: Ditto.
>   * include/experimental/bits/simd_math.h: Ditto.
>   * include/experimental/bits/simd_scalar.h: Ditto.
>   * include/experimental/bits/simd_neon.h: Add __odr_helper
>   template parameter.
>   * include/experimental/bits/simd_ppc.h: Ditto.
>   * include/experimental/bits/simd_x86.h: Ditto.
> ---
>  libstdc++-v3/include/experimental/bits/simd.h | 380 --
>  .../include/experimental/bits/simd_builtin.h  |  41 +-
>  .../include/experimental/bits/simd_detail.h   |  40 ++
>  .../experimental/bits/simd_fixed_size.h   |  39 +-
>  .../include/experimental/bits/simd_math.h |  45 ++-
>  .../include/experimental/bits/simd_neon.h |   4 +-
>  .../include/experimental/bits/simd_ppc.h  |   4 +-
>  .../include/experimental/bits/simd_scalar.h   |  71 +++-
>  .../include/experimental/bits/simd_x86.h  |   4 +-
>  9 files changed, 440 insertions(+), 188 deletions(-)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──





[PATCH v5] c++: Add gnu::diagnose_as attribute

2021-11-14 Thread Matthias Kretz
Sorry for taking so long. I hope we can still get this done for GCC 12.

One open question: If we change std::__cxx11::basic_string to 
std::string with this feature, should DWARF strings change or not? I.e. should 
diagnose_as be conditional on (pp->flags & pp_c_flag_gnu_v3)? If these strings 
are only for user consumption, I think the DWARF strings should be affected by 
the attribute...

Oh, and note that the current patch depends on the "c++: Print function 
template parms when relevant" patch I sent on Nov 8th.

On Wednesday, 8 September 2021 04:21:51 CEST Jason Merrill wrote:
> On 7/23/21 4:58 AM, Matthias Kretz wrote:
> > gcc/cp/ChangeLog:
> >  PR c++/89370
> >  * cp-tree.h: Add is_alias_template_p declaration.
> >  * decl2.c (is_alias_template_p): New function. Determines
> >  whether a given TYPE_DECL is actually an alias template that is
> >  still missing its template_info.
> 
> I still think you want to share code with get_underlying_template.  For
> the case where the alias doesn't have DECL_TEMPLATE_INFO yet, you can
> compare to current_template_args ().  Or you could do some initial
> processing that doesn't care about templates in the handler, and then do
> more in cp_parser_alias_declaration after the call to grokfield/start_decl.

I still don't understand how I could make use of get_underlying_template. I.e. 
I don't even understand how get_underlying_template answers any of the 
questions I need answered. I used way too much time trying to make this 
work...
 
> If you still think you need this function, let's call it
> is_renaming_alias_template or renaming_alias_template_p; using both is_
> and _p is redundant.  I don't have a strong preference which.

OK.
 
> >  (is_late_template_attribute): Decls with diagnose_as attribute
> >  are early attributes only if they are alias templates.
> 
> Is there a reason not to apply it early to other templates as well?

Unconditionally returning false for diagnose_as in is_late_template_attribute 
makes renamed class templates print without template parameter list. E.g.

  template  struct [[diagnose_as("foo")]] A;
  using bar [[diagnose_as]] = A;

  template  struct A {
template  struct B {};
using C [[diagnose_as]] = B;
  };

could query for attributes. So IIUC, member types of class templates require 
late attributes.

> >  * error.c (dump_scope): When printing the name of a namespace,
> >  look for the diagnose_as attribute. If found, print the
> >  associated string instead of calling dump_decl.
> 
> Did you decide not to handle this in dump_decl, so we use the
> diagnose_as when referring to the namespace in non-scope contexts as well?

Good question. dump_decl is the more general place for handling the attribute 
and that's where I moved it to.

> > +  if (flag_diagnostics_use_aliases)
> > +{
> > +  tree attr = lookup_attribute ("diagnose_as", DECL_ATTRIBUTES
> > (decl)); +  if (attr && TREE_VALUE (attr))
> > +   {
> > + pp_cxx_ws_string (
> > +   pp, TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr;
> 
> This pattern is used several places outside this function; can we factor
> it into something like
> 
> if (maybe_print_diagnose_as (special))
>/* OK */;

Yes, I added the functions lookup_diagnose_as_attribute and 
dump_diagnose_as_alias to remove code duplication.

> Missing space before (

OK. I think I found and fixed all of them.

> > + if (tmplate)
> > +   TREE_VALUE (*parms) = make_tree_vec (0);
> 
> This could use a comment.

Added.

> >  (dump_aggr_type): If the type has a diagnose_as attribute, print
> >  the associated string instead of printing the original type
> >  name. Print template parms only if the attribute was not applied
> >  to the instantiation / full specialization. Delay call to
> >  dump_scope until the diagnose_as attribute is found. If the
> >  attribute has a second argument, use it to override the context
> >  passed to dump_scope.
> > 
> > + for (int i = 0; i < NUM_TMPL_ARGS (args); ++i)
> > +   {
> > + tree arg = TREE_VEC_ELT (args, i);
> > + while (INDIRECT_TYPE_P (arg))
> > +   arg = TREE_TYPE (arg);
> > + if (WILDCARD_TYPE_P (arg))
> > +   {
> > + tmplate = true;
> > + break;
> > +   }
> > +   }
> 
> I think you want any_dependent_template_args_p (args)

Yes, except that I need `++pr

[PATCH v3] c-family: Add __builtin_assoc_barrier

2021-11-11 Thread Matthias Kretz
On Wednesday, 8 September 2021 15:49:27 CET Matthias Kretz wrote:
> On Wednesday, 8 September 2021 15:44:28 CEST Jason Merrill wrote:
> > On 9/8/21 5:37 AM, Matthias Kretz wrote:
> > > On Tuesday, 7 September 2021 19:36:22 CEST Jason Merrill wrote:
> > >>> case PAREN_EXPR:
> > >>> -  RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t,
> > >>> 0;
> > >>> +  if (REF_PARENTHESIZED_P (t))
> > >>> +   RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t,
> > >>> 0;
> > >>> +  else
> > >>> +   RETURN (RECUR (TREE_OPERAND (t, 0)));
> > >> 
> > >> I think you need to build a new PAREN_EXPR in the assoc barrier case as
> > >> well, for it to have any effect in templates.
> > > 
> > > My intent was to ignore __builtin_assoc_barrier in templates / constexpr
> > > evaluation since it's not affected by -fassociative-math anyway. Or do
> > > you
> > > mean something else?
> > 
> > I agree about constexpr, but why wouldn't template instantiations be
> > affected by -fassociative-math like any other function?
> 
> Oh, that seems like a major misunderstanding on my part. I assumed
> tsubst_copy_and_build would evaluate the expressions in template arguments
> 臘. I'll expand the test and will fix.

Sorry for the long delay. New patch is attached. OK for trunk?


New builtin to enable explicit use of PAREN_EXPR in C & C++ code.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* c-c++-common/builtin-assoc-barrier-1.c: New test.

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_constant_expression): Handle PAREN_EXPR
via cxx_eval_constant_expression.
* cp-objcp-common.c (names_builtin_p): Handle
RID_BUILTIN_ASSOC_BARRIER.
* cp-tree.h: Adjust TREE_LANG_FLAG documentation to include
PAREN_EXPR in REF_PARENTHESIZED_P.
(REF_PARENTHESIZED_P): Add PAREN_EXPR.
* parser.c (cp_parser_postfix_expression): Handle
RID_BUILTIN_ASSOC_BARRIER.
* pt.c (tsubst_copy_and_build): If the PAREN_EXPR is not a
parenthesized initializer, build a new PAREN_EXPR.
* semantics.c (force_paren_expr): Simplify conditionals. Set
REF_PARENTHESIZED_P on PAREN_EXPR.
(maybe_undo_parenthesized_ref): Test PAREN_EXPR for
REF_PARENTHESIZED_P.

gcc/c-family/ChangeLog:

* c-common.c (c_common_reswords): Add __builtin_assoc_barrier.
* c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER.

gcc/c/ChangeLog:

* c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER.
* c-parser.c (c_parser_postfix_expression): Likewise.

gcc/ChangeLog:

* doc/extend.texi: Document __builtin_assoc_barrier.
---
 gcc/c-family/c-common.c   |  1 +
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-decl.c|  1 +
 gcc/c/c-parser.c  | 20 ++
 gcc/cp/constexpr.c|  8 +++
 gcc/cp/cp-objcp-common.c  |  1 +
 gcc/cp/cp-tree.h  | 12 ++--
 gcc/cp/parser.c   | 14 
 gcc/cp/pt.c   | 10 ++-
 gcc/cp/semantics.c| 23 ++
 gcc/doc/extend.texi   | 18 +
 .../c-c++-common/builtin-assoc-barrier-1.c    | 71 +++
 12 files changed, 158 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/builtin-assoc-barrier-1.c


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 436df45df68..dd2a3d5da9e 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 },
   { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
   { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
+  { "__builtin_assoc_barrier", RID_BUILTIN_ASSOC_BARRIER, 0 },
   { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
   { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 },
   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index d5dad99ff97..c089fda12e4 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -108,7 +108,7 @@ enum rid
   

Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-08 Thread Matthias Kretz
I forgot to mention why I tagged it [RFC]: I needed one more bit of 
information on the template args TREE_VEC to encode EXPLICIT_TEMPLATE_ARGS_P. 
Its TREE_CHAIN already points to an integer constant denoting the number of 
non-default arguments, so I couldn't trivially replace that. Therefore, I used 
the sign of that integer. I was hoping to find a cleaner solution, though.

-Matthias

On Monday, 8 November 2021 17:40:44 CET Matthias Kretz wrote:
> On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote:
> > > 2. Given a DECL_TI_ARGS tree, can I query whether an argument was
> > > deduced
> > > or explicitly specified? I'm asking because I still consider diagnostics
> > > of function templates unfortunate. `template  void f()` is
> > > fine,
> > > as is `void f(T) [with T = float]`, but `void f() [with T = float]`
> > > could
> > > be better. I.e. if the template parameter appears somewhere in the
> > > function parameter list, dump_template_parms would only produce noise.
> > > If, however, the template parameter was given explicitly, it would be
> > > nice if it could show up accordingly in diagnostics.
> > 
> > NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are
> > some issues with it.  Attached is my WIP from May to improve it
> > somewhat, if that's interesting.
> 
> It is interesting. I used your patch to come up with the attached. Patch. I
> must say, I didn't try to read through all the cp/pt.c code to understand
> all of what you did there (which is why my ChangeLog entry says "Jason?"),
> but it works for me (and all of `make check`).
> 
> Anyway, I'd like to propose the following before finishing my diagnose_as
> patch. I believe it's useful to fix this part first. The diagnostic/default-
> template-args-[12].C tests show a lot of examples of the intent of this
> patch. And the remaining changes to the testsuite show how it changes
> diagnostic output.
> 
> -- 8< 
> 
> The choice when to print a function template parameter was still
> suboptimal. That's because sometimes the function template parameter
> list only adds noise, while in other situations the lack of a function
> template parameter list makes diagnostic messages hard to understand.
> 
> The general idea of this change is to print template parms wherever they
> would appear in the source code as well. Thus, the diagnostics code
> needs to know whether any template parameter was given explicitly.
> 
> Signed-off-by: Matthias Kretz 
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow
> DW_AT_default_value.
> * g++.dg/diagnostic/default-template-args-1.C: New.
> * g++.dg/diagnostic/default-template-args-2.C: New.
> * g++.dg/diagnostic/param-type-mismatch-2.C: Expect template
> parms in diagnostic.
> * g++.dg/ext/pretty1.C: Expect function template specialization
> to not pretty-print template parms.
> * g++.old-deja/g++.ext/pretty3.C: Ditto.
> * g++.old-deja/g++.pt/memtemp77.C: Ditto.
> * g++.dg/goacc/template.C: Expect function template parms for
> explicit arguments.
> * g++.dg/gomp/declare-variant-7.C: Expect no function template
> parms for deduced arguments.
> * g++.dg/template/error40.C: Expect only non-default template
> arguments in diagnostic.
> 
> gcc/cp/ChangeLog:
> 
> * cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return
> absolute value of stored constant.
> (EXPLICIT_TEMPLATE_ARGS_P): New.
> (SET_EXPLICIT_TEMPLATE_ARGS_P): New.
> (TFF_AS_PRIMARY): New constant.
> * error.c (get_non_default_template_args_count): Avoid
> GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if
> NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent
> of flag_pretty_templates.
> (dump_template_bindings): Add flags parameter to be passed to
> get_non_default_template_args_count. Print only non-default
> template arguments.
> (dump_function_decl): Call dump_function_name and dump_type of
> the DECL_CONTEXT with specialized template and set
> TFF_AS_PRIMARY for their flags.
> (dump_function_name): Add and document conditions for calling
> dump_template_parms.
> (dump_template_parms): Print only non-default template
> parameters.
> * pt.c (determine_specialization): Jason?
> (template_parms_level_to_args): Jason?
> (copy_template_args): Jason?
> (fn_type_unification): Set EXPLICIT_TEMPL

[RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-08 Thread Matthias Kretz
On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote:
> > 2. Given a DECL_TI_ARGS tree, can I query whether an argument was deduced
> > or explicitly specified? I'm asking because I still consider diagnostics
> > of function templates unfortunate. `template  void f()` is fine,
> > as is `void f(T) [with T = float]`, but `void f() [with T = float]` could
> > be better. I.e. if the template parameter appears somewhere in the
> > function parameter list, dump_template_parms would only produce noise.
> > If, however, the template parameter was given explicitly, it would be
> > nice if it could show up accordingly in diagnostics.
> 
> NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are
> some issues with it.  Attached is my WIP from May to improve it
> somewhat, if that's interesting.

It is interesting. I used your patch to come up with the attached. Patch. I 
must say, I didn't try to read through all the cp/pt.c code to understand all 
of what you did there (which is why my ChangeLog entry says "Jason?"), but it 
works for me (and all of `make check`).

Anyway, I'd like to propose the following before finishing my diagnose_as 
patch. I believe it's useful to fix this part first. The diagnostic/default-
template-args-[12].C tests show a lot of examples of the intent of this patch. 
And the remaining changes to the testsuite show how it changes diagnostic 
output.

-- 8< 

The choice when to print a function template parameter was still
suboptimal. That's because sometimes the function template parameter
list only adds noise, while in other situations the lack of a function
template parameter list makes diagnostic messages hard to understand.

The general idea of this change is to print template parms wherever they
would appear in the source code as well. Thus, the diagnostics code
needs to know whether any template parameter was given explicitly.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow
DW_AT_default_value.
* g++.dg/diagnostic/default-template-args-1.C: New.
* g++.dg/diagnostic/default-template-args-2.C: New.
* g++.dg/diagnostic/param-type-mismatch-2.C: Expect template
parms in diagnostic.
* g++.dg/ext/pretty1.C: Expect function template specialization
to not pretty-print template parms.
* g++.old-deja/g++.ext/pretty3.C: Ditto.
* g++.old-deja/g++.pt/memtemp77.C: Ditto.
* g++.dg/goacc/template.C: Expect function template parms for
explicit arguments.
* g++.dg/gomp/declare-variant-7.C: Expect no function template
parms for deduced arguments.
* g++.dg/template/error40.C: Expect only non-default template
arguments in diagnostic.

gcc/cp/ChangeLog:

* cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return
absolute value of stored constant.
(EXPLICIT_TEMPLATE_ARGS_P): New.
(SET_EXPLICIT_TEMPLATE_ARGS_P): New.
(TFF_AS_PRIMARY): New constant.
* error.c (get_non_default_template_args_count): Avoid
GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if
NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent
of flag_pretty_templates.
(dump_template_bindings): Add flags parameter to be passed to
get_non_default_template_args_count. Print only non-default
template arguments.
(dump_function_decl): Call dump_function_name and dump_type of
the DECL_CONTEXT with specialized template and set
TFF_AS_PRIMARY for their flags.
(dump_function_name): Add and document conditions for calling
dump_template_parms.
(dump_template_parms): Print only non-default template
parameters.
* pt.c (determine_specialization): Jason?
(template_parms_level_to_args): Jason?
(copy_template_args): Jason?
(fn_type_unification): Set EXPLICIT_TEMPLATE_ARGS_P on the
template arguments tree if any template parameter was explicitly
given.
(type_unification_real): Jason?
(get_partial_spec_bindings): Jason?
(tsubst_template_args): Determine number of defaulted arguments
from new argument vector, if possible.
---
 gcc/cp/cp-tree.h  | 18 +++-
 gcc/cp/error.c| 83 ++-
 gcc/cp/pt.c   | 58 +
 .../g++.dg/debug/dwarf2/template-params-12n.C |  2 +-
 .../diagnostic/default-template-args-1.C  | 73 
 .../diagnostic/default-template-args-2.C  | 37 +
 .../g++.dg/diagnostic/param-type-mismatch-2.C |  2 +-
 gcc/testsuite/g++.dg/ext/pretty1.C|  2 +-
 gcc/testsuite/g++.dg/goacc/template.C |  8 +-
 gcc/testsuite/g++.dg/gomp/declare-varian

Re: [PATCH v2] c-family: Add __builtin_assoc_barrier

2021-09-08 Thread Matthias Kretz
On Wednesday, 8 September 2021 15:44:28 CEST Jason Merrill wrote:
> On 9/8/21 5:37 AM, Matthias Kretz wrote:
> > On Tuesday, 7 September 2021 19:36:22 CEST Jason Merrill wrote:
> >>> case PAREN_EXPR:
> >>> -  RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, 0;
> >>> +  if (REF_PARENTHESIZED_P (t))
> >>> +   RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t,
> >>> 0;
> >>> +  else
> >>> +   RETURN (RECUR (TREE_OPERAND (t, 0)));
> >> 
> >> I think you need to build a new PAREN_EXPR in the assoc barrier case as
> >> well, for it to have any effect in templates.
> > 
> > My intent was to ignore __builtin_assoc_barrier in templates / constexpr
> > evaluation since it's not affected by -fassociative-math anyway. Or do you
> > mean something else?
> 
> I agree about constexpr, but why wouldn't template instantiations be
> affected by -fassociative-math like any other function?

Oh, that seems like a major misunderstanding on my part. I assumed 
tsubst_copy_and_build would evaluate the expressions in template arguments 臘. 
I'll expand the test and will fix.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


Re: [PATCH v2] c-family: Add __builtin_assoc_barrier

2021-09-08 Thread Matthias Kretz
On Tuesday, 7 September 2021 19:36:22 CEST Jason Merrill wrote:
> > case PAREN_EXPR:
> > -  RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, 0;
> > +  if (REF_PARENTHESIZED_P (t))
> > +   RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, 0;
> > +  else
> > +   RETURN (RECUR (TREE_OPERAND (t, 0)));
> 
> I think you need to build a new PAREN_EXPR in the assoc barrier case as
> well, for it to have any effect in templates.

My intent was to ignore __builtin_assoc_barrier in templates / constexpr 
evaluation since it's not affected by -fassociative-math anyway. Or do you 
mean something else?

> Please also add a comment mentioning __builtin_assoc_barrier.

I added a comment to that effect to both the cp/pt.c and cp/constexpr.c 
changes.

New patch attached.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 681fcc972f4..c62a6398a47 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 },
   { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
   { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
+  { "__builtin_assoc_barrier", RID_BUILTIN_ASSOC_BARRIER, 0 },
   { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
   { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 },
   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 50ca8fb6ebd..f34dc47c2ba 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -108,7 +108,7 @@ enum rid
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,  RID_CHOOSE_EXPR,
   RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX,	 RID_BUILTIN_SHUFFLE,
   RID_BUILTIN_SHUFFLEVECTOR,   RID_BUILTIN_CONVERTVECTOR,   RID_BUILTIN_TGMATH,
-  RID_BUILTIN_HAS_ATTRIBUTE,
+  RID_BUILTIN_HAS_ATTRIBUTE,   RID_BUILTIN_ASSOC_BARRIER,
   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
 
   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 983d65e930c..dcf4a2d7c32 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -10557,6 +10557,7 @@ names_builtin_p (const char *name)
 case RID_BUILTIN_HAS_ATTRIBUTE:
 case RID_BUILTIN_SHUFFLE:
 case RID_BUILTIN_SHUFFLEVECTOR:
+case RID_BUILTIN_ASSOC_BARRIER:
 case RID_CHOOSE_EXPR:
 case RID_OFFSETOF:
 case RID_TYPES_COMPATIBLE_P:
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9a56e0c04c6..fffd81f4e5b 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -8931,6 +8931,7 @@ c_parser_predefined_identifier (c_parser *parser)
 			 assignment-expression ,
 			 assignment-expression, )
  __builtin_convertvector ( assignment-expression , type-name )
+ __builtin_assoc_barrier ( assignment-expression )
 
offsetof-member-designator:
  identifier
@@ -10076,6 +10077,25 @@ c_parser_postfix_expression (c_parser *parser)
 	  }
 	  }
 	  break;
+	case RID_BUILTIN_ASSOC_BARRIER:
+	  {
+	location_t start_loc = loc;
+	c_parser_consume_token (parser);
+	matching_parens parens;
+	if (!parens.require_open (parser))
+	  {
+		expr.set_error ();
+		break;
+	  }
+	e1 = c_parser_expr_no_commas (parser, NULL);
+	mark_exp_read (e1.value);
+	location_t end_loc = c_parser_peek_token (parser)->get_finish ();
+	parens.skip_until_found_close (parser);
+	expr.value = build1_loc (loc, PAREN_EXPR, TREE_TYPE (e1.value),
+ e1.value);
+	set_c_expr_source_range (, start_loc, end_loc);
+	  }
+	  break;
 	case RID_AT_SELECTOR:
 	  {
 	gcc_assert (c_dialect_objc ());
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 31fa5b66865..6e964837d24 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -6730,6 +6730,14 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, tree t,
    non_constant_p, overflow_p);
   break;
 
+case PAREN_EXPR:
+  gcc_assert (!REF_PARENTHESIZED_P (t));
+  /* A PAREN_EXPR resulting from __builtin_assoc_barrier has no effect in
+ constant expressions since it's unaffected by -fassociative-math.  */
+  r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0), lval,
+	non_constant_p, overflow_p);
+  break;
+
 case NOP_EXPR:
   if (REINTERPRET_CAST_P (t))
 	{
diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c
index ee255732d5a..04522a23eda 100644
--- a/gcc/cp/cp-objcp-common.c
+++ b

Re: [PATCH v2] c-family: Add __builtin_assoc_barrier

2021-09-06 Thread Matthias Kretz
On Monday, 6 September 2021 14:59:27 CEST Richard Biener wrote:
> On Mon, 6 Sep 2021, Matthias Kretz wrote:
> > On Monday, 6 September 2021 14:40:31 CEST Richard Biener wrote:
> > > I'll note that currently a + PAREN_EXPR (b * c) is for example
> > > also not contracted to PAREN_EXPR (FMA (PAREN_EXPR (a), b, c))
> > > even though technically FP contraction is not association.  But
> > > that's an implementation detail that could be changed.  There
> > > are likely other transforms that it prevents as well that are
> > > not assocations, the implementation focus was correctness
> > > as to preventing association, not so much not hindering
> > > unrelated optimizations.  If you run into any such issues
> > > reporting a bugzilla would be welcome.
> > 
> > Thanks, interesting point. I believe it might even be useful to nail down
> > that behavior (i.e. document it and write a test). Because a + b * c
> > evaluates b * c before the addition in any case. So why would anyone add
> > a PAREN_EXPR around b * c?
> 
> At least for integers we have transforms that do a + a * c -> a * (1 + c)
> so one could think of (x/y) + (x/y)*c -> (x/y) * (1 + c) which would
> then have associated the c * (x/y) multiplication ...  Or when
> c is constant then a + a * C can be simplified.

Right given float a, `a + 2.1f * a` is compiled to `3.1f * a` with -ffast-
math. So yes, there's a reason one might want `a + 
__builtin_assoc_barrier(2.1f * a)` without inhibiting contraction. I'll 
investigate more and might submit a PR...

> > We have (std::)fma (__builtin_fma) to explicitly request contraction.
> > PAREN_EXPR seems like a good fit to inhibit contraction.
> 
> OK, I guess it should apply to PAREN_EXPR (a + a) + a as well
> which then does not become 3 * PAREN_EXPR (a).  Likewise
> PAREN_EXPR (a) - a might eventually not become zero (I'm not
> absolutely sure about that ;))

Just tested it. PAREN_EXPR inhibits both transformations.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


Re: [PATCH v2] c-family: Add __builtin_assoc_barrier

2021-09-06 Thread Matthias Kretz
On Monday, 6 September 2021 14:40:31 CEST Richard Biener wrote:
> I'll note that currently a + PAREN_EXPR (b * c) is for example
> also not contracted to PAREN_EXPR (FMA (PAREN_EXPR (a), b, c))
> even though technically FP contraction is not association.  But
> that's an implementation detail that could be changed.  There
> are likely other transforms that it prevents as well that are
> not assocations, the implementation focus was correctness
> as to preventing association, not so much not hindering
> unrelated optimizations.  If you run into any such issues
> reporting a bugzilla would be welcome.

Thanks, interesting point. I believe it might even be useful to nail down that 
behavior (i.e. document it and write a test). Because a + b * c evaluates b * 
c before the addition in any case. So why would anyone add a PAREN_EXPR around 
b * c?

We have (std::)fma (__builtin_fma) to explicitly request contraction. 
PAREN_EXPR seems like a good fit to inhibit contraction.

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


[PATCH v2] c-family: Add __builtin_assoc_barrier

2021-09-06 Thread Matthias Kretz
Hi,

On Tuesday, 20 July 2021 22:22:02 CEST Jason Merrill wrote:
> The C++ front end already uses PAREN_EXPR in templates to indicate
> parenthesized initializers in cases where that matters for
> decltype(auto).  It should be fine to use it for both that and
> __builtin_assoc_barrier, but you probably want to distinguish them with
> a TREE_LANG_FLAG, and change tsubst_copy_and_build to keep the
> PAREN_EXPR in this case.

I reused REF_PARENTHESIZED_P for PAREN_EXPR.

> For constexpr you probably just need to add handling to
> cxx_eval_constant_expression to evaluate its operand instead.

OK, that was easy.

On Monday, 19 July 2021 14:34:12 CEST Richard Biener wrote:
> On Mon, 19 Jul 2021, Matthias Kretz wrote:
> > tested on x86_64-pc-linux-gnu with no new failures. OK for master?
> 
> I think now that PAREN_EXPR can appear in C++ code you need to
> adjust some machiner to expect it (constexpr folding?  template stuff?).
> I suggest to add some testcases covering templates and constexpr
> functions.

Right. I expanded the test.

> +@deftypefn {Built-in Function} @var{type} __builtin_assoc_barrier
> (@var{type} @var{expr})
> +This built-in represents a re-association barrier for the floating-point
> +expression @var{expr} with operations following the built-in. The
> expression
> +@var{expr} itself can be reordered, and the whole expression @var{expr}
> can
> be
> +reordered with operations after the barrier.
> 
> What operations follow the built-in also applies to operations leading
> the builtin?  Maybe "This built-in represents a re-association barrier
> for the floating-point expression @var{expr} with the expression
> consuming its value."  But I'm not an english speaker - I guess
> I'm mostly confused about "follow" here.

With "follow" I meant time / precedence and not that the operation follows 
syntactically. So e.g. a + b * c: the addition follows after the 
multiplication. It's probably not as precise as it could/should be. Also "the 
whole expression @var{expr} can be reordered with operations after the 
barrier" probably should say "with operands" not "with operations", right?

> I'm not sure if there are better C/C++ language terms describing what
> the builtin does, but basically it appears as opaque operand to the
> surrounding expression and the surrounding expression is opaque
> to the expression inside the parens.

I can't think of any other term that would help here.

Based upon your suggestion, the attached patch now says:
"This built-in inhibits re-association of the floating-point expression 
@var{expr} with expressions consuming the return value of the built-in. The 
expression @var{expr} itself can be reordered, and the whole expression 
@var{expr} can be reordered with operands after the barrier. [...]"

New patch attached. OK to push?

---

New builtin to enable explicit use of PAREN_EXPR in C & C++ code.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* c-c++-common/builtin-assoc-barrier-1.c: New test.

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_constant_expression): Handle PAREN_EXPR
via cxx_eval_constant_expression.
* cp-objcp-common.c (names_builtin_p): Handle
RID_BUILTIN_ASSOC_BARRIER.
* cp-tree.h: Adjust TREE_LANG_FLAG documentation to include
PAREN_EXPR in REF_PARENTHESIZED_P.
(REF_PARENTHESIZED_P): Add PAREN_EXPR.
* parser.c (cp_parser_postfix_expression): Handle
RID_BUILTIN_ASSOC_BARRIER.
* pt.c (tsubst_copy_and_build): If the PAREN_EXPR is not a
parenthesized initializer, evaluate by ignoring the PAREN_EXPR.
* semantics.c (force_paren_expr): Simplify conditionals. Set
REF_PARENTHESIZED_P on PAREN_EXPR.
(maybe_undo_parenthesized_ref): Test PAREN_EXPR for
REF_PARENTHESIZED_P.

gcc/c-family/ChangeLog:

* c-common.c (c_common_reswords): Add __builtin_assoc_barrier.
* c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER.

gcc/c/ChangeLog:

* c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER.
* c-parser.c (c_parser_postfix_expression): Likewise.

gcc/ChangeLog:

* doc/extend.texi: Document __builtin_assoc_barrier.
---
 gcc/c-family/c-common.c   |  1 +
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-decl.c|  1 +
 gcc/c/c-parser.c  | 20 
 gcc/cp/constexpr.c|  6 +++
 gcc/cp/cp-objcp-common.c  |  1 +
 gcc/cp/cp-tree.h  | 12 +++--
 gcc/cp/parser.c   | 14 ++
 gcc/cp/pt.c   |  5 +-
 gcc/cp/semantics.c  

ping-3: [PATCH] c-family: Add more predefined macros for math flags

2021-07-27 Thread Matthias Kretz
OK?

On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote:
> Library code, especially in headers, sometimes needs to know how the
> compiler interprets / optimizes floating-point types and operations.
> This information can be used for additional optimizations or for
> ensuring correctness. This change makes -freciprocal-math,
> -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and
> -frounding-math report their state via corresponding pre-defined macros.
> 
> Signed-off-by: Matthias Kretz 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/associative-math-1.c: New test.
>   * gcc.dg/associative-math-2.c: New test.
>   * gcc.dg/no-signed-zeros-1.c: New test.
>   * gcc.dg/no-signed-zeros-2.c: New test.
>   * gcc.dg/no-trapping-math-1.c: New test.
>   * gcc.dg/no-trapping-math-2.c: New test.
>   * gcc.dg/reciprocal-math-1.c: New test.
>   * gcc.dg/reciprocal-math-2.c: New test.
>   * gcc.dg/rounding-math-1.c: New test.
>   * gcc.dg/rounding-math-2.c: New test.
> 
> gcc/c-family/ChangeLog:
> 
>   * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or
>   undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
>   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
>   __ROUNDING_MATH__ according to the new optimization flags.
> 
> gcc/ChangeLog:
> 
>   * cppbuiltin.c (define_builtin_macros_for_compilation_flags):
>   Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
>   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
>   __ROUNDING_MATH__ according to their corresponding flags.
>   * doc/cpp.texi: Document __RECIPROCAL_MATH__,
>   __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__,
>   and __ROUNDING_MATH__.
> ---
>  gcc/c-family/c-cppbuiltin.c   | 25 +++
>  gcc/cppbuiltin.c  | 10 +
>  gcc/doc/cpp.texi  | 18 
>  gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++
>  gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++
>  gcc/testsuite/gcc.dg/no-signed-zeros-1.c  | 17 +++
>  gcc/testsuite/gcc.dg/no-signed-zeros-2.c  | 17 +++
>  gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++
>  gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++
>  gcc/testsuite/gcc.dg/reciprocal-math-1.c  | 17 +++
>  gcc/testsuite/gcc.dg/reciprocal-math-2.c  | 17 +++
>  gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++
>  gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++
>  13 files changed, 223 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index f79f939bd10..671af04b1f8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree,
   cpp_undef (pfile, "__FINITE_MATH_ONLY__");
   cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0");
 }
+
+  if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math)
+cpp_define_unused (pfile, "__RECIPROCAL_MATH__");
+  else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math)
+cpp_undef (pfile, "__RECIPROCAL_MATH__");
+
+  if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros)
+cpp_undef (pfile, "__NO_SIGNED_ZEROS__");
+  else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros)
+cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__");
+
+  if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math)
+cpp_undef (pfile, "__NO_TRAPPING_MATH__");
+  else if (prev->x_

[PATCH v4] c++: Add gnu::diagnose_as attribute

2021-07-23 Thread Matthias Kretz
Hi Jason,

I found a few regressions from the last patch in the meantime. Version 4 of 
the patch is attached.

Questions:

1. I simplified the condition for calling dump_template_parms in 
dump_function_name. !DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION (t) is 
equivalent to DECL_USE_TEMPLATE (t) in this context; implying that 
dump_template_parms is unconditionally called with `primary = false`. Or am I 
missing something?

2. Given a DECL_TI_ARGS tree, can I query whether an argument was deduced or 
explicitly specified? I'm asking because I still consider diagnostics of 
function templates unfortunate. `template  void f()` is fine, as is 
`void f(T) [with T = float]`, but `void f() [with T = float]` could be better. 
I.e. if the template parameter appears somewhere in the function parameter 
list, dump_template_parms would only produce noise. If, however, the template 
parameter was given explicitly, it would be nice if it could show up 
accordingly in diagnostics.

3. When parsing tentatively and the parse is rejected, input_location is not 
reset, correct? In the attached patch I therefore made 
cp_parser_namespace_alias_definition reset input_location on a failed 
tentative parse. But it feels wrong. Shouldn't input_location be restored on 
cp_parser_parse_definitely?

--

This attribute overrides the diagnostics output string for the entity it
appertains to. The motivation is to improve QoI for library TS
implementations, where diagnostics have a very bad signal-to-noise ratio
due to the long namespaces involved.

With the attribute, it is possible to solve PR89370 and make
std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as
std::string in diagnostic output without extra hacks to recognize the
type in the C++ frontend.

Signed-off-by: Matthias Kretz 

gcc/ChangeLog:

PR c++/89370
* doc/extend.texi: Document the diagnose_as attribute.
* doc/invoke.texi: Document -fno-diagnostics-use-aliases.

gcc/c-family/ChangeLog:

PR c++/89370
* c.opt (fdiagnostics-use-aliases): New diagnostics flag.

gcc/cp/ChangeLog:

PR c++/89370
* cp-tree.h: Add is_alias_template_p declaration.
* decl2.c (is_alias_template_p): New function. Determines
whether a given TYPE_DECL is actually an alias template that is
still missing its template_info.
(is_late_template_attribute): Decls with diagnose_as attribute
are early attributes only if they are alias templates.
* error.c (dump_scope): When printing the name of a namespace,
look for the diagnose_as attribute. If found, print the
associated string instead of calling dump_decl.
(dump_decl_name_or_diagnose_as): New function to replace
dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the
diagnose_as attribute before printing the DECL_NAME.
(dump_template_scope): New function. Prints the scope of a
template instance correctly applying diagnose_as attributes and
adjusting the list of template parms accordingly.
(dump_aggr_type): If the type has a diagnose_as attribute, print
the associated string instead of printing the original type
name. Print template parms only if the attribute was not applied
to the instantiation / full specialization. Delay call to
dump_scope until the diagnose_as attribute is found. If the
attribute has a second argument, use it to override the context
passed to dump_scope.
(dump_simple_decl): Call dump_decl_name_or_diagnose_as instead
of dump_decl.
(dump_decl): Ditto.
(lang_decl_name): Ditto.
(dump_function_decl): Walk the functions context list to
determine whether a call to dump_template_scope is required.
Ensure function templates diagnosed with pretty templates set
TFF_TEMPLATE_NAME to skip dump_template_parms.
(dump_function_name): Replace the function's identifier with the
diagnose_as attribute value, if set. Expand
DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION to DECL_USE_TEMPLATE
and consequently call dump_template_parms with primary = false.
(comparable_template_types_p): Consider the types not a template
if one carries a diagnose_as attribute.
(print_template_differences): Replace the identifier with the
diagnose_as attribute value on the most general template, if it
is set.
* name-lookup.c (handle_namespace_attrs): Handle the diagnose_as
attribute on namespaces. Ensure exactly one string argument.
Ensure previous diagnose_as attributes used the same name.
'diagnose_as' on namespace aliases are forwarded to the original
namespace. Support no-argument 'diagnose_as' on namespace
aliases.
(do_namespace_alias): Add attributes parameter and call
handle_namespace_attrs.
* name-lookup.h (do_namespace_alias

[PATCH] c-family: Add __builtin_assoc_barrier

2021-07-19 Thread Matthias Kretz
tested on x86_64-pc-linux-gnu with no new failures. OK for master?

New builtin to enable explicit use of PAREN_EXPR in C & C++ code.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* c-c++-common/builtin-assoc-barrier-1.c: New test.

gcc/cp/ChangeLog:

* cp-objcp-common.c (names_builtin_p): Handle
RID_BUILTIN_ASSOC_BARRIER.
* parser.c (cp_parser_postfix_expression): Handle
RID_BUILTIN_ASSOC_BARRIER.

gcc/c-family/ChangeLog:

* c-common.c (c_common_reswords): Add __builtin_assoc_barrier.
* c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER.

gcc/c/ChangeLog:

* c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER.
* c-parser.c (c_parser_postfix_expression): Likewise.

gcc/ChangeLog:

* doc/extend.texi: Document __builtin_assoc_barrier.
---
 gcc/c-family/c-common.c   |  1 +
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-decl.c|  1 +
 gcc/c/c-parser.c  | 20 
 gcc/cp/cp-objcp-common.c  |  1 +
 gcc/cp/parser.c   | 14 +++
 gcc/doc/extend.texi   | 18 ++
 .../c-c++-common/builtin-assoc-barrier-1.c| 24 +++
 8 files changed, 80 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/builtin-assoc-barrier-1.c


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 681fcc972f4..c62a6398a47 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 },
   { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
   { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
+  { "__builtin_assoc_barrier", RID_BUILTIN_ASSOC_BARRIER, 0 },
   { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
   { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 },
   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 50ca8fb6ebd..f34dc47c2ba 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -108,7 +108,7 @@ enum rid
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,  RID_CHOOSE_EXPR,
   RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX,	 RID_BUILTIN_SHUFFLE,
   RID_BUILTIN_SHUFFLEVECTOR,   RID_BUILTIN_CONVERTVECTOR,   RID_BUILTIN_TGMATH,
-  RID_BUILTIN_HAS_ATTRIBUTE,
+  RID_BUILTIN_HAS_ATTRIBUTE,   RID_BUILTIN_ASSOC_BARRIER,
   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
 
   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 983d65e930c..dcf4a2d7c32 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -10557,6 +10557,7 @@ names_builtin_p (const char *name)
 case RID_BUILTIN_HAS_ATTRIBUTE:
 case RID_BUILTIN_SHUFFLE:
 case RID_BUILTIN_SHUFFLEVECTOR:
+case RID_BUILTIN_ASSOC_BARRIER:
 case RID_CHOOSE_EXPR:
 case RID_OFFSETOF:
 case RID_TYPES_COMPATIBLE_P:
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9a56e0c04c6..fffd81f4e5b 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -8931,6 +8931,7 @@ c_parser_predefined_identifier (c_parser *parser)
 			 assignment-expression ,
 			 assignment-expression, )
  __builtin_convertvector ( assignment-expression , type-name )
+ __builtin_assoc_barrier ( assignment-expression )
 
offsetof-member-designator:
  identifier
@@ -10076,6 +10077,25 @@ c_parser_postfix_expression (c_parser *parser)
 	  }
 	  }
 	  break;
+	case RID_BUILTIN_ASSOC_BARRIER:
+	  {
+	location_t start_loc = loc;
+	c_parser_consume_token (parser);
+	matching_parens parens;
+	if (!parens.require_open (parser))
+	  {
+		expr.set_error ();
+		break;
+	  }
+	e1 = c_parser_expr_no_commas (parser, NULL);
+	mark_exp_read (e1.value);
+	location_t end_loc = c_parser_peek_token (parser)->get_finish ();
+	parens.skip_until_found_close (parser);
+	expr.value = build1_loc (loc, PAREN_EXPR, TREE_TYPE (e1.value),
+ e1.value);
+	set_c_expr_source_range (, start_loc, end_loc);
+	  }
+	  break;
 	case RID_AT_SELECTOR:
 	  {
 	gcc_assert (c_dialect_objc ());
diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c
index ee255732d5a..04522a23eda 100644
--- a/gcc/cp/cp-objcp-common.c
+++ b/gcc/cp/cp-ob

Re: [PATCH] c++: implement C++17 hardware interference size

2021-07-17 Thread Matthias Kretz
On Saturday, 17 July 2021 15:32:42 CEST Jonathan Wakely wrote:
> On Sat, 17 Jul 2021, 09:15 Matthias Kretz,  wrote:
> > If somebody writes a library with `keep_apart` in the public API/ABI then
> > you're right.
> 
> Yes, it's fine if those constants don't affect anything across module
> boundaries.

I believe a significant fraction of hardware interference size usage will be 
internal.

> > The developer who wants his code to be included in a distro should care
> > about
> > binary distribution. If his code has an ABI issue, that's a bug he needs
> > to
> > fix. It's not the fault of the packager.
> 
> Yes but in practice it's the packagers who have to deal with the bug
> reports, analyze the problem, and often fix the bug too. It might not be
> the packager's fault but it's often their problem 

I can imagine. But I don't think requiring users to specify the value 
according to what -mtune suggests will improve things. Users will write a 
configure/cmake/... macro to parse the value -mtune prints and pass that on 
the command line (we'll soon find this solution on SO ). I.e. things are 
likely to be even more broken.

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] c++: implement C++17 hardware interference size

2021-07-17 Thread Matthias Kretz
On Friday, 16 July 2021 21:58:36 CEST Jonathan Wakely wrote:
> On Fri, 16 Jul 2021 at 20:26, Matthias Kretz  wrote:
> > On Friday, 16 July 2021 18:54:30 CEST Jonathan Wakely wrote:
> > > On Fri, 16 Jul 2021 at 16:33, Jason Merrill wrote:
> > > > Adjusting them based on tuning would certainly simplify a significant
> > > > use
> > > > case, perhaps the only reasonable use.  Cases more concerned with ABI
> > > > stability probably shouldn't use them at all. And that would mean not
> > > > needing to worry about the impossible task of finding the right values
> > > > for
> > > > an entire architecture.
> > > 
> > > But it would be quite a significant change in behaviour if -mtune
> > > started affecting ABI, wouldn't it?
> > 
> > For existing code -mtune still doesn't affect ABI.
> 
> True, because existing code isn't using the constants.
> 
> >The users who write
> >
> > struct keep_apart {
> > 
> >   alignas(std::hardware_destructive_interference_size) std::atomic
> >   cat;
> >   alignas(std::hardware_destructive_interference_size) std::atomic
> >   dog;
> > 
> > };
> > 
> > *want* to have different sizeof(keep_apart) depending on the CPU the code
> > is compiled for. I.e. they *ask* for getting their ABI broken.
> 
> Right, but the person who wants that and the person who chooses the
> -mtune option might be different people.

Yes. But it was the intent of the person who wrote the code that the person 
compiling the code can change the data layout of keep_apart via -mtune. Of 
course, if the one compiling doesn't want to choose because the binary needs 
to work on the widest range of systems, then there's a problem we might want 
to solve (direction of target_clones?). (Or the developer of the library 
solves it by providing the ABI for all possible interference_size values.)

> A distro might add -mtune=core2 to all package builds by default, not
> expecting it to cause ABI changes. Some header in a package in the
> distro might start using the constants. Now everybody who includes
> that header needs to use the same -mtune option as the distro default.

If somebody writes a library with `keep_apart` in the public API/ABI then 
you're right.

> That change in the behaviour and expected use of an existing option
> seems scary to me. Even with a warning about using the constants
> (because somebody's just going to use #pragma around their use of the
> constants to disable the warning, and now the ABI impact of -mtune is
> much less obvious).

There are people who say that linking TUs compiled with different compiler 
flags is UB. In general I think that's correct, but we can make explicit 
exceptions. Up to now -mtune wouldn't lead to UB, AFAIK, though -march easily 
does. So maybe, to keep the status quo, the constants should be tied to -march 
not -mtune?

> It's much less scary in a world where the code is written and used by
> the same group of people, but for something like a linux distro it
> worries me.

The developer who wants his code to be included in a distro should care about 
binary distribution. If his code has an ABI issue, that's a bug he needs to 
fix. It's not the fault of the packager.



-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] c++: implement C++17 hardware interference size

2021-07-16 Thread Matthias Kretz
On Friday, 16 July 2021 19:20:29 CEST Noah Goldstein wrote:
> On Fri, Jul 16, 2021 at 11:12 AM Matthias Kretz  wrote:
> > I don't understand how this feature would lead to false sharing. But maybe
> > I
> > misunderstand the spatial prefetcher. The first access to one of the two
> > cache
> > lines pairs would bring both cache lines to LLC (and possibly L2). If a
> > core
> > with a different L2 reads the other cache line the cache line would be
> > duplicated; if it writes to it, it would be exclusive to the other core's
> > L2.
> > The cache line pairs do not affect each other anymore. Maybe there's a
> > minor
> > inefficiency on initial transfer from memory, but isn't that all?
> 
> If two cores that do not share an L2 cache need exclusive access to
> a cache-line, the L2 spatial prefetcher could cause pingponging if those
> two cache-lines were adjacent and shared the same 128 byte alignment.
> Say core A requests line x1 in exclusive, it also get line x2 (not sure
> if x2 would be in shared or exclusive), core B then requests x2 in
> exclusive,
> it also gets x1. Irrelevant of the state x1 comes into core B's private L2
> cache
> it invalidates the exclusive state on cache-line x1 in core A's private L2
> cache. If this was done in a loop (say a simple `lock add` loop) it would
> cause
> pingponging on cache-lines x1/x2 between core A and B's private L2 caches.

Quoting the latest ORM: "The following two hardware prefetchers fetched data 
from memory to the L2 cache and last level cache:
Spatial Prefetcher: This prefetcher strives to complete every cache line 
fetched to the L2 cache with the pair line that completes it to a 128-byte 
aligned chunk."

1. If the requested cache line is already present on some other core, the 
spatial prefetcher should not get used ("fetched data from memory").

2. The section is about data prefetching. It is unclear whether the spatial 
prefetcher applies at all for normal cache line fetches.

3. The ORM uses past tense ("The following two hardware prefetchers fetched 
data"), which indicates to me that Intel isn't doing this for newer 
generations anymore.

4. If I'm wrong on points 1 & 2 consider this: Core 1 requests a read of cache 
line A and the adjacent cache line B thus is also loaded to LLC. Core 2 
request a read of line B and thus loads line A into LLC. Now both cores have 
both cache lines in LLC. Core 1 writes to line A, which invalidates line A in 
LLC of Core 2 but does not affect line B. Core 2 writes to line B, 
invalidating line A for Core 1. => no false sharing. Where did I get my mental 
cache protocol wrong?

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] c++: implement C++17 hardware interference size

2021-07-16 Thread Matthias Kretz
On Friday, 16 July 2021 18:54:30 CEST Jonathan Wakely wrote:
> On Fri, 16 Jul 2021 at 16:33, Jason Merrill wrote:
> > Adjusting them based on tuning would certainly simplify a significant use
> > case, perhaps the only reasonable use.  Cases more concerned with ABI
> > stability probably shouldn't use them at all. And that would mean not
> > needing to worry about the impossible task of finding the right values for
> > an entire architecture.
> 
> But it would be quite a significant change in behaviour if -mtune
> started affecting ABI, wouldn't it?

For existing code -mtune still doesn't affect ABI. The users who write 

struct keep_apart {
  alignas(std::hardware_destructive_interference_size) std::atomic cat;
  alignas(std::hardware_destructive_interference_size) std::atomic dog;
};

*want* to have different sizeof(keep_apart) depending on the CPU the code is 
compiled for. I.e. they *ask* for getting their ABI broken. If they wanted to 
specify the value themselves on the command line they'd written:

struct keep_apart {
  alignas(SOME_MACRO) std::atomic cat;
  alignas(SOME_MACRO) std::atomic dog;
};

I would be very disappointed if std::hardware_destructive_interference_size 
and std::hardware_constructive_interference_size turn into a glorified macro.

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] c++: implement C++17 hardware interference size

2021-07-16 Thread Matthias Kretz
On Friday, 16 July 2021 04:41:17 CEST Jason Merrill via Gcc-patches wrote:
> > Currently the patch does not adjust the values based on -march, as in JF's
> > proposal.  I'll need more guidance from the ARM/AArch64 maintainers about
> > how to go about that.  --param l1-cache-line-size is set based on -mtune,
> > but I don't think we want -mtune to change these ABI-affecting values. 
> > Are
> > there -march values for which a smaller range than 64-256 makes sense?

As a user who cares about ABI but also cares about maximizing performance of 
builds for a specific HPC setup I'd expect the hardware interference size 
values to be allowed to break ABIs. The point of these values is to give me 
better performance portability (but not necessarily binary portability) than 
my usual "pick 64 as a good average".

Wrt, -march / -mtune setting hardware interference size: IMO -mtune=X should 
be interpreted as "my binary is supposed to be optimized for X, I accept 
inefficiencies on everything that's not X".

On Friday, 16 July 2021 04:48:52 CEST Noah Goldstein wrote:
> On intel x86 systems with a private L2 cache the spatial prefetcher
> can cause destructive interference along 128 byte aligned boundaries.
> https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-3
> 2-architectures-optimization-manual.pdf#page=60

I don't understand how this feature would lead to false sharing. But maybe I 
misunderstand the spatial prefetcher. The first access to one of the two cache 
lines pairs would bring both cache lines to LLC (and possibly L2). If a core 
with a different L2 reads the other cache line the cache line would be 
duplicated; if it writes to it, it would be exclusive to the other core's L2. 
The cache line pairs do not affect each other anymore. Maybe there's a minor 
inefficiency on initial transfer from memory, but isn't that all?

That said. Intel documents the spatial prefetcher exclusively for Sandy 
Bridge. So if you still believe 128 is necessary, set the destructive hardware 
interference size to 64 for all of x86 except -mtune=sandybridge.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [RFC] c-family: Add __builtin_noassoc

2021-07-16 Thread Matthias Kretz
On Friday, 16 July 2021 11:31:29 CEST Richard Biener wrote:
> On Fri, Jul 16, 2021 at 10:57 AM Matthias Kretz  wrote:
> > On Wednesday, 14 July 2021 10:14:55 CEST Richard Biener wrote:
> > > I think implementing it similar to how we do __builtin_shufflevector
> > > would
> > > be easily possible.  PAREN_EXPR is a tree code.
> > 
> > Like this? If you like it, I'll write the missing documentation and do
> > real
> > regression testing.
> 
> Yes, like this.  Now, __builtin_noassoc (a + b + c) might suggest that
> it prevents a + b + c from being re-associated - but it does not. 
> PAREN_EXPR is a barrier for association, so for 'a + b + c + PAREN_EXPR  + e + f>' the a+b+c and d+e+f chains will not mix but they individually can
> be re-associated.  That said __builtin_noassoc might be a bad name,
> maybe __builtin_assoc_barrier is better?

Yes, I agree with renaming it. And assoc_barrier sounds intuitive to me.

> To fully prevent association of a a + b + d + e chain you need at least
> two PAREN_EXPRs, for example (a+b) + (d+e) would do.
> 
> One could of course provide __builtin_noassoc (a+b+c+d) with the
> implied semantics and insert PAREN_EXPRs around all operands
> when lowering it.

I wouldn't want to go there. __builtin_noassoc(f(x, y, z))? We probably both 
agree that it would be a no-op, but it reads like f should be evaluated with -
fno-associative-math.

> Not sure what's more useful in practice - directly exposing the middle-end
> PAREN_EXPR or providing a way to mark a whole expression as to be
> not re-associated?  Maybe both?

I think this is a tool for specialists. Give them the low-level tool and 
they'll build whatever higher level abstractions they need on top of it. Like

float sum_noassoc(RangeOfFloats auto x) {
  float sum = 0;
  for (float v : x)
sum = __builtin_assoc_barrier(v + x);
  return sum;
}

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


[RFC] c-family: Add __builtin_noassoc

2021-07-16 Thread Matthias Kretz
On Wednesday, 14 July 2021 10:14:55 CEST Richard Biener wrote:
> > > There's one "related" IL feature used by the Fortran frontend -
> > > PAREN_EXPR
> > > prevents association across it.  So for Fortran (when not
> > > -fno-protect-parens which is enabled by -Ofast), (a + b) - b cannot be
> > > optimized to a.  Eventually this could be used to wrap intrinsic results
> > > since most of the issues in the end require association.  Note
> > > PAREN_EXPR
> > > isn't exposed to the C family frontends but we could of course add a
> > > builtin-like thing for this _Noassoc (  ) or so.  Note PAREN_EXPR
> > > survives -Ofast so it's the frontends that would need to choose to emit
> > > or
> > > not emit it (or always emit it).
> >
> > Interesting. I want that builtin in C++. Currently I use inline asm to
> > achieve a similar effect. But the inline asm hammer is really too big for
> > the problem.
>
> I think implementing it similar to how we do __builtin_shufflevector would
> be easily possible.  PAREN_EXPR is a tree code.

Like this? If you like it, I'll write the missing documentation and do real 
regression testing.

---

New builtin to enable explicit use of PAREN_EXPR in C & C++ code.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* c-c++-common/builtin-noassoc-1.c: New test.

gcc/cp/ChangeLog:

* cp-objcp-common.c (names_builtin_p): Handle
RID_BUILTIN_NOASSOC.
* parser.c (cp_parser_postfix_expression): Handle
RID_BUILTIN_NOASSOC.

gcc/c-family/ChangeLog:

* c-common.c (c_common_reswords): Add __builtin_noassoc.
* c-common.h (enum rid): Add RID_BUILTIN_NOASSOC.

gcc/c/ChangeLog:

* c-decl.c (names_builtin_p): Handle RID_BUILTIN_NOASSOC.
* c-parser.c (c_parser_postfix_expression): Likewise.
---
 gcc/c-family/c-common.c   |  1 +
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-decl.c|  1 +
 gcc/c/c-parser.c  | 20 
 gcc/cp/cp-objcp-common.c  |  1 +
 gcc/cp/parser.c   | 14 +++
 .../c-c++-common/builtin-noassoc-1.c  | 24 +++
 7 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/builtin-noassoc-1.c


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 681fcc972f4..e74123d896c 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] =
   { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 },
   { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 },
   { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY },
+  { "__builtin_noassoc", RID_BUILTIN_NOASSOC, 0 },
   { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 },
   { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 },
   { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY },
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 50ca8fb6ebd..b772cf9c5e9 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -108,7 +108,7 @@ enum rid
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,  RID_CHOOSE_EXPR,
   RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX,	 RID_BUILTIN_SHUFFLE,
   RID_BUILTIN_SHUFFLEVECTOR,   RID_BUILTIN_CONVERTVECTOR,   RID_BUILTIN_TGMATH,
-  RID_BUILTIN_HAS_ATTRIBUTE,
+  RID_BUILTIN_HAS_ATTRIBUTE,   RID_BUILTIN_NOASSOC,
   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
 
   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 983d65e930c..7b7ecba026f 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -10557,6 +10557,7 @@ names_builtin_p (const char *name)
 case RID_BUILTIN_HAS_ATTRIBUTE:
 case RID_BUILTIN_SHUFFLE:
 case RID_BUILTIN_SHUFFLEVECTOR:
+case RID_BUILTIN_NOASSOC:
 case RID_CHOOSE_EXPR:
 case RID_OFFSETOF:
 case RID_TYPES_COMPATIBLE_P:
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9a56e0c04c6..2b40dc8253e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -8931,6 +8931,7 @@ c_parser_predefined_identifier (c_parser *parser)
 			 assignment-expression ,
 			 assignment-expression, )
  __builtin_convertvector ( assignment-expression , type-name )
+ __builtin_noassoc (

Re: ping-2: [PATCH] c-family: Add more predefined macros for math flags

2021-07-16 Thread Matthias Kretz
On Wednesday, 14 July 2021 14:42:01 CEST H.J. Lu wrote:
> On Wed, Jul 14, 2021 at 12:32 AM Matthias Kretz  wrote:
> > OK?
> > 
> > On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote:
> > > Library code, especially in headers, sometimes needs to know how the
> > > compiler interprets / optimizes floating-point types and operations.
> > > This information can be used for additional optimizations or for
> > > ensuring correctness. This change makes -freciprocal-math,
> > > -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and
> > > -frounding-math report their state via corresponding pre-defined macros.
> > > 
> > > Signed-off-by: Matthias Kretz 
> > > 
> > > gcc/testsuite/ChangeLog:
> > >   * gcc.dg/associative-math-1.c: New test.
> > >   * gcc.dg/associative-math-2.c: New test.
> > >   * gcc.dg/no-signed-zeros-1.c: New test.
> > >   * gcc.dg/no-signed-zeros-2.c: New test.
> > >   * gcc.dg/no-trapping-math-1.c: New test.
> > >   * gcc.dg/no-trapping-math-2.c: New test.
> > >   * gcc.dg/reciprocal-math-1.c: New test.
> > >   * gcc.dg/reciprocal-math-2.c: New test.
> > >   * gcc.dg/rounding-math-1.c: New test.
> > >   * gcc.dg/rounding-math-2.c: New test.
> > > 
> > > gcc/c-family/ChangeLog:
> > >   * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or
> > >   undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
> > >   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
> > >   __ROUNDING_MATH__ according to the new optimization flags.
> > > 
> > > gcc/ChangeLog:
> > >   * cppbuiltin.c (define_builtin_macros_for_compilation_flags):
> > >   Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
> > >   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
> > >   __ROUNDING_MATH__ according to their corresponding flags.
> > >   * doc/cpp.texi: Document __RECIPROCAL_MATH__,
> > >   __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__,
> > >   and __ROUNDING_MATH__.
> > > 
> 
> Hi Hongtao,
> 
> Can this be used to address
> 
> https://gcc.gnu.org/pipermail/gcc/2021-July/236778.html

It should help to determine when a workaround is necessary. I use inline asm 
to implement the workaround. Relevant libstdc++ code (not upstream yet and not 
making use of __ASSOCIATIVE_MATH__ yet):

/*
 * Ensure the expressions leading up to the @p __x argument are evaluated at 
least once.
 *
 * Example: __force_evaluation(x + y) - y will not optimize to x with -
fassociative-math.
 * _TV is expected to be __vector_type_t.
 */
template 
  [[__gnu__::__flatten__, __gnu__::__const__]]
  _GLIBCXX_SIMD_INTRINSIC constexpr
  _TV
  __force_evaluation(_TV __x) noexcept
  {
if (__builtin_is_constant_evaluated())
  return __x;
else
  return [&] {
if constexpr(__have_sse)
  {
if constexpr (sizeof(__x) >= 16)
  {
asm("" :: "x"(__x));
asm("" : "+x"(__x));
  }
else if constexpr (is_same_v<__vector_type_t, _TV>)
  {
asm("" :: "x"(__x[0]), "x"(__x[1]));
asm("" : "+x"(__x[0]), "+x"(__x[1]));
  }
else
  __assert_unreachable<_TV>();
  }
else if constexpr(__have_neon)
  {
asm("" :: "w"(__x));
asm("" : "+w"(__x));
  }
else if constexpr (__have_power_vmx)
  {
if constexpr (is_same_v<__vector_type_t, _TV>)
  {
asm("" :: "fgr"(__x[0]), "fgr"(__x[1]));
asm("" : "+fgr"(__x[0]), "+fgr"(__x[1]));
  }
else
  {
asm("" :: "v"(__x));
asm("" : "+v"(__x));
  }
  }
else
  {
asm("" :: "g"(__x));
asm("" : "+g"(__x));
  }
return __x;
  }();
  }

// Returns __x + __y - __y without -fassociative-math optimizing to __x.
// - _TV must be __vector_type_t.
// - _UV must be _TV or floating-point type.
template 
  [[__gnu__::__const__]]
  _GLIBCXX_SIMD_INTRINSIC constexpr
  _TV
  __plus_minus(_TV __x, _UV __y) noexcept
  {
#if defined __clang__ || __GCC_IEC_559 > 0
return (__x + __y) - __y;
#else
if 

[PATCH v3] c++: Add gnu::diagnose_as attribute

2021-07-15 Thread Matthias Kretz
Hi Jason,

A new revision of the patch is attached. I think I implemented all your 
suggestions.

Please comment on cp/decl2.c (is_alias_template_p). I find it surprising that 
I had to write this function. Maybe I missed something? In any case, 
DECL_ALIAS_TEMPLATE_P requires a template_decl and the TYPE_DECL apparently 
doesn't have a template_info/decl at this point.

From: Matthias Kretz 

This attribute overrides the diagnostics output string for the entity it
appertains to. The motivation is to improve QoI for library TS
implementations, where diagnostics have a very bad signal-to-noise ratio
due to the long namespaces involved.

With the attribute, it is possible to solve PR89370 and make
std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as
std::string in diagnostic output without extra hacks to recognize the
type in the C++ frontend.

Signed-off-by: Matthias Kretz 

gcc/ChangeLog:

PR c++/89370
* doc/extend.texi: Document the diagnose_as attribute.
* doc/invoke.texi: Document -fno-diagnostics-use-aliases.

gcc/c-family/ChangeLog:

PR c++/89370
* c.opt (fdiagnostics-use-aliases): New diagnostics flag.

gcc/cp/ChangeLog:

PR c++/89370
* cp-tree.h: Add TFF_AS_PRIMARY. Add is_alias_template_p
declaration.
* decl2.c (is_alias_template_p): New function. Determines
whether a given TYPE_DECL is actually an alias template that is
still missing its template_info.
(is_late_template_attribute): Decls with diagnose_as attribute
are early attributes only if they are alias templates.
* error.c (dump_scope): When printing the name of a namespace,
look for the diagnose_as attribute. If found, print the
associated string instead of calling dump_decl.
(dump_decl_name_or_diagnose_as): New function to replace
dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the
diagnose_as attribute before printing the DECL_NAME.
(dump_template_scope): New function. Prints the scope of a
template instance correctly applying diagnose_as attributes and
adjusting the list of template parms accordingly.
(dump_aggr_type): If the type has a diagnose_as attribute, print
the associated string instead of printing the original type
name. Print template parms only if the attribute was not applied
to the instantiation / full specialization. Delay call to
dump_scope until the diagnose_as attribute is found. If the
attribute has a second argument, use it to override the context
passed to dump_scope.
(dump_simple_decl): Call dump_decl_name_or_diagnose_as instead
of dump_decl.
(dump_decl): Ditto.
(lang_decl_name): Ditto.
(dump_function_decl): Walk the functions context list to
determine whether a call to dump_template_scope is required.
Ensure function templates are presented as primary templates.
(dump_function_name): Replace the function's identifier with the
diagnose_as attribute value, if set.
(dump_template_parms): Treat as primary template if flags
contains TFF_AS_PRIMARY.
(comparable_template_types_p): Consider the types not a template
if one carries a diagnose_as attribute.
(print_template_differences): Replace the identifier with the
diagnose_as attribute value on the most general template, if it
is set.
* name-lookup.c (handle_namespace_attrs): Handle the diagnose_as
attribute on namespaces. Ensure exactly one string argument.
Ensure previous diagnose_as attributes used the same name.
'diagnose_as' on namespace aliases are forwarded to the original
namespace. Support no-argument 'diagnose_as' on namespace
aliases.
(do_namespace_alias): Add attributes parameter and call
handle_namespace_attrs.
* name-lookup.h (do_namespace_alias): Add attributes tree
parameter.
* parser.c (cp_parser_declaration): If the next token is
RID_NAMESPACE, tentatively parse a namespace alias definition.
If this fails expect a namespace definition.
(cp_parser_namespace_alias_definition): Allow optional
attributes before and after the identifier. Fast exit if the
expected CPP_EQ token is missing. Pass attributes to
do_namespace_alias.
* tree.c (cxx_attribute_table): Add diagnose_as attribute to the
table.
(check_diagnose_as_redeclaration): New function; copied and
adjusted from check_abi_tag_redeclaration.
(handle_diagnose_as_attribute): New function; copied and
adjusted from handle_abi_tag_attribute. If the given *node is a
TYPE_DECL: allow no argument to the attribute, using DECL_NAME
instead; apply the attribute to the type on the RHS in place,
even if the type is complete. A

ping-2: [PATCH] c-family: Add more predefined macros for math flags

2021-07-14 Thread Matthias Kretz
OK?

On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote:
> Library code, especially in headers, sometimes needs to know how the
> compiler interprets / optimizes floating-point types and operations.
> This information can be used for additional optimizations or for
> ensuring correctness. This change makes -freciprocal-math,
> -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and
> -frounding-math report their state via corresponding pre-defined macros.
> 
> Signed-off-by: Matthias Kretz 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/associative-math-1.c: New test.
>   * gcc.dg/associative-math-2.c: New test.
>   * gcc.dg/no-signed-zeros-1.c: New test.
>   * gcc.dg/no-signed-zeros-2.c: New test.
>   * gcc.dg/no-trapping-math-1.c: New test.
>   * gcc.dg/no-trapping-math-2.c: New test.
>   * gcc.dg/reciprocal-math-1.c: New test.
>   * gcc.dg/reciprocal-math-2.c: New test.
>   * gcc.dg/rounding-math-1.c: New test.
>   * gcc.dg/rounding-math-2.c: New test.
> 
> gcc/c-family/ChangeLog:
> 
>   * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or
>   undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
>   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
>   __ROUNDING_MATH__ according to the new optimization flags.
> 
> gcc/ChangeLog:
> 
>   * cppbuiltin.c (define_builtin_macros_for_compilation_flags):
>   Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
>   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
>   __ROUNDING_MATH__ according to their corresponding flags.
>   * doc/cpp.texi: Document __RECIPROCAL_MATH__,
>   __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__,
>   and __ROUNDING_MATH__.
> ---
>  gcc/c-family/c-cppbuiltin.c   | 25 +++
>  gcc/cppbuiltin.c  | 10 +
>  gcc/doc/cpp.texi  | 18 
>  gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++
>  gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++
>  gcc/testsuite/gcc.dg/no-signed-zeros-1.c  | 17 +++
>  gcc/testsuite/gcc.dg/no-signed-zeros-2.c  | 17 +++
>  gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++
>  gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++
>  gcc/testsuite/gcc.dg/reciprocal-math-1.c  | 17 +++
>  gcc/testsuite/gcc.dg/reciprocal-math-2.c  | 17 +++
>  gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++
>  gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++
>  13 files changed, 223 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index f79f939bd10..671af04b1f8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree,
   cpp_undef (pfile, "__FINITE_MATH_ONLY__");
   cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0");
 }
+
+  if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math)
+cpp_define_unused (pfile, "__RECIPROCAL_MATH__");
+  else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math)
+cpp_undef (pfile, "__RECIPROCAL_MATH__");
+
+  if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros)
+cpp_undef (pfile, "__NO_SIGNED_ZEROS__");
+  else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros)
+cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__");
+
+  if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math)
+cpp_undef (pfile, "__NO_TRAPPING_MATH__");
+  else if (prev->x_

Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?

2021-07-14 Thread Matthias Kretz
On Wednesday, 14 July 2021 07:18:29 CEST Hongtao Liu via Gcc-help wrote:
> On Wed, Jul 14, 2021 at 1:15 PM Hongtao Liu  wrote:
> > Hi:
> >   The original problem was that some users wanted the cmdline option
> > 
> > -ffast-math not to act on intrinsic production code.

This sounds like the users want intrinsics to map *directly* to the 
corresponding instruction. If that's the case such users should use inline 
assembly, IMHO. If you compile a TU with -ffast-math then *all* floating-point 
operations are affected. Yes, more control over where to use fast-math and the 
ability to mix fast-math and no-fast-math without risking ODR violations would 
be great. But that's a larger issue, and one that would ideally be solved in 
WG14/WG21.

FWIW, this is what I'd do, i.e. turn off fast-math for the function in 
question:
https://godbolt.org/z/3cKq5hT1o

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] Add gnu::diagnose_as attribute

2021-07-07 Thread Matthias Kretz
On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote:
> > 2. About the namespace aliases: IIUC an attribute would currently be
> > rejected because of the C++ grammar. Do you want to make it valid before
> > WG21 officially decides how to proceed? And if you have a pointer for me
> > where I'd have to adjust the grammar rules, that'd help. 
> 
> You will want to adjust cp_parser_namespace_alias_definition to handle
> attributes like cp_parser_namespace_definition.  The latter currently
> accepts attributes both before and after the name, which seems like a
> good pattern to follow so it doesn't matter which WG21 chooses.
> Probably best to pedwarn about C++11 attributes in both locations for
> now, not just after.

This introduces an ambiguity in cp_parser_declaration. The function has to 
decide whether to call cp_parser_namespace_definition or fall back to 
cp_parser_block_declaration (which calls 
cp_parser_namespace_alias_definition). But now the parser has to look ahead a 
lot farther:

namespace foo [[whatever]] {}
namespace bar [[whatever]] = foo;

I.e. only at '{' vs. '=' can cp_parser_declaration decide to call 
cp_parser_namespace_definition.

Consequently, should I really modify cp_parser_namespace_definition to handle 
namespace aliases? Or can/should cp_parser_declaration look ahead behind the 
attribute(s)? How?
With pedantic standard C++ it would be easy, since only these attribute 
placements are allowed:

namespace [[whatever] foo {}
namespace bar [[whatever]] = foo;

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


ping: [PATCH] c-family: Add more predefined macros for math flags

2021-07-07 Thread Matthias Kretz
OK? (I want to use the macros in libstdc++.)

On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote:
> Library code, especially in headers, sometimes needs to know how the
> compiler interprets / optimizes floating-point types and operations.
> This information can be used for additional optimizations or for
> ensuring correctness. This change makes -freciprocal-math,
> -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and
> -frounding-math report their state via corresponding pre-defined macros.
> 
> Signed-off-by: Matthias Kretz 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/associative-math-1.c: New test.
>   * gcc.dg/associative-math-2.c: New test.
>   * gcc.dg/no-signed-zeros-1.c: New test.
>   * gcc.dg/no-signed-zeros-2.c: New test.
>   * gcc.dg/no-trapping-math-1.c: New test.
>   * gcc.dg/no-trapping-math-2.c: New test.
>   * gcc.dg/reciprocal-math-1.c: New test.
>   * gcc.dg/reciprocal-math-2.c: New test.
>   * gcc.dg/rounding-math-1.c: New test.
>   * gcc.dg/rounding-math-2.c: New test.
> 
> gcc/c-family/ChangeLog:
> 
>   * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or
>   undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
>   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
>   __ROUNDING_MATH__ according to the new optimization flags.
> 
> gcc/ChangeLog:
> 
>   * cppbuiltin.c (define_builtin_macros_for_compilation_flags):
>   Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
>   __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
>   __ROUNDING_MATH__ according to their corresponding flags.
>   * doc/cpp.texi: Document __RECIPROCAL_MATH__,
>   __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__,
>   and __ROUNDING_MATH__.
> ---
>  gcc/c-family/c-cppbuiltin.c   | 25 +++
>  gcc/cppbuiltin.c  | 10 +
>  gcc/doc/cpp.texi  | 18 
>  gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++
>  gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++
>  gcc/testsuite/gcc.dg/no-signed-zeros-1.c  | 17 +++
>  gcc/testsuite/gcc.dg/no-signed-zeros-2.c  | 17 +++
>  gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++
>  gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++
>  gcc/testsuite/gcc.dg/reciprocal-math-1.c  | 17 +++
>  gcc/testsuite/gcc.dg/reciprocal-math-2.c  | 17 +++
>  gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++
>  gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++
>  13 files changed, 223 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index f79f939bd10..671af04b1f8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree,
   cpp_undef (pfile, "__FINITE_MATH_ONLY__");
   cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0");
 }
+
+  if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math)
+cpp_define_unused (pfile, "__RECIPROCAL_MATH__");
+  else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math)
+cpp_undef (pfile, "__RECIPROCAL_MATH__");
+
+  if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros)
+cpp_undef (pfile, "__NO_SIGNED_ZEROS__");
+  else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros)
+cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__");
+
+  if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math)
+cpp_undef (pfile, "__NO

Re: [PATCH] Add gnu::diagnose_as attribute

2021-07-05 Thread Matthias Kretz
On Thursday, 1 July 2021 17:18:26 CEST Jason Merrill wrote:
> You probably want to adjust is_late_template_attribute to change that.

Right, I hacked is_late_template_attribute but now I only see a TYPE_DECL 
passed to my attribute handler (!DECL_ALIAS_TEMPLATE_P). I.e. I don't know how 
your previous comment is supposed to help me:

On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote:
> Yes.  You can check that with get_underlying_template.

FWIW, I don't feel qualified to implement the diagnose_as attribute on alias 
templates. The trees I've seen while testing the following test case don't 
make sense to me. :(


// { dg-do compile { target c++11 } }
// { dg-options "-fdiagnostics-use-aliases -fpretty-templates" }

template  class A0 {};
template  using B0 [[gnu::diagnose_as]] = A0; // #1
template  using C0 [[gnu::diagnose_as]] = A0; // #2

template  class A1 {};
template  class A1 {};
template  using B1 [[gnu::diagnose_as]] = A1; // #3

void fn_1(int);

int main ()
{
  fn_1 (A0 ()); // { dg-error "cannot convert 'B0' to 'int'" }
  fn_1 (A1 ()); // { dg-error "cannot convert 'A1' to 'int'" }
  fn_1 (A1 ()); // { dg-error "cannot convert 'B1' to 'int'" }
}


On #1 I see !COMPLETE_TYPE_P (TREE_TYPE (*node)) while on #3 TREE_TYPE (*node) 
is a complete type. Like I said, I don't get to see the TEMPLATE_DECL of 
either #1, #2, or #3, only a TYPE_DECL whose TREE_TYPE is A0. I thus have no 
idea how to reject #2.

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] Add gnu::diagnose_as attribute

2021-07-01 Thread Matthias Kretz
On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote:
> On 6/22/21 4:01 PM, Matthias Kretz wrote:
> > On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote:
> >> For alias templates, you probably want the attribute only on the
> >> templated class, not on the instantiations.
> > 
> > Oh good point. My current patch does not allow the attribute on alias
> > templates. Consider:
> > 
> > template 
> > 
> >struct X {};
> > 
> > template 
> > 
> >using foo [[gnu::diagnose_as]] = X;
> > 
> > I have no idea how this could work. I would have to set the attribute for
> > an implicit partial specialization (not that I know of the existence of
> > such a thing)? I.e. X would have to be diagnosed as foo,
> > but X would have to be diagnosed as X, not foo.
> > 
> > So if anything it should only support alias templates if they are strictly
> > "renaming" the type. I.e. their template parameters must match up exactly.
> > Can I constrain the attribute like this?
> 
> Yes.  You can check that with get_underlying_template.
> 
> Or you could support the above by putting the attribute on the
> instantiation with the TEMPLATE_INFO for foo rather than a simple name.

Question, given:

  template  using foo = bar;

The diagnose_as attribute handler isn't called until e.g. `foo` is 
instantiated. Meaning that even after the declaration of the alias template 
`bar` will not be diagnosed as `foo`, which happens only after the 
first use of `foo`. I find that more confusing than helpful, even if the 
expectation would be that users only use the alias template.

So do you still expect alias templates to support diagnose_as? And if yes, how 
could I handle the attribute so that the diagnose_as attribute is applied to 
`bar` on declaration of `foo`?

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


[PATCH] c-family: Add more predefined macros for math flags

2021-06-30 Thread Matthias Kretz

Library code, especially in headers, sometimes needs to know how the
compiler interprets / optimizes floating-point types and operations.
This information can be used for additional optimizations or for
ensuring correctness. This change makes -freciprocal-math,
-fno-signed-zeros, -fno-trapping-math, -fassociative-math, and
-frounding-math report their state via corresponding pre-defined macros.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* gcc.dg/associative-math-1.c: New test.
* gcc.dg/associative-math-2.c: New test.
* gcc.dg/no-signed-zeros-1.c: New test.
* gcc.dg/no-signed-zeros-2.c: New test.
* gcc.dg/no-trapping-math-1.c: New test.
* gcc.dg/no-trapping-math-2.c: New test.
* gcc.dg/reciprocal-math-1.c: New test.
* gcc.dg/reciprocal-math-2.c: New test.
* gcc.dg/rounding-math-1.c: New test.
* gcc.dg/rounding-math-2.c: New test.

gcc/c-family/ChangeLog:

* c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or
undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
__NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
__ROUNDING_MATH__ according to the new optimization flags.

gcc/ChangeLog:

* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
__NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
__ROUNDING_MATH__ according to their corresponding flags.
* doc/cpp.texi: Document __RECIPROCAL_MATH__,
__NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__,
and __ROUNDING_MATH__.
---
 gcc/c-family/c-cppbuiltin.c   | 25 +++
 gcc/cppbuiltin.c  | 10 +
 gcc/doc/cpp.texi  | 18 
 gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++
 gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++
 gcc/testsuite/gcc.dg/no-signed-zeros-1.c  | 17 +++
 gcc/testsuite/gcc.dg/no-signed-zeros-2.c  | 17 +++
 gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++
 gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++
 gcc/testsuite/gcc.dg/reciprocal-math-1.c  | 17 +++
 gcc/testsuite/gcc.dg/reciprocal-math-2.c  | 17 +++
 gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++
 gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++
 13 files changed, 223 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c
 create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c
 create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c
 create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c
 create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c
 create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index f79f939bd10..671af04b1f8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree,
   cpp_undef (pfile, "__FINITE_MATH_ONLY__");
   cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0");
 }
+
+  if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math)
+cpp_define_unused (pfile, "__RECIPROCAL_MATH__");
+  else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math)
+cpp_undef (pfile, "__RECIPROCAL_MATH__");
+
+  if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros)
+cpp_undef (pfile, "__NO_SIGNED_ZEROS__");
+  else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros)
+cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__");
+
+  if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math)
+cpp_undef (pfile, "__NO_TRAPPING_MATH__");
+  else if (prev->x_flag_trapping_math && !cur->x_flag_trapping_math)
+cpp_define_unused (pfile, "__NO_TRAPPING_MATH__");
+
+  if (!prev->x_flag_associative_math && cur->x_flag_associative_math)
+cpp_define_unused (pfile, "__ASSOCIATIVE_MATH__");
+  else if (prev->x_flag_associative_math && !cur->x_flag_

Re: [PATCH 04/11 v3] libstdc++: Make use of __builtin_bit_cast

2021-06-24 Thread Matthias Kretz
For -ffast-math there was a missing using namespace __proposed left. The 
attached patch resolves the issue.

From: Matthias Kretz 

The __bit_cast function was a hack to achieve what __builtin_bit_cast
can do, therefore use __builtin_bit_cast if possible. However,
__builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since
it isn't trivially copyable (in the language sense — in principle it
is). Therefore add __proposed::simd_bit_cast to enable the use case
required in the test framework.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__bit_cast): Implement via
__builtin_bit_cast #if available.
(__proposed::simd_bit_cast): Add overloads for simd and
simd_mask, which use __builtin_bit_cast (or __bit_cast #if not
available), which return an object of the requested type with
the same bits as the argument.
* include/experimental/bits/simd_math.h: Use simd_bit_cast
instead of __bit_cast to allow casts to fixed_size_simd.
(copysign): Remove branch that was only required if __bit_cast
cannot be constexpr.
* testsuite/experimental/simd/tests/bits/test_values.h: Switch
from __bit_cast to __proposed::simd_bit_cast since the former
will not cast fixed_size objects anymore.
---
 libstdc++-v3/include/experimental/bits/simd.h | 57 ++-
 .../include/experimental/bits/simd_math.h | 37 ++--
 .../simd/tests/bits/test_values.h |  8 +--
 3 files changed, 76 insertions(+), 26 deletions(-)


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 163f1b574e2..852d0b62012 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1598,7 +1598,9 @@ template 
   _GLIBCXX_SIMD_INTRINSIC constexpr _To
   __bit_cast(const _From __x)
   {
-// TODO: implement with / replace by __builtin_bit_cast ASAP
+#if __has_builtin(__builtin_bit_cast)
+return __builtin_bit_cast(_To, __x);
+#else
 static_assert(sizeof(_To) == sizeof(_From));
 constexpr bool __to_is_vectorizable
   = is_arithmetic_v<_To> || is_enum_v<_To>;
@@ -1629,6 +1631,7 @@ template 
 			 reinterpret_cast(&__x), sizeof(_To));
 	return __r;
   }
+#endif
   }
 
 // }}}
@@ -2900,6 +2903,58 @@ template (__x)};
   }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  _To
+  simd_bit_cast(const simd<_Up, _Abi>& __x)
+  {
+using _Tp = typename _To::value_type;
+using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember;
+using _From = simd<_Up, _Abi>;
+using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember;
+// with concepts, the following should be constraints
+static_assert(sizeof(_To) == sizeof(_From));
+static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>);
+static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>);
+#if __has_builtin(__builtin_bit_cast)
+return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))};
+#else
+return {__private_init, __bit_cast<_ToMember>(__data(__x))};
+#endif
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  _To
+  simd_bit_cast(const simd_mask<_Up, _Abi>& __x)
+  {
+using _From = simd_mask<_Up, _Abi>;
+static_assert(sizeof(_To) == sizeof(_From));
+static_assert(is_trivially_copyable_v<_From>);
+// _To can be simd, specifically simd> in which case _To is not trivially
+// copyable.
+if constexpr (is_simd_v<_To>)
+  {
+	using _Tp = typename _To::value_type;
+	using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember;
+	static_assert(is_trivially_copyable_v<_ToMember>);
+#if __has_builtin(__builtin_bit_cast)
+	return {__private_init, __builtin_bit_cast(_ToMember, __x)};
+#else
+	return {__private_init, __bit_cast<_ToMember>(__x)};
+#endif
+  }
+else
+  {
+	static_assert(is_trivially_copyable_v<_To>);
+#if __has_builtin(__builtin_bit_cast)
+	return __builtin_bit_cast(_To, __x);
+#else
+	return __bit_cast<_To>(__x);
+#endif
+  }
+  }
 } // namespace __proposed
 
 // simd_cast {{{2
diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index d954e761eee..ef2bdc641b8 100644
--- a/libstdc++-v3/include

Re: [PATCH] Add gnu::diagnose_as attribute

2021-06-22 Thread Matthias Kretz
On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote:
> For alias templates, you probably want the attribute only on the
> templated class, not on the instantiations.

Oh good point. My current patch does not allow the attribute on alias 
templates. Consider:

template 
  struct X {};

template 
  using foo [[gnu::diagnose_as]] = X;

I have no idea how this could work. I would have to set the attribute for an 
implicit partial specialization (not that I know of the existence of such a 
thing)? I.e. X would have to be diagnosed as foo, but X would have to be diagnosed as X, not foo.

So if anything it should only support alias templates if they are strictly 
"renaming" the type. I.e. their template parameters must match up exactly. Can 
I constrain the attribute like this? Or should we rely on developers to be 
reasonable and only use it for template aliases with matching template params?

-Matthias

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH v2] libstdc++: Improve std::lock algorithm

2021-06-22 Thread Matthias Kretz
On Dienstag, 22. Juni 2021 17:20:41 CEST Jonathan Wakely wrote:
> On Tue, 22 Jun 2021 at 14:21, Matthias Kretz wrote:
> > This does a try_lock on all lockabes even if any of them fails. I think
> > that's
> > not only more expensive but also non-conforming. I think you need to defer
> > locking and then loop from beginning to end to break the loop on the first
> > unsuccessful try_lock.
> 
> Oops, good point. I'll add a test for that too. Here's the fixed code:
> 
> template
>   inline int
>   __try_lock_impl(_L0& __l0, _Lockables&... __lockables)
>   {
> #if __cplusplus >= 201703L
> if constexpr ((is_same_v<_L0, _Lockables> && ...))
>   {
> constexpr int _Np = 1 + sizeof...(_Lockables);
> unique_lock<_L0> __locks[_Np] = {
> {__l0, defer_lock}, {__lockables, defer_lock}...
> };
> for (int __i = 0; __i < _Np; ++__i)

I thought coding style requires a { here?

>   if (!__locks[__i].try_lock())
> {
>   const int __failed = __i;
>   while (__i--)
> __locks[__i].unlock();
>   return __i;

You meant `return __failed`?

> }
> for (auto& __l : __locks)
>   __l.release();
> return -1;
>   }
> else
> #endif
> 
> > [...]
> > Yes, if only we had a wrapping integer type that wraps at an arbitrary N.
> > Like
> > 
> > unsigned int but with parameter, like:
> >   for (__wrapping_uint<_Np> __k = __idx; __k != __first; --__k)
> >   
> > __locks[__k - 1].unlock();
> > 
> > This is the loop I wanted to write, except --__k is simpler to write and
> > __k -
> > 1 would also wrap around to _Np - 1 for __k == 0. But if this is the only
> > place it's not important enough to abstract.
> 
> We might be able to use __wrapping_uint in std::seed_seq::generate too, and
> maybe some other places in . But we can add that later if we decide
> it's worth it.

OK.

> > I also considered moving it down here. Makes sense unless you want to call
> > __detail::__lock_impl from other functions. And if we want to make it work
> > for
> > pre-C++11 we could do
> > 
> >   using __homogeneous
> >   
> > = __and_, is_same<_L1, _L3>...>;
> >   
> >   int __i = 0;
> >   __detail::__lock_impl(__homogeneous(), __i, 0, __l1, __l2, __l3...);
> 
> We don't need tag dispatching, we could just do:
> 
> if _GLIBCXX17_CONSTEXPR (homogeneous::value)
>  ...
> else
>  ...
> 
> because both branches are valid for the homogeneous case, i.e. we aren't
> using if-constexpr to avoid invalid instantiations.

But for the inhomogeneous case the homogeneous code is invalid (initialization 
of C-array of unique_lock<_L1>).

> But given that the default -std option is gnu++17 now, I'm OK with the
> iterative version only being used for C++17.

Fair enough.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH v2] libstdc++: Improve std::lock algorithm

2021-06-22 Thread Matthias Kretz
{
> +   const int __idx = (__first + __j) % _Np;
> +   if (!__locks[__idx].try_lock())
> + {
> +   for (int __k = __j; __k != 0; --__k)
> + __locks[(__first + __k - 1) % _Np].unlock();
> +   __first = __idx;
> +   break;
> + }
> + }
> + } while (!__locks[__first]);
> +
> + for (auto& __l : __locks)
> +   __l.release();
> +   }
> +  else
> +#endif
> +   {
> + int __i = 0;
> + __detail::__lock_impl(__i, 0, __l1, __l2, __l3...);
> +   }
>  }
> 
>  #if __cplusplus >= 201703L


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [committed] libstdc++: Improve std::lock algorithm

2021-06-22 Thread Matthias Kretz
90,19 +627,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  void
>  lock(_L1& __l1, _L2& __l2, _L3&... __l3)
>  {
> -  while (true)
> -{
> -  using __try_locker = __try_lock_impl<0, sizeof...(_L3) != 0>;
> -  unique_lock<_L1> __first(__l1);
> -  int __idx;
> -  auto __locks = std::tie(__l2, __l3...);
> -  __try_locker::__do_try_lock(__locks, __idx);
> -  if (__idx == -1)
> -{
> -  __first.release();
> -  return;
> -}
> -}
> +  int __i = 0;
> +  __detail::__lock_impl(__i, 0, __l1, __l2, __l3...);
>  }
> 
>  #if __cplusplus >= 201703L


-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] Add gnu::diagnose_as attribute

2021-06-22 Thread Matthias Kretz
On Wednesday, 16 June 2021 02:48:09 CEST Jason Merrill wrote:
> > IIUC, your main concern is that my proposed diagnose_as *can* be used to
> > make diagnostics worse, by replacing names with strings that are not
> > valid identifiers. Of course, whoever uses the attribute to that effect
> > should have a good reason to do so. Is your other concern that using the
> > attribute in a "good" way is repetitive? Would you be happier if I make
> > the string argument to the attribute optional for type aliases?
> 
> Yes, and namespace aliases.

I'll look into making the attribute argument optional for aliases. Would you 
accept the patch with this change?

Questions:

1. If a type alias applies the attribute after a type was completed / 
implicitly instantiated (and possibly already used in diagnostics) should / 
can I still modify the type and add the attribute?

2. About the namespace aliases: IIUC an attribute would currently be rejected 
because of the C++ grammar. Do you want to make it valid before WG21 
officially decides how to proceed? And if you have a pointer for me where I'd 
have to adjust the grammar rules, that'd help. :)

Best,
  Matthias

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] Add gnu::diagnose_as attribute

2021-06-15 Thread Matthias Kretz
On Tuesday, 15 June 2021 17:51:20 CEST Jason Merrill wrote:
> On 6/11/21 6:01 AM, Matthias Kretz wrote:
> > For reference I'll attach my stdx::simd diagnose_as patch.
> > 
> > We could also talk about extending the feature to provide more information
> > about the diagnose_as substition. E.g. print a list of all diagnose_as
> > substitutions, which were used, at the end of the output stream. Or
> > simpler, print "note: some identifiers were simplified, use
> > -fno-diagnostics-use- aliases to see their real names".
> 
> Or perhaps before the first use of a name that doesn't correspond to a
> source-level name.

Right. I guess that would be even easier to implement than printing it at the 
end.

> > -struct _Scalar;
> > +  struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar;
> > 
> >  template 
> > 
> > -  struct _Fixed;
> > +  struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed;
> 
> Thes two could be the variant of the attribute without an explicit
> string, attached to the alias-declaration.

Agreed. (since you don't have implementation concerns...)

> > +using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>;
> > +using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>;
> > +using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] =
> > _VecBltnBtmsk<64>; +  using __odr_helper [[__gnu__::__diagnose_as__("[ODR
> > helper]")]]
> These [] names seem like minimal improvements over the __ names that you
> would get from the attribute without an explicit string.

Right. It would, however, give the user an identifier that I don't want them 
to use in their code. We could argue "it has a double-underscore and it's not 
a documented implementation-defined type, so you're shooting yourself in the 
foot". Or we could just avoid the issue altogether. I agree this is not a huge 
issue.

> > +   inline namespace parallelism_v2
> > [[__gnu__::__diagnose_as__("std\u2093")]] {
> This could go on std::experimental itself, along with my proposed change
> to hide inline namespaces by default (with a note similar to the one above).

Yes, with the following consequences:

* If only the std::experimental::parallelism_v2::simd headers set the 
diagnose_as attribute on std::experimental, the #inclusion of  changes the diagnostics of all other TS implementations.

* If all TS implementations set the diagnose_as attribute, then it's basically 
impossible to go back to the long and scary name. Which is what we really 
should do as soon as there's both a std::simd and a stdₓ::simd. Attaching the 
diagnose_as attribute to the inline namespace allows for better granularity, 
even if it's maybe not good enough for some TSs.

* If `namespace std { namespace experimental [[gnu::diagnose_as("foo")]] {` 
turns the scope into 'foo::' and not 'std::foo::' (not sure what you intended) 
then I could still attach the attribute to the inline namespace.


So, yes, I could improve stdx::simd with what you propose. IMHO it wouldn't be 
as good as what I can do with the patch at hand, though.

IIUC, your main concern is that my proposed diagnose_as *can* be used to make 
diagnostics worse, by replacing names with strings that are not valid 
identifiers. Of course, whoever uses the attribute to that effect should have 
a good reason to do so. Is your other concern that using the attribute in a 
"good" way is repetitive? Would you be happier if I make the string argument 
to the attribute optional for type aliases?

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──




Re: [PATCH 04/11 v2] libstdc++: Make use of __builtin_bit_cast

2021-06-11 Thread Matthias Kretz
While testing newer patches I found several missing conversions from 
__bit_cast to simd_bit_cast in this patch (i.e. where bit casting to / from 
fixed_size was sometimes required). Corrected patch attached.


From: Matthias Kretz 

The __bit_cast function was a hack to achieve what __builtin_bit_cast
can do, therefore use __builtin_bit_cast if possible. However,
__builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since
it isn't trivially copyable (in the language sense — in principle it
is). Therefore add __proposed::simd_bit_cast to enable the use case
required in the test framework.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__bit_cast): Implement via
__builtin_bit_cast #if available.
(__proposed::simd_bit_cast): Add overloads for simd and
simd_mask, which use __builtin_bit_cast (or __bit_cast #if not
available), which return an object of the requested type with
the same bits as the argument.
* include/experimental/bits/simd_math.h: Use simd_bit_cast
instead of __bit_cast to allow casts to fixed_size_simd.
(copysign): Remove branch that was only required if __bit_cast
cannot be constexpr.
* testsuite/experimental/simd/tests/bits/test_values.h: Switch
from __bit_cast to __proposed::simd_bit_cast since the former
will not cast fixed_size objects anymore.
---
 libstdc++-v3/include/experimental/bits/simd.h | 57 ++-
 .../include/experimental/bits/simd_math.h | 36 +---
 .../simd/tests/bits/test_values.h |  8 +--
 3 files changed, 75 insertions(+), 26 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 163f1b574e2..852d0b62012 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1598,7 +1598,9 @@ template 
   _GLIBCXX_SIMD_INTRINSIC constexpr _To
   __bit_cast(const _From __x)
   {
-// TODO: implement with / replace by __builtin_bit_cast ASAP
+#if __has_builtin(__builtin_bit_cast)
+return __builtin_bit_cast(_To, __x);
+#else
 static_assert(sizeof(_To) == sizeof(_From));
 constexpr bool __to_is_vectorizable
   = is_arithmetic_v<_To> || is_enum_v<_To>;
@@ -1629,6 +1631,7 @@ template 
 			 reinterpret_cast(&__x), sizeof(_To));
 	return __r;
   }
+#endif
   }
 
 // }}}
@@ -2900,6 +2903,58 @@ template (__x)};
   }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  _To
+  simd_bit_cast(const simd<_Up, _Abi>& __x)
+  {
+using _Tp = typename _To::value_type;
+using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember;
+using _From = simd<_Up, _Abi>;
+using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember;
+// with concepts, the following should be constraints
+static_assert(sizeof(_To) == sizeof(_From));
+static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>);
+static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>);
+#if __has_builtin(__builtin_bit_cast)
+return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))};
+#else
+return {__private_init, __bit_cast<_ToMember>(__data(__x))};
+#endif
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  _To
+  simd_bit_cast(const simd_mask<_Up, _Abi>& __x)
+  {
+using _From = simd_mask<_Up, _Abi>;
+static_assert(sizeof(_To) == sizeof(_From));
+static_assert(is_trivially_copyable_v<_From>);
+// _To can be simd, specifically simd> in which case _To is not trivially
+// copyable.
+if constexpr (is_simd_v<_To>)
+  {
+	using _Tp = typename _To::value_type;
+	using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember;
+	static_assert(is_trivially_copyable_v<_ToMember>);
+#if __has_builtin(__builtin_bit_cast)
+	return {__private_init, __builtin_bit_cast(_ToMember, __x)};
+#else
+	return {__private_init, __bit_cast<_ToMember>(__x)};
+#endif
+  }
+else
+  {
+	static_assert(is_trivially_copyable_v<_To>);
+#if __has_builtin(__builtin_bit_cast)
+	return __builtin_bit_cast(_To, __x);
+#else
+	return __bit_cast<_To>(__x);
+#endif
+  }
+  }
 } // namespace __proposed
 
 // simd_cast {{{2
diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/

Re: [PATCH] Add gnu::diagnose_as attribute

2021-06-11 Thread Matthias Kretz
How can we make progress here? I could try to produce some "Tony Tables" of 
diagnostic output of my modified stdx::simd. I believe it's a major 
productivity boost to see abbreviated / "obfuscated" diagnostics *out-of-the 
box* (with the possibility to opt-out). Actually, it already *is* a 
productivity boost to me. Understanding diagnostics has improved from 

"1. ooof, I'm not going to read this, let me rather guess what the issue was
2. sh** I have to read it
3. several minutes later: I finally found the five words to understand the 
problem; I could use a break"

to

"1. right, let me check that"

For reference I'll attach my stdx::simd diagnose_as patch.

We could also talk about extending the feature to provide more information 
about the diagnose_as substition. E.g. print a list of all diagnose_as 
substitutions, which were used, at the end of the output stream. Or simpler, 
print "note: some identifiers were simplified, use -fno-diagnostics-use-
aliases to see their real names".

On Tuesday, 1 June 2021 21:12:18 CEST Jason Merrill wrote:
> > Right, but then two of my design goals can't be met:
> > 
> > 1. Diagnostics have an improved signal-to-noise ratio out of the box.
> > 
> > 2. We can use replacement names that are not valid identifiers.
> 
> This is the basic disconnect: I think that these goals are
> contradictory, and that replacement names that are not valid identifiers
> will just confuse users that don't know about them.
> 
> If a user sees stdx::foo in a diagnostic and then tries to refer to
> stdx::foo and gets an error, the diagnostic is not more helpful than one
> that uses the fully qualified name.
> 
> Jonathan, David, any thoughts on this issue?

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 43331134301..8e0cceff860 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -80,13 +80,13 @@ using __m512d [[__gnu__::__vector_size__(64)]] = double;
 using __m512i [[__gnu__::__vector_size__(64)]] = long long;
 #endif
 
-namespace simd_abi {
+namespace simd_abi [[__gnu__::__diagnose_as__("simd_abi")]] {
 // simd_abi forward declarations {{{
 // implementation details:
-struct _Scalar;
+  struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar;
 
 template 
-  struct _Fixed;
+  struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed;
 
 // There are two major ABIs that appear on different architectures.
 // Both have non-boolean values packed into an N Byte register
@@ -105,28 +105,11 @@ template 
 template 
   struct _VecBltnBtmsk;
 
-template 
-  using _VecN = _VecBuiltin;
-
-template 
-  using _Sse = _VecBuiltin<_UsedBytes>;
-
-template 
-  using _Avx = _VecBuiltin<_UsedBytes>;
-
-template 
-  using _Avx512 = _VecBltnBtmsk<_UsedBytes>;
-
-template 
-  using _Neon = _VecBuiltin<_UsedBytes>;
-
-// implementation-defined:
-using __sse = _Sse<>;
-using __avx = _Avx<>;
-using __avx512 = _Avx512<>;
-using __neon = _Neon<>;
-using __neon128 = _Neon<16>;
-using __neon64 = _Neon<8>;
+#if defined __i386__ || defined __x86_64__
+using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>;
+using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>;
+using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] = _VecBltnBtmsk<64>;
+#endif
 
 // standard:
 template 
@@ -364,7 +347,7 @@ namespace __detail
* users link TUs compiled with different flags. This is especially important
* for using simd in libraries.
*/
-  using __odr_helper
+  using __odr_helper [[__gnu__::__diagnose_as__("[ODR helper]")]]
 = conditional_t<__machine_flags() == 0, _OdrEnforcer,
 		_MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>;
 
@@ -689,7 +672,7 @@ template 
   __is_avx512_abi()
   {
 constexpr auto _Bytes = __abi_bytes_v<_Abi>;
-return _Bytes <= 64 && is_same_v, _Abi>;
+return _Bytes <= 64 && is_same_v, _Abi>;
   }
 
 // }}}
diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h
index 78ad33f74e4..1f127cd0d52 100644
--- a/libstdc++-v3/include/experimental/bits/simd_detail.h
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -36,7 +36,7 @@
   {

Re: [PATCH 11/11] libstdc++: Fix ODR issues with different -m flags

2021-06-09 Thread Matthias Kretz
On Wednesday, 9 June 2021 14:22:00 CEST Richard Biener wrote:
> On Tue, Jun 8, 2021 at 2:23 PM Matthias Kretz  wrote:
> > From: Matthias Kretz 
> > 
> > Explicitly support use of the stdx::simd implementation in situations
> > where the user links TUs that were compiled with different -m flags. In
> > general, this is always a (quasi) ODR violation for inline functions
> > because at least codegen may differ in important ways. However, in the
> > resulting executable only one (unspecified which one) of them might be
> > used. For simd we want to support users to compile code multiple times,
> > with different -m flags and have a runtime dispatch to the TU matching
> > the target CPU. But if internal functions are not inlined this may lead
> > to unexpected performance loss or execution of illegal instructions.
> > Therefore, inline functions that are not marked as always_inline must
> > use an additional template parameter somewhere in their name, to
> > disambiguate between the different -m translations.
> 
> Note that excessive use of always_inline can cause compile-time issues
> (see for example PR99785).

Ah, I should verify whether that's also the reason my stdx::simd 
implementation is slow to compile.

However, I really must have the always_inline semantics in most of the places 
stdx::simd uses it. Because most of these functions compile to either a single 
function call or a single instruction (often f0 -> f1 -> f2 -> single 
instruction). If the inliner even makes one single wrong inlining decision, 
the whole program might slow down by integral factors, not only small 
percentages. And without inlining these functions, -fno-inline builds (i.e. 
many debug builds) become unbearably slow (aka useless).

> I wonder whether the inlines can be
> placed in an anonymous namespace instead of the difficult to maintain
> explict list of SIMD features?

It's possible, and part of the patch:

+  namespace
+  {
+struct _OdrEnforcer {};
+  }
[...]
+  using __odr_helper
+= conditional_t<__machine_flags() == 0, _OdrEnforcer,
+   _MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>;

It can potentially blow up the code size and the instruction cache usage, 
though. The trade-off isn't obvious to make. I guess I can't promise that 
mixing different compiler flags is ODR violation free 

> It also doesn't solve the issue when
> instantiating the functions from a TU which contains #pragma GCC target
> sections to switch options, of course.

Yes. Can I get PR83875? ;-)

- Matthias

> > Signed-off-by: Matthias Kretz 
> > 
> > libstdc++-v3/ChangeLog:
> > * include/experimental/bits/simd.h: Move feature detection bools
> > and add __have_avx512bitalg, __have_avx512vbmi2,
> > __have_avx512vbmi, __have_avx512ifma, __have_avx512cd,
> > __have_avx512vnni, __have_avx512vpopcntdq.
> > (__detail::__machine_flags): New function which returns a unique
> > uint64 depending on relevant -m and -f flags.
> > (__detail::__odr_helper): New type alias for either an anonymous
> > type or a type specialized with the __machine_flags number.
> > (_SimdIntOperators): Change template parameters from _Impl to
> > _Tp, _Abi because _Impl now has an __odr_helper parameter which
> > may be _OdrEnforcer from the anonymous namespace, which makes
> > for a bad base class.
> > (many): Either add __odr_helper template parameter or mark as
> > always_inline.
> > * include/experimental/bits/simd_detail.h: Add defines for
> > AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD,
> > AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT.
> > * include/experimental/bits/simd_builtin.h: Add __odr_helper
> > template parameter or mark as always_inline.
> > * include/experimental/bits/simd_fixed_size.h: Ditto.
> > * include/experimental/bits/simd_math.h: Ditto.
> > * include/experimental/bits/simd_scalar.h: Ditto.
> > * include/experimental/bits/simd_neon.h: Add __odr_helper
> > template parameter.
> > * include/experimental/bits/simd_ppc.h: Ditto.
> > * include/experimental/bits/simd_x86.h: Ditto.
> > 
> > ---
> > 
> >  libstdc++-v3/include/experimental/bits/simd.h | 380 --
> >  .../include/experimental/bits/simd_builtin.h  |  41 +-
> >  .../include/experimental/bits/simd_detail.h   |  40 ++
> >  .../experimental/bits/simd_fixed_size.h   |  39 +-
> &

[PATCH 11/11] libstdc++: Fix ODR issues with different -m flags

2021-06-08 Thread Matthias Kretz

From: Matthias Kretz 

Explicitly support use of the stdx::simd implementation in situations
where the user links TUs that were compiled with different -m flags. In
general, this is always a (quasi) ODR violation for inline functions
because at least codegen may differ in important ways. However, in the
resulting executable only one (unspecified which one) of them might be
used. For simd we want to support users to compile code multiple times,
with different -m flags and have a runtime dispatch to the TU matching
the target CPU. But if internal functions are not inlined this may lead
to unexpected performance loss or execution of illegal instructions.
Therefore, inline functions that are not marked as always_inline must
use an additional template parameter somewhere in their name, to
disambiguate between the different -m translations.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Move feature detection bools
and add __have_avx512bitalg, __have_avx512vbmi2,
__have_avx512vbmi, __have_avx512ifma, __have_avx512cd,
__have_avx512vnni, __have_avx512vpopcntdq.
(__detail::__machine_flags): New function which returns a unique
uint64 depending on relevant -m and -f flags.
(__detail::__odr_helper): New type alias for either an anonymous
type or a type specialized with the __machine_flags number.
(_SimdIntOperators): Change template parameters from _Impl to
_Tp, _Abi because _Impl now has an __odr_helper parameter which
may be _OdrEnforcer from the anonymous namespace, which makes
for a bad base class.
(many): Either add __odr_helper template parameter or mark as
always_inline.
* include/experimental/bits/simd_detail.h: Add defines for
AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD,
AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT.
* include/experimental/bits/simd_builtin.h: Add __odr_helper
template parameter or mark as always_inline.
* include/experimental/bits/simd_fixed_size.h: Ditto.
* include/experimental/bits/simd_math.h: Ditto.
* include/experimental/bits/simd_scalar.h: Ditto.
* include/experimental/bits/simd_neon.h: Add __odr_helper
template parameter.
* include/experimental/bits/simd_ppc.h: Ditto.
* include/experimental/bits/simd_x86.h: Ditto.
---
 libstdc++-v3/include/experimental/bits/simd.h | 380 --
 .../include/experimental/bits/simd_builtin.h  |  41 +-
 .../include/experimental/bits/simd_detail.h   |  40 ++
 .../experimental/bits/simd_fixed_size.h   |  39 +-
 .../include/experimental/bits/simd_math.h |  45 ++-
 .../include/experimental/bits/simd_neon.h |   4 +-
 .../include/experimental/bits/simd_ppc.h  |   4 +-
 .../include/experimental/bits/simd_scalar.h   |  71 +++-
 .../include/experimental/bits/simd_x86.h  |   4 +-
 9 files changed, 440 insertions(+), 188 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 21100c1087d..43331134301 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -35,6 +35,7 @@
 #include  // for stderr
 #endif
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -203,9 +204,170 @@ template 
 // }}}
 template 
   using _SizeConstant = integral_constant;
+// constexpr feature detection{{{
+constexpr inline bool __have_mmx = _GLIBCXX_SIMD_HAVE_MMX;
+constexpr inline bool __have_sse = _GLIBCXX_SIMD_HAVE_SSE;
+constexpr inline bool __have_sse2 = _GLIBCXX_SIMD_HAVE_SSE2;
+constexpr inline bool __have_sse3 = _GLIBCXX_SIMD_HAVE_SSE3;
+constexpr inline bool __have_ssse3 = _GLIBCXX_SIMD_HAVE_SSSE3;
+constexpr inline bool __have_sse4_1 = _GLIBCXX_SIMD_HAVE_SSE4_1;
+constexpr inline bool __have_sse4_2 = _GLIBCXX_SIMD_HAVE_SSE4_2;
+constexpr inline bool __have_xop = _GLIBCXX_SIMD_HAVE_XOP;
+constexpr inline bool __have_avx = _GLIBCXX_SIMD_HAVE_AVX;
+constexpr inline bool __have_avx2 = _GLIBCXX_SIMD_HAVE_AVX2;
+constexpr inline bool __have_bmi = _GLIBCXX_SIMD_HAVE_BMI1;
+constexpr inline bool __have_bmi2 = _GLIBCXX_SIMD_HAVE_BMI2;
+constexpr inline bool __have_lzcnt = _GLIBCXX_SIMD_HAVE_LZCNT;
+constexpr inline bool __have_sse4a = _GLIBCXX_SIMD_HAVE_SSE4A;
+constexpr inline bool __have_fma = _GLIBCXX_SIMD_HAVE_FMA;
+constexpr inline bool __have_fma4 = _GLIBCXX_SIMD_HAVE_FMA4;
+constexpr inline bool __have_f16c = _GLIBCXX_SIMD_HAVE_F16C;
+constexpr inline bool __have_popcnt

[PATCH 10/11] libstdc++: Fix internal names: add missing underscores

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_math.h
(_GLIBCXX_SIMD_MATH_CALL2_): Rename arg2_ to __arg2.
(_GLIBCXX_SIMD_MATH_CALL3_): Rename arg2_ to __arg2 and arg3_ to
__arg3.
---
 libstdc++-v3/include/experimental/bits/simd_math.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index a5df2039970..61af9fc67af 100644
--- a/libstdc++-v3/include/experimental/bits/simd_math.h
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -119,10 +119,10 @@ template 
 
 //}}}
 // _GLIBCXX_SIMD_MATH_CALL2_ {{{
-#define _GLIBCXX_SIMD_MATH_CALL2_(__name, arg2_)   \
+#define _GLIBCXX_SIMD_MATH_CALL2_(__name, __arg2)  \
 template < \
   typename _Tp, typename _Abi, typename...,\
-  typename _Arg2 = _Extra_argument_type, \
+  typename _Arg2 = _Extra_argument_type<__arg2, _Tp, _Abi>,\
   typename _R = _Math_return_type_t<   \
 decltype(std::__name(declval(), _Arg2::declval())), _Tp, _Abi>>\
   enable_if_t, _R>\
@@ -137,7 +137,7 @@ template\
   declval(),   \
   declval, \
+	  is_same<__arg2, _Tp>,\
 	  negation, simd<_Tp, _Abi>>>,   \
 	  is_convertible<_Up, simd<_Tp, _Abi>>, is_floating_point<_Tp>>,   \
 	double>>())),  \
@@ -147,10 +147,10 @@ template\
 
 // }}}
 // _GLIBCXX_SIMD_MATH_CALL3_ {{{
-#define _GLIBCXX_SIMD_MATH_CALL3_(__name, arg2_, arg3_)\
+#define _GLIBCXX_SIMD_MATH_CALL3_(__name, __arg2, __arg3)  \
 template , \
-	  typename _Arg3 = _Extra_argument_type, \
+	  typename _Arg2 = _Extra_argument_type<__arg2, _Tp, _Abi>,\
+	  typename _Arg3 = _Extra_argument_type<__arg3, _Tp, _Abi>,\
 	  typename _R = _Math_return_type_t<   \
 	decltype(std::__name(declval(), _Arg2::declval(),  \
  _Arg3::declval())),   \


[PATCH 09/11] libstdc++: Ensure unrolled loops inline the lambda

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__execute_on_index_sequence,
__execute_on_index_sequence_with_return,
__call_with_n_evaluations, __call_with_subscripts): Add flatten
attribute.
---
 libstdc++-v3/include/experimental/bits/simd.h | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 5d243f22434..21100c1087d 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -234,7 +234,8 @@ namespace __detail
 // unrolled/pack execution helpers
 // __execute_n_times{{{
 template 
-  _GLIBCXX_SIMD_INTRINSIC constexpr void
+  [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr
+  void
   __execute_on_index_sequence(_Fp&& __f, index_sequence<_I...>)
   { ((void)__f(_SizeConstant<_I>()), ...); }
 
@@ -254,7 +255,8 @@ template 
 // }}}
 // __generate_from_n_evaluations{{{
 template 
-  _GLIBCXX_SIMD_INTRINSIC constexpr _R
+  [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr
+  _R
   __execute_on_index_sequence_with_return(_Fp&& __f, index_sequence<_I...>)
   { return _R{__f(_SizeConstant<_I>())...}; }
 
@@ -269,7 +271,8 @@ template 
 // }}}
 // __call_with_n_evaluations{{{
 template 
-  _GLIBCXX_SIMD_INTRINSIC constexpr auto
+  [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr
+  auto
   __call_with_n_evaluations(index_sequence<_I...>, _F0&& __f0, _FArgs&& __fargs)
   { return __f0(__fargs(_SizeConstant<_I>())...); }
 
@@ -285,7 +288,8 @@ template 
 // }}}
 // __call_with_subscripts{{{
 template 
-  _GLIBCXX_SIMD_INTRINSIC constexpr auto
+  [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr
+  auto
   __call_with_subscripts(_Tp&& __x, index_sequence<_It...>, _Fp&& __fun)
   { return __fun(__x[_First + _It]...); }
 


[PATCH 08/11] libstdc++: Avoid raising fp exceptions in trunc, floor, and ceil

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_x86.h (_S_trunc, _S_floor,
_S_ceil): Set bit 8 (_MM_FROUND_NO_EXC) on AVX and SSE4.1
roundp[sd] calls.
---
 .../include/experimental/bits/simd_x86.h  | 24 +--
 1 file changed, 12 insertions(+), 12 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index 5706bf63845..34633c096b1 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -2657,13 +2657,13 @@ template 
 	else if constexpr (__is_avx512_pd<_Tp, _Np>())
 	  return _mm512_roundscale_pd(__x, 0x0b);
 	else if constexpr (__is_avx_ps<_Tp, _Np>())
-	  return _mm256_round_ps(__x, 0x3);
+	  return _mm256_round_ps(__x, 0xb);
 	else if constexpr (__is_avx_pd<_Tp, _Np>())
-	  return _mm256_round_pd(__x, 0x3);
+	  return _mm256_round_pd(__x, 0xb);
 	else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
-	  return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x3));
+	  return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0xb));
 	else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
-	  return _mm_round_pd(__x, 0x3);
+	  return _mm_round_pd(__x, 0xb);
 	else if constexpr (__is_sse_ps<_Tp, _Np>())
 	  {
 	auto __truncated
@@ -2786,13 +2786,13 @@ template 
 	else if constexpr (__is_avx512_pd<_Tp, _Np>())
 	  return _mm512_roundscale_pd(__x, 0x09);
 	else if constexpr (__is_avx_ps<_Tp, _Np>())
-	  return _mm256_round_ps(__x, 0x1);
+	  return _mm256_round_ps(__x, 0x9);
 	else if constexpr (__is_avx_pd<_Tp, _Np>())
-	  return _mm256_round_pd(__x, 0x1);
+	  return _mm256_round_pd(__x, 0x9);
 	else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
-	  return __auto_bitcast(_mm_floor_ps(__to_intrin(__x)));
+	  return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x9));
 	else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
-	  return _mm_floor_pd(__x);
+	  return _mm_round_pd(__x, 0x9);
 	else
 	  return _Base::_S_floor(__x);
   }
@@ -2808,13 +2808,13 @@ template 
 	else if constexpr (__is_avx512_pd<_Tp, _Np>())
 	  return _mm512_roundscale_pd(__x, 0x0a);
 	else if constexpr (__is_avx_ps<_Tp, _Np>())
-	  return _mm256_round_ps(__x, 0x2);
+	  return _mm256_round_ps(__x, 0xa);
 	else if constexpr (__is_avx_pd<_Tp, _Np>())
-	  return _mm256_round_pd(__x, 0x2);
+	  return _mm256_round_pd(__x, 0xa);
 	else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>())
-	  return __auto_bitcast(_mm_ceil_ps(__to_intrin(__x)));
+	  return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0xa));
 	else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>())
-	  return _mm_ceil_pd(__x);
+	  return _mm_round_pd(__x, 0xa);
 	else
 	  return _Base::_S_ceil(__x);
   }


[PATCH 07/11] libstdc++: Fix condition when AVX512F ldexp implementation is used

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

This improves codegen of ldexp if AVX512VL is available.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h (_S_ldexp): The AVX512F
implementation doesn't require a _VecBltnBtmsk ABI tag, it
requires either a 64-Byte input (in which case AVX512F must be
available) or AVX512VL.
---
 libstdc++-v3/include/experimental/bits/simd_x86.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index 305d7a9fa54..5706bf63845 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -2611,13 +2611,14 @@ template 
   _S_ldexp(_SimdWrapper<_Tp, _Np> __x,
 	   __fixed_size_storage_t __exp)
   {
-	if constexpr (__is_avx512_abi<_Abi>())
+	if constexpr (sizeof(__x) == 64 || __have_avx512vl)
 	  {
 	const auto __xi = __to_intrin(__x);
 	constexpr _SimdConverter, _Tp, _Abi>
 	  __cvt;
 	const auto __expi = __to_intrin(__cvt(__exp));
-	constexpr auto __k1 = _Abi::template _S_implicit_mask_intrin<_Tp>();
+	using _Up = __bool_storage_member_type_t<_Np>;
+	constexpr _Up __k1 = _Np < sizeof(_Up) * __CHAR_BIT__ ? _Up((1ULL << _Np) - 1) : ~_Up();
 	if constexpr (sizeof(__xi) == 16)
 	  {
 		if constexpr (sizeof(_Tp) == 8)


[PATCH 06/11] libstdc++: Minor simd_math cleanups

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_math.h: Undefine internal
macros after use.
(frexp): Move #if to a more sensible position and reformat
preceding code.
(logb): Call _SimdImpl::_S_logb for fixed_size instead of
duplicating the code here.
(modf): Simplify condition.
---
 .../include/experimental/bits/simd_math.h | 22 +--
 1 file changed, 6 insertions(+), 16 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index cff4371619d..a5df2039970 100644
--- a/libstdc++-v3/include/experimental/bits/simd_math.h
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -645,11 +645,8 @@ template 
 	return __r;
   }
 else if constexpr (__is_fixed_size_abi_v<_Abi>)
-  {
-	return {__private_init,
-		_Abi::_SimdImpl::_S_frexp(__data(__x), __data(*__exp))};
+  return {__private_init, _Abi::_SimdImpl::_S_frexp(__data(__x), __data(*__exp))};
 #if _GLIBCXX_SIMD_X86INTRIN
-  }
 else if constexpr (__have_avx512f)
   {
 	constexpr size_t _Np = simd_size_v<_Tp, _Abi>;
@@ -667,8 +664,8 @@ template 
 		_Abi::_CommonImpl::_S_blend(_SimdWrapper(
 	  __isnonzero),
 	__v, __getmant_avx512(__v))};
-#endif // _GLIBCXX_SIMD_X86INTRIN
   }
+#endif // _GLIBCXX_SIMD_X86INTRIN
 else
   {
 	// fallback implementation
@@ -749,14 +746,7 @@ template 
 if constexpr (_Np == 1)
   return std::logb(__x[0]);
 else if constexpr (__is_fixed_size_abi_v<_Abi>)
-  {
-	return {__private_init,
-		__data(__x)._M_apply_per_chunk([](auto __impl, auto __xx) {
-		  using _V = typename decltype(__impl)::simd_type;
-		  return __data(
-		std::experimental::logb(_V(__private_init, __xx)));
-		})};
-  }
+  return {__private_init, _Abi::_SimdImpl::_S_logb(__data(__x))};
 #if _GLIBCXX_SIMD_X86INTRIN // {{{
 else if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>())
   return {__private_init,
@@ -827,9 +817,7 @@ template 
   enable_if_t, simd<_Tp, _Abi>>
   modf(const simd<_Tp, _Abi>& __x, simd<_Tp, _Abi>* __iptr)
   {
-if constexpr (__is_scalar_abi<_Abi>()
-		  || (__is_fixed_size_abi_v<
-			_Abi> && simd_size_v<_Tp, _Abi> == 1))
+if constexpr (simd_size_v<_Tp, _Abi> == 1)
   {
 	_Tp __tmp;
 	_Tp __r = std::modf(__x[0], &__tmp);
@@ -1472,6 +1460,8 @@ template 
   }
 // }}}
 
+#undef _GLIBCXX_SIMD_CVTING2
+#undef _GLIBCXX_SIMD_CVTING3
 #undef _GLIBCXX_SIMD_MATH_CALL_
 #undef _GLIBCXX_SIMD_MATH_CALL2_
 #undef _GLIBCXX_SIMD_MATH_CALL3_


[PATCH 05/11] libstdc++: Remove incorrect fabs overload

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

fabs(int) returns double, this one didn't. This overload is not
specified in the Parallelism TS 2. Also remove the comment about labs
and llabs: it doesn't belong here.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_math.h (fabs): Remove
fabs(simd) overload.
---
 .../include/experimental/bits/simd_math.h| 16 
 1 file changed, 16 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index 3ade293fcbf..cff4371619d 100644
--- a/libstdc++-v3/include/experimental/bits/simd_math.h
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -863,22 +863,6 @@ template 
   abs(const simd<_Tp, _Abi>& __x)
   { return {__private_init, _Abi::_SimdImpl::_S_abs(__data(__x))}; }
 
-template 
-  enable_if_t && is_signed_v<_Tp>, simd<_Tp, _Abi>>
-  fabs(const simd<_Tp, _Abi>& __x)
-  { return {__private_init, _Abi::_SimdImpl::_S_abs(__data(__x))}; }
-
-// the following are overloads for functions in  and not covered by
-// [parallel.simd.math]. I don't see much value in making them work, though
-/*
-template  simd labs(const simd &__x)
-{ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))}; }
-
-template  simd llabs(const simd
-&__x)
-{ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))}; }
-*/
-
 #define _GLIBCXX_SIMD_CVTING2(_NAME)   \
 template  \
   _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME(   \


[PATCH 04/11] libstdc++: Make use of __builtin_bit_cast

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

The __bit_cast function was a hack to achieve what __builtin_bit_cast
can do, therefore use __builtin_bit_cast if possible. However,
__builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since
it isn't trivially copyable (in the language sense — in principle it
is). Therefore add __proposed::simd_bit_cast to enable the use case
required in the test framework.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__bit_cast): Implement via
__builtin_bit_cast #if available.
(__proposed::simd_bit_cast): Add overloads for simd and
simd_mask, which use __builtin_bit_cast (or __bit_cast #if not
available), which return an object of the requested type with
the same bits as the argument.
* include/experimental/bits/simd_math.h: Use simd_bit_cast
instead of __bit_cast to allow casts to fixed_size_simd.
* testsuite/experimental/simd/tests/bits/test_values.h: Switch
from __bit_cast to __proposed::simd_bit_cast since the former
will not cast fixed_size objects anymore.
---
 libstdc++-v3/include/experimental/bits/simd.h | 40 ++-
 .../include/experimental/bits/simd_math.h |  8 ++--
 .../simd/tests/bits/test_values.h |  8 ++--
 3 files changed, 46 insertions(+), 10 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 163f1b574e2..5d243f22434 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1598,7 +1598,9 @@ template 
   _GLIBCXX_SIMD_INTRINSIC constexpr _To
   __bit_cast(const _From __x)
   {
-// TODO: implement with / replace by __builtin_bit_cast ASAP
+#if __has_builtin(__builtin_bit_cast)
+return __builtin_bit_cast(_To, __x);
+#else
 static_assert(sizeof(_To) == sizeof(_From));
 constexpr bool __to_is_vectorizable
   = is_arithmetic_v<_To> || is_enum_v<_To>;
@@ -1629,6 +1631,7 @@ template 
 			 reinterpret_cast(&__x), sizeof(_To));
 	return __r;
   }
+#endif
   }
 
 // }}}
@@ -2900,6 +2903,41 @@ template (__x)};
   }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  _To
+  simd_bit_cast(const simd<_Up, _Abi>& __x)
+  {
+using _Tp = typename _To::value_type;
+using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember;
+using _From = simd<_Up, _Abi>;
+using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember;
+// with concepts, the following should be constraints
+static_assert(sizeof(_To) == sizeof(_From));
+static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>);
+static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>);
+#if __has_builtin(__builtin_bit_cast)
+return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))};
+#else
+return {__private_init, __bit_cast<_ToMember>(__data(__x))};
+#endif
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  _To
+  simd_bit_cast(const simd_mask<_Up, _Abi>& __x)
+  {
+using _From = simd_mask<_Up, _Abi>;
+static_assert(sizeof(_To) == sizeof(_From));
+static_assert(is_trivially_copyable_v<_To> && is_trivially_copyable_v<_From>);
+#if __has_builtin(__builtin_bit_cast)
+return __builtin_bit_cast(_To, __x);
+#else
+return __bit_cast<_To>(__x);
+#endif
+  }
 } // namespace __proposed
 
 // simd_cast {{{2
diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index d954e761eee..3ade293fcbf 100644
--- a/libstdc++-v3/include/experimental/bits/simd_math.h
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -700,11 +700,9 @@ template 
 	// (inf and NaN are excluded by -ffinite-math-only)
 	const auto __iszero_inf_nan = __x == 0;
 #else
-	const auto __as_int
-	  = __bit_cast, _V>>(abs(__x));
-	const auto __inf
-	  = __bit_cast, _V>>(
-	_V(__infinity_v<_Tp>));
+	using _Ip = __int_for_sizeof_t<_Tp>;
+	const auto __as_int = simd_bit_cast>(abs(__x));
+	const auto __inf = simd_bit_cast>(_V(__infinity_v<_Tp>));
 	const auto __iszero_inf_nan = static_simd_cast(
 	  __as_int == 0 || __as_int >= __inf);
 #endif
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_value

[PATCH 03/11] libstdc++: Improve fixed_size codegen

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

Sometimes fixed_size objects will get unnecessarily copied on the stack.
The simd implementation should never pass _SimdTuple by value to avoid
requiring the optimizer to see through these copies.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_converter.h
(_SimdConverter::operator()): Pass _SimdTuple by const-ref.
* include/experimental/bits/simd_fixed_size.h
(_GLIBCXX_SIMD_FIXED_OP): Pass binary operator _SimdTuple
arguments by const-ref.
(_S_masked_unary): Pass _SimdTuple by const-ref.
---
 libstdc++-v3/include/experimental/bits/simd_converter.h  | 2 +-
 libstdc++-v3/include/experimental/bits/simd_fixed_size.h | 5 ++---
 2 files changed, 3 insertions(+), 4 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h b/libstdc++-v3/include/experimental/bits/simd_converter.h
index 9c8bf382df9..11999df25e4 100644
--- a/libstdc++-v3/include/experimental/bits/simd_converter.h
+++ b/libstdc++-v3/include/experimental/bits/simd_converter.h
@@ -316,7 +316,7 @@ template 
 
 _GLIBCXX_SIMD_INTRINSIC constexpr
   typename _SimdTraits<_To, _Ap>::_SimdMember
-  operator()(_Arg __x) const noexcept
+  operator()(const _Arg& __x) const noexcept
 {
   if constexpr (_Arg::_S_tuple_size == 1)
 	return __vector_convert<__vector_type_t<_To, _Np>>(__x.first);
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index b6fb47cdf39..dc2fb90b9b2 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -1480,7 +1480,7 @@ template 
 #define _GLIBCXX_SIMD_FIXED_OP(name_, op_) \
 template\
   static inline constexpr _SimdTuple<_Tp, _As...> name_(   \
-	const _SimdTuple<_Tp, _As...> __x, const _SimdTuple<_Tp, _As...> __y)  \
+	const _SimdTuple<_Tp, _As...>& __x, const _SimdTuple<_Tp, _As...>& __y)\
   {\
 	return __x._M_apply_per_chunk( \
 	  [](auto __impl, auto __xx, auto __yy) constexpr {\
@@ -1780,8 +1780,7 @@ template 
 // _S_masked_unary {{{2
 template  class _Op, typename _Tp, typename... _As>
   static inline _SimdTuple<_Tp, _As...>
-  _S_masked_unary(const _MaskMember __bits,
-		  const _SimdTuple<_Tp, _As...> __v) // TODO: const-ref __v?
+  _S_masked_unary(const _MaskMember __bits, const _SimdTuple<_Tp, _As...>& __v)
   {
 	return __v._M_apply_wrapped([&__bits](auto __meta,
 	  auto __native) constexpr {


[PATCH 02/11] libstdc++: Remove dead code

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

This helper type became unused at some point.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_fixed_size.h
(_AbisInSimdTuple): Removed.
---
 .../experimental/bits/simd_fixed_size.h   | 49 ---
 1 file changed, 49 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index 7c2c1df77c8..b6fb47cdf39 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -1025,55 +1025,6 @@ template 
   _Tp, _Remain, _SimdTuple<_Tp, _As..., typename _Next::abi_type>>::type;
   };
 
-// }}}
-// _AbisInSimdTuple {{{
-template 
-  struct _SeqOp;
-
-template 
-  struct _SeqOp>
-  {
-using _FirstPlusOne = index_sequence<_I0 + 1, _Is...>;
-using _NotFirstPlusOne = index_sequence<_I0, (_Is + 1)...>;
-template 
-using _Prepend = index_sequence<_First, _I0 + _Add, (_Is + _Add)...>;
-  };
-
-template 
-  struct _AbisInSimdTuple;
-
-template 
-  struct _AbisInSimdTuple<_SimdTuple<_Tp>>
-  {
-using _Counts = index_sequence<0>;
-using _Begins = index_sequence<0>;
-  };
-
-template 
-  struct _AbisInSimdTuple<_SimdTuple<_Tp, _Ap>>
-  {
-using _Counts = index_sequence<1>;
-using _Begins = index_sequence<0>;
-  };
-
-template 
-  struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A0, _As...>>
-  {
-using _Counts = typename _SeqOp>::_Counts>::_FirstPlusOne;
-using _Begins = typename _SeqOp>::_Begins>::_NotFirstPlusOne;
-  };
-
-template 
-  struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A1, _As...>>
-  {
-using _Counts = typename _SeqOp>::_Counts>::template _Prepend<1, 0>;
-using _Begins = typename _SeqOp>::_Begins>::template _Prepend<0, 1>;
-  };
-
 // }}}
 // __autocvt_to_simd {{{
 template >>


[PATCH 01/11] libstdc++: Improve copysign codegen

2021-06-08 Thread Matthias Kretz


From: Matthias Kretz 

This also resolves a test failure on aarch64 with -ffast-math and
fixed_size with large N.

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Add missing operator~
overload for simd to __float_bitwise_operators.
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_complement): Bitcast to int (and back) to
implement complement for floating-point vectors.
* include/experimental/bits/simd_fixed_size.h
(_SimdImplFixedSize::_S_copysign): New function, forwarding to
copysign implementation of _SimdTuple members.
* include/experimental/bits/simd_math.h (copysign): Call
_SimdImpl::_S_copysign for fixed_size arguments. Simplify
generic copysign implementation using the new ~ operator.
---
 libstdc++-v3/include/experimental/bits/simd.h| 6 ++
 libstdc++-v3/include/experimental/bits/simd_builtin.h| 7 ++-
 libstdc++-v3/include/experimental/bits/simd_fixed_size.h | 2 +-
 libstdc++-v3/include/experimental/bits/simd_math.h   | 4 +++-
 4 files changed, 16 insertions(+), 3 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 59ddf3cc958..163f1b574e2 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -5189,6 +5189,12 @@ template 
 return {__private_init,
 	_Ap::_SimdImpl::_S_bit_and(__data(__a), __data(__b))};
   }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  enable_if_t, simd<_Tp, _Ap>>
+  operator~(const simd<_Tp, _Ap>& __a)
+  { return {__private_init, _Ap::_SimdImpl::_S_complement(__data(__a))}; }
 } // namespace __float_bitwise_operators }}}
 
 _GLIBCXX_SIMD_END_NAMESPACE
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index e986ee91620..8cd338e313f 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -1632,7 +1632,12 @@ template 
 template 
   _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
   _S_complement(_SimdWrapper<_Tp, _Np> __x) noexcept
-  { return ~__x._M_data; }
+  {
+	if constexpr (is_floating_point_v<_Tp>)
+	  return __vector_bitcast<_Tp>(~__vector_bitcast<__int_for_sizeof_t<_Tp>>(__x));
+	else
+	  return ~__x._M_data;
+  }
 
 // _S_unary_minus {{{2
 template 
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index 2722055c899..7c2c1df77c8 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -1663,7 +1663,7 @@ template 
 _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ldexp)
 _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmod)
 _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, remainder)
-// copysign in simd_math.h
+_GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, copysign)
 _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nextafter)
 _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fdim)
 _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmax)
diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h
index 4799803a200..d954e761eee 100644
--- a/libstdc++-v3/include/experimental/bits/simd_math.h
+++ b/libstdc++-v3/include/experimental/bits/simd_math.h
@@ -1304,6 +1304,8 @@ template 
   {
 if constexpr (simd_size_v<_Tp, _Abi> == 1)
   return std::copysign(__x[0], __y[0]);
+else if constexpr (__is_fixed_size_abi_v<_Abi>)
+  return {__private_init, _Abi::_SimdImpl::_S_copysign(__data(__x), __data(__y))};
 else if constexpr (is_same_v<_Tp, long double> && sizeof(_Tp) == 12)
   // Remove this case once __bit_cast is implemented via __builtin_bit_cast.
   // It is necessary, because __signmask below cannot be computed at compile
@@ -1315,7 +1317,7 @@ template 
 	using _V = simd<_Tp, _Abi>;
 	using namespace std::experimental::__float_bitwise_operators;
 	_GLIBCXX_SIMD_USE_CONSTEXPR_API auto __signmask = _V(1) ^ _V(-1);
-	return (__x & (__x ^ __signmask)) | (__y & __signmask);
+	return (__x & ~__signmask) | (__y & __signmask);
   }
   }
 


[PATCH 00/11] stdx::simd optimizations, corrections, and cleanups

2021-06-08 Thread Matthias Kretz
The following patches mostly contain code cleanups and minor corrections. The 
major feature in this patchset is the last patch, which should make the use of 
stdx::simd much safer wrt. ODR violations involuntarily introduced by linking 
TUs that were compiled with different -m and floating-point flags.

Matthias Kretz (11):
  libstdc++: Improve copysign codegen
  libstdc++: Remove dead code
  libstdc++: Improve fixed_size codegen
  libstdc++: Make use of __builtin_bit_cast
  libstdc++: Remove incorrect fabs overload
  libstdc++: Minor simd_math cleanups
  libstdc++: Fix condition when AVX512F ldexp implementation is used
  libstdc++: Avoid raising fp exceptions in trunc, floor, and ceil
  libstdc++: Ensure unrolled loops inline the lambda
  libstdc++: Fix internal names: add missing underscores
  libstdc++: Fix ODR issues with different -m flags

 libstdc++-v3/include/experimental/bits/simd.h | 438 --
 .../include/experimental/bits/simd_builtin.h  |  48 +-
 .../experimental/bits/simd_converter.h|   2 +-
 .../include/experimental/bits/simd_detail.h   |  40 ++
 .../experimental/bits/simd_fixed_size.h   |  95 ++--
 .../include/experimental/bits/simd_math.h | 107 ++---
 .../include/experimental/bits/simd_neon.h |   4 +-
 .../include/experimental/bits/simd_ppc.h  |   4 +-
 .../include/experimental/bits/simd_scalar.h   |  71 ++-
 .../include/experimental/bits/simd_x86.h  |  33 +-
 .../simd/tests/bits/test_values.h |   8 +-
 11 files changed, 540 insertions(+), 310 deletions(-)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 3/3] libstdc++: Document simd testsuite

2021-06-08 Thread Matthias Kretz


libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/README.md: New file.

Signed-off-by: Matthias Kretz 
---
 .../testsuite/experimental/simd/README.md | 257 ++
 1 file changed, 257 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/experimental/simd/README.md


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/README.md b/libstdc++-v3/testsuite/experimental/simd/README.md
new file mode 100644
index 000..db0d71f8d43
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/README.md
@@ -0,0 +1,257 @@
+# SIMD Tests
+
+To execute the simd testsuite, call `make check-simd`, typically with `-j N` 
+argument.
+
+For more control over verbosity, compiler flags, and use of a simulator, use 
+the environment variables documented below.
+
+## Environment variables
+
+### `target_list`
+
+Similar to dejagnu target lists: E.g. 
+`target_list="unix{-march=sandybridge,-march=native/-ffast-math,-march=native/-ffinite-math-only}" 
+would create three subdirs in `testsuite/simd/` to run the complete simd 
+testsuite first with `-march=sandybridge`, then with `-march=native 
+-ffast-math`, and finally with `-march=native -ffinite-math-only`.
+
+
+### `CHECK_SIMD_CONFIG`
+
+This variable can be set to a path to a file which is equivalent to a dejagnu 
+board. The file needs to be a valid `sh` script since it is sourced from the 
+`scripts/check_simd` script. It's purpose is to set the `target_list` variable 
+depending on `$target_triplet` (or whatever else makes sense for you). Example:
+
+```sh
+case "$target_triplet" in
+x86_64-*)
+  target_list="unix{-march=sandybridge,-march=skylake-avx512,-march=native/-ffast-math,-march=athlon64,-march=core2,-march=nehalem,-march=skylake,-march=native/-ffinite-math-only,-march=knl}"
+  ;;
+
+powerpc64le-*)
+  define_target power7 "-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc112"
+  define_target power8 "-mcpu=power8 -static" "$HOME/bin/run_on_gccfarm gcc112"
+  define_target power9 "-mcpu=power9 -static" "$HOME/bin/run_on_gccfarm gcc135"
+  target_list="power7 power8 power9{,-ffast-math}"
+  ;;
+
+powerpc64-*)
+  define_target power7 "-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc110"
+  define_target power8 "-mcpu=power8 -static" "$HOME/bin/run_on_gccfarm gcc110"
+  target_list="power7 power8{,-ffast-math}"
+  ;;
+esac
+```
+
+The `unix` target is pre-defined to have no initial flags and no simulator. Use 
+the `define_target(name, flags, sim)` function to define your own targets for 
+the `target_list` variable. In the example above `define_target power7 
+"-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc112"` defines the target 
+`power7` which always uses the flags `-mcpu=power7` and `-static` when 
+compiling tests and prepends `$HOME/bin/run_on_gccfarm gcc112` to test 
+executables. In `target_list` you can now use the name `power7`. E.g. 
+`target_list="power7 power7/-ffast-math"` or it's shorthand 
+`target_list="power7{,-ffast-math}"`.
+
+
+### `DRIVEROPTS`
+
+This variable affects the `Makefile`s generated per target (as defined above). 
+It's a string of flags that are prepended to the `driver.sh` invocation which 
+builds and runs the tests. You `cd` into a simd test subdir and use `make help` 
+to see possible options and a list of all valid targets.
+
+```
+use DRIVEROPTS= to pass the following options:
+-q, --quiet Disable same-line progress output (default if stdout is
+not a tty).
+-p, --percentageAdd percentage to default same-line progress output.
+-v, --verbose   Print one line per test and minimal extra information on
+failure.
+-vv Print all compiler and test output.
+-k, --keep-failed   Keep executables of failed tests.
+--sim   Path to an executable that is prepended to the test
+execution binary (default: the value of
+GCC_TEST_SIMULATOR).
+--timeout-factor 
+Multiply the default timeout with x.
+-x, --run-expensive Compile and run tests marked as expensive (default:
+true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise).
+-o , --only 
+Compile and run only tests matching the given pattern.
+```
+
+
+### `TESTFLAGS`
+
+This variable also affects the `Makefile`s generated per target. It's a list of 
+compiler flags that are appended to `CXXFLAGS`.
+
+

[PATCH 2/3] libstdc++: Improve output verbosity options and default

2021-06-08 Thread Matthias Kretz


For most uses --quiet was too quiet while the default was too noisy. Now
the default output, if stdout is a tty, shows the last successful test
on the same line. With --percentage it adds a percentage at the start of
the line. --percentage is not default because it requires more resources
and might not be 100% compatible to all environments.
If stdout is not a tty the default is quiet output like for dejagnu.

Additionally, argument parsing now recognizes contracted short options
which is easier to use with e.g. DRIVEROPTS=-pxk.

libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/driver.sh: Rewrite output
verbosity logic. Add -p/--percentage option. Allow -v/--verbose
to be used twice. Add -x and -o short options. Parse long
options with = instead of separating space generically. Parce
contracted short options. Make unrecognized options an error.
If same-line output is active, trap on EXIT to increment the
progress (only with --percentage), erase the line and print the
current status.
* testsuite/experimental/simd/generate_makefile.sh: Initialize
helper files for progress account keeping. Update help target
for changes to DRIVEROPTS.

Signed-off-by: Matthias Kretz 
---
 .../testsuite/experimental/simd/driver.sh | 137 +-
 .../experimental/simd/generate_makefile.sh|  33 +++--
 2 files changed, 121 insertions(+), 49 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh
index f2d31c70bd0..5ae9905e3a3 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -5,8 +5,22 @@ abi=0
 name=
 srcdir="$(cd "${0%/*}" && pwd)/tests"
 sim="$GCC_TEST_SIMULATOR"
-quiet=false
-verbose=false
+
+# output_mode values:
+# print only failures with minimal context
+readonly really_quiet=0
+# as above plus same-line output of last successful test
+readonly same_line=1
+# as above plus percentage
+readonly percentage=2
+# print one line per finished test with minimal context on failure
+readonly verbose=3
+# print one line per finished test with full output of the compiler and test
+readonly really_verbose=4
+
+output_mode=$really_quiet
+[ -t 1 ] && output_mode=$same_line
+
 timeout=180
 run_expensive=false
 if [ -n "$GCC_TEST_RUN_EXPENSIVE" ]; then
@@ -21,8 +35,12 @@ Usage: $0 [Options] 
 
 Options:
   -h, --help  Print this message and exit.
-  -q, --quiet Only print failures.
-  -v, --verbose   Print compiler and test output on failure.
+  -q, --quiet Disable same-line progress output (default if stdout is
+  not a tty).
+  -p, --percentageAdd percentage to default same-line progress output.
+  -v, --verbose   Print one line per test and minimal extra information on
+  failure.
+  -vv Print all compiler and test output.
   -t , --type 
   The value_type to test (default: $type).
   -a [0-9], --abi [0-9]
@@ -36,9 +54,10 @@ Options:
   GCC_TEST_SIMULATOR).
   --timeout-factor 
   Multiply the default timeout with x.
-  --run-expensive Compile and run tests marked as expensive (default:
+  -x, --run-expensive Compile and run tests marked as expensive (default:
   true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise).
-  --only Compile and run only tests matching the given pattern.
+  -o , --only 
+  Compile and run only tests matching the given pattern.
 EOF
 }
 
@@ -49,71 +68,74 @@ while [ $# -gt 0 ]; do
 exit
 ;;
   -q|--quiet)
-quiet=true
+output_mode=$really_quiet
+;;
+  -p|--percentage)
+output_mode=$percentage
 ;;
   -v|--verbose)
-verbose=true
+if [ $output_mode -lt $verbose ]; then
+  output_mode=$verbose
+else
+  output_mode=$really_verbose
+fi
 ;;
-  --run-expensive)
+  -x|--run-expensive)
 run_expensive=true
 ;;
   -k|--keep-failed)
 keep_failed=true
 ;;
-  --only)
+  -o|--only)
 only="$2"
 shift
 ;;
-  --only=*)
-only="${1#--only=}"
-;;
   -t|--type)
 type="$2"
 shift
 ;;
-  --type=*)
-type="${1#--type=}"
-;;
   -a|--abi)
 abi="$2"
 shift
 ;;
-  --abi=*)
-abi="${1#--abi=}"
-;;
   -n|--name)
 name="$2"
 shift
 ;;
-  --name

[PATCH 1/3] libstdc++: Remove -fno-tree-vrp after PR98834 was resolved

2021-06-08 Thread Matthias Kretz


libstdc++-v3/ChangeLog:

* testsuite/Makefile.am (check-simd): Remove -fno-tree-vrp flag
and associated warning.
* testsuite/Makefile.in: Regenerate.

Signed-off-by: Matthias Kretz 
---
 libstdc++-v3/testsuite/Makefile.am | 3 +--
 libstdc++-v3/testsuite/Makefile.in | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am
index ba5023a8b54..d2011f03c64 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -191,10 +191,9 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \
 	${glibcxx_srcdir}/scripts/check_simd \
 	testsuite_files_simd \
 	${glibcxx_builddir}/scripts/testsuite_flags
-	@echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834."
 	@rm -f .simd.summary
 	@echo "Generating simd testsuite subdirs and Makefiles ..."
-	@${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \
+	@${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \
 	  while read subdir; do \
 	$(MAKE) -C "$${subdir}"; \
 	tail -n20 $${subdir}/simd_testsuite.sum | \
diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/Makefile.in
index c9dd7f5da61..c65cdaf2015 100644
--- a/libstdc++-v3/testsuite/Makefile.in
+++ b/libstdc++-v3/testsuite/Makefile.in
@@ -716,10 +716,9 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \
 	${glibcxx_srcdir}/scripts/check_simd \
 	testsuite_files_simd \
 	${glibcxx_builddir}/scripts/testsuite_flags
-	@echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834."
 	@rm -f .simd.summary
 	@echo "Generating simd testsuite subdirs and Makefiles ..."
-	@${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \
+	@${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \
 	  while read subdir; do \
 	$(MAKE) -C "$${subdir}"; \
 	tail -n20 $${subdir}/simd_testsuite.sum | \


[PATCH 0/3] Improve and document stdx::simd testsuite

2021-06-08 Thread Matthias Kretz
As discussed a long time ago on IRC, this improves (i.e. decreases by default) 
the verbosity of make check-simd, gives more verbosity options, and finally 
documents how the simd testsuite is used and how it works. In addition, after 
PR98834 was resolved, remove the -fno-tree-vrp workaround.

Tested on x86_64-linux (and more).


Matthias Kretz (3):
  libstdc++: Remove -fno-tree-vrp after PR98834 was resolved
  libstdc++: Improve output verbosity options and default
  libstdc++: Document simd testsuite

 libstdc++-v3/testsuite/Makefile.am|   3 +-
 libstdc++-v3/testsuite/Makefile.in|   3 +-
 .../testsuite/experimental/simd/README.md | 257 ++
 .../testsuite/experimental/simd/driver.sh | 137 +++---
 .../experimental/simd/generate_makefile.sh|  33 ++-
 5 files changed, 380 insertions(+), 53 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/experimental/simd/README.md

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



Re: [PATCH] Add gnu::diagnose_as attribute

2021-06-01 Thread Matthias Kretz
On Tuesday, 1 June 2021 21:12:18 CEST Jason Merrill wrote:
> On 5/28/21 3:42 AM, Matthias Kretz wrote:
> > On Friday, 28 May 2021 05:05:52 CEST Jason Merrill wrote:
> >> I'd think you could get the same effect from a hypothetical
> >> 
> >> namespace [[gnu::diagnose_as]] stdx = std::experimental;
> >> 
> >> though we'll need to add support for attributes on namespace aliases to
> >> the grammar.
> > 
> > Right, but then two of my design goals can't be met:
> > 
> > 1. Diagnostics have an improved signal-to-noise ratio out of the box.
> > 
> > 2. We can use replacement names that are not valid identifiers.
> 
> This is the basic disconnect: I think that these goals are
> contradictory, and that replacement names that are not valid identifiers
> will just confuse users that don't know about them.

With signal-to-noise ratio I meant the ratio (averaged over all GCC users - so 
yes, we can't give actual numbers for these):

  #characters one needs to read to understand / #total diagnostic characters.

Or more specifically

  1 - #characters that are distracting from understanding the issue / #total 
diagnostic characters.

Consider that for the stdx::simd case I regularly hit the problem that vim's 
QuickFix truncates at 4095 characters and the message basically just got 
started (i.e. it's sometimes impossible to use vim's QuickFix to understand 
errors involving stdx::simd). There's *a lot* of noise that must be removed 
*per default*.

WRT "invalid identifiers", there are two types:
(1) string of characters that is not a valid C++ identifier
(2) valid C++ identifier, but not defined for the given TU

(2) can be confusing, I agree, but doesn't have to be. (1) provides a stronger 
hint that something is either abbreviated or intentionally hidden from the 
user.

If I write `std::experimental::simd` in my code and get a diagnostic 
that says 'stdₓ::simd' then it's relatively easy to 
make the connection what happened here: 'stdₓ' clearly must mean something 
else than a literal 'stdₓ' in my code. The user knows there's no `std::simd' 
so it must be `std::experimental::simd`. (Note that once 
std::experimental::simd goes into the IS, I would be the first to propose a 
change for 'stdₓ::simd' back to 'std::experimental::simd'.)

> If a user sees stdx::foo in a diagnostic and then tries to refer to
> stdx::foo and gets an error, the diagnostic is not more helpful than one
> that uses the fully qualified name.

Hmm, if GCC prints an actual suggestion like "write 'stdₓ::foo' here" then 
yes, I agree. That should not make use of diagnose_as.

> Jonathan, David, any thoughts on this issue?
>
> > I can imagine using it to make _Internal __names more readable while at
> > the
> > same time discouraging users to utter them in their own code. Sorry for
> > the
> > bad code obfuscation example above.
> > 
> > An example for consideration from stdx::simd:
> >namespace std {
> >namespace experimental {
> >namespace parallelism_v2 [[gnu::diagnose_as("stdx")]] {
> >namespace simd_abi [[gnu::diagnose_as("simd_abi")]] {
> >
> >  template 
> >  
> >struct _VecBuiltin;
> >  
> >  template 
> >  
> >struct _VecBltnBtmsk;
> >
> >#if x86
> >
> >  using __ignore_me_0 [[gnu::diagnose_as("[SSE]")]] = _VecBuiltin<16>;
> >  using __ignore_me_1 [[gnu::diagnose_as("[AVX]")]] = _VecBuiltin<32>;
> >  using __ignore_me_2 [[gnu::diagnose_as("[AVX512]")]] =
> >  _VecBltnBtmsk<64>;
> >
> >#endif
> >
> > 
> > Then diagnostics would print 'stdx::simd'
> > instead of 'stdx::simd>'. (Users utter
> > the type by saying e.g. 'stdx::native_simd', while compiling with
> > AVX512 flags.)
>
> Wouldn't it be better to print stdx::native_simd if that's how
> the users write the type?

No. For example, I might expect that native_simd maps to AVX-512 vectors but 
forgot the relevant -m flag(s). If the diagnostics show 'simd' I have a good chance of catching that issue.
And the other way around: If I wrote `stdx::simd` and it happens to be 
the same type as the native_simd typedef, it would show the latter in 
diagnostics. Similar issue with asking for a simd ABI with 
`simd_abi::deduce_t`: I typically don't want to know whether that's 
also native_simd but rather what exact simd_abi I got. And no, as a 
user I don't typically care about the libstdc++ implementation details but 
what those details mean.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-28 Thread Matthias Kretz
On Friday, 28 May 2021 05:05:52 CEST Jason Merrill wrote:
> On 5/27/21 6:07 PM, Matthias Kretz wrote:
> > On Thursday, 27 May 2021 23:15:46 CEST Jason Merrill wrote:
> >> On 5/27/21 2:54 PM, Matthias Kretz wrote:
> >>> namespace Vir {
> >>> inline namespace foo {
> >>>   struct A {};
> >>> }
> >>> struct A {};
> >>> }
> >>> using Vir::A;
> >>> 
> >>> :7:12: error: reference to 'A' is ambiguous
> >>> :3:12: note: candidates are: 'struct Vir::A'
> >>> :5:10: note: 'struct Vir::A'
> >> 
> >> That doesn't seem so bad.
> > 
> > As long as you ignore the line numbers, it's a puzzling diagnostic.
> 
> Only briefly puzzling, I think; Vir::A is a valid way of referring to
> both types.

True. But that's also what lead to the error. GCC easily clears it up 
nowadays, but wouldn't anymore if inline namespaces were hidden by default.

> I'd think you could get the same effect from a hypothetical
> 
> namespace [[gnu::diagnose_as]] stdx = std::experimental;
> 
> though we'll need to add support for attributes on namespace aliases to
> the grammar.

Right, but then two of my design goals can't be met:

1. Diagnostics have an improved signal-to-noise ratio out of the box.

2. We can use replacement names that are not valid identifiers.

I don't think libstdc++ would ship with a namespace alias outside of the std 
namespace. So we'd place the "burden" of using diagnose_as correctly on our 
users. Also as a user you'd possibly have to repeat the namespace alias in 
every source file and/or place it in your applications/libraries namespace.

> >> Here it seems like you want to say "use this typedef as the true name of
> >> the type".  Is it useful to have to repeat the name?  Allowing people to
> >> use names that don't correspond to actual declarations seems unnecessary.
> > 
> > Yes, but you could also use it to apply diagnose_as to a template
> > instantiation without introducing a name for users. E.g.
> > 
> >using __only_to_apply_the_attribute [[gnu::diagnose_as("intvector")]]
> >
> >  = std::vector;
> > 
> > Now all diagnostics of 'std::vector' would print 'intvector' instead.
> 
> Yes, but why would you want to?  Making diagnostics print names that the
> user can't use in their own code seems obfuscatory, and requiring users
> to write the same names in two places seems like extra work.

I can imagine using it to make _Internal __names more readable while at the 
same time discouraging users to utter them in their own code. Sorry for the 
bad code obfuscation example above.

An example for consideration from stdx::simd:

  namespace std {
  namespace experimental {
  namespace parallelism_v2 [[gnu::diagnose_as("stdx")]] {
  namespace simd_abi [[gnu::diagnose_as("simd_abi")]] {
template 
  struct _VecBuiltin;

template 
  struct _VecBltnBtmsk;

  #if x86
using __ignore_me_0 [[gnu::diagnose_as("[SSE]")]] = _VecBuiltin<16>;
using __ignore_me_1 [[gnu::diagnose_as("[AVX]")]] = _VecBuiltin<32>;
using __ignore_me_2 [[gnu::diagnose_as("[AVX512]")]] = _VecBltnBtmsk<64>;
  #endif
  

Then diagnostics would print 'stdx::simd' instead 
of 'stdx::simd>'. (Users utter the type by 
saying e.g. 'stdx::native_simd', while compiling with AVX512 flags.)


> > But in general, I tend to agree, for type aliases there's rarely a case
> > where the names wouldn't match.
> > 
> > However, I didn't want to special-case the attribute parameters for type
> > aliases (or introduce another attribute just for this case). The attribute
> > works consistently and with the same interface independent of where it's
> > used. I tried to build a generic, broad feature instead of a narrow
> > one-problem solution.
> 
> "Treat this declaration as the name of the type/namespace it refers to
> in diagnostics" also seems consistent to me.

Sure. In general, I think

  namespace foo [[gnu::this_is_the_name_I_want]] = bar;
  using foo [[gnu::this_is_the_name_I_want]] = bar;

is not a terribly bad idea on its own. But it's not the solution for the 
problems I set out to solve.

> Still, perhaps it would be better to store these aliases in a separate hash
> table instead of *_ATTRIBUTES.

Maybe. For performance reasons or for simplification of the implementation? 
What entity could I use for hashing? The identifier alone wouldn't suffice 
since different instantiations of the same class template can have different 
diagnose_as values (e.g. std::string, std::wstring, ...).

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-27 Thread Matthias Kretz
On Thursday, 27 May 2021 23:15:46 CEST Jason Merrill wrote:
> On 5/27/21 2:54 PM, Matthias Kretz wrote:
> > Also hiding all inline namespace by default might make some error messages
> > harder to understand:
> > 
> > namespace Vir {
> >inline namespace foo {
> >  struct A {};
> >}
> >struct A {};
> > }
> > using Vir::A;
> > 
> > :7:12: error: reference to 'A' is ambiguous
> > :3:12: note: candidates are: 'struct Vir::A'
> > :5:10: note: 'struct Vir::A'
> 
> That doesn't seem so bad.

As long as you ignore the line numbers, it's a puzzling diagnostic.

> > This is from my pending std::string patch:
> > 
> > --- a/libstdc++-v3/include/bits/c++config
> > +++ b/libstdc++-v3/include/bits/c++config
> > @@ -299,7 +299,8 @@ namespace std
> > 
> >   #if _GLIBCXX_USE_CXX11_ABI
> >   namespace std
> >   {
> > 
> > -  inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { }
> > +  inline namespace __cxx11
> > +__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { }
> 
> This seems to have the same benefits and drawbacks of my inline
> namespace suggestion.

True for std::string, not true for TS's where the extra '::experimental' still 
makes finding the relevant information in diagnostics harder than necessary.

> And it seems like applying the attribute to a
> namespace means that enclosing namespaces are not printed, unlike the
> handling for types.

Yes, that's also how I documented it. For nested namespaces I wanted to enable 
the removal of nesting (e.g. from std::experimental::parallelism_v2::simd to 
stdx::simd). However, for types and functions it would be a problem to drop 
the enclosing scope, because the scope can be class templates and thus the 
diagnose_as attribute would remove all template parms & args.

> > -  typedef basic_stringstring;
> > +  typedef basic_string string
> > [[__gnu__::__diagnose_as__("string")]];
>
> Here it seems like you want to say "use this typedef as the true name of
> the type".  Is it useful to have to repeat the name?  Allowing people to
> use names that don't correspond to actual declarations seems unnecessary.

Yes, but you could also use it to apply diagnose_as to a template 
instantiation without introducing a name for users. E.g.

  using __only_to_apply_the_attribute [[gnu::diagnose_as("intvector")]]
= std::vector;

Now all diagnostics of 'std::vector' would print 'intvector' instead. But 
in general, I tend to agree, for type aliases there's rarely a case where the 
names wouldn't match.

However, I didn't want to special-case the attribute parameters for type 
aliases (or introduce another attribute just for this case). The attribute 
works consistently and with the same interface independent of where it's used. 
I tried to build a generic, broad feature instead of a narrow one-problem 
solution.

FWIW, before you suggest to have one attribute for namespaces and one for type 
aliases (to cover the std::string case), I have another use case in stdx::simd 
(the spec requires simd_abi::scalar to be an alias):

  namespace std::experimental::parallelism_v2::simd_abi {
struct [[gnu::diagnose_as("scalar")]] _Scalar;
using scalar = _Scalar;
  }

If the attribute were on the type alias (using scalar [[gnu::diagnose_as]] = 
_Scalar;), then we'd have to apply the attribute to _Scalar after it was 
completed. That seemed like a bad idea to me.  

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-27 Thread Matthias Kretz
On Thursday, 27 May 2021 19:39:48 CEST Jason Merrill wrote:
> On 5/4/21 7:13 AM, Matthias Kretz wrote:
> > From: Matthias Kretz 
> > 
> > This attribute overrides the diagnostics output string for the entity it
> > appertains to. The motivation is to improve QoI for library TS
> > implementations, where diagnostics have a very bad signal-to-noise ratio
> > due to the long namespaces involved.
> > 
> > On Tuesday, 27 April 2021 11:46:48 CEST Jonathan Wakely wrote:
> >> I think it's a great idea and would like to use it for all the TS
> >> implementations where there is some inline namespace that the user
> >> doesn't care about. std::experimental::fundamentals_v1:: would be much
> >> better as just std::experimental::, or something like std::[LFTS]::.
> 
> Hmm, how much of the benefit could we get from a flag (probably on by
> default) to skip inline namespaces in diagnostics?

I'd say about 20% for the TS's. Even std::experimental::simd (i.e. without the 
'::parallelism_v2' part) is still rather noisy. I want stdₓ::simd, std-x::simd 
or std::[PTS2]::simd or whatever shortest shorthand Jonathan will allow. ;)

For PR89370, the benefit would be ~2%:

'template std::__cxx11::basic_string<_CharT, _Traits, 
_Alloc>::_If_sv<_Tp, std::__cxx11::basic_string<_CharT, _Traits, _Alloc>&> 
std::__cxx11::basic_string<_CharT, _Traits, 
_Alloc>::insert(std::__cxx11::basic_string<_CharT, _Traits, 
_Alloc>::size_type, const _Tp&, std::__cxx11::basic_string<_CharT, _Traits, 
_Alloc>::size_type, std::__cxx11::basic_string<_CharT, _Traits, 
_Alloc>::size_type) [with _Tp = _Tp; _CharT = char; _Traits = 
std::char_traits; _Alloc = std::allocator]'

would then only turn into:

'template std::basic_string<_CharT, _Traits, 
_Alloc>::_If_sv<_Tp, std::basic_string<_CharT, _Traits, _Alloc>&> 
std::basic_string<_CharT, _Traits, 
_Alloc>::insert(std::basic_string<_CharT, _Traits, 
_Alloc>::size_type, const _Tp&, std::basic_string<_CharT, _Traits, 
_Alloc>::size_type, std::basic_string<_CharT, _Traits, 
_Alloc>::size_type) [with _Tp = _Tp; _CharT = char; _Traits = 
std::char_traits; _Alloc = std::allocator]'

instead of:

'template std::string::_If_sv<_Tp, std::string&> 
std::string::insert<_Tp>(std::string::size_type, const _Tp&, 
std::string::size_type, std::string::size_type)'


Also hiding all inline namespace by default might make some error messages 
harder to understand:

namespace Vir {
  inline namespace foo {
struct A {};
  }
  struct A {};
}
using Vir::A;


:7:12: error: reference to 'A' is ambiguous
:3:12: note: candidates are: 'struct Vir::A'
:5:10: note: 'struct Vir::A'

> > With the attribute, it is possible to solve PR89370 and make
> > std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as
> > std::string in diagnostic output without extra hacks to recognize the
> > type.
> 
> That sounds wrong to me; std::string is the  instantiation, not
> the template.  Your patch doesn't make it possible to apply this
> attribute to class template instantiations, does it?

Yes, it does.

Initially, when I tried to improve the TS experience, it didn't. When Jonathan 
showed PR89370 to me I tried to make [[gnu::diagnose_as]] more generic & 
useful. Since there's no obvious syntax to apply an attribute to a template 
instantiation, I had to be creative. This is from my pending std::string 
patch:

--- a/libstdc++-v3/include/bits/c++config
+++ b/libstdc++-v3/include/bits/c++config
@@ -299,7 +299,8 @@ namespace std
 #if _GLIBCXX_USE_CXX11_ABI
 namespace std
 {
-  inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { }
+  inline namespace __cxx11
+__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { }
 }
 namespace __gnu_cxx
 {
--- a/libstdc++-v3/include/bits/stringfwd.h
+++ b/libstdc++-v3/include/bits/stringfwd.h
@@ -76,24 +76,24 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_CXX11
 
   /// A string of @c char
-  typedef basic_stringstring;   
+  typedef basic_string string [[__gnu__::__diagnose_as__("string")]];
 
 #ifdef _GLIBCXX_USE_WCHAR_T
   /// A string of @c wchar_t
-  typedef basic_string wstring;   
+  typedef basic_string wstring
 [[__gnu__::__diagnose_as__("wstring")]];
 #endif
[...]

The part of my frontend patch that makes this work is in 
handle_diagnose_as_attribute:

+  if (TREE_CODE (*node) == TYPE_DECL)
+{
+  // Apply the attribute to the type alias itself.
+  decl = *node;
+  tree type = TREE_TYPE (*node);
+  if (CLASS_TYPE_P (type) && CLASSTYPE_TEMPLATE_INSTANTIATION (type))
+   {
+ if (COMPLETE_OR_OPEN_TYPE_P (type))
+   warning (OPT_Wattributes,
+"igno

Re: [PATCH] c++: Add missing scope in typedef diagnostic [PR100763]

2021-05-27 Thread Dr. Matthias Kretz
On Thursday, 27 May 2021 17:18:58 CEST Jason Merrill wrote:
> On 5/26/21 5:27 PM, Matthias Kretz wrote:
> > From: Matthias Kretz 
> > 
> > dump_type on 'const std::string' should not print 'const string' unless
> > TFF_UNQUALIFIED_NAME is requested.
> > 
> > gcc/cp/ChangeLog:
> > PR c++/100763
> > * error.c: Call dump_scope when printing a typedef.
> > 
> > + if (! (flags & TFF_UNQUALIFIED_NAME))
> > +   dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags);
> 
> You can use "decl" instead of "TYPE_NAME (t)" here.
> 
> OK with that change.

Updated patch below.


From: Matthias Kretz 

dump_type on 'const std::string' should not print 'const string' unless
TFF_UNQUALIFIED_NAME is requested.

gcc/cp/ChangeLog:

PR c++/100763
* error.c: Call dump_scope when printing a typedef.
---
 gcc/cp/error.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 3d5eebd4bcd..ae78b10c7b2 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -501,6 +501,8 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags)
   else
 	{
 	  pp_cxx_cv_qualifier_seq (pp, t);
+	  if (! (flags & TFF_UNQUALIFIED_NAME))
+	dump_scope (pp, CP_DECL_CONTEXT (decl), flags);
 	  pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t));
 	  return;
 	}


Re: [PATCH] c++: Output less irrelevant info for function template decl [PR100716]

2021-05-27 Thread Matthias Kretz
On Thursday, 27 May 2021 17:07:40 CEST Jason Merrill wrote:
> On 5/26/21 5:29 PM, Matthias Kretz wrote:
> > New revision which can also be compiled with GCC 4.8.
> > 
> > From: Matthias Kretz 
> > 
> > Ensure dump_template_decl for function templates never prints template
> > parameters after the function name (it did with -fno-pretty-templates)
> > and skip output of irrelevant & confusing "[with T = T]" in
> > dump_substitution.
> > 
> > gcc/cp/ChangeLog:
> > PR c++/100716
> > * error.c (dump_template_bindings): Include code to print
> > "[with" and ']', conditional on whether anything is printed at
> > all. This is tied to whether a semicolon is needed to separate
> > multiple template parameters. If the template argument repeats
> > the template parameter (T = T), then skip the parameter.
> 
> This description should really be in a comment in the code, rather than
> the ChangeLog.  OK either way.

Added comments in the code. New patch below.


From: Matthias Kretz 

Ensure dump_template_decl for function templates never prints template
parameters after the function name (it did with -fno-pretty-templates)
and skip output of irrelevant & confusing "[with T = T]" in
dump_substitution.

gcc/cp/ChangeLog:

PR c++/100716
* error.c (dump_template_bindings): Include code to print
"[with" and ']', conditional on whether anything is printed at
all. This is tied to whether a semicolon is needed to separate
multiple template parameters. If the template argument repeats
the template parameter (T = T), then skip the parameter.
(dump_substitution): Moved code to print "[with" and ']' to
dump_template_bindings.
(dump_function_decl): Partial revert of PR50828, which masked
TFF_TEMPLATE_NAME for all of dump_function_decl. Now
TFF_TEMPLATE_NAME is masked for the scope of the function and
only carries through to dump_function_name.
(dump_function_name): Avoid calling dump_template_parms if
TFF_TEMPLATE_NAME is set.

gcc/testsuite/ChangeLog:

PR c++/100716
* g++.dg/diagnostic/pr100716.C: New test.
* g++.dg/diagnostic/pr100716-1.C: Same test with
-fno-pretty-templates.
---
 gcc/cp/error.c   | 63 +++-
 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 +
 gcc/testsuite/g++.dg/diagnostic/pr100716.C   | 54 +
 3 files changed, 156 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index a2f19d1a5c1..ade9b17e663 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -371,7 +371,35 @@ static void
 dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 vec *typenames)
 {
-  bool need_semicolon = false;
+  /* Print "[with" and ']', conditional on whether anything is printed at all.
+ This is tied to whether a semicolon is needed to separate multiple template
+ parameters.  */
+  struct prepost_semicolon
+  {
+cxx_pretty_printer *pp;
+bool need_semicolon;
+
+void operator()()
+{
+  if (need_semicolon)
+	pp_separate_with_semicolon (pp);
+  else
+	{
+	  pp_cxx_whitespace (pp);
+	  pp_cxx_left_bracket (pp);
+	  pp->translate_string ("with");
+	  pp_cxx_whitespace (pp);
+	  need_semicolon = true;
+	}
+}
+
+~prepost_semicolon()
+{
+  if (need_semicolon)
+	pp_cxx_right_bracket (pp);
+}
+  } semicolon_or_introducer = {pp, false};
+
   int i;
   tree t;
 
@@ -395,10 +423,20 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	  if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx)
 	arg = TREE_VEC_ELT (lvl_args, arg_idx);
 
-	  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
-	  dump_template_parameter (pp, TREE_VEC_ELT (p, i),
-   TFF_PLAIN_IDENTIFIER);
+	  tree parm_i = TREE_VEC_ELT (p, i);
+	  /* If the template argument repeats the template parameter (T = T),
+	 skip the parameter.*/
+	  if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM
+		&& TREE_CODE (parm_i) == TREE_LIST
+		&& TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL
+		&& TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i)))
+		 == TEMP

Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-26 Thread Matthias Kretz
o. Thanks for your detailed comments on this topic. Very helpful .

> > + continue;
> > +   }
> > + if (TREE_CODE (TREE_VALUE (args)) != STRING_CST)
> > +   {
> > + error ("the argument to the %qE attribute must be a string "
> > +   "literal", name);
> 
> Similarly here, recommend to follow one of the existing styles (see
> c-family/c-attribs.c) rather than adding another variation to the mix.

The visibility attribute on namespaces says: "%qD attribute requires a single 
NTBS argument". So I copied that (and its logic) for now. However, I believe 
the use of "NTBS" is not very user friendly.

> > + if (CLASS_TYPE_P (type) && CLASSTYPE_IMPLICIT_INSTANTIATION (type))
> > +   {
> > + if (COMPLETE_OR_OPEN_TYPE_P (type))
> > +   warning (OPT_Wattributes, "%qE attribute cannot be applied to %qT 
"
> > + "after its instantiation", name, type);
> 
> Ditto here:
> msgid "ignoring %qE attribute applied to template instantiation %qT"

Ah, here I want to be more precise. Because the attribute can be applied to a 
template instantiation. But only before its instantiation. Example:

template struct X {};
using [[gnu::diagnose_as("XX")]] XX = X; // OK
template struct X;
using [[gnu::diagnose_as("XY")]] XY = X; // not OK

msgid "ignoring %qE attribute applied to template %qT after instantiation"
OK?

> > + error ("%qE attribute applied to extern \"C\" declaration %qD",
> 
> Please quote extern "C" (as "%).

OK. However the msgid was copied from handle_abi_tag_attribute above.

New patch (and ChangeLog) below:


From: Matthias Kretz 

This attribute overrides the diagnostics output string for the entity it
appertains to. The motivation is to improve QoI for library TS
implementations, where diagnostics have a very bad signal-to-noise ratio
due to the long namespaces involved.

With the attribute, it is possible to solve PR89370 and make
std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as
std::string in diagnostic output without extra hacks to recognize the
type in the C++ frontend.

gcc/ChangeLog:

PR c++/89370
* doc/extend.texi: Document the diagnose_as attribute.
* doc/invoke.texi: Document -fno-diagnostics-use-aliases.

gcc/c-family/ChangeLog:

PR c++/89370
* c.opt (fdiagnostics-use-aliases): New diagnostics flag.

gcc/cp/ChangeLog:

PR c++/89370
* cp-tree.h: Add TFF_AS_PRIMARY.
* error.c (dump_scope): When printing the name of a namespace,
look for the diagnose_as attribute. If found, print the
associated string instead of calling dump_decl.
(dump_decl_name_or_diagnose_as): New function to replace
dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the
diagnose_as attribute before printing the DECL_NAME.
(dump_template_scope): New function. Prints the scope of a
template instance correctly applying diagnose_as attributes and
adjusting the list of template parms accordingly.
(dump_aggr_type): If the type has a diagnose_as attribute, print
the associated string instead of printing the original type
name. Print template parms only if the attribute was not applied
to the instantiation / full specialization.
(dump_simple_decl): Call dump_decl_name_or_diagnose_as instead
of dump_decl.
(dump_decl): Ditto.
(lang_decl_name): Ditto.
(dump_function_decl): Walk the functions context list to
determine whether a call to dump_template_scope is required.
Ensure function templates are presented as primary templates.
(dump_function_name): Replace the function's identifier with the
diagnose_as attribute value, if set.
(dump_template_parms): Treat as primary template if flags
contains TFF_AS_PRIMARY.
(comparable_template_types_p): Consider the types not a template
if one carries a diagnose_as attribute.
(print_template_differences): Replace the identifier with the
diagnose_as attribute value on the most general template, if it
is set.
* name-lookup.c (handle_namespace_attrs): Handle the diagnose_as
attribute. Ensure exactly one string argument. Ensure previous
diagnose_as attributes used the same name.
* tree.c (cxx_attribute_table): Add diagnose_as attribute to the
table.
(check_diagnose_as_redeclaration): New function; copied and
adjusted from check_abi_tag_redeclaration.
(handle_diagnose_as_attribute): New function; copied and
adjusted from handle_abi_tag_attribute. If the given *node is a
TYPE_DECL and the TREE_TYPE is an implicit class te

Re: [PATCH] c++: Output less irrelevant info for function template decl [PR100716]

2021-05-26 Thread Matthias Kretz
New revision which can also be compiled with GCC 4.8.

From: Matthias Kretz 

Ensure dump_template_decl for function templates never prints template
parameters after the function name (it did with -fno-pretty-templates)
and skip output of irrelevant & confusing "[with T = T]" in
dump_substitution.

gcc/cp/ChangeLog:

PR c++/100716
* error.c (dump_template_bindings): Include code to print
"[with" and ']', conditional on whether anything is printed at
all. This is tied to whether a semicolon is needed to separate
multiple template parameters. If the template argument repeats
the template parameter (T = T), then skip the parameter.
(dump_substitution): Moved code to print "[with" and ']' to
dump_template_bindings.
(dump_function_decl): Partial revert of PR50828, which masked
TFF_TEMPLATE_NAME for all of dump_function_decl. Now
TFF_TEMPLATE_NAME is masked for the scope of the function and
only carries through to dump_function_name.
(dump_function_name): Avoid calling dump_template_parms if
TFF_TEMPLATE_NAME is set.

gcc/testsuite/ChangeLog:

PR c++/100716
* g++.dg/diagnostic/pr100716.C: New test.
* g++.dg/diagnostic/pr100716-1.C: Same test with
-fno-pretty-templates.
---
 gcc/cp/error.c   | 59 +++-
 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 ++
 gcc/testsuite/g++.dg/diagnostic/pr100716.C   | 54 ++
 3 files changed, 152 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C


--
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index ad69df6ef7f..b0836d83888 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -371,7 +371,32 @@ static void
 dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 vec *typenames)
 {
-  bool need_semicolon = false;
+  struct prepost_semicolon
+  {
+cxx_pretty_printer *pp;
+bool need_semicolon;
+
+void operator()()
+{
+  if (need_semicolon)
+	pp_separate_with_semicolon (pp);
+  else
+	{
+	  pp_cxx_whitespace (pp);
+	  pp_cxx_left_bracket (pp);
+	  pp->translate_string ("with");
+	  pp_cxx_whitespace (pp);
+	  need_semicolon = true;
+	}
+}
+
+~prepost_semicolon()
+{
+  if (need_semicolon)
+	pp_cxx_right_bracket (pp);
+}
+  } semicolon_or_introducer = {pp, false};
+
   int i;
   tree t;
 
@@ -395,10 +420,19 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	  if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx)
 	arg = TREE_VEC_ELT (lvl_args, arg_idx);
 
-	  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
-	  dump_template_parameter (pp, TREE_VEC_ELT (p, i),
-   TFF_PLAIN_IDENTIFIER);
+	  tree parm_i = TREE_VEC_ELT (p, i);
+	  /* Skip this parameter if it just noise such as "T = T".  */
+	  if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM
+		&& TREE_CODE (parm_i) == TREE_LIST
+		&& TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL
+		&& TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i)))
+		 == TEMPLATE_TYPE_PARM
+		&& DECL_NAME (TREE_VALUE (parm_i))
+		 == DECL_NAME (TREE_CHAIN (arg)))
+	continue;
+
+	  semicolon_or_introducer();
+	  dump_template_parameter (pp, parm_i, TFF_PLAIN_IDENTIFIER);
 	  pp_cxx_whitespace (pp);
 	  pp_equal (pp);
 	  pp_cxx_whitespace (pp);
@@ -414,7 +448,6 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	pp_string (pp, M_(""));
 
 	  ++arg_idx;
-	  need_semicolon = true;
 	}
 
   parms = TREE_CHAIN (parms);
@@ -436,8 +469,7 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 
   FOR_EACH_VEC_SAFE_ELT (typenames, i, t)
 {
-  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
+  semicolon_or_introducer();
   dump_type (pp, t, TFF_PLAIN_IDENTIFIER);
   pp_cxx_whitespace (pp);
   pp_equal (pp);
@@ -1599,12 +1631,7 @@ dump_substitution (cxx_pretty_printer *pp,
   && !(flags & TFF_NO_TEMPLATE_BINDINGS))
 {
   vec *typenames = t ? find_typenames (t) : NULL;
-  pp_cxx_whitespace (pp);
-  pp_cxx_left_bracket (pp);
-  pp->translate_string ("with");
-  pp_cxx_whitespace (pp);
   dump_template_bindings (pp, template_parms, template_args, typenames);
-  pp_cxx_right_

[PATCH] c++: Add missing scope in typedef diagnostic [PR100763]

2021-05-26 Thread Matthias Kretz


From: Matthias Kretz 

dump_type on 'const std::string' should not print 'const string' unless
TFF_UNQUALIFIED_NAME is requested.

gcc/cp/ChangeLog:

PR c++/100763
* error.c: Call dump_scope when printing a typedef.
---
 gcc/cp/error.c | 2 ++
 1 file changed, 2 insertions(+)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index c88d1749a0f..ad69df6ef7f 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -501,6 +501,8 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags)
   else
 	{
 	  pp_cxx_cv_qualifier_seq (pp, t);
+	  if (! (flags & TFF_UNQUALIFIED_NAME))
+	dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags);
 	  pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t));
 	  return;
 	}


[PATCH] c++: Output less irrelevant info for function template decl [PR100716]

2021-05-25 Thread Matthias Kretz

From: Matthias Kretz 

Ensure dump_template_decl for function templates never prints template 
parameters after the function name (it did with -fno-pretty-templates) and 
skip output of irrelevant & confusing "[with T = T]" in dump_substitution.

gcc/cp/ChangeLog:

PR c++/100716
* error.c (dump_template_bindings): Include code to print
"[with" and ']', conditional on whether anything is printed at
all. This is tied to whether a semicolon is needed to separate
multiple template parameters. If the template argument repeats
the template parameter (T = T), then skip the parameter.
(dump_substitution): Moved code to print "[with" and ']' to
dump_template_bindings.
(dump_function_decl): Partial revert of PR50828, which masked
TFF_TEMPLATE_NAME for all of dump_function_decl. Now
TFF_TEMPLATE_NAME is masked for the scope of the function and
only carries through to dump_function_name.
(dump_function_name): Avoid calling dump_template_parms if
TFF_TEMPLATE_NAME is set.

gcc/testsuite/ChangeLog:

PR c++/100716
* g++.dg/diagnostic/pr100716.C: New test.
* g++.dg/diagnostic/pr100716-1.C: Same test with
-fno-pretty-templates.
---
 gcc/cp/error.c   | 59 +++-
 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 ++
 gcc/testsuite/g++.dg/diagnostic/pr100716.C   | 54 ++
 3 files changed, 152 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C
 create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C


--
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 010fbce41a7..bc0b68f07e0 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -381,7 +381,32 @@ static void
 dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 vec *typenames)
 {
-  bool need_semicolon = false;
+  struct prepost_semicolon
+  {
+cxx_pretty_printer *pp;
+bool need_semicolon = false;
+
+void operator()()
+{
+  if (need_semicolon)
+	pp_separate_with_semicolon (pp);
+  else
+	{
+	  pp_cxx_whitespace (pp);
+	  pp_cxx_left_bracket (pp);
+	  pp->translate_string ("with");
+	  pp_cxx_whitespace (pp);
+	  need_semicolon = true;
+	}
+}
+
+~prepost_semicolon()
+{
+  if (need_semicolon)
+	pp_cxx_right_bracket (pp);
+}
+  } semicolon_or_introducer = {pp};
+
   int i;
   tree t;
 
@@ -405,10 +430,19 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	  if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx)
 	arg = TREE_VEC_ELT (lvl_args, arg_idx);
 
-	  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
-	  dump_template_parameter (pp, TREE_VEC_ELT (p, i),
-   TFF_PLAIN_IDENTIFIER);
+	  tree parm_i = TREE_VEC_ELT (p, i);
+	  /* Skip this parameter if it just noise such as "T = T".  */
+	  if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM
+		&& TREE_CODE (parm_i) == TREE_LIST
+		&& TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL
+		&& TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i)))
+		 == TEMPLATE_TYPE_PARM
+		&& DECL_NAME (TREE_VALUE (parm_i))
+		 == DECL_NAME (TREE_CHAIN (arg)))
+	continue;
+
+	  semicolon_or_introducer();
+	  dump_template_parameter (pp, parm_i, TFF_PLAIN_IDENTIFIER);
 	  pp_cxx_whitespace (pp);
 	  pp_equal (pp);
 	  pp_cxx_whitespace (pp);
@@ -424,7 +458,6 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 	pp_string (pp, M_(""));
 
 	  ++arg_idx;
-	  need_semicolon = true;
 	}
 
   parms = TREE_CHAIN (parms);
@@ -446,8 +479,7 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args,
 
   FOR_EACH_VEC_SAFE_ELT (typenames, i, t)
 {
-  if (need_semicolon)
-	pp_separate_with_semicolon (pp);
+  semicolon_or_introducer();
   dump_type (pp, t, TFF_PLAIN_IDENTIFIER);
   pp_cxx_whitespace (pp);
   pp_equal (pp);
@@ -1652,12 +1684,7 @@ dump_substitution (cxx_pretty_printer *pp,
   && !(flags & TFF_NO_TEMPLATE_BINDINGS))
 {
   vec *typenames = t ? find_typenames (t) : NULL;
-  pp_cxx_whitespace (pp);
-  pp_cxx_left_bracket (pp);
-  pp->translate_string ("with");
-  pp_cxx_whitespace (pp);
   dump_template_bindings (pp, template_parms, template_args, typenames);
-  pp_cxx_right_bracket (pp);
 }
 }
 
@@ -1698,7 +1725,8 @@ du

Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-04 Thread Matthias Kretz
> On Tuesday, 4 May 2021 15:34:13 CEST David Malcolm wrote:
> > Does the patch interact correctly with the %H and %I codes that try to
> > show the differences between two template types?

While looking into this, I noticed that given

namespace std {
  struct A {};
  typedef A B;
}

const std::B would print as "'const B' {aka 'const std::A'}", i.e. without 
printing the scope of the typedef. I traced it to cp/error.c (dump_type). In 
the `if (TYPE_P (t) && typedef_variant_p (t))` branch, in the final else 
branch only cv-qualifiers and identifier are printed:

  pp_cxx_cv_qualifier_seq (pp, t);
  pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t));

I believe the following should go in between, correct?

  pp_cxx_cv_qualifier_seq (pp, t);
  if (! (flags & TFF_UNQUALIFIED_NAME))
dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags);
  pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t));

This is important for my diagnose_as patch because otherwise the output is:

  'const string' {aka 'const std::string'}

which is confusing and unnecessarily verbose. Patch below.


From: Matthias Kretz 

dump_type on 'const std::string' should not print 'const string' unless
TFF_UNQUALIFIED_NAME is requested.

gcc/cp/ChangeLog:
* error.c: Call dump_scope when printing a typedef.
---
 gcc/cp/error.c | 2 ++
 1 file changed, 2 insertions(+)


-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 10b547afaa7..edeaad44bcd 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -511,6 +511,8 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags)
   else
 	{
 	  pp_cxx_cv_qualifier_seq (pp, t);
+	  if (! (flags & TFF_UNQUALIFIED_NAME))
+	dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags);
 	  pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t));
 	  return;
 	}


Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-04 Thread Matthias Kretz
On Tuesday, 4 May 2021 16:23:23 CEST Matthias Kretz wrote:
> On Tuesday, 4 May 2021 15:34:13 CEST David Malcolm wrote:
> > Does the patch interact correctly with the %H and %I codes that try to
> > show the differences between two template types?
> 
> I don't know. I'll try to find out. If you have a good idea (or pointer) for
> a testcase, let me know.

I see it now. It currently does not interact with %H and %I (at least in my 
tests). I'll investigate what it should do.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] Add gnu::diagnose_as attribute

2021-05-04 Thread Matthias Kretz
On Tuesday, 4 May 2021 15:34:13 CEST David Malcolm wrote:
> On Tue, 2021-05-04 at 13:13 +0200, Matthias Kretz wrote:
> > This attribute overrides the diagnostics output string for the entity
> > it
> > appertains to. The motivation is to improve QoI for library TS
> > implementations, where diagnostics have a very bad signal-to-noise
> > ratio
> > due to the long namespaces involved.
> > [...]
> 
> Thanks for the patch, it looks very promising.

Thanks. I'm new to modifying the compiler like this, so please be extra 
careful with my patch. I believe I understand most of what I did, but I might 
have misunderstood. :)

> The patch has no testcases; it should probably add test coverage for:
> - the various places and ways in which diagnose_as can affect the
> output,
> - disabling it with the option
> - the various ways in which the user can get diagnose_as wrong
> - etc

Right. If you know of an existing similar testcase, that'd help me a lot to 
get started.

> Does the patch affect the output of labels when underlining ranges of
> source code in diagnostics?

AFAIU (and tested), it doesn't affect source code output. So, no?

> Does the patch interact correctly with the %H and %I codes that try to
> show the differences between two template types?

I don't know. I'll try to find out. If you have a good idea (or pointer) for a 
testcase, let me know.

> I have some minor nits from a diagnostics point of view:
> [...]
> Please add an auto_diagnostic_group here so that the "inform" is
> associated with the "error".
> [...]
> diagnose_as should be in quotes here (%< and %>).
> [...]
> Please quote extern "C":

Thanks. All done in my tree. I'll work on testcases before sending an updated 
patch.

> Thanks again for the patch; hope this is constructive




-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


[PATCH] Add gnu::diagnose_as attribute

2021-05-04 Thread Matthias Kretz
From: Matthias Kretz 

This attribute overrides the diagnostics output string for the entity it
appertains to. The motivation is to improve QoI for library TS
implementations, where diagnostics have a very bad signal-to-noise ratio
due to the long namespaces involved.

On Tuesday, 27 April 2021 11:46:48 CEST Jonathan Wakely wrote:
> I think it's a great idea and would like to use it for all the TS
> implementations where there is some inline namespace that the user
> doesn't care about. std::experimental::fundamentals_v1:: would be much
> better as just std::experimental::, or something like std::[LFTS]::.

With the attribute, it is possible to solve PR89370 and make
std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as
std::string in diagnostic output without extra hacks to recognize the
type.

gcc/ChangeLog:

PR c++/89370
* doc/extend.texi: Document the diagnose_as attribute.
* doc/invoke.texi: Document -fno-diagnostics-use-aliases.

gcc/c-family/ChangeLog:

PR c++/89370
* c.opt (fdiagnostics-use-aliases): New diagnostics flag.

gcc/cp/ChangeLog:

PR c++/89370
* error.c (dump_scope): When printing the name of a namespace,
look for the diagnose_as attribute. If found, print the
associated string instead of calling dump_decl.
(dump_decl_name_or_diagnose_as): New function to replace
dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the
diagnose_as attribute before printing the DECL_NAME.
(dump_aggr_type): If the type has a diagnose_as attribute, print
the associated string instead of printing the original type
name.
(dump_simple_decl): Call dump_decl_name_or_diagnose_as instead
of dump_decl.
(dump_decl): Ditto.
(lang_decl_name): Ditto.
(dump_function_decl): Ensure complete replacement of the class
template diagnostics if a diagnose_as attribute is present.
(dump_function_name): Replace the function diagnostic output if
the diagnose_as attribute is set.
* name-lookup.c (handle_namespace_attrs): Handle the diagnose_as
attribute. Ensure exactly one string argument. Ensure previous
diagnose_as attributes used the same name.
* tree.c (cxx_attribute_table): Add diagnose_as attribute to the
table.
(check_diagnose_as_redeclaration): New function; copied and
adjusted from check_abi_tag_redeclaration.
(handle_diagnose_as_attribute): New function; copied and
adjusted from handle_abi_tag_attribute. If the given *node is a
TYPE_DECL and the TREE_TYPE is an implicit class template
instantiation, call decl_attributes to add the diagnose_as
attribute to the TREE_TYPE.
---
 gcc/c-family/c.opt   |   4 ++
 gcc/cp/error.c   |  85 ---
 gcc/cp/name-lookup.c |  27 ++
 gcc/cp/tree.c| 117 +++
 gcc/doc/extend.texi  |  37 ++
 gcc/doc/invoke.texi  |   9 +++-
 6 files changed, 270 insertions(+), 9 deletions(-)


--
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 3f8b72cdc00..0cf01c6dba4 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1582,6 +1582,10 @@ fdiagnostics-show-template-tree
 C++ ObjC++ Var(flag_diagnostics_show_template_tree) Init(0)
 Print hierarchical comparisons when template types are mismatched.
 
+fdiagnostics-use-aliases
+C++ Var(flag_diagnostics_use_aliases) Init(1)
+Replace identifiers or scope names in diagnostics as defined by the diagnose_as attribute.
+
 fdirectives-only
 C ObjC C++ ObjC++
 Preprocess directives only.
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index c88d1749a0f..10b547afaa7 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "gcc-rich-location.h"
 #include "cp-name-hint.h"
+#include "attribs.h"
 
 #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',')
 #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';')
@@ -66,6 +67,7 @@ static void dump_alias_template_specialization (cxx_pretty_printer *, tree, int)
 static void dump_type (cxx_pretty_printer *, tree, int);
 static void dump_typename (cxx_pretty_printer *, tree, int);
 static void dump_simple_decl (cxx_pretty_printer *, tree, tree, int);
+static void dump_decl_name_or_diagnose_as (cxx_pretty_printer *, tree, int);
 static void dump_decl (cxx_pretty_printer *, tree, int);
 stati

Re: [PATCH 4/4] libstdc++: More efficient last day of month.

2021-02-23 Thread Matthias Kretz
I like the idea.

On Dienstag, 23. Februar 2021 14:25:10 CET Cassio Neri via Libstdc++ wrote:
> ((__m ^ (__m >> 3)) & 1) | 30

Note that you can drop the `& 1` part. 30 in binary is 0b0. ORing with a 
value in [0, 0b01101] will only toggle the last bit.

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH] libstdc++: Don't use reserved identifiers in simd headers

2021-02-01 Thread Matthias Kretz
On Montag, 1. Februar 2021 13:21:33 CET Rainer Orth wrote:
> Two simd tests FAIL on Solaris, both SPARC and x86:
> 
> FAIL: experimental/simd/standard_abi_usable.cc -msse2 -O2 -Wno-psabi (test
> for excess errors) FAIL: experimental/simd/standard_abi_usable_2.cc -msse2
> -O2 -Wno-psabi (test for excess errors)
> 
> This happens because the simd headers use identifiers documented in the
> libstdc++ manual as reserved by system headers.

Sorry, this code was originally written as non-stdlib code, i.e. without any 
reserved identifiers. I had hoped I found all issues...

> Fixed as follows, tested on i386-pc-solaris2.11, sparc-sun-solaris2.11,
> and x86_64-pc-linux-gnu.
> 
> Ok for master?

Looks good to me.

> As an aside, the use of vim: markers initially confused the hell out of
> me.  As an Emacs user, I rarely use vi for much more than a pager, but
> when I wanted to check the lines mentioned in the g++ errors, I had no
> idea what was going on or how to disable the folding enabled there:
> 
> // vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
> 
> I can't help but feel that this is just a personal preference and
> doesn't belong into the upstream code.

Yes. I guess it's better to remove at least foldmethod. The rest isn't 
personal preference, but coding style requirements. However, I don't need any 
of it anymore: by now my vim config autodetects GCC / libstdc++ code. If the 
rest of libstdc++ doesn't have it, the simd headers probably shouldn't have it 
either.

Best,
  Matthias

-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [PATCH 14/16] Implement hmin and hmax

2021-02-01 Thread Matthias Kretz
On Mittwoch, 27. Januar 2021 21:42:50 CET Matthias Kretz wrote:
> --- a/libstdc++-v3/include/experimental/bits/simd.h
> +++ b/libstdc++-v3/include/experimental/bits/simd.h
> @@ -204,6 +204,27 @@ template 
>  template 
>using _SizeConstant = integral_constant;
> 
> +namespace __detail {
> +  struct _Minimum {
> +template 
> +  _GLIBCXX_SIMD_INTRINSIC constexpr
> +  _Tp
> +  operator()(_Tp __a, _Tp __b) const {

Reviewing my own patch :) This needs line breaks before { for namespace, 
struct, and operator(). And another line break before the next struct. New 
patch attached.

From: Matthias Kretz 

From 9.7.4 in Parallelism TS 2. For some reason I overlooked these two
functions. Implement them via call to _S_reduce.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __detail::_Minimum and
__detail::_Maximum to use them as _BinaryOperation to _S_reduce.
Add hmin and hmax overloads for simd and const_where_expression.
* include/experimental/bits/simd_scalar.h
(_SimdImplScalar::_S_reduce): Make unused _BinaryOperation
parameter const-ref to allow calling _S_reduce with an rvalue.
* testsuite/experimental/simd/tests/reductions.cc: Add tests for
hmin and hmax. Since the compiler statically determined that all
tests pass, repeat the test after a call to make_value_unknown.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 14179491f9d..a90cb3b2d98 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -204,6 +204,33 @@ template 
 template 
   using _SizeConstant = integral_constant;
 
+namespace __detail
+{
+  struct _Minimum
+  {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const
+  {
+	using std::min;
+	return min(__a, __b);
+  }
+  };
+
+  struct _Maximum
+  {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const
+  {
+	using std::max;
+	return max(__a, __b);
+  }
+  };
+} // namespace __detail
+
 // unrolled/pack execution helpers
 // __execute_n_times{{{
 template 
@@ -3408,7 +3435,7 @@ template 
 
 // }}}1
 // reductions [simd.reductions] {{{1
-  template >
+template >
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
   reduce(const simd<_Tp, _Abi>& __v,
 	 _BinaryOperation __binary_op = _BinaryOperation())
@@ -3454,6 +3481,61 @@ template 
   reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op)
   { return reduce(__x, 0, __binary_op); }
 
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmin(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmax(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmin(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_max_v<_Tp>;
+#else
+  __value_or<__infinity, _Tp>(__finite_max_v<_Tp>);
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+__data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmax(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_min_v<_Tp>;
+#else
+  [] {
+	if constexpr (__value_exists_v<__infinity, _Tp>)
+	  return -__infinity_v<_Tp>;
+	else
+	  return __finite_min_v<_Tp>;
+  }();
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+__data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum());
+  }
+
 // }}}1
 // algorithms [simd.alg] {{{
 template 
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
index 7680bc39c30..7e4

[PATCH 16/16] Improve "find_first/last_set" for NEON

2021-01-27 Thread Matthias Kretz
From: yaozhongxiao 

find_first_set and find_last_set method is not optimal for neon,
it need to be improved by synthesized with horizontal adds(vaddv)
which will reduce the generated assembly code; in the following cases,
vaddvq_s16 will generate 2 instructions but vpadd_s16 will generate 4
instrunctions:
```
 # vaddvq_s16
vaddvq_s16(__asint);
//  addvh0, v1.8h
//  smovw1, v0.h[0]
 # vpadd_s16
vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero), __zero)[0]
// addp v1.8h,v1.8h,v2.8h
// addp v1.8h,v1.8h,v2.8h
// addp v1.8h,v1.8h,v2.8h
// smovw1, v1.h[0]
 #
```

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_neon.h: Replace repeated vpadd
calls with a single vaddv for aarch64.
---
 .../include/experimental/bits/simd_neon.h   | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd_neon.h b/libstdc++-
v3/include/experimental/bits/simd_neon.h
index a3a8ffe165f..0b8ccc17513 100644
--- a/libstdc++-v3/include/experimental/bits/simd_neon.h
+++ b/libstdc++-v3/include/experimental/bits/simd_neon.h
@@ -311,8 +311,7 @@ struct _MaskImplNeonMixin
  });
  __asint &= __bitsel;
 #ifdef __aarch64__
- return vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), 
__zero),
-   __zero)[0];
+ return vaddvq_s16(__asint);
 #else
  return vpadd_s16(
vpadd_s16(vpadd_s16(__lo64(__asint), __hi64(__asint)), __zero),
@@ -328,7 +327,7 @@ struct _MaskImplNeonMixin
  });
  __asint &= __bitsel;
 #ifdef __aarch64__
- return vpaddq_s32(vpaddq_s32(__asint, __zero), __zero)[0];
+ return vaddvq_s32(__asint);
 #else
  return vpadd_s32(vpadd_s32(__lo64(__asint), __hi64(__asint)),
   __zero)[0];
@@ -351,8 +350,12 @@ struct _MaskImplNeonMixin
return static_cast<_I>(__i < _Np ? 1 << __i : 0);
  });
  __asint &= __bitsel;
+#ifdef __aarch64__
+ return vaddv_s8(__asint);
+#else
  return vpadd_s8(vpadd_s8(vpadd_s8(__asint, __zero), __zero),
  __zero)[0];
+#endif
}
  else if constexpr (sizeof(_Tp) == 2)
{
@@ -362,12 +365,20 @@ struct _MaskImplNeonMixin
return static_cast<_I>(__i < _Np ? 1 << __i : 0);
  });
  __asint &= __bitsel;
+#ifdef __aarch64__
+ return vaddv_s16(__asint);
+#else
  return vpadd_s16(vpadd_s16(__asint, __zero), __zero)[0];
+#endif
}
  else if constexpr (sizeof(_Tp) == 4)
{
  __asint &= __make_vector<_I>(0x1, 0x2);
+#ifdef __aarch64__
+ return vaddv_s32(__asint);
+#else
  return vpadd_s32(__asint, __zero)[0];
+#endif
}
  else
__assert_unreachable<_Tp>();
-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






[PATCH 15/16] Work around test failures using -mno-tree-vrp

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

This is necessary to avoid failures resulting from PR98834.

libstdc++-v3/ChangeLog:
* testsuite/Makefile.am: Warn about the workaround. Add
-fno-tree-vrp to CXXFLAGS passed to the check_simd script.
Improve initial user feedback from make check-simd.
* testsuite/Makefile.in: Regenerated.
---
 libstdc++-v3/testsuite/Makefile.am | 4 +++-
 libstdc++-v3/testsuite/Makefile.in | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/
Makefile.am
index 2d3ad481dba..ba5023a8b54 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -191,8 +191,10 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around 
PR98834."
@rm -f .simd.summary
-   ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
+   @echo "Generating simd testsuite subdirs and Makefiles ..."
+   @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
tail -n20 $${subdir}/simd_testsuite.sum | \
diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/
Makefile.in
index ac6207ae75c..c9dd7f5da61 100644
--- a/libstdc++-v3/testsuite/Makefile.in
+++ b/libstdc++-v3/testsuite/Makefile.in
@@ -716,8 +716,10 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around 
PR98834."
@rm -f .simd.summary
-   ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
+   @echo "Generating simd testsuite subdirs and Makefiles ..."
+   @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \
  while read subdir; do \
    $(MAKE) -C "$${subdir}"; \
tail -n20 $${subdir}/simd_testsuite.sum | \
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 14/16] Implement hmin and hmax

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

From 9.7.4 in Parallelism TS 2. For some reason I overlooked these two
functions. Implement them via call to _S_reduce.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __detail::_Minimum and
__detail::_Maximum to use them as _BinaryOperation to _S_reduce.
Add hmin and hmax overloads for simd and const_where_expression.
* include/experimental/bits/simd_scalar.h
(_SimdImplScalar::_S_reduce): Make unused _BinaryOperation
parameter const-ref to allow calling _S_reduce with an rvalue.
* testsuite/experimental/simd/tests/reductions.cc: Add tests for
hmin and hmax. Since the compiler statically determined that all
tests pass, repeat the test after a call to make_value_unknown.
---
 libstdc++-v3/include/experimental/bits/simd.h | 78 ++-
 .../include/experimental/bits/simd_scalar.h   |  2 +-
 .../experimental/simd/tests/reductions.cc | 21 +
 3 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index 14179491f9d..f08ef4c027d 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -204,6 +204,27 @@ template 
 template 
   using _SizeConstant = integral_constant;
 
+namespace __detail {
+  struct _Minimum {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const {
+   using std::min;
+   return min(__a, __b);
+  }
+  };
+  struct _Maximum {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const {
+   using std::max;
+   return max(__a, __b);
+  }
+  };
+} // namespace __detail
+
 // unrolled/pack execution helpers
 // __execute_n_times{{{
 template 
@@ -3408,7 +3429,7 @@ template 
 
 // }}}1
 // reductions [simd.reductions] {{{1
-  template >
+template >
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
   reduce(const simd<_Tp, _Abi>& __v,
 _BinaryOperation __binary_op = _BinaryOperation())
@@ -3454,6 +3475,61 @@ template 
   reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op)
   { return reduce(__x, 0, __binary_op); }
 
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmin(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmax(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmin(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_max_v<_Tp>;
+#else
+  __value_or<__infinity, _Tp>(__finite_max_v<_Tp>);
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+   __data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmax(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_min_v<_Tp>;
+#else
+  [] {
+   if constexpr (__value_exists_v<__infinity, _Tp>)
+ return -__infinity_v<_Tp>;
+   else
+ return __finite_min_v<_Tp>;
+  }();
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+   __data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum());
+  }
+
 // }}}1
 // algorithms [simd.alg] {{{
 template 
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-
v3/include/experimental/bits/simd_scalar.h
index 7680bc39c30..7e480ecdb37 100644
--- a/libstdc++-v3/include/experimental/bits/simd_scalar.h
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -182,7 +182,7 @@ struct _SimdImplScalar
   // _S_reduce {{{2
   template 
 static constexpr inline _Tp
-_S_reduce(const simd<_Tp, simd_abi::scalar>& __x, _BinaryOperation&)
+_S_reduce(const simd<_Tp, simd_abi::scalar>& __x, const 
_BinaryOperation&)
 { return __x._M_data; }
 
   // _S_min, _S_max {{{2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc b/
libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
index 9d897d5ccd6..02df68fafbc 100644
--- a/libstdc++-v3

[PATCH 13/16] Improve test codegen for interpreting assembly

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

In many failure cases it is helpful to inspect the instructions leading
up to the test failure. After this change the location is easier to find
and the branch after failure is easier to find.

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/tests/bits/verify.h (verify): Add
instruction pointer data member. Ensure that the `if (m_failed)`
branch is always inlined into the calling code. The body of the
conditional can still be a function call. Move the get_ip call
into the verify ctor to simplify the ctor calls.
(COMPARE): Don't mention the use of all_of for reduction of a
simd_mask. It only distracts from the real issue.
---
 .../experimental/simd/tests/bits/verify.h | 44 +--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h b/
libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
index 5da47b35536..17bda71b77e 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
@@ -60,6 +60,7 @@ template 
 class verify
 {
   const bool m_failed = false;
+  size_t m_ip = 0;
 
   template ()
@@ -129,20 +130,21 @@ class verify
 
 public:
   template 
-verify(bool ok, size_t ip, const char* file, const int line,
+[[gnu::always_inline]]
+verify(bool ok, const char* file, const int line,
   const char* func, const char* cond, const Ts&... extra_info)
-: m_failed(!ok)
+: m_failed(!ok), m_ip(get_ip())
 {
   if (m_failed)
-   {
+   [&] {
  __builtin_fprintf(stderr, "%s:%d: (%s):\nInstruction Pointer: %x\n"
"Assertion '%s' failed.\n",
-   file, line, func, ip, cond);
+   file, line, func, m_ip, cond);
  (print(extra_info, int()), ...);
-   }
+   }();
 }
 
-  ~verify()
+  [[gnu::always_inline]] ~verify()
   {
 if (m_failed)
   {
@@ -152,26 +154,27 @@ public:
   }
 
   template 
+[[gnu::always_inline]]
 const verify&
 operator<<(const T& x) const
 {
   if (m_failed)
-   {
- print(x, int());
-   }
+   print(x, int());
   return *this;
 }
 
   template 
+[[gnu::always_inline]]
 const verify&
 on_failure(const Ts&... xs) const
 {
   if (m_failed)
-   (print(xs, int()), ...);
+   [&] { (print(xs, int()), ...); }();
   return *this;
 }
 
-  [[gnu::always_inline]] static inline size_t
+  [[gnu::always_inline]] static inline
+  size_t
   get_ip()
   {
 size_t _ip = 0;
@@ -220,24 +223,21 @@ template 
 
 #define COMPARE(_a, _b)
\
   [&](auto&& _aa, auto&& _bb) {
\
-return verify(std::experimental::all_of(_aa == _bb), verify::get_ip(), 
\
- __FILE__, __LINE__, __PRETTY_FUNCTION__, \
- "all_of(" #_a " == " #_b ")", #_a " = ", _aa,\
+return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__,   
\
+ __PRETTY_FUNCTION__, #_a " == " #_b, #_a " = ", _aa, \
  "\n" #_b " = ", _bb);\
   }(force_fp_truncation(_a), force_fp_truncation(_b))
 #else
 #define COMPARE(_a, _b)
\
   [&](auto&& _aa, auto&& _bb) {
\
-return verify(std::experimental::all_of(_aa == _bb), verify::get_ip(), 
\
- __FILE__, __LINE__, __PRETTY_FUNCTION__, \
- "all_of(" #_a " == " #_b ")", #_a " = ", _aa,\
+return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__,   
\
+ __PRETTY_FUNCTION__, #_a " == " #_b, #_a " = ", _aa, \
  "\n" #_b " = ", _bb);\
   }((_a), (_b))
 #endif
 
 #define VERIFY(_test)  
\
-  verify(_test, verify::get_ip(), __FILE__, __LINE__, __PRETTY_FUNCTION__, 
\
-#_test)
+  verify(_test, __FILE__, __LINE__, __PRETTY_FUNCTION__, #_test)
 
   // ulp_distance_signed can raise FP exceptions and thus must be 
conditionally
   // executed
@@ -245,9 +245,9 @@ template 
   [&](auto&& _aa, auto&& _bb) {
\
 const bool success = std::experimental::all_of(
\
   vir::test::ulp_distance(_aa, _bb

[PATCH 12/16] Support timeout and timeout-factor options

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Abstract reading test
options into read_src_option function. Read skip, only,
expensive, and xfail via read_src_option. Add timeout and
timeout-factor options and adjust timeout variable accordingly.
* testsuite/experimental/simd/tests/loadstore.cc: Set
timeout-factor 2.
---
 .../testsuite/experimental/simd/driver.sh | 38 +--
 .../experimental/simd/tests/loadstore.cc  |  1 +
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-
v3/testsuite/experimental/simd/driver.sh
index 719e4db8e68..71e0c7d5ee8 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -214,35 +214,43 @@ trap "rm -f '$log' '$sum' $exe; exit" INT
 rm -f "$log" "$sum"
 touch "$log" "$sum"
 
-skip="$(head -n25 "$src" | grep '^//\s*skip: ')"
-if [ -n "$skip" ]; then
-  skip="$(echo "$skip" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+read_src_option() {
+  local key tmp var
+  key="$1"
+  var="$2"
+  [ -z "$var" ] && var="$1"
+  local tmp="$(head -n25 "$src" | grep "^//\\s*${key}: ")"
+  if [ -n "$tmp" ]; then
+tmp="$(echo "${tmp#//*${key}: }" | sed -e 's/ \+/ /g' -e 's/^ //' -e 's/ 
$//')"
+eval "$var=\"$tmp\""
+  else
+return 1
+  fi
+}
+
+if read_src_option skip; then
   if test_selector "$skip"; then
 # silently skip this test
 exit 0
   fi
 fi
-only="$(head -n25 "$src" | grep '^//\s*only: ')"
-if [ -n "$only" ]; then
-  only="$(echo "$only" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+if read_src_option only; then
   if ! test_selector "$only"; then
 # silently skip this test
 exit 0
   fi
 fi
+
 if ! $run_expensive; then
-  expensive="$(head -n25 "$src" | grep '^//\s*expensive: ')"
-  if [ -n "$expensive" ]; then
-expensive="$(echo "$expensive" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+  if read_src_option expensive; then
 if test_selector "$expensive"; then
   unsupported "skip expensive tests"
   exit 0
 fi
   fi
 fi
-xfail="$(head -n25 "$src" | grep '^//\s*xfail: ')"
-if [ -n "$xfail" ]; then
-  xfail="$(echo "$xfail" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+
+if read_src_option xfail; then
   if test_selector "${xfail#* }"; then
 xfail="${xfail%% *}"
   else
@@ -250,6 +258,12 @@ if [ -n "$xfail" ]; then
   fi
 fi
 
+read_src_option timeout
+
+if read_src_option timeout-factor factor; then
+  timeout=$(awk "BEGIN { print int($timeout * $factor) }")
+fi
+
 log_output() {
   if $verbose; then
 maxcol=${1:-1024}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc b/
libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc
index dd7d6c30e8c..cd27c3a7426 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc
@@ -16,6 +16,7 @@
 // <http://www.gnu.org/licenses/>.
 
 // expensive: * [1-9] * *
+// timeout-factor: 2
 #include "bits/verify.h"
 #include "bits/make_vec.h"
 #include "bits/conversions.h"
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






[PATCH 11/16] Abort test after 1000 lines of output

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

Handle overly large output by aborting the log and thus the test. This
is a similar condition to a timeout.

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: When handling the pipe
to log (and on verbose to stdout) count the lines. If it exceeds
1000 log the issue and exit 125, which is then handled as a
failure.
---
 .../testsuite/experimental/simd/driver.sh   | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-
v3/testsuite/experimental/simd/driver.sh
index 314c6a16f86..719e4db8e68 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -258,7 +258,11 @@ BEGIN { count = 0 }
 /^###exitstatus### [0-9]+$/ { exit \$2 }
 {
   print >> \"$log\"
-  if (count >= 1000) next
+  if (count >= 1000) {
+print \"Aborting: too much output\" >> \"$log\"
+print \"Aborting: too much output\"
+exit 125
+  }
   ++count
   if (length(\$0) > $maxcol) {
 i = 1
@@ -282,8 +286,17 @@ END { close(\"$log\") }
 "
   else
 awk "
+BEGIN { count = 0 }
 /^###exitstatus### [0-9]+$/ { exit \$2 }
-{ print >> \"$log\" }
+{
+  print >> \"$log\"
+  if (count >= 1000) {
+print \"Aborting: too much output\" >> \"$log\"
+print \"Aborting: too much output\"
+exit 125
+  }
+  ++count
+}
 END { close(\"$log\") }
 "
   fi
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 10/16] Skip testing hypot3 for long double on PPC

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

std::hypot(a, b, c) is imprecise and makes this test fail even though
the failure is unrelated to simd.

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/tests/hypot3_fma.cc: Add skip:
markup for long double on powerpc64*.
---
 libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc b/
libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc
index 689a90c10a5..94d267fccfb 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc
@@ -16,6 +16,7 @@
 // <http://www.gnu.org/licenses/>.
 
 // only: float|double|ldouble * * *
+// skip: ldouble * powerpc64* *
 // expensive: * [1-9] * *
 #include "bits/verify.h"
 #include "bits/metahelpers.h"
-- 
──────
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 09/16] Fix mask reduction of simd_mask on POWER7

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

POWER7 does not support __vector long long reductions, making the
generic _S_popcount implementation ill-formed. Specializing _S_popcount
for PPC allows optimization and avoids the issue.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __have_power10vec
conditional on _ARCH_PWR10.
* include/experimental/bits/simd_builtin.h: Forward declare
_MaskImplPpc and use it as _MaskImpl when __ALTIVEC__ is
defined.
(_MaskImplBuiltin::_S_some_of): Call _S_popcount from the
_SuperImpl for optimizations and correctness.
* include/experimental/bits/simd_ppc.h: Add _MaskImplPpc.
(_MaskImplPpc::_S_popcount): Implement via vec_cntm for POWER10.
Otherwise, for >=int use -vec_sums divided by a sizeof factor.
For  struct _MaskImplX86;
 template  struct _SimdImplNeon;
 template  struct _MaskImplNeon;
 template  struct _SimdImplPpc;
+template  struct _MaskImplPpc;
 
 // simd_abi::_VecBuiltin {{{
 template 
@@ -959,10 +960,11 @@ template 
 using _CommonImpl = _CommonImplBuiltin;
 #ifdef __ALTIVEC__
 using _SimdImpl = _SimdImplPpc<_VecBuiltin<_UsedBytes>>;
+using _MaskImpl = _MaskImplPpc<_VecBuiltin<_UsedBytes>>;
 #else
 using _SimdImpl = _SimdImplBuiltin<_VecBuiltin<_UsedBytes>>;
-#endif
 using _MaskImpl = _MaskImplBuiltin<_VecBuiltin<_UsedBytes>>;
+#endif
 #endif
 
 // }}}
@@ -2899,7 +2901,7 @@ template 
   _GLIBCXX_SIMD_INTRINSIC static bool
   _S_some_of(simd_mask<_Tp, _Abi> __k)
   {
-   const int __n_true = _S_popcount(__k);
+   const int __n_true = _SuperImpl::_S_popcount(__k);
return __n_true > 0 && __n_true < int(_S_size<_Tp>);
   }
 
diff --git a/libstdc++-v3/include/experimental/bits/simd_ppc.h b/libstdc++-v3/
include/experimental/bits/simd_ppc.h
index c00d2323ac6..1d649931eb9 100644
--- a/libstdc++-v3/include/experimental/bits/simd_ppc.h
+++ b/libstdc++-v3/include/experimental/bits/simd_ppc.h
@@ -30,6 +30,7 @@
 #ifndef __ALTIVEC__
 #error "simd_ppc.h may only be included when AltiVec/VMX is available"
 #endif
+#include 
 
 _GLIBCXX_SIMD_BEGIN_NAMESPACE
 
@@ -114,10 +115,42 @@ template 
 // }}}
   };
 
+// }}}
+// _MaskImplPpc {{{
+template 
+  struct _MaskImplPpc : _MaskImplBuiltin<_Abi>
+  {
+using _Base = _MaskImplBuiltin<_Abi>;
+
+// _S_popcount {{{
+template 
+  _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> 
__k)
+  {
+   const auto __kv = __as_vector(__k);
+   if constexpr (__have_power10vec)
+ {
+   return vec_cntm(__to_intrin(__kv), 1);
+ }
+   else if constexpr (sizeof(_Tp) >= sizeof(int))
+ {
+   using _Intrin = __intrinsic_type16_t;
+   const int __sum = -vec_sums(__intrin_bitcast<_Intrin>(__kv), 
_Intrin())[3];
+   return __sum / (sizeof(_Tp) / sizeof(int));
+ }
+   else
+ {
+   const auto __summed_to_int = vec_sum4s(__to_intrin(__kv), 
__intrinsic_type16_t());
+   return -vec_sums(__summed_to_int, __intrinsic_type16_t())[3];
+ }
+  }
+
+// }}}
+  };
+
 // }}}
 
 _GLIBCXX_SIMD_END_NAMESPACE
 #endif // __cplusplus >= 201703L
 #endif // _GLIBCXX_EXPERIMENTAL_SIMD_PPC_H_
 
-// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
+// vim: foldmethod=marker foldmarker={{{,}}} sw=2 noet ts=8 sts=2 tw=100
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






[PATCH 08/16] Immediate feedback with -v

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Remove executable on
SIGINT. Process compiler and test executable output: In verbose
mode print messages immediately, limited to 1000 lines and
breaking long lines to below $COLUMNS (or 1024 if not set).
Communicating the exit status of the compiler / test with the
necessary pipe is done via a message through stdout/-in.
---
 .../testsuite/experimental/simd/driver.sh | 194 +++---
 1 file changed, 116 insertions(+), 78 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh
index cf07ff9ad85..314c6a16f86 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -172,81 +172,14 @@ unsupported() {
   echo "UNSUPPORTED: $src $type $abiflag ($*)" >> "$log"
 }
 
-verify_compilation() {
-  failed=$1
-  if [ $failed -eq 0 ]; then
-warnings=$(grep -ic 'warning:' "$log")
-if [ $warnings -gt 0 ]; then
-  fail "excess warnings:" $warnings
-  if $verbose; then
-cat "$log"
-  elif ! $quiet; then
-grep -i 'warning:' "$log" | head -n5
-  fi
-elif [ "$xfail" = "compile" ]; then
-  xpass "test for excess errors"
-else
-  pass "test for excess errors"
-fi
-  else
-if [ $failed -eq 124 ]; then
-  fail "timeout: test for excess errors"
-else
-  errors=$(grep -ic 'error:' "$log")
-  if [ "$xfail" = "compile" ]; then
-xfail "excess errors:" $errors
-exit 0
-  else
-fail "excess errors:" $errors
-  fi
-fi
-if $verbose; then
-  cat "$log"
-elif ! $quiet; then
-  grep -i 'error:' "$log" | head -n5
-fi
-exit 0
-  fi
-}
-
-verify_test() {
-  failed=$1
-  if [ $failed -eq 0 ]; then
-rm "$exe"
-if [ "$xfail" = "run" ]; then
-  xpass "execution test"
-else
-  pass "execution test"
-fi
-  else
-$keep_failed || rm "$exe"
-if [ $failed -eq 124 ]; then
-  fail "timeout: execution test"
-elif [ "$xfail" = "run" ]; then
-  xfail "execution test"
-else
-  fail "execution test"
-fi
-if $verbose; then
-  lines=$(wc -l < "$log")
-  lines=$((lines-3))
-  if [ $lines -gt 1000 ]; then
-echo "[...]"
-tail -n1000 "$log"
-  else
-tail -n$lines "$log"
-  fi
-elif ! $quiet; then
-  grep -i fail "$log" | head -n5
-fi
-exit 0
-  fi
-}
-
 write_log_and_verbose() {
   echo "$*" >> "$log"
   if $verbose; then
-echo "$*"
+if [ -z "$COLUMNS" ] || ! type fmt>/dev/null; then
+  echo "$*"
+else
+  echo "$*" | fmt -w $COLUMNS -s - || cat
+fi
   fi
 }
 
@@ -277,7 +210,7 @@ test_selector() {
   return 1
 }
 
-trap "rm -f '$log' '$sum'; exit" INT
+trap "rm -f '$log' '$sum' $exe; exit" INT
 rm -f "$log" "$sum"
 touch "$log" "$sum"
 
@@ -317,17 +250,122 @@ if [ -n "$xfail" ]; then
   fi
 fi
 
+log_output() {
+  if $verbose; then
+maxcol=${1:-1024}
+awk "
+BEGIN { count = 0 }
+/^###exitstatus### [0-9]+$/ { exit \$2 }
+{
+  print >> \"$log\"
+  if (count >= 1000) next
+  ++count
+  if (length(\$0) > $maxcol) {
+i = 1
+while (i + $maxcol <= length(\$0)) {
+  len = $maxcol
+  line = substr(\$0, i, len)
+  len = match(line, / [^ ]*$/)
+  if (len <= 0) {
+len = match(substr(\$0, i), / [^ ]/)
+if (len <= 0) len = $maxcol
+  }
+  print substr(\$0, i, len)
+  i += len
+}
+print substr(\$0, i)
+  } else {
+print
+  }
+}
+END { close(\"$log\") }
+"
+  else
+awk "
+/^###exitstatus### [0-9]+$/ { exit \$2 }
+{ print >> \"$log\" }
+END { close(\"$log\") }
+"
+  fi
+}
+
+verify_compilation() {
+  log_output $COLUMNS
+  exitstatus=$?
+  if [ $exitstatus -eq 0 ]; then
+warnings=$(grep -ic 'warning:' "$log")
+if [ $warnings -gt 0 ]; then
+  fail "excess warnings:" $warnings
+  if

[PATCH 07/16] Fix incorrect display of old test summaries

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/Makefile.am: Ensure .simd.summary is empty before
collecting a new summary.
* testsuite/Makefile.in: Regenerate.
---
 libstdc++-v3/testsuite/Makefile.am | 1 +
 libstdc++-v3/testsuite/Makefile.in | 1 +
 2 files changed, 2 insertions(+)

diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/
Makefile.am
index 5dd109b40c9..2d3ad481dba 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -191,6 +191,7 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @rm -f .simd.summary
${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/
Makefile.in
index 3900d6d87b4..ac6207ae75c 100644
--- a/libstdc++-v3/testsuite/Makefile.in
+++ b/libstdc++-v3/testsuite/Makefile.in
@@ -716,6 +716,7 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @rm -f .simd.summary
${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 05/16] Fix several check-simd interaction issues

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh (verify_test): Print
test output on run xfail. Do not repeat lines from the log that
were already printed on stdout.
(test_selector): Make the compiler flags pattern usable as a
substring selector.
(toplevel): Trap on SIGINT and remove the log and sum files.
Call timout with --foreground to quickly terminate on SIGINT.
* testsuite/experimental/simd/generate_makefile.sh: Simplify run
targets via target patterns. Default DRIVEROPTS to -v for run
targets. Remove log and sum files after completion of the run
target (so that it's always recompiled).
Place help text into text file for reasonable 'make help'
performance.
---
 .../testsuite/experimental/simd/driver.sh | 16 +++--
 .../experimental/simd/generate_makefile.sh| 70 +--
 2 files changed, 44 insertions(+), 42 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh
index 84f3829c2d4..cf07ff9ad85 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -224,16 +224,17 @@ verify_test() {
   fail "timeout: execution test"
 elif [ "$xfail" = "run" ]; then
   xfail "execution test"
-  exit 0
 else
   fail "execution test"
 fi
 if $verbose; then
-  if [ $(cat "$log"|wc -l) -gt 1000 ]; then
+  lines=$(wc -l < "$log")
+  lines=$((lines-3))
+  if [ $lines -gt 1000 ]; then
 echo "[...]"
 tail -n1000 "$log"
   else
-cat "$log"
+tail -n$lines "$log"
   fi
 elif ! $quiet; then
   grep -i fail "$log" | head -n5
@@ -267,7 +268,7 @@ test_selector() {
   [ -z "$target_triplet" ] && target_triplet=$($CXX -dumpmachine)
   if matches "$target_triplet" "$pat_triplet"; then
 pat_flags="${string#* }"
-if matches "$CXXFLAGS" "$pat_flags"; then
+if matches "$CXXFLAGS" "*$pat_flags*"; then
   return 0
 fi
   fi
@@ -276,6 +277,7 @@ test_selector() {
   return 1
 }
 
+trap "rm -f '$log' '$sum'; exit" INT
 rm -f "$log" "$sum"
 touch "$log" "$sum"
 
@@ -316,15 +318,15 @@ if [ -n "$xfail" ]; then
 fi
 
 write_log_and_verbose "$CXX $src $@ -D_GLIBCXX_SIMD_TESTTYPE=$type $abiflag -o $exe"
-timeout $timeout "$CXX" "$src" "$@" "-D_GLIBCXX_SIMD_TESTTYPE=$type" $abiflag -o "$exe" >> "$log" 2>&1
+timeout --foreground $timeout "$CXX" "$src" "$@" "-D_GLIBCXX_SIMD_TESTTYPE=$type" $abiflag -o "$exe" >> "$log" 2>&1
 verify_compilation $?
 if [ -n "$sim" ]; then
   write_log_and_verbose "$sim ./$exe"
-  timeout $timeout $sim "./$exe" >> "$log" 2>&1 <&-
+  timeout --foreground $timeout $sim "./$exe" >> "$log" 2>&1 <&-
 else
   write_log_and_verbose "./$exe"
   timeout=$(awk "BEGIN { print int($timeout / 2) }")
-  timeout $timeout "./$exe" >> "$log" 2>&1 <&-
+  timeout --foreground $timeout "./$exe" >> "$log" 2>&1 <&-
 fi
 verify_test $?
 
diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
index 553bc98f60b..8d642a2941a 100755
--- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
@@ -240,7 +240,7 @@ EOF
 %-$type.log: %-$type-0.log %-$type-1.log %-$type-2.log %-$type-3.log \
 %-$type-4.log %-$type-5.log %-$type-6.log %-$type-7.log \
 %-$type-8.log %-$type-9.log
-	@cat $^ > \$@
+	@cat \$^ > \$@
 	@cat \$(^:log=sum) > \$(@:log=sum)${rmline}
 
 EOF
@@ -252,47 +252,47 @@ EOF
 EOF
 done
   done
-  echo 'run-%: export GCC_TEST_RUN_EXPENSIVE=yes'
-  all_tests | while read file && read name; do
-echo "run-$name: $name.log"
-all_types "$file" | while read t && read type; do
-  echo "run-$name-$type:

[PATCH 04/16] Fix simd_mask on POWER w/o POWER8

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Remove unnecessary static
assertion. Allow sizeof(8) integer __intrinsic_type to enable
the necessary mask type.
---
 libstdc++-v3/include/experimental/bits/simd.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index 64cf8d32328..9685df0be9e 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2292,12 +2292,6 @@ template 
 #ifndef __VSX__
 static_assert(!is_same_v<_Tp, double>,
  "no __intrinsic_type support for double on PPC w/o VSX");
-#endif
-#ifndef __POWER8_VECTOR__
-static_assert(
-  !(is_integral_v<_Tp> && sizeof(_Tp) > 4),
-  "no __intrinsic_type support for integers larger than 4 Bytes "
-  "on PPC w/o POWER8 vectors");
 #endif
 using type =
   typename __intrinsic_type_impl<
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 06/16] Fix DRIVEROPTS and TESTFLAGS processing

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/generate_makefile.sh: Use
different variables internally than documented for user
overrides. This makes internal append/prepend work as intended.
---
 .../testsuite/experimental/simd/generate_makefile.sh  | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/
libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
index 8d642a2941a..4fb710c7767 100755
--- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
@@ -85,19 +85,20 @@ CXX="$1"
 shift
 
 echo "TESTFLAGS ?=" > "$dst"
-[ -n "$testflags" ] && echo "TESTFLAGS := $testflags \$(TESTFLAGS)" >> "$dst"
-echo CXXFLAGS = "$@" "\$(TESTFLAGS)" >> "$dst"
+echo "test_flags := $testflags \$(TESTFLAGS)" >> "$dst"
+echo CXXFLAGS = "$@" "\$(test_flags)" >> "$dst"
 [ -n "$sim" ] && echo "export GCC_TEST_SIMULATOR = $sim" >> "$dst"
 cat >> "$dst" <https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 03/16] Support -mlong-double-64 on PPC

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Let __intrinsic_type be valid if sizeof(long double) == sizeof(double) and
use a __vector double as member type.
---
 libstdc++-v3/include/experimental/bits/simd.h | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index d56176210df..64cf8d32328 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2285,7 +2285,9 @@ template 
   struct __intrinsic_type<_Tp, _Bytes,
  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
   {
-static_assert(!is_same_v<_Tp, long double>,
+static constexpr bool _S_is_ldouble = is_same_v<_Tp, long double>;
+// allow _Tp == long double with -mlong-double-64
+static_assert(!(_S_is_ldouble && sizeof(long double) > sizeof(double)),
  "no __intrinsic_type support for long double on PPC");
 #ifndef __VSX__
 static_assert(!is_same_v<_Tp, double>,
@@ -2297,8 +2299,11 @@ template 
   "no __intrinsic_type support for integers larger than 4 Bytes "
   "on PPC w/o POWER8 vectors");
 #endif
-using type = typename __intrinsic_type_impl, _Tp, __int_for_sizeof_t<_Tp>>>::type;
+using type =
+  typename __intrinsic_type_impl<
+conditional_t,
+  conditional_t<_S_is_ldouble, double, _Tp>,
+  __int_for_sizeof_t<_Tp>>>::type;
   };
 #endif // __ALTIVEC__
 
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 02/16] Fix NEON intrinsic types usage

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

Intrinsics types for NEON differ from gnu::vector_size types now. This
requires explicit specializations for __intrinsic_type and a new
__is_intrinsic_type trait.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h (__is_intrinsic_type): New
internal type trait. Alias for __is_vector_type on x86.
(_VectorTraitsImpl): Enable for __intrinsic_type in addition for
__vector_type.
(__intrin_bitcast): Allow casting to & from vector & intrinsic
types.
(__intrinsic_type): Explicitly specialize for NEON intrinsic
vector types.
---
 libstdc++-v3/include/experimental/bits/simd.h | 70 +--
 1 file changed, 66 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index 00eec50d64f..d56176210df 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1379,13 +1379,35 @@ template 
 template 
   inline constexpr bool __is_vector_type_v = __is_vector_type<_Tp>::value;
 
+// }}}
+// __is_intrinsic_type {{{
+#if _GLIBCXX_SIMD_HAVE_SSE_ABI
+template 
+  using __is_intrinsic_type = __is_vector_type<_Tp>;
+#else // not SSE (x86)
+template >
+  struct __is_intrinsic_type : false_type {};
+
+template 
+  struct __is_intrinsic_type<
+_Tp, void_t()[0])>, 
sizeof(_Tp)>::type>>
+: is_same<_Tp, typename __intrinsic_type<
+remove_reference_t()[0])>,
+sizeof(_Tp)>::type> {};
+#endif
+
+template 
+  inline constexpr bool __is_intrinsic_type_v = 
__is_intrinsic_type<_Tp>::value;
+
 // }}}
 // _VectorTraits{{{
 template >
   struct _VectorTraitsImpl;
 
 template 
-  struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>>>
+  struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>
+ || __is_intrinsic_type_v<_Tp>>>
   {
 using type = _Tp;
 using value_type = remove_reference_t()[0])>;
@@ -1457,7 +1479,8 @@ template 
   _GLIBCXX_SIMD_INTRINSIC constexpr _To
   __intrin_bitcast(_From __v)
   {
-static_assert(__is_vector_type_v<_From> && __is_vector_type_v<_To>);
+static_assert((__is_vector_type_v<_From> || __is_intrinsic_type_v<_From>)
+   && (__is_vector_type_v<_To> || __is_intrinsic_type_v<_To>));
 if constexpr (sizeof(_To) == sizeof(_From))
   return reinterpret_cast<_To>(__v);
 else if constexpr (sizeof(_From) > sizeof(_To))
@@ -2183,16 +2206,55 @@ template 
 #endif // _GLIBCXX_SIMD_HAVE_SSE_ABI
 // __intrinsic_type (ARM){{{
 #if _GLIBCXX_SIMD_HAVE_NEON
+template <>
+  struct __intrinsic_type
+  { using type = float32x2_t; };
+
+template <>
+  struct __intrinsic_type
+  { using type = float32x4_t; };
+
+#if _GLIBCXX_SIMD_HAVE_NEON_A64
+template <>
+  struct __intrinsic_type
+  { using type = float64x1_t; };
+
+template <>
+  struct __intrinsic_type
+  { using type = float64x2_t; };
+#endif
+
+#define _GLIBCXX_SIMD_ARM_INTRIN(_Bits, _Np)   
\
+template <>
\
+  struct __intrinsic_type<__int_with_sizeof_t<_Bits / 8>,  
\
+ _Np * _Bits / 8, void>   \
+  { using type = int##_Bits##x##_Np##_t; };
\
+template <>
\
+  struct __intrinsic_type>, 
\
+ _Np * _Bits / 8, void>   \
+  { using type = uint##_Bits##x##_Np##_t; }
+_GLIBCXX_SIMD_ARM_INTRIN(8, 8);
+_GLIBCXX_SIMD_ARM_INTRIN(8, 16);
+_GLIBCXX_SIMD_ARM_INTRIN(16, 4);
+_GLIBCXX_SIMD_ARM_INTRIN(16, 8);
+_GLIBCXX_SIMD_ARM_INTRIN(32, 2);
+_GLIBCXX_SIMD_ARM_INTRIN(32, 4);
+_GLIBCXX_SIMD_ARM_INTRIN(64, 1);
+_GLIBCXX_SIMD_ARM_INTRIN(64, 2);
+#undef _GLIBCXX_SIMD_ARM_INTRIN
+
 template 
   struct __intrinsic_type<_Tp, _Bytes,
  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
   {
-static constexpr int _S_VBytes = _Bytes <= 8 ? 8 : 16;
+static constexpr int _SVecBytes = _Bytes <= 8 ? 8 : 16;
 using _Ip = __int_for_sizeof_t<_Tp>;
 using _Up = conditional_t<
   is_floating_point_v<_Tp>, _Tp,
   conditional_t, make_unsigned_t<_Ip>, _Ip>>;
-using type [[__gnu__::__vector_size__(_S_VBytes)]] = _Up;
+static_assert(!is_same_v<_Tp, _Up> || _SVecBytes != _Bytes,
+ "should use explicit specialization above");
+using type = typename __intrinsic_type<_Up, _SVecBytes>::type;
   };
 #endif // _GLIBCXX_SIMD_H

[PATCH 01/16] Support skip, only, expensive, and xfail markers

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Implement skip, only,
expensive, and xfail markers. They can select on type, ABI tag
subset number, target-triplet, and compiler flags.
* testsuite/experimental/simd/generate_makefile.sh: The summary
now includes lines for unexpected passes and expected failures.
If the skip or only markers are only conditional on the type, do
not generate rules for those types.
* testsuite/experimental/simd/tests/abs.cc: Mark test expensive
for ABI tag subsets 1-9.
* testsuite/experimental/simd/tests/algorithms.cc: Ditto.
* testsuite/experimental/simd/tests/broadcast.cc: Ditto.
* testsuite/experimental/simd/tests/casts.cc: Ditto.
* testsuite/experimental/simd/tests/generator.cc: Ditto.
* testsuite/experimental/simd/tests/integer_operators.cc: Ditto.
* testsuite/experimental/simd/tests/loadstore.cc: Ditto.
* testsuite/experimental/simd/tests/mask_broadcast.cc: Ditto.
* testsuite/experimental/simd/tests/mask_conversions.cc: Ditto.
* testsuite/experimental/simd/tests/mask_implicit_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/mask_loadstore.cc: Ditto.
* testsuite/experimental/simd/tests/mask_operator_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/mask_operators.cc: Ditto.
* testsuite/experimental/simd/tests/mask_reductions.cc: Ditto.
* testsuite/experimental/simd/tests/operator_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/operators.cc: Ditto.
* testsuite/experimental/simd/tests/reductions.cc: Ditto.
* testsuite/experimental/simd/tests/simd.cc: Ditto.
* testsuite/experimental/simd/tests/split_concat.cc: Ditto.
* testsuite/experimental/simd/tests/splits.cc: Ditto.
* testsuite/experimental/simd/tests/where.cc: Ditto.
* testsuite/experimental/simd/tests/fpclassify.cc: Ditto. In
addition replace "test only floattypes" marker by unconditional
"float|double|ldouble" only marker.
* testsuite/experimental/simd/tests/frexp.cc: Ditto.
* testsuite/experimental/simd/tests/hypot3_fma.cc: Ditto.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
Ditto.
* testsuite/experimental/simd/tests/logarithm.cc: Ditto.
* testsuite/experimental/simd/tests/math_1arg.cc: Ditto.
* testsuite/experimental/simd/tests/math_2arg.cc: Ditto.
* testsuite/experimental/simd/tests/remqo.cc: Ditto.
* testsuite/experimental/simd/tests/trigonometric.cc: Ditto.
* testsuite/experimental/simd/tests/trunc_ceil_floor.cc: Ditto.
* testsuite/experimental/simd/tests/sincos.cc: Ditto. In
addition, xfail on run because the reference data is missing.
---
 .../testsuite/experimental/simd/driver.sh | 114 +---
 .../experimental/simd/generate_makefile.sh| 122 --
 .../testsuite/experimental/simd/tests/abs.cc  |   1 +
 .../experimental/simd/tests/algorithms.cc |   1 +
 .../experimental/simd/tests/broadcast.cc  |   1 +
 .../experimental/simd/tests/casts.cc  |   1 +
 .../experimental/simd/tests/fpclassify.cc |   3 +-
 .../experimental/simd/tests/frexp.cc  |   3 +-
 .../experimental/simd/tests/generator.cc  |   1 +
 .../experimental/simd/tests/hypot3_fma.cc |   3 +-
 .../simd/tests/integer_operators.cc   |   1 +
 .../simd/tests/ldexp_scalbn_scalbln_modf.cc   |   3 +-
 .../experimental/simd/tests/loadstore.cc  |   1 +
 .../experimental/simd/tests/logarithm.cc  |   3 +-
 .../experimental/simd/tests/mask_broadcast.cc |   1 +
 .../simd/tests/mask_conversions.cc|   1 +
 .../simd/tests/mask_implicit_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_loadstore.cc |   1 +
 .../simd/tests/mask_operator_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_operators.cc |   1 +
 .../simd/tests/mask_reductions.cc |   1 +
 .../experimental/simd/tests/math_1arg.cc  |   3 +-
 .../experimental/simd/tests/math_2arg.cc  |   3 +-
 .../experimental/simd/tests/operator_cvt.cc   |   1 +
 .../experimental/simd/tests/operators.cc  |   1 +
 .../experimental/simd/tests/reductions.cc |   1 +
 .../experimental/simd/tests/remqo.cc  |   3 +-
 .../testsuite/experimental/simd/tests/simd.cc |   1 +
 .../experimental/simd/tests/sincos.cc |   4 +-
 .../experimental/simd/tests/split_concat.cc   |   1 +
 .../experimental/simd/tests/splits.cc |   1 +
 .../experimental/simd/tests/trigonometric.cc  |   3 +-
 .../simd/tests/trunc_ceil_floor.cc|   3 +-
 .../experimental/simd/tests/where.cc  |   1 +
 34 files changed, 225 insertions(+), 66 deletions(-)


--
──
 Dr. Matthias Kretz

[PATCH 00/16] stdx::simd fixes and testsuite improvements

2021-01-27 Thread Matthias Kretz
As promised on IRC ...

Matthias Kretz (15):
  Support skip, only, expensive, and xfail markers
  Fix NEON intrinsic types usage
  Support -mlong-double-64 on PPC
  Fix simd_mask on POWER w/o POWER8
  Fix several check-simd interaction issues
  Fix DRIVEROPTS and TESTFLAGS processing
  Fix incorrect display of old test summaries
  Immediate feedback with -v
  Fix mask reduction of simd_mask on POWER7
  Skip testing hypot3 for long double on PPC
  Abort test after 1000 lines of output
  Support timeout and timeout-factor options
  Improve test codegen for interpreting assembly
  Implement hmin and hmax
  Work around test failures using -mno-tree-vrp

yaozhongxiao (1):
  Improve "find_first/last_set" for NEON

 libstdc++-v3/include/experimental/bits/simd.h | 170 ++-
 .../include/experimental/bits/simd_builtin.h  |   6 +-
 .../include/experimental/bits/simd_neon.h |  17 +-
 .../include/experimental/bits/simd_ppc.h  |  35 ++-
 .../include/experimental/bits/simd_scalar.h   |   2 +-
 libstdc++-v3/testsuite/Makefile.am|   5 +-
 libstdc++-v3/testsuite/Makefile.in|   5 +-
 .../testsuite/experimental/simd/driver.sh | 263 ++
 .../experimental/simd/generate_makefile.sh| 201 +++--
 .../testsuite/experimental/simd/tests/abs.cc  |   1 +
 .../experimental/simd/tests/algorithms.cc |   1 +
 .../experimental/simd/tests/bits/verify.h |  44 +--
 .../experimental/simd/tests/broadcast.cc  |   1 +
 .../experimental/simd/tests/casts.cc  |   1 +
 .../experimental/simd/tests/fpclassify.cc |   3 +-
 .../experimental/simd/tests/frexp.cc  |   3 +-
 .../experimental/simd/tests/generator.cc  |   1 +
 .../experimental/simd/tests/hypot3_fma.cc |   4 +-
 .../simd/tests/integer_operators.cc   |   1 +
 .../simd/tests/ldexp_scalbn_scalbln_modf.cc   |   3 +-
 .../experimental/simd/tests/loadstore.cc  |   2 +
 .../experimental/simd/tests/logarithm.cc  |   3 +-
 .../experimental/simd/tests/mask_broadcast.cc |   1 +
 .../simd/tests/mask_conversions.cc|   1 +
 .../simd/tests/mask_implicit_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_loadstore.cc |   1 +
 .../simd/tests/mask_operator_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_operators.cc |   1 +
 .../simd/tests/mask_reductions.cc |   1 +
 .../experimental/simd/tests/math_1arg.cc  |   3 +-
 .../experimental/simd/tests/math_2arg.cc  |   3 +-
 .../experimental/simd/tests/operator_cvt.cc   |   1 +
 .../experimental/simd/tests/operators.cc  |   1 +
 .../experimental/simd/tests/reductions.cc |  22 ++
 .../experimental/simd/tests/remqo.cc  |   3 +-
 .../testsuite/experimental/simd/tests/simd.cc |   1 +
 .../experimental/simd/tests/sincos.cc |   4 +-
 .../experimental/simd/tests/split_concat.cc   |   1 +
 .../experimental/simd/tests/splits.cc |   1 +
 .../experimental/simd/tests/trigonometric.cc  |   3 +-
 .../simd/tests/trunc_ceil_floor.cc|   3 +-
 .../experimental/simd/tests/where.cc  |   1 +
 42 files changed, 635 insertions(+), 191 deletions(-)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






Re: [PATCH] Add simd testsuite

2020-12-18 Thread Matthias Kretz
On Donnerstag, 17. Dezember 2020 14:10:51 CET Jonathan Wakely wrote:
> On 16/12/20 12:58 +0100, Matthias Kretz wrote:
> >+  $srcdir/testsuite/experimental/simd/generate_makefile.sh \
> >+--destination="$testdir/$subdir" $CXX $INCLUDES $CXXFLAGS -static
> 
> Is the -static here to avoid needing LD_LIBRARY_PATH to find
> libstdc++.so?
> 
> If you don't have libc.a installed it won't work. How about
> using -static-libgcc -static-libstdc++ instead?

I need the -static for qemu and simple remote execution (copy binary via scp, 
execute via ssh). And yes, -static makes it much easier to avoid the 
LD_LIBRARY_PATH issue.

I'll make -static optional, and default to -static-libgcc -static-libstdc++. 
The latter should still work for most remote execution setups (works for me, 
at least).

> >--- /dev/null
> >+++ b/libstdc++-v3/testsuite/experimental/simd/tests/abs.cc
> >@@ -0,0 +1,24 @@
> >+#include "bits/verify.h"
> >+#include "bits/metahelpers.h"
> 
> We'd usually put these testsuite helper files in testsuite/util, maybe
> in a testsuite/util/simd sub-dir, but I suppose keeping them local to
> the tests is OK too.

At this point the simd testsuite is very close to being usable for other 
Parallelism TS 2 implementations. That's a feature I'd support if there's 
interest outside of libstdc++.

> >--- /dev/null
> >+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h
> >@@ -0,0 +1,167 @@
> >+#include 
> >+
> >+// is_conversion_undefined
> >+/* implementation-defined
> >+ * ==
> >+ * §4.7 p3 (integral conversions)
> 
> These section signs will cause errors if the testsuite is run with
> something like -finput-charset=ascii, but I suppose we can say "don't
> do that". We have tests that use that option and include all the
> libstdc++ headers, so there should be no need to run the entire
> testsuite with that option. So it's OK.

Ah, but good point. I have comments in simd_math.h (i.e. the other patch) 
like: "Fold @p x into [-¼π, ¼π] and [...]". These comments are not in the 
testsuite. I guess I need to replace all non-ASCII chars there?

Attached is the diff to the previous patch. In addition to the -static change 
I added license headers (as noted on IRC) and I improved the Makefile 
generator: Instead of only passing TESTFLAGS and GCC_TEST_SIMULATOR as 
environment variables, place the initial values into the generated Makefile. 
This makes it much easier to work on tests or fixes for failures.

I'll top-post the squashed simd patches. 

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──
diff --git a/libstdc++-v3/scripts/check_simd b/libstdc++-v3/scripts/check_simd
index 2b7a17a64c9..25acf64c841 100755
--- a/libstdc++-v3/scripts/check_simd
+++ b/libstdc++-v3/scripts/check_simd
@@ -26,7 +26,7 @@ sim=\\\"$sim\\\"\""
 
 if [ -f "$CHECK_SIMD_CONFIG" ]; then
   . "$CHECK_SIMD_CONFIG"
-elif [ -z "$CHECK_SIMD_CONFIG"]; then
+elif [ -z "$CHECK_SIMD_CONFIG" ]; then
   if [ -z "$target_list" ]; then
 target_list="unix"
 case "$target_triplet" in
@@ -69,8 +69,7 @@ while [ ${#list} -gt 0 ]; do
   subdir="simd/$(echo "$flags" | sed 's#[= /-]##g')"
   rm -f "${subdir}/Makefile"
   $srcdir/testsuite/experimental/simd/generate_makefile.sh \
---destination="$testdir/$subdir" $CXX $INCLUDES $CXXFLAGS -static
-  echo "$subdir
-$flags
-$sim"
+--destination="$testdir/$subdir" --sim="$sim" --testflags="$flags" \
+$CXX $INCLUDES $CXXFLAGS -static-libgcc -static-libstdc++
+  echo "$subdir"
 done
diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am
index d2e282b62b9..fa9cc4753f3 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -192,9 +192,10 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \
 	testsuite_files_simd \
 	${glibcxx_builddir}/scripts/testsuite_flags
 	${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \
-	  while read subdir && read flags && read sim; do \
-	$(MAKE) -C "$${subdir}" TESTFLAGS="$${flags}" GCC_TEST_SIMULATOR="$${sim}"; \
-	tail -n6 $${subdir}/simd_testsuite.sum >> .simd.summary; \
+	  while read subdir; do \
+	$(M

Re: [PATCH] std::experimental::simd

2020-11-13 Thread Matthias Kretz
On Donnerstag, 12. November 2020 00:43:31 CET Jonathan Wakely wrote:
> On 08/05/20 21:03 +0200, Matthias Kretz wrote:
> >Here's my last update to the std::experimental::simd patch. It's currently
> >based on the gcc-10 branch.
> >
> >
> >+
> >+// __next_power_of_2{{{
> >+/**
> >+ * \internal
> 
> We use @foo for Doxygen commens rather than \foo

Done.

> >+ * Returns the next power of 2 larger than or equal to \p __x.
> >+ */
> >+constexpr std::size_t
> >+__next_power_of_2(std::size_t __x)
> >+{
> >+  return (__x & (__x - 1)) == 0 ? __x
> >+: __next_power_of_2((__x | (__x >> 1)) + 1);
> >+}
> 
> Can this be replaced with std::__bit_ceil ?
> 
> std::bit_ceil is C++20, but we provide __private versions of
> everything in  for C++14 and up.

Ah good. I'll delete some code.

> >+// vvv  type traits  vvv
> >+// integer type aliases{{{
> >+using _UChar = unsigned char;
> >+using _SChar = signed char;
> >+using _UShort = unsigned short;
> >+using _UInt = unsigned int;
> >+using _ULong = unsigned long;
> >+using _ULLong = unsigned long long;
> >+using _LLong = long long;
> 
> I have a suspicion some of these might clash with libc macros on some
> OS somewhere, but we can cross that bridge when we come to it.

I need those to help cutting down the code for 80 cols. ;-)

> >+// __make_dependent_t {{{
> >+template  struct __make_dependent
> >+{
> >+  using type = _Up;
> >+};
> >+template 
> >+using __make_dependent_t = typename __make_dependent<_Tp, _Up>::type;
> 
> Do you need a distinct class template for this, or can
> __make_dependent_t be an alias to __type_identity::type or
> something else that already exists?

With GCC it would suffice to use __type_identity::type here. But Clang 
rejects it. Clang sees that the first template argument is not used in the 
definition of the alias and thus doesn't make _Up a dependent type.

> >+// __call_with_n_evaluations{{{
> >+template 
> >+_GLIBCXX_SIMD_INTRINSIC constexpr auto
> >+__call_with_n_evaluations(std::index_sequence<_I...>, _F0&& __f0,
> >+  _FArgs&& __fargs)
> 
> I'm not sure if it matters here, but old versions of G++ passed empty
> types (like index_sequence) using the wrong ABI. Passing them as the
> last argument makes it a non-issue. If they're not the last argument,
> you get incompatible code when compiling with -fabi-version=7 or
> lower.

These are all [[gnu::always_inline]]. So it shouldn't matter.

> >+// __is_narrowing_conversion<_From, _To>{{{
> >+template  >std::is_arithmetic<_From>::value, +bool =
> >std::is_arithmetic<_To>::value>
> 
> These could use is_arithmetic_v.

Right. That was me trying to work around a clang-format bug. Will fix. I'm in 
the process of ditching clang-format anyway.

> >+{
> >+};
> >+
> >+template 
> >+struct __is_narrowing_conversion : public true_type
> 
> This looks odd, bool to arithmetic type T is narrowing?
> I assume there's a reason for it, so maybe a comment explaining it
> would help.

Odd indeed. Either I wanted to take a shortcut to implement: "From is a 
vectorizable type and every possibly value of From can be represented with 
type value_type, or [...]". Or I wanted to swap bool and _Tp here and say that 
anything other than bool converting to bool is narrowing.

I should clean this up.

> 
> >+// _BitOps {{{
> >+struct _BitOps
> > [...]
> std::__popcount in 
> > [...]
> std::__countl_zero in 

Yes. I'll clean up all of _BitOps.

> >+template 
> 
> We generally avoid single letter names, although _V isn't in the list
> of BADNAMES in the manual, so maybe this one's OK.
> 
> >+template ,
> >+  typename _R
> 
> Same for _R, it's not listed as a BADNAME.

I believe I checked the list. ;-)

> >+
> >+template 
> >+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp
> >+__and(_Tp __a, _Tp __b) noexcept
> 
> Calls to __and are done unqualified. Are they only with types that
> won't cause ADL to look outside namespace std?
> 
> Even though __and is a reserved name, avoidign ADL has other benefits.

Called either with integers, [[gnu::vector_size(N)]] types, or 
std::experimental::parallelism_v2::_SimdWrapper. I request a column limit 
relaxation to at least 100 cols if I should qualify all of them with 
std::experimental:: ;-)

> That's all for now ... not very far through the huge patch though.
> Generally this looks very good. The things mentioned above are
> stylistic or just remove some redundancy, they're not critical.

Thanks. I'll post a new patch ASAP. My tests are running.

-Matthias

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] Let numeric_limits::is_iec559 reflect -ffast-math

2020-05-25 Thread Matthias Kretz
On Freitag, 22. Mai 2020 18:39:42 CEST Jonathan Wakely wrote:
> On 22/05/20 09:49 +0200, Matthias Kretz wrote:
> >On Donnerstag, 21. Mai 2020 17:46:01 CEST Marc Glisse wrote:
> >> On Thu, 21 May 2020, Jonathan Wakely wrote:
> >> > On 27/04/20 17:09 +0200, Matthias Kretz wrote:
> >> >> From: Matthias Kretz 
> >> >> 
> >> >>PR libstdc++/84949
> >> >>* include/std/limits: Let is_iec559 reflect whether
> >> >>__GCC_IEC_559 says float and double support IEEE 754-2008.
> >> >>* testsuite/18_support/numeric_limits/is_iec559.cc: Test IEC559
> >> >>mandated behavior if is_iec559 is true.
> >> >>* testsuite/18_support/numeric_limits/infinity.cc: Only test
> >> >>inf
> >> >>behavior if is_iec559 is true, otherwise there is no guarantee
> >> >>how arithmetic on inf behaves.
> >> >>* testsuite/18_support/numeric_limits/quiet_NaN.cc: ditto for
> >> >>NaN.
> >> >>* testsuite/18_support/numeric_limits/denorm_min-1.cc: Compile
> >> >>with -ffast-math.
> >> >>* testsuite/18_support/numeric_limits/epsilon-1.cc: ditto.
> >> >>* testsuite/18_support/numeric_limits/infinity-1.cc: ditto.
> >> >>* testsuite/18_support/numeric_limits/is_iec559-1.cc: ditto.
> >> >>* testsuite/18_support/numeric_limits/quiet_NaN-1.cc: ditto.
> >> > 
> >> > I'm inclined to go ahead and commit this (to master only, obviously).
> >> > It certainly seems more correct to me, and we'll probably never find
> >> > out if it's "safe" to do unless we actually change it and see what
> >> > happens.
> >> > 
> >> > Marc, do you have an opinion?
> >> 
> >> I don't have a strong opinion on this. I thought we were refraining from
> >> changing numeric_limits based on flags (like -fwrapv for modulo) because
> >> that would lead to ODR violations when people link objects compiled with
> >> different flags. There is a value in libstdc++.so, which may have been
> >> compiled with different flags than the application.
> >
> >But these ODR violations happen in any case: The floating-point types are
> >different types with or without -ffast-math (and related) flags. They
> >behave differently. Compiling a function in multiple TUs with different
> >flags produces observably different results. Choosing a single one of them
> >is obviously fragile and broken. That's the spirit of an ODR violation...
> >
> >It would sometimes be useful to have different types:
> >float, float_no_nan, float_no_nan_no_signed_zero, ...
> 
> Sure. There are ODR violations like that, and then there are ones
> like:
> 
>template
>struct X
>{
>  conditional_t::is_iec559, T, BigNum> val;
>};

Nice. ;-) If only the mangling of a struct could include the type of its 
members (recursively)... But at least val has a different type now. And 
correctly so. Yes, the ABI breaks possible via this change is real, though 
I'd guess there are zero or close-to-zero ABI dependencies on is_iec559 out in 
the wild (at this point - because it didn't work anyway).

> I'm generally not concerned about ODR violations where one TU behaves
> as requested by the flags used to compile that TU and another behaves
> as requested by the flats used to compile that second TU. That happens
> all the time with -fno-exceptions and -fno-rtti and such like. That
> causes ODR violations too, but of the kind where each definition does
> what was requested.

I am concerned. Showcase: https://godbolt.org/z/KzM3si. If you link those TUs, 
you get one of the two behaviors for both TUs. This can result in very hard to 
find Heisenbugs.

> Constants defined by the library changing value is a bit more
> concerning. But I don't know if it's really a problem in this case.

template ::is_iec559>
struct Float
{
  T val
};

Finally, the standard mechanism that can help resolve those silent ODR 
violations works. I.e. one can build float_559 and float_non559 types 
(overloading all operators is still rather tedious)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──





Re: [PATCH] Let numeric_limits::is_iec559 reflect -ffast-math

2020-05-22 Thread Matthias Kretz
On Donnerstag, 21. Mai 2020 17:46:01 CEST Marc Glisse wrote:
> On Thu, 21 May 2020, Jonathan Wakely wrote:
> > On 27/04/20 17:09 +0200, Matthias Kretz wrote:
> >> From: Matthias Kretz 
> >> 
> >>PR libstdc++/84949
> >>* include/std/limits: Let is_iec559 reflect whether
> >>__GCC_IEC_559 says float and double support IEEE 754-2008.
> >>* testsuite/18_support/numeric_limits/is_iec559.cc: Test IEC559
> >>mandated behavior if is_iec559 is true.
> >>* testsuite/18_support/numeric_limits/infinity.cc: Only test inf
> >>behavior if is_iec559 is true, otherwise there is no guarantee
> >>how arithmetic on inf behaves.
> >>* testsuite/18_support/numeric_limits/quiet_NaN.cc: ditto for
> >>NaN.
> >>* testsuite/18_support/numeric_limits/denorm_min-1.cc: Compile
> >>with -ffast-math.
> >>* testsuite/18_support/numeric_limits/epsilon-1.cc: ditto.
> >>* testsuite/18_support/numeric_limits/infinity-1.cc: ditto.
> >>* testsuite/18_support/numeric_limits/is_iec559-1.cc: ditto.
> >>* testsuite/18_support/numeric_limits/quiet_NaN-1.cc: ditto.
> > 
> > I'm inclined to go ahead and commit this (to master only, obviously).
> > It certainly seems more correct to me, and we'll probably never find
> > out if it's "safe" to do unless we actually change it and see what
> > happens.
> > 
> > Marc, do you have an opinion?
> 
> I don't have a strong opinion on this. I thought we were refraining from
> changing numeric_limits based on flags (like -fwrapv for modulo) because
> that would lead to ODR violations when people link objects compiled with
> different flags. There is a value in libstdc++.so, which may have been
> compiled with different flags than the application.

But these ODR violations happen in any case: The floating-point types are 
different types with or without -ffast-math (and related) flags. They behave 
differently. Compiling a function in multiple TUs with different flags 
produces observably different results. Choosing a single one of them is 
obviously fragile and broken. That's the spirit of an ODR violation...

It would sometimes be useful to have different types:
float, float_no_nan, float_no_nan_no_signed_zero, ... 

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


Re: [RFC] Clarify -ffinite-math-only documentation

2020-04-28 Thread Matthias Kretz
On Dienstag, 28. April 2020 09:21:38 CEST Richard Biener wrote:
> On Mon, Apr 27, 2020 at 11:26 PM Matthias Kretz  wrote:
> > On Montag, 27. April 2020 21:39:17 CEST Richard Sandiford wrote:
> > > "Dr. Matthias Kretz"  writes:
> > > > On Montag, 27. April 2020 18:59:08 CEST Richard Sandiford wrote:
> > > >> Richard Biener via Gcc-patches  writes:
> > > >> > On Mon, Apr 27, 2020 at 6:09 PM Matthias Kretz  
wrote:
> > > >> >> Hi,
> > > >> >> 
> > > >> >> This documentation change clarifies the effect of
> > > >> >> -ffinite-math-only.
> > > >> >> With the current documentation, it is unclear what the presence of
> > > >> >> NaN
> > > >> >> and Inf representations means if (arithmetic) operations on such
> > > >> >> values
> > > >> >> are unspecified and even classification functions like isnan are
> > > >> >> unreliable. If the hardware thinks a certain bit pattern is a NaN,
> > > >> >> but
> > > >> >> the software assumes a NaN value cannot ever exist, it is
> > > >> >> questionable
> > > >> >> whether, from a language viewpoint, a representation for NaNs
> > > >> >> really
> > > >> >> exists. Because, a NaN is defined by its behavior. This change
> > > >> >> also
> > > >> >> clarifies that isnan(nan) returning false is fine.
> > > >> >> 
> > > >> >> This relates to PR84949.
> > > >> >> 
> > > >> >> * doc/invoke.texi: Clarify the effects of
> > > >> >> -ffinite-math-only.
> > > >> >> 
> > > >> >> ---
> > > >> >> 
> > > >> >>  gcc/doc/invoke.texi | 6 --
> > > >> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> > > >> >> 
> > > >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > >> >> index a37a2ee9c19..9e76ab057a9 100644
> > > >> >> --- a/gcc/doc/invoke.texi
> > > >> >> +++ b/gcc/doc/invoke.texi
> > > >> >> @@ -11619,8 +11619,10 @@ The default is
> > > >> >> @option{-fno-reciprocal-math}.
> > > >> >> 
> > > >> >>  @item -ffinite-math-only
> > > >> >>  @opindex ffinite-math-only
> > > >> >> 
> > > >> >> -Allow optimizations for floating-point arithmetic that assume
> > > >> >> -that arguments and results are not NaNs or +-Infs.
> > > >> >> +Assume that floating-point types in the language do not have
> > > >> >> representations for
> > > >> >> +NaNs and +-Inf. Whether floating-point hardware supports and acts
> > > >> >> on
> > > >> >> NaNs and ++-Inf is not affected. The behavior of a program that
> > > >> >> uses a
> > > >> >> NaN or +-Inf value
> > > >> >> +as function argument, macro argument, or operand is undefined.
> > > >> > 
> > > >> > Minor nit here - I'd avoid the 'undefined' word which has bad
> > > >> > connotation
> > > >> > and use 'unspecified'.  Maybe we can even use ISO C language
> > > >> > specification
> > > >> > terms but I'm not sure which one is most appropriate here.
> > > > 
> > > > I'm an ISO C++ person, and unspecified sounds too reliable to me:
> > > > https://wg21.link/intro.defs#defns.unspecified.
> > > > 
> > > >> > Curiously __builtin_nan ("nan") still gets you a NaN representation
> > > >> > but isnan(__builtin_nan("nan")) is resolved to false.
> > > > 
> > > > Right, that's because only the hardware thinks __builtin_nan ("nan")
> > > > is a
> > > > NaN representation. With -ffinite-math-only, the double data type in
> > > > C/C++ can either hold a finite real value, or an invalid value (i.e. a
> > > > value that the optimizer unconditionally excludes as a possible value
> > > > for
> > > > any object of floating-point type). FWIW, with -ffinite-math-only,
> > > > ubsan
> > > > should flag isnan(__builtin_nan("

Re: [RFC] Clarify -ffinite-math-only documentation

2020-04-27 Thread Matthias Kretz
On Montag, 27. April 2020 21:39:17 CEST Richard Sandiford wrote:
> "Dr. Matthias Kretz"  writes:
> > On Montag, 27. April 2020 18:59:08 CEST Richard Sandiford wrote:
> >> Richard Biener via Gcc-patches  writes:
> >> > On Mon, Apr 27, 2020 at 6:09 PM Matthias Kretz  wrote:
> >> >> Hi,
> >> >> 
> >> >> This documentation change clarifies the effect of -ffinite-math-only.
> >> >> With the current documentation, it is unclear what the presence of NaN
> >> >> and Inf representations means if (arithmetic) operations on such
> >> >> values
> >> >> are unspecified and even classification functions like isnan are
> >> >> unreliable. If the hardware thinks a certain bit pattern is a NaN, but
> >> >> the software assumes a NaN value cannot ever exist, it is questionable
> >> >> whether, from a language viewpoint, a representation for NaNs really
> >> >> exists. Because, a NaN is defined by its behavior. This change also
> >> >> clarifies that isnan(nan) returning false is fine.
> >> >> 
> >> >> This relates to PR84949.
> >> >> 
> >> >> * doc/invoke.texi: Clarify the effects of -ffinite-math-only.
> >> >> 
> >> >> ---
> >> >> 
> >> >>  gcc/doc/invoke.texi | 6 --
> >> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >> >> 
> >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> >> >> index a37a2ee9c19..9e76ab057a9 100644
> >> >> --- a/gcc/doc/invoke.texi
> >> >> +++ b/gcc/doc/invoke.texi
> >> >> @@ -11619,8 +11619,10 @@ The default is @option{-fno-reciprocal-math}.
> >> >> 
> >> >>  @item -ffinite-math-only
> >> >>  @opindex ffinite-math-only
> >> >> 
> >> >> -Allow optimizations for floating-point arithmetic that assume
> >> >> -that arguments and results are not NaNs or +-Infs.
> >> >> +Assume that floating-point types in the language do not have
> >> >> representations for
> >> >> +NaNs and +-Inf. Whether floating-point hardware supports and acts on
> >> >> NaNs and ++-Inf is not affected. The behavior of a program that uses a
> >> >> NaN or +-Inf value
> >> >> +as function argument, macro argument, or operand is undefined.
> >> > 
> >> > Minor nit here - I'd avoid the 'undefined' word which has bad
> >> > connotation
> >> > and use 'unspecified'.  Maybe we can even use ISO C language
> >> > specification
> >> > terms but I'm not sure which one is most appropriate here.
> > 
> > I'm an ISO C++ person, and unspecified sounds too reliable to me:
> > https://wg21.link/intro.defs#defns.unspecified.
> > 
> >> > Curiously __builtin_nan ("nan") still gets you a NaN representation
> >> > but isnan(__builtin_nan("nan")) is resolved to false.
> > 
> > Right, that's because only the hardware thinks __builtin_nan ("nan") is a
> > NaN representation. With -ffinite-math-only, the double data type in
> > C/C++ can either hold a finite real value, or an invalid value (i.e. a
> > value that the optimizer unconditionally excludes as a possible value for
> > any object of floating-point type). FWIW, with -ffinite-math-only, ubsan
> > should flag isnan(__builtin_nan("nan")) or any f(constexpr nan).
> > 
> > With the above documentation change, it is clear that with
> > https://wg21.link/ P1841 std::numbers::quiet_NaN would be
> > ill-formed under -ffinite-math- only. Without the documentation change,
> > it can be argued either way.
> > 
> > There's another interesting observation resulting from the above: double
> > and double under -ffinite-math-only are different types. Any function
> > call from one world to the other is dangerous. Inline functions
> > translated in different TUs compiled with different math flags violate
> > the ODR. But that's all the more reason to have a very precise
> > documentation/understanding of what -ffinite-math-only does. Because this
> > gotcha is already the status quo.> 
> >> Yeah, for that and other reasons, I think it would be good to avoid
> >> giving the impression that -ffinite-math-only can be relied on to make
> >> the assumption above.  Wouldn't it be more accurate to say that the
> >> compiler is allowed to make the assumption, at any po

Re: [RFC] Clarify -ffinite-math-only documentation

2020-04-27 Thread Dr. Matthias Kretz
On Montag, 27. April 2020 18:59:08 CEST Richard Sandiford wrote:
> Richard Biener via Gcc-patches  writes:
> > On Mon, Apr 27, 2020 at 6:09 PM Matthias Kretz  wrote:
> >> Hi,
> >> 
> >> This documentation change clarifies the effect of -ffinite-math-only.
> >> With the current documentation, it is unclear what the presence of NaN
> >> and Inf representations means if (arithmetic) operations on such values
> >> are unspecified and even classification functions like isnan are
> >> unreliable. If the hardware thinks a certain bit pattern is a NaN, but
> >> the software assumes a NaN value cannot ever exist, it is questionable
> >> whether, from a language viewpoint, a representation for NaNs really
> >> exists. Because, a NaN is defined by its behavior. This change also
> >> clarifies that isnan(nan) returning false is fine.
> >> 
> >> This relates to PR84949.
> >> 
> >> * doc/invoke.texi: Clarify the effects of -ffinite-math-only.
> >> 
> >> ---
> >> 
> >>  gcc/doc/invoke.texi | 6 --
> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> >> index a37a2ee9c19..9e76ab057a9 100644
> >> --- a/gcc/doc/invoke.texi
> >> +++ b/gcc/doc/invoke.texi
> >> @@ -11619,8 +11619,10 @@ The default is @option{-fno-reciprocal-math}.
> >> 
> >>  @item -ffinite-math-only
> >>  @opindex ffinite-math-only
> >> 
> >> -Allow optimizations for floating-point arithmetic that assume
> >> -that arguments and results are not NaNs or +-Infs.
> >> +Assume that floating-point types in the language do not have
> >> representations for
> >> +NaNs and +-Inf. Whether floating-point hardware supports and acts on
> >> NaNs and ++-Inf is not affected. The behavior of a program that uses a
> >> NaN or +-Inf value
> >> +as function argument, macro argument, or operand is undefined.
> > 
> > Minor nit here - I'd avoid the 'undefined' word which has bad connotation
> > and use 'unspecified'.  Maybe we can even use ISO C language specification
> > terms but I'm not sure which one is most appropriate here.

I'm an ISO C++ person, and unspecified sounds too reliable to me:
https://wg21.link/intro.defs#defns.unspecified.

> > Curiously __builtin_nan ("nan") still gets you a NaN representation
> > but isnan(__builtin_nan("nan")) is resolved to false.

Right, that's because only the hardware thinks __builtin_nan ("nan") is a NaN 
representation. With -ffinite-math-only, the double data type in C/C++ can 
either hold a finite real value, or an invalid value (i.e. a value that the 
optimizer unconditionally excludes as a possible value for any object of 
floating-point type). FWIW, with -ffinite-math-only, ubsan should flag 
isnan(__builtin_nan("nan")) or any f(constexpr nan).

With the above documentation change, it is clear that with https://wg21.link/
P1841 std::numbers::quiet_NaN would be ill-formed under -ffinite-math-
only. Without the documentation change, it can be argued either way.

There's another interesting observation resulting from the above: double and 
double under -ffinite-math-only are different types. Any function call from 
one world to the other is dangerous. Inline functions translated in different 
TUs compiled with different math flags violate the ODR. But that's all the 
more reason to have a very precise documentation/understanding of what 
-ffinite-math-only does. Because this gotcha is already the status quo.
 
> Yeah, for that and other reasons, I think it would be good to avoid
> giving the impression that -ffinite-math-only can be relied on to make
> the assumption above.  Wouldn't it be more accurate to say that the
> compiler is allowed to make the assumption, at any point that it seems
> convenient?

I think undefined behavior does what you're asking for while unspecified 
behavior does what you want to avoid. I.e. its an undocumented behavior, but 
it can be relied on with a given implementation (compiler).

-Matthias

-- 
──┬
 Dr. Matthias Kretz   │ SDE — Software Development for Experiments
 Senior Software Engineer,│  +49 6159 713084
 SIMD Expert, │  m.kr...@gsi.de
 ISO C++ Committee Member │  mattkretz.github.io
──┴

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Georg Schütte





<    1   2   3   >