Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)
On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote: > On 11/8/21 15:00, Matthias Kretz wrote: > > I forgot to mention why I tagged it [RFC]: I needed one more bit of > > information on the template args TREE_VEC to encode > > EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer > > constant denoting the number of non-default arguments, so I couldn't > > trivially replace that. Therefore, I used the sign of that integer. I was > > hoping to find a cleaner solution, though. > It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that > would be a cleaner solution. I tried that first but realized that TREE_VEC doesn't allow any TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since the int constants are shared between many trees). Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments, respectively? (And where would I document this?) -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──
[PATCH 2/2] libstdc++: Use diagnose_as attribute to improve simd diagnostics
Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Diagnose 'std::experimental::parallelism_v2::simd_abi' as 'simd_abi'. On x86, diagnose _VecBuiltin<16>, _VecBuiltin<32>, and _VecBltnBtmsk<64> as 'simd_abi::[SSE]', 'simd_abi::[AVX]', and 'simd_abi::AVX512' respectively. (simd_abi::_Scalar): Diagnose as 'simd_abi::scalar'. (simd_abi::_Fixed): Diagnose as 'simd_abi::fixed_size'. (__odr_helper): Shorten implementation details (effectively hiding them). * include/experimental/bits/simd_detail.h: Diagnose 'std::experimental::parallelism_v2' as 'stdₓ'. --- libstdc++-v3/include/experimental/bits/simd.h | 37 +-- .../include/experimental/bits/simd_detail.h | 2 +- 2 files changed, 11 insertions(+), 28 deletions(-) -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 4fbad7d67b5..f581b46fbd8 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -83,13 +83,13 @@ using __m512d [[__gnu__::__vector_size__(64)]] = double; using __m512i [[__gnu__::__vector_size__(64)]] = long long; #endif -namespace simd_abi { +namespace simd_abi [[__gnu__::__diagnose_as__("simd_abi")]] { // simd_abi forward declarations {{{ // implementation details: -struct _Scalar; + struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar; template - struct _Fixed; + struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed; // There are two major ABIs that appear on different architectures. // Both have non-boolean values packed into an N Byte register @@ -108,28 +108,11 @@ template template struct _VecBltnBtmsk; -template - using _VecN = _VecBuiltin; - -template - using _Sse = _VecBuiltin<_UsedBytes>; - -template - using _Avx = _VecBuiltin<_UsedBytes>; - -template - using _Avx512 = _VecBltnBtmsk<_UsedBytes>; - -template - using _Neon = _VecBuiltin<_UsedBytes>; - -// implementation-defined: -using __sse = _Sse<>; -using __avx = _Avx<>; -using __avx512 = _Avx512<>; -using __neon = _Neon<>; -using __neon128 = _Neon<16>; -using __neon64 = _Neon<8>; +#if defined __i386__ || defined __x86_64__ +using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>; +using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>; +using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] = _VecBltnBtmsk<64>; +#endif // standard: template @@ -367,7 +350,7 @@ namespace __detail * users link TUs compiled with different flags. This is especially important * for using simd in libraries. */ - using __odr_helper + using __odr_helper [[__gnu__::__diagnose_as__("[ODR helper]")]] = conditional_t<__machine_flags() == 0, _OdrEnforcer, _MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>; @@ -692,7 +675,7 @@ template __is_avx512_abi() { constexpr auto _Bytes = __abi_bytes_v<_Abi>; -return _Bytes <= 64 && is_same_v, _Abi>; +return _Bytes <= 64 && is_same_v, _Abi>; } // }}} diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h index 198c925c133..437f1ddb278 100644 --- a/libstdc++-v3/include/experimental/bits/simd_detail.h +++ b/libstdc++-v3/include/experimental/bits/simd_detail.h @@ -37,7 +37,7 @@ {\ _GLIBCXX_BEGIN_NAMESPACE_VERSION \ namespace experimental { \ - inline namespace parallelism_v2 { + inline namespace parallelism_v2 [[__gnu__::__diagnose_as__("std\u2093")]] { #define _GLIBCXX_SIMD_END_NAMESPACE\ }\ }\
[PATCH 1/2] libstdc++: Use diagnose_as attribute to improve string diagnostics
This hides the basic_string template in all diagnostics, reducing the signal-to-noise ratio significantly. It also hides the std::__cxx11 namespace from users by presenting it as std. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR c++/89370 * include/bits/c++config: Diagnose std::__cxx11:: as std:: using the diagnose_as attribute. * include/bits/stringfwd.h: Add diagnose_as attribute to string, wstring, u8string, u16string, and u32string. * include/debug/string: Ditto. * include/experimental/string: Ditto. * include/std/string: Ditto. --- libstdc++-v3/include/bits/c++config | 3 ++- libstdc++-v3/include/bits/stringfwd.h| 10 +- libstdc++-v3/include/debug/string| 10 +- libstdc++-v3/include/experimental/string | 10 +- libstdc++-v3/include/std/string | 10 +- 5 files changed, 22 insertions(+), 21 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──diff --git a/libstdc++-v3/include/bits/c++config b/libstdc++-v3/include/bits/c++config index a6495809671..02d11afc1aa 100644 --- a/libstdc++-v3/include/bits/c++config +++ b/libstdc++-v3/include/bits/c++config @@ -318,7 +318,8 @@ namespace std #if _GLIBCXX_USE_CXX11_ABI namespace std { - inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { } + inline namespace __cxx11 +__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { } } namespace __gnu_cxx { diff --git a/libstdc++-v3/include/bits/stringfwd.h b/libstdc++-v3/include/bits/stringfwd.h index bcfd350e505..3f653feae14 100644 --- a/libstdc++-v3/include/bits/stringfwd.h +++ b/libstdc++-v3/include/bits/stringfwd.h @@ -74,22 +74,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 _GLIBCXX_END_NAMESPACE_CXX11 /// A string of @c char - typedef basic_stringstring; + typedef basic_stringstring __attribute__((__diagnose_as__)); /// A string of @c wchar_t - typedef basic_string wstring; + typedef basic_string wstring __attribute__((__diagnose_as__)); #ifdef _GLIBCXX_USE_CHAR8_T /// A string of @c char8_t - typedef basic_string u8string; + typedef basic_string u8string __attribute__((__diagnose_as__)); #endif #if __cplusplus >= 201103L /// A string of @c char16_t - typedef basic_string u16string; + typedef basic_string u16string __attribute__((__diagnose_as__)); /// A string of @c char32_t - typedef basic_string u32string; + typedef basic_string u32string __attribute__((__diagnose_as__)); #endif /** @} */ diff --git a/libstdc++-v3/include/debug/string b/libstdc++-v3/include/debug/string index a8389528001..d6299e5552f 100644 --- a/libstdc++-v3/include/debug/string +++ b/libstdc++-v3/include/debug/string @@ -1296,21 +1296,21 @@ namespace __gnu_debug return __res; } - typedef basic_stringstring; + typedef basic_stringstring __attribute__((__diagnose_as__)); - typedef basic_string wstring; + typedef basic_string wstring __attribute__((__diagnose_as__)); #ifdef _GLIBCXX_USE_CHAR8_T /// A string of @c char8_t - typedef basic_string u8string; + typedef basic_string u8string __attribute__((__diagnose_as__)); #endif #if __cplusplus >= 201103L /// A string of @c char16_t - typedef basic_string u16string; + typedef basic_string u16string __attribute__((__diagnose_as__)); /// A string of @c char32_t - typedef basic_string u32string; + typedef basic_string u32string __attribute__((__diagnose_as__)); #endif template diff --git a/libstdc++-v3/include/experimental/string b/libstdc++-v3/include/experimental/string index 4d92a7e39cc..91a9dd8b164 100644 --- a/libstdc++-v3/include/experimental/string +++ b/libstdc++-v3/include/experimental/string @@ -73,13 +73,13 @@ inline namespace fundamentals_v2 // basic_string typedef names using polymorphic allocator in namespace // std::experimental::pmr -typedef basic_string string; +typedef basic_string string __attribute__((__diagnose_as__)); #ifdef _GLIBCXX_USE_CHAR8_T -typedef basic_string u8string; +typedef basic_string u8string __attribute__((__diagnose_as__)); #endif -typedef basic_string u16string; -typedef basic_string u32string; -typedef basic_string wstring; +typedef basic_string u16string __attribute__((__diagnose_as__)); +typedef basic_string u32string __attribute__((__diagnose_as__)); +typedef basic_string wstring __attribute__((__diagnose_as__)); } // namespace pmr #endif diff --git a/libstdc++-v3/include/std/string b/libstdc++-v3/include/std/string index af840e887d5..03a3c68050f 100644 --- a/libstdc++-v3/include/std/string +++ b/libstdc++-v3
[PATCH 0/2] Make use of the diagnose_as attribute to improve libstdc++ diagnostics
After my two C++ patches for template diagnostics and the diagnose_as attribute are in, I'd like to make use of the attribute for std::*string and std::pmr::*string as well as for std::experimental::simd diagnostics. Matthias Kretz (2): libstdc++: Use diagnose_as attribute to improve string diagnostics libstdc++: Use diagnose_as attribute to improve simd diagnostics libstdc++-v3/include/bits/c++config | 3 +- libstdc++-v3/include/bits/stringfwd.h | 10 ++--- libstdc++-v3/include/debug/string | 10 ++--- libstdc++-v3/include/experimental/bits/simd.h | 37 +-- .../include/experimental/bits/simd_detail.h | 2 +- libstdc++-v3/include/experimental/string | 10 ++--- libstdc++-v3/include/std/string | 10 ++--- 7 files changed, 33 insertions(+), 49 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──
Re: [PATCH 11/11] libstdc++: Fix ODR issues with different -m flags
ping. OK to push? On Tuesday, 8 June 2021 14:12:23 CET Matthias Kretz wrote: > From: Matthias Kretz > > Explicitly support use of the stdx::simd implementation in situations > where the user links TUs that were compiled with different -m flags. In > general, this is always a (quasi) ODR violation for inline functions > because at least codegen may differ in important ways. However, in the > resulting executable only one (unspecified which one) of them might be > used. For simd we want to support users to compile code multiple times, > with different -m flags and have a runtime dispatch to the TU matching > the target CPU. But if internal functions are not inlined this may lead > to unexpected performance loss or execution of illegal instructions. > Therefore, inline functions that are not marked as always_inline must > use an additional template parameter somewhere in their name, to > disambiguate between the different -m translations. > > Signed-off-by: Matthias Kretz > > libstdc++-v3/ChangeLog: > > * include/experimental/bits/simd.h: Move feature detection bools > and add __have_avx512bitalg, __have_avx512vbmi2, > __have_avx512vbmi, __have_avx512ifma, __have_avx512cd, > __have_avx512vnni, __have_avx512vpopcntdq. > (__detail::__machine_flags): New function which returns a unique > uint64 depending on relevant -m and -f flags. > (__detail::__odr_helper): New type alias for either an anonymous > type or a type specialized with the __machine_flags number. > (_SimdIntOperators): Change template parameters from _Impl to > _Tp, _Abi because _Impl now has an __odr_helper parameter which > may be _OdrEnforcer from the anonymous namespace, which makes > for a bad base class. > (many): Either add __odr_helper template parameter or mark as > always_inline. > * include/experimental/bits/simd_detail.h: Add defines for > AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD, > AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT. > * include/experimental/bits/simd_builtin.h: Add __odr_helper > template parameter or mark as always_inline. > * include/experimental/bits/simd_fixed_size.h: Ditto. > * include/experimental/bits/simd_math.h: Ditto. > * include/experimental/bits/simd_scalar.h: Ditto. > * include/experimental/bits/simd_neon.h: Add __odr_helper > template parameter. > * include/experimental/bits/simd_ppc.h: Ditto. > * include/experimental/bits/simd_x86.h: Ditto. > --- > libstdc++-v3/include/experimental/bits/simd.h | 380 -- > .../include/experimental/bits/simd_builtin.h | 41 +- > .../include/experimental/bits/simd_detail.h | 40 ++ > .../experimental/bits/simd_fixed_size.h | 39 +- > .../include/experimental/bits/simd_math.h | 45 ++- > .../include/experimental/bits/simd_neon.h | 4 +- > .../include/experimental/bits/simd_ppc.h | 4 +- > .../include/experimental/bits/simd_scalar.h | 71 +++- > .../include/experimental/bits/simd_x86.h | 4 +- > 9 files changed, 440 insertions(+), 188 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──
[PATCH v5] c++: Add gnu::diagnose_as attribute
Sorry for taking so long. I hope we can still get this done for GCC 12. One open question: If we change std::__cxx11::basic_string to std::string with this feature, should DWARF strings change or not? I.e. should diagnose_as be conditional on (pp->flags & pp_c_flag_gnu_v3)? If these strings are only for user consumption, I think the DWARF strings should be affected by the attribute... Oh, and note that the current patch depends on the "c++: Print function template parms when relevant" patch I sent on Nov 8th. On Wednesday, 8 September 2021 04:21:51 CEST Jason Merrill wrote: > On 7/23/21 4:58 AM, Matthias Kretz wrote: > > gcc/cp/ChangeLog: > > PR c++/89370 > > * cp-tree.h: Add is_alias_template_p declaration. > > * decl2.c (is_alias_template_p): New function. Determines > > whether a given TYPE_DECL is actually an alias template that is > > still missing its template_info. > > I still think you want to share code with get_underlying_template. For > the case where the alias doesn't have DECL_TEMPLATE_INFO yet, you can > compare to current_template_args (). Or you could do some initial > processing that doesn't care about templates in the handler, and then do > more in cp_parser_alias_declaration after the call to grokfield/start_decl. I still don't understand how I could make use of get_underlying_template. I.e. I don't even understand how get_underlying_template answers any of the questions I need answered. I used way too much time trying to make this work... > If you still think you need this function, let's call it > is_renaming_alias_template or renaming_alias_template_p; using both is_ > and _p is redundant. I don't have a strong preference which. OK. > > (is_late_template_attribute): Decls with diagnose_as attribute > > are early attributes only if they are alias templates. > > Is there a reason not to apply it early to other templates as well? Unconditionally returning false for diagnose_as in is_late_template_attribute makes renamed class templates print without template parameter list. E.g. template struct [[diagnose_as("foo")]] A; using bar [[diagnose_as]] = A; template struct A { template struct B {}; using C [[diagnose_as]] = B; }; could query for attributes. So IIUC, member types of class templates require late attributes. > > * error.c (dump_scope): When printing the name of a namespace, > > look for the diagnose_as attribute. If found, print the > > associated string instead of calling dump_decl. > > Did you decide not to handle this in dump_decl, so we use the > diagnose_as when referring to the namespace in non-scope contexts as well? Good question. dump_decl is the more general place for handling the attribute and that's where I moved it to. > > + if (flag_diagnostics_use_aliases) > > +{ > > + tree attr = lookup_attribute ("diagnose_as", DECL_ATTRIBUTES > > (decl)); + if (attr && TREE_VALUE (attr)) > > + { > > + pp_cxx_ws_string ( > > + pp, TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr; > > This pattern is used several places outside this function; can we factor > it into something like > > if (maybe_print_diagnose_as (special)) >/* OK */; Yes, I added the functions lookup_diagnose_as_attribute and dump_diagnose_as_alias to remove code duplication. > Missing space before ( OK. I think I found and fixed all of them. > > + if (tmplate) > > + TREE_VALUE (*parms) = make_tree_vec (0); > > This could use a comment. Added. > > (dump_aggr_type): If the type has a diagnose_as attribute, print > > the associated string instead of printing the original type > > name. Print template parms only if the attribute was not applied > > to the instantiation / full specialization. Delay call to > > dump_scope until the diagnose_as attribute is found. If the > > attribute has a second argument, use it to override the context > > passed to dump_scope. > > > > + for (int i = 0; i < NUM_TMPL_ARGS (args); ++i) > > + { > > + tree arg = TREE_VEC_ELT (args, i); > > + while (INDIRECT_TYPE_P (arg)) > > + arg = TREE_TYPE (arg); > > + if (WILDCARD_TYPE_P (arg)) > > + { > > + tmplate = true; > > + break; > > + } > > + } > > I think you want any_dependent_template_args_p (args) Yes, except that I need `++pr
[PATCH v3] c-family: Add __builtin_assoc_barrier
On Wednesday, 8 September 2021 15:49:27 CET Matthias Kretz wrote: > On Wednesday, 8 September 2021 15:44:28 CEST Jason Merrill wrote: > > On 9/8/21 5:37 AM, Matthias Kretz wrote: > > > On Tuesday, 7 September 2021 19:36:22 CEST Jason Merrill wrote: > > >>> case PAREN_EXPR: > > >>> - RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, > > >>> 0; > > >>> + if (REF_PARENTHESIZED_P (t)) > > >>> + RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, > > >>> 0; > > >>> + else > > >>> + RETURN (RECUR (TREE_OPERAND (t, 0))); > > >> > > >> I think you need to build a new PAREN_EXPR in the assoc barrier case as > > >> well, for it to have any effect in templates. > > > > > > My intent was to ignore __builtin_assoc_barrier in templates / constexpr > > > evaluation since it's not affected by -fassociative-math anyway. Or do > > > you > > > mean something else? > > > > I agree about constexpr, but why wouldn't template instantiations be > > affected by -fassociative-math like any other function? > > Oh, that seems like a major misunderstanding on my part. I assumed > tsubst_copy_and_build would evaluate the expressions in template arguments > 臘. I'll expand the test and will fix. Sorry for the long delay. New patch is attached. OK for trunk? New builtin to enable explicit use of PAREN_EXPR in C & C++ code. Signed-off-by: Matthias Kretz gcc/testsuite/ChangeLog: * c-c++-common/builtin-assoc-barrier-1.c: New test. gcc/cp/ChangeLog: * constexpr.c (cxx_eval_constant_expression): Handle PAREN_EXPR via cxx_eval_constant_expression. * cp-objcp-common.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * cp-tree.h: Adjust TREE_LANG_FLAG documentation to include PAREN_EXPR in REF_PARENTHESIZED_P. (REF_PARENTHESIZED_P): Add PAREN_EXPR. * parser.c (cp_parser_postfix_expression): Handle RID_BUILTIN_ASSOC_BARRIER. * pt.c (tsubst_copy_and_build): If the PAREN_EXPR is not a parenthesized initializer, build a new PAREN_EXPR. * semantics.c (force_paren_expr): Simplify conditionals. Set REF_PARENTHESIZED_P on PAREN_EXPR. (maybe_undo_parenthesized_ref): Test PAREN_EXPR for REF_PARENTHESIZED_P. gcc/c-family/ChangeLog: * c-common.c (c_common_reswords): Add __builtin_assoc_barrier. * c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER. gcc/c/ChangeLog: * c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * c-parser.c (c_parser_postfix_expression): Likewise. gcc/ChangeLog: * doc/extend.texi: Document __builtin_assoc_barrier. --- gcc/c-family/c-common.c | 1 + gcc/c-family/c-common.h | 2 +- gcc/c/c-decl.c| 1 + gcc/c/c-parser.c | 20 ++ gcc/cp/constexpr.c| 8 +++ gcc/cp/cp-objcp-common.c | 1 + gcc/cp/cp-tree.h | 12 ++-- gcc/cp/parser.c | 14 gcc/cp/pt.c | 10 ++- gcc/cp/semantics.c| 23 ++ gcc/doc/extend.texi | 18 + .../c-c++-common/builtin-assoc-barrier-1.c | 71 +++ 12 files changed, 158 insertions(+), 23 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/builtin-assoc-barrier-1.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ── diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 436df45df68..dd2a3d5da9e 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] = { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 }, { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 }, { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY }, + { "__builtin_assoc_barrier", RID_BUILTIN_ASSOC_BARRIER, 0 }, { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 }, { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 }, { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY }, diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index d5dad99ff97..c089fda12e4 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -108,7 +108,7 @@ enum rid
Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)
I forgot to mention why I tagged it [RFC]: I needed one more bit of information on the template args TREE_VEC to encode EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer constant denoting the number of non-default arguments, so I couldn't trivially replace that. Therefore, I used the sign of that integer. I was hoping to find a cleaner solution, though. -Matthias On Monday, 8 November 2021 17:40:44 CET Matthias Kretz wrote: > On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote: > > > 2. Given a DECL_TI_ARGS tree, can I query whether an argument was > > > deduced > > > or explicitly specified? I'm asking because I still consider diagnostics > > > of function templates unfortunate. `template void f()` is > > > fine, > > > as is `void f(T) [with T = float]`, but `void f() [with T = float]` > > > could > > > be better. I.e. if the template parameter appears somewhere in the > > > function parameter list, dump_template_parms would only produce noise. > > > If, however, the template parameter was given explicitly, it would be > > > nice if it could show up accordingly in diagnostics. > > > > NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are > > some issues with it. Attached is my WIP from May to improve it > > somewhat, if that's interesting. > > It is interesting. I used your patch to come up with the attached. Patch. I > must say, I didn't try to read through all the cp/pt.c code to understand > all of what you did there (which is why my ChangeLog entry says "Jason?"), > but it works for me (and all of `make check`). > > Anyway, I'd like to propose the following before finishing my diagnose_as > patch. I believe it's useful to fix this part first. The diagnostic/default- > template-args-[12].C tests show a lot of examples of the intent of this > patch. And the remaining changes to the testsuite show how it changes > diagnostic output. > > -- 8< > > The choice when to print a function template parameter was still > suboptimal. That's because sometimes the function template parameter > list only adds noise, while in other situations the lack of a function > template parameter list makes diagnostic messages hard to understand. > > The general idea of this change is to print template parms wherever they > would appear in the source code as well. Thus, the diagnostics code > needs to know whether any template parameter was given explicitly. > > Signed-off-by: Matthias Kretz > > gcc/testsuite/ChangeLog: > > * g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow > DW_AT_default_value. > * g++.dg/diagnostic/default-template-args-1.C: New. > * g++.dg/diagnostic/default-template-args-2.C: New. > * g++.dg/diagnostic/param-type-mismatch-2.C: Expect template > parms in diagnostic. > * g++.dg/ext/pretty1.C: Expect function template specialization > to not pretty-print template parms. > * g++.old-deja/g++.ext/pretty3.C: Ditto. > * g++.old-deja/g++.pt/memtemp77.C: Ditto. > * g++.dg/goacc/template.C: Expect function template parms for > explicit arguments. > * g++.dg/gomp/declare-variant-7.C: Expect no function template > parms for deduced arguments. > * g++.dg/template/error40.C: Expect only non-default template > arguments in diagnostic. > > gcc/cp/ChangeLog: > > * cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return > absolute value of stored constant. > (EXPLICIT_TEMPLATE_ARGS_P): New. > (SET_EXPLICIT_TEMPLATE_ARGS_P): New. > (TFF_AS_PRIMARY): New constant. > * error.c (get_non_default_template_args_count): Avoid > GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if > NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent > of flag_pretty_templates. > (dump_template_bindings): Add flags parameter to be passed to > get_non_default_template_args_count. Print only non-default > template arguments. > (dump_function_decl): Call dump_function_name and dump_type of > the DECL_CONTEXT with specialized template and set > TFF_AS_PRIMARY for their flags. > (dump_function_name): Add and document conditions for calling > dump_template_parms. > (dump_template_parms): Print only non-default template > parameters. > * pt.c (determine_specialization): Jason? > (template_parms_level_to_args): Jason? > (copy_template_args): Jason? > (fn_type_unification): Set EXPLICIT_TEMPL
[RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)
On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote: > > 2. Given a DECL_TI_ARGS tree, can I query whether an argument was deduced > > or explicitly specified? I'm asking because I still consider diagnostics > > of function templates unfortunate. `template void f()` is fine, > > as is `void f(T) [with T = float]`, but `void f() [with T = float]` could > > be better. I.e. if the template parameter appears somewhere in the > > function parameter list, dump_template_parms would only produce noise. > > If, however, the template parameter was given explicitly, it would be > > nice if it could show up accordingly in diagnostics. > > NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are > some issues with it. Attached is my WIP from May to improve it > somewhat, if that's interesting. It is interesting. I used your patch to come up with the attached. Patch. I must say, I didn't try to read through all the cp/pt.c code to understand all of what you did there (which is why my ChangeLog entry says "Jason?"), but it works for me (and all of `make check`). Anyway, I'd like to propose the following before finishing my diagnose_as patch. I believe it's useful to fix this part first. The diagnostic/default- template-args-[12].C tests show a lot of examples of the intent of this patch. And the remaining changes to the testsuite show how it changes diagnostic output. -- 8< The choice when to print a function template parameter was still suboptimal. That's because sometimes the function template parameter list only adds noise, while in other situations the lack of a function template parameter list makes diagnostic messages hard to understand. The general idea of this change is to print template parms wherever they would appear in the source code as well. Thus, the diagnostics code needs to know whether any template parameter was given explicitly. Signed-off-by: Matthias Kretz gcc/testsuite/ChangeLog: * g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow DW_AT_default_value. * g++.dg/diagnostic/default-template-args-1.C: New. * g++.dg/diagnostic/default-template-args-2.C: New. * g++.dg/diagnostic/param-type-mismatch-2.C: Expect template parms in diagnostic. * g++.dg/ext/pretty1.C: Expect function template specialization to not pretty-print template parms. * g++.old-deja/g++.ext/pretty3.C: Ditto. * g++.old-deja/g++.pt/memtemp77.C: Ditto. * g++.dg/goacc/template.C: Expect function template parms for explicit arguments. * g++.dg/gomp/declare-variant-7.C: Expect no function template parms for deduced arguments. * g++.dg/template/error40.C: Expect only non-default template arguments in diagnostic. gcc/cp/ChangeLog: * cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return absolute value of stored constant. (EXPLICIT_TEMPLATE_ARGS_P): New. (SET_EXPLICIT_TEMPLATE_ARGS_P): New. (TFF_AS_PRIMARY): New constant. * error.c (get_non_default_template_args_count): Avoid GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent of flag_pretty_templates. (dump_template_bindings): Add flags parameter to be passed to get_non_default_template_args_count. Print only non-default template arguments. (dump_function_decl): Call dump_function_name and dump_type of the DECL_CONTEXT with specialized template and set TFF_AS_PRIMARY for their flags. (dump_function_name): Add and document conditions for calling dump_template_parms. (dump_template_parms): Print only non-default template parameters. * pt.c (determine_specialization): Jason? (template_parms_level_to_args): Jason? (copy_template_args): Jason? (fn_type_unification): Set EXPLICIT_TEMPLATE_ARGS_P on the template arguments tree if any template parameter was explicitly given. (type_unification_real): Jason? (get_partial_spec_bindings): Jason? (tsubst_template_args): Determine number of defaulted arguments from new argument vector, if possible. --- gcc/cp/cp-tree.h | 18 +++- gcc/cp/error.c| 83 ++- gcc/cp/pt.c | 58 + .../g++.dg/debug/dwarf2/template-params-12n.C | 2 +- .../diagnostic/default-template-args-1.C | 73 .../diagnostic/default-template-args-2.C | 37 + .../g++.dg/diagnostic/param-type-mismatch-2.C | 2 +- gcc/testsuite/g++.dg/ext/pretty1.C| 2 +- gcc/testsuite/g++.dg/goacc/template.C | 8 +- gcc/testsuite/g++.dg/gomp/declare-varian
Re: [PATCH v2] c-family: Add __builtin_assoc_barrier
On Wednesday, 8 September 2021 15:44:28 CEST Jason Merrill wrote: > On 9/8/21 5:37 AM, Matthias Kretz wrote: > > On Tuesday, 7 September 2021 19:36:22 CEST Jason Merrill wrote: > >>> case PAREN_EXPR: > >>> - RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, 0; > >>> + if (REF_PARENTHESIZED_P (t)) > >>> + RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, > >>> 0; > >>> + else > >>> + RETURN (RECUR (TREE_OPERAND (t, 0))); > >> > >> I think you need to build a new PAREN_EXPR in the assoc barrier case as > >> well, for it to have any effect in templates. > > > > My intent was to ignore __builtin_assoc_barrier in templates / constexpr > > evaluation since it's not affected by -fassociative-math anyway. Or do you > > mean something else? > > I agree about constexpr, but why wouldn't template instantiations be > affected by -fassociative-math like any other function? Oh, that seems like a major misunderstanding on my part. I assumed tsubst_copy_and_build would evaluate the expressions in template arguments 臘. I'll expand the test and will fix. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──
Re: [PATCH v2] c-family: Add __builtin_assoc_barrier
On Tuesday, 7 September 2021 19:36:22 CEST Jason Merrill wrote: > > case PAREN_EXPR: > > - RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, 0; > > + if (REF_PARENTHESIZED_P (t)) > > + RETURN (finish_parenthesized_expr (RECUR (TREE_OPERAND (t, 0; > > + else > > + RETURN (RECUR (TREE_OPERAND (t, 0))); > > I think you need to build a new PAREN_EXPR in the assoc barrier case as > well, for it to have any effect in templates. My intent was to ignore __builtin_assoc_barrier in templates / constexpr evaluation since it's not affected by -fassociative-math anyway. Or do you mean something else? > Please also add a comment mentioning __builtin_assoc_barrier. I added a comment to that effect to both the cp/pt.c and cp/constexpr.c changes. New patch attached. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ── diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 681fcc972f4..c62a6398a47 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] = { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 }, { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 }, { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY }, + { "__builtin_assoc_barrier", RID_BUILTIN_ASSOC_BARRIER, 0 }, { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 }, { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 }, { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY }, diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 50ca8fb6ebd..f34dc47c2ba 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -108,7 +108,7 @@ enum rid RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR, RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE, RID_BUILTIN_SHUFFLEVECTOR, RID_BUILTIN_CONVERTVECTOR, RID_BUILTIN_TGMATH, - RID_BUILTIN_HAS_ATTRIBUTE, + RID_BUILTIN_HAS_ATTRIBUTE, RID_BUILTIN_ASSOC_BARRIER, RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, /* TS 18661-3 keywords, in the same sequence as the TI_* values. */ diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 983d65e930c..dcf4a2d7c32 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -10557,6 +10557,7 @@ names_builtin_p (const char *name) case RID_BUILTIN_HAS_ATTRIBUTE: case RID_BUILTIN_SHUFFLE: case RID_BUILTIN_SHUFFLEVECTOR: +case RID_BUILTIN_ASSOC_BARRIER: case RID_CHOOSE_EXPR: case RID_OFFSETOF: case RID_TYPES_COMPATIBLE_P: diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 9a56e0c04c6..fffd81f4e5b 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -8931,6 +8931,7 @@ c_parser_predefined_identifier (c_parser *parser) assignment-expression , assignment-expression, ) __builtin_convertvector ( assignment-expression , type-name ) + __builtin_assoc_barrier ( assignment-expression ) offsetof-member-designator: identifier @@ -10076,6 +10077,25 @@ c_parser_postfix_expression (c_parser *parser) } } break; + case RID_BUILTIN_ASSOC_BARRIER: + { + location_t start_loc = loc; + c_parser_consume_token (parser); + matching_parens parens; + if (!parens.require_open (parser)) + { + expr.set_error (); + break; + } + e1 = c_parser_expr_no_commas (parser, NULL); + mark_exp_read (e1.value); + location_t end_loc = c_parser_peek_token (parser)->get_finish (); + parens.skip_until_found_close (parser); + expr.value = build1_loc (loc, PAREN_EXPR, TREE_TYPE (e1.value), + e1.value); + set_c_expr_source_range (, start_loc, end_loc); + } + break; case RID_AT_SELECTOR: { gcc_assert (c_dialect_objc ()); diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 31fa5b66865..6e964837d24 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -6730,6 +6730,14 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, tree t, non_constant_p, overflow_p); break; +case PAREN_EXPR: + gcc_assert (!REF_PARENTHESIZED_P (t)); + /* A PAREN_EXPR resulting from __builtin_assoc_barrier has no effect in + constant expressions since it's unaffected by -fassociative-math. */ + r = cxx_eval_constant_expression (ctx, TREE_OPERAND (t, 0), lval, + non_constant_p, overflow_p); + break; + case NOP_EXPR: if (REINTERPRET_CAST_P (t)) { diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c index ee255732d5a..04522a23eda 100644 --- a/gcc/cp/cp-objcp-common.c +++ b
Re: [PATCH v2] c-family: Add __builtin_assoc_barrier
On Monday, 6 September 2021 14:59:27 CEST Richard Biener wrote: > On Mon, 6 Sep 2021, Matthias Kretz wrote: > > On Monday, 6 September 2021 14:40:31 CEST Richard Biener wrote: > > > I'll note that currently a + PAREN_EXPR (b * c) is for example > > > also not contracted to PAREN_EXPR (FMA (PAREN_EXPR (a), b, c)) > > > even though technically FP contraction is not association. But > > > that's an implementation detail that could be changed. There > > > are likely other transforms that it prevents as well that are > > > not assocations, the implementation focus was correctness > > > as to preventing association, not so much not hindering > > > unrelated optimizations. If you run into any such issues > > > reporting a bugzilla would be welcome. > > > > Thanks, interesting point. I believe it might even be useful to nail down > > that behavior (i.e. document it and write a test). Because a + b * c > > evaluates b * c before the addition in any case. So why would anyone add > > a PAREN_EXPR around b * c? > > At least for integers we have transforms that do a + a * c -> a * (1 + c) > so one could think of (x/y) + (x/y)*c -> (x/y) * (1 + c) which would > then have associated the c * (x/y) multiplication ... Or when > c is constant then a + a * C can be simplified. Right given float a, `a + 2.1f * a` is compiled to `3.1f * a` with -ffast- math. So yes, there's a reason one might want `a + __builtin_assoc_barrier(2.1f * a)` without inhibiting contraction. I'll investigate more and might submit a PR... > > We have (std::)fma (__builtin_fma) to explicitly request contraction. > > PAREN_EXPR seems like a good fit to inhibit contraction. > > OK, I guess it should apply to PAREN_EXPR (a + a) + a as well > which then does not become 3 * PAREN_EXPR (a). Likewise > PAREN_EXPR (a) - a might eventually not become zero (I'm not > absolutely sure about that ;)) Just tested it. PAREN_EXPR inhibits both transformations. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──
Re: [PATCH v2] c-family: Add __builtin_assoc_barrier
On Monday, 6 September 2021 14:40:31 CEST Richard Biener wrote: > I'll note that currently a + PAREN_EXPR (b * c) is for example > also not contracted to PAREN_EXPR (FMA (PAREN_EXPR (a), b, c)) > even though technically FP contraction is not association. But > that's an implementation detail that could be changed. There > are likely other transforms that it prevents as well that are > not assocations, the implementation focus was correctness > as to preventing association, not so much not hindering > unrelated optimizations. If you run into any such issues > reporting a bugzilla would be welcome. Thanks, interesting point. I believe it might even be useful to nail down that behavior (i.e. document it and write a test). Because a + b * c evaluates b * c before the addition in any case. So why would anyone add a PAREN_EXPR around b * c? We have (std::)fma (__builtin_fma) to explicitly request contraction. PAREN_EXPR seems like a good fit to inhibit contraction. -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de stdₓ::simd ──
[PATCH v2] c-family: Add __builtin_assoc_barrier
Hi, On Tuesday, 20 July 2021 22:22:02 CEST Jason Merrill wrote: > The C++ front end already uses PAREN_EXPR in templates to indicate > parenthesized initializers in cases where that matters for > decltype(auto). It should be fine to use it for both that and > __builtin_assoc_barrier, but you probably want to distinguish them with > a TREE_LANG_FLAG, and change tsubst_copy_and_build to keep the > PAREN_EXPR in this case. I reused REF_PARENTHESIZED_P for PAREN_EXPR. > For constexpr you probably just need to add handling to > cxx_eval_constant_expression to evaluate its operand instead. OK, that was easy. On Monday, 19 July 2021 14:34:12 CEST Richard Biener wrote: > On Mon, 19 Jul 2021, Matthias Kretz wrote: > > tested on x86_64-pc-linux-gnu with no new failures. OK for master? > > I think now that PAREN_EXPR can appear in C++ code you need to > adjust some machiner to expect it (constexpr folding? template stuff?). > I suggest to add some testcases covering templates and constexpr > functions. Right. I expanded the test. > +@deftypefn {Built-in Function} @var{type} __builtin_assoc_barrier > (@var{type} @var{expr}) > +This built-in represents a re-association barrier for the floating-point > +expression @var{expr} with operations following the built-in. The > expression > +@var{expr} itself can be reordered, and the whole expression @var{expr} > can > be > +reordered with operations after the barrier. > > What operations follow the built-in also applies to operations leading > the builtin? Maybe "This built-in represents a re-association barrier > for the floating-point expression @var{expr} with the expression > consuming its value." But I'm not an english speaker - I guess > I'm mostly confused about "follow" here. With "follow" I meant time / precedence and not that the operation follows syntactically. So e.g. a + b * c: the addition follows after the multiplication. It's probably not as precise as it could/should be. Also "the whole expression @var{expr} can be reordered with operations after the barrier" probably should say "with operands" not "with operations", right? > I'm not sure if there are better C/C++ language terms describing what > the builtin does, but basically it appears as opaque operand to the > surrounding expression and the surrounding expression is opaque > to the expression inside the parens. I can't think of any other term that would help here. Based upon your suggestion, the attached patch now says: "This built-in inhibits re-association of the floating-point expression @var{expr} with expressions consuming the return value of the built-in. The expression @var{expr} itself can be reordered, and the whole expression @var{expr} can be reordered with operands after the barrier. [...]" New patch attached. OK to push? --- New builtin to enable explicit use of PAREN_EXPR in C & C++ code. Signed-off-by: Matthias Kretz gcc/testsuite/ChangeLog: * c-c++-common/builtin-assoc-barrier-1.c: New test. gcc/cp/ChangeLog: * constexpr.c (cxx_eval_constant_expression): Handle PAREN_EXPR via cxx_eval_constant_expression. * cp-objcp-common.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * cp-tree.h: Adjust TREE_LANG_FLAG documentation to include PAREN_EXPR in REF_PARENTHESIZED_P. (REF_PARENTHESIZED_P): Add PAREN_EXPR. * parser.c (cp_parser_postfix_expression): Handle RID_BUILTIN_ASSOC_BARRIER. * pt.c (tsubst_copy_and_build): If the PAREN_EXPR is not a parenthesized initializer, evaluate by ignoring the PAREN_EXPR. * semantics.c (force_paren_expr): Simplify conditionals. Set REF_PARENTHESIZED_P on PAREN_EXPR. (maybe_undo_parenthesized_ref): Test PAREN_EXPR for REF_PARENTHESIZED_P. gcc/c-family/ChangeLog: * c-common.c (c_common_reswords): Add __builtin_assoc_barrier. * c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER. gcc/c/ChangeLog: * c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * c-parser.c (c_parser_postfix_expression): Likewise. gcc/ChangeLog: * doc/extend.texi: Document __builtin_assoc_barrier. --- gcc/c-family/c-common.c | 1 + gcc/c-family/c-common.h | 2 +- gcc/c/c-decl.c| 1 + gcc/c/c-parser.c | 20 gcc/cp/constexpr.c| 6 +++ gcc/cp/cp-objcp-common.c | 1 + gcc/cp/cp-tree.h | 12 +++-- gcc/cp/parser.c | 14 ++ gcc/cp/pt.c | 5 +- gcc/cp/semantics.c
ping-3: [PATCH] c-family: Add more predefined macros for math flags
OK? On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote: > Library code, especially in headers, sometimes needs to know how the > compiler interprets / optimizes floating-point types and operations. > This information can be used for additional optimizations or for > ensuring correctness. This change makes -freciprocal-math, > -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and > -frounding-math report their state via corresponding pre-defined macros. > > Signed-off-by: Matthias Kretz > > gcc/testsuite/ChangeLog: > > * gcc.dg/associative-math-1.c: New test. > * gcc.dg/associative-math-2.c: New test. > * gcc.dg/no-signed-zeros-1.c: New test. > * gcc.dg/no-signed-zeros-2.c: New test. > * gcc.dg/no-trapping-math-1.c: New test. > * gcc.dg/no-trapping-math-2.c: New test. > * gcc.dg/reciprocal-math-1.c: New test. > * gcc.dg/reciprocal-math-2.c: New test. > * gcc.dg/rounding-math-1.c: New test. > * gcc.dg/rounding-math-2.c: New test. > > gcc/c-family/ChangeLog: > > * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or > undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > __ROUNDING_MATH__ according to the new optimization flags. > > gcc/ChangeLog: > > * cppbuiltin.c (define_builtin_macros_for_compilation_flags): > Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > __ROUNDING_MATH__ according to their corresponding flags. > * doc/cpp.texi: Document __RECIPROCAL_MATH__, > __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, > and __ROUNDING_MATH__. > --- > gcc/c-family/c-cppbuiltin.c | 25 +++ > gcc/cppbuiltin.c | 10 + > gcc/doc/cpp.texi | 18 > gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/no-signed-zeros-1.c | 17 +++ > gcc/testsuite/gcc.dg/no-signed-zeros-2.c | 17 +++ > gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/reciprocal-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/reciprocal-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++ > gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++ > 13 files changed, 223 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c > create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c > create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c index f79f939bd10..671af04b1f8 100644 --- a/gcc/c-family/c-cppbuiltin.c +++ b/gcc/c-family/c-cppbuiltin.c @@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree, cpp_undef (pfile, "__FINITE_MATH_ONLY__"); cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0"); } + + if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math) +cpp_define_unused (pfile, "__RECIPROCAL_MATH__"); + else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math) +cpp_undef (pfile, "__RECIPROCAL_MATH__"); + + if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros) +cpp_undef (pfile, "__NO_SIGNED_ZEROS__"); + else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros) +cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__"); + + if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math) +cpp_undef (pfile, "__NO_TRAPPING_MATH__"); + else if (prev->x_
[PATCH v4] c++: Add gnu::diagnose_as attribute
Hi Jason, I found a few regressions from the last patch in the meantime. Version 4 of the patch is attached. Questions: 1. I simplified the condition for calling dump_template_parms in dump_function_name. !DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION (t) is equivalent to DECL_USE_TEMPLATE (t) in this context; implying that dump_template_parms is unconditionally called with `primary = false`. Or am I missing something? 2. Given a DECL_TI_ARGS tree, can I query whether an argument was deduced or explicitly specified? I'm asking because I still consider diagnostics of function templates unfortunate. `template void f()` is fine, as is `void f(T) [with T = float]`, but `void f() [with T = float]` could be better. I.e. if the template parameter appears somewhere in the function parameter list, dump_template_parms would only produce noise. If, however, the template parameter was given explicitly, it would be nice if it could show up accordingly in diagnostics. 3. When parsing tentatively and the parse is rejected, input_location is not reset, correct? In the attached patch I therefore made cp_parser_namespace_alias_definition reset input_location on a failed tentative parse. But it feels wrong. Shouldn't input_location be restored on cp_parser_parse_definitely? -- This attribute overrides the diagnostics output string for the entity it appertains to. The motivation is to improve QoI for library TS implementations, where diagnostics have a very bad signal-to-noise ratio due to the long namespaces involved. With the attribute, it is possible to solve PR89370 and make std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as std::string in diagnostic output without extra hacks to recognize the type in the C++ frontend. Signed-off-by: Matthias Kretz gcc/ChangeLog: PR c++/89370 * doc/extend.texi: Document the diagnose_as attribute. * doc/invoke.texi: Document -fno-diagnostics-use-aliases. gcc/c-family/ChangeLog: PR c++/89370 * c.opt (fdiagnostics-use-aliases): New diagnostics flag. gcc/cp/ChangeLog: PR c++/89370 * cp-tree.h: Add is_alias_template_p declaration. * decl2.c (is_alias_template_p): New function. Determines whether a given TYPE_DECL is actually an alias template that is still missing its template_info. (is_late_template_attribute): Decls with diagnose_as attribute are early attributes only if they are alias templates. * error.c (dump_scope): When printing the name of a namespace, look for the diagnose_as attribute. If found, print the associated string instead of calling dump_decl. (dump_decl_name_or_diagnose_as): New function to replace dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the diagnose_as attribute before printing the DECL_NAME. (dump_template_scope): New function. Prints the scope of a template instance correctly applying diagnose_as attributes and adjusting the list of template parms accordingly. (dump_aggr_type): If the type has a diagnose_as attribute, print the associated string instead of printing the original type name. Print template parms only if the attribute was not applied to the instantiation / full specialization. Delay call to dump_scope until the diagnose_as attribute is found. If the attribute has a second argument, use it to override the context passed to dump_scope. (dump_simple_decl): Call dump_decl_name_or_diagnose_as instead of dump_decl. (dump_decl): Ditto. (lang_decl_name): Ditto. (dump_function_decl): Walk the functions context list to determine whether a call to dump_template_scope is required. Ensure function templates diagnosed with pretty templates set TFF_TEMPLATE_NAME to skip dump_template_parms. (dump_function_name): Replace the function's identifier with the diagnose_as attribute value, if set. Expand DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION to DECL_USE_TEMPLATE and consequently call dump_template_parms with primary = false. (comparable_template_types_p): Consider the types not a template if one carries a diagnose_as attribute. (print_template_differences): Replace the identifier with the diagnose_as attribute value on the most general template, if it is set. * name-lookup.c (handle_namespace_attrs): Handle the diagnose_as attribute on namespaces. Ensure exactly one string argument. Ensure previous diagnose_as attributes used the same name. 'diagnose_as' on namespace aliases are forwarded to the original namespace. Support no-argument 'diagnose_as' on namespace aliases. (do_namespace_alias): Add attributes parameter and call handle_namespace_attrs. * name-lookup.h (do_namespace_alias
[PATCH] c-family: Add __builtin_assoc_barrier
tested on x86_64-pc-linux-gnu with no new failures. OK for master? New builtin to enable explicit use of PAREN_EXPR in C & C++ code. Signed-off-by: Matthias Kretz gcc/testsuite/ChangeLog: * c-c++-common/builtin-assoc-barrier-1.c: New test. gcc/cp/ChangeLog: * cp-objcp-common.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * parser.c (cp_parser_postfix_expression): Handle RID_BUILTIN_ASSOC_BARRIER. gcc/c-family/ChangeLog: * c-common.c (c_common_reswords): Add __builtin_assoc_barrier. * c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER. gcc/c/ChangeLog: * c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * c-parser.c (c_parser_postfix_expression): Likewise. gcc/ChangeLog: * doc/extend.texi: Document __builtin_assoc_barrier. --- gcc/c-family/c-common.c | 1 + gcc/c-family/c-common.h | 2 +- gcc/c/c-decl.c| 1 + gcc/c/c-parser.c | 20 gcc/cp/cp-objcp-common.c | 1 + gcc/cp/parser.c | 14 +++ gcc/doc/extend.texi | 18 ++ .../c-c++-common/builtin-assoc-barrier-1.c| 24 +++ 8 files changed, 80 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/c-c++-common/builtin-assoc-barrier-1.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 681fcc972f4..c62a6398a47 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] = { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 }, { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 }, { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY }, + { "__builtin_assoc_barrier", RID_BUILTIN_ASSOC_BARRIER, 0 }, { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 }, { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 }, { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY }, diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 50ca8fb6ebd..f34dc47c2ba 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -108,7 +108,7 @@ enum rid RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR, RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE, RID_BUILTIN_SHUFFLEVECTOR, RID_BUILTIN_CONVERTVECTOR, RID_BUILTIN_TGMATH, - RID_BUILTIN_HAS_ATTRIBUTE, + RID_BUILTIN_HAS_ATTRIBUTE, RID_BUILTIN_ASSOC_BARRIER, RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, /* TS 18661-3 keywords, in the same sequence as the TI_* values. */ diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 983d65e930c..dcf4a2d7c32 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -10557,6 +10557,7 @@ names_builtin_p (const char *name) case RID_BUILTIN_HAS_ATTRIBUTE: case RID_BUILTIN_SHUFFLE: case RID_BUILTIN_SHUFFLEVECTOR: +case RID_BUILTIN_ASSOC_BARRIER: case RID_CHOOSE_EXPR: case RID_OFFSETOF: case RID_TYPES_COMPATIBLE_P: diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 9a56e0c04c6..fffd81f4e5b 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -8931,6 +8931,7 @@ c_parser_predefined_identifier (c_parser *parser) assignment-expression , assignment-expression, ) __builtin_convertvector ( assignment-expression , type-name ) + __builtin_assoc_barrier ( assignment-expression ) offsetof-member-designator: identifier @@ -10076,6 +10077,25 @@ c_parser_postfix_expression (c_parser *parser) } } break; + case RID_BUILTIN_ASSOC_BARRIER: + { + location_t start_loc = loc; + c_parser_consume_token (parser); + matching_parens parens; + if (!parens.require_open (parser)) + { + expr.set_error (); + break; + } + e1 = c_parser_expr_no_commas (parser, NULL); + mark_exp_read (e1.value); + location_t end_loc = c_parser_peek_token (parser)->get_finish (); + parens.skip_until_found_close (parser); + expr.value = build1_loc (loc, PAREN_EXPR, TREE_TYPE (e1.value), + e1.value); + set_c_expr_source_range (, start_loc, end_loc); + } + break; case RID_AT_SELECTOR: { gcc_assert (c_dialect_objc ()); diff --git a/gcc/cp/cp-objcp-common.c b/gcc/cp/cp-objcp-common.c index ee255732d5a..04522a23eda 100644 --- a/gcc/cp/cp-objcp-common.c +++ b/gcc/cp/cp-ob
Re: [PATCH] c++: implement C++17 hardware interference size
On Saturday, 17 July 2021 15:32:42 CEST Jonathan Wakely wrote: > On Sat, 17 Jul 2021, 09:15 Matthias Kretz, wrote: > > If somebody writes a library with `keep_apart` in the public API/ABI then > > you're right. > > Yes, it's fine if those constants don't affect anything across module > boundaries. I believe a significant fraction of hardware interference size usage will be internal. > > The developer who wants his code to be included in a distro should care > > about > > binary distribution. If his code has an ABI issue, that's a bug he needs > > to > > fix. It's not the fault of the packager. > > Yes but in practice it's the packagers who have to deal with the bug > reports, analyze the problem, and often fix the bug too. It might not be > the packager's fault but it's often their problem I can imagine. But I don't think requiring users to specify the value according to what -mtune suggests will improve things. Users will write a configure/cmake/... macro to parse the value -mtune prints and pass that on the command line (we'll soon find this solution on SO ). I.e. things are likely to be even more broken. -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] c++: implement C++17 hardware interference size
On Friday, 16 July 2021 21:58:36 CEST Jonathan Wakely wrote: > On Fri, 16 Jul 2021 at 20:26, Matthias Kretz wrote: > > On Friday, 16 July 2021 18:54:30 CEST Jonathan Wakely wrote: > > > On Fri, 16 Jul 2021 at 16:33, Jason Merrill wrote: > > > > Adjusting them based on tuning would certainly simplify a significant > > > > use > > > > case, perhaps the only reasonable use. Cases more concerned with ABI > > > > stability probably shouldn't use them at all. And that would mean not > > > > needing to worry about the impossible task of finding the right values > > > > for > > > > an entire architecture. > > > > > > But it would be quite a significant change in behaviour if -mtune > > > started affecting ABI, wouldn't it? > > > > For existing code -mtune still doesn't affect ABI. > > True, because existing code isn't using the constants. > > >The users who write > > > > struct keep_apart { > > > > alignas(std::hardware_destructive_interference_size) std::atomic > > cat; > > alignas(std::hardware_destructive_interference_size) std::atomic > > dog; > > > > }; > > > > *want* to have different sizeof(keep_apart) depending on the CPU the code > > is compiled for. I.e. they *ask* for getting their ABI broken. > > Right, but the person who wants that and the person who chooses the > -mtune option might be different people. Yes. But it was the intent of the person who wrote the code that the person compiling the code can change the data layout of keep_apart via -mtune. Of course, if the one compiling doesn't want to choose because the binary needs to work on the widest range of systems, then there's a problem we might want to solve (direction of target_clones?). (Or the developer of the library solves it by providing the ABI for all possible interference_size values.) > A distro might add -mtune=core2 to all package builds by default, not > expecting it to cause ABI changes. Some header in a package in the > distro might start using the constants. Now everybody who includes > that header needs to use the same -mtune option as the distro default. If somebody writes a library with `keep_apart` in the public API/ABI then you're right. > That change in the behaviour and expected use of an existing option > seems scary to me. Even with a warning about using the constants > (because somebody's just going to use #pragma around their use of the > constants to disable the warning, and now the ABI impact of -mtune is > much less obvious). There are people who say that linking TUs compiled with different compiler flags is UB. In general I think that's correct, but we can make explicit exceptions. Up to now -mtune wouldn't lead to UB, AFAIK, though -march easily does. So maybe, to keep the status quo, the constants should be tied to -march not -mtune? > It's much less scary in a world where the code is written and used by > the same group of people, but for something like a linux distro it > worries me. The developer who wants his code to be included in a distro should care about binary distribution. If his code has an ABI issue, that's a bug he needs to fix. It's not the fault of the packager. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] c++: implement C++17 hardware interference size
On Friday, 16 July 2021 19:20:29 CEST Noah Goldstein wrote: > On Fri, Jul 16, 2021 at 11:12 AM Matthias Kretz wrote: > > I don't understand how this feature would lead to false sharing. But maybe > > I > > misunderstand the spatial prefetcher. The first access to one of the two > > cache > > lines pairs would bring both cache lines to LLC (and possibly L2). If a > > core > > with a different L2 reads the other cache line the cache line would be > > duplicated; if it writes to it, it would be exclusive to the other core's > > L2. > > The cache line pairs do not affect each other anymore. Maybe there's a > > minor > > inefficiency on initial transfer from memory, but isn't that all? > > If two cores that do not share an L2 cache need exclusive access to > a cache-line, the L2 spatial prefetcher could cause pingponging if those > two cache-lines were adjacent and shared the same 128 byte alignment. > Say core A requests line x1 in exclusive, it also get line x2 (not sure > if x2 would be in shared or exclusive), core B then requests x2 in > exclusive, > it also gets x1. Irrelevant of the state x1 comes into core B's private L2 > cache > it invalidates the exclusive state on cache-line x1 in core A's private L2 > cache. If this was done in a loop (say a simple `lock add` loop) it would > cause > pingponging on cache-lines x1/x2 between core A and B's private L2 caches. Quoting the latest ORM: "The following two hardware prefetchers fetched data from memory to the L2 cache and last level cache: Spatial Prefetcher: This prefetcher strives to complete every cache line fetched to the L2 cache with the pair line that completes it to a 128-byte aligned chunk." 1. If the requested cache line is already present on some other core, the spatial prefetcher should not get used ("fetched data from memory"). 2. The section is about data prefetching. It is unclear whether the spatial prefetcher applies at all for normal cache line fetches. 3. The ORM uses past tense ("The following two hardware prefetchers fetched data"), which indicates to me that Intel isn't doing this for newer generations anymore. 4. If I'm wrong on points 1 & 2 consider this: Core 1 requests a read of cache line A and the adjacent cache line B thus is also loaded to LLC. Core 2 request a read of line B and thus loads line A into LLC. Now both cores have both cache lines in LLC. Core 1 writes to line A, which invalidates line A in LLC of Core 2 but does not affect line B. Core 2 writes to line B, invalidating line A for Core 1. => no false sharing. Where did I get my mental cache protocol wrong? -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] c++: implement C++17 hardware interference size
On Friday, 16 July 2021 18:54:30 CEST Jonathan Wakely wrote: > On Fri, 16 Jul 2021 at 16:33, Jason Merrill wrote: > > Adjusting them based on tuning would certainly simplify a significant use > > case, perhaps the only reasonable use. Cases more concerned with ABI > > stability probably shouldn't use them at all. And that would mean not > > needing to worry about the impossible task of finding the right values for > > an entire architecture. > > But it would be quite a significant change in behaviour if -mtune > started affecting ABI, wouldn't it? For existing code -mtune still doesn't affect ABI. The users who write struct keep_apart { alignas(std::hardware_destructive_interference_size) std::atomic cat; alignas(std::hardware_destructive_interference_size) std::atomic dog; }; *want* to have different sizeof(keep_apart) depending on the CPU the code is compiled for. I.e. they *ask* for getting their ABI broken. If they wanted to specify the value themselves on the command line they'd written: struct keep_apart { alignas(SOME_MACRO) std::atomic cat; alignas(SOME_MACRO) std::atomic dog; }; I would be very disappointed if std::hardware_destructive_interference_size and std::hardware_constructive_interference_size turn into a glorified macro. -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] c++: implement C++17 hardware interference size
On Friday, 16 July 2021 04:41:17 CEST Jason Merrill via Gcc-patches wrote: > > Currently the patch does not adjust the values based on -march, as in JF's > > proposal. I'll need more guidance from the ARM/AArch64 maintainers about > > how to go about that. --param l1-cache-line-size is set based on -mtune, > > but I don't think we want -mtune to change these ABI-affecting values. > > Are > > there -march values for which a smaller range than 64-256 makes sense? As a user who cares about ABI but also cares about maximizing performance of builds for a specific HPC setup I'd expect the hardware interference size values to be allowed to break ABIs. The point of these values is to give me better performance portability (but not necessarily binary portability) than my usual "pick 64 as a good average". Wrt, -march / -mtune setting hardware interference size: IMO -mtune=X should be interpreted as "my binary is supposed to be optimized for X, I accept inefficiencies on everything that's not X". On Friday, 16 July 2021 04:48:52 CEST Noah Goldstein wrote: > On intel x86 systems with a private L2 cache the spatial prefetcher > can cause destructive interference along 128 byte aligned boundaries. > https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-3 > 2-architectures-optimization-manual.pdf#page=60 I don't understand how this feature would lead to false sharing. But maybe I misunderstand the spatial prefetcher. The first access to one of the two cache lines pairs would bring both cache lines to LLC (and possibly L2). If a core with a different L2 reads the other cache line the cache line would be duplicated; if it writes to it, it would be exclusive to the other core's L2. The cache line pairs do not affect each other anymore. Maybe there's a minor inefficiency on initial transfer from memory, but isn't that all? That said. Intel documents the spatial prefetcher exclusively for Sandy Bridge. So if you still believe 128 is necessary, set the destructive hardware interference size to 64 for all of x86 except -mtune=sandybridge. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [RFC] c-family: Add __builtin_noassoc
On Friday, 16 July 2021 11:31:29 CEST Richard Biener wrote: > On Fri, Jul 16, 2021 at 10:57 AM Matthias Kretz wrote: > > On Wednesday, 14 July 2021 10:14:55 CEST Richard Biener wrote: > > > I think implementing it similar to how we do __builtin_shufflevector > > > would > > > be easily possible. PAREN_EXPR is a tree code. > > > > Like this? If you like it, I'll write the missing documentation and do > > real > > regression testing. > > Yes, like this. Now, __builtin_noassoc (a + b + c) might suggest that > it prevents a + b + c from being re-associated - but it does not. > PAREN_EXPR is a barrier for association, so for 'a + b + c + PAREN_EXPR + e + f>' the a+b+c and d+e+f chains will not mix but they individually can > be re-associated. That said __builtin_noassoc might be a bad name, > maybe __builtin_assoc_barrier is better? Yes, I agree with renaming it. And assoc_barrier sounds intuitive to me. > To fully prevent association of a a + b + d + e chain you need at least > two PAREN_EXPRs, for example (a+b) + (d+e) would do. > > One could of course provide __builtin_noassoc (a+b+c+d) with the > implied semantics and insert PAREN_EXPRs around all operands > when lowering it. I wouldn't want to go there. __builtin_noassoc(f(x, y, z))? We probably both agree that it would be a no-op, but it reads like f should be evaluated with - fno-associative-math. > Not sure what's more useful in practice - directly exposing the middle-end > PAREN_EXPR or providing a way to mark a whole expression as to be > not re-associated? Maybe both? I think this is a tool for specialists. Give them the low-level tool and they'll build whatever higher level abstractions they need on top of it. Like float sum_noassoc(RangeOfFloats auto x) { float sum = 0; for (float v : x) sum = __builtin_assoc_barrier(v + x); return sum; } -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[RFC] c-family: Add __builtin_noassoc
On Wednesday, 14 July 2021 10:14:55 CEST Richard Biener wrote: > > > There's one "related" IL feature used by the Fortran frontend - > > > PAREN_EXPR > > > prevents association across it. So for Fortran (when not > > > -fno-protect-parens which is enabled by -Ofast), (a + b) - b cannot be > > > optimized to a. Eventually this could be used to wrap intrinsic results > > > since most of the issues in the end require association. Note > > > PAREN_EXPR > > > isn't exposed to the C family frontends but we could of course add a > > > builtin-like thing for this _Noassoc ( ) or so. Note PAREN_EXPR > > > survives -Ofast so it's the frontends that would need to choose to emit > > > or > > > not emit it (or always emit it). > > > > Interesting. I want that builtin in C++. Currently I use inline asm to > > achieve a similar effect. But the inline asm hammer is really too big for > > the problem. > > I think implementing it similar to how we do __builtin_shufflevector would > be easily possible. PAREN_EXPR is a tree code. Like this? If you like it, I'll write the missing documentation and do real regression testing. --- New builtin to enable explicit use of PAREN_EXPR in C & C++ code. Signed-off-by: Matthias Kretz gcc/testsuite/ChangeLog: * c-c++-common/builtin-noassoc-1.c: New test. gcc/cp/ChangeLog: * cp-objcp-common.c (names_builtin_p): Handle RID_BUILTIN_NOASSOC. * parser.c (cp_parser_postfix_expression): Handle RID_BUILTIN_NOASSOC. gcc/c-family/ChangeLog: * c-common.c (c_common_reswords): Add __builtin_noassoc. * c-common.h (enum rid): Add RID_BUILTIN_NOASSOC. gcc/c/ChangeLog: * c-decl.c (names_builtin_p): Handle RID_BUILTIN_NOASSOC. * c-parser.c (c_parser_postfix_expression): Likewise. --- gcc/c-family/c-common.c | 1 + gcc/c-family/c-common.h | 2 +- gcc/c/c-decl.c| 1 + gcc/c/c-parser.c | 20 gcc/cp/cp-objcp-common.c | 1 + gcc/cp/parser.c | 14 +++ .../c-c++-common/builtin-noassoc-1.c | 24 +++ 7 files changed, 62 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/c-c++-common/builtin-noassoc-1.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 681fcc972f4..e74123d896c 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -384,6 +384,7 @@ const struct c_common_resword c_common_reswords[] = { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 }, { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 }, { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY }, + { "__builtin_noassoc", RID_BUILTIN_NOASSOC, 0 }, { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 }, { "__builtin_shufflevector", RID_BUILTIN_SHUFFLEVECTOR, 0 }, { "__builtin_tgmath", RID_BUILTIN_TGMATH, D_CONLY }, diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 50ca8fb6ebd..b772cf9c5e9 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -108,7 +108,7 @@ enum rid RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR, RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE, RID_BUILTIN_SHUFFLEVECTOR, RID_BUILTIN_CONVERTVECTOR, RID_BUILTIN_TGMATH, - RID_BUILTIN_HAS_ATTRIBUTE, + RID_BUILTIN_HAS_ATTRIBUTE, RID_BUILTIN_NOASSOC, RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, /* TS 18661-3 keywords, in the same sequence as the TI_* values. */ diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c index 983d65e930c..7b7ecba026f 100644 --- a/gcc/c/c-decl.c +++ b/gcc/c/c-decl.c @@ -10557,6 +10557,7 @@ names_builtin_p (const char *name) case RID_BUILTIN_HAS_ATTRIBUTE: case RID_BUILTIN_SHUFFLE: case RID_BUILTIN_SHUFFLEVECTOR: +case RID_BUILTIN_NOASSOC: case RID_CHOOSE_EXPR: case RID_OFFSETOF: case RID_TYPES_COMPATIBLE_P: diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 9a56e0c04c6..2b40dc8253e 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -8931,6 +8931,7 @@ c_parser_predefined_identifier (c_parser *parser) assignment-expression , assignment-expression, ) __builtin_convertvector ( assignment-expression , type-name ) + __builtin_noassoc (
Re: ping-2: [PATCH] c-family: Add more predefined macros for math flags
On Wednesday, 14 July 2021 14:42:01 CEST H.J. Lu wrote: > On Wed, Jul 14, 2021 at 12:32 AM Matthias Kretz wrote: > > OK? > > > > On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote: > > > Library code, especially in headers, sometimes needs to know how the > > > compiler interprets / optimizes floating-point types and operations. > > > This information can be used for additional optimizations or for > > > ensuring correctness. This change makes -freciprocal-math, > > > -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and > > > -frounding-math report their state via corresponding pre-defined macros. > > > > > > Signed-off-by: Matthias Kretz > > > > > > gcc/testsuite/ChangeLog: > > > * gcc.dg/associative-math-1.c: New test. > > > * gcc.dg/associative-math-2.c: New test. > > > * gcc.dg/no-signed-zeros-1.c: New test. > > > * gcc.dg/no-signed-zeros-2.c: New test. > > > * gcc.dg/no-trapping-math-1.c: New test. > > > * gcc.dg/no-trapping-math-2.c: New test. > > > * gcc.dg/reciprocal-math-1.c: New test. > > > * gcc.dg/reciprocal-math-2.c: New test. > > > * gcc.dg/rounding-math-1.c: New test. > > > * gcc.dg/rounding-math-2.c: New test. > > > > > > gcc/c-family/ChangeLog: > > > * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or > > > undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > > > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > > > __ROUNDING_MATH__ according to the new optimization flags. > > > > > > gcc/ChangeLog: > > > * cppbuiltin.c (define_builtin_macros_for_compilation_flags): > > > Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > > > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > > > __ROUNDING_MATH__ according to their corresponding flags. > > > * doc/cpp.texi: Document __RECIPROCAL_MATH__, > > > __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, > > > and __ROUNDING_MATH__. > > > > > Hi Hongtao, > > Can this be used to address > > https://gcc.gnu.org/pipermail/gcc/2021-July/236778.html It should help to determine when a workaround is necessary. I use inline asm to implement the workaround. Relevant libstdc++ code (not upstream yet and not making use of __ASSOCIATIVE_MATH__ yet): /* * Ensure the expressions leading up to the @p __x argument are evaluated at least once. * * Example: __force_evaluation(x + y) - y will not optimize to x with - fassociative-math. * _TV is expected to be __vector_type_t. */ template [[__gnu__::__flatten__, __gnu__::__const__]] _GLIBCXX_SIMD_INTRINSIC constexpr _TV __force_evaluation(_TV __x) noexcept { if (__builtin_is_constant_evaluated()) return __x; else return [&] { if constexpr(__have_sse) { if constexpr (sizeof(__x) >= 16) { asm("" :: "x"(__x)); asm("" : "+x"(__x)); } else if constexpr (is_same_v<__vector_type_t, _TV>) { asm("" :: "x"(__x[0]), "x"(__x[1])); asm("" : "+x"(__x[0]), "+x"(__x[1])); } else __assert_unreachable<_TV>(); } else if constexpr(__have_neon) { asm("" :: "w"(__x)); asm("" : "+w"(__x)); } else if constexpr (__have_power_vmx) { if constexpr (is_same_v<__vector_type_t, _TV>) { asm("" :: "fgr"(__x[0]), "fgr"(__x[1])); asm("" : "+fgr"(__x[0]), "+fgr"(__x[1])); } else { asm("" :: "v"(__x)); asm("" : "+v"(__x)); } } else { asm("" :: "g"(__x)); asm("" : "+g"(__x)); } return __x; }(); } // Returns __x + __y - __y without -fassociative-math optimizing to __x. // - _TV must be __vector_type_t. // - _UV must be _TV or floating-point type. template [[__gnu__::__const__]] _GLIBCXX_SIMD_INTRINSIC constexpr _TV __plus_minus(_TV __x, _UV __y) noexcept { #if defined __clang__ || __GCC_IEC_559 > 0 return (__x + __y) - __y; #else if
[PATCH v3] c++: Add gnu::diagnose_as attribute
Hi Jason, A new revision of the patch is attached. I think I implemented all your suggestions. Please comment on cp/decl2.c (is_alias_template_p). I find it surprising that I had to write this function. Maybe I missed something? In any case, DECL_ALIAS_TEMPLATE_P requires a template_decl and the TYPE_DECL apparently doesn't have a template_info/decl at this point. From: Matthias Kretz This attribute overrides the diagnostics output string for the entity it appertains to. The motivation is to improve QoI for library TS implementations, where diagnostics have a very bad signal-to-noise ratio due to the long namespaces involved. With the attribute, it is possible to solve PR89370 and make std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as std::string in diagnostic output without extra hacks to recognize the type in the C++ frontend. Signed-off-by: Matthias Kretz gcc/ChangeLog: PR c++/89370 * doc/extend.texi: Document the diagnose_as attribute. * doc/invoke.texi: Document -fno-diagnostics-use-aliases. gcc/c-family/ChangeLog: PR c++/89370 * c.opt (fdiagnostics-use-aliases): New diagnostics flag. gcc/cp/ChangeLog: PR c++/89370 * cp-tree.h: Add TFF_AS_PRIMARY. Add is_alias_template_p declaration. * decl2.c (is_alias_template_p): New function. Determines whether a given TYPE_DECL is actually an alias template that is still missing its template_info. (is_late_template_attribute): Decls with diagnose_as attribute are early attributes only if they are alias templates. * error.c (dump_scope): When printing the name of a namespace, look for the diagnose_as attribute. If found, print the associated string instead of calling dump_decl. (dump_decl_name_or_diagnose_as): New function to replace dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the diagnose_as attribute before printing the DECL_NAME. (dump_template_scope): New function. Prints the scope of a template instance correctly applying diagnose_as attributes and adjusting the list of template parms accordingly. (dump_aggr_type): If the type has a diagnose_as attribute, print the associated string instead of printing the original type name. Print template parms only if the attribute was not applied to the instantiation / full specialization. Delay call to dump_scope until the diagnose_as attribute is found. If the attribute has a second argument, use it to override the context passed to dump_scope. (dump_simple_decl): Call dump_decl_name_or_diagnose_as instead of dump_decl. (dump_decl): Ditto. (lang_decl_name): Ditto. (dump_function_decl): Walk the functions context list to determine whether a call to dump_template_scope is required. Ensure function templates are presented as primary templates. (dump_function_name): Replace the function's identifier with the diagnose_as attribute value, if set. (dump_template_parms): Treat as primary template if flags contains TFF_AS_PRIMARY. (comparable_template_types_p): Consider the types not a template if one carries a diagnose_as attribute. (print_template_differences): Replace the identifier with the diagnose_as attribute value on the most general template, if it is set. * name-lookup.c (handle_namespace_attrs): Handle the diagnose_as attribute on namespaces. Ensure exactly one string argument. Ensure previous diagnose_as attributes used the same name. 'diagnose_as' on namespace aliases are forwarded to the original namespace. Support no-argument 'diagnose_as' on namespace aliases. (do_namespace_alias): Add attributes parameter and call handle_namespace_attrs. * name-lookup.h (do_namespace_alias): Add attributes tree parameter. * parser.c (cp_parser_declaration): If the next token is RID_NAMESPACE, tentatively parse a namespace alias definition. If this fails expect a namespace definition. (cp_parser_namespace_alias_definition): Allow optional attributes before and after the identifier. Fast exit if the expected CPP_EQ token is missing. Pass attributes to do_namespace_alias. * tree.c (cxx_attribute_table): Add diagnose_as attribute to the table. (check_diagnose_as_redeclaration): New function; copied and adjusted from check_abi_tag_redeclaration. (handle_diagnose_as_attribute): New function; copied and adjusted from handle_abi_tag_attribute. If the given *node is a TYPE_DECL: allow no argument to the attribute, using DECL_NAME instead; apply the attribute to the type on the RHS in place, even if the type is complete. A
ping-2: [PATCH] c-family: Add more predefined macros for math flags
OK? On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote: > Library code, especially in headers, sometimes needs to know how the > compiler interprets / optimizes floating-point types and operations. > This information can be used for additional optimizations or for > ensuring correctness. This change makes -freciprocal-math, > -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and > -frounding-math report their state via corresponding pre-defined macros. > > Signed-off-by: Matthias Kretz > > gcc/testsuite/ChangeLog: > > * gcc.dg/associative-math-1.c: New test. > * gcc.dg/associative-math-2.c: New test. > * gcc.dg/no-signed-zeros-1.c: New test. > * gcc.dg/no-signed-zeros-2.c: New test. > * gcc.dg/no-trapping-math-1.c: New test. > * gcc.dg/no-trapping-math-2.c: New test. > * gcc.dg/reciprocal-math-1.c: New test. > * gcc.dg/reciprocal-math-2.c: New test. > * gcc.dg/rounding-math-1.c: New test. > * gcc.dg/rounding-math-2.c: New test. > > gcc/c-family/ChangeLog: > > * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or > undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > __ROUNDING_MATH__ according to the new optimization flags. > > gcc/ChangeLog: > > * cppbuiltin.c (define_builtin_macros_for_compilation_flags): > Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > __ROUNDING_MATH__ according to their corresponding flags. > * doc/cpp.texi: Document __RECIPROCAL_MATH__, > __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, > and __ROUNDING_MATH__. > --- > gcc/c-family/c-cppbuiltin.c | 25 +++ > gcc/cppbuiltin.c | 10 + > gcc/doc/cpp.texi | 18 > gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/no-signed-zeros-1.c | 17 +++ > gcc/testsuite/gcc.dg/no-signed-zeros-2.c | 17 +++ > gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/reciprocal-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/reciprocal-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++ > gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++ > 13 files changed, 223 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c > create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c > create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c index f79f939bd10..671af04b1f8 100644 --- a/gcc/c-family/c-cppbuiltin.c +++ b/gcc/c-family/c-cppbuiltin.c @@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree, cpp_undef (pfile, "__FINITE_MATH_ONLY__"); cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0"); } + + if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math) +cpp_define_unused (pfile, "__RECIPROCAL_MATH__"); + else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math) +cpp_undef (pfile, "__RECIPROCAL_MATH__"); + + if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros) +cpp_undef (pfile, "__NO_SIGNED_ZEROS__"); + else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros) +cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__"); + + if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math) +cpp_undef (pfile, "__NO_TRAPPING_MATH__"); + else if (prev->x_
Re: [Questions] Is there any bit in gimple/rtl to indicate this IR support fast-math or not?
On Wednesday, 14 July 2021 07:18:29 CEST Hongtao Liu via Gcc-help wrote: > On Wed, Jul 14, 2021 at 1:15 PM Hongtao Liu wrote: > > Hi: > > The original problem was that some users wanted the cmdline option > > > > -ffast-math not to act on intrinsic production code. This sounds like the users want intrinsics to map *directly* to the corresponding instruction. If that's the case such users should use inline assembly, IMHO. If you compile a TU with -ffast-math then *all* floating-point operations are affected. Yes, more control over where to use fast-math and the ability to mix fast-math and no-fast-math without risking ODR violations would be great. But that's a larger issue, and one that would ideally be solved in WG14/WG21. FWIW, this is what I'd do, i.e. turn off fast-math for the function in question: https://godbolt.org/z/3cKq5hT1o -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote: > > 2. About the namespace aliases: IIUC an attribute would currently be > > rejected because of the C++ grammar. Do you want to make it valid before > > WG21 officially decides how to proceed? And if you have a pointer for me > > where I'd have to adjust the grammar rules, that'd help. > > You will want to adjust cp_parser_namespace_alias_definition to handle > attributes like cp_parser_namespace_definition. The latter currently > accepts attributes both before and after the name, which seems like a > good pattern to follow so it doesn't matter which WG21 chooses. > Probably best to pedwarn about C++11 attributes in both locations for > now, not just after. This introduces an ambiguity in cp_parser_declaration. The function has to decide whether to call cp_parser_namespace_definition or fall back to cp_parser_block_declaration (which calls cp_parser_namespace_alias_definition). But now the parser has to look ahead a lot farther: namespace foo [[whatever]] {} namespace bar [[whatever]] = foo; I.e. only at '{' vs. '=' can cp_parser_declaration decide to call cp_parser_namespace_definition. Consequently, should I really modify cp_parser_namespace_definition to handle namespace aliases? Or can/should cp_parser_declaration look ahead behind the attribute(s)? How? With pedantic standard C++ it would be easy, since only these attribute placements are allowed: namespace [[whatever] foo {} namespace bar [[whatever]] = foo; -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
ping: [PATCH] c-family: Add more predefined macros for math flags
OK? (I want to use the macros in libstdc++.) On Wednesday, 30 June 2021 10:59:28 CEST Matthias Kretz wrote: > Library code, especially in headers, sometimes needs to know how the > compiler interprets / optimizes floating-point types and operations. > This information can be used for additional optimizations or for > ensuring correctness. This change makes -freciprocal-math, > -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and > -frounding-math report their state via corresponding pre-defined macros. > > Signed-off-by: Matthias Kretz > > gcc/testsuite/ChangeLog: > > * gcc.dg/associative-math-1.c: New test. > * gcc.dg/associative-math-2.c: New test. > * gcc.dg/no-signed-zeros-1.c: New test. > * gcc.dg/no-signed-zeros-2.c: New test. > * gcc.dg/no-trapping-math-1.c: New test. > * gcc.dg/no-trapping-math-2.c: New test. > * gcc.dg/reciprocal-math-1.c: New test. > * gcc.dg/reciprocal-math-2.c: New test. > * gcc.dg/rounding-math-1.c: New test. > * gcc.dg/rounding-math-2.c: New test. > > gcc/c-family/ChangeLog: > > * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or > undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > __ROUNDING_MATH__ according to the new optimization flags. > > gcc/ChangeLog: > > * cppbuiltin.c (define_builtin_macros_for_compilation_flags): > Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, > __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and > __ROUNDING_MATH__ according to their corresponding flags. > * doc/cpp.texi: Document __RECIPROCAL_MATH__, > __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, > and __ROUNDING_MATH__. > --- > gcc/c-family/c-cppbuiltin.c | 25 +++ > gcc/cppbuiltin.c | 10 + > gcc/doc/cpp.texi | 18 > gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/no-signed-zeros-1.c | 17 +++ > gcc/testsuite/gcc.dg/no-signed-zeros-2.c | 17 +++ > gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/reciprocal-math-1.c | 17 +++ > gcc/testsuite/gcc.dg/reciprocal-math-2.c | 17 +++ > gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++ > gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++ > 13 files changed, 223 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c > create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c > create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c > create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c > create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c index f79f939bd10..671af04b1f8 100644 --- a/gcc/c-family/c-cppbuiltin.c +++ b/gcc/c-family/c-cppbuiltin.c @@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree, cpp_undef (pfile, "__FINITE_MATH_ONLY__"); cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0"); } + + if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math) +cpp_define_unused (pfile, "__RECIPROCAL_MATH__"); + else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math) +cpp_undef (pfile, "__RECIPROCAL_MATH__"); + + if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros) +cpp_undef (pfile, "__NO_SIGNED_ZEROS__"); + else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros) +cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__"); + + if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math) +cpp_undef (pfile, "__NO
Re: [PATCH] Add gnu::diagnose_as attribute
On Thursday, 1 July 2021 17:18:26 CEST Jason Merrill wrote: > You probably want to adjust is_late_template_attribute to change that. Right, I hacked is_late_template_attribute but now I only see a TYPE_DECL passed to my attribute handler (!DECL_ALIAS_TEMPLATE_P). I.e. I don't know how your previous comment is supposed to help me: On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote: > Yes. You can check that with get_underlying_template. FWIW, I don't feel qualified to implement the diagnose_as attribute on alias templates. The trees I've seen while testing the following test case don't make sense to me. :( // { dg-do compile { target c++11 } } // { dg-options "-fdiagnostics-use-aliases -fpretty-templates" } template class A0 {}; template using B0 [[gnu::diagnose_as]] = A0; // #1 template using C0 [[gnu::diagnose_as]] = A0; // #2 template class A1 {}; template class A1 {}; template using B1 [[gnu::diagnose_as]] = A1; // #3 void fn_1(int); int main () { fn_1 (A0 ()); // { dg-error "cannot convert 'B0' to 'int'" } fn_1 (A1 ()); // { dg-error "cannot convert 'A1' to 'int'" } fn_1 (A1 ()); // { dg-error "cannot convert 'B1' to 'int'" } } On #1 I see !COMPLETE_TYPE_P (TREE_TYPE (*node)) while on #3 TREE_TYPE (*node) is a complete type. Like I said, I don't get to see the TEMPLATE_DECL of either #1, #2, or #3, only a TYPE_DECL whose TREE_TYPE is A0. I thus have no idea how to reject #2. -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote: > On 6/22/21 4:01 PM, Matthias Kretz wrote: > > On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote: > >> For alias templates, you probably want the attribute only on the > >> templated class, not on the instantiations. > > > > Oh good point. My current patch does not allow the attribute on alias > > templates. Consider: > > > > template > > > >struct X {}; > > > > template > > > >using foo [[gnu::diagnose_as]] = X; > > > > I have no idea how this could work. I would have to set the attribute for > > an implicit partial specialization (not that I know of the existence of > > such a thing)? I.e. X would have to be diagnosed as foo, > > but X would have to be diagnosed as X, not foo. > > > > So if anything it should only support alias templates if they are strictly > > "renaming" the type. I.e. their template parameters must match up exactly. > > Can I constrain the attribute like this? > > Yes. You can check that with get_underlying_template. > > Or you could support the above by putting the attribute on the > instantiation with the TEMPLATE_INFO for foo rather than a simple name. Question, given: template using foo = bar; The diagnose_as attribute handler isn't called until e.g. `foo` is instantiated. Meaning that even after the declaration of the alias template `bar` will not be diagnosed as `foo`, which happens only after the first use of `foo`. I find that more confusing than helpful, even if the expectation would be that users only use the alias template. So do you still expect alias templates to support diagnose_as? And if yes, how could I handle the attribute so that the diagnose_as attribute is applied to `bar` on declaration of `foo`? -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH] c-family: Add more predefined macros for math flags
Library code, especially in headers, sometimes needs to know how the compiler interprets / optimizes floating-point types and operations. This information can be used for additional optimizations or for ensuring correctness. This change makes -freciprocal-math, -fno-signed-zeros, -fno-trapping-math, -fassociative-math, and -frounding-math report their state via corresponding pre-defined macros. Signed-off-by: Matthias Kretz gcc/testsuite/ChangeLog: * gcc.dg/associative-math-1.c: New test. * gcc.dg/associative-math-2.c: New test. * gcc.dg/no-signed-zeros-1.c: New test. * gcc.dg/no-signed-zeros-2.c: New test. * gcc.dg/no-trapping-math-1.c: New test. * gcc.dg/no-trapping-math-2.c: New test. * gcc.dg/reciprocal-math-1.c: New test. * gcc.dg/reciprocal-math-2.c: New test. * gcc.dg/rounding-math-1.c: New test. * gcc.dg/rounding-math-2.c: New test. gcc/c-family/ChangeLog: * c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and __ROUNDING_MATH__ according to the new optimization flags. gcc/ChangeLog: * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and __ROUNDING_MATH__ according to their corresponding flags. * doc/cpp.texi: Document __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and __ROUNDING_MATH__. --- gcc/c-family/c-cppbuiltin.c | 25 +++ gcc/cppbuiltin.c | 10 + gcc/doc/cpp.texi | 18 gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++ gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++ gcc/testsuite/gcc.dg/no-signed-zeros-1.c | 17 +++ gcc/testsuite/gcc.dg/no-signed-zeros-2.c | 17 +++ gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++ gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++ gcc/testsuite/gcc.dg/reciprocal-math-1.c | 17 +++ gcc/testsuite/gcc.dg/reciprocal-math-2.c | 17 +++ gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++ gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++ 13 files changed, 223 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c index f79f939bd10..671af04b1f8 100644 --- a/gcc/c-family/c-cppbuiltin.c +++ b/gcc/c-family/c-cppbuiltin.c @@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree, cpp_undef (pfile, "__FINITE_MATH_ONLY__"); cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0"); } + + if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math) +cpp_define_unused (pfile, "__RECIPROCAL_MATH__"); + else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math) +cpp_undef (pfile, "__RECIPROCAL_MATH__"); + + if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros) +cpp_undef (pfile, "__NO_SIGNED_ZEROS__"); + else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros) +cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__"); + + if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math) +cpp_undef (pfile, "__NO_TRAPPING_MATH__"); + else if (prev->x_flag_trapping_math && !cur->x_flag_trapping_math) +cpp_define_unused (pfile, "__NO_TRAPPING_MATH__"); + + if (!prev->x_flag_associative_math && cur->x_flag_associative_math) +cpp_define_unused (pfile, "__ASSOCIATIVE_MATH__"); + else if (prev->x_flag_associative_math && !cur->x_flag_
Re: [PATCH 04/11 v3] libstdc++: Make use of __builtin_bit_cast
For -ffast-math there was a missing using namespace __proposed left. The attached patch resolves the issue. From: Matthias Kretz The __bit_cast function was a hack to achieve what __builtin_bit_cast can do, therefore use __builtin_bit_cast if possible. However, __builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since it isn't trivially copyable (in the language sense — in principle it is). Therefore add __proposed::simd_bit_cast to enable the use case required in the test framework. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__bit_cast): Implement via __builtin_bit_cast #if available. (__proposed::simd_bit_cast): Add overloads for simd and simd_mask, which use __builtin_bit_cast (or __bit_cast #if not available), which return an object of the requested type with the same bits as the argument. * include/experimental/bits/simd_math.h: Use simd_bit_cast instead of __bit_cast to allow casts to fixed_size_simd. (copysign): Remove branch that was only required if __bit_cast cannot be constexpr. * testsuite/experimental/simd/tests/bits/test_values.h: Switch from __bit_cast to __proposed::simd_bit_cast since the former will not cast fixed_size objects anymore. --- libstdc++-v3/include/experimental/bits/simd.h | 57 ++- .../include/experimental/bits/simd_math.h | 37 ++-- .../simd/tests/bits/test_values.h | 8 +-- 3 files changed, 76 insertions(+), 26 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 163f1b574e2..852d0b62012 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1598,7 +1598,9 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr _To __bit_cast(const _From __x) { -// TODO: implement with / replace by __builtin_bit_cast ASAP +#if __has_builtin(__builtin_bit_cast) +return __builtin_bit_cast(_To, __x); +#else static_assert(sizeof(_To) == sizeof(_From)); constexpr bool __to_is_vectorizable = is_arithmetic_v<_To> || is_enum_v<_To>; @@ -1629,6 +1631,7 @@ template reinterpret_cast(&__x), sizeof(_To)); return __r; } +#endif } // }}} @@ -2900,6 +2903,58 @@ template (__x)}; } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd<_Up, _Abi>& __x) + { +using _Tp = typename _To::value_type; +using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; +using _From = simd<_Up, _Abi>; +using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember; +// with concepts, the following should be constraints +static_assert(sizeof(_To) == sizeof(_From)); +static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>); +static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>); +#if __has_builtin(__builtin_bit_cast) +return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))}; +#else +return {__private_init, __bit_cast<_ToMember>(__data(__x))}; +#endif + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd_mask<_Up, _Abi>& __x) + { +using _From = simd_mask<_Up, _Abi>; +static_assert(sizeof(_To) == sizeof(_From)); +static_assert(is_trivially_copyable_v<_From>); +// _To can be simd, specifically simd> in which case _To is not trivially +// copyable. +if constexpr (is_simd_v<_To>) + { + using _Tp = typename _To::value_type; + using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; + static_assert(is_trivially_copyable_v<_ToMember>); +#if __has_builtin(__builtin_bit_cast) + return {__private_init, __builtin_bit_cast(_ToMember, __x)}; +#else + return {__private_init, __bit_cast<_ToMember>(__x)}; +#endif + } +else + { + static_assert(is_trivially_copyable_v<_To>); +#if __has_builtin(__builtin_bit_cast) + return __builtin_bit_cast(_To, __x); +#else + return __bit_cast<_To>(__x); +#endif + } + } } // namespace __proposed // simd_cast {{{2 diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index d954e761eee..ef2bdc641b8 100644 --- a/libstdc++-v3/include
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote: > For alias templates, you probably want the attribute only on the > templated class, not on the instantiations. Oh good point. My current patch does not allow the attribute on alias templates. Consider: template struct X {}; template using foo [[gnu::diagnose_as]] = X; I have no idea how this could work. I would have to set the attribute for an implicit partial specialization (not that I know of the existence of such a thing)? I.e. X would have to be diagnosed as foo, but X would have to be diagnosed as X, not foo. So if anything it should only support alias templates if they are strictly "renaming" the type. I.e. their template parameters must match up exactly. Can I constrain the attribute like this? Or should we rely on developers to be reasonable and only use it for template aliases with matching template params? -Matthias -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH v2] libstdc++: Improve std::lock algorithm
On Dienstag, 22. Juni 2021 17:20:41 CEST Jonathan Wakely wrote: > On Tue, 22 Jun 2021 at 14:21, Matthias Kretz wrote: > > This does a try_lock on all lockabes even if any of them fails. I think > > that's > > not only more expensive but also non-conforming. I think you need to defer > > locking and then loop from beginning to end to break the loop on the first > > unsuccessful try_lock. > > Oops, good point. I'll add a test for that too. Here's the fixed code: > > template > inline int > __try_lock_impl(_L0& __l0, _Lockables&... __lockables) > { > #if __cplusplus >= 201703L > if constexpr ((is_same_v<_L0, _Lockables> && ...)) > { > constexpr int _Np = 1 + sizeof...(_Lockables); > unique_lock<_L0> __locks[_Np] = { > {__l0, defer_lock}, {__lockables, defer_lock}... > }; > for (int __i = 0; __i < _Np; ++__i) I thought coding style requires a { here? > if (!__locks[__i].try_lock()) > { > const int __failed = __i; > while (__i--) > __locks[__i].unlock(); > return __i; You meant `return __failed`? > } > for (auto& __l : __locks) > __l.release(); > return -1; > } > else > #endif > > > [...] > > Yes, if only we had a wrapping integer type that wraps at an arbitrary N. > > Like > > > > unsigned int but with parameter, like: > > for (__wrapping_uint<_Np> __k = __idx; __k != __first; --__k) > > > > __locks[__k - 1].unlock(); > > > > This is the loop I wanted to write, except --__k is simpler to write and > > __k - > > 1 would also wrap around to _Np - 1 for __k == 0. But if this is the only > > place it's not important enough to abstract. > > We might be able to use __wrapping_uint in std::seed_seq::generate too, and > maybe some other places in . But we can add that later if we decide > it's worth it. OK. > > I also considered moving it down here. Makes sense unless you want to call > > __detail::__lock_impl from other functions. And if we want to make it work > > for > > pre-C++11 we could do > > > > using __homogeneous > > > > = __and_, is_same<_L1, _L3>...>; > > > > int __i = 0; > > __detail::__lock_impl(__homogeneous(), __i, 0, __l1, __l2, __l3...); > > We don't need tag dispatching, we could just do: > > if _GLIBCXX17_CONSTEXPR (homogeneous::value) > ... > else > ... > > because both branches are valid for the homogeneous case, i.e. we aren't > using if-constexpr to avoid invalid instantiations. But for the inhomogeneous case the homogeneous code is invalid (initialization of C-array of unique_lock<_L1>). > But given that the default -std option is gnu++17 now, I'm OK with the > iterative version only being used for C++17. Fair enough. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH v2] libstdc++: Improve std::lock algorithm
{ > + const int __idx = (__first + __j) % _Np; > + if (!__locks[__idx].try_lock()) > + { > + for (int __k = __j; __k != 0; --__k) > + __locks[(__first + __k - 1) % _Np].unlock(); > + __first = __idx; > + break; > + } > + } > + } while (!__locks[__first]); > + > + for (auto& __l : __locks) > + __l.release(); > + } > + else > +#endif > + { > + int __i = 0; > + __detail::__lock_impl(__i, 0, __l1, __l2, __l3...); > + } > } > > #if __cplusplus >= 201703L -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [committed] libstdc++: Improve std::lock algorithm
90,19 +627,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > void > lock(_L1& __l1, _L2& __l2, _L3&... __l3) > { > - while (true) > -{ > - using __try_locker = __try_lock_impl<0, sizeof...(_L3) != 0>; > - unique_lock<_L1> __first(__l1); > - int __idx; > - auto __locks = std::tie(__l2, __l3...); > - __try_locker::__do_try_lock(__locks, __idx); > - if (__idx == -1) > -{ > - __first.release(); > - return; > -} > -} > + int __i = 0; > + __detail::__lock_impl(__i, 0, __l1, __l2, __l3...); > } > > #if __cplusplus >= 201703L -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Wednesday, 16 June 2021 02:48:09 CEST Jason Merrill wrote: > > IIUC, your main concern is that my proposed diagnose_as *can* be used to > > make diagnostics worse, by replacing names with strings that are not > > valid identifiers. Of course, whoever uses the attribute to that effect > > should have a good reason to do so. Is your other concern that using the > > attribute in a "good" way is repetitive? Would you be happier if I make > > the string argument to the attribute optional for type aliases? > > Yes, and namespace aliases. I'll look into making the attribute argument optional for aliases. Would you accept the patch with this change? Questions: 1. If a type alias applies the attribute after a type was completed / implicitly instantiated (and possibly already used in diagnostics) should / can I still modify the type and add the attribute? 2. About the namespace aliases: IIUC an attribute would currently be rejected because of the C++ grammar. Do you want to make it valid before WG21 officially decides how to proceed? And if you have a pointer for me where I'd have to adjust the grammar rules, that'd help. :) Best, Matthias -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 15 June 2021 17:51:20 CEST Jason Merrill wrote: > On 6/11/21 6:01 AM, Matthias Kretz wrote: > > For reference I'll attach my stdx::simd diagnose_as patch. > > > > We could also talk about extending the feature to provide more information > > about the diagnose_as substition. E.g. print a list of all diagnose_as > > substitutions, which were used, at the end of the output stream. Or > > simpler, print "note: some identifiers were simplified, use > > -fno-diagnostics-use- aliases to see their real names". > > Or perhaps before the first use of a name that doesn't correspond to a > source-level name. Right. I guess that would be even easier to implement than printing it at the end. > > -struct _Scalar; > > + struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar; > > > > template > > > > - struct _Fixed; > > + struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed; > > Thes two could be the variant of the attribute without an explicit > string, attached to the alias-declaration. Agreed. (since you don't have implementation concerns...) > > +using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>; > > +using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>; > > +using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] = > > _VecBltnBtmsk<64>; + using __odr_helper [[__gnu__::__diagnose_as__("[ODR > > helper]")]] > These [] names seem like minimal improvements over the __ names that you > would get from the attribute without an explicit string. Right. It would, however, give the user an identifier that I don't want them to use in their code. We could argue "it has a double-underscore and it's not a documented implementation-defined type, so you're shooting yourself in the foot". Or we could just avoid the issue altogether. I agree this is not a huge issue. > > + inline namespace parallelism_v2 > > [[__gnu__::__diagnose_as__("std\u2093")]] { > This could go on std::experimental itself, along with my proposed change > to hide inline namespaces by default (with a note similar to the one above). Yes, with the following consequences: * If only the std::experimental::parallelism_v2::simd headers set the diagnose_as attribute on std::experimental, the #inclusion of changes the diagnostics of all other TS implementations. * If all TS implementations set the diagnose_as attribute, then it's basically impossible to go back to the long and scary name. Which is what we really should do as soon as there's both a std::simd and a stdₓ::simd. Attaching the diagnose_as attribute to the inline namespace allows for better granularity, even if it's maybe not good enough for some TSs. * If `namespace std { namespace experimental [[gnu::diagnose_as("foo")]] {` turns the scope into 'foo::' and not 'std::foo::' (not sure what you intended) then I could still attach the attribute to the inline namespace. So, yes, I could improve stdx::simd with what you propose. IMHO it wouldn't be as good as what I can do with the patch at hand, though. IIUC, your main concern is that my proposed diagnose_as *can* be used to make diagnostics worse, by replacing names with strings that are not valid identifiers. Of course, whoever uses the attribute to that effect should have a good reason to do so. Is your other concern that using the attribute in a "good" way is repetitive? Would you be happier if I make the string argument to the attribute optional for type aliases? -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH 04/11 v2] libstdc++: Make use of __builtin_bit_cast
While testing newer patches I found several missing conversions from __bit_cast to simd_bit_cast in this patch (i.e. where bit casting to / from fixed_size was sometimes required). Corrected patch attached. From: Matthias Kretz The __bit_cast function was a hack to achieve what __builtin_bit_cast can do, therefore use __builtin_bit_cast if possible. However, __builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since it isn't trivially copyable (in the language sense — in principle it is). Therefore add __proposed::simd_bit_cast to enable the use case required in the test framework. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__bit_cast): Implement via __builtin_bit_cast #if available. (__proposed::simd_bit_cast): Add overloads for simd and simd_mask, which use __builtin_bit_cast (or __bit_cast #if not available), which return an object of the requested type with the same bits as the argument. * include/experimental/bits/simd_math.h: Use simd_bit_cast instead of __bit_cast to allow casts to fixed_size_simd. (copysign): Remove branch that was only required if __bit_cast cannot be constexpr. * testsuite/experimental/simd/tests/bits/test_values.h: Switch from __bit_cast to __proposed::simd_bit_cast since the former will not cast fixed_size objects anymore. --- libstdc++-v3/include/experimental/bits/simd.h | 57 ++- .../include/experimental/bits/simd_math.h | 36 +--- .../simd/tests/bits/test_values.h | 8 +-- 3 files changed, 75 insertions(+), 26 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 163f1b574e2..852d0b62012 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1598,7 +1598,9 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr _To __bit_cast(const _From __x) { -// TODO: implement with / replace by __builtin_bit_cast ASAP +#if __has_builtin(__builtin_bit_cast) +return __builtin_bit_cast(_To, __x); +#else static_assert(sizeof(_To) == sizeof(_From)); constexpr bool __to_is_vectorizable = is_arithmetic_v<_To> || is_enum_v<_To>; @@ -1629,6 +1631,7 @@ template reinterpret_cast(&__x), sizeof(_To)); return __r; } +#endif } // }}} @@ -2900,6 +2903,58 @@ template (__x)}; } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd<_Up, _Abi>& __x) + { +using _Tp = typename _To::value_type; +using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; +using _From = simd<_Up, _Abi>; +using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember; +// with concepts, the following should be constraints +static_assert(sizeof(_To) == sizeof(_From)); +static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>); +static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>); +#if __has_builtin(__builtin_bit_cast) +return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))}; +#else +return {__private_init, __bit_cast<_ToMember>(__data(__x))}; +#endif + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd_mask<_Up, _Abi>& __x) + { +using _From = simd_mask<_Up, _Abi>; +static_assert(sizeof(_To) == sizeof(_From)); +static_assert(is_trivially_copyable_v<_From>); +// _To can be simd, specifically simd> in which case _To is not trivially +// copyable. +if constexpr (is_simd_v<_To>) + { + using _Tp = typename _To::value_type; + using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; + static_assert(is_trivially_copyable_v<_ToMember>); +#if __has_builtin(__builtin_bit_cast) + return {__private_init, __builtin_bit_cast(_ToMember, __x)}; +#else + return {__private_init, __bit_cast<_ToMember>(__x)}; +#endif + } +else + { + static_assert(is_trivially_copyable_v<_To>); +#if __has_builtin(__builtin_bit_cast) + return __builtin_bit_cast(_To, __x); +#else + return __bit_cast<_To>(__x); +#endif + } + } } // namespace __proposed // simd_cast {{{2 diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/
Re: [PATCH] Add gnu::diagnose_as attribute
How can we make progress here? I could try to produce some "Tony Tables" of diagnostic output of my modified stdx::simd. I believe it's a major productivity boost to see abbreviated / "obfuscated" diagnostics *out-of-the box* (with the possibility to opt-out). Actually, it already *is* a productivity boost to me. Understanding diagnostics has improved from "1. ooof, I'm not going to read this, let me rather guess what the issue was 2. sh** I have to read it 3. several minutes later: I finally found the five words to understand the problem; I could use a break" to "1. right, let me check that" For reference I'll attach my stdx::simd diagnose_as patch. We could also talk about extending the feature to provide more information about the diagnose_as substition. E.g. print a list of all diagnose_as substitutions, which were used, at the end of the output stream. Or simpler, print "note: some identifiers were simplified, use -fno-diagnostics-use- aliases to see their real names". On Tuesday, 1 June 2021 21:12:18 CEST Jason Merrill wrote: > > Right, but then two of my design goals can't be met: > > > > 1. Diagnostics have an improved signal-to-noise ratio out of the box. > > > > 2. We can use replacement names that are not valid identifiers. > > This is the basic disconnect: I think that these goals are > contradictory, and that replacement names that are not valid identifiers > will just confuse users that don't know about them. > > If a user sees stdx::foo in a diagnostic and then tries to refer to > stdx::foo and gets an error, the diagnostic is not more helpful than one > that uses the fully qualified name. > > Jonathan, David, any thoughts on this issue? -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 43331134301..8e0cceff860 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -80,13 +80,13 @@ using __m512d [[__gnu__::__vector_size__(64)]] = double; using __m512i [[__gnu__::__vector_size__(64)]] = long long; #endif -namespace simd_abi { +namespace simd_abi [[__gnu__::__diagnose_as__("simd_abi")]] { // simd_abi forward declarations {{{ // implementation details: -struct _Scalar; + struct [[__gnu__::__diagnose_as__("scalar")]] _Scalar; template - struct _Fixed; + struct [[__gnu__::__diagnose_as__("fixed_size")]] _Fixed; // There are two major ABIs that appear on different architectures. // Both have non-boolean values packed into an N Byte register @@ -105,28 +105,11 @@ template template struct _VecBltnBtmsk; -template - using _VecN = _VecBuiltin; - -template - using _Sse = _VecBuiltin<_UsedBytes>; - -template - using _Avx = _VecBuiltin<_UsedBytes>; - -template - using _Avx512 = _VecBltnBtmsk<_UsedBytes>; - -template - using _Neon = _VecBuiltin<_UsedBytes>; - -// implementation-defined: -using __sse = _Sse<>; -using __avx = _Avx<>; -using __avx512 = _Avx512<>; -using __neon = _Neon<>; -using __neon128 = _Neon<16>; -using __neon64 = _Neon<8>; +#if defined __i386__ || defined __x86_64__ +using __sse [[__gnu__::__diagnose_as__("[SSE]")]] = _VecBuiltin<16>; +using __avx [[__gnu__::__diagnose_as__("[AVX]")]] = _VecBuiltin<32>; +using __avx512 [[__gnu__::__diagnose_as__("[AVX512]")]] = _VecBltnBtmsk<64>; +#endif // standard: template @@ -364,7 +347,7 @@ namespace __detail * users link TUs compiled with different flags. This is especially important * for using simd in libraries. */ - using __odr_helper + using __odr_helper [[__gnu__::__diagnose_as__("[ODR helper]")]] = conditional_t<__machine_flags() == 0, _OdrEnforcer, _MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>; @@ -689,7 +672,7 @@ template __is_avx512_abi() { constexpr auto _Bytes = __abi_bytes_v<_Abi>; -return _Bytes <= 64 && is_same_v, _Abi>; +return _Bytes <= 64 && is_same_v, _Abi>; } // }}} diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h b/libstdc++-v3/include/experimental/bits/simd_detail.h index 78ad33f74e4..1f127cd0d52 100644 --- a/libstdc++-v3/include/experimental/bits/simd_detail.h +++ b/libstdc++-v3/include/experimental/bits/simd_detail.h @@ -36,7 +36,7 @@ {
Re: [PATCH 11/11] libstdc++: Fix ODR issues with different -m flags
On Wednesday, 9 June 2021 14:22:00 CEST Richard Biener wrote: > On Tue, Jun 8, 2021 at 2:23 PM Matthias Kretz wrote: > > From: Matthias Kretz > > > > Explicitly support use of the stdx::simd implementation in situations > > where the user links TUs that were compiled with different -m flags. In > > general, this is always a (quasi) ODR violation for inline functions > > because at least codegen may differ in important ways. However, in the > > resulting executable only one (unspecified which one) of them might be > > used. For simd we want to support users to compile code multiple times, > > with different -m flags and have a runtime dispatch to the TU matching > > the target CPU. But if internal functions are not inlined this may lead > > to unexpected performance loss or execution of illegal instructions. > > Therefore, inline functions that are not marked as always_inline must > > use an additional template parameter somewhere in their name, to > > disambiguate between the different -m translations. > > Note that excessive use of always_inline can cause compile-time issues > (see for example PR99785). Ah, I should verify whether that's also the reason my stdx::simd implementation is slow to compile. However, I really must have the always_inline semantics in most of the places stdx::simd uses it. Because most of these functions compile to either a single function call or a single instruction (often f0 -> f1 -> f2 -> single instruction). If the inliner even makes one single wrong inlining decision, the whole program might slow down by integral factors, not only small percentages. And without inlining these functions, -fno-inline builds (i.e. many debug builds) become unbearably slow (aka useless). > I wonder whether the inlines can be > placed in an anonymous namespace instead of the difficult to maintain > explict list of SIMD features? It's possible, and part of the patch: + namespace + { +struct _OdrEnforcer {}; + } [...] + using __odr_helper += conditional_t<__machine_flags() == 0, _OdrEnforcer, + _MachineFlagsTemplate<__machine_flags(), __floating_point_flags()>>; It can potentially blow up the code size and the instruction cache usage, though. The trade-off isn't obvious to make. I guess I can't promise that mixing different compiler flags is ODR violation free > It also doesn't solve the issue when > instantiating the functions from a TU which contains #pragma GCC target > sections to switch options, of course. Yes. Can I get PR83875? ;-) - Matthias > > Signed-off-by: Matthias Kretz > > > > libstdc++-v3/ChangeLog: > > * include/experimental/bits/simd.h: Move feature detection bools > > and add __have_avx512bitalg, __have_avx512vbmi2, > > __have_avx512vbmi, __have_avx512ifma, __have_avx512cd, > > __have_avx512vnni, __have_avx512vpopcntdq. > > (__detail::__machine_flags): New function which returns a unique > > uint64 depending on relevant -m and -f flags. > > (__detail::__odr_helper): New type alias for either an anonymous > > type or a type specialized with the __machine_flags number. > > (_SimdIntOperators): Change template parameters from _Impl to > > _Tp, _Abi because _Impl now has an __odr_helper parameter which > > may be _OdrEnforcer from the anonymous namespace, which makes > > for a bad base class. > > (many): Either add __odr_helper template parameter or mark as > > always_inline. > > * include/experimental/bits/simd_detail.h: Add defines for > > AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD, > > AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT. > > * include/experimental/bits/simd_builtin.h: Add __odr_helper > > template parameter or mark as always_inline. > > * include/experimental/bits/simd_fixed_size.h: Ditto. > > * include/experimental/bits/simd_math.h: Ditto. > > * include/experimental/bits/simd_scalar.h: Ditto. > > * include/experimental/bits/simd_neon.h: Add __odr_helper > > template parameter. > > * include/experimental/bits/simd_ppc.h: Ditto. > > * include/experimental/bits/simd_x86.h: Ditto. > > > > --- > > > > libstdc++-v3/include/experimental/bits/simd.h | 380 -- > > .../include/experimental/bits/simd_builtin.h | 41 +- > > .../include/experimental/bits/simd_detail.h | 40 ++ > > .../experimental/bits/simd_fixed_size.h | 39 +- > &
[PATCH 11/11] libstdc++: Fix ODR issues with different -m flags
From: Matthias Kretz Explicitly support use of the stdx::simd implementation in situations where the user links TUs that were compiled with different -m flags. In general, this is always a (quasi) ODR violation for inline functions because at least codegen may differ in important ways. However, in the resulting executable only one (unspecified which one) of them might be used. For simd we want to support users to compile code multiple times, with different -m flags and have a runtime dispatch to the TU matching the target CPU. But if internal functions are not inlined this may lead to unexpected performance loss or execution of illegal instructions. Therefore, inline functions that are not marked as always_inline must use an additional template parameter somewhere in their name, to disambiguate between the different -m translations. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Move feature detection bools and add __have_avx512bitalg, __have_avx512vbmi2, __have_avx512vbmi, __have_avx512ifma, __have_avx512cd, __have_avx512vnni, __have_avx512vpopcntdq. (__detail::__machine_flags): New function which returns a unique uint64 depending on relevant -m and -f flags. (__detail::__odr_helper): New type alias for either an anonymous type or a type specialized with the __machine_flags number. (_SimdIntOperators): Change template parameters from _Impl to _Tp, _Abi because _Impl now has an __odr_helper parameter which may be _OdrEnforcer from the anonymous namespace, which makes for a bad base class. (many): Either add __odr_helper template parameter or mark as always_inline. * include/experimental/bits/simd_detail.h: Add defines for AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD, AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT. * include/experimental/bits/simd_builtin.h: Add __odr_helper template parameter or mark as always_inline. * include/experimental/bits/simd_fixed_size.h: Ditto. * include/experimental/bits/simd_math.h: Ditto. * include/experimental/bits/simd_scalar.h: Ditto. * include/experimental/bits/simd_neon.h: Add __odr_helper template parameter. * include/experimental/bits/simd_ppc.h: Ditto. * include/experimental/bits/simd_x86.h: Ditto. --- libstdc++-v3/include/experimental/bits/simd.h | 380 -- .../include/experimental/bits/simd_builtin.h | 41 +- .../include/experimental/bits/simd_detail.h | 40 ++ .../experimental/bits/simd_fixed_size.h | 39 +- .../include/experimental/bits/simd_math.h | 45 ++- .../include/experimental/bits/simd_neon.h | 4 +- .../include/experimental/bits/simd_ppc.h | 4 +- .../include/experimental/bits/simd_scalar.h | 71 +++- .../include/experimental/bits/simd_x86.h | 4 +- 9 files changed, 440 insertions(+), 188 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 21100c1087d..43331134301 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -35,6 +35,7 @@ #include // for stderr #endif #include +#include #include #include #include @@ -203,9 +204,170 @@ template // }}} template using _SizeConstant = integral_constant; +// constexpr feature detection{{{ +constexpr inline bool __have_mmx = _GLIBCXX_SIMD_HAVE_MMX; +constexpr inline bool __have_sse = _GLIBCXX_SIMD_HAVE_SSE; +constexpr inline bool __have_sse2 = _GLIBCXX_SIMD_HAVE_SSE2; +constexpr inline bool __have_sse3 = _GLIBCXX_SIMD_HAVE_SSE3; +constexpr inline bool __have_ssse3 = _GLIBCXX_SIMD_HAVE_SSSE3; +constexpr inline bool __have_sse4_1 = _GLIBCXX_SIMD_HAVE_SSE4_1; +constexpr inline bool __have_sse4_2 = _GLIBCXX_SIMD_HAVE_SSE4_2; +constexpr inline bool __have_xop = _GLIBCXX_SIMD_HAVE_XOP; +constexpr inline bool __have_avx = _GLIBCXX_SIMD_HAVE_AVX; +constexpr inline bool __have_avx2 = _GLIBCXX_SIMD_HAVE_AVX2; +constexpr inline bool __have_bmi = _GLIBCXX_SIMD_HAVE_BMI1; +constexpr inline bool __have_bmi2 = _GLIBCXX_SIMD_HAVE_BMI2; +constexpr inline bool __have_lzcnt = _GLIBCXX_SIMD_HAVE_LZCNT; +constexpr inline bool __have_sse4a = _GLIBCXX_SIMD_HAVE_SSE4A; +constexpr inline bool __have_fma = _GLIBCXX_SIMD_HAVE_FMA; +constexpr inline bool __have_fma4 = _GLIBCXX_SIMD_HAVE_FMA4; +constexpr inline bool __have_f16c = _GLIBCXX_SIMD_HAVE_F16C; +constexpr inline bool __have_popcnt
[PATCH 10/11] libstdc++: Fix internal names: add missing underscores
From: Matthias Kretz Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_math.h (_GLIBCXX_SIMD_MATH_CALL2_): Rename arg2_ to __arg2. (_GLIBCXX_SIMD_MATH_CALL3_): Rename arg2_ to __arg2 and arg3_ to __arg3. --- libstdc++-v3/include/experimental/bits/simd_math.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index a5df2039970..61af9fc67af 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -119,10 +119,10 @@ template //}}} // _GLIBCXX_SIMD_MATH_CALL2_ {{{ -#define _GLIBCXX_SIMD_MATH_CALL2_(__name, arg2_) \ +#define _GLIBCXX_SIMD_MATH_CALL2_(__name, __arg2) \ template < \ typename _Tp, typename _Abi, typename...,\ - typename _Arg2 = _Extra_argument_type, \ + typename _Arg2 = _Extra_argument_type<__arg2, _Tp, _Abi>,\ typename _R = _Math_return_type_t< \ decltype(std::__name(declval(), _Arg2::declval())), _Tp, _Abi>>\ enable_if_t, _R>\ @@ -137,7 +137,7 @@ template\ declval(), \ declval, \ + is_same<__arg2, _Tp>,\ negation, simd<_Tp, _Abi>>>, \ is_convertible<_Up, simd<_Tp, _Abi>>, is_floating_point<_Tp>>, \ double>>())), \ @@ -147,10 +147,10 @@ template\ // }}} // _GLIBCXX_SIMD_MATH_CALL3_ {{{ -#define _GLIBCXX_SIMD_MATH_CALL3_(__name, arg2_, arg3_)\ +#define _GLIBCXX_SIMD_MATH_CALL3_(__name, __arg2, __arg3) \ template , \ - typename _Arg3 = _Extra_argument_type, \ + typename _Arg2 = _Extra_argument_type<__arg2, _Tp, _Abi>,\ + typename _Arg3 = _Extra_argument_type<__arg3, _Tp, _Abi>,\ typename _R = _Math_return_type_t< \ decltype(std::__name(declval(), _Arg2::declval(), \ _Arg3::declval())), \
[PATCH 09/11] libstdc++: Ensure unrolled loops inline the lambda
From: Matthias Kretz Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__execute_on_index_sequence, __execute_on_index_sequence_with_return, __call_with_n_evaluations, __call_with_subscripts): Add flatten attribute. --- libstdc++-v3/include/experimental/bits/simd.h | 12 1 file changed, 8 insertions(+), 4 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 5d243f22434..21100c1087d 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -234,7 +234,8 @@ namespace __detail // unrolled/pack execution helpers // __execute_n_times{{{ template - _GLIBCXX_SIMD_INTRINSIC constexpr void + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + void __execute_on_index_sequence(_Fp&& __f, index_sequence<_I...>) { ((void)__f(_SizeConstant<_I>()), ...); } @@ -254,7 +255,8 @@ template // }}} // __generate_from_n_evaluations{{{ template - _GLIBCXX_SIMD_INTRINSIC constexpr _R + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + _R __execute_on_index_sequence_with_return(_Fp&& __f, index_sequence<_I...>) { return _R{__f(_SizeConstant<_I>())...}; } @@ -269,7 +271,8 @@ template // }}} // __call_with_n_evaluations{{{ template - _GLIBCXX_SIMD_INTRINSIC constexpr auto + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + auto __call_with_n_evaluations(index_sequence<_I...>, _F0&& __f0, _FArgs&& __fargs) { return __f0(__fargs(_SizeConstant<_I>())...); } @@ -285,7 +288,8 @@ template // }}} // __call_with_subscripts{{{ template - _GLIBCXX_SIMD_INTRINSIC constexpr auto + [[__gnu__::__flatten__]] _GLIBCXX_SIMD_INTRINSIC constexpr + auto __call_with_subscripts(_Tp&& __x, index_sequence<_It...>, _Fp&& __fun) { return __fun(__x[_First + _It]...); }
[PATCH 08/11] libstdc++: Avoid raising fp exceptions in trunc, floor, and ceil
From: Matthias Kretz Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_x86.h (_S_trunc, _S_floor, _S_ceil): Set bit 8 (_MM_FROUND_NO_EXC) on AVX and SSE4.1 roundp[sd] calls. --- .../include/experimental/bits/simd_x86.h | 24 +-- 1 file changed, 12 insertions(+), 12 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h index 5706bf63845..34633c096b1 100644 --- a/libstdc++-v3/include/experimental/bits/simd_x86.h +++ b/libstdc++-v3/include/experimental/bits/simd_x86.h @@ -2657,13 +2657,13 @@ template else if constexpr (__is_avx512_pd<_Tp, _Np>()) return _mm512_roundscale_pd(__x, 0x0b); else if constexpr (__is_avx_ps<_Tp, _Np>()) - return _mm256_round_ps(__x, 0x3); + return _mm256_round_ps(__x, 0xb); else if constexpr (__is_avx_pd<_Tp, _Np>()) - return _mm256_round_pd(__x, 0x3); + return _mm256_round_pd(__x, 0xb); else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>()) - return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x3)); + return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0xb)); else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>()) - return _mm_round_pd(__x, 0x3); + return _mm_round_pd(__x, 0xb); else if constexpr (__is_sse_ps<_Tp, _Np>()) { auto __truncated @@ -2786,13 +2786,13 @@ template else if constexpr (__is_avx512_pd<_Tp, _Np>()) return _mm512_roundscale_pd(__x, 0x09); else if constexpr (__is_avx_ps<_Tp, _Np>()) - return _mm256_round_ps(__x, 0x1); + return _mm256_round_ps(__x, 0x9); else if constexpr (__is_avx_pd<_Tp, _Np>()) - return _mm256_round_pd(__x, 0x1); + return _mm256_round_pd(__x, 0x9); else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>()) - return __auto_bitcast(_mm_floor_ps(__to_intrin(__x))); + return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0x9)); else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>()) - return _mm_floor_pd(__x); + return _mm_round_pd(__x, 0x9); else return _Base::_S_floor(__x); } @@ -2808,13 +2808,13 @@ template else if constexpr (__is_avx512_pd<_Tp, _Np>()) return _mm512_roundscale_pd(__x, 0x0a); else if constexpr (__is_avx_ps<_Tp, _Np>()) - return _mm256_round_ps(__x, 0x2); + return _mm256_round_ps(__x, 0xa); else if constexpr (__is_avx_pd<_Tp, _Np>()) - return _mm256_round_pd(__x, 0x2); + return _mm256_round_pd(__x, 0xa); else if constexpr (__have_sse4_1 && __is_sse_ps<_Tp, _Np>()) - return __auto_bitcast(_mm_ceil_ps(__to_intrin(__x))); + return __auto_bitcast(_mm_round_ps(__to_intrin(__x), 0xa)); else if constexpr (__have_sse4_1 && __is_sse_pd<_Tp, _Np>()) - return _mm_ceil_pd(__x); + return _mm_round_pd(__x, 0xa); else return _Base::_S_ceil(__x); }
[PATCH 07/11] libstdc++: Fix condition when AVX512F ldexp implementation is used
From: Matthias Kretz This improves codegen of ldexp if AVX512VL is available. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_x86.h (_S_ldexp): The AVX512F implementation doesn't require a _VecBltnBtmsk ABI tag, it requires either a 64-Byte input (in which case AVX512F must be available) or AVX512VL. --- libstdc++-v3/include/experimental/bits/simd_x86.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h index 305d7a9fa54..5706bf63845 100644 --- a/libstdc++-v3/include/experimental/bits/simd_x86.h +++ b/libstdc++-v3/include/experimental/bits/simd_x86.h @@ -2611,13 +2611,14 @@ template _S_ldexp(_SimdWrapper<_Tp, _Np> __x, __fixed_size_storage_t __exp) { - if constexpr (__is_avx512_abi<_Abi>()) + if constexpr (sizeof(__x) == 64 || __have_avx512vl) { const auto __xi = __to_intrin(__x); constexpr _SimdConverter, _Tp, _Abi> __cvt; const auto __expi = __to_intrin(__cvt(__exp)); - constexpr auto __k1 = _Abi::template _S_implicit_mask_intrin<_Tp>(); + using _Up = __bool_storage_member_type_t<_Np>; + constexpr _Up __k1 = _Np < sizeof(_Up) * __CHAR_BIT__ ? _Up((1ULL << _Np) - 1) : ~_Up(); if constexpr (sizeof(__xi) == 16) { if constexpr (sizeof(_Tp) == 8)
[PATCH 06/11] libstdc++: Minor simd_math cleanups
From: Matthias Kretz Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_math.h: Undefine internal macros after use. (frexp): Move #if to a more sensible position and reformat preceding code. (logb): Call _SimdImpl::_S_logb for fixed_size instead of duplicating the code here. (modf): Simplify condition. --- .../include/experimental/bits/simd_math.h | 22 +-- 1 file changed, 6 insertions(+), 16 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index cff4371619d..a5df2039970 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -645,11 +645,8 @@ template return __r; } else if constexpr (__is_fixed_size_abi_v<_Abi>) - { - return {__private_init, - _Abi::_SimdImpl::_S_frexp(__data(__x), __data(*__exp))}; + return {__private_init, _Abi::_SimdImpl::_S_frexp(__data(__x), __data(*__exp))}; #if _GLIBCXX_SIMD_X86INTRIN - } else if constexpr (__have_avx512f) { constexpr size_t _Np = simd_size_v<_Tp, _Abi>; @@ -667,8 +664,8 @@ template _Abi::_CommonImpl::_S_blend(_SimdWrapper( __isnonzero), __v, __getmant_avx512(__v))}; -#endif // _GLIBCXX_SIMD_X86INTRIN } +#endif // _GLIBCXX_SIMD_X86INTRIN else { // fallback implementation @@ -749,14 +746,7 @@ template if constexpr (_Np == 1) return std::logb(__x[0]); else if constexpr (__is_fixed_size_abi_v<_Abi>) - { - return {__private_init, - __data(__x)._M_apply_per_chunk([](auto __impl, auto __xx) { - using _V = typename decltype(__impl)::simd_type; - return __data( - std::experimental::logb(_V(__private_init, __xx))); - })}; - } + return {__private_init, _Abi::_SimdImpl::_S_logb(__data(__x))}; #if _GLIBCXX_SIMD_X86INTRIN // {{{ else if constexpr (__have_avx512vl && __is_sse_ps<_Tp, _Np>()) return {__private_init, @@ -827,9 +817,7 @@ template enable_if_t, simd<_Tp, _Abi>> modf(const simd<_Tp, _Abi>& __x, simd<_Tp, _Abi>* __iptr) { -if constexpr (__is_scalar_abi<_Abi>() - || (__is_fixed_size_abi_v< - _Abi> && simd_size_v<_Tp, _Abi> == 1)) +if constexpr (simd_size_v<_Tp, _Abi> == 1) { _Tp __tmp; _Tp __r = std::modf(__x[0], &__tmp); @@ -1472,6 +1460,8 @@ template } // }}} +#undef _GLIBCXX_SIMD_CVTING2 +#undef _GLIBCXX_SIMD_CVTING3 #undef _GLIBCXX_SIMD_MATH_CALL_ #undef _GLIBCXX_SIMD_MATH_CALL2_ #undef _GLIBCXX_SIMD_MATH_CALL3_
[PATCH 05/11] libstdc++: Remove incorrect fabs overload
From: Matthias Kretz fabs(int) returns double, this one didn't. This overload is not specified in the Parallelism TS 2. Also remove the comment about labs and llabs: it doesn't belong here. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_math.h (fabs): Remove fabs(simd) overload. --- .../include/experimental/bits/simd_math.h| 16 1 file changed, 16 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index 3ade293fcbf..cff4371619d 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -863,22 +863,6 @@ template abs(const simd<_Tp, _Abi>& __x) { return {__private_init, _Abi::_SimdImpl::_S_abs(__data(__x))}; } -template - enable_if_t && is_signed_v<_Tp>, simd<_Tp, _Abi>> - fabs(const simd<_Tp, _Abi>& __x) - { return {__private_init, _Abi::_SimdImpl::_S_abs(__data(__x))}; } - -// the following are overloads for functions in and not covered by -// [parallel.simd.math]. I don't see much value in making them work, though -/* -template simd labs(const simd &__x) -{ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))}; } - -template simd llabs(const simd -&__x) -{ return {__private_init, _Abi::_SimdImpl::abs(__data(__x))}; } -*/ - #define _GLIBCXX_SIMD_CVTING2(_NAME) \ template \ _GLIBCXX_SIMD_INTRINSIC simd<_Tp, _Abi> _NAME( \
[PATCH 04/11] libstdc++: Make use of __builtin_bit_cast
From: Matthias Kretz The __bit_cast function was a hack to achieve what __builtin_bit_cast can do, therefore use __builtin_bit_cast if possible. However, __builtin_bit_cast cannot be used to cast from/to fixed_size_simd, since it isn't trivially copyable (in the language sense — in principle it is). Therefore add __proposed::simd_bit_cast to enable the use case required in the test framework. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__bit_cast): Implement via __builtin_bit_cast #if available. (__proposed::simd_bit_cast): Add overloads for simd and simd_mask, which use __builtin_bit_cast (or __bit_cast #if not available), which return an object of the requested type with the same bits as the argument. * include/experimental/bits/simd_math.h: Use simd_bit_cast instead of __bit_cast to allow casts to fixed_size_simd. * testsuite/experimental/simd/tests/bits/test_values.h: Switch from __bit_cast to __proposed::simd_bit_cast since the former will not cast fixed_size objects anymore. --- libstdc++-v3/include/experimental/bits/simd.h | 40 ++- .../include/experimental/bits/simd_math.h | 8 ++-- .../simd/tests/bits/test_values.h | 8 ++-- 3 files changed, 46 insertions(+), 10 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 163f1b574e2..5d243f22434 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1598,7 +1598,9 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr _To __bit_cast(const _From __x) { -// TODO: implement with / replace by __builtin_bit_cast ASAP +#if __has_builtin(__builtin_bit_cast) +return __builtin_bit_cast(_To, __x); +#else static_assert(sizeof(_To) == sizeof(_From)); constexpr bool __to_is_vectorizable = is_arithmetic_v<_To> || is_enum_v<_To>; @@ -1629,6 +1631,7 @@ template reinterpret_cast(&__x), sizeof(_To)); return __r; } +#endif } // }}} @@ -2900,6 +2903,41 @@ template (__x)}; } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd<_Up, _Abi>& __x) + { +using _Tp = typename _To::value_type; +using _ToMember = typename _SimdTraits<_Tp, typename _To::abi_type>::_SimdMember; +using _From = simd<_Up, _Abi>; +using _FromMember = typename _SimdTraits<_Up, _Abi>::_SimdMember; +// with concepts, the following should be constraints +static_assert(sizeof(_To) == sizeof(_From)); +static_assert(is_trivially_copyable_v<_Tp> && is_trivially_copyable_v<_Up>); +static_assert(is_trivially_copyable_v<_ToMember> && is_trivially_copyable_v<_FromMember>); +#if __has_builtin(__builtin_bit_cast) +return {__private_init, __builtin_bit_cast(_ToMember, __data(__x))}; +#else +return {__private_init, __bit_cast<_ToMember>(__data(__x))}; +#endif + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + _To + simd_bit_cast(const simd_mask<_Up, _Abi>& __x) + { +using _From = simd_mask<_Up, _Abi>; +static_assert(sizeof(_To) == sizeof(_From)); +static_assert(is_trivially_copyable_v<_To> && is_trivially_copyable_v<_From>); +#if __has_builtin(__builtin_bit_cast) +return __builtin_bit_cast(_To, __x); +#else +return __bit_cast<_To>(__x); +#endif + } } // namespace __proposed // simd_cast {{{2 diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index d954e761eee..3ade293fcbf 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -700,11 +700,9 @@ template // (inf and NaN are excluded by -ffinite-math-only) const auto __iszero_inf_nan = __x == 0; #else - const auto __as_int - = __bit_cast, _V>>(abs(__x)); - const auto __inf - = __bit_cast, _V>>( - _V(__infinity_v<_Tp>)); + using _Ip = __int_for_sizeof_t<_Tp>; + const auto __as_int = simd_bit_cast>(abs(__x)); + const auto __inf = simd_bit_cast>(_V(__infinity_v<_Tp>)); const auto __iszero_inf_nan = static_simd_cast( __as_int == 0 || __as_int >= __inf); #endif diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_values.h b/libstdc++-v3/testsuite/experimental/simd/tests/bits/test_value
[PATCH 03/11] libstdc++: Improve fixed_size codegen
From: Matthias Kretz Sometimes fixed_size objects will get unnecessarily copied on the stack. The simd implementation should never pass _SimdTuple by value to avoid requiring the optimizer to see through these copies. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_converter.h (_SimdConverter::operator()): Pass _SimdTuple by const-ref. * include/experimental/bits/simd_fixed_size.h (_GLIBCXX_SIMD_FIXED_OP): Pass binary operator _SimdTuple arguments by const-ref. (_S_masked_unary): Pass _SimdTuple by const-ref. --- libstdc++-v3/include/experimental/bits/simd_converter.h | 2 +- libstdc++-v3/include/experimental/bits/simd_fixed_size.h | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h b/libstdc++-v3/include/experimental/bits/simd_converter.h index 9c8bf382df9..11999df25e4 100644 --- a/libstdc++-v3/include/experimental/bits/simd_converter.h +++ b/libstdc++-v3/include/experimental/bits/simd_converter.h @@ -316,7 +316,7 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr typename _SimdTraits<_To, _Ap>::_SimdMember - operator()(_Arg __x) const noexcept + operator()(const _Arg& __x) const noexcept { if constexpr (_Arg::_S_tuple_size == 1) return __vector_convert<__vector_type_t<_To, _Np>>(__x.first); diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h index b6fb47cdf39..dc2fb90b9b2 100644 --- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h +++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h @@ -1480,7 +1480,7 @@ template #define _GLIBCXX_SIMD_FIXED_OP(name_, op_) \ template\ static inline constexpr _SimdTuple<_Tp, _As...> name_( \ - const _SimdTuple<_Tp, _As...> __x, const _SimdTuple<_Tp, _As...> __y) \ + const _SimdTuple<_Tp, _As...>& __x, const _SimdTuple<_Tp, _As...>& __y)\ {\ return __x._M_apply_per_chunk( \ [](auto __impl, auto __xx, auto __yy) constexpr {\ @@ -1780,8 +1780,7 @@ template // _S_masked_unary {{{2 template class _Op, typename _Tp, typename... _As> static inline _SimdTuple<_Tp, _As...> - _S_masked_unary(const _MaskMember __bits, - const _SimdTuple<_Tp, _As...> __v) // TODO: const-ref __v? + _S_masked_unary(const _MaskMember __bits, const _SimdTuple<_Tp, _As...>& __v) { return __v._M_apply_wrapped([&__bits](auto __meta, auto __native) constexpr {
[PATCH 02/11] libstdc++: Remove dead code
From: Matthias Kretz This helper type became unused at some point. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd_fixed_size.h (_AbisInSimdTuple): Removed. --- .../experimental/bits/simd_fixed_size.h | 49 --- 1 file changed, 49 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h index 7c2c1df77c8..b6fb47cdf39 100644 --- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h +++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h @@ -1025,55 +1025,6 @@ template _Tp, _Remain, _SimdTuple<_Tp, _As..., typename _Next::abi_type>>::type; }; -// }}} -// _AbisInSimdTuple {{{ -template - struct _SeqOp; - -template - struct _SeqOp> - { -using _FirstPlusOne = index_sequence<_I0 + 1, _Is...>; -using _NotFirstPlusOne = index_sequence<_I0, (_Is + 1)...>; -template -using _Prepend = index_sequence<_First, _I0 + _Add, (_Is + _Add)...>; - }; - -template - struct _AbisInSimdTuple; - -template - struct _AbisInSimdTuple<_SimdTuple<_Tp>> - { -using _Counts = index_sequence<0>; -using _Begins = index_sequence<0>; - }; - -template - struct _AbisInSimdTuple<_SimdTuple<_Tp, _Ap>> - { -using _Counts = index_sequence<1>; -using _Begins = index_sequence<0>; - }; - -template - struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A0, _As...>> - { -using _Counts = typename _SeqOp>::_Counts>::_FirstPlusOne; -using _Begins = typename _SeqOp>::_Begins>::_NotFirstPlusOne; - }; - -template - struct _AbisInSimdTuple<_SimdTuple<_Tp, _A0, _A1, _As...>> - { -using _Counts = typename _SeqOp>::_Counts>::template _Prepend<1, 0>; -using _Begins = typename _SeqOp>::_Begins>::template _Prepend<0, 1>; - }; - // }}} // __autocvt_to_simd {{{ template >>
[PATCH 01/11] libstdc++: Improve copysign codegen
From: Matthias Kretz This also resolves a test failure on aarch64 with -ffast-math and fixed_size with large N. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Add missing operator~ overload for simd to __float_bitwise_operators. * include/experimental/bits/simd_builtin.h (_SimdImplBuiltin::_S_complement): Bitcast to int (and back) to implement complement for floating-point vectors. * include/experimental/bits/simd_fixed_size.h (_SimdImplFixedSize::_S_copysign): New function, forwarding to copysign implementation of _SimdTuple members. * include/experimental/bits/simd_math.h (copysign): Call _SimdImpl::_S_copysign for fixed_size arguments. Simplify generic copysign implementation using the new ~ operator. --- libstdc++-v3/include/experimental/bits/simd.h| 6 ++ libstdc++-v3/include/experimental/bits/simd_builtin.h| 7 ++- libstdc++-v3/include/experimental/bits/simd_fixed_size.h | 2 +- libstdc++-v3/include/experimental/bits/simd_math.h | 4 +++- 4 files changed, 16 insertions(+), 3 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 59ddf3cc958..163f1b574e2 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -5189,6 +5189,12 @@ template return {__private_init, _Ap::_SimdImpl::_S_bit_and(__data(__a), __data(__b))}; } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + enable_if_t, simd<_Tp, _Ap>> + operator~(const simd<_Tp, _Ap>& __a) + { return {__private_init, _Ap::_SimdImpl::_S_complement(__data(__a))}; } } // namespace __float_bitwise_operators }}} _GLIBCXX_SIMD_END_NAMESPACE diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h index e986ee91620..8cd338e313f 100644 --- a/libstdc++-v3/include/experimental/bits/simd_builtin.h +++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h @@ -1632,7 +1632,12 @@ template template _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np> _S_complement(_SimdWrapper<_Tp, _Np> __x) noexcept - { return ~__x._M_data; } + { + if constexpr (is_floating_point_v<_Tp>) + return __vector_bitcast<_Tp>(~__vector_bitcast<__int_for_sizeof_t<_Tp>>(__x)); + else + return ~__x._M_data; + } // _S_unary_minus {{{2 template diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h index 2722055c899..7c2c1df77c8 100644 --- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h +++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h @@ -1663,7 +1663,7 @@ template _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, ldexp) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmod) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, remainder) -// copysign in simd_math.h +_GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, copysign) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, nextafter) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fdim) _GLIBCXX_SIMD_APPLY_ON_TUPLE(_Tp, fmax) diff --git a/libstdc++-v3/include/experimental/bits/simd_math.h b/libstdc++-v3/include/experimental/bits/simd_math.h index 4799803a200..d954e761eee 100644 --- a/libstdc++-v3/include/experimental/bits/simd_math.h +++ b/libstdc++-v3/include/experimental/bits/simd_math.h @@ -1304,6 +1304,8 @@ template { if constexpr (simd_size_v<_Tp, _Abi> == 1) return std::copysign(__x[0], __y[0]); +else if constexpr (__is_fixed_size_abi_v<_Abi>) + return {__private_init, _Abi::_SimdImpl::_S_copysign(__data(__x), __data(__y))}; else if constexpr (is_same_v<_Tp, long double> && sizeof(_Tp) == 12) // Remove this case once __bit_cast is implemented via __builtin_bit_cast. // It is necessary, because __signmask below cannot be computed at compile @@ -1315,7 +1317,7 @@ template using _V = simd<_Tp, _Abi>; using namespace std::experimental::__float_bitwise_operators; _GLIBCXX_SIMD_USE_CONSTEXPR_API auto __signmask = _V(1) ^ _V(-1); - return (__x & (__x ^ __signmask)) | (__y & __signmask); + return (__x & ~__signmask) | (__y & __signmask); } }
[PATCH 00/11] stdx::simd optimizations, corrections, and cleanups
The following patches mostly contain code cleanups and minor corrections. The major feature in this patchset is the last patch, which should make the use of stdx::simd much safer wrt. ODR violations involuntarily introduced by linking TUs that were compiled with different -m and floating-point flags. Matthias Kretz (11): libstdc++: Improve copysign codegen libstdc++: Remove dead code libstdc++: Improve fixed_size codegen libstdc++: Make use of __builtin_bit_cast libstdc++: Remove incorrect fabs overload libstdc++: Minor simd_math cleanups libstdc++: Fix condition when AVX512F ldexp implementation is used libstdc++: Avoid raising fp exceptions in trunc, floor, and ceil libstdc++: Ensure unrolled loops inline the lambda libstdc++: Fix internal names: add missing underscores libstdc++: Fix ODR issues with different -m flags libstdc++-v3/include/experimental/bits/simd.h | 438 -- .../include/experimental/bits/simd_builtin.h | 48 +- .../experimental/bits/simd_converter.h| 2 +- .../include/experimental/bits/simd_detail.h | 40 ++ .../experimental/bits/simd_fixed_size.h | 95 ++-- .../include/experimental/bits/simd_math.h | 107 ++--- .../include/experimental/bits/simd_neon.h | 4 +- .../include/experimental/bits/simd_ppc.h | 4 +- .../include/experimental/bits/simd_scalar.h | 71 ++- .../include/experimental/bits/simd_x86.h | 33 +- .../simd/tests/bits/test_values.h | 8 +- 11 files changed, 540 insertions(+), 310 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 3/3] libstdc++: Document simd testsuite
libstdc++-v3/ChangeLog: * testsuite/experimental/simd/README.md: New file. Signed-off-by: Matthias Kretz --- .../testsuite/experimental/simd/README.md | 257 ++ 1 file changed, 257 insertions(+) create mode 100644 libstdc++-v3/testsuite/experimental/simd/README.md -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/testsuite/experimental/simd/README.md b/libstdc++-v3/testsuite/experimental/simd/README.md new file mode 100644 index 000..db0d71f8d43 --- /dev/null +++ b/libstdc++-v3/testsuite/experimental/simd/README.md @@ -0,0 +1,257 @@ +# SIMD Tests + +To execute the simd testsuite, call `make check-simd`, typically with `-j N` +argument. + +For more control over verbosity, compiler flags, and use of a simulator, use +the environment variables documented below. + +## Environment variables + +### `target_list` + +Similar to dejagnu target lists: E.g. +`target_list="unix{-march=sandybridge,-march=native/-ffast-math,-march=native/-ffinite-math-only}" +would create three subdirs in `testsuite/simd/` to run the complete simd +testsuite first with `-march=sandybridge`, then with `-march=native +-ffast-math`, and finally with `-march=native -ffinite-math-only`. + + +### `CHECK_SIMD_CONFIG` + +This variable can be set to a path to a file which is equivalent to a dejagnu +board. The file needs to be a valid `sh` script since it is sourced from the +`scripts/check_simd` script. It's purpose is to set the `target_list` variable +depending on `$target_triplet` (or whatever else makes sense for you). Example: + +```sh +case "$target_triplet" in +x86_64-*) + target_list="unix{-march=sandybridge,-march=skylake-avx512,-march=native/-ffast-math,-march=athlon64,-march=core2,-march=nehalem,-march=skylake,-march=native/-ffinite-math-only,-march=knl}" + ;; + +powerpc64le-*) + define_target power7 "-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc112" + define_target power8 "-mcpu=power8 -static" "$HOME/bin/run_on_gccfarm gcc112" + define_target power9 "-mcpu=power9 -static" "$HOME/bin/run_on_gccfarm gcc135" + target_list="power7 power8 power9{,-ffast-math}" + ;; + +powerpc64-*) + define_target power7 "-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc110" + define_target power8 "-mcpu=power8 -static" "$HOME/bin/run_on_gccfarm gcc110" + target_list="power7 power8{,-ffast-math}" + ;; +esac +``` + +The `unix` target is pre-defined to have no initial flags and no simulator. Use +the `define_target(name, flags, sim)` function to define your own targets for +the `target_list` variable. In the example above `define_target power7 +"-mcpu=power7 -static" "$HOME/bin/run_on_gccfarm gcc112"` defines the target +`power7` which always uses the flags `-mcpu=power7` and `-static` when +compiling tests and prepends `$HOME/bin/run_on_gccfarm gcc112` to test +executables. In `target_list` you can now use the name `power7`. E.g. +`target_list="power7 power7/-ffast-math"` or it's shorthand +`target_list="power7{,-ffast-math}"`. + + +### `DRIVEROPTS` + +This variable affects the `Makefile`s generated per target (as defined above). +It's a string of flags that are prepended to the `driver.sh` invocation which +builds and runs the tests. You `cd` into a simd test subdir and use `make help` +to see possible options and a list of all valid targets. + +``` +use DRIVEROPTS= to pass the following options: +-q, --quiet Disable same-line progress output (default if stdout is +not a tty). +-p, --percentageAdd percentage to default same-line progress output. +-v, --verbose Print one line per test and minimal extra information on +failure. +-vv Print all compiler and test output. +-k, --keep-failed Keep executables of failed tests. +--sim Path to an executable that is prepended to the test +execution binary (default: the value of +GCC_TEST_SIMULATOR). +--timeout-factor +Multiply the default timeout with x. +-x, --run-expensive Compile and run tests marked as expensive (default: +true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise). +-o , --only +Compile and run only tests matching the given pattern. +``` + + +### `TESTFLAGS` + +This variable also affects the `Makefile`s generated per target. It's a list of +compiler flags that are appended to `CXXFLAGS`. + +
[PATCH 2/3] libstdc++: Improve output verbosity options and default
For most uses --quiet was too quiet while the default was too noisy. Now the default output, if stdout is a tty, shows the last successful test on the same line. With --percentage it adds a percentage at the start of the line. --percentage is not default because it requires more resources and might not be 100% compatible to all environments. If stdout is not a tty the default is quiet output like for dejagnu. Additionally, argument parsing now recognizes contracted short options which is easier to use with e.g. DRIVEROPTS=-pxk. libstdc++-v3/ChangeLog: * testsuite/experimental/simd/driver.sh: Rewrite output verbosity logic. Add -p/--percentage option. Allow -v/--verbose to be used twice. Add -x and -o short options. Parse long options with = instead of separating space generically. Parce contracted short options. Make unrecognized options an error. If same-line output is active, trap on EXIT to increment the progress (only with --percentage), erase the line and print the current status. * testsuite/experimental/simd/generate_makefile.sh: Initialize helper files for progress account keeping. Update help target for changes to DRIVEROPTS. Signed-off-by: Matthias Kretz --- .../testsuite/experimental/simd/driver.sh | 137 +- .../experimental/simd/generate_makefile.sh| 33 +++-- 2 files changed, 121 insertions(+), 49 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh index f2d31c70bd0..5ae9905e3a3 100755 --- a/libstdc++-v3/testsuite/experimental/simd/driver.sh +++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh @@ -5,8 +5,22 @@ abi=0 name= srcdir="$(cd "${0%/*}" && pwd)/tests" sim="$GCC_TEST_SIMULATOR" -quiet=false -verbose=false + +# output_mode values: +# print only failures with minimal context +readonly really_quiet=0 +# as above plus same-line output of last successful test +readonly same_line=1 +# as above plus percentage +readonly percentage=2 +# print one line per finished test with minimal context on failure +readonly verbose=3 +# print one line per finished test with full output of the compiler and test +readonly really_verbose=4 + +output_mode=$really_quiet +[ -t 1 ] && output_mode=$same_line + timeout=180 run_expensive=false if [ -n "$GCC_TEST_RUN_EXPENSIVE" ]; then @@ -21,8 +35,12 @@ Usage: $0 [Options] Options: -h, --help Print this message and exit. - -q, --quiet Only print failures. - -v, --verbose Print compiler and test output on failure. + -q, --quiet Disable same-line progress output (default if stdout is + not a tty). + -p, --percentageAdd percentage to default same-line progress output. + -v, --verbose Print one line per test and minimal extra information on + failure. + -vv Print all compiler and test output. -t , --type The value_type to test (default: $type). -a [0-9], --abi [0-9] @@ -36,9 +54,10 @@ Options: GCC_TEST_SIMULATOR). --timeout-factor Multiply the default timeout with x. - --run-expensive Compile and run tests marked as expensive (default: + -x, --run-expensive Compile and run tests marked as expensive (default: true if GCC_TEST_RUN_EXPENSIVE is set, false otherwise). - --only Compile and run only tests matching the given pattern. + -o , --only + Compile and run only tests matching the given pattern. EOF } @@ -49,71 +68,74 @@ while [ $# -gt 0 ]; do exit ;; -q|--quiet) -quiet=true +output_mode=$really_quiet +;; + -p|--percentage) +output_mode=$percentage ;; -v|--verbose) -verbose=true +if [ $output_mode -lt $verbose ]; then + output_mode=$verbose +else + output_mode=$really_verbose +fi ;; - --run-expensive) + -x|--run-expensive) run_expensive=true ;; -k|--keep-failed) keep_failed=true ;; - --only) + -o|--only) only="$2" shift ;; - --only=*) -only="${1#--only=}" -;; -t|--type) type="$2" shift ;; - --type=*) -type="${1#--type=}" -;; -a|--abi) abi="$2" shift ;; - --abi=*) -abi="${1#--abi=}" -;; -n|--name) name="$2" shift ;; - --name
[PATCH 1/3] libstdc++: Remove -fno-tree-vrp after PR98834 was resolved
libstdc++-v3/ChangeLog: * testsuite/Makefile.am (check-simd): Remove -fno-tree-vrp flag and associated warning. * testsuite/Makefile.in: Regenerate. Signed-off-by: Matthias Kretz --- libstdc++-v3/testsuite/Makefile.am | 3 +-- libstdc++-v3/testsuite/Makefile.in | 3 +-- 2 files changed, 2 insertions(+), 4 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am index ba5023a8b54..d2011f03c64 100644 --- a/libstdc++-v3/testsuite/Makefile.am +++ b/libstdc++-v3/testsuite/Makefile.am @@ -191,10 +191,9 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags - @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834." @rm -f .simd.summary @echo "Generating simd testsuite subdirs and Makefiles ..." - @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \ + @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ tail -n20 $${subdir}/simd_testsuite.sum | \ diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/Makefile.in index c9dd7f5da61..c65cdaf2015 100644 --- a/libstdc++-v3/testsuite/Makefile.in +++ b/libstdc++-v3/testsuite/Makefile.in @@ -716,10 +716,9 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags - @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834." @rm -f .simd.summary @echo "Generating simd testsuite subdirs and Makefiles ..." - @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \ + @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ tail -n20 $${subdir}/simd_testsuite.sum | \
[PATCH 0/3] Improve and document stdx::simd testsuite
As discussed a long time ago on IRC, this improves (i.e. decreases by default) the verbosity of make check-simd, gives more verbosity options, and finally documents how the simd testsuite is used and how it works. In addition, after PR98834 was resolved, remove the -fno-tree-vrp workaround. Tested on x86_64-linux (and more). Matthias Kretz (3): libstdc++: Remove -fno-tree-vrp after PR98834 was resolved libstdc++: Improve output verbosity options and default libstdc++: Document simd testsuite libstdc++-v3/testsuite/Makefile.am| 3 +- libstdc++-v3/testsuite/Makefile.in| 3 +- .../testsuite/experimental/simd/README.md | 257 ++ .../testsuite/experimental/simd/driver.sh | 137 +++--- .../experimental/simd/generate_makefile.sh| 33 ++- 5 files changed, 380 insertions(+), 53 deletions(-) create mode 100644 libstdc++-v3/testsuite/experimental/simd/README.md -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 1 June 2021 21:12:18 CEST Jason Merrill wrote: > On 5/28/21 3:42 AM, Matthias Kretz wrote: > > On Friday, 28 May 2021 05:05:52 CEST Jason Merrill wrote: > >> I'd think you could get the same effect from a hypothetical > >> > >> namespace [[gnu::diagnose_as]] stdx = std::experimental; > >> > >> though we'll need to add support for attributes on namespace aliases to > >> the grammar. > > > > Right, but then two of my design goals can't be met: > > > > 1. Diagnostics have an improved signal-to-noise ratio out of the box. > > > > 2. We can use replacement names that are not valid identifiers. > > This is the basic disconnect: I think that these goals are > contradictory, and that replacement names that are not valid identifiers > will just confuse users that don't know about them. With signal-to-noise ratio I meant the ratio (averaged over all GCC users - so yes, we can't give actual numbers for these): #characters one needs to read to understand / #total diagnostic characters. Or more specifically 1 - #characters that are distracting from understanding the issue / #total diagnostic characters. Consider that for the stdx::simd case I regularly hit the problem that vim's QuickFix truncates at 4095 characters and the message basically just got started (i.e. it's sometimes impossible to use vim's QuickFix to understand errors involving stdx::simd). There's *a lot* of noise that must be removed *per default*. WRT "invalid identifiers", there are two types: (1) string of characters that is not a valid C++ identifier (2) valid C++ identifier, but not defined for the given TU (2) can be confusing, I agree, but doesn't have to be. (1) provides a stronger hint that something is either abbreviated or intentionally hidden from the user. If I write `std::experimental::simd` in my code and get a diagnostic that says 'stdₓ::simd' then it's relatively easy to make the connection what happened here: 'stdₓ' clearly must mean something else than a literal 'stdₓ' in my code. The user knows there's no `std::simd' so it must be `std::experimental::simd`. (Note that once std::experimental::simd goes into the IS, I would be the first to propose a change for 'stdₓ::simd' back to 'std::experimental::simd'.) > If a user sees stdx::foo in a diagnostic and then tries to refer to > stdx::foo and gets an error, the diagnostic is not more helpful than one > that uses the fully qualified name. Hmm, if GCC prints an actual suggestion like "write 'stdₓ::foo' here" then yes, I agree. That should not make use of diagnose_as. > Jonathan, David, any thoughts on this issue? > > > I can imagine using it to make _Internal __names more readable while at > > the > > same time discouraging users to utter them in their own code. Sorry for > > the > > bad code obfuscation example above. > > > > An example for consideration from stdx::simd: > >namespace std { > >namespace experimental { > >namespace parallelism_v2 [[gnu::diagnose_as("stdx")]] { > >namespace simd_abi [[gnu::diagnose_as("simd_abi")]] { > > > > template > > > >struct _VecBuiltin; > > > > template > > > >struct _VecBltnBtmsk; > > > >#if x86 > > > > using __ignore_me_0 [[gnu::diagnose_as("[SSE]")]] = _VecBuiltin<16>; > > using __ignore_me_1 [[gnu::diagnose_as("[AVX]")]] = _VecBuiltin<32>; > > using __ignore_me_2 [[gnu::diagnose_as("[AVX512]")]] = > > _VecBltnBtmsk<64>; > > > >#endif > > > > > > Then diagnostics would print 'stdx::simd' > > instead of 'stdx::simd>'. (Users utter > > the type by saying e.g. 'stdx::native_simd', while compiling with > > AVX512 flags.) > > Wouldn't it be better to print stdx::native_simd if that's how > the users write the type? No. For example, I might expect that native_simd maps to AVX-512 vectors but forgot the relevant -m flag(s). If the diagnostics show 'simd' I have a good chance of catching that issue. And the other way around: If I wrote `stdx::simd` and it happens to be the same type as the native_simd typedef, it would show the latter in diagnostics. Similar issue with asking for a simd ABI with `simd_abi::deduce_t`: I typically don't want to know whether that's also native_simd but rather what exact simd_abi I got. And no, as a user I don't typically care about the libstdc++ implementation details but what those details mean. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Friday, 28 May 2021 05:05:52 CEST Jason Merrill wrote: > On 5/27/21 6:07 PM, Matthias Kretz wrote: > > On Thursday, 27 May 2021 23:15:46 CEST Jason Merrill wrote: > >> On 5/27/21 2:54 PM, Matthias Kretz wrote: > >>> namespace Vir { > >>> inline namespace foo { > >>> struct A {}; > >>> } > >>> struct A {}; > >>> } > >>> using Vir::A; > >>> > >>> :7:12: error: reference to 'A' is ambiguous > >>> :3:12: note: candidates are: 'struct Vir::A' > >>> :5:10: note: 'struct Vir::A' > >> > >> That doesn't seem so bad. > > > > As long as you ignore the line numbers, it's a puzzling diagnostic. > > Only briefly puzzling, I think; Vir::A is a valid way of referring to > both types. True. But that's also what lead to the error. GCC easily clears it up nowadays, but wouldn't anymore if inline namespaces were hidden by default. > I'd think you could get the same effect from a hypothetical > > namespace [[gnu::diagnose_as]] stdx = std::experimental; > > though we'll need to add support for attributes on namespace aliases to > the grammar. Right, but then two of my design goals can't be met: 1. Diagnostics have an improved signal-to-noise ratio out of the box. 2. We can use replacement names that are not valid identifiers. I don't think libstdc++ would ship with a namespace alias outside of the std namespace. So we'd place the "burden" of using diagnose_as correctly on our users. Also as a user you'd possibly have to repeat the namespace alias in every source file and/or place it in your applications/libraries namespace. > >> Here it seems like you want to say "use this typedef as the true name of > >> the type". Is it useful to have to repeat the name? Allowing people to > >> use names that don't correspond to actual declarations seems unnecessary. > > > > Yes, but you could also use it to apply diagnose_as to a template > > instantiation without introducing a name for users. E.g. > > > >using __only_to_apply_the_attribute [[gnu::diagnose_as("intvector")]] > > > > = std::vector; > > > > Now all diagnostics of 'std::vector' would print 'intvector' instead. > > Yes, but why would you want to? Making diagnostics print names that the > user can't use in their own code seems obfuscatory, and requiring users > to write the same names in two places seems like extra work. I can imagine using it to make _Internal __names more readable while at the same time discouraging users to utter them in their own code. Sorry for the bad code obfuscation example above. An example for consideration from stdx::simd: namespace std { namespace experimental { namespace parallelism_v2 [[gnu::diagnose_as("stdx")]] { namespace simd_abi [[gnu::diagnose_as("simd_abi")]] { template struct _VecBuiltin; template struct _VecBltnBtmsk; #if x86 using __ignore_me_0 [[gnu::diagnose_as("[SSE]")]] = _VecBuiltin<16>; using __ignore_me_1 [[gnu::diagnose_as("[AVX]")]] = _VecBuiltin<32>; using __ignore_me_2 [[gnu::diagnose_as("[AVX512]")]] = _VecBltnBtmsk<64>; #endif Then diagnostics would print 'stdx::simd' instead of 'stdx::simd>'. (Users utter the type by saying e.g. 'stdx::native_simd', while compiling with AVX512 flags.) > > But in general, I tend to agree, for type aliases there's rarely a case > > where the names wouldn't match. > > > > However, I didn't want to special-case the attribute parameters for type > > aliases (or introduce another attribute just for this case). The attribute > > works consistently and with the same interface independent of where it's > > used. I tried to build a generic, broad feature instead of a narrow > > one-problem solution. > > "Treat this declaration as the name of the type/namespace it refers to > in diagnostics" also seems consistent to me. Sure. In general, I think namespace foo [[gnu::this_is_the_name_I_want]] = bar; using foo [[gnu::this_is_the_name_I_want]] = bar; is not a terribly bad idea on its own. But it's not the solution for the problems I set out to solve. > Still, perhaps it would be better to store these aliases in a separate hash > table instead of *_ATTRIBUTES. Maybe. For performance reasons or for simplification of the implementation? What entity could I use for hashing? The identifier alone wouldn't suffice since different instantiations of the same class template can have different diagnose_as values (e.g. std::string, std::wstring, ...). -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Thursday, 27 May 2021 23:15:46 CEST Jason Merrill wrote: > On 5/27/21 2:54 PM, Matthias Kretz wrote: > > Also hiding all inline namespace by default might make some error messages > > harder to understand: > > > > namespace Vir { > >inline namespace foo { > > struct A {}; > >} > >struct A {}; > > } > > using Vir::A; > > > > :7:12: error: reference to 'A' is ambiguous > > :3:12: note: candidates are: 'struct Vir::A' > > :5:10: note: 'struct Vir::A' > > That doesn't seem so bad. As long as you ignore the line numbers, it's a puzzling diagnostic. > > This is from my pending std::string patch: > > > > --- a/libstdc++-v3/include/bits/c++config > > +++ b/libstdc++-v3/include/bits/c++config > > @@ -299,7 +299,8 @@ namespace std > > > > #if _GLIBCXX_USE_CXX11_ABI > > namespace std > > { > > > > - inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { } > > + inline namespace __cxx11 > > +__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { } > > This seems to have the same benefits and drawbacks of my inline > namespace suggestion. True for std::string, not true for TS's where the extra '::experimental' still makes finding the relevant information in diagnostics harder than necessary. > And it seems like applying the attribute to a > namespace means that enclosing namespaces are not printed, unlike the > handling for types. Yes, that's also how I documented it. For nested namespaces I wanted to enable the removal of nesting (e.g. from std::experimental::parallelism_v2::simd to stdx::simd). However, for types and functions it would be a problem to drop the enclosing scope, because the scope can be class templates and thus the diagnose_as attribute would remove all template parms & args. > > - typedef basic_stringstring; > > + typedef basic_string string > > [[__gnu__::__diagnose_as__("string")]]; > > Here it seems like you want to say "use this typedef as the true name of > the type". Is it useful to have to repeat the name? Allowing people to > use names that don't correspond to actual declarations seems unnecessary. Yes, but you could also use it to apply diagnose_as to a template instantiation without introducing a name for users. E.g. using __only_to_apply_the_attribute [[gnu::diagnose_as("intvector")]] = std::vector; Now all diagnostics of 'std::vector' would print 'intvector' instead. But in general, I tend to agree, for type aliases there's rarely a case where the names wouldn't match. However, I didn't want to special-case the attribute parameters for type aliases (or introduce another attribute just for this case). The attribute works consistently and with the same interface independent of where it's used. I tried to build a generic, broad feature instead of a narrow one-problem solution. FWIW, before you suggest to have one attribute for namespaces and one for type aliases (to cover the std::string case), I have another use case in stdx::simd (the spec requires simd_abi::scalar to be an alias): namespace std::experimental::parallelism_v2::simd_abi { struct [[gnu::diagnose_as("scalar")]] _Scalar; using scalar = _Scalar; } If the attribute were on the type alias (using scalar [[gnu::diagnose_as]] = _Scalar;), then we'd have to apply the attribute to _Scalar after it was completed. That seemed like a bad idea to me. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Thursday, 27 May 2021 19:39:48 CEST Jason Merrill wrote: > On 5/4/21 7:13 AM, Matthias Kretz wrote: > > From: Matthias Kretz > > > > This attribute overrides the diagnostics output string for the entity it > > appertains to. The motivation is to improve QoI for library TS > > implementations, where diagnostics have a very bad signal-to-noise ratio > > due to the long namespaces involved. > > > > On Tuesday, 27 April 2021 11:46:48 CEST Jonathan Wakely wrote: > >> I think it's a great idea and would like to use it for all the TS > >> implementations where there is some inline namespace that the user > >> doesn't care about. std::experimental::fundamentals_v1:: would be much > >> better as just std::experimental::, or something like std::[LFTS]::. > > Hmm, how much of the benefit could we get from a flag (probably on by > default) to skip inline namespaces in diagnostics? I'd say about 20% for the TS's. Even std::experimental::simd (i.e. without the '::parallelism_v2' part) is still rather noisy. I want stdₓ::simd, std-x::simd or std::[PTS2]::simd or whatever shortest shorthand Jonathan will allow. ;) For PR89370, the benefit would be ~2%: 'template std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_If_sv<_Tp, std::__cxx11::basic_string<_CharT, _Traits, _Alloc>&> std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::insert(std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Tp&, std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type, std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::size_type) [with _Tp = _Tp; _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator]' would then only turn into: 'template std::basic_string<_CharT, _Traits, _Alloc>::_If_sv<_Tp, std::basic_string<_CharT, _Traits, _Alloc>&> std::basic_string<_CharT, _Traits, _Alloc>::insert(std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Tp&, std::basic_string<_CharT, _Traits, _Alloc>::size_type, std::basic_string<_CharT, _Traits, _Alloc>::size_type) [with _Tp = _Tp; _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator]' instead of: 'template std::string::_If_sv<_Tp, std::string&> std::string::insert<_Tp>(std::string::size_type, const _Tp&, std::string::size_type, std::string::size_type)' Also hiding all inline namespace by default might make some error messages harder to understand: namespace Vir { inline namespace foo { struct A {}; } struct A {}; } using Vir::A; :7:12: error: reference to 'A' is ambiguous :3:12: note: candidates are: 'struct Vir::A' :5:10: note: 'struct Vir::A' > > With the attribute, it is possible to solve PR89370 and make > > std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as > > std::string in diagnostic output without extra hacks to recognize the > > type. > > That sounds wrong to me; std::string is the instantiation, not > the template. Your patch doesn't make it possible to apply this > attribute to class template instantiations, does it? Yes, it does. Initially, when I tried to improve the TS experience, it didn't. When Jonathan showed PR89370 to me I tried to make [[gnu::diagnose_as]] more generic & useful. Since there's no obvious syntax to apply an attribute to a template instantiation, I had to be creative. This is from my pending std::string patch: --- a/libstdc++-v3/include/bits/c++config +++ b/libstdc++-v3/include/bits/c++config @@ -299,7 +299,8 @@ namespace std #if _GLIBCXX_USE_CXX11_ABI namespace std { - inline namespace __cxx11 __attribute__((__abi_tag__ ("cxx11"))) { } + inline namespace __cxx11 +__attribute__((__abi_tag__ ("cxx11"), __diagnose_as__("std"))) { } } namespace __gnu_cxx { --- a/libstdc++-v3/include/bits/stringfwd.h +++ b/libstdc++-v3/include/bits/stringfwd.h @@ -76,24 +76,24 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 _GLIBCXX_END_NAMESPACE_CXX11 /// A string of @c char - typedef basic_stringstring; + typedef basic_string string [[__gnu__::__diagnose_as__("string")]]; #ifdef _GLIBCXX_USE_WCHAR_T /// A string of @c wchar_t - typedef basic_string wstring; + typedef basic_string wstring [[__gnu__::__diagnose_as__("wstring")]]; #endif [...] The part of my frontend patch that makes this work is in handle_diagnose_as_attribute: + if (TREE_CODE (*node) == TYPE_DECL) +{ + // Apply the attribute to the type alias itself. + decl = *node; + tree type = TREE_TYPE (*node); + if (CLASS_TYPE_P (type) && CLASSTYPE_TEMPLATE_INSTANTIATION (type)) + { + if (COMPLETE_OR_OPEN_TYPE_P (type)) + warning (OPT_Wattributes, +"igno
Re: [PATCH] c++: Add missing scope in typedef diagnostic [PR100763]
On Thursday, 27 May 2021 17:18:58 CEST Jason Merrill wrote: > On 5/26/21 5:27 PM, Matthias Kretz wrote: > > From: Matthias Kretz > > > > dump_type on 'const std::string' should not print 'const string' unless > > TFF_UNQUALIFIED_NAME is requested. > > > > gcc/cp/ChangeLog: > > PR c++/100763 > > * error.c: Call dump_scope when printing a typedef. > > > > + if (! (flags & TFF_UNQUALIFIED_NAME)) > > + dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags); > > You can use "decl" instead of "TYPE_NAME (t)" here. > > OK with that change. Updated patch below. From: Matthias Kretz dump_type on 'const std::string' should not print 'const string' unless TFF_UNQUALIFIED_NAME is requested. gcc/cp/ChangeLog: PR c++/100763 * error.c: Call dump_scope when printing a typedef. --- gcc/cp/error.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 3d5eebd4bcd..ae78b10c7b2 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -501,6 +501,8 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags) else { pp_cxx_cv_qualifier_seq (pp, t); + if (! (flags & TFF_UNQUALIFIED_NAME)) + dump_scope (pp, CP_DECL_CONTEXT (decl), flags); pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t)); return; }
Re: [PATCH] c++: Output less irrelevant info for function template decl [PR100716]
On Thursday, 27 May 2021 17:07:40 CEST Jason Merrill wrote: > On 5/26/21 5:29 PM, Matthias Kretz wrote: > > New revision which can also be compiled with GCC 4.8. > > > > From: Matthias Kretz > > > > Ensure dump_template_decl for function templates never prints template > > parameters after the function name (it did with -fno-pretty-templates) > > and skip output of irrelevant & confusing "[with T = T]" in > > dump_substitution. > > > > gcc/cp/ChangeLog: > > PR c++/100716 > > * error.c (dump_template_bindings): Include code to print > > "[with" and ']', conditional on whether anything is printed at > > all. This is tied to whether a semicolon is needed to separate > > multiple template parameters. If the template argument repeats > > the template parameter (T = T), then skip the parameter. > > This description should really be in a comment in the code, rather than > the ChangeLog. OK either way. Added comments in the code. New patch below. From: Matthias Kretz Ensure dump_template_decl for function templates never prints template parameters after the function name (it did with -fno-pretty-templates) and skip output of irrelevant & confusing "[with T = T]" in dump_substitution. gcc/cp/ChangeLog: PR c++/100716 * error.c (dump_template_bindings): Include code to print "[with" and ']', conditional on whether anything is printed at all. This is tied to whether a semicolon is needed to separate multiple template parameters. If the template argument repeats the template parameter (T = T), then skip the parameter. (dump_substitution): Moved code to print "[with" and ']' to dump_template_bindings. (dump_function_decl): Partial revert of PR50828, which masked TFF_TEMPLATE_NAME for all of dump_function_decl. Now TFF_TEMPLATE_NAME is masked for the scope of the function and only carries through to dump_function_name. (dump_function_name): Avoid calling dump_template_parms if TFF_TEMPLATE_NAME is set. gcc/testsuite/ChangeLog: PR c++/100716 * g++.dg/diagnostic/pr100716.C: New test. * g++.dg/diagnostic/pr100716-1.C: Same test with -fno-pretty-templates. --- gcc/cp/error.c | 63 +++- gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 + gcc/testsuite/g++.dg/diagnostic/pr100716.C | 54 + 3 files changed, 156 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/cp/error.c b/gcc/cp/error.c index a2f19d1a5c1..ade9b17e663 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -371,7 +371,35 @@ static void dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, vec *typenames) { - bool need_semicolon = false; + /* Print "[with" and ']', conditional on whether anything is printed at all. + This is tied to whether a semicolon is needed to separate multiple template + parameters. */ + struct prepost_semicolon + { +cxx_pretty_printer *pp; +bool need_semicolon; + +void operator()() +{ + if (need_semicolon) + pp_separate_with_semicolon (pp); + else + { + pp_cxx_whitespace (pp); + pp_cxx_left_bracket (pp); + pp->translate_string ("with"); + pp_cxx_whitespace (pp); + need_semicolon = true; + } +} + +~prepost_semicolon() +{ + if (need_semicolon) + pp_cxx_right_bracket (pp); +} + } semicolon_or_introducer = {pp, false}; + int i; tree t; @@ -395,10 +423,20 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx) arg = TREE_VEC_ELT (lvl_args, arg_idx); - if (need_semicolon) - pp_separate_with_semicolon (pp); - dump_template_parameter (pp, TREE_VEC_ELT (p, i), - TFF_PLAIN_IDENTIFIER); + tree parm_i = TREE_VEC_ELT (p, i); + /* If the template argument repeats the template parameter (T = T), + skip the parameter.*/ + if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM + && TREE_CODE (parm_i) == TREE_LIST + && TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL + && TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i))) + == TEMP
Re: [PATCH] Add gnu::diagnose_as attribute
o. Thanks for your detailed comments on this topic. Very helpful . > > + continue; > > + } > > + if (TREE_CODE (TREE_VALUE (args)) != STRING_CST) > > + { > > + error ("the argument to the %qE attribute must be a string " > > + "literal", name); > > Similarly here, recommend to follow one of the existing styles (see > c-family/c-attribs.c) rather than adding another variation to the mix. The visibility attribute on namespaces says: "%qD attribute requires a single NTBS argument". So I copied that (and its logic) for now. However, I believe the use of "NTBS" is not very user friendly. > > + if (CLASS_TYPE_P (type) && CLASSTYPE_IMPLICIT_INSTANTIATION (type)) > > + { > > + if (COMPLETE_OR_OPEN_TYPE_P (type)) > > + warning (OPT_Wattributes, "%qE attribute cannot be applied to %qT " > > + "after its instantiation", name, type); > > Ditto here: > msgid "ignoring %qE attribute applied to template instantiation %qT" Ah, here I want to be more precise. Because the attribute can be applied to a template instantiation. But only before its instantiation. Example: template struct X {}; using [[gnu::diagnose_as("XX")]] XX = X; // OK template struct X; using [[gnu::diagnose_as("XY")]] XY = X; // not OK msgid "ignoring %qE attribute applied to template %qT after instantiation" OK? > > + error ("%qE attribute applied to extern \"C\" declaration %qD", > > Please quote extern "C" (as "%). OK. However the msgid was copied from handle_abi_tag_attribute above. New patch (and ChangeLog) below: From: Matthias Kretz This attribute overrides the diagnostics output string for the entity it appertains to. The motivation is to improve QoI for library TS implementations, where diagnostics have a very bad signal-to-noise ratio due to the long namespaces involved. With the attribute, it is possible to solve PR89370 and make std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as std::string in diagnostic output without extra hacks to recognize the type in the C++ frontend. gcc/ChangeLog: PR c++/89370 * doc/extend.texi: Document the diagnose_as attribute. * doc/invoke.texi: Document -fno-diagnostics-use-aliases. gcc/c-family/ChangeLog: PR c++/89370 * c.opt (fdiagnostics-use-aliases): New diagnostics flag. gcc/cp/ChangeLog: PR c++/89370 * cp-tree.h: Add TFF_AS_PRIMARY. * error.c (dump_scope): When printing the name of a namespace, look for the diagnose_as attribute. If found, print the associated string instead of calling dump_decl. (dump_decl_name_or_diagnose_as): New function to replace dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the diagnose_as attribute before printing the DECL_NAME. (dump_template_scope): New function. Prints the scope of a template instance correctly applying diagnose_as attributes and adjusting the list of template parms accordingly. (dump_aggr_type): If the type has a diagnose_as attribute, print the associated string instead of printing the original type name. Print template parms only if the attribute was not applied to the instantiation / full specialization. (dump_simple_decl): Call dump_decl_name_or_diagnose_as instead of dump_decl. (dump_decl): Ditto. (lang_decl_name): Ditto. (dump_function_decl): Walk the functions context list to determine whether a call to dump_template_scope is required. Ensure function templates are presented as primary templates. (dump_function_name): Replace the function's identifier with the diagnose_as attribute value, if set. (dump_template_parms): Treat as primary template if flags contains TFF_AS_PRIMARY. (comparable_template_types_p): Consider the types not a template if one carries a diagnose_as attribute. (print_template_differences): Replace the identifier with the diagnose_as attribute value on the most general template, if it is set. * name-lookup.c (handle_namespace_attrs): Handle the diagnose_as attribute. Ensure exactly one string argument. Ensure previous diagnose_as attributes used the same name. * tree.c (cxx_attribute_table): Add diagnose_as attribute to the table. (check_diagnose_as_redeclaration): New function; copied and adjusted from check_abi_tag_redeclaration. (handle_diagnose_as_attribute): New function; copied and adjusted from handle_abi_tag_attribute. If the given *node is a TYPE_DECL and the TREE_TYPE is an implicit class te
Re: [PATCH] c++: Output less irrelevant info for function template decl [PR100716]
New revision which can also be compiled with GCC 4.8. From: Matthias Kretz Ensure dump_template_decl for function templates never prints template parameters after the function name (it did with -fno-pretty-templates) and skip output of irrelevant & confusing "[with T = T]" in dump_substitution. gcc/cp/ChangeLog: PR c++/100716 * error.c (dump_template_bindings): Include code to print "[with" and ']', conditional on whether anything is printed at all. This is tied to whether a semicolon is needed to separate multiple template parameters. If the template argument repeats the template parameter (T = T), then skip the parameter. (dump_substitution): Moved code to print "[with" and ']' to dump_template_bindings. (dump_function_decl): Partial revert of PR50828, which masked TFF_TEMPLATE_NAME for all of dump_function_decl. Now TFF_TEMPLATE_NAME is masked for the scope of the function and only carries through to dump_function_name. (dump_function_name): Avoid calling dump_template_parms if TFF_TEMPLATE_NAME is set. gcc/testsuite/ChangeLog: PR c++/100716 * g++.dg/diagnostic/pr100716.C: New test. * g++.dg/diagnostic/pr100716-1.C: Same test with -fno-pretty-templates. --- gcc/cp/error.c | 59 +++- gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 ++ gcc/testsuite/g++.dg/diagnostic/pr100716.C | 54 ++ 3 files changed, 152 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/cp/error.c b/gcc/cp/error.c index ad69df6ef7f..b0836d83888 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -371,7 +371,32 @@ static void dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, vec *typenames) { - bool need_semicolon = false; + struct prepost_semicolon + { +cxx_pretty_printer *pp; +bool need_semicolon; + +void operator()() +{ + if (need_semicolon) + pp_separate_with_semicolon (pp); + else + { + pp_cxx_whitespace (pp); + pp_cxx_left_bracket (pp); + pp->translate_string ("with"); + pp_cxx_whitespace (pp); + need_semicolon = true; + } +} + +~prepost_semicolon() +{ + if (need_semicolon) + pp_cxx_right_bracket (pp); +} + } semicolon_or_introducer = {pp, false}; + int i; tree t; @@ -395,10 +420,19 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx) arg = TREE_VEC_ELT (lvl_args, arg_idx); - if (need_semicolon) - pp_separate_with_semicolon (pp); - dump_template_parameter (pp, TREE_VEC_ELT (p, i), - TFF_PLAIN_IDENTIFIER); + tree parm_i = TREE_VEC_ELT (p, i); + /* Skip this parameter if it just noise such as "T = T". */ + if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM + && TREE_CODE (parm_i) == TREE_LIST + && TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL + && TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i))) + == TEMPLATE_TYPE_PARM + && DECL_NAME (TREE_VALUE (parm_i)) + == DECL_NAME (TREE_CHAIN (arg))) + continue; + + semicolon_or_introducer(); + dump_template_parameter (pp, parm_i, TFF_PLAIN_IDENTIFIER); pp_cxx_whitespace (pp); pp_equal (pp); pp_cxx_whitespace (pp); @@ -414,7 +448,6 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, pp_string (pp, M_("")); ++arg_idx; - need_semicolon = true; } parms = TREE_CHAIN (parms); @@ -436,8 +469,7 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, FOR_EACH_VEC_SAFE_ELT (typenames, i, t) { - if (need_semicolon) - pp_separate_with_semicolon (pp); + semicolon_or_introducer(); dump_type (pp, t, TFF_PLAIN_IDENTIFIER); pp_cxx_whitespace (pp); pp_equal (pp); @@ -1599,12 +1631,7 @@ dump_substitution (cxx_pretty_printer *pp, && !(flags & TFF_NO_TEMPLATE_BINDINGS)) { vec *typenames = t ? find_typenames (t) : NULL; - pp_cxx_whitespace (pp); - pp_cxx_left_bracket (pp); - pp->translate_string ("with"); - pp_cxx_whitespace (pp); dump_template_bindings (pp, template_parms, template_args, typenames); - pp_cxx_right_
[PATCH] c++: Add missing scope in typedef diagnostic [PR100763]
From: Matthias Kretz dump_type on 'const std::string' should not print 'const string' unless TFF_UNQUALIFIED_NAME is requested. gcc/cp/ChangeLog: PR c++/100763 * error.c: Call dump_scope when printing a typedef. --- gcc/cp/error.c | 2 ++ 1 file changed, 2 insertions(+) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/cp/error.c b/gcc/cp/error.c index c88d1749a0f..ad69df6ef7f 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -501,6 +501,8 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags) else { pp_cxx_cv_qualifier_seq (pp, t); + if (! (flags & TFF_UNQUALIFIED_NAME)) + dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags); pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t)); return; }
[PATCH] c++: Output less irrelevant info for function template decl [PR100716]
From: Matthias Kretz Ensure dump_template_decl for function templates never prints template parameters after the function name (it did with -fno-pretty-templates) and skip output of irrelevant & confusing "[with T = T]" in dump_substitution. gcc/cp/ChangeLog: PR c++/100716 * error.c (dump_template_bindings): Include code to print "[with" and ']', conditional on whether anything is printed at all. This is tied to whether a semicolon is needed to separate multiple template parameters. If the template argument repeats the template parameter (T = T), then skip the parameter. (dump_substitution): Moved code to print "[with" and ']' to dump_template_bindings. (dump_function_decl): Partial revert of PR50828, which masked TFF_TEMPLATE_NAME for all of dump_function_decl. Now TFF_TEMPLATE_NAME is masked for the scope of the function and only carries through to dump_function_name. (dump_function_name): Avoid calling dump_template_parms if TFF_TEMPLATE_NAME is set. gcc/testsuite/ChangeLog: PR c++/100716 * g++.dg/diagnostic/pr100716.C: New test. * g++.dg/diagnostic/pr100716-1.C: Same test with -fno-pretty-templates. --- gcc/cp/error.c | 59 +++- gcc/testsuite/g++.dg/diagnostic/pr100716-1.C | 54 ++ gcc/testsuite/g++.dg/diagnostic/pr100716.C | 54 ++ 3 files changed, 152 insertions(+), 15 deletions(-) create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716-1.C create mode 100644 gcc/testsuite/g++.dg/diagnostic/pr100716.C -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 010fbce41a7..bc0b68f07e0 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -381,7 +381,32 @@ static void dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, vec *typenames) { - bool need_semicolon = false; + struct prepost_semicolon + { +cxx_pretty_printer *pp; +bool need_semicolon = false; + +void operator()() +{ + if (need_semicolon) + pp_separate_with_semicolon (pp); + else + { + pp_cxx_whitespace (pp); + pp_cxx_left_bracket (pp); + pp->translate_string ("with"); + pp_cxx_whitespace (pp); + need_semicolon = true; + } +} + +~prepost_semicolon() +{ + if (need_semicolon) + pp_cxx_right_bracket (pp); +} + } semicolon_or_introducer = {pp}; + int i; tree t; @@ -405,10 +430,19 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, if (lvl_args && NUM_TMPL_ARGS (lvl_args) > arg_idx) arg = TREE_VEC_ELT (lvl_args, arg_idx); - if (need_semicolon) - pp_separate_with_semicolon (pp); - dump_template_parameter (pp, TREE_VEC_ELT (p, i), - TFF_PLAIN_IDENTIFIER); + tree parm_i = TREE_VEC_ELT (p, i); + /* Skip this parameter if it just noise such as "T = T". */ + if (arg && TREE_CODE (arg) == TEMPLATE_TYPE_PARM + && TREE_CODE (parm_i) == TREE_LIST + && TREE_CODE (TREE_VALUE (parm_i)) == TYPE_DECL + && TREE_CODE (TREE_TYPE (TREE_VALUE (parm_i))) + == TEMPLATE_TYPE_PARM + && DECL_NAME (TREE_VALUE (parm_i)) + == DECL_NAME (TREE_CHAIN (arg))) + continue; + + semicolon_or_introducer(); + dump_template_parameter (pp, parm_i, TFF_PLAIN_IDENTIFIER); pp_cxx_whitespace (pp); pp_equal (pp); pp_cxx_whitespace (pp); @@ -424,7 +458,6 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, pp_string (pp, M_("")); ++arg_idx; - need_semicolon = true; } parms = TREE_CHAIN (parms); @@ -446,8 +479,7 @@ dump_template_bindings (cxx_pretty_printer *pp, tree parms, tree args, FOR_EACH_VEC_SAFE_ELT (typenames, i, t) { - if (need_semicolon) - pp_separate_with_semicolon (pp); + semicolon_or_introducer(); dump_type (pp, t, TFF_PLAIN_IDENTIFIER); pp_cxx_whitespace (pp); pp_equal (pp); @@ -1652,12 +1684,7 @@ dump_substitution (cxx_pretty_printer *pp, && !(flags & TFF_NO_TEMPLATE_BINDINGS)) { vec *typenames = t ? find_typenames (t) : NULL; - pp_cxx_whitespace (pp); - pp_cxx_left_bracket (pp); - pp->translate_string ("with"); - pp_cxx_whitespace (pp); dump_template_bindings (pp, template_parms, template_args, typenames); - pp_cxx_right_bracket (pp); } } @@ -1698,7 +1725,8 @@ du
Re: [PATCH] Add gnu::diagnose_as attribute
> On Tuesday, 4 May 2021 15:34:13 CEST David Malcolm wrote: > > Does the patch interact correctly with the %H and %I codes that try to > > show the differences between two template types? While looking into this, I noticed that given namespace std { struct A {}; typedef A B; } const std::B would print as "'const B' {aka 'const std::A'}", i.e. without printing the scope of the typedef. I traced it to cp/error.c (dump_type). In the `if (TYPE_P (t) && typedef_variant_p (t))` branch, in the final else branch only cv-qualifiers and identifier are printed: pp_cxx_cv_qualifier_seq (pp, t); pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t)); I believe the following should go in between, correct? pp_cxx_cv_qualifier_seq (pp, t); if (! (flags & TFF_UNQUALIFIED_NAME)) dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags); pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t)); This is important for my diagnose_as patch because otherwise the output is: 'const string' {aka 'const std::string'} which is confusing and unnecessarily verbose. Patch below. From: Matthias Kretz dump_type on 'const std::string' should not print 'const string' unless TFF_UNQUALIFIED_NAME is requested. gcc/cp/ChangeLog: * error.c: Call dump_scope when printing a typedef. --- gcc/cp/error.c | 2 ++ 1 file changed, 2 insertions(+) -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 10b547afaa7..edeaad44bcd 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -511,6 +511,8 @@ dump_type (cxx_pretty_printer *pp, tree t, int flags) else { pp_cxx_cv_qualifier_seq (pp, t); + if (! (flags & TFF_UNQUALIFIED_NAME)) + dump_scope (pp, CP_DECL_CONTEXT (TYPE_NAME (t)), flags); pp_cxx_tree_identifier (pp, TYPE_IDENTIFIER (t)); return; }
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 4 May 2021 16:23:23 CEST Matthias Kretz wrote: > On Tuesday, 4 May 2021 15:34:13 CEST David Malcolm wrote: > > Does the patch interact correctly with the %H and %I codes that try to > > show the differences between two template types? > > I don't know. I'll try to find out. If you have a good idea (or pointer) for > a testcase, let me know. I see it now. It currently does not interact with %H and %I (at least in my tests). I'll investigate what it should do. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add gnu::diagnose_as attribute
On Tuesday, 4 May 2021 15:34:13 CEST David Malcolm wrote: > On Tue, 2021-05-04 at 13:13 +0200, Matthias Kretz wrote: > > This attribute overrides the diagnostics output string for the entity > > it > > appertains to. The motivation is to improve QoI for library TS > > implementations, where diagnostics have a very bad signal-to-noise > > ratio > > due to the long namespaces involved. > > [...] > > Thanks for the patch, it looks very promising. Thanks. I'm new to modifying the compiler like this, so please be extra careful with my patch. I believe I understand most of what I did, but I might have misunderstood. :) > The patch has no testcases; it should probably add test coverage for: > - the various places and ways in which diagnose_as can affect the > output, > - disabling it with the option > - the various ways in which the user can get diagnose_as wrong > - etc Right. If you know of an existing similar testcase, that'd help me a lot to get started. > Does the patch affect the output of labels when underlining ranges of > source code in diagnostics? AFAIU (and tested), it doesn't affect source code output. So, no? > Does the patch interact correctly with the %H and %I codes that try to > show the differences between two template types? I don't know. I'll try to find out. If you have a good idea (or pointer) for a testcase, let me know. > I have some minor nits from a diagnostics point of view: > [...] > Please add an auto_diagnostic_group here so that the "inform" is > associated with the "error". > [...] > diagnose_as should be in quotes here (%< and %>). > [...] > Please quote extern "C": Thanks. All done in my tree. I'll work on testcases before sending an updated patch. > Thanks again for the patch; hope this is constructive -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH] Add gnu::diagnose_as attribute
From: Matthias Kretz This attribute overrides the diagnostics output string for the entity it appertains to. The motivation is to improve QoI for library TS implementations, where diagnostics have a very bad signal-to-noise ratio due to the long namespaces involved. On Tuesday, 27 April 2021 11:46:48 CEST Jonathan Wakely wrote: > I think it's a great idea and would like to use it for all the TS > implementations where there is some inline namespace that the user > doesn't care about. std::experimental::fundamentals_v1:: would be much > better as just std::experimental::, or something like std::[LFTS]::. With the attribute, it is possible to solve PR89370 and make std::__cxx11::basic_string<_CharT, _Traits, _Alloc> appear as std::string in diagnostic output without extra hacks to recognize the type. gcc/ChangeLog: PR c++/89370 * doc/extend.texi: Document the diagnose_as attribute. * doc/invoke.texi: Document -fno-diagnostics-use-aliases. gcc/c-family/ChangeLog: PR c++/89370 * c.opt (fdiagnostics-use-aliases): New diagnostics flag. gcc/cp/ChangeLog: PR c++/89370 * error.c (dump_scope): When printing the name of a namespace, look for the diagnose_as attribute. If found, print the associated string instead of calling dump_decl. (dump_decl_name_or_diagnose_as): New function to replace dump_decl (pp, DECL_NAME(t), flags) and inspect the tree for the diagnose_as attribute before printing the DECL_NAME. (dump_aggr_type): If the type has a diagnose_as attribute, print the associated string instead of printing the original type name. (dump_simple_decl): Call dump_decl_name_or_diagnose_as instead of dump_decl. (dump_decl): Ditto. (lang_decl_name): Ditto. (dump_function_decl): Ensure complete replacement of the class template diagnostics if a diagnose_as attribute is present. (dump_function_name): Replace the function diagnostic output if the diagnose_as attribute is set. * name-lookup.c (handle_namespace_attrs): Handle the diagnose_as attribute. Ensure exactly one string argument. Ensure previous diagnose_as attributes used the same name. * tree.c (cxx_attribute_table): Add diagnose_as attribute to the table. (check_diagnose_as_redeclaration): New function; copied and adjusted from check_abi_tag_redeclaration. (handle_diagnose_as_attribute): New function; copied and adjusted from handle_abi_tag_attribute. If the given *node is a TYPE_DECL and the TREE_TYPE is an implicit class template instantiation, call decl_attributes to add the diagnose_as attribute to the TREE_TYPE. --- gcc/c-family/c.opt | 4 ++ gcc/cp/error.c | 85 --- gcc/cp/name-lookup.c | 27 ++ gcc/cp/tree.c| 117 +++ gcc/doc/extend.texi | 37 ++ gcc/doc/invoke.texi | 9 +++- 6 files changed, 270 insertions(+), 9 deletions(-) -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 3f8b72cdc00..0cf01c6dba4 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -1582,6 +1582,10 @@ fdiagnostics-show-template-tree C++ ObjC++ Var(flag_diagnostics_show_template_tree) Init(0) Print hierarchical comparisons when template types are mismatched. +fdiagnostics-use-aliases +C++ Var(flag_diagnostics_use_aliases) Init(1) +Replace identifiers or scope names in diagnostics as defined by the diagnose_as attribute. + fdirectives-only C ObjC C++ ObjC++ Preprocess directives only. diff --git a/gcc/cp/error.c b/gcc/cp/error.c index c88d1749a0f..10b547afaa7 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "gcc-rich-location.h" #include "cp-name-hint.h" +#include "attribs.h" #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',') #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';') @@ -66,6 +67,7 @@ static void dump_alias_template_specialization (cxx_pretty_printer *, tree, int) static void dump_type (cxx_pretty_printer *, tree, int); static void dump_typename (cxx_pretty_printer *, tree, int); static void dump_simple_decl (cxx_pretty_printer *, tree, tree, int); +static void dump_decl_name_or_diagnose_as (cxx_pretty_printer *, tree, int); static void dump_decl (cxx_pretty_printer *, tree, int); stati
Re: [PATCH 4/4] libstdc++: More efficient last day of month.
I like the idea. On Dienstag, 23. Februar 2021 14:25:10 CET Cassio Neri via Libstdc++ wrote: > ((__m ^ (__m >> 3)) & 1) | 30 Note that you can drop the `& 1` part. 30 in binary is 0b0. ORing with a value in [0, 0b01101] will only toggle the last bit. -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] libstdc++: Don't use reserved identifiers in simd headers
On Montag, 1. Februar 2021 13:21:33 CET Rainer Orth wrote: > Two simd tests FAIL on Solaris, both SPARC and x86: > > FAIL: experimental/simd/standard_abi_usable.cc -msse2 -O2 -Wno-psabi (test > for excess errors) FAIL: experimental/simd/standard_abi_usable_2.cc -msse2 > -O2 -Wno-psabi (test for excess errors) > > This happens because the simd headers use identifiers documented in the > libstdc++ manual as reserved by system headers. Sorry, this code was originally written as non-stdlib code, i.e. without any reserved identifiers. I had hoped I found all issues... > Fixed as follows, tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, > and x86_64-pc-linux-gnu. > > Ok for master? Looks good to me. > As an aside, the use of vim: markers initially confused the hell out of > me. As an Emacs user, I rarely use vi for much more than a pager, but > when I wanted to check the lines mentioned in the g++ errors, I had no > idea what was going on or how to disable the folding enabled there: > > // vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80 > > I can't help but feel that this is just a personal preference and > doesn't belong into the upstream code. Yes. I guess it's better to remove at least foldmethod. The rest isn't personal preference, but coding style requirements. However, I don't need any of it anymore: by now my vim config autodetects GCC / libstdc++ code. If the rest of libstdc++ doesn't have it, the simd headers probably shouldn't have it either. Best, Matthias -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH 14/16] Implement hmin and hmax
On Mittwoch, 27. Januar 2021 21:42:50 CET Matthias Kretz wrote: > --- a/libstdc++-v3/include/experimental/bits/simd.h > +++ b/libstdc++-v3/include/experimental/bits/simd.h > @@ -204,6 +204,27 @@ template > template >using _SizeConstant = integral_constant; > > +namespace __detail { > + struct _Minimum { > +template > + _GLIBCXX_SIMD_INTRINSIC constexpr > + _Tp > + operator()(_Tp __a, _Tp __b) const { Reviewing my own patch :) This needs line breaks before { for namespace, struct, and operator(). And another line break before the next struct. New patch attached. From: Matthias Kretz From 9.7.4 in Parallelism TS 2. For some reason I overlooked these two functions. Implement them via call to _S_reduce. libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Add __detail::_Minimum and __detail::_Maximum to use them as _BinaryOperation to _S_reduce. Add hmin and hmax overloads for simd and const_where_expression. * include/experimental/bits/simd_scalar.h (_SimdImplScalar::_S_reduce): Make unused _BinaryOperation parameter const-ref to allow calling _S_reduce with an rvalue. * testsuite/experimental/simd/tests/reductions.cc: Add tests for hmin and hmax. Since the compiler statically determined that all tests pass, repeat the test after a call to make_value_unknown. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 14179491f9d..a90cb3b2d98 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -204,6 +204,33 @@ template template using _SizeConstant = integral_constant; +namespace __detail +{ + struct _Minimum + { +template + _GLIBCXX_SIMD_INTRINSIC constexpr + _Tp + operator()(_Tp __a, _Tp __b) const + { + using std::min; + return min(__a, __b); + } + }; + + struct _Maximum + { +template + _GLIBCXX_SIMD_INTRINSIC constexpr + _Tp + operator()(_Tp __a, _Tp __b) const + { + using std::max; + return max(__a, __b); + } + }; +} // namespace __detail + // unrolled/pack execution helpers // __execute_n_times{{{ template @@ -3408,7 +3435,7 @@ template // }}}1 // reductions [simd.reductions] {{{1 - template > +template > _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp reduce(const simd<_Tp, _Abi>& __v, _BinaryOperation __binary_op = _BinaryOperation()) @@ -3454,6 +3481,61 @@ template reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op) { return reduce(__x, 0, __binary_op); } +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp + hmin(const simd<_Tp, _Abi>& __v) noexcept + { +return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp + hmax(const simd<_Tp, _Abi>& __v) noexcept + { +return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + typename _V::value_type + hmin(const const_where_expression<_M, _V>& __x) noexcept + { +using _Tp = typename _V::value_type; +constexpr _Tp __id_elem = +#ifdef __FINITE_MATH_ONLY__ + __finite_max_v<_Tp>; +#else + __value_or<__infinity, _Tp>(__finite_max_v<_Tp>); +#endif +_V __tmp = __id_elem; +_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp), +__data(__get_lvalue(__x))); +return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + typename _V::value_type + hmax(const const_where_expression<_M, _V>& __x) noexcept + { +using _Tp = typename _V::value_type; +constexpr _Tp __id_elem = +#ifdef __FINITE_MATH_ONLY__ + __finite_min_v<_Tp>; +#else + [] { + if constexpr (__value_exists_v<__infinity, _Tp>) + return -__infinity_v<_Tp>; + else + return __finite_min_v<_Tp>; + }(); +#endif +_V __tmp = __id_elem; +_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp), +__data(__get_lvalue(__x))); +return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum()); + } + // }}}1 // algorithms [simd.alg] {{{ template diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h index 7680bc39c30..7e4
[PATCH 16/16] Improve "find_first/last_set" for NEON
From: yaozhongxiao find_first_set and find_last_set method is not optimal for neon, it need to be improved by synthesized with horizontal adds(vaddv) which will reduce the generated assembly code; in the following cases, vaddvq_s16 will generate 2 instructions but vpadd_s16 will generate 4 instrunctions: ``` # vaddvq_s16 vaddvq_s16(__asint); // addvh0, v1.8h // smovw1, v0.h[0] # vpadd_s16 vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero), __zero)[0] // addp v1.8h,v1.8h,v2.8h // addp v1.8h,v1.8h,v2.8h // addp v1.8h,v1.8h,v2.8h // smovw1, v1.h[0] # ``` libstdc++-v3/ChangeLog: * include/experimental/bits/simd_neon.h: Replace repeated vpadd calls with a single vaddv for aarch64. --- .../include/experimental/bits/simd_neon.h | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd_neon.h b/libstdc++- v3/include/experimental/bits/simd_neon.h index a3a8ffe165f..0b8ccc17513 100644 --- a/libstdc++-v3/include/experimental/bits/simd_neon.h +++ b/libstdc++-v3/include/experimental/bits/simd_neon.h @@ -311,8 +311,7 @@ struct _MaskImplNeonMixin }); __asint &= __bitsel; #ifdef __aarch64__ - return vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero), - __zero)[0]; + return vaddvq_s16(__asint); #else return vpadd_s16( vpadd_s16(vpadd_s16(__lo64(__asint), __hi64(__asint)), __zero), @@ -328,7 +327,7 @@ struct _MaskImplNeonMixin }); __asint &= __bitsel; #ifdef __aarch64__ - return vpaddq_s32(vpaddq_s32(__asint, __zero), __zero)[0]; + return vaddvq_s32(__asint); #else return vpadd_s32(vpadd_s32(__lo64(__asint), __hi64(__asint)), __zero)[0]; @@ -351,8 +350,12 @@ struct _MaskImplNeonMixin return static_cast<_I>(__i < _Np ? 1 << __i : 0); }); __asint &= __bitsel; +#ifdef __aarch64__ + return vaddv_s8(__asint); +#else return vpadd_s8(vpadd_s8(vpadd_s8(__asint, __zero), __zero), __zero)[0]; +#endif } else if constexpr (sizeof(_Tp) == 2) { @@ -362,12 +365,20 @@ struct _MaskImplNeonMixin return static_cast<_I>(__i < _Np ? 1 << __i : 0); }); __asint &= __bitsel; +#ifdef __aarch64__ + return vaddv_s16(__asint); +#else return vpadd_s16(vpadd_s16(__asint, __zero), __zero)[0]; +#endif } else if constexpr (sizeof(_Tp) == 4) { __asint &= __make_vector<_I>(0x1, 0x2); +#ifdef __aarch64__ + return vaddv_s32(__asint); +#else return vpadd_s32(__asint, __zero)[0]; +#endif } else __assert_unreachable<_Tp>(); -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 15/16] Work around test failures using -mno-tree-vrp
From: Matthias Kretz This is necessary to avoid failures resulting from PR98834. libstdc++-v3/ChangeLog: * testsuite/Makefile.am: Warn about the workaround. Add -fno-tree-vrp to CXXFLAGS passed to the check_simd script. Improve initial user feedback from make check-simd. * testsuite/Makefile.in: Regenerated. --- libstdc++-v3/testsuite/Makefile.am | 4 +++- libstdc++-v3/testsuite/Makefile.in | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/ Makefile.am index 2d3ad481dba..ba5023a8b54 100644 --- a/libstdc++-v3/testsuite/Makefile.am +++ b/libstdc++-v3/testsuite/Makefile.am @@ -191,8 +191,10 @@ check-simd: $(srcdir)/experimental/simd/ generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags + @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834." @rm -f .simd.summary - ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$ {glibcxx_builddir}" "$(CXXFLAGS)" | \ + @echo "Generating simd testsuite subdirs and Makefiles ..." + @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$ {glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ tail -n20 $${subdir}/simd_testsuite.sum | \ diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/ Makefile.in index ac6207ae75c..c9dd7f5da61 100644 --- a/libstdc++-v3/testsuite/Makefile.in +++ b/libstdc++-v3/testsuite/Makefile.in @@ -716,8 +716,10 @@ check-simd: $(srcdir)/experimental/simd/ generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags + @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around PR98834." @rm -f .simd.summary - ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$ {glibcxx_builddir}" "$(CXXFLAGS)" | \ + @echo "Generating simd testsuite subdirs and Makefiles ..." + @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$ {glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ tail -n20 $${subdir}/simd_testsuite.sum | \ -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 14/16] Implement hmin and hmax
From: Matthias Kretz From 9.7.4 in Parallelism TS 2. For some reason I overlooked these two functions. Implement them via call to _S_reduce. libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Add __detail::_Minimum and __detail::_Maximum to use them as _BinaryOperation to _S_reduce. Add hmin and hmax overloads for simd and const_where_expression. * include/experimental/bits/simd_scalar.h (_SimdImplScalar::_S_reduce): Make unused _BinaryOperation parameter const-ref to allow calling _S_reduce with an rvalue. * testsuite/experimental/simd/tests/reductions.cc: Add tests for hmin and hmax. Since the compiler statically determined that all tests pass, repeat the test after a call to make_value_unknown. --- libstdc++-v3/include/experimental/bits/simd.h | 78 ++- .../include/experimental/bits/simd_scalar.h | 2 +- .../experimental/simd/tests/reductions.cc | 21 + 3 files changed, 99 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/ include/experimental/bits/simd.h index 14179491f9d..f08ef4c027d 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -204,6 +204,27 @@ template template using _SizeConstant = integral_constant; +namespace __detail { + struct _Minimum { +template + _GLIBCXX_SIMD_INTRINSIC constexpr + _Tp + operator()(_Tp __a, _Tp __b) const { + using std::min; + return min(__a, __b); + } + }; + struct _Maximum { +template + _GLIBCXX_SIMD_INTRINSIC constexpr + _Tp + operator()(_Tp __a, _Tp __b) const { + using std::max; + return max(__a, __b); + } + }; +} // namespace __detail + // unrolled/pack execution helpers // __execute_n_times{{{ template @@ -3408,7 +3429,7 @@ template // }}}1 // reductions [simd.reductions] {{{1 - template > +template > _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp reduce(const simd<_Tp, _Abi>& __v, _BinaryOperation __binary_op = _BinaryOperation()) @@ -3454,6 +3475,61 @@ template reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op) { return reduce(__x, 0, __binary_op); } +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp + hmin(const simd<_Tp, _Abi>& __v) noexcept + { +return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp + hmax(const simd<_Tp, _Abi>& __v) noexcept + { +return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + typename _V::value_type + hmin(const const_where_expression<_M, _V>& __x) noexcept + { +using _Tp = typename _V::value_type; +constexpr _Tp __id_elem = +#ifdef __FINITE_MATH_ONLY__ + __finite_max_v<_Tp>; +#else + __value_or<__infinity, _Tp>(__finite_max_v<_Tp>); +#endif +_V __tmp = __id_elem; +_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp), + __data(__get_lvalue(__x))); +return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum()); + } + +template + _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR + typename _V::value_type + hmax(const const_where_expression<_M, _V>& __x) noexcept + { +using _Tp = typename _V::value_type; +constexpr _Tp __id_elem = +#ifdef __FINITE_MATH_ONLY__ + __finite_min_v<_Tp>; +#else + [] { + if constexpr (__value_exists_v<__infinity, _Tp>) + return -__infinity_v<_Tp>; + else + return __finite_min_v<_Tp>; + }(); +#endif +_V __tmp = __id_elem; +_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp), + __data(__get_lvalue(__x))); +return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum()); + } + // }}}1 // algorithms [simd.alg] {{{ template diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++- v3/include/experimental/bits/simd_scalar.h index 7680bc39c30..7e480ecdb37 100644 --- a/libstdc++-v3/include/experimental/bits/simd_scalar.h +++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h @@ -182,7 +182,7 @@ struct _SimdImplScalar // _S_reduce {{{2 template static constexpr inline _Tp -_S_reduce(const simd<_Tp, simd_abi::scalar>& __x, _BinaryOperation&) +_S_reduce(const simd<_Tp, simd_abi::scalar>& __x, const _BinaryOperation&) { return __x._M_data; } // _S_min, _S_max {{{2 diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc b/ libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc index 9d897d5ccd6..02df68fafbc 100644 --- a/libstdc++-v3
[PATCH 13/16] Improve test codegen for interpreting assembly
From: Matthias Kretz In many failure cases it is helpful to inspect the instructions leading up to the test failure. After this change the location is easier to find and the branch after failure is easier to find. libstdc++-v3/ChangeLog: * testsuite/experimental/simd/tests/bits/verify.h (verify): Add instruction pointer data member. Ensure that the `if (m_failed)` branch is always inlined into the calling code. The body of the conditional can still be a function call. Move the get_ip call into the verify ctor to simplify the ctor calls. (COMPARE): Don't mention the use of all_of for reduction of a simd_mask. It only distracts from the real issue. --- .../experimental/simd/tests/bits/verify.h | 44 +-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h b/ libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h index 5da47b35536..17bda71b77e 100644 --- a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h +++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h @@ -60,6 +60,7 @@ template class verify { const bool m_failed = false; + size_t m_ip = 0; template () @@ -129,20 +130,21 @@ class verify public: template -verify(bool ok, size_t ip, const char* file, const int line, +[[gnu::always_inline]] +verify(bool ok, const char* file, const int line, const char* func, const char* cond, const Ts&... extra_info) -: m_failed(!ok) +: m_failed(!ok), m_ip(get_ip()) { if (m_failed) - { + [&] { __builtin_fprintf(stderr, "%s:%d: (%s):\nInstruction Pointer: %x\n" "Assertion '%s' failed.\n", - file, line, func, ip, cond); + file, line, func, m_ip, cond); (print(extra_info, int()), ...); - } + }(); } - ~verify() + [[gnu::always_inline]] ~verify() { if (m_failed) { @@ -152,26 +154,27 @@ public: } template +[[gnu::always_inline]] const verify& operator<<(const T& x) const { if (m_failed) - { - print(x, int()); - } + print(x, int()); return *this; } template +[[gnu::always_inline]] const verify& on_failure(const Ts&... xs) const { if (m_failed) - (print(xs, int()), ...); + [&] { (print(xs, int()), ...); }(); return *this; } - [[gnu::always_inline]] static inline size_t + [[gnu::always_inline]] static inline + size_t get_ip() { size_t _ip = 0; @@ -220,24 +223,21 @@ template #define COMPARE(_a, _b) \ [&](auto&& _aa, auto&& _bb) { \ -return verify(std::experimental::all_of(_aa == _bb), verify::get_ip(), \ - __FILE__, __LINE__, __PRETTY_FUNCTION__, \ - "all_of(" #_a " == " #_b ")", #_a " = ", _aa,\ +return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__, \ + __PRETTY_FUNCTION__, #_a " == " #_b, #_a " = ", _aa, \ "\n" #_b " = ", _bb);\ }(force_fp_truncation(_a), force_fp_truncation(_b)) #else #define COMPARE(_a, _b) \ [&](auto&& _aa, auto&& _bb) { \ -return verify(std::experimental::all_of(_aa == _bb), verify::get_ip(), \ - __FILE__, __LINE__, __PRETTY_FUNCTION__, \ - "all_of(" #_a " == " #_b ")", #_a " = ", _aa,\ +return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__, \ + __PRETTY_FUNCTION__, #_a " == " #_b, #_a " = ", _aa, \ "\n" #_b " = ", _bb);\ }((_a), (_b)) #endif #define VERIFY(_test) \ - verify(_test, verify::get_ip(), __FILE__, __LINE__, __PRETTY_FUNCTION__, \ -#_test) + verify(_test, __FILE__, __LINE__, __PRETTY_FUNCTION__, #_test) // ulp_distance_signed can raise FP exceptions and thus must be conditionally // executed @@ -245,9 +245,9 @@ template [&](auto&& _aa, auto&& _bb) { \ const bool success = std::experimental::all_of( \ vir::test::ulp_distance(_aa, _bb
[PATCH 12/16] Support timeout and timeout-factor options
From: Matthias Kretz libstdc++-v3/ChangeLog: * testsuite/experimental/simd/driver.sh: Abstract reading test options into read_src_option function. Read skip, only, expensive, and xfail via read_src_option. Add timeout and timeout-factor options and adjust timeout variable accordingly. * testsuite/experimental/simd/tests/loadstore.cc: Set timeout-factor 2. --- .../testsuite/experimental/simd/driver.sh | 38 +-- .../experimental/simd/tests/loadstore.cc | 1 + 2 files changed, 27 insertions(+), 12 deletions(-) diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++- v3/testsuite/experimental/simd/driver.sh index 719e4db8e68..71e0c7d5ee8 100755 --- a/libstdc++-v3/testsuite/experimental/simd/driver.sh +++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh @@ -214,35 +214,43 @@ trap "rm -f '$log' '$sum' $exe; exit" INT rm -f "$log" "$sum" touch "$log" "$sum" -skip="$(head -n25 "$src" | grep '^//\s*skip: ')" -if [ -n "$skip" ]; then - skip="$(echo "$skip" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')" +read_src_option() { + local key tmp var + key="$1" + var="$2" + [ -z "$var" ] && var="$1" + local tmp="$(head -n25 "$src" | grep "^//\\s*${key}: ")" + if [ -n "$tmp" ]; then +tmp="$(echo "${tmp#//*${key}: }" | sed -e 's/ \+/ /g' -e 's/^ //' -e 's/ $//')" +eval "$var=\"$tmp\"" + else +return 1 + fi +} + +if read_src_option skip; then if test_selector "$skip"; then # silently skip this test exit 0 fi fi -only="$(head -n25 "$src" | grep '^//\s*only: ')" -if [ -n "$only" ]; then - only="$(echo "$only" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')" +if read_src_option only; then if ! test_selector "$only"; then # silently skip this test exit 0 fi fi + if ! $run_expensive; then - expensive="$(head -n25 "$src" | grep '^//\s*expensive: ')" - if [ -n "$expensive" ]; then -expensive="$(echo "$expensive" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')" + if read_src_option expensive; then if test_selector "$expensive"; then unsupported "skip expensive tests" exit 0 fi fi fi -xfail="$(head -n25 "$src" | grep '^//\s*xfail: ')" -if [ -n "$xfail" ]; then - xfail="$(echo "$xfail" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')" + +if read_src_option xfail; then if test_selector "${xfail#* }"; then xfail="${xfail%% *}" else @@ -250,6 +258,12 @@ if [ -n "$xfail" ]; then fi fi +read_src_option timeout + +if read_src_option timeout-factor factor; then + timeout=$(awk "BEGIN { print int($timeout * $factor) }") +fi + log_output() { if $verbose; then maxcol=${1:-1024} diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc b/ libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc index dd7d6c30e8c..cd27c3a7426 100644 --- a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc +++ b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc @@ -16,6 +16,7 @@ // <http://www.gnu.org/licenses/>. // expensive: * [1-9] * * +// timeout-factor: 2 #include "bits/verify.h" #include "bits/make_vec.h" #include "bits/conversions.h" -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 11/16] Abort test after 1000 lines of output
From: Matthias Kretz Handle overly large output by aborting the log and thus the test. This is a similar condition to a timeout. libstdc++-v3/ChangeLog: * testsuite/experimental/simd/driver.sh: When handling the pipe to log (and on verbose to stdout) count the lines. If it exceeds 1000 log the issue and exit 125, which is then handled as a failure. --- .../testsuite/experimental/simd/driver.sh | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++- v3/testsuite/experimental/simd/driver.sh index 314c6a16f86..719e4db8e68 100755 --- a/libstdc++-v3/testsuite/experimental/simd/driver.sh +++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh @@ -258,7 +258,11 @@ BEGIN { count = 0 } /^###exitstatus### [0-9]+$/ { exit \$2 } { print >> \"$log\" - if (count >= 1000) next + if (count >= 1000) { +print \"Aborting: too much output\" >> \"$log\" +print \"Aborting: too much output\" +exit 125 + } ++count if (length(\$0) > $maxcol) { i = 1 @@ -282,8 +286,17 @@ END { close(\"$log\") } " else awk " +BEGIN { count = 0 } /^###exitstatus### [0-9]+$/ { exit \$2 } -{ print >> \"$log\" } +{ + print >> \"$log\" + if (count >= 1000) { +print \"Aborting: too much output\" >> \"$log\" +print \"Aborting: too much output\" +exit 125 + } + ++count +} END { close(\"$log\") } " fi -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 10/16] Skip testing hypot3 for long double on PPC
From: Matthias Kretz std::hypot(a, b, c) is imprecise and makes this test fail even though the failure is unrelated to simd. libstdc++-v3/ChangeLog: * testsuite/experimental/simd/tests/hypot3_fma.cc: Add skip: markup for long double on powerpc64*. --- libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc b/ libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc index 689a90c10a5..94d267fccfb 100644 --- a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc +++ b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc @@ -16,6 +16,7 @@ // <http://www.gnu.org/licenses/>. // only: float|double|ldouble * * * +// skip: ldouble * powerpc64* * // expensive: * [1-9] * * #include "bits/verify.h" #include "bits/metahelpers.h" -- ────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 09/16] Fix mask reduction of simd_mask on POWER7
From: Matthias Kretz POWER7 does not support __vector long long reductions, making the generic _S_popcount implementation ill-formed. Specializing _S_popcount for PPC allows optimization and avoids the issue. libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Add __have_power10vec conditional on _ARCH_PWR10. * include/experimental/bits/simd_builtin.h: Forward declare _MaskImplPpc and use it as _MaskImpl when __ALTIVEC__ is defined. (_MaskImplBuiltin::_S_some_of): Call _S_popcount from the _SuperImpl for optimizations and correctness. * include/experimental/bits/simd_ppc.h: Add _MaskImplPpc. (_MaskImplPpc::_S_popcount): Implement via vec_cntm for POWER10. Otherwise, for >=int use -vec_sums divided by a sizeof factor. For struct _MaskImplX86; template struct _SimdImplNeon; template struct _MaskImplNeon; template struct _SimdImplPpc; +template struct _MaskImplPpc; // simd_abi::_VecBuiltin {{{ template @@ -959,10 +960,11 @@ template using _CommonImpl = _CommonImplBuiltin; #ifdef __ALTIVEC__ using _SimdImpl = _SimdImplPpc<_VecBuiltin<_UsedBytes>>; +using _MaskImpl = _MaskImplPpc<_VecBuiltin<_UsedBytes>>; #else using _SimdImpl = _SimdImplBuiltin<_VecBuiltin<_UsedBytes>>; -#endif using _MaskImpl = _MaskImplBuiltin<_VecBuiltin<_UsedBytes>>; +#endif #endif // }}} @@ -2899,7 +2901,7 @@ template _GLIBCXX_SIMD_INTRINSIC static bool _S_some_of(simd_mask<_Tp, _Abi> __k) { - const int __n_true = _S_popcount(__k); + const int __n_true = _SuperImpl::_S_popcount(__k); return __n_true > 0 && __n_true < int(_S_size<_Tp>); } diff --git a/libstdc++-v3/include/experimental/bits/simd_ppc.h b/libstdc++-v3/ include/experimental/bits/simd_ppc.h index c00d2323ac6..1d649931eb9 100644 --- a/libstdc++-v3/include/experimental/bits/simd_ppc.h +++ b/libstdc++-v3/include/experimental/bits/simd_ppc.h @@ -30,6 +30,7 @@ #ifndef __ALTIVEC__ #error "simd_ppc.h may only be included when AltiVec/VMX is available" #endif +#include _GLIBCXX_SIMD_BEGIN_NAMESPACE @@ -114,10 +115,42 @@ template // }}} }; +// }}} +// _MaskImplPpc {{{ +template + struct _MaskImplPpc : _MaskImplBuiltin<_Abi> + { +using _Base = _MaskImplBuiltin<_Abi>; + +// _S_popcount {{{ +template + _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> __k) + { + const auto __kv = __as_vector(__k); + if constexpr (__have_power10vec) + { + return vec_cntm(__to_intrin(__kv), 1); + } + else if constexpr (sizeof(_Tp) >= sizeof(int)) + { + using _Intrin = __intrinsic_type16_t; + const int __sum = -vec_sums(__intrin_bitcast<_Intrin>(__kv), _Intrin())[3]; + return __sum / (sizeof(_Tp) / sizeof(int)); + } + else + { + const auto __summed_to_int = vec_sum4s(__to_intrin(__kv), __intrinsic_type16_t()); + return -vec_sums(__summed_to_int, __intrinsic_type16_t())[3]; + } + } + +// }}} + }; + // }}} _GLIBCXX_SIMD_END_NAMESPACE #endif // __cplusplus >= 201703L #endif // _GLIBCXX_EXPERIMENTAL_SIMD_PPC_H_ -// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80 +// vim: foldmethod=marker foldmarker={{{,}}} sw=2 noet ts=8 sts=2 tw=100 -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 08/16] Immediate feedback with -v
From: Matthias Kretz libstdc++-v3/ChangeLog: * testsuite/experimental/simd/driver.sh: Remove executable on SIGINT. Process compiler and test executable output: In verbose mode print messages immediately, limited to 1000 lines and breaking long lines to below $COLUMNS (or 1024 if not set). Communicating the exit status of the compiler / test with the necessary pipe is done via a message through stdout/-in. --- .../testsuite/experimental/simd/driver.sh | 194 +++--- 1 file changed, 116 insertions(+), 78 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh index cf07ff9ad85..314c6a16f86 100755 --- a/libstdc++-v3/testsuite/experimental/simd/driver.sh +++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh @@ -172,81 +172,14 @@ unsupported() { echo "UNSUPPORTED: $src $type $abiflag ($*)" >> "$log" } -verify_compilation() { - failed=$1 - if [ $failed -eq 0 ]; then -warnings=$(grep -ic 'warning:' "$log") -if [ $warnings -gt 0 ]; then - fail "excess warnings:" $warnings - if $verbose; then -cat "$log" - elif ! $quiet; then -grep -i 'warning:' "$log" | head -n5 - fi -elif [ "$xfail" = "compile" ]; then - xpass "test for excess errors" -else - pass "test for excess errors" -fi - else -if [ $failed -eq 124 ]; then - fail "timeout: test for excess errors" -else - errors=$(grep -ic 'error:' "$log") - if [ "$xfail" = "compile" ]; then -xfail "excess errors:" $errors -exit 0 - else -fail "excess errors:" $errors - fi -fi -if $verbose; then - cat "$log" -elif ! $quiet; then - grep -i 'error:' "$log" | head -n5 -fi -exit 0 - fi -} - -verify_test() { - failed=$1 - if [ $failed -eq 0 ]; then -rm "$exe" -if [ "$xfail" = "run" ]; then - xpass "execution test" -else - pass "execution test" -fi - else -$keep_failed || rm "$exe" -if [ $failed -eq 124 ]; then - fail "timeout: execution test" -elif [ "$xfail" = "run" ]; then - xfail "execution test" -else - fail "execution test" -fi -if $verbose; then - lines=$(wc -l < "$log") - lines=$((lines-3)) - if [ $lines -gt 1000 ]; then -echo "[...]" -tail -n1000 "$log" - else -tail -n$lines "$log" - fi -elif ! $quiet; then - grep -i fail "$log" | head -n5 -fi -exit 0 - fi -} - write_log_and_verbose() { echo "$*" >> "$log" if $verbose; then -echo "$*" +if [ -z "$COLUMNS" ] || ! type fmt>/dev/null; then + echo "$*" +else + echo "$*" | fmt -w $COLUMNS -s - || cat +fi fi } @@ -277,7 +210,7 @@ test_selector() { return 1 } -trap "rm -f '$log' '$sum'; exit" INT +trap "rm -f '$log' '$sum' $exe; exit" INT rm -f "$log" "$sum" touch "$log" "$sum" @@ -317,17 +250,122 @@ if [ -n "$xfail" ]; then fi fi +log_output() { + if $verbose; then +maxcol=${1:-1024} +awk " +BEGIN { count = 0 } +/^###exitstatus### [0-9]+$/ { exit \$2 } +{ + print >> \"$log\" + if (count >= 1000) next + ++count + if (length(\$0) > $maxcol) { +i = 1 +while (i + $maxcol <= length(\$0)) { + len = $maxcol + line = substr(\$0, i, len) + len = match(line, / [^ ]*$/) + if (len <= 0) { +len = match(substr(\$0, i), / [^ ]/) +if (len <= 0) len = $maxcol + } + print substr(\$0, i, len) + i += len +} +print substr(\$0, i) + } else { +print + } +} +END { close(\"$log\") } +" + else +awk " +/^###exitstatus### [0-9]+$/ { exit \$2 } +{ print >> \"$log\" } +END { close(\"$log\") } +" + fi +} + +verify_compilation() { + log_output $COLUMNS + exitstatus=$? + if [ $exitstatus -eq 0 ]; then +warnings=$(grep -ic 'warning:' "$log") +if [ $warnings -gt 0 ]; then + fail "excess warnings:" $warnings + if
[PATCH 07/16] Fix incorrect display of old test summaries
From: Matthias Kretz libstdc++-v3/ChangeLog: * testsuite/Makefile.am: Ensure .simd.summary is empty before collecting a new summary. * testsuite/Makefile.in: Regenerate. --- libstdc++-v3/testsuite/Makefile.am | 1 + libstdc++-v3/testsuite/Makefile.in | 1 + 2 files changed, 2 insertions(+) diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/ Makefile.am index 5dd109b40c9..2d3ad481dba 100644 --- a/libstdc++-v3/testsuite/Makefile.am +++ b/libstdc++-v3/testsuite/Makefile.am @@ -191,6 +191,7 @@ check-simd: $(srcdir)/experimental/simd/ generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags + @rm -f .simd.summary ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$ {glibcxx_builddir}" "$(CXXFLAGS)" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/ Makefile.in index 3900d6d87b4..ac6207ae75c 100644 --- a/libstdc++-v3/testsuite/Makefile.in +++ b/libstdc++-v3/testsuite/Makefile.in @@ -716,6 +716,7 @@ check-simd: $(srcdir)/experimental/simd/ generate_makefile.sh \ ${glibcxx_srcdir}/scripts/check_simd \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags + @rm -f .simd.summary ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$ {glibcxx_builddir}" "$(CXXFLAGS)" | \ while read subdir; do \ $(MAKE) -C "$${subdir}"; \ -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 05/16] Fix several check-simd interaction issues
From: Matthias Kretz libstdc++-v3/ChangeLog: * testsuite/experimental/simd/driver.sh (verify_test): Print test output on run xfail. Do not repeat lines from the log that were already printed on stdout. (test_selector): Make the compiler flags pattern usable as a substring selector. (toplevel): Trap on SIGINT and remove the log and sum files. Call timout with --foreground to quickly terminate on SIGINT. * testsuite/experimental/simd/generate_makefile.sh: Simplify run targets via target patterns. Default DRIVEROPTS to -v for run targets. Remove log and sum files after completion of the run target (so that it's always recompiled). Place help text into text file for reasonable 'make help' performance. --- .../testsuite/experimental/simd/driver.sh | 16 +++-- .../experimental/simd/generate_makefile.sh| 70 +-- 2 files changed, 44 insertions(+), 42 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh index 84f3829c2d4..cf07ff9ad85 100755 --- a/libstdc++-v3/testsuite/experimental/simd/driver.sh +++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh @@ -224,16 +224,17 @@ verify_test() { fail "timeout: execution test" elif [ "$xfail" = "run" ]; then xfail "execution test" - exit 0 else fail "execution test" fi if $verbose; then - if [ $(cat "$log"|wc -l) -gt 1000 ]; then + lines=$(wc -l < "$log") + lines=$((lines-3)) + if [ $lines -gt 1000 ]; then echo "[...]" tail -n1000 "$log" else -cat "$log" +tail -n$lines "$log" fi elif ! $quiet; then grep -i fail "$log" | head -n5 @@ -267,7 +268,7 @@ test_selector() { [ -z "$target_triplet" ] && target_triplet=$($CXX -dumpmachine) if matches "$target_triplet" "$pat_triplet"; then pat_flags="${string#* }" -if matches "$CXXFLAGS" "$pat_flags"; then +if matches "$CXXFLAGS" "*$pat_flags*"; then return 0 fi fi @@ -276,6 +277,7 @@ test_selector() { return 1 } +trap "rm -f '$log' '$sum'; exit" INT rm -f "$log" "$sum" touch "$log" "$sum" @@ -316,15 +318,15 @@ if [ -n "$xfail" ]; then fi write_log_and_verbose "$CXX $src $@ -D_GLIBCXX_SIMD_TESTTYPE=$type $abiflag -o $exe" -timeout $timeout "$CXX" "$src" "$@" "-D_GLIBCXX_SIMD_TESTTYPE=$type" $abiflag -o "$exe" >> "$log" 2>&1 +timeout --foreground $timeout "$CXX" "$src" "$@" "-D_GLIBCXX_SIMD_TESTTYPE=$type" $abiflag -o "$exe" >> "$log" 2>&1 verify_compilation $? if [ -n "$sim" ]; then write_log_and_verbose "$sim ./$exe" - timeout $timeout $sim "./$exe" >> "$log" 2>&1 <&- + timeout --foreground $timeout $sim "./$exe" >> "$log" 2>&1 <&- else write_log_and_verbose "./$exe" timeout=$(awk "BEGIN { print int($timeout / 2) }") - timeout $timeout "./$exe" >> "$log" 2>&1 <&- + timeout --foreground $timeout "./$exe" >> "$log" 2>&1 <&- fi verify_test $? diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh index 553bc98f60b..8d642a2941a 100755 --- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh +++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh @@ -240,7 +240,7 @@ EOF %-$type.log: %-$type-0.log %-$type-1.log %-$type-2.log %-$type-3.log \ %-$type-4.log %-$type-5.log %-$type-6.log %-$type-7.log \ %-$type-8.log %-$type-9.log - @cat $^ > \$@ + @cat \$^ > \$@ @cat \$(^:log=sum) > \$(@:log=sum)${rmline} EOF @@ -252,47 +252,47 @@ EOF EOF done done - echo 'run-%: export GCC_TEST_RUN_EXPENSIVE=yes' - all_tests | while read file && read name; do -echo "run-$name: $name.log" -all_types "$file" | while read t && read type; do - echo "run-$name-$type:
[PATCH 04/16] Fix simd_mask on POWER w/o POWER8
From: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Remove unnecessary static assertion. Allow sizeof(8) integer __intrinsic_type to enable the necessary mask type. --- libstdc++-v3/include/experimental/bits/simd.h | 6 -- 1 file changed, 6 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/ include/experimental/bits/simd.h index 64cf8d32328..9685df0be9e 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -2292,12 +2292,6 @@ template #ifndef __VSX__ static_assert(!is_same_v<_Tp, double>, "no __intrinsic_type support for double on PPC w/o VSX"); -#endif -#ifndef __POWER8_VECTOR__ -static_assert( - !(is_integral_v<_Tp> && sizeof(_Tp) > 4), - "no __intrinsic_type support for integers larger than 4 Bytes " - "on PPC w/o POWER8 vectors"); #endif using type = typename __intrinsic_type_impl< -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 06/16] Fix DRIVEROPTS and TESTFLAGS processing
From: Matthias Kretz libstdc++-v3/ChangeLog: * testsuite/experimental/simd/generate_makefile.sh: Use different variables internally than documented for user overrides. This makes internal append/prepend work as intended. --- .../testsuite/experimental/simd/generate_makefile.sh | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/ libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh index 8d642a2941a..4fb710c7767 100755 --- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh +++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh @@ -85,19 +85,20 @@ CXX="$1" shift echo "TESTFLAGS ?=" > "$dst" -[ -n "$testflags" ] && echo "TESTFLAGS := $testflags \$(TESTFLAGS)" >> "$dst" -echo CXXFLAGS = "$@" "\$(TESTFLAGS)" >> "$dst" +echo "test_flags := $testflags \$(TESTFLAGS)" >> "$dst" +echo CXXFLAGS = "$@" "\$(test_flags)" >> "$dst" [ -n "$sim" ] && echo "export GCC_TEST_SIMULATOR = $sim" >> "$dst" cat >> "$dst" <https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 03/16] Support -mlong-double-64 on PPC
From: Matthias Kretz libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h: Let __intrinsic_type be valid if sizeof(long double) == sizeof(double) and use a __vector double as member type. --- libstdc++-v3/include/experimental/bits/simd.h | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/ include/experimental/bits/simd.h index d56176210df..64cf8d32328 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -2285,7 +2285,9 @@ template struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>> { -static_assert(!is_same_v<_Tp, long double>, +static constexpr bool _S_is_ldouble = is_same_v<_Tp, long double>; +// allow _Tp == long double with -mlong-double-64 +static_assert(!(_S_is_ldouble && sizeof(long double) > sizeof(double)), "no __intrinsic_type support for long double on PPC"); #ifndef __VSX__ static_assert(!is_same_v<_Tp, double>, @@ -2297,8 +2299,11 @@ template "no __intrinsic_type support for integers larger than 4 Bytes " "on PPC w/o POWER8 vectors"); #endif -using type = typename __intrinsic_type_impl, _Tp, __int_for_sizeof_t<_Tp>>>::type; +using type = + typename __intrinsic_type_impl< +conditional_t, + conditional_t<_S_is_ldouble, double, _Tp>, + __int_for_sizeof_t<_Tp>>>::type; }; #endif // __ALTIVEC__ -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
[PATCH 02/16] Fix NEON intrinsic types usage
From: Matthias Kretz Intrinsics types for NEON differ from gnu::vector_size types now. This requires explicit specializations for __intrinsic_type and a new __is_intrinsic_type trait. libstdc++-v3/ChangeLog: * include/experimental/bits/simd.h (__is_intrinsic_type): New internal type trait. Alias for __is_vector_type on x86. (_VectorTraitsImpl): Enable for __intrinsic_type in addition for __vector_type. (__intrin_bitcast): Allow casting to & from vector & intrinsic types. (__intrinsic_type): Explicitly specialize for NEON intrinsic vector types. --- libstdc++-v3/include/experimental/bits/simd.h | 70 +-- 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/ include/experimental/bits/simd.h index 00eec50d64f..d56176210df 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1379,13 +1379,35 @@ template template inline constexpr bool __is_vector_type_v = __is_vector_type<_Tp>::value; +// }}} +// __is_intrinsic_type {{{ +#if _GLIBCXX_SIMD_HAVE_SSE_ABI +template + using __is_intrinsic_type = __is_vector_type<_Tp>; +#else // not SSE (x86) +template > + struct __is_intrinsic_type : false_type {}; + +template + struct __is_intrinsic_type< +_Tp, void_t()[0])>, sizeof(_Tp)>::type>> +: is_same<_Tp, typename __intrinsic_type< +remove_reference_t()[0])>, +sizeof(_Tp)>::type> {}; +#endif + +template + inline constexpr bool __is_intrinsic_type_v = __is_intrinsic_type<_Tp>::value; + // }}} // _VectorTraits{{{ template > struct _VectorTraitsImpl; template - struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>>> + struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp> + || __is_intrinsic_type_v<_Tp>>> { using type = _Tp; using value_type = remove_reference_t()[0])>; @@ -1457,7 +1479,8 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr _To __intrin_bitcast(_From __v) { -static_assert(__is_vector_type_v<_From> && __is_vector_type_v<_To>); +static_assert((__is_vector_type_v<_From> || __is_intrinsic_type_v<_From>) + && (__is_vector_type_v<_To> || __is_intrinsic_type_v<_To>)); if constexpr (sizeof(_To) == sizeof(_From)) return reinterpret_cast<_To>(__v); else if constexpr (sizeof(_From) > sizeof(_To)) @@ -2183,16 +2206,55 @@ template #endif // _GLIBCXX_SIMD_HAVE_SSE_ABI // __intrinsic_type (ARM){{{ #if _GLIBCXX_SIMD_HAVE_NEON +template <> + struct __intrinsic_type + { using type = float32x2_t; }; + +template <> + struct __intrinsic_type + { using type = float32x4_t; }; + +#if _GLIBCXX_SIMD_HAVE_NEON_A64 +template <> + struct __intrinsic_type + { using type = float64x1_t; }; + +template <> + struct __intrinsic_type + { using type = float64x2_t; }; +#endif + +#define _GLIBCXX_SIMD_ARM_INTRIN(_Bits, _Np) \ +template <> \ + struct __intrinsic_type<__int_with_sizeof_t<_Bits / 8>, \ + _Np * _Bits / 8, void> \ + { using type = int##_Bits##x##_Np##_t; }; \ +template <> \ + struct __intrinsic_type>, \ + _Np * _Bits / 8, void> \ + { using type = uint##_Bits##x##_Np##_t; } +_GLIBCXX_SIMD_ARM_INTRIN(8, 8); +_GLIBCXX_SIMD_ARM_INTRIN(8, 16); +_GLIBCXX_SIMD_ARM_INTRIN(16, 4); +_GLIBCXX_SIMD_ARM_INTRIN(16, 8); +_GLIBCXX_SIMD_ARM_INTRIN(32, 2); +_GLIBCXX_SIMD_ARM_INTRIN(32, 4); +_GLIBCXX_SIMD_ARM_INTRIN(64, 1); +_GLIBCXX_SIMD_ARM_INTRIN(64, 2); +#undef _GLIBCXX_SIMD_ARM_INTRIN + template struct __intrinsic_type<_Tp, _Bytes, enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>> { -static constexpr int _S_VBytes = _Bytes <= 8 ? 8 : 16; +static constexpr int _SVecBytes = _Bytes <= 8 ? 8 : 16; using _Ip = __int_for_sizeof_t<_Tp>; using _Up = conditional_t< is_floating_point_v<_Tp>, _Tp, conditional_t, make_unsigned_t<_Ip>, _Ip>>; -using type [[__gnu__::__vector_size__(_S_VBytes)]] = _Up; +static_assert(!is_same_v<_Tp, _Up> || _SVecBytes != _Bytes, + "should use explicit specialization above"); +using type = typename __intrinsic_type<_Up, _SVecBytes>::type; }; #endif // _GLIBCXX_SIMD_H
[PATCH 01/16] Support skip, only, expensive, and xfail markers
From: Matthias Kretz libstdc++-v3/ChangeLog: * testsuite/experimental/simd/driver.sh: Implement skip, only, expensive, and xfail markers. They can select on type, ABI tag subset number, target-triplet, and compiler flags. * testsuite/experimental/simd/generate_makefile.sh: The summary now includes lines for unexpected passes and expected failures. If the skip or only markers are only conditional on the type, do not generate rules for those types. * testsuite/experimental/simd/tests/abs.cc: Mark test expensive for ABI tag subsets 1-9. * testsuite/experimental/simd/tests/algorithms.cc: Ditto. * testsuite/experimental/simd/tests/broadcast.cc: Ditto. * testsuite/experimental/simd/tests/casts.cc: Ditto. * testsuite/experimental/simd/tests/generator.cc: Ditto. * testsuite/experimental/simd/tests/integer_operators.cc: Ditto. * testsuite/experimental/simd/tests/loadstore.cc: Ditto. * testsuite/experimental/simd/tests/mask_broadcast.cc: Ditto. * testsuite/experimental/simd/tests/mask_conversions.cc: Ditto. * testsuite/experimental/simd/tests/mask_implicit_cvt.cc: Ditto. * testsuite/experimental/simd/tests/mask_loadstore.cc: Ditto. * testsuite/experimental/simd/tests/mask_operator_cvt.cc: Ditto. * testsuite/experimental/simd/tests/mask_operators.cc: Ditto. * testsuite/experimental/simd/tests/mask_reductions.cc: Ditto. * testsuite/experimental/simd/tests/operator_cvt.cc: Ditto. * testsuite/experimental/simd/tests/operators.cc: Ditto. * testsuite/experimental/simd/tests/reductions.cc: Ditto. * testsuite/experimental/simd/tests/simd.cc: Ditto. * testsuite/experimental/simd/tests/split_concat.cc: Ditto. * testsuite/experimental/simd/tests/splits.cc: Ditto. * testsuite/experimental/simd/tests/where.cc: Ditto. * testsuite/experimental/simd/tests/fpclassify.cc: Ditto. In addition replace "test only floattypes" marker by unconditional "float|double|ldouble" only marker. * testsuite/experimental/simd/tests/frexp.cc: Ditto. * testsuite/experimental/simd/tests/hypot3_fma.cc: Ditto. * testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc: Ditto. * testsuite/experimental/simd/tests/logarithm.cc: Ditto. * testsuite/experimental/simd/tests/math_1arg.cc: Ditto. * testsuite/experimental/simd/tests/math_2arg.cc: Ditto. * testsuite/experimental/simd/tests/remqo.cc: Ditto. * testsuite/experimental/simd/tests/trigonometric.cc: Ditto. * testsuite/experimental/simd/tests/trunc_ceil_floor.cc: Ditto. * testsuite/experimental/simd/tests/sincos.cc: Ditto. In addition, xfail on run because the reference data is missing. --- .../testsuite/experimental/simd/driver.sh | 114 +--- .../experimental/simd/generate_makefile.sh| 122 -- .../testsuite/experimental/simd/tests/abs.cc | 1 + .../experimental/simd/tests/algorithms.cc | 1 + .../experimental/simd/tests/broadcast.cc | 1 + .../experimental/simd/tests/casts.cc | 1 + .../experimental/simd/tests/fpclassify.cc | 3 +- .../experimental/simd/tests/frexp.cc | 3 +- .../experimental/simd/tests/generator.cc | 1 + .../experimental/simd/tests/hypot3_fma.cc | 3 +- .../simd/tests/integer_operators.cc | 1 + .../simd/tests/ldexp_scalbn_scalbln_modf.cc | 3 +- .../experimental/simd/tests/loadstore.cc | 1 + .../experimental/simd/tests/logarithm.cc | 3 +- .../experimental/simd/tests/mask_broadcast.cc | 1 + .../simd/tests/mask_conversions.cc| 1 + .../simd/tests/mask_implicit_cvt.cc | 1 + .../experimental/simd/tests/mask_loadstore.cc | 1 + .../simd/tests/mask_operator_cvt.cc | 1 + .../experimental/simd/tests/mask_operators.cc | 1 + .../simd/tests/mask_reductions.cc | 1 + .../experimental/simd/tests/math_1arg.cc | 3 +- .../experimental/simd/tests/math_2arg.cc | 3 +- .../experimental/simd/tests/operator_cvt.cc | 1 + .../experimental/simd/tests/operators.cc | 1 + .../experimental/simd/tests/reductions.cc | 1 + .../experimental/simd/tests/remqo.cc | 3 +- .../testsuite/experimental/simd/tests/simd.cc | 1 + .../experimental/simd/tests/sincos.cc | 4 +- .../experimental/simd/tests/split_concat.cc | 1 + .../experimental/simd/tests/splits.cc | 1 + .../experimental/simd/tests/trigonometric.cc | 3 +- .../simd/tests/trunc_ceil_floor.cc| 3 +- .../experimental/simd/tests/where.cc | 1 + 34 files changed, 225 insertions(+), 66 deletions(-) -- ── Dr. Matthias Kretz
[PATCH 00/16] stdx::simd fixes and testsuite improvements
As promised on IRC ... Matthias Kretz (15): Support skip, only, expensive, and xfail markers Fix NEON intrinsic types usage Support -mlong-double-64 on PPC Fix simd_mask on POWER w/o POWER8 Fix several check-simd interaction issues Fix DRIVEROPTS and TESTFLAGS processing Fix incorrect display of old test summaries Immediate feedback with -v Fix mask reduction of simd_mask on POWER7 Skip testing hypot3 for long double on PPC Abort test after 1000 lines of output Support timeout and timeout-factor options Improve test codegen for interpreting assembly Implement hmin and hmax Work around test failures using -mno-tree-vrp yaozhongxiao (1): Improve "find_first/last_set" for NEON libstdc++-v3/include/experimental/bits/simd.h | 170 ++- .../include/experimental/bits/simd_builtin.h | 6 +- .../include/experimental/bits/simd_neon.h | 17 +- .../include/experimental/bits/simd_ppc.h | 35 ++- .../include/experimental/bits/simd_scalar.h | 2 +- libstdc++-v3/testsuite/Makefile.am| 5 +- libstdc++-v3/testsuite/Makefile.in| 5 +- .../testsuite/experimental/simd/driver.sh | 263 ++ .../experimental/simd/generate_makefile.sh| 201 +++-- .../testsuite/experimental/simd/tests/abs.cc | 1 + .../experimental/simd/tests/algorithms.cc | 1 + .../experimental/simd/tests/bits/verify.h | 44 +-- .../experimental/simd/tests/broadcast.cc | 1 + .../experimental/simd/tests/casts.cc | 1 + .../experimental/simd/tests/fpclassify.cc | 3 +- .../experimental/simd/tests/frexp.cc | 3 +- .../experimental/simd/tests/generator.cc | 1 + .../experimental/simd/tests/hypot3_fma.cc | 4 +- .../simd/tests/integer_operators.cc | 1 + .../simd/tests/ldexp_scalbn_scalbln_modf.cc | 3 +- .../experimental/simd/tests/loadstore.cc | 2 + .../experimental/simd/tests/logarithm.cc | 3 +- .../experimental/simd/tests/mask_broadcast.cc | 1 + .../simd/tests/mask_conversions.cc| 1 + .../simd/tests/mask_implicit_cvt.cc | 1 + .../experimental/simd/tests/mask_loadstore.cc | 1 + .../simd/tests/mask_operator_cvt.cc | 1 + .../experimental/simd/tests/mask_operators.cc | 1 + .../simd/tests/mask_reductions.cc | 1 + .../experimental/simd/tests/math_1arg.cc | 3 +- .../experimental/simd/tests/math_2arg.cc | 3 +- .../experimental/simd/tests/operator_cvt.cc | 1 + .../experimental/simd/tests/operators.cc | 1 + .../experimental/simd/tests/reductions.cc | 22 ++ .../experimental/simd/tests/remqo.cc | 3 +- .../testsuite/experimental/simd/tests/simd.cc | 1 + .../experimental/simd/tests/sincos.cc | 4 +- .../experimental/simd/tests/split_concat.cc | 1 + .../experimental/simd/tests/splits.cc | 1 + .../experimental/simd/tests/trigonometric.cc | 3 +- .../simd/tests/trunc_ceil_floor.cc| 3 +- .../experimental/simd/tests/where.cc | 1 + 42 files changed, 635 insertions(+), 191 deletions(-) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Add simd testsuite
On Donnerstag, 17. Dezember 2020 14:10:51 CET Jonathan Wakely wrote: > On 16/12/20 12:58 +0100, Matthias Kretz wrote: > >+ $srcdir/testsuite/experimental/simd/generate_makefile.sh \ > >+--destination="$testdir/$subdir" $CXX $INCLUDES $CXXFLAGS -static > > Is the -static here to avoid needing LD_LIBRARY_PATH to find > libstdc++.so? > > If you don't have libc.a installed it won't work. How about > using -static-libgcc -static-libstdc++ instead? I need the -static for qemu and simple remote execution (copy binary via scp, execute via ssh). And yes, -static makes it much easier to avoid the LD_LIBRARY_PATH issue. I'll make -static optional, and default to -static-libgcc -static-libstdc++. The latter should still work for most remote execution setups (works for me, at least). > >--- /dev/null > >+++ b/libstdc++-v3/testsuite/experimental/simd/tests/abs.cc > >@@ -0,0 +1,24 @@ > >+#include "bits/verify.h" > >+#include "bits/metahelpers.h" > > We'd usually put these testsuite helper files in testsuite/util, maybe > in a testsuite/util/simd sub-dir, but I suppose keeping them local to > the tests is OK too. At this point the simd testsuite is very close to being usable for other Parallelism TS 2 implementations. That's a feature I'd support if there's interest outside of libstdc++. > >--- /dev/null > >+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/conversions.h > >@@ -0,0 +1,167 @@ > >+#include > >+ > >+// is_conversion_undefined > >+/* implementation-defined > >+ * == > >+ * §4.7 p3 (integral conversions) > > These section signs will cause errors if the testsuite is run with > something like -finput-charset=ascii, but I suppose we can say "don't > do that". We have tests that use that option and include all the > libstdc++ headers, so there should be no need to run the entire > testsuite with that option. So it's OK. Ah, but good point. I have comments in simd_math.h (i.e. the other patch) like: "Fold @p x into [-¼π, ¼π] and [...]". These comments are not in the testsuite. I guess I need to replace all non-ASCII chars there? Attached is the diff to the previous patch. In addition to the -static change I added license headers (as noted on IRC) and I improved the Makefile generator: Instead of only passing TESTFLAGS and GCC_TEST_SIMULATOR as environment variables, place the initial values into the generated Makefile. This makes it much easier to work on tests or fixes for failures. I'll top-post the squashed simd patches. -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ── diff --git a/libstdc++-v3/scripts/check_simd b/libstdc++-v3/scripts/check_simd index 2b7a17a64c9..25acf64c841 100755 --- a/libstdc++-v3/scripts/check_simd +++ b/libstdc++-v3/scripts/check_simd @@ -26,7 +26,7 @@ sim=\\\"$sim\\\"\"" if [ -f "$CHECK_SIMD_CONFIG" ]; then . "$CHECK_SIMD_CONFIG" -elif [ -z "$CHECK_SIMD_CONFIG"]; then +elif [ -z "$CHECK_SIMD_CONFIG" ]; then if [ -z "$target_list" ]; then target_list="unix" case "$target_triplet" in @@ -69,8 +69,7 @@ while [ ${#list} -gt 0 ]; do subdir="simd/$(echo "$flags" | sed 's#[= /-]##g')" rm -f "${subdir}/Makefile" $srcdir/testsuite/experimental/simd/generate_makefile.sh \ ---destination="$testdir/$subdir" $CXX $INCLUDES $CXXFLAGS -static - echo "$subdir -$flags -$sim" +--destination="$testdir/$subdir" --sim="$sim" --testflags="$flags" \ +$CXX $INCLUDES $CXXFLAGS -static-libgcc -static-libstdc++ + echo "$subdir" done diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/Makefile.am index d2e282b62b9..fa9cc4753f3 100644 --- a/libstdc++-v3/testsuite/Makefile.am +++ b/libstdc++-v3/testsuite/Makefile.am @@ -192,9 +192,10 @@ check-simd: $(srcdir)/experimental/simd/generate_makefile.sh \ testsuite_files_simd \ ${glibcxx_builddir}/scripts/testsuite_flags ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "${glibcxx_builddir}" "$(CXXFLAGS)" | \ - while read subdir && read flags && read sim; do \ - $(MAKE) -C "$${subdir}" TESTFLAGS="$${flags}" GCC_TEST_SIMULATOR="$${sim}"; \ - tail -n6 $${subdir}/simd_testsuite.sum >> .simd.summary; \ + while read subdir; do \ + $(M
Re: [PATCH] std::experimental::simd
On Donnerstag, 12. November 2020 00:43:31 CET Jonathan Wakely wrote: > On 08/05/20 21:03 +0200, Matthias Kretz wrote: > >Here's my last update to the std::experimental::simd patch. It's currently > >based on the gcc-10 branch. > > > > > >+ > >+// __next_power_of_2{{{ > >+/** > >+ * \internal > > We use @foo for Doxygen commens rather than \foo Done. > >+ * Returns the next power of 2 larger than or equal to \p __x. > >+ */ > >+constexpr std::size_t > >+__next_power_of_2(std::size_t __x) > >+{ > >+ return (__x & (__x - 1)) == 0 ? __x > >+: __next_power_of_2((__x | (__x >> 1)) + 1); > >+} > > Can this be replaced with std::__bit_ceil ? > > std::bit_ceil is C++20, but we provide __private versions of > everything in for C++14 and up. Ah good. I'll delete some code. > >+// vvv type traits vvv > >+// integer type aliases{{{ > >+using _UChar = unsigned char; > >+using _SChar = signed char; > >+using _UShort = unsigned short; > >+using _UInt = unsigned int; > >+using _ULong = unsigned long; > >+using _ULLong = unsigned long long; > >+using _LLong = long long; > > I have a suspicion some of these might clash with libc macros on some > OS somewhere, but we can cross that bridge when we come to it. I need those to help cutting down the code for 80 cols. ;-) > >+// __make_dependent_t {{{ > >+template struct __make_dependent > >+{ > >+ using type = _Up; > >+}; > >+template > >+using __make_dependent_t = typename __make_dependent<_Tp, _Up>::type; > > Do you need a distinct class template for this, or can > __make_dependent_t be an alias to __type_identity::type or > something else that already exists? With GCC it would suffice to use __type_identity::type here. But Clang rejects it. Clang sees that the first template argument is not used in the definition of the alias and thus doesn't make _Up a dependent type. > >+// __call_with_n_evaluations{{{ > >+template > >+_GLIBCXX_SIMD_INTRINSIC constexpr auto > >+__call_with_n_evaluations(std::index_sequence<_I...>, _F0&& __f0, > >+ _FArgs&& __fargs) > > I'm not sure if it matters here, but old versions of G++ passed empty > types (like index_sequence) using the wrong ABI. Passing them as the > last argument makes it a non-issue. If they're not the last argument, > you get incompatible code when compiling with -fabi-version=7 or > lower. These are all [[gnu::always_inline]]. So it shouldn't matter. > >+// __is_narrowing_conversion<_From, _To>{{{ > >+template >std::is_arithmetic<_From>::value, +bool = > >std::is_arithmetic<_To>::value> > > These could use is_arithmetic_v. Right. That was me trying to work around a clang-format bug. Will fix. I'm in the process of ditching clang-format anyway. > >+{ > >+}; > >+ > >+template > >+struct __is_narrowing_conversion : public true_type > > This looks odd, bool to arithmetic type T is narrowing? > I assume there's a reason for it, so maybe a comment explaining it > would help. Odd indeed. Either I wanted to take a shortcut to implement: "From is a vectorizable type and every possibly value of From can be represented with type value_type, or [...]". Or I wanted to swap bool and _Tp here and say that anything other than bool converting to bool is narrowing. I should clean this up. > > >+// _BitOps {{{ > >+struct _BitOps > > [...] > std::__popcount in > > [...] > std::__countl_zero in Yes. I'll clean up all of _BitOps. > >+template > > We generally avoid single letter names, although _V isn't in the list > of BADNAMES in the manual, so maybe this one's OK. > > >+template , > >+ typename _R > > Same for _R, it's not listed as a BADNAME. I believe I checked the list. ;-) > >+ > >+template > >+_GLIBCXX_SIMD_INTRINSIC constexpr _Tp > >+__and(_Tp __a, _Tp __b) noexcept > > Calls to __and are done unqualified. Are they only with types that > won't cause ADL to look outside namespace std? > > Even though __and is a reserved name, avoidign ADL has other benefits. Called either with integers, [[gnu::vector_size(N)]] types, or std::experimental::parallelism_v2::_SimdWrapper. I request a column limit relaxation to at least 100 cols if I should qualify all of them with std::experimental:: ;-) > That's all for now ... not very far through the huge patch though. > Generally this looks very good. The things mentioned above are > stylistic or just remove some redundancy, they're not critical. Thanks. I'll post a new patch ASAP. My tests are running. -Matthias -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Let numeric_limits::is_iec559 reflect -ffast-math
On Freitag, 22. Mai 2020 18:39:42 CEST Jonathan Wakely wrote: > On 22/05/20 09:49 +0200, Matthias Kretz wrote: > >On Donnerstag, 21. Mai 2020 17:46:01 CEST Marc Glisse wrote: > >> On Thu, 21 May 2020, Jonathan Wakely wrote: > >> > On 27/04/20 17:09 +0200, Matthias Kretz wrote: > >> >> From: Matthias Kretz > >> >> > >> >>PR libstdc++/84949 > >> >>* include/std/limits: Let is_iec559 reflect whether > >> >>__GCC_IEC_559 says float and double support IEEE 754-2008. > >> >>* testsuite/18_support/numeric_limits/is_iec559.cc: Test IEC559 > >> >>mandated behavior if is_iec559 is true. > >> >>* testsuite/18_support/numeric_limits/infinity.cc: Only test > >> >>inf > >> >>behavior if is_iec559 is true, otherwise there is no guarantee > >> >>how arithmetic on inf behaves. > >> >>* testsuite/18_support/numeric_limits/quiet_NaN.cc: ditto for > >> >>NaN. > >> >>* testsuite/18_support/numeric_limits/denorm_min-1.cc: Compile > >> >>with -ffast-math. > >> >>* testsuite/18_support/numeric_limits/epsilon-1.cc: ditto. > >> >>* testsuite/18_support/numeric_limits/infinity-1.cc: ditto. > >> >>* testsuite/18_support/numeric_limits/is_iec559-1.cc: ditto. > >> >>* testsuite/18_support/numeric_limits/quiet_NaN-1.cc: ditto. > >> > > >> > I'm inclined to go ahead and commit this (to master only, obviously). > >> > It certainly seems more correct to me, and we'll probably never find > >> > out if it's "safe" to do unless we actually change it and see what > >> > happens. > >> > > >> > Marc, do you have an opinion? > >> > >> I don't have a strong opinion on this. I thought we were refraining from > >> changing numeric_limits based on flags (like -fwrapv for modulo) because > >> that would lead to ODR violations when people link objects compiled with > >> different flags. There is a value in libstdc++.so, which may have been > >> compiled with different flags than the application. > > > >But these ODR violations happen in any case: The floating-point types are > >different types with or without -ffast-math (and related) flags. They > >behave differently. Compiling a function in multiple TUs with different > >flags produces observably different results. Choosing a single one of them > >is obviously fragile and broken. That's the spirit of an ODR violation... > > > >It would sometimes be useful to have different types: > >float, float_no_nan, float_no_nan_no_signed_zero, ... > > Sure. There are ODR violations like that, and then there are ones > like: > >template >struct X >{ > conditional_t::is_iec559, T, BigNum> val; >}; Nice. ;-) If only the mangling of a struct could include the type of its members (recursively)... But at least val has a different type now. And correctly so. Yes, the ABI breaks possible via this change is real, though I'd guess there are zero or close-to-zero ABI dependencies on is_iec559 out in the wild (at this point - because it didn't work anyway). > I'm generally not concerned about ODR violations where one TU behaves > as requested by the flags used to compile that TU and another behaves > as requested by the flats used to compile that second TU. That happens > all the time with -fno-exceptions and -fno-rtti and such like. That > causes ODR violations too, but of the kind where each definition does > what was requested. I am concerned. Showcase: https://godbolt.org/z/KzM3si. If you link those TUs, you get one of the two behaviors for both TUs. This can result in very hard to find Heisenbugs. > Constants defined by the library changing value is a bit more > concerning. But I don't know if it's really a problem in this case. template ::is_iec559> struct Float { T val }; Finally, the standard mechanism that can help resolve those silent ODR violations works. I.e. one can build float_559 and float_non559 types (overloading all operators is still rather tedious) -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [PATCH] Let numeric_limits::is_iec559 reflect -ffast-math
On Donnerstag, 21. Mai 2020 17:46:01 CEST Marc Glisse wrote: > On Thu, 21 May 2020, Jonathan Wakely wrote: > > On 27/04/20 17:09 +0200, Matthias Kretz wrote: > >> From: Matthias Kretz > >> > >>PR libstdc++/84949 > >>* include/std/limits: Let is_iec559 reflect whether > >>__GCC_IEC_559 says float and double support IEEE 754-2008. > >>* testsuite/18_support/numeric_limits/is_iec559.cc: Test IEC559 > >>mandated behavior if is_iec559 is true. > >>* testsuite/18_support/numeric_limits/infinity.cc: Only test inf > >>behavior if is_iec559 is true, otherwise there is no guarantee > >>how arithmetic on inf behaves. > >>* testsuite/18_support/numeric_limits/quiet_NaN.cc: ditto for > >>NaN. > >>* testsuite/18_support/numeric_limits/denorm_min-1.cc: Compile > >>with -ffast-math. > >>* testsuite/18_support/numeric_limits/epsilon-1.cc: ditto. > >>* testsuite/18_support/numeric_limits/infinity-1.cc: ditto. > >>* testsuite/18_support/numeric_limits/is_iec559-1.cc: ditto. > >>* testsuite/18_support/numeric_limits/quiet_NaN-1.cc: ditto. > > > > I'm inclined to go ahead and commit this (to master only, obviously). > > It certainly seems more correct to me, and we'll probably never find > > out if it's "safe" to do unless we actually change it and see what > > happens. > > > > Marc, do you have an opinion? > > I don't have a strong opinion on this. I thought we were refraining from > changing numeric_limits based on flags (like -fwrapv for modulo) because > that would lead to ODR violations when people link objects compiled with > different flags. There is a value in libstdc++.so, which may have been > compiled with different flags than the application. But these ODR violations happen in any case: The floating-point types are different types with or without -ffast-math (and related) flags. They behave differently. Compiling a function in multiple TUs with different flags produces observably different results. Choosing a single one of them is obviously fragile and broken. That's the spirit of an ODR violation... It would sometimes be useful to have different types: float, float_no_nan, float_no_nan_no_signed_zero, ... -- ── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Centre for Heavy Ion Research https://gsi.de std::experimental::simd https://github.com/VcDevel/std-simd ──
Re: [RFC] Clarify -ffinite-math-only documentation
On Dienstag, 28. April 2020 09:21:38 CEST Richard Biener wrote: > On Mon, Apr 27, 2020 at 11:26 PM Matthias Kretz wrote: > > On Montag, 27. April 2020 21:39:17 CEST Richard Sandiford wrote: > > > "Dr. Matthias Kretz" writes: > > > > On Montag, 27. April 2020 18:59:08 CEST Richard Sandiford wrote: > > > >> Richard Biener via Gcc-patches writes: > > > >> > On Mon, Apr 27, 2020 at 6:09 PM Matthias Kretz wrote: > > > >> >> Hi, > > > >> >> > > > >> >> This documentation change clarifies the effect of > > > >> >> -ffinite-math-only. > > > >> >> With the current documentation, it is unclear what the presence of > > > >> >> NaN > > > >> >> and Inf representations means if (arithmetic) operations on such > > > >> >> values > > > >> >> are unspecified and even classification functions like isnan are > > > >> >> unreliable. If the hardware thinks a certain bit pattern is a NaN, > > > >> >> but > > > >> >> the software assumes a NaN value cannot ever exist, it is > > > >> >> questionable > > > >> >> whether, from a language viewpoint, a representation for NaNs > > > >> >> really > > > >> >> exists. Because, a NaN is defined by its behavior. This change > > > >> >> also > > > >> >> clarifies that isnan(nan) returning false is fine. > > > >> >> > > > >> >> This relates to PR84949. > > > >> >> > > > >> >> * doc/invoke.texi: Clarify the effects of > > > >> >> -ffinite-math-only. > > > >> >> > > > >> >> --- > > > >> >> > > > >> >> gcc/doc/invoke.texi | 6 -- > > > >> >> 1 file changed, 4 insertions(+), 2 deletions(-) > > > >> >> > > > >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > > >> >> index a37a2ee9c19..9e76ab057a9 100644 > > > >> >> --- a/gcc/doc/invoke.texi > > > >> >> +++ b/gcc/doc/invoke.texi > > > >> >> @@ -11619,8 +11619,10 @@ The default is > > > >> >> @option{-fno-reciprocal-math}. > > > >> >> > > > >> >> @item -ffinite-math-only > > > >> >> @opindex ffinite-math-only > > > >> >> > > > >> >> -Allow optimizations for floating-point arithmetic that assume > > > >> >> -that arguments and results are not NaNs or +-Infs. > > > >> >> +Assume that floating-point types in the language do not have > > > >> >> representations for > > > >> >> +NaNs and +-Inf. Whether floating-point hardware supports and acts > > > >> >> on > > > >> >> NaNs and ++-Inf is not affected. The behavior of a program that > > > >> >> uses a > > > >> >> NaN or +-Inf value > > > >> >> +as function argument, macro argument, or operand is undefined. > > > >> > > > > >> > Minor nit here - I'd avoid the 'undefined' word which has bad > > > >> > connotation > > > >> > and use 'unspecified'. Maybe we can even use ISO C language > > > >> > specification > > > >> > terms but I'm not sure which one is most appropriate here. > > > > > > > > I'm an ISO C++ person, and unspecified sounds too reliable to me: > > > > https://wg21.link/intro.defs#defns.unspecified. > > > > > > > >> > Curiously __builtin_nan ("nan") still gets you a NaN representation > > > >> > but isnan(__builtin_nan("nan")) is resolved to false. > > > > > > > > Right, that's because only the hardware thinks __builtin_nan ("nan") > > > > is a > > > > NaN representation. With -ffinite-math-only, the double data type in > > > > C/C++ can either hold a finite real value, or an invalid value (i.e. a > > > > value that the optimizer unconditionally excludes as a possible value > > > > for > > > > any object of floating-point type). FWIW, with -ffinite-math-only, > > > > ubsan > > > > should flag isnan(__builtin_nan("
Re: [RFC] Clarify -ffinite-math-only documentation
On Montag, 27. April 2020 21:39:17 CEST Richard Sandiford wrote: > "Dr. Matthias Kretz" writes: > > On Montag, 27. April 2020 18:59:08 CEST Richard Sandiford wrote: > >> Richard Biener via Gcc-patches writes: > >> > On Mon, Apr 27, 2020 at 6:09 PM Matthias Kretz wrote: > >> >> Hi, > >> >> > >> >> This documentation change clarifies the effect of -ffinite-math-only. > >> >> With the current documentation, it is unclear what the presence of NaN > >> >> and Inf representations means if (arithmetic) operations on such > >> >> values > >> >> are unspecified and even classification functions like isnan are > >> >> unreliable. If the hardware thinks a certain bit pattern is a NaN, but > >> >> the software assumes a NaN value cannot ever exist, it is questionable > >> >> whether, from a language viewpoint, a representation for NaNs really > >> >> exists. Because, a NaN is defined by its behavior. This change also > >> >> clarifies that isnan(nan) returning false is fine. > >> >> > >> >> This relates to PR84949. > >> >> > >> >> * doc/invoke.texi: Clarify the effects of -ffinite-math-only. > >> >> > >> >> --- > >> >> > >> >> gcc/doc/invoke.texi | 6 -- > >> >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> >> > >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > >> >> index a37a2ee9c19..9e76ab057a9 100644 > >> >> --- a/gcc/doc/invoke.texi > >> >> +++ b/gcc/doc/invoke.texi > >> >> @@ -11619,8 +11619,10 @@ The default is @option{-fno-reciprocal-math}. > >> >> > >> >> @item -ffinite-math-only > >> >> @opindex ffinite-math-only > >> >> > >> >> -Allow optimizations for floating-point arithmetic that assume > >> >> -that arguments and results are not NaNs or +-Infs. > >> >> +Assume that floating-point types in the language do not have > >> >> representations for > >> >> +NaNs and +-Inf. Whether floating-point hardware supports and acts on > >> >> NaNs and ++-Inf is not affected. The behavior of a program that uses a > >> >> NaN or +-Inf value > >> >> +as function argument, macro argument, or operand is undefined. > >> > > >> > Minor nit here - I'd avoid the 'undefined' word which has bad > >> > connotation > >> > and use 'unspecified'. Maybe we can even use ISO C language > >> > specification > >> > terms but I'm not sure which one is most appropriate here. > > > > I'm an ISO C++ person, and unspecified sounds too reliable to me: > > https://wg21.link/intro.defs#defns.unspecified. > > > >> > Curiously __builtin_nan ("nan") still gets you a NaN representation > >> > but isnan(__builtin_nan("nan")) is resolved to false. > > > > Right, that's because only the hardware thinks __builtin_nan ("nan") is a > > NaN representation. With -ffinite-math-only, the double data type in > > C/C++ can either hold a finite real value, or an invalid value (i.e. a > > value that the optimizer unconditionally excludes as a possible value for > > any object of floating-point type). FWIW, with -ffinite-math-only, ubsan > > should flag isnan(__builtin_nan("nan")) or any f(constexpr nan). > > > > With the above documentation change, it is clear that with > > https://wg21.link/ P1841 std::numbers::quiet_NaN would be > > ill-formed under -ffinite-math- only. Without the documentation change, > > it can be argued either way. > > > > There's another interesting observation resulting from the above: double > > and double under -ffinite-math-only are different types. Any function > > call from one world to the other is dangerous. Inline functions > > translated in different TUs compiled with different math flags violate > > the ODR. But that's all the more reason to have a very precise > > documentation/understanding of what -ffinite-math-only does. Because this > > gotcha is already the status quo.> > >> Yeah, for that and other reasons, I think it would be good to avoid > >> giving the impression that -ffinite-math-only can be relied on to make > >> the assumption above. Wouldn't it be more accurate to say that the > >> compiler is allowed to make the assumption, at any po
Re: [RFC] Clarify -ffinite-math-only documentation
On Montag, 27. April 2020 18:59:08 CEST Richard Sandiford wrote: > Richard Biener via Gcc-patches writes: > > On Mon, Apr 27, 2020 at 6:09 PM Matthias Kretz wrote: > >> Hi, > >> > >> This documentation change clarifies the effect of -ffinite-math-only. > >> With the current documentation, it is unclear what the presence of NaN > >> and Inf representations means if (arithmetic) operations on such values > >> are unspecified and even classification functions like isnan are > >> unreliable. If the hardware thinks a certain bit pattern is a NaN, but > >> the software assumes a NaN value cannot ever exist, it is questionable > >> whether, from a language viewpoint, a representation for NaNs really > >> exists. Because, a NaN is defined by its behavior. This change also > >> clarifies that isnan(nan) returning false is fine. > >> > >> This relates to PR84949. > >> > >> * doc/invoke.texi: Clarify the effects of -ffinite-math-only. > >> > >> --- > >> > >> gcc/doc/invoke.texi | 6 -- > >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> > >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > >> index a37a2ee9c19..9e76ab057a9 100644 > >> --- a/gcc/doc/invoke.texi > >> +++ b/gcc/doc/invoke.texi > >> @@ -11619,8 +11619,10 @@ The default is @option{-fno-reciprocal-math}. > >> > >> @item -ffinite-math-only > >> @opindex ffinite-math-only > >> > >> -Allow optimizations for floating-point arithmetic that assume > >> -that arguments and results are not NaNs or +-Infs. > >> +Assume that floating-point types in the language do not have > >> representations for > >> +NaNs and +-Inf. Whether floating-point hardware supports and acts on > >> NaNs and ++-Inf is not affected. The behavior of a program that uses a > >> NaN or +-Inf value > >> +as function argument, macro argument, or operand is undefined. > > > > Minor nit here - I'd avoid the 'undefined' word which has bad connotation > > and use 'unspecified'. Maybe we can even use ISO C language specification > > terms but I'm not sure which one is most appropriate here. I'm an ISO C++ person, and unspecified sounds too reliable to me: https://wg21.link/intro.defs#defns.unspecified. > > Curiously __builtin_nan ("nan") still gets you a NaN representation > > but isnan(__builtin_nan("nan")) is resolved to false. Right, that's because only the hardware thinks __builtin_nan ("nan") is a NaN representation. With -ffinite-math-only, the double data type in C/C++ can either hold a finite real value, or an invalid value (i.e. a value that the optimizer unconditionally excludes as a possible value for any object of floating-point type). FWIW, with -ffinite-math-only, ubsan should flag isnan(__builtin_nan("nan")) or any f(constexpr nan). With the above documentation change, it is clear that with https://wg21.link/ P1841 std::numbers::quiet_NaN would be ill-formed under -ffinite-math- only. Without the documentation change, it can be argued either way. There's another interesting observation resulting from the above: double and double under -ffinite-math-only are different types. Any function call from one world to the other is dangerous. Inline functions translated in different TUs compiled with different math flags violate the ODR. But that's all the more reason to have a very precise documentation/understanding of what -ffinite-math-only does. Because this gotcha is already the status quo. > Yeah, for that and other reasons, I think it would be good to avoid > giving the impression that -ffinite-math-only can be relied on to make > the assumption above. Wouldn't it be more accurate to say that the > compiler is allowed to make the assumption, at any point that it seems > convenient? I think undefined behavior does what you're asking for while unspecified behavior does what you want to avoid. I.e. its an undocumented behavior, but it can be relied on with a given implementation (compiler). -Matthias -- ──┬ Dr. Matthias Kretz │ SDE — Software Development for Experiments Senior Software Engineer,│ +49 6159 713084 SIMD Expert, │ m.kr...@gsi.de ISO C++ Committee Member │ mattkretz.github.io ──┴ GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528 Managing Directors / Geschäftsführung: Professor Dr. Paolo Giubellino, Jörg Blaurock Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats: State Secretary / Staatssekretär Dr. Georg Schütte