Re: [PATCH 2/4] xtensa: Consider the Loop Option when setmemsi is expanded to small loop

2022-06-10 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2022/06/11 9:12, Max Filippov wrote: Hi Suwa-san, hi! This change results in a bunch of ICEs in tests that look like this: gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c: In function 'main': gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c:28:1: error: unrecognizable insn: (insn 7 6 8 2

Re: [PATCH 2/4] xtensa: Consider the Loop Option when setmemsi is expanded to small loop

2022-06-10 Thread Max Filippov via Gcc-patches
Hi Suwa-san, On Thu, Jun 9, 2022 at 9:26 PM Takayuki 'January June' Suwa wrote: > > Now apply to almost any size of aligned block under such circumstances. > > gcc/ChangeLog: > > * config/xtensa/xtensa.cc (xtensa_expand_block_set_small_loop): > Pass through the block length /

Re: [PATCH] libstdc++: Rename __null_terminated to avoid collision with Apple SDK

2022-06-10 Thread Mark Mentovai
Thanks, Jonathan. I am, in fact, so certifying. I do believe that bringing up support for new OS versions is in scope for open branches, and it makes sense to merge, particularly for a trivial and uncontentious patch like this one. Jonathan Wakely wrote: > On Fri, 10 Jun 2022 at 21:12, Mark

Re: [PATCH] libstdc++: Rename __null_terminated to avoid collision with Apple SDK

2022-06-10 Thread Jonathan Wakely via Gcc-patches
On Fri, 10 Jun 2022 at 21:12, Mark Mentovai wrote: > > The macOS 13 SDK (and equivalent-version iOS and other Apple OS SDKs) > contain this definition in : > > 863 #define __null_terminated > > This collides with the use of __null_terminated in libstdc++'s > experimental fs_path.h. > > As

[PATCH] libstdc++: Rename __null_terminated to avoid collision with Apple SDK

2022-06-10 Thread Mark Mentovai
The macOS 13 SDK (and equivalent-version iOS and other Apple OS SDKs) contain this definition in : 863 #define __null_terminated This collides with the use of __null_terminated in libstdc++'s experimental fs_path.h. As libstdc++'s use of this token is entirely internal to fs_path.h, the

Re: [PATCH] regrename: Fix -fcompare-debug issue in check_new_reg_p [PR105041]

2022-06-10 Thread Jeff Law via Gcc-patches
On 6/10/2022 9:40 AM, Segher Boessenkool wrote: Hi! On Fri, Jun 10, 2022 at 07:52:57PM +0530, Surya Kumari Jangala wrote: In check_new_reg_p, the nregs of a du chain is computed by obtaining the MODE of the first element in the chain, and then calling hard_regno_nregs() with the MODE. But

Re: [PATCH] c++: Add support for __real__/__imag__ modifications in constant expressions [PR88174]

2022-06-10 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 10, 2022 at 01:27:28PM -0400, Jason Merrill wrote: > > --- gcc/cp/constexpr.cc.jj 2022-06-08 08:21:02.973448193 +0200 > > +++ gcc/cp/constexpr.cc 2022-06-08 17:13:04.986040449 +0200 > > @@ -5707,6 +5707,20 @@ cxx_eval_store_expression (const constex > > } > > break; >

c++: Add a late-writing step for modules

2022-06-10 Thread Nathan Sidwell via Gcc-patches
To add a module initializer optimization, we need to defer finishing writing out the module file until the end of determining the dynamic initializers. This is achieved by passing some saved-state from the main module writing to a new function that completes it. This patch merely adds the

[PATCH] i386: Fix up *3_doubleword_mask [PR105911

2022-06-10 Thread Jakub Jelinek via Gcc-patches
Hi! Another regression caused by my recent patch. This time because define_insn_and_split only requires that the constant mask is const_int_operand. When it was only SImode, that wasn't a problem, HImode neither, but for DImode if we need to and the shift count we might run into a problem that

Re: [committed] openmp: Add support for HBW or large capacity or interleaved memory through the libmemkind.so library

2022-06-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 09, 2022 at 01:57:52PM +0200, Jakub Jelinek via Gcc-patches wrote: > On Thu, Jun 09, 2022 at 12:11:28PM +0200, Thomas Schwinge wrote: > > On 2022-06-09T10:19:03+0200, Jakub Jelinek via Gcc-patches > > wrote: > > > This patch adds support for dlopening libmemkind.so > > > > Instead

[PING][PATCH] Add instruction level discriminator support.

2022-06-10 Thread Eugene Rozenfeld via Gcc-patches
Hello, I'd like to ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html Thanks, Eugene -Original Message- From: Gcc-patches On Behalf Of Eugene Rozenfeld via Gcc-patches Sent: Thursday, June 02, 2022 12:22 AM To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan

[PATCH] x86: Require AVX for F16C and VAES

2022-06-10 Thread H.J. Lu via Gcc-patches
Since F16C and VAES are only usable with AVX, require AVX for F16C and VAES. OK for master and release branches? Thanks. H.J. --- libgcc/105920 * common/config/i386/cpuinfo.h (get_available_features): Require AVX for F16C and VAES. --- gcc/common/config/i386/cpuinfo.h |

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread H.J. Lu via Gcc-patches
On Fri, Jun 10, 2022 at 7:44 AM H.J. Lu wrote: > > On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote: > > > > * liuhongt via Libc-alpha: > > > > > +\subsubsection{Special Types} > > > + > > > +The \code{__Bfloat16} type uses a 8-bit exponent and 7-bit mantissa. > > > +It is used for

Re: [PATCH V2] Disable generating load/store vector pairs for block copies.

2022-06-10 Thread Segher Boessenkool
Hi! On Fri, Jun 10, 2022 at 11:27:40AM -0400, Michael Meissner wrote: > Testing has found that using store vector pair for block copies can result > in a slow down on power10. This patch disables using the vector pair > instructions for block copies if we are tuning for power10. Load paired

Re: [PATCH] c++: Add support for __real__/__imag__ modifications in constant expressions [PR88174]

2022-06-10 Thread Jason Merrill via Gcc-patches
On 6/9/22 04:37, Jakub Jelinek wrote: Hi! We claim we support P0415R1 (constexpr complex), but e.g. #include constexpr bool foo () { std::complex a (1.0, 2.0); a += 3.0; a.real (6.0); return a.real () == 6.0 && a.imag () == 2.0; } static_assert (foo ()); fails with test.C:12:20:

Re: [PATCH] c++: optimize specialization of nested class templates

2022-06-10 Thread Jason Merrill via Gcc-patches
On 6/10/22 12:00, Patrick Palka wrote: On Fri, 10 Jun 2022, Patrick Palka wrote: On Thu, 9 Jun 2022, Patrick Palka wrote: On Thu, 9 Jun 2022, Jason Merrill wrote: On 6/8/22 14:21, Patrick Palka wrote: When substituting a class template specialization, tsubst_aggr_type substitutes the

c++: Adjust module initializer calling emission

2022-06-10 Thread Nathan Sidwell via Gcc-patches
We special-case emitting the calls of module initializer functions. It's simpler to just emit a static fn do do that, and add it onto the front of the global init fn chain. We can also move the calculation of the set of initializers to call to the point of use. nathan -- Nathan

Re: [PATCH 2/1] c++: optimize specialization of templated member functions

2022-06-10 Thread Jason Merrill via Gcc-patches
On 6/9/22 15:37, Patrick Palka wrote: On Thu, 9 Jun 2022, Jason Merrill wrote: On 6/9/22 09:00, Patrick Palka wrote: This performs one of the optimizations added by the previous patch to lookup_template_class, to instantiate_template as well. (For the libstdc++ ranges tests this optimization

Re: [PATCH] c++: improve TYPENAME_TYPE hashing [PR65328]

2022-06-10 Thread Jason Merrill via Gcc-patches
On 6/10/22 09:40, Patrick Palka wrote: The reason compiling the testcase in this PR is so slow is ultimately due to our poor hashing of TYPENAME_TYPE causing a huge amount of hash table collisions in the spec_hasher and typename_hasher tables. In spec_hasher, we don't hash the components of a

Re: [PATCH] c++: optimize specialization of nested class templates

2022-06-10 Thread Patrick Palka via Gcc-patches
On Fri, 10 Jun 2022, Patrick Palka wrote: > On Thu, 9 Jun 2022, Patrick Palka wrote: > > > On Thu, 9 Jun 2022, Jason Merrill wrote: > > > > > On 6/8/22 14:21, Patrick Palka wrote: > > > > When substituting a class template specialization, tsubst_aggr_type > > > > substitutes the TYPE_CONTEXT

[PATCH] libgompd: Fix sizes in OMPD support and add local ICVs finctions.

2022-06-10 Thread Mohamed Atef via Gcc-patches
libgomp/ChangeLog 2022-06-10 Mohamed Atef * ompd-helper.h (DEREFERENCE, ACCESS_VALUE): New macros. * ompd-helper.c (gompd_get_nthread, gompd_get_thread_limit, gomp_get_run_shed, gompd_get_run_sched_chunk_size, gompd_get_default_device, gompd_get_dynamic, gompd_get_max_active_levels,

Re: [PATCH] regrename: Fix -fcompare-debug issue in check_new_reg_p [PR105041]

2022-06-10 Thread Segher Boessenkool
Hi! On Fri, Jun 10, 2022 at 07:52:57PM +0530, Surya Kumari Jangala wrote: > In check_new_reg_p, the nregs of a du chain is computed by obtaining the MODE > of the first element in the chain, and then calling hard_regno_nregs() with > the > MODE. But the first element of the chain can be a

[PATCH v2 4/4] xtensa: Improve constant synthesis for both integer and floating-point

2022-06-10 Thread Takayuki 'January June' Suwa via Gcc-patches
This patch revises the previous implementation of constant synthesis. First, changed to use define_split machine description pattern and to run after reload pass, in order not to interfere some optimizations such as the loop invariant motion. Second, not only integer but floating-point is

[PATCH V2] Disable generating load/store vector pairs for block copies.

2022-06-10 Thread Michael Meissner via Gcc-patches
[PATCH, V2] Disable generating load/store vector pairs for block copies. Testing has found that using store vector pair for block copies can result in a slow down on power10. This patch disables using the vector pair instructions for block copies if we are tuning for power10. This is version 2

Re: [PATCH] Darwin: Future-proof -mmacosx-version-min

2022-06-10 Thread Iain Sandoe
Hi Mark, > On 10 Jun 2022, at 15:56, Mark Mentovai wrote: > > f18cbc1ee1f4 (2021-12-18) updated various parts of gcc to not impose a > Darwin or macOS version maximum of the current known release. Different > parts of gcc accept, variously, Darwin version numbers matching > darwin2*, and macOS

[PATCH] Darwin: Future-proof -mmacosx-version-min

2022-06-10 Thread Mark Mentovai
f18cbc1ee1f4 (2021-12-18) updated various parts of gcc to not impose a Darwin or macOS version maximum of the current known release. Different parts of gcc accept, variously, Darwin version numbers matching darwin2*, and macOS major version numbers up to 99. The current released version is Darwin

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread H.J. Lu via Gcc-patches
On Fri, Jun 10, 2022 at 2:38 AM Florian Weimer wrote: > > * liuhongt via Libc-alpha: > > > +\subsubsection{Special Types} > > + > > +The \code{__Bfloat16} type uses a 8-bit exponent and 7-bit mantissa. > > +It is used for \code{BF16} related intrinsics, it cannot be Please mention that this is

[committed] libstdc++: Make std::lcm and std::gcd detect overflow [PR105844]

2022-06-10 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk. -- >8 -- When I fixed PR libstdc++/92978 I introduced a regression whereby std::lcm(INT_MIN, 1) and std::lcm(5, 4) would no longer produce errors during constant evaluation. Those calls are undefined, because they violate the preconditions that

[committed] libstdc++: Fix lifetime bugs for non-TLS eh_globals [PR105880]

2022-06-10 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk. -- >8 -- This ensures that the single-threaded fallback buffer eh_globals is not destroyed during program termination, using the same immortalization technique used for error category objects. Also ensure that init._M_init can still be read after init

[PATCH] regrename: Fix -fcompare-debug issue in check_new_reg_p [PR105041]

2022-06-10 Thread Surya Kumari Jangala via Gcc-patches
regrename: Fix -fcompare-debug issue in check_new_reg_p [PR105041] In check_new_reg_p, the nregs of a du chain is computed by obtaining the MODE of the first element in the chain, and then calling hard_regno_nregs() with the MODE. But the first element of the chain can be a DEBUG_INSN whose mode

[committed] libstdc++: Make std::hash> allocator-agnostic (LWG 3705)

2022-06-10 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk. -- >8 -- This new library issue was recently moved to Tentatively Ready by an LWG poll, so I'm making the change on trunk. As noted in PR libstc++/105907 the std::hash specializations for PMR strings were not treated as slow hashes by the unordered

[PATCH] c++: improve TYPENAME_TYPE hashing [PR65328]

2022-06-10 Thread Patrick Palka via Gcc-patches
The reason compiling the testcase in this PR is so slow is ultimately due to our poor hashing of TYPENAME_TYPE causing a huge amount of hash table collisions in the spec_hasher and typename_hasher tables. In spec_hasher, we don't hash the components of a TYPENAME_TYPE at all, presumably because

[PATCH][AArch64] Implement ACLE Data Intrinsics

2022-06-10 Thread Andre Vieira (lists) via Gcc-patches
Hi, This patch adds support for the ACLE Data Intrinsics to the AArch64 port. Bootstrapped and regression tested on aarch64-none-linux. OK for trunk? gcc/ChangeLog: 2022-06-10  Andre Vieira      * config/aarch64/aarch64.md (rbit2): Rename this ...     (@aarch64_rbit): ... this and

[committed] libstdc++: Partially revert r11-9772-g6f8133689f4397 [PR105915]

2022-06-10 Thread Jonathan Wakely via Gcc-patches
I have done a partial revert on the gcc-11 branch to fix PR105915. I'll also backport it to gcc-10 after testing finishes. -- >8 -- The r11-9772-g6f8133689f4397 backport made two changes, but only one was needed on the gcc-11 branch. The other should not have been backported, and causes errors

Re: [PATCH] c++: optimize specialization of nested class templates

2022-06-10 Thread Patrick Palka via Gcc-patches
On Thu, 9 Jun 2022, Patrick Palka wrote: > On Thu, 9 Jun 2022, Jason Merrill wrote: > > > On 6/8/22 14:21, Patrick Palka wrote: > > > When substituting a class template specialization, tsubst_aggr_type > > > substitutes the TYPE_CONTEXT before passing it to lookup_template_class. > > > This

Fix ipa-prop wrt volatile memory accesses

2022-06-10 Thread Jan Hubicka via Gcc-patches
Hi, this patch prevents ipa-prop from propagating aggregates when load is volatile. Martin, does this look OK? It seem to me that ipa-prop may need some additional volatile flag checks. Bootstrapped/regtested x86_64-linux, OK? Honza gcc/ChangeLog: 2022-06-10 Jan Hubicka PR

Re: [PATCH 2/2] Add a general mapping from internal fns to target insns

2022-06-10 Thread David Malcolm via Gcc-patches
On Fri, 2022-06-10 at 10:14 +0100, Richard Sandiford via Gcc-patches wrote: Several existing internal functions map directly to an instruction defined in target-insns.def.  This patch makes it easier to define more such functions in future. This should help to reduce cut-&-paste, but more

Re: [PING][PATCH][WIP] have configure probe prefix for gmp/mpfr/mpc [PR44425]

2022-06-10 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-06-09 at 16:04 -0400, Eric Gallager via Gcc-patches wrote: > Hi, I'd like to ping this patch: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596126.html > (cc-ing the build machinery maintainers listed in MAINTAINERS this > time) > > On Thu, Jun 2, 2022 at 11:53 AM Eric

[PATCH] Do not erase warning data in gimple_set_location

2022-06-10 Thread Eric Botcazou via Gcc-patches
Hi, gimple_set_location is mostly invoked on newly built GIMPLE statements, so their location is UNKNOWN_LOCATION and setting it will clobber the warning data of the passed location, if any. Tested on x86-64/Linux, OK for mainline and 12 branch? 2022-06-10 Eric Botcazou *

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread Florian Weimer via Gcc-patches
* liuhongt via Libc-alpha: > +\subsubsection{Special Types} > + > +The \code{__Bfloat16} type uses a 8-bit exponent and 7-bit mantissa. > +It is used for \code{BF16} related intrinsics, it cannot be > +used with standard C operators. I think it's not necessary to specify whether the type

Re: [PATCH] testsuite: Add -mtune=generic to dg-options for two testcases.

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 4:45 PM Cui,Lili via Gcc-patches wrote: > > This patch is to change dg-options for two testcases. > > Use -mtune=generic to limit these two testcases. Because configuring them with > -mtune=cascadelake or znver3 will vectorize them. > > regtested on

[PATCH 2/2] Add a general mapping from internal fns to target insns

2022-06-10 Thread Richard Sandiford via Gcc-patches
Several existing internal functions map directly to an instruction defined in target-insns.def. This patch makes it easier to define more such functions in future. This should help to reduce cut-&-paste, but more importantly, it allows the difference between optab functions and target-insns.def

[PATCH 1/2] Factor out common internal-fn idiom

2022-06-10 Thread Richard Sandiford via Gcc-patches
internal-fn.c has quite a few functions that simply map the result of the call to an instruction's output operand (if any) and map each argument to an instruction's input operand, in order. This patch adds a single function for doing that. It's really just a generalisation of

[PATCH] testsuite: Add -mtune=generic to dg-options for two testcases.

2022-06-10 Thread Cui,Lili via Gcc-patches
This patch is to change dg-options for two testcases. Use -mtune=generic to limit these two testcases. Because configuring them with -mtune=cascadelake or znver3 will vectorize them. regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? Thanks, Lili. Use -mtune=generic to limit these two test

Re: [PATCH] Add optional __Bfloat16 support

2022-06-10 Thread Hongtao Liu via Gcc-patches
On Fri, Jun 10, 2022 at 3:47 PM liuhongt via Libc-alpha wrote: > > Pass and return __Bfloat16 values in XMM registers. > > Background: > __Bfloat16 (BF16) is a new floating-point format that can accelerate machine > learning (deep learning training, in particular) algorithms. > It's first

[PATCH] Add optional __Bfloat16 support

2022-06-10 Thread liuhongt via Gcc-patches
Pass and return __Bfloat16 values in XMM registers. Background: __Bfloat16 (BF16) is a new floating-point format that can accelerate machine learning (deep learning training, in particular) algorithms. It's first introduced by Intel AVX-512 extension called AVX-512_BF16. __Bfloat16 has 8 bits

Re: [PATCH] aarch64: Lower vcombine to GIMPLE

2022-06-10 Thread Richard Sandiford via Gcc-patches
Andrew Carlotti via Gcc-patches writes: > Hi all, > > This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables > better optimisation during GIMPLE passes. > > Bootstrapped and tested on aarch64-none-linux-gnu, and tested for > aarch64_be-none-linux-gnu via