Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Maciej Cencora
But constexpr-ness of bit_cast has additional limitations and e.g. providing an union as _Tp would be a hard-error. So we have two options: - before bitcasting check if type can be bitcast-ed at compile-time, - change the 'if constexpr' to regular 'if'. If we go with the second solution then we

Re: [PATCH v6] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-27 Thread Tejas Belagod
On 6/28/24 6:18 AM, Pengxuan Zheng wrote: This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target by adding popcount patterns for vector modes besides QImode, i.e., HImode, SImode and DImode. With this patch, we now generate the following for V8HI: cnt v1.16b, v0.

[PATCH] Use move-aware auto_vec in map

2024-06-27 Thread Jørgen Kvalsvik
Using auto_vec rather than vec for means the vectors are release automatically upon return, to stop the leak. The problem seems is that auto_vec is not really move-aware, only the specialization is. This is actually Jan's original suggestion https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-27 Thread Richard Biener
On Fri, Jun 28, 2024 at 8:01 AM Richard Biener wrote: > > On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote: > > > > for the testcase in the PR115406, here is part of the dump. > > > > char D.4882; > > vector(1) _1; > > vector(1) signed char _2; > > char _5; > > > >: > > _1 = { -1 };

RE: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Roger Sayle
Hi Thomas, There are two things I think I can contribute to this discussion. The first is that I have a patch (from a year or two ago) for adding rtx_costs to the nvptx backend that I will respin, which will provide more backend control over combine-like pass decisions. The second is in res

Re: [PATCH 3/3] [x86] Enable flate-combine.

2024-06-27 Thread Uros Bizjak
On Fri, Jun 28, 2024 at 7:29 AM liuhongt wrote: > > Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also > define target_insn_cost to prevent post_reload pass_late_combine to > revert the optimziation did in pass_rpad. > > Adjust testcases since pass_late_combine generates better

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-27 Thread Richard Biener
On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote: > > for the testcase in the PR115406, here is part of the dump. > > char D.4882; > vector(1) _1; > vector(1) signed char _2; > char _5; > >: > _1 = { -1 }; > > When assign { -1 } to vector(1} {signed-boolean:8}, > Since TYPE_PRECISION

Re: [PATCH 2/3] Extend lshifrtsi3_1_zext to ?k alternative.

2024-06-27 Thread Uros Bizjak
On Fri, Jun 28, 2024 at 7:29 AM liuhongt wrote: > > late_combine will combine lshift + zero into *lshifrtsi3_1_zext which > cause extra mov between gpr and kmask, add ?k to the pattern. > > gcc/ChangeLog: > > PR target/115610 > * config/i386/i386.md (<*insnsi3_zext): Add alternativ

Re: [x86 PATCH] Handle sign_extend like zero_extend in *concatditi3_[346]

2024-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2024 at 9:40 PM Roger Sayle wrote: > > > This patch generalizes some of the patterns in i386.md that recognize > double word concatenation, so they handle sign_extend the same way that > they handle zero_extend in appropriate contexts. > > As a motivating example consider the follo

Re: [PATCH] vect: Fix shift-by-induction for single-lane slp

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 5:15 PM Feng Xue OS wrote: > > I added two test cases for the examples your mentioned. OK, thanks. > BTW: would you please look over another 3 lane-reducing patches that have > been updated? If ok, I would consider to check them in. Sorry, I've been distracted by other

Re: [PATCH v3] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 4:45 PM Li, Pan2 wrote: > > Hi Richard, > > As mentioned by tamar in previous, would like to try even more optimization > based on this patch. > Assume we take zip benchmark as example, we may have gimple similar as below > > unsigned int _1, _2; > unsigned short int _9; >

[PATCH 2/3] Extend lshifrtsi3_1_zext to ?k alternative.

2024-06-27 Thread liuhongt
late_combine will combine lshift + zero into *lshifrtsi3_1_zext which cause extra mov between gpr and kmask, add ?k to the pattern. gcc/ChangeLog: PR target/115610 * config/i386/i386.md (<*insnsi3_zext): Add alternative ?k, enable it only for lshiftrt and under avx512bw.

[PATCH 3/3] [x86] Enable flate-combine.

2024-06-27 Thread liuhongt
Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also define target_insn_cost to prevent post_reload pass_late_combine to revert the optimziation did in pass_rpad. Adjust testcases since pass_late_combine generates better code but break scan assembly. .i.e Under 32-bit target, gcc

[PATCH 0/3][x86] Enable pass_late_combine for x86.

2024-06-27 Thread liuhongt
Because of the issue described in PR115610, late_combine is disabled by default.The series try to solve the regressions and enable late_combine. There're 4 regressions observed. 1. The first one is related to pass_stv2, because late_combine will restore transformation did in the pass. Move the pas

[PATCH 1/3] [avx512 testsuite] Define mask as extern instead of uninitialized local variables.

2024-06-27 Thread liuhongt
The testcases are supposed to scan for vpopcnt{b,w,d,q} operations with k mask, but mask is defined as uninitialized local variable which will be set as 0 at rtl expand phase. And it's further simplified off by late_combine which caused scan assembly failure. Move the definition of mask outside to

[PATCH v1] Match: Support imm form for unsigned scalar .SAT_ADD

2024-06-27 Thread pan2 . li
From: Pan Li This patch would like to support the form of unsigned scalar .SAT_ADD when one of the op is IMM. For example as below: Form IMM: #define DEF_SAT_U_ADD_IMM_FMT_1(T) \ T __attribute__((noinline)) \ sat_u_add_imm_##T##_fmt_1 (T x) \ {

[PATCH] MIPS/testsuite: Add -mfpxx to call-clobbered-1.c

2024-06-27 Thread YunQiang Su
The scan-assembler-times rules only fit for -mfp32 and -mfpxx. It fails if we are configured as FP64 by default, as it has one less sdc1/ldc1 pair. gcc/testsuite * gcc.target/mips/call-clobbered-1.c: Add -mfpxx. --- gcc/testsuite/gcc.target/mips/call-clobbered-1.c | 2 +- 1 file changed,

[PATCH] MIPS: Support more cases with alien mode of SHF.DF

2024-06-27 Thread YunQiang Su
Currently, we support the cases that strictly fit for the instructions. For example, for V16QImode, we only support shuffle like (0<=N0, N1, N2, N3<=3 here) N0, N1, N2, N3 N0+4N1+4N2+4, N3+4 N0+8N1+8N2+8, N3+8 N0+12 N1+12 N2+12, N

[PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-27 Thread liuhongt
for the testcase in the PR115406, here is part of the dump. char D.4882; vector(1) _1; vector(1) signed char _2; char _5; : _1 = { -1 }; When assign { -1 } to vector(1} {signed-boolean:8}, Since TYPE_PRECISION (itype) <= BITS_PER_UNIT, so it set each bit of dest with each vector el

Re: [x86 SSE PATCH] Some additional ternlog refinements.

2024-06-27 Thread Hongtao Liu
On Thu, Jun 27, 2024 at 4:29 PM Roger Sayle wrote: > > > This patch is another round of refinements to fine tune the new ternlog > infrastructure in i386's sse.md. This patch tweaks ix86_ternlog_idx > to allow multiple MEM/CONST_VECTOR/VEC_DUPLICATE operands prior to > splitting (before reload),

RE: [PATCH v5] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-27 Thread Pengxuan Zheng (QUIC)
Thanks, Richard! I've updated the patch accordingly. https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655912.html Please let me know if any other changes are needed. Thanks, Pengxuan > Sorry for the slow reply. > > Pengxuan Zheng writes: > > This patch improves GCC’s vectorization of __buil

[PATCH v6] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-27 Thread Pengxuan Zheng
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target by adding popcount patterns for vector modes besides QImode, i.e., HImode, SImode and DImode. With this patch, we now generate the following for V8HI: cnt v1.16b, v0.16b uaddlp v2.8h, v1.16b For V4HI, we gen

Re: [PATCH] libgccjit: Fix get_size of size_t

2024-06-27 Thread Antoni Boucher
Le 2024-06-26 à 18 h 01, David Malcolm a écrit : On Wed, 2024-02-21 at 14:16 -0500, Antoni Boucher wrote: On Thu, 2023-12-07 at 19:57 -0500, David Malcolm wrote: On Thu, 2023-12-07 at 17:26 -0500, Antoni Boucher wrote: Hi. This patch fixes getting the size of size_t (bug 112910). There's o

Re: [PATCH] preprocessor: Create the parser before handling command-line includes [PR115312]

2024-06-27 Thread Marek Polacek
On Thu, Jun 27, 2024 at 05:06:14PM -0400, Lewis Hyatt wrote: > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115312 > > This fixes a 14.1 regression with PCH for MinGW and other platforms that don't > use stdc-predef.h. Bootstrap + regtest all languages on x86-64 Linux; > bootstrap + re

[PATCH] Testsuite/MIPS: Fix msa.c: test7_v2f64, test7_v4f32, test43_v2i64

2024-06-27 Thread YunQiang Su
BNEGI.W/D are used for test7_v2f64 and test7_v4f32 now. It is an improvment since that we can save a instruction. ILVR.D is used for test43_v2i64 now, instead of INSVE.D. gcc/testsuite gcc.target/mips/msa.c: Fix test7_v2f64, test7_v4f32 and test43_v2i64. --- gcc/testsuite/gcc.ta

Re: [PATCH v2] MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

2024-06-27 Thread YunQiang Su
Maciej W. Rozycki 于2024年6月28日周五 01:01写道: > > On Thu, 27 Jun 2024, YunQiang Su wrote: > > > > The missed optimisation in GAS, which used not to trigger pre-R6, is > > > irrelevant from this change's point of view and just adds noise. I'm > > > surprised that it worked even in the first place, as

Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Thomas Schwinge
Hi! On 2024-06-27T23:20:18+0200, I wrote: > On 2024-06-27T22:27:21+0200, I wrote: >> On 2024-06-27T18:49:17+0200, I wrote: >>> On 2023-10-24T19:49:10+0100, Richard Sandiford >>> wrote: This patch adds a combine pass that runs late in the pipeline. >> >> [After sending, I realized I replied

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Jonathan Wakely
On Thu, 27 Jun 2024 at 14:27, Maciej Cencora wrote: > > I think going the bit_cast way would be the best because it enables the > optimization for many more classes including common wrappers like optional, > variant, pair, tuple and std::array. This isn't tested but seems to work on simple case

[RFC PATCH] cse: Add another CSE pass after split1

2024-06-27 Thread Palmer Dabbelt
This is really more of a question than a patch. Looking at PR/115687 I managed to convince myself there's a general class of problems here: splitting might produce constant subexpressions, but as far as I can tell there's nothing to eliminate those constant subexpressions. So I very quickly threw

Re: [PATCH] libgccjit: Make new_array_type take unsigned long

2024-06-27 Thread Antoni Boucher
Thanks for the review. I'm a bit concerned about using unsigned long. Would it be OK if I change the type to uint64_t? I could rename the function to gcc_jit_context_new_array_type_u64. Regards. Le 2024-06-26 à 11 h 34, David Malcolm a écrit : On Fri, 2024-02-23 at 09:55 -0500, Antoni Boucher wr

Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Thomas Schwinge
Hi! On 2024-06-27T22:27:21+0200, I wrote: > On 2024-06-27T18:49:17+0200, I wrote: >> On 2023-10-24T19:49:10+0100, Richard Sandiford >> wrote: >>> This patch adds a combine pass that runs late in the pipeline. > > [After sending, I realized I replied to a previous thread of this work.] > >> I've

Re: [PATCH] c: ICE with invalid sizeof [PR115642]

2024-06-27 Thread Marek Polacek
Sorry, I used the wrong e-mail address for Joseph. On Wed, Jun 26, 2024 at 11:09:37AM -0400, Marek Polacek wrote: > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > -- >8 -- > Here we ICE in c_expr_sizeof_expr on an erroneous expr.value. The > code checks for expr.value == error_

Re: [PATCH] c: ICE on invalid with attribute optimize [PR115549]

2024-06-27 Thread Marek Polacek
Sorry, I used the wrong e-mail address for Joseph. On Thu, Jun 27, 2024 at 05:04:41PM -0400, Marek Polacek wrote: > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > -- >8 -- > I had this PR in my open tabs so why not go ahead and fix it. > > decl_attributes gets last_decl, the la

[PATCH] preprocessor: Create the parser before handling command-line includes [PR115312]

2024-06-27 Thread Lewis Hyatt
Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115312 This fixes a 14.1 regression with PCH for MinGW and other platforms that don't use stdc-predef.h. Bootstrap + regtest all languages on x86-64 Linux; bootstrap + regtest c,c++ on x86_64-w64-mingw32. Is it OK for 14 branch and master please

[PATCH] c: ICE on invalid with attribute optimize [PR115549]

2024-06-27 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- I had this PR in my open tabs so why not go ahead and fix it. decl_attributes gets last_decl, the last already pushed declaration, to be used in common_handle_aligned_attribute. In C++, we look up the decl via find_last_decl,

Re: [PATCH] _Hashtable fancy pointer support

2024-06-27 Thread Jonathan Wakely
On Thu, 27 Jun 2024 at 20:25, François Dumont wrote: > > Thanks for the link, based on it I removed some of the nullptr usages > keeping only assignments. That's not necessary. A nullable pointer type is equality comparable with nullptr_t, and nullptr can be implicitly converted to the pointer ty

Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Thomas Schwinge
Hi! On 2024-06-27T18:49:17+0200, I wrote: > On 2023-10-24T19:49:10+0100, Richard Sandiford > wrote: >> This patch adds a combine pass that runs late in the pipeline. [After sending, I realized I replied to a previous thread of this work.] > I've beek looking a bit through recent nvptx target c

[x86 PATCH] Handle sign_extend like zero_extend in *concatditi3_[346]

2024-06-27 Thread Roger Sayle
This patch generalizes some of the patterns in i386.md that recognize double word concatenation, so they handle sign_extend the same way that they handle zero_extend in appropriate contexts. As a motivating example consider the following function: __int128 foo(long long x, unsigned long long y)

Re: [PATCH] _Hashtable fancy pointer support

2024-06-27 Thread François Dumont
Thanks for the link, based on it I removed some of the nullptr usages keeping only assignments. François On 26/06/2024 23:41, Jonathan Wakely wrote: On Wed, 26 Jun 2024 at 21:39, François Dumont wrote: Hi Here is my proposal to add support for fancy allocator pointer. The only place where

Re: [gcc r15-1619] ira: Scale save/restore costs of callee save registers with block frequency

2024-06-27 Thread Andrew Pinski
On Thu, Jun 27, 2024 at 3:57 AM Andreas Schwab wrote: > > This breaks s390. > > ../../../../../gcc/libstdc++-v3/src/c++17/floating_to_chars.cc: In function > ‘std::to_chars_result std::__floating_to_chars_shortest(char*, char*, T, > chars_format) [with T = long double]’: > ../../../../../gcc/lib

Re: [PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2024 at 12:50 PM Evgeny Karpov wrote: > > Thursday, June 27, 2024 10:39 AM > Uros Bizjak wrote: > > > > diff --git a/gcc/config/i386/i386-expand.cc > > > b/gcc/config/i386/i386-expand.cc > > > index 5dfa7d49f58..20adb42e17b 100644 > > > --- a/gcc/config/i386/i386-expand.cc > > >

Re: [PATCH v5] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-27 Thread Richard Sandiford
Sorry for the slow reply. Pengxuan Zheng writes: > This patch improves GCC’s vectorization of __builtin_popcount for aarch64 > target > by adding popcount patterns for vector modes besides QImode, i.e., HImode, > SImode and DImode. > > With this patch, we now generate the following for V8HI: >

Re: [PATCH] fixincludes: adjust stdio fix for macOS 15 headers

2024-06-27 Thread FX Coudert
> OK. thanks for the fix > I guess we have also to backport if we want earlier branches to bootstrap > there too? Thanks. I’ll backport after some time. FX

Re: [PATCH] fixincludes: adjust stdio fix for macOS 15 headers

2024-06-27 Thread Iain Sandoe
> On 27 Jun 2024, at 17:59, FX Coudert wrote: > > macOS 15 headers move the bulk of the content of to an included > header <_stdio.h> so we apply the “apple_local_stdio_fn_deprecation” > fixinclude to this file also. > > Restores bootstrap on darwin24. > OK to push? OK. thanks for the fix

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-27 Thread FX Coudert
Among the review comments from the last round, Jakub suggested: > Perhaps libgccjit.h could use > #ifdef __has_include > #if __has_include () > #include > #endif > #endif > instead of just #include . I’m not sure it’s necessary since other headers treat as always available, but I suppose it ca

Re: [PATCH v2] MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

2024-06-27 Thread Maciej W. Rozycki
On Thu, 27 Jun 2024, YunQiang Su wrote: > > The missed optimisation in GAS, which used not to trigger pre-R6, is > > irrelevant from this change's point of view and just adds noise. I'm > > surprised that it worked even in the first place, as I reckon GCC is > > supposed to emit regular MIPS cod

[PATCH] fixincludes: adjust stdio fix for macOS 15 headers

2024-06-27 Thread FX Coudert
macOS 15 headers move the bulk of the content of to an included header <_stdio.h> so we apply the “apple_local_stdio_fn_deprecation” fixinclude to this file also. Restores bootstrap on darwin24. OK to push? FX fixincludes/ChangeLog: * fixincl.x: Regenerate. * inclhack.def (a

nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Thomas Schwinge
Hi! On 2023-10-24T19:49:10+0100, Richard Sandiford wrote: > This patch adds a combine pass that runs late in the pipeline. Great! In context of 'nvptx vs. "fwprop: invoke change_is_worthwhile to judge if a replacement is worthwhile"', I've beek looking a bit thr

Re: [BACKPORT] AArch64: Fix strict-align cpymem/setmem [PR103100]

2024-06-27 Thread Richard Sandiford
Wilco Dijkstra writes: > OK to backport to GCC13 (it applies cleanly and regress/bootstrap passes)? Yes, thanks. Richard > > Cheers, > Wilco > > On 29/11/2023 18:09, Richard Sandiford wrote: >> Wilco Dijkstra writes: >>> v2: Use UINTVAL, rename max_mops_size. >>> >>> The cpymemdi/setmemdi impl

Re: [PATCH] libgccjit: Add support for machine-dependent builtins

2024-06-27 Thread Uros Bizjak
On Thu, Jun 27, 2024 at 12:49 AM David Malcolm wrote: > > On Thu, 2023-11-23 at 17:17 -0500, Antoni Boucher wrote: > > Hi. > > I did split the patch and sent one for the bfloat16 support and > > another > > one for the vector support. > > > > Here's the updated patch for the machine-dependent buil

Re: [PATCH 3/3 v2] RISC-V: Add md files for vector BFloat16

2024-06-27 Thread Patrick O'Neill
Hi Feng, Precommit results for the series: https://github.com/ewlu/gcc-precommit-ci/issues/1809#issuecomment-2193980567 https://patchwork.sourceware.org/project/gcc/patch/20240627070121.32461-3-wangf...@eswincomputing.com/ It looks like there are 5 minor testsuite failures added. Log from t

Re: [PATCH v3] Arm: Fix ldrd offset range [PR115153]

2024-06-27 Thread Wilco Dijkstra
Hi Richard, > The Linaro CI is reporting an ICE while building libgfortran with this change. So it looks like Thumb-2 oddly enough restricts the negative range of DFmode eventhough that is unnecessary and inefficient. The easiest workaround turned out to avoid using checked adjust_address. Cheer

Re: [PATCH v3] Arm: Fix disassembly error in Thumb-1 relaxed load/store [PR115188]

2024-06-27 Thread Wilco Dijkstra
Hi Richard, > Doing just this will mean that the register allocator will have to undo a > pre/post memory operand that was accepted by the predicate (memory_operand).  > I think we really need a tighter predicate (lets call it noautoinc_mem_op) > here to avoid that.  Note that the existing uses

Re: [PATCH v3] c: Error message for incorrect use of static in array declarations

2024-06-27 Thread Marek Polacek
On Thu, Jun 27, 2024 at 11:06:51AM +, Uecker, Martin wrote: > > Next version with the improved location. I assume the [PATCH] > should become part of the commit message. Just the "c: ..." part please. > Bootstrapped and regression tested on x86_64. Thanks, this patch is OK. > c: Err

RE: [PATCH][c++ frontend]: check for missing condition for novector [PR115623]

2024-06-27 Thread Tamar Christina
> -Original Message- > From: Jason Merrill > Sent: Tuesday, June 25, 2024 10:24 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; nat...@acm.org > Subject: Re: [PATCH][c++ frontend]: check for missing condition for novector > [PR115623] > > On 6/25/24 12:52, Tamar Christina wr

Re: [PATCH] vect: Fix shift-by-induction for single-lane slp

2024-06-27 Thread Feng Xue OS
I added two test cases for the examples your mentioned. BTW: would you please look over another 3 lane-reducing patches that have been updated? If ok, I would consider to check them in. Thanks, Feng -- Allow shift-by-induction for slp node, when it is single lane, which is aligned with the or

Re: [PATCH v5] gcc, libcpp: Add warning switch for "#pragma once in main file" [PR89808]

2024-06-27 Thread Ken Matsui
Ping. On Sat, Jun 15, 2024 at 10:30 PM Ken Matsui wrote: > > This patch adds a warning switch for "#pragma once in main file". The > warning option name is Wpragma-once-outside-header, which is the same > as Clang provides. > > PR preprocessor/89808 > > gcc/c-family/ChangeLog: > >

[BACKPORT] AArch64: Fix strict-align cpymem/setmem [PR103100]

2024-06-27 Thread Wilco Dijkstra
OK to backport to GCC13 (it applies cleanly and regress/bootstrap passes)? Cheers, Wilco On 29/11/2023 18:09, Richard Sandiford wrote: > Wilco Dijkstra writes: >> v2: Use UINTVAL, rename max_mops_size. >> >> The cpymemdi/setmemdi implementation doesn't fully support strict alignment. >> Block t

RE: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Thursday, June 27, 2024 3:49 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw ; > Richard Sandiford > Subject: Re: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2 > > Hi Tamar, > Thanks for going thro

Re: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Kyrylo Tkachov
Hi Tamar, Thanks for going through the docs here, > On 27 Jun 2024, at 16:19, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi Kyrill, > >> -Original Message- >> From: Kyrylo Tkachov >> Sent: Thursday, June 27, 2024 9:58 AM >> To: gcc-patch

RE: [PATCH v3] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-27 Thread Li, Pan2
Hi Richard, As mentioned by tamar in previous, would like to try even more optimization based on this patch. Assume we take zip benchmark as example, we may have gimple similar as below unsigned int _1, _2; unsigned short int _9; _9 = (unsigned short int).SAT_SUB (_1, _2); If we can locate the

RE: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Tamar Christina
Hi Kyrill, > -Original Message- > From: Kyrylo Tkachov > Sent: Thursday, June 27, 2024 9:58 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > > Subject: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2 > > Hi all, > > According to the TRM for Neove

[pushed] Disable late-combine for -O0 [PR115677]

2024-06-27 Thread Richard Sandiford
late-combine relies on df, which for -O0 is only initialised late (pass_df_initialize_no_opt, after split1). Other df-based passes cope with this by requiring optimize > 0, so this patch does the same for late-combine. Bootstrapped & regression tested on aarch64-linux-gnu, pushed as obvious. Ric

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Maciej Cencora
I think going the bit_cast way would be the best because it enables the optimization for many more classes including common wrappers like optional, variant, pair, tuple and std::array. Regards, Maciej Cencora czw., 27 cze 2024 o 14:57 Maciej Cencora napisał(a): > You could include some of the b

Re: [PATCH] s390: Check for ADDR_REGS in s390_decompose_addrstyle_without_index

2024-06-27 Thread Andreas Krebbel
On 6/26/24 14:15, Stefan Schulze Frielinghaus wrote: An explicit check for address registers was not required so far since during register allocation the processing of address constraints was sufficient. However, address constraints themself do not check for REGNO_OK_FOR_{BASE,INDEX}_P. Thus, w

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Maciej Cencora
You could include some of the bigger classes by checking whether the class type is bit_cast-able to std::array of bytes, and that bitcasted output is equal to value-initialized array. Regards, Maciej czw., 27 cze 2024 o 14:50 Jonathan Wakely napisał(a): > On Thu, 27 Jun 2024 at 13:49, Jonathan

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Jonathan Wakely
On Thu, 27 Jun 2024 at 13:49, Jonathan Wakely wrote: > > On Thu, 27 Jun 2024 at 13:40, Maciej Cencora wrote: > > > > Hi, > > > > not sure whether I've missed some conditional that would exclude this case, > > but your change seems to incorrectly handle trivial types that have a > > non-zero bit

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Jonathan Wakely
On Thu, 27 Jun 2024 at 13:40, Maciej Cencora wrote: > > Hi, > > not sure whether I've missed some conditional that would exclude this case, > but your change seems to incorrectly handle trivial types that have a > non-zero bit pattern of value-initialized object, e.g. pointer to member. Good poi

Re: PR target/115618: can we back port the fix to GCC 13?

2024-06-27 Thread Andrew Carlotti
On Wed, Jun 26, 2024 at 09:03:26AM +, Kyrylo Tkachov wrote: > Hi Andrew, > > I’ve tested the fix for PR 115618 from your commit r14-6612-g8d30107455f230 > on the GCC 13 branch. > I’d like to back port it to that branch. > Is there any problem with that I should be aware of? > It applies clean

[PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Maciej Cencora
Hi, not sure whether I've missed some conditional that would exclude this case, but your change seems to incorrectly handle trivial types that have a non-zero bit pattern of value-initialized object, e.g. pointer to member. Regards, Maciej Cencora

Re: [PATCH 2/2] Harden SLP reduction support wrt STMT_VINFO_REDUC_IDX

2024-06-27 Thread Richard Biener
On Thu, 27 Jun 2024, Richard Biener wrote: > The following makes sure that for a SLP reductions all lanes have > the same STMT_VINFO_REDUC_IDX. Once we move that info and can adjust > it we can implement swapping. It also makes the existing protection > against operand swapping trigger for all s

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Jonathan Wakely
On Thu, 27 Jun 2024 at 11:53, Jonathan Wakely wrote: > > For trivial types std::__uninitialized_default (which is used by > std::uninitialized_value_construct) value-initializes the first element > then copies that to the rest of the range using std::fill. > > Tamar is working on improved vectoriz

[committed v2] libstdc++: Fix std::codecvt for empty dest [PR37475]

2024-06-27 Thread Jonathan Wakely
Here's what I've pushed, with a typo fixed as spotted by Kristian in the PR comments. Tested x86_64-linux. Pushed to trunk. -- >8 -- For the GNU locale model, codecvt::do_out and codecvt::do_in incorrectly return 'ok' when the destination range is empty. That happens because detecting incomplete

[PATCH v3] c: Error message for incorrect use of static in array declarations

2024-06-27 Thread Uecker, Martin
Next version with the improved location. I assume the [PATCH] should become part of the commit message. Bootstrapped and regression tested on x86_64. c: Error message for incorrect use of static in array declarations. Add an explicit error messages when c99's static is used wi

Re: [gcc r15-1619] ira: Scale save/restore costs of callee save registers with block frequency

2024-06-27 Thread Andreas Schwab
This breaks s390. ../../../../../gcc/libstdc++-v3/src/c++17/floating_to_chars.cc: In function ‘std::to_chars_result std::__floating_to_chars_shortest(char*, char*, T, chars_format) [with T = long double]’: ../../../../../gcc/libstdc++-v3/src/c++17/floating_to_chars.cc:1306:3: internal compiler

[PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread Evgeny Karpov
Thursday, June 27, 2024 11:31 AM Christophe Lyon wrote: > > Hi Evgeny, > > Minor comments: > - the patch title should end with [PRn, ...] (choose the most > relevant bug number) > - ChangeLog should mention every bug with PR component/n > so that the bugzilla hooks will notice the commit.

[PATCH 3/3] libstdc++: Use std::__uninitialized_default for ranges::uninitialized_value_construct

2024-06-27 Thread Jonathan Wakely
By generalizing std::__uninitialized_default to work with non-common ranges (i.e. iterator/sentinel pair) we can reuse it for the ranges::uninitialized_value_construct function. Doing that ensures that whatever optimizations we have for std::uninitialized_value_construct are automatically used for

[PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-27 Thread Jonathan Wakely
For trivial types std::__uninitialized_default (which is used by std::uninitialized_value_construct) value-initializes the first element then copies that to the rest of the range using std::fill. Tamar is working on improved vectorization for std::fill, but for this value-initialized case where we

[PATCH 1/3] libstdc++: Use RAII in

2024-06-27 Thread Jonathan Wakely
This refactoring to use RAII doesn't seem to make any difference in benchmarks, although the generated code for some std::vector operations seems to be slightly larger. Maybe it will be faster (or slower) in some cases I didn't test? I think I like the change anyway - any other opinions on whether

[PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread Evgeny Karpov
Thursday, June 27, 2024 10:39 AM Uros Bizjak wrote: > > diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc > > index 5dfa7d49f58..20adb42e17b 100644 > > --- a/gcc/config/i386/i386-expand.cc > > +++ b/gcc/config/i386/i386-expand.cc > > @@ -414,6 +414,10 @@ ix86_expand_mov

Re: [PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread Sam James
Evgeny Karpov writes: > Thank you for reporting the issues and discussing the root causes. > It helped in preparing the patch. Thanks. I'll test it shortly but it looks equivalent to my local changes, so LGTM. > > This patch fixes 3 bugs reported after merging > the "Add DLL import/export impl

Re: [PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 10:40 AM Uros Bizjak wrote: > > On Thu, Jun 27, 2024 at 9:16 AM Evgeny Karpov > wrote: > > > > Thank you for reporting the issues and discussing the root causes. > > It helped in preparing the patch. > > > > This patch fixes 3 bugs reported after merging > > the "Add DLL i

Re: [PATCH 5/7] Adjust testcase for the regressed testcases after obsolete of vcond{, u, eq}.

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 10:30 AM liuhongt wrote: > > > Richard suggests that we implement the "obvious" transforms like > > inversion in the middle-end but if for example unsigned compares > > are not supported the us_minus + eq + negative trick isn't on > > that list. > > > > The main reason to r

Re: [PATCH 2/7] Lower AVX512 kmask comparison back to AVX2 comparison when op_{true, false} is vector -1/0.

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 10:30 AM liuhongt wrote: > > gcc/ChangeLog In PR115659 Kewen notes that ISEL (and possibly folding) could do a better job with these. In addition to the mentioned issues we can also try whether the target can handle an alternate mask mode. So instead of gating with

Re: [PATCH 0/7][x86] Remove vcond{,u,eq} expanders.

2024-06-27 Thread Richard Biener
On Thu, Jun 27, 2024 at 10:27 AM liuhongt wrote: > > There're several regressions after obsolete vcond{,u,eq}, > Some regressions are due to the direct optimizations in > ix86_expand_{fp,int}_vcond..i.e ix86_expand_sse_fp_minmax. > Some regrssions are due to optimizations relies on canonicalizatio

[PATCH 2/2] Harden SLP reduction support wrt STMT_VINFO_REDUC_IDX

2024-06-27 Thread Richard Biener
The following makes sure that for a SLP reductions all lanes have the same STMT_VINFO_REDUC_IDX. Once we move that info and can adjust it we can implement swapping. It also makes the existing protection against operand swapping trigger for all stmts participating in a reduction, not just the fina

[PATCH 1/2] tree-optimization/115669 - fix SLP reduction association

2024-06-27 Thread Richard Biener
The following avoids associating a reduction path as that might get STMT_VINFO_REDUC_IDX out-of-sync with the SLP operand order. This is a latent issue with SLP reductions but now easily exposed as we're doing single-lane SLP reductions. When we achieved SLP only we can move and update this meta-d

Re: [PATCH v3] [testsuite] [arm] [vect] adjust mve-vshr test [PR113281]

2024-06-27 Thread Alexandre Oliva
On Jun 26, 2024, Richard Sandiford wrote: > Alexandre Oliva writes: >> On Jun 25, 2024, Richard Sandiford wrote: >> Richard (Sandiford), do you happen to recall why the IRC conversation mentioned in the PR trail decided to drop it entirely, even for signed types? >> >>> In the

Re: [PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread unlvsur unlvsur
Can this process be a little bit simpler in the future? Get Outlook for Android From: Christophe Lyon Sent: Thursday, June 27, 2024 5:30:47 AM To: Evgeny Karpov Cc: gcc-patches@gcc.gnu.org ; ubiz...@gmail.com ; richard.sandif...@arm.com ;

Re: [PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-27 Thread Christophe Lyon
Hi Evgeny, Minor comments: - the patch title should end with [PRn, ...] (choose the most relevant bug number) - ChangeLog should mention every bug with PR component/n so that the bugzilla hooks will notice the commit. See https://gcc.gnu.org/contribute.html#patches (but I can do it for yo

Re: [PATCH v2] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-27 Thread Uros Bizjak
EN_STORE (vectp_out.12_75, 32B, { -1, ... }, _81, 0, > vect_patt_49.11_73); > vectp_op_1.7_69 = vectp_op_1.7_68 + ivtmp_67; > vectp_out.12_76 = vectp_out.12_75 + ivtmp_74; > ivtmp_80 = ivtmp_79 - _81; > > riscv64-unknown-elf-gcc (GCC) 15.0.0 20240627 (experimental) >

[PATCH 2/2] libstdc++: Do not use C++11 alignof in C++98 mode [PR104395]

2024-06-27 Thread Jonathan Wakely
As I commented in the PR, I think it would be nice if the compiler accepted C++11 alignof in C++98 mode when -faligned-new is used. But even if G++ added that, we'd need Clang to use it, and then wait a few releases for that new Clang support to be in widespread use. So let's just disable the exte

[PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Kyrylo Tkachov
Hi all, According to the TRM for Neoverse V2 the Memory Tagging and RNG features are optional configurations of the core and may not always be present. Therefore -mcpu=neoverse-v2 shouldn't enable them, similar to how the crypto extensions aren’t enabled by default. Bootstrapped and tested on aa

[PATCH 1/2] libstdc++: Simplify class templates

2024-06-27 Thread Jonathan Wakely
I'm planning to push this, although arguably the first change isn't worth doing if we can't use it everywhere. If we need to keep the old code for EDG, maybe we should just keep using that? The new version probably compiles faster though. Removing the dependency on std::aligned_storage and adding

[COMMITTED 1/7] ada: Implement first half of Generalized Finalization

2024-06-27 Thread Marc Poulhiès
From: Eric Botcazou This implements the first half of the Generalized Finalization proposal, namely the Finalizable aspect as well as its optional relaxed semantics for the finalization operations, but the latter part is only implemented for dynamically allocated objects. In accordance with the

[COMMITTED 5/7] ada: Add missing dimension information for target names

2024-06-27 Thread Marc Poulhiès
From: Eric Botcazou It is computed from the Etype of N_Target_Name nodes. gcc/ada/ * sem_ch5.adb (Analyze_Target_Name): Call Analyze_Dimension on the node once the Etype is set. * sem_dim.adb (OK_For_Dimension): Set to True for N_Target_Name. (Analyze_Dimension):

[COMMITTED 4/7] ada: Fix array-manipulating code in Mdll

2024-06-27 Thread Marc Poulhiès
From: Ronan Desplanques This patch fixes a duo of array assigments in Mdll that were bound to fail. gcc/ada/ * mdll.adb (Build_Non_Reloc_DLL): Fix incorrect assignment to array object. (Ada_Build_Non_Reloc_DLL): Likewise. Tested on x86_64-pc-linux-gnu, committed on mast

[COMMITTED 7/7] ada: Remove last uses of System.Address_Operations in runtime library

2024-06-27 Thread Marc Poulhiès
From: Eric Botcazou This completes the switch from using System.Address_Operations to using only System.Storage_Elements in the runtime library. The remaining uses were for simple optimizations that can be done by the optimizer alone. gcc/ada/ * libgnat/s-carsi8.adb: Remove clauses for

[COMMITTED 3/7] ada: Bug using user defined string literals with interpolated strings

2024-06-27 Thread Marc Poulhiès
From: Javier Miranda The frontend rejects the use of user defined string literals using interpolated strings. gcc/ada/ * sem_res.adb (Has_Applicable_User_Defined_Literal): Add missing support for interpolated strings. Tested on x86_64-pc-linux-gnu, committed on master. --- gc

[COMMITTED 2/7] ada: Overridden operation field not correctly set for controlling result wrappers

2024-06-27 Thread Marc Poulhiès
From: Martin Clochard Implicit wrapper overridings generated for functions with controlling result when deriving with null extension may have field Overridden_Operation incorrectly set, when making several such derivations in succession. This happens because overridings were assumed to come from

  1   2   >