Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Andrew Stubbs wrote: > On 14/02/2024 13:43, Richard Biener wrote: > > On Wed, 14 Feb 2024, Andrew Stubbs wrote: > > > >> On 14/02/2024 13:27, Richard Biener wrote: > >>> On Wed, 14 Feb 2024, Andrew Stubbs wrote: > >>> > On 13/02/2024 08:26, Richard Biener wrote: > >

[PATCH] lower-bitint: Ensure we don't get coalescing ICEs for (ab) SSA_NAMEs used in mul/div/mod [PR113567]

2024-02-14 Thread Jakub Jelinek
Hi! The build_bitint_stmt_ssa_conflicts hook has a special case for multiplication, division and modulo, where to ensure there is no overlap between lhs and rhs1/rhs2 arrays we make the lhs conflict with the operands. On the following testcase, we have # a_1(ab) = PHI lab: a_3(ab) = a_1(ab)

[PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-02-14 Thread Jakub Jelinek
Hi! AFAIK we have no code in LTO streaming to stream out or in SSA_NAME_{RANGE,PTR}_INFO, so LTO effectively throws it all away and let vrp1 and alias analysis after IPA recompute that. There is just one spot, for IPA VRP and IPA bit CCP we save/restore ranges and set SSA_NAME_{PTR,RANGE}_INFO

Re: [PATCH] Skip gnat.dg/div_zero.adb on RISC-V

2024-02-14 Thread Kito Cheng
LGTM, thanks :) On Wed, Feb 14, 2024 at 10:11 PM Andreas Schwab wrote: > > Like AArch64 and POWER, RISC-V does not support trap on zero divide. > > gcc/testsuite/ > * gnat.dg/div_zero.adb: Skip on RISC-V. > --- > gcc/testsuite/gnat.dg/div_zero.adb | 2 +- > 1 file changed, 1

PING: [PATCH v3 0/8] Optimize more type traits

2024-02-14 Thread Ken Matsui
IIRC, all libstdc++ patches were already reviewed. It would be great if gcc patches were reviewed as well. Thank you for your time. Sincerely, Ken Matsui On Fri, Jan 5, 2024 at 9:08 PM Ken Matsui wrote: > > Changes in v3: > > - Rebased on top of master. > - Fixed __is_pointer in

[PATCH v3 1/4] c++: Implement __add_pointer built-in trait

2024-02-14 Thread Ken Matsui
This patch implements built-in trait for std::add_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __add_pointer. * semantics.cc (finish_trait_type): Handle CPTK_ADD_POINTER. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __add_pointer.

Re: [PATCH v2 1/4] c++: Implement __add_pointer built-in trait

2024-02-14 Thread Ken Matsui
On Wed, Feb 14, 2024 at 12:19 PM Patrick Palka wrote: > > On Wed, 14 Feb 2024, Ken Matsui wrote: > > > This patch implements built-in trait for std::add_pointer. > > > > gcc/cp/ChangeLog: > > > > * cp-trait.def: Define __add_pointer. > > * semantics.cc (finish_trait_type): Handle

[PATCH V4 4/5] RISC-V: Quick and simple fixes to testcases that break due to reordering

2024-02-14 Thread Edwin Lu
The following test cases are easily fixed with small updates to the expected assembly order. Additionally make calling-convention testcases more robust PR target/113249 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c: update *

[PATCH V4 3/5] RISC-V: Use default cost model for insn scheduling

2024-02-14 Thread Edwin Lu
Use default cost model scheduling on these test cases. All these tests introduce scan dump failures with -mtune generic-ooo. Since the vector cost models are the same across all three tunes, some of the tests in PR113249 will be fixed with this patch series. PR target/113249

[PATCH V4 2/5] RISC-V: Add vector related pipelines

2024-02-14 Thread Edwin Lu
Creates new generic vector pipeline file common to all cpu tunes. Moves all vector related pipelines from generic-ooo to generic-vector-ooo. Creates new vector crypto related insn reservations. gcc/ChangeLog: * config/riscv/generic-ooo.md (generic_ooo): Move reservation

[PATCH V4 5/5] RISC-V: Enable assert for insn_has_dfa_reservation

2024-02-14 Thread Edwin Lu
Enables assert that every typed instruction is associated with a dfa reservation gcc/ChangeLog: * config/riscv/riscv.cc (riscv_sched_variable_issue): enable assert Signed-off-by: Edwin Lu --- V2: - No changes V3: - Remove debug statements V4: - no changes ---

[PATCH V4 1/5] RISC-V: Add non-vector types to dfa pipelines

2024-02-14 Thread Edwin Lu
This patch adds non-vector related insn reservations and updates/creates new insn reservations so all non-vector typed instructions have a reservation. gcc/ChangeLog: * config/riscv/generic-ooo.md (generic_ooo_sfb_alu): Add reservation (generic_ooo_branch): ditto *

[PATCH V4 0/5] RISC-V: Associate typed insns to dfa reservation

2024-02-14 Thread Edwin Lu
Previous version (V3 23cd2961bd2ff63583f46e3499a07bd54491d45c) was reverted. Updates all tune insn reservation pipelines to cover all types defined by define_attr "type" in riscv.md. Creates new vector insn reservation pipelines in new file generic-vector-ooo.md which has separate automaton

Re: [PATCH RFA] build: drop target libs from LD_LIBRARY_PATH [PR105688]

2024-02-14 Thread Iain Sandoe
> On 14 Feb 2024, at 22:59, Iain Sandoe wrote: >> On 12 Feb 2024, at 19:59, Jason Merrill wrote: >> >> On 2/10/24 07:30, Iain Sandoe wrote: On 10 Feb 2024, at 12:07, Jason Merrill wrote: On 2/10/24 05:46, Iain Sandoe wrote: >> On 9 Feb 2024, at 23:21, Iain Sandoe wrote:

[PATCH] bpf: fix zero_extendqidi2 ldx template

2024-02-14 Thread David Faust
Commit 77d0f9ec3809b4d2e32c36069b6b9239d301c030 inadvertently changed the normal asm dialect instruction template for zero_extendqidi2 from ldxb to ldxh. Fix that. Tested for bpf-unknown-none on x86_64-linux-gnu host. gcc/ * config/bpf/bpf.md (zero_extendqidi2): Correct asm template to

[PATCH 1/2] doc: Fix some standard named pattern documentation modes

2024-02-14 Thread Andrew Pinski
Currently these use `@var{m3}` but the 3 here is a literal 3 and not part of the mode itself so it should not be inside the var. Fixed as such. Built the documentation to make sure it looks correct now. gcc/ChangeLog: * doc/md.texi (widen_ssum, widen_usum, smulhs, umulhs,

[PATCH 0/2] Some minor internal optabs related fixes

2024-02-14 Thread Andrew Pinski
While working on adding some new vector code to the aarch64 backend, I was confused on which mode was supposed to be used for widen_ssum pattern so I decided to improve the documentation so the next person won't be confused. Andrew Pinski (2): doc: Fix some standard named pattern documentation

[PATCH 2/2] doc: Add documentation of which operand matches the mode of the standard pattern name [PR113508]

2024-02-14 Thread Andrew Pinski
In some of the standard pattern names, it is not obvious which mode is being used in the pattern name. Is it operand 0, 1, or 2? Is it the wider mode or the narrower mode? This fixes that so there is no confusion by adding a sentence to some of them. Built the documentation to make sure that it

Re: [PATCH RFA] build: drop target libs from LD_LIBRARY_PATH [PR105688]

2024-02-14 Thread Iain Sandoe
> On 12 Feb 2024, at 19:59, Jason Merrill wrote: > > On 2/10/24 07:30, Iain Sandoe wrote: >>> On 10 Feb 2024, at 12:07, Jason Merrill wrote: >>> >>> On 2/10/24 05:46, Iain Sandoe wrote: > On 9 Feb 2024, at 23:21, Iain Sandoe wrote: > > > >> On 9 Feb 2024, at 10:56,

[patch, fortran] Bug 105847 - namelist-object-name can be a renamed host associated entity

2024-02-14 Thread Jerry D
Pushed as simple and obvious. Regards, Jerry commit 8221201cc59870579b9dc451b173f94b8d8b0993 (HEAD -> master, origin/master, origin/HEAD) Author: Steve Kargl Date: Wed Feb 14 14:40:16 2024 -0800 Fortran: namelist-object-name renaming. PR fortran/105847

Re: [PATCH][_GLIBCXX_DEBUG] Fix std::__niter_base behavior

2024-02-14 Thread François Dumont
On 14/02/2024 20:44, Jonathan Wakely wrote: On Wed, 14 Feb 2024 at 18:39, François Dumont wrote: libstdc++: [_GLIBCXX_DEBUG] Fix std::__niter_base behavior std::__niter_base is used in _GLIBCXX_DEBUG mode to remove _Safe_iterator<> wrapper on random access iterators. But

Re: [PATCH] aarch64: Reword error message for mismatch guard size and probing interval [PR90155]

2024-02-14 Thread Richard Sandiford
Andrew Pinski writes: > The error message is not clear what options are being taked about when it > says the values > need to match; plus there is a wrong quotation dealing with the diagnostic. > So this changes the error message to be exactly talking about the param > options that > are being

Re: [PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-14 Thread Edwin Lu
On 2/14/2024 12:09 PM, Robin Dapp wrote: On 2/14/24 20:46, Edwin Lu wrote: The testcase pr113742.c is failing for 32 bit targets due to the following cc1 error: cc1: error: ABI requries '-march=rv64' I think we usually just add exactly this to the test options (so it is always run rather

Re: [PATCH] aarch64: Use vec_perm_indices::new_shrunk_vector in aarch64_evpc_reencode

2024-02-14 Thread Richard Sandiford
Andrew Pinski writes: > While working on PERM related stuff, I can across that aarch64_evpc_reencode > was manually figuring out if we shrink the perm indices instead of > using vec_perm_indices::new_shrunk_vector; shrunk was added after reencode > was added. > > Built and tested for

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-02-14 Thread Ajit Agarwal
Hello Richard: On 15/02/24 2:21 am, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> >> >> On 14/02/24 10:45 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: >> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc >> index 1856fa4884f..ffc47a6eaa0 100644 >> ---

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Ajit Agarwal
Hello Richard: On 15/02/24 1:14 am, Richard Sandiford wrote: > Ajit Agarwal writes: >> On 14/02/24 10:56 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: >> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc >> index 88ee0dd67fc..a8d0ee7c4db 100644 >> --- a/gcc/df-problems.cc

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-02-14 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Richard: > > > On 14/02/24 10:45 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc > index 1856fa4884f..ffc47a6eaa0 100644 > --- a/gcc/emit-rtl.cc > +++ b/gcc/emit-rtl.cc > @@ -921,7 +921,7 @@

Re: [PATCH v2 4/4] libstdc++: Optimize std::remove_extent compilation performance

2024-02-14 Thread Patrick Palka
On Wed, 14 Feb 2024, Ken Matsui wrote: > This patch optimizes the compilation performance of std::remove_extent > by dispatching to the new __remove_extent built-in trait. > > libstdc++-v3/ChangeLog: > > * include/std/type_traits (remove_extent): Use __remove_extent > built-in

Re: [PATCH v2 3/4] c++: Implement __remove_extent built-in trait

2024-02-14 Thread Patrick Palka
On Wed, 14 Feb 2024, Ken Matsui wrote: > This patch implements built-in trait for std::remove_extent. > > gcc/cp/ChangeLog: > > * cp-trait.def: Define __remove_extent. > * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_EXTENT. > > gcc/testsuite/ChangeLog: > > *

Re: [PATCH v2 2/4] libstdc++: Optimize std::add_pointer compilation performance

2024-02-14 Thread Patrick Palka
On Wed, 14 Feb 2024, Ken Matsui wrote: > This patch optimizes the compilation performance of std::add_pointer > by dispatching to the new __add_pointer built-in trait. > > libstdc++-v3/ChangeLog: > > * include/std/type_traits (add_pointer): Use __add_pointer > built-in trait. LGTM

Re: [PATCH v2 1/4] c++: Implement __add_pointer built-in trait

2024-02-14 Thread Patrick Palka
On Wed, 14 Feb 2024, Ken Matsui wrote: > This patch implements built-in trait for std::add_pointer. > > gcc/cp/ChangeLog: > > * cp-trait.def: Define __add_pointer. > * semantics.cc (finish_trait_type): Handle CPTK_ADD_POINTER. > > gcc/testsuite/ChangeLog: > > *

[committed] testsuite: Fix a couple of x86 issues in gcc.dg/vect testsuite

2024-02-14 Thread Uros Bizjak
A compile-time test can use -march=skylake-avx512 for all x86 targets, but a runtime test needs to check avx512f effective target if the instructions can be assembled. The runtime test also needs to check if the target machine supports instruction set we have been compiled for. The testsuite

Re: [PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-14 Thread Robin Dapp
On 2/14/24 20:46, Edwin Lu wrote: > The testcase pr113742.c is failing for 32 bit targets due to the following cc1 > error: > cc1: error: ABI requries '-march=rv64' I think we usually just add exactly this to the test options (so it is always run rather than just on a 64-bit target. Regards

Re: [PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-14 Thread H.J. Lu
On Wed, Feb 14, 2024 at 11:59 AM Iain Sandoe wrote: > > > > > On 14 Feb 2024, at 18:12, H.J. Lu wrote: > > > > On Tue, Feb 13, 2024 at 8:46 AM Jakub Jelinek wrote: > >> > >> On Tue, Feb 13, 2024 at 08:40:52AM -0800, H.J. Lu wrote: > >>> Add x32 and IBT support to x86 heap trampoline

Re: [PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-14 Thread Jakub Jelinek
On Wed, Feb 14, 2024 at 07:59:26PM +, Iain Sandoe wrote: > I have just one question; > > from your patch the use of endbr* seems to be unconditionally based on the > flags used to build libgcc. > > However, I was expecting that the use of extended trampolines like this would > depend on

Re: [PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-14 Thread Iain Sandoe
> On 14 Feb 2024, at 18:12, H.J. Lu wrote: > > On Tue, Feb 13, 2024 at 8:46 AM Jakub Jelinek wrote: >> >> On Tue, Feb 13, 2024 at 08:40:52AM -0800, H.J. Lu wrote: >>> Add x32 and IBT support to x86 heap trampoline implementation with a >>> testcase. >>> >>> 2024-02-13 Jakub Jelinek >>>

[committed] i386: psrlq is not used for PERM [PR113871]

2024-02-14 Thread Uros Bizjak
Introduce vec_shl_ and vec_shr_ expanders to improve '*a = __builtin_shufflevector(*a, (vect64){0}, 1, 2, 3, 4);' and '*a = __builtin_shufflevector((vect64){0}, *a, 3, 4, 5, 6);' shuffles. The generated code improves from: movzwl 6(%rdi), %eax movzwl 4(%rdi), %edx salq

[PATCH] RISC-V: Set require-effective-target rv64 for PR113742

2024-02-14 Thread Edwin Lu
The testcase pr113742.c is failing for 32 bit targets due to the following cc1 error: cc1: error: ABI requries '-march=rv64' Disable testing on rv32 targets PR target/113742 gcc/testsuite/ChangeLog: * gcc.target/riscv/pr113742.c: add require-effective-target Signed-off-by:

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-02-14 Thread Ajit Agarwal
Hello Richard: On 14/02/24 10:45 pm, Richard Sandiford wrote: > Ajit Agarwal writes: diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index 1856fa4884f..ffc47a6eaa0 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -921,7 +921,7 @@ validate_subreg (machine_mode omode,

Re: [PATCH][_GLIBCXX_DEBUG] Fix std::__niter_base behavior

2024-02-14 Thread Jonathan Wakely
On Wed, 14 Feb 2024 at 18:39, François Dumont wrote: > libstdc++: [_GLIBCXX_DEBUG] Fix std::__niter_base behavior > > std::__niter_base is used in _GLIBCXX_DEBUG mode to remove _Safe_iterator<> > wrapper on random access iterators. But doing so it should also preserve > original > behavior to

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Richard Sandiford
Ajit Agarwal writes: > On 14/02/24 10:56 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: > diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc > index 88ee0dd67fc..a8d0ee7c4db 100644 > --- a/gcc/df-problems.cc > +++ b/gcc/df-problems.cc > @@ -3360,7 +3360,7 @@

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Ajit Agarwal
On 14/02/24 10:56 pm, Richard Sandiford wrote: > Ajit Agarwal writes: diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc index 88ee0dd67fc..a8d0ee7c4db 100644 --- a/gcc/df-problems.cc +++ b/gcc/df-problems.cc @@ -3360,7 +3360,7 @@ df_set_unused_notes_for_mw (rtx_insn

[PATCH][_GLIBCXX_DEBUG] Fix std::__niter_base behavior

2024-02-14 Thread François Dumont
libstdc++: [_GLIBCXX_DEBUG] Fix std::__niter_base behavior std::__niter_base is used in _GLIBCXX_DEBUG mode to remove _Safe_iterator<> wrapper on random access iterators. But doing so it should also preserve original behavior to remove __normal_iterator wrapper. libstdc++-v3/ChangeLog:     *

RE: [libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-02-14 Thread Kyrylo Tkachov
> -Original Message- > From: Victor Do Nascimento > Sent: Wednesday, February 14, 2024 5:06 PM > To: Roger Sayle ; gcc-patches@gcc.gnu.org; > Richard Earnshaw > Subject: Re: [libatomic PATCH] PR other/113336: Fix libatomic testsuite > regressions on ARM. > > Though I'm not in a

Re: [PATCH v2] x86: Support x32 and IBT in heap trampoline

2024-02-14 Thread H.J. Lu
On Tue, Feb 13, 2024 at 8:46 AM Jakub Jelinek wrote: > > On Tue, Feb 13, 2024 at 08:40:52AM -0800, H.J. Lu wrote: > > Add x32 and IBT support to x86 heap trampoline implementation with a > > testcase. > > > > 2024-02-13 Jakub Jelinek > > H.J. Lu > > > > libgcc/ > > > > PR

[COMMITTED] aarch64/testsuite: Remove dg-excess-errors from c-c++-common/gomp/pr63328.c and gcc.dg/gomp/pr87895-2.c [PR113861]

2024-02-14 Thread Andrew Pinski
These now pass after r14-6416-gf5fc001a84a7db so let's remove the dg-excess-errors from them. Committed as obvious after a test for aarch64-linux-gnu. gcc/testsuite/ChangeLog: PR testsuite/113861 * c-c++-common/gomp/pr63328.c: Remove dg-excess-errors. *

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Ajit Agarwal
Hello Sam: On 14/02/24 10:50 pm, Sam James wrote: > > Ajit Agarwal writes: > >> Hello Richard: >> >> >> On 14/02/24 4:03 pm, Richard Sandiford wrote: >>> Hi, >>> >>> Thanks for working on this. >>> >>> You posted a version of this patch on Sunday too. If you need to repost >>> to fix bugs or

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Richard Sandiford
Ajit Agarwal writes: >>> diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc >>> index 88ee0dd67fc..a8d0ee7c4db 100644 >>> --- a/gcc/df-problems.cc >>> +++ b/gcc/df-problems.cc >>> @@ -3360,7 +3360,7 @@ df_set_unused_notes_for_mw (rtx_insn *insn, struct >>> df_mw_hardreg *mws, >>>if

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Sam James
Ajit Agarwal writes: > Hello Richard: > > > On 14/02/24 4:03 pm, Richard Sandiford wrote: >> Hi, >> >> Thanks for working on this. >> >> You posted a version of this patch on Sunday too. If you need to repost >> to fix bugs or make other improvements, could you describe the changes >> that

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-02-14 Thread Richard Sandiford
Ajit Agarwal writes: >>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc >>> index 1856fa4884f..ffc47a6eaa0 100644 >>> --- a/gcc/emit-rtl.cc >>> +++ b/gcc/emit-rtl.cc >>> @@ -921,7 +921,7 @@ validate_subreg (machine_mode omode, machine_mode imode, >>> return false; >>> >>>/* The subreg

Re: [libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-02-14 Thread Victor Do Nascimento
Though I'm not in a position to approve the patch, I'm happy to confirm the proposed changes look good to me. Thanks for the updated version, Victor On 1/28/24 16:24, Roger Sayle wrote: This patch is a revised version of the fix for PR other/113336. This patch has been tested on

Fix ICE in loop splitting

2024-02-14 Thread Jan Hubicka
Hi, as demonstrated in the testcase, I forgot to check that profile is present in tree-ssa-loop-split. Bootstrapped and regtested x86_64-linux, comitted. PR tree-optimization/111054 gcc/ChangeLog: * tree-ssa-loop-split.cc (split_loop): Check for profile being present.

Re: [PATCH] [libiberty] remove TBAA violation in iterative_hash, improve code-gen

2024-02-14 Thread Jakub Jelinek
On Wed, Feb 14, 2024 at 05:09:39PM +0100, Richard Biener wrote: > > > > Am 14.02.2024 um 16:22 schrieb Jakub Jelinek : > > > > On Wed, Feb 14, 2024 at 04:13:51PM +0100, Richard Biener wrote: > >> The following removes the TBAA violation present in iterative_hash. > >> As we eventually LTO that

Re: [PATCH] [libiberty] remove TBAA violation in iterative_hash, improve code-gen

2024-02-14 Thread Richard Biener
> Am 14.02.2024 um 16:22 schrieb Jakub Jelinek : > > On Wed, Feb 14, 2024 at 04:13:51PM +0100, Richard Biener wrote: >> The following removes the TBAA violation present in iterative_hash. >> As we eventually LTO that it's important to fix. This also improves >> code generation for the >= 12

Re: [PATCH] coreutils-sum-pr108666.c: fix spurious LLP64 warnings

2024-02-14 Thread Jonathan Yong
On 2/14/24 13:55, David Malcolm wrote: On Fri, 2024-02-02 at 23:55 +, Jonathan Yong wrote: Attached patch OK? Fixes the following warnings: Thanks; looks good to me. Dave Thanks, pushed to master branch.

Re: [PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-14 Thread Richard Sandiford
Richard Biener writes: > On Wed, 14 Feb 2024, Richard Sandiford wrote: > >> Richard Biener writes: >> > The following avoids accessing out-of-bound vector elements when >> > native encoding a boolean vector with sub-BITS_PER_UNIT precision >> > elements. The error was basing the number of

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Andrew Stubbs
On 14/02/2024 13:43, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 14/02/2024 13:27, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 13/02/2024 08:26, Richard Biener wrote: On Mon, 12 Feb 2024, Thomas Schwinge wrote: Hi! On

Re: [PATCH] [libiberty] remove TBAA violation in iterative_hash, improve code-gen

2024-02-14 Thread Jakub Jelinek
On Wed, Feb 14, 2024 at 04:13:51PM +0100, Richard Biener wrote: > The following removes the TBAA violation present in iterative_hash. > As we eventually LTO that it's important to fix. This also improves > code generation for the >= 12 bytes loop by using | to compose the > 4 byte words as at

Re: [PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Richard Biener
> Am 14.02.2024 um 16:16 schrieb Tamar Christina : > >  >> >> >> I think this isn't entirely good. For simple cases for >> do {} while the condition ends up in the latch while for while () {} >> loops it ends up in the header. In your case the latch isn't empty >> so it doesn't end up

[PATCH]AArch64: remove ls64 from being mandatory on armv8.7-a..

2024-02-14 Thread Tamar Christina
Hi All, The Arm Architectural Reference Manual (Version J.a, section A2.9 on FEAT_LS64) shows that ls64 is an optional extensions and should not be enabled by default for Armv8.7-a. This drops it from the mandatory bits for the architecture and brings GCC inline with LLVM and the achitecture.

RE: [PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Tamar Christina
> > I think this isn't entirely good. For simple cases for > do {} while the condition ends up in the latch while for while () {} > loops it ends up in the header. In your case the latch isn't empty > so it doesn't end up with the conditional. > > I think your patch is OK to the point of

[PATCH][RFC] tree-optimization/113910 - improve bitmap_hash

2024-02-14 Thread Richard Biener
The following tries to improve the actual hash function for hashing bitmaps. We're still getting collision rates as high as 23 for the testcase in the PR. The following improves this by properly mixing in the bitmap element starting bit number. This brings down the collision rate below 1.4,

[PATCH] [libiberty] remove TBAA violation in iterative_hash, improve code-gen

2024-02-14 Thread Richard Biener
The following removes the TBAA violation present in iterative_hash. As we eventually LTO that it's important to fix. This also improves code generation for the >= 12 bytes loop by using | to compose the 4 byte words as at least GCC 7 and up can recognize that pattern and perform a 4 byte load

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Ajit Agarwal
On 14/02/24 7:22 pm, Ajit Agarwal wrote: > Hello Richard: > > > On 14/02/24 4:03 pm, Richard Sandiford wrote: >> Hi, >> >> Thanks for working on this. >> >> You posted a version of this patch on Sunday too. If you need to repost >> to fix bugs or make other improvements, could you describe

[PATCH] Skip gnat.dg/div_zero.adb on RISC-V

2024-02-14 Thread Andreas Schwab
Like AArch64 and POWER, RISC-V does not support trap on zero divide. gcc/testsuite/ * gnat.dg/div_zero.adb: Skip on RISC-V. --- gcc/testsuite/gnat.dg/div_zero.adb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gnat.dg/div_zero.adb

Re: [PATCH] analyzer/pr104308.c: Avoid optimizing away the copies

2024-02-14 Thread David Malcolm
On Tue, 2022-05-03 at 17:29 -0700, Palmer Dabbelt wrote: > The test cases in analyzer/pr104308.c use uninitialized values in a > way > that doesn't plumb through to the return value of the function.  This > allows the accesses to be deleted, which can result in the diagnostic > not firing.

Re: [PATCH] coreutils-sum-pr108666.c: fix spurious LLP64 warnings

2024-02-14 Thread David Malcolm
On Fri, 2024-02-02 at 23:55 +, Jonathan Yong wrote: > Attached patch OK? Fixes the following warnings: Thanks; looks good to me. Dave > coreutils-sum-pr108666.c:17:1: warning: conflicting types for built- > in function ‘memcpy’; expected ‘void *(void *, const void *, long > long unsigned

[PATCH v2 4/4] libstdc++: Optimize std::remove_extent compilation performance

2024-02-14 Thread Ken Matsui
This patch optimizes the compilation performance of std::remove_extent by dispatching to the new __remove_extent built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (remove_extent): Use __remove_extent built-in trait. Signed-off-by: Ken Matsui ---

[PATCH v2 2/4] libstdc++: Optimize std::add_pointer compilation performance

2024-02-14 Thread Ken Matsui
This patch optimizes the compilation performance of std::add_pointer by dispatching to the new __add_pointer built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (add_pointer): Use __add_pointer built-in trait. Signed-off-by: Ken Matsui ---

[PATCH v2 3/4] c++: Implement __remove_extent built-in trait

2024-02-14 Thread Ken Matsui
This patch implements built-in trait for std::remove_extent. gcc/cp/ChangeLog: * cp-trait.def: Define __remove_extent. * semantics.cc (finish_trait_type): Handle CPTK_REMOVE_EXTENT. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of

[PATCH v2 1/4] c++: Implement __add_pointer built-in trait

2024-02-14 Thread Ken Matsui
This patch implements built-in trait for std::add_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __add_pointer. * semantics.cc (finish_trait_type): Handle CPTK_ADD_POINTER. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __add_pointer.

Re: [PATCH] c++: implicitly_declare_fn and access checks [PR113908]

2024-02-14 Thread Jason Merrill
On 2/14/24 08:46, Patrick Palka wrote: On Tue, 13 Feb 2024, Jason Merrill wrote: On 2/13/24 11:49, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, are one of both of these fixes OK for trunk? -- >8 -- Here during ahead of time checking of the non-dependent new-expr

Re: [PATCH V1] Common infrastructure for load-store fusion for aarch64 and rs6000 target

2024-02-14 Thread Ajit Agarwal
Hello Richard: On 14/02/24 4:03 pm, Richard Sandiford wrote: > Hi, > > Thanks for working on this. > > You posted a version of this patch on Sunday too. If you need to repost > to fix bugs or make other improvements, could you describe the changes > that you've made since the previous

Re: [PATCH] testsuite: gdc: Require ucn in gdc.test/runnable/mangle.d etc. [PR104739]

2024-02-14 Thread Iain Buclaw
Excerpts from Rainer Orth's message of Februar 14, 2024 11:51 am: > gdc.test/runnable/mangle.d and two other tests come out UNRESOLVED on > Solaris with the native assembler: > > UNRESOLVED: gdc.test/runnable/mangle.d compilation failed to produce > executable > UNRESOLVED:

Re: [PATCH] tree-optimization/113910 - huge compile time during PTA

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Richard Biener wrote: > For the testcase in PR113910 we spend a lot of time in PTA comparing > bitmaps for looking up equivalence class members. This points to > the very weak bitmap_hash function which effectively hashes set > and a subset of not set bits. The following

Re: [PATCH] c++: implicitly_declare_fn and access checks [PR113908]

2024-02-14 Thread Patrick Palka
On Tue, 13 Feb 2024, Jason Merrill wrote: > On 2/13/24 11:49, Patrick Palka wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu, are one of > > both of these fixes OK for trunk? > > > > -- >8 -- > > > > Here during ahead of time checking of the non-dependent new-expr we > > synthesize

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Andrew Stubbs wrote: > On 14/02/2024 13:27, Richard Biener wrote: > > On Wed, 14 Feb 2024, Andrew Stubbs wrote: > > > >> On 13/02/2024 08:26, Richard Biener wrote: > >>> On Mon, 12 Feb 2024, Thomas Schwinge wrote: > >>> > Hi! > > On 2023-10-20T12:51:03+0100,

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-02-14 Thread Ajit Agarwal
Hello Alex: On 24/01/24 10:13 pm, Alex Coplan wrote: > Hi Ajit, > > On 21/01/2024 19:57, Ajit Agarwal wrote: >> >> Hello All: >> >> New pass to replace adjacent memory addresses lxv with lxvp. >> Added common infrastructure for load store fusion for >> different targets. > > Thanks for this, it

Re: [PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Tamar Christina wrote: > Hi All, > > Attaching a pragma to a loop which has a complex condition often gets the > pragma > dropped. e.g. > > #pragma GCC novector > while (i < N && parse_tables_n--) > > before lowering this is represented as: > > if (ANNOTATE_EXPR ) ...

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Andrew Stubbs
On 14/02/2024 13:27, Richard Biener wrote: On Wed, 14 Feb 2024, Andrew Stubbs wrote: On 13/02/2024 08:26, Richard Biener wrote: On Mon, 12 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: I've committed this patch ... as commit

Re: [PATCH v2] c++: Defer emitting inline variables [PR113708]

2024-02-14 Thread Jason Merrill
On 2/14/24 06:03, Nathaniel Shead wrote: On Tue, Feb 13, 2024 at 09:47:27PM -0500, Jason Merrill wrote: On 2/13/24 20:34, Nathaniel Shead wrote: On Tue, Feb 13, 2024 at 06:08:42PM -0500, Jason Merrill wrote: On 2/11/24 08:26, Nathaniel Shead wrote: Currently inline vars imported from

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-02-14 Thread Jan Hubicka
> [Public] > > Hi, > > >>I assume the znver5 costs are smae as znver4 so far? > > Costing table updated for below entries. > + {COSTS_N_INSNS (10), /* cost of a divide/mod for QI. */ > + COSTS_N_INSNS (11), /* HI. */

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Andrew Stubbs wrote: > On 13/02/2024 08:26, Richard Biener wrote: > > On Mon, 12 Feb 2024, Thomas Schwinge wrote: > > > >> Hi! > >> > >> On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: > >>> I've committed this patch > >> > >> ... as commit

RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-02-14 Thread Anbazhagan, Karthiban
[Public] Hi, >>I assume the znver5 costs are smae as znver4 so far? Costing table updated for below entries. + {COSTS_N_INSNS (10), /* cost of a divide/mod for QI. */ + COSTS_N_INSNS (11), /* HI. */ +

Re: [patch, libgfortran] PR99210 X editing for reading file with encoding='utf-8'

2024-02-14 Thread FX Coudert
> Regression tested on x86_64 and new test case. > OK for trunk? OK, and thanks! FX

[PATCH]middle-end: inspect all exits for additional annotations for loop.

2024-02-14 Thread Tamar Christina
Hi All, Attaching a pragma to a loop which has a complex condition often gets the pragma dropped. e.g. #pragma GCC novector while (i < N && parse_tables_n--) before lowering this is represented as: if (ANNOTATE_EXPR ) ... But after lowering the condition is broken appart and attached to

[PATCH] testsuite: Fix guality/ipa-sra-1.c to work with return IPA-VRP

2024-02-14 Thread Martin Jambor
Hi, the test guality/ipa-sra-1.c stopped working after r14-5628-g53ba8d669550d3 because the variable from which the values of removed parameters could be calculated is also removed with it. Fixed with this patch which stops a function from returning a constant. I have also noticed that the

Re: GCN RDNA2+ vs. GCC vectorizer "Reduce using vector shifts"

2024-02-14 Thread Andrew Stubbs
On 13/02/2024 08:26, Richard Biener wrote: On Mon, 12 Feb 2024, Thomas Schwinge wrote: Hi! On 2023-10-20T12:51:03+0100, Andrew Stubbs wrote: I've committed this patch ... as commit c7ec7bd1c6590cf4eed267feab490288e0b8d691 "amdgcn: add -march=gfx1030 EXPERIMENTAL". The RDNA2 ISA variant

Re: [PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following avoids accessing out-of-bound vector elements when > > native encoding a boolean vector with sub-BITS_PER_UNIT precision > > elements. The error was basing the number of elements to extract > > on the

Re: [PATCH] middle-end/113576 - zero padding of vector bools when expanding compares

2024-02-14 Thread Richard Biener
On Wed, 14 Feb 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following zeros paddings of vector bools when expanding compares > > and the mode used for the compare is an integer mode. In that case > > targets cannot distinguish between a 4 element and 8 element vector > >

[PATCH] tree-optimization/113910 - huge compile time during PTA

2024-02-14 Thread Richard Biener
For the testcase in PR113910 we spend a lot of time in PTA comparing bitmaps for looking up equivalence class members. This points to the very weak bitmap_hash function which effectively hashes set and a subset of not set bits. The following improves it by mixing that weak result with the

[PATCH 2/2] libstdc++: Optimize std::add_pointer compilation performance

2024-02-14 Thread Ken Matsui
This patch optimizes the compilation performance of std::add_pointer by dispatching to the new __add_pointer built-in trait. libstdc++-v3/ChangeLog: * include/std/type_traits (add_pointer): Use __add_pointer built-in trait. Signed-off-by: Ken Matsui ---

[PATCH 1/2] c++: Implement __add_pointer built-in trait

2024-02-14 Thread Ken Matsui
This patch implements built-in trait for std::add_pointer. gcc/cp/ChangeLog: * cp-trait.def: Define __add_pointer. * semantics.cc (finish_trait_type): Handle CPTK_ADD_POINTER. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: Test existence of __add_pointer.

[PATCH][GCC 12] tree-optimization/113896 - reduction of permuted external vector

2024-02-14 Thread Richard Biener
The following fixes eliding of the permutation of a BB reduction of an existing vector which breaks materialization of live lanes as we fail to permute the SLP_TREE_SCALAR_STMTS vector. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/113896 *

Re: [PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-14 Thread Richard Sandiford
Richard Biener writes: > The following avoids accessing out-of-bound vector elements when > native encoding a boolean vector with sub-BITS_PER_UNIT precision > elements. The error was basing the number of elements to extract > on the rounded up total byte size involved and the patch bases >

Re: [PATCH][GCC 12] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-02-14 Thread Richard Sandiford
Alex Coplan writes: > This is a backport of the GCC 13 fix for PR111677 to the GCC 12 branch. > The only part of the patch that isn't a straight cherry-pick is due to > the TX iterator lacking TDmode for GCC 12, so this version adjusts > TX_V16QI accordingly. > > Bootstrapped/regtested on

Re: [PATCH] middle-end/113576 - zero padding of vector bools when expanding compares

2024-02-14 Thread Richard Sandiford
Richard Biener writes: > The following zeros paddings of vector bools when expanding compares > and the mode used for the compare is an integer mode. In that case > targets cannot distinguish between a 4 element and 8 element vector > compare (both get to the QImode compare optab) so we have to

Re: [PATCH v2] c++: Defer emitting inline variables [PR113708]

2024-02-14 Thread Nathaniel Shead
On Tue, Feb 13, 2024 at 09:47:27PM -0500, Jason Merrill wrote: > On 2/13/24 20:34, Nathaniel Shead wrote: > > On Tue, Feb 13, 2024 at 06:08:42PM -0500, Jason Merrill wrote: > > > On 2/11/24 08:26, Nathaniel Shead wrote: > > > > > > > > Currently inline vars imported from modules aren't correctly

RE: [PATCH] arm/aarch64: Add bti for all functions [PR106671]

2024-02-14 Thread Kyrylo Tkachov
Hi Feng, > -Original Message- > From: Gcc-patches bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Feng Xue OS > via Gcc-patches > Sent: Wednesday, August 2, 2023 4:49 PM > To: gcc-patches@gcc.gnu.org > Subject: [PATCH] arm/aarch64: Add bti for all functions [PR106671] > > This

[PATCH] testsuite: gdc: Require ucn in gdc.test/runnable/mangle.d etc. [PR104739]

2024-02-14 Thread Rainer Orth
gdc.test/runnable/mangle.d and two other tests come out UNRESOLVED on Solaris with the native assembler: UNRESOLVED: gdc.test/runnable/mangle.d compilation failed to produce executable UNRESOLVED: gdc.test/runnable/mangle.d -shared-libphobos compilation failed to produce executable

  1   2   >