Re: [PATCH 1/2] aarch64: PR target/115457 Implement missing __ARM_FEATURE_BF16 macro

2024-07-04 Thread Kyrylo Tkachov
> On 3 Jul 2024, at 11:59, Kyrylo Tkachov wrote: > > Hi all, > > The ACLE asks the user to test for __ARM_FEATURE_BF16 before using the > header but GCC doesn't set this up. > LLVM does, so this is an inconsistency between the compilers. > > This patch enables th

Re: [PATCH v1 0/2] Aarch64: addp NEON big-endian fix [PR114890]

2024-07-04 Thread Kyrylo Tkachov
regards, > Alfie > > Sent from Outlook for iOS > From: Kyrylo Tkachov > Sent: Wednesday, July 3, 2024 11:23:37 AM > To: Alfie Richards > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH v1 0/2] Aarch64: addp NEON big-endian fix [PR114890] Hi > Alfie, >

Re: [PATCH] match.pd: Fold x/sqrt(x) to sqrt(x)

2024-07-03 Thread Kyrylo Tkachov
> On 3 Jul 2024, at 14:22, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Wed, 3 Jul 2024, Jennifer Schmitz wrote: > >> > > OK. I’ve pushed the patch on Jennifer’s behalf with 8dc5ad3ce8d4d2cd6cc2b7516d282395502fdf7d . One thing I noticed is

Re: [PATCH v1 0/2] Aarch64: addp NEON big-endian fix [PR114890]

2024-07-03 Thread Kyrylo Tkachov
Hi Alfie, > On 3 Jul 2024, at 12:10, alfie.richa...@arm.com wrote: > > External email: Use caution opening links or attachments > > > From: Alfie Richards > > Hi All, > > This fixes a case where the operands for the addp NEON intrinsic were > erroneously swapped. > > Regtested on

Re: [PATCH 1/2] aarch64: PR target/115457 Implement missing __ARM_FEATURE_BF16 macro

2024-07-03 Thread Kyrylo Tkachov
> On 3 Jul 2024, at 11:59, Kyrylo Tkachov wrote: > > External email: Use caution opening links or attachments > > > Hi all, > > The ACLE asks the user to test for __ARM_FEATURE_BF16 before using the > header but GCC doesn't set this up. > LLVM does, so th

[PATCH 2/2] aarch64: PR target/115475 Implement missing __ARM_FEATURE_SVE_BF16 macro

2024-07-03 Thread Kyrylo Tkachov
/testsuite/ PR target/115475 * gcc.target/aarch64/acle/bf16_sve_feature.c: New test. Signed-off-by: Kyrylo Tkachov 0002-aarch64-PR-target-115475-Implement-missing-__ARM_FEA.patch Description: 0002-aarch64-PR-target-115475-Implement-missing-__ARM_FEA.patch

[PATCH 1/2] aarch64: PR target/115457 Implement missing __ARM_FEATURE_BF16 macro

2024-07-03 Thread Kyrylo Tkachov
for TARGET_BF16_FP. gcc/testsuite/ PR target/115457 * gcc.target/aarch64/acle/bf16_feature.c: New test. Signed-off-by: Kyrylo Tkachov 0001-aarch64-PR-target-115457-Implement-missing-__ARM_FEA.patch Description: 0001-aarch64-PR-target-115457-Implement-missing-__ARM_FEA.patch

Re: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-28 Thread Kyrylo Tkachov
On 27 Jun 2024, at 16:58, Tamar Christina wrote: External email: Use caution opening links or attachments -Original Message- From: Kyrylo Tkachov mailto:ktkac...@nvidia.com>> Sent: Thursday, June 27, 2024 3:49 PM To: Tamar Christina mailto:tamar.christ...@arm.com>> Cc:

Re: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Kyrylo Tkachov
Hi Tamar, Thanks for going through the docs here, > On 27 Jun 2024, at 16:19, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi Kyrill, > >> -Original Message----- >> From: Kyrylo Tkachov >> Sent: Thu

[PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-27 Thread Kyrylo Tkachov
Hi all, According to the TRM for Neoverse V2 the Memory Tagging and RNG features are optional configurations of the core and may not always be present. Therefore -mcpu=neoverse-v2 shouldn't enable them, similar to how the crypto extensions aren’t enabled by default. Bootstrapped and tested on

[PATCH][GCC 11] aarch64: Add support for -mcpu=grace

2024-06-27 Thread Kyrylo Tkachov
Hi all, This is the GCC 11 (and last) version of the patch. Pushing to the branch. Thanks, Kyrill grace-11.patch Description: grace-11.patch

Re: [PATCH][GCC 14] aarch64: Add support for -mcpu=grace

2024-06-27 Thread Kyrylo Tkachov
The subject line should, of course, say [GCC 13] rather than [GCC 14] > On 27 Jun 2024, at 10:21, Kyrylo Tkachov wrote: > > External email: Use caution opening links or attachments > > > Hi all, > > This is the GCC 13 version of the patch. > Pushing to the branch. > Thanks, > Kyrill > >

[PATCH][GCC 12] aarch64: Add support for -mcpu=grace

2024-06-27 Thread Kyrylo Tkachov
Hi all, This is the GCC 12 version of the patch, still using the AARCH64_FL_* syntax for option extensions. Pushing to the branch. Thanks, Kyrill grace-12.patch Description: grace-12.patch

[PATCH][GCC 14] aarch64: Add support for -mcpu=grace

2024-06-27 Thread Kyrylo Tkachov
Hi all, This is the GCC 13 version of the patch. Pushing to the branch. Thanks, Kyrill grace-13.patch Description: grace-13.patch

[PATCH][GCC 14] aarch64: Add support for -mcpu=grace

2024-06-27 Thread Kyrylo Tkachov
Hi all, This is the GCC 14 version of the patch. It’s the same as the trunk one, but aarch64-tune.md regeneration looks different. Pushing to the branch. Thanks, Kyrill grace-14.patch Description: grace-14.patch

Re: [PATCH] aarch64: Add support for -mcpu=grace

2024-06-27 Thread Kyrylo Tkachov
Hi Andrew, > On 26 Jun 2024, at 23:02, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Wed, Jun 26, 2024 at 12:40 AM Kyrylo Tkachov wrote: >> >> Hi all, >> >> This adds support for the NVIDIA Grace CPU to

PR target/115618: can we back port the fix to GCC 13?

2024-06-26 Thread Kyrylo Tkachov
Hi Andrew, I’ve tested the fix for PR 115618 from your commit r14-6612-g8d30107455f230 on the GCC 13 branch. I’d like to back port it to that branch. Is there any problem with that I should be aware of? It applies cleanly and tests fine. Thanks, Kyrill

[PATCH] aarch64: Add support for -mcpu=grace

2024-06-26 Thread Kyrylo Tkachov
/invoke.texi (AArch64 Options): Document the above. Signed-off-by: Kyrylo Tkachov grace.patch Description: grace.patch

[MAINTAINERS] Update my email address

2024-06-18 Thread Kyrylo Tkachov
Hi all, Pushing to trunk. Thanks, Kyrill Signed-off-by: Kyrylo Tkachov * MAINTAINERS (aarch64 port): Update my email address. (DCO section): Likewise. maintainers.patch Description: maintainers.patch

Re: [PATCH 0/2] aarch64: Small cleanups of the cavium cores

2024-06-18 Thread Kyrylo Tkachov
Hi Andrew, > On 18 Jun 2024, at 05:40, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > While thinking the variant patch I had posted, I went back to > look at the original cores which used the variant and saw there > was small cleanup for them since

Re: [PATCH] aarch64: Improve popcount for bytes [PR113042]

2024-06-10 Thread Kyrylo Tkachov
Hi Andrew -Original Message- From: Andrew Pinski mailto:quic_apin...@quicinc.com>> Date: Monday, 10 June 2024 at 06:05 To: "gcc-patches@gcc.gnu.org " mailto:gcc-patches@gcc.gnu.org>> Cc: Andrew Pinski mailto:quic_apin...@quicinc.com>> Subject: [PATCH]

[MAINTAINERS] Update my email address and step down as arm port maintainer

2024-04-04 Thread Kyrylo Tkachov
Hi all, I'm stepping down as arm maintainer. Realistically I won't have good access to arm hardware to test patches for the port in the foreseeable future, or at least the more active M-profile parts of it. I'm still happy to keep helping with AArch64 though. I'm also adding myself to the DCO

[PATCH][wwwdocs] changes.html changes for AArch64 for GCC 14.1

2024-04-02 Thread Kyrylo Tkachov
Hi all, Here's a writeup of the AArch64 changes to highlight in GCC 14.1. If there's something you'd like to highlight feel free to comment or add a patch yourself. I don't expect the list to be exhaustive. It's been a busy release for AArch64! Thanks, Kyrill gcc-14-aarch64-wwwdocs.patch

RE: [PATCH] aarch64: Align lrcpc3 FEAT_STRING with /proc/cpuinfo 'Features' entry

2024-03-25 Thread Kyrylo Tkachov
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Monday, March 25, 2024 10:59 AM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Richard Sandiford > ; Richard Earnshaw > ; Victor Do Nascimento > > Subject: [PATCH] aarch64: Align lr

RE: [libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-02-14 Thread Kyrylo Tkachov
> -Original Message- > From: Victor Do Nascimento > Sent: Wednesday, February 14, 2024 5:06 PM > To: Roger Sayle ; gcc-patches@gcc.gnu.org; > Richard Earnshaw > Subject: Re: [libatomic PATCH] PR other/113336: Fix libatomic testsuite > regressions on ARM. > > Though I'm not in a

RE: [PATCH] arm/aarch64: Add bti for all functions [PR106671]

2024-02-14 Thread Kyrylo Tkachov
Hi Feng, > -Original Message- > From: Gcc-patches bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Feng Xue OS > via Gcc-patches > Sent: Wednesday, August 2, 2023 4:49 PM > To: gcc-patches@gcc.gnu.org > Subject: [PATCH] arm/aarch64: Add bti for all functions [PR106671] > > This

RE: [PATCH] AArch64: Add -mcpu=cobalt-100

2024-01-25 Thread Kyrylo Tkachov
> -Original Message- > From: Wilco Dijkstra > Sent: Thursday, January 25, 2024 5:00 PM > To: Kyrylo Tkachov ; GCC Patches patc...@gcc.gnu.org> > Cc: Richard Earnshaw ; Richard Sandiford > > Subject: Re: [PATCH] AArch64: Add -mcpu=cobalt-100 > > Hi,

RE: [PATCH] aarch64: Re-enable ldp/stp fusion pass

2024-01-24 Thread Kyrylo Tkachov
Hi Alex, > -Original Message- > From: Alex Coplan > Sent: Wednesday, January 24, 2024 8:34 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Kyrylo Tkachov ; > Jakub Jelinek > Subject: [PATCH] aarch64: Re-enable ldp/stp fusion

RE: [PATCH v2 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-17 Thread Kyrylo Tkachov
Hi Andre, > -Original Message- > From: Andre Vieira > Sent: Friday, January 5, 2024 5:52 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Stam Markianos-Wright > > Subject: [PATCH v2 2/2] arm: Add support for MVE Tail-Predicated Low Overhead > Loops > > Respin after comments

RE: [PATCH] aarch64: Fix aarch64_ldp_reg_operand predicate not to allow all subreg [PR113221]

2024-01-17 Thread Kyrylo Tkachov
> -Original Message- > From: Andrew Pinski > Sent: Wednesday, January 17, 2024 3:29 AM > To: gcc-patches@gcc.gnu.org > Cc: Alex Coplan ; Andrew Pinski > > Subject: [PATCH] aarch64: Fix aarch64_ldp_reg_operand predicate not to allow > all subreg [PR113221] > > So the problem here is

RE: [PATCH] AArch64: Add -mcpu=cobalt-100

2024-01-16 Thread Kyrylo Tkachov
> -Original Message- > From: Wilco Dijkstra > Sent: Tuesday, January 16, 2024 5:23 PM > To: GCC Patches > Cc: Kyrylo Tkachov ; Richard Earnshaw > ; Richard Sandiford > > Subject: [PATCH] AArch64: Add -mcpu=cobalt-100 > > > Add support

RE: [PATCH][wwwdoc] gcc-14: Add arm cortex-m52 cpu support

2024-01-10 Thread Kyrylo Tkachov
> -Original Message- > From: Chung-Ju Wu > Sent: Wednesday, January 10, 2024 7:07 AM > To: Gerald Pfeifer ; gcc-patches patc...@gcc.gnu.org> > Cc: Kyrylo Tkachov ; Richard Earnshaw > ; Sudakshina Das ; > jason...@anshingtek.com.tw > Subject: [PATCH][wwwdoc]

RE: [PATCH]Arm: Update early-break tests to accept thumb output too.

2024-01-09 Thread Kyrylo Tkachov
> -Original Message- > From: Tamar Christina > Sent: Tuesday, January 9, 2024 12:02 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ni...@redhat.com; Kyrylo Tkachov > Subject: [PATCH]Arm: Update early-break tests to accept thumb output too. >

RE: [PATCH 2/2] arm: Add cortex-m52 doc

2024-01-08 Thread Kyrylo Tkachov
> -Original Message- > From: Chung-Ju Wu > Sent: Monday, January 8, 2024 6:17 AM > To: gcc-patches ; Kyrylo Tkachov > ; Richard Earnshaw > Cc: jason...@anshingtek.com.tw > Subject: [PATCH 2/2] arm: Add cortex-m52 doc > > Hi, > > This is the patch to

RE: [PATCH 1/2] arm: Add cortex-m52 core

2024-01-08 Thread Kyrylo Tkachov
Hi jasonwucj, > -Original Message- > From: Chung-Ju Wu > Sent: Monday, January 8, 2024 6:16 AM > To: gcc-patches ; Kyrylo Tkachov > ; Richard Earnshaw > Cc: jason...@anshingtek.com.tw > Subject: [PATCH 1/2] arm: Add cortex-m52 core > > Hi, > > Recen

RE: [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2024-01-04 Thread Kyrylo Tkachov
Hi Tamar, > -Original Message- > From: Tamar Christina > Sent: Thursday, January 4, 2024 11:06 AM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com; Kyrylo Tkachov > > Subject: RE: [PA

RE: [PATCH] aarch64: Fix parens in aarch64_stp_reg_operand [PR113061]

2023-12-19 Thread Kyrylo Tkachov
> -Original Message- > From: Alex Coplan > Sent: Monday, December 18, 2023 10:29 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Earnshaw ; Richard Sandiford > ; Kyrylo Tkachov > Subject: [PATCH] aarch64: Fix parens in aarch64_stp_reg_operand [PR113

RE: [PATCH] aarch64: Add an early RA for strided registers

2023-12-05 Thread Kyrylo Tkachov
Hi Richard, > -Original Message- > From: Richard Sandiford > Sent: Monday, November 20, 2023 12:16 PM > To: gcc-patches@gcc.gnu.org > Subject: [PATCH] aarch64: Add an early RA for strided registers > > [Yeah, I just missed the stage1 deadline, sorry. But this is gated > behind several

RE: [PATCH v2 3/5] aarch64: Sync `aarch64-sys-regs.def' with Binutils.

2023-11-28 Thread Kyrylo Tkachov
Hi Victor, > -Original Message- > From: Victor Do Nascimento > Sent: Tuesday, November 28, 2023 3:56 PM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Richard Sandiford > ; Richard Earnshaw > ; Victor Do Nascimento > > Subject: [PATCH v2 3/5] aarch64:

RE: [PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2023-11-27 Thread Kyrylo Tkachov
Hi Tamar, > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:43 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com; Kyrylo Tkachov > > Subject: [PATCH 20/21]Arm: Add Advanced

RE: [PATCH 21/21]Arm: Add MVE cbranch implementation

2023-11-27 Thread Kyrylo Tkachov
Hi Tamar, > -Original Message- > From: Tamar Christina > Sent: Monday, November 6, 2023 7:43 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com; Kyrylo Tkachov > > Subject: [PATCH 21/21]Arm: Add MVE cbr

RE: [PATCH]AArch64: fix aarch64_usubw pattern

2023-11-22 Thread Kyrylo Tkachov
Hi Tamar, > -Original Message- > From: Tamar Christina > Sent: Wednesday, November 22, 2023 10:20 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > Subject: [PATCH]AArch64: fix aarch64_u

RE: [PATCH 6/6] arm: [MVE intrinsics] rework vldq1 vst1q

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Christophe Lyon > Sent: Thursday, November 16, 2023 3:26 PM > To: gcc-patches@gcc.gnu.org; Richard Sandiford > ; Richard Earnshaw > ; Kyrylo Tkachov > Cc: Christophe Lyon > Subject: [PATCH 6/6] arm: [MVE intrinsics] rework vldq1 vs

RE: [PATCH 4/6] arm: [MVE intrinsics] add load and store shapes

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Christophe Lyon > Sent: Thursday, November 16, 2023 3:26 PM > To: gcc-patches@gcc.gnu.org; Richard Sandiford > ; Richard Earnshaw > ; Kyrylo Tkachov > Cc: Christophe Lyon > Subject: [PATCH 4/6] arm: [MVE intrinsics] a

RE: [PATCH 3/6] arm: [MVE intrinsics] Add support for contiguous loads and stores

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Christophe Lyon > Sent: Thursday, November 16, 2023 3:26 PM > To: gcc-patches@gcc.gnu.org; Richard Sandiford > ; Richard Earnshaw > ; Kyrylo Tkachov > Cc: Christophe Lyon > Subject: [PATCH 3/6] arm: [MVE intrinsics] Add supp

RE: [PATCH 2/6] arm: [MVE intrinsics] Add support for void and load/store pointers as argument types.

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Christophe Lyon > Sent: Thursday, November 16, 2023 3:26 PM > To: gcc-patches@gcc.gnu.org; Richard Sandiford > ; Richard Earnshaw > ; Kyrylo Tkachov > Cc: Christophe Lyon > Subject: [PATCH 2/6] arm: [MVE intrinsics] Add support f

RE: [PATCH 1/6] arm: Fix arm_simd_types and MVE scalar_types

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Christophe Lyon > Sent: Thursday, November 16, 2023 3:26 PM > To: gcc-patches@gcc.gnu.org; Richard Sandiford > ; Richard Earnshaw > ; Kyrylo Tkachov > Cc: Christophe Lyon > Subject: [PATCH 1/6] arm: Fix arm_simd_types and MVE sca

RE: [PATCH 5/6] arm: [MVE intrinsics] fix vst1 tests

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Christophe Lyon > Sent: Thursday, November 16, 2023 3:26 PM > To: gcc-patches@gcc.gnu.org; Richard Sandiford > ; Richard Earnshaw > ; Kyrylo Tkachov > Cc: Christophe Lyon > Subject: [PATCH 5/6] arm: [MVE intrinsics] fix vst1 tes

RE: [PATCH] aarch64: costs: update for TARGET_CSSC

2023-11-16 Thread Kyrylo Tkachov
> -Original Message- > From: Richard Earnshaw > Sent: Thursday, November 16, 2023 8:53 AM > To: Philipp Tomsich ; gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov > Subject: Re: [PATCH] aarch64: costs: update for TARGET_CSSC > > > > On 16/11/202

RE: [PATCH] Add a REG_P check for inc and dec for Arm MVE

2023-11-14 Thread Kyrylo Tkachov
Hi Saurabh, > -Original Message- > From: Saurabh Jha > Sent: Thursday, November 9, 2023 10:12 AM > To: gcc-patches@gcc.gnu.org; Richard Earnshaw > ; Richard Sandiford > > Subject: [PATCH] Add a REG_P check for inc and dec for Arm MVE > > Hey, > > This patch tightens

RE: [PATCH] AArch64: Cleanup memset expansion

2023-11-10 Thread Kyrylo Tkachov
> -Original Message- > From: Richard Earnshaw > Sent: Friday, November 10, 2023 11:31 AM > To: Wilco Dijkstra ; Kyrylo Tkachov > ; GCC Patches > Cc: Richard Sandiford ; Richard Earnshaw > > Subject: Re: [PATCH] AArch64: Cleanup memset expansion > > &

RE: [PATCH] libatomic: Improve ifunc selection on AArch64

2023-11-10 Thread Kyrylo Tkachov
> -Original Message- > From: Wilco Dijkstra > Sent: Friday, November 10, 2023 10:23 AM > To: Kyrylo Tkachov ; GCC Patches patc...@gcc.gnu.org>; Richard Sandiford > Subject: Re: [PATCH] libatomic: Improve ifunc selection on AArch64 > > Hi Kyrill, > >

RE: [PATCH] libatomic: Improve ifunc selection on AArch64

2023-11-10 Thread Kyrylo Tkachov
Hi Wilco, > -Original Message- > From: Wilco Dijkstra > Sent: Monday, November 6, 2023 12:13 PM > To: GCC Patches ; Richard Sandiford > > Cc: Kyrylo Tkachov > Subject: Re: [PATCH] libatomic: Improve ifunc selection on AArch64 > > > > ping >

RE: [PATCH] AArch64: Cleanup memset expansion

2023-11-10 Thread Kyrylo Tkachov
Hi Wilco, > -Original Message- > From: Wilco Dijkstra > Sent: Monday, November 6, 2023 12:12 PM > To: GCC Patches > Cc: Richard Sandiford ; Richard Earnshaw > > Subject: Re: [PATCH] AArch64: Cleanup memset expansion > > ping > > Cleanup memset implementation.  Similar to

RE: [PATCH v4] aarch64: Fine-grained policies to control ldp-stp formation.

2023-09-27 Thread Kyrylo Tkachov
Hi Manos, > -Original Message- > From: Manos Anagnostakis > Sent: Tuesday, September 26, 2023 2:52 PM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Tamar Christina > ; Philipp Tomsich ; > Manos Anagnostakis > Subject: [PATCH v4] aarch64: Fine-grained pol

RE: [PATCH v3] aarch64: Fine-grained policies to control ldp-stp formation.

2023-09-26 Thread Kyrylo Tkachov
> -Original Message- > From: Kyrylo Tkachov > Sent: Tuesday, September 26, 2023 9:36 AM > To: Manos Anagnostakis ; gcc- > patc...@gcc.gnu.org > Cc: Philipp Tomsich ; Andrew Pinski > > Subject: RE: [PATCH v3] aarch64: Fine-grained policies to control ldp-stp &g

RE: [PATCH v3] aarch64: Fine-grained policies to control ldp-stp formation.

2023-09-26 Thread Kyrylo Tkachov
Tomsich ; Kyrylo Tkachov ; Andrew Pinski Subject: Re: [PATCH v3] aarch64: Fine-grained policies to control ldp-stp formation. Thank you Andrew for the input. I've prepared a patch using --param with enum, which seems a more suitable approach to me as strings are more descriptive as well

RE: [PATCH v3] aarch64: Fine-grained policies to control ldp-stp formation.

2023-09-26 Thread Kyrylo Tkachov
> -Original Message- > From: Andrew Pinski > Sent: Monday, September 25, 2023 9:05 PM > To: Philipp Tomsich > Cc: Manos Anagnostakis ; gcc- > patc...@gcc.gnu.org; Kyrylo Tkachov > Subject: Re: [PATCH v3] aarch64: Fine-grained policies to control ldp-stp > for

RE: [PATCH] aarch64: Fine-grained ldp and stp policies with test-cases.

2023-09-25 Thread Kyrylo Tkachov
Hi Manos, Apologies for the long delay. > -Original Message- > From: Manos Anagnostakis > Sent: Friday, August 18, 2023 8:50 AM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Philipp Tomsich > ; Manos Anagnostakis > > Subject: [PATCH] aarch64: Fine-grai

RE: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2023-09-14 Thread Kyrylo Tkachov via Gcc-patches
Hi Stam, > -Original Message- > From: Stam Markianos-Wright > Sent: Wednesday, September 6, 2023 6:19 PM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Richard Earnshaw > > Subject: [PING][PATCH 2/2] arm: Add support for MVE Tail-Predicated Low > Ov

RE: [PING][PATCH 1/2] arm: Add define_attr to to create a mapping between MVE predicated and unpredicated insns

2023-09-14 Thread Kyrylo Tkachov via Gcc-patches
Hi Stam, > -Original Message- > From: Stam Markianos-Wright > Sent: Wednesday, September 6, 2023 6:19 PM > To: gcc-patches@gcc.gnu.org > Cc: Kyrylo Tkachov ; Richard Earnshaw > > Subject: [PING][PATCH 1/2] arm: Add define_attr to to create a mapping >

RE: [PATCH 1/9] arm: [MVE intrinsics] factorize vmullbq vmulltq

2023-08-22 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe, > -Original Message- > From: Christophe Lyon > Sent: Monday, August 14, 2023 7:34 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw ; Richard Sandiford > > Cc: Christophe Lyon > Subject: [PATCH 1/9] arm: [MVE intrinsics] f

RE: [PATCH] arm: [MVE intrinsics] Remove dead check for float type in parse_element_type

2023-08-22 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Monday, August 14, 2023 7:10 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw ; Richard Sandiford > > Cc: Christophe Lyon > Subject: [PATCH] arm: [MVE intrinsics] Remove d

RE: [PATCH] arm: [MVE intrinsics] fix binary_acca_int32 and binary_acca_int64 shapes

2023-08-22 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe, > -Original Message- > From: Christophe Lyon > Sent: Monday, August 14, 2023 7:01 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw ; Richard Sandiford > > Cc: Christophe Lyon > Subject: [PATCH] arm: [MVE intrinsic

RE: [PING][PATCH] arm: Remove unsigned variant of vcaddq_m

2023-08-21 Thread Kyrylo Tkachov via Gcc-patches
Ok. Thanks, Kyrill From: Stam Markianos-Wright Sent: Saturday, August 19, 2023 12:42 PM To: gcc-patches@gcc.gnu.org Cc: Kyrylo Tkachov ; Richard Earnshaw Subject: [PING][PATCH] arm: Remove unsigned variant of vcaddq_m (Pinging since I realised that this is required for my later Low

RE: [PATCH 1/6] arm: [MVE intrinsics] Factorize vcaddq vhcaddq

2023-07-14 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Thursday, July 13, 2023 11:22 AM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw ; Richard Sandiford > > Cc: Christophe Lyon > Subject: [PATCH 1/6] arm: [MVE intrinsics] Factorize vcadd

RE: [PATCH 2/2] [testsuite, arm]: Make mve_fp_fpu[12].c accept single or double precision FPU

2023-07-14 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Thursday, July 13, 2023 11:22 AM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw > Cc: Christophe Lyon > Subject: [PATCH 2/2] [testsuite,arm]: Make mve_fp_fpu[12].c accept single or &

RE: [PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp

2023-07-14 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Thursday, July 13, 2023 11:22 AM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw > Cc: Christophe Lyon > Subject: [PATCH 1/2] [testsuite,arm]: Make nomve_fp_1.c require arm_fp >

RE: [PATCH] testsuite: Add _link flavor for several arm_arch* and arm* effective-targets

2023-07-10 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Monday, July 10, 2023 2:59 PM > To: Kyrylo Tkachov > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > > Subject: Re: [PATCH] testsuite: Add _link flavor for several arm_arch* and > arm* effective-targets &

RE: [PATCH v2] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-07-10 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Monday, July 10, 2023 2:09 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw > Cc: Christophe Lyon > Subject: [PATCH v2] arm: Fix MVE intrinsics support with LTO (PR > target/110268) &

RE: [PATCH] doc: Document arm_v8_1m_main_cde_mve_fp

2023-07-10 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Friday, July 7, 2023 8:52 AM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw > Cc: Christophe Lyon > Subject: [PATCH] doc: Document arm_v8_1m_main_cde_mve_fp > > The arm_v8_1m_main_cde

RE: [PATCH] testsuite: Add _link flavor for several arm_arch* and arm* effective-targets

2023-07-10 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Christophe Lyon > Sent: Friday, July 7, 2023 8:52 AM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Earnshaw > Cc: Christophe Lyon > Subject: [PATCH] testsuite: Add _link flavor for several arm_arch* and arm* > effect

RE: [PATCH] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-07-06 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe, > -Original Message- > From: Christophe Lyon > Sent: Thursday, July 6, 2023 4:21 PM > To: Kyrylo Tkachov > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH] arm: Fix MVE intrinsics support with LTO (PR > target/110268) >

RE: [PATCH] arm: Fix MVE intrinsics support with LTO (PR target/110268)

2023-07-05 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe, > -Original Message- > From: Christophe Lyon > Sent: Monday, June 26, 2023 4:03 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; > Richard Sandiford > Cc: Christophe Lyon > Subject: [PATCH] arm: Fix MVE intrinsics support with LTO (PR targ

[PATCH][committed] aarch64: Use instead of in scalar SQRSHRUN pattern

2023-06-26 Thread Kyrylo Tkachov via Gcc-patches
Hi all, In the scalar pattern for SQRSHRUN it's a bit clearer to use DWI instead of V2XWIDE to make it more clear that no vector modes are involved. No behavioural change intended. Bootstrapped and tested on aarch64-none-linux-gnu. Pushing to trunk. Thanks, Kyrill gcc/ChangeLog: *

[PATCH][committed] aarch64: Clean up some rounding immediate predicates

2023-06-26 Thread Kyrylo Tkachov via Gcc-patches
Hi all, aarch64_simd_rsra_rnd_imm_vec is now used for more than just RSRA and accepts more than just vectors so rename it to make it more truthful. The aarch64_simd_rshrn_imm_vec is now unused and can be deleted. No behavioural change intended. Bootstrapped and tested on aarch64-none-linux-gnu.

[PATCH][committed] aarch64: Avoid same input and output Z register for gather loads

2023-06-21 Thread Kyrylo Tkachov via Gcc-patches
Hi all, The architecture recommends that load-gather instructions avoid using the same Z register for the load address and the destination, and the Software Optimization Guides for Arm cores recommend that as well. This means that for code like: #include svuint64_t food (svbool_t p, uint64_t

[PATCH][committed] aarch64: Convert SVE gather patterns to compact syntax

2023-06-21 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch converts the SVE load gather patterns to the new compact syntax that Tamar introduced. This allows for a future patch I want to contribute to add more alternatives that are better viewed in the more compact form. The lines in some patterns are >80 long now, but I think that's

[PATCH][committed] aarch64: Optimise ADDP with same source operands

2023-06-20 Thread Kyrylo Tkachov via Gcc-patches
Hi all, We've been asked to optimise the testcase in this patch of a 64-bit ADDP with the low and high halves of the same 128-bit vector. This can be done by a single .4s ADDP followed by just reading the bottom 64 bits. A splitter for this is quite straightforward now that all the vec_concat

[PATCH][4/5] aarch64: [US]Q(R)SHR(U)N2 refactoring

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
This patch is large in lines of code, but it is a fairly regular extension of the first patch as it converts the high-half patterns to standard RTL codes in the same fashion as the first patch did for the low-half ones. This now allows us to remove the unspec codes for these instructions as there

[PATCH][0/5][committed] aarch64: Reimplement [US]Q(R)SHR(U)N(2) patterns with standard RTL codes

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch series reimplements the MD patterns for the instructions that perform narrowing right shifts with optional rounding and saturation using standard RTL codes rather than unspecs. This includes the scalar forms and the *2 forms that write to the high half of the result vector.

[PATCH][2/5] aarch64: [US]Q(R)SHR(U)N scalar forms refactoring

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
Some instructions from the previous patch have scalar forms: SQSHRN,SQRSHRN,UQSHRN,UQRSHRN,SQSHRUN,SQRSHRUN. This patch converts the patterns for these to use standard RTL codes. Their MD patterns deviate slightly from the vector forms mostly due to things like operands being scalar rather than

[PATCH][5/5] aarch64: Handle ASHIFTRT in patterns for shrn2

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
Similar to the low-half patterns, we want to match both ashiftrt and lshiftrt with the truncate for SHRN2. We reuse the SHIFTRT iterator and the AARCH64_VALID_SHRN_OP check to help, but because we expand the high-half patterns by their gen_* names we need to disambiguate all the different

[PATCH][3/5] aarch64: Add ASHIFTRT handling for shrn pattern

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
The first patch in the series has some fallout in the testsuite, particularly gcc.target/aarch64/shrn-combine-2.c. Our previous patterns for SHRN matched both (truncate (ashiftrt (x) (N))) and (truncate (lshiftrt (x) (N)) as these are equivalent for the shift amounts involved. In our refactoring,

[PATCH][1/5] aarch64: Reimplement [US]Q(R)SHR(U)N patterns with RTL codes

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
This patch reimplements the MD patterns for the instructions that perform narrowing right shifts with optional rounding and saturation using standard RTL codes rather than unspecs. There are four groups of patterns involved: * Simple narrowing shifts with optional signed or unsigned truncation:

[PATCH] simplify-rtx: Simplify VEC_CONCAT of SUBREG and VEC_CONCAT from same vector

2023-06-16 Thread Kyrylo Tkachov via Gcc-patches
Hi all, In the testcase for this patch we try to vec_concat the lowpart and highpart of a vector, but the lowpart is expressed as a subreg. simplify-rtx.cc does not recognise this and combine ends up trying to match: Trying 7 -> 8: 7: r93:V2SI=vec_select(r95:V4SI,parallel) 8:

RE: [PATCH v2] [PR96339] Optimise svlast[ab]

2023-06-14 Thread Kyrylo Tkachov via Gcc-patches
> -Original Message- > From: Gcc-patches bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Prathamesh > Kulkarni via Gcc-patches > Sent: Wednesday, June 14, 2023 8:13 AM > To: Tejas Belagod > Cc: Richard Sandiford ; gcc- > patc...@gcc.gnu.org > Subject: Re: [PATCH v2] [PR96339]

[PATCH][committed] arm: Extend -mtp= arguments

2023-06-13 Thread Kyrylo Tkachov via Gcc-patches
Hi all, After discussing the -mtp= option with Arm's LLVM developers we'd like to extend the functionality of the option somewhat. There are actually 3 system registers that can be accessed for the thread pointer in aarch32: tpidrurw, tpidruro, tpidrprw. They are all read through the CP15

[PATCH][committed] aarch64: Extend -mtp= arguments

2023-06-13 Thread Kyrylo Tkachov via Gcc-patches
Hi all, After discussing the -mtp= option with Arm's LLVM developers we'd like to extend the functionality of the option somewhat. First of all, there is another TPIDR register that can be used to read the thread pointer: TPIDRRO_EL0 (which can also be accessed by AArch32 under another name) so

RE: [PATCH] simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

2023-06-12 Thread Kyrylo Tkachov via Gcc-patches
Hi Richard, > -Original Message- > From: Richard Sandiford > Sent: Friday, June 9, 2023 7:08 PM > To: Kyrylo Tkachov via Gcc-patches > Cc: Kyrylo Tkachov > Subject: Re: [PATCH] simplify-rtx: Implement constant folding of > SS_TRUNCATE, US_TRUNCATE > > Kyr

[PATCH] simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

2023-06-08 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch implements RTL constant-folding for the SS_TRUNCATE and US_TRUNCATE codes. The semantics are a clamping operation on the argument with the min and max of the narrow mode, followed by a truncation. The signedness of the clamp and the min/max extrema is derived from the

[PATCH][committed] aarch64: Represent SQXTUN with RTL operations

2023-06-07 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch removes UNSPEC_SQXTUN and uses organic RTL codes to represent the operation. SQXTUN is an odd one. It's described in the architecture as "Signed saturating extract Unsigned Narrow". It's not a straightforward ss_truncate nor a us_truncate. It is a sort of truncating signed

[PATCH][committed] aarch64: Improve RTL representation of ADDP instructions

2023-06-07 Thread Kyrylo Tkachov via Gcc-patches
Hi all, Similar to the ADDLP instructions the non-widening ADDP ones can be represented by adding the odd lanes with the even lanes of a vector. These instructions take two vector inputs and the architecture spec describes the operation as concatenating them together before going through it with

[PATCH][committed] aarch64: Improve representation of vpaddd intrinsics

2023-06-06 Thread Kyrylo Tkachov via Gcc-patches
Hi all, The aarch64_addpdi pattern is redundant as the reduc_plus_scal_ pattern can already generate the required form of the ADDP instruction, and is mostly folded to GIMPLE early on so can benefit from more optimisations. Though it turns out that we were missing the folding for the unsigned

[PATCH][committed] aarch64: Reimplement URSHR,SRSHR patterns with standard RTL codes

2023-06-06 Thread Kyrylo Tkachov via Gcc-patches
Hi all, Having converted the patterns for the URSRA,SRSRA instructions to standard RTL codes we can also easily convert the non-accumulating forms URSHR,SRSHR. This patch does that, reusing the various helpers and predicates from that patch in a straightforward way. This allows GCC to perform

[PATCH][committed] aarch64: Simplify SHRN, RSHRN expanders and patterns

2023-06-06 Thread Kyrylo Tkachov via Gcc-patches
Hi all, Now that we've got the annotations we can get rid of explicit !BYTES_BIG_ENDIAN and BYTES_BIG_ENDIAN patterns for the narrowing shift instructions. This allows us to clean up the expanders as well. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. Pushing to

[PATCH][committed] aarch64: Improve representation of ADDLV instructions

2023-06-06 Thread Kyrylo Tkachov via Gcc-patches
Hi all, We've received requests to optimise the attached intrinsics testcase. We currently generate: foo_1: uaddlp v0.4s, v0.8h uaddlv d31, v0.4s fmovx0, d31 ret foo_2: uaddlp v0.4s, v0.8h addvs31, v0.4s fmovw0, s31

[PATCH][committed] aarch64: Add =r, m and =m, r alternatives to 64-bit vector move patterns

2023-06-01 Thread Kyrylo Tkachov via Gcc-patches
Hi all, We can use the X registers to load and store 64-bit vector modes, we just need to add the alternatives to the mov patterns. This straightforward patch does that and for the pair variants too. For the testcase in the code we now generate the optimal assembly without any superfluous

[PATCH][committed] aarch64: PR target/99195 Annotate dot-product patterns for vec-concat-zero

2023-05-31 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This straightforward patch annotates the dotproduct instructions, including the i8mm ones. Tests included. Nothing unexpected here. Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf. Pushing to trunk. Thanks, Kyrill gcc/ChangeLog: PR target/99195

[PATCH][committed] aarch64: PR target/99195 Annotate saturating mult patterns for vec-concat-zero

2023-05-31 Thread Kyrylo Tkachov via Gcc-patches
Hi all, This patch goes through the various alphabet soup saturating multiplication patterns, including those in TARGET_RDMA and annotates them with . Many other patterns are widening and always write the full 128-bit vectors so this annotation doesn't apply to them. Nothing out of the ordinary

  1   2   3   4   5   6   7   8   9   10   >