Re: [PATCH] aarch64: Add SVE instruction types

2023-09-12 Thread Evandro Menezes via Gcc-patches
of memory ops through, TARGET_SCHED_ADJUST_PRIORITY, but it was innefective. I’m a bit at a loss what’s likely going on with the RA at this point. Any pointers? Thank you, -- Evandro Menezes > Em 16 de mai. de 2023, à(s) 03:36, Kyrylo Tkachov > escreveu: > > Hi Evandro, >

Re: [PATCH] aarch64: Add SVE instruction types

2023-05-16 Thread Evandro Menezes via Gcc-patches
> I think that was more down to my rushed model rather than anything else > though. > > Thanks, > Kyrill > > From: Evandro Menezes > Sent: Monday, May 15, 2023 9:13 PM > To: Kyrylo Tkachov > Cc: Richard Sandiford ; Evandro Menezes via > Gcc-patches ; evandro+.

Re: [PATCH] aarch64: Add SVE instruction types

2023-05-15 Thread Evandro Menezes via Gcc-patches
to mention with regards to granularity? Yes, my intent for this patch is to enable modeling the SVE instructions on N1. The patch that implements it brings up some performance improvements, but it’s mostly flat, as expected. Thank you, -- Evandro Menezes > Em 15 de mai. de 2023, à(s) 04:49, Kyr

Re: [PATCH] aarch64: Add SVE instruction types

2023-05-15 Thread Evandro Menezes via Gcc-patches
instructions in its group. Do you have specific instances in mind? Thank you, -- Evandro Menezes > Em 15 de mai. de 2023, à(s) 04:00, Richard Sandiford > escreveu: > > Evandro Menezes via Gcc-patches writes: >> This patch adds the attribute `type` to most SVE1 instructions, a

[PATCH] aarch64: Add SVE instruction types

2023-05-12 Thread Evandro Menezes via Gcc-patches
This patch adds the attribute `type` to most SVE1 instructions, as in the other instructions. -- Evandro Menezes 0002-aarch64-Add-SVE-instruction-types.patch Description: Binary data

aarch64: Add scheduling model for Neoverse V1

2023-05-07 Thread Evandro Menezes via Gcc-patches
This patch adds the scheduling model for Neoverse V1, based on the information from the “Arm Neoverse V1 Software Optimization Guide” and on static and dynamic analysis of internal and public benchmarks. Results are forthcoming. -- Evandro Menezes 0001-aarch64-Add-scheduling-model

Re: [PATCH] aarch64: Add the cost model for Neoverse N1

2023-04-24 Thread Evandro Menezes via Gcc-patches
Sorry, but it seems that, before sending, the email client is stripping leading spaces. I’m attaching the file here. -- Evandro Menezes ◊ evan...@yahoo.com ◊ Austin, TX Άγιος ο Θεός ⁂ ܩܕܝܫܐ ܐܢ̱ܬ ܠܐ ܡܝܘܬܐ ⁂ Sanctus Deus > Em 24 de abr. de 2023, à(s) 17:48, Evandro Menezes > escreveu: &

Re: [PATCH] aarch64: Add the cost model for Neoverse N1

2023-04-24 Thread Evandro Menezes via Gcc-patches
Hi, Tamara. Does this work? Thank you, -- Evandro Menezes ◊ evan...@yahoo.com ◊ Austin, TX Άγιος ο Θεός ⁂ ܩܕܝܫܐ ܐܢ̱ܬ ܠܐ ܡܝܘܬܐ ⁂ Sanctus Deus > Em 24 de abr. de 2023, à(s) 12:37, Tamar Christina > escreveu: > > Hi Evandro, > > I wanted to give this patch a try, but the

[PATCH] aarch64: Add the cost model for Neoverse N1

2023-04-18 Thread Evandro Menezes via Gcc-patches
This patch adds the cost model for Neoverse N1, based on the information from the "Arm Neoverse N1 Software Optimization Guide”. -- Evandro Menezes gcc/ChangeLog: * config/aarch64/aarch64-cores.def

[PATCH] aarch64: Add the scheduling model for Neoverse N1

2023-04-18 Thread Evandro Menezes via Gcc-patches
This patch adds the scheduling model for Neoverse N1, based on the information from the "Arm Neoverse N1 Software Optimization Guide”. -- Evandro Menezes gcc/ChangeLog: * config/aarch64/aarch64-core

Re: [PATCH] aarch64: Add the cost and scheduling models for Neoverse N1

2023-04-17 Thread Evandro Menezes via Gcc-patches
Hi, Kyrylo. > Em 11 de abr. de 2023, à(s) 04:41, Kyrylo Tkachov > escreveu: > >> -Original Message- >> From: Gcc-patches > bounces+kyrylo.tkachov=arm@gcc.gnu.org >> <mailto:bounces+kyrylo.tkachov=arm@gcc.gnu.org>> On Behalf Of Evandro >

[PATCH] aarch64: Add the cost and scheduling models for Neoverse N1

2023-04-07 Thread Evandro Menezes via Gcc-patches
This patch adds the cost and scheduling models for Neoverse N1, based on the information from the "Arm Neoverse N1 Software Optimization Guide”. -- Evandro Menezes ◊ evan...@yahoo.com [PATCH] aarch64: Add the cost and scheduling models for Neoverse N1 gcc/ChangeLog: * config/aa

Re: [PATCH][AArch64] Allow multiple-of-8 immediate offsets for TImode LDP/STP

2016-07-13 Thread Evandro Menezes
stp x2, x3, [x0] ret whereas with this patch we generate: bar: ldp x2, x3, [x1, 8] stp x2, x3, [x0, 8] ret Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? LGTM -- Evandro Menezes

Re: [PATCH][AArch64] Increase code alignment

2016-06-29 Thread Evandro Menezes
-systems.com; Evandro Menezes Subject: [PATCH][AArch64] Increase code alignment Increase loop alignment on Cortex cores to 8 and set function alignment to 16. This makes things consistent across big.LITTLE cores, improves performance of benchmarks with tight loops and reduces performance

RE: [PATCH][AArch64] Enable -frename-registers at -O2 and higher

2016-06-16 Thread Evandro Menezes
er benchmarks tonight, but I'm leaning towards having it as a > target specific extra tuning option. The results are in and -frename-registers is not a good idea for Exynos M1. Thank you, -- Evandro Menezes Austin, TX

RE: [PATCH][AArch64] Enable -frename-registers at -O2 and higher

2016-06-15 Thread Evandro Menezes
provements for me to be comfortable with -frename-registers being a generic default for AArch64. I'll run some larger benchmarks tonight, but I'm leaning towards having it as a target specific extra tuning option. Thank you, -- Evandro Menezes

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-06-14 Thread Evandro Menezes
On 06/14/16 03:28, Christophe Lyon wrote: On 13 June 2016 at 21:06, Evandro Menezes <e.mene...@samsung.com> wrote: On 06/13/16 05:15, James Greenhalgh wrote: Thanks for your patience on this patch series. Just checked the series in. If I'm not mistaken, it looks like you forgot to

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-06-13 Thread Evandro Menezes
On 06/13/16 05:15, James Greenhalgh wrote: Thanks for your patience on this patch series. Just checked the series in. Thank y'all for your assistance and patience. Cheers, -- Evandro Menezes

Re: [PATCH][AArch64] Increase code alignment

2016-06-03 Thread Evandro Menezes
On 06/03/16 17:22, Evandro Menezes wrote: On 06/03/16 05:51, Wilco Dijkstra wrote: It looks almost all AArch64 cores agree on alignment of 16 for function, and 8 for loops and branches, so we should change -mcpu=generic as well if there is no disagreement - feedback welcome. I'll see what

Re: [PATCH][AArch64] Increase code alignment

2016-06-03 Thread Evandro Menezes
comfortable with, but I also wonder if the -falign-labels shouldn't also be a parameter in tune_params. Thoughts? -- Evandro Menezes

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-06-03 Thread Evandro Menezes
Rebasing the patch... -- Evandro Menezes >From d791090aae6a29fa94d8fc10894ee1053b05bcc2 Mon Sep 17 00:00:00 2001 From: Evandro Menezes <e.mene...@samsung.com> Date: Mon, 4 Apr 2016 14:02:24 -0500 Subject: [PATCH 3/3] [AArch64] Emit division using the Newton series 2016-04-04 Evandr

Re: [PATCH 1/3][AArch64] Add more choices for the reciprocal square root approximation

2016-06-03 Thread Evandro Menezes
On 06/01/16 03:35, James Greenhalgh wrote: On Fri, May 27, 2016 at 05:57:23PM -0500, Evandro Menezes wrote: From 86d7690632d03ec85fd69bfaef8e89c0542518ad Mon Sep 17 00:00:00 2001 From: Evandro Menezes <e.mene...@samsung.com> Date: Thu, 3 Mar 2016 18:13:46 -0600 Subject: [PATCH 1/3] [A

Re: [PATCH 2/3][AArch64] Emit square root using the Newton series

2016-06-03 Thread Evandro Menezes
On 06/01/16 04:00, James Greenhalgh wrote: On Fri, May 27, 2016 at 05:57:26PM -0500, Evandro Menezes wrote: 2016-04-04 Evandro Menezes <e.mene...@samsung.com> Wilco Dijkstra <wilco.dijks...@arm.com> gcc/ * config/aarch64/aarc

Re: [PATCH][AArch64] Cleanup -mpc-relative-loads

2016-06-03 Thread Evandro Menezes
On 06/03/16 07:56, Wilco Dijkstra wrote: This patch cleans up the -mpc-relative-loads option processing. Rename to avoid the "no*" name and confusing !no* expressions. Fix the option processing code to implement -mno-pc-relative-loads rather than ignore it. OK for commit? LGTM

Re: [PATCH][wwwdocs][AArch64] Mention -mcpu=qdf24xx support for GCC 6

2016-06-03 Thread Evandro Menezes
On 06/02/16 09:54, Kyrill Tkachov wrote: The Qualcomm QDF24xx processor is now supported via the Shouldn't this read "The Qualcomm QDF24xx processors are now supported via the"? Not that I have a strong opinion about it, but, otherwise, OK. -- Evandro Menezes

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-05-31 Thread Evandro Menezes
On 05/31/16 04:27, James Greenhalgh wrote: On Fri, May 27, 2016 at 05:57:30PM -0500, Evandro Menezes wrote: On 05/25/16 11:16, James Greenhalgh wrote: On Wed, Apr 27, 2016 at 04:15:53PM -0500, Evandro Menezes wrote: gcc/ * config/aarch64/aarch64-protos.h (tune_params

Re: [PATCH][AArch64] Use aarch64_fusion_enabled_p to check for insn fusion capabilities

2016-05-27 Thread Evandro Menezes
Kyrylo Tkachov <kyrylo.tkac...@arm.com> * config/aarch64/aarch64.c (aarch_macro_fusion_pair_p): Use aarch64_fusion_enabled_p to check for fusion capabilities. LGTM -- Evandro Menezes

Re: [PATCH 3/3][AArch64] Emit division using the Newton series

2016-05-27 Thread Evandro Menezes
On 05/25/16 11:16, James Greenhalgh wrote: On Wed, Apr 27, 2016 at 04:15:53PM -0500, Evandro Menezes wrote: gcc/ * config/aarch64/aarch64-protos.h (tune_params): Add new member "approx_div_modes". (aarch64_emit_approx_div): Declare new function.

Re: [PATCH 2/3][AArch64] Emit square root using the Newton series

2016-05-27 Thread Evandro Menezes
On 05/25/16 10:52, James Greenhalgh wrote: On Wed, Apr 27, 2016 at 04:15:45PM -0500, Evandro Menezes wrote: gcc/ * config/aarch64/aarch64-protos.h (aarch64_emit_approx_rsqrt): Replace with new function "aarch64_emit_approx_sqrt". (tune_params):

Re: [PATCH 1/3][AArch64] Add more choices for the reciprocal square root approximation

2016-05-27 Thread Evandro Menezes
On 05/25/16 05:15, James Greenhalgh wrote: On Wed, Apr 27, 2016 at 04:13:33PM -0500, Evandro Menezes wrote: gcc/ * config/aarch64/aarch64-protos.h (AARCH64_APPROX_MODE): New macro. (AARCH64_APPROX_{NONE,SP,DP,DFORM,QFORM,SCALAR,VECTOR,ALL}): Likewise

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-24 Thread Evandro Menezes
On 05/23/16 15:32, Evandro Menezes wrote: I'm fine with this patch, as it achieves in part what I intended before: going beyond the default_case_values_threshold, too conservative for Exynos M1. My concern is particularly what happens to in-order targets, like the ubiquitous A53. I'll

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-24 Thread Evandro Menezes
alignment or some other secondary effect. I always thought that this patch, that lays out the branch tree more optimally, deserved to be revisited: https://gcc.gnu.org/ml/gcc-patches/2008-04/msg02197.html Cheers, -- Evandro Menezes

Re: [PATCH 0/3][AArch64] Add infrastructure for more approximate FP operations

2016-05-23 Thread Evandro Menezes
On 04/27/16 16:13, Evandro Menezes wrote: This patch suite increases the granularity of target selections of approximate FP operations and adds the options of emitting approximate square root and division. The full suite is contained in the emails tagged: 1. [PATCH 1/3][AArch64] Add more

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-05-23 Thread Evandro Menezes
, -- Evandro Menezes

[PATCH 3/3][AArch64] Emit division using the Newton series

2016-04-27 Thread Evandro Menezes
Define new function. * config/aarch64/aarch64.md ("div3"): New expansion. * config/aarch64/aarch64-simd.md ("div3"): Likewise. * config/aarch64/aarch64.opt (-mlow-precision-div): Add new option. * doc/invoke.texi (-mlow-precision-div): Describe

[PATCH 2/3][AArch64] Emit square root using the Newton series

2016-04-27 Thread Evandro Menezes
n and insn definitions. * config/aarch64/aarch64.md: Likewise. * config/aarch64/aarch64.opt (mlow-precision-sqrt): Add new option description. * doc/invoke.texi (mlow-precision-sqrt): Likewise. -- Evandro Menezes >From 753115a8691afd7aed4a510d9e9cb0a8e859acf4 Mon

[PATCH 1/3][AArch64] Add more choices for the reciprocal square root approximation

2016-04-27 Thread Evandro Menezes
(aarch64_optab_supported_p): New argument for the mode. * doc/invoke.texi (-mlow-precision-recip-sqrt): Reword description. -- Evandro Menezes >From 2cb6c0f35bbdc3b4cc6f88c61a50f3fbb168ec99 Mon Sep 17 00:00:00 2001 From: Evandro Menezes <e.mene...@samsung.com> Date: Thu, 3 Mar

[PATCH 0/3][AArch64] Add infrastructure for more approximate FP operations

2016-04-27 Thread Evandro Menezes
approximation 2. [PATCH 2/3][AArch64] Emit square root using the Newton series 3. [PATCH 3/3][AArch64] Emit division using the Newton series Thank you, -- Evandro Menezes

Re: [PATCH][AArch64] Replace insn to zero up SIMD registers

2016-04-27 Thread Evandro Menezes
On 04/26/16 08:25, Wilco Dijkstra wrote: Evandro Menezes wrote: On 03/10/16 10:37, James Greenhalgh wrote: Thanks for sticking with it. This is OK for GCC 7 when development opens. Remember to mention the most recent changes in your Changelog entry (Remove "fp" attribute from *mov

Re: [PATCH][AArch64] Simplify ashl3 expander for SHORT modes

2016-04-27 Thread Evandro Menezes
On 04/27/16 09:10, Kyrill Tkachov wrote: 2016-04-27 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * config/aarch64/aarch64.md (ashl3, SHORT modes): Use const_int_operand for operand 2 predicate. Simplify expand code as a result. LGTM -- Evandro Menezes

Re: [AArch64] Emit square root using the Newton series

2016-04-27 Thread Evandro Menezes
On 04/27/16 09:23, James Greenhalgh wrote: On Tue, Apr 12, 2016 at 01:14:51PM -0500, Evandro Menezes wrote: On 04/05/16 17:30, Evandro Menezes wrote: On 04/05/16 13:37, Wilco Dijkstra wrote: I can't get any of these to work... Not only do I get a large number of collisions and duplicated code

Re: [AArch64] Emit division using the Newton series

2016-04-27 Thread Evandro Menezes
are users to use it through the command line option -mlow-precision-div. -- Evandro Menezes

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-26 Thread Evandro Menezes
On 04/26/16 11:14, Wilco Dijkstra wrote: Evandro Menezes wrote: True, but the results when running on A53 could be quite different. GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no difference in perlbench. Looks good, then. Fine by me. Thanks for your patience

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes
On 04/25/16 14:58, Wilco Dijkstra wrote: Evandro Menezes wrote: I agree with your assessment, but I'm more curious to understand how this change affects code built with the default -mcpu=generic when run on both A53 and A57, the typical configuration of big.LITTLE machines. I wouldn't expect

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes
On 04/25/16 14:21, Wilco Dijkstra wrote: Evandro Menezes wrote: I assume that you mean that such improvements are true for -mcpu=generic, yes? On which target, A53 or A57 or other? It's true for any CPU setting. The SPEC results are for Cortex-A57 however I wrote a microbenchmark that shows

Re: [PATCH][AArch64] Replace insn to zero up SIMD registers

2016-04-25 Thread Evandro Menezes
On 03/10/16 10:37, James Greenhalgh wrote: On Thu, Mar 10, 2016 at 10:32:15AM -0600, Evandro Menezes wrote: I agree to postpone until GCC 7. [AArch64] Replace insn to zero up SIMD registers gcc/ * config/aarch64/aarch64.md (*movhf_aarch64): Add "mo

Re: [PATCH][AArch64] Adjust SIMD integer preference

2016-04-25 Thread Evandro Menezes
On 04/22/16 10:35, Wilco Dijkstra wrote: OK for trunk? LGTM -- Evandro Menezes

Re: [PATCH][AArch64][wwwdocs] Summarise some more AArch64 changes for GCC6

2016-04-25 Thread Evandro Menezes
On 04/21/16 03:15, Kyrill Tkachov wrote: Ok to commit? LGTM -- Evandro Menezes

Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-25 Thread Evandro Menezes
assume that you mean that such improvements are true for -mcpu=generic, yes? On which target, A53 or A57 or other? Otherwise, it seems to be a sensible change, but I'm trying to understand how generally beneficial it is. Thank you, -- Evandro Menezes

RE: [AArch64] Add more precision choices for the reciprocal square root approximation

2016-04-21 Thread Evandro Menezes
> On 04/04/16 11:13, Evandro Menezes wrote: > > On 04/01/16 18:08, Wilco Dijkstra wrote: > >> Evandro Menezes wrote: > >>> I hope that this gets in the ballpark of what's been discussed > >>> previously. > >> Yes that's very close to what I had i

RE: [AArch64] Emit square root using the Newton series

2016-04-21 Thread Evandro Menezes
> On 04/05/16 17:30, Evandro Menezes wrote: > > On 04/05/16 13:37, Wilco Dijkstra wrote: > >> I can't get any of these to work... Not only do I get a large number > >> of collisions and duplicated code between these patches, when I try > >> to resolve them,

RE: [AArch64] Emit division using the Newton series

2016-04-21 Thread Evandro Menezes
> On 04/04/16 14:06, Evandro Menezes wrote: > > On 04/01/16 17:52, Evandro Menezes wrote: > >> On 04/01/16 17:45, Wilco Dijkstra wrote: > >>> Evandro Menezes wrote: > >>> > >>>> However, I don't think that there's the need to handle

Re: [AArch64] Emit square root using the Newton series

2016-04-12 Thread Evandro Menezes
On 04/05/16 17:30, Evandro Menezes wrote: On 04/05/16 13:37, Wilco Dijkstra wrote: I can't get any of these to work... Not only do I get a large number of collisions and duplicated code between these patches, when I try to resolve them, all I get is crashes whenever I try to use sqrt (even

Re: [AArch64] Add more precision choices for the reciprocal square root approximation

2016-04-12 Thread Evandro Menezes
On 04/04/16 11:13, Evandro Menezes wrote: On 04/01/16 18:08, Wilco Dijkstra wrote: Evandro Menezes wrote: I hope that this gets in the ballpark of what's been discussed previously. Yes that's very close to what I had in mind. A minor issue is that the vector modes cannot work as they start

Re: [AArch64] Emit division using the Newton series

2016-04-12 Thread Evandro Menezes
On 04/04/16 14:06, Evandro Menezes wrote: On 04/01/16 17:52, Evandro Menezes wrote: On 04/01/16 17:45, Wilco Dijkstra wrote: Evandro Menezes wrote: However, I don't think that there's the need to handle any special case for division. The only case when the approximation differs from

Re: [AArch64] Emit square root using the Newton series

2016-04-05 Thread Evandro Menezes
a patchset that applies cleanly so I can try all approximation routines? Hi, Wilco. The original patches should be independent of each other, so indeed they duplicate code. This patch suite should be suitable for testing. HTH -- Evandro Menezes >From cbc2b62f7df5c3e2fef2a24157b1bdd1a6de191b Mon

Re: [AArch64] Emit division using the Newton series

2016-04-04 Thread Evandro Menezes
On 04/01/16 17:52, Evandro Menezes wrote: On 04/01/16 17:45, Wilco Dijkstra wrote: Evandro Menezes wrote: However, I don't think that there's the need to handle any special case for division. The only case when the approximation differs from division is when the numerator is infinity

Re: [AArch64] Emit square root using the Newton series

2016-04-04 Thread Evandro Menezes
On 04/01/16 17:45, Evandro Menezes wrote: On 03/24/16 14:11, Evandro Menezes wrote: On 03/17/16 17:46, Evandro Menezes wrote: This patch refactors the function to emit the reciprocal square root approximation to also emit the square root approximation. This version of the patch cleans up

Re: [AArch64] Add more precision choices for the reciprocal square root approximation

2016-04-04 Thread Evandro Menezes
On 04/01/16 18:08, Wilco Dijkstra wrote: Evandro Menezes wrote: I hope that this gets in the ballpark of what's been discussed previously. Yes that's very close to what I had in mind. A minor issue is that the vector modes cannot work as they start at MAX_MODE_FLOAT (which is >

Re: [AArch64] Emit division using the Newton series

2016-04-01 Thread Evandro Menezes
On 04/01/16 17:45, Wilco Dijkstra wrote: Evandro Menezes wrote: However, I don't think that there's the need to handle any special case for division. The only case when the approximation differs from division is when the numerator is infinity and the denominator, zero, when the approximation

Re: [AArch64] Emit square root using the Newton series

2016-04-01 Thread Evandro Menezes
On 03/24/16 14:11, Evandro Menezes wrote: On 03/17/16 17:46, Evandro Menezes wrote: This patch refactors the function to emit the reciprocal square root approximation to also emit the square root approximation. This version of the patch cleans up the changes to the MD files and fixes some bugs

Re: [AArch64] Emit division using the Newton series

2016-04-01 Thread Evandro Menezes
On 04/01/16 16:22, Wilco Dijkstra wrote: Evandro Menezes wrote: The division variant should use the same latency reduction trick I mentioned for sqrt. I don't think that it applies here, since it doesn't have to deal with special cases. No it applies as it's exactly the same calculation: x

Re: [AArch64] Fix SIMD predicate

2016-04-01 Thread Evandro Menezes
On 03/31/16 04:52, James Greenhalgh wrote: On Wed, Mar 30, 2016 at 11:18:27AM -0500, Evandro Menezes wrote: Add scalar 0.0 to the aarch64_simd_reg_or_zero predicate. 2016-03-30 Evandro Menezes <e.mene...@samsung.com> * gcc/config/aarch64/predica

Re: [AArch64] Emit division using the Newton series

2016-04-01 Thread Evandro Menezes
On 04/01/16 08:58, Wilco Dijkstra wrote: Evandro Menezes wrote: On 03/23/16 11:24, Evandro Menezes wrote: On 03/17/16 15:09, Evandro Menezes wrote: This patch implements FP division by an approximation using the Newton series. With this patch, DF division is sped up by over 100% and SF

Re: [AArch64] Add more precision choices for the reciprocal square root approximation

2016-04-01 Thread Evandro Menezes
rgument for the mode. This patch allows a target to choose the mode of this operation when it is beneficial to use the approximate version. I hope that this gets in the ballpark of what's been discussed previously. Thank you, -- Evandro Menezes >From 17ac33719bae8966a481cc833c9ac062f7fb

Re: [AArch64] Add precision choices for the reciprocal square root approximation

2016-04-01 Thread Evandro Menezes
On 04/01/16 09:06, James Greenhalgh wrote: On Fri, Apr 01, 2016 at 02:47:05PM +0100, Wilco Dijkstra wrote: Evandro Menezes wrote: Ping^1 I haven't seen a newer version that incorporates my feedback. To recap what I'd like to see is a more general way to select approximations based on mode. I

Re: [AArch64] Add precision choices for the reciprocal square root approximation

2016-04-01 Thread Evandro Menezes
On 04/01/16 08:47, Wilco Dijkstra wrote: Evandro Menezes wrote: Ping^1 I haven't seen a newer version that incorporates my feedback. To recap what I'd like to see is a more general way to select approximations based on mode. I don't believe that looking at the inner mode works in general

Re: [PATCH 2/4][AArch64] Increase the loop peeling limit

2016-03-31 Thread Evandro Menezes
On 03/16/16 14:48, Evandro Menezes wrote: On 02/03/16 13:46, Evandro Menezes wrote: On 01/08/16 16:55, Evandro Menezes wrote: On 12/16/2015 02:11 PM, Evandro Menezes wrote: On 12/16/2015 05:24 AM, Richard Earnshaw (lists) wrote: On 15/12/15 23:34, Evandro Menezes wrote: On 12/14/2015 05:26

Re: [AArch64] Add precision choices for the reciprocal square root approximation

2016-03-31 Thread Evandro Menezes
On 03/18/16 18:00, Evandro Menezes wrote: On 03/18/16 17:20, Wilco Dijkstra wrote: Evandro Menezes <e.mene...@samsung.com> wrote: On 03/18/16 10:21, Wilco Dijkstra wrote: Hi Evandro, For example, though this approximation is improves the performance noticeably for DF on A57, for SF,

Re: [AArch64] Emit division using the Newton series

2016-03-31 Thread Evandro Menezes
On 03/23/16 11:24, Evandro Menezes wrote: On 03/17/16 15:09, Evandro Menezes wrote: This patch implements FP division by an approximation using the Newton series. With this patch, DF division is sped up by over 100% and SF division, zilch, both on A57 and on M1. gcc

[AArch64] Fix SIMD predicate

2016-03-30 Thread Evandro Menezes
Add scalar 0.0 to the aarch64_simd_reg_or_zero predicate. 2016-03-30 Evandro Menezes <e.mene...@samsung.com> * gcc/config/aarch64/predicates.md (aarch64_simd_reg_or_zero predicate): Add the "const_double" constraint. It seems to me that the aarch64_

Re: [AArch64] Emit square root using the Newton series

2016-03-24 Thread Evandro Menezes
On 03/17/16 17:46, Evandro Menezes wrote: This patch refactors the function to emit the reciprocal square root approximation to also emit the square root approximation. 2016-03-23 Evandro Menezes <e.mene...@samsung.com> Wilco Dijkstra <wilco.dijks...@arm.com

Re: [AArch64] Emit division using the Newton series

2016-03-23 Thread Evandro Menezes
On 03/17/16 15:09, Evandro Menezes wrote: This patch implements FP division by an approximation using the Newton series. With this patch, DF division is sped up by over 100% and SF division, zilch, both on A57 and on M1. gcc/ * config/aarch64/aarch64-tuning-flags.def

Re: [AArch64] Emit division using the Newton series

2016-03-23 Thread Evandro Menezes
On 03/17/16 15:09, Evandro Menezes wrote: This patch implements FP division by an approximation using the Newton series. With this patch, DF division is sped up by over 100% and SF division, zilch, both on A57 and on M1. gcc/ * config/aarch64/aarch64-tuning-flags.def

[AArch64] Emit division using the Newton series

2016-03-19 Thread Evandro Menezes
Emit division using the Newton series 2016-03-17 Evandro Menezes <e.mene...@samsung.com> gcc/ * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNE_APPROX_DIV_{SF,DF}: New tuning macros. * config/aarch64/aarch64-pr

Re: [AArch64] Emit square root using the Newton series

2016-03-19 Thread Evandro Menezes
On 03/17/16 09:55, James Greenhalgh wrote: On Wed, Mar 16, 2016 at 02:45:37PM -0500, Evandro Menezes wrote: On 03/08/16 16:08, Evandro Menezes wrote: On 02/16/16 14:56, Evandro Menezes wrote: On 12/08/15 15:35, Evandro Menezes wrote: Emit square root using the Newton series 2015-12-03

[AArch64] Add precision choices for the reciprocal square root approximation

2016-03-19 Thread Evandro Menezes
, not so much, if at all. Feedback appreciated. Thank you, -- Evandro Menezes >From 95581aefcf324233c3603f4d8232ee18c5836f8a Mon Sep 17 00:00:00 2001 From: Evandro Menezes <e.mene...@samsung.com> Date: Thu, 17 Mar 2016 17:00:03 -0500 Subject: [PATCH] Add precision choices for the reciproc

Emit square root using the Newton series

2016-03-19 Thread Evandro Menezes
2016-03-16 Evandro Menezes <e.mene...@samsung.com> Wilco Dijkstra <wilco.dijks...@arm.com> gcc/ * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNE_APPROX_SQRT_{SF,DF}): New tuning macros. * config/aarch64/aarc

[AArch64] Add precision choices for the reciprocal square root approximation

2016-03-19 Thread Evandro Menezes
, not so much, if at all. Feedback appreciated. Thank you, -- Evandro Menezes

Re: [AArch64] Emit square root using the Newton series

2016-03-19 Thread Evandro Menezes
On 03/08/16 16:08, Evandro Menezes wrote: On 02/16/16 14:56, Evandro Menezes wrote: On 12/08/15 15:35, Evandro Menezes wrote: Emit square root using the Newton series 2015-12-03 Evandro Menezes <e.mene...@samsung.com> gcc/ * config/aarch64/aarch64-pr

Re: [AArch64] Add precision choices for the reciprocal square root approximation

2016-03-18 Thread Evandro Menezes
makes the decision in the md file which does not seem a good idea). I agree. Will modify it. Thank you, -- Evandro Menezes

[COMMITTED][AArch64] Tweak the pipeline model for Exynos M1

2016-03-18 Thread Evandro Menezes
Tweak the pipeline model for Exynos M1 * gcc/config/aarch64/aarch64.c (exynosm1_tunings): Enable the weak prefetching model. Committed as r234307. -- Evandro Menezes >From a75d875a3c64180c9d6c368e2d87036d70f66036 Mon Sep 17 00:00:00 2001 From: evandro <e

Re: [AArch64] Emit square root using the Newton series

2016-03-18 Thread Evandro Menezes
On 03/10/16 19:06, Wilco Dijkstra wrote: Evandro Menezes <e.mene...@samsung.com> wrote: That's what I had in mind too, but around the approximation for x^-1/2 and using masks for vector cases thusly: fcmne v3.4s, v0.4s, #0.0 frsqrte v1.4s, v0.4s fmulv2.4s,

Re: [AArch64] Add precision choices for the reciprocal square root approximation

2016-03-18 Thread Evandro Menezes
On 03/18/16 17:20, Wilco Dijkstra wrote: Evandro Menezes <e.mene...@samsung.com> wrote: On 03/18/16 10:21, Wilco Dijkstra wrote: Hi Evandro, For example, though this approximation is improves the performance noticeably for DF on A57, for SF, not so much, if at all. I'm still ske

Re: [PATCH 2/4][AArch64] Increase the loop peeling limit

2016-03-18 Thread Evandro Menezes
On 02/03/16 13:46, Evandro Menezes wrote: On 01/08/16 16:55, Evandro Menezes wrote: On 12/16/2015 02:11 PM, Evandro Menezes wrote: On 12/16/2015 05:24 AM, Richard Earnshaw (lists) wrote: On 15/12/15 23:34, Evandro Menezes wrote: On 12/14/2015 05:26 AM, James Greenhalgh wrote: On Thu, Dec 03

Re: [AArch64] Emit square root using the Newton series

2016-03-14 Thread Evandro Menezes
On 03/10/16 19:06, Wilco Dijkstra wrote: Evandro Menezes <e.mene...@samsung.com> wrote: That's what I had in mind too, but around the approximation for x^-1/2 and using masks for vector cases thusly: fcmne v3.4s, v0.4s, #0.0 frsqrte v1.4s, v0.4s fmulv2.4s,

Re: [AArch64] Emit square root using the Newton series

2016-03-10 Thread Evandro Menezes
v2.4s, v1.4s, v1.4s frsqrts v2.4s, v0.4s, v2.4s fmulv1.4s, v1.4s, v2.4s and v1.4s, v3.4s fmulv0.4s, v1.4s, v0.4s Thanks, -- Evandro Menezes

Re: [AArch64] Emit square root using the Newton series

2016-03-10 Thread Evandro Menezes
. Thanks for the pointer, Wilco. Will work it in the patch. -- Evandro Menezes

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-10 Thread Evandro Menezes
On 03/10/16 10:27, Evandro Menezes wrote: On 03/10/16 07:23, James Greenhalgh wrote: On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote: On 03/01/16 13:08, Evandro Menezes wrote: On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-10 Thread Evandro Menezes
On 03/10/16 07:23, James Greenhalgh wrote: On Wed, Mar 09, 2016 at 03:35:43PM -0600, Evandro Menezes wrote: On 03/01/16 13:08, Evandro Menezes wrote: On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-09 Thread Evandro Menezes
On 03/01/16 13:08, Evandro Menezes wrote: On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floa

Re: [AArch64] Emit square root using the Newton series

2016-03-08 Thread Evandro Menezes
On 03/08/16 16:08, Evandro Menezes wrote: On 02/16/16 14:56, Evandro Menezes wrote: On 12/08/15 15:35, Evandro Menezes wrote: Emit square root using the Newton series 2015-12-03 Evandro Menezes <e.mene...@samsung.com> gcc/ * config/aarch64/aarch64-pr

Re: [AArch64] Emit square root using the Newton series

2016-03-08 Thread Evandro Menezes
On 03/08/16 16:08, Evandro Menezes wrote: On 02/16/16 14:56, Evandro Menezes wrote: On 12/08/15 15:35, Evandro Menezes wrote: Emit square root using the Newton series 2015-12-03 Evandro Menezes <e.mene...@samsung.com> gcc/ * config/aarch64/aarch64-pr

Re: [AArch64] Emit square root using the Newton series

2016-03-08 Thread Evandro Menezes
On 02/16/16 14:56, Evandro Menezes wrote: On 12/08/15 15:35, Evandro Menezes wrote: Emit square root using the Newton series 2015-12-03 Evandro Menezes <e.mene...@samsung.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt): Declare new fu

Re: [AArch64] Emit square root using the Newton series

2016-03-03 Thread Evandro Menezes
On 02/16/16 14:56, Evandro Menezes wrote: On 12/08/15 15:35, Evandro Menezes wrote: Emit square root using the Newton series 2015-12-03 Evandro Menezes <e.mene...@samsung.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt): Declare new fu

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-03-01 Thread Evandro Menezes
On 03/01/16 13:02, Wilco Dijkstra wrote: Evandro Menezes wrote: The meaning of these attributes are not clear to me. Is there a reference somewhere about which insns are FP or SIMD or neither? The meaning should be clear, "fp" is a floating point instruction, "simd" a

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-02-29 Thread Evandro Menezes
On 02/29/16 12:07, Wilco Dijkstra wrote: Evandro Menezes <e.mene...@samsung.com> wrote: Please, verify the new "simd" and "fp" attributes for SF and DF. Both movsf and movdf should be: (set_attr "simd" "*,yes,*,*,*,*,*,*,*,*") (set_attr "

Re: [AArch64] Emit square root using the Newton series

2016-02-26 Thread Evandro Menezes
On 02/26/16 17:42, Evandro Menezes wrote: On 02/26/16 08:59, James Greenhalgh wrote: On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote: In preparation for the patch adding the Newton series also for square root, I'd like to propose this patch changing the name of the existing

Re: [AArch64] Emit square root using the Newton series

2016-02-26 Thread Evandro Menezes
On 02/26/16 08:59, James Greenhalgh wrote: On Mon, Feb 22, 2016 at 06:50:44PM -0600, Evandro Menezes wrote: In preparation for the patch adding the Newton series also for square root, I'd like to propose this patch changing the name of the existing tuning flag for the reciprocal square root

Re: [PATCH][AArch64] Replace insn to zero up DF register

2016-02-26 Thread Evandro Menezes
On 02/26/16 06:37, Wilco Dijkstra wrote: Evandro Menezes <e.mene...@samsung.com> wrote: I have a question though: is it necessary to add the "fp" and "simd" attributes to both movsf_aarch64 and movdf_aarch64 as well? You need at least the "simd" attribute, b

  1   2   >