Evandro, Shouldn't ‘execute_cse_reciprocals_1’ take care of this, once the reciprocal-division is implemented? Do you think there’s additional work needed to catch all cases/opportunities?
Best, Philipp. > On 24 Jun 2015, at 20:19, Evandro Menezes <e.mene...@samsung.com> wrote: > > Benedikt, > > Are you developing the reciprocal approximation just for 1/x proper or for > any division, as in x/y = x * 1/y? > > Thank you, > > -- > Evandro Menezes Austin, TX > > >> -----Original Message----- >> From: Benedikt Huber [mailto:benedikt.hu...@theobroma-systems.com] >> Sent: Wednesday, June 24, 2015 12:11 >> To: Dr. Philipp Tomsich >> Cc: Evandro Menezes; gcc-patches@gcc.gnu.org >> Subject: Re: [PATCH] [aarch64] Implemented reciprocal square root (rsqrt) >> estimation in -ffast-math >> >> Evandro, >> >> Yes, we also have the 1/x approximation. >> However we do not have the test cases yet, and it also would need some clean >> up. >> I am going to provide a patch for that soon (say next week). >> Also, for this optimization we have *not* yet found a benchmark with >> significant improvements. >> >> Best Regards, >> Benedikt >> >> >>> On 24 Jun 2015, at 18:52, Dr. Philipp Tomsich <philipp.tomsich@theobroma- >> systems.com> wrote: >>> >>> Evandro, >>> >>> We’ve seen a 28% speed-up on gromacs in SPECfp for the (scalar) reciprocal >> sqrt. >>> >>> Also, the “reciprocal divide” patches are floating around in various >>> of our git-tree, but aren’t ready for public consumption, yet… I’ll >>> leave Benedikt to comment on potential timelines for getting that pushed >> out. >>> >>> Best, >>> Philipp. >>> >>>> On 24 Jun 2015, at 18:42, Evandro Menezes <e.mene...@samsung.com> wrote: >>>> >>>> Benedikt, >>>> >>>> You beat me to it! :-) Do you have the implementation for dividing >>>> using the Newton series as well? >>>> >>>> I'm not sure that the series is always for all data types and on all >>>> processors. It would be useful to allow each AArch64 processor to >>>> enable this or not depending on the data type. BTW, do you have some >>>> tests showing the speed up? >>>> >>>> Thank you, >>>> >>>> -- >>>> Evandro Menezes Austin, TX >>>> >>>>> -----Original Message----- >>>>> From: gcc-patches-ow...@gcc.gnu.org >>>>> [mailto:gcc-patches-ow...@gcc.gnu.org] >>>> On >>>>> Behalf Of Benedikt Huber >>>>> Sent: Thursday, June 18, 2015 7:04 >>>>> To: gcc-patches@gcc.gnu.org >>>>> Cc: benedikt.hu...@theobroma-systems.com; philipp.tomsich@theobroma- >>>>> systems.com >>>>> Subject: [PATCH] [aarch64] Implemented reciprocal square root >>>>> (rsqrt) estimation in -ffast-math >>>>> >>>>> arch64 offers the instructions frsqrte and frsqrts, for rsqrt >>>>> estimation >>>> and >>>>> a Newton-Raphson step, respectively. >>>>> There are ARMv8 implementations where this is faster than using fdiv >>>>> and rsqrt. >>>>> It runs three steps for double and two steps for float to achieve >>>>> the >>>> needed >>>>> precision. >>>>> >>>>> There is one caveat and open question. >>>>> Since -ffast-math enables flush to zero intermediate values between >>>>> approximation steps will be flushed to zero if they are denormal. >>>>> E.g. This happens in the case of rsqrt (DBL_MAX) and rsqrtf (FLT_MAX). >>>>> The test cases pass, but it is unclear to me whether this is >>>>> expected behavior with -ffast-math. >>>>> >>>>> The patch applies to commit: >>>>> svn+ssh://gcc.gnu.org/svn/gcc/trunk@224470 >>>>> >>>>> Please consider including this patch. >>>>> Thank you and best regards, >>>>> Benedikt Huber >>>>> >>>>> Benedikt Huber (1): >>>>> 2015-06-15 Benedikt Huber <benedikt.hu...@theobroma-systems.com> >>>>> >>>>> gcc/ChangeLog | 9 +++ >>>>> gcc/config/aarch64/aarch64-builtins.c | 60 ++++++++++++++++ >>>>> gcc/config/aarch64/aarch64-protos.h | 2 + >>>>> gcc/config/aarch64/aarch64-simd.md | 27 ++++++++ >>>>> gcc/config/aarch64/aarch64.c | 63 +++++++++++++++++ >>>>> gcc/config/aarch64/aarch64.md | 3 + >>>>> gcc/testsuite/gcc.target/aarch64/rsqrt.c | 113 >>>>> +++++++++++++++++++++++++++++++ >>>>> 7 files changed, 277 insertions(+) >>>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c >>>>> >>>>> -- >>>>> 1.9.1 >>>> <Mail Attachment.eml> >>> > >