[PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Related with bug 86829, but for hyperbolic trigonometric functions. This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both formulas has division by 0, but it causes no harm because 1/(+0) -> +infinity, thus the math is still safe. Changelog: 2018-08-07 Giuliano Belinassi * match.pd: add simplification rules to sinh(atanh(x)) and cosh(atanh(x)). All tests added by this patch runs without errors in trunk, however, there are tests unrelated with this patch that fails in my x86_64 Ubuntu 18.04. Index: gcc/match.pd === --- gcc/match.pd (revisão 263359) +++ gcc/match.pd (cópia de trabalho) @@ -4219,6 +4219,25 @@ (mult:c (TAN:s @0) (COS:s @0)) (SIN @0)) + /* Simplify sinh(atanh(x)) -> x / sqrt(1 - x*x). */ + (for sins (SINH) + atans (ATANH) + sqrts (SQRT) + (simplify + (sins (atans:s @0)) + (rdiv @0 (sqrts (minus {build_one_cst (type);} + (mult @0 @0)) + + /* Simplify cosh(atanh(x)) -> 1 / sqrt(1 - x*x). */ + (for coss (COSH) + atans (ATANH) + sqrts (SQRT) + (simplify + (coss (atans:s @0)) + (rdiv {build_one_cst (type);} + (sqrts (minus {build_one_cst (type);} +(mult @0 @0)) + /* Simplify x * pow(x,c) -> pow(x,c+1). */ (simplify (mult:c @0 (POW:s @0 REAL_CST@1)) Index: gcc/testsuite/gcc.dg/sinhtanh-1.c === --- gcc/testsuite/gcc.dg/sinhtanh-1.c (nonexistent) +++ gcc/testsuite/gcc.dg/sinhtanh-1.c (cópia de trabalho) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ffast-math -fdump-tree-optimized" } */ + +extern double sinh(double x); +extern double atanh(double x); + +double __attribute__ ((noinline)) +sinhatanh_(double x) +{ +return sinh(atanh(x)); +} + +/* There should be no calls to sinh nor atanh */ +/* { dg-final { scan-tree-dump-not "sinh " "optimized" } } */ +/* { dg-final { scan-tree-dump-not "atanh " "optimized" } } */ Index: gcc/testsuite/gcc.dg/sinhtanh-2.c === --- gcc/testsuite/gcc.dg/sinhtanh-2.c (nonexistent) +++ gcc/testsuite/gcc.dg/sinhtanh-2.c (cópia de trabalho) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ffast-math -fdump-tree-optimized" } */ + +extern double cosh(double x); +extern double atanh(double x); + +double __attribute__ ((noinline)) +coshatanh_(double x) +{ +return cosh(atanh(x)); +} + +/* There should be no calls to cosh nor atanh */ +/* { dg-final { scan-tree-dump-not "cosh " "optimized" } } */ +/* { dg-final { scan-tree-dump-not "atanh " "optimized" } } */ Index: gcc/testsuite/gcc.dg/sinhtanh-3.c === --- gcc/testsuite/gcc.dg/sinhtanh-3.c (nonexistent) +++ gcc/testsuite/gcc.dg/sinhtanh-3.c (cópia de trabalho) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ffast-math -fdump-tree-optimized" } */ + +extern double sinh(double x); +extern double atanh(double x); + +double __attribute__ ((noinline)) +sinhatanh_(double x) +{ +double atgh = atanh(x); +return sinh(atgh) + atgh; +} + +/* There should be calls to both sinh and atanh */ +/* { dg-final { scan-tree-dump "sinh " "optimized" } } */ +/* { dg-final { scan-tree-dump "atanh " "optimized" } } */ Index: gcc/testsuite/gcc.dg/sinhtanh-4.c === --- gcc/testsuite/gcc.dg/sinhtanh-4.c (nonexistent) +++ gcc/testsuite/gcc.dg/sinhtanh-4.c (cópia de trabalho) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ffast-math -fdump-tree-optimized" } */ + +extern double cosh(double x); +extern double atanh(double x); + +double __attribute__ ((noinline)) +coshatanh_(double x) +{ +double atgh = atanh(x); +return cosh(atgh) + atgh; +} + +/* There should be calls to both cosh and atanh */ +/* { dg-final { scan-tree-dump "cosh " "optimized" } } */ +/* { dg-final { scan-tree-dump "atanh " "optimized" } } */
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
> On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi > wrote: > > Related with bug 86829, but for hyperbolic trigonometric functions. > This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 > - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both > formulas has division by 0, but it causes no harm because 1/(+0) -> > +infinity, thus the math is still safe. What about non-IEEE targets that don't have "infinite" in their float representation? paul
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
That is a good question because I didn't know that such targets exists. Any suggestion? On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning wrote: > > >> On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi >> wrote: >> >> Related with bug 86829, but for hyperbolic trigonometric functions. >> This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 >> - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both >> formulas has division by 0, but it causes no harm because 1/(+0) -> >> +infinity, thus the math is still safe. > > What about non-IEEE targets that don't have "infinite" in their float > representation? > > paul > >
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Now I'm puzzled. I don't see how an infinite would show up in the original expression. I don't know hyperbolic functions, so I just constructed a small test program, and the original vs. the substitution you mention are not at all similar. paul > On Aug 7, 2018, at 4:42 PM, Giuliano Augusto Faulin Belinassi > wrote: > > That is a good question because I didn't know that such targets > exists. Any suggestion? > > > On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning wrote: >> >> >>> On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi >>> wrote: >>> >>> Related with bug 86829, but for hyperbolic trigonometric functions. >>> This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 >>> - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both >>> formulas has division by 0, but it causes no harm because 1/(+0) -> >>> +infinity, thus the math is still safe. >> >> What about non-IEEE targets that don't have "infinite" in their float >> representation? >> >>paul >> >>
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Sorry about that. In the e-mail text field I wrote sinh(tanh(x)) and cosh(tanh(x)) where it was supposed to be sinh(atanh(x)) and cosh(atanh(x)), thus I am talking about the inverse hyperbolic tangent function. The patch code and comments are still correct. On Wed, Aug 8, 2018 at 10:58 AM, Paul Koning wrote: > Now I'm puzzled. > > I don't see how an infinite would show up in the original expression. I > don't know hyperbolic functions, so I just constructed a small test program, > and the original vs. the substitution you mention are not at all similar. > > paul > > >> On Aug 7, 2018, at 4:42 PM, Giuliano Augusto Faulin Belinassi >> wrote: >> >> That is a good question because I didn't know that such targets >> exists. Any suggestion? >> >> >> On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning wrote: >>> >>> On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi wrote: Related with bug 86829, but for hyperbolic trigonometric functions. This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both formulas has division by 0, but it causes no harm because 1/(+0) -> +infinity, thus the math is still safe. >>> >>> What about non-IEEE targets that don't have "infinite" in their float >>> representation? >>> >>>paul >>> >>> >
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Thanks. Ok, so the expressions you gave are undefined for x==1, which says that substituting something that is also undefined for x==1 is permitted. You can argue from "undefined" rather than relying on IEEE features like NaN or infinite. paul > On Aug 8, 2018, at 2:57 PM, Giuliano Augusto Faulin Belinassi > wrote: > > Sorry about that. In the e-mail text field I wrote sinh(tanh(x)) and > cosh(tanh(x)) where it was supposed to be sinh(atanh(x)) and > cosh(atanh(x)), thus I am talking about the inverse hyperbolic tangent > function. The patch code and comments are still correct. > > On Wed, Aug 8, 2018 at 10:58 AM, Paul Koning wrote: >> Now I'm puzzled. >> >> I don't see how an infinite would show up in the original expression. I >> don't know hyperbolic functions, so I just constructed a small test program, >> and the original vs. the substitution you mention are not at all similar. >> >>paul >> >> >>> On Aug 7, 2018, at 4:42 PM, Giuliano Augusto Faulin Belinassi >>> wrote: >>> >>> That is a good question because I didn't know that such targets >>> exists. Any suggestion? >>> >>> >>> On Tue, Aug 7, 2018 at 5:29 PM, Paul Koning wrote: > On Aug 7, 2018, at 4:00 PM, Giuliano Augusto Faulin Belinassi > wrote: > > Related with bug 86829, but for hyperbolic trigonometric functions. > This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 > - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both > formulas has division by 0, but it causes no harm because 1/(+0) -> > +infinity, thus the math is still safe. What about non-IEEE targets that don't have "infinite" in their float representation? paul >>
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi Wilco, On Thu, Nov 08, 2018 at 01:33:19PM +, Wilco Dijkstra wrote: > > But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math > > libraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than > > 2 ULP even. > > You can't add ULP errors in general - a tiny difference in the input can > make a huge difference in the result if the derivative is > 1. > > Even with perfect implementations of 0.501ULP on easy functions with > no large derivatives you could get a 2ULP total error if the perfectly rounded > and actual result end up rounding in different directions in the 2nd > function... Sure. My point is that there can be math libraries where the original sinh(atanh(x)) is more precise than what we replace it with here, for certain values at least. So you need some fast math flag no matter what. But we agree on that anyway :-) > So you have to measure ULP error since it is quite counter intuitive. It's hard to measure with DP, and worse with QP... Proving things is often easier. Segher
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 10/23/18 3:17 AM, Richard Biener wrote: > On Mon, Oct 22, 2018 at 10:09 PM Jeff Law wrote: >> >> On 10/20/18 9:47 AM, Giuliano Augusto Faulin Belinassi wrote: >>> So I did some further investigation comparing the ULP error. >>> >>> With the formula that Wilco Dijkstra provided, there are cases where >>> the substitution is super precise. >>> With floats: >>> with input : = 9.9940395355224609375000e-01 >>> sinh: before: = 2.89631005859375e+03 >>> sinh: after : = 2.896309326171875000e+03 >>> sinh: mpfr : = 2.89630924626497842670468162463283783344599446025119e+03 >>> ulp err befr: = 3 >>> ulp err aftr: = 0 >>> >>> With doubles: >>> with input : = 9.99888977697537484345957636833190917969e-01 >>> sinh: before: = 6.710886400029802322387695312500e+07 >>> sinh: after : = 6.71088632549419403076171875e+07 >>> sinh: mpfr : = 6.710886344120645523071287770030292885894208e+07 >>> ulp err befr: = 3 >>> ulp err aftr: = 0 >>> >>> *However*, there are cases where some error shows up. The biggest ULP >>> error that I could find was 2. >>> >>> With floats: >>> with input : = 9.99968349933624267578125000e-01 >>> sinh: before: = 1.2568613433837890625000e+02 >>> sinh: after : = 1.2568614959716796875000e+02 >>> sinh: mpfr : = 1.25686137592274042266452526368087062890399889097864e+02 >>> ulp err befr: = 0 >>> ulp err aftr: = 2 >>> >>> With doubles: >>> with input : = 9.999463651256803586875321343541145324707031e-01 >>> sinh: before: = 9.65520209507428342476487159729003906250e+05 >>> sinh: after : = 9.6552020950742810964584350585937500e+05 >>> sinh: mpfr : = 9.65520209507428288553227922831618987450806468855883e+05 >>> ulp err befr: = 0 >>> ulp err aftr: = 2 >>> >>> And with FMA we have the same results showed above. (super precise >>> cases, and maximum ULP error equal 2). >>> >>> So maybe update the patch with the following rules? >>>* If FMA is available, then compute 1 - x*x with it. >>>* If FMA is not available, then do the dijkstra substitution when |x| > >>> 0.5. >> So I think the runtime math libraries shoot for .5 ULP (yes, they don't >> always make it, but that's their goal). We should probably have the >> same goal. Going from 0 to 2 ULPs would be considered bad. > > But we do that everywhere (with -funsafe-math-optimizations or > -fassociative-math). So if we're going from 0->2 ULPs in some cases, do we want to guard it with one of the various options, if so, which? Giuliano's follow-up will still have the potential for 2ULPs. jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi Jeff, > So if we're going from 0->2 ULPs in some cases, do we want to guard it > with one of the various options, if so, which? Giuliano's follow-up > will still have the potential for 2ULPs. The ULP difference is not important since the individual math functions already have ULP of 3 or higher. Changing ULP error for some or all inputs (like we did with the rewritten math functions) is not considered an issue as long as worst-case ULP error doesn't increase. The question is more like whether errno and trapping/exception behaviour is identical - I guess it is not so I would expect this to be fastmath only. Which particular flag one uses is a detail given there isn't a clear definition for most of them. Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On Wed, Nov 07, 2018 at 10:34:30PM +, Wilco Dijkstra wrote: > Hi Jeff, > > > So if we're going from 0->2 ULPs in some cases, do we want to guard it > > with one of the various options, if so, which? Giuliano's follow-up > > will still have the potential for 2ULPs. > > The ULP difference is not important since the individual math functions > already have ULP of 3 or higher. Changing ULP error for some or all inputs > (like we did with the rewritten math functions) is not considered an issue as > long as worst-case ULP error doesn't increase. But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math libraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than 2 ULP even. > The question is more like whether errno and trapping/exception behaviour > is identical - I guess it is not so I would expect this to be fastmath only. > Which particular flag one uses is a detail given there isn't a clear > definition > for most of them. And signed zeroes. Yeah. I think it would have to be flag_unsafe_math_optimizations + some more. Segher
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi, > But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math > libraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than > 2 ULP even. You can't add ULP errors in general - a tiny difference in the input can make a huge difference in the result if the derivative is > 1. Even with perfect implementations of 0.501ULP on easy functions with no large derivatives you could get a 2ULP total error if the perfectly rounded and actual result end up rounding in different directions in the 2nd function... So you have to measure ULP error since it is quite counter intuitive. > And signed zeroes. Yeah. I think it would have to be > flag_unsafe_math_optimizations + some more. Indeed. Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 11/8/18 6:33 AM, Wilco Dijkstra wrote: > Hi, > >> But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math >> libraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than >> 2 ULP even. > > You can't add ULP errors in general - a tiny difference in the input can > make a huge difference in the result if the derivative is > 1. > > Even with perfect implementations of 0.501ULP on easy functions with > no large derivatives you could get a 2ULP total error if the perfectly rounded > and actual result end up rounding in different directions in the 2nd > function... > > So you have to measure ULP error since it is quite counter intuitive. > >> And signed zeroes. Yeah. I think it would have to be >> flag_unsafe_math_optimizations + some more. > > Indeed. So we need to give Giuliano some clear guidance on guarding. This is out of my area of expertise, so looking to y'all to help here. jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi. Sorry for the late reply :P > But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math > ibraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than >2 ULP even. Sorry, but doesn't the user agree to sacrifice precision for performance when -ffast-math is enabled? >> The question is more like whether errno and trapping/exception behaviour >> is identical - I guess it is not so I would expect this to be fastmath only. >> Which particular flag one uses is a detail given there isn't a clear >> definition >> for most of them. > And signed zeroes. Yeah. I think it would have to be > flag_unsafe_math_optimizations + some more. >From my point of view, this optimization is OK for IEEE 754. So I have to check if the target has signed zeroes and support signed infinity. I will look into that. > So we need to give Giuliano some clear guidance on guarding. This is > out of my area of expertise, so looking to y'all to help here. At this point I don't know how to check that, but I will look into it. On Fri, Nov 9, 2018 at 6:03 PM Jeff Law wrote: > > On 11/8/18 6:33 AM, Wilco Dijkstra wrote: > > Hi, > > > >> But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math > >> libraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than > >> 2 ULP even. > > > > You can't add ULP errors in general - a tiny difference in the input can > > make a huge difference in the result if the derivative is > 1. > > > > Even with perfect implementations of 0.501ULP on easy functions with > > no large derivatives you could get a 2ULP total error if the perfectly > > rounded > > and actual result end up rounding in different directions in the 2nd > > function... > > > > So you have to measure ULP error since it is quite counter intuitive. > > > >> And signed zeroes. Yeah. I think it would have to be > >> flag_unsafe_math_optimizations + some more. > > > > Indeed. > So we need to give Giuliano some clear guidance on guarding. This is > out of my area of expertise, so looking to y'all to help here. > > jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On Fri, Nov 09, 2018 at 01:03:55PM -0700, Jeff Law wrote: > >> And signed zeroes. Yeah. I think it would have to be > >> flag_unsafe_math_optimizations + some more. > > > > Indeed. > So we need to give Giuliano some clear guidance on guarding. This is > out of my area of expertise, so looking to y'all to help here. IMO, it needs flag_unsafe_optimizations, as above; and it needs to be investigated which (if any) options like flag_signed_zeros it needs in addition to that. It needs an option like that whenever the new expression can give a zero with a different sign than the original expression, etc. Although it could be said that flag_unsafe_optimizations supercedes all of that. It isn't clear. Segher
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On Sat, Nov 10, 2018 at 6:36 AM Segher Boessenkool wrote: > > On Fri, Nov 09, 2018 at 01:03:55PM -0700, Jeff Law wrote: > > >> And signed zeroes. Yeah. I think it would have to be > > >> flag_unsafe_math_optimizations + some more. > > > > > > Indeed. > > So we need to give Giuliano some clear guidance on guarding. This is > > out of my area of expertise, so looking to y'all to help here. > > IMO, it needs flag_unsafe_optimizations, as above; and it needs to be > investigated which (if any) options like flag_signed_zeros it needs in > addition to that. It needs an option like that whenever the new expression > can give a zero with a different sign than the original expression, etc. > Although it could be said that flag_unsafe_optimizations supercedes all > of that. It isn't clear. It indeed isn't clear whether at least some of the other flags make no sense with -funsafe-math-optimizations. Still at least for documentation purposes please use !flag_siged_zeros && flag_unsafe_math_optimizations && ... flag_unsafe_math_optimizations is generally used when there's extra rounding involved. Some specific kind of transforms have individual flags and do not require flag_unsafe_math_optimizations (re-association and contraction for example). I'm not sure I would require flag_unsafe_math_optimizations for a 2ulp error though. Richard. > > Segher
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 8/7/18 2:00 PM, Giuliano Augusto Faulin Belinassi wrote: > Related with bug 86829, but for hyperbolic trigonometric functions. > This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 > - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both > formulas has division by 0, but it causes no harm because 1/(+0) -> > +infinity, thus the math is still safe. > > Changelog: > 2018-08-07 Giuliano Belinassi > > * match.pd: add simplification rules to sinh(atanh(x)) and cosh(atanh(x)). > > All tests added by this patch runs without errors in trunk, however, > there are tests unrelated with this patch that fails in my x86_64 > Ubuntu 18.04. I think these are going to need similar handling because the x*x can overflow. Are the domains constrained in a way that is helpful? jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hello. I don't think there is a need for overflow handling here because the argument is bound by the argument of the sqrt function :-) Since we have to compute sqrt (1 - x*x), the input is only valid if 1 - x*x >= 0, implying that -1 <= x <= 1. For any x outside of this set, the sqrt will return a invalid value, as imaginary numbers are required to represent the answer. One can also argue a problem regarding division by 0, however in the extremes x -> -1 by the right and x -> 1 by the left we have: sinh(atanh(-1)) = -1 / sqrt (0) = -inf sinh(atanh( 1)) = 1 / sqrt (0) = +inf cos(atanh(-1)) = 1 / sqrt (0) = +inf cos(atanh( 1)) = 1 / sqrt (0) = +inf Therefore it seems that the target has to support infinity anyway. Well, I think I can take a look about how glibc handles such cases on targets where infinity is not supported to try to keep compatibility, but I think this is safe :-). On Fri, Oct 12, 2018 at 1:09 AM Jeff Law wrote: > > On 8/7/18 2:00 PM, Giuliano Augusto Faulin Belinassi wrote: > > Related with bug 86829, but for hyperbolic trigonometric functions. > > This patch adds substitution rules to both sinh(tanh(x)) -> x / sqrt(1 > > - x*x) and cosh(tanh(x)) -> 1 / sqrt(1 - x*x). Notice that the both > > formulas has division by 0, but it causes no harm because 1/(+0) -> > > +infinity, thus the math is still safe. > > > > Changelog: > > 2018-08-07 Giuliano Belinassi > > > > * match.pd: add simplification rules to sinh(atanh(x)) and > > cosh(atanh(x)). > > > > All tests added by this patch runs without errors in trunk, however, > > there are tests unrelated with this patch that fails in my x86_64 > > Ubuntu 18.04. > I think these are going to need similar handling because the x*x can > overflow. Are the domains constrained in a way that is helpful? > > jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 10/12/18 8:36 AM, Giuliano Augusto Faulin Belinassi wrote: > Hello. > I don't think there is a need for overflow handling here because > the argument is bound by the argument of the sqrt function :-) Yea, I guess you're right. The domain of arctanh is -1 to 1, so I guess we're safe there. Except for the case where the input is -1 or 1 in which case I think you just set the output to +- INF as appropriate. Hmm, do we have problems as we get close to -1 or 1 where the outputs of the two forms might diverge? Jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
> Hmm, do we have problems as we get close to -1 or 1 where the outputs of > the two forms might diverge? Well, I did some minor testing with that with input x around nextafter(1, -1); There are a minor imprecision when comparing directly with sinh(atanh(x)) and cosh(atanh(x)). * On 32-bits floats, for such x the error is about 10^-4 * On 64-bits floats, for such x the error is about 10^-7 * On 80-bits floats, for such x the error is about 10^-9 here are the code that I used for the test: https://pastebin.com/JzYZyigQ I can create a testcase based on this if needed :-)
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 10/17/18 3:25 PM, Giuliano Augusto Faulin Belinassi wrote: >> Hmm, do we have problems as we get close to -1 or 1 where the outputs of >> the two forms might diverge? > > Well, I did some minor testing with that with input x around nextafter(1, -1); > There are a minor imprecision when comparing directly with > sinh(atanh(x)) and cosh(atanh(x)). > * On 32-bits floats, for such x the error is about 10^-4 > * On 64-bits floats, for such x the error is about 10^-7 > * On 80-bits floats, for such x the error is about 10^-9 > > here are the code that I used for the test: https://pastebin.com/JzYZyigQ > > I can create a testcase based on this if needed :-) My gut instinct is those errors are too significant in practice. It also just occurred to me that we may have problems as X approaches X from either direction. Clearly when x^2 is indistinguishable from 0 or 1, then the result has to be +-0 or +-1. But I'm not sure if figuring out where those points are is sufficient to avoid the imprecisions noted above. This is *well* outside my areas of expertise. jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Oh, please note that the error that I'm talking about is the comparison with the result obtained before and after the simplification. It is possible that the result obtained after the simplification be more precise when compared to an arbitrary precise value (example, a 30 digits precise approximation). Well, I will try check that. But yes, with regard to compatibility this may be a problem. On Wed, Oct 17, 2018 at 6:42 PM Jeff Law wrote: > > On 10/17/18 3:25 PM, Giuliano Augusto Faulin Belinassi wrote: > >> Hmm, do we have problems as we get close to -1 or 1 where the outputs of > >> the two forms might diverge? > > > > Well, I did some minor testing with that with input x around nextafter(1, > > -1); > > There are a minor imprecision when comparing directly with > > sinh(atanh(x)) and cosh(atanh(x)). > > * On 32-bits floats, for such x the error is about 10^-4 > > * On 64-bits floats, for such x the error is about 10^-7 > > * On 80-bits floats, for such x the error is about 10^-9 > > > > here are the code that I used for the test: https://pastebin.com/JzYZyigQ > > > > I can create a testcase based on this if needed :-) > My gut instinct is those errors are too significant in practice. > > It also just occurred to me that we may have problems as X approaches X > from either direction. > > Clearly when x^2 is indistinguishable from 0 or 1, then the result has > to be +-0 or +-1. But I'm not sure if figuring out where those points > are is sufficient to avoid the imprecisions noted above. This is *well* > outside my areas of expertise. > > jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 10/17/18 4:21 PM, Giuliano Augusto Faulin Belinassi wrote: > Oh, please note that the error that I'm talking about is the > comparison with the result obtained before and after the > simplification. It is possible that the result obtained after the > simplification be more precise when compared to an arbitrary precise > value (example, a 30 digits precise approximation). Well, I will try > check that. That would be helpful. Obviously if we're getting more precise, then that's a good thing :-) jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 10/18, Jeff Law wrote: > On 10/17/18 4:21 PM, Giuliano Augusto Faulin Belinassi wrote: > > Oh, please note that the error that I'm talking about is the > > comparison with the result obtained before and after the > > simplification. It is possible that the result obtained after the > > simplification be more precise when compared to an arbitrary precise > > value (example, a 30 digits precise approximation). Well, I will try > > check that. > That would be helpful. Obviously if we're getting more precise, then > that's a good thing :-) > > jeff Well, I compared the results before and after the simplifications with a 512-bit precise mpfr value. Unfortunately, I found that sometimes the error is very noticeable :-( . For example, using floats and comparing with a 512 precision mpfr calculation with input : = 9.9996697902679443359375e-01 cosh: before : = 1.2305341339111328125000e+02 cosh: after : = 1.230523986816406250e+02 cosh: mpfr512: = 1.23053409952258504358633865742873246642102963529577e+02 error before : = 3.43885477689136613425712675335789703647042270993727e-06 error after : = 1.01127061787935863386574287324664210296352957729006e-03 There are also some significant loss of precision with long doubles: with input : = 9.96799706237365912286918501195032149553e-01 cosh: before : = 1.24994262843556815705596818588674068450927734375000e+07 cosh: after : = 1.24994262843556715697559411637485027313232421875000e+07 cosh: mpfr512: = 1.24994262843556815704069193408098058772318248178348e+07 error before : = 1.52762518057600967860948619665184971612393688891101e-13 error after : = 1.6509781770613031459085826303348150283876063111e-08 So yes, precision may be a problem here.
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi, > Well, I compared the results before and after the simplifications with a > 512-bit > precise mpfr value. Unfortunately, I found that sometimes the error is very > noticeable :-( . Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the relative error should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 should be more accurate when abs(x)>0.5 and still much faster. Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hello, > Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the > relative error > should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 > should be > more accurate when abs(x)>0.5 and still much faster. No, but I will check how to enable it if FMA is available. I did a minor test with your formula and the precision improved a lot. Here is an example for floats with input : = 9.988079071044921875e-01 cosh: before: = 2.048000e+03 cosh: after : = 2.048000244140625000e+03 cosh: mpfr : = 2.0486103515897848424084406334262726138617589463e+03 error before: = 6.10351589784842408440633426272613861758946325324235e-05 error after : = 1.83105466021515759155936657372738613824105367467577e-04 But now I am puzzled about how did you come up with that formula :-). I am able to proof equality, but how did you know it was going to be more precise? On Thu, Oct 18, 2018 at 7:41 PM Wilco Dijkstra wrote: > > Hi, > > > Well, I compared the results before and after the simplifications with a > > 512-bit > > precise mpfr value. Unfortunately, I found that sometimes the error is very > > noticeable :-( . > > Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the > relative error > should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 > should be > more accurate when abs(x)>0.5 and still much faster. > > Wilco > >
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi, >> Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the >> relative error >> should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 >> should be >> more accurate when abs(x)>0.5 and still much faster. > >No, but I will check how to enable it if FMA is available. > I did a minor test with your formula and the precision improved a lot. > But now I am puzzled about how did you come up with that formula :-). > I am able to proof equality, but how did you know it was going to be > more precise? Basically when x is close to 1, x the top N bits in the mantissa will be ones. Then x*x has one bits in the top 2*N bits in the mantissa. Ie. we lose N bits of useful information in the multiply - problematic when N gets close to the number of mantissa bits. In contrast FMA computes the fully accurate result due to cancellation of the top 2*N one-bits in the subtract. If we can use (1-x) instead of x in the evaluation, we avoid losing accuracy in the multiply when x is close to 1. Then it's basic algebra to find an equivalent formula that can produce 1-x^2 using 1-x. For example (1+x)*(1-x) will work fine too (using 1+x loses 1 low bit of x). Note that these alternative evaluations lose accuracy close to 0 in exactly the same way, so if no FMA is available you'd need to select between the 2 cases. Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi all, On Fri, Oct 19, 2018 at 09:21:07AM -0300, Giuliano Augusto Faulin Belinassi wrote: > > Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the > > relative error > > should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 > > should be > > more accurate when abs(x)>0.5 and still much faster. > > No, but I will check how to enable it if FMA is available. > I did a minor test with your formula and the precision improved a lot. > Here is an example for floats > > with input : = 9.988079071044921875e-01 > cosh: before: = 2.048000e+03 > cosh: after : = 2.048000244140625000e+03 > cosh: mpfr : = 2.0486103515897848424084406334262726138617589463e+03 > error before: = 6.10351589784842408440633426272613861758946325324235e-05 > error after : = 1.83105466021515759155936657372738613824105367467577e-04 Maybe I am crazy, or the labels here are wrong, but that looks like the error is three times as *big* after the patch. I.e. it worsened instead of improving. Segher
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On Fri, Oct 19, 2018 at 01:39:01PM +, Wilco Dijkstra wrote: > >> Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the > >> relative error > >> should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 > >> should be > >> more accurate when abs(x)>0.5 and still much faster. > > > >No, but I will check how to enable it if FMA is available. > > I did a minor test with your formula and the precision improved a lot. > > > But now I am puzzled about how did you come up with that formula :-). > > I am able to proof equality, but how did you know it was going to be > > more precise? > > Basically when x is close to 1, x the top N bits in the mantissa will be ones. > Then x*x has one bits in the top 2*N bits in the mantissa. Ie. we lose N bits > of > useful information in the multiply - problematic when N gets close to the > number > of mantissa bits. In contrast FMA computes the fully accurate result due to > cancellation of the top 2*N one-bits in the subtract. > > If we can use (1-x) instead of x in the evaluation, we avoid losing accuracy > in the > multiply when x is close to 1. Then it's basic algebra to find an equivalent > formula > that can produce 1-x^2 using 1-x. For example (1+x)*(1-x) will work fine too > (using 1+x loses 1 low bit of x). > > Note that these alternative evaluations lose accuracy close to 0 in exactly > the > same way, so if no FMA is available you'd need to select between the 2 cases. At this point this seems like something that shouldn't be done inline anymore, so either we don't do this optimization at all, because the errors are far bigger than what is acceptable even for -ffast-math, or we have a library function that does the sinh (tanh (x)) and cosh (tanh (x)) computations somewhere (libm, libgcc, ...) that handles all the cornercases. Jakub
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
> Maybe I am crazy, or the labels here are wrong, but that looks like the > error is three times as *big* after the patch. I.e. it worsened instead > of improving. Oh, sorry. I was not clear in my previous message. The error did not improved with regard to the original formula. What I meant is with regard to the original (1-x*x) simplification. But you are right, the above error is about 3 times bigger than the original formula, but before the error was about 300 times bigger. You are not crazy :P On Fri, Oct 19, 2018 at 10:46 AM Segher Boessenkool wrote: > > Hi all, > > On Fri, Oct 19, 2018 at 09:21:07AM -0300, Giuliano Augusto Faulin Belinassi > wrote: > > > Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the > > > relative error > > > should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2 > > > should be > > > more accurate when abs(x)>0.5 and still much faster. > > > > No, but I will check how to enable it if FMA is available. > > I did a minor test with your formula and the precision improved a lot. > > Here is an example for floats > > > > with input : = 9.988079071044921875e-01 > > cosh: before: = 2.048000e+03 > > cosh: after : = 2.048000244140625000e+03 > > cosh: mpfr : = 2.0486103515897848424084406334262726138617589463e+03 > > error before: = 6.10351589784842408440633426272613861758946325324235e-05 > > error after : = 1.83105466021515759155936657372738613824105367467577e-04 > > Maybe I am crazy, or the labels here are wrong, but that looks like the > error is three times as *big* after the patch. I.e. it worsened instead > of improving. > > > Segher
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Jakub Jelinek wrote: > At this point this seems like something that shouldn't be done inline > anymore, so either we don't do this optimization at all, because the errors > are far bigger than what is acceptable even for -ffast-math, or we have a > library function that does the sinh (tanh (x)) and cosh (tanh (x)) > computations somewhere (libm, libgcc, ...) that handles all the cornercases. The FMA version should not have any accuracy issues. Without FMA it's harder, but it's not that different from the sin(atan(x)) simplification which also requires two separate cases. So it's more a question how much effort we want to spend optimizing for targets which do not support FMA. Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi, >> Maybe I am crazy, or the labels here are wrong, but that looks like the >> error is three times as *big* after the patch. I.e. it worsened instead >> of improving. This error is actually 1ULP, so just a rounding error. Can't expect any better than that! > with input : = 9.988079071044921875e-01 > cosh: before: = 2.048000e+03 > cosh: after : = 2.048000244140625000e+03 > cosh: mpfr : = 2.0486103515897848424084406334262726138617589463e+03 > error before: = 6.10351589784842408440633426272613861758946325324235e-05 > error after : = 1.83105466021515759155936657372738613824105367467577e-04 It may be less confusing to print relative error or ULP error... Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
So I did some further investigation comparing the ULP error. With the formula that Wilco Dijkstra provided, there are cases where the substitution is super precise. With floats: with input : = 9.9940395355224609375000e-01 sinh: before: = 2.89631005859375e+03 sinh: after : = 2.896309326171875000e+03 sinh: mpfr : = 2.89630924626497842670468162463283783344599446025119e+03 ulp err befr: = 3 ulp err aftr: = 0 With doubles: with input : = 9.99888977697537484345957636833190917969e-01 sinh: before: = 6.710886400029802322387695312500e+07 sinh: after : = 6.71088632549419403076171875e+07 sinh: mpfr : = 6.710886344120645523071287770030292885894208e+07 ulp err befr: = 3 ulp err aftr: = 0 *However*, there are cases where some error shows up. The biggest ULP error that I could find was 2. With floats: with input : = 9.99968349933624267578125000e-01 sinh: before: = 1.2568613433837890625000e+02 sinh: after : = 1.2568614959716796875000e+02 sinh: mpfr : = 1.25686137592274042266452526368087062890399889097864e+02 ulp err befr: = 0 ulp err aftr: = 2 With doubles: with input : = 9.999463651256803586875321343541145324707031e-01 sinh: before: = 9.65520209507428342476487159729003906250e+05 sinh: after : = 9.6552020950742810964584350585937500e+05 sinh: mpfr : = 9.65520209507428288553227922831618987450806468855883e+05 ulp err befr: = 0 ulp err aftr: = 2 And with FMA we have the same results showed above. (super precise cases, and maximum ULP error equal 2). So maybe update the patch with the following rules? * If FMA is available, then compute 1 - x*x with it. * If FMA is not available, then do the dijkstra substitution when |x| > 0.5. The code I used for testing: https://pastebin.com/zxPeXmJB On Fri, Oct 19, 2018 at 11:32 AM Wilco Dijkstra wrote: > > Hi, > > >> Maybe I am crazy, or the labels here are wrong, but that looks like the > >> error is three times as *big* after the patch. I.e. it worsened instead > >> of improving. > > This error is actually 1ULP, so just a rounding error. Can't expect any > better than that! > > > with input : = 9.988079071044921875e-01 > > cosh: before: = 2.048000e+03 > > cosh: after : = 2.048000244140625000e+03 > > cosh: mpfr : = 2.0486103515897848424084406334262726138617589463e+03 > > error before: = 6.10351589784842408440633426272613861758946325324235e-05 > > error after : = 1.83105466021515759155936657372738613824105367467577e-04 > > It may be less confusing to print relative error or ULP error... > > Wilco >
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On 10/20/18 9:47 AM, Giuliano Augusto Faulin Belinassi wrote: > So I did some further investigation comparing the ULP error. > > With the formula that Wilco Dijkstra provided, there are cases where > the substitution is super precise. > With floats: > with input : = 9.9940395355224609375000e-01 > sinh: before: = 2.89631005859375e+03 > sinh: after : = 2.896309326171875000e+03 > sinh: mpfr : = 2.89630924626497842670468162463283783344599446025119e+03 > ulp err befr: = 3 > ulp err aftr: = 0 > > With doubles: > with input : = 9.99888977697537484345957636833190917969e-01 > sinh: before: = 6.710886400029802322387695312500e+07 > sinh: after : = 6.71088632549419403076171875e+07 > sinh: mpfr : = 6.710886344120645523071287770030292885894208e+07 > ulp err befr: = 3 > ulp err aftr: = 0 > > *However*, there are cases where some error shows up. The biggest ULP > error that I could find was 2. > > With floats: > with input : = 9.99968349933624267578125000e-01 > sinh: before: = 1.2568613433837890625000e+02 > sinh: after : = 1.2568614959716796875000e+02 > sinh: mpfr : = 1.25686137592274042266452526368087062890399889097864e+02 > ulp err befr: = 0 > ulp err aftr: = 2 > > With doubles: > with input : = 9.999463651256803586875321343541145324707031e-01 > sinh: before: = 9.65520209507428342476487159729003906250e+05 > sinh: after : = 9.6552020950742810964584350585937500e+05 > sinh: mpfr : = 9.65520209507428288553227922831618987450806468855883e+05 > ulp err befr: = 0 > ulp err aftr: = 2 > > And with FMA we have the same results showed above. (super precise > cases, and maximum ULP error equal 2). > > So maybe update the patch with the following rules? >* If FMA is available, then compute 1 - x*x with it. >* If FMA is not available, then do the dijkstra substitution when |x| > > 0.5. So I think the runtime math libraries shoot for .5 ULP (yes, they don't always make it, but that's their goal). We should probably have the same goal. Going from 0 to 2 ULPs would be considered bad. So ideally we'd have some way to distinguish between the cases where we actually improve things (such as in your example). I don't know if that's possible. jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On Mon, Oct 22, 2018 at 10:09 PM Jeff Law wrote: > > On 10/20/18 9:47 AM, Giuliano Augusto Faulin Belinassi wrote: > > So I did some further investigation comparing the ULP error. > > > > With the formula that Wilco Dijkstra provided, there are cases where > > the substitution is super precise. > > With floats: > > with input : = 9.9940395355224609375000e-01 > > sinh: before: = 2.89631005859375e+03 > > sinh: after : = 2.896309326171875000e+03 > > sinh: mpfr : = 2.89630924626497842670468162463283783344599446025119e+03 > > ulp err befr: = 3 > > ulp err aftr: = 0 > > > > With doubles: > > with input : = 9.99888977697537484345957636833190917969e-01 > > sinh: before: = 6.710886400029802322387695312500e+07 > > sinh: after : = 6.71088632549419403076171875e+07 > > sinh: mpfr : = 6.710886344120645523071287770030292885894208e+07 > > ulp err befr: = 3 > > ulp err aftr: = 0 > > > > *However*, there are cases where some error shows up. The biggest ULP > > error that I could find was 2. > > > > With floats: > > with input : = 9.99968349933624267578125000e-01 > > sinh: before: = 1.2568613433837890625000e+02 > > sinh: after : = 1.2568614959716796875000e+02 > > sinh: mpfr : = 1.25686137592274042266452526368087062890399889097864e+02 > > ulp err befr: = 0 > > ulp err aftr: = 2 > > > > With doubles: > > with input : = 9.999463651256803586875321343541145324707031e-01 > > sinh: before: = 9.65520209507428342476487159729003906250e+05 > > sinh: after : = 9.6552020950742810964584350585937500e+05 > > sinh: mpfr : = 9.65520209507428288553227922831618987450806468855883e+05 > > ulp err befr: = 0 > > ulp err aftr: = 2 > > > > And with FMA we have the same results showed above. (super precise > > cases, and maximum ULP error equal 2). > > > > So maybe update the patch with the following rules? > >* If FMA is available, then compute 1 - x*x with it. > >* If FMA is not available, then do the dijkstra substitution when |x| > > > 0.5. > So I think the runtime math libraries shoot for .5 ULP (yes, they don't > always make it, but that's their goal). We should probably have the > same goal. Going from 0 to 2 ULPs would be considered bad. But we do that everywhere (with -funsafe-math-optimizations or -fassociative-math). Richard. > So ideally we'd have some way to distinguish between the cases where we > actually improve things (such as in your example). I don't know if > that's possible. > > jeff
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi, >> So I think the runtime math libraries shoot for .5 ULP (yes, they don't >> always make it, but that's their goal). We should probably have the >> same goal. Going from 0 to 2 ULPs would be considered bad. Generally the goal is 1ULP in round to nearest - other rounding modes may have higher ULP. The current GLIBC float/double/long double sinh and tanh are 2 ULP in libm-test-ulps (they can be 4 ULP in non-nearest rounding modes). cosh is 1 ULP in round to nearest but up to 3 in other rounding modes. > But we do that everywhere (with -funsafe-math-optimizations or > -fassociative-math). Exactly. And 2 ULP is extremely accurate for fast-math transformations - much better than eg. reassociating additions. Wilco
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
On Tue, Oct 23, 2018 at 10:37:54AM +, Wilco Dijkstra wrote: > >> So I think the runtime math libraries shoot for .5 ULP (yes, they don't > >> always make it, but that's their goal). We should probably have the > >> same goal. Going from 0 to 2 ULPs would be considered bad. > > Generally the goal is 1ULP in round to nearest Has that changed recently? At least in the past for double the goal has been always .5ULP in round to nearest. > > But we do that everywhere (with -funsafe-math-optimizations or > > -fassociative-math). > > Exactly. And 2 ULP is extremely accurate for fast-math transformations - much > better than eg. reassociating additions. For -ffast-math yeah. Jakub
Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules
Hi, >> Generally the goal is 1ULP in round to nearest > > Has that changed recently? At least in the past for double the goal has > been always .5ULP in round to nearest. Yes. 0.5 ULP (perfect rounding) as a goal was insane as it caused ridiculous slowdowns in the 10x range for no apparent reason. GLIBC was black listed in the HPC community as a result. So I removed most of the perfect rounding code - this not only avoids the slowdown but also speeds up the average case significantly. The goal is to stay below 1 ULP, the math functions Szabolcs and I rewrote generally do better, eg sinf is 0.56 ULP. Wilco