Re: patch to bug #86829
On 8/22/18 6:02 AM, Richard Biener wrote: > On Tue, Aug 21, 2018 at 11:27 PM Jeff Law wrote: >> >> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote: Just as an example, compare the results for x = 0x1.fp1023 >>> >>> Thank you for your answer and the counterexample. :-) >>> If we had useful range info on floats we might conditionalize such transforms appropriately. Or we can enable it on floats and do the sqrt (x*x + 1) in double. >>> >>> I think I managed to find a bound were the transformation can be done >>> without overflow harm, however I don't know about rounding problems, >>> however >>> >>> Suppose we are handling double precision floats for now. The function >>> x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x >>> for the function be 1? >>> >>> Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x >>> that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller >>> than 1. Such eps must be around 1 - 2^-53 in ieee double because the >>> mantissa has 52 bits. Solving for x yields that x must be somewhat >>> bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is >>> enough to return copysign(1, x). Notice that this arguments is also >>> valid for x = +-inf (if target supports that) because sin(atan(+-inf)) >>> = +-1, and it can be extended to other floating point formats.The >>> following test code illustrates my point: >>> https://pastebin.com/M4G4neLQ >>> >>> This might still be faster than calculating sin(atan(x)) explicitly. >>> >>> Please let me know if this is unfeasible. :-) >> The problem is our VRP implementation doesn't handle any floating point >> types at this time. If we had range information for FP types, then >> this kind of analysis is precisely what we'd need to do the >> transformation regardless of -ffast-math. > > I think his idea was to emit a runtime test? You'd have to use a > COND_EXPR and evaluate both arms at the same time because > match.pd doesn't allow you to create control flow. Right. That's what his subsequent patch does. Can you take a peek at the match.pd part > > Note the rounding issue is also real given for large x you strip > away lower mantissa bits when computing x*x. Does that happen for values less than 1e8? Jeff
Re: patch to bug #86829
> > Ah, a runtime test. That'd be sufficient. The cost when we can't do > the transformation is relatively small, but the gains when we can are huge. Thank you. I will update the patch and send it again :-) On Wed, Aug 22, 2018 at 7:05 PM, Jeff Law wrote: > On 08/22/2018 06:02 AM, Richard Biener wrote: >> On Tue, Aug 21, 2018 at 11:27 PM Jeff Law wrote: >>> >>> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote: > Just as an example, compare the results for > x = 0x1.fp1023 Thank you for your answer and the counterexample. :-) > If we had useful range info on floats we might conditionalize such > transforms appropriately. Or we can enable it on floats and do > the sqrt (x*x + 1) in double. I think I managed to find a bound were the transformation can be done without overflow harm, however I don't know about rounding problems, however Suppose we are handling double precision floats for now. The function x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x for the function be 1? Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller than 1. Such eps must be around 1 - 2^-53 in ieee double because the mantissa has 52 bits. Solving for x yields that x must be somewhat bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is enough to return copysign(1, x). Notice that this arguments is also valid for x = +-inf (if target supports that) because sin(atan(+-inf)) = +-1, and it can be extended to other floating point formats.The following test code illustrates my point: https://pastebin.com/M4G4neLQ This might still be faster than calculating sin(atan(x)) explicitly. Please let me know if this is unfeasible. :-) >>> The problem is our VRP implementation doesn't handle any floating point >>> types at this time. If we had range information for FP types, then >>> this kind of analysis is precisely what we'd need to do the >>> transformation regardless of -ffast-math. >> >> I think his idea was to emit a runtime test? You'd have to use a >> COND_EXPR and evaluate both arms at the same time because >> match.pd doesn't allow you to create control flow. >> >> Note the rounding issue is also real given for large x you strip >> away lower mantissa bits when computing x*x. > Ah, a runtime test. That'd be sufficient. The cost when we can't do > the transformation is relatively small, but the gains when we can are huge. > > Jeff
Re: patch to bug #86829
On 08/22/2018 06:02 AM, Richard Biener wrote: > On Tue, Aug 21, 2018 at 11:27 PM Jeff Law wrote: >> >> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote: Just as an example, compare the results for x = 0x1.fp1023 >>> >>> Thank you for your answer and the counterexample. :-) >>> If we had useful range info on floats we might conditionalize such transforms appropriately. Or we can enable it on floats and do the sqrt (x*x + 1) in double. >>> >>> I think I managed to find a bound were the transformation can be done >>> without overflow harm, however I don't know about rounding problems, >>> however >>> >>> Suppose we are handling double precision floats for now. The function >>> x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x >>> for the function be 1? >>> >>> Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x >>> that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller >>> than 1. Such eps must be around 1 - 2^-53 in ieee double because the >>> mantissa has 52 bits. Solving for x yields that x must be somewhat >>> bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is >>> enough to return copysign(1, x). Notice that this arguments is also >>> valid for x = +-inf (if target supports that) because sin(atan(+-inf)) >>> = +-1, and it can be extended to other floating point formats.The >>> following test code illustrates my point: >>> https://pastebin.com/M4G4neLQ >>> >>> This might still be faster than calculating sin(atan(x)) explicitly. >>> >>> Please let me know if this is unfeasible. :-) >> The problem is our VRP implementation doesn't handle any floating point >> types at this time. If we had range information for FP types, then >> this kind of analysis is precisely what we'd need to do the >> transformation regardless of -ffast-math. > > I think his idea was to emit a runtime test? You'd have to use a > COND_EXPR and evaluate both arms at the same time because > match.pd doesn't allow you to create control flow. > > Note the rounding issue is also real given for large x you strip > away lower mantissa bits when computing x*x. Ah, a runtime test. That'd be sufficient. The cost when we can't do the transformation is relatively small, but the gains when we can are huge. Jeff
Re: patch to bug #86829
On Tue, Aug 21, 2018 at 11:27 PM Jeff Law wrote: > > On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote: > >> Just as an example, compare the results for > >> x = 0x1.fp1023 > > > > Thank you for your answer and the counterexample. :-) > > > >> If we had useful range info on floats we might conditionalize such > >> transforms appropriately. Or we can enable it on floats and do > >> the sqrt (x*x + 1) in double. > > > > I think I managed to find a bound were the transformation can be done > > without overflow harm, however I don't know about rounding problems, > > however > > > > Suppose we are handling double precision floats for now. The function > > x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x > > for the function be 1? > > > > Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x > > that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller > > than 1. Such eps must be around 1 - 2^-53 in ieee double because the > > mantissa has 52 bits. Solving for x yields that x must be somewhat > > bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is > > enough to return copysign(1, x). Notice that this arguments is also > > valid for x = +-inf (if target supports that) because sin(atan(+-inf)) > > = +-1, and it can be extended to other floating point formats.The > > following test code illustrates my point: > > https://pastebin.com/M4G4neLQ > > > > This might still be faster than calculating sin(atan(x)) explicitly. > > > > Please let me know if this is unfeasible. :-) > The problem is our VRP implementation doesn't handle any floating point > types at this time. If we had range information for FP types, then > this kind of analysis is precisely what we'd need to do the > transformation regardless of -ffast-math. I think his idea was to emit a runtime test? You'd have to use a COND_EXPR and evaluate both arms at the same time because match.pd doesn't allow you to create control flow. Note the rounding issue is also real given for large x you strip away lower mantissa bits when computing x*x. Richard. > jeff > > > > Giuliano. > > > > On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law wrote: > >> On 08/21/2018 02:02 AM, Richard Biener wrote: > >>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law wrote: > > On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: > > Closes bug #86829 > > > > Description: Adds substitution rules for both sin(atan(x)) and > > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / > > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity > > can be proved mathematically. > > > > Changelog: > > > > 2018-08-03 Giuliano Belinassi > > > > * match.pd: add simplification rules to sin(atan(x)) and > > cos(atan(x)). > > > > Bootstrap and Testing: > > There were no unexpected failures in a proper testing in GCC 8.1.0 > > under a x86_64 running Ubuntu 18.04. > I understand these are mathematical identities. But floating point > arthmetic in a compiler isn't nearly that clean :-) We have to worry > about overflows, underflows, rounding, and the simple fact that many > floating point numbers can't be exactly represented. > > Just as an example, compare the results for > x = 0x1.fp1023 > > I think sin(atan (x)) is well defined in that case. But the x*x isn't > because it overflows. > > So I think this has to be somewhere under the -ffast-math umbrella. > And the testing requirements for that are painful -- you have to verify > it doesn't break the spec benchmark. > > I know Richi acked in the PR, but that might have been premature. > >>> > >>> It's under the flag_unsafe_math_optimizations umbrella, but sure, > >>> a "proper" way to optimize this would be to further expand > >>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough > >>> and not have this overflow issue. > >>> > >>> But yes, I do not find (quickly skimming) other simplifications that > >>> have this kind of overflow issue (in fact I do remember raising > >>> overflow/underflow issues for other patches). > >>> > >>> Thus approval withdrawn. > >> At least until we can do some testing around spec. There's also a patch > >> for logarithm addition/subtraction from MCC CS and another from Giuliano > >> for hyperbolics that need testing with spec. I think that getting that > >> testing done anytime between now and stage1 close is sufficient -- none > >> of the 3 patches is particularly complex. > >> > >> > >>> > >>> If we had useful range info on floats we might conditionalize such > >>> transforms appropriately. Or we can enable it on floats and do > >>> the sqrt (x*x + 1) in double. > >> Yea. I keep thinking about what it might take to start doing some light > >> VRP of floating point objects. I'd originally been thi
Re: patch to bug #86829
On Tue, 21 Aug 2018, Jeff Law wrote: > The problem is our VRP implementation doesn't handle any floating point > types at this time. If we had range information for FP types, then > this kind of analysis is precisely what we'd need to do the > transformation regardless of -ffast-math. I don't think you can do it regardless of -ffast-math simply because it may change the semantics and we've generally assumed that if the optimization might produce results different from what you get with correctly rounded library functions, it should go under -funsafe-math-optimizations. One might try to figure out a way to split that option, to distinguish optimizations that might change correctly rounded result but keep errors small from optimizations that might produce results that are way off, or spurious exceptions, for some inputs. -- Joseph S. Myers jos...@codesourcery.com
Re: patch to bug #86829
On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote: >> Just as an example, compare the results for >> x = 0x1.fp1023 > > Thank you for your answer and the counterexample. :-) > >> If we had useful range info on floats we might conditionalize such >> transforms appropriately. Or we can enable it on floats and do >> the sqrt (x*x + 1) in double. > > I think I managed to find a bound were the transformation can be done > without overflow harm, however I don't know about rounding problems, > however > > Suppose we are handling double precision floats for now. The function > x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x > for the function be 1? > > Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x > that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller > than 1. Such eps must be around 1 - 2^-53 in ieee double because the > mantissa has 52 bits. Solving for x yields that x must be somewhat > bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is > enough to return copysign(1, x). Notice that this arguments is also > valid for x = +-inf (if target supports that) because sin(atan(+-inf)) > = +-1, and it can be extended to other floating point formats.The > following test code illustrates my point: > https://pastebin.com/M4G4neLQ > > This might still be faster than calculating sin(atan(x)) explicitly. > > Please let me know if this is unfeasible. :-) The problem is our VRP implementation doesn't handle any floating point types at this time. If we had range information for FP types, then this kind of analysis is precisely what we'd need to do the transformation regardless of -ffast-math. jeff > > Giuliano. > > On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law wrote: >> On 08/21/2018 02:02 AM, Richard Biener wrote: >>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law wrote: On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: > Closes bug #86829 > > Description: Adds substitution rules for both sin(atan(x)) and > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity > can be proved mathematically. > > Changelog: > > 2018-08-03 Giuliano Belinassi > > * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). > > Bootstrap and Testing: > There were no unexpected failures in a proper testing in GCC 8.1.0 > under a x86_64 running Ubuntu 18.04. I understand these are mathematical identities. But floating point arthmetic in a compiler isn't nearly that clean :-) We have to worry about overflows, underflows, rounding, and the simple fact that many floating point numbers can't be exactly represented. Just as an example, compare the results for x = 0x1.fp1023 I think sin(atan (x)) is well defined in that case. But the x*x isn't because it overflows. So I think this has to be somewhere under the -ffast-math umbrella. And the testing requirements for that are painful -- you have to verify it doesn't break the spec benchmark. I know Richi acked in the PR, but that might have been premature. >>> >>> It's under the flag_unsafe_math_optimizations umbrella, but sure, >>> a "proper" way to optimize this would be to further expand >>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough >>> and not have this overflow issue. >>> >>> But yes, I do not find (quickly skimming) other simplifications that >>> have this kind of overflow issue (in fact I do remember raising >>> overflow/underflow issues for other patches). >>> >>> Thus approval withdrawn. >> At least until we can do some testing around spec. There's also a patch >> for logarithm addition/subtraction from MCC CS and another from Giuliano >> for hyperbolics that need testing with spec. I think that getting that >> testing done anytime between now and stage1 close is sufficient -- none >> of the 3 patches is particularly complex. >> >> >>> >>> If we had useful range info on floats we might conditionalize such >>> transforms appropriately. Or we can enable it on floats and do >>> the sqrt (x*x + 1) in double. >> Yea. I keep thinking about what it might take to start doing some light >> VRP of floating point objects. I'd originally been thinking to just >> track 0.0 and exceptional value state. But the more I ponder the more I >> think we could use the range information to allow transformations that >> are currently guarded by the -ffast-math family of options. >> >> jeff >>
Re: patch to bug #86829
> Just as an example, compare the results for > x = 0x1.fp1023 Thank you for your answer and the counterexample. :-) > If we had useful range info on floats we might conditionalize such > transforms appropriately. Or we can enable it on floats and do > the sqrt (x*x + 1) in double. I think I managed to find a bound were the transformation can be done without overflow harm, however I don't know about rounding problems, however Suppose we are handling double precision floats for now. The function x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x for the function be 1? Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller than 1. Such eps must be around 1 - 2^-53 in ieee double because the mantissa has 52 bits. Solving for x yields that x must be somewhat bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is enough to return copysign(1, x). Notice that this arguments is also valid for x = +-inf (if target supports that) because sin(atan(+-inf)) = +-1, and it can be extended to other floating point formats.The following test code illustrates my point: https://pastebin.com/M4G4neLQ This might still be faster than calculating sin(atan(x)) explicitly. Please let me know if this is unfeasible. :-) Giuliano. On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law wrote: > On 08/21/2018 02:02 AM, Richard Biener wrote: >> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law wrote: >>> >>> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: Closes bug #86829 Description: Adds substitution rules for both sin(atan(x)) and cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity can be proved mathematically. Changelog: 2018-08-03 Giuliano Belinassi * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). Bootstrap and Testing: There were no unexpected failures in a proper testing in GCC 8.1.0 under a x86_64 running Ubuntu 18.04. >>> I understand these are mathematical identities. But floating point >>> arthmetic in a compiler isn't nearly that clean :-) We have to worry >>> about overflows, underflows, rounding, and the simple fact that many >>> floating point numbers can't be exactly represented. >>> >>> Just as an example, compare the results for >>> x = 0x1.fp1023 >>> >>> I think sin(atan (x)) is well defined in that case. But the x*x isn't >>> because it overflows. >>> >>> So I think this has to be somewhere under the -ffast-math umbrella. >>> And the testing requirements for that are painful -- you have to verify >>> it doesn't break the spec benchmark. >>> >>> I know Richi acked in the PR, but that might have been premature. >> >> It's under the flag_unsafe_math_optimizations umbrella, but sure, >> a "proper" way to optimize this would be to further expand >> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough >> and not have this overflow issue. >> >> But yes, I do not find (quickly skimming) other simplifications that >> have this kind of overflow issue (in fact I do remember raising >> overflow/underflow issues for other patches). >> >> Thus approval withdrawn. > At least until we can do some testing around spec. There's also a patch > for logarithm addition/subtraction from MCC CS and another from Giuliano > for hyperbolics that need testing with spec. I think that getting that > testing done anytime between now and stage1 close is sufficient -- none > of the 3 patches is particularly complex. > > >> >> If we had useful range info on floats we might conditionalize such >> transforms appropriately. Or we can enable it on floats and do >> the sqrt (x*x + 1) in double. > Yea. I keep thinking about what it might take to start doing some light > VRP of floating point objects. I'd originally been thinking to just > track 0.0 and exceptional value state. But the more I ponder the more I > think we could use the range information to allow transformations that > are currently guarded by the -ffast-math family of options. > > jeff >
Re: patch to bug #86829
On 08/21/2018 02:02 AM, Richard Biener wrote: > On Mon, Aug 20, 2018 at 9:40 PM Jeff Law wrote: >> >> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: >>> Closes bug #86829 >>> >>> Description: Adds substitution rules for both sin(atan(x)) and >>> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / >>> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity >>> can be proved mathematically. >>> >>> Changelog: >>> >>> 2018-08-03 Giuliano Belinassi >>> >>> * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). >>> >>> Bootstrap and Testing: >>> There were no unexpected failures in a proper testing in GCC 8.1.0 >>> under a x86_64 running Ubuntu 18.04. >> I understand these are mathematical identities. But floating point >> arthmetic in a compiler isn't nearly that clean :-) We have to worry >> about overflows, underflows, rounding, and the simple fact that many >> floating point numbers can't be exactly represented. >> >> Just as an example, compare the results for >> x = 0x1.fp1023 >> >> I think sin(atan (x)) is well defined in that case. But the x*x isn't >> because it overflows. >> >> So I think this has to be somewhere under the -ffast-math umbrella. >> And the testing requirements for that are painful -- you have to verify >> it doesn't break the spec benchmark. >> >> I know Richi acked in the PR, but that might have been premature. > > It's under the flag_unsafe_math_optimizations umbrella, but sure, > a "proper" way to optimize this would be to further expand > sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough > and not have this overflow issue. > > But yes, I do not find (quickly skimming) other simplifications that > have this kind of overflow issue (in fact I do remember raising > overflow/underflow issues for other patches). > > Thus approval withdrawn. At least until we can do some testing around spec. There's also a patch for logarithm addition/subtraction from MCC CS and another from Giuliano for hyperbolics that need testing with spec. I think that getting that testing done anytime between now and stage1 close is sufficient -- none of the 3 patches is particularly complex. > > If we had useful range info on floats we might conditionalize such > transforms appropriately. Or we can enable it on floats and do > the sqrt (x*x + 1) in double. Yea. I keep thinking about what it might take to start doing some light VRP of floating point objects. I'd originally been thinking to just track 0.0 and exceptional value state. But the more I ponder the more I think we could use the range information to allow transformations that are currently guarded by the -ffast-math family of options. jeff
Re: patch to bug #86829
On Mon, Aug 20, 2018 at 9:40 PM Jeff Law wrote: > > On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: > > Closes bug #86829 > > > > Description: Adds substitution rules for both sin(atan(x)) and > > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / > > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity > > can be proved mathematically. > > > > Changelog: > > > > 2018-08-03 Giuliano Belinassi > > > > * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). > > > > Bootstrap and Testing: > > There were no unexpected failures in a proper testing in GCC 8.1.0 > > under a x86_64 running Ubuntu 18.04. > I understand these are mathematical identities. But floating point > arthmetic in a compiler isn't nearly that clean :-) We have to worry > about overflows, underflows, rounding, and the simple fact that many > floating point numbers can't be exactly represented. > > Just as an example, compare the results for > x = 0x1.fp1023 > > I think sin(atan (x)) is well defined in that case. But the x*x isn't > because it overflows. > > So I think this has to be somewhere under the -ffast-math umbrella. > And the testing requirements for that are painful -- you have to verify > it doesn't break the spec benchmark. > > I know Richi acked in the PR, but that might have been premature. It's under the flag_unsafe_math_optimizations umbrella, but sure, a "proper" way to optimize this would be to further expand sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough and not have this overflow issue. But yes, I do not find (quickly skimming) other simplifications that have this kind of overflow issue (in fact I do remember raising overflow/underflow issues for other patches). Thus approval withdrawn. If we had useful range info on floats we might conditionalize such transforms appropriately. Or we can enable it on floats and do the sqrt (x*x + 1) in double. Richard. > jeff > > >
Re: patch to bug #86829
On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote: > Closes bug #86829 > > Description: Adds substitution rules for both sin(atan(x)) and > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity > can be proved mathematically. > > Changelog: > > 2018-08-03 Giuliano Belinassi > > * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). > > Bootstrap and Testing: > There were no unexpected failures in a proper testing in GCC 8.1.0 > under a x86_64 running Ubuntu 18.04. I understand these are mathematical identities. But floating point arthmetic in a compiler isn't nearly that clean :-) We have to worry about overflows, underflows, rounding, and the simple fact that many floating point numbers can't be exactly represented. Just as an example, compare the results for x = 0x1.fp1023 I think sin(atan (x)) is well defined in that case. But the x*x isn't because it overflows. So I think this has to be somewhere under the -ffast-math umbrella. And the testing requirements for that are painful -- you have to verify it doesn't break the spec benchmark. I know Richi acked in the PR, but that might have been premature. jeff
Re: patch to bug #86829
ping On Sat, Aug 4, 2018 at 10:22 AM, Giuliano Augusto Faulin Belinassi wrote: > Closes bug #86829 > > Description: Adds substitution rules for both sin(atan(x)) and > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity > can be proved mathematically. > > Changelog: > > 2018-08-03 Giuliano Belinassi > > * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). > > Bootstrap and Testing: > There were no unexpected failures in a proper testing in GCC 8.1.0 > under a x86_64 running Ubuntu 18.04.
patch to bug #86829
Closes bug #86829 Description: Adds substitution rules for both sin(atan(x)) and cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 / sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity can be proved mathematically. Changelog: 2018-08-03 Giuliano Belinassi * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)). Bootstrap and Testing: There were no unexpected failures in a proper testing in GCC 8.1.0 under a x86_64 running Ubuntu 18.04. Test run by giulianob on Fri Aug 3 17:01:33 2018 Native configuration is x86_64-pc-linux-gnu === gcc tests === Schedule of variations: unix Running target unix Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /home/giulianob/Downloads/gcc/src/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/compile/compile.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/execute.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/ieee/ieee.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/asan/asan.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/atomic/atomic.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/autopar/autopar.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/charset/charset.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/compat/compat.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/cpp/cpp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/cpp/trad/trad.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/debug/debug.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf2.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/dfp/dfp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/dg.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/fixed-point/fixed-point.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/format/format.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/goacc/goacc.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/gomp/gomp.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/graphite/graphite.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/guality/guality.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/ipa/ipa.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/lto/lto.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/noncompile/noncompile.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/params/params.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/pch/pch.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/plugin/plugin.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/rtl/rtl.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/sancov/sancov.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/simulate-thread/simulate-thread.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/special/mips-abi.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/special/special.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/sso/sso.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tls/tls.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tm/tm.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/dg-torture.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/stackalign/stackalign.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/tls/tls.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tree-ssa/tree-ssa.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tsan/tsan.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/ubsan/ubsan.exp ... Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/vect/costmodel