Re: patch to bug #86829

2018-10-03 Thread Jeff Law
On 8/22/18 6:02 AM, Richard Biener wrote:
> On Tue, Aug 21, 2018 at 11:27 PM Jeff Law  wrote:
>>
>> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote:
 Just as an example, compare the results for
 x = 0x1.fp1023
>>>
>>> Thank you for your answer and the counterexample. :-)
>>>
 If we had useful range info on floats we might conditionalize such
 transforms appropriately.  Or we can enable it on floats and do
 the sqrt (x*x + 1) in double.
>>>
>>> I think I managed to find a bound were the transformation can be done
>>> without overflow harm, however I don't know about rounding problems,
>>> however
>>>
>>> Suppose we are handling double precision floats for now. The function
>>> x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
>>> for the function be 1?
>>>
>>> Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
>>> that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
>>> than 1. Such eps must be around 1 - 2^-53 in ieee double because the
>>> mantissa has 52 bits. Solving for x yields that x must be somewhat
>>> bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
>>> enough to return copysign(1, x). Notice that this arguments is also
>>> valid for x = +-inf (if target supports that) because sin(atan(+-inf))
>>> = +-1, and it can be extended to other floating point formats.The
>>> following test code illustrates my point:
>>> https://pastebin.com/M4G4neLQ
>>>
>>> This might still be faster than calculating sin(atan(x)) explicitly.
>>>
>>> Please let me know if this is unfeasible. :-)
>> The problem is our VRP implementation doesn't handle any floating point
>> types at this time.   If we had range information for FP types, then
>> this kind of analysis is precisely what we'd need to do the
>> transformation regardless of -ffast-math.
> 
> I think his idea was to emit a runtime test?  You'd have to use a
> COND_EXPR and evaluate both arms at the same time because
> match.pd doesn't allow you to create control flow.
Right.  That's what his subsequent patch does.  Can you take a peek at
the match.pd part

> 
> Note the rounding issue is also real given for large x you strip
> away lower mantissa bits when computing x*x.
Does that happen for values less than 1e8?

Jeff


Re: patch to bug #86829

2018-08-23 Thread Giuliano Augusto Faulin Belinassi
>
> Ah, a runtime test.  That'd be sufficient.  The cost when we can't do
> the transformation is relatively small, but the gains when we can are huge.

Thank you. I will update the patch and send it again :-)

On Wed, Aug 22, 2018 at 7:05 PM, Jeff Law  wrote:
> On 08/22/2018 06:02 AM, Richard Biener wrote:
>> On Tue, Aug 21, 2018 at 11:27 PM Jeff Law  wrote:
>>>
>>> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote:
> Just as an example, compare the results for
> x = 0x1.fp1023

 Thank you for your answer and the counterexample. :-)

> If we had useful range info on floats we might conditionalize such
> transforms appropriately.  Or we can enable it on floats and do
> the sqrt (x*x + 1) in double.

 I think I managed to find a bound were the transformation can be done
 without overflow harm, however I don't know about rounding problems,
 however

 Suppose we are handling double precision floats for now. The function
 x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
 for the function be 1?

 Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
 that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
 than 1. Such eps must be around 1 - 2^-53 in ieee double because the
 mantissa has 52 bits. Solving for x yields that x must be somewhat
 bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
 enough to return copysign(1, x). Notice that this arguments is also
 valid for x = +-inf (if target supports that) because sin(atan(+-inf))
 = +-1, and it can be extended to other floating point formats.The
 following test code illustrates my point:
 https://pastebin.com/M4G4neLQ

 This might still be faster than calculating sin(atan(x)) explicitly.

 Please let me know if this is unfeasible. :-)
>>> The problem is our VRP implementation doesn't handle any floating point
>>> types at this time.   If we had range information for FP types, then
>>> this kind of analysis is precisely what we'd need to do the
>>> transformation regardless of -ffast-math.
>>
>> I think his idea was to emit a runtime test?  You'd have to use a
>> COND_EXPR and evaluate both arms at the same time because
>> match.pd doesn't allow you to create control flow.
>>
>> Note the rounding issue is also real given for large x you strip
>> away lower mantissa bits when computing x*x.
> Ah, a runtime test.  That'd be sufficient.  The cost when we can't do
> the transformation is relatively small, but the gains when we can are huge.
>
> Jeff


Re: patch to bug #86829

2018-08-22 Thread Jeff Law
On 08/22/2018 06:02 AM, Richard Biener wrote:
> On Tue, Aug 21, 2018 at 11:27 PM Jeff Law  wrote:
>>
>> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote:
 Just as an example, compare the results for
 x = 0x1.fp1023
>>>
>>> Thank you for your answer and the counterexample. :-)
>>>
 If we had useful range info on floats we might conditionalize such
 transforms appropriately.  Or we can enable it on floats and do
 the sqrt (x*x + 1) in double.
>>>
>>> I think I managed to find a bound were the transformation can be done
>>> without overflow harm, however I don't know about rounding problems,
>>> however
>>>
>>> Suppose we are handling double precision floats for now. The function
>>> x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
>>> for the function be 1?
>>>
>>> Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
>>> that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
>>> than 1. Such eps must be around 1 - 2^-53 in ieee double because the
>>> mantissa has 52 bits. Solving for x yields that x must be somewhat
>>> bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
>>> enough to return copysign(1, x). Notice that this arguments is also
>>> valid for x = +-inf (if target supports that) because sin(atan(+-inf))
>>> = +-1, and it can be extended to other floating point formats.The
>>> following test code illustrates my point:
>>> https://pastebin.com/M4G4neLQ
>>>
>>> This might still be faster than calculating sin(atan(x)) explicitly.
>>>
>>> Please let me know if this is unfeasible. :-)
>> The problem is our VRP implementation doesn't handle any floating point
>> types at this time.   If we had range information for FP types, then
>> this kind of analysis is precisely what we'd need to do the
>> transformation regardless of -ffast-math.
> 
> I think his idea was to emit a runtime test?  You'd have to use a
> COND_EXPR and evaluate both arms at the same time because
> match.pd doesn't allow you to create control flow.
> 
> Note the rounding issue is also real given for large x you strip
> away lower mantissa bits when computing x*x.
Ah, a runtime test.  That'd be sufficient.  The cost when we can't do
the transformation is relatively small, but the gains when we can are huge.

Jeff


Re: patch to bug #86829

2018-08-22 Thread Richard Biener
On Tue, Aug 21, 2018 at 11:27 PM Jeff Law  wrote:
>
> On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote:
> >> Just as an example, compare the results for
> >> x = 0x1.fp1023
> >
> > Thank you for your answer and the counterexample. :-)
> >
> >> If we had useful range info on floats we might conditionalize such
> >> transforms appropriately.  Or we can enable it on floats and do
> >> the sqrt (x*x + 1) in double.
> >
> > I think I managed to find a bound were the transformation can be done
> > without overflow harm, however I don't know about rounding problems,
> > however
> >
> > Suppose we are handling double precision floats for now. The function
> > x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
> > for the function be 1?
> >
> > Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
> > that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
> > than 1. Such eps must be around 1 - 2^-53 in ieee double because the
> > mantissa has 52 bits. Solving for x yields that x must be somewhat
> > bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
> > enough to return copysign(1, x). Notice that this arguments is also
> > valid for x = +-inf (if target supports that) because sin(atan(+-inf))
> > = +-1, and it can be extended to other floating point formats.The
> > following test code illustrates my point:
> > https://pastebin.com/M4G4neLQ
> >
> > This might still be faster than calculating sin(atan(x)) explicitly.
> >
> > Please let me know if this is unfeasible. :-)
> The problem is our VRP implementation doesn't handle any floating point
> types at this time.   If we had range information for FP types, then
> this kind of analysis is precisely what we'd need to do the
> transformation regardless of -ffast-math.

I think his idea was to emit a runtime test?  You'd have to use a
COND_EXPR and evaluate both arms at the same time because
match.pd doesn't allow you to create control flow.

Note the rounding issue is also real given for large x you strip
away lower mantissa bits when computing x*x.

Richard.

> jeff
> >
> > Giuliano.
> >
> > On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law  wrote:
> >> On 08/21/2018 02:02 AM, Richard Biener wrote:
> >>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law  wrote:
> 
>  On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
> > Closes bug #86829
> >
> > Description: Adds substitution rules for both sin(atan(x)) and
> > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
> > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
> > can be proved mathematically.
> >
> > Changelog:
> >
> > 2018-08-03  Giuliano Belinassi 
> >
> > * match.pd: add simplification rules to sin(atan(x)) and 
> > cos(atan(x)).
> >
> > Bootstrap and Testing:
> > There were no unexpected failures in a proper testing in GCC 8.1.0
> > under a x86_64 running Ubuntu 18.04.
>  I understand these are mathematical identities.  But floating point
>  arthmetic in a compiler isn't nearly that clean :-)  We have to worry
>  about overflows, underflows, rounding, and the simple fact that many
>  floating point numbers can't be exactly represented.
> 
>  Just as an example, compare the results for
>  x = 0x1.fp1023
> 
>  I think sin(atan (x)) is well defined in that case.  But the x*x isn't
>  because it overflows.
> 
>  So  I think this has to be somewhere under the -ffast-math umbrella.
>  And the testing requirements for that are painful -- you have to verify
>  it doesn't break the spec benchmark.
> 
>  I know Richi acked in the PR, but that might have been premature.
> >>>
> >>> It's under the flag_unsafe_math_optimizations umbrella, but sure,
> >>> a "proper" way to optimize this would be to further expand
> >>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough
> >>> and not have this overflow issue.
> >>>
> >>> But yes, I do not find (quickly skimming) other simplifications that
> >>> have this kind of overflow issue (in fact I do remember raising
> >>> overflow/underflow issues for other patches).
> >>>
> >>> Thus approval withdrawn.
> >> At least until we can do some testing around spec.  There's also a patch
> >> for logarithm addition/subtraction from MCC CS and another from Giuliano
> >> for hyperbolics that need testing with spec.  I think that getting that
> >> testing done anytime between now and stage1 close is sufficient -- none
> >> of the 3 patches is particularly complex.
> >>
> >>
> >>>
> >>> If we had useful range info on floats we might conditionalize such
> >>> transforms appropriately.  Or we can enable it on floats and do
> >>> the sqrt (x*x + 1) in double.
> >> Yea.  I keep thinking about what it might take to start doing some light
> >> VRP of floating point objects.  I'd originally been thi

Re: patch to bug #86829

2018-08-21 Thread Joseph Myers
On Tue, 21 Aug 2018, Jeff Law wrote:

> The problem is our VRP implementation doesn't handle any floating point
> types at this time.   If we had range information for FP types, then
> this kind of analysis is precisely what we'd need to do the
> transformation regardless of -ffast-math.

I don't think you can do it regardless of -ffast-math simply because it 
may change the semantics and we've generally assumed that if the 
optimization might produce results different from what you get with 
correctly rounded library functions, it should go under 
-funsafe-math-optimizations.  One might try to figure out a way to split 
that option, to distinguish optimizations that might change correctly 
rounded result but keep errors small from optimizations that might produce 
results that are way off, or spurious exceptions, for some inputs.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: patch to bug #86829

2018-08-21 Thread Jeff Law
On 08/21/2018 02:08 PM, Giuliano Augusto Faulin Belinassi wrote:
>> Just as an example, compare the results for
>> x = 0x1.fp1023
> 
> Thank you for your answer and the counterexample. :-)
> 
>> If we had useful range info on floats we might conditionalize such
>> transforms appropriately.  Or we can enable it on floats and do
>> the sqrt (x*x + 1) in double.
> 
> I think I managed to find a bound were the transformation can be done
> without overflow harm, however I don't know about rounding problems,
> however
> 
> Suppose we are handling double precision floats for now. The function
> x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
> for the function be 1?
> 
> Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
> that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
> than 1. Such eps must be around 1 - 2^-53 in ieee double because the
> mantissa has 52 bits. Solving for x yields that x must be somewhat
> bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
> enough to return copysign(1, x). Notice that this arguments is also
> valid for x = +-inf (if target supports that) because sin(atan(+-inf))
> = +-1, and it can be extended to other floating point formats.The
> following test code illustrates my point:
> https://pastebin.com/M4G4neLQ
> 
> This might still be faster than calculating sin(atan(x)) explicitly.
> 
> Please let me know if this is unfeasible. :-)
The problem is our VRP implementation doesn't handle any floating point
types at this time.   If we had range information for FP types, then
this kind of analysis is precisely what we'd need to do the
transformation regardless of -ffast-math.
jeff
> 
> Giuliano.
> 
> On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law  wrote:
>> On 08/21/2018 02:02 AM, Richard Biener wrote:
>>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law  wrote:

 On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
> Closes bug #86829
>
> Description: Adds substitution rules for both sin(atan(x)) and
> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
> can be proved mathematically.
>
> Changelog:
>
> 2018-08-03  Giuliano Belinassi 
>
> * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
>
> Bootstrap and Testing:
> There were no unexpected failures in a proper testing in GCC 8.1.0
> under a x86_64 running Ubuntu 18.04.
 I understand these are mathematical identities.  But floating point
 arthmetic in a compiler isn't nearly that clean :-)  We have to worry
 about overflows, underflows, rounding, and the simple fact that many
 floating point numbers can't be exactly represented.

 Just as an example, compare the results for
 x = 0x1.fp1023

 I think sin(atan (x)) is well defined in that case.  But the x*x isn't
 because it overflows.

 So  I think this has to be somewhere under the -ffast-math umbrella.
 And the testing requirements for that are painful -- you have to verify
 it doesn't break the spec benchmark.

 I know Richi acked in the PR, but that might have been premature.
>>>
>>> It's under the flag_unsafe_math_optimizations umbrella, but sure,
>>> a "proper" way to optimize this would be to further expand
>>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough
>>> and not have this overflow issue.
>>>
>>> But yes, I do not find (quickly skimming) other simplifications that
>>> have this kind of overflow issue (in fact I do remember raising
>>> overflow/underflow issues for other patches).
>>>
>>> Thus approval withdrawn.
>> At least until we can do some testing around spec.  There's also a patch
>> for logarithm addition/subtraction from MCC CS and another from Giuliano
>> for hyperbolics that need testing with spec.  I think that getting that
>> testing done anytime between now and stage1 close is sufficient -- none
>> of the 3 patches is particularly complex.
>>
>>
>>>
>>> If we had useful range info on floats we might conditionalize such
>>> transforms appropriately.  Or we can enable it on floats and do
>>> the sqrt (x*x + 1) in double.
>> Yea.  I keep thinking about what it might take to start doing some light
>> VRP of floating point objects.  I'd originally been thinking to just
>> track 0.0 and exceptional value state.  But the more I ponder the more I
>> think we could use the range information to allow transformations that
>> are currently guarded by the -ffast-math family of options.
>>
>> jeff
>>



Re: patch to bug #86829

2018-08-21 Thread Giuliano Augusto Faulin Belinassi
> Just as an example, compare the results for
> x = 0x1.fp1023

Thank you for your answer and the counterexample. :-)

> If we had useful range info on floats we might conditionalize such
> transforms appropriately.  Or we can enable it on floats and do
> the sqrt (x*x + 1) in double.

I think I managed to find a bound were the transformation can be done
without overflow harm, however I don't know about rounding problems,
however

Suppose we are handling double precision floats for now. The function
x/sqrt(1 + x*x) approaches 1 when x is big enough. How big must be x
for the function be 1?

Since sqrt(1 + x*x) > x when x > 1, then we must find a value to x
that x/sqrt(1 + x*x) < eps, where eps is the biggest double smaller
than 1. Such eps must be around 1 - 2^-53 in ieee double because the
mantissa has 52 bits. Solving for x yields that x must be somewhat
bigger than 6.7e7, so let's take 1e8. Therefore if abs(x) > 1e8, it is
enough to return copysign(1, x). Notice that this arguments is also
valid for x = +-inf (if target supports that) because sin(atan(+-inf))
= +-1, and it can be extended to other floating point formats.The
following test code illustrates my point:
https://pastebin.com/M4G4neLQ

This might still be faster than calculating sin(atan(x)) explicitly.

Please let me know if this is unfeasible. :-)

Giuliano.

On Tue, Aug 21, 2018 at 11:27 AM, Jeff Law  wrote:
> On 08/21/2018 02:02 AM, Richard Biener wrote:
>> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law  wrote:
>>>
>>> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
 Closes bug #86829

 Description: Adds substitution rules for both sin(atan(x)) and
 cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
 sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
 can be proved mathematically.

 Changelog:

 2018-08-03  Giuliano Belinassi 

 * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).

 Bootstrap and Testing:
 There were no unexpected failures in a proper testing in GCC 8.1.0
 under a x86_64 running Ubuntu 18.04.
>>> I understand these are mathematical identities.  But floating point
>>> arthmetic in a compiler isn't nearly that clean :-)  We have to worry
>>> about overflows, underflows, rounding, and the simple fact that many
>>> floating point numbers can't be exactly represented.
>>>
>>> Just as an example, compare the results for
>>> x = 0x1.fp1023
>>>
>>> I think sin(atan (x)) is well defined in that case.  But the x*x isn't
>>> because it overflows.
>>>
>>> So  I think this has to be somewhere under the -ffast-math umbrella.
>>> And the testing requirements for that are painful -- you have to verify
>>> it doesn't break the spec benchmark.
>>>
>>> I know Richi acked in the PR, but that might have been premature.
>>
>> It's under the flag_unsafe_math_optimizations umbrella, but sure,
>> a "proper" way to optimize this would be to further expand
>> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough
>> and not have this overflow issue.
>>
>> But yes, I do not find (quickly skimming) other simplifications that
>> have this kind of overflow issue (in fact I do remember raising
>> overflow/underflow issues for other patches).
>>
>> Thus approval withdrawn.
> At least until we can do some testing around spec.  There's also a patch
> for logarithm addition/subtraction from MCC CS and another from Giuliano
> for hyperbolics that need testing with spec.  I think that getting that
> testing done anytime between now and stage1 close is sufficient -- none
> of the 3 patches is particularly complex.
>
>
>>
>> If we had useful range info on floats we might conditionalize such
>> transforms appropriately.  Or we can enable it on floats and do
>> the sqrt (x*x + 1) in double.
> Yea.  I keep thinking about what it might take to start doing some light
> VRP of floating point objects.  I'd originally been thinking to just
> track 0.0 and exceptional value state.  But the more I ponder the more I
> think we could use the range information to allow transformations that
> are currently guarded by the -ffast-math family of options.
>
> jeff
>


Re: patch to bug #86829

2018-08-21 Thread Jeff Law
On 08/21/2018 02:02 AM, Richard Biener wrote:
> On Mon, Aug 20, 2018 at 9:40 PM Jeff Law  wrote:
>>
>> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
>>> Closes bug #86829
>>>
>>> Description: Adds substitution rules for both sin(atan(x)) and
>>> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
>>> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
>>> can be proved mathematically.
>>>
>>> Changelog:
>>>
>>> 2018-08-03  Giuliano Belinassi 
>>>
>>> * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
>>>
>>> Bootstrap and Testing:
>>> There were no unexpected failures in a proper testing in GCC 8.1.0
>>> under a x86_64 running Ubuntu 18.04.
>> I understand these are mathematical identities.  But floating point
>> arthmetic in a compiler isn't nearly that clean :-)  We have to worry
>> about overflows, underflows, rounding, and the simple fact that many
>> floating point numbers can't be exactly represented.
>>
>> Just as an example, compare the results for
>> x = 0x1.fp1023
>>
>> I think sin(atan (x)) is well defined in that case.  But the x*x isn't
>> because it overflows.
>>
>> So  I think this has to be somewhere under the -ffast-math umbrella.
>> And the testing requirements for that are painful -- you have to verify
>> it doesn't break the spec benchmark.
>>
>> I know Richi acked in the PR, but that might have been premature.
> 
> It's under the flag_unsafe_math_optimizations umbrella, but sure,
> a "proper" way to optimize this would be to further expand
> sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough
> and not have this overflow issue.
> 
> But yes, I do not find (quickly skimming) other simplifications that
> have this kind of overflow issue (in fact I do remember raising
> overflow/underflow issues for other patches).
> 
> Thus approval withdrawn.
At least until we can do some testing around spec.  There's also a patch
for logarithm addition/subtraction from MCC CS and another from Giuliano
for hyperbolics that need testing with spec.  I think that getting that
testing done anytime between now and stage1 close is sufficient -- none
of the 3 patches is particularly complex.


> 
> If we had useful range info on floats we might conditionalize such
> transforms appropriately.  Or we can enable it on floats and do
> the sqrt (x*x + 1) in double.
Yea.  I keep thinking about what it might take to start doing some light
VRP of floating point objects.  I'd originally been thinking to just
track 0.0 and exceptional value state.  But the more I ponder the more I
think we could use the range information to allow transformations that
are currently guarded by the -ffast-math family of options.

jeff



Re: patch to bug #86829

2018-08-21 Thread Richard Biener
On Mon, Aug 20, 2018 at 9:40 PM Jeff Law  wrote:
>
> On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
> > Closes bug #86829
> >
> > Description: Adds substitution rules for both sin(atan(x)) and
> > cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
> > sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
> > can be proved mathematically.
> >
> > Changelog:
> >
> > 2018-08-03  Giuliano Belinassi 
> >
> > * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
> >
> > Bootstrap and Testing:
> > There were no unexpected failures in a proper testing in GCC 8.1.0
> > under a x86_64 running Ubuntu 18.04.
> I understand these are mathematical identities.  But floating point
> arthmetic in a compiler isn't nearly that clean :-)  We have to worry
> about overflows, underflows, rounding, and the simple fact that many
> floating point numbers can't be exactly represented.
>
> Just as an example, compare the results for
> x = 0x1.fp1023
>
> I think sin(atan (x)) is well defined in that case.  But the x*x isn't
> because it overflows.
>
> So  I think this has to be somewhere under the -ffast-math umbrella.
> And the testing requirements for that are painful -- you have to verify
> it doesn't break the spec benchmark.
>
> I know Richi acked in the PR, but that might have been premature.

It's under the flag_unsafe_math_optimizations umbrella, but sure,
a "proper" way to optimize this would be to further expand
sqrt (x*x + 1) to fabs(x) + ... (extra terms) that are precise enough
and not have this overflow issue.

But yes, I do not find (quickly skimming) other simplifications that
have this kind of overflow issue (in fact I do remember raising
overflow/underflow issues for other patches).

Thus approval withdrawn.

If we had useful range info on floats we might conditionalize such
transforms appropriately.  Or we can enable it on floats and do
the sqrt (x*x + 1) in double.

Richard.

> jeff
>
>
>


Re: patch to bug #86829

2018-08-20 Thread Jeff Law
On 08/04/2018 07:22 AM, Giuliano Augusto Faulin Belinassi wrote:
> Closes bug #86829
> 
> Description: Adds substitution rules for both sin(atan(x)) and
> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
> can be proved mathematically.
> 
> Changelog:
> 
> 2018-08-03  Giuliano Belinassi 
> 
> * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
> 
> Bootstrap and Testing:
> There were no unexpected failures in a proper testing in GCC 8.1.0
> under a x86_64 running Ubuntu 18.04.
I understand these are mathematical identities.  But floating point
arthmetic in a compiler isn't nearly that clean :-)  We have to worry
about overflows, underflows, rounding, and the simple fact that many
floating point numbers can't be exactly represented.

Just as an example, compare the results for
x = 0x1.fp1023

I think sin(atan (x)) is well defined in that case.  But the x*x isn't
because it overflows.

So  I think this has to be somewhere under the -ffast-math umbrella.
And the testing requirements for that are painful -- you have to verify
it doesn't break the spec benchmark.

I know Richi acked in the PR, but that might have been premature.

jeff





Re: patch to bug #86829

2018-08-18 Thread Giuliano Augusto Faulin Belinassi
ping

On Sat, Aug 4, 2018 at 10:22 AM, Giuliano Augusto Faulin Belinassi
 wrote:
> Closes bug #86829
>
> Description: Adds substitution rules for both sin(atan(x)) and
> cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
> sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
> can be proved mathematically.
>
> Changelog:
>
> 2018-08-03  Giuliano Belinassi 
>
> * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
>
> Bootstrap and Testing:
> There were no unexpected failures in a proper testing in GCC 8.1.0
> under a x86_64 running Ubuntu 18.04.


patch to bug #86829

2018-08-04 Thread Giuliano Augusto Faulin Belinassi
Closes bug #86829

Description: Adds substitution rules for both sin(atan(x)) and
cos(atan(x)). These formulas are replaced by x / sqrt(x*x + 1) and 1 /
sqrt(x*x + 1) respectively, providing up to 10x speedup. This identity
can be proved mathematically.

Changelog:

2018-08-03  Giuliano Belinassi 

* match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).

Bootstrap and Testing:
There were no unexpected failures in a proper testing in GCC 8.1.0
under a x86_64 running Ubuntu 18.04.

Test run by giulianob on Fri Aug  3 17:01:33 2018
Native configuration is x86_64-pc-linux-gnu

=== gcc tests ===

Schedule of variations:
unix

Running target unix
Using /usr/share/dejagnu/baseboards/unix.exp as board description file for 
target.
Using /usr/share/dejagnu/config/unix.exp as generic interface file for target.
Using /home/giulianob/Downloads/gcc/src/gcc/testsuite/config/default.exp as 
tool-and-target-specific interface file.
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/compile/compile.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/execute.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/execute/ieee/ieee.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp
 ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/asan/asan.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/atomic/atomic.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/autopar/autopar.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/charset/charset.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/compat/compat.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp
 ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/cpp/cpp.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/cpp/trad/trad.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/debug/debug.exp 
...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/debug/dwarf2/dwarf2.exp 
...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/dfp/dfp.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/dg.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/fixed-point/fixed-point.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/format/format.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/goacc-gomp/goacc-gomp.exp
 ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/goacc/goacc.exp 
...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/gomp/gomp.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/graphite/graphite.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/guality/guality.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/ipa/ipa.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/lto/lto.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/noncompile/noncompile.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/params/params.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/pch/pch.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/plugin/plugin.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/rtl/rtl.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/sancov/sancov.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/simulate-thread/simulate-thread.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/special/mips-abi.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/special/special.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/sso/sso.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tls/tls.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tm/tm.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/dg-torture.exp 
...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/stackalign/stackalign.exp
 ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/torture/tls/tls.exp ...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp 
...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tree-ssa/tree-ssa.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/tsan/tsan.exp ...
Running /home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/ubsan/ubsan.exp 
...
Running 
/home/giulianob/Downloads/gcc/src/gcc/testsuite/gcc.dg/vect/costmodel