[jira] [Commented] (MATH-1253) Skewness could get more precision from slightly reordered code.

2015-11-24 Thread Otmar Ertl (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025190#comment-15025190
 ] 

Otmar Ertl commented on MATH-1253:
--

Provided that there is no overflow, I do not think that my proposal is more 
accurate. The relative error of a sum of floating point numbers is the same 
after scaling the numbers by the same factor.

> Skewness could get more precision from slightly reordered code.
> ---
>
> Key: MATH-1253
> URL: https://issues.apache.org/jira/browse/MATH-1253
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.5
>Reporter: Bill Murphy
>Priority: Minor
>
> In Skewness.java, approx line 180, there is code like:
> {noformat}
> double accum3 = 0.0;
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d;
> }
> accum3 /= variance * FastMath.sqrt(variance);
> {noformat}
> If the division was moved into the for loop, accum3 would be less likely to 
> overflow to Infinity (or -Infinity). This might allow computation to return a 
> result in a case such as:
> {noformat}
> double[] numArray = { 1.234E11, 1.234E51, 1.234E101, 1.234E151 };
> Skewnessskew = new Skewness();
> doublesk = skew.evaluate( numArray );
> {noformat}
> Currently, this returns NaN, but I'd prefer it returned approx 
> 1.154700538379252.
> The change I'm proposing would have the code instead read like:
> {noformat}
> double accum3 = 0.0;
> double divisor = variance * FastMath.sqrt(variance);
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d / divisor;
> }
> {noformat}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MATH-1253) Skewness could get more precision from slightly reordered code.

2015-11-24 Thread Gilles (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024241#comment-15024241
 ] 

Gilles commented on MATH-1253:
--

Independently of extending the valid range, wouldn't Otmar's proposal also give 
a more accurate result (than the current code), especially when the number of 
samples is large?


> Skewness could get more precision from slightly reordered code.
> ---
>
> Key: MATH-1253
> URL: https://issues.apache.org/jira/browse/MATH-1253
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.5
>Reporter: Bill Murphy
>Priority: Minor
>
> In Skewness.java, approx line 180, there is code like:
> {noformat}
> double accum3 = 0.0;
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d;
> }
> accum3 /= variance * FastMath.sqrt(variance);
> {noformat}
> If the division was moved into the for loop, accum3 would be less likely to 
> overflow to Infinity (or -Infinity). This might allow computation to return a 
> result in a case such as:
> {noformat}
> double[] numArray = { 1.234E11, 1.234E51, 1.234E101, 1.234E151 };
> Skewnessskew = new Skewness();
> doublesk = skew.evaluate( numArray );
> {noformat}
> Currently, this returns NaN, but I'd prefer it returned approx 
> 1.154700538379252.
> The change I'm proposing would have the code instead read like:
> {noformat}
> double accum3 = 0.0;
> double divisor = variance * FastMath.sqrt(variance);
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d / divisor;
> }
> {noformat}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MATH-1253) Skewness could get more precision from slightly reordered code.

2015-11-23 Thread Thomas Neidhart (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15023289#comment-15023289
 ] 

Thomas Neidhart commented on MATH-1253:
---

just ftr: I did some tests with octave and scipy, and both return NaN or 
overflow for the input.

> Skewness could get more precision from slightly reordered code.
> ---
>
> Key: MATH-1253
> URL: https://issues.apache.org/jira/browse/MATH-1253
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.5
>Reporter: Bill Murphy
>Priority: Minor
>
> In Skewness.java, approx line 180, there is code like:
> {noformat}
> double accum3 = 0.0;
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d;
> }
> accum3 /= variance * FastMath.sqrt(variance);
> {noformat}
> If the division was moved into the for loop, accum3 would be less likely to 
> overflow to Infinity (or -Infinity). This might allow computation to return a 
> result in a case such as:
> {noformat}
> double[] numArray = { 1.234E11, 1.234E51, 1.234E101, 1.234E151 };
> Skewnessskew = new Skewness();
> doublesk = skew.evaluate( numArray );
> {noformat}
> Currently, this returns NaN, but I'd prefer it returned approx 
> 1.154700538379252.
> The change I'm proposing would have the code instead read like:
> {noformat}
> double accum3 = 0.0;
> double divisor = variance * FastMath.sqrt(variance);
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d / divisor;
> }
> {noformat}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MATH-1253) Skewness could get more precision from slightly reordered code.

2015-08-18 Thread Bill Murphy (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702114#comment-14702114
 ] 

Bill Murphy commented on MATH-1253:
---

I agree with Otmar's comments entirely.

I do not yet know a better suggestion. Even a slight extension of the range of 
correct results seems to me to be an improvement. I would still prefer going 
forwards with these mods as suggested, but a true fix as outlined by Otmar 
would be hugely better. Does someone else know enough numerical analysis stuff 
to provide a mostly-correct approximation over the range of doubles? It is 
sadly beyond me as of today.


> Skewness could get more precision from slightly reordered code.
> ---
>
> Key: MATH-1253
> URL: https://issues.apache.org/jira/browse/MATH-1253
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.5
>Reporter: Bill Murphy
>Priority: Minor
>
> In Skewness.java, approx line 180, there is code like:
> {noformat}
> double accum3 = 0.0;
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d;
> }
> accum3 /= variance * FastMath.sqrt(variance);
> {noformat}
> If the division was moved into the for loop, accum3 would be less likely to 
> overflow to Infinity (or -Infinity). This might allow computation to return a 
> result in a case such as:
> {noformat}
> double[] numArray = { 1.234E11, 1.234E51, 1.234E101, 1.234E151 };
> Skewnessskew = new Skewness();
> doublesk = skew.evaluate( numArray );
> {noformat}
> Currently, this returns NaN, but I'd prefer it returned approx 
> 1.154700538379252.
> The change I'm proposing would have the code instead read like:
> {noformat}
> double accum3 = 0.0;
> double divisor = variance * FastMath.sqrt(variance);
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d / divisor;
> }
> {noformat}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MATH-1253) Skewness could get more precision from slightly reordered code.

2015-08-08 Thread Otmar Ertl (JIRA)

[ 
https://issues.apache.org/jira/browse/MATH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662994#comment-14662994
 ] 

Otmar Ertl commented on MATH-1253:
--

The given example is problematic anyway, because the numbers (1.234E11, 
1.234E51, 1.234E101, 1.234E151) are many orders of magnitude apart, for which 
cancellation effects are a major issue when using floating point types. I can 
reproduce the NaN result for values that have either large or small exponents, 
for example (1.234E148, 1.234E149, 1.234E150, 1.234E151) and (1.234E-148, 
1.234E-149, 1.234E-150, 1.234E-151). The proposed code does not avoid this 
over-/underflows either, because the division is performed after the two 
multiplications which cause the over-/underflow. However, following code would 
work:
{code}
final double variance = (accum - (accum2 * accum2 / length)) / 
(length - 1);
final double stdDevInverse = 1d / FastMath.sqrt(variance);

double accum3 = 0.0;
for (int i = begin; i < begin + length; i++) {
final double d = (values[i] - m) * stdDevInverse;
accum3 += d * d * d;
}
{code}
(Here the inverse is precomputed, in order to avoid additional divisions which 
are likely to be more expensive than multiplications.) Nevertheless, I am not 
sure, If we really should fix that. The variance calculation is also prone to 
over-/underflows for numbers greater than sqrt(Double.MAX_VALUE) and smaller 
than sqrt(Double.MIN_VALUE), respectively. So the fix would "slightly" extend 
the valid input range from (cbrt(Double.MIN_VALUE), cbrt(Double.MAX_VALUE)) to 
(sqrt(Double.MIN_VALUE), sqrt(Double.MAX_VALUE)) at the expense of an 
additional multiplication within the loop. Of course, if there was also an 
accurate variance calculation method, that avoids over-/underflows for values 
outside of (sqrt(Double.MIN_VALUE), sqrt(Double.MAX_VALUE)), this fix would 
make much more sense to me.

> Skewness could get more precision from slightly reordered code.
> ---
>
> Key: MATH-1253
> URL: https://issues.apache.org/jira/browse/MATH-1253
> Project: Commons Math
>  Issue Type: Bug
>Affects Versions: 3.5
>Reporter: Bill Murphy
>Priority: Minor
>
> In Skewness.java, approx line 180, there is code like:
> {noformat}
> double accum3 = 0.0;
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d;
> }
> accum3 /= variance * FastMath.sqrt(variance);
> {noformat}
> If the division was moved into the for loop, accum3 would be less likely to 
> overflow to Infinity (or -Infinity). This might allow computation to return a 
> result in a case such as:
> {noformat}
> double[] numArray = { 1.234E11, 1.234E51, 1.234E101, 1.234E151 };
> Skewnessskew = new Skewness();
> doublesk = skew.evaluate( numArray );
> {noformat}
> Currently, this returns NaN, but I'd prefer it returned approx 
> 1.154700538379252.
> The change I'm proposing would have the code instead read like:
> {noformat}
> double accum3 = 0.0;
> double divisor = variance * FastMath.sqrt(variance);
> for (int i = begin; i < begin + length; i++) {
> final double d = values[i] - m;
> accum3 += d * d * d / divisor;
> }
> {noformat}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)