Alex Herbert created MATH-1626: ---------------------------------- Summary: Variance weighted evaluation may have unexpected output (negative or NaN variance) Key: MATH-1626 URL: https://issues.apache.org/jira/browse/MATH-1626 Project: Commons Math Issue Type: Bug Affects Versions: 4.0 Reporter: Alex Herbert
The Variance class implements WeightedEvaluation: {code:java} double evaluate(double[] values, double[] weights); double evaluate(double[] values, double[] weights, int begin, int length); {code} This applies a weight to each input value. However the default behaviour is to compute the bias corrected variance by dividing by the sum of the weights minus 1. This will result in: * a negative variance if the weights sum to less than 1 * a NaN variance if the weights sum to 1 The weights are verified by the MathArrays.verifyValues method to have at least 1 non-zero weight in the evaluated range and no negative weights. But no validation is performed to ensure the weights sum to above 1. A suggested fix is to document this behaviour. The bias corrected weighted evaluation for the variance should only be applied when the weights correspond to observed counts of values in a population. Ideally the sum of the weights is the total number of observations and should be at least 2. This issue can also be avoided by using the non-bias corrected variance. This issue applies to any weighted evaluation based on the Variance: * StandardDeviation -- This message was sent by Atlassian Jira (v8.3.4#803005)