After implementing var.2 from the Stanford paper in UnivariateImpl and scratching my head for some time over why the variance calculation failed its JUnit test case, I realized there's a flaw in var.2 that I can't understand no one talks about. To update the variance (called S in the paper), the formula calculates
z = y / i S = S + (i?1) * y * z
where i is the number of data values (including the value just being added to the collection). It doesn't really matter how y is defined, because you will notice that
S = S + (i?1) * y * y / i = S + (i?1) * y**2 / i
which means that S can never decrease in magnitude (for real data, which is what we're talking about). But for the simple case of three data values {1, 2, 2} in the JUnit test case, the variance decreases between the addition of the second and third data values.
Can anyone point out what I'm missing here?
Al, I see what your saying, I wrote a little example case to implement the pseudo code they have in the paper:
public class SmallTest {
public static void main(String[] args) { double[] vals = new double[] { 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0 };
double m = vals[0]; double s = 0.0;
System.out.println("m=" + m); System.out.println("s=" + s); System.out.println("");
for (int i = 2; i <= vals.length; i++) {
double y = vals[i-1] - m; double z = y / i; m += z; s += (i - 1) * y * z;
System.out.println("y=" + y); System.out.println("z=" + z); System.out.println("m=" + m); System.out.println("s=" + s); System.out.println(""); } } }
s does seem to increase even thought the variance of the calculation should be going down.
I want us to review this paper further and go back to the research of
Hanson, R. J. 1975. Stably updating mean and standard deviation of data. Communications of the ACM 18:57{58. Stanford, where he currently holds the Thomas Ford Chair in the Department of Engineering-Economic
Lets verify if theres a typo in the equation or something. Maybe these guys even misenterpreted his work.
-M.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]