Al Chou wrote:


After implementing var.2 from the Stanford paper in UnivariateImpl and
scratching my head for some time over why the variance calculation failed its
JUnit test case, I realized there's a flaw in var.2 that I can't understand no
one talks about.  To update the variance (called S in the paper), the formula
calculates

z = y / i
S = S + (i?1) * y * z

where i is the number of data values (including the value just being added to
the collection).  It doesn't really matter how y is defined, because you will
notice that

S = S + (i?1) * y * y / i
  = S + (i?1) * y**2 / i

which means that S can never decrease in magnitude (for real data, which is
what we're talking about).  But for the simple case of three data values {1, 2,
2} in the JUnit test case, the variance decreases between the addition of the
second and third data values.

Can anyone point out what I'm missing here?

Al, I see what your saying, I wrote a little example case to implement the pseudo code they have in the paper:


public class SmallTest {

    public static void main(String[] args) {
        double[] vals = new double[] { 1.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0 };

        double m = vals[0];
        double s = 0.0;

        System.out.println("m=" + m);
        System.out.println("s=" + s);
        System.out.println("");

for (int i = 2; i <= vals.length; i++) {

            double y = vals[i-1] - m;
            double z = y / i;
            m += z;
            s += (i - 1) * y * z;

            System.out.println("y=" + y);
            System.out.println("z=" + z);
            System.out.println("m=" + m);
            System.out.println("s=" + s);
            System.out.println("");
        }
    }
}

s does seem to increase even thought the variance of the calculation should be going down.

I want us to review this paper further and go back to the research of

Hanson, R. J. 1975. Stably updating mean and standard
deviation of data. Communications of the
ACM 18:57{58.
Stanford, where he currently holds the Thomas Ford
Chair in the Department of Engineering-Economic

Lets verify if theres a typo in the equation or something. Maybe these guys even misenterpreted his work.

-M.


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to