I started to consider changing the evaluate() methods in UnivariateStatistic to work with DoubleIterators from the collections.primitives API. But I found some pretty basic shortcomings to working with DoubleCollections and DoubleIterators on double[]'s. Much of this came from the similar warning signs about performance that Phil raised with my initial implementations of UnivariateStatistics. So I went and used his strategy to attempt to come up with efficiency measures for various iteration strategies.


I've done some analysis and I'm pretty surprised to find out how inefficient the "Iterators" hasNext() methods are in for/while loops. I wrote a simple iterator for double[]'s and find that it always performs poorly (~ 2-3 times slower) in relation to the "for(int i = 0; i < length;i++)" loop approach. Phil pointed out that the slowdown in my initial implementation was due to having "method calls" for increment(double d) embedded in an iterative approach. While a significant percentage of time spent processing all the calculations in the "instantaneous increment" method call, there is also a good portion of time lost to poorly designed passes over the double[], DoubleArrayIterator performance, and the use of hasNext(). The attached example code shows these effects fairly clearly.


I'd recommend the following from these results for both [math] and [collections]:

*1* [math] we can get about a 95% efficiency in relation to regular for-loops (or evaluate2(...) in the code) by doing tricks like this.

public double evaluate(double[] values, int begin, int length) {
        for (int i = begin + length; --i >= 0;) {
         ...

*2* [math][collections]if we want to use Iterators they work the fastest when "hasNext()" can be avoided like the following, this provides about a 110% efficiency in relation to regular for-loops (or evaluate2(...) in the code) this is as opposed to a ~200% efficiency for the use of iter.hasNext() in the conditional test of the loop (or evaluate4(...) in the code).

public double evaluate3(double[] values, int begin, int length) {
        MyDoubleIterator iter
            = new MyDoubleIterator(values, begin, length);
        for (int i = length; --i >= 0;) {
              ... iter.next();
              ...

*3* [collections] in primitive collections the toArray() and toArray(primitive[]) methods can benefit from the above "Iterator" for loop in strategy for the Abstract Cases, I recommend however, considering the use of System.arrayCopy in each <Primitive>Collection where ever possible because this will always be faster that copying it by hand. In cases where [math] needs a copy of an array, we would really like it in the most efficient means possible. Our current DoubleArray implementations approach copies using System.arrayCopy(...).

Hope this is helpful to both groups,
Mark
/*
 * Created on Jul 11, 2003
 *
 * To change the template for this generated file go to
 * Window>Preferences>Java>Code Generation>Code and Comments
 */

/**
 * @author Mark Diggory
 *
 * Examples of various methods of calculating the mean.
 * 
 */
public class ExampleUsage {

    public static void main(String[] args) {
        new ExampleUsage();
    }

    public ExampleUsage() {

        double[] x = new double[1000];
        for (int i = 0; i < 1000; i++) {
            x[i] = (5 - i) * (i - 200);
        }
        long startTick = 0;
        double res = 0;

        for (int j = 0; j < 10; j++) {

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                res = evaluate(x, 0, x.length);
            }
            System.out.println(
                "evaluate: "
                    + (System.currentTimeMillis() - startTick)
                    + " result: "
                    + res);

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                res = evaluate1(x, 0, x.length);
            }
            System.out.println(
                "evaluate1: "
                    + (System.currentTimeMillis() - startTick)
                    + " result: "
                    + res);

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                res = evaluate2(x, 0, x.length);
            }
            System.out.println(
                "evaluate2: "
                    + (System.currentTimeMillis() - startTick)
                    + " result: "
                    + res);

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                res = evaluate3(x, 0, x.length);
            }
            System.out.println(
                "evaluate3: "
                    + (System.currentTimeMillis() - startTick)
                    + " result: "
                    + res);

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                res = evaluate4(x, 0, x.length);
            }
            System.out.println(
                "evaluate4: "
                    + (System.currentTimeMillis() - startTick)
                    + " result: "
                    + res);

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                res = evaluate5(x, 0, x.length);
            }
            System.out.println(
                "evaluate5: "
                    + (System.currentTimeMillis() - startTick)
                    + " result: "
                    + res);
        }
    }

    /* evaluate uses unoptimized for loop*/
    public double evaluate(double[] values, int begin, int length) {
        double sum = 0.0;
        int l = begin + length;
        for (int i = begin; i < l; i++) {
            sum += values[i];
        }
        return sum / ((double) length);
    }

    /* evaluate1 uses optimized for loop with comparision to "0"*/
    public double evaluate1(double[] values, int begin, int length) {
        double sum = 0.0;
        for (int i = begin + length; --i >= 0;) {
            sum += values[i];
        }
        return sum / ((double) length);
    }

    MyDoubleIterator iter = new MyDoubleIterator();

    /* evaluate2 uses incremental moment approach with while loop*/
    public double evaluate2(double[] values, int begin, int length) {

        double m1 = 0.0;
        int n = 0;
        iter.setContent(values, begin, length);

        while (iter.hasNext()) {
            n++;
            m1 += ((iter.next() - m1) / (double) n);
        }
        return m1;
    }

    /* evaluate3 uses incremental moment approach with for-loop strategy, a little 
better than evaluate2*/
    public double evaluate3(double[] values, int begin, int length) {

        double m1 = 0.0;
        int diff = begin - 1;
        int l = length + 1;
        for (int i = 1; i < l; i++) {
            m1 += ((values[i + diff] - m1) / (double) i);
        }
        return m1;
    }

    /* evaluate4 uses optimized for-loop evaulation approach on a DoubleIterator, 
shows ok timing*/
    public double evaluate4(double[] values, int begin, int length) {

        double sum = 0.0;
        iter.setContent(values, begin, length);
        for (int i = length; --i >= 0;) {
            sum += iter.next();
        }
        return sum / ((double) length);
    }

    /* evaluate5 uses an unoptimized while loop iteration approach, shows poor timing*/
    public double evaluate5(double[] values, int begin, int length) {

        double sum = 0.0;
        iter.setContent(values, begin, length);
        while (iter.hasNext()) {
            sum += iter.next();
        }
        return sum / ((double) length);
    }
    
    public class MyDoubleIterator {

        double[] vals;
        int l = 0;
        int c = 0;

        public void setContent(double[] values, int begin, int length) {
            this.vals = values;
            this.l = length;
            this.c = begin;
        }

        public boolean hasNext() {
            return c < l && c < vals.length;
        }

        public double next() {
            return vals[c++];
        }

    }
}
/*
 * Created on Jul 11, 2003
 *
 * To change the template for this generated file go to
 * Window>Preferences>Java>Code Generation>Code and Comments
 */

/**
 * @author Mark Diggory
 *
 * Examples of various methods of calculating the mean.
 * 
 */
public class ArrayCopyExampleUsage {

    public static void main(String[] args) {
        new ArrayCopyExampleUsage();
    }

    public ArrayCopyExampleUsage() {

        double[] x = new double[1000];
        for (int i = 0; i < 1000; i++) {
            x[i] = (5 - i) * (i - 200);
        }
        long startTick = 0;
        double res = 0;

        for (int j = 0; j < 10; j++) {

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                copy(x, 0, x.length);
            }
            System.out.println(
                "copy: "
                    + (System.currentTimeMillis() - startTick));

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                copy2(x, 0, x.length);
            }
            System.out.println(
                "copy2: "
                    + (System.currentTimeMillis() - startTick));

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                copy3(x, 0, x.length);
            }
            System.out.println(
                "copy3: "
                    + (System.currentTimeMillis() - startTick));

            startTick = System.currentTimeMillis();
            for (int i = 0; i < 100000; i++) {
                copy4(x, 0, x.length);
            }
            System.out.println(
                "copy4: "
                    + (System.currentTimeMillis() - startTick));

        }
    }

    /* evaluate uses unoptimized for loop*/
    public void copy(double[] values, int begin, int length) {
        double[] copy = new double[length];
        System.arraycopy(values,begin,copy,0,length);
    }

    /* evaluate uses unoptimized for loop*/
    public void copy2(double[] values, int begin, int length) {
        double[] copy = new double[length];
        int l = begin + length;
        int j = 0;
        for (int i = begin; i < l; i++) {
            copy[j] = values[i];
            j++;
        }
    }

    

    public void copy3(double[] values, int begin, int length) {
        MyDoubleIterator iter = new MyDoubleIterator();
        iter.setContent(values, begin, length);
        double[] copy = new double[length];
        for (int j = 0; iter.hasNext(); j++) {
            copy[j] = iter.next();
        }
    }

    public void copy4(double[] values, int begin, int length) {
        MyDoubleIterator iter = new MyDoubleIterator();
        iter.setContent(values, begin, length);
        double[] copy = new double[length];
        for (int j = 0; j < length; j++) {
            copy[j] = iter.next();
        }
    }

    public class MyDoubleIterator {

        double[] vals;
        int l = 0;
        int c = 0;

        public void setContent(double[] values, int begin, int length) {
            this.vals = values;
            this.l = length;
            this.c = begin;
        }

        public boolean hasNext() {
            return c < l && c < vals.length;
        }

        public double next() {
            return vals[c++];
        }

    }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to