Here's a patch that does the list of changes I stated below. Phil, now that is see it, I'm not so sure I'm as against it as I was before. It does seem to make the implementations a little bit more clearly defined. And I think your right about if users want a window they can use DescriptiveStatisticsImpl.

-Mark

Mark R. Diggory wrote:



Phil Steitz wrote:

"Mark R. Diggory" <[EMAIL PROTECTED]> wrote:

I think your percieving apply(...) as if its to "apply" externally instantiated UnivariateStatistics, it wasn't designed for that, it was designed to polymorphically implement the evaluations for the individual internal existing statistics which are incremented in addValue(...).



The problem is that the way the API is written, it takes an arbitrary
Univariate as argument. It makes sense to apply() any Univariate when
there are stored values to apply it to, or if the actual parameter is one
of the internally maintained stats (in which case you can use the getXxx
methods instead); but it makes no sense when there is no storage and <stat>
is not one of the internally maintained instances (or at least of the same
type as one of these or somehow computable from the internally maintained
state). If we keep it, we should throw or return NaN if there is no window
and <stat> is not one of the internally maintained stats.


I agree, thats why it is not part of the DescriptiveStatistics interface because it was never meant to be used by the user. This is why I recomend making it protected, its part of the Abstract classes so that its available to the implementations internally. Its the responsability of the developer to make sure its applied properly, the user will not be making any mistakes by trying to use it because its not available to them in the DescriptiveStatistics interface.

My preference would be to move this down to the Stored version where it
always makes sense. I understand that the loss here would be in applying
additional Univariates to the finite window case, but the same
functionality is available in the stored impl.


I'm not thrilled about it, but continue to entertain the idea:

If we do this we will have to do the following:

1.) We will end up having to reimplement every DescriptiveStatistics.get<Stat> method in AbstractStorelessDescriptiveStatistics to call StorelessUnivariateStatistic.getResult() if we remove the abstract apply() method from the underlying AbstStorlessDescrStat implementation.

2.) We will then need to reimplement every get<Stat> method in AbstDscrStats to overload the above get<Stat> method in AbstStrlsDscrStats to call "apply" again.

3.) We will need to overload get/SetWindowSize in StorlsDscrStatImpl to throw UnsupportedOperationException's

4.) We will remove the windowed storage from StorlsDscrStatImpl.

In the end, this reduces the functionality of StrlsDscrStatImpl and makes it have even more "usupported operations" than it did before.

I'd like to hear what other math developers / contributors think of this.

-Mark


-- Mark Diggory Software Developer Harvard MIT Data Center http://osprey.hmdc.harvard.edu
Index: src/java/org/apache/commons/math/stat/AbstractDescriptiveStatistics.java
===================================================================
retrieving revision 1.1
diff -u -r1.1 AbstractDescriptiveStatistics.java
--- src/java/org/apache/commons/math/stat/AbstractDescriptiveStatistics.java    15 Nov 
2003 16:01:37 -0000      1.1
+++ src/java/org/apache/commons/math/stat/AbstractDescriptiveStatistics.java    18 Jan 
2004 19:12:38 -0000
@@ -55,15 +55,19 @@
 
 import java.util.Arrays;
 
+import org.apache.commons.math.stat.univariate.UnivariateStatistic;
 import org.apache.commons.math.stat.univariate.rank.Percentile;
 
 /**
  * Provides univariate measures for an array of doubles. 
- * @version $Revision: 1.1 $ $Date: 2003/11/15 16:01:37 $
+ * @version $Revision: 1.1 $ $Date: 2003/11/10 17:43:28 $
  */
 public abstract class AbstractDescriptiveStatistics
     extends AbstractStorelessDescriptiveStatistics {
 
+    /** hold the window size **/
+    protected int windowSize = INFINITE_WINDOW;
+    
     /** Percentile */
     protected Percentile percentile = new Percentile(50);
         
@@ -100,21 +104,138 @@
     }
     
     /**
-     * @see org.apache.commons.math.stat.Univariate#addValue(double)
+     * Apply the given statistic to this univariate collection.
+     * @param stat the statistic to apply
+     * @return the computed value of the statistic.
+     */
+    public abstract double apply(UnivariateStatistic stat);
+    
+    /**
+     * @see org.apache.commons.math.stat.Univariate#getSum()
+     */
+    public double getSum() {
+        return apply(sum);
+    }
+
+    /**
+     * @see org.apache.commons.math.stat.Univariate#getSumsq()
+     */
+    public double getSumsq() {
+        return apply(sumsq);
+    }
+
+    /**
+     * @see org.apache.commons.math.stat.Univariate#getMean()
+     */
+    public double getMean() {
+        return apply(mean);
+    }
+
+    /**
+     * Returns the standard deviation for this collection of values
+     * @see org.apache.commons.math.stat.Univariate#getStandardDeviation()
+     */
+    public double getStandardDeviation() {
+        double stdDev = Double.NaN;
+        if (getN() > 0) {
+            if (getN() > 1) {
+                stdDev = Math.sqrt(getVariance());
+            } else {
+                stdDev = 0.0;
+            }
+        }
+        return (stdDev);
+    }
+
+    /**
+     * Returns the variance of the values that have been added via West's
+     * algorithm as described by
+     * <a href="http://doi.acm.org/10.1145/359146.359152";>Chan, T. F. and
+     * J. G. Lewis 1979, <i>Communications of the ACM</i>,
+     * vol. 22 no. 9, pp. 526-531.</a>.
+     *
+     * @return The variance of a set of values.  
+     *         Double.NaN is returned for an empty 
+     *         set of values and 0.0 is returned for 
+     *         a &lt;= 1 value set.
+     */
+    public double getVariance() {
+        return apply(variance);
+    }
+
+    /**
+     * Returns the skewness of the values that have been added as described by
+     * <a href="http://mathworld.wolfram.com/k-Statistic.html";>
+     * Equation (6) for k-Statistics</a>.
+     * @return The skew of a set of values.  Double.NaN is returned for
+     *         an empty set of values and 0.0 is returned for a 
+     *         &lt;= 2 value set.
      */
-    public abstract void addValue(double value);
+    public double getSkewness() {
+        return apply(skewness);
+    }
 
     /**
-     * @see org.apache.commons.math.stat.DescriptiveStatistics#getValues()
+     * Returns the kurtosis of the values that have been added as described by
+     * <a href="http://mathworld.wolfram.com/k-Statistic.html";>
+     * Equation (7) for k-Statistics</a>.
+     *
+     * @return The kurtosis of a set of values.  Double.NaN is returned for
+     *         an empty set of values and 0.0 is returned for a &lt;= 3 
+     *         value set.
      */
-    public abstract double[] getValues();
+    public double getKurtosis() {
+        return apply(kurtosis);
+    }
 
+    /**
+     * @see org.apache.commons.math.stat.DescriptiveStatistics#getKurtosisClass()
+     */
+    public int getKurtosisClass() {
+        int kClass = MESOKURTIC;
+
+        double kurtosis = getKurtosis();
+        if (kurtosis > 0) {
+            kClass = LEPTOKURTIC;
+        } else if (kurtosis < 0) {
+            kClass = PLATYKURTIC;
+        }
+        return (kClass);
+    }
+
+    /**
+     * @see org.apache.commons.math.stat.Univariate#getMax()
+     */
+    public double getMax() {
+        return apply(max);
+    }
 
     /**
-     * @see org.apache.commons.math.stat.DescriptiveStatistics#getElement(int)
+     * @see org.apache.commons.math.stat.Univariate#getMin()
      */
-    public abstract double getElement(int index);
+    public double getMin() {
+        return apply(min);
+    }
 
+    /**
+     * @see org.apache.commons.math.stat.Univariate#getGeometricMean()
+     */
+    public double getGeometricMean() {
+        return apply(geoMean);
+    }
 
+    /**
+     * @see org.apache.commons.math.stat.Univariate#getWindowSize()
+     */
+    public int getWindowSize() {
+        return windowSize;
+    }
 
+    /**
+     * @see org.apache.commons.math.stat.Univariate#setWindowSize(int)
+     */
+    public void setWindowSize(int windowSize) {
+        clear();
+        this.windowSize = windowSize;
+    }
 }
Index: 
src/java/org/apache/commons/math/stat/AbstractStorelessDescriptiveStatistics.java
===================================================================
retrieving revision 1.1
diff -u -r1.1 AbstractStorelessDescriptiveStatistics.java
--- src/java/org/apache/commons/math/stat/AbstractStorelessDescriptiveStatistics.java  
 15 Nov 2003 16:01:37 -0000      1.1
+++ src/java/org/apache/commons/math/stat/AbstractStorelessDescriptiveStatistics.java  
 18 Jan 2004 19:12:38 -0000
@@ -72,9 +72,6 @@
  */
 public abstract class AbstractStorelessDescriptiveStatistics extends 
DescriptiveStatistics {
 
-    /** hold the window size **/
-    protected int windowSize = INFINITE_WINDOW;
-
     /** count of values that have been added */
     protected int n = 0;
 
@@ -141,24 +138,6 @@
     }
 
     /**
-     * Apply the given statistic to this univariate collection.
-     * @param stat the statistic to apply
-     * @return the computed value of the statistic.
-     */
-    public abstract double apply(UnivariateStatistic stat);
-    
-
-    /**
-     * If windowSize is set to Infinite, 
-     * statistics are calculated using the following 
-     * <a href="http://www.spss.com/tech/stat/Algorithms/11.5/descriptives.pdf";>
-     * recursive strategy
-     * </a>.
-     * @see org.apache.commons.math.stat.Univariate#addValue(double)
-     */
-    public abstract void addValue(double value);
-
-    /**
      * @see org.apache.commons.math.stat.Univariate#getN()
      */
     public int getN() {
@@ -169,21 +148,21 @@
      * @see org.apache.commons.math.stat.Univariate#getSum()
      */
     public double getSum() {
-        return apply(sum);
+        return sum.getResult();
     }
 
     /**
      * @see org.apache.commons.math.stat.Univariate#getSumsq()
      */
     public double getSumsq() {
-        return apply(sumsq);
+        return sumsq.getResult();
     }
 
     /**
      * @see org.apache.commons.math.stat.Univariate#getMean()
      */
     public double getMean() {
-        return apply(mean);
+        return mean.getResult();
     }
 
     /**
@@ -215,7 +194,7 @@
      *         a &lt;= 1 value set.
      */
     public double getVariance() {
-        return apply(variance);
+        return variance.getResult();
     }
 
     /**
@@ -227,7 +206,7 @@
      *         &lt;= 2 value set.
      */
     public double getSkewness() {
-        return apply(skewness);
+        return skewness.getResult();
     }
 
     /**
@@ -240,7 +219,7 @@
      *         value set.
      */
     public double getKurtosis() {
-        return apply(kurtosis);
+        return kurtosis.getResult();
     }
 
     /**
@@ -262,21 +241,21 @@
      * @see org.apache.commons.math.stat.Univariate#getMax()
      */
     public double getMax() {
-        return apply(max);
+        return max.getResult();
     }
 
     /**
      * @see org.apache.commons.math.stat.Univariate#getMin()
      */
     public double getMin() {
-        return apply(min);
+        return min.getResult();
     }
 
     /**
     * @see org.apache.commons.math.stat.Univariate#getGeometricMean()
     */
     public double getGeometricMean() {
-        return apply(geoMean);
+        return geoMean.getResult();
     }
     
     /**
@@ -321,15 +300,41 @@
      * @see org.apache.commons.math.stat.Univariate#getWindowSize()
      */
     public int getWindowSize() {
-        return windowSize;
+        throw new UnsupportedOperationException("Not Available in 
AbstractStorelessDescriptiveStatistics");
     }
 
     /**
      * @see org.apache.commons.math.stat.Univariate#setWindowSize(int)
      */
     public void setWindowSize(int windowSize) {
-        clear();
-        this.windowSize = windowSize;
+        throw new UnsupportedOperationException("Not Available in 
AbstractStorelessDescriptiveStatistics");
     }
 
+    /* (non-Javadoc)
+     * @see org.apache.commons.math.stat.DescriptiveStatistics#getValues()
+     */
+    public double[] getValues() {
+        throw new UnsupportedOperationException("Not Available in 
AbstractStorelessDescriptiveStatistics");
+    }
+
+    /* (non-Javadoc)
+     * @see org.apache.commons.math.stat.DescriptiveStatistics#getSortedValues()
+     */
+    public double[] getSortedValues() {
+        throw new UnsupportedOperationException("Not Available in 
AbstractStorelessDescriptiveStatistics");
+    }
+
+    /* (non-Javadoc)
+     * @see org.apache.commons.math.stat.DescriptiveStatistics#getElement(int)
+     */
+    public double getElement(int index) {
+        throw new UnsupportedOperationException("Not Available in 
AbstractStorelessDescriptiveStatistics");
+    }
+
+    /* (non-Javadoc)
+     * @see org.apache.commons.math.stat.DescriptiveStatistics#getPercentile(double)
+     */
+    public double getPercentile(double p) {
+        throw new UnsupportedOperationException("Not Available in 
AbstractStorelessDescriptiveStatistics");
+    }
 }
Index: src/java/org/apache/commons/math/stat/DescriptiveStatisticsImpl.java
===================================================================
retrieving revision 1.2
diff -u -r1.2 DescriptiveStatisticsImpl.java
--- src/java/org/apache/commons/math/stat/DescriptiveStatisticsImpl.java        19 Nov 
2003 03:28:23 -0000      1.2
+++ src/java/org/apache/commons/math/stat/DescriptiveStatisticsImpl.java        18 Jan 
2004 19:12:38 -0000
@@ -74,6 +74,14 @@
     public DescriptiveStatisticsImpl() {
         eDA = new ContractableDoubleArray();
     }
+    
+    /**
+     * Construct a DescriptiveStatisticsImpl with window
+     */
+    public DescriptiveStatisticsImpl(int windowSize) {
+        this.windowSize = windowSize;
+        eDA = new ContractableDoubleArray(windowSize);
+    }   
 
     /**
      * @see org.apache.commons.math.stat.DescriptiveStatistics#getValues()
Index: src/java/org/apache/commons/math/stat/StorelessDescriptiveStatisticsImpl.java
===================================================================
retrieving revision 1.2
diff -u -r1.2 StorelessDescriptiveStatisticsImpl.java
--- src/java/org/apache/commons/math/stat/StorelessDescriptiveStatisticsImpl.java      
 19 Nov 2003 03:28:23 -0000      1.2
+++ src/java/org/apache/commons/math/stat/StorelessDescriptiveStatisticsImpl.java      
 18 Jan 2004 19:12:39 -0000
@@ -55,9 +55,6 @@
 
 import java.io.Serializable;
 
-import org.apache.commons.math.stat.univariate.*;
-import org.apache.commons.math.util.FixedDoubleArray;
-
 /**
  *
  * Accumulates univariate statistics for values fed in
@@ -70,23 +67,11 @@
 */
 public class StorelessDescriptiveStatisticsImpl extends 
AbstractStorelessDescriptiveStatistics implements Serializable {
 
-       /** fixed storage */
-       private FixedDoubleArray storage = null;
-
        /** Creates new univariate with an infinite window */
        public StorelessDescriptiveStatisticsImpl() {
                super();
        }
 
-       /** 
-        * Creates a new univariate with a fixed window 
-        * @param window Window Size
-        */
-       public StorelessDescriptiveStatisticsImpl(int window) {
-               super(window);
-               storage = new FixedDoubleArray(window);
-       }
-
        /**
         *  If windowSize is set to Infinite, moments 
         *  are calculated using the following 
@@ -98,18 +83,6 @@
         */
        public void addValue(double value) {
 
-               if (storage != null) {
-                       /* then all getters deligate to StatUtils
-                        * and this clause simply adds/rolls a value in the storage 
array 
-                        */
-                       if (getWindowSize() == n) {
-                               storage.addElementRolling(value);
-                       } else {
-                               n++;
-                               storage.addElement(value);
-                       }
-
-               } else {
                        /* If the windowSize is infinite don't store any values and 
there 
                         * is no need to discard the influence of any single item.
                         */
@@ -120,13 +93,7 @@
                        sumsq.increment(value);
                        sumLog.increment(value);
                        geoMean.increment(value);
-
                        moment.increment(value);
-                       //mean.increment(value);
-                       //variance.increment(value);
-                       //skewness.increment(value);
-                       //kurtosis.increment(value);
-               }
        }
 
        /**
@@ -146,63 +113,6 @@
                return outBuffer.toString();
-       }
-
-       /**
-        * @see org.apache.commons.math.stat.Univariate#clear()
-        */
-       public void clear() {
-               super.clear();
-               if (getWindowSize() != INFINITE_WINDOW) {
-                       storage = new FixedDoubleArray(getWindowSize());
-               }
-       }
-
-       /**
-        * Apply the given statistic to this univariate collection.
-        * @param stat the statistic to apply
-        * @return the computed value of the statistic.
-        */
-       public double apply(UnivariateStatistic stat) {
-
-               if (storage != null) {
-                       return stat.evaluate(
-                               storage.getValues(),
-                               storage.start(),
-                               storage.getNumElements());
-               } else if (stat instanceof StorelessUnivariateStatistic) {
-                       return ((StorelessUnivariateStatistic) stat).getResult();
-               }
-
-               return Double.NaN;
-       }
-
-       /* (non-Javadoc)
-        * @see org.apache.commons.math.stat.DescriptiveStatistics#getValues()
-        */
-       public double[] getValues() {
-               throw new UnsupportedOperationException("Only Available with Finite 
Window");
-       }
-
-       /* (non-Javadoc)
-        * @see org.apache.commons.math.stat.DescriptiveStatistics#getSortedValues()
-        */
-       public double[] getSortedValues() {
-               throw new UnsupportedOperationException("Only Available with Finite 
Window");
-       }
-
-       /* (non-Javadoc)
-        * @see org.apache.commons.math.stat.DescriptiveStatistics#getElement(int)
-        */
-       public double getElement(int index) {
-               throw new UnsupportedOperationException("Only Available with Finite 
Window");
-       }
-
-       /* (non-Javadoc)
-        * @see 
org.apache.commons.math.stat.DescriptiveStatistics#getPercentile(double)
-        */
-       public double getPercentile(double p) {
-               throw new UnsupportedOperationException("Only Available with Finite 
Window");
        }
 
 }
Index: src/test/org/apache/commons/math/stat/ListUnivariateImpl.java
===================================================================
retrieving revision 1.1
diff -u -r1.1 ListUnivariateImpl.java
--- src/test/org/apache/commons/math/stat/ListUnivariateImpl.java       15 Nov 2003 
16:01:41 -0000      1.1
+++ src/test/org/apache/commons/math/stat/ListUnivariateImpl.java       18 Jan 2004 
19:12:41 -0000
@@ -61,7 +61,7 @@
 import org.apache.commons.math.util.NumberTransformer;
 
 /**
- * @version $Revision: 1.1 $ $Date: 2003/11/15 16:01:41 $
+ * @version $Revision: 1.1 $ $Date: 2003/11/10 17:43:31 $
  */
 public class ListUnivariateImpl extends AbstractDescriptiveStatistics {
 
Index: src/test/org/apache/commons/math/stat/univariate/UnivariateImplTest.java
===================================================================
retrieving revision 1.1
diff -u -r1.1 UnivariateImplTest.java
--- src/test/org/apache/commons/math/stat/univariate/UnivariateImplTest.java    15 Nov 
2003 16:01:41 -0000      1.1
+++ src/test/org/apache/commons/math/stat/univariate/UnivariateImplTest.java    18 Jan 
2004 19:12:41 -0000
@@ -54,6 +54,7 @@
 package org.apache.commons.math.stat.univariate;
 
 import org.apache.commons.math.stat.DescriptiveStatistics;
+import org.apache.commons.math.stat.DescriptiveStatisticsImpl;
 import org.apache.commons.math.stat.StorelessDescriptiveStatisticsImpl;
 
 import junit.framework.Test;
@@ -63,7 +64,7 @@
 /**
  * Test cases for the [EMAIL PROTECTED] DescriptiveStatistics} class.
  *
- * @version $Revision: 1.1 $ $Date: 2003/11/15 16:01:41 $
+ * @version $Revision: 1.1 $ $Date: 2003/11/10 17:43:34 $
  */
 
 public final class UnivariateImplTest extends TestCase {
@@ -169,7 +170,7 @@
     }
 
     public void testProductAndGeometricMean() throws Exception {
-       StorelessDescriptiveStatisticsImpl u = new 
StorelessDescriptiveStatisticsImpl(10);
+       DescriptiveStatistics u = new DescriptiveStatisticsImpl(10);
                
         u.addValue( 1.0 );
         u.addValue( 2.0 );
@@ -191,7 +192,7 @@
     }
     
     public void testRollingMinMax() {
-        StorelessDescriptiveStatisticsImpl u = new 
StorelessDescriptiveStatisticsImpl(3);
+        DescriptiveStatistics u = new DescriptiveStatisticsImpl(3);
         u.addValue( 1.0 );
         u.addValue( 5.0 );
         u.addValue( 3.0 );
Index: xdocs/developers.xml
===================================================================
retrieving revision 1.9
diff -u -r1.9 developers.xml
--- xdocs/developers.xml        15 Nov 2003 18:38:16 -0000      1.9
+++ xdocs/developers.xml        18 Jan 2004 19:12:42 -0000
@@ -238,5 +238,14 @@
     </dl>
    </subsection>
   </section>
+  <section name='Generating Distributions'>
+   <p>
+    Publishing of distributions for the [math] project is currently
+    done via the Maven repository at <a 
href="http://www.ibiblio.org/maven";>Ibiblio</a>. 
+   </p>
+   <p>
+       Requests to publish new releases are made using <a 
href="http://jira.codehaus.org";>jira</a>.
+   </p>
+  </section>
  </body>
 </document>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to