psteitz 2004/04/25 12:16:13 Modified: math/xdocs/userguide stat.xml Log: Added BivariateRegression section. Revision Changes Path 1.14 +101 -3 jakarta-commons/math/xdocs/userguide/stat.xml Index: stat.xml =================================================================== RCS file: /home/cvs/jakarta-commons/math/xdocs/userguide/stat.xml,v retrieving revision 1.13 retrieving revision 1.14 diff -u -r1.13 -r1.14 --- stat.xml 21 Mar 2004 20:32:50 -0000 1.13 +++ stat.xml 25 Apr 2004 19:16:13 -0000 1.14 @@ -240,8 +240,106 @@ </p> </subsection> <subsection name="1.4 Bivariate regression" href="regression"> - <p>This is yet to be written. Any contributions will be gratefully - accepted!</p> + <p> + <a href="../apidocs/org/apache/commons/math/stat/multivariate/BivariateRegression.html"> + org.apache.commons.math.stat.multivariate.BivariateRegression</a> + provides ordinary least squares regression with one independent variable., estimating + the linear model: + </p> + <p> + <code> y = intercept + slope * x </code> + </p> + <p> + Standard errors for <code>intercept</code> and <code>slope</code> are + available as well as ANOVA, r-square and Pearson's r statistics. + </p> + <p> + Observations (x,y pairs) can be added to the model one at a time or they + can be provided in a 2-dimensional array. The observations are not stored + in memory, so there is no limit to the number of observations that can be + added to the model. + </p> + <p> + <strong>Usage Notes</strong>: <ul> + <li> When there are fewer than two observations in the model, or when + there is no variation in the x values (i.e. all x values are the same) + all statistics return <code>NaN</code>. At least two observations with + different x coordinates are requred to estimate a bivariate regression + model.</li> + <li> getters for the statistics always compute values based on the current + set of observations -- i.e., you can get statistics, then add more data + and get updated statistics without using a new instance. There is no + "compute" method that updates all statistics. Each of the getters performs + the necessary computations to return the requested statistic.</li> + </ul> + </p> + <p> + <strong>Implementation Notes</strong>: <ul> + <li> As observations are added to the model, the sum of x values, y values, + cross products (x times y), and squared deviations of x and y from their + respective means are updated using updating formulas defined in + "Algorithms for Computing the Sample Variance: Analysis and + Recommendations", Chan, T.F., Golub, G.H., and LeVeque, R.J. + 1983, American Statistician, vol. 37, pp. 242-247, referenced in + Weisberg, S. "Applied Linear Regression". 2nd Ed. 1985. All regression + statistics are computed from these sums.</li> + <li> Inference statistics (confidence intervals, parameter significance levels) + are based on on the assumption that the observations included in the model are + drawn from a <a href="http://mathworld.wolfram.com/BivariateNormalDistribution.html"> + Bivariate Normal Distribution</a></li> + </ul> + </p> + <p> + Here is are some examples. + <dl> + <dt>Estimate a model based on observations added one at a time</dt> + <br></br> + <dd>Instantiate a regression instance and add data points + <source> + regression = new BivariateRegression(); + regression.addData(1d, 2d); + // At this point, with only one observation, all regression statistics will return NaN + regression.addData(3d, 3d); + // With only two observations, slope and intercept can be computed + // but inference statistics will return NaN + regression.addData(3d, 3d); + // Now all statistics are defined. + </source> + </dd> + <dd>Compute some statistics based on observations added so far + <source> +System.out.println(regression.getIntercept()); // displays intercept of regression line +System.out.println(regression.getSlope()); // displays slope of regression line +System.out.println(regression.getSlopeStdErr()); // displays slope standard error + </source> + </dd> + <dd>Use the regression model to predict the y value for a new x value + <source> +System.out.println(regression.predict(1.5d) // displays predicted y value for x = 1.5 + </source> + More data points can be added and subsequent getXXX calls will incorporate + additional data in statistics. + </dd> + <dt>Estimate a model from a double[][] array of data points</dt> + <br></br> + <dd>Instantiate a regression object and load dataset + <source> + double[][] data = { { 1, 3 }, {2, 5 }, {3, 7 }, {4, 14 }, {5, 11 }}; + BivariateRegression regression = new BivariateRegression(); + regression.addData(data); + </source> + </dd> + <dd>Estimate regression model based on data + <source> +System.out.println(regression.getIntercept()); // displays intercept of regression line +System.out.println(regression.getSlope()); // displays slope of regression line +System.out.println(regression.getSlopeStdErr()); // displays slope standard error + </source> + More data points -- even another double[][] array -- can be added and subsequent + getXXX calls will incorporate additional data in statistics. + </dd> + </dl> + </p> </subsection> <subsection name="1.5 Statistical tests" href="tests"> <p>This is yet to be written. Any contributions will be gratefully
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]