I forgot to CC the list. ----- Forwarded message from Jason Stover <[EMAIL PROTECTED]> -----
Date: Sun, 4 Feb 2007 11:50:38 -0500 From: Jason Stover <[EMAIL PROTECTED]> To: John Darrington <[EMAIL PROTECTED]> Subject: Re: Regression results need checking ? In-Reply-To: <[EMAIL PROTECTED]> User-Agent: Mutt/1.5.10i I just checked in a fix for the computation of the p-values. I can fix the computation of the standardized coefficients, but before I do, I have a question. Is there a place where the regression procedure can just read the standard deviation for a variable, or must it compute the standard deviation itself? And if the regression procedure must compute the standard deviation itself, is there a single routine somewhere in src that it can use, or does it need its own? The reason I ask is because this test data set has missing data, and regression already has its own way of dealing with missing cases. It would be nice if there were another standard procedure to call to compute descriptive statistics without having to make regression aware of yet another way to handle missing data. Computing means, standard deviations, and other univariate statistics is a common enough task that there should be one place to do it. So as long as we're on the topic, it might be nice to have a couple of routines in src/math to compute such descriptive statistics, and maybe even store them in a cache. Would a pool serve this purpose? I guess by raising the issue, this means I'm volunteering to do it. -Jason On Sat, Feb 03, 2007 at 10:50:17PM -0500, Jason Stover wrote: > > For the first example, it looks like pspp and spss are computing the same > basic > statistics. The most important values in the output are those in the > ANOVA table, the coefficients and their standard errors. All these agree > with the values shown in the example. That said, there are some discrepancies > which need further attention: > > 1. The "standard error of the estimate" in the model summary table. > On the page whose link you sent, this value is about 64, but pspp > reports a value of about .08. I'll have to check this one later, > but for now, I suspect it's the web page that has the incorrect > value. If I remember correctly, the value should be the standard > error of R-square. R-square is always between 0 and 1, and > therefore should not have a standard error larger than 1, as the > web page reports. > > 2. The "Coefficients" table, in the column referring to the > standardized coefficients. This is something I'll have to check > more closely later, but pspp seems to report the incorrect values > here. > > 3. The discrepancy in the "Sig." column needs to be checked. I'm guessing > it's something simple, like a miscalculation of degrees of freedom. This > column is filled in by a straightforward computation of the t distribution. > > I'll look into this more over the next few days and patch as necessary. > > -Jason > > > On Sat, Feb 03, 2007 at 08:25:06AM +0900, John Darrington wrote: > > When I try out the exercises at > > http://www.ats.ucla.edu/stat/SPSS/webbooks/reg/chapter1/spssreg1.htm > > using pspp, the numbers I get are quite different to those in their > > examples. > > > > Do we have something wrong or do they? > > > > J' > > > > -- > > PGP Public key ID: 1024D/2DE827B3 > > fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 > > See http://pgp.mit.edu or any PGP keyserver for public key. > > > > > > > > > _______________________________________________ > > pspp-dev mailing list > > [email protected] > > http://lists.gnu.org/mailman/listinfo/pspp-dev > > > > _______________________________________________ > pspp-dev mailing list > [email protected] > http://lists.gnu.org/mailman/listinfo/pspp-dev ----- End forwarded message ----- _______________________________________________ pspp-dev mailing list [email protected] http://lists.gnu.org/mailman/listinfo/pspp-dev
