----- Original Message ----- From: Thomas A Torda <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Sunday, December 26, 1999 11:03 PM Subject: Excel > I am a statistical near-illiterate, trying to write an introduction to stats > for real stats illiterates, using Excel data analysis functions. I have > problems with some of the procedures and would be interested to know the > algorithms used. Does anyone know whether these can be found or whether > anything has been written on the use of Excel as a stats resource? > The covariance and the two factor Anova especially appear a bit odd. > Thanks, > Tom Torda > [EMAIL PROTECTED] > There have been some adverse reactions on edstat to your request. First of all, you are reinventing the wheel. There are books out there in the software shops on EXCEL, some of which describe just what you are looking for. Second, there are books that describe how to do data analysis on EXCEL, such as Berk and Carey's book, "Data Analysis With Microsoft Excel", Duxbury, ISBN 0-534-52929-1 ($30) (reviewed in the November 1999 issue of Technometrics). Thirdly, the issue of "accuracy" or "ability" to solve problems is just a matter of belief; belief in the ability of your favorite black box. If you believe SAS or SPSS or S-Plus or EXCEL will give you what you want, then fine. There are so many theoretical ways these days to do data analysis, and they all have some degree of validity. The fine differences such as biased/unbiased, marginal or not, power, non-normality, computational accuracy, etc... seem to get lost. Whatever the software displays/prints out, then that must be correct, especially if the black box uses some new method. We tend to overlook the fact that authors do make mistakes or come up with something that is no better than traditional methods of analysis. As one commentator some time ago on edstat said "there is a lot of stat garbage" out there. If you go on edstat or semnet for several years, you get the flavor on what people are asking for. Their favorate black box comes up with a message that disturbs them, and they want to know what it means. For example such programs as MX (free) provides the engine to do structural equation modeling (SEM), but is unconcerned about the problems of building the model and interpreting the results. Bollen in his book goes to great length on difficulties of building the SEM model and interpreting the computed results. The focus, from the user group questions; is on getting numbers (or graphs) out, not on how they are generated. Your intent validates this view. Fourth, EXCEL is an excellant tool for data input and handling, Compare to all the overpriced stat software being pedalled, EXCEL is a bargain, in spite of Microsoft overpricing the OFFICE suite. By using Visual Basic for Applications which comes with the OFFICE suite, one can do their own programming. For example the line "x = ActiveSheet.Cells(4,7).Value" puts the numerical value in the current EXCEL worksheet in cell row 4, column G into variable x. You use macros to hold the program. All the old basic from the 1980's is recognized. Very simple to program. If you don't like EXCEL's built in ANOVA, write your own. Microsoft and other publishers have a lot of good books out on how to do this. If you build your own engine, you know then what it does. Fifth, I think EXCEL is an excellant data input engine that can be used for front end data inputs. The graphiics are very limited, but there are software packages that do graphics from standard EXCEL, PARADOX, DBASE, ACCESS, ASCII, ..... data files. EXCEL is an excellant tool for developers to build on. DAHeiser Chair, Department of Silly Things (after Monty Python)