On Mon, 9 Oct 2000, Wuensch, Karl L. wrote:

>       I sure many of you have been asked a question like that posed today
> by one of my students, and I would be interested in hearing how you respond
> to it.  I've included the question along with the response I gave this
> morning.  It looks a bit long to me now, I must have been having an attack
> of mania <grin>

Absent the mania ;-) I rather like the sentiment.  I honestly believe that
there is something to be learned from memorizing several of the basic
formulas that are involved in defining statistics.  I, less elegantly,
tell my students that it is important to have this basic understanding so
that it can 1) be utilized when we have the machines start doing the
computations for us and 2)be drawn on for understanding when the
mathematics is no longer so simple.

Michael

> ****************************************************************************
> ***************
> 9. October 2000
> 
>       One of my graduate students just asked me, "I have been diligently
> studying for the exam, but I realized that there are a lot of formulas and
> sub formulas that I am having trouble memorizing.  I probably can memorize
> them, but I not sure if that is what we need to do.  Should we memorize all
> the formulas and sub-formulas or should we expend most of our energy on
> having a good understanding of the concepts that we have covered or both?"
> 
>       Here is my reply:
> 
>       IMHO, one cannot have a good understanding of the concepts without
> knowing some basic definitions.  As a simple example, I opine that you would
> not have a good understanding of the concept of mean without knowing that it
> is the "balance point" which makes the sum of deviations about it zero, and
> that it is the quantity which minimizes the sum of squared deviations about
> it (the least squares criterion).  Now, I can present that definition in
> what you might call a pair of formulae, but it is, nevertheless, a
> definition essential for understanding the concept.  On the other hand, if
> you are going to compute a sample mean by hand, you will probably just add
> up the scores and divide by the number of scores, a useful "computational
> formula," but not a definition essential for understanding.
> 
>       Consider next the concept of variance (not just the more general
> concept of dispersion).  To understand it, you need to know that it is
> defined as the mean squared deviation of scores from their mean.  Yes, it is
> just another sort of mean.  Again, I can present that definition in what you
> might call a formula, but it is really just a definition essential for
> understanding the concept.  On the other hand, I would not think it
> essential that you know that you can get the corrected (for the mean) sum of
> squared deviations (numerator of the ratio we call variance) by taking the
> uncorrected sum of squares and subtracting the ratio of the square of the
> summed scores to the number of scores -- but that is the formula you should
> use if you were computing a variance by hand (but we have machines to do
> such tasks now, tasks done by one's graduate students back in the dark ages
> when I was a graduate student).
> 
>       Another example, after we cover correlation and regression, I would
> expect you to know that the correlation coefficient is really just a mean --
> the mean cross-product of standardized (z) scores, and it represents the
> slope of the standardized least squares linear regression line for
> predicting one variable from another.  While I could present that definition
> of Pearson r in "formulas," those would not be the formulas you would use to
> compute r, but rather are definitions that would help you understand r.
> With that understanding, you would realize that r is the number of standard
> deviations by which predicted Y increases per one standard deviation change
> in known X.  Building on that understanding of r, you would then recognize
> that the covariance is also just a mean, the mean cross product of
> deviations of X about its mean and deviations of Y about its mean,
> structurally the same as the univariate concept of variance, but in two
> dimensions rather than just one.  The same least squares criterion used to
> define the mean is used to define the regression line -- it minimizes the
> (error) sum of squared deviations (in the Y dimension) about it.  The
> univariate mean is really just our least squares predicted value for a score
> when the only information we have is that in the univariate distribution.
> 
> If our linear model is any good, it should account for some the variance in
> the variables.  The sum of the squared deviations of the predicted scores
> about the regression line is used to measure that portion of the total
> variance, and represents the reduction in error due to adding the X variable
> to the model used to predict Y.  Divide that regression sum of squares by
> the total sum of squares for the predicted variable and you obtain
> r-squared, so now you have another way to interpret r -- squared, it is the
> proportion of the total variance in one variable "accounted for" by our
> model.
> 
> If you have read Edwin Abbott's "Flatland," you might recognize that the
> same concept (a mean) which looked like a point in one dimensional space now
> looks like a line in two dimensional space.  Then you would be ready to leap
> into three dimensional space and even beyond, into hyperspace, but you might
> want to sit down and have a good beer first.  I promise that we shall travel
> that space before the semester is out (as soon as we get started on multiple
> regression).
> 
>       So, to recap, starting with what might seem like a useless task of
> memorizing a couple of formulas for the arithmetic mean, we come to an
> understanding of several useful extensions of that concept, ending up in
> hyperspace with a good beer.  What more could you possible expect from life
> than having a good beer in hyperspace?
> +++++++++++++++++++++++++++++++++++++++++
> Karl L. Wuensch, Department of Psychology,
> East Carolina University, Greenville NC  27858-4353
> Voice:  252-328-4102     Fax:  252-328-6283
> [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>   
> http://core.ecu.edu/psyc/wuenschk/klw.htm
> <http://core.ecu.edu/psyc/wuenschk/klw.htm> 
> 
> 
> =================================================================
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>                   http://jse.stat.ncsu.edu/
> =================================================================
> 

*******************************************************************
Michael M. Granaas
Associate Professor                    [EMAIL PROTECTED]
Department of Psychology
University of South Dakota             Phone: (605) 677-5295
Vermillion, SD  57069                  FAX:   (605) 677-6604
*******************************************************************
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to