I applaud your courage, Dr. Hammer.  I hope everyone appreciates how intimidating this 
list of experts can be. 

I also agree with your point that PCA can be used when the data are not multivariate 
normal if you are just using it to visualize information, or if you just know what it 
is doing for that matter.  I am a fan of using any and all analyses that help in 
figuring out what is happening.  However, in order to understand the results and what 
you are visualizing you have to understand both the data input and what the 
statistical analysis is doing.  Sometimes the information that seems to be revealed is 
an artifact of violation of the assumptions and if the observer doesn't realize this 
it is very easy to come to the wrong conclusion.   

I thought, "what was the analysis doing" and "how to interpret it" were the original 
questions we were discussing, although I admit to reading the e-mails quickly.    The 
original e-mail indicated that perhaps size and shape confounding was causing their 
odd looking results.  If the shapes are the same, but the sizes are different then the 
source of the non-normality would be multiple modes only.  This may not be a serious 
enough violation to cause interpretability problems.  However, it sounded to me from 
the description of the problem and the results that in addition to multiple modes 
there are multiple variance/covariance matrices. That was making it difficult to 
interpret the results, and since PCA is based upon the variance/covariance will result 
in difficult to interpret or even invalid components.  Separating the analysis into 
subgroups will allow them to visualize and test the differences in the modes and in 
the variance/covariance matrices and in that way understand!
  the source of the differences in the groups.  

Maybe the "common PCA" analysis someone else mentioned might do this as well.  I am 
not familiar with that method.

Thanx all again for your attention and patience,
Kath



Kathleen M. Robinette, Ph.D.
Principal Research Anthropologist
Air Force Research Laboratory
AFRL/HEPA
2800 Q Street
Wright-Patterson AFB, OH 45433-7947
(937) 255-8810
DSN 785-8810
FAX (937) 255-8752
e-mail:[EMAIL PROTECTED] 

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: Wednesday, May 19, 2004 9:29 AM
To: [EMAIL PROTECTED]
Subject: Re: size correction & discriminant functions analyses


Just a comment on this one, from a pragmatic point of view.

It is of course true that PCA is only *guaranteed* to
produce components maximizing variance if you have
multivariate normality. The theory of PCA is based on this assumption. But in many 
cases, PCA is used purely as a visualization device, projecting a multivariate data 
set onto a sheet of paper so we can see it. For visualization of non-normal data, one 
could play around with different techniques, such as PCA, PCO, NMDS, projection 
pursuit etc., and then find that PCA does (or does not) perform well for the given 
data set. There is no law against making any linear combination you want of your 
variates, if it reveals information. For example, PCA may be perfectly adequate for 
resolving two well-separated groups, if the within-group variance is relatively small.

Of course, when using PCA for non-normal data one must
be a little careful and not over-interpret the results (especially not the component 
loadings), but I think it's too harsh to dismiss its use totally.

I'm sure the hard-liners will flame me to pieces for
this email, but I hope they will at least give me
credit for my courage  :-)


Dr. Oyvind Hammer
Geological Museum
University of Oslo



> PCA Analysis assumes multivariate normality.
>
> Kathleen M. Robinette, Ph.D.
> Principal Research Anthropologist
> Air Force Research Laboratory



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.



==
Replies will be sent to list.
For more information see http://life.bio.sunysb.edu/morph/morphmet.html.

Reply via email to