I applaud your courage, Dr. Hammer. I hope everyone appreciates how intimidating this list of experts can be.
I also agree with your point that PCA can be used when the data are not multivariate normal if you are just using it to visualize information, or if you just know what it is doing for that matter. I am a fan of using any and all analyses that help in figuring out what is happening. However, in order to understand the results and what you are visualizing you have to understand both the data input and what the statistical analysis is doing. Sometimes the information that seems to be revealed is an artifact of violation of the assumptions and if the observer doesn't realize this it is very easy to come to the wrong conclusion. I thought, "what was the analysis doing" and "how to interpret it" were the original questions we were discussing, although I admit to reading the e-mails quickly. The original e-mail indicated that perhaps size and shape confounding was causing their odd looking results. If the shapes are the same, but the sizes are different then the source of the non-normality would be multiple modes only. This may not be a serious enough violation to cause interpretability problems. However, it sounded to me from the description of the problem and the results that in addition to multiple modes there are multiple variance/covariance matrices. That was making it difficult to interpret the results, and since PCA is based upon the variance/covariance will result in difficult to interpret or even invalid components. Separating the analysis into subgroups will allow them to visualize and test the differences in the modes and in the variance/covariance matrices and in that way understand! the source of the differences in the groups. Maybe the "common PCA" analysis someone else mentioned might do this as well. I am not familiar with that method. Thanx all again for your attention and patience, Kath Kathleen M. Robinette, Ph.D. Principal Research Anthropologist Air Force Research Laboratory AFRL/HEPA 2800 Q Street Wright-Patterson AFB, OH 45433-7947 (937) 255-8810 DSN 785-8810 FAX (937) 255-8752 e-mail:[EMAIL PROTECTED] -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, May 19, 2004 9:29 AM To: [EMAIL PROTECTED] Subject: Re: size correction & discriminant functions analyses Just a comment on this one, from a pragmatic point of view. It is of course true that PCA is only *guaranteed* to produce components maximizing variance if you have multivariate normality. The theory of PCA is based on this assumption. But in many cases, PCA is used purely as a visualization device, projecting a multivariate data set onto a sheet of paper so we can see it. For visualization of non-normal data, one could play around with different techniques, such as PCA, PCO, NMDS, projection pursuit etc., and then find that PCA does (or does not) perform well for the given data set. There is no law against making any linear combination you want of your variates, if it reveals information. For example, PCA may be perfectly adequate for resolving two well-separated groups, if the within-group variance is relatively small. Of course, when using PCA for non-normal data one must be a little careful and not over-interpret the results (especially not the component loadings), but I think it's too harsh to dismiss its use totally. I'm sure the hard-liners will flame me to pieces for this email, but I hope they will at least give me credit for my courage :-) Dr. Oyvind Hammer Geological Museum University of Oslo > PCA Analysis assumes multivariate normality. > > Kathleen M. Robinette, Ph.D. > Principal Research Anthropologist > Air Force Research Laboratory == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html. == Replies will be sent to list. For more information see http://life.bio.sunysb.edu/morph/morphmet.html.