> My question: what does it mean asymmetry distribution could 
> affect PCA  ? and also outliers could affect factors?

It means what it says. PCA will be affected by asymmetry  and outliers will 
affect the principal components (sometimes loosely called 'factors') In 
particular an extreme outlying data point can cause at least one PC to be 
essentially parallel to the vector between the outlier and the mean of the rest 
of the data. If you want a picture of factors describing the bulk of the data 
set, you need to chuck out the extreme points or use robust PCA.

Asymmetry I'd worry less about, at least for exploratory graphical 
presentation; if I had a nice spherical data set I'd probably not be very 
interested in the PCA because it'd not have much discriminatory power for 
groups. But inference based on things like mahalanobis distance often  relies 
on some sense of multivariate normality or the like, and if the model used for 
inference isn't built on a symmetric data set the inferences can be badly 
wrong. Think Turkish flag; the star is 'obviously' not part of the crescent, 
but in mahalanobis distance it's not much further from the (empty) centre of 
the crescent than most of the crescent is. 


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to