Re: [R] pca vs. pfa: dimension reduction

William Revelle Wed, 25 Mar 2009 18:00:05 -0700

Dear  Sören, Mark, and Jon,

At 12:51 PM -0700 3/25/09, Mark Difford wrote:

Hi Sören,

(1) Is there an easy example, which explains the differences betweenpca and pfa? (2) Which R procedure should I use to get what I want?


There are a number of fundamental differences between PCA and FA (Factor
Analysis), which unfortunately are quite widely ignored. FA is explicitly
model-based, whereas PCA does not invoke an explicit model. FA is also
designed to detect structure, whereas PCA focuses on variance, to put things
simply. In more detail, the two methods "attack" the covariance matrix in
different ways: in PCA the focus of decomposition is on the diagonal
elements, whereas in FA the focus is on the off-diagonal elements.

This is nicely put. Less concisely, see pages139-149 of my (under development)book on psychometric theory using R(http://personality-project.org/r/book/Chapter6.pdf)

In particular, on page 149:

"Although on the surface, the component model andfactor model appear to very similar(compare Tables 6.6 and 6.7), they are in factvery different. One example of this is when anadditional variable is added to the correlationmatrix (Table 6.8). In this case, two additionalvariables are added to the correlation matrix.The factor pattern does not change, but thecomponent pattern does. Why is this? Because thecomponents are aimed at accounting forall of the variance of the matrix, adding newvariables increases the amount of variance to beexplained and changes the previous estimates. Butthe common part of the variables (thatwhich is estimated by factors) is not sensitiveto the presence (or absence) of other variables.Although a fundamental difference between the twomodels, this problem of the additionalvariable is most obvious when there are not verymany variables and becomes less of an

empirical problem as the number of variables increases."

Take a look at Prof. Revelle's psych package (funtion omega &c). Note also
that factanal has a rotation = "none" option.

Regards, Mark.


soeren.vogel wrote:
 Can't make sense of calculated results and hope I'll find help here.
I've collected answers from about 600 persons concerning threevariables. I hypothesise those three variables to be components (orindicators) of one latent factor. In order to reduce data (vars), Ihad the following idea: Calculate the factor underlying these threevars. Use the loadings and the original var values to construct an new(artificial) var: (B1 * X1) + (B2 * X2) + (B3 * X3) = ArtVar (bracketsfor readability). Use ArtVar for further analysis of the data, that
 > is, as predictor etc.

For 3 variables, there is only one factorpossible, so rotation is not a problem. (For 1factor, there are 3 unknown factor loadings and 3known correlations. The model is justidentified. )

 >
In my (I realise, elementary) psychological statistics readings I wastaught to use pca for these problems. Referring to Venables & Ripley(2002, chapter 11), I applied "princomp" to my vars. But the outcomeshows 4 components -- which is obviously not what I want. Readingfurther I found "factanal", which produces loadings on the onespecified factor very fine. But since this is a contradiction totheoretical introductions in so many texts I'm completely confusedwhether I'm right with these calculations.

If you want to think of what these variables havein common, use factor analysis, if you want tosummarize them all most efficiently with onecomposite, use principal components. These arevery different models.

As Mark said, the difference is that FA accountsfor the covariances (the off diagonal elements)which reflect what the variables have in common.PCS accounts for the entire matrix, which in a 3x3 problem, is primarily the diagonal variances.


Let me know if you need more information.

Bill

 >
(1) Is there an easy example, which explains the differences betweenpca and pfa? (2) Which R procedure should I use to get what I want?

 Thank you for your help

 Sören


 Refs.:

Venables, W. N., and Ripley, B. D. (2002). Modern applied statisticswith S (4th edition). New York: Springer.


 ______________________________________________
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--

View this message in context:http://www.nabble.com/pca-vs.-pfa%3A-dimension-reduction-tp22707926p22709481.html

Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
William Revelle         http://personality-project.org/revelle.html
Professor                       http://personality-project.org/personality.html
Department of Psychology             http://www.wcas.northwestern.edu/psych/
Northwestern University http://www.northwestern.edu/
Attend  ISSID/ARP:2009               http://issid.org/issid.2009/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pca vs. pfa: dimension reduction

Reply via email to