To all ..., Bill's "lateral" wisdom is almost certainly a better solution. So thanks for the advice (and everything else that went before it [Bill: apropos of termplot, what happened to tplot ?]). And I will [almost] desist from asking the obvious: and if there were 10 000 observations ?
BestR, Mark. Bill.Venables wrote: > > ...but with 500 variables and only 20 'entities' (observations) you will > have 481 PCs with dead zero eigenvalues. How small is 'smaller' and how > many is "a few"? > > Everyone who has responded to this seems to accept the idea that PCA is > the way to go here, but that is not clear to me at all. There is a > 2-sample structure in the 20 observations that you have. If you simply > ignore that in doing your PCA you are making strong assumptions about > sampling that would seem to me unlikely to be met. If you allow for the > structure and project orthogonal to it then you are probably throwing > the baby out with the bathwater - you want to choose variables which > maximise separation between the 2 samples (and now you are up to 482 > zero principal variances, if that matters...). > > I think this problem probably needs a bit of a re-think. Some variant > on singular LDA, for example, may be a more useful way to think about > it. > > Bill Venables. > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ravi Varadhan > Sent: Monday, 2 July 2007 1:29 PM > To: 'Patrick Connolly' > Cc: r-help@stat.math.ethz.ch; 'Mark Difford' > Subject: Re: [R] Question about PCA with prcomp > > The PCs that are associated with the smaller eigenvalues. > > ------------------------------------------------------------------------ > ---- > ------- > > Ravi Varadhan, Ph.D. > > Assistant Professor, The Center on Aging and Health > > Division of Geriatric Medicine and Gerontology > > Johns Hopkins University > > Ph: (410) 502-2619 > > Fax: (410) 614-9625 > > Email: [EMAIL PROTECTED] > > Webpage: > http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html > > > > ------------------------------------------------------------------------ > ---- > -------- > > -----Original Message----- > From: Patrick Connolly [mailto:[EMAIL PROTECTED] > Sent: Monday, July 02, 2007 4:23 PM > To: Ravi Varadhan > Cc: 'Mark Difford'; r-help@stat.math.ethz.ch > Subject: Re: [R] Question about PCA with prcomp > > On Mon, 02-Jul-2007 at 03:16PM -0400, Ravi Varadhan wrote: > > |> Mark, > |> > |> What you are referring to deals with the selection of covariates, > |> since > PC > |> doesn't do dimensionality reduction in the sense of covariate > selection. > |> But what Mark is asking for is to identify how much each data point > |> contributes to individual PCs. I don't think that Mark's query makes > much > |> sense, unless he meant to ask: which individuals have high/low scores > > |> on PC1/PC2. Here are some comments that may be tangentially related > |> to > Mark's > |> question: > |> > |> 1. If one is worried about a few data points contributing heavily to > > |> the estimation of PCs, then one can use robust PCA, for example, > |> using robust covariance matrices. MASS has some tools for this. > |> 2. The "biplot" for the first 2 PCs can give some insights 3. PCs, > |> especially, the last few PCs, can be used to identify "outliers". > > What is meant by "last few PCs"? > > -- > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > > ___ Patrick Connolly > {~._.~} Great minds discuss ideas > _( Y )_ Middle minds discuss events > (:_~*~_:) Small minds discuss people > (_)-(_) ..... Anon > > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Question-about-PCA-with-prcomp-tf4012919.html#a11402204 Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.