I am not sure that you really want to do separate regressions for each row of X, with the same y. This does not make much sense.
Why do you think multiple linear regression is not possible just because X'X is not invertible? You have 2 main options here: 1. Obtain a minimum-norm solution using SVD (also known as Moore-Penrose inverse). This solution minimizes ||y - Xb|| subject to minimum ||b|| 2. Obtain a regularized solution such as the ridge-regression, as Vito suggested. You can do (1) as follows: require(MASS) soln <- ginv(X, y) Here is an example: X <- matrix(rnorm(1000), 10, 100) # matrix with rank = 10 b <- rep(1, 100) y <- crossprod(t(X), b) soln <- c(ginv(X) %*% y) # this will not be close to b Hope this helps, Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alex Roy Sent: Tuesday, July 14, 2009 11:29 AM To: Vito Muggeo (UniPa) Cc: r-help@r-project.org Subject: Re: [R] Linear Regression Problem Dear Vito, Thanks for your comments. But I want to do Simple linear regression not Multiple Linear regression. Multiple Linear regression is not possible here as number of variables are much more than samples.( X is ill condioned, inverse of X^TX does not exist! ) I just want to take one predictor variable and regress on y and store regression coefficients, p values and R^2 values. And the loop go up to 40,000 predictors. Alex On Tue, Jul 14, 2009 at 5:18 PM, Vito Muggeo (UniPa) <vito.mug...@unipa.it>wrote: > dear Alex, > I think your problem with a large number of predictors and a > relatively small number of subjects may be faced via some > regularization approach (ridge or lasso regression..) > > hope this helps you, > vito > > Alex Roy ha scritto: > >> Dear All, >> I have a matrix say, X ( 100 X 40,000) and a vector >> say, y (100 X 1) . I want to perform linear regression. I have scaled >> X matrix by using scale () to get mean zero and s.d 1 . But still I >> get very high values of regression coefficients. If I scale X >> matrix, then the regression coefficients will bahave as a correlation >> coefficient and they should not be more than 1. Am I right? I do not >> whats going wrong. >> Thanks for your help. >> Alex >> >> >> *Code:* >> >> UniBeta <- sapply(1:dim(X)[2], function(k) >> + summary(lm(y~X[,k]))$coefficients[2,1]) >> >> pval <- sapply(1:dim(X)[2], function(l) >> + summary(lm(y~X[,l]))$coefficients[2,4]) >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting -guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> > -- > ==================================== > Vito M.R. Muggeo > Dip.to Sc Statist e Matem `Vianelli' > Universit` di Palermo > viale delle Scienze, edificio 13 > 90128 Palermo - ITALY > tel: 091 6626240 > fax: 091 485726/485612 > http://dssm.unipa.it/vmuggeo > ==================================== > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.