William Valdar wrote:

From: Brian Ripley

As a first shot, use lm with a matrix response. That fits them all at once with one QR-decomposition. No analogue for glm or lmer, though, since for those the iterative fits run do depend on the response.


Thanks Brian, that's very helpful. Also thanks to Kevin Wright who suggested using lsfit(x,Y) as being faster than lm for a Y matrix.

I've since worked out that I can bypass even more lm machinery by basing my permutation test significance thresholds on the RSS from qr.resid().
Since,


    y = QRb + e
  Q'y = Rb + Q'e
  RSS = || Q'y - Rb ||

then I can do

  X.qr <- qr(X)

once, and for every instance of y calculate

  e   <- qr.resid(X.qr, y)
  rss <- e %*% e

recording them in

  rss.for.all.fits[i] <- rss

which gives me an empirical distribution of RSS scores. The degrees of freedom in my X matrix are constant throughout (I should have said that before), so all RSS's are on a level footing and map trivially to the p-value. I can therefore take the RSS at, say, the 5th percentile, turn it into a p-value and report that as my 5% threshold.

Actually you do not need to calculate the residuals to be able to calculate RSS. If you write Q = [Q1 Q2] where Q1 is the first p columns and Q2 is the remaining n - p columns and R1 for the first p rows of R then your expression for RSS can be extended as


  RSS = || Q'y - Rb || = || Q1'y - R1 b || + || Q2'y ||

because the last n - p rows of R are zero. At the least squares value of b the first term is zero when X has full column rank. Thus

  rss <- sum(qr.qty(X.qr, y)[-(1:p)]^2)

qr.qty should be slightly faster than qr.resid because it performs only one (virtual) multiplication by Q or Q'. I doubt that the difference would be noticeable in practice.

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to