Dear Kevin,
If you do the same regression as in the text then you'll get the same
hat-values; the regression is the one on the top of p. 244:
mod - lm(repwt ~ weight*sex, data=Davis)
max(hatvalues(mod))
[1] 0.7141856
As to making sense of the computations:
X - model.matrix(mod)
head(X)
(Intercept) weight sexM weight:sexM
1 1 771 77
2 1 580 0
3 1 530 0
4 1 681 68
5 1 590 0
6 1 761 76
H - X %*% solve( t(X) %*% X ) %*% t(X)
h - diag(H)
max(h)
[1] 0.7141856
I hope this helps,
John
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
Behalf Of rkevinbur...@charter.net
Sent: March-05-09 11:40 AM
To: r-h...@stat.math.ethz.ch
Subject: [R] hatvalues?
I am struiggling a bit with this function 'hatvalues'. I would like a
little
more undrestanding than taking the black-box and using the values. I
looked
at the Fortran source and it is quite opaque to me. So I am asking for
some
help in understanding the theory. First, I take the simplest case of a
single
variant. For this I turn o John Fox's book, Applied Regression Analysis
and
Generalized Linear Models, p 245 and generate this 'R' code:
library(car)
attach(Davis)
# remove the NA's
narepwt - repwt[!is.na(repwt)]
meanrw - mean(narepwt)
drw - narepwt - meanrw
ssrw - sum(drw * drw)
h - 1/length(narepwt) + (drw * drw)/ssrw
h
This gives me a array of values the largest of which is
order(h, decreasing=TRUE)
[1] 21 52 17 93 30 62 158 113 175 131 182 29 106 125 123 146 91
99
So the largest hatvalue is
h[21]
[1] 0.1041207
Which doesn't match the 0.714 value that is reported in the book but I
will
probably take that up with the author later.
Then I use more of 'R' and I get
fit - lm(weight ~ repwt)
hr - hatvalues(fit)
hr[21]
21
0.1041207
So this matches which is reasusing. My question is this, given the QR
transformation and the residuals derived from that transformation what is
a
simple matrix formula for the hatvalues?
From http://en.wikipedia.org/wiki/Linear_regression I get
residuals = y - Hy = y(I - H)
or
H = -(residuals/y - I)
fit - lm(weight ~ repwt)
h - -(residuals(fit)/weight[as.numeric(names(residuals(fit)))] -
diag(1,length(residuals(fit)), length(residuals(fit
This generates a matrix but I cannot see any coerrelation between this
hat-
matrix and the return from hatvalues.
Comments?
Thank you.
Kevin
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.