Re: [R] influence.measures(stats): hatvalues(model, ...)

2009-11-08 Thread Viechtbauer Wolfgang (STAT)
Not sure what you mean.

yi <- c(2,3,2,4,3,6)
xi <- c(1,4,3,2,4,5)

res <- lm(yi ~ xi)
hatvalues(res)

X <- cbind(1, xi)
diag( X%*%solve(t(X)%*%X)%*%t(X) )

Same result.

Best,

--
Wolfgang Viechtbauerhttp://www.wvbauer.com/
Department of Methodology and StatisticsTel: +31 (0)43 388-2277
School for Public Health and Primary Care   Office Location:
Maastricht University, P.O. Box 616 Room B2.01 (second floor)
6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck)

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Sigmund Freud [ss_freud...@yahoo.com]
Sent: Sunday, November 08, 2009 8:14 AM
To: r-help@r-project.org
Subject: [R]  influence.measures(stats): hatvalues(model, ...)

Hello:

I am trying to understand the method 'hatvalues(...)', which returns something 
similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but 
not quite.

A Fortran programmer I am not, but tracing through the code it looks like 
perhaps some sort of correction based on the notion of 'leave-one-out' variance 
is being applied.

Whatever the difference, in simulations 'hatvalues' appears to perform much 
better in the context of identifying outiers using Cook's Distance than the 
diagonals of the plain vanilla hat matrix. (As in 
http://en.wikipedia.org/wiki/Cook's_distance).

Would prefer to understand a little more when using this method. I have 
downloaded the freely available references cited in the help and am in the 
process of digesting them. If someone with knowledge could offer a pointer on 
the most efficient way to get at why 'hatvalues' does what it does, that would 
be great.

Thanks,
Jean Yarrington
Independent consultant.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] influence.measures(stats): hatvalues(model, ...)

2009-11-08 Thread Peter Ehlers


Sigmund Freud wrote:

Hello:

I am trying to understand the method 'hatvalues(...)', which returns something similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but not quite. 

A Fortran programmer I am not, but tracing through the code it looks like perhaps some sort of correction based on the notion of 'leave-one-out' variance is being applied. 



I can't see what the problem is. Using the LifeCycleSavings
example from ?influence.measures:

  lm.SR <- lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)
  X <- model.matrix(lm.SR)
  H <- X %*% solve(t(X) %*% X) %*% t(X)
  hats1 <- diag(H)
  hats2 <- hatvalues(lm.SR)
  all.equal(hats1, hats2)
  #[1] TRUE


Whatever the difference, in simulations 'hatvalues' appears to perform much 
better in the context of identifying outiers using Cook's Distance than the 
diagonals of the plain vanilla hat matrix. (As in 
http://en.wikipedia.org/wiki/Cook's_distance).

Would prefer to understand a little more when using this method. I have 
downloaded the freely available references cited in the help and am in the 
process of digesting them. If someone with knowledge could offer a pointer on 
the most efficient way to get at why 'hatvalues' does what it does, that would 
be great.


In a nutshell, hatvalues are a measure of how unusual
a point is in predictor space, i.e. to what extent it
"sticks out" in one or more of the X-dimensions.

 -Peter Ehlers


Thanks,
Jean Yarrington
Independent consultant.



  
	[[alternative HTML version deleted]]






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] influence.measures(stats): hatvalues(model, ...)

2009-11-07 Thread Sigmund Freud
Hello:

I am trying to understand the method 'hatvalues(...)', which returns something 
similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but 
not quite. 

A Fortran programmer I am not, but tracing through the code it looks like 
perhaps some sort of correction based on the notion of 'leave-one-out' variance 
is being applied. 

Whatever the difference, in simulations 'hatvalues' appears to perform much 
better in the context of identifying outiers using Cook's Distance than the 
diagonals of the plain vanilla hat matrix. (As in 
http://en.wikipedia.org/wiki/Cook's_distance).

Would prefer to understand a little more when using this method. I have 
downloaded the freely available references cited in the help and am in the 
process of digesting them. If someone with knowledge could offer a pointer on 
the most efficient way to get at why 'hatvalues' does what it does, that would 
be great.

Thanks,
Jean Yarrington
Independent consultant.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.