Try multiplying var(e) by n-1 and dividing by n-3 (97 are the degrees of 
freedom for the sum of squares error in your example). Then it all matches up 
nicely.

Best,

-- 
Wolfgang Viechtbauer
 Department of Methodology and Statistics
 University of Maastricht, The Netherlands
 http://www.wvbauer.com/



----Original Message----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Daniel Malter Sent:
Tuesday, February 26, 2008 08:25 To: 'r-help'
Subject: [R] OLS standard errors

> Hi,
> 
> the standard errors of the coefficients in two regressions that I
> computed by hand and using lm() differ by about 1%. Can somebody help
> me to identify the source of this difference? The coefficient
> estimates are the same, but the standard errors differ.   
> 
> ####Simulate data
> 
>       happiness=0
>       income=0
>       gender=(rep(c(0,1,1,0),25))
>               for(i in 1:100){
>                       happiness[i]=1000+i+rnorm(1,0,40)
>                       income[i]=2*i+rnorm(1,0,40)
>                       }
> 
> ####Run lm()
> 
>       reg=lm(happiness~income+factor(gender))
>       summary(reg)
> 
> ####Compute coefficient estimates "by hand"
> 
>       x=cbind(income,gender)
>       y=happiness
> 
>       z=as.matrix(cbind(rep(1,100),x))
>       beta=solve(t(z)%*%z)%*%t(z)%*%y
> 
> ####Compare estimates
> 
>       cbind(reg$coef,beta)  ##fine so far, they both look the same
> 
>       reg$coef[1]-beta[1]
>       reg$coef[2]-beta[2]
>       reg$coef[3]-beta[3]     ##differences are too small to cause a 1%
> difference
> 
> ####Check predicted values
> 
>       estimates=c(beta[2],beta[3])
> 
>       predicted=estimates%*%t(x)
>       predicted=as.vector(t(as.double(predicted+beta[1])))
> 
>       cbind(reg$fitted,predicted)             ##inspect fitted values
>       as.vector(reg$fitted-predicted) ##differences are marginal
> 
> #### Compute errors
> 
>       e=NA
>       e2=NA
>       for(i in 1:length(happiness)){
>               e[i]=y[i]-predicted[i]   ##for "hand-computed" regression
>               e2[i]=y[i]-reg$fitted[i] ##for lm() regression
>       }
> 
> #### Compute standard error of the coefficients
> 
>   sqrt(abs(var(e)*solve(t(z)%*%z)))   ##for "hand-computed" regression
>   sqrt(abs(var(e2)*solve(t(z)%*%z)))  ##for lm() regression using e2
> from 
> above
> 
>       ##Both are the same
> 
> ####Compare to lm() standard errors of the coefficients again
> 
>       summary(reg)
> 
> 
> The diagonal elements of the variance/covariance matrices should be
> the standard errors of the coefficients. Both are identical when
> computed by hand. However, they differ from the standard errors
> reported in summary(reg). The difference of 1% seems nonmarginal.
> Should I have multiplied var(e)*solve(t(z)%*%z) by n and divided by
> n-1? But even if I do this, I still observe a difference. Can anybody
> help me out what the source of this difference is?      
> 
> Cheers,
> Daniel
> 
> 
> -------------------------
> cuncta stricte discussurus
> 
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to