On Aug 1, 2011, at 15:27 , Samuel Le wrote:
> Hello,
>
>
>
> I was wondering if someone knows the formula used by the function lm to
> compute the t-values.
>
>
>
> I am trying to implement a linear regression myself. Assuming that I have K
> variables, and N observations, the formula I am using is:
>
> For the k-th variable, t-value= b_k/sigma_k
>
>
>
> With b_k is the coefficient for the k-th variable, and sigma_k =(t(x) x
> )^(-1) _kk is its standard deviation.
>
>
>
> I find sigma_k = sigma * n/(n*Sum x_{k,i}^2 -(sum x_{k,i}^2))
>
>
>
> With sigma: the estimated standard deviation of the residuals,
>
> Sigma = sqrt(1/(N-K-1)*Sum epsilon_i^2)
>
>
>
> With:
>
> N: number of observations
>
> K: number of variables
>
>
>
> This formula comes from my old course of econometrics.
>
> For some reason it doesn't match the t-value produced by R (I am off by about
> 1%). I can match the other results produced by R (coefficients of the
> regression, r squared, etc.).
>
>
>
> I would be grateful if someone could provide some clarifications.
AFAICT, your formula only holds for K=1. Otherwise, the formula for sigma_k
involves matrix inversion. Also, even for K=1, beware that textbook formulas
like SSDx = SSx - (Sx)^2/n involve subtraction of nearly equal quantities and
easily loses multiple digits of precision, so software tends to use rather more
careful algorithms.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [email protected] Priv: [email protected]
"Døden skal tape!" --- Nordahl Grieg
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.