On Aug 1, 2011, at 15:27 , Samuel Le wrote: > Hello, > > > > I was wondering if someone knows the formula used by the function lm to > compute the t-values. > > > > I am trying to implement a linear regression myself. Assuming that I have K > variables, and N observations, the formula I am using is: > > For the k-th variable, t-value= b_k/sigma_k > > > > With b_k is the coefficient for the k-th variable, and sigma_k =(t(x) x > )^(-1) _kk is its standard deviation. > > > > I find sigma_k = sigma * n/(n*Sum x_{k,i}^2 -(sum x_{k,i}^2)) > > > > With sigma: the estimated standard deviation of the residuals, > > Sigma = sqrt(1/(N-K-1)*Sum epsilon_i^2) > > > > With: > > N: number of observations > > K: number of variables > > > > This formula comes from my old course of econometrics. > > For some reason it doesn't match the t-value produced by R (I am off by about > 1%). I can match the other results produced by R (coefficients of the > regression, r squared, etc.). > > > > I would be grateful if someone could provide some clarifications.
AFAICT, your formula only holds for K=1. Otherwise, the formula for sigma_k involves matrix inversion. Also, even for K=1, beware that textbook formulas like SSDx = SSx - (Sx)^2/n involve subtraction of nearly equal quantities and easily loses multiple digits of precision, so software tends to use rather more careful algorithms. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com "Døden skal tape!" --- Nordahl Grieg ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.