Dear list members, I have problems to interpret the coefficients from a lm model involving the interaction of a numeric and factor variable compared to separate lm models for each level of the factor variable.
## data: y1 <- rnorm(20) + 6.8 y2 <- rnorm(20) + (1:20*1.7 + 1) y3 <- rnorm(20) + (1:20*6.7 + 3.7) y <- c(y1,y2,y3) x <- rep(1:20,3) f <- gl(3,20, labels=paste("lev", 1:3, sep="")) d <- data.frame(x=x,y=y, f=f) ## plot # xyplot(y~x|f) ## lm model with interaction summary(lm(y~x:f, data=d)) Call: lm(formula = y ~ x:f, data = d) Residuals: Min 1Q Median 3Q Max -2.8109 -0.8302 0.2542 0.6737 3.5383 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.68799 0.41045 8.985 1.91e-12 *** x:flev1 0.20885 0.04145 5.039 5.21e-06 *** x:flev2 1.49670 0.04145 36.109 < 2e-16 *** x:flev3 6.70815 0.04145 161.838 < 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.53 on 56 degrees of freedom Multiple R-Squared: 0.9984, Adjusted R-squared: 0.9984 F-statistic: 1.191e+04 on 3 and 56 DF, p-value: < 2.2e-16 ## separate lm fits lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef) $lev1 (Intercept) x 6.77022860 -0.01667528 $lev2 (Intercept) x 1.019078 1.691982 $lev3 (Intercept) x 3.274656 6.738396 Can anybody give me a hint why the coefficients for the slopes (especially for lev1) are so different and how the coefficients from the lm model with interaction are related to the separate fits? Thanks, Sven ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.