Joao, Your intuition is correct, the intercept represents the predicted value for wool A and tension L. But, you're tripping up on how to figure out that predicted value. In the model that you fit, the predicted value for wool A and tension L is not simply the mean of the observations for wool A and tension L, because there are only main effects in the model (no interaction). Try this:
attach(warpbreaks) tapply(breaks, list(wool, tension), mean) fit <- lm(breaks ~ wool + tension, data=warpbreaks) tapply(fit$fitted, list(wool, tension), mean) fit2 <- lm(breaks ~ wool*tension, data=warpbreaks) tapply(fit2$fitted, list(wool, tension), mean) I believe that your results will depend on how the "contrasts" option is set in R. For me it's like this: options("contrasts") $contrasts unordered ordered "contr.treatment" "contr.poly" Jean Joao Azevedo <joao.c.azev...@gmail.com> wrote on 07/27/2012 07:16:10 AM: > > Hi! > > Thanks for the link. I've already stumbled upon that explanation. I'm > able to understand how the coding schemes are applied in the supplied > examples, but they only use a single explanatory variable. My problem > is with understanding the model when there are multiple categorical > explanatory variables. > > -- > Joao. > > On Fri, Jul 27, 2012 at 1:04 PM, Jean V Adams <jvad...@usgs.gov> wrote: > > Joao, > > > > There's a very thorough explanation at > > http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm > > > > Jean > > > > > > Joao Azevedo <joao.c.azev...@gmail.com> wrote on 07/27/2012 06:32:31 AM: > > > >> > >> Hi! > >> > >> I'm failing to understand the value of the intercept value in a > >> multiple linear regression with categorical values. Taking the > >> "warpbreaks" data set as an example, when I do: > >> > >> > lm(breaks ~ wool, data=warpbreaks) > >> > >> Call: > >> lm(formula = breaks ~ wool, data = warpbreaks) > >> > >> Coefficients: > >> (Intercept) woolB > >> 31.037 -5.778 > >> > >> I'm able to understand that the value of intercept is the mean value > >> of breaks when wool equals "A", and that adding up the "woolB" > >> coefficient to the intercept value I get the mean value of breaks when > >> wool equals "B". However, if I also consider the tension variable in > >> the model, I'm unable to figure out the meaning of the intercept > >> value: > >> > >> > lm(breaks ~ wool + tension, data=warpbreaks) > >> > >> Call: > >> lm(formula = breaks ~ wool + tension, data = warpbreaks) > >> > >> Coefficients: > >> (Intercept) woolB tensionM tensionH > >> 39.278 -5.778 -10.000 -14.722 > >> > >> I thought it would be the mean value of breaks when either wool equals > >> "A" or tension equals "L", but that isn't true for this dataset. > >> > >> Any clues on interpreting the value of intercept? > >> > >> Thanks! > >> > >> -- > >> Joao. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.