Joao,

Your intuition is correct, the intercept represents the predicted value 
for wool A and tension L.  But, you're tripping up on how to figure out 
that predicted value.  In the model that you fit, the predicted value for 
wool A and tension L is not simply the mean of the observations for wool A 
and tension L, because there are only main effects in the model (no 
interaction).  Try this:

attach(warpbreaks)
tapply(breaks, list(wool, tension), mean)

fit <- lm(breaks ~ wool + tension, data=warpbreaks)
tapply(fit$fitted, list(wool, tension), mean)

fit2 <- lm(breaks ~ wool*tension, data=warpbreaks)
tapply(fit2$fitted, list(wool, tension), mean)

I believe that your results will depend on how the "contrasts" option is 
set in R.  For me it's like this:
options("contrasts")

$contrasts
        unordered           ordered 
"contr.treatment"      "contr.poly" 

Jean


Joao Azevedo <joao.c.azev...@gmail.com> wrote on 07/27/2012 07:16:10 AM:
> 
> Hi!
> 
> Thanks for the link. I've already stumbled upon that explanation. I'm
> able to understand how the coding schemes are applied in the supplied
> examples, but they only use a single explanatory variable. My problem
> is with understanding the model when there are multiple categorical
> explanatory variables.
> 
> --
> Joao.
> 
> On Fri, Jul 27, 2012 at 1:04 PM, Jean V Adams <jvad...@usgs.gov> wrote:
> > Joao,
> >
> > There's a very thorough explanation at
> > http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm
> >
> > Jean
> >
> >
> > Joao Azevedo <joao.c.azev...@gmail.com> wrote on 07/27/2012 06:32:31 
AM:
> >
> >>
> >> Hi!
> >>
> >> I'm failing to understand the value of the intercept value in a
> >> multiple linear regression with categorical values. Taking the
> >> "warpbreaks" data set as an example, when I do:
> >>
> >> > lm(breaks ~ wool, data=warpbreaks)
> >>
> >> Call:
> >> lm(formula = breaks ~ wool, data = warpbreaks)
> >>
> >> Coefficients:
> >> (Intercept)        woolB
> >>      31.037       -5.778
> >>
> >> I'm able to understand that the value of intercept is the mean value
> >> of breaks when wool equals "A", and that adding up the "woolB"
> >> coefficient to the intercept value I get the mean value of breaks 
when
> >> wool equals "B". However, if I also consider the tension variable in
> >> the model, I'm unable to figure out the meaning of the intercept
> >> value:
> >>
> >> > lm(breaks ~ wool + tension, data=warpbreaks)
> >>
> >> Call:
> >> lm(formula = breaks ~ wool + tension, data = warpbreaks)
> >>
> >> Coefficients:
> >> (Intercept)        woolB     tensionM     tensionH
> >>      39.278       -5.778      -10.000      -14.722
> >>
> >> I thought it would be the mean value of breaks when either wool 
equals
> >> "A" or tension equals "L", but that isn't true for this dataset.
> >>
> >> Any clues on interpreting the value of intercept?
> >>
> >> Thanks!
> >>
> >> --
> >> Joao.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to