Your second fit makes no sense, as you can easily tell if you look at the
regression summaries. Fitting with spray as a categorical variable gives you
an overall p-value of less than 2.2e-16, while fitting with
as.numeric(spray) gives an overall p-value of .2118. The fit you've done
with as.numeric induces a completely invalid model, as others have tried to
point out.

Jonathan


On Fri, Aug 13, 2010 at 1:55 PM, TGS <cran.questi...@gmail.com> wrote:

> # I wasn't trying to do ANOVA. I was simply trying to figure out how
> regress count on sprays (this is after I saw another poster asking an
> unrelated question with the InsectSprays dataset).
> #
> # Anyhow, David clarified this but also, thanks for your explanation as
> well.
>
> rm(list = ls()); sprays <- as.numeric(InsectSprays$spray)
>
> lm(formula = count ~ 0 + spray, data = InsectSprays)
> lm(formula = count ~ 0 + sprays, data = InsectSprays)
>
> # besides the point, in the ANOVA problem the degrees of freedom would be
> 5, not 1.
>
> On Aug 13, 2010, at 12:27 PM, Greg Snow wrote:
>
> So you want 1 degree of freedom for InsectSprays?  You believe that the
> difference between A and B is exactly the same as between B and C which is
> exactly the same as between D and E (etc.)?  that seems an odd assumption,
> but you can get that by using as.numeric (as I and others have already
> stated).
>
> If on the other hand you want InsectSprays to be treated correctly with the
> correct number of degrees of freedom, but have the output on a single line
> testing the overall effect, then you want to use the aov function rather
> than lm (internally they do the same thing, but the default summary output
> for aov is 1 line per term).
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> > -----Original Message-----
> > From: TGS [mailto:cran.questi...@gmail.com]
> > Sent: Friday, August 13, 2010 11:51 AM
> > To: Greg Snow
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Dealing with data
> >
> > # Greg, if R automatically does that then I don't know why it's
> > treating each indicator
> > # as a different regressor. In other words, I am interested in treating
> > 'spray' as one
> > # independent variable.
> > #
> > # Erik, which book do you suggest I read? Thanks.
> >
> > data(InsectSprays)
> > lm(InsectSprays$count ~ 0 + InsectSprays$spray)
> >
> > On Aug 13, 2010, at 10:34 AM, Greg Snow wrote:
> >
> > R/S does all of that automatically for you, you do not need to manually
> > create the indicator variables.
> >
> > If you do something like:
> >
> >> fit <- lm( Sepal.Width ~ Species, data=iris, x=TRUE)
> >
> > Then look at the matrix actually used:
> >
> >> fit$x
> >
> > Or the output:
> >
> >> summary(fit)
> >
> > You will see that Species was automatically converted into indicator
> > variables and those were used in the regression.
> >
> > If you really need the indicator variables yourself, look at the
> > model.matrix function, e.g.:
> >
> >> model.matrix( ~Species, data=iris )
> >
> > Or
> >
> >> model.matrix( ~Species - 1, data=iris )
> >
> > If you really want 1 for A, 2 for B, etc. then look at as.numeric on a
> > factor variable (e.g. as.numeric(iris$Species) ).
> >
> > Hope this helps,
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.s...@imail.org
> > 801.408.8111
> >
> >
> >> -----Original Message-----
> >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> >> project.org] On Behalf Of TGS
> >> Sent: Friday, August 13, 2010 11:22 AM
> >> To: David Winsemius
> >> Cc: r-help@r-project.org
> >> Subject: Re: [R] Dealing with data
> >>
> >> To clarify, I'd like to create a column of indicators for the
> >> respective letters so that I could maybe do regression on indicators,
> >> etc.
> >>
> >> For instance, "A" gets "1", "B" gets "2", and so on.
> >>
> >> On Aug 13, 2010, at 10:19 AM, David Winsemius wrote:
> >>
> >>
> >> On Aug 13, 2010, at 1:03 PM, TGS wrote:
> >>
> >>> # how would I code in R to look at the letter of the alphabet
> >>> # in the second column and create a indicator column for the
> >>> # corresponding letter?
> >>>
> >>> data(InsectSprays)
> >>> InsectSprays$spray
> >>
> >> It's already what most people mean when they say "indicator column",
> >> i.e., a factor variable (and not a character vector) .... so,  what
> > do
> >> _you_ mean?
> >>>
> >>
> >>
> >> --
> >>
> >> David Winsemius, MD
> >> West Hartford, CT
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to