Your second fit makes no sense, as you can easily tell if you look at the regression summaries. Fitting with spray as a categorical variable gives you an overall p-value of less than 2.2e-16, while fitting with as.numeric(spray) gives an overall p-value of .2118. The fit you've done with as.numeric induces a completely invalid model, as others have tried to point out.
Jonathan On Fri, Aug 13, 2010 at 1:55 PM, TGS <cran.questi...@gmail.com> wrote: > # I wasn't trying to do ANOVA. I was simply trying to figure out how > regress count on sprays (this is after I saw another poster asking an > unrelated question with the InsectSprays dataset). > # > # Anyhow, David clarified this but also, thanks for your explanation as > well. > > rm(list = ls()); sprays <- as.numeric(InsectSprays$spray) > > lm(formula = count ~ 0 + spray, data = InsectSprays) > lm(formula = count ~ 0 + sprays, data = InsectSprays) > > # besides the point, in the ANOVA problem the degrees of freedom would be > 5, not 1. > > On Aug 13, 2010, at 12:27 PM, Greg Snow wrote: > > So you want 1 degree of freedom for InsectSprays? You believe that the > difference between A and B is exactly the same as between B and C which is > exactly the same as between D and E (etc.)? that seems an odd assumption, > but you can get that by using as.numeric (as I and others have already > stated). > > If on the other hand you want InsectSprays to be treated correctly with the > correct number of degrees of freedom, but have the output on a single line > testing the overall effect, then you want to use the aov function rather > than lm (internally they do the same thing, but the default summary output > for aov is 1 line per term). > > Hope this helps, > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org > 801.408.8111 > > > > -----Original Message----- > > From: TGS [mailto:cran.questi...@gmail.com] > > Sent: Friday, August 13, 2010 11:51 AM > > To: Greg Snow > > Cc: r-help@r-project.org > > Subject: Re: [R] Dealing with data > > > > # Greg, if R automatically does that then I don't know why it's > > treating each indicator > > # as a different regressor. In other words, I am interested in treating > > 'spray' as one > > # independent variable. > > # > > # Erik, which book do you suggest I read? Thanks. > > > > data(InsectSprays) > > lm(InsectSprays$count ~ 0 + InsectSprays$spray) > > > > On Aug 13, 2010, at 10:34 AM, Greg Snow wrote: > > > > R/S does all of that automatically for you, you do not need to manually > > create the indicator variables. > > > > If you do something like: > > > >> fit <- lm( Sepal.Width ~ Species, data=iris, x=TRUE) > > > > Then look at the matrix actually used: > > > >> fit$x > > > > Or the output: > > > >> summary(fit) > > > > You will see that Species was automatically converted into indicator > > variables and those were used in the regression. > > > > If you really need the indicator variables yourself, look at the > > model.matrix function, e.g.: > > > >> model.matrix( ~Species, data=iris ) > > > > Or > > > >> model.matrix( ~Species - 1, data=iris ) > > > > If you really want 1 for A, 2 for B, etc. then look at as.numeric on a > > factor variable (e.g. as.numeric(iris$Species) ). > > > > Hope this helps, > > > > -- > > Gregory (Greg) L. Snow Ph.D. > > Statistical Data Center > > Intermountain Healthcare > > greg.s...@imail.org > > 801.408.8111 > > > > > >> -----Original Message----- > >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > >> project.org] On Behalf Of TGS > >> Sent: Friday, August 13, 2010 11:22 AM > >> To: David Winsemius > >> Cc: r-help@r-project.org > >> Subject: Re: [R] Dealing with data > >> > >> To clarify, I'd like to create a column of indicators for the > >> respective letters so that I could maybe do regression on indicators, > >> etc. > >> > >> For instance, "A" gets "1", "B" gets "2", and so on. > >> > >> On Aug 13, 2010, at 10:19 AM, David Winsemius wrote: > >> > >> > >> On Aug 13, 2010, at 1:03 PM, TGS wrote: > >> > >>> # how would I code in R to look at the letter of the alphabet > >>> # in the second column and create a indicator column for the > >>> # corresponding letter? > >>> > >>> data(InsectSprays) > >>> InsectSprays$spray > >> > >> It's already what most people mean when they say "indicator column", > >> i.e., a factor variable (and not a character vector) .... so, what > > do > >> _you_ mean? > >>> > >> > >> > >> -- > >> > >> David Winsemius, MD > >> West Hartford, CT > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting- > >> guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.