Re: [R] Linear Regression with 2 grouping variables

Nathan Miller Tue, 23 Aug 2011 07:46:33 -0700

Thanks Dennis! Worked perfectly. I keep forgetting that plyr can split data
based on multiple subsetting variables.


Thanks so much,
Nate

On Mon, Aug 22, 2011 at 10:12 PM, Dennis Murphy <djmu...@gmail.com> wrote:

> Hi:
>
> You're kind of on the right track, but there is no conditioning
> formula in lm(); it's not lattice :)  This is relatively easy to do
> with the plyr package, though:
>
> library('plyr')
> # Generate a list of models - the subsetting variables (Site, Vial) are
> # used to generate the data splits and the function is run on a generic
> # data split d, assumed to be a data frame.
> mlist <- dlply(feed1, .(Site, Vial), function(d) lm(lnRFU ~ Time, data =
> d))
>
> # To get the set of model coefficients, take the list mlist as the input
> # data (argument 1) and then for each generic component 'm', extract its
> # coefficients:
> ldply(mlist, function(m) coef(m))
>
> For your test data, only Vial varies - the example code below reflects
> that:
>
> > mlist <- dlply(feed1, .(Vial), function(d) lm(lnRFU ~ Time, data = d))
> > length(mlist)  # three component model objects
> [1] 3
> > ldply(mlist, function(m) coef(m))
>  Vial (Intercept)         Time
> 1    1    10.75440 -0.001508621
> 2    2    10.83171 -0.005095100
> 3    3    10.81087 -0.004897600
>
> This idea can be generalized: if you want to pull out similar pieces
> of output from each model, run ldply() on the list of models and
> create a utility function that outputs, for a generic model object m,
> what you want to have returned. Common choices include R^2 values,
> tables of coefficients (as lists instead of data frames) or residuals
> and predicted values. The game is to write the function so that it
> takes a [list] model object (here, m) as input and a data frame as
> output. You can also extract output from summary(m) in a similar way,
> using m as the input object.
>
> HTH,
> Dennis
>
> On Mon, Aug 22, 2011 at 6:15 PM, Nathan Miller <natemille...@gmail.com>
> wrote:
> > Hi all,
> >
> > I have a data set that looks a bit like this.
> >
> > feed1
> >      RFU Site Vial Time       lnRFU
> > 1   44448    1    1   10  10.702075
> > 2   47521    1    1   20  10.768927
> > 3   42905    1    1   30  10.66674
> > 4   46867    1    1   40  10.755069
> > 5   42995    1    1   50  10.668839
> > 6   43074    1    1   60  10.670675
> > 7   41195    1    1   70  10.626072
> > 8   47090    1    2   10  10.759816
> > 9   48100    1    2   20  10.781037
> > 10  43215    1    2   30  10.673943
> > 11  39656    1    2   40  10.587998
> > 12  38799    1    2   50  10.566150
> > 13  38424    1    2   60 10.556438
> > 14 35240 1 2 70  10.469937
> > 15  46427    1    3   10  10.745636
> > 16 46418 1 3 20  10.745443
> > 17  42095    1    3   30  10.647684
> > ......
> > There are 5 columns of data, three levels of "Site", 10 "Vials" per site,
> > and measurements were taken at 10 min intervals from 10-70.. I am
> primarily
> > interested in the relationship between "Time" and "lnRFU" to calculate
> the
> > rate at which lnRFU declines over time. I have a nice plot using a
> ggplot2
> > code that looks like this
> >
> > p<-ggplot(data=feed1,aes(x=Time,y=lnRFU))
> > p+geom_point(size=4)+facet_grid(Site~Vial)+geom_smooth(method="lm")
> >
> > The graph is useful to visualize the changes over time and grouped by
> both
> > Site and Vial, but I also need the slopes of the linear regressions for
> each
> > Vial, within a Site. This is where I run into a problem. I want to run a
> > linear regression of lnRFU as a function of Time grouped by both Site and
> > Vial. Its easy to visualize this comparison in ggplot using facet_grid(),
> > but I'm not sure how to do a similar comparison/analysis within lm()
> >
> > I imagine something like
> >
> > fit<-lm(lnRFU~Time | Vial * Site, data=feed1)
> >
> >  in which I group by both Vial and Site, but obviously this code doesn't
> > work. Does anyone have an idea for how to do a linear regression with two
> > grouping variables? Do I have to go back and combine Vial and Site into a
> > single grouping variable or can I leave the dataframe the way it is? I'm
> > trying to imagine a means of accomplishing the same type of thing that
> > facet_grid does when it allows you to plot the data as a function of two
> > "grouping" variables.
> >
> > Thanks for you time. I greatly appreciate it.
> >
> > Nate Miller
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear Regression with 2 grouping variables

Reply via email to