Re: [R] splitting into multiple dataframes and then create a loop to work

Nilaya Sharma Tue, 30 Aug 2011 12:10:14 -0700

Thank you for the help. My focus was to split data frame for a different
function, not lm. I could provide detail of that lengthy function instead I
provided the lm function.


The comment were very helpful.

Thanks;

NIL

On Mon, Aug 29, 2011 at 3:37 PM, Dimitris Rizopoulos <
d.rizopou...@erasmusmc.nl> wrote:

> well, if a pooled estimate of the residual standard error is not desirable,
> then you just need to set argument 'pool' of lmList() to FALSE, e.g.,
>
> mlis <- lmList(yvar ~ .  - clvar | clvar, data = df, pool = FALSE)
> summary(mlis)
>
>
> Best,
> Dimitris
>
>
>
> On 8/29/2011 9:20 PM, Dennis Murphy wrote:
>
>> Hi:
>>
>> Dimitris' solution is appropriate, but it needs to be mentioned that
>> the approach I offered earlier in this thread differs from the
>> lmList() approach. lmList() uses a pooled measure of error MSE (which
>> you can see at the bottom of the output from summary(mlis) ), whereas
>> the plyr approach subdivides the data into distinct sub-data frames
>> and analyzes them as separate entities. As a result, the residual MSEs
>> will differ between the two approaches, which in turn affects the
>> significance tests on the model coefficients. You need to decide which
>> approach is better for your purposes.
>>
>> Cheers,
>> Dennis
>>
>> On Mon, Aug 29, 2011 at 12:02 PM, Dimitris Rizopoulos
>> <d.rizopou...@erasmusmc.nl>  wrote:
>>
>>> You can do this using function lmList() from package nlme, without having
>>> to
>>> split the data frames, e.g.,
>>>
>>> library(nlme)
>>>
>>> mlis<- lmList(yvar ~ .  - clvar | clvar, data = df)
>>> mlis
>>> summary(mlis)
>>>
>>>
>>> I hope it helps.
>>>
>>> Best,
>>> Dimitris
>>>
>>>
>>> On 8/29/2011 5:37 PM, Nilaya Sharma wrote:
>>>
>>>>
>>>> Dear All
>>>>
>>>> Sorry for this simple question, I could not solve it by spending days.
>>>>
>>>> My data looks like this:
>>>>
>>>> # data
>>>> set.seed(1234)
>>>> clvar<- c( rep(1, 10), rep(2, 10), rep(3, 10), rep(4, 10)) # I have 100
>>>> level for this factor var;
>>>> yvar<-  rnorm(40, 10,6);
>>>> var1<- rnorm(40, 10,4); var2<- rnorm(40, 10,4); var3<- rnorm(40, 5, 2);
>>>> var4<- rnorm(40, 10, 3); var5<- rnorm(40, 15, 8) # just example
>>>> df<- data.frame(clvar, yvar, var1, var2, var3, var4, var5)
>>>>
>>>> # manual splitting
>>>> df1<- subset(df, clvar == 1)
>>>> df2<- subset(df, clvar == 2)
>>>> df3<- subset(df, clvar == 3)
>>>> df4<- subset(df, clvar == 4)
>>>> df5<- subset(df, clvar == 5)
>>>>
>>>> # i tried to mechanize it
>>>> *
>>>>
>>>> for(i in 1:5) {
>>>>
>>>>           df[i]<- subset(df, clvar == i)
>>>>
>>>> }
>>>>
>>>> I know it should not work as df[i] is single variable, do it did. But I
>>>> could not find away to output multiple dataframes from this loop. My
>>>> limited
>>>> R knowledge, did not help at all !
>>>>
>>>> *
>>>>
>>>> # working on each of variable, just trying simple function
>>>>  a<- 3:8
>>>> out1<- lapply(1:5, function(ind){
>>>>                    lm(df1$yvar ~ df1[, a[ind]])
>>>>  })
>>>> p1<- lapply(out1, function(m)summary(m)$**coefficients[,4][2])
>>>> p1<- do.call(rbind, p1)
>>>>
>>>>
>>>> My ultimate objective is to apply this function to all the dataframes
>>>> created (i.e. df1, df2, df3, df4, df5) and create five corresponding
>>>> p-value
>>>> vectors (p1, p2, p3, p4, p5). Then output would be a matrix of clvar and
>>>> correponding p values
>>>> clvar       var1   var2  var3  var4   var5
>>>> 1
>>>> 2
>>>> 3
>>>> 4
>>>>
>>>> Please help me !
>>>>
>>>> Thanks
>>>>
>>>> NIL
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________**________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html>
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>> --
>>> Dimitris Rizopoulos
>>> Assistant Professor
>>> Department of Biostatistics
>>> Erasmus University Medical Center
>>>
>>> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>>> Tel: +31/(0)10/7043478
>>> Fax: +31/(0)10/7043014
>>> Web: 
>>> http://www.erasmusmc.nl/**biostatistiek/<http://www.erasmusmc.nl/biostatistiek/>
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
> --
> Dimitris Rizopoulos
> Assistant Professor
> Department of Biostatistics
> Erasmus University Medical Center
>
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478
> Fax: +31/(0)10/7043014
> Web: 
> http://www.erasmusmc.nl/**biostatistiek/<http://www.erasmusmc.nl/biostatistiek/>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting into multiple dataframes and then create a loop to work

Reply via email to